vagrant @ ubuntu: ~ $ sudo docker ps

CONTAINER ID IMAGE COMMAND CREATED STATUS

c3bc2d2e37a6 ubuntu sleep 10 "46 seconds ago Up 1 second

We see that when the container falls, it restarts. As a result – we always have an application in two states – raised or raised. If a web server crashes from some rare error, this is the norm, but most likely there is an error in processing requests, and it will crash on every such request, and in monitoring we will see a raised container. Such a web server is better dead than half alive. But, at the same time, a normal web server may not start due to rare errors, for example, due to the lack of connection to the database due to network instability. In such a case, the application must be able to handle errors and exit. And in case of a crash due to code errors, do not restart to see the inoperability and send it to the developers for repair. In the case of a floating error, you can try several times:

vagrant @ ubuntu: ~ $ sudo docker run -d –restart = on-failure: 3 ubuntu sleep 10

056c4fc6986a13936e5270585e0dc1782a9246f01d6243dd247cb03b7789de1c

vagrant @ ubuntu: ~ $ sleep 10

vagrant @ ubuntu: ~ $ sudo docker ps

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

c3bc2d2e37a6 ubuntu "sleep 10" 9 minutes ago Up 2 seconds keen_sinoussi

vagrant @ ubuntu: ~ $ sleep 10

vagrant @ ubuntu: ~ $ sleep 10

vagrant @ ubuntu: ~ $ sudo docker ps

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

c3bc2d2e37a6 ubuntu "sleep 10" 10 minutes ago Up 9 seconds keen_sinoussi

vagrant @ ubuntu: ~ $ sleep 10

vagrant @ ubuntu: ~ $ sudo docker ps

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

c3bc2d2e37a6 ubuntu "sleep 10" 10 minutes ago Up 2 seconds keen_sinoussi

Another aspect is when to consider the container dead. By default, this is a process crash. But, by far, the application does not always crash itself in case of an error in order to allow the container to be restarted. For example, a server may be designed incorrectly and try to download the necessary libraries during its startup, but it does not have this opportunity, for example, due to the blocking of requests by the firewall. In such a scenario, the server can wait a long time if an adequate timeout is not specified. In this case, we need to check the functionality. For a web server, this is a response to a specific url, for example:

docker run –rm -d \

–-name = elasticsearch \

–-health-cmd = "curl –silent –fail localhost: 9200 / _cluster / health || exit 1" \

–-health-interval = 5s \

–-health-retries = 12 \

–-health-timeout = 20s \

{image}

For demonstration, we will use the file creation command. If the application has not reached the working state within the allotted time limit (set to 0) (for example, creating a file), then it is marked as working, but before that the specified number of checks is done:

vagrant @ ubuntu: ~ $ sudo docker run \

–d –name healt \

–-health-timeout = 0s \

–-health-interval = 5s \

–-health-retries = 3 \

–-health-cmd = "ls / halth" \

ubuntu bash -c 'sleep 1000'

c0041a8d973e74fe8c96a81b6f48f96756002485c74e51a1bd4b3bc9be0d9ec5

vagrant @ ubuntu: ~ $ sudo docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

c0041a8d973e ubuntu "bash -c 'sleep 1000'" 4 seconds ago Up 3 seconds (health: starting) healt

vagrant @ ubuntu: ~ $ sleep 20

vagrant @ ubuntu: ~ $ sudo docker ps

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

c0041a8d973e ubuntu "bash -c 'sleep 1000'" 38 seconds ago Up 37 seconds (unhealthy) healt

vagrant @ ubuntu: ~ $ sudo docker rm -f healt

healt

If at least one of the checks worked, then the container is marked as healthy immediately: