vagrant @ ubuntu: ~ $ sudo docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS
c3bc2d2e37a6 ubuntu sleep 10 "46 seconds ago Up 1 second
We see that when the container falls, it restarts. As a result – we always have an application in two states – raised or raised. If a web server crashes from some rare error, this is the norm, but most likely there is an error in processing requests, and it will crash on every such request, and in monitoring we will see a raised container. Such a web server is better dead than half alive. But, at the same time, a normal web server may not start due to rare errors, for example, due to the lack of connection to the database due to network instability. In such a case, the application must be able to handle errors and exit. And in case of a crash due to code errors, do not restart to see the inoperability and send it to the developers for repair. In the case of a floating error, you can try several times:
vagrant @ ubuntu: ~ $ sudo docker run -d –restart = on-failure: 3 ubuntu sleep 10
056c4fc6986a13936e5270585e0dc1782a9246f01d6243dd247cb03b7789de1c
vagrant @ ubuntu: ~ $ sleep 10
vagrant @ ubuntu: ~ $ sudo docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
c3bc2d2e37a6 ubuntu "sleep 10" 9 minutes ago Up 2 seconds keen_sinoussi
vagrant @ ubuntu: ~ $ sleep 10
vagrant @ ubuntu: ~ $ sleep 10
vagrant @ ubuntu: ~ $ sudo docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
c3bc2d2e37a6 ubuntu "sleep 10" 10 minutes ago Up 9 seconds keen_sinoussi
vagrant @ ubuntu: ~ $ sleep 10
vagrant @ ubuntu: ~ $ sudo docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
c3bc2d2e37a6 ubuntu "sleep 10" 10 minutes ago Up 2 seconds keen_sinoussi
Another aspect is when to consider the container dead. By default, this is a process crash. But, by far, the application does not always crash itself in case of an error in order to allow the container to be restarted. For example, a server may be designed incorrectly and try to download the necessary libraries during its startup, but it does not have this opportunity, for example, due to the blocking of requests by the firewall. In such a scenario, the server can wait a long time if an adequate timeout is not specified. In this case, we need to check the functionality. For a web server, this is a response to a specific url, for example:
docker run –rm -d \
–-name = elasticsearch \
–-health-cmd = "curl –silent –fail localhost: 9200 / _cluster / health || exit 1" \
–-health-interval = 5s \
–-health-retries = 12 \
–-health-timeout = 20s \
{image}
For demonstration, we will use the file creation command. If the application has not reached the working state within the allotted time limit (set to 0) (for example, creating a file), then it is marked as working, but before that the specified number of checks is done:
vagrant @ ubuntu: ~ $ sudo docker run \
–d –name healt \
–-health-timeout = 0s \
–-health-interval = 5s \
–-health-retries = 3 \
–-health-cmd = "ls / halth" \
ubuntu bash -c 'sleep 1000'
c0041a8d973e74fe8c96a81b6f48f96756002485c74e51a1bd4b3bc9be0d9ec5
vagrant @ ubuntu: ~ $ sudo docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
c0041a8d973e ubuntu "bash -c 'sleep 1000'" 4 seconds ago Up 3 seconds (health: starting) healt
vagrant @ ubuntu: ~ $ sleep 20
vagrant @ ubuntu: ~ $ sudo docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
c0041a8d973e ubuntu "bash -c 'sleep 1000'" 38 seconds ago Up 37 seconds (unhealthy) healt
vagrant @ ubuntu: ~ $ sudo docker rm -f healt
healt
If at least one of the checks worked, then the container is marked as healthy immediately: