* distributed (traces),;

* synthetic (availability),;

* IAOps (forecasting, anomalies).

Monitoring is divided into two parts according to the degree of analysis: logging systems and incident investigation systems. An example of logging

serves as ELK stack, and incident investigation – Sentry (SaaS). For micro-services, a tracing system is also added.

requests such as Jeger or Zipkin. The logging system simply writes all the logs that are available.

The incident investigation system writes much more information, but writes it only in case of errors in the application, for example,

environment parameters, versions of installed packages, stack trace and so on, which allows you to get maximum information when viewing

by mistake, rather than collecting it piece by piece from the server and the GIT repository. But the set and format of information depends on the environment, therefore

the incident system needs to be integrated with various language platforms, and even better with specific frameworks. So Sentry

poisons environment variables, a piece of code and an indication of where the error occurred, parameters of the program and platform

environments, method calls.

Ecosystem monitoring can be divided into:

* Built into Cloud Cloud: Azure Monitoring, Amazon CloudWatch, Google Cloud Monitoring

* Provided as a service with support for various SaaS integrations: DataDog, NewRelic

* CloudNative: Prometheus

* For dedicated servers OnPremis: Zabbix

Zabbix was developed in 1998 and released to OpenSource under the GPL in 2001. At that time, the traditional interface:

without any design, with a lot of tabs, selectors and the like. Since it was developed for

own needs, it contains specific solutions. He is oriented

monitoring devices and their components such as disks, networks, printers, routers and the like. For

interactions can be used:

Agents – installed on servers, collecting many metrics and poisoning Zabbix server

HTTP – Zabbix makes requests over http, for example printers

SNMP – a network protocol for communicating with network devices

IPMI is a protocol for communicating with server devices such as routers

In 2019, Gratner presented a rating of monitoring systems in its square:

** Dynatrace;

** Cisco (AppDynamics);

** New Relic;

** Broadcom (CA Technologies);

** Riverbed and Microsoft;

** IBM;

** Oracle;

** SolarWinds;

** Micro Focus;

** ManageEngine and Tingyun.

Not included in the square:

** Correlsense;

** Datadog;

** Elastic;

** Honeycomb;

** Instant;

** Jennifer Soft;

** Light Step;

** Nastel Technologies;

** SignalFx;

** Splunk;

** Sysdig.

When we run an application in a Docker container, all the standard output (what is displayed in the console) of the running program (process) is buffered. We can view this buffer with the docker logs name_container program . If we follow the Docker ideology – "one process, one container" – we can view the logs of an individual program. It is convenient to use the less and tail commands to view logs. The first program allows you to conveniently scroll through the logs with the keyboard arrows, search for the desired one based on matches and using a regular expression pattern, like the text editor vi . The second program displays the amount we need

An important criterion for ensuring smooth operation is the control of free space. So, if there is no space left, then the database will not be able to write data, with other components the situation can be more dire than the loss of new data. Docker has limit settings not only for individual containers, at least 10%. During imaging or container startup, an error may be thrown that the specified limits have been exceeded. To change the default settings, you need to tell the Dockerd server the settings, after stopping it with service docker stop (all containers will be stopped) and after resuming it with service docker start (the containers will be resumed). Settings can be set as options / bin / dockerd –storange-opt dm.basesize = 50G –stirange-opt