Страница 13 из 37

IT Cloud

Shtoltc Eugeny

{beta_kubernetes_io_arch = "amd64", beta_kubernetes_io_os = "linux", device = "tmpfs", id = "/", instance = "node01", job = "kubernetes-cadvisor", kubernetes_io_arch = "amd64", kubernetes_io_host , kubernetes_io_os = "linux"} 0.409296896

If we want to get the minimum disk, then we need to remove the RAM device from the list: "min (container_fs_limit_bytes {device! =" Tmpfs "} / 1000/1000/1000)"

{} 253.74174822400002

In addition to metrics that indicate the value of the metric itself, there are metrics and counters. Their names usually end in "_total". If we look at them, we will see an ascending line. To get the value, we need to get the difference (using the rate function) over a period of time (indicated in square brackets), something like rate (name_metric_total) [time]. Time is usually kept in seconds or minutes. The prefix "s" is used to represent seconds, for example 40s, 60s. For minutes – "m", for example, 2m, 5m. It is important to note that you ca

And you can see the names of the metrics that you could record along the path / metrics:

controlplane $ curl https://2886795314-9090-ollie08.environments.katacoda.com/metrics 2> / dev / null | head

# HELP go_gc_duration_seconds A summary of the GC invocation durations.

# TYPE go_gc_duration_seconds summary

go_gc_duration_seconds {quantile = "0"} 3.536e-05

go_gc_duration_seconds {quantile = "0.25"} 7.5348e-05

go_gc_duration_seconds {quantile = "0.5"} 0.000163193

go_gc_duration_seconds {quantile = "0.75"} 0.001391603

go_gc_duration_seconds {quantile = "1"} 0.246707852

go_gc_duration_seconds_sum 0.388611299

go_gc_duration_seconds_count 74

# HELP go_goroutines Number of goroutines that currently exist.

Raising the Prometheus and Graphana ligament

We examined the metrics in the already configured Prometheus, now we will raise Prometheus and configure it ourselves:

essh @ kubernetes-master: ~ $ docker run -d –net = host –name prometheus prom / prometheus

09416fc74bf8b54a35609a1954236e686f8f6dfc598f7e05fa12234f287070ab

essh @ kubernetes-master: ~ $ docker ps -f name = prometheus

CONTAINER ID IMAGE NAMES

09416fc74bf8 prom / prometheus prometheus

UI with graphs for displaying metrics:

essh @ kubernetes-master: ~ $ firefox localhost: 9090

Add the go_gc_duration_seconds {quantile = "0"} metric from the list:

essh @ kubernetes-master: ~ $ curl localhost: 9090 / metrics 2> / dev / null | head -n 4

# HELP go_gc_duration_seconds A summary of the GC invocation durations.

# TYPE go_gc_duration_seconds summary

go_gc_duration_seconds {quantile = "0"} 1.0097e-05

go_gc_duration_seconds {quantile = "0.25"} 1.7841e-05

Going to the UI at localhost: 9090 in the menu, select Graph. Let's add to the dashboard with the chart: select the metric using the list – insert metrics at cursor . Here we see the same metrics as in the localhost: 9090 / metrics list, but aggregated by parameters, for example, just go_gc_duration_seconds. We select the go_gc_duration_seconds metric and show it on the Execute button . In the console tab of the dashboard, we see the metrics:

go_gc_duration_seconds {instance = "localhost: 9090", JOB = "prometheus", quantile = "0"} 0.000009186 go_gc_duration_seconds {instance = "localhost: 9090", JOB = "prometheus", quantile = "0.25"} 0.000012056 = go_congc_ instance "localhost: 9090", JOB = "prometheus", quantile = "0.5"} 0.000023256 go_gc_duration_seconds {instance = "localhost: 9090", JOB = "prometheus", quantile = "0.75"} 0.000068848 go_gc_duration_seconds {instance = "localhost: 9090 ", JOB =" prometheus ", quantile =" 1 "} 0.00021869

by going to the Graph tab – their graphical representation.

Now Prometheus collects metrics from the current node: go_ *, net_ *, process_ *, prometheus_ *, promhttp_ *, scrape_ * and up. To collect metrics from Docker, we tell him to write his metrics in Prometheus on port 9323:

eSSH @ Kubernetes-master: ~ $ curl http: // localhost: 9323 / metrics 2> / dev / null | head -n 20

# HELP builder_builds_failed_total Number of failed image builds

# TYPE builder_builds_failed_total counter

builder_builds_failed_total {reason = "build_canceled"} 0

builder_builds_failed_total {reason = "build_target_not_reachable_error"} 0

builder_builds_failed_total {reason = "command_not_supported_error"} 0

builder_builds_failed_total {reason = "Dockerfile_empty_error"} 0

builder_builds_failed_total {reason = "Dockerfile_syntax_error"} 0

builder_builds_failed_total {reason = "error_processing_commands_error"} 0

builder_builds_failed_total {reason = "missing_onbuild_arguments_error"} 0

builder_builds_failed_total {reason = "unknown_instruction_error"} 0

# HELP builder_builds_triggered_total Number of triggered image builds

# TYPE builder_builds_triggered_total counter

builder_builds_triggered_total 0

# HELP engine_daemon_container_actions_seconds The number of seconds it takes to process each container action

# TYPE engine_daemon_container_actions_seconds histogram

engine_daemon_container_actions_seconds_bucket {action = "changes", le = "0.005"} 1

engine_daemon_container_actions_seconds_bucket {action = "changes", le = "0.01"} 1

engine_daemon_container_actions_seconds_bucket {action = "changes", le = "0.025"} 1

engine_daemon_container_actions_seconds_bucket {action = "changes", le = "0.05"} 1

engine_daemon_container_actions_seconds_bucket {action = "changes", le = "0.1"} 1

In order for the docker daemon to apply the parameters, it must be restarted, which will lead to the fall of all containers, and when the daemon starts, the containers will be raised in accordance with their policy:

essh @ kubernetes-master: ~ $ sudo chmod a + w /etc/docker/daemon.json

essh @ kubernetes-master: ~ $ echo '{"metrics-addr": "127.0.0.1:9323", "experimental": true}' | jq -M -f / dev / null> /etc/docker/daemon.json

essh @ kubernetes-master: ~ $ cat /etc/docker/daemon.json

{

"metrics-addr": "127.0.0.1:9323",

"experimental": true

}

essh @ kubernetes-master: ~ $ systemctl restart docker

Prometheus will only respond to metrics on the same server from different sources. In order for us to collect metrics from different nodes and see the aggregated result, we need to put an agent collecting metrics on each node:

essh @ kubernetes-master: ~ $ docker run -d

–v "/ proc: / host / proc"

–v "/ sys: / host / sys"

–v "/: / rootfs"

–-net = "host"

–-name = explorer

quay.io/prometheus/node-exporter:v0.13.0

–collector.procfs / host / proc

–collector.sysfs / host / sys

–collector.filesystem.ignored-mount-points "^ / (sys | proc | dev | host | etc) ($ | /)"

1faf800c878447e6110f26aa3c61718f5e7276f93023ab4ed5bc1e782bf39d56

and register to listen to the address of the node, but for now everything is local, localhost: 9100. Now let's tell Prometheus to listen to agent and docker:

essh @ kubernetes-master: ~ $ mkdir prometheus && cd $ _