Страница 16 из 37

IT Cloud

Shtoltc Eugeny

cases of life can serve Fluentd due to its high functionality (reading logs, system logs, etc.),

scalability and the ability to roll out across Kubernetes clusters using the Helm chart, and monitor everything

data center in the standard package, but about this relevant section.

To manage logs, you can use Curator, which can archive old ones from ElasticSearch

logs or delete them, increasing the efficiency of its work.

The process of obtaining logs is logical carried out by special collectors: logstash, fluentd, filebeat or

others.

fluentd is the least demanding and simpler analogue of Logstash. Customization

produced in /etc/td-agent/td-agent.conf, which contains four blocks:

** match – contains settings for transferring received data;

** include – contains information about file types;

** system – contains system settings.

Logstash provides a much more functional configuration language. Logstash agent daemon – logstash monitors

changes in files. If the logs are not located locally, but on a distributed system, then logstash is installed on each server and

runs in agent mode bin / logstash agent -f /env/conf/my.conf . Since run

logstash only as an agent for sending logs is wasteful, then you can use a product from those

the same developers Logstash Forwarder (formerly Lumberjack) forwards logs via the lumberjack protocol to

logstash to the server. You can use the Packetbeat agent to track and retrieve data from MySQL

(https://www.8host.com/blog/sbor-metrik-infrastruktury-s-pomoshhyu-packetbeat-i-elk-v-ubuntu-14-04/).

Also logstash allows you to convert data of different types:

** grok – set regular expressions to rip fields from a string, often for logs from text format to JSON;

** date – in case of archived logs, set the date when the log was created not as the current date, but take it from the log itself

** kv – for logs like key = value;

** mutate – select only the required fields and change the data in the fields, for example, replace the "/" character with "_";

** multiline – for multi-line logs with delimiters.

For example, you can decompose a log in the format "date type number" into components, for example "01.01.2021 INFO 1" decompose into a hash "message":

filter {

grok {

type => "my_log"

match => ["message", "% {MYDATE: date}% {WORD: loglevel} $ {ID.id.int}"]

}

The $ {ID.id.int} template takes the class – the ID template, the resulting value will be substituted into the id field and the string value will be converted to the int type.

In the "Output" block, we can specify: output data to the console using the "Stdout" block, to a file – "File", transfer via http via JSON REST API – "Elasticsearch" or send by mail – "Email". You can also order conditions for the fields obtained in the filter block. For instance,:

output {

if [type] == "Info" {

elasticsearch {

host => localhost

index => "log -% {+ YYYY.MM.dd}"

}

Here the Elasticsearch index (a database, if we can analogy with SQL) changes every day. To create a new index, you do not need to create it specially – this is how NoSQL databases do it, since there is no strict requirement to describe the structure – property and type. But it is still recommended to describe it, otherwise all fields will be with string values, if a number is not specified. To display Elasticsearch data, a plugin of the WEB-ui interface in AngularJS – Kibana is used. To display a timeline in its charts, you need to describe at least one field with the date type, and for aggregate functions – a numeric one, be it an integer or floating point. Also, if new fields are added, indexing and displaying them requires re-indexing the entire index, so the most complete description of the structure will help to avoid the very time-consuming operation of reindexing.

The division of the index by days is done to speed up the work of Elasticsearch, and in Kibana you can select several by pattern, here log- * , the limitation of one million documents per index is also removed.

Consider a more detailed Logstash output plugin:

output {

if [type] == "Info" {

elasticsearch {

claster => elasticsearch

action => "create"

hosts => ["localhost: 9200"]

index => "log -% {+ YYYY.MM.dd}"

document_type => ....

document_id => "% {id}"

}

Interaction with ElasticSearch is carried out through the JSON REST API, for which there are drivers for most modern languages. But in order not to write code, we will use the Logstash utility, which also knows how to convert text data to JSON based on regular expressions. There are also predefined templates, like classes in regular expressions, such as % {IP: client} and others, which can be viewed at https://github.com/elastic/logstash/tree/v1.1.9/patterns. For standard services with standard settings on the Internet there are many ready-made configs, for example, for NGINX – https://github.com/zooniverse/static/blob/master/logstash- Nginx.conf. More similarly, it is described in the article https://habr.com/post/165059/.

ElasticSearch is a NoSQL database, so you don't need to specify a format (set of fields and its types). For searching, he still needs it, so he defines it himself, and with each format change, re-indexing occurs, in which work is impossible. To maintain a unified structure in the Serilog logger (DOT Net) there is an EventType field in which you can encrypt a set of fields and their types, for the rest you will have to implement them separately. To analyze the logs from a microservice architecture application, it is important to set the ID while it is being executed, that is, the request ID, which will be unchanged and transmitted from the microservice to the microservice, so that you can trace the entire path of the request.

Install ElasticSearch (https://habr.com/post/280488/) and check that curl -X GET localhost: 9200 works

sudo sysctl -w vm.max_map_count = 262144

$ curl 'localhost: 9200 / _cat / indices? v'

health status index uuid pri rep docs.count docs.deleted store.size pri.store.size

green open graylog_0 h2NICPMTQlqQRZhfkvsXRw 4 0 0 0 1kb 1kb

green open .kibana_1 iMJl7vyOTuu1eG8DlWl1OQ 1 0 3 0 11.9kb 11.9kb

yellow open indexname le87KQZwT22lFll8LSRdjw 5 1 1 0 4.5kb 4.5kb

yellow open db i6I2DmplQ7O40AUzyA-a6A 5 1 0 0 1.2kb 1.2kb

Create an entry in the blog database and post table curl -X PUT "$ ES_URL / blog / post / 1? Pretty" -d '

ElasticSearch search engine

In the previous section, we looked at the ELK stack that ElasticSearch, Logstash, and Kibana make up. In the full set, and often it is still extended by Filebeat – more tailored to work with the Logstash extension, for working with text logs. Despite the fact that Logstash quickly performs its task u

If we have an application, then pure ElasticSearch is used, which is used as a search engine, and Kibana is used as a tool for writing and debugging queries – the Dev Tools block. Although relational databases have a long history of development, the principle remains that the more demoralized the data, the slower it becomes, because it has to be merged with every request. This problem is solved by creating a View, which stores the resulting selection. But although modern databases have acquired impressive functionality, up to full-text search, they still ca