Elasticsearch - logi i metryki

Elasticsearch -> Logi & Metryki
Czyli jak ułatwić sobie życie
Rafał Kuć

Shipping
File Shipper
File Shipper
File Shipper

Shipping
File Shipper
File Shipper
File Shipper
Centralny
Bufor

Shipping
File Shipper
File Shipper
File Shipper
Centralny
Bufor
data

Shipping
File Shipper
File Shipper
File Shipper
Centralny
Bufor
ES ES ES
ES ES ES
ES ES ES
data

Log shipper
File Shipper
File Shipper
File Shipper
Centralny
Bufor
ES ES ES
ES ES ES
ES ES ES
data

O czym pamiętać?
metrics
Centralized
Buffer
Co użyć do wysyłania logów?
Który protokół wykorzystć
Jak buforować dane
Logować do JSON czy parsować?

Bufory
wydajność & dostępność
batches & threads co jeżeli padnie centralny bufor

Typy buforów
Disk || memory || podejście typu hybrid
Na źródle || centralizowany
App
Bufor
App
Bufor
lokalny bufor
App
App
Kafka / Redis / Logstash / etc…
ES
ES

Centralny bufor
File Shipper
File Shipper
File Shipper
Centralny
Bufor
ES ES ES
ES ES ES
ES ES ES
data

Dlaczego Apache Kafka?
Szybka i łatwa w użyciu
Łatwość skalowania
Fault tolerant & highly available
Wsparcie streamingu
Działa w modelu publish/subscribe

Kafka
ZooKeeper
ZooKeeper
ZooKeeper
ensemble
Kafka
Kafka
KafkaKafka

Kafka & topics
es_metrics system_metrics
mongo_metrics app_metrics
Kafka zapisuje dane
w topikach
zapisanych na dysku

Kafka & topics & partycje & repliki
metrics
partition 2
metrics
partition 1
metrics
partition 3
metrics
partition 4
metrics replica
partition 2
metrics replica
partition 1
metrics replica
partition 3
mertics replica
partition 4
replication replication

Skalowanie
metrics
partition 1

Skalowanie
metrics
partition 1
metrics
partition 2
metrics
partition 3
mertics
partition 4

Skalowanie
metrics
partition 1
metrics
partition 2
metrics
partition 3
metrics
partition 4
metrics
partition 5
metrics
partition 6
metrics
partition 7
metrics
partition 8
metrics
partition 9
metrics
partition 10
metrics
partition 11
metrics
partition 12
metrics
partition 13
metrics
partition 14
metrics
partition 15
metrics
partition 16

O czym pamiętać korzystając z Kafki
Skaluje się poprzez dodawanie partycji nie wątków
Więcej IOPS == lepiej
Liczba konsumentów powinna być równa liczbie partycji
Repliki wyorzystywane tylko do HA & FT
Offset zapisywany jest per konsumer

Elasticsearch
File Shipper
File Shipper
File Shipper
Centralized
Buffer
ES ES ES
ES ES ES
ES ES ES
data

Elasticsearch – architektura klastra
client
client
client
data
data
data
data
data
data
master
master
master
ingest
ingest
ingest

Pamiętaj o dedykowanych masterach
client
client
client
data
data
data
data
data
data
master
master
master
discovery.zen.minimum_master_nodes -> N/2 + 1 master eligible nodes
ingest
ingest
ingest

Elasticsearch – Indeksy
Index – logiczne miejsce dla danych
Index – może być porównany do tabeli w DB
Index – zbudowany z jednego lub więcej shardów
Index – może by rozproszony

Skalowanie Elasticsearch
metrics
Shard1

mongo_metrics
Shard1
app_metrics
Shard1
es_metrics
Shard1

metrics
Shard1
metrics
Shard2
metrics
Shard3
metrics
Shard4

metrics
Shard3
metrics
Shard2
metrics
Shard4
metrics
Shard1

metrics
Shard1
metrics
Replica4
metrics
Shard2
metrics
Replica3
metrics
Shard4
metrics
Replica1
metrics
Shard3
metrics
Replica2

Jeden duży index to zły pomysł
Niewystarczająca wydajność dla timebased data
Indeksowanie zwalnia wraz ze wzrostem ilości danych
Coraz większy koszt merge
Delete by query konieczne do kontroli retencji danych

Dzienne indeksy to dobry start
2017.11.16 2017.11.17 2017.11.20 2017.11.21. . .
Indeksowanie jest szybsze na małych indeksach
Usuwanie danych jest tanie
Wyszukiwanie tylko na danych które chcemy
Statyczne indeksy są “cache friendly”
indexing
most searches

Dzienne indeksy nie są w pełni optymalne
black
friday
sobota
niedziela
load
nie jest
równy

Indeksy oparte o wielkość
limit wielkości
metrics_01
indeksowanie

metrics_01
indeksowanie
metrics_02

Size based indices are optimal
metrics_01 metrics_02
indeksowanie
metrics_N
. . .

Przewidywalna wydajność
Lepszy balans danych
Mniej shardów
Łatwiejsza obsługa nagłego wzrostu danych
Mniejsze koszty poprzez lepsze wykorzystanie sprzętu

Elasticsearch - konfiguracja
Trzymaj index.refresh_interval na wartości maksymalnej
1 sec -> 100%, 5 sec -> 125%, 30 sec -> 175%
Tuning merge policy:
- możliwy ze względu na use-case
- segments_per_tier -> wyżej
- max_merge_at_once-> wyżej
- max_merged_segment -> niżej
Prefiks do powyższych
index.merge.policy
} szybsze
indeksowanie

Elasticsearch - optymalizacja
Ze względu na dane oparte o czas możemy optymalizować
client
client
client
data
data
data
data
data
data
master
master
master
ingest
ingest
ingest

Architektura hot – cold
ES hot ES cold ES cold
-Dnode.attr.tag=hot -Dnode.attr.tag=cold -Dnode.attr.tag=cold

metrics_2017.11.22
-Dnode.attr.tag=hot -Dnode.attr.tag=cold -Dnode.attr.tag=cold
curl -XPUT localhost:9200/metrics_2017.11.22 -d '{
"settings" : {
"index.routing.allocation.exclude.tag" : "cold",
"index.routing.allocation.include.tag" : "hot"
}
}'

metrics_2017.11.22
indeksowanie

metrics_2017.11.22
metrics_2017.11.23
indeksowanie

metrics_2017.11.22
metrics_2017.11.23
indeksowanie
curl -XPUT localhost:9200/metrics_2017.11.22/_settings -d '{
"index.routing.allocation.exclude.tag" : "hot",
"index.routing.allocation.include.tag” : "cold"
}'

metrics_2017.11.23 metrics_2017.11.22
indeksowanie

Hot – cold architecture
metrics_2017.11.23
metrics_2017.11.24
metrics_2017.11.22
indeksowanie

metrics_2017.11.24 metrics_2017.11.22 metrics_2017.11.23
indeksowanie

Hot ES Tier
CPU
I/O
Cold ES Tier
RAM
I/O
ES cold
Cold ES Tier
RAM
I/O

Wymagania Elasticsearch client node
client
client
client
data
data
data
data
data
data
master
master
master
ingest
ingest
ingest

Wymagania Elasticsearch ingest node
client
client
client
data
data
data
data
data
data
master
master
master
ingest
ingest
ingest

Wymagania Elasticsearch master node
client
client
client
data
data
data
data
data
data
master
master
master
ingest
ingest
ingest

Dzięki!
Rafał
rafal.kuc@sematext.com
@kucrafal
http://sematext.com
@sematext http://sematext.com/jobs

Elasticsearch - logi i metryki

More Related Content

Featured

Elasticsearch - logi i metryki