L'ODYSSÉE DE LA LOG
1
QUI ?
@gerald_quintana
2
ITINÉRAIRE
3 . 1
DONNÉES TEMPORELLES
Logs
Mesures
Evénements
(2017-04-20 18:17:16, Data)
3 . 2
COLLECTER
Applica ons
Agents
3 . 3
STOCKER
Bases de données
Système de fichier
Cloud
3 . 4
POUR COMMENCER
Collecter Stocker
3 . 5
TRANSFORMER, FILTRER, ENRICHIR
Logs processing
Stream processing
3 . 6
POUR CONTINUER
Collecter Transformer Stocker
3 . 7
TRANSPORTER, BUFFERISER
3 . 8
TRANSPORTER, BUFFERISER
Aler ng, SIEM...?
3 . 9
POUR FINIR
Collecter Bufferiser Transformer Stocker
3 . 10
COLLECTER
4 . 1
LOG
2017-03-20 22:42:03 [main] INFO Bonjour à tous
4 . 2
FORMAT
{
"@timestamp":"2017-03-20T22:42:03.522+01:00",
"logger":"mixit",
"level":"INFO",
"message":"Bonjour à tous",
"thread":"main",
"host":"laptop-gerald",
"user":"gerald",
"transactionid": 4567,
"talk":"log-odyssey"
}
4 . 3
EMISSION
Fichier vs TCP/UDP
4 . 4
APPLICATIONS & JSON
...
4 . 5
APPLICATIONS & KAFKA
... →
4 . 6
BEATS & JSON
filebeat.prospectors:
- input_type: log
document_type: logback
paths:
- /var/log/log-odyssey/application.*.log
json:
keys_under_root: true
output.elasticsearch:
hosts: ["elasticsearch:9200"]
4 . 7
BEATS & KAFKA
filebeat.prospectors:
- input_type: log
...
output.kafka:
hosts: ["kafka:9092"]
topic: logstash
4 . 8
DOCKER
--log-driver=json-file|syslog|gelf|fluentd|splunk...
4 . 9
TRANSPORTER
5 . 1
KAFKA
5 . 2
PRODUCER / BROKER / CONSUMER
5 . 3
MESSAGE / RECORD
Key Value TS
5 . 4
PRODUCER
n1
n2
n3
5 . 5
PARTITIONNEMENT
5 . 6
BEATS & KAFKA
filebeat.prospectors:
- input_type: log
...
output.kafka:
hosts: ["kafka:9092"]
topic: logstash
partition.round_robin:
reachable_only: false
5 . 7
CONSUMER
n1
n2
n3
5 . 8
COMMITFAILEDEXCEPTION
CommitFailedException: Commit cannot be completed since the group has already
rebalanced and assigned the partitions to another member.
5 . 9
EQUILIBRAGE DES CONSUMERS
<[... 14:27:40,752] ...: Preparing to restabilize group logstash with old generation 0>
<[... 14:27:40,753] ...: Stabilized group logstash generation 1>
<[... 14:27:40,773] ...: Assignment received from leader for group logstash for generation
<[... 14:27:48,243] ...: Preparing to restabilize group logstash with old generation 1>
<[... 14:27:49,837] ...: Stabilized group logstash generation 2>
<[... 14:27:49,845] ...: Assignment received from leader for group logstash for generation
<[... 14:27:54,969] ...: Preparing to restabilize group logstash with old generation 2>
<[... 14:27:56,621] ...: Stabilized group logstash generation 3>
5 . 10
CONSUMER
n3
poll
commit
subscribe
session.timeout.ms
partition.max.fetch.size max.poll.records
5 . 11
PROTOCOLE
EOFException: null at o.a.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:
SchemaException: Error reading field 'throttle_time_ms': java.nio.BufferUnderflowException
InvalidRequestException: Error getting request for apiKey: 3 and apiVersion: 2
5 . 12
PROTOCOLE
Version Client ≤ Version Serveurs
h p://ka a.apache.org/protocol.html
5 . 13
FILTRER, TRANSFORMER
6 . 1
PIPELINE LOGSTASH
filterinput output
6 . 2
CONFIGURATION
input {
kafka {
bootstrap_servers => "kafka:9092"
codec => json
topics => ["logstash"]
}
}
filter {
if [type] == "jetty" {
grok {
match => { "message" =>
"%{COMBINEDAPACHELOG} (?:%{NUMBER:latency:int}|-)" }
}
date {
match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
6 . 3
KAFKA INPUT/OUTPUT
Ka a Logstash Plugin
0.8 2.0 - 2.x <3.0
0.9 2.0 - 2.3 3.x
0.9 2.4 - 5.x 4.x
0.10.0 2.4 - 5.x 5.x
0.10.1 2.4 - 5.x 6.x
6 . 4
PIPELINE LOGSTASH
input
input
batcher filterfilter output
batcher filterfilter output
pipeline.workers -w
pipeline.batch.size -b
6 . 5
MONITORER
6 . 6
MONITORER
filter {
ruby {
init => "require 'time'"
code => "start_time=Time.now.to_f*1000.0;
event.set('[@metadata][start_time]', start_time);"
}
# Filtrage
#....
ruby {
init => "require 'time'"
code => "end_time=Time.now.to_f=1000.0;
start_time=event.get('[@metadata][start_time]');
event.set('[logstash_duration]', end_time - start_time)"
}
metrics
6 . 7
MONITORER
6 . 8
PIPELINE ELASTICSEARCH
PUT _ingest/pipeline/jetty
{ "description": "Jetty Access Logs",
"processors": [
{ "grok": {
"field": "message",
"patterns": [
"%{COMBINEDAPACHELOG} (?:%{NUMBER:latency:int}|-)" ] } },
{ "date": {
"field": "timestamp",
"formats": [
"dd/MMM/yyyy:HH:mm:ss Z" ] } },
{ "date_index_name": {
"field": "@timestamp",
"index_name_prefix": "logs-",
"date_rounding" : "d" } }
6 . 9
STOCKER
7 . 1
ELASTICSEARCH
7 . 2
SCHEMALESS ?
Normaliser les champs
7 . 3
MAPPINGS
{ "jetty": {
"properties": {
"champ": {
"type": "text|keyword|integer|...",
"index": false?,
"norms": false?
} },
"_all": { "enabled": false }
} }
7 . 4
ROUTING & MAPPING
bulk
map
7 . 5
WRITES
segment
refresh
buffer
translog segment
flush
segment
...
index.refresh_interval 1s
index.translog.flush_threshold_size 512mb
7 . 6
index.merge.scheduler.max_thread_count CPU/2
MERGES
7 . 7
ARRIVÉE
8 . 1
QUESTIONS
?
8 . 2
MERCI
@gerald_quintana
8 . 3

L'odyssée de la log