Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
From zero to production hero:
Log analysis with Elasticsearch
Rafał Kuć
Radu Gheorghe
Who are we?
RaduRafał
Our Company → Sematext
HQ: NYC + Globally Distributed Team
Search & Big Data Consulting
Production Support for Solr & Elas...
Our Company → Sematext
Agenda
Kibana
Elasticsearch
essentials, tuning and scaling
Logstash
rsyslog
Logstash + rsyslog
Commands & Configs:
https:/...
Lucene Essentials
{"verb": "GET"}
document
Lucene Essentials
{"verb": "GET"}
1)GETdocument
stored
Lucene Essentials
GET 1,3,5
PUT 2,4
{"verb": "GET"}
1)GETdocument
stored
indexed
Analysis
(Macintosh; Intel Mac OSX; en)
["Macintosh", "Intel", "Mac", "OSX", "en"]
["macintosh", "intel", "mac", "osx", "e...
Field data
GET 1,2
PUT 2,3
Field data
GET 1,2
PUT 2,3
Field data
GET 1,2
PUT 2,3
1) GET
2) GET,PUT
3) PUT
Field data
GET 1,2
PUT 2,3
1) GET
2) GET,PUT
3) PUT
expensive
Field data
GET 1,2
PUT 2,3
1) GET
2) GET,PUT
3) PUT
expensive
heap
Field data
GET 1,2
PUT 2,3
1) GET
2) GET,PUT
3) PUT
expensive
heap
http://bio-img.s3.amazonaws.com/bds/formhdr-cvr-5-memor...
DocValues
GET 1,2
PUT 2,3
1) GET
2) GET,PUT
3) PUT
at index time;
on disk
https://www.lorextechnology.com/images/products/...
DocValues
GET 1,2
PUT 2,3
1) GET
2) GET,PUT
3) PUT
no uninverting!
at index time;
on disk
https://www.lorextechnology.com/...
Logstash
/var/log/apache.log
GET /index.html
grok
{
"verb": "GET",
"path": "/index.html"
}
- w $numberOfWorkers
workers =>...
rsyslog
/var/log/apache.log
GET /index.html
mmnormalize
{
"verb": "GET",
"path": "/index.html"
}
queue.workerThreads
queue...
mmnormalize parse tree
sys
tem log
d -ng
=> scales very well with # of rules
(performance depends more on log length)
rsyslog + Redis via Kafka
rsyslog Apache Kafka Logstash Elasticsearch
file input
mmnormalize
omkafka +
JSON template
Kafka...
Free eBooks @ sematext.com
We are hiring too
http://sematext.com/about/jobs.html
Thank you!
Rafał Kuć
@kucrafal
rafal.kuc@sematext.com
Radu Gheorghe
@radu0gheorghe
radu.gheorghe@sematext.com
Sematext
@se...
From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity NYC 2015)
Upcoming SlideShare
Loading in …5
×

From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity NYC 2015)

10,800 views

Published on

This talk covers the basics of centralizing logs in Elasticsearch and all the strategies that make it scale with billions of documents in production. Topics include:

- Time-based indices and index templates to efficiently slice your data
- Different node tiers to de-couple reading from writing, heavy traffic from low traffic
- Tuning various Elasticsearch and OS settings to maximize throughput and search performance
- Configuring tools such as logstash and rsyslog to maximize throughput and minimize overhead

Published in: Technology

From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity NYC 2015)

  1. 1. From zero to production hero: Log analysis with Elasticsearch Rafał Kuć Radu Gheorghe
  2. 2. Who are we? RaduRafał
  3. 3. Our Company → Sematext HQ: NYC + Globally Distributed Team Search & Big Data Consulting Production Support for Solr & Elasticsearch Training for Solr & Elasticsearch (online and onsite) Training in NYC next week! Oct 19 & 20
  4. 4. Our Company → Sematext
  5. 5. Agenda Kibana Elasticsearch essentials, tuning and scaling Logstash rsyslog Logstash + rsyslog Commands & Configs: https://github.com/sematext/velocity
  6. 6. Lucene Essentials {"verb": "GET"} document
  7. 7. Lucene Essentials {"verb": "GET"} 1)GETdocument stored
  8. 8. Lucene Essentials GET 1,3,5 PUT 2,4 {"verb": "GET"} 1)GETdocument stored indexed
  9. 9. Analysis (Macintosh; Intel Mac OSX; en) ["Macintosh", "Intel", "Mac", "OSX", "en"] ["macintosh", "intel", "mac", "osx", "en"] standard tokenizer lowercase token filter
  10. 10. Field data GET 1,2 PUT 2,3
  11. 11. Field data GET 1,2 PUT 2,3
  12. 12. Field data GET 1,2 PUT 2,3 1) GET 2) GET,PUT 3) PUT
  13. 13. Field data GET 1,2 PUT 2,3 1) GET 2) GET,PUT 3) PUT expensive
  14. 14. Field data GET 1,2 PUT 2,3 1) GET 2) GET,PUT 3) PUT expensive heap
  15. 15. Field data GET 1,2 PUT 2,3 1) GET 2) GET,PUT 3) PUT expensive heap http://bio-img.s3.amazonaws.com/bds/formhdr-cvr-5-memory-killing-foods-v2.png
  16. 16. DocValues GET 1,2 PUT 2,3 1) GET 2) GET,PUT 3) PUT at index time; on disk https://www.lorextechnology.com/images/products/HDD250GB/900x600/security-certified-HDD250GB-L1.png
  17. 17. DocValues GET 1,2 PUT 2,3 1) GET 2) GET,PUT 3) PUT no uninverting! at index time; on disk https://www.lorextechnology.com/images/products/HDD250GB/900x600/security-certified-HDD250GB-L1.png OS caches instead of heap
  18. 18. Logstash /var/log/apache.log GET /index.html grok { "verb": "GET", "path": "/index.html" } - w $numberOfWorkers workers => 2 filter output input Elasticsearch
  19. 19. rsyslog /var/log/apache.log GET /index.html mmnormalize { "verb": "GET", "path": "/index.html" } queue.workerThreads queue.dequeueBatchSize omelasticsearch imfile input module Elasticsearch main queue (RAM+Disk) queue.type queue.size ...
  20. 20. mmnormalize parse tree sys tem log d -ng => scales very well with # of rules (performance depends more on log length)
  21. 21. rsyslog + Redis via Kafka rsyslog Apache Kafka Logstash Elasticsearch file input mmnormalize omkafka + JSON template Kafka input + JSON codec Elasticsearch output
  22. 22. Free eBooks @ sematext.com We are hiring too http://sematext.com/about/jobs.html
  23. 23. Thank you! Rafał Kuć @kucrafal rafal.kuc@sematext.com Radu Gheorghe @radu0gheorghe radu.gheorghe@sematext.com Sematext @sematext http://sematext.com

×