Night Owl
Log Monitoring using Elasticsearch and Hadoop

Boyd Meier (bmeier@pros.com)
Hadoop Meetup – October 16, 2013

© ...
Problem

© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
Application Performance Monitoring
● Many servers
● Many applications
● Many log formats
● Many places to go look for info...
Advanced Analysis
● The logs are too low-level
● The servers need the existing capacity
● The amount of data to be analyze...
Proactive Support
● See problems coming before they become crises
● Watch for errors and exceptions
● Track performance of...
Some Analysis Questions
● What errors happen, and how often?
● Who did what, when?
● How long did it take to do a task?
● ...
Constraints
● Very little budget – as much free stuff as possible
● Can’t use client machines
● Communications need to be ...
Approach

© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
Hadoop
● We have a lot of data (~2 GB day with 3 clients)
● We need to process it in reasonable time
● We can’t afford a b...
Elasticsearch
● Query performance on base Hadoop is painful
● Ad-hoc queries are required
● Hadoop integration
● Cluster d...
Logstash
● Handle many sources, not just logs
● Fan-in architecture to server
● Compressed, SSL encrypted data
● Can offlo...
Kibana
● Backed by Elasticsearch
● Supports dynamic queries
● View information over time
● Built-in support for Logstash
●...
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
Hadoop Processing
● Pig scripts process the data
● Wonderdog from InfoChimps to integrate Pig and Elasticsearch

– There a...
Demo

© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
Configuration Details

© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
Software
● Ubuntu 12.04.2 LTS (Precise)
● Cloudera CDH 4.3.1

– Hadoop 2.0.0
– Hbase 0.94
– Hive 0.10
– Pig 0.11
● Elastic...
Hardware Architecture
● 27 node cluster of commodity machines
● 42 TB of disk space
● Connected via 10 gigabit switch
● Ea...
Performance
● Over the month of September:

– 188 million events ingested from 3 clients
– 57.5 GB storage used (1.92 GB /...
Resources
● Elasticsearch - http://www.elasticsearch.org/overview/
• http://github.com/elasticsearch/elasticsearch

● Logs...
World Headquarters
3100 Main Street, Suite #900
Houston, TX 77002
Phone: +1 713-335-5151
Sales: +1 855-846-0641
Fax: +1 71...
Upcoming SlideShare
Loading in...5
×

Night owl by Boyd Meyer of PROS

200

Published on

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
200
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
4
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Night owl by Boyd Meyer of PROS

  1. 1. Night Owl Log Monitoring using Elasticsearch and Hadoop Boyd Meier (bmeier@pros.com) Hadoop Meetup – October 16, 2013 © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
  2. 2. Problem © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
  3. 3. Application Performance Monitoring ● Many servers ● Many applications ● Many log formats ● Many places to go look for information ● What if we could just look in one place and see everything? © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
  4. 4. Advanced Analysis ● The logs are too low-level ● The servers need the existing capacity ● The amount of data to be analyzed is huge ● Some analysis needs to be across multiple servers ● What if we want to change the analysis algorithms? ● How we can do analysis in the most flexible way possible? © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
  5. 5. Proactive Support ● See problems coming before they become crises ● Watch for errors and exceptions ● Track performance of the application ● Track usage of the application ● Enable checks we haven’t thought of yet © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
  6. 6. Some Analysis Questions ● What errors happen, and how often? ● Who did what, when? ● How long did it take to do a task? ● What else was happening on the server? © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
  7. 7. Constraints ● Very little budget – as much free stuff as possible ● Can’t use client machines ● Communications need to be secure ● Large amounts of data (Gb/day/client) ● Minimize support’s dependence on client IT © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
  8. 8. Approach © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
  9. 9. Hadoop ● We have a lot of data (~2 GB day with 3 clients) ● We need to process it in reasonable time ● We can’t afford a big machine for this ● We have lots of old machines lying around ● Sounds like a job for the elephant! But what about query? © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
  10. 10. Elasticsearch ● Query performance on base Hadoop is painful ● Ad-hoc queries are required ● Hadoop integration ● Cluster deployment ● Looks promising! How do we get the data into the server? © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
  11. 11. Logstash ● Handle many sources, not just logs ● Fan-in architecture to server ● Compressed, SSL encrypted data ● Can offload some logic on the client if desired ● Massively configurable ● Output to Elasticsearch ● Great! Now how about visualization? © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
  12. 12. Kibana ● Backed by Elasticsearch ● Supports dynamic queries ● View information over time ● Built-in support for Logstash ● Configurable, shareable dashboards © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
  13. 13. © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
  14. 14. Hadoop Processing ● Pig scripts process the data ● Wonderdog from InfoChimps to integrate Pig and Elasticsearch – There are issues: • Cluster stability using Wonderdog • Wonderdog Pig interface has not been updated in a while • Currently evaluating elasticsearch-hadoop project from Elasticsearch.org ● Analysis results are stored in Elasticsearch for ease of access © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
  15. 15. Demo © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
  16. 16. Configuration Details © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
  17. 17. © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
  18. 18. Software ● Ubuntu 12.04.2 LTS (Precise) ● Cloudera CDH 4.3.1 – Hadoop 2.0.0 – Hbase 0.94 – Hive 0.10 – Pig 0.11 ● Elasticsearch 0.90.3 ● Logstash 1.1.12 ● Kibana 3 M3 © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
  19. 19. Hardware Architecture ● 27 node cluster of commodity machines ● 42 TB of disk space ● Connected via 10 gigabit switch ● Each machine has: – 8 GB RAM – 2 TB SATA HDD – Gigabit Ethernet © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
  20. 20. Performance ● Over the month of September: – 188 million events ingested from 3 clients – 57.5 GB storage used (1.92 GB / day) ● At that rate, 42 TB is enough space for: – 142 billion events – 60 years of data from these clients – 1 year of data from 180 clients at the same volume per client © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
  21. 21. Resources ● Elasticsearch - http://www.elasticsearch.org/overview/ • http://github.com/elasticsearch/elasticsearch ● Logstash - http://www.elasticsearch.org/overview/logstash/ • https://github.com/logstash/logstash ● Kibana - http://www.elasticsearch.org/overview/kibana/ • https://github.com/elasticsearch/kibana ● ES – Hadoop - http://www.elasticsearch.org/overview/hadoop/ • http://github.com/elasticsearch/elasticsearch-hadoop ● Cloudera - http://www.cloudera.com/ © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
  22. 22. World Headquarters 3100 Main Street, Suite #900 Houston, TX 77002 Phone: +1 713-335-5151 Sales: +1 855-846-0641 Fax: +1 713-335-8144 PROS Germany GmbH Feringastrasse 6 85774 Unterfoehring Munich Tel.: +49 89 99216 270 Fax: +49 89 99216 200 European Headquarters - United Kingdom Lakeside House 1 Furzeground Way Stockley Park Heathrow UB11 1BD Phone: +44 (0) 208 622 3555 Fax: +44 208 622 3230 Regional Office - Austin, TX 3600 Parmer Lane, Suite 205 Austin, Texas 78727 Regional Office - Cary, North Carolina 1000 Centre Green Way, #200 Cary, NC 27513 Phone:+1 919-228-6334 © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×