This presentation is meant to be an introduction in the world of log monitoring using Logstash and also a showcase for using Logstash in big data setups.
You will be guided in using Logstash from scratch and end up with a complete and centralized setup that should cover all your logging needs.
The presentation covers multiple software solutions like Logstash, Kibana, Elasticsearch, HDFS, Cassandra, HBase and KairosDB.
Similar to OSMC 2014: Processing millions of logs with Logstash and integrating with Elasticsearch, Hadoop and Cassandra | Valentin Fischer-Mitoiu (20)
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
OSMC 2014: Processing millions of logs with Logstash and integrating with Elasticsearch, Hadoop and Cassandra | Valentin Fischer-Mitoiu
1. About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying Processing millions of logs with Logstash
and integrating with Elasticsearch, Hadoop and Cassandra
Valentin Fischer-Mitoiu, OSMC 2014
November 21, 2014
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
2. About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying About me
My name is Valentin Fischer-Mitoiu and I work for the
University of Vienna.
More speci
3. caly in a group called Domainis (Internet domain
administration).
We operate the nameservers, monitoring and much more for
nic.at, the .at registry.
I'm a monitoring guy and sys admin, working daily with systems
like Logstash, Icinga, Elasticsearch, Hadoop, Cassandra.
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
4. About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying Scope of this presentation
1. This presentation is meant as a walk-through for starting
monitoring logs using logstash.
2. Describing possible integration of logstash in "big data"
environments.
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
5. About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying Logstash: What is it?
Logstash is a software that allows you to send, receive and
manage log
6. les from various sources.
It appeared in 2011 and it is getting a lot of interest because of its
exibility and power of con
7. guration.
Based on Ruby, works as an agent/daemon and provides a high
level of abstractization for its con
8. guration, similar to puppet.
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
9. About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying Logstash: How we use it
Logstash: How we use it
We started by testing it and see what types of logs we can manage
with it. The initial setup was processing logs related to dns, login,
postresql, soap, epp, and system.
We started using it in production and feeding logs from various
sources.
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
10. About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying Logstash: Why we use it
Logstash: Why we use it
We use logstash because we want to use its power in managing
logs and generate events based on speci
11. c triggers.
We cover states with icinga and adding logstash would cover the
whole monitoring spectrum.
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
12. About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying Logstash: Deploying fast and simple
Deploying logstash is quite simple, the challenge is in generating
a con
13. guration.
In most cases it's recommended to start with a centralized setup.
This means having a shipper, queue, indexer and some storage
system.
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
14. About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying Logstash: Overview of a centralized setup
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
15. About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying Logstash: Hardware requirements
Logstash: Hardware requirements
In order
16. nd out how much hardware you need for your logstash
setup,
17. rst think how much data you process daily and if you
need to search into it.
Another factor is the complexity of your con
20. About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying Logstash: Working with inputs
Logstash: Working with inputs
This is how a
21. le input looks like
1 i n p u t f
f i l e f
3 codec => " p l a i n "
path => "/ dns / l o g s / p r o c e s s i n g q u e u e / . t a s k s
5 s t a r t p o s i t i o n = [ b e g i n n i n g ]
t ype = p r o c e s s i n g t a s k s
7 g
g
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
22. About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying Logstash: Working with
25. lter looks like
f i l t e r f
2 grok f
add t ag = [ p r o c e s s e d t a s k s ]
4 match = [ mes sage , %fIP : c l i e n t g %fWORD: methodg
%fURIPATHPARAM: r e q u e s t g %fNUMBER: b y t e s g %fNUMBER:
d u r a t i o n g ]
t a g o n f a i l u r e = [ f a i l e d t a k s ]
6 g
g
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
26. About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying Logstash: Working with outputs
Logstash: Working with outputs
This is how an output looks like
1 output f
e l a s t i c s e a r c h f
3 c l u s t e r = s t o r a g ec l u s t e r
codec = p l a i n
5 g
g
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
27. About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying Logstash: Working with Elasticsearch
Logstash: Working with Elasticsearch
Elasticsearch is distributed RESTful search and analytics
engine.
This gives logstash great searching power using the lucene syntax.
It is also a powerful storage system that can make use of
indexes, facets, templates and much more.
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
28. About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying Logstash: Working with Elasticsearch
Logstash: Elasticsearch overview
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
29. About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying Logstash: Working with Kibana
Logstash: Working with Kibana
Kibana is a web interface used to search (query) the
elasticsearch cluster.
The interface can be grouped in dashboards, each of one
containing dierent layouts.
Each layout can have: pies, maps, charts, histograms, tables,
trends, etc.
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
30. About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying Logstash: Working with Kibana
Logstash: Dashboard overview
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
31. About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying Logstash: Deployment example
Logstash: Deployment example
You now know the basics, an actual deployment would look like
this.
1. Download a package from www.elasticsearch.org
2. Setup your shipper
3. Setup your queue
4. Setup your indexer
5. Setup your storage
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
32. About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying Logstash: Setting up the shipper/forwarder
Logstash: Setting up your shipper/forwarder
A shipper/forwarder is an agent that usually contains some input
(port,
33. le, etc) that gets some data and sends it to an output (a
queue if using a centralized setup).
To setup your shipper you generally need an input and output
section.
You can also do preprocessing in most shipping agents.
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
34. About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying Logstash: Shipping logs
Logstash: Shipping logs
Log Courier ( lightweight, fast, secure, stable )
Logstash ( heavy, fast, stable, well developed )
Logstash-forwarder ( fast, secure, well developed )
Nxlog ( no java, lightweight, mainly used for windows logs )
Beaver ( lightweight, fast , python)
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
35. About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying Logstash: Setting up the indexer
Logstash: Setting up your indexer
An indexer is a logstash agent that contains some logic (
36. lters)
and does some processing on messages.
To setup your indexer you generally need an input,
37. lter and
output section.
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
38. About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying Logstash: Indexer example
Logstash: An example of a working indexer
An example of a running indexer.
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
39. About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying Logstash: Setting up the queue
Logstash: Setting up your queue
In order to have a more robust and centralized setup we decided to
use a queue.
This will keep all your messages and wait for indexers to process
them.
The most used queue with logstash is redis and rabbitmq.
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
40. About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying Logstash: Setting up the storage cluster
Logstash: Setting up your storage cluster
In order to store all our logs, we will setup an elasticsearch cluster.
We will use it also because of its advanced search capabilities.
Con
41. guration is fairly simple. Edit elasticsearch.yml and make
the necessary changes.
After that, set the max RAM limit for your ES node
(ES MAX MEM).
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
42. About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying Logstash: Setting up the Kibana instance
Logstash: Setting up your storage cluster
Kibana is a tool used to visualize your data stored in Elasticsearch.
You can do complex queries using the lucene language and graph
the data in various charts, maps, etc.
e.g MESSAGES WHERE LOG SOURCE=(shipper1,shipper2)
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
43. About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying Logstash: Debugging and errors
Logstash: Debugging and errors
Logstash is pretty clear and descriptive when it comes to errors but
you can always troubleshoot more using debug mode (-debug).
Another useful tip is sending your logs to stdout output and see
more information if you have issues with your messages format.
To see all the
ags just use -h or {help.
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
44. About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying Logstash: Getting help
Logstash: Getting help
Usually if you need help you can enter IRC and channel #logstash
on network Freenode. You will
45. nd a large community and people
willing to help you.
I'm also present there with nickname vali.
You can also visit the ocial documentation present on
http://logstash.net
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
46. About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying Logstash: Developing your own add-ons
Logstash: Developing your own add-ons
To develop your own addon (input,
48. nd more information at
http://logstash.net/docs/1.4.2/extending/
Basically you follow a standard structure, add your methods and
required arguments. Do some processing with the incoming event,
modify it or return a value.
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
49. About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying Logstash: Integrating with big data systems
In our case, integrating with big data systems was done via
KairosDB. This acts as a middleware and allows us to use as
storage system Hadoop or Cassandra.
In our case the sending was done via the opentsdb output.
And custom scripts sending directly to KairosDB via a TCP port.
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
50. About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying Logstash: Using Hadoop
Logstash: Using Hadoop
The target is to develop a big data system that allows use to store
events/data from various systems/sources forever.
The system has to be able to be queried using a simple interface
and its data returned in a useful format (json) and be graphed
easily.
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
51. About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying Logstash: Using Cassandra
Logstash: Using Cassandra
The same idea as using HBase + HDFS but exploring dierent
storage systems.
We can still use Hive, Pig and MapReduce queries.
More simple to setup
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
52. About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying Logstash: Using Elasticsearch
Logstash: Using Elasticsearch
Elasticsearch has the same characteristics as other NOSQL
systems.
Some advantages over casssandra and hadoop:
- direct access from logstash, no need for any other middleware.
- great web interface to allow easy access to the data and
excellent graphing options.
- powerful API to retrieve data
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
53. About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying Logstash: Conclusion
In our case logstash seems to be a great candidate for covering
the log management spectrum.
Its ease of usage, great collection of plugins and easy
con
55. nitely a tool worth exploring for this kind of monitoring.
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
56. About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying Questions
This would be a great time to ask questions. :)
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
57. About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying The end
Thank you!
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash