Jeremy Cohoe presented on using the ELK (Elasticsearch, Logstash, Kibana) stack for log analysis. He began with an overview of what ELK is and its components - Logstash parses logs, Elasticsearch is the database, and Kibana provides the GUI. Cohoe then demonstrated using ELK to monitor 802.11 client probes with a software defined radio and parse Flex pager signals. Finally, he discussed implementing ELK in production for a Linux central syslog system, including scaling out with Redis, common plugins, and cluster monitoring tools.
2. Intro and agenda:
1. What is ELK?
- Elasticsearch - Database
- Logstash - Log Parser
- Kibana - GUI
2. Using ELK for fun and profit ...demo
- 802.11 client probe monitoring
- with Software Defined Radio
3. Using ELK in Production ...demo
- Linux central syslog, scaling out
- Plugins: head, HQ, marvel
End
3. About me…
Sysadmin, wireless & amateur radio…
Who is this talk for?
- If you look at logs
- If you have logs and you don’t look at them
Familiar with ELK? Who here uses ELK?
Introduction
4. Three open-source projects that have merged into the ELK stack
Commercial support available from Elasticsearch
“Elastic provides a growing platform of open source projects and commercial products designed to search,
analyze, and visualize your data, allowing you to get actionable insight in real time” - Elasticsearch.com
Logstash - Log Parser
Elasticsearch - Database
Kibana - GUI (html5)
QuickELK
1. What is ELK?
8. Logstash Filters
Grok - Parser
“Grok is currently the best way in Logstash to parse unstructured log data into something structured and queryable”
Mutate - lowercase, merge, replace, split, strip
Drop, Clone
GeoIp
grok debugger
(the secret sauce for success)
12. Elasticsearch
Automatic clustering and replication
Rolling upgrades
Types of nodes: Master, Data, Client
“Elasticsearch is a search server based on Lucene. It provides a distributed, multitenant-capable full-text search engine with a
RESTful web interface and schema-free JSON documents. Elasticsearch is developed in Java and is released as open source
under the terms of the Apache License.” - Wikipedia: Elasticsearch
14. Kibana 3
Easy to install
Download tarball, unzip, edit config.yml
Limited security - Must use custom solutions
nginx as a reverse proxy
mod_auth_ldap
iptables
Dashboard Setup
Once Logstash and Elasticsearch are configured, most time will be spent in Kibana
Dashboard complexity depends on number of fields/variables in your data
15. Kibana 4
Released Feb 2015
Built in webserver on port 5601 using the JRE
Connects to the Elasticsearch cluster as a client
SSL, Native LDAP and role based access (with Shield
plugin, $$$)
Demo and screenshots are from Kibana3 :(
16.
17. 2. Using ELK for fun and profit
802.11 client probe monitoring
with Software Defined Radio
18. 2. Using ELK for fun and profit
802.11 client probe monitoring
Analyzing client probe requests
Tshark and an Alfa Wireless card on RaspberryPi to monitor the
802.11 RF airspace for client probe requests
19.
20. Using the SDR +
Raspberry Pi to
decode FLEX Pager
signals
2. Using ELK for fun and profit
with Software Defined Radio
21. Use GNURadio and rtl_flex from Github to decode signals
https://github.com/zarya/sdr/tree/master/receivers/flex
Setup: Install GNURadio
Download rtl_flex python scripts from Github
Start it up:
Decoding FLEX Signals
24. 3. Using ELK in Production
Linux Central Syslog
Scaling with Redis and Elasticsearch
Plugins are easy to install:
elasticsearch/bin/plugin --install mobz/elasticsearch-head
Plugins: head, HQ, marvel
Tools: Curator
Stats: Log retention, events per second
25. Scaling Elasticsearch
Implement REDIS as a log broker
Ability to perform rolling restarts and upgrades
without data loss or interruption to search
capabilities
Split database functions into dedicated VM’s
- Master: Keeps tracks of data and cluster
management tasks, shard routing
- Data: Does the heavy lifting, searching, indexing
- Client: Load balances requests from Kibana,
custom scripts and clients
Cluster resource monitoring is important!!!
29. Tools and Stats
“Curator: Tending your time-series indices in
Elasticsearch”
Central Syslog Stats: about 100 million events per
day, 1500 events per second average, 256GB ram
and 16TB disk distributed across 8 VM’s (32gb ram
and 2tb disk each). Events kept between 7, 30, 90,
and 365 days.