Implement Real-time Centralized logging System
by Elastic Stack
Len Chang, WeMo Scooter
1
Who am I
● 軟體工程師, 資料庫架構師, 資料科學家, HadoopCon 2015 Speaker
● 喜歡發呆, 做瑜珈, 跳舞, 和各種球類運動。當然,寫程式已經是生活的一部分
● 目前較關注和熟悉的技術為: C#, Python, Elastic Stack, PostgreSQL, Spark
● 目前任職於 WeMo Scooter, 我們是一間很有趣的小新創公司, 目標是希望推廣城市內的電動機車租貸服務,進而
減少廢氣排放與機車數量。
● Linkedin: https://tw.linkedin.com/in/huailunchang 2
Agenda
● Elastic stack
○ Arch.
■ Open Source
■ License
● How to establish a simple Elastic Stack ?
○ filebeat
○ elasticsearch
○ kibana
● Use Case: How to convert “log timestamp” to be “sort standard”
○ logstash
● Q & A
● WeMo Scooter
3
Elastic stack
4
Overview
● Elastic's open source solutions solve a growing list of search and log analysis.
● Helps you take data from any source, any format and search, analyze, and
visualize it in real time.
5
Architecture
DB
Application
sys log
sys log
Web Console
Restful API
6
Open Source Hadoop Ecosystem
7
License
8
How to establish a simple Elastic Stack ?
9
Beats
DB
Application
sys log
sys log
10
https://www.elastic.co/downloads/beats
Beats - filebeat
11
filebeat - core
12
filebeat - Index Template Setting
https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-template.html
13
Logs
filebeat - Index Template Description
hostname: string (No Analyze)
msg: string (Analyze)
msg_count: number
...
template.json
14
filebeat - Index Template Content
15
filebeat - Index Template Demo
16
filebeat - tab && space
17
Remember to use
“space”
Elasticsearch
DB
Application
sys log
sys log
18
Elasticsearch - Deploy by RPM
Category Explanation Destination
conf Configuration files elasticsearch.yml and logging.yml. /etc/elasticsearch
conf Environment variables including heap size, file descriptors. /etc/sysconfig/elasticsearch
19
Elasticsearch - Heap Tuning
https://www.elastic.co/guide/en/elasticsearch/guide/current/heap-sizing.html
● Give (less than) Half Your Memory to Lucene
○ Lucene need memory to interact with the OS.
● Don’t Cross 32 GB!
○ Compressed oops(Ordinary object pointers) have a upper boundary (~ 32 GB)
■ 32-bit pointer can reference four billion objects, rather than four billion bytes
20
Kibana
DB
Application
sys log
sys log
21
<Use Case>
How to convert “log timestamp” to be
“sort standard”
22
Kibana - Search
23
Kibana - log sample
24
Why we can’t use Beats to do it ?
DB
Application
sys log
sys log
25
Filebeat config
26
Logstash
DB
Application
sys log
sys log
27
Logstash - Filter plugins
28
grok
● https://www.elastic.co/guide/en/logstash/current/p
lugins-filters-grok.html
● Parse arbitrary text and structure it.
● Grok is currently the best way in logstash to
parse crappy unstructured log data into
something structured and queryable.
date
● https://www.elastic.co/guide/en/logstash/current/p
lugins-filters-date.html
● The date filter is used for parsing dates from
fields, and then using that date or timestamp as
the logstash timestamp for the event.
Logstash - Filter of Config
29
PREFIX_TIMESTAMP ^[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2},[0-9]{3}
Logstash - Result
30
Q & A
31
WeMo Scooter
32
Official Website
● http://www.wemoscooter.com/
Video Introduction
● https://www.youtube.com/watch?v=Ne1kg3KeoRs
If you want to be a software engineer with us….
● len.chang@wemoscooter.com
○ Assistant software engineer / software engineer
■ Django / Python
■ ASP.NET MVC 5 / C#
■ Others...
Thanks
33

Hadoop con2016 - Implement Real-time Centralized logging System by Elastic Stack