Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
D i g i n s i g h t
LOG MINING
fanjiang@thoughtworks.com
https://github.com/tcz001
TECH RADAR TREND
2
structured-logging
什么是LOG?
3
> tail -f /usr/local/log
INFO [2014-11-13 12:23:36,173]
com.thoughtworks.forcetalk.resources.ContactResource:
Up...
什么是好LOG?
4
▫ http://juliusdavies.ca/logging/llclc.html
Best Logs:
▫Tell you exactly what happened: when, where, and how.
▫...
DEVOPS的故事
5
> rm -rf ALL_THE_LOGS
DEVOPS的故事
6
We got an angry User! HELP!
BE REACTIVE
7
MONITOR IS FAR FROM
“TOP”
8
SAVE OUR LIFE
9
?
SAVE OUR LIFE
10
Splunk
saas
LogStash
opensource
OR
SAVE OUR LIFE
11
SAVE OUR LIFE
12
WHAT TIME IS IT?
1304060505
29/Apr/2011:07:05:26 +0000
Fri, 21 Nov 1997 09:55:06 -0600
Oct 11 20:21:47
02...
SAVE OUR LIFE
13
> 23 INPUTS | 18 FILTERS | 40 OUTPUTS
不只是timestamp
▫ LogLevel
▫ Source
▫ IP=> GeoHash
▫ Browser/Platform
SAVE OUR LIFE
14
Logstash-server
input {
lumberjack {
# The port to listen on
port => 5043
# The paths to your ssl cert an...
ELASTICSEARCH
15
▫ Restiful API search engine
▫ Multi-cluster supported
▫ Great community
▫ Use it! throw things into it!
...
DIGGING DEEPER
16
curl -XGET 'http://localhost:9200/logstash-*/_search?pretty&search_type=count' -d '{
"aggregations": {
“...
DIGGING DEEPER
17
http://localhost:8000/
Zoomable Treemap for diging into Logs via source
By Elasticsearch aggregation API
LEARN FROM LOG
18
treat Log as StatisticalData
AUTO REACTIVE
19
Be Responsive to every Exception
OTHER POSSIBILITY
20
Q&A
Thanks~
21
Upcoming SlideShare
Loading in …5
×

Log mining

574 views

Published on

Structured Logging and mining as a data treatment

Published in: Software
  • Be the first to comment

  • Be the first to like this

Log mining

  1. 1. D i g i n s i g h t LOG MINING fanjiang@thoughtworks.com https://github.com/tcz001
  2. 2. TECH RADAR TREND 2 structured-logging
  3. 3. 什么是LOG? 3 > tail -f /usr/local/log INFO [2014-11-13 12:23:36,173] com.thoughtworks.forcetalk.resources.ContactResource: Updated Contact {"FirstName":"Alper","LastName":"Mermer","Employee_ID__c ":"16906","Email":"amermer@thoughtworks.com","Grade__c": "Senior Consultant”} ERROR [2014-11-13 11:45:33,892] com.thoughtworks.forcetalk.validators.ForceQueryResultsVali dator: Unable to retrieve Project for Opportunity with id: 0065000000TE2evAAD INFO [2014-11-13 12:23:36,505] com.thoughtworks.tetalk.resources.UserResource: Contact Update Response SObjectResponse{successful=true, id='null', errorMessage='null', errorField='null', errorCode='null'} INFO 2014-11-13 12:23:36,173 com.thoughtworks.forcetalk.resources.ContactResource ERROR
  4. 4. 什么是好LOG? 4 ▫ http://juliusdavies.ca/logging/llclc.html Best Logs: ▫Tell you exactly what happened: when, where, and how. ▫Suitable for manual, semi-automated, or automated analysis. ▫Can be analysed without having the application that produced them at ha ▫Don't slow the system down. ▫Can be proven reliable (if used as evidence). Avoid Logs: ▫Missing necessary information. ▫Unsuitable for grep because of redundant information. ▫Information split across more than one line (bad for grep). ▫Error reported to user, but not logged. ▫Never include any sensitive data.(for Security !).
  5. 5. DEVOPS的故事 5 > rm -rf ALL_THE_LOGS
  6. 6. DEVOPS的故事 6 We got an angry User! HELP!
  7. 7. BE REACTIVE 7
  8. 8. MONITOR IS FAR FROM “TOP” 8
  9. 9. SAVE OUR LIFE 9 ?
  10. 10. SAVE OUR LIFE 10 Splunk saas LogStash opensource OR
  11. 11. SAVE OUR LIFE 11
  12. 12. SAVE OUR LIFE 12 WHAT TIME IS IT? 1304060505 29/Apr/2011:07:05:26 +0000 Fri, 21 Nov 1997 09:55:06 -0600 Oct 11 20:21:47 020805 13:51:24 110429.071055,118 @4000000037c219bf2ef02e94 DATE FILTER FIXES THIS BULLSHIT filter { date { # Turn 020805 13:51:24 # Into 2002-08-05T13:51:24.000Z mysqltimestamp => "YYMMdd HH:mm:ss" } }
  13. 13. SAVE OUR LIFE 13 > 23 INPUTS | 18 FILTERS | 40 OUTPUTS 不只是timestamp ▫ LogLevel ▫ Source ▫ IP=> GeoHash ▫ Browser/Platform
  14. 14. SAVE OUR LIFE 14 Logstash-server input { lumberjack { # The port to listen on port => 5043 # The paths to your ssl cert and key ssl_certificate => "./logstash.crt" ssl_key => "./logstash.key" # Set this to whatever you want. type => "finance" } } filter { if [type] == "finance" { grok { match => [ "message", "%{LOGLEVEL:loglevel}s+[%{TIMESTAMP_IS mp}] (?<source>(w|.)+): (?<msg>(.*))" ] add_tag => [ "grokked" ] } date { match => [ "timestamp" , "yyyy-MM-dd HH } } } output { if "_grokparsefailure" not in [tags] { stdout { codec => rubydebug } elasticsearch { host => localhost } } } Logstash-forwarder "network": { "servers": [ "localhost:5043" ], "ssl ca": "./logstash-forwarder.crt" }, "files": [ { "paths": [ "/usr/local/finance/**/logs/*.log" ], "dead time" : "8760h", "fields": { "type": "finance" } } ] All Our Services ElasticSearch Clusters
  15. 15. ELASTICSEARCH 15 ▫ Restiful API search engine ▫ Multi-cluster supported ▫ Great community ▫ Use it! throw things into it! ElasticSearch + Kibana
  16. 16. DIGGING DEEPER 16 curl -XGET 'http://localhost:9200/logstash-*/_search?pretty&search_type=count' -d '{ "aggregations": { “source-aggregation”: { "terms": { "field": "source","size": 1000 } } } } ' Try it!
  17. 17. DIGGING DEEPER 17 http://localhost:8000/ Zoomable Treemap for diging into Logs via source By Elasticsearch aggregation API
  18. 18. LEARN FROM LOG 18 treat Log as StatisticalData
  19. 19. AUTO REACTIVE 19 Be Responsive to every Exception
  20. 20. OTHER POSSIBILITY 20
  21. 21. Q&A Thanks~ 21

×