Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Copyright © 2014 Splunk Inc. 
Sustainable Logging: SUCCEEDING WITH SPLUNK
2 
Paul Gilowey 
Foundation Technology Specialist 
paul.gilowey@santam.co.za 
@paulcgt 
Sustainable Logging: 
SUCCEEDING W...
3 
www.dan-dare.org
4 
My technology background
5 
The evolution that led to Splunk
6 
In the beginning there was ONE. 
depotwallpaper.com
7 
Then things got really complex.
8
9
10 
In 2012, a new project
11 
A big decision 
It’s time to say goodbye…
12 
Highly distributed and integrated
13 
A brand new world 
Claims 
Finance 
Docs 
B2B 
Portal 
Legacy 
Reverse 
Proxies 
Load-balancers 
IDM 
Integration 
ESM...
14 
James Wheeler 
souvenirpixels.com 
Too many logs to monitor
15 
capetownstockphotos.com 
So little time to trace problems
16 
Not only in production 
https://www.flickr.com/photos/wsdot/
17 
On a tight timeline
18 
https://www.flickr.com/photos/usnavy/ 
December 2013 Production and Non-Production 20GB
19 
Now what? 
So we’re collecting log events.
20 
Developers like doing things the old way
21 
tail -f ./catalina.out
22 
We like this. It’s comforting.
23 
Effecting change
24 
CTO’s Office 
Splunk users (dev, ops, etc.) 
Choosing your champion
25 
•have influence across departments 
•act as product owner 
•be fanatical 
•be hands-on 
•have a development background...
26 
Tips to help your champion
27 
Help developers troubleshoot (even in dev) 
Ed Yordon https://www.flickr.com/photos/yourdon/
28 
Change how developers think about log events
29 
Police lazy logging 
[INFO ] Got here 
[INFO ] finished loop 420 
[INFO ] JDE… 
[INFO ] >>>>>>>>AAAAAAAA 
[INFO ] BBBB...
30 
Ops might as well be blindfolded. 
https://www.flickr.com/photos/foxtongue
31 
Do you really want to be called at 2am?
32 
Demonstrate thoughtful logging 
[DEBUG] TxId=328, Counting invoice line items… 
[INFO ] TxId=328, Invoice LineItemsTot...
33 
Show the benefit of structured log events [INFO] Purchase complete - total=42 currency=ZAR language=en_ZA priority=13 ...
34 11 Sep 2014 15:05:27,960 [Thread-428] [DEBUG] [stm.amx.communication.outboundcommunicationmanager] za.co.santam.communi...
35 
… into this.
36 
Formalise stacktrace logging policy 
Function call -> 
Function call -> 
Function call -> 
Function call 
<- Log stack...
37 
Avoid filtering events. 
[DEBUG] TxId=328, Real important debug statement. 
[INFO ] TxId=328, This would have been use...
38 
Avoid filtering events. 
[WARN ] TxId=328, Real important debug statement. 
[WARN ] TxId=328, This would have been use...
39 
tail -f ./catalina.out
40 
Why developer buy-in matters
41 
“A fool with a tool is still a fool.” Grady Booch
42 
•Laughable deadlines 
•Long days, longer nights 
•Management pressure
43 
If we log excessively…
44 
Bob B. Brown - https://www.flickr.com/photos/beleaveme
45 
tail -f ./catalina.out
46 
Nope, no fires today, folks. 
Robert du Bois https://www.flickr.com/photos/lordisgood
47 
No value, no money. 
Neubie - https://www.flickr.com/photos/neubie/
48 
Shelfware. 
Robert Couse-Baker https://www.flickr.com/photos/29233640@N07/
49 
8 steps to successful implementation
50 
Start small (but plan to grow big) 
Pewstruck.com - https://www.flickr.com/photos/canoodlepets/ 
1
51 
Start with a 
clean slate 
2
52 
Learn 
Implement 
Stabilise 
Spread the word 
Refine 
Take a 
smart approach 
3
53 Dashboards are pretty, alerts are king Reactive becomes proactive Register defects (ERROR = defect) Filter, don’t flood...
54 Get a feel for the pain Make sure filtering is working Police false positives 
Receive 
all alerts 
yourself 
5
55 Mine their data yourself 
–Find what’s difficult to show 
–Build dashboards to showcase their solutions Broaden their m...
56 
“Not too hot, not too cold, just right!” 
“Meh – too sloooow…” 
“Too expensive!” 
Apply the Goldilocks Principle 
7
57 
Monitor licence usage by source or source type 
index=_internal source=*metrics.log 
group="per_sourcetype_thruput" 
|...
58 
Wrapping up
59 
Encourage thoughtful logging 
Promote good logging practices 
Police bad behaviour 
Be intimately involved 
Adopt a he...
Thanks for listening! 
Paul Gilowey 
Foundation Technology Specialist 
paul.gilowey@santam.co.za 
@paulcgt
Upcoming SlideShare
Loading in …5
×

Sustainable Logging – SplunkLive! 2014

1,411 views

Published on

There are several factors that will make your Splunk implementation a success. This presentation covers why our organisation implemented Splunk for log management and the steps you can take to make your implementation successful.

Published in: Technology
  • Be the first to comment

Sustainable Logging – SplunkLive! 2014

  1. 1. Copyright © 2014 Splunk Inc. Sustainable Logging: SUCCEEDING WITH SPLUNK
  2. 2. 2 Paul Gilowey Foundation Technology Specialist paul.gilowey@santam.co.za @paulcgt Sustainable Logging: SUCCEEDING WITH SPLUNK Words and thoughts expressed herein are my own, and not those of Santam.
  3. 3. 3 www.dan-dare.org
  4. 4. 4 My technology background
  5. 5. 5 The evolution that led to Splunk
  6. 6. 6 In the beginning there was ONE. depotwallpaper.com
  7. 7. 7 Then things got really complex.
  8. 8. 8
  9. 9. 9
  10. 10. 10 In 2012, a new project
  11. 11. 11 A big decision It’s time to say goodbye…
  12. 12. 12 Highly distributed and integrated
  13. 13. 13 A brand new world Claims Finance Docs B2B Portal Legacy Reverse Proxies Load-balancers IDM Integration ESM Virtualisation New Policy Administration MDM
  14. 14. 14 James Wheeler souvenirpixels.com Too many logs to monitor
  15. 15. 15 capetownstockphotos.com So little time to trace problems
  16. 16. 16 Not only in production https://www.flickr.com/photos/wsdot/
  17. 17. 17 On a tight timeline
  18. 18. 18 https://www.flickr.com/photos/usnavy/ December 2013 Production and Non-Production 20GB
  19. 19. 19 Now what? So we’re collecting log events.
  20. 20. 20 Developers like doing things the old way
  21. 21. 21 tail -f ./catalina.out
  22. 22. 22 We like this. It’s comforting.
  23. 23. 23 Effecting change
  24. 24. 24 CTO’s Office Splunk users (dev, ops, etc.) Choosing your champion
  25. 25. 25 •have influence across departments •act as product owner •be fanatical •be hands-on •have a development background •be an architect Dave Keeshan - https://www.flickr.com/photos/spudmurphy/ Your champion should…
  26. 26. 26 Tips to help your champion
  27. 27. 27 Help developers troubleshoot (even in dev) Ed Yordon https://www.flickr.com/photos/yourdon/
  28. 28. 28 Change how developers think about log events
  29. 29. 29 Police lazy logging [INFO ] Got here [INFO ] finished loop 420 [INFO ] JDE… [INFO ] >>>>>>>>AAAAAAAA [INFO ] BBBBBBBBBBBBBBB [ERROR] It failed!!!!!!
  30. 30. 30 Ops might as well be blindfolded. https://www.flickr.com/photos/foxtongue
  31. 31. 31 Do you really want to be called at 2am?
  32. 32. 32 Demonstrate thoughtful logging [DEBUG] TxId=328, Counting invoice line items… [INFO ] TxId=328, Invoice LineItemsTotal=420 [DEBUG] TxId=328, Calling remote service JDE… [TRACE] TxId=328, JDE Request: {“TxID”:”328”, “Items”[{“desc”:”Motor Vehicle”,”prem”:305.24},… [WARN ] TxId=328, Timed out while calling remote service JDE… target system may be down. Will retry in 30s.
  33. 33. 33 Show the benefit of structured log events [INFO] Purchase complete - total=42 currency=ZAR language=en_ZA priority=13 “Purchase complete” priority<4 | stats sum(total) as currencyTotal by currency | table currency, currencyTotal
  34. 34. 34 11 Sep 2014 15:05:27,960 [Thread-428] [DEBUG] [stm.amx.communication.outboundcommunicationmanager] za.co.santam.communication.outboundcommunicationmanager.RunnableStatusReceiver - btid=77320d33-5f8c-4178-b13e-c594816463d8, cmpid=za.co.santam.communication.outboundcommunicationmanager.RunnableStatusReceiver, uid=System, za.co.santam.communication.outboundcommunicationmanager.RunnableStatusReceiver.processStatusMessage : Status [STATUS_PROCESSING_COMPLETED = 6], will act on [STATUS_FINISHED = 1], for now only GENERATE_DIGITAL_DOCUMENT. 11 Sep 2014 15:05:36,272 [Thread-428] [DEBUG] [stm.amx.communication.outboundcommunicationmanager] za.co.santam.communication.outboundcommunicationmanager.RunnableReceiver - btid=e76665e2-e876-455a-a087-aeb5ba97d5a8, cmpid=za.co.santam.communication.outboundcommunicationmanager.RunnableStatusReceiver, uid=System, za.co.santam.communication.outboundcommunicationmanager.RunnableStatusReceiver.processMessages : Blocking(2000) read storage until message arrives... 11 Sep 2014 15:05:36,472 [Thread-427] [DEBUG] [stm.amx.communication.outboundcommunicationmanager] za.co.santam.communication.outboundcommunicationmanager.RunnableReceiver - btid=e76665e2-e876-455a-a087-aeb5ba97d5a8, cmpid=za.co.santam.communication.outboundcommunicationmanager.RunnableStorageReceiver, uid=System, za.co.santam.communication.outboundcommunicationmanager.RunnableStorageReceiver.processMessages : message received. 11 Sep 2014 15:05:36,475 [Thread-427] [TRACE] [com.tibco.amx.platform] com.tibco.governance.amxagent.msginterceptor.component.AMXGovMsgInterceptorComponent - Target URI : urn:amx:env2/stm.amx.communication.outboundcommunicationmanager/StatusReceiver_1.2.0.v2014-09-10- 1604#reference(StatusReceiver_ContentManagerProxyAsync_v4_Int). Change this…
  35. 35. 35 … into this.
  36. 36. 36 Formalise stacktrace logging policy Function call -> Function call -> Function call -> Function call <- Log stacktrace <- Log stacktrace <- Log stacktrace <- Log stacktrace
  37. 37. 37 Avoid filtering events. [DEBUG] TxId=328, Real important debug statement. [INFO ] TxId=328, This would have been useful to see... [DEBUG] TxId=328, Useful when we really need it. [TRACE] TxId=328, Oh man, I need this event so bad. [DEBUG] TxId=328, Flippin’ important debug message. [INFO ] TxId=328, This would have been useful to see... [WARN ] TxId=328, Why am I logging at all?
  38. 38. 38 Avoid filtering events. [WARN ] TxId=328, Real important debug statement. [WARN ] TxId=328, This would have been useful to see... [WARN ] TxId=328, Useful when we really need it. [WARN ] TxId=328, Oh man, I need this event so bad. [WARN ] TxId=328, Flippin’ important debug message. [WARN ] TxId=328, Cummon, I *really* wanna see this! [WARN ] TxId=328, Why am I logging at all?
  39. 39. 39 tail -f ./catalina.out
  40. 40. 40 Why developer buy-in matters
  41. 41. 41 “A fool with a tool is still a fool.” Grady Booch
  42. 42. 42 •Laughable deadlines •Long days, longer nights •Management pressure
  43. 43. 43 If we log excessively…
  44. 44. 44 Bob B. Brown - https://www.flickr.com/photos/beleaveme
  45. 45. 45 tail -f ./catalina.out
  46. 46. 46 Nope, no fires today, folks. Robert du Bois https://www.flickr.com/photos/lordisgood
  47. 47. 47 No value, no money. Neubie - https://www.flickr.com/photos/neubie/
  48. 48. 48 Shelfware. Robert Couse-Baker https://www.flickr.com/photos/29233640@N07/
  49. 49. 49 8 steps to successful implementation
  50. 50. 50 Start small (but plan to grow big) Pewstruck.com - https://www.flickr.com/photos/canoodlepets/ 1
  51. 51. 51 Start with a clean slate 2
  52. 52. 52 Learn Implement Stabilise Spread the word Refine Take a smart approach 3
  53. 53. 53 Dashboards are pretty, alerts are king Reactive becomes proactive Register defects (ERROR = defect) Filter, don’t flood mailboxes Build alerts and set policy 4
  54. 54. 54 Get a feel for the pain Make sure filtering is working Police false positives Receive all alerts yourself 5
  55. 55. 55 Mine their data yourself –Find what’s difficult to show –Build dashboards to showcase their solutions Broaden their minds – complement traditional BI by using log events Help managers look good 6
  56. 56. 56 “Not too hot, not too cold, just right!” “Meh – too sloooow…” “Too expensive!” Apply the Goldilocks Principle 7
  57. 57. 57 Monitor licence usage by source or source type index=_internal source=*metrics.log group="per_sourcetype_thruput" | stats sum(kb) as KB by series | where KB > 20000 8
  58. 58. 58 Wrapping up
  59. 59. 59 Encourage thoughtful logging Promote good logging practices Police bad behaviour Be intimately involved Adopt a helpful attitude Make sure you show value To be successful:
  60. 60. Thanks for listening! Paul Gilowey Foundation Technology Specialist paul.gilowey@santam.co.za @paulcgt

×