Splunk for Real time alerting and monitoring. www.gtri.com

737 views
585 views

Published on

Splunk for Real Time monitoring.

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
737
On SlideShare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Splunk for Real time alerting and monitoring. www.gtri.com

  1. 1. Copyright*©*2012*Splunk*Inc.*Real*Time*Aler=ng*&*Monitoring*Ledion*Bi=ncka*
  2. 2. Got*Alerts?*2*Aler=ng*basics*Modes*of*aler=ng*Control*knobs*Managing*Ques=ons?*
  3. 3. Intro*Sr*SoIware*Architect*1870*days*@Splunk****3*Scheduler*&*Aler=ng* Summary*Indexing*Field*Extrac=ons*
  4. 4. Alert*anatomy*4*SMS*Email*SNMP*Script*No#fica#on( Condi#on( Data(search*basics*
  5. 5. Types*of*alerts**5*basics*Alerts*Digest*Per*result*Historical*Real*=me**Search*type*Digest*Per*result*No=fica=on*type*
  6. 6. RealY=me*search*primer*Search*forward*in*=me****Never*complete*(unless*stopped)*Constantly*upda=ng*result*set*Only*generates*results*preview*All*search*commands*supported**6*basics*now*RT(search(Historical*search*
  7. 7. Per*result*aler=ng*New*in*4.3*One*no=fica=on*per*result*Per*result*suppression***Example:*Send*me*an*email(for(each(user(who*has*more*than*5*failed*logins*in*a*30*minute*window.***7*basics*
  8. 8. Scheduler*Periodically*executes*searches*Evaluate*condi=ons**Execute*no=fica=ons***8*Alerts** Summary*Indexing* Dashboard*basics*
  9. 9. Splunkd/*Scheduler*Search*Process*=me*Search*Start**historical*search*audit.log*search.log*Y*N*No=fy**splunkd*splunkd_access.log*Suppress?*audit.log*Search**done*Y *Execute(ac#ons(Y *Update*ar=fact*TTL*Y *Suppression*update*Y *Alert*manager*N*Y* Done*scheduler.log*Logging*Condi=on*Results*Scheduled*search*alerts*basics*
  10. 10. RealY=me*alerts*Splunkd/*Scheduler*Search*Process*=me*RT*Search*Start**RT*search*audit.log*search.log*Y*N*No=fy**splunkd*splunkd_access.log*Suppress?*Y *Execute(ac#ons(Y *Update*ar=fact*TTL*Y *Suppression*update*Y *Alert*manager*N*Y*Logging*Condi=on*ResPrev*Done*scheduler.log*Condi=on*ResPrev*N*Y*…..*Results*Snapshot*basics*
  11. 11. Aler=ng*modes**Event*occurrence***Periodic*aggregate***Sliding*aggregate**11*
  12. 12. Event*occurrence*Search:* * *all*=me,*real*=me*Condi=on:* *always*No=fica=on:* *per*result**Use*when:* *absolutely*need*to*know*when****************************something*(fatal)*happens*ASAP**12*modes**
  13. 13. Periodic*aggregate*Search:* * *historical*Condi=on:* *use*case*specific*No=fica=on:* *digest*or*perYresult*Use*when:* *Medium*priority*alerts*that*need*to****************************be*evaluated*over*a*set*of*results**13*modes**
  14. 14. Sliding*aggregate*Search:* * *windowed*real*=me*Condi=on:* *use*case*specific*No=fica=on:* *digest*or*per*result**Use*when:* *Higher*priority,*need*to*know*when*****************************a*sliding*window*matches*condi=on**14*modes**
  15. 15. Control*knobs*Scheduling*Suppression*Customiza=on***15*
  16. 16. Scheduling*Condi=on*evalua=on*frequency*Should*match*search*range**Limited*resources**Queues*&*skips*16*knobs*
  17. 17. Suppression*Stops*no=fica=on**Time*based**RealY=me*&*historical*searches*Field*based*suppression*****Y*alert*me*for*each(user(who*has*more*than*5*failed*logins*in*a***********30*minute*window,**but*not*more*than*once*an*hour*for*each(user(17*knobs*
  18. 18. Customizing**Email*fields***Scripts***Custom**alert*ac=ons*18*knobs*
  19. 19. Customizing**Email*fields***Scripts***Custom**alert*ac=ons*19*knobs*
  20. 20. Customizing**Email*fields***Scripts***Custom**alert*ac=ons*20*knobs*
  21. 21. Customizing**Email*fields***Scripts***Custom**alert*ac=ons*21*knobs*1.  Build*an*external*search*cmd*2.  Declare*it*as*an*alert*ac=on*in*alert_ac#ons.conf(3.  Reference*the*ac=on*in*savedsearches.conf*as*ac=on.<ac=onYname>**
  22. 22. Managing*alerts*Alert*manager**Scheduler*dashboards*Capacity*planning*Logs**22*
  23. 23. Alert*manager**Collec=on*of*triggered*alerts**See*all*alerts*in*one*place***23*manage*
  24. 24. Scheduler*dashboards**Troubleshoo=ng**Understanding*load**Tracing*load*origin*24*manage*
  25. 25. Capacity*planning**25*manage*
  26. 26. Capacity*planning*Y*basics*Alert*==*search*Search*bandwidth*limited*by*#CPUs*********Limit*=*4*x*#CPU*Scheduler*limited*to*25%**** 26*manage*Scheduler*Ad*hoc*
  27. 27. Capacity*planning*Y*op=ons*Use*the*right*alert*mode*Schedule*alerts*at*reasonable*periods******there*are*1440*minutes*/*day***Consider*increasing*scheduler*limit**Increase*search*bandwidth*27*manage*
  28. 28. Logs*&*.conf**scheduler.log**savedsearches.conf**alert_ac=ons.conf**limits.conf*28*manage*
  29. 29. Aler=ng*Summary*29**Basics**Control*knobs**Customizing**Managing****
  30. 30. Ques=ons?*30*
  31. 31. You*might*also*like*these*sessions***31*…*
  32. 32. Expira=on**Alert*tracking**How*long*is*the*alert*kept**Alert*manager**Affects*TTL*32*knobs*
  33. 33. Ar=fact*TTL*Painful*to*understand*!*Base*TTL:*2*x*scheduled*period*Alert*TTL:*max*TTL*specified*by*ac=ons*******************OR*alert*expira=on********************33*knobs*
  34. 34. Ar=fact*TTL,*exercise*********************34*knobs*Schedule(period(Ac#ons( Expira#on( TTL(Ar#facts((24(hours((
  35. 35. Ar=fact*TTL,*exercise*********************35*knobs*Schedule(period(Ac#ons( Expira#on( TTL(Ar#facts((24(hours((1( Hourly( None( None( 2(hours( 2(
  36. 36. Ar=fact*TTL,*exercise*********************36*knobs*Schedule(period(Ac#ons( Expira#on( TTL(Ar#facts((24(hours((1* Hourly* None* None* 2*hours* 2*2( Hourly( Email( None( 24(hours( 24(
  37. 37. Ar=fact*TTL,*exercise*********************37*knobs*Schedule(period(Ac#ons( Expira#on( TTL(Ar#facts((24(hours((1* Hourly* None* None* 2*hours* 2*2* Hourly* Email* None* 24*hours* 24*3( 5(minutes( None( 24(hours( 24(hours( 288(
  38. 38. Ar=fact*TTL,*exercise*********************38*knobs*Schedule(period(Ac#ons( Expira#on( TTL(Ar#facts((24(hours((1* Hourly* None* None* 2*hours* 2*2* Hourly* Email* None* 24*hours* 24*3* 5*minutes* None* 24*hours* 24*hours* 288*4( minute( Email( 12(hours( 24(hours(( 1440(

×