Splunk for Real time alerting and monitoring. www.gtri.com

  • 227 views
Uploaded on

Splunk for Real Time monitoring.

Splunk for Real Time monitoring.

More in: Technology , Business
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
227
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
0
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Copyright*©*2012*Splunk*Inc.*Real*Time*Aler=ng*&*Monitoring*Ledion*Bi=ncka*
  • 2. Got*Alerts?*2*Aler=ng*basics*Modes*of*aler=ng*Control*knobs*Managing*Ques=ons?*
  • 3. Intro*Sr*SoIware*Architect*1870*days*@Splunk****3*Scheduler*&*Aler=ng* Summary*Indexing*Field*Extrac=ons*
  • 4. Alert*anatomy*4*SMS*Email*SNMP*Script*No#fica#on( Condi#on( Data(search*basics*
  • 5. Types*of*alerts**5*basics*Alerts*Digest*Per*result*Historical*Real*=me**Search*type*Digest*Per*result*No=fica=on*type*
  • 6. RealY=me*search*primer*Search*forward*in*=me****Never*complete*(unless*stopped)*Constantly*upda=ng*result*set*Only*generates*results*preview*All*search*commands*supported**6*basics*now*RT(search(Historical*search*
  • 7. Per*result*aler=ng*New*in*4.3*One*no=fica=on*per*result*Per*result*suppression***Example:*Send*me*an*email(for(each(user(who*has*more*than*5*failed*logins*in*a*30*minute*window.***7*basics*
  • 8. Scheduler*Periodically*executes*searches*Evaluate*condi=ons**Execute*no=fica=ons***8*Alerts** Summary*Indexing* Dashboard*basics*
  • 9. Splunkd/*Scheduler*Search*Process*=me*Search*Start**historical*search*audit.log*search.log*Y*N*No=fy**splunkd*splunkd_access.log*Suppress?*audit.log*Search**done*Y *Execute(ac#ons(Y *Update*ar=fact*TTL*Y *Suppression*update*Y *Alert*manager*N*Y* Done*scheduler.log*Logging*Condi=on*Results*Scheduled*search*alerts*basics*
  • 10. RealY=me*alerts*Splunkd/*Scheduler*Search*Process*=me*RT*Search*Start**RT*search*audit.log*search.log*Y*N*No=fy**splunkd*splunkd_access.log*Suppress?*Y *Execute(ac#ons(Y *Update*ar=fact*TTL*Y *Suppression*update*Y *Alert*manager*N*Y*Logging*Condi=on*ResPrev*Done*scheduler.log*Condi=on*ResPrev*N*Y*…..*Results*Snapshot*basics*
  • 11. Aler=ng*modes**Event*occurrence***Periodic*aggregate***Sliding*aggregate**11*
  • 12. Event*occurrence*Search:* * *all*=me,*real*=me*Condi=on:* *always*No=fica=on:* *per*result**Use*when:* *absolutely*need*to*know*when****************************something*(fatal)*happens*ASAP**12*modes**
  • 13. Periodic*aggregate*Search:* * *historical*Condi=on:* *use*case*specific*No=fica=on:* *digest*or*perYresult*Use*when:* *Medium*priority*alerts*that*need*to****************************be*evaluated*over*a*set*of*results**13*modes**
  • 14. Sliding*aggregate*Search:* * *windowed*real*=me*Condi=on:* *use*case*specific*No=fica=on:* *digest*or*per*result**Use*when:* *Higher*priority,*need*to*know*when*****************************a*sliding*window*matches*condi=on**14*modes**
  • 15. Control*knobs*Scheduling*Suppression*Customiza=on***15*
  • 16. Scheduling*Condi=on*evalua=on*frequency*Should*match*search*range**Limited*resources**Queues*&*skips*16*knobs*
  • 17. Suppression*Stops*no=fica=on**Time*based**RealY=me*&*historical*searches*Field*based*suppression*****Y*alert*me*for*each(user(who*has*more*than*5*failed*logins*in*a***********30*minute*window,**but*not*more*than*once*an*hour*for*each(user(17*knobs*
  • 18. Customizing**Email*fields***Scripts***Custom**alert*ac=ons*18*knobs*
  • 19. Customizing**Email*fields***Scripts***Custom**alert*ac=ons*19*knobs*
  • 20. Customizing**Email*fields***Scripts***Custom**alert*ac=ons*20*knobs*
  • 21. Customizing**Email*fields***Scripts***Custom**alert*ac=ons*21*knobs*1.  Build*an*external*search*cmd*2.  Declare*it*as*an*alert*ac=on*in*alert_ac#ons.conf(3.  Reference*the*ac=on*in*savedsearches.conf*as*ac=on.<ac=onYname>**
  • 22. Managing*alerts*Alert*manager**Scheduler*dashboards*Capacity*planning*Logs**22*
  • 23. Alert*manager**Collec=on*of*triggered*alerts**See*all*alerts*in*one*place***23*manage*
  • 24. Scheduler*dashboards**Troubleshoo=ng**Understanding*load**Tracing*load*origin*24*manage*
  • 25. Capacity*planning**25*manage*
  • 26. Capacity*planning*Y*basics*Alert*==*search*Search*bandwidth*limited*by*#CPUs*********Limit*=*4*x*#CPU*Scheduler*limited*to*25%**** 26*manage*Scheduler*Ad*hoc*
  • 27. Capacity*planning*Y*op=ons*Use*the*right*alert*mode*Schedule*alerts*at*reasonable*periods******there*are*1440*minutes*/*day***Consider*increasing*scheduler*limit**Increase*search*bandwidth*27*manage*
  • 28. Logs*&*.conf**scheduler.log**savedsearches.conf**alert_ac=ons.conf**limits.conf*28*manage*
  • 29. Aler=ng*Summary*29**Basics**Control*knobs**Customizing**Managing****
  • 30. Ques=ons?*30*
  • 31. You*might*also*like*these*sessions***31*…*
  • 32. Expira=on**Alert*tracking**How*long*is*the*alert*kept**Alert*manager**Affects*TTL*32*knobs*
  • 33. Ar=fact*TTL*Painful*to*understand*!*Base*TTL:*2*x*scheduled*period*Alert*TTL:*max*TTL*specified*by*ac=ons*******************OR*alert*expira=on********************33*knobs*
  • 34. Ar=fact*TTL,*exercise*********************34*knobs*Schedule(period(Ac#ons( Expira#on( TTL(Ar#facts((24(hours((
  • 35. Ar=fact*TTL,*exercise*********************35*knobs*Schedule(period(Ac#ons( Expira#on( TTL(Ar#facts((24(hours((1( Hourly( None( None( 2(hours( 2(
  • 36. Ar=fact*TTL,*exercise*********************36*knobs*Schedule(period(Ac#ons( Expira#on( TTL(Ar#facts((24(hours((1* Hourly* None* None* 2*hours* 2*2( Hourly( Email( None( 24(hours( 24(
  • 37. Ar=fact*TTL,*exercise*********************37*knobs*Schedule(period(Ac#ons( Expira#on( TTL(Ar#facts((24(hours((1* Hourly* None* None* 2*hours* 2*2* Hourly* Email* None* 24*hours* 24*3( 5(minutes( None( 24(hours( 24(hours( 288(
  • 38. Ar=fact*TTL,*exercise*********************38*knobs*Schedule(period(Ac#ons( Expira#on( TTL(Ar#facts((24(hours((1* Hourly* None* None* 2*hours* 2*2* Hourly* Email* None* 24*hours* 24*3* 5*minutes* None* 24*hours* 24*hours* 288*4( minute( Email( 12(hours( 24(hours(( 1440(