• Save
Splunk for Real time alerting and monitoring. www.gtri.com
Upcoming SlideShare
Loading in...5
×
 

Splunk for Real time alerting and monitoring. www.gtri.com

on

  • 339 views

Splunk for Real Time monitoring.

Splunk for Real Time monitoring.

Statistics

Views

Total Views
339
Views on SlideShare
339
Embed Views
0

Actions

Likes
0
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Splunk for Real time alerting and monitoring. www.gtri.com Splunk for Real time alerting and monitoring. www.gtri.com Presentation Transcript

  • Copyright*©*2012*Splunk*Inc.*Real*Time*Aler=ng*&*Monitoring*Ledion*Bi=ncka*
  • Got*Alerts?*2*Aler=ng*basics*Modes*of*aler=ng*Control*knobs*Managing*Ques=ons?*
  • Intro*Sr*SoIware*Architect*1870*days*@Splunk****3*Scheduler*&*Aler=ng* Summary*Indexing*Field*Extrac=ons*
  • Alert*anatomy*4*SMS*Email*SNMP*Script*No#fica#on( Condi#on( Data(search*basics*
  • Types*of*alerts**5*basics*Alerts*Digest*Per*result*Historical*Real*=me**Search*type*Digest*Per*result*No=fica=on*type*
  • RealY=me*search*primer*Search*forward*in*=me****Never*complete*(unless*stopped)*Constantly*upda=ng*result*set*Only*generates*results*preview*All*search*commands*supported**6*basics*now*RT(search(Historical*search*
  • Per*result*aler=ng*New*in*4.3*One*no=fica=on*per*result*Per*result*suppression***Example:*Send*me*an*email(for(each(user(who*has*more*than*5*failed*logins*in*a*30*minute*window.***7*basics*
  • Scheduler*Periodically*executes*searches*Evaluate*condi=ons**Execute*no=fica=ons***8*Alerts** Summary*Indexing* Dashboard*basics*
  • Splunkd/*Scheduler*Search*Process*=me*Search*Start**historical*search*audit.log*search.log*Y*N*No=fy**splunkd*splunkd_access.log*Suppress?*audit.log*Search**done*Y *Execute(ac#ons(Y *Update*ar=fact*TTL*Y *Suppression*update*Y *Alert*manager*N*Y* Done*scheduler.log*Logging*Condi=on*Results*Scheduled*search*alerts*basics*
  • RealY=me*alerts*Splunkd/*Scheduler*Search*Process*=me*RT*Search*Start**RT*search*audit.log*search.log*Y*N*No=fy**splunkd*splunkd_access.log*Suppress?*Y *Execute(ac#ons(Y *Update*ar=fact*TTL*Y *Suppression*update*Y *Alert*manager*N*Y*Logging*Condi=on*ResPrev*Done*scheduler.log*Condi=on*ResPrev*N*Y*…..*Results*Snapshot*basics*
  • Aler=ng*modes**Event*occurrence***Periodic*aggregate***Sliding*aggregate**11*
  • Event*occurrence*Search:* * *all*=me,*real*=me*Condi=on:* *always*No=fica=on:* *per*result**Use*when:* *absolutely*need*to*know*when****************************something*(fatal)*happens*ASAP**12*modes**
  • Periodic*aggregate*Search:* * *historical*Condi=on:* *use*case*specific*No=fica=on:* *digest*or*perYresult*Use*when:* *Medium*priority*alerts*that*need*to****************************be*evaluated*over*a*set*of*results**13*modes**
  • Sliding*aggregate*Search:* * *windowed*real*=me*Condi=on:* *use*case*specific*No=fica=on:* *digest*or*per*result**Use*when:* *Higher*priority,*need*to*know*when*****************************a*sliding*window*matches*condi=on**14*modes**
  • Control*knobs*Scheduling*Suppression*Customiza=on***15*
  • Scheduling*Condi=on*evalua=on*frequency*Should*match*search*range**Limited*resources**Queues*&*skips*16*knobs*
  • Suppression*Stops*no=fica=on**Time*based**RealY=me*&*historical*searches*Field*based*suppression*****Y*alert*me*for*each(user(who*has*more*than*5*failed*logins*in*a***********30*minute*window,**but*not*more*than*once*an*hour*for*each(user(17*knobs*
  • Customizing**Email*fields***Scripts***Custom**alert*ac=ons*18*knobs*
  • Customizing**Email*fields***Scripts***Custom**alert*ac=ons*19*knobs*
  • Customizing**Email*fields***Scripts***Custom**alert*ac=ons*20*knobs*
  • Customizing**Email*fields***Scripts***Custom**alert*ac=ons*21*knobs*1.  Build*an*external*search*cmd*2.  Declare*it*as*an*alert*ac=on*in*alert_ac#ons.conf(3.  Reference*the*ac=on*in*savedsearches.conf*as*ac=on.<ac=onYname>**
  • Managing*alerts*Alert*manager**Scheduler*dashboards*Capacity*planning*Logs**22*
  • Alert*manager**Collec=on*of*triggered*alerts**See*all*alerts*in*one*place***23*manage*
  • Scheduler*dashboards**Troubleshoo=ng**Understanding*load**Tracing*load*origin*24*manage*
  • Capacity*planning**25*manage*
  • Capacity*planning*Y*basics*Alert*==*search*Search*bandwidth*limited*by*#CPUs*********Limit*=*4*x*#CPU*Scheduler*limited*to*25%**** 26*manage*Scheduler*Ad*hoc*
  • Capacity*planning*Y*op=ons*Use*the*right*alert*mode*Schedule*alerts*at*reasonable*periods******there*are*1440*minutes*/*day***Consider*increasing*scheduler*limit**Increase*search*bandwidth*27*manage*
  • Logs*&*.conf**scheduler.log**savedsearches.conf**alert_ac=ons.conf**limits.conf*28*manage*
  • Aler=ng*Summary*29**Basics**Control*knobs**Customizing**Managing****
  • Ques=ons?*30*
  • You*might*also*like*these*sessions***31*…*
  • Expira=on**Alert*tracking**How*long*is*the*alert*kept**Alert*manager**Affects*TTL*32*knobs*
  • Ar=fact*TTL*Painful*to*understand*!*Base*TTL:*2*x*scheduled*period*Alert*TTL:*max*TTL*specified*by*ac=ons*******************OR*alert*expira=on********************33*knobs*
  • Ar=fact*TTL,*exercise*********************34*knobs*Schedule(period(Ac#ons( Expira#on( TTL(Ar#facts((24(hours((
  • Ar=fact*TTL,*exercise*********************35*knobs*Schedule(period(Ac#ons( Expira#on( TTL(Ar#facts((24(hours((1( Hourly( None( None( 2(hours( 2(
  • Ar=fact*TTL,*exercise*********************36*knobs*Schedule(period(Ac#ons( Expira#on( TTL(Ar#facts((24(hours((1* Hourly* None* None* 2*hours* 2*2( Hourly( Email( None( 24(hours( 24(
  • Ar=fact*TTL,*exercise*********************37*knobs*Schedule(period(Ac#ons( Expira#on( TTL(Ar#facts((24(hours((1* Hourly* None* None* 2*hours* 2*2* Hourly* Email* None* 24*hours* 24*3( 5(minutes( None( 24(hours( 24(hours( 288(
  • Ar=fact*TTL,*exercise*********************38*knobs*Schedule(period(Ac#ons( Expira#on( TTL(Ar#facts((24(hours((1* Hourly* None* None* 2*hours* 2*2* Hourly* Email* None* 24*hours* 24*3* 5*minutes* None* 24*hours* 24*hours* 288*4( minute( Email( 12(hours( 24(hours(( 1440(