Splunk Application logging Best Practices

5,817 views

Published on

Best practice PDF from Splunk Conf 2012 from Clint on Application logging.

Published in: Technology, Education
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
5,817
On SlideShare
0
From Embeds
0
Number of Embeds
27
Actions
Shares
0
Downloads
103
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide

Splunk Application logging Best Practices

  1. 1. Copyright*©*2012*Splunk*Inc.* Applica9on*Logging*Best* Prac9ces* Clint*Sharp,*Geek*Marketeer* #datajourney*
  2. 2. Legal*No9ces* During*the*course*of*this*presenta9on,*we*may*make*forwardJlooking*statements*regarding*future*events*or*the* expected*performance*of*the*company.*We*cau9on*you*that*such*statements*reflect*our*current* expecta9ons*and*es9mates*based*on*factors*currently*known*to*us*and*that*actual*events*or*results*could*differ* materially.*For*important*factors*that*may*cause*actual*results*to*differ*from*those*contained*in*our*forwardJlooking* statements,*please*review*our*filings*with*the*SEC.**The*forwardJlooking*statements*made*in*this*presenta9on*are* being*made*as*of*the*9me*and*date*of*its*live*presenta9on.**If*reviewed*aUer*its*live*presenta9on,*this*presenta9on* may*not*contain*current*or*accurate*informa9on.***We*do*not*assume*any*obliga9on*to*update*any*forwardJlooking* statements*we*may*make.**In*addi9on,*any*informa9on*about*our*roadmap*outlines*our*general*product*direc9on* and*is*subject*to*change*at*any*9me*without*no9ce.**It*is*for*informa9onal*purposes*only*and*shall*not,*be* incorporated*into*any*contract*or*other*commitment.**Splunk*undertakes*no*obliga9on*either*to*develop*the* features*or*func9onality*described*or*to*include*any*such*feature*or*func9onality*in*a*future*release.* * Splunk,(the(engine(for(machine(data([MODIFY*THIS*TO*LIST*THOSE*SPLUNK*TRADEMARKS*REFERENCED*IN*PRESENTATION](are(registered( trademarks(or(trademarks(of(Splunk(Inc.(and/or(its(subsidiaries(and/or(affiliates(in(the(United(States(and/or(other(jurisdic=ons.(*All(other( brand(names,(product(names(or(trademarks(belong(to(their(respec=ve(holders.(( ©2012(Splunk(Inc.(All(rights(reserved.* 2*
  3. 3. Agenda* ! *Se^ng*some*context* ! *Early*vs.*Late*Binding*Schema* ! *Logging*best*prac9ces* ! *Basic*Opera9onal*best*prac9ces* ! *Developer*best*prac9ces*
  4. 4. 4*
  5. 5. Why*Should*You*Care*How*to*Log?* ! Isn’t*logging*only*for*errors?* ! How*much*code*is*that?* ! What*will*it*get*me?* ! Why*wouldn’t*I*just*use*a*ByteJCode*Instrumenta9on* product?* I’ll*give*you*a*hint,*I’m*going*to*answer*all*my*own*ques9ons*
  6. 6. Life*Sucks*for*Developers* ! You*have*to*debug*complex*distributed*applica9ons* ! You*might*need*expensive/heavy*tools*in*development*(can’t* be*moved*to*produc9on)* ! Need*many*different*tools*for*different*purposes* ! Lots*of*code*is*NOT*under*your*control*–*only*pieces*
  7. 7. Life*is*Great*for*Developers* ! At*least*you*have*a*job*in*this*economy* ! You*get*paid*well*(!?!?!?)* ! You*can*dress*however*you*like*(kilts,*etc)*
  8. 8. “Seman9c*Logging”* !  You*have*no*control*over*other*systems*events* !  You*have*full*control*over*events*that*YOU*write* !  Most*events*are*wrijen*by*developers*to*help*them*debug** !  Some*events*are*wrijen*to*form*an*audit*trail* Seman&c(Events(are*wrijen*explicitly** for*the*gathering*of*analy9cs*
  9. 9. Late*Binding*Schema* Splunk*knows*virtually*nothing*about*the* data*as*it*is*indexed* 9*
  10. 10. Late*Binding*Schema* Splunk*applies*structure** at(search(=me( ( We(call(this(“Late(Binding(Schema”* 10*
  11. 11. Early*vs.*Late*Binding*Schema* SELECT*customers.**FROM*customers*WHERE* customers.customer_id*NOT*IN(SELECT*customer_id*FROM* orders*WHERE*year(orders.order_date)*=*2004)* Early*Structure*Binding*J*Tradi9onal* Structure! Data! •  Schema*–*created*at* design*9me* •  Queries*–*understood* at*design*9me*for* maximum*performance* •  Homogeneous*–*must* fit*into*tables*or*be* converted*to*fit*into* tables** •  Must*exactly*match* constraints*
  12. 12. Early*vs.*Late*Binding*Schema* Late*Structure*Binding*J*Splunk* Structure! Data! •  SchemaJless* •  Created*at*search( 9me* •  Queries/searches*can* be*adJhoc* * •  Heterogeneous*–*can* come*from*any* textual*source* •  Constantly*changing* •  No*conversion* required,*no* constraints*
  13. 13. Analy9cs* Early*Structure*Binding* Decide*the*ques9on(s)*you* want*to*ask* Design*the*Schema* Normalize*the*data*and*write* DB*inser9on*code* Create*SQL*&*Feed*into* Analy9cs*Tool* Write*Seman9c*Events* Collect*with*Splunk* Create*Searches,*Reports*&* Graphs* Late*Binding*Schema* (Minutes(&(NonMDestruc=ve)( (Days,Weeks( Or(Months(&( Distruc=ve)(
  14. 14. Logging*Best*Prac9ces* Create*Human*Readable*Events** **For*the*most*part*
  15. 15. Logging*Best*Prac9ces* !  Log*in*Text*–*Binary*sounds*good*because*it’s*compressed,*but*it* requires*decoding*and*will*not*segment* !  Make(it(easy(for(humans*–*Try*not*to*use*complex*encoding*that* require*lookups* !  Categorize(–*Use*INFO,*WARN,*ERROR,*DEBUG,*etc( !  Don’t(use(XML(–(Unless*you*absolutely*need*mul9Jdepth*nes9ng* –  We’re*happy*for*you*to*pay*us*to*log*in*XML,*but*JSON*is*much*easier*to*read* !  JSON(is(beXer*–*Splunk*has*na9ve*JSON*support,*even*for*nested* structures( !  Keep(mul=Mline(events(to(a(minimum(
  16. 16. Logging*Best*Prac9ces* !  Do(not(use(=me(offsets( !  Use(human(readable(=mestamps( !  Favor(the(beginning(of(the(line*–*the*farther*you*place*the* 9mestamp*from*the*beginning,*the*more*difficult*it*is*to*tell* it’s*a*9mestamp*and*not*other*data* Clearly*Timestamp*Every*Event*
  17. 17. Logging*Best*Prac9ces* Log*more*than*just*Debugging*Events** Log!anything!that!can!add!value!when!aggregated,!charted!or!further!analyzed! Example!Bogus!Pseudo?Code:! ! void*submitPurchase(purchaseId) ! { ! !log.info("action=submitPurchaseStart, purchaseId=%d", purchaseId)! !//these calls throw an exception on error! !submitToCreditCard(...)! !generateInvoice(...)! !generateFullfillmentOrder(...)! !log.info("action=submitPurchaseCompleted, purchaseId=%d", purchaseId)! } ! ! • **Graph*purchase*volume*by*hour,*by*day,*by*month.** • **How*long*are*purchases*taking*during*different*9mes*of*the*day*and*different*days*of*the*week?** • **Are*purchases*taking*longer*than*they*did*last*month?** • **Are*my*systems*ge^ng*slower*and*slower,*or*are*they*ok?** • **How*many*purchases*are*failing?*Graph*the*failures*over*9me.** • **Which*specific*purchases*are*failing?** !
  18. 18. Logging*Best*Prac9ces* Clearly*mark*key/value*pairs* Splunk!loves!key!value!pairs!that!look!like:! * ***************key=value,*key2=value2,*key3=value3….* Look!at!the!following!events:! ** **1)**Log.debug(“error*%d”,*userId)* **2)**Log.debug(“orderstatus=error*errorcode=454*user=%d”,*userId)* * Searching*for*“error”*if*logging*using*#1,*will*probably*bring*back*all*kinds*of*errors,*but*searching*for*orderstatus=error*will* bring*back*only*the*ones*you*really*want.* * Sure,*it’s*verbose*–*but*Splunk*because*Splunk(Compresses,*this*yields*good*compression*due*to*repeatable*terms*
  19. 19. Logging*Best*Prac9ces* Break*mul9Jvalue*informa9on*into*separate*events* Example:! Events*represent*what*apps*are*installed*on*a*mobile*device* * <TS>*phonenumber=333J444J4444,*app=angrybirds,*installdate=xx/xx/xx* <TS>*phonenumber=333J444J4444,app=facebook,installdate=yy/yy/yy* * Use*the*“transac9on”*search*command*to*group*them* * !If!you!do!this,!you’ll!have!to!edit!a!config!file:! <TS>*phonenumber=333J444J4444,app=angrybirds,facebook*
  20. 20. Logging*Best*Prac9ces* !  Log*Unique*Iden9fiers* !  Carry*Unique*Iden9fiers*through*mul9ple*touch*points*if*possible** "  enables*transac9on*search* !  Use*Transi9ve*Closure*if*you*need*to:* * transid=abcdef,** transid=abcdef,**otherid=*qrstuv,*.*.*.*.*.* otherid=qrstuv* TransacGon!
  21. 21. 21* Quick*Seman9c*Logging*Demo*
  22. 22. Why*JSON* !  Direct*to/from*Data*Structure*in*Modern*Languages* "  Python,*Ruby,*Javascript,*etc* !  Easy*to*serialize/deJserialize*objects*to/from*JSON* "  Thus*storing*and*retrieving*objects*via*Splunk* !  It’s*the*“Lingua*Franca”*of*light*weight*Cloud*Services* "  Web*Hooks*and*push* *
  23. 23. JSON*Search*Examples* {"web-app": {! "servlet": [ ! {! "servlet-name": "cofaxCDS",! "servlet-class": "org.cofax.cds.CDSServlet",! "init-param": {! "configGlossary:installationAt": "Philadelphia, PA",! "configGlossary:adminEmail": "ksm@pobox.com",! "maxUrlLength": 500}},! {! "servlet-name": "cofaxEmail",! "servlet-class": "org.cofax.cds.EmailServlet",! "init-param": {! "mailHost": "mail1",! "mailHostOverride": "mail2"}},! {! "servlet-name": "cofaxAdmin",! "servlet-class": "org.cofax.cds.AdminServlet"},! ! {! "servlet-name": "fileServlet",! "servlet-class": "org.cofax.cds.FileServlet"},! ! . . . . . . . .! ! source="/Users/wma/splunk/siJstaging/sample.json"* |*spath*output=foo*path=webJapp.servlet{2}.servletJ class*|*top*foo*
  24. 24. Opera9onal*Best*Prac9ces*for*Splunk* !  Log(locally(to(files( !  Use(rota=on(policies(–*destroy*or*back*up*(your*choice)* !  Run(Splunk(Forwarders( " *provides*elas9c*buffering*–*or*else*produc9on** *applica9ons*can*block!* * * Splunk! Indexer! or!Storm! Network* Local*Log*File* Splunk! Forwarder! Event!! Producing! ApplicaGon!
  25. 25. Opera9onal*Best*Prac9ces*for*Splunk* !  Syslog*is*great*for*large*volumes*of*low*value*data.* " Obviously*lossy* " But*has*exis9ng*services*on*U*nix* !  Syslog*NG*is*bejer,*but*watch*your*configura9on* !  Syslog*can’t*handle*mul9Jline*events.**Packet*sizes*are*too* small.* *
  26. 26. Opera9onal*Best*Prac9ces*for*Enterprise*Splunk* !  Over*provision*indexers( "  More*indexers*=*bejer*search*performance* "  I’ve*seen*too*many*people*underJpower*their*Splunk** machines*and*then*complain*that*Splunk*is*slow* * **More*indexers*will*add*more*paralleliza9on*to*searches** * *
  27. 27. Opera9onal*Best*Prac9ces*for*Splunk* The(more(you(put(in(Splunk,(the(more(visibility(you(have:* !  Applica9on*logs* !  Database*logs* !  Network*logs* !  Configura9on*files* !  Performance*data*(iostat,*vmstat,*ps,*etc)* !  Anything*that*has*a*9me*component* * *
  28. 28. 28* Treat*Splunk*as*part*of*your*development* soUware*stack*
  29. 29. 29* Use*Splunk*as*your*Analy9cs*Engine* * Collect*events*from*every*single*machine*
  30. 30. Development*Best*Prac9ces* !  Developer*teams*are*now*required*to*create*tags*and* nota9ons*in*logs*for*easier*iden9fica9on* !  Part*of*each*applica9on*backlog*includes*crea9ng*custom* Splunk*reports*dashboards*and*alerts* !  Enrich*your*Logs!* "  Build*in*specific*tags*and*keywords* "  Standardize*an*op9mize*your*log*formats* **Washington*Post*Splunk*Presenta9on*
  31. 31. 31* Your*Code*Isn’t* Considered*“Delivered”* Un9l*You*Have*Built* Analy9cs*that*Support*it!*
  32. 32. What*It*Gets*You*
  33. 33. 33*
  34. 34. Well*Instrumented*Applica9ons*Can*Get*You* ! Per*API*performance*metrics*with*almost*no*overhead* ! Detailed*tracing*of*where*an*error*occurs*in*a*flow* ! Bejer*monitoring* ! Interes9ng*“Found*Data”* ! Business*Analy9cs*along*with*Performance*Analy9cs* "  Be*a*hero*to*your*boss*by*accident!*
  35. 35. Middleware*Performance*Example* 35* 2011J07J28*09:21:47*server=sandapcspapl1*adaptor=APL*call=ValidateAc9va9onPayment*type=Requests*val=1*newval=109083*oldval=109082* 2011J07J28*09:21:47*server=sandapcspapl1*adaptor=APL*call=GetCustomerInforma9on*type=ResponseTime*val=1142*newval=1142*oldval=1318* 2011J07J28*09:21:47*server=sandapcspapl1*adaptor=APL*call=UpdateAc9va9onPayment*type=Successful*val=3*newval=103334*oldval=103331* 2011J07J28*09:21:47*server=sandapcspapl1*adaptor=APL*call=ValidateAc9va9onPayment*type=RequestsOneMinuteCount*val=1*newval=1*oldval=0* 2011J07J28*09:21:47*server=sandapcspapl1*adaptor=APL*call=PostPaygoPayment*type=Successful*val=6*newval=178006*oldval=178000*
  36. 36. Per*API*Response*Times* 36*
  37. 37. Business*Metrics*(Sales)*
  38. 38. Thanks!* Ques9ons?*

×