SlideShare a Scribd company logo
1 of 50
Download to read offline
Making Reg[Ee]x Your Buddy




                                          August	
  15,	
  2011
                                 (?i)(mi(chael|ke)	
  wilde),	
  Splunk	
  Ninja
Thursday, August 18, 11
Hi,	
  I’m	
  Michael	
  Wilde
                                                • You	
  may	
  know	
  me	
  from:




                          Splunk Worldwide Users’ Conference     2                    © Copyright Splunk 2011
Thursday, August 18, 11
What	
  is	
  RegEx
                                                        “Finite	
  Automata”


             •Regular	
  Expression	
  invented	
  in	
  the	
  1950’s	
  by	
  
              mathemaUcian	
  Stephen	
  Cole	
  Kleene
             •Implemented	
  by	
  “ed”	
  and	
  “grep”	
  creator	
  Ken	
  
              Thompson	
  in	
  1973
              Pa[ern	
  matching	
  language	
  for	
  text	
  processing
             •Has	
  slightly	
  different	
  implementaUons	
  (PERL,	
  POSIX)	
  
             •Way	
  crypUc	
  at	
  first	
  sight


                          Splunk Worldwide Users’ Conference     3                    © Copyright Splunk 2011
Thursday, August 18, 11
Why	
  should	
  you	
  care
          •Field	
  extracUon	
  is	
  a	
  requirement	
  for	
  reporUng
          •Index-­‐Ume	
  filtering	
  &	
  rouUng
          •You’ll	
  seem	
  smart
          •It	
  will	
  be	
  useful	
  beyond	
  Splunk
          •You	
  might	
  score	
  with	
  the	
  (ladies|dudes)	
  at	
  
             (MakersFaire	
  |ComiCon).


                          Splunk Worldwide Users’ Conference   4              © Copyright Splunk 2011
Thursday, August 18, 11
Thinking	
  Regex




Thursday, August 18, 11
Thinking	
  Regex

                      •Log	
  Events	
  are	
  a	
  great	
  place	
  to	
  start,	
  they	
  have	
  structure
                      •Don’t	
  overthink	
  it.	
  	
  The	
  pa[ern	
  is	
  there	
  waiUng	
  to	
  
                          discovered
                      •Don’t	
  be	
  lazy	
  and	
  use	
  wildcards	
  too	
  much
                      •Learn	
  to	
  love	
  “NOT”	
  regexes.	
  S+	
  D+	
  W+	
  [^,]+


                            Splunk Worldwide Users’ Conference      6                                         © Copyright Splunk 2011
Thursday, August 18, 11
Splunk Worldwide Users’ Conference   7   © Copyright Splunk 2011
Thursday, August 18, 11
Be	
  nice	
  to	
  your	
  RegEx	
  engine
                    • MS-­‐DOS	
  taught	
  us	
  to	
  be	
  
                      laaaaaaaaaaaaaaaaazy	
  with	
  *.*
                    • A	
  regex	
  engine	
  matches	
  character	
  by	
  
                      character,	
  and	
  then	
  does	
  backtracking.
                    • Match	
  in	
  as	
  few	
  steps	
  as	
  possible




                          Splunk Worldwide Users’ Conference   8               © Copyright Splunk 2011
Thursday, August 18, 11
Regexes	
  in	
  Splunk


                          Search Language: “rex”, “erex”, “regex”
                          Indexing: Filtering data (in|out), line breaking,
                          timestamp extraction
                          Field Extraction



Thursday, August 18, 11
IFX
                     • Splunk	
  has	
  a	
  built	
  in	
  "interacUve	
  field	
  extractor"
                     • It	
  can	
  be	
  useful.	
  Give	
  it	
  samples	
  of	
  data,	
  and	
  it	
  will	
  a[empt	
  to	
  
                       learn	
  a	
  regex	
  and	
  persist	
  a	
  single	
  field
                     • It	
  has	
  a	
  limitaUon	
  of	
  the	
  amount	
  of	
  events	
  to	
  display	
  in	
  its	
  
                       viewer.
                     • You	
  might	
  not	
  see	
  your	
  search	
  results	
  when	
  using	
  it?	
  	
  Huh?




                            Splunk Worldwide Users’ Conference            10                                              © Copyright Splunk 2011
Thursday, August 18, 11
what	
  if	
  we	
  could	
  use	
  that	
  "intelligent"	
  stuff
            IFX	
  was	
  doing	
  but	
  in	
  the	
  search	
  language	
  




                     •
Thursday, August 18, 11
                          Splunk Worldwide Users’ Conference   11    © Copyright Splunk 2011
meet	
  "erex"
                      • Allows	
  you	
  to	
  give	
  it	
  examples,	
  but	
  it	
  works	
  on	
  your	
  
                        search	
  results
                      • Allows	
  you	
  to	
  give	
  it	
  counterexamples	
  of	
  stuff	
  you	
  
                        don't	
  want	
  to	
  match	
  on
                      • Builds	
  you	
  a	
  proper	
  rex	
  command




                           Splunk Worldwide Users’ Conference     12                                       © Copyright Splunk 2011
Thursday, August 18, 11
...there's	
  an	
  app	
  for	
  that.
                                        right?	
  




                           Splunk Worldwide Users’ Conference   13   © Copyright Splunk 2011
Thursday, August 18, 11
Field	
  Extractor	
  App
          • Imagine	
  you	
  could	
  use	
  your	
  
            mouse,	
  highlight	
  fields,	
  name	
  
            them,	
  persist	
  them,	
  go	
  home	
  
            early	
  and	
  never	
  write	
  regex.
          • David	
  Carasso's	
  Field	
  Extractor	
  
            app	
  is	
  like	
  a	
  "workbench	
  for	
  field	
  
            extracUon"
          • Download	
  it	
  from	
  SplunkBase


                          Splunk Worldwide Users’ Conference          14   © Copyright Splunk 2011
Thursday, August 18, 11
searching	
  with	
  regex


Thursday, August 18, 11
the	
  |	
  regex	
  search	
  command
                     • Did	
  you	
  know	
  splunk	
  crushes	
  all	
  terms	
  to	
  lower	
  case?
                     • If	
  you	
  need	
  to	
  look	
  for	
  specific	
  pa;erns	
  or	
  even	
  
                       words	
  and	
  respect	
  the	
  case	
  the	
  original	
  events	
  are	
  in,	
  
                       use	
  |	
  regex
                     • index=splunktv|regex	
  _raw="(MP3|M4A)"	
  <-­‐-­‐noMce	
  
                       this	
  is	
  a	
  case	
  sensiMve	
  pa;ern	
  match.


                          Splunk Worldwide Users’ Conference   16                                    © Copyright Splunk 2011
Thursday, August 18, 11
What	
  about	
  good	
  ole	
  Rex?
          •      Search	
  Ume	
  field	
  
                 extracUons	
  via	
  your	
  own	
  
                 regexes	
  -­‐-­‐	
  in	
  the	
  search	
  
                 language
          •      Name	
  your	
  fields
          •      Reuse	
  everyone	
  elses	
  
                 work!



                           Splunk Worldwide Users’ Conference   17       © Copyright Splunk 2011
Thursday, August 18, 11
a	
  few	
  more	
  tricks	
  for	
  you



                          Splunk Worldwide Users’ Conference   18            © Copyright Splunk 2011
Thursday, August 18, 11
host	
  extracUon	
  irritates	
  me



                          Splunk Worldwide Users’ Conference   19     © Copyright Splunk 2011
Thursday, August 18, 11
regex	
  in	
  host	
  extracUon
                     • Splunk	
  will	
  a[empt	
  to	
  do	
  the	
  right	
  thing.	
  Log	
  source	
  will	
  likely	
  
                       make	
  it	
  hard	
  for	
  Splunk-­‐-­‐and	
  you'll	
  blame	
  Splunk

                     • Props.conf	
  &	
  transforms.conf	
  are	
  needed	
  to	
  properly	
  extract	
  
                       hostnames	
  in	
  some	
  cases	
  (F5	
  Big-­‐IP	
  and	
  HP	
  networking	
  gear

                     • Use	
  default	
  seungs	
  in	
  props.conf	
  and	
  use	
  your	
  own	
  seungs	
  
                       as	
  well


                            Splunk Worldwide Users’ Conference            20                                             © Copyright Splunk 2011
Thursday, August 18, 11
priority	
  boarding	
  in	
  props.conf
                          [source::...a...]
                          TRANSFORMS-­‐ahosts	
  =	
  ahostextrac:on
                          priority	
  =	
  1

                          [source::...z...]
                          TRANSFORMS-­‐zhosts	
  =	
  zhostextrac:on
                          priority	
  =	
  99

                          what	
  if	
  the	
  source	
  we	
  were	
  matching	
  against	
  had	
  the	
  word	
  "arizona"	
  in	
  it?	
  It	
  
                          will	
  match	
  both,	
  right?	
  	
  	
  Use	
  "Priority"	
  to	
  control	
  matching.	
  	
  99	
  is	
  higher	
  than	
  
                          1.	
  	
  So	
  99	
  is	
  a	
  higher	
  priority.	
  	
  Yeah,	
  i	
  know...	
  weird.


                              Splunk Worldwide Users’ Conference                        21                                                         © Copyright Splunk 2011
Thursday, August 18, 11
Basic	
  Training	
  Complete!



                             Lets	
  do	
  something	
  more
                                            difficult


Thursday, August 18, 11
Splunk	
  is	
  so	
  smart
                                                        except	
  when	
  its	
  not

              	
  	
  <policy	
  id="3">Finjan	
  HTTPS	
  policy</policy>
              	
  	
  <cp	
  id="5"	
  name="AcUve	
  Content"	
  display_name="AcUve	
  Content"/>
              	
  	
  <group	
  id="5002"	
  cp_id="5"	
  type="0">Full	
  profile	
  -­‐	
  Binary	
  Behavior</group>
              	
  <item	
  id="28015">Format	
  error	
  in	
  CRL	
  lastUpdate	
  field</item>
              	
  <item	
  id="3265747">*.served.com/*</item>
              	
  	
  <rule_comment	
  id="2"	
  name="Block	
  cerUficate	
  validaUon	
  errors">&lt;!
              [CDATA[Block	
  HTTPS	
  content	
  without	
  a	
  valid	
  cerUficate]]&gt;</rule_comment>


                          AUTO-­‐KV	
  pulled	
  the	
  “id”	
  field	
  out	
  of	
  every	
  event.	
  	
  Yay!!!


                          Splunk Worldwide Users’ Conference           23                                            © Copyright Splunk 2011
Thursday, August 18, 11
“id”	
  is	
  not	
  the	
  field	
  name
                                                   look	
  closer	
  Agent	
  Starling

              	
  	
  <policy	
  id="3">Finjan	
  HTTPS	
  policy</policy>
              	
  	
  <cp	
  id="5"	
  name="AcUve	
  Content"	
  display_name="AcUve	
  Content"/>
              	
  	
  <group	
  id="5002"	
  cp_id="5"	
  type="0">Full	
  profile	
  -­‐	
  Binary	
  Behavior</group>
              	
  <item	
  id="28015">Format	
  error	
  in	
  CRL	
  lastUpdate	
  field</item>
              	
  	
  <rule_comment	
  id="2"	
  name="Block	
  cerUficate	
  validaUon	
  errors">&lt;!
              [CDATA[Block	
  HTTPS	
  content	
  without	
  a	
  valid	
  cerUficate]]&gt;</rule_comment>

                            We	
  can	
  educate	
  Splunk	
  on	
  dynamically	
  pulling	
  the	
  
                                              KEY	
  and	
  VALUE	
  with...


                          Splunk Worldwide Users’ Conference       24                                      © Copyright Splunk 2011
Thursday, August 18, 11
Dynamic	
  Key	
  Value	
  ExtracUon
                                               ...but	
  tailored	
  for	
  our	
  needs
       REGEX	
  for	
  the	
  “KEY”	
  is	
  <([^=]+)=                    	
  	
  <policy	
  id="3">
       Less	
   than,	
   followed	
   by	
   (anything	
   that	
   is	
  
       “not	
   an	
   equal	
   sign-­‐-­‐greedy	
   match)	
   	
  	
  <cp	
  id="5"	
  
       followed	
  by	
  an	
  equal	
  sign                                 	
  <item	
  id="28015">
                                                         keep	
  going	
  dude!

       REGEX	
  for	
  the	
  “VALUE”	
  is	
  ”(                                	
  	
  <policy	
  id="3">
       A	
  quote	
   (followed	
  by	
   anything	
   that	
  is	
  not	
  
       a	
   quote-­‐-­‐greedy	
   match)	
   followed	
   by	
   a	
             	
  	
  <cp	
  id="5"	
  
       quote	
  followed	
  by	
  a	
  greater	
  than	
  sign                    	
  <item	
  id="28015">
                          Splunk Worldwide Users’ Conference         25                                 © Copyright Splunk 2011
Thursday, August 18, 11
Persist	
  your	
  sweet	
  dynamic	
  KV	
  pa[erns
                                                 props.conf	
  &	
  transforms.conf	
  required

       Create	
  an	
  entry	
  in	
  props.conf	
  like	
  this:

                          [m86_dynamic_kv]

                                                                                     $1	
  	
  	
  	
  	
  $2
                          REPORT-­‐m86fields	
  =	
  mym86kv
                                                                          Text
       Create	
  an	
  entry	
  in	
  transforms.conf	
  like	
  this:

                          [mym86kv]
                          REGEX	
  =	
  <([^=]+)="([^"]+)">
                          FORMAT = $1::$2                                <policy	
  id="3">Finjan	
  HTTPS	
  policy</
                                                                         policy>

                               Splunk Worldwide Users’ Conference           26                                  © Copyright Splunk 2011
Thursday, August 18, 11
Dang	
  it!	
  It	
  wasn’t	
  perfect
                          some	
  of	
  our	
  events	
  don’t	
  finish	
  their	
  XML	
  tag	
  right	
  a~er	
  a	
  quote

       Create	
  an	
  entry	
  in	
  props.conf	
  like	
  this:

               [m86_dynamic_kv]

                                                                                         $1	
  	
  	
  	
  	
  $2
               REPORT-­‐m86fields	
  =	
  mym86kv
                                                                          Text
       Create	
  an	
  entry	
  in	
  transforms.conf	
  like	
  this:

               [mym86kv]
               REGEX	
  =	
  <([^=]+)="([^"]+)[^>]+>         <rule_comment	
  id="690"	
  name="Log	
  everythin
               FORMAT = $1::$2
                                                                    Image	
  files">&lt;![CDATA[Logs	
  all	
  content	
  passin
                                                                    the	
  system	
  except	
  for	
  ......

                              Splunk Worldwide Users’ Conference           27                                           © Copyright Splunk 2011
Thursday, August 18, 11
Think	
  you’re	
  good?
                                               Try	
  extracUng	
  the	
  “service”	
  field

       2011/07/21	
  19:27:22.071	
  [(ninja-­‐fe96,opensocial,/makeRequest,2011/07/21	
  
       19:27:21.978)[ninja-­‐be04,auth,Auth2Service.recoverSubject]]	
  []	
  [Auth2Service]	
  
       recoverSubject(V1.21.47,OSM:1t7Dg201000:i:
       1311276436:1d00a2fc1f9addd936af12ed5c430a169c362af8,null,shindig,
       172.17.207.243,)=[Principal[3],[OSM:1t7Dg201000:i:
       1311276439:20d1d0b474927a301376d70f2ad5949a2241e271,false,1h]]	
  in	
  1ms


                            Your	
  job	
  is	
  to	
  create	
  a	
  mulU-­‐valued	
  field	
  as	
  the	
  “service”	
  
                                       field	
  exists	
  mulUple	
  Umes	
  in	
  each	
  event

                          Splunk Worldwide Users’ Conference             28                                                 © Copyright Splunk 2011
Thursday, August 18, 11
Look	
  for	
  the	
  obvious	
  pa[erns

       2011/07/21	
  19:27:22.071	
  [(ela4-­‐fe96,opensocial,/makeRequest,2011/07/21	
  
       19:27:21.978)[ela4-­‐be04,auth,Auth2Service.recoverSubject]]	
  []	
  [Auth2Service]	
  
       recoverSubject(V1.21.47,OSM:1t7Dg201000:i:
       1311276436:1d00a2fc1f9addd936af12ed5c430a169c362af8,null,shindig,
       172.17.207.243,)=[Principal[3],[OSM:1t7Dg201000:i:
       1311276439:20d1d0b474927a301376d70f2ad5949a2241e271,false,1h]]	
  in	
  1ms

                           Your	
  brain	
  will	
  tell	
  you	
  to	
  look	
  for	
  “anything	
  a~er	
  the	
  
                            first	
  comma”	
  a~er	
  that	
  le~	
  bracket	
  and	
  before	
  the	
  
                                                            second	
  comma

                          Splunk Worldwide Users’ Conference            29                                             © Copyright Splunk 2011
Thursday, August 18, 11
...and	
  your	
  brain	
  was	
  wrong.
  2011/07/21	
  19:27:22.071	
  [(ela4-­‐fe96,opensocial,/makeRequest,2011/07/21	
  
  19:27:21.978)[ela4-­‐be04,auth,Auth2Service.recoverSubject]]	
  []	
  [Auth2Service]	
  
  recoverSubject(V1.21.47,OSM:1t7Dg201000:i:
  1311276436:1d00a2fc1f9addd936af12ed5c430a169c362af8,null,shindig,
  172.17.207.243,)=[Principal[3],[OSM:1t7Dg201000:i:
  1311276439:20d1d0b474927a301376d70f2ad5949a2241e271,false,1h]]	
  in	
  1ms

           This	
  is	
  NOT	
  a	
  “service”

                                             Dang...	
  what	
  are	
  we	
  gonna	
  do	
  now?

                           Splunk Worldwide Users’ Conference           30                         © Copyright Splunk 2011
Thursday, August 18, 11
What	
  is	
  common	
  with	
  “services”
  2011/07/21	
  19:27:22.071	
  [(ela4-­‐fe96,opensocial,/makeRequest,2011/07/21	
  
  19:27:21.978)[ela4-­‐be04,auth,Auth2Service.recoverSubject]]	
  []	
  [Auth2Service]	
  
  recoverSubject(V1.21.47,OSM:1t7Dg201000:i:
  1311276436:1d00a2fc1f9addd936af12ed5c430a169c362af8,null,shindig,
  172.17.207.243,)=[Principal[3],[OSM:1t7Dg201000:i:
  1311276439:20d1d0b474927a301376d70f2ad5949a2241e271,false,1h]]	
  in	
  1ms
                                           They’re	
  all	
  alphanumeric	
  or	
  
                                               “word”	
  characters	
  
                                                      0-­‐9A-­‐Za-­‐z_
                           Splunk Worldwide Users’ Conference   31                    © Copyright Splunk 2011
Thursday, August 18, 11
But	
  what	
  about	
  the	
  preceding	
  text
  2011/07/21	
  19:27:22.071	
  [(ela4-­‐fe96,opensocial,/makeRequest,2011/07/21	
  
  19:27:21.978)[ela4-­‐be04,auth,Auth2Service.recoverSubject]]	
  []	
  [Auth2Service]	
  
  recoverSubject(V1.21.47,OSM:1t7Dg201000:i:
  1311276436:1d00a2fc1f9addd936af12ed5c430a169c362af8,null,shindig,
  172.17.207.243,)=[Principal[3],[OSM:1t7Dg201000:i:
  1311276439:20d1d0b474927a301376d70f2ad5949a2241e271,false,1h]]	
  in	
  1ms

                           Le~	
  bracket	
  followed	
  by	
  some	
  stuff,	
  followed	
  by	
  a	
  comma..	
  
                          but	
  its	
  not	
  consistent.	
  	
  SomeUmes	
  a	
  “(“	
  le~	
  paren	
  is	
  in	
  there.


                               Splunk Worldwide Users’ Conference                32                                            © Copyright Splunk 2011
Thursday, August 18, 11
This	
  is	
  a	
  be[er	
  match
  2011/07/21	
  19:27:22.071	
  [(ela4-­‐fe96,opensocial,/makeRequest,2011/07/21	
  
  19:27:21.978)[ela4-­‐be04,auth,Auth2Service.recoverSubject]]	
  []	
  [Auth2Service]	
  
  recoverSubject(V1.21.47,OSM:1t7Dg201000:i:
                                      [[(-­‐a-­‐zA-­‐Z0-­‐9]+,([a-­‐zA-­‐Z]+),
  1311276436:1d00a2fc1f9addd936af12ed5c430a169c362af8,null,shindig,
  172.17.207.243,)=[Principal[3],[OSM:1t7Dg201000:i:
  1311276439:20d1d0b474927a301376d70f2ad5949a2241e271,false,1h]]	
  in	
  1ms
                                Say	
  the	
  matching	
  paZern	
  out	
  loud.	
  	
  It	
  will	
  help
                   Le~	
  bracket,	
  followed	
  by	
  anything	
  in	
  this	
  character	
  list	
  (greedy).	
  Followed	
  by	
  a	
  comma,	
  and	
  
                    then	
  create	
  a	
  capturing	
  group	
  of	
  text	
  that	
  matches	
  upper	
  or	
  lower	
  case	
  roman	
  alphabet-­‐-­‐
                      greedy	
  (as	
  many	
  Umes	
  as	
  possible).	
  End	
  capturing	
  group,	
  then	
  followed	
  by	
  a	
  comma.


                              Splunk Worldwide Users’ Conference                       33                                                          © Copyright Splunk 2011
Thursday, August 18, 11
Can’t	
  be	
  too	
  hard	
  to	
  extend	
  it,	
  right?
  2011/07/21	
  19:27:22.071	
  [(ela4-­‐fe96,opensocial,/makeRequest,2011/07/21	
  
  19:27:21.978)[ela4-­‐be04,auth,Auth2Service.recoverSubject]]	
  []	
  [Auth2Service]	
  
  recoverSubject(V1.21.47,OSM:1t7Dg201000:i:
                 [[(-­‐a-­‐zA-­‐Z0-­‐9]+,([a-­‐zA-­‐Z]+),[^[]+[[(-­‐
  1311276436:1d00a2fc1f9addd936af12ed5c430a169c362af8,null,shindig,
  172.17.207.243,)=[Principal[3],[OSM:1t7Dg201000:i:
                               a-­‐zA-­‐Z0-­‐9]+,([a-­‐zA-­‐Z]+),
  1311276439:20d1d0b474927a301376d70f2ad5949a2241e271,false,1h]]	
  in	
  1ms

                    Le~	
  bracket,	
  followed	
  by	
  anything	
  in	
  this	
  character	
  list	
  (greedy).	
  Followed	
  by	
  a	
  comma,	
  and	
  then	
  
                     create	
  a	
  capturing	
  group	
  of	
  text	
  that	
  matches	
  upper	
  or	
  lower	
  case	
  roman	
  alphabet-­‐-­‐greedy	
  (as	
  
                    many	
  Umes	
  as	
  possible).	
  End	
  capturing	
  group,	
  then	
  followed	
  by	
  a	
  comma.	
  	
  Followed	
  by	
  anything	
  
                                                           that	
  is	
  NOT	
  a	
  Le~	
  Bracket,	
  followed	
  by.....


                              Splunk Worldwide Users’ Conference                            34                                                               © Copyright Splunk 2011
Thursday, August 18, 11
Sad	
  Trombone
                                                   This	
  one	
  has	
  four	
  services

                    2011/07/21	
  19:27:27.596	
  [(ninja4-­‐fe29,genie,/handle,131292312,2011/07/21	
  
                    19:27:27.310)[ninja4-­‐
                    be716,lmt,PbContentService.write<tetherAccountData;default>][ninja4-­‐
                    be05,tether,TetherAccountService.bindAccount][ninja4-­‐
                    be393,auth,Auth2Service.upgradeSubject]]	
  []	
  [Auth2Service]	
  
                    upgradeSubject(V1.21.49,"INT",[LIM:131292312:s:
                    1311276361:b8f677d957eb3f7b9622247b72374c791720bc17,true],
                    {internalAppName=twitter-­‐sync},"tether",null)=[Principal[2],[INT:
                    131292312/twitter-­‐sync:
                    1311276447:df9dd0175bd2e6107c2dfae36dfd9a9dc11f0631,false,20y]]	
  in	
  15ms




                          Splunk Worldwide Users’ Conference         35                        © Copyright Splunk 2011
Thursday, August 18, 11
Remember	
  “rex”?
                                                          He	
  devours	
  data




                                         But	
  you	
  can	
  make	
  “rex”	
  very	
  hungry	
  and	
  
                                           control	
  how	
  much	
  lunch	
  he	
  eats.	
  	
  By	
  
                                        default,	
  he	
  only	
  gets	
  “one	
  helping	
  of	
  meat”



                          Splunk Worldwide Users’ Conference        36                                     © Copyright Splunk 2011
Thursday, August 18, 11
Using	
  max_match	
  with	
  rex
                                     You	
  limit	
  or	
  expand	
  the	
  number	
  of	
  Umes	
  it	
  runs


                          rex max_match=20 "[[(-a-zA-Z0-9]+,(?<service>[a-zA-Z]+),"


                             Instead	
  of	
  that	
  last	
  regex	
  that	
  matched	
  “two”	
  services,	
  lets	
  
                             just	
  match	
  one,	
  and	
  tell	
  rex	
  to	
  repeat	
  our	
  pa[ern	
  matching	
  




                             Splunk Worldwide Users’ Conference          37                                            © Copyright Splunk 2011
Thursday, August 18, 11
You	
  can	
  persist	
  this	
  in	
  config	
  files
                                             props.conf	
  &	
  transforms.conf	
  required

                            Create	
  an	
  entry	
  in	
  props.conf	
  like	
  this:

                                         [ninjasocial]
                                         REPORT-­‐ninjafields	
  =	
  myepicregex

                            Create	
  an	
  entry	
  in	
  transforms.conf	
  like	
  this:

                                         [myepicregex]
                                         REGEX	
  =	
  [[(-a-zA-Z0-9]+,(?<service>[a-zA-Z]+),
                                         MV_ADD = TRUE




                           Splunk Worldwide Users’ Conference                    38                © Copyright Splunk 2011
Thursday, August 18, 11
And	
  now	
  for	
  something	
  difficult
                                                 gaming	
  logs	
  -­‐	
  Team	
  Fortress


                      L	
  08/02/2011	
  -­‐	
  11:46:05:	
  "The	
  
                      Administrator<61><BOT><Red>"	
  killed	
  
                      "MoreGun<56><BOT><Blue>"	
  with	
  
                      "flamethrower"	
  (attacker_position	
  
                      "-­‐2677	
  2177	
  -­‐127")	
  (victim_position	
  
                      "-­‐2555	
  2323	
  -­‐127")


                          Splunk Worldwide Users’ Conference        39                       © Copyright Splunk 2011
Thursday, August 18, 11
I	
  need	
  the	
  data
                                                 gaming	
  logs	
  -­‐	
  Team	
  Fortress


                      L	
  08/02/2011	
  -­‐	
  11:46:05:	
  "The	
  
                      Administrator<61><BOT><Red>"	
  killed	
  
                      "MoreGun<56><BOT><Blue>"	
  with	
  
                      "flamethrower"	
  (attacker_position	
  
                      "-­‐2677	
  2177	
  -­‐127")	
  (victim_position	
  
                      "-­‐2555	
  2323	
  -­‐127")


                          Splunk Worldwide Users’ Conference        40                       © Copyright Splunk 2011
Thursday, August 18, 11
Who’s	
  who?
                                      How	
  do	
  we	
  know	
  who	
  did	
  what	
  to	
  whom?


                      L	
  08/02/2011	
  -­‐	
  11:46:05:	
  "The	
  
                      Administrator<61><BOT><Red>"	
  killed	
  
                      "MoreGun<56><BOT><Blue>"	
  with	
  
                      "flamethrower"	
  (attacker_position	
  
                      "-­‐2677	
  2177	
  -­‐127")	
  (victim_position	
  
                      "-­‐2555	
  2323	
  -­‐127")

                          Splunk Worldwide Users’ Conference       41                                © Copyright Splunk 2011
Thursday, August 18, 11
actor            actor_id           actor_team   actor_type


                          L	
  08/02/2011	
  -­‐	
  11:46:05:	
  "The	
  
                          Administrator<61><BOT><Red>"	
  killed	
  
                          "MoreGun<56><BOT><Blue>"	
  with	
  
                          "flamethrower"	
  (attacker_position	
  
                          "-­‐2677	
  2177	
  -­‐127")	
  (victim_position	
  
                          "-­‐2555	
  2323	
  -­‐127")


                            actee              actee_id          actee_type     actee_team

                            Splunk Worldwide Users’ Conference       42                      © Copyright Splunk 2011
Thursday, August 18, 11
Didn’t	
  we	
  see	
  this	
  slide	
  before?
                                      How	
  do	
  we	
  know	
  who	
  did	
  what	
  to	
  whom?


                      L	
  08/02/2011	
  -­‐	
  11:46:05:	
  "The	
  
                      Administrator<61><BOT><Red>"	
  killed	
  
                      "MoreGun<56><BOT><Blue>"	
  with	
  
                      "flamethrower"	
  (attacker_position	
  
                      "-­‐2677	
  2177	
  -­‐127")	
  (victim_position	
  
                      "-­‐2555	
  2323	
  -­‐127")

                          Splunk Worldwide Users’ Conference       43                                © Copyright Splunk 2011
Thursday, August 18, 11
See	
  that	
  pa[ern?	
  	
  Remember	
  
                                     “max_match”?
                     L	
  08/02/2011	
  -­‐	
  11:46:05:	
  "The	
  
                     Administrator<61><BOT><Red>"	
  killed	
  
                     "MoreGun<56><BOT><Blue>"	
  with	
  
                     "flamethrower"	
  (attacker_position	
  
                     "-­‐2677	
  2177	
  -­‐127")	
  (victim_position	
  
                     "-­‐2555	
  2323	
  -­‐127")


                          Splunk Worldwide Users’ Conference   44       © Copyright Splunk 2011
Thursday, August 18, 11
See	
  that	
  pa[ern?	
  	
  Remember	
  
                                             “max_match”?
                     "The	
  Administrator<61><BOT><Red>"	
  
                     "MoreGun<56><BOT><Blue>"	
  

                          Using	
  rex	
  /	
  mv_add,	
  lets	
  capture	
  it	
  in	
  to	
  some	
  temporary	
  “mul9-­‐value”	
  fields




                                 Splunk Worldwide Users’ Conference                  45                                                       © Copyright Splunk 2011
Thursday, August 18, 11
“Temporary”	
  MulUValue	
  Fields

                          actor_name_z                                  The	
  Administrator,MoreGun
                            actor_id_z                                  61,56
                          actor_type_z                                  BOT,BOT
                          actor_team_z                                  Red,Blue

                     Using	
  rex	
  /	
  mv_add,	
  lets	
  capture	
  it	
  in	
  to	
  some	
  temporary	
  “mul9-­‐value”	
  fields



                             Splunk Worldwide Users’ Conference                   46                                                     © Copyright Splunk 2011
Thursday, August 18, 11
Evaluate	
  &	
  Transform	
  with	
  “mvindex”
                              mul9-­‐value	
  fields	
  have	
  an	
  “posi9on	
  value”	
  in	
  the	
  array


                             mvindex                              0	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  1
                          actor_name_z                            The	
  Administrator,MoreGun
                            actor_id_z                            61,	
  56
                          actor_type_z                            BOT,BOT
                          actor_team_z                            Red,Blue



                           Splunk Worldwide Users’ Conference                47                                              © Copyright Splunk 2011
Thursday, August 18, 11
Its	
  Ume	
  for	
  our	
  fields	
  to	
  split	
  up!
                               mul9-­‐value	
  fields	
  have	
  an	
  “posi9on	
  value”	
  in	
  the	
  array


                          |	
  eval	
  actor_name	
  =	
  
                          mvindex(actor_name_z,0)|	
  eval	
  
                          actee_name	
  =	
  mvindex(actor_name_z,1)	
  

                               actor_name	
  =	
  The	
  Administrator
                               actee_name	
  =	
  MoreGun

                            Splunk Worldwide Users’ Conference                48                                 © Copyright Splunk 2011
Thursday, August 18, 11
Resources

                          •   regexlib.com
                          •   regular-­‐expressions.info
                          •   gskinner.com/RegExr
                          •   Reggy	
  /	
  RegExhibit
                          •   RegexBuddy	
  (JGSo~.com)



Thursday, August 18, 11
Questions,	
  just	
  ask!
                          Michael	
  Wilde,	
  Splunk	
  Ninja
                              ninja@splunk.com



Thursday, August 18, 11

More Related Content

What's hot

Red Hat Enteprise Linux Open Stack Platfrom Director
Red Hat Enteprise Linux Open Stack Platfrom DirectorRed Hat Enteprise Linux Open Stack Platfrom Director
Red Hat Enteprise Linux Open Stack Platfrom DirectorOrgad Kimchi
 
Openstack Quantum + Devstack Tutorial
Openstack Quantum + Devstack TutorialOpenstack Quantum + Devstack Tutorial
Openstack Quantum + Devstack TutorialDavid Lapsley
 
Chef 11 Preview/Chef for OpenStack
Chef 11 Preview/Chef for OpenStackChef 11 Preview/Chef for OpenStack
Chef 11 Preview/Chef for OpenStackMatt Ray
 
Verizon k8-ignite-meetup
Verizon k8-ignite-meetupVerizon k8-ignite-meetup
Verizon k8-ignite-meetupDani Traphagen
 
Boston/NYC Chef for OpenStack Hack Days
Boston/NYC Chef for OpenStack Hack DaysBoston/NYC Chef for OpenStack Hack Days
Boston/NYC Chef for OpenStack Hack DaysMatt Ray
 
Chef for OpenStack December 2012
Chef for OpenStack December 2012Chef for OpenStack December 2012
Chef for OpenStack December 2012Matt Ray
 
Puppet at Spotify
Puppet at SpotifyPuppet at Spotify
Puppet at SpotifyPuppet
 
Building and Running OpenStack on POWER8
Building and Running OpenStack on POWER8Building and Running OpenStack on POWER8
Building and Running OpenStack on POWER8Lance Albertson
 
OpenStack Austin Meetup January 2014: Chef + OpenStack
OpenStack Austin Meetup January 2014: Chef + OpenStackOpenStack Austin Meetup January 2014: Chef + OpenStack
OpenStack Austin Meetup January 2014: Chef + OpenStackMatt Ray
 
Deploy like a Boss: Using Kubernetes and Apache Ignite!
Deploy like a Boss: Using Kubernetes and Apache Ignite!Deploy like a Boss: Using Kubernetes and Apache Ignite!
Deploy like a Boss: Using Kubernetes and Apache Ignite!Dani Traphagen
 
Puppet and the Model-Driven Infrastructure
Puppet and the Model-Driven InfrastructurePuppet and the Model-Driven Infrastructure
Puppet and the Model-Driven Infrastructurelkanies
 
Puppet Camp Charlotte 2015: Exporting Resources: There and Back Again
Puppet Camp Charlotte 2015: Exporting Resources: There and Back AgainPuppet Camp Charlotte 2015: Exporting Resources: There and Back Again
Puppet Camp Charlotte 2015: Exporting Resources: There and Back AgainPuppet
 
Network automation ansible_nx-api
Network automation ansible_nx-apiNetwork automation ansible_nx-api
Network automation ansible_nx-apiJoel W. King
 
Introduction to Ansible (Pycon7 2016)
Introduction to Ansible (Pycon7 2016)Introduction to Ansible (Pycon7 2016)
Introduction to Ansible (Pycon7 2016)Ivan Rossi
 

What's hot (20)

Red Hat Enteprise Linux Open Stack Platfrom Director
Red Hat Enteprise Linux Open Stack Platfrom DirectorRed Hat Enteprise Linux Open Stack Platfrom Director
Red Hat Enteprise Linux Open Stack Platfrom Director
 
Openstack Quantum + Devstack Tutorial
Openstack Quantum + Devstack TutorialOpenstack Quantum + Devstack Tutorial
Openstack Quantum + Devstack Tutorial
 
Chef 11 Preview/Chef for OpenStack
Chef 11 Preview/Chef for OpenStackChef 11 Preview/Chef for OpenStack
Chef 11 Preview/Chef for OpenStack
 
Verizon k8-ignite-meetup
Verizon k8-ignite-meetupVerizon k8-ignite-meetup
Verizon k8-ignite-meetup
 
TripleO
 TripleO TripleO
TripleO
 
Boston/NYC Chef for OpenStack Hack Days
Boston/NYC Chef for OpenStack Hack DaysBoston/NYC Chef for OpenStack Hack Days
Boston/NYC Chef for OpenStack Hack Days
 
Falcon feedenhancement
Falcon feedenhancementFalcon feedenhancement
Falcon feedenhancement
 
Shareplex Presentation
Shareplex PresentationShareplex Presentation
Shareplex Presentation
 
Chef for OpenStack December 2012
Chef for OpenStack December 2012Chef for OpenStack December 2012
Chef for OpenStack December 2012
 
Puppet at Spotify
Puppet at SpotifyPuppet at Spotify
Puppet at Spotify
 
Building and Running OpenStack on POWER8
Building and Running OpenStack on POWER8Building and Running OpenStack on POWER8
Building and Running OpenStack on POWER8
 
OpenStack Austin Meetup January 2014: Chef + OpenStack
OpenStack Austin Meetup January 2014: Chef + OpenStackOpenStack Austin Meetup January 2014: Chef + OpenStack
OpenStack Austin Meetup January 2014: Chef + OpenStack
 
Triple o overview
Triple o overviewTriple o overview
Triple o overview
 
Deploy like a Boss: Using Kubernetes and Apache Ignite!
Deploy like a Boss: Using Kubernetes and Apache Ignite!Deploy like a Boss: Using Kubernetes and Apache Ignite!
Deploy like a Boss: Using Kubernetes and Apache Ignite!
 
Puppet and the Model-Driven Infrastructure
Puppet and the Model-Driven InfrastructurePuppet and the Model-Driven Infrastructure
Puppet and the Model-Driven Infrastructure
 
Puppet Camp Charlotte 2015: Exporting Resources: There and Back Again
Puppet Camp Charlotte 2015: Exporting Resources: There and Back AgainPuppet Camp Charlotte 2015: Exporting Resources: There and Back Again
Puppet Camp Charlotte 2015: Exporting Resources: There and Back Again
 
Network automation ansible_nx-api
Network automation ansible_nx-apiNetwork automation ansible_nx-api
Network automation ansible_nx-api
 
Sf k8-ignite-meetup
Sf k8-ignite-meetupSf k8-ignite-meetup
Sf k8-ignite-meetup
 
Dev stacklabguide
Dev stacklabguideDev stacklabguide
Dev stacklabguide
 
Introduction to Ansible (Pycon7 2016)
Introduction to Ansible (Pycon7 2016)Introduction to Ansible (Pycon7 2016)
Introduction to Ansible (Pycon7 2016)
 

Viewers also liked

Regular Expression (Regex) Fundamentals
Regular Expression (Regex) FundamentalsRegular Expression (Regex) Fundamentals
Regular Expression (Regex) FundamentalsMesut Günes
 
Machine Data 101
Machine Data 101Machine Data 101
Machine Data 101Splunk
 
Machine Learning + Analytics in Splunk
Machine Learning + Analytics in SplunkMachine Learning + Analytics in Splunk
Machine Learning + Analytics in SplunkSplunk
 
Machine Data 101
Machine Data 101Machine Data 101
Machine Data 101Splunk
 
Threat Hunting with Splunk
Threat Hunting with SplunkThreat Hunting with Splunk
Threat Hunting with SplunkSplunk
 
Splunk Enterprise for IT Troubleshooting Hands-On
Splunk Enterprise for IT Troubleshooting Hands-OnSplunk Enterprise for IT Troubleshooting Hands-On
Splunk Enterprise for IT Troubleshooting Hands-OnSplunk
 
Splunk for Security-Hands On
Splunk for Security-Hands OnSplunk for Security-Hands On
Splunk for Security-Hands OnSplunk
 
Machine Data 101 Hands-on
Machine Data 101 Hands-onMachine Data 101 Hands-on
Machine Data 101 Hands-onSplunk
 
Splunk Overview
Splunk OverviewSplunk Overview
Splunk OverviewSplunk
 
Threat Hunting with Splunk Hands-on
Threat Hunting with Splunk Hands-onThreat Hunting with Splunk Hands-on
Threat Hunting with Splunk Hands-onSplunk
 
44CON London 2015 - Indicators of Compromise: From malware analysis to eradic...
44CON London 2015 - Indicators of Compromise: From malware analysis to eradic...44CON London 2015 - Indicators of Compromise: From malware analysis to eradic...
44CON London 2015 - Indicators of Compromise: From malware analysis to eradic...44CON
 
Data Onboarding
Data Onboarding Data Onboarding
Data Onboarding Splunk
 
SplunkLive! Washington DC May 2013 - Search Language Beginner
SplunkLive! Washington DC May 2013 - Search Language BeginnerSplunkLive! Washington DC May 2013 - Search Language Beginner
SplunkLive! Washington DC May 2013 - Search Language BeginnerSplunk
 
Getting started with Splunk
Getting started with SplunkGetting started with Splunk
Getting started with SplunkSanjib Dhar
 
Power of SPL
Power of SPLPower of SPL
Power of SPLTian Chen
 
Using splunk6.2 labs
Using splunk6.2 labsUsing splunk6.2 labs
Using splunk6.2 labsJagadish a
 
Security Hands-On - Splunklive! Houston
Security Hands-On - Splunklive! HoustonSecurity Hands-On - Splunklive! Houston
Security Hands-On - Splunklive! HoustonSplunk
 

Viewers also liked (20)

Regular Expression (Regex) Fundamentals
Regular Expression (Regex) FundamentalsRegular Expression (Regex) Fundamentals
Regular Expression (Regex) Fundamentals
 
Regular Expressions
Regular ExpressionsRegular Expressions
Regular Expressions
 
Machine Data 101
Machine Data 101Machine Data 101
Machine Data 101
 
Machine Learning + Analytics in Splunk
Machine Learning + Analytics in SplunkMachine Learning + Analytics in Splunk
Machine Learning + Analytics in Splunk
 
Machine Data 101
Machine Data 101Machine Data 101
Machine Data 101
 
Andrei's Regex Clinic
Andrei's Regex ClinicAndrei's Regex Clinic
Andrei's Regex Clinic
 
Threat Hunting with Splunk
Threat Hunting with SplunkThreat Hunting with Splunk
Threat Hunting with Splunk
 
Splunk Enterprise for IT Troubleshooting Hands-On
Splunk Enterprise for IT Troubleshooting Hands-OnSplunk Enterprise for IT Troubleshooting Hands-On
Splunk Enterprise for IT Troubleshooting Hands-On
 
Splunk for Security-Hands On
Splunk for Security-Hands OnSplunk for Security-Hands On
Splunk for Security-Hands On
 
Machine Data 101 Hands-on
Machine Data 101 Hands-onMachine Data 101 Hands-on
Machine Data 101 Hands-on
 
Splunk Overview
Splunk OverviewSplunk Overview
Splunk Overview
 
Threat Hunting with Splunk Hands-on
Threat Hunting with Splunk Hands-onThreat Hunting with Splunk Hands-on
Threat Hunting with Splunk Hands-on
 
44CON London 2015 - Indicators of Compromise: From malware analysis to eradic...
44CON London 2015 - Indicators of Compromise: From malware analysis to eradic...44CON London 2015 - Indicators of Compromise: From malware analysis to eradic...
44CON London 2015 - Indicators of Compromise: From malware analysis to eradic...
 
Data Onboarding
Data Onboarding Data Onboarding
Data Onboarding
 
SplunkLive! Washington DC May 2013 - Search Language Beginner
SplunkLive! Washington DC May 2013 - Search Language BeginnerSplunkLive! Washington DC May 2013 - Search Language Beginner
SplunkLive! Washington DC May 2013 - Search Language Beginner
 
Getting started with Splunk
Getting started with SplunkGetting started with Splunk
Getting started with Splunk
 
Power of SPL
Power of SPLPower of SPL
Power of SPL
 
Using splunk6.2 labs
Using splunk6.2 labsUsing splunk6.2 labs
Using splunk6.2 labs
 
Security Hands-On - Splunklive! Houston
Security Hands-On - Splunklive! HoustonSecurity Hands-On - Splunklive! Houston
Security Hands-On - Splunklive! Houston
 
Regular Expression
Regular ExpressionRegular Expression
Regular Expression
 

Similar to Field Extractions: Making Regex Your Buddy

dlux splunk>live! 2012 Beginners Session
dlux splunk>live! 2012 Beginners Sessiondlux splunk>live! 2012 Beginners Session
dlux splunk>live! 2012 Beginners SessionDavid Lutz
 
SplunkLive 2011 Beginners Session
SplunkLive 2011 Beginners SessionSplunkLive 2011 Beginners Session
SplunkLive 2011 Beginners SessionSplunk
 
Elasticsearch Basics
Elasticsearch BasicsElasticsearch Basics
Elasticsearch BasicsShifa Khan
 
3D in the Browser via WebGL: It's Go Time
3D in the Browser via WebGL: It's Go Time 3D in the Browser via WebGL: It's Go Time
3D in the Browser via WebGL: It's Go Time Pascal Rettig
 
Image and Music: Processing plus Pure Data with libpd library
Image and Music: Processing plus Pure Data with libpd libraryImage and Music: Processing plus Pure Data with libpd library
Image and Music: Processing plus Pure Data with libpd libraryPETER KIRN
 
Migration from Fast ESP to Lucene Solr - Michael McIntosh
Migration from Fast ESP to Lucene Solr - Michael McIntoshMigration from Fast ESP to Lucene Solr - Michael McIntosh
Migration from Fast ESP to Lucene Solr - Michael McIntoshlucenerevolution
 
Instrumentation with Splunk
Instrumentation with SplunkInstrumentation with Splunk
Instrumentation with SplunkDatavail
 
MongoDB at Sailthru: Scaling and Schema Design
MongoDB at Sailthru: Scaling and Schema DesignMongoDB at Sailthru: Scaling and Schema Design
MongoDB at Sailthru: Scaling and Schema DesignDATAVERSITY
 
4Developers 2015: Lessons for Erlang VM - Michał Ślaski
4Developers 2015: Lessons for Erlang VM - Michał Ślaski4Developers 2015: Lessons for Erlang VM - Michał Ślaski
4Developers 2015: Lessons for Erlang VM - Michał ŚlaskiPROIDEA
 
Why Functional Programming and Clojure - LightningTalk
Why Functional Programming and Clojure - LightningTalkWhy Functional Programming and Clojure - LightningTalk
Why Functional Programming and Clojure - LightningTalkJakub Holy
 
Play concurrency
Play concurrencyPlay concurrency
Play concurrencyJustin Long
 
SplunkGettingStartedWorkshop.pptx
SplunkGettingStartedWorkshop.pptxSplunkGettingStartedWorkshop.pptx
SplunkGettingStartedWorkshop.pptxKhongHieu2
 
SplunkGettingStartedWorkshop.pptx
SplunkGettingStartedWorkshop.pptxSplunkGettingStartedWorkshop.pptx
SplunkGettingStartedWorkshop.pptxCazlp1
 
Conquistando el Servidor con Node.JS
Conquistando el Servidor con Node.JSConquistando el Servidor con Node.JS
Conquistando el Servidor con Node.JSCaridy Patino
 
Puppet camp europe 2011 hackability
Puppet camp europe 2011   hackabilityPuppet camp europe 2011   hackability
Puppet camp europe 2011 hackabilityPuppet
 
Java SE 7 - The Platform Evolves, Dalibor Topić (Oracle)
Java SE 7 - The Platform Evolves, Dalibor Topić (Oracle)Java SE 7 - The Platform Evolves, Dalibor Topić (Oracle)
Java SE 7 - The Platform Evolves, Dalibor Topić (Oracle)OpenBlend society
 

Similar to Field Extractions: Making Regex Your Buddy (20)

dlux splunk>live! 2012 Beginners Session
dlux splunk>live! 2012 Beginners Sessiondlux splunk>live! 2012 Beginners Session
dlux splunk>live! 2012 Beginners Session
 
SplunkLive 2011 Beginners Session
SplunkLive 2011 Beginners SessionSplunkLive 2011 Beginners Session
SplunkLive 2011 Beginners Session
 
Elasticsearch Basics
Elasticsearch BasicsElasticsearch Basics
Elasticsearch Basics
 
3D in the Browser via WebGL: It's Go Time
3D in the Browser via WebGL: It's Go Time 3D in the Browser via WebGL: It's Go Time
3D in the Browser via WebGL: It's Go Time
 
Image and Music: Processing plus Pure Data with libpd library
Image and Music: Processing plus Pure Data with libpd libraryImage and Music: Processing plus Pure Data with libpd library
Image and Music: Processing plus Pure Data with libpd library
 
Migration from Fast ESP to Lucene Solr - Michael McIntosh
Migration from Fast ESP to Lucene Solr - Michael McIntoshMigration from Fast ESP to Lucene Solr - Michael McIntosh
Migration from Fast ESP to Lucene Solr - Michael McIntosh
 
Erlang os
Erlang osErlang os
Erlang os
 
eZ Publish nextgen
eZ Publish nextgeneZ Publish nextgen
eZ Publish nextgen
 
Instrumentation with Splunk
Instrumentation with SplunkInstrumentation with Splunk
Instrumentation with Splunk
 
MongoDB at Sailthru: Scaling and Schema Design
MongoDB at Sailthru: Scaling and Schema DesignMongoDB at Sailthru: Scaling and Schema Design
MongoDB at Sailthru: Scaling and Schema Design
 
4Developers 2015: Lessons for Erlang VM - Michał Ślaski
4Developers 2015: Lessons for Erlang VM - Michał Ślaski4Developers 2015: Lessons for Erlang VM - Michał Ślaski
4Developers 2015: Lessons for Erlang VM - Michał Ślaski
 
Why Functional Programming and Clojure - LightningTalk
Why Functional Programming and Clojure - LightningTalkWhy Functional Programming and Clojure - LightningTalk
Why Functional Programming and Clojure - LightningTalk
 
Play concurrency
Play concurrencyPlay concurrency
Play concurrency
 
SplunkGettingStartedWorkshop.pptx
SplunkGettingStartedWorkshop.pptxSplunkGettingStartedWorkshop.pptx
SplunkGettingStartedWorkshop.pptx
 
SplunkGettingStartedWorkshop.pptx
SplunkGettingStartedWorkshop.pptxSplunkGettingStartedWorkshop.pptx
SplunkGettingStartedWorkshop.pptx
 
Caridy patino - node-js
Caridy patino - node-jsCaridy patino - node-js
Caridy patino - node-js
 
Conquistando el Servidor con Node.JS
Conquistando el Servidor con Node.JSConquistando el Servidor con Node.JS
Conquistando el Servidor con Node.JS
 
Puppet camp europe 2011 hackability
Puppet camp europe 2011   hackabilityPuppet camp europe 2011   hackability
Puppet camp europe 2011 hackability
 
JavaSE 7
JavaSE 7JavaSE 7
JavaSE 7
 
Java SE 7 - The Platform Evolves, Dalibor Topić (Oracle)
Java SE 7 - The Platform Evolves, Dalibor Topić (Oracle)Java SE 7 - The Platform Evolves, Dalibor Topić (Oracle)
Java SE 7 - The Platform Evolves, Dalibor Topić (Oracle)
 

More from Michael Wilde

DockerCon17 - Building The Super-Dynamic Demo Center
DockerCon17 - Building The Super-Dynamic Demo CenterDockerCon17 - Building The Super-Dynamic Demo Center
DockerCon17 - Building The Super-Dynamic Demo CenterMichael Wilde
 
Social media & sentiment analysis splunk conf2012
Social media & sentiment analysis   splunk conf2012Social media & sentiment analysis   splunk conf2012
Social media & sentiment analysis splunk conf2012Michael Wilde
 
Do gooders unite: Save the world with technology!
Do gooders unite: Save the world with technology!Do gooders unite: Save the world with technology!
Do gooders unite: Save the world with technology!Michael Wilde
 
Interop - Exploring Machine Data
Interop - Exploring Machine DataInterop - Exploring Machine Data
Interop - Exploring Machine DataMichael Wilde
 
Big Data for Everyman
Big Data for EverymanBig Data for Everyman
Big Data for EverymanMichael Wilde
 
Splunk User Group - Austin - Kickoff Meeting
Splunk User Group - Austin - Kickoff MeetingSplunk User Group - Austin - Kickoff Meeting
Splunk User Group - Austin - Kickoff MeetingMichael Wilde
 
Splunk @ Amazon Startup - Austin, TX - 9/11/2008
Splunk @ Amazon Startup - Austin, TX - 9/11/2008Splunk @ Amazon Startup - Austin, TX - 9/11/2008
Splunk @ Amazon Startup - Austin, TX - 9/11/2008Michael Wilde
 

More from Michael Wilde (7)

DockerCon17 - Building The Super-Dynamic Demo Center
DockerCon17 - Building The Super-Dynamic Demo CenterDockerCon17 - Building The Super-Dynamic Demo Center
DockerCon17 - Building The Super-Dynamic Demo Center
 
Social media & sentiment analysis splunk conf2012
Social media & sentiment analysis   splunk conf2012Social media & sentiment analysis   splunk conf2012
Social media & sentiment analysis splunk conf2012
 
Do gooders unite: Save the world with technology!
Do gooders unite: Save the world with technology!Do gooders unite: Save the world with technology!
Do gooders unite: Save the world with technology!
 
Interop - Exploring Machine Data
Interop - Exploring Machine DataInterop - Exploring Machine Data
Interop - Exploring Machine Data
 
Big Data for Everyman
Big Data for EverymanBig Data for Everyman
Big Data for Everyman
 
Splunk User Group - Austin - Kickoff Meeting
Splunk User Group - Austin - Kickoff MeetingSplunk User Group - Austin - Kickoff Meeting
Splunk User Group - Austin - Kickoff Meeting
 
Splunk @ Amazon Startup - Austin, TX - 9/11/2008
Splunk @ Amazon Startup - Austin, TX - 9/11/2008Splunk @ Amazon Startup - Austin, TX - 9/11/2008
Splunk @ Amazon Startup - Austin, TX - 9/11/2008
 

Recently uploaded

Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 

Recently uploaded (20)

Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 

Field Extractions: Making Regex Your Buddy

  • 1. Making Reg[Ee]x Your Buddy August  15,  2011 (?i)(mi(chael|ke)  wilde),  Splunk  Ninja Thursday, August 18, 11
  • 2. Hi,  I’m  Michael  Wilde • You  may  know  me  from: Splunk Worldwide Users’ Conference 2 © Copyright Splunk 2011 Thursday, August 18, 11
  • 3. What  is  RegEx “Finite  Automata” •Regular  Expression  invented  in  the  1950’s  by   mathemaUcian  Stephen  Cole  Kleene •Implemented  by  “ed”  and  “grep”  creator  Ken   Thompson  in  1973 Pa[ern  matching  language  for  text  processing •Has  slightly  different  implementaUons  (PERL,  POSIX)   •Way  crypUc  at  first  sight Splunk Worldwide Users’ Conference 3 © Copyright Splunk 2011 Thursday, August 18, 11
  • 4. Why  should  you  care •Field  extracUon  is  a  requirement  for  reporUng •Index-­‐Ume  filtering  &  rouUng •You’ll  seem  smart •It  will  be  useful  beyond  Splunk •You  might  score  with  the  (ladies|dudes)  at   (MakersFaire  |ComiCon). Splunk Worldwide Users’ Conference 4 © Copyright Splunk 2011 Thursday, August 18, 11
  • 6. Thinking  Regex •Log  Events  are  a  great  place  to  start,  they  have  structure •Don’t  overthink  it.    The  pa[ern  is  there  waiUng  to   discovered •Don’t  be  lazy  and  use  wildcards  too  much •Learn  to  love  “NOT”  regexes.  S+  D+  W+  [^,]+ Splunk Worldwide Users’ Conference 6 © Copyright Splunk 2011 Thursday, August 18, 11
  • 7. Splunk Worldwide Users’ Conference 7 © Copyright Splunk 2011 Thursday, August 18, 11
  • 8. Be  nice  to  your  RegEx  engine • MS-­‐DOS  taught  us  to  be   laaaaaaaaaaaaaaaaazy  with  *.* • A  regex  engine  matches  character  by   character,  and  then  does  backtracking. • Match  in  as  few  steps  as  possible Splunk Worldwide Users’ Conference 8 © Copyright Splunk 2011 Thursday, August 18, 11
  • 9. Regexes  in  Splunk Search Language: “rex”, “erex”, “regex” Indexing: Filtering data (in|out), line breaking, timestamp extraction Field Extraction Thursday, August 18, 11
  • 10. IFX • Splunk  has  a  built  in  "interacUve  field  extractor" • It  can  be  useful.  Give  it  samples  of  data,  and  it  will  a[empt  to   learn  a  regex  and  persist  a  single  field • It  has  a  limitaUon  of  the  amount  of  events  to  display  in  its   viewer. • You  might  not  see  your  search  results  when  using  it?    Huh? Splunk Worldwide Users’ Conference 10 © Copyright Splunk 2011 Thursday, August 18, 11
  • 11. what  if  we  could  use  that  "intelligent"  stuff IFX  was  doing  but  in  the  search  language   • Thursday, August 18, 11 Splunk Worldwide Users’ Conference 11 © Copyright Splunk 2011
  • 12. meet  "erex" • Allows  you  to  give  it  examples,  but  it  works  on  your   search  results • Allows  you  to  give  it  counterexamples  of  stuff  you   don't  want  to  match  on • Builds  you  a  proper  rex  command Splunk Worldwide Users’ Conference 12 © Copyright Splunk 2011 Thursday, August 18, 11
  • 13. ...there's  an  app  for  that. right?   Splunk Worldwide Users’ Conference 13 © Copyright Splunk 2011 Thursday, August 18, 11
  • 14. Field  Extractor  App • Imagine  you  could  use  your   mouse,  highlight  fields,  name   them,  persist  them,  go  home   early  and  never  write  regex. • David  Carasso's  Field  Extractor   app  is  like  a  "workbench  for  field   extracUon" • Download  it  from  SplunkBase Splunk Worldwide Users’ Conference 14 © Copyright Splunk 2011 Thursday, August 18, 11
  • 16. the  |  regex  search  command • Did  you  know  splunk  crushes  all  terms  to  lower  case? • If  you  need  to  look  for  specific  pa;erns  or  even   words  and  respect  the  case  the  original  events  are  in,   use  |  regex • index=splunktv|regex  _raw="(MP3|M4A)"  <-­‐-­‐noMce   this  is  a  case  sensiMve  pa;ern  match. Splunk Worldwide Users’ Conference 16 © Copyright Splunk 2011 Thursday, August 18, 11
  • 17. What  about  good  ole  Rex? • Search  Ume  field   extracUons  via  your  own   regexes  -­‐-­‐  in  the  search   language • Name  your  fields • Reuse  everyone  elses   work! Splunk Worldwide Users’ Conference 17 © Copyright Splunk 2011 Thursday, August 18, 11
  • 18. a  few  more  tricks  for  you Splunk Worldwide Users’ Conference 18 © Copyright Splunk 2011 Thursday, August 18, 11
  • 19. host  extracUon  irritates  me Splunk Worldwide Users’ Conference 19 © Copyright Splunk 2011 Thursday, August 18, 11
  • 20. regex  in  host  extracUon • Splunk  will  a[empt  to  do  the  right  thing.  Log  source  will  likely   make  it  hard  for  Splunk-­‐-­‐and  you'll  blame  Splunk • Props.conf  &  transforms.conf  are  needed  to  properly  extract   hostnames  in  some  cases  (F5  Big-­‐IP  and  HP  networking  gear • Use  default  seungs  in  props.conf  and  use  your  own  seungs   as  well Splunk Worldwide Users’ Conference 20 © Copyright Splunk 2011 Thursday, August 18, 11
  • 21. priority  boarding  in  props.conf [source::...a...] TRANSFORMS-­‐ahosts  =  ahostextrac:on priority  =  1 [source::...z...] TRANSFORMS-­‐zhosts  =  zhostextrac:on priority  =  99 what  if  the  source  we  were  matching  against  had  the  word  "arizona"  in  it?  It   will  match  both,  right?      Use  "Priority"  to  control  matching.    99  is  higher  than   1.    So  99  is  a  higher  priority.    Yeah,  i  know...  weird. Splunk Worldwide Users’ Conference 21 © Copyright Splunk 2011 Thursday, August 18, 11
  • 22. Basic  Training  Complete! Lets  do  something  more difficult Thursday, August 18, 11
  • 23. Splunk  is  so  smart except  when  its  not    <policy  id="3">Finjan  HTTPS  policy</policy>    <cp  id="5"  name="AcUve  Content"  display_name="AcUve  Content"/>    <group  id="5002"  cp_id="5"  type="0">Full  profile  -­‐  Binary  Behavior</group>  <item  id="28015">Format  error  in  CRL  lastUpdate  field</item>  <item  id="3265747">*.served.com/*</item>    <rule_comment  id="2"  name="Block  cerUficate  validaUon  errors">&lt;! [CDATA[Block  HTTPS  content  without  a  valid  cerUficate]]&gt;</rule_comment> AUTO-­‐KV  pulled  the  “id”  field  out  of  every  event.    Yay!!! Splunk Worldwide Users’ Conference 23 © Copyright Splunk 2011 Thursday, August 18, 11
  • 24. “id”  is  not  the  field  name look  closer  Agent  Starling    <policy  id="3">Finjan  HTTPS  policy</policy>    <cp  id="5"  name="AcUve  Content"  display_name="AcUve  Content"/>    <group  id="5002"  cp_id="5"  type="0">Full  profile  -­‐  Binary  Behavior</group>  <item  id="28015">Format  error  in  CRL  lastUpdate  field</item>    <rule_comment  id="2"  name="Block  cerUficate  validaUon  errors">&lt;! [CDATA[Block  HTTPS  content  without  a  valid  cerUficate]]&gt;</rule_comment> We  can  educate  Splunk  on  dynamically  pulling  the   KEY  and  VALUE  with... Splunk Worldwide Users’ Conference 24 © Copyright Splunk 2011 Thursday, August 18, 11
  • 25. Dynamic  Key  Value  ExtracUon ...but  tailored  for  our  needs REGEX  for  the  “KEY”  is  <([^=]+)=    <policy  id="3"> Less   than,   followed   by   (anything   that   is   “not   an   equal   sign-­‐-­‐greedy   match)      <cp  id="5"   followed  by  an  equal  sign  <item  id="28015"> keep  going  dude! REGEX  for  the  “VALUE”  is  ”(    <policy  id="3"> A  quote   (followed  by   anything   that  is  not   a   quote-­‐-­‐greedy   match)   followed   by   a      <cp  id="5"   quote  followed  by  a  greater  than  sign  <item  id="28015"> Splunk Worldwide Users’ Conference 25 © Copyright Splunk 2011 Thursday, August 18, 11
  • 26. Persist  your  sweet  dynamic  KV  pa[erns props.conf  &  transforms.conf  required Create  an  entry  in  props.conf  like  this: [m86_dynamic_kv] $1          $2 REPORT-­‐m86fields  =  mym86kv Text Create  an  entry  in  transforms.conf  like  this: [mym86kv] REGEX  =  <([^=]+)="([^"]+)"> FORMAT = $1::$2 <policy  id="3">Finjan  HTTPS  policy</ policy> Splunk Worldwide Users’ Conference 26 © Copyright Splunk 2011 Thursday, August 18, 11
  • 27. Dang  it!  It  wasn’t  perfect some  of  our  events  don’t  finish  their  XML  tag  right  a~er  a  quote Create  an  entry  in  props.conf  like  this: [m86_dynamic_kv] $1          $2 REPORT-­‐m86fields  =  mym86kv Text Create  an  entry  in  transforms.conf  like  this: [mym86kv] REGEX  =  <([^=]+)="([^"]+)[^>]+> <rule_comment  id="690"  name="Log  everythin FORMAT = $1::$2 Image  files">&lt;![CDATA[Logs  all  content  passin the  system  except  for  ...... Splunk Worldwide Users’ Conference 27 © Copyright Splunk 2011 Thursday, August 18, 11
  • 28. Think  you’re  good? Try  extracUng  the  “service”  field 2011/07/21  19:27:22.071  [(ninja-­‐fe96,opensocial,/makeRequest,2011/07/21   19:27:21.978)[ninja-­‐be04,auth,Auth2Service.recoverSubject]]  []  [Auth2Service]   recoverSubject(V1.21.47,OSM:1t7Dg201000:i: 1311276436:1d00a2fc1f9addd936af12ed5c430a169c362af8,null,shindig, 172.17.207.243,)=[Principal[3],[OSM:1t7Dg201000:i: 1311276439:20d1d0b474927a301376d70f2ad5949a2241e271,false,1h]]  in  1ms Your  job  is  to  create  a  mulU-­‐valued  field  as  the  “service”   field  exists  mulUple  Umes  in  each  event Splunk Worldwide Users’ Conference 28 © Copyright Splunk 2011 Thursday, August 18, 11
  • 29. Look  for  the  obvious  pa[erns 2011/07/21  19:27:22.071  [(ela4-­‐fe96,opensocial,/makeRequest,2011/07/21   19:27:21.978)[ela4-­‐be04,auth,Auth2Service.recoverSubject]]  []  [Auth2Service]   recoverSubject(V1.21.47,OSM:1t7Dg201000:i: 1311276436:1d00a2fc1f9addd936af12ed5c430a169c362af8,null,shindig, 172.17.207.243,)=[Principal[3],[OSM:1t7Dg201000:i: 1311276439:20d1d0b474927a301376d70f2ad5949a2241e271,false,1h]]  in  1ms Your  brain  will  tell  you  to  look  for  “anything  a~er  the   first  comma”  a~er  that  le~  bracket  and  before  the   second  comma Splunk Worldwide Users’ Conference 29 © Copyright Splunk 2011 Thursday, August 18, 11
  • 30. ...and  your  brain  was  wrong. 2011/07/21  19:27:22.071  [(ela4-­‐fe96,opensocial,/makeRequest,2011/07/21   19:27:21.978)[ela4-­‐be04,auth,Auth2Service.recoverSubject]]  []  [Auth2Service]   recoverSubject(V1.21.47,OSM:1t7Dg201000:i: 1311276436:1d00a2fc1f9addd936af12ed5c430a169c362af8,null,shindig, 172.17.207.243,)=[Principal[3],[OSM:1t7Dg201000:i: 1311276439:20d1d0b474927a301376d70f2ad5949a2241e271,false,1h]]  in  1ms This  is  NOT  a  “service” Dang...  what  are  we  gonna  do  now? Splunk Worldwide Users’ Conference 30 © Copyright Splunk 2011 Thursday, August 18, 11
  • 31. What  is  common  with  “services” 2011/07/21  19:27:22.071  [(ela4-­‐fe96,opensocial,/makeRequest,2011/07/21   19:27:21.978)[ela4-­‐be04,auth,Auth2Service.recoverSubject]]  []  [Auth2Service]   recoverSubject(V1.21.47,OSM:1t7Dg201000:i: 1311276436:1d00a2fc1f9addd936af12ed5c430a169c362af8,null,shindig, 172.17.207.243,)=[Principal[3],[OSM:1t7Dg201000:i: 1311276439:20d1d0b474927a301376d70f2ad5949a2241e271,false,1h]]  in  1ms They’re  all  alphanumeric  or   “word”  characters   0-­‐9A-­‐Za-­‐z_ Splunk Worldwide Users’ Conference 31 © Copyright Splunk 2011 Thursday, August 18, 11
  • 32. But  what  about  the  preceding  text 2011/07/21  19:27:22.071  [(ela4-­‐fe96,opensocial,/makeRequest,2011/07/21   19:27:21.978)[ela4-­‐be04,auth,Auth2Service.recoverSubject]]  []  [Auth2Service]   recoverSubject(V1.21.47,OSM:1t7Dg201000:i: 1311276436:1d00a2fc1f9addd936af12ed5c430a169c362af8,null,shindig, 172.17.207.243,)=[Principal[3],[OSM:1t7Dg201000:i: 1311276439:20d1d0b474927a301376d70f2ad5949a2241e271,false,1h]]  in  1ms Le~  bracket  followed  by  some  stuff,  followed  by  a  comma..   but  its  not  consistent.    SomeUmes  a  “(“  le~  paren  is  in  there. Splunk Worldwide Users’ Conference 32 © Copyright Splunk 2011 Thursday, August 18, 11
  • 33. This  is  a  be[er  match 2011/07/21  19:27:22.071  [(ela4-­‐fe96,opensocial,/makeRequest,2011/07/21   19:27:21.978)[ela4-­‐be04,auth,Auth2Service.recoverSubject]]  []  [Auth2Service]   recoverSubject(V1.21.47,OSM:1t7Dg201000:i: [[(-­‐a-­‐zA-­‐Z0-­‐9]+,([a-­‐zA-­‐Z]+), 1311276436:1d00a2fc1f9addd936af12ed5c430a169c362af8,null,shindig, 172.17.207.243,)=[Principal[3],[OSM:1t7Dg201000:i: 1311276439:20d1d0b474927a301376d70f2ad5949a2241e271,false,1h]]  in  1ms Say  the  matching  paZern  out  loud.    It  will  help Le~  bracket,  followed  by  anything  in  this  character  list  (greedy).  Followed  by  a  comma,  and   then  create  a  capturing  group  of  text  that  matches  upper  or  lower  case  roman  alphabet-­‐-­‐ greedy  (as  many  Umes  as  possible).  End  capturing  group,  then  followed  by  a  comma. Splunk Worldwide Users’ Conference 33 © Copyright Splunk 2011 Thursday, August 18, 11
  • 34. Can’t  be  too  hard  to  extend  it,  right? 2011/07/21  19:27:22.071  [(ela4-­‐fe96,opensocial,/makeRequest,2011/07/21   19:27:21.978)[ela4-­‐be04,auth,Auth2Service.recoverSubject]]  []  [Auth2Service]   recoverSubject(V1.21.47,OSM:1t7Dg201000:i: [[(-­‐a-­‐zA-­‐Z0-­‐9]+,([a-­‐zA-­‐Z]+),[^[]+[[(-­‐ 1311276436:1d00a2fc1f9addd936af12ed5c430a169c362af8,null,shindig, 172.17.207.243,)=[Principal[3],[OSM:1t7Dg201000:i: a-­‐zA-­‐Z0-­‐9]+,([a-­‐zA-­‐Z]+), 1311276439:20d1d0b474927a301376d70f2ad5949a2241e271,false,1h]]  in  1ms Le~  bracket,  followed  by  anything  in  this  character  list  (greedy).  Followed  by  a  comma,  and  then   create  a  capturing  group  of  text  that  matches  upper  or  lower  case  roman  alphabet-­‐-­‐greedy  (as   many  Umes  as  possible).  End  capturing  group,  then  followed  by  a  comma.    Followed  by  anything   that  is  NOT  a  Le~  Bracket,  followed  by..... Splunk Worldwide Users’ Conference 34 © Copyright Splunk 2011 Thursday, August 18, 11
  • 35. Sad  Trombone This  one  has  four  services 2011/07/21  19:27:27.596  [(ninja4-­‐fe29,genie,/handle,131292312,2011/07/21   19:27:27.310)[ninja4-­‐ be716,lmt,PbContentService.write<tetherAccountData;default>][ninja4-­‐ be05,tether,TetherAccountService.bindAccount][ninja4-­‐ be393,auth,Auth2Service.upgradeSubject]]  []  [Auth2Service]   upgradeSubject(V1.21.49,"INT",[LIM:131292312:s: 1311276361:b8f677d957eb3f7b9622247b72374c791720bc17,true], {internalAppName=twitter-­‐sync},"tether",null)=[Principal[2],[INT: 131292312/twitter-­‐sync: 1311276447:df9dd0175bd2e6107c2dfae36dfd9a9dc11f0631,false,20y]]  in  15ms Splunk Worldwide Users’ Conference 35 © Copyright Splunk 2011 Thursday, August 18, 11
  • 36. Remember  “rex”? He  devours  data But  you  can  make  “rex”  very  hungry  and   control  how  much  lunch  he  eats.    By   default,  he  only  gets  “one  helping  of  meat” Splunk Worldwide Users’ Conference 36 © Copyright Splunk 2011 Thursday, August 18, 11
  • 37. Using  max_match  with  rex You  limit  or  expand  the  number  of  Umes  it  runs rex max_match=20 "[[(-a-zA-Z0-9]+,(?<service>[a-zA-Z]+)," Instead  of  that  last  regex  that  matched  “two”  services,  lets   just  match  one,  and  tell  rex  to  repeat  our  pa[ern  matching   Splunk Worldwide Users’ Conference 37 © Copyright Splunk 2011 Thursday, August 18, 11
  • 38. You  can  persist  this  in  config  files props.conf  &  transforms.conf  required Create  an  entry  in  props.conf  like  this: [ninjasocial] REPORT-­‐ninjafields  =  myepicregex Create  an  entry  in  transforms.conf  like  this: [myepicregex] REGEX  =  [[(-a-zA-Z0-9]+,(?<service>[a-zA-Z]+), MV_ADD = TRUE Splunk Worldwide Users’ Conference 38 © Copyright Splunk 2011 Thursday, August 18, 11
  • 39. And  now  for  something  difficult gaming  logs  -­‐  Team  Fortress L  08/02/2011  -­‐  11:46:05:  "The   Administrator<61><BOT><Red>"  killed   "MoreGun<56><BOT><Blue>"  with   "flamethrower"  (attacker_position   "-­‐2677  2177  -­‐127")  (victim_position   "-­‐2555  2323  -­‐127") Splunk Worldwide Users’ Conference 39 © Copyright Splunk 2011 Thursday, August 18, 11
  • 40. I  need  the  data gaming  logs  -­‐  Team  Fortress L  08/02/2011  -­‐  11:46:05:  "The   Administrator<61><BOT><Red>"  killed   "MoreGun<56><BOT><Blue>"  with   "flamethrower"  (attacker_position   "-­‐2677  2177  -­‐127")  (victim_position   "-­‐2555  2323  -­‐127") Splunk Worldwide Users’ Conference 40 © Copyright Splunk 2011 Thursday, August 18, 11
  • 41. Who’s  who? How  do  we  know  who  did  what  to  whom? L  08/02/2011  -­‐  11:46:05:  "The   Administrator<61><BOT><Red>"  killed   "MoreGun<56><BOT><Blue>"  with   "flamethrower"  (attacker_position   "-­‐2677  2177  -­‐127")  (victim_position   "-­‐2555  2323  -­‐127") Splunk Worldwide Users’ Conference 41 © Copyright Splunk 2011 Thursday, August 18, 11
  • 42. actor actor_id actor_team actor_type L  08/02/2011  -­‐  11:46:05:  "The   Administrator<61><BOT><Red>"  killed   "MoreGun<56><BOT><Blue>"  with   "flamethrower"  (attacker_position   "-­‐2677  2177  -­‐127")  (victim_position   "-­‐2555  2323  -­‐127") actee actee_id actee_type actee_team Splunk Worldwide Users’ Conference 42 © Copyright Splunk 2011 Thursday, August 18, 11
  • 43. Didn’t  we  see  this  slide  before? How  do  we  know  who  did  what  to  whom? L  08/02/2011  -­‐  11:46:05:  "The   Administrator<61><BOT><Red>"  killed   "MoreGun<56><BOT><Blue>"  with   "flamethrower"  (attacker_position   "-­‐2677  2177  -­‐127")  (victim_position   "-­‐2555  2323  -­‐127") Splunk Worldwide Users’ Conference 43 © Copyright Splunk 2011 Thursday, August 18, 11
  • 44. See  that  pa[ern?    Remember   “max_match”? L  08/02/2011  -­‐  11:46:05:  "The   Administrator<61><BOT><Red>"  killed   "MoreGun<56><BOT><Blue>"  with   "flamethrower"  (attacker_position   "-­‐2677  2177  -­‐127")  (victim_position   "-­‐2555  2323  -­‐127") Splunk Worldwide Users’ Conference 44 © Copyright Splunk 2011 Thursday, August 18, 11
  • 45. See  that  pa[ern?    Remember   “max_match”? "The  Administrator<61><BOT><Red>"   "MoreGun<56><BOT><Blue>"   Using  rex  /  mv_add,  lets  capture  it  in  to  some  temporary  “mul9-­‐value”  fields Splunk Worldwide Users’ Conference 45 © Copyright Splunk 2011 Thursday, August 18, 11
  • 46. “Temporary”  MulUValue  Fields actor_name_z The  Administrator,MoreGun actor_id_z 61,56 actor_type_z BOT,BOT actor_team_z Red,Blue Using  rex  /  mv_add,  lets  capture  it  in  to  some  temporary  “mul9-­‐value”  fields Splunk Worldwide Users’ Conference 46 © Copyright Splunk 2011 Thursday, August 18, 11
  • 47. Evaluate  &  Transform  with  “mvindex” mul9-­‐value  fields  have  an  “posi9on  value”  in  the  array mvindex 0                                    1 actor_name_z The  Administrator,MoreGun actor_id_z 61,  56 actor_type_z BOT,BOT actor_team_z Red,Blue Splunk Worldwide Users’ Conference 47 © Copyright Splunk 2011 Thursday, August 18, 11
  • 48. Its  Ume  for  our  fields  to  split  up! mul9-­‐value  fields  have  an  “posi9on  value”  in  the  array |  eval  actor_name  =   mvindex(actor_name_z,0)|  eval   actee_name  =  mvindex(actor_name_z,1)   actor_name  =  The  Administrator actee_name  =  MoreGun Splunk Worldwide Users’ Conference 48 © Copyright Splunk 2011 Thursday, August 18, 11
  • 49. Resources • regexlib.com • regular-­‐expressions.info • gskinner.com/RegExr • Reggy  /  RegExhibit • RegexBuddy  (JGSo~.com) Thursday, August 18, 11
  • 50. Questions,  just  ask! Michael  Wilde,  Splunk  Ninja ninja@splunk.com Thursday, August 18, 11