Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Splunk Dynamic lookup


Published on

Published in: Technology
  • Be the first to comment

Splunk Dynamic lookup

  1. 1. Dynamic Lookups
  2. 2. AgendaLookups in GeneralStatic LookupsDynamic Lookups - Retrieve fields from a web site - Retrieve fields from a database - Retrieve fields from a persistent cache 2
  3. 3. Enrich Your Events with Fields from External Sources 3
  4. 4. Splunk: The Engine for Machine Data Customer Outside the Facing Data DatacenterClick-stream data Manufacturing, logisticsShopping cart data …Online transaction data CDRs & IPDRs Power consumption Logfiles Configs Messages Traps Metrics Scripts Changes Tickets RFID data Alerts GPS data Virtualization Windows Linux/Unix Applications Databases Networking & Cloud Registry Configurations Hypervisor Web logs Configurations Configurations Event logs syslog Guest OS, Apps Log4J, JMS, JMX Audit/query logs syslog File system File system Cloud .NET events Tables SNMP sysinternals ps, iostat, top Code and scripts Schemas netflow 4
  5. 5. 5
  6. 6. 6
  7. 7. 7
  8. 8. 8
  9. 9. Interesting Things to Lookup• User’s Mailing Address • External Host Address• Error Code Descriptions • Database Query• Product Names • Web Service Call for Status• Stock Symbol (from CUSIP) • Geo Location 9
  10. 10. Other Reasons For Lookup• Bypass static developer or vendor that does not enrich logs• Imaginative correlations • Example: A website URL with “Like” or “Dislike” count stored in external source• Make your data more interesting • Better to see textual descriptions than arcane codes 10
  11. 11. AgendaLookups in GeneralStatic LookupsDynamic Lookups - Retrieve fields from a web site - Retrieve fields from a database - Retrieve fields from a persistent cache 11
  12. 12. Static vs. Dynamic Lookup External Data comes from a CSV file StaticDynamic External Data comes from output of external script, which resembles a CSV file 12
  13. 13. Static Lookup Review• Pick the input fields that will be used to get output fields• Create or locate a CSV file that has all the fields you need in the proper order• Tell Splunk via the Manager about your CSV file and your lookup • You can also define lookups manually via props.conf and transforms.conf • If you use automatic lookups, they will run every time the source, sourcetype or associated host stanza is used in a search • Non-automatic lookups run only when the lookup command is invoked in the search 13
  14. 14. Example Static Lookup Conf Filesprops.conf [access_combined] lookup_http = http_status status OUTPUT status_description, status_typetransforms.conf [http_status] filename = http_status.csv 14
  15. 15. PermissionsDefine Lookups via Splunk Manager & set permissions there local.meta [lookups/http_status.csv] export = system [transforms/http_status] export = system 15
  16. 16. Example Automatic Static Lookup 16
  17. 17. AgendaLookups in GeneralStatic LookupsDynamic Lookups - Retrieve fields from a web site - Retrieve fields from a database - Retrieve fields from a persistent cache 17
  18. 18. Dynamic Lookups• Write the script to simulate access to external source• Test the script with one set of inputs• Create the Splunk Version of the lookup script• Register the script with Splunk via Manager or conf files• Test the script explicitly before using automatic lookups 18
  19. 19. Lookups vs Custom Command• Use dynamic lookups when returning fields given input fields • Standard use case for users who already are familiar with lookups• Use a custom command when doing MORE than a lookup • Not all use cases involve just returning fields • Decrypt event data • Translate event data from one format to another with new fields (e.g. FIX) 19
  20. 20. Write/Test External Field Gathering Script Send: Input FieldsExternal Data inCloud Your Python Script Return: Output Fields 20
  21. 21. Example Script to Test External Lookup# Given a host, find the corresponding IP addressdef mylookup(host): try: ipaddrlist = socket.gethostbyname_ex(host) return ipaddrlist except: return[] 21
  22. 22. External Field Gathering Script with SplunkExternal Data inCloud Your Python Script Return: Output Fields 22
  23. 23. Script for Splunk Simulates Reading Input CSV hostname, ip 23
  24. 24. Output of Script Returns Logically Complete CSV hostname, ip,,, 24
  25. 25. transforms.conf for Dynamic Lookup[NameofLookup]external_cmd = <name>.py field1….fieldNexternal_type = pythonfields_list = field1, …, fieldN 25
  26. 26. Example Dynamic Lookup conf files transforms.conf # Note – this is an explicit lookup [whoisLookup] external_cmd = ip whois external_type = python fields_list = ip, whois 26
  27. 27. Dynamic Lookup Python Flowdef lookup(input): Perform external lookup based on input. Return resultmain()Check standard input for CSV headers.Write headers to standard output.For each line in standard input (input fields): Gather input fields into a dictionary (key-value structure) ret = lookup(input fields) If ret: Send to standard output input values and return values from lookup 27
  28. 28. Whois Lookupdef main(): if len(sys.arv) != 3: print “Usage: python [ip field] [whois field]” sys.exit(0) ipf = sys.argv[1] whoisf = sys.argv[2] r = csv.reader(sys.stdin) w = none header = [ ] first = True… 28
  29. 29. Whois Lookup (cont.) to Read CSV Header# First get read the “CSV Header” and output the field namesfor line in r: if first: header = line if whoisf not in header or ipf not in header: print “IP and whois fields must exist in CSV data” sys.exit(0) csv.write(sys.stdout).writerow(header) w = csv.DictWriter(sys.stdout, header) first = False continue… 29
  30. 30. Whois Lookup (cont.) to Populate Input Fields# Read the result and populate the values for the input fields (ipaddress in our case) result = {} i=0 while i < len(header): if i < len(line): result[header[i]] = line[i] else: result[header[i]] = i += 1 30
  31. 31. Whois Lookup (cont.) to Populate Input Fields# Perform the whois lookup if necessary if len(result[ipf]) and len(result[whoisf]): w.writerow(result)# Else call external website to get whois field from the ip address as thekey elif len(result[ipf]): result[whoisf] = lookup(result[ipf]) if len(result[whoisf]): w.writerow(result) 31
  32. 32. Whois Lookup FunctionLOCATION_URL= Given an ip, return the whois responsedef lookup(ip): try: whois_ret = urllib.urlopen(LOCATION_URL + ip) lines = whois_ret.readlines() return lines except: return 32
  33. 33. Database Lookup• Acquire proper modules to connect to the database• Connect and authenticate to database • Use a connection pool if possible• Have lookup function query the database • Return a list([]) of results 33
  34. 34. Database Lookup vs. Database Sent To Index• Well, it depends…• Use a Lookup when: • Using needle in the haystack searches with a few users • Using form searches returning few results• Index the database table or view when: • Having LOTS of users and ad hoc reporting is needed • It’s OK to have “stale” data (N minutes) old for a dynamic database 34
  35. 35. Example Database Lookup using MySQL# First connect to DB outside of the for loopconn = MySQLdb.connect(host = “localhost”, user = “name of user”, passwd = “password”, db = “Name of DB”)cursor = conn.cursor() 35
  36. 36. Example Database Lookup (cont.) using MySQLimport MySQLdb…# Given a city, find its countrydef lookup(city, cur): try: selString=“SELECT country FROM city_country where city=“ cur.execute(selString + “”” + city + “””) row = cur.fetechone() return row[0] except: return [] 36
  37. 37. Lookup Using Key Value Persistent Cache• Download and install Redis• Download and install Redis Python module Redis is an open• Import Redis module in Python and populate source, advanced key- value store. key value DB• Import Redis module in lookup function given to Splunk to lookup a value given a key 37
  38. 38. Redis Lookup###CHANGE PATH According to your REDIS install ######sys.path.append(“/Library/Python/2.6/…/redis-2.4.5-py.egg”)import redis…def main()…#Connect to redis – Change for your distributionpool = redis.ConnectionPool(host=„localhost‟,port=6379,db=0)redp = redis.Redis(connection_pool=pool) 38
  39. 39. Redis Lookup (cont.)def lookup(redp, mykey):try: return redp.get(mykey)except: return “” 39
  40. 40. Combine Persistent Cache with External Lookup• For data that is “relatively static” • First see if the data is in the persistent cache • If not, look it up in the external source such as a database or web service • If results come back, add results to the persistent cache and return results• For data that changes often, you will need to create your own cache retention policies 40
  41. 41. Combining Redis with Whois Lookupdef lookup(redp, ip): try: ret = redp.get(ip) if ret!=None and ret!=: return ret else: whois_ret = urllib.urlopen(LOCATION_URL + ip) lines = whois_ret.readlines() if lines!=: redp.set(ip, lines) return lines… except: 41
  42. 42. Where do I get the add-ons from today? Splunkbase! Add-On Download Location Release 4.x Whois add-on http://splunk- 4.x DBLookup using-a-database 4.x Redis Lookup lookup 4.xGeo IP Lookup (not location-lookup-script-powered-by-maxmind in these slides) 42
  43. 43. ConclusionLookups are a powerful way to enhanceyour search experience beyond indexing the data. 43
  44. 44. Thank You