Splunk Dynamic lookup


Published on

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Splunk is a data engine for your machine data. It gives you real-time visibility and intelligence into what’s happening across your IT infrastructure – whether it’s physical, virtual or in the cloud. Everybody now recognizes the value of this data, the problem up to now has been getting to it. At Splunk we applied the search engine paradigm to being able to rapidly harness any and all machine data wherever it originates. The “no predefined schema” design, means you can point Splunk at any of your data, regardless of format, source or location. There is no need to build custom parsers or connectors, there’s no traditional RDBMS, there’s no need to filter and forward.Here we see just a sample of the kinds of data Splunk can ‘eat’.Reminder – what’s the ‘big deal’ about machine data? It holds a categorical record of the following:User transactionsCustomer behaviorMachine behaviorSecurity threatsFraudulent activityYou can imagine that a single user transaction can span many systems and sources of this data, or a single service relies on many underlying systems. Splunk gives you one place to search, report on, analyze and visualize all this data.
  • Splunk Dynamic lookup

    1. 1. Dynamic Lookups
    2. 2. AgendaLookups in GeneralStatic LookupsDynamic Lookups - Retrieve fields from a web site - Retrieve fields from a database - Retrieve fields from a persistent cache 2
    3. 3. Enrich Your Events with Fields from External Sources 3
    4. 4. Splunk: The Engine for Machine Data Customer Outside the Facing Data DatacenterClick-stream data Manufacturing, logisticsShopping cart data …Online transaction data CDRs & IPDRs Power consumption Logfiles Configs Messages Traps Metrics Scripts Changes Tickets RFID data Alerts GPS data Virtualization Windows Linux/Unix Applications Databases Networking & Cloud Registry Configurations Hypervisor Web logs Configurations Configurations Event logs syslog Guest OS, Apps Log4J, JMS, JMX Audit/query logs syslog File system File system Cloud .NET events Tables SNMP sysinternals ps, iostat, top Code and scripts Schemas netflow 4
    5. 5. 5
    6. 6. 6
    7. 7. 7
    8. 8. 8
    9. 9. Interesting Things to Lookup• User’s Mailing Address • External Host Address• Error Code Descriptions • Database Query• Product Names • Web Service Call for Status• Stock Symbol (from CUSIP) • Geo Location 9
    10. 10. Other Reasons For Lookup• Bypass static developer or vendor that does not enrich logs• Imaginative correlations • Example: A website URL with “Like” or “Dislike” count stored in external source• Make your data more interesting • Better to see textual descriptions than arcane codes 10
    11. 11. AgendaLookups in GeneralStatic LookupsDynamic Lookups - Retrieve fields from a web site - Retrieve fields from a database - Retrieve fields from a persistent cache 11
    12. 12. Static vs. Dynamic Lookup External Data comes from a CSV file StaticDynamic External Data comes from output of external script, which resembles a CSV file 12
    13. 13. Static Lookup Review• Pick the input fields that will be used to get output fields• Create or locate a CSV file that has all the fields you need in the proper order• Tell Splunk via the Manager about your CSV file and your lookup • You can also define lookups manually via props.conf and transforms.conf • If you use automatic lookups, they will run every time the source, sourcetype or associated host stanza is used in a search • Non-automatic lookups run only when the lookup command is invoked in the search 13
    14. 14. Example Static Lookup Conf Filesprops.conf [access_combined] lookup_http = http_status status OUTPUT status_description, status_typetransforms.conf [http_status] filename = http_status.csv 14
    15. 15. PermissionsDefine Lookups via Splunk Manager & set permissions there local.meta [lookups/http_status.csv] export = system [transforms/http_status] export = system 15
    16. 16. Example Automatic Static Lookup 16
    17. 17. AgendaLookups in GeneralStatic LookupsDynamic Lookups - Retrieve fields from a web site - Retrieve fields from a database - Retrieve fields from a persistent cache 17
    18. 18. Dynamic Lookups• Write the script to simulate access to external source• Test the script with one set of inputs• Create the Splunk Version of the lookup script• Register the script with Splunk via Manager or conf files• Test the script explicitly before using automatic lookups 18
    19. 19. Lookups vs Custom Command• Use dynamic lookups when returning fields given input fields • Standard use case for users who already are familiar with lookups• Use a custom command when doing MORE than a lookup • Not all use cases involve just returning fields • Decrypt event data • Translate event data from one format to another with new fields (e.g. FIX) 19
    20. 20. Write/Test External Field Gathering Script Send: Input FieldsExternal Data inCloud Your Python Script Return: Output Fields 20
    21. 21. Example Script to Test External Lookup# Given a host, find the corresponding IP addressdef mylookup(host): try: ipaddrlist = socket.gethostbyname_ex(host) return ipaddrlist except: return[] 21
    22. 22. External Field Gathering Script with SplunkExternal Data inCloud Your Python Script Return: Output Fields 22
    23. 23. Script for Splunk Simulates Reading Input CSV hostname, ip a.b.c.com zorrosty.com seemanny.com 23
    24. 24. Output of Script Returns Logically Complete CSV hostname, ip a.b.c.com, zorrosty.com, seemanny.com, 24
    25. 25. transforms.conf for Dynamic Lookup[NameofLookup]external_cmd = <name>.py field1….fieldNexternal_type = pythonfields_list = field1, …, fieldN 25
    26. 26. Example Dynamic Lookup conf files transforms.conf # Note – this is an explicit lookup [whoisLookup] external_cmd = whois_lookup.py ip whois external_type = python fields_list = ip, whois 26
    27. 27. Dynamic Lookup Python Flowdef lookup(input): Perform external lookup based on input. Return resultmain()Check standard input for CSV headers.Write headers to standard output.For each line in standard input (input fields): Gather input fields into a dictionary (key-value structure) ret = lookup(input fields) If ret: Send to standard output input values and return values from lookup 27
    28. 28. Whois Lookupdef main(): if len(sys.arv) != 3: print “Usage: python whois_lookup.py [ip field] [whois field]” sys.exit(0) ipf = sys.argv[1] whoisf = sys.argv[2] r = csv.reader(sys.stdin) w = none header = [ ] first = True… 28
    29. 29. Whois Lookup (cont.) to Read CSV Header# First get read the “CSV Header” and output the field namesfor line in r: if first: header = line if whoisf not in header or ipf not in header: print “IP and whois fields must exist in CSV data” sys.exit(0) csv.write(sys.stdout).writerow(header) w = csv.DictWriter(sys.stdout, header) first = False continue… 29
    30. 30. Whois Lookup (cont.) to Populate Input Fields# Read the result and populate the values for the input fields (ipaddress in our case) result = {} i=0 while i < len(header): if i < len(line): result[header[i]] = line[i] else: result[header[i]] = i += 1 30
    31. 31. Whois Lookup (cont.) to Populate Input Fields# Perform the whois lookup if necessary if len(result[ipf]) and len(result[whoisf]): w.writerow(result)# Else call external website to get whois field from the ip address as thekey elif len(result[ipf]): result[whoisf] = lookup(result[ipf]) if len(result[whoisf]): w.writerow(result) 31
    32. 32. Whois Lookup FunctionLOCATION_URL=http://some.url.com?query=# Given an ip, return the whois responsedef lookup(ip): try: whois_ret = urllib.urlopen(LOCATION_URL + ip) lines = whois_ret.readlines() return lines except: return 32
    33. 33. Database Lookup• Acquire proper modules to connect to the database• Connect and authenticate to database • Use a connection pool if possible• Have lookup function query the database • Return a list([]) of results 33
    34. 34. Database Lookup vs. Database Sent To Index• Well, it depends…• Use a Lookup when: • Using needle in the haystack searches with a few users • Using form searches returning few results• Index the database table or view when: • Having LOTS of users and ad hoc reporting is needed • It’s OK to have “stale” data (N minutes) old for a dynamic database 34
    35. 35. Example Database Lookup using MySQL# First connect to DB outside of the for loopconn = MySQLdb.connect(host = “localhost”, user = “name of user”, passwd = “password”, db = “Name of DB”)cursor = conn.cursor() 35
    36. 36. Example Database Lookup (cont.) using MySQLimport MySQLdb…# Given a city, find its countrydef lookup(city, cur): try: selString=“SELECT country FROM city_country where city=“ cur.execute(selString + “”” + city + “””) row = cur.fetechone() return row[0] except: return [] 36
    37. 37. Lookup Using Key Value Persistent Cache• Download and install Redis• Download and install Redis Python module Redis is an open• Import Redis module in Python and populate source, advanced key- value store. key value DB• Import Redis module in lookup function given to Splunk to lookup a value given a key 37
    38. 38. Redis Lookup###CHANGE PATH According to your REDIS install ######sys.path.append(“/Library/Python/2.6/…/redis-2.4.5-py.egg”)import redis…def main()…#Connect to redis – Change for your distributionpool = redis.ConnectionPool(host=„localhost‟,port=6379,db=0)redp = redis.Redis(connection_pool=pool) 38
    39. 39. Redis Lookup (cont.)def lookup(redp, mykey):try: return redp.get(mykey)except: return “” 39
    40. 40. Combine Persistent Cache with External Lookup• For data that is “relatively static” • First see if the data is in the persistent cache • If not, look it up in the external source such as a database or web service • If results come back, add results to the persistent cache and return results• For data that changes often, you will need to create your own cache retention policies 40
    41. 41. Combining Redis with Whois Lookupdef lookup(redp, ip): try: ret = redp.get(ip) if ret!=None and ret!=: return ret else: whois_ret = urllib.urlopen(LOCATION_URL + ip) lines = whois_ret.readlines() if lines!=: redp.set(ip, lines) return lines… except: 41
    42. 42. Where do I get the add-ons from today? Splunkbase! Add-On Download Location Release http://splunk-base.splunk.com/apps/22381/whois- 4.x Whois add-on http://splunk- 4.x DBLookup base.splunk.com/apps/22394/example-lookup- using-a-database http://splunk-base.splunk.com/apps/27106/redis- 4.x Redis Lookup lookup http://splunk-base.splunk.com/apps/22282/geo- 4.xGeo IP Lookup (not location-lookup-script-powered-by-maxmind in these slides) 42
    43. 43. ConclusionLookups are a powerful way to enhanceyour search experience beyond indexing the data. 43
    44. 44. Thank You