Why it matters to ITMACY CRONKRITE@MACYCRONwww.facebook.com/safehex
Data Mining for Organization ValueData you are already processing has value• Audit trail & application status• Automatic monitoring for errors and warnings• Helping track down configuration problems• Helping track down bugs• Micro analysis of user behavior “click stream” andcomplex events• No more email to “monitor a process”• Get alerts only when something critically fails.
What is it?• Search and analysis engine• Google like search of your log data
RDBMS???ARE YOU KIDDING?Organization Data isBIG DATA(velocity-variety-volume)So Map Reduce – Key Value Pairs FTW!!!!Old way RDBMS>>> New Way (Map Reduce)
• Could be better than user supplied info? AKAtickets, complaints, unreported errors.• Behavior Analysis (Good and Bad)
MACHINE DATA• Most sensors create log files• Anything with a time-stamp• Unstructured data (many source types)• Anything that the system does on behalf of auser can be tracked, aggregated, andcorrelated across servers and applications• At minimum two keys are needed;– timestamp, and unique user session id.
Why --- Event Correlation• It leverages a natural query language toperform searches and analysis of log files.• A single search can cross multiple disparatelogs looking for key words and otherstructures• Splunk is licensed per volume of dataindexed, not on a per server basis• Build Apps (custom views) for specific ROLES
Mix Human Event Reports AND Machine EventsCorrelate your 1X / Base case instantlyLOGS are on all layers of your application stackAlert when the combination of events meet criteria.Less for human to parse Whew!!Less data overload/ignore you won’t go back
What is Splunk?• Sounds like its expensive or it takes weeks to set up.• There’s a free license. It installs in 15 minutes. On your laptop, while you’re testing itout, search billions of events in seconds. When you’re ready, scale up to your datacenter andsearch trillions. Basic searching and quite a lot of the reporting will work right out of the box.• Bullsxxx.Well I’m not saying that 15 minutes in, it’s going to be emailing your boss a pdf pie chart of“lost revenue – top causes”. But that’s seriously possible in a couple of hours. Out of thebox, Splunk will parse your data and extract out a lot of meaning, and if it doesn’t geteverything, teaching it how to extract the juicy numbers and names from your events is reallypretty straightforward. Then, once all the numbers and names are extracted and ready to bereported on, you’ll be able to do real searches and reports that help your people solve realproblems. And when you get to that point, from then on it’s pretty much crack. My goal inthis document is to get you addicted. Sorry.• Download Splunk for free and try it for yourself from splunk.com, right now.
Uses• Right Now we are using Splunk to calculate our VPN metricsfor the Remote Access service• Total Sessions– index="vpn" user authentication Successful | stats count ASLogins• Unique users– index="vpn" %ASA-6-113004 | rex field=_raw "user ="(?<Username>.*) | dedup Username | stats count ASUniqueUsers• For information usage, “non ‘mm’ machines”– index="vpn" Received request for DHCP hostname for DDNS|rex field=_raw "hostname for DDNS is: (?<Machine>.*)!"| evalmachine=lower(Machine)| search Machine!= "mm*" | rexfield=_raw "Username = (?<User>.*), IP"| table User, Machine
Transactions ACROSS devices• Can we calculate IN SPLUNK, the transactionduration, e.g. started transaction attimestamp, and end transaction. IF westandardize on the Keys for the start and end.• This is a different approach to solving"duration"
Splunk Navigation and Basic SearchingREVIEW• Splunk comes with several Apps, but the only relevant one now is the Search app, which isthe interface for generic searching. To begin your Splunk search, type in terms you mightexpect to find in your data. For example, if you want to find events that might be HTTP 404errors (i.e., webpage not found), type in the keywords:• http 404 --Youll get back all the events that have both HTTP and 404 in their text. Notice thatsearch terms are implicitly ANDd together. The search was the same as "http AND 404".Lets make the search narrower:• http 404 "like gecko“ Using quotes tells Splunk to search for a literal phrase “likegecko”, which returns more specific results than just searching for “like” and “gecko” becausethey must be adjacent as a phrase.• Splunk supports the Boolean operators AND, OR, and NOT (must be capitalized), as well asparentheses to enforce grouping. To get all HTTP error events (i.e., not 200 error code), notincluding 403 or 404, use this:• http NOT (200 OR 403 OR 404) Again, the AND operator is implied; the previous search is thesame as http AND NOT (200 OR 403 OR 404)• Splunk supports the asterisk (*) wildcard for searching. For example, to retrieve events thathas 40x and 50xx classes of HTTP status codes, you could try: http (40* OR 50*)•
Intermediate Searching• Splunks search language is much more powerful than you think it is. So far weve only beentalking about search, which retrieves your indexed data, but there are dozens of otheroperations you can perform on your data. You can "pipe" (i.e., transfer) the results of asearch to other commands to filter, modify, reorder, and group your results.• If Google were Splunk, youd be able to search the web for every single page mentioning yourex-girlfriends, extract out geographical information, remove results without location info, sortthe results by when they were written, keeping only the most recent page per ex-girlfriend, and finally generate a state by-state count of where Mr. Don Juans ladies currentlylive. But Google isnt Splunk, so good luck with that.• Lets do something similar, though, with our web data: lets find some interesting thingsabout URIs that have 404s. Heres our basic search:• status=404• Now lets take the result of that search and sort the results by URI:• status=404 | sort - uri• That special "pipe" character ("|") says "take the results of the thing on the left and processit, in this case, with the sort operator".•
Splunk Navigation and Basic SearchingREVIEW• Wildcards can appear anywhere in a term, so "f*ck" will return all events withfack, feck, fick, fock, or flapjack, among others. A search for “*” will return all events. Notethat in these searches we’ve been playing fast and loose with precision. Any event that has50 in it (e.g. “12:18:50”) would also unfortunately match. Let’s fix that.•When you index data, Splunk automatically adds fields (i.e., attributes) to each of yourevents. You can always add your own extraction rules for pulling out additional fields. Tonarrow results with a search, just add attribute=value to your search:• sourcetype=access_combined status=404• This search shows a much more precise version of our first search (i.e., "http 404") because itwill only return events that come from access_combined sources (i.e., webserver events) andthat have a status code of 404, which is different than just having a 404 somewhere in thetext. In addition to <attribute>=<value>, you can also do != (not equals), and <, >, >=, and <=for numeric fields.
Continued• status=404 | top 5 referer_domain | search count>2•OK math geeks, supposing you want to calculate a new field based on other fields, you canuse the eval command. Lets make a new field kbytes, on the fly, based on the bytes fields:• * | eval kbytes = bytes/1024And now for something completely different: assuming you had indexed data from a datingsite, search for the smartest girl of each hair and eye color variation, calculating her bmi:•• gender=female |sort -iq |dedup hair, eyes |eval bmi=weight/height• No hate mail.• Weve just shown you a tiny, tiny window of what is possible in a Splunk search. See theAppendix for a quick cheatsheet of search commands and examples.