Splunk live beginner training nyc


Published on

This is the slide deck used in the beginner workshop at Splunk Live NYC on May 2nd, 2013.

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Hopefully you are starting to see the power of Splunk. On the left here is a typical way organizations use Splunk--index your IT data, use Splunk to search and investigate, users add knowledge such as saving valuable searches, monitor and alert, report and analyze the data, review trends and other findings to become more proactive in a cycle of improved IT Operations.Our customers typically start by using Splunk to solve a specific problem area. Often it’s Application management and troubleshooting, or security monitoring and incident investigation, or compliance. After quickly making their initial use case an internal success, Splunk is typically deployed into other areas of IT—these ten areas are the ones where Splunk is most often deployed. Customers who get maximum value from Splunk understand the value of having a single IT data engine that can provide the complete view needed by anyone to accomplish their job—in a far more productive and effective way. We work with customers to leverage these capabilities across every functional or organizational silo in your IT organization.Splunk delivers value to dev teams, server administrators, network managers, security analysts, auditors, and others.
  • Follow along if you like!See full list of supported platforms in Installation Manual.Can choose different directory during installation.
  • Good analogy for Apps is iPhone/iPad. Same data, many uses. Apps change the presentation layer.
  • Illustrate add data, illustrate creating a new index, illustrate the *nix app to show performance metrics.
  • This is the unix app in action. In this example, we’re pulling a number of scripted inputs such as top, iostat, network, etc.
  • 1. Wildcards are supported - *2. Search terms are case insensitive.3. Boolean searches are supported with AND, OR, NOT. Just remember that Booleans must be uppercase.4. There is an implied AND between the search terms, and for complex searches, use parenthesis. (error OR failed)5. You can also quote phrases such as “Login Failure”6. Search Modes!
  • 1. Wildcards are supported - *2. Search terms are case insensitive.3. Boolean searches are supported with AND, OR, NOT. Just remember that Booleans must be uppercase.4. There is an implied AND between the search terms, and for complex searches, use parenthesis. (error OR failed)5. You can also quote phrases such as “Login Failure”6. Search Modes!
  • This is an example of a search for error OR failed but includes some Boolean exclusions using NOT.
  • The search assistant offers quick reference for the Splunk search language that updates as you type. That includes links to online documentation, and shows matching searches along with their count, matching terms and examples. It also shows you your history of searches.
  • A search becomes a job for Splunk to process. While a search is processing, this job can be Canceled, Paused, sent to the background and Finalized. The ability to cancel is handy if you made a mistake or chose the wrong time range.Finalized = stop processing events but build the "number of events" count. Jobs can be accessed while running or after through the jobs menu. There, Paused Jobs can be resumed and those sent to the background can be accessed. Jobs results are kept for a configurable time of 10 minutes by default.
  • Splunk search language is very unix-like—use the pipe symbol to pass search results to search commands. Search commands can be chained. You can even create your own custom search commands.These are common commands we find most useful to analyze and filter data. <review each command>Search reference is available online in addition to the search assistance and covers all search commands.
  • Much like *nix* operating systems, chances are you’re not going to memorize all of the commands. You’ll memorize a handful, and rely on the “man pages” to get additional context to commands. We SEs here at Splunk use maybe twenty terms in our day to day.
  • Fields give you much more precision in searches. Fields are key value pairs associated with your data by Splunk. So, an example would be host=www1, status=503. Now there are two specific types of fields. There are default fields, (source, sourcetype and host) which are added to every event by Splunk during indexing.And there are data-specific fields. These would be action=“purchase” or status=“503”.
  • What’s the difference between Sources, sourcetypes, and hosts?A host would be the hostname, IP address or name of the network host from which events originate. An example might be a single windows server would be a host or specific firewall.A Source is the name of a file, a stream or some other input, such as a config file, process, application or event log, on a server. So per our Windows server example, sources on that server, might include Windows event logs, exchange logs, DNS/DHCP logs, performance metrics as well as the windows event logs from the windows event viewer. Each of these is a different source.A Sourcetype is a specific data format. Sourcetype would beALLexchange logs or ALL Cisco ASA. It’s a high level group. Running your searches against a sourcetypeof Windows Event Log Security across multiple servers.
  • Event types can help you automatically identify events based on a search. An event type is a field based on a search, it’s a way of classifying data for searching and reporting and it’s useful for user knowledge capture and sharing.Tags are different, in that they allow you to search for events with related field values. You can assign any field/value combination. So as an example, server names aren’t always helpful. Sometimes they contain ambiguous information. Using tags you can use a more meaningful term.The Splunk Manager allows you to enable/disable, copy, delete and edit tags that you’ve created.
  • Extracting fields that aren’t already pulled out at search time is a necessary step to doing more with your data like reporting.Show example of field extraction with IFX and an example using rex.Show other field extractor.
  • Show alert in realtime: sourcetype=linux_secure fail* root Real-time alerts always trigger immediately for every returned resultReal-time monitored alerts monitor a real-time window and can trigger immediately, or you can define conditionsScheduled alerts run a search on a regular interval that you define and triggers based on conditions that you define
  • Run alert in Splunk.Splunk alerts are based on searches and can run either on a regular scheduled interval or in real-time.Alerts are triggered when the results of the search meet a specific condition that you define.Based on your needs, alerts can send emails, trigger scripts and write to RSS feeds.
  • Consider how you might use a scripted alert.
  • Demo building a report
  • Demo building a traditional report. Reports can also be dashboards mailed out.
  • Demo building a report and dashboard.
  • Demo new dashboard workflow
  • Show dashboard examples:
  • Splunk can be divided into four logical functions. First, from the bottom up, collection. Splunk forwarders come in two packages; the full Splunk distribution or a dedicated “Universal Forwarder”. The full Splunk distribution can be configured to filter data before transmitting, execute scripts locally, or run SplunkWeb. This gives you several options depending on the footprint size your endpoints can tolerate. The universal forwarder is an ultra-lightweight agent designed to collect data in the smallest possible footprint. Both flavors of forwarder come with automatic load balancing, SSL encryption and data compression, and the ability to route data to multiple Splunk instances or third party systems. To manage your distributed Splunk environment, there is the Deployment Server. Deployment server helps you synchronize the configuration of your search heads during distributed searching, as well as your forwarders to centrally manage your distributed data collection. Of course, Splunk has a simple flat-file configuration system, so feel free to use your own config management tools if your more comfortable with what you already have. The core of the Splunk infrastructure is indexing. An indexer does two things – it accepts and processes new data, adding it to the index and compressing it on disk. The indexer also services search requests, looking through the data it has via it’s indices and returning the appropriate results to the searcher over a compressed communication channel. Indexers scale out almost limitlessly and with almost no degradation in overall performance, allowing Splunk to scale from single-instance small deployments to truly massive Big Data challenges. Finally, the Splunk most users see is the search head. This is the webserver and app interpreting engine that provides the primary, web-based user interface. Since most of the data interpretation happens as-needed at search time, the role of the search head is to translate user and app requests into actionable searches for it’s indexer(s) and display the results. The Splunk web UI is highly customizable, either through our own view and app system, or by embedding Splunk searches in your own web apps via includes or our API.
  • Getting data into Splunk is designed to be as flexible and easy as possible. Because the indexing engine is so flexible and doesn’t generally require configuration for most IT data, all that remains is how to collect and ship the data to your Splunk. There are many options. First, you can collect data over the network, without an agent. The most common network input is syslog; Splunk is a fully compliant and customizable syslog listener over both TCP and UDP. Further, because Splunk is just software, any remote file share you can mount or symlink to via the operating system is available for indexing as well. To facilitate remote Windows data collection, Splunk has a its own WMI query tool that can remotely collect Windows Event logs and performance counters from your Windows systems. Finally, Splunk has a AD monitoring tool that can connect to AD and get your user meta data to enhance your searching context and monitor AD for replication, policy or user security changes. When Splunk is running locally as an indexer or forwarder, you have additional options and greater control. Splunk can directly monitor hundreds or thousands of local files, index them and detect changes. Additionally, many customers use our out-of-the-box scripts and tools to generate data – common examples include performance polling scripts on *nix hosts, API calls to collect hypervisor statistics and for detailed monitoring of custom apps running in debug modes. Also, Splunk has Windows-specific collection tools, including native Event Log access, registry monitoring drivers, performance monitoring and AD monitoring that can run locally with a minimal footprint.
  • Historically, a Splunk forwarder was a stripped down version of the full Splunk distribution. Certain features, such as Splunk Web, were turned off to decrease footprint on a remote host. Our customers asked us for something even lighter and we delivered. The Universal Forwarder is a new, dedicated package specifically designed for collecting and sending data to Splunk. It’s super light on resources, easy to install, but still includes all the current Splunk inputs, without requiring python. Most deployments should only require the use of the Universal Forwarder but we have kept all features of forwarding in the Regular (or Heavy) Forwarder for cases when you need specific capabilities.
  • A single indexers it can index 50-100gigabytes per day depending the data sources and load from searching. If you have terabytes a day you can linearly scale a single, logical Splunk deployment by adding index servers, using Splunk’s built in forwarderload balancing to distribute the data and using distributed search to provide a single view across all of these servers. Unlike some log management products you get full consolidated reporting and alerting not simply merged query results. When in doubt, the first rule of scaling is ‘add another commodity indexer.’ Splunk indexers are designed to enable nearly limitless fan-out with linear scalability by leveraging techniques like MapReduce to fan-out work in a highly efficient manner.
  • Leverage distributed search to give each locale access to their own data, while providing a combined view to central teams back at headquarters. Whether to optimize your network traffic or meet data segmentation requirements, feel free to build your Splunk infrastructure as it makes sense for your organization. Further, each distributed search head automatically creates the correct app and user context while searching across other datasets. No specific custom configuration management is required; Splunk handles it for you.
  • The insights from your data are mission-critical. With Splunk Enterprise 5 we wanted to deliver a highly available system, with enterprise-grade data resiliency, even as you scale on commodity storage. And we wanted to maintain Splunk’s robust, real-time and ease of use features.Splunk indexers can now be grouped together to replicate each other’s data, maintaining multiple copies of all data – preventing data loss and delivering highly available data for Splunk search. Using index replication, if one or more indexers fail, incoming data continues to get indexed and indexed data continues to be searchable.By spreading data across multiple indexers, searches can read from many indexers in parallel, improving parallelism of operations and performance. All as you scale on commodity servers and storage. And without a SAN.
  • For high availability and scale out, combine auto load balancing with data cloning. Each clone group has one complete set of the overall data for redundancy, while load balancing within each clone group spreads the load and the data between indexers for efficient scaling. So long as one indexer remains in a clone group, that group will remain synced with the entirety of the data. Search Head Pooling can share the same application and user configurations and coordinate the scheduling of searches. This allows for one logical pool of search heads to service large numbers of users with minimal downtime should a search head become unavailable.Additionally, by leveraging LDAP authentication, such as Active Directory, users can be directed to any search head as needed for load balancing or failover. NOTE: the second indexers needs to be licensed with an HA license 50% of regular enterprise license
  • Splunk isn’t the only technology that can benefit from IT data collection, so let Splunk help send the data to those systems that need it. For those systems that want a direct tap into the raw data, Splunk can forward all or a subset of data in real time via TCP as raw text or RFC-compliant syslog. This can be done on the forwarder or centrally via the indexer without incrementing your daily indexing volume. Separately, Splunk can schedule sophisticated correlation searches and configure them to open tickets or insert events into SIEMs or operation event consoles. This allows you to summarize, mash-up and transform the data with the full power of the search language and import data into these other systems in a controlled fashion, even if they don’t natively support all the data types Splunk does. MSSP, Cloud Services, etc.
  • Your logs and other IT data are important but often cryptic. You can extend Splunk’s search with lookups to external data sources as well as automate tagging of hosts, users, sources, IP addresses and other fields that appear in your IT data. This enables you to find and summarize IT data according to business impact, logical application, user role and other logical business mappings. In the example shown, Splunk is looking up the server’s IP address to determine which domain the servicing web host is located in, and the customer account number to show which local market the customer is coming from. Using these fields, a search user could create reports pivoted on this information easily. Illustrate Lookups:
  • Splunk allows you to extend your existing AAA systems into the Splunk search system for both security and convenience. Splunk can connect to your LDAP based systems, like AD, and directly map your groups and users to Splunk users and roles. From there, define what users and groups can access Splunk, which apps and searches they have access to, and automatically (and transparently) filter their results by any search you can define. That allows you to not only exclude whole events that are inappropriate for a user to see, but also mask or hide specific fields in the data – such as customer names or credit card numbers – from those not authorized to see the entire event.
  • Centralized License Management provides for a holistic approach in your multi-indexer distributed Splunk environment. You can aggregate compatible licenses into stacks of available license volume and define pools of indexers to use license volume from a given stack.
  • Splunk deployments can grow to encompass thousands of Splunk instances, including forwarders, indexers, and search heads. Splunk offers a deployment monitor app that helps you to effectively manage medium- to large-scale deployments, keeping track of all your Splunk instances and providing early warning of unexpected or abnormal behavior.The deployment monitor provides chart-rich dashboards and drilldown pages that offer a wealth of information to help you monitor the health of your system. These are some of the things you can monitor:Index throughput over timeNumber of forwarders connecting to the indexer over timeIndexer and forwarder abnormalitiesDetails for individual forwarders and indexers, such as status and forwarding volume over timeSource types being indexed by the systemLicense usage
  • With thousands of enterprise customers and an order of magnitude more actual users, we have a thriving community.We launched a dev portal a few months back and already have over 1,000 unique visitors per week.We have over 300 apps contributed by ourselves, our partners and our community.Our knowledge exchange Answers site has over 20,000+ questions answered.And in August 2012 we ran our 3rd users’ conference with over 1,000 users in attendance, over 100 sessions of content, customers presenting.Best of all, this community demands more from Splunk and gives us incredible feedback
  • Splunk live beginner training nyc

    1. 1. Copyright © 2013 Splunk Inc.May 2nd, 2013TechnicalWorkshopsGetting Started User TrainingGetting StartedUser Training WorkshopDimitri McKayJedi Master
    2. 2. Agenda• Getting Started with Splunk• Search• Alert• Dashboard• Deployment and Integration• Community• Help & Questions2
    3. 3. Getting Started With Splunk
    4. 4. One Splunk. Many Uses.
    5. 5. Install SplunkStart SplunkWIN: Program FilesSplunkbinsplunk.exe start (services start)*NIX: /opt/splunk/bin/splunk startwww.splunk.com/download• 32 or 64 bit?• Indexer or UniversalForwarder?Splunk HomeWIN: Program FilesSplunkOther: /opt/splunk (Applications/splunk)
    6. 6. Splunk LicensesFree Download Limits Indexing to 500MB/dayEnterprise Trial License expires after 60 daysReverts to Free LicenseFeatures Disabled in Free LicenseMultiple user accounts and role-based access controlsDistributed searchForwarding to non-Splunk InstancesDeployment managementScheduled saved searches and alertingSummary indexingOther License TypesEnterprise, Forwarder, Trial
    7. 7. 7Splunk Web BasicsBrowser SupportFirefox 3.6, 10.x and latestInternet Explorer 6, 7, 8 and 9Safari (latest)Chrome (latest)Default on install is http://localhost:8000Index some dataAdd dataGetting Started AppInstall an App (Splunk for Windows, *NIX)
    8. 8. 8Splunk Web Basics cont.Splunk AppsSplunk Home -> Find more appsApps create different contexts for your data out of sets of views,dashboards, and configurationsYou can create your own!Search is an AppSummary will show everything you have indexedUpdated in real-timeClick on any source, sourcetype, or host to look at events
    9. 9. Optional: add some test dataDownload the sample file, follow this link and save the file to yourdesktop, then unzip: http://bit.ly/UBPFWP (Using Splunk Book)Or, to follow along locally, you can download the slides, lookups anddata samples at: http://bit.ly/UjkNt6 (Dropbox)To add the file to Splunk:– From the Welcome screen, click Add Data.– Click From files and directories on the bottom half of the screen.– Select Skip preview.– Click the radio button next to Upload and index a file.– Click Save.Install *nix or Windows app to test drive your local OS data!9
    10. 10. 10*nix app in action:
    11. 11. * best practice note:Create an individual index basedon sourcetype.– Easier to re-index data if you makea mistake.– Easier to remove data.– Easier to define permissions anddata retention.11
    12. 12. Search Basics
    13. 13. Search app – Summary viewcurrent viewglobal statsapp navigationtime rangepickerdata sourcesstartsearchsearch box
    14. 14. Searching14Search > *Select Time Range• Historical, custom, or real-timeUsing the timeline• Click events and zoom in and out• Click and drag over events for a specific range• New for 5.0: Search modes!
    15. 15. 15Everything is searchableEverything is searchable• * wildcard supported• Search terms are caseinsensitive• Booleans AND, OR, NOT– Booleans must beuppercase– Implied AND betweenterms– Use () for complexsearches• Quote phrasesfail*fail* nfserror OR 404error OR failed OR (sourcetype=access_*(500 OR 503))"login failure"
    16. 16. Example search:16
    17. 17. Search Assistant17Contextual Help- advanced type-aheadHistory- search- commandsSearch Reference- short/long description- examplessuggests search termsand displays countupdates as you typeshows examples and helptoggle off / on
    18. 18. Searches can be managed asasynchronous processesJobs can be• Scheduled• Moved to background tasks• Paused, stopped, resumed, finalized• Managed• Archived• CancelledJob Managementsend to backgroundpause finalizecancel18
    19. 19. Search Commands19Search > error | head 1Search results are “piped” to the commandCommands for:• Manipulating fields• Formatting• Handling results• Reporting
    20. 20. Over 100 Commands!20http://www.splunk.com/base/Documentation/latest/SearchReference/SearchCheatsheet
    21. 21. Field Extraction Fun
    22. 22. Fields22Default fields• host, source, sourcetype, linecount, etc.• View on left panel in search results or all in field pickerWhere do fields come from?• Pre-defined by sourcetypes• Automatically extracted key-value pairs• User defined
    23. 23. Sources, sourcetypes, hosts• Source- the name of the file,stream, or other input• Sourcetype- a specific data type ordata format• Host- hostname, IP address,or name of the networkhost from which theevents originated23
    24. 24. 24Tagging and Event TypingEventtypes for more human-readable reportsto categorize and make sense of mountains of datapunctuation helps find events with similar patternsSearch > eventtype=failed_login instead ofSearch > “failed login” OR “FAILED LOGIN” OR “Authentication failure” OR “Failed toauthenticate user”Tags are labelsapply adhoc knowledgecreate logical divisions or groupstag hosts, sources, fields, even eventtypesSearch > tag=web_servers instead ofSearch > host=“apache1.splunk.com” OR host=“apache2.splunk.com” ORhost=“apache3.splunk.com”
    25. 25. Extract Fields25Interactive Field Extractorgenerate PCREeditable regexpreview/saveprops.conf[mysourcetype]REPORT-myclass = myFieldstransforms.conf[myFields]REGEX = ^(w+)sFORMAT = myFieldLabel::$1Configuration Filemanual field extractiondelim-based extractionsRex Search Command... | rex field=_raw "From: (?<from>.*) To: (?<to>.*)"
    26. 26. Saved Search & Alert Basics
    27. 27. Saved Searches and Alerting27Find Something Interesting?OR
    28. 28. Alerting Cont.28Searches run on a schedule and fire an alert• Example: Run a search for “Failed password” every 15 minover the last 15 min and alert if the number of events isgreater than 10Searches are running in real-time and fire an alert• Example: Run a search for “Failed password user=john.doe”in a 1 minute window and alert if an event is found
    29. 29. Alerting Actions29• Send email• RSS• Execute a script• Track in Alert Manager
    30. 30. Report & Dashboard Wackiness
    31. 31. Reporting31Build reports from the results of any searchSelect type of report (Values over time, Top Values, Rare Values)and on which fields to report or perform statistics Choose the type of chart (line, area, column, etc) andother formatting options
    32. 32. Reporting32Build reports from the results of any searchSelect type of report (Values over time, Top Values, Rare Values)and on which fields to report or perform statistics Choose the type of chart (line, area, column, etc) andother formatting options
    33. 33. Reporting Examples33• Use wizard or reporting commands (timechart, top, etc)• Build real-time reports with real-time searches• Save reports for use on dashboards
    34. 34. Dashboards34Create dashboards from search results
    35. 35. Dashboard Examples35
    36. 36. Splunk Manager36Now Manage All of that Cool Stuff You Just Created (and more!)• Permissions• Saved Searches/Reports• Custom Views• Distributed Splunk• Deployment Server• License Usage….
    37. 37. Deployment andIntegration
    38. 38. Splunk Has Four Primary Functions38• Searching and Reporting (Search Head)• Indexing and Search Services (Indexer)• Local and Distributed Management (Deployment Server)• Data Collection and Forwarding (Forwarder)A Splunk install can be one or all roles…
    39. 39. Getting Data Into Splunk39Agent and Agent-less Approach for FlexibilityperfshellcodeMounted File SystemshostnamemountsyslogTCP/UDPWMIEvent Logs PerformanceActiveDirectorysyslog compatible hostsand network devicesUnix, Linux and Windows hostsWindows hosts Custom apps and scripted API connectionsLocal File Monitoringlog files, config filesdumps and trace filesWindows InputsEvent Logsperformance countersregistry monitoringActive Directory monitoringvirtualhostWindows hostsScripted Inputsshell scripts customparsers batch loadingAgent-less Data Input Splunk Forwarder
    40. 40. Understanding the Universal Forwarder40Forward data without negatively impacting production performance.ScriptsUniversal Forwarder DeploymentLogs ConfigurationsMessages MetricsCentral Deployment ManagementMonitor files, changes and the system registry; capture metrics and status.Universal Forwarder Regular (Heavy) ForwarderMonitor AllSupportedInputs✔ ✔Routing,Filtering,Cloning✔ ✔Splunk Web ✔PythonLibraries✔Event BasedRouting✔ScriptedInputs✔
    41. 41. Horizontal Scaling41Load balanced search and indexing for massive, linear scale out.ForwarderAuto LoadBalancingDistributed Search
    42. 42. Multiple Datacenters42HeadquartersLondon Hong Kong Tokyo New YorkDistributed SearchIndex and store locally. Distribute searches to datacenters, networks & geographies.
    43. 43. High Availability, On Commodity Servers and Storage43As Splunk collects data, it keepsmultiple identical copiesIf indexer fails, incoming datacontinues to get indexedIndexed data continues to besearchableEasy setup and administrationData integrity and resiliencewithout a SANIndex ReplicationSplunk UniversalForwarder PoolConstantUptime
    44. 44. High Availability44Combine auto load balancing and cloning for HA at every Splunk tier.Clone Group 1 : Complete DatasetData Cloning &Auto Load BalancingDistributed Search Distributed SearchClone Group 2 : Complete DatasetShared Storage
    45. 45. Service DeskEvent ConsoleSIEMSend Data to Other Systems45Route raw data in real time or send alerts based on searches.
    46. 46. Integrate External Data46LDAP, AD WatchListsCRM/ERPCMDBCorrelate IP addresses with locations, accounts with regionsExtend search with lookups to external data sources.
    47. 47. Integrate Users and Roles47Problem Investigation Problem Investigation Problem InvestigationSaveSearchesShareSearchesLDAP, ADUsers and GroupsSplunk Flexible RolesManageUsersManageIndexesCapabilities& FiltersNOTtag=PCIApp=ERP…Map LDAP & AD groups to flexible Splunk roles. Define any search as a filter.Integrate authentication with LDAP and Active Directory.
    48. 48. Centralized Licensing Management48Problem InvestigationGroups, Stacks, and Pools for Enterprise Deployments
    49. 49. Deployment Monitoring49Keep Tabs On Your Splunk Enterprise DeploymentForwardersIndexersSourcetypesLicenses
    50. 50. Support andCommunity
    51. 51. Support Through the Splunk Community51Splunkbase
    52. 52. Where to Go for Help52• Documentation– http://www.splunk.com/base/Documentation• Technical Support– http://www.splunk.com/support• Videos– http://www.splunk.com/videos• Education– http://www.splunk.com/goto/education• Community– http://answers.splunk.com• Splunk Book– http://splunkbook.com
    53. 53. Thank youNovember 12st,2012TechnicalWorkshopsGetting Started User Training