Your SlideShare is downloading. ×
CeBIT Big Data 2012 - Raanan Dagan, Big Data Product Marketing, Splunk
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

CeBIT Big Data 2012 - Raanan Dagan, Big Data Product Marketing, Splunk


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. Copyright © 2012 Splunk Inc.Real-time Analyticsfrom Small Data, BigData and Huge DataRaanan Dagan, Big Data Solutions, Splunk
  • 2. What I’ll Talk AboutMachine DataSplunk and Big Data, Real-time AnalyticsCustomer Use Cases 2
  • 3. Big Data Comes from Machines Volume | Velocity | Variety | Variability GPS, Machine-generated data is one of the RFID, fastest growing, most complex Hypervisor,and most valuable segments of big data Web Servers, Email, Messaging Clickstreams, Mobile, Telephony, IVR, Databases, Sensors, Telematics, Storage, Servers, Security Devices, Desktops 3
  • 4. What Does Machine Data Look Like? SourcesOrder Processing Middleware Error Care IVR Twitter 4
  • 5. Machine Data Contains Critical Insights Sources Customer ID Order ID Product IDOrder Processing Order ID Customer ID Middleware Error Time Waiting On Hold Customer ID Care IVR Twitter Customer’s Tweet ID Twitter Company’s Twitter ID 5
  • 6. Splunk: The Platform for Machine DataMachine Data Operational Intelligence Insight and Visualizations for Executives Statistical Analysis Proactive Monitoring Splunk Index Search and Investigation 6
  • 7. Splunk Collects and Indexes Machine Data No upfront schema. No RDBMS. No custom connectors.Customer Outside theFacing Data Datacenter Click-stream data Manufacturing, Shopping cart data logistics… Online transaction data CDRs & IPDRs Power consumption Logfiles Configs Messages Traps Metrics Scripts Changes Tickets RFID data Alerts GPS data Windows Linux/Unix Virtualization Applications Databases Networking Registry Configuration & Cloud Web logs Configurations Configurations Event logs s Hypervisor Log4J, JMS, JMX Audit/query syslog File system syslog Guest OS, Apps .NET events logs SNMP sysinternals File system Cloud Code and scripts Tables netflow ps, iostat, top Schemas 7
  • 8. Operational Intelligence for IT and Business Users IT Operations Management Web Intelligence Application Management Business Analytics Security & ComplianceCustomerCustomer LOB Owners/ LOB Owners/ Support Support Executives Executives Operations Operations Website/Business Website/Business Teams Teams Analysts Analysts System System IT IT Administrator Administrator Executives Executives Development Development Security Security Auditors Auditors Teams Teams Analysts Analysts 8
  • 9. The Technical part
  • 10. Splunk Has Four Primary Functions • Searching and Reporting (Search Head) • Indexing and Search Services (Indexer) • Local and Distributed Management (Deployment Server) • Data Collection and Forwarding (Forwarder) A Splunk install can be one or all roles… 10
  • 11. Scalability to Tens of TBs/Day on Commodity Servers Offload search load to Splunk Search Heads Auto load-balanced forwarding to as many Splunk Indexers as you need to index terabytes/day Send data from 1000s of servers using combination of Splunk Forwarders, syslog, WMI, message queues, or other remote protocols 11
  • 12. Analyzing Heterogeneous Data Universal Late Structure Analysis and Indexing Binding VisualizationNo data normalization Knowledge applied at Normalization as it’sAutomatically handles search-time neededtimestamps No brittle schema to work Faster implementationParsers not required around Easy search languageIndex every term & Multiple views into the Multiple views into thepattern “blindly” same data same dataNo attempt to Find transactions, patterns“understand” up front and trends Rapid time-to-deploy: hours or days 12
  • 13. Real-time AnalyticsData Monitor Input Real-time Parsing Pipeline Real-time Search Parsing Queue Index Queue Buffer Process • Source, event typing TCP/UDP Input • Character set normalization • Line breaking Scripted Input • Timestamp identification Indexing Splunk • Regex transforms Pipeline Raw data Index Index Files 13
  • 14. Splunk and Hadoop Real-time Dashboards,Collection and Reports, Analysis Access Controls Splunk Hadoop Connect Reliable Data Export Import Hadoop Data > > Splunk App for HadoopOps > > End-to-end monitoring,> > troubleshooting , analysis of Hadoop environment 14
  • 15. Splunk Hadoop Connect Delivers reliable integration between Splunk and Hadoop Export events collected and aggregated in Splunk to HDFS Explore and browse HDFS directories and files Import and index data from HDFS for secure searching, reporting, analysis and visualizations in Splunk 15
  • 16. Splunk App for HadoopOpsEnd-to-end monitoring andtroubleshooting for Hadoop Monitoring of entire Hadoopenvironment (Network, Switch,Operating System and Database) Integrated alerting to track andrespond to activities from MapReduceto the individual node in the cluster Centralized real-time view of Hadoopnodes using intuitive heatmap display 16
  • 17. Summary - Splunk Big Data SolutionProduct-based Integrated and Performance Solution End-to-end at scale 17
  • 18. Thank You