Machine Data Analytics
Upcoming SlideShare
Loading in...5
×
 

Machine Data Analytics

on

  • 563 views

Gain New Insights by Analyzing Machine Logs using Machine Data Analytics and BigInsights. ...

Gain New Insights by Analyzing Machine Logs using Machine Data Analytics and BigInsights.

Half of Fortune 500 companies experience more than 80 hours of system down time annually. Spread evenly over a year, that amounts to approximately 13 minutes every day. As a consumer, the thought of online bank operations being inaccessible so frequently is disturbing. As a business owner, when systems go down, all processes come to a stop. Work in progress is destroyed and failure to meet SLA’s and contractual obligations can result in expensive fees, adverse publicity, and loss of current and potential future customers. Ultimately the inability to provide a reliable and stable system results in loss of $$$’s. While the failure of these systems is inevitable, the ability to timely predict failures and intercept them before they occur is now a requirement.

A possible solution to the problem can be found is in the huge volumes of diagnostic big data generated at hardware, firmware, middleware, application, storage and management layers indicating failures or errors. Machine analysis and understanding of this data is becoming an important part of debugging, performance analysis, root cause analysis and business analysis. In addition to preventing outages, machine data analysis can also provide insights for fraud detection, customer retention and other important use cases.

Statistics

Views

Total Views
563
Views on SlideShare
563
Embed Views
0

Actions

Likes
0
Downloads
42
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Machine Data Analytics Machine Data Analytics Presentation Transcript

  • Big Data Meetup Machine Data Analytics Raghuram Velega IBM Software Architect Big Data Analytics © 2013 IBM Corporation
  • Relevant Operations Data is Huge A Typical Enterprise of 5000 servers with 125 applications across 2 or 3 data centers generates in excess of 1.4 TB of data per day Op Daily Metric Output: era 15-2 tional d 0% per ata grow year ing . •250 Mb of event data from 125,000 Events •125Mb of endpoint mgmt data from 5K servers •12 Gb of performance data for 5000 servers •1 Gb of performance for 5000 Virtual Machine •8 Gb or Application middleware data Assumptions: 40% of servers running monitored middleware Average 60 metrics each, collected every 15 minutes • 9 Gb Storage Data per day: 175K fiber ports Average PMDB insert 1000 bytes, 40 inserts/server 175 fiber ports,10 metrics per port, collected every 5 minutes, .5KB per port •500 Mb Application transaction tracking data 25K volumes, 10 metrics per volume, .5KB per for 125 Applications volume •1 Tb Log file data per day 5KB*(65K ports and volumes)*12*24 = 9.3 GB/day 200 Mb average per server (some will be • 2Gb Network performance data for Data Center smaller, some larger) networks (not access networks) Example: WAS instances typically 180x64 port Switches and 4 Routers to manage produce 400MB-750MB logs/day physical network. •.35Tb Security data collected per day Data flow of approximately 1TB unstructured data, and .4TB metric data per day, Scaled to 20K servers, approx 4TB unstructured, 1.6TB metric data
  • Shifting market for IT Operations Operational Visibility APM Digest survey* of Senior IT Ops @ Fortune 500 − 50% growing dissatisfaction with traditional performance management solutions for Production IT − Inability to adapt to rapidly changing applications & workloads − 30% of them believe that they do not have a way to proactively detect problems − Looking to operate on raw data and gain actionable insights IT Overwhelmed by data IT Analytics solutions can predict, detect and help solve problems by churning through piles of data and translating this to understandable, relevant information, and actionable insights. * Source: APMDigest: http://apmdigest.com/it-analytics-emerging-as-dissatisfaction-grows-with-apm-and-bsm-tools
  • Exploiting IBM’s breadth of Analytics Initiatives Proactively mitigate risk, attain insights to optimize actions, and reduce cost of ownership across Business, IT Operations, Asset Management, and more…. Simple ad-hoc and scheduled Reporting to enable comparison of multiple metrics and data-sources Self-learning capabilities to automatically adapt to change Reduce false alerts to lower management costs Notice problems sooner and more accurately Leveraging analytics for IT Operations Performance trending to plan for growth Automated threshold setting for quicker deployment Detect capacity issues prior to business impact Streaming data analytics to provide realtime information and process Big Data volumes easily Predictive Analytics enables forecasting and trending to provide foresight in resource demand, capacity & availability and clarify potential risks. Provide holistic and accurate diagnosis by using guiding technology with behavioral learning capabilities. Advanced correlation and pattern recognition to identify and resolve complex and undetectable events in realtime. InfoSphere BigInsights
  • IT Operations needs analytics to predict, to search and to optimize • How can we get early warning of failures in my critical retail applications? Predict • Can we predict/project failure occurrences for specific asset types? • Can I predict which KPIs are going to cause application issues without manually configuring thresholds? I have 100s of thousands of KPIs. • I want to predict my online banking outages and take corrective actions before customers hit them. • What is driving my high maintenance costs and what can I do to address this? • How do we make sense out of the terabytes of metric and log data that is generated by our applications and the infrastructure on which they run to isolate problems and reduce downtime? Search • How can I reduce reserved material inventory due to work order backlog? • Can I use analysis of my channel traffic analysis to achieve improved customer insight and intelligence? • “What-if” we change our preventive maintenance strategy? • Help me track capacity and performance of applications & services in cloud / virtual environments, when do I need to add more capacity? • Show me how to reduce cost of running my virtual infrastructure & making it more compliant with best practices. Optimize • How should I plan maintenance to efficiently keep my assets operational, given what I know today about my six month resource availability.
  • How the Big Data Platform Can Help? Raghuram Velega - IBM Software Architect (Big Data Analytics)
  • IBM Provides a Holistic and Integrated Approach to Big Data and Analytics CONSULTING and IMPLEMENTATION SERVICES Assemble and combine relevant mix of information SOLUTIONS Sales | Marketing | Finance | Operations | IT | Risk | HR Industry Risk Analytics Decision Management Content Analytics Business Intelligence and Predictive Analytics Hadoop System Stream Computing Take action and automate processes Optimize analytical performance and IT costs Reduced infrastructure complexity and cost BIG DATA PLATFORM Content Management Discover and explore with smart visualizations Analyze, predict and automate for more accurate answers ANALYTICS Performance Management Enabling organizations to Data Warehouse Information Integration and Governance SECURITY, SYSTEMS, STORAGE AND CLOUD Manage, govern and secure information
  • The Platform for New Insight and Applications InfoSphere Data Explorer BIG DATA PLATFORM Systems Management Application Development Discovery InfoSphere BigInsights Accelerators Hadoop System Stream Computing Discover, understand, search, and navigate federated sources of big data Data Warehouse Information Integration & Governance Cost-effectively analyze Petabytes of unstructured and structured data InfoSphere Streams Analyze streaming data and large data bursts for real-time insights Data Media Content Machine Social
  • The 5 High Value Big Data Use Cases Big Data Exploration Find, visualize, understand all big data to improve business knowledge Enhanced 360o View of the Customer Security/Intelligence Extension Achieve a true unified view, incorporating internal and external sources Lower risk, detect fraud and monitor cyber security in real-time Operations Analysis Data Warehouse Augmentation Analyze a variety of machine data for improved business results Integrate big data and data warehouse capabilities to increase operational efficiency
  • Observed Big Data Use Cases Machine Data Analysis Customer behavior/Social analysis Database Offload, reporting,mining Text Analytics Telco Apps Audio, Video, Image Analysis Analytic Apps Cyber Security Geospatial Location/ Space exploration Statistical /predictiveAnalysis Financial Apps Algo Trading Fraud / Risk Real Time Processing Environmental Sensor apps Smart Grid Apps Event Processing File storage or ECM offload Medical/ Transcriptional Profiling Transportation/ SCM BigInsights as NoSQL store 197 143 139 71 32 29 24 23 22 20 19 18 14 13 13 10 8 8 5 4 0 20 40 60 80 100 120 Source: Multiple websites , n=933 available data for n= 812, count of use cases is not mutually exclusive 10 12/11/2013 140 160 180 200
  • Big Data Creates A Challenge – And an Opportunity What If You Could... Traditional Big Data Approach Leverage All of the Data Captured Reduce Effort Required to Leverage Data Let Data Lead The Way, and continuously explore Leverage data as it is captured – In Motion
  • IBM Infosphere BigInsights : Machine Data Analytics
  • Machine Data Analytics: Customer Example • Intelligent Infrastructure Management: log analytics, energy bill forecasting, energy consumption optimization, anomalous energy usage detection, presence-aware energy management • Optimized building energy consumption with centralized monitoring; Automated preventive and corrective maintenance • Utilized InfoSphere Streams, InfoSphere BigInsights, IBM Cognos Would Operations Analysis benefit you? Do you deal with large volumes of machine data? How do you access and search that data? How do you perform root cause analysis? How do you perform complex real-time analysis to correlate across different data sets? How do you monitor and visualize streaming data in real time and generate alerts? Product Starting Point: InfoSphere BigInsights, InfoSphere Streams
  • BigInsights : Machine Data Analytics Raw Logs and Machine Data Indexing, Search Only store what is needed Statistical Modeling Machine Data Accelerator Root Cause Analysis Real-time Analysis Federated Navigation & Discovery
  • Taking Full Advantage of Machine Data Requires New Thinking Machine Data Characteristics From variety of complex systems with complex formats – no standards May not always have context Structured and unstructured data Extremely large volumes of data Streaming data as well as data at rest Time sensitive - agile in interpretation and ability to respond Requires sophisticated text analysis Adaptive/dynamic algorithms to efficiently process data Large scale indexing
  • Taking Full Advantage of Machine Data Requires New Thinking Correlation across different data sets and/or different environments Data may need to be enriched or transformed to provide proper context Causal analysis (if problem on Tuesday, what happened on Monday to cause this) Pattern analysis Time and spatial based analysis Unique Visualization/UI needs based on data type and industry/application Sophisticated search capabilities.
  • Customer Usage Pattern of Log Analysis with MDA Step 1: − “What is happening in my systems?” Step 2: − “Let me try to use my experience to correlate the events and sequence” Step 3: − “I need a tool to do Step 2 – I have too many systems and too many logs” Step 4: − “I need to combine with my system KPI data and monitor / report in a dashboard. Provide possible solutions to the problem / anomaly” Step 5: − “I need to predict the behavior when I make changes, add error codes. or add new systems”
  • Step 1: What is happening in my system? This is accomplished get all the log data, extract, parse, index and search through a faceted interface. This is also the phase where basic event level metrics – max, min, counts, builtin range metrics, alerts when KPIs are not in range – are desired and tested. Dashboards that are dynamic and actionable in sync with the searches are highly desirable. The MDA provides the Faceted Search interface. KEY TECHNOLOGIES – Text Analytics, Faceted Search, BI
  • Step 2: Let me correlate events In this phase, the customer performs searches and endeavors to make sense of the events and sequences − We usually work side by side with the customer in this stage − We extract the vital tribal knowledge and applications in the domain. − We log their “experiential” notions of event sequences and correlations – this is essential to verify results when the user wants to go to Step 3. KEY TECHINOLOGIES – Big Sheets
  • Step 3: I have too many systems and logs to correlate In this phase, the customer essentially wants to find relationships and patterns of occurrence between log events across systems and applications. The MDA provides uses sessionization and sequence mining capability to accomplish this step. KEY TECHNOLOGIES – Text Analytics, Machine Learning
  • Step 4: Combine with my KPI, Topology data Once Step 3 is completed, the integration with the KPI, topology, and monitoring data is possible. This step allows us to expose the capabilities to the Network Operator and end user. KEY TECHNOLOGIES – Data Joins, SQL/JAQL, Big Sheets, Reporting Dashboards
  • Step 5: Predict events based on patterns The more advanced customers and network operators would like to build predictive models based on the patterns they see in the events in log data. Customers want to build models that help with meeting enterprise SLAs for systems Downtime scheduling for systems is a complex problem for most data centers. KEY TECHNOLOGIES – Machine Learning (R, SPSS, System ML)
  • High-Level Workflow Apply Adapter
  • Import What – Copy the logs from these machines where logs are generated using into hdfs. How – BigInsights Distributed copy app + MDA extensions Advantages • Use ftp/ sftp protocols supported by Distributed Copy App • MDA extensions allow batch incremental processing, batch replement • MDA extensions associating metadata like server names, or any other, which is available to downstream analysis
  • Extract What – Identify log record boundaries – Extract information from log records in text and XML How – BigInsights Text Analytics Advantages – Robust text extraction using SQL like language • Avoid ‘brittle’ custom parsers – Library of extractors for common log files • Syslogs, websphere, web access, datapower, csv, generic – Extensive tooling for custom extractor development and app customization • Eclipse based IDE
  • The Extract Stage: Text analytics applied to log files Field and Entity Extraction Record Splitting (HDFS/GPFS) Log Records (text) Raw Log Files AQL To Transform Stage SemiStructured Data (JSON) AQL AQL extractors available for many common formats [syslog, websphere, csv, ...] BigInsights ships with tools for creating new extractors.
  • Index What − Index and facet extracted records and fields so it can be available for searching via the faceted searching user interface How − BigInsights BigIndex Advantages Find correlated, log entries based on time through interactive UI Add/inject other data (e.g Excel) to enrich log context. Allow operations staff to quickly find log entries based on search terms such as, web service name, server name, exception code, transaction id etc
  • Transform What – Link and enrich log information from different entities • Find relationships between log records • Integrate structured data with log data – network configuration, user account information… How – JAQL Advantages – High level language that is Big Data aware – Out of the box transformers – Extensive tooling for application customization • Eclipse IDE
  • The Transform Stage: Linking logs from and other information from varied sources Text Files Raw Logs (HDFS/GPFS) Link logs corresponding to 1.IT logs of a single business activity or transaction – Up & down the IT stack Performance and Fault data Web log Network log Correlations, Predictive Models 2.Log of a activity across one layer of IT stack (e.g. OS layer) 3.… Structured data from non-log sources Outlier Detection MQ log – Messages flowing through a sequence of routers (HDFS/GPFS) Performance Data Fault data Transaction log Server log Input: Parsed log records, additional structured data Output: Individual log records, from different IT entities, linked and enriched
  • Analyze What – Correlate across fields – Find frequently occurring sequences and combinations of events – Potential for predictive modeling in the future How – System ML Advantages – Scalable to perform analytics on Big Data – Flexible and customizable – Easy to plugin into applications via a JAQL/Java interface
  • Agenda Introduction High Level Workflow Some Highlights Demo
  • Machine Data Adapters What are Adapters − Adapt a variety of inputs to a standard output Why do we need Machine Data Adapters − To handle different ‘machine data’ formats
  • Adapters in High-Level Workflow Apply Adapter
  • Adapter Functions Create − Enter Adapter-Name, LogType, ‘sample machine data’ and first ‘timestamp’ in the ‘sample machine data’ − Check the recommended ‘DataTime Format’ and ‘preTimeStamp Regex’ and select defaults like ‘timezone’, ‘year’ and ‘month’. − Verify the extracted output and save if you find it good − If extracted out is bad, then you can go back and edit parameter ‘Data Source Type’, ‘DataTime Format’ and ‘preTimeStamp Regex’ Edit View Apply Delete
  • Create Machine Data Adapter – Step-1
  • Create Machine Data Adapter – Step-2
  • Create Machine Data Adapter – Step-3
  • Display Machine Data Adapter
  • Edit Machine Data Adapter – Step-1
  • Edit Machine Data Adapter – Step-2
  • Edit Machine Data Adapter – Step-3
  • Display Machine Data Adapter
  • Apply Machine Data Adapter
  • Verify the Adapter (metadata.json)
  • Delete
  • Data Explorer for Indexing Application Data Explorer Index Configuration File to support generic schema for extracted machine data. Parallelizing data pushing to Data Explorer Indexer. Run Data Explorer Index Application
  • Data Explorer Index Configuration File The Data Explorer index config file specifies which fields to index, which field contains record ID as well as Data Explorer index field definitions: field name, type, searchable, retrievable, filterable and sortable. Example: { "source": { "dateFormat": "MMM dd yyyy HH:mm:ss.SSS Z", "LogDatetime[].normalized_text", "deFieldName": "LogDatetime", "retrievable": true, "fieldName": "suppress": false }, "target": { "filterable": true, "searchable": true, "isRecordID": false, "sortable": true, Default Index Configuration file is provided. "type": "Date" }}
  • Parallelizing data pushing to Data Explorer Indexer The application uses Oozie jaql action to parallelize the job to multiple tasks. HDFS Jaql Hadoop Task 1 Jaql Hadoop Task M … BigSearch BigSearch BigSearch BigSearch Zookeeper Zookeeper Cluster Cluster Locate shards Indexing app BI platform/IDE DE Backend DE Backend Shard 1 Shard 1 … DE Backend DE Backend Shard N Shard N
  • Run Data Explorer index Application
  • Basic Facet Search UI on Application Builder
  • BI Log Monitoring and Analysis • Ingest BigInsights logs in HBase in real time. • Create Log Monitoring Extraction application that extracts log records from HBase. • Create Index Management application to delete old index log records from DFS. • Embed the MDA Search UI within the BigInsights Dashboard for BigInsights log search.
  • Ingesting BigInsights Logs into HBase Chukwa agents setup on Name Node and each of the Data Nodes Adapters are programmatically installed and removed depending on user configuration. Custom Chukwa writer class created to add logs into HBASE in real time. Log4j Interface streams logs to the adaptors which stream logs to HBASE Different log types are concurrently recorded in HBASE in a single table
  • Data Collection Diagram HDFS with Apache Map Reduce Data Node 2 Name Node Hadoop Data Node Hadoop Secondary Name Node Hadoop Task Tracker Hadoop Name Node Hadoop Task Attempt Hadoop Jobtracker HBASE Data Node 1 Hadoop Data Node Data Node 3 Hadoop Task Tracker Hadoop Task Attempt Hadoop Data Node Hadoop Task Tracker Hadoop Task Attempt For HDFS with Symphony MapReduce Installation: Hadoop Data Node, Hadoop Name Node and Hadoop Secondary Name Node logs are supported For GPFS with Apache MapReduce Installation: Hadoop Job Tracker, Hadoop Task Tracker and Hadoop Task Attempt logs are supported For GPFS with Symphony MapReduce Installation: Only Hadoop Task Attempt logs are supported
  • BigInsights Dashboard User starts the BigInsights log collection from the LogCollection app. User is able to stop the BigInsights log collection from the LogCollection app. Or by turning off the Monitoring. The MDA Search UI is wrapped by a frame in BigInsights Dashboard.
  • Dashboard
  • LogCollection app.
  • BigInsights Log Monitoring Application Is a BigInsights Chained application. Contains Log Monitoring Extraction application and Index application. Assumes that Log Monitoring Extraction application is running on schedule mode. The BigInsights Logs is selected assumed as the workflow for Index application. Any configuration files are assumed to be the default configuration files installed with MDA “Index Only New Logs” check-box in the Index application is assumed to be unchecked.
  • BigInsights Log Monitoring Application
  • Agenda Introduction High Level Workflow New Features in MDA 2.1 Demo