Your SlideShare is downloading. ×
Harnessing the value of big data analytics
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Harnessing the value of big data analytics


Published on

Published in: Education, Technology

  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. BIG DATA is not just HADOOP Understand and navigate federated big data sources Federated Discovery and Navigation Manage & store huge volume of any data Hadoop File System MapReduce Structure and control data Data Warehousing Manage streaming data Stream Computing Analyze unstructured data Text Analytics Engine Integrate and govern all data sources Integration, Data Quality, Security, Lifecycle Management, MDM
  • 2. Business-Centric Big Data Enables You to Start With a Critical Business Pain and Expand the Foundation for Future Requirements Corresponding Tools /products  “Big data” isn’t just a technology—it’s a business strategy for capitalizing on information resources  Getting started is crucial  Success at each entry point is accelerated by products within the Big Data platform  Build the foundation for future requirements by expanding further into the big data platform
  • 3. Velocity Variety Volume
  • 4. Merging the Traditional and Big Data Approaches Traditional Approach Big Data Approach Structured & Repeatable Analysis Iterative & Exploratory Analysis Business Users Determine what question to ask IT Delivers a platform to enable creative discovery IT Business Structures the data to answer that question Explores what questions could be asked Monthly sales reports Profitability analysis Customer surveys Brand sentiment Product strategy Maximum asset utilization
  • 5. Raw Data Valuable Data Assets
  • 6. A) Data Refinery Platform B) Data Discovery Platform C)Analytical Tools And Techniques D)Integrated Data Warehouse E)Distinct Execution Engine F)Library Of pre-Built analytic functions G)Interactive Development Tool
  • 7. SQL for structured and MR for large scale process analytics Manage relational & non Relational data in ins& out of Data Warehouse Iterative analytics with greater accuracy and effectiveness Dig deeper for insights Within budget
  • 8. Data Task Low-cost storage Potential Workloads • Retains raw data in manner that can provide low TCO-per-terabyte storage costs and retention • Requires access in deep storage, but not at same speeds as in a front-line system Loading • Brings data into the system from the source system Pre-processing/ • Prepares data for downstream processing by, for example, fetching dimension prep/cleansing/ data, recording a new incoming batch, or archiving old window batch. constraint validation Transformation • Converts one structure of data into another structure. This may require going from third-normal form in a relational database to a star or snowflake schema, or from text to a relational database, or from relational technology to a graph, as with structural transformations. Reporting • Queries historical data such as what happened, where it happened, how much happened, who did it (e.g., sales of a given product by region) Analytics (including • Performs relationship modeling via declarative SQL (e.g., scoring or basic stats) • Performs relationship modeling via procedural MapReduce (e.g., model building user-driven, inter- active, or ad-hoc) or time series)
  • 9. Stable (structured) Evolving (Semi-Structured) No Schema (Has Format only) • Relatively fixed, Infrequent change • Leverage strength of relational model & SQL • Fixed and variable of schema, but changes occur too quickly • Leverage backend RDBMS, “LATE BINDING” of structure by queries • Less relational, No Semantics – stored in native file formats • via MapReduce: Interpret the format & pull out the required data
  • 10. Stable Evolving • ERP Data • Inventory Recods • Web logs, Call record • Twitter feeds No Schema • images • Videos, Web Pages
  • 11. What Does Machine Data Look Like? Sources Order Processing Middleware Error Care IVR Twitter 6
  • 12. Machine Data Contains Critical Insights Sources Customer ID Order ID Product ID Order Processing Order ID Customer ID Middleware Error Time Waiting On Hold Care IVR Customer ID Twitter ID Twitter Company’s Twitter ID Customer’s Tweet
  • 13. Machine Data Contains Critical Insights Sources Customer ID Order ID Product ID Order Processing Order ID Customer ID Middleware Error Time Waiting On Hold Care IVR Customer ID Twitter ID Twitter Company’s Twitter ID Customer’s Tweet
  • 14. Di Hadoop captures, stores and transforms images and call records Traditional Work flow Capture, Retention and Transformation Layer Data Sources ETL TOOLS Analytic Results Call Center Voice Records Analysis and Marketing Automation (Customer Retention Campaign) Discovery Platform Dimensional Data Hadoop Check Images path and sentiment analysis with multistructured data Social and web data Integrated DW