• Save
Dai Clegg (Big Data Evangelist IBM) - Babies, Buses and Movies; some examples of the value in big data analytics
Upcoming SlideShare
Loading in...5

Like this? Share it with your network


Dai Clegg (Big Data Evangelist IBM) - Babies, Buses and Movies; some examples of the value in big data analytics



Presentatie van Big Data Evangelist Dai Clegg (IBM): 'Babies, Buses and Movies; some examples of the value in big data analytics' tijdens het Big Data Analytics seminar 14 juni van Almere DataCapital ...

Presentatie van Big Data Evangelist Dai Clegg (IBM): 'Babies, Buses and Movies; some examples of the value in big data analytics' tijdens het Big Data Analytics seminar 14 juni van Almere DataCapital in Almere.



Total Views
Views on SlideShare
Embed Views



3 Embeds 142

http://www.almeredatacapital.nl 137
http://almeredatacapital.nl 4
http://webcache.googleusercontent.com 1



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment
  • Here is another example of something the University of Southern California Annenberg School of Communication did with the IBM Big Data platform’s BigSheets technology. USC@Annenburg created the Film Forecaster tool and used it to correctly predict 2011’s summer block busters based on scraping Twitter and analyzing that against a simple lexicon that described a positive or negative showing for a movie. They made quite the impact since this very solution was featured on ABC News (a national news agency in the USA).More striking is the quote: the application was built by a communication Masters student who learned Big Sheets in a day.
  • This picture is a little simplistic for 2 reasons:First if gives pre-eminence to Netezza. That is because Netezza’s simplicity, performance and agile support for ad-hoc analysis is often the default proposition for an analytic warehouse in a greenfield situation (though this is not necessarily true if there is an existing commitment to Power or to DB2).Secondly it does not recognise the differentiation between exploratory analysis and repeated analysis.But if you are doing exploratory analysis of relational (ie structured) data, Netezza is a better platform; it thrives on ad-hoc analysis and has very rich tooling (INZA, SPSS etc) for analytics.Clearly exploratory on unstructured is BigI, Exploratory analysis on something in between (e.g. CDRs) could be done on Netezza, but if the data is not already being loaded (and even in a Netezza customer the raw XDRs are probably not loaded into the warehouse) then exploration in a low-cost Hadoop grid makes tons of sense. We have at least one customer use case of this, where once the analysis was repeatable it was implemented in the Netezza. But there are also use cases where the repeated analysis remains in BigI, exploiting its differentiating enterprise readiness.
  • If it’s data in motion (remember the babies being monitored). it has to be real-time. it has to be Streams. That’s the easy one.If it’s unstructured data, at rest, the best place to start is BigInsights, though you may load data into the relational warehouse subsequently for further insight.If it’s relational data, it’s unlikely you are going to move it to Hadoop If it’s semi-structured you have a choice and you’ll be influenced by these other development factors:It may be that an organization has already developed a map-reduce solution that delivers a high value analysis for data that was unloaded from the corporate EDW.Is the right solution to say ‘great, now you know the solution, re-code it in SQL using in-database analytics and implement it on your warehouse?’ Maybe a better solution is to implement BigInsights to enterprise-harden the Hadoop environment and run the application as is, but with production applications reliability and supportability.It may be that the volume is so huge that a DWH can’t handle it and certainly can’t handle it economically (think Vestas)it may be better to go to the platform with more of the appropriate analytic skills or other development resources availableIt may be that the customer wants to build their capability in Hadoop because they will have more challenging use case later that will be clear-cut BigInsights use cases.It may be that the customer just wants to experiment cheaply and quickly (though actually that’s more a BigI Basic edition use case – we’ll be looking to enterprise harden it later)But remember they are influencers, not deciders. IBMers can adapt to whatever best matches the customer’s needs, because of the comprehensive nature of our big data portfolio.

Dai Clegg (Big Data Evangelist IBM) - Babies, Buses and Movies; some examples of the value in big data analytics Presentation Transcript

  • 1. Embracing & Exploiting Big DataBabies, Motors & Movies dai clegg: IBM big data evangelist © 2012 IBM Corporation
  • 2. Information Management Utilities Financial Services  Weather impact on power  Fraud detection generation  Risk management  Transmission monitoring  360° View of the Customer  Smart grid management Variety: Manage the complexity of multiple relational and non-relational dataTransportation types and schemas IT Weather and traffic  Transition log analysis for impact on logistics and multiple systems Streaming data and large volume fuel consumption Velocity:  Cybersecurity data movementHealth & Life Sciences Epidemic early warning Retail ICU monitoring Volume:  Customer 360° View Healthcare monitoring Scale from terabytes to zettabytes  Click-stream analysis  Real-time promotions Telecommunications Law Enforcement  CDR processing  Real-time multimodal surveillance  Churn prediction  Situational awareness  Geomapping / marketing  Cyber security detection  Network monitoring © 2012 IBM Corporation
  • 3. Information Management Big: Broad: Brainless: Big + Smart = Insights! © 2012 IBM Corporation
  • 4. Information ManagementBabies Use case – Neonatal infant monitoring – Predict infection in ICU 24 hours in advance Solutions – 120 children monitored :120K msg/sec, billion msg/day – Trials expanding to include hospitals in US and China © 2012 IBM Corporation
  • 5. Information ManagementBabies © 2012 IBM Corporation
  • 6. Information ManagementMotors Policy & Claims System Service Centre Mobile Data Feed Customer Portal Analytic Reporting © 2012 IBM Corporation
  • 7. Information ManagementMoviesUSC’s Film Forecaster correctly predicted a clamor for "Hangover 2” thatresulted in $100 million opening over Memorial Day weekend – Looked at 250K-500K Tweets and broke down positive and negative messages using a lexicon of 1700 words The Film Forecaster sounds like a big undertaking for USC, but it really came down to one communications masters student who learned Big Sheets in a day, then pulled in the tweets and analyzed them - Ryan Kim © 2012 IBM Corporation
  • 8. Information ManagementMovies © 2012 IBM Corporation
  • 9. Information ManagementIBM big data platform InfoSphere BigInsights Hadoop-based analytics for variety and volume Hadoop Information Stream Integration Computing InfoSphere Information InfoSphere Streams Server Low-latency Analytics forHigh-volume data integration streaming data and transformation MPP Data Warehouse IBM optimized workload data warehouses Scalable, high-performance, mixed-workload analytics on structured data © 2012 IBM Corporation
  • 10. Information ManagementIBM big data platform © 2012 IBM Corporation
  • 11. Information ManagementIBM big data platform InfoSphere BigInsights IBM Netezza InfoSphere Streams Analytics on Big Data at Rest Analytics on Unstructured Structured Big Data in Motion © 2012 IBM Corporation
  • 12. Information ManagementIBM big data platform © 2012 IBM Corporation
  • 13. Information ManagementIBM big data platform • Big Data • Volume • Velocity • Variety • Combining data types & sources • Combining technologies to analyse it • Complementing the relational warehouse © 2012 IBM Corporation
  • 14. Information Management © 2012 IBM Corporation