Future of Power: Big Data - Søren Ravn
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,041
On Slideshare
1,041
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
73
Comments
0
Likes
2

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. © 2013 IBM Corporation Søren Ravn (sravn@dk.ibm.com) Big Data Architect IBM Software Group, Information Management September 4th, 2013 Big Data a Paradigm Shift August 2013 IBM Future of Power Event 2013
  • 2. © 2013 IBM Corporation2 What is Big Data? Where is it comming from? Where is it going? What can I do with it?
  • 3. © 2013 IBM Corporation3 Google search on “What is Big Data” you will get 2,9 mill. hits
  • 4. © 2013 IBM Corporation4 What is Big Data? A definition: Big Data are datasets that grow so large and/or varied that they become awkward to work with using traditional information management technologies
  • 5. © 2013 IBM Corporation5 What is Big Data or Big Data Analytics TDWI: Big data analytics is the application of advanced analytic techniques to very large, diverse data sets that often include varied data types and streaming data. Wikipedia: Big data is the term for a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. The challenges include capture, curation, storage, search, sharing, transfer, analysis and visualization. Forbes: Big data is new and “ginormous” and scary – very, very scary…. U.S. federal Big Data commission report: Big Data is a phenomenon defined by the rapid acceleration in the expanding volume of high velocity, complex, and diverse types of data. Big Data is often defined along three dimensions -- volume, velocity, and variety McKinsey Global Institute: Big data: The next frontier for innovation, competition, and productivity
  • 6. © 2013 IBM Corporation6 U.S. federal Big Data commission report The Big Data Commission will provide guidance to the White House and Congress on the use of big data to improve government efficiency, services and capabilities, and drive innovation and the economy • The Commission was formed in May, 2012 • Steve Mills co-chair • Brought together experts from Government, Academia and Industry • The report seeks to demystify big data, and focus on the business and mission value it will deliver • Intent is to provide clear recommendations and a roadmap for getting started Find it here: http://ibmdatamag.com/2012/11/demystifying-big-data/
  • 7. © 2013 IBM Corporation7 “Data is the New Oil” ““We have for the first time an economy based onWe have for the first time an economy based on a key resource [Information] that is not only renewable,a key resource [Information] that is not only renewable, but selfbut self--generating. Running out of it is not a problem,generating. Running out of it is not a problem, but drowning in it is.but drowning in it is.”” –– JohnJohn NaisbittNaisbitt Harvesting any resource requires Mining, Refining and Delivering Big Data is the next Natural Resource
  • 8. © 2013 IBM Corporation8 Integration & Analytics (DW, MDM,…) The unseen information Governance Operational systems
  • 9. © 2013 IBM Corporation9 2+ billion people on the Web by end 2011 30 billion RFID tags today (1.3B in 2005) 4.6 billion camera phones world wide 100s of millions of GPS enabled devices sold annually 76 million smart meters in 2009… 200M by 2014 12+ TBs of tweet data every day 25+ TBs of log data every day ?TBsof dataeverydayWhere is big data coming from?
  • 10. © 2013 IBM Corporation10 Big data is a hot topic because technology makes it possible to analyze ALL available data Cost effectively manage and analyze all available data, in its native form – unstructured, structured, streaming ERP CRM RFID Website Network Switches Social Media Billing
  • 11. © 2013 IBM Corporation11 The characteristics of big data Collectively Analyzing the broadening Variety Responding to the increasing Velocity Cost efficiently processing the growing Volume Establishing the Veracity of big data sources 30 Billion RFID sensors and counting 1 in 3 business leaders don’t trust the information they use to make decisions 50x 35 ZB 2020 80% of the worlds data is unstructured 2010
  • 12. © 2013 IBM Corporation12 Extending and Integrating Big Data requires a Holistic Approach Traditional Approach Structured, analytical, logical New Approach Creative, holistic thought, intuition Multimedia Data Warehouse Web Logs Social Data Sensor data: images RFID Internal App Data Transaction Data Mainframe Data OLTP System Data Traditional Sources ERP Data Structured Repeatable Linear Unstructured Exploratory Dynamic Text Data: emails Hadoop and Streams New Sources
  • 13. © 2013 IBM Corporation13 New Architecture to Leverage All Data and Analytics Data in Motion Data at Rest Data in Many Forms Information Ingestion and Operational Information Decision Management BI and Predictive Analytics Navigation and Discovery Intelligence Analysis Landing Area, Analytics Zone and Archive Raw Data Structured Data Text Analytics Data Mining Entity Analytics Machine Learning Real-time Analytics Video/Audio Network/Sensor Entity Analytics Predictive Exploration, Integrated Warehouse, and Mart Zones Discovery Deep Reflection Operational Predictive Stream Processing Data Integration Master Data Streams Information Governance, Security and Business Continuity
  • 14. © 2013 IBM Corporation14 Big Data Exploration Find, visualize, understand all big data to improve decision making Enhanced 360o View of the Customer Extend existing customer views (MDM, CRM, etc) by incorporating additional internal and external information sources Operations Analysis Analyze a variety of machine data for improved business results Data Warehouse Augmentation Integrate big data and data warehouse capabilities to increase operational efficiency Security/Intelligence Extension Lower risk, detect fraud and monitor cyber security in real-time Big Data Use Cases
  • 15. © 2013 IBM Corporation15 Big Data Exploration Find, visualize, understand all big data to improve decision making Enhanced 360o View of the Customer Extend existing customer views (MDM, CRM, etc) by incorporating additional internal and external information sources Operations Analysis Analyze a variety of machine data for improved business results Data Warehouse Augmentation Integrate big data and data warehouse capabilities to increase operational efficiency Security/Intelligence Extension Lower risk, detect fraud and monitor cyber security in real-time Big Data Use Cases
  • 16. © 2013 IBM Corporation16 Vestas optimizes capital investments based on 2.5 Petabytes of information Need • Model the weather to optimize placement of turbines, maximizing power generation and life expectancy Benefits • Reduce time required to identify placement of turbine from weeks to hours • Reduces IT footprint and costs, and decreases energy consumption by 40 % -- while increasing computational power • Incorporate 2.5 PB of structured and semi- structured information flows. Data volume expected to grow to 6 PB 1616
  • 17. © 2013 IBM Corporation17 Big Data Exploration Find, visualize, understand all big data to improve decision making Enhanced 360o View of the Customer Extend existing customer views (MDM, CRM, etc) by incorporating additional internal and external information sources Operations Analysis Analyze a variety of machine data for improved business results Data Warehouse Augmentation Integrate big data and data warehouse capabilities to increase operational efficiency Security/Intelligence Extension Lower risk, detect fraud and monitor cyber security in real-time Big Data Use Cases
  • 18. © 2013 IBM Corporation18 How do you correlate information across different data sets, e.g., social media and trusted enterprise data? How do you decide the next best action when dealing with customers? How do you monitor and visualize data in real time and generate alerts? Is your customer data distributed among many different applications and sources? How do you deliver it in usable form to the employees who need it?
  • 19. © 2013 IBM Corporation19 Enhanced 360º View of the Customer Requirements Create a connected picture of the customer Mine all existing and new sources of information Analyze social media to uncover sentiment about products Add value by optimizing every client interaction Industry Examples • Smart meter analysis • Telco data location monetization • Retail marketing optimization • Travel and Transport customer analytics and loyalty marketing • Financial Services Next Best Action and customer retention • Automotive warranty claims Optimize every customer interaction by knowing everything about them
  • 20. © 2013 IBM Corporation20 Enhanced 360º View of the Customer: In Practice 360o View of Party Identity CRM J Robertson Pittsburgh, PA 15213 35 West 15th Name: Address: Address: ERP Janet Robertson Pittsburgh, PA 15213 35 West 15th St. Name: Address: Address: Legacy Jan Robertson Pittsburgh, PA 15213 36 West 15th St. Name: Address: Address: SOURCE SYSTEMS Janet 35 West 15th St Pittsburgh Robertson PA / 15213 F 48 1/4/64 First: Last: Address: City: State/Zip: Gender: Age: DOB: InfoSphere MDM BigInsights Streams Warehouse Unified View of Party’s Information InfoSphere Data Explorer
  • 21. © 2013 IBM Corporation21 Big Data Exploration Find, visualize, understand all big data to improve decision making Enhanced 360o View of the Customer Extend existing customer views (MDM, CRM, etc) by incorporating additional internal and external information sources Operations Analysis Analyze a variety of machine data for improved business results Data Warehouse Augmentation Integrate big data and data warehouse capabilities to increase operational efficiency Security/Intelligence Extension Lower risk, detect fraud and monitor cyber security in real-time Big Data Use Cases
  • 22. © 2013 IBM Corporation22 Security/Intelligence Extension © 2013 IBM Corporation Enhanced Intelligence & Surveillance Insight Real-time Cyber Attack Prediction & Mitigation Analyze network traffic to: • Discover new threats early • Detect known complex threats • Take action in real-time Analyze Telco & social data to: • Gather criminal evidence • Prevent criminal activities • Proactively apprehend criminals Crime prediction & protection Security/Intelligence Extension enhances traditional security solutions by analyzing all types and sources of under-leveraged data Analyze data-in-motion & at rest to: • Find associations • Uncover patterns and facts • Maintain currency of information
  • 23. © 2013 IBM Corporation23 Big Data Exploration Find, visualize, understand all big data to improve decision making Enhanced 360o View of the Customer Extend existing customer views (MDM, CRM, etc) by incorporating additional internal and external information sources Operations Analysis Analyze a variety of machine data for improved business results Data Warehouse Augmentation Integrate big data and data warehouse capabilities to increase operational efficiency Security/Intelligence Extension Lower risk, detect fraud and monitor cyber security in real-time Big Data Use Cases
  • 24. © 2013 IBM Corporation24 Handling Machine Data Brings Unique Challenges Data Sources and Integration • Complex formats, no standards • Extremely large data volumes • Mix of enterprise and machine data • Streaming data as well as data at rest Analytics Visualizations/ Actions/ Outputs • Large scale indexing • Correlation across different data sets • Advanced analytics for different data types • New visualizations for streaming and massive data sets • Real-time dashboards • Geospatial mash-up - Gain deep insights into operations, customer experience, transactions and behavior - Proactive planning to increase operational efficiency - Troubleshoot problems and investigate security incidents - Monitor end-to-end infrastructure to avoid service degradation or outages Outcome
  • 25. © 2013 IBM Corporation25 Big Data Exploration Find, visualize, understand all big data to improve decision making Enhanced 360o View of the Customer Extend existing customer views (MDM, CRM, etc) by incorporating additional internal and external information sources Operations Analysis Analyze a variety of machine data for improved business results Data Warehouse Augmentation Integrate big data and data warehouse capabilities to increase operational efficiency Security/Intelligence Extension Lower risk, detect fraud and monitor cyber security in real-time Big Data Use Cases
  • 26. © 2013 IBM Corporation26 Data Warehouse Augmentation: Value & Diagram Pre-Processing Hub Query-able Archive Ad hoc & Exploratory Analysis Information Integration Data Warehouse Streams Real-time processing BigInsights Landing zone for all data Data Warehouse BigInsights combined with unstructured & new kind of data Data Warehouse 1 2 3 26
  • 27. © 2013 IBM Corporation27 Data Warehouse Augmentation: Next Generation Enterprise Data Warehouse Architecture Predictive Analytics BI & Reporting Visualization & Discovery Operational Warehouse Zone Operational Warehouse Zone Analytics Warehouse Zone Analytics Warehouse Zone Hadoop Zone - Preprocessing, Queriable Archive, Ad Hoc Analysis Information Integration and Governance Information Integration and Governance Integration Master Data Governance Custom Applications Structured Semi Structured Unstructured Hadoop Analytics & Visualization Real time Analytics Zone
  • 28. © 2013 IBM Corporation28 0 IBM Big Data Platform - Move the Analytics Closer to the Data IBM Big Analytics IBM Big Data Platform Systems Management Application Development Visualization & Discovery Accelerators Information Integration & Governance Hadoop System Stream Computing Data Warehouse New analytic applications drive the requirements for a big data platform • Integrate and manage the full variety, velocity and volume of data • Apply advanced analytics to information in its native form • Visualize all available data for ad- hoc analysis • Development environment for building new analytic applications • Workload optimization and scheduling • Security and Governance
  • 29. © 2013 IBM Corporation29 Assemble & Distill Consume & Deliver IBM Big Analytics IBM Big Analytics Explore & Experiment Report & Act Applied Analytics Predict & Analyze Next wave of analytics harnesses the value of the new mix of information • Visualize and explore the variety, velocity and volume of big data • Apply advanced analytics to uncover patterns previously hidden • Blend traditional structured information with data previously unavailable • Optimize access and delivery to take insight to action • Extend existing capabilities to address specific analytic applications Hadoop System Stream Computing Data Warehouse Operational Sources Big Data
  • 30. © 2013 IBM Corporation30 IBM Confidential PureData System for Hadoop Bringing Big Data to the enterprise Simplify the delivery of unstructured data to the enterprise Integrate Hadoop with the data warehouse Leverage Hadoop for data archive Provide best in class security Provide data exploration across structured and unstructured data Accelerate insight with machine data Accelerate insight with social data Beyond today’s big data appliances System for Hadoop
  • 31. © 2013 IBM Corporation31 Pre-defined PowerLinux Hadoop/BigInsights Configurations
  • 32. © 2013 IBM Corporation32 Big Data considerations.... • How to find out that the datasets exist ? • How to get permission to access and use ? • Privacy, confidentiality, security ? • How to combine disparate datasets and sources ? • How to normalize and integrate ? • How to reconcile standards and metadata considerations ? • Underlying data structures ? • Interoperability ? • How to get the people who collect these disparate data types to communicate with one another ? (And with the computer people ?) • If they understand one another better, will combing diverse data be easier and more useful ? • How to get people who don’t understand data structures and architecture to understand them well enough to make analysis and modeling more possible and successful ?
  • 33. How do you address these challenges? These experiences reveal a great irony -- that while the impact of Big Data will be transformational, the path to effectively harnessing it is not. The journey is evolutionary versus revolutionary, incremental and iterative – Demystifying Big Data, TechAmerica Report, October 2012 Is your organization characterized by one or more of the following traits? 1. Executive Management wants a big data plan 2. Executive Management wants it to be realistic and drive value as it is being implemented 3. Wants a partner to rely on for guidance & expertise to lower risk 4. Big Data must be leveraged with the existing infrastructure 5. Concerned about the complexity & risk of Big Data acquisition ✔ ✔ ✔ ✔ ✔
  • 34. © 2013 IBM Corporation34 Patterns of organizational behavior are consistent across four stages of big data adoption Big data adoption When segmented into four groups based on current levels of big data activity, respondents showed significant consistency in organizational behaviors Total respondents n = 1061 Totals do not equal 100% due to rounding
  • 35. © 2013 IBM Corporation35 Importance of Hadoop & Big Data “We believe that more than half of the world’s data will be stored in Apache Hadoop within five years” – Hortonworks IBM INTERNAL USE ONLY
  • 36. © 2013 IBM Corporation36 Gartner on Hadoop: Don’t Delay - Big data analytics and the Apache Hadoop open source project are rapidly emerging as the preferred solution to address business and technology trends that are disrupting traditional data management and processing. Enterprises can gain a competitive advantage by being early adopters of big data analytics. - Enterprises should consider adopting a packaged Hadoop distribution . . . to reduce the technical risk and increase speed of implementation of the Hadoop initiative. - Enterprises should not delay implementation just because of the technical nature of big data analytics....Early adopters will gain competitive advantage and invaluable experience, which will sustain the advantage as the technology matures and gains wider acceptance. - Adopt big data analytics and . . . Hadoop . . . to meet the challenges of the changing business and technology landscape. IBM INTERNAL USE ONLY
  • 37. © 2013 IBM Corporation37
  • 38. © 2013 IBM Corporation38
  • 39. © 2013 IBM Corporation39 So....don’t get lost in the sea of data
  • 40. © 2013 IBM Corporation40 THINK