Today we measure available data in zettabytes

IN 2011, THE AMOUNT
OF DATA SURPASSED

1.8

90% OF THE DATA IN THE WORLD TO...
Where is this data?
Types and Volumes of Data …

Traditional content types,
Including unstructured data,

…have grown dram...
What can’t we see?
WHAT CRITICAL “NEW SIGNALS”

MIGHT WE BE MISSING?
Is it in our ERP Systems?
Our M2M data?
Social?

© 20...
Big Data - Definition
“Big Data” refers to the problems of capturing,
storing, managing, and analyzing massive
amounts of ...
The SAP you need to know
System of Engagement
“Newer SAP”

SAP Cloud
Maintenance & Operations
24/7, SLA’s, DR & HA, Elasti...
In Memory Database Platform

Digging Deeper

In Memory / Columnar/ MPP/ Federation

SAP Business Suite

Text

Core
PLM

OL...
Open Hadoop Strategy

© 2013 SAP AG. All rights reserved.

Confidential

7
Accelerated BI with SAP BusinessObjects and SAP HANA
One unified and complete BI Suite addressing the full spectrum of BI ...
Data Logistics
SAP Business
Suite

Trigger
Based, Real
Time

SAP LT
Replication
Server

SAP
BusinessObjects
tools

DB
Conn...
SAP Big Data Apps

•

Customer Engagement
Intelligence

•

Predictive Analytics RDS

See overview https://community.wdf.sa...
Delivering On Your Business Imperatives
Data Science Services
Forecasting Sales and Demand
 Forecast demand and managing
...
HANA + Hadoop
What is Hadoop
 Open source project inspired by Google/Yahoo
 Used at Yahoo, Facebook, eBay, LinkedIn, startups, Fortune...
Apache Hadoop
Software framework for distributed data processing

 Hadoop Distributed File System
(HDFS) – reliable data ...
Why Hadoop?
Pros
 Free software

 Cheap hardware - commodity servers
 Scalable to thousands of nodes and petabytes of d...
SAP HANA + Hadoop Provides Real-Time on BIG DATA
Combine INSTANT Results with INFINITE Storage

HADOOP

8

SAP HANA

1.0se...
Upcoming SlideShare
Loading in …5
×

Big data tim

640 views

Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Big data tim

  1. 1. Today we measure available data in zettabytes IN 2011, THE AMOUNT OF DATA SURPASSED 1.8 90% OF THE DATA IN THE WORLD TODAY has been created in the last two years alone ZETTABYTES COMBINED GDP OF: 1.8 ZETTABYTES = 57.5 BILLION 32 GB iPads **IDC Digital Universe Study Extracting Value from Chaos © 2013 SAP AG. All rights reserved. = $34.4 • • • • = TRILLION US • France Japan • UK China • Italy Germany 1 Confidential 1
  2. 2. Where is this data? Types and Volumes of Data … Traditional content types, Including unstructured data, …have grown dramatically are growing by up to 80% per year CRM Systems M2M data Transactions Sales Order Mobile ERP Systems Instant Messages Transactions Planning Email Things Sales Order Things Demand Legacy EDW © 2013 SAP AG. All rights reserved. 2013 SAP AG. All rights reserved. © Planning Legacy ERP Structured data grew by Inventory more than 40% per year Mobile Customer 2 2
  3. 3. What can’t we see? WHAT CRITICAL “NEW SIGNALS” MIGHT WE BE MISSING? Is it in our ERP Systems? Our M2M data? Social? © 2013 SAP AG. All rights reserved. Confidential 3
  4. 4. Big Data - Definition “Big Data” refers to the problems of capturing, storing, managing, and analyzing massive amounts of various types of data Big Data Challenge: turn raw data into insights that drive business value and manage in a cost effective manner; Most commonly this refers to terabytes or petabytes of data, stored in multiple formats, from different internal and external sources, with strict demands for speed and complexity of analysis © 2013 SAP AG. All rights reserved. 4
  5. 5. The SAP you need to know System of Engagement “Newer SAP” SAP Cloud Maintenance & Operations 24/7, SLA’s, DR & HA, Elasticity mobile System of Record Business Suite (ERP) Business Analytics “Foundational SAP” Data Logistics/Quality ETL In Memory Database Platform In Memory / Columnar/ MPP/ Federation © 2013 SAP AG. All rights reserved. Confidential 5
  6. 6. In Memory Database Platform Digging Deeper In Memory / Columnar/ MPP/ Federation SAP Business Suite Text Core PLM OLAP SRM OLTP SCM ERP Apps CRM Custom Predictive BI HANA SAP BW HTTP Native Apps Geospatial Models Engines Logical memory HOT disk WARM cached Bulk/Streaming/Real-time User Interface & Applications COLD Physical Table(s) Virtual Tables Ingest Engines Federation Data Logistics (Data Services , SLT, CEP) COLD 100101 011010 100101 © 2013 SAP AG. All rights reserved. Other DB Other ERP Other Data … Confidential 6
  7. 7. Open Hadoop Strategy © 2013 SAP AG. All rights reserved. Confidential 7
  8. 8. Accelerated BI with SAP BusinessObjects and SAP HANA One unified and complete BI Suite addressing the full spectrum of BI on SAP HANA Discovery and Analysis Dashboards and Apps Reporting Discover. Predict. Create. Build Engaging Experiences Share Information  Discover areas to optimize your business  Deliver engaging information to users where they need it  Securely distribute information across your organization  Adapt data to business needs  Track key performance indicators and summary data  Give users the ability to ask and answer their own questions  Tell your story with beautiful visualizations  Build custom experiences so users get what they need quickly  Build printable reports for operational efficiency © 2013 SAP AG. All rights reserved. Confidential 8
  9. 9. Data Logistics SAP Business Suite Trigger Based, Real Time SAP LT Replication Server SAP BusinessObjects tools DB Connection SQL ETL, Batch SAP BW Other query tools BICS SQL MDX HANA Studio ODBC SAP BOBJ Data Services Log Based Non SAP Data Sources SAP In-Memory Database ECDA/ODBC Sybase Replication Server In Memory Models Column Store Event Streams M2M SAP Event Stream Processor * ODBC SAP HANA Data Sources © 2013 SAP AG. All rights reserved. * SAP HANA Roadmap ** SAP ERP & BW Extractors Confidential 9
  10. 10. SAP Big Data Apps • Customer Engagement Intelligence • Predictive Analytics RDS See overview https://community.wdf.sap.corp/docs/DOC-222087 © 2013 SAP AG. All rights reserved. Confidential 10
  11. 11. Delivering On Your Business Imperatives Data Science Services Forecasting Sales and Demand  Forecast demand and managing inventory levels in perishable CPG  Model variant cannibalization and impact on manufacturer forecasts  Utility load demand forecasting Check and Compliance  Deliver faster response time and higher throughput of compliance checks to enable competitive advantage  Tackling public fraud waste and abuse by analyzing records for tax discovery Optimization Performance and Insights  Optimize transport and logistics recover from unforeseen disruptions  Maximizing guest / customer experience  Optimize depth and timing of retail markdowns to boost sales  Assess the impact of promotions, and improve profitability  Grow deposits not excessive interest costs  Directional insight on growing revenues and basket sizes Contact “DL BigDataSalesSupport” for more information about SAP Data Science Services © 2013 SAP AG. All rights reserved. Confidential 11
  12. 12. HANA + Hadoop
  13. 13. What is Hadoop  Open source project inspired by Google/Yahoo  Used at Yahoo, Facebook, eBay, LinkedIn, startups, Fortune 500 enterprises to store and process Petabytes of data on thousands of servers  Hadoop components – Cluster of commodity servers – Distributed storage layer (Hadoop Distributed File System, or HDFS) – Distributed processing infrastructure (MapReduce programming model) Cluster of Commodity Servers Hadoop    NameNode 10s to 1000s DataNode(s) © 2013 SAP AG. All rights reserved. Hadoop Software Architecture Hadoop Computation Engines Hive HBase Mahout Pig Sqoop … Map-Reduce Data storage (Hadoop Distributed File system) Confidential 13
  14. 14. Apache Hadoop Software framework for distributed data processing  Hadoop Distributed File System (HDFS) – reliable data storage on commodity hardware HDFS Name Node (stores metadata) Data Node Data Node  HIVE -- data warehousing solution on top of Hadoop with direct access to HDFS and Hbase (stores actual data in blocks) replication (stores actual data in blocks) client  MapReduce – programing model for parallel data processing and query execution © 2013 SAP AG. All rights reserved. HDFS Input MapReduce process HDFS output Confidential 14
  15. 15. Why Hadoop? Pros  Free software  Cheap hardware - commodity servers  Scalable to thousands of nodes and petabytes of data  Highly fault-tolerant storage and processing  Flexible – write Java MapReduce programs to do any kind of processing; any data- no fixed schema needed  Open source libraries & tools Cons  Specialized skillset to administer and develop – Hadoop is not free!  Require more development (programming MapReduce & other NoSQL tools) than relational technologies (SQL, stored procedure)  HIVE/PIG/Impala not as performant nor as mature as relational tech  Batch-oriented jobs, not real-time  Less mature in enterprise readiness – security, ETL, management, monitoring, etc © 2013 SAP AG. All rights reserved. Confidential 15
  16. 16. SAP HANA + Hadoop Provides Real-Time on BIG DATA Combine INSTANT Results with INFINITE Storage HADOOP 8 SAP HANA 1.0sec Infinite storage Instant Results • Modern in-memory platform • Distributed disk platform • Transact/analyze in real-time • Store infinite amounts of unstructured data • Native predictive, text, and spatial algorithms • No-SQL access © 2013 SAP AG. All rights reserved. Confidential 16

×