SlideShare a Scribd company logo
Leveraging SAP, Hadoop, and Big Data to Redefine Business
Balaji Krishna – SAP HANA Product Management
@balajivkrishna
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 2Internal
Agenda
1 Big Data Overview and Market opportunity
2 Big Data Solutions from SAP
3
SDA, vUDF, Spark Integration and beyond
4 Smart Data Streaming, SDI
5 Analytics & Predictive
6 Use Cases and Customer deployments
7 Q & A
3
SAP’s data footprint is widespread
For more than 40 years, SAP has been embedding itself in Business Data
At the global level At the business level At the personal level
74%
of the world’s
transaction revenue
touches an SAP
system
97%
of subscribers
reached by SAP
mobile solutions
via text messaging
98%
of the top 100 most
valued brands are
SAP customers
4
Simplifying the data landscape is key to maximizing value
10.2% or $237B*
of profits lost by top 200 global companies due to hidden costs of complexity
*Global Simplicity Index, 2015
Scattered information Technology limitationBatch orientation
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 5Internal
Big Data is gaining momentum in the market - $50B
IDC
Allied
Market Research Wikibon
© 2015 SAP AG or an SAP affiliate company. All rights reserved. 6
HAVING DATA ISN’T VALUABLE
USING IT IS!
+70% of IT projects
use more than
1 platform*
* Operationalizing the Buzz: Big Data, An Enterprise Management Associations (EMA) Research Report
Standardization can drive transformation
8
HADOOP is key Part of SAP’s Open Source Development usage
1
10
100
1000
10000
Open source consumption Open source contribution
SAP Contributes to over 100 Open Source Projects
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 9
Agenda
1 Big Data Overview and Market opportunity
2 Big Data Solutions from SAP
3
SDA, vUDF, Spark Integration and beyond
4 Smart Data Streaming, SDI
5 Analytics & Predictive
6 Use Cases and Customer deployments
7 Q & A
SAP HANA Platform
The SAP focus: End-to-end value chain
SPATIAL
PROCESSING
ANALYTICS, TEXT,
GRAPH, PREDICTIVE
ENGINES
CONSUME
COMPUTE
STORAGE
SOURCE
INGEST
Application
Development
Environment
Transformations &
Cleansing
Smart Data Integration
Smart Data Quality
Stream
Processing
Smart Data Streaming
STREAM
PROCESSING
LogsTextOLTP Social MachineGeoERP SensorStore & forward
Mobile applications and BI
Smart Data Access
Virtual
Tables
User Defined
Functions
1010100
1010110
1001110
Dynamic Tiering
Aged data
in Disk
In-Memory
Data model
& data
Calculation engine
Fast
computing
Column Storage
High performance
analytics
Series Data Storage
Store time-
series data
Reporting &
Dashboards
High Performance
Applications
Data Exploration
& Visualization
Adhoc & OLAP
Analytics
Predictive
Analysis
Business Planning
& Forecasting
Lumira / BI
Hadoop / NoSQL
MapReduce
YARN
HDFS
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 11Public
HANA Data Management
Technical Foundation for End-to-End Big Data
In-Memory
Sub-second Response
Column Storage
High Performance
Analytics
Dynamic Tiering
Warm data to disk
Smart Data Access
Remote Source as
Virtual Tables
Virtual UDF
HDFS and
MapReduce
011001
Smart Data Streaming
On-the-fly Stream
Analysis
Smart Data Integration
Extend HANA with
Hadoop Stores
Smart Data Quality
Cleansing and
Transformation
Replication server
Real-time data
movement to Hadoop
Smart Data Preparation
Clean data for
better decisions
Data Services
Big Data and No-SQL
transformations
Aging Rules and
Automated Data
Movement from HANA
to Hadoop
Data Warehouse Foundation
SAP & Hortonworks
co-engineering
partnership
Commit to open
source community
Data governance enhancements
via
Apache Atlas project
Open cloud
infrastructure & solutions
SAP & Hortonworks, partnering to accelerate innovation for all
© 2015 SAP AG or an SAP affiliate company. All rights reserved. 14
HANA Data Platform – “Hadoop Inside”
Big Data | Vision
HANA native BigData
 Dynamic Tiering
 Smart Data Streaming
 NoSQL | Graph | Geo |
TimeSeries
HANA & Hadoop
 SDA  Hive | Spark
 MapReduce | HDFS
 Admin & Monitoring
 User Mgmt / Security
Hadoop Extension
 Spark integration
 Integrated with HANA and
Hadoop
HANA Data Management Platform
Instant Results
SAP HANA
In-Memory
Warm Data
HANA
Dynamic Tiering
0.0sec
∞Infinite Storage
Raw Data
HADOOP
Information Management | Text | Search | Graph | Geospatial | Predictive
Smart Data Streaming
Administration | Monitoring | Operations | User Management | Security
© 2015 SAP AG or an SAP affiliate company. All rights reserved. 15
Agenda
1 Big Data Overview and Market opportunity
2 Big Data Solutions from SAP
3
SDA, vUDF, Spark Integration and beyond
4 Smart Data Streaming, SDI
5 Analytics & Predictive
6 Use Cases and Customer deployments
7 Q & A
© 2015 SAP AG or an SAP affiliate company. All rights reserved. 16
SAP and Hadoop / NoSQL Integration
Open Strategy
MapReduce / YARN / AWS Elastic MapReduce
Distributed Processing Framework
Hive
SQL QuerySpark
(in-memory)
HDFS
Hadoop Distributed File System
Hadoop / NoSQL
Adapters
SAP Data
Services
SAP HANA Platform
Pig
Scripting
Smart
Data
Integration
Virtual
User
Defined
Operators
RFC Hadoop
webHCat WedHDFS
Smart
Event
Processing
Smart Data
Access
ODBC
Driver
AdapterAdapter
SAP EIM
© 2015 SAP AG or an SAP affiliate company. All rights reserved. 17
SAP HANA and Hadoop
 GUI for design & development
 High performance reading from and
loading into Hadoop
 Extended optimizer: HIVEQL and PIG
aware
SAP
HANA
SAP
Data
Services
 MapReduce pushdown
 Text Data Processing
(Entity Extraction)
© 2015 SAP AG or an SAP affiliate company. All rights reserved. 18
SAP HANA Smart Data Access (Data Virtualization)
Virtual Table to Hive
Capabilities
 Real-time, virtualized data access to external sources
 SAP Sources: HANA, ASE, IQ, MaxDB, ESP, SQLA
 Databases: Teradadata, Microsoft SQLServer, Oracle,
IBM DB2, IBM Netezza
 Hadoop: Hive ODBC Driver to Cloudera,
Hortonworks, MapR
Benefits
 Optimized performance
 Compliments existing enterprise investments
 Lower development costs by using data directly from its
source system
© 2015 SAP AG or an SAP affiliate company. All rights reserved. 19
Remote caching for Hadoop sources
When SAP HANA dispatches a federated query to HIVE, it involves series of ‘map’ and ‘reduce’ job execution. This could take few
minutes to hours to complete a query depending on the data size in Hadoop and the current cluster capacity.
In most cases, the data in Hadoop cluster is not frequently updated and successive execution of map/reduce jobs might result in
same tuples.
As of SP07, HANA allows this result view to be materialized in the remote system thus avoiding the repetitive execution of the same
query.
This behavior can be controlled by hinting the optimizer to use remote caching.
Syntax
Select * from hive_activity_log where incident_type = ‘ERROR’ and plant =’001’ with hint (USE_REMOTE_CACHE)
© 2015 SAP AG or an SAP affiliate company. All rights reserved. 20
HANA Data Platform
HANA & Hadoop Integration
HANA & Hadoop Integration (SPS09)
 SQL on Hadoop via SDA (virtual tables) –
Hive (SPS07) or Spark
 Execution of MR-Jobs via HANA (Virtual
Functions)
 Access to HDFS (via virtual function)
 Integration for storage & processing
Next Steps (SP10)
 Spark SQL adapter via SDA
 Join relocation to Hadoop thru SparkRDD
 Unified Admin thru Ambari integration
 Tiering to Hadoop using DLM
© 2015 SAP AG or an SAP affiliate company. All rights reserved. 21
SAP HANA
Virtual User Defined Function
Capabilities
 User defined function for data virtualization
 Direct access to HDFS via RFC Hadoop function (webHCat WedHDFS)
without need for package, mapper, and reducer specification
 Invoke custom Map Reduce jobs; store as JAR file that be called
by SQL
 Ad-hoc query capabilities and processing of unstructured data
Benefits
 Provides flexibility, supporting use cases beyond Hive via SAP
HANA smart data access
SAP HANA
vUDF
Operator
RFC Hadoop
Hadoop
Map Reduce
HDFS
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 22Public
Virtual UDF for HDFS and MapReduce Integration
Syntax
Highlights
 Syntax:
CREATE VIRTUAL FUNCTION <func_name> [(<parameter_clause>)]
RETURNS <return_table_type>
[SQL SECURITY <mode>]
[<package_clause>]
CONFIGURATION <remote_proc_properties>
AT <remote_source_name>;
 Virtual Function Properties
– Can be used in-place of a table or derived table where the return clause represents the result-set
– Many configuration parameters depending on HDFS or MapReduce Job Call
– Points to a remote Hadoop cluster defined by the CREATE REMOTE SOURCE DDL
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 23Public
HANA Data Platform
HDFS Integration
Feature Highlights
 Query native HDFS (Hadoop File System) data
 Read-only access to HDFS file
 vUDF needs to define the schema of the result set returned with the TABLE clause
 Some relevant configuration parameters, more in SPS09 Administration Guide
Parameter Name Description
hdfs_location Where the hdfs file is location, e.g. /user/hive/tpch/products
hdfs_field_delimiter The character which defines the separator between fields in the file pointed by hdfs_location
datetime_format Defines the ISO datetime format of a date_time columun in the file
date_format Defines the ISO date format of a date typed column in the file, e.g yyyy-MM-dd
time_format Same for time format
© 2015 SAP AG or an SAP affiliate company. All rights reserved. 25
SAP HANA Smart Data Streaming
Real-time Event Streams
Capabilities
 Capture, filter, analyze and act on millions of events
per second in real-time
 Capture high value data in SAP HANA and direct
other data into Hadoop (adapter for HDFS or
MapReduce job into Hive)
 Stream live information to operational dashboards
 Perform continuous queries using declarative (CCL)
or model-driven approaches
Benefits
Real-time insight from streaming event data
Incoming
streams
Stream
(push)
SAP HANA
Streaming
Service
© 2015 SAP AG or an SAP affiliate company. All rights reserved. 26
Real-time Applications, Interactive Analysis
SCMERP CRM Text Geospatial Sensor
Social
Media
Logs
Data
Source
Distributed
File
Persistence
In-Memory
Persistence
In-Memory
Processing
SAP HANA
smart data
access
Data Access
SQL Java Scala Python OtherSQL .NET Javascript MDX OtherNodeJS
In-memory
Columnar Data
Predictive Text / NLP
Geospatial
Planning /
Rules
SAP HANA
Spark
SQL/
Shark
Spark
Streaming
MLlib Graph
X
(graph)
HDFS / Any Hadoop
Fault Tolerant
DFS Mgmt
SAP HANA and Apache Spark
Enterprise Fabric for Big Data
Integration between SAP HANA and Spark is via SAP HANA Smart Data Access
Done with Spark SQL
Requires Shark ODBC driver and unixODBC Driver Manager
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 27Public
HANA Data Platform and Hadoop
Where we are heading
Some relevant features:
 Lightweight and fast data replication/movement from
HANA to Hadoop
 Data Aging solution for HANA via Data Lifecycle
Management utility to define aging rules and relocate
aged data to Hadoop
 SDA support for Data Provisioning for the SAP HANA
Service/Adapter Framework
 SDA performance optimization: maintain statistics
 Optimize SAP HANA and Spark SQL Integration
 Leverage HANA/Hadoop Security capabilities for User
Authentication
 Single UI for HANA and Hadoop cluster Administration
& Monitoring (through Ambari)
 SQL on Hadoop
© 2015 SAP AG or an SAP affiliate company. All rights reserved. 28
SAP BusinessObjects BI / SAP Lumira & Hadoop / NoSQL
Combined With SAP HANA
Data Integration
Log Files
Text Data
Sources
Structured Data
Sources
SAP HANA Platform
SAP
BusinessObjects BI
SAP Sources Non-SAP
BI Universe
Available as of SAP Data Services 4.1 (Hive & HDFS)
SAP HANA smart data access (Hive)
Available as of SAP HANA SPS6
Hive, Amazon EMR, Impala available as of BI 4.0 FP3*
Hadoop
* BI 4.0 FP3 for single-source universe
BI 4.0 FP5 for multi-source universe
SAP
Lumira
Desktop
Hive 0.1, Amazon EMR 0,8
EMR, Hive 0.13, Impala, support
planned for 1.21
SAP
Lumira Cloud
Hive , EMR
Hana Cloud Integration
© 2015 SAP AG or an SAP affiliate company. All rights reserved. 29
SAP Predictive Analytics & Hadoop / NoSQL
SAP Predictive Analytics 2.0
Hadoop / NoSQL
SPARK
HDFS
HIVE
Greenplum
DB
Capabilities
 Unified UI for business analysts and data scientists
 Packaged business applications
 Extensive predictive library plus R, Hadoop, and
No SQL integration (Hive, HDFS, SPARK, and
Greenplum)
 Cloud ready
Benefits
 Improved forecasts from analysis of Big Data
 Support for business users & data scientists
© 2015 SAP AG or an SAP affiliate company. All rights reserved. 30
Agenda
1 Big Data Overview and Market opportunity
2 Big Data Solutions from SAP
3
SDA, vUDF, Spark Integration and beyond
4 Smart Data Streaming, SDI
5 Analytics & Predictive
6 Use Cases and Customer deployments
7 Q & A
© 2015 SAP AG or an SAP affiliate company. All rights reserved. 31
SAP Big Data Strategy Methodology
Example Successes
© 2014 SAP AG or an SAP affiliate company. All rights reserved. 31
Drive innovation by improving forecasting models*
Analyze Big Data from sensors. Watch the video.
Engaging customers and building fandom*
Built state of the art experience for fans. Watch the video.
Predict customer purchase sentiment
Seasonality Analysis in 5 seconds. Watch the video.
Cost Reduction, Tire Life Extension*
40 billion events per year analyzed. Read about Services Success
Customer Insight
Detect critical signals from 50+ PBs of data in eBay EDW. Watch the video
Performance Insight of Customer Behavior*
Improved ROI of campaigns by targeting the right audience. Watch
the video.
Customer
© 2015 SAP AG or an SAP affiliate company. All rights reserved. 32
Data Storage/Analytics
Hi Tech Industry Information Mgmt Landscape
HANA
Revision 58
Cloudera CDH 4.2.1
HIVE
HDFS
Source Systems
ECC
CRM
Others …
ETL Tools
Data Services
4.1 SP 1, Patch 4
SAP Landscape
Transformation
(SLT)
© 2015 SAP AG or an SAP affiliate company. All rights reserved. 33
Gaming Industry
• Look at game play metrics - various/ frequency distribution of data. What are
the top performing games from a revenue standpoint?
• Using SDA, access the game data in Hadoop and BW to calculate wins
sorted by top playing games (tie revenue to individual product)
• Calculate average session length per game, total events per session, pulls
per session.
• Analyze MOST profitable games with the LEAST number of plays/handle
pulls?
• Access revenue on games by account manager level
• Royalty tracking – Look at what royalties are owed and how it relates to
revenue generation for each product. Using SDA to access the game data
and royalty data
• Predictive use case - Track performance of each game and predict future
performance.
© 2015 SAP AG or an SAP affiliate company. All rights reserved. 34
SAP Consumer Insights 365
MAPR DISTRIBUTION FOR HADOOPNFS, MapR File System
MapR Distribution
SAP Mobile Services
Global Data Centers
Predictive Analytics
Machine Learning
SAP Consumer Insights
365
Hive, MapR-DB
Raw data
files
HANA Load
Files
Mediation Zone
Mobile Operator Data
Center
S/FTP
Raw data
files
Existing
Mobile
Operator
Systems
Mediation Zone
---------------------
Files Anonymized
SAP Consumer Insights 365
Subscriber
US
EMEA, APJ, etc…
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 35Public
Conclusion
Bringing Big Data to main stream Enterprise Data
ONE
PLATFORM
ALL
WORKLOADS
INTEGRATED
ALL DATA
SIMPLE
OPEN
© 2015 SAP AG or an SAP affiliate company. All rights reserved.
Thank You
Balaji Krishna - SAP HANA Product Management
AskSAPHANA@sap.com
© 2015 SAP AG or an SAP affiliate company. All rights reserved. 38
© 2015 SAP AG or an SAP affiliate company. All rights reserved.
No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP SE or an SAP affiliate company.
SAP and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP SE (or an SAP affiliate company) in Germany and other
countries. Please see http://global12.sap.com/corporate-en/legal/copyright/index.epx for additional trademark information and notices.
Some software products marketed by SAP SE and its distributors contain proprietary software components of other software vendors.
National product specifications may vary.
These materials are provided by SAP SE or an SAP affiliate company for informational purposes only, without representation or warranty of any kind, and SAP SE or its affiliated companies shall not be liable
for errors or omissions with respect to the materials. The only warranties for SAP SE or SAP affiliate company products and
services are those that are set forth in the express warranty statements accompanying such products and services, if any. Nothing herein should be construed as constituting an additional warranty.
In particular, SAP SE or its affiliated companies have no obligation to pursue any course of business outlined in this document or any related presentation, or to develop
or release any functionality mentioned therein. This document, or any related presentation, and SAP SE’s or its affiliated companies’ strategy and possible future developments, products, and/or platform
directions and functionality are all subject to change and may be changed by SAP SE or its affiliated companies at any time
for any reason without notice. The information in this document is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. All forward-looking statements are subject to
various risks and uncertainties that could cause actual results to differ materially from expectations. Readers are cautioned not to place undue reliance on these forward-looking statements, which speak only
as of their dates, and they should not be relied upon in making purchasing decisions.

More Related Content

What's hot

What's hot (20)

Big data/Hadoop/HANA Basics
Big data/Hadoop/HANA BasicsBig data/Hadoop/HANA Basics
Big data/Hadoop/HANA Basics
 
Flexpod with SAP HANA and SAP Applications
Flexpod with SAP HANA and SAP ApplicationsFlexpod with SAP HANA and SAP Applications
Flexpod with SAP HANA and SAP Applications
 
Building Custom Advanced Analytics Applications with SAP HANA
Building Custom Advanced Analytics Applications with SAP HANABuilding Custom Advanced Analytics Applications with SAP HANA
Building Custom Advanced Analytics Applications with SAP HANA
 
SAP Vora CodeJam
SAP Vora CodeJamSAP Vora CodeJam
SAP Vora CodeJam
 
Integration of SAP HANA with Hadoop
Integration of SAP HANA with HadoopIntegration of SAP HANA with Hadoop
Integration of SAP HANA with Hadoop
 
SAP EIM Overview
SAP EIM OverviewSAP EIM Overview
SAP EIM Overview
 
SAP HANA SPS09 - HANA IM Services
SAP HANA SPS09 - HANA IM ServicesSAP HANA SPS09 - HANA IM Services
SAP HANA SPS09 - HANA IM Services
 
What's Planned for SAP HANA SPS10
What's Planned for SAP HANA SPS10What's Planned for SAP HANA SPS10
What's Planned for SAP HANA SPS10
 
Hadoop integration with SAP HANA
Hadoop integration with SAP HANAHadoop integration with SAP HANA
Hadoop integration with SAP HANA
 
What's New for SAP HANA Smart Data Integration & Smart Data Quality
What's New for SAP HANA Smart Data Integration & Smart Data QualityWhat's New for SAP HANA Smart Data Integration & Smart Data Quality
What's New for SAP HANA Smart Data Integration & Smart Data Quality
 
Finance month closing with HANA
Finance month closing with HANAFinance month closing with HANA
Finance month closing with HANA
 
SAP HANA and SAP Vora
SAP HANA and SAP VoraSAP HANA and SAP Vora
SAP HANA and SAP Vora
 
SAP Helps Reduce Silos Between Business and Spatial Data
SAP Helps Reduce Silos Between Business and Spatial DataSAP Helps Reduce Silos Between Business and Spatial Data
SAP Helps Reduce Silos Between Business and Spatial Data
 
How can Hadoop & SAP be integrated
How can Hadoop & SAP be integratedHow can Hadoop & SAP be integrated
How can Hadoop & SAP be integrated
 
SAP HANA SPS10- Hadoop Integration
SAP HANA SPS10- Hadoop IntegrationSAP HANA SPS10- Hadoop Integration
SAP HANA SPS10- Hadoop Integration
 
SQL Anywhere and the Internet of Things
SQL Anywhere and the Internet of ThingsSQL Anywhere and the Internet of Things
SQL Anywhere and the Internet of Things
 
SAP HANA One
SAP HANA OneSAP HANA One
SAP HANA One
 
Big Data, Big Thinking: Simplified Architecture Webinar Fact Sheet
Big Data, Big Thinking: Simplified Architecture Webinar Fact SheetBig Data, Big Thinking: Simplified Architecture Webinar Fact Sheet
Big Data, Big Thinking: Simplified Architecture Webinar Fact Sheet
 
Build and run an sql data warehouse on sap hana
Build and run an sql data warehouse on sap hanaBuild and run an sql data warehouse on sap hana
Build and run an sql data warehouse on sap hana
 
Spotlight on Financial Services with Calypso and SAP ASE
Spotlight on Financial Services with Calypso and SAP ASESpotlight on Financial Services with Calypso and SAP ASE
Spotlight on Financial Services with Calypso and SAP ASE
 

Viewers also liked

B2B Target Marketing Agency in korea
B2B Target Marketing Agency in koreaB2B Target Marketing Agency in korea
B2B Target Marketing Agency in korea
ArunJin
 
Wakanda: NoSQL for Model-Driven Web applications - NoSQL matters 2012
Wakanda: NoSQL for Model-Driven Web applications - NoSQL matters 2012Wakanda: NoSQL for Model-Driven Web applications - NoSQL matters 2012
Wakanda: NoSQL for Model-Driven Web applications - NoSQL matters 2012
Alexandre Morgaut
 

Viewers also liked (16)

Big Data Taiwan 2014 Track2-1: SAP 善用足跡,預測未來 - 全方位的行銷視野
Big Data Taiwan 2014 Track2-1: SAP 善用足跡,預測未來 - 全方位的行銷視野Big Data Taiwan 2014 Track2-1: SAP 善用足跡,預測未來 - 全方位的行銷視野
Big Data Taiwan 2014 Track2-1: SAP 善用足跡,預測未來 - 全方位的行銷視野
 
What's new on SAP HANA Smart Data Access
What's new on SAP HANA Smart Data AccessWhat's new on SAP HANA Smart Data Access
What's new on SAP HANA Smart Data Access
 
Big Data Taiwan 2014 Track2-3: QlikView 與 Big Data ─ 從 Big Data 裡獲取重要信息
Big Data Taiwan 2014 Track2-3: QlikView 與 Big Data ─ 從 Big Data 裡獲取重要信息Big Data Taiwan 2014 Track2-3: QlikView 與 Big Data ─ 從 Big Data 裡獲取重要信息
Big Data Taiwan 2014 Track2-3: QlikView 與 Big Data ─ 從 Big Data 裡獲取重要信息
 
Job experience in De Nora
Job experience in De NoraJob experience in De Nora
Job experience in De Nora
 
B2B Target Marketing Agency in korea
B2B Target Marketing Agency in koreaB2B Target Marketing Agency in korea
B2B Target Marketing Agency in korea
 
Frank Celler – Processing large-scale graphs with Google(TM) Pregel - NoSQL m...
Frank Celler – Processing large-scale graphs with Google(TM) Pregel - NoSQL m...Frank Celler – Processing large-scale graphs with Google(TM) Pregel - NoSQL m...
Frank Celler – Processing large-scale graphs with Google(TM) Pregel - NoSQL m...
 
Employing Graph Databases as a Standardization Model towards Addressing Heter...
Employing Graph Databases as a Standardization Model towards Addressing Heter...Employing Graph Databases as a Standardization Model towards Addressing Heter...
Employing Graph Databases as a Standardization Model towards Addressing Heter...
 
Inside Google Knowledge Graph
Inside Google Knowledge GraphInside Google Knowledge Graph
Inside Google Knowledge Graph
 
Experimenting with Google Knowledge Graph & How Can we Potentially use it in...
 Experimenting with Google Knowledge Graph & How Can we Potentially use it in... Experimenting with Google Knowledge Graph & How Can we Potentially use it in...
Experimenting with Google Knowledge Graph & How Can we Potentially use it in...
 
Drupal 6 Database layer
Drupal 6 Database layerDrupal 6 Database layer
Drupal 6 Database layer
 
Intel and Cloudera: Accelerating Enterprise Big Data Success
Intel and Cloudera: Accelerating Enterprise Big Data SuccessIntel and Cloudera: Accelerating Enterprise Big Data Success
Intel and Cloudera: Accelerating Enterprise Big Data Success
 
Enterprise Knowledge Graph
Enterprise Knowledge GraphEnterprise Knowledge Graph
Enterprise Knowledge Graph
 
AWS Big Data Platform
AWS Big Data PlatformAWS Big Data Platform
AWS Big Data Platform
 
NoSQL and MapReduce
NoSQL and MapReduceNoSQL and MapReduce
NoSQL and MapReduce
 
Enterprise knowledge graphs
Enterprise knowledge graphsEnterprise knowledge graphs
Enterprise knowledge graphs
 
Wakanda: NoSQL for Model-Driven Web applications - NoSQL matters 2012
Wakanda: NoSQL for Model-Driven Web applications - NoSQL matters 2012Wakanda: NoSQL for Model-Driven Web applications - NoSQL matters 2012
Wakanda: NoSQL for Model-Driven Web applications - NoSQL matters 2012
 

Similar to Leveraging SAP, Hadoop, and Big Data to Redefine Business

How is sap data services unique for sap hana integration
How is sap data services unique for sap hana integrationHow is sap data services unique for sap hana integration
How is sap data services unique for sap hana integration
Flavio Alejandro Corradini
 
Data Migration Tools for the MOVE to SAP S_4HANA - Comparison_ MC _ RDM _ LSM...
Data Migration Tools for the MOVE to SAP S_4HANA - Comparison_ MC _ RDM _ LSM...Data Migration Tools for the MOVE to SAP S_4HANA - Comparison_ MC _ RDM _ LSM...
Data Migration Tools for the MOVE to SAP S_4HANA - Comparison_ MC _ RDM _ LSM...
SreeGe1
 

Similar to Leveraging SAP, Hadoop, and Big Data to Redefine Business (20)

What's New in SPS11 Overview
What's New in SPS11 OverviewWhat's New in SPS11 Overview
What's New in SPS11 Overview
 
SAP HANA SQL Data Warehousing (Sefan Linders)
SAP HANA SQL Data Warehousing (Sefan Linders)SAP HANA SQL Data Warehousing (Sefan Linders)
SAP HANA SQL Data Warehousing (Sefan Linders)
 
Sap bw4 hana
Sap bw4 hanaSap bw4 hana
Sap bw4 hana
 
Analytics Products L2 public 2020-23 Black.pptx
Analytics Products L2 public 2020-23 Black.pptxAnalytics Products L2 public 2020-23 Black.pptx
Analytics Products L2 public 2020-23 Black.pptx
 
IMCSummit 2015 - Day 1 IT Business Track - In-memory computing with SAP HANA:...
IMCSummit 2015 - Day 1 IT Business Track - In-memory computing with SAP HANA:...IMCSummit 2015 - Day 1 IT Business Track - In-memory computing with SAP HANA:...
IMCSummit 2015 - Day 1 IT Business Track - In-memory computing with SAP HANA:...
 
SAP HANA Cloud Platform Expert Session - SAP HANA Cloud Platform Analytics
SAP HANA Cloud Platform Expert Session - SAP HANA Cloud Platform AnalyticsSAP HANA Cloud Platform Expert Session - SAP HANA Cloud Platform Analytics
SAP HANA Cloud Platform Expert Session - SAP HANA Cloud Platform Analytics
 
SAP HANA - Big Data and Fast Data
SAP HANA - Big Data and Fast DataSAP HANA - Big Data and Fast Data
SAP HANA - Big Data and Fast Data
 
How is sap data services unique for sap hana integration
How is sap data services unique for sap hana integrationHow is sap data services unique for sap hana integration
How is sap data services unique for sap hana integration
 
Sneak Peak into Self-Service, Cross-Enterprise, Job Scheduling with CA Worklo...
Sneak Peak into Self-Service, Cross-Enterprise, Job Scheduling with CA Worklo...Sneak Peak into Self-Service, Cross-Enterprise, Job Scheduling with CA Worklo...
Sneak Peak into Self-Service, Cross-Enterprise, Job Scheduling with CA Worklo...
 
Dmm302 - Sap Hana Data Warehousing: Models for Sap Bw and SQL DW on SAP HANA
Dmm302 - Sap Hana Data Warehousing: Models for Sap Bw and SQL DW on SAP HANA Dmm302 - Sap Hana Data Warehousing: Models for Sap Bw and SQL DW on SAP HANA
Dmm302 - Sap Hana Data Warehousing: Models for Sap Bw and SQL DW on SAP HANA
 
Development to Deployment with SAP HANA
Development to Deployment with SAP HANADevelopment to Deployment with SAP HANA
Development to Deployment with SAP HANA
 
SAP TechEd 2015 | DEV109 | Extending Cloud Solutions from SAP using SAP HANA ...
SAP TechEd 2015 | DEV109 | Extending Cloud Solutions from SAP using SAP HANA ...SAP TechEd 2015 | DEV109 | Extending Cloud Solutions from SAP using SAP HANA ...
SAP TechEd 2015 | DEV109 | Extending Cloud Solutions from SAP using SAP HANA ...
 
Extend SAP S/4HANA to deliver real-time intelligent processes
Extend SAP S/4HANA to deliver real-time intelligent processesExtend SAP S/4HANA to deliver real-time intelligent processes
Extend SAP S/4HANA to deliver real-time intelligent processes
 
Deploy s4 hana
Deploy s4 hanaDeploy s4 hana
Deploy s4 hana
 
Data Migration Tools for the MOVE to SAP S_4HANA - Comparison_ MC _ RDM _ LSM...
Data Migration Tools for the MOVE to SAP S_4HANA - Comparison_ MC _ RDM _ LSM...Data Migration Tools for the MOVE to SAP S_4HANA - Comparison_ MC _ RDM _ LSM...
Data Migration Tools for the MOVE to SAP S_4HANA - Comparison_ MC _ RDM _ LSM...
 
SAP Data Hub e SUSE Container as a Service Platform
SAP Data Hub e SUSE Container as a Service PlatformSAP Data Hub e SUSE Container as a Service Platform
SAP Data Hub e SUSE Container as a Service Platform
 
SAP TechEd 2013: CD105: Extending SuccessFactors EmployeeCentral with apps on...
SAP TechEd 2013: CD105: Extending SuccessFactors EmployeeCentral with apps on...SAP TechEd 2013: CD105: Extending SuccessFactors EmployeeCentral with apps on...
SAP TechEd 2013: CD105: Extending SuccessFactors EmployeeCentral with apps on...
 
SAP HANA SPS08 Overview
SAP HANA SPS08 OverviewSAP HANA SPS08 Overview
SAP HANA SPS08 Overview
 
SAP HANA Cloud: From Your Datacenter to the Cloud and Back
SAP HANA Cloud: From Your Datacenter to the Cloud and Back  SAP HANA Cloud: From Your Datacenter to the Cloud and Back
SAP HANA Cloud: From Your Datacenter to the Cloud and Back
 
データベースMeetup Vol3
データベースMeetup Vol3データベースMeetup Vol3
データベースMeetup Vol3
 

More from DataWorks Summit

HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
DataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
DataWorks Summit
 

More from DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Recently uploaded

Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 

Recently uploaded (20)

Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
In-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT ProfessionalsIn-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT Professionals
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 

Leveraging SAP, Hadoop, and Big Data to Redefine Business

  • 1. Leveraging SAP, Hadoop, and Big Data to Redefine Business Balaji Krishna – SAP HANA Product Management @balajivkrishna
  • 2. © 2015 SAP SE or an SAP affiliate company. All rights reserved. 2Internal Agenda 1 Big Data Overview and Market opportunity 2 Big Data Solutions from SAP 3 SDA, vUDF, Spark Integration and beyond 4 Smart Data Streaming, SDI 5 Analytics & Predictive 6 Use Cases and Customer deployments 7 Q & A
  • 3. 3 SAP’s data footprint is widespread For more than 40 years, SAP has been embedding itself in Business Data At the global level At the business level At the personal level 74% of the world’s transaction revenue touches an SAP system 97% of subscribers reached by SAP mobile solutions via text messaging 98% of the top 100 most valued brands are SAP customers
  • 4. 4 Simplifying the data landscape is key to maximizing value 10.2% or $237B* of profits lost by top 200 global companies due to hidden costs of complexity *Global Simplicity Index, 2015 Scattered information Technology limitationBatch orientation
  • 5. © 2015 SAP SE or an SAP affiliate company. All rights reserved. 5Internal Big Data is gaining momentum in the market - $50B IDC Allied Market Research Wikibon
  • 6. © 2015 SAP AG or an SAP affiliate company. All rights reserved. 6 HAVING DATA ISN’T VALUABLE USING IT IS! +70% of IT projects use more than 1 platform* * Operationalizing the Buzz: Big Data, An Enterprise Management Associations (EMA) Research Report
  • 7. Standardization can drive transformation
  • 8. 8 HADOOP is key Part of SAP’s Open Source Development usage 1 10 100 1000 10000 Open source consumption Open source contribution SAP Contributes to over 100 Open Source Projects
  • 9. © 2015 SAP SE or an SAP affiliate company. All rights reserved. 9 Agenda 1 Big Data Overview and Market opportunity 2 Big Data Solutions from SAP 3 SDA, vUDF, Spark Integration and beyond 4 Smart Data Streaming, SDI 5 Analytics & Predictive 6 Use Cases and Customer deployments 7 Q & A
  • 10. SAP HANA Platform The SAP focus: End-to-end value chain SPATIAL PROCESSING ANALYTICS, TEXT, GRAPH, PREDICTIVE ENGINES CONSUME COMPUTE STORAGE SOURCE INGEST Application Development Environment Transformations & Cleansing Smart Data Integration Smart Data Quality Stream Processing Smart Data Streaming STREAM PROCESSING LogsTextOLTP Social MachineGeoERP SensorStore & forward Mobile applications and BI Smart Data Access Virtual Tables User Defined Functions 1010100 1010110 1001110 Dynamic Tiering Aged data in Disk In-Memory Data model & data Calculation engine Fast computing Column Storage High performance analytics Series Data Storage Store time- series data Reporting & Dashboards High Performance Applications Data Exploration & Visualization Adhoc & OLAP Analytics Predictive Analysis Business Planning & Forecasting Lumira / BI Hadoop / NoSQL MapReduce YARN HDFS
  • 11. © 2015 SAP SE or an SAP affiliate company. All rights reserved. 11Public HANA Data Management Technical Foundation for End-to-End Big Data In-Memory Sub-second Response Column Storage High Performance Analytics Dynamic Tiering Warm data to disk Smart Data Access Remote Source as Virtual Tables Virtual UDF HDFS and MapReduce 011001 Smart Data Streaming On-the-fly Stream Analysis Smart Data Integration Extend HANA with Hadoop Stores Smart Data Quality Cleansing and Transformation Replication server Real-time data movement to Hadoop Smart Data Preparation Clean data for better decisions Data Services Big Data and No-SQL transformations Aging Rules and Automated Data Movement from HANA to Hadoop Data Warehouse Foundation
  • 12. SAP & Hortonworks co-engineering partnership Commit to open source community Data governance enhancements via Apache Atlas project Open cloud infrastructure & solutions SAP & Hortonworks, partnering to accelerate innovation for all
  • 13. © 2015 SAP AG or an SAP affiliate company. All rights reserved. 14 HANA Data Platform – “Hadoop Inside” Big Data | Vision HANA native BigData  Dynamic Tiering  Smart Data Streaming  NoSQL | Graph | Geo | TimeSeries HANA & Hadoop  SDA  Hive | Spark  MapReduce | HDFS  Admin & Monitoring  User Mgmt / Security Hadoop Extension  Spark integration  Integrated with HANA and Hadoop HANA Data Management Platform Instant Results SAP HANA In-Memory Warm Data HANA Dynamic Tiering 0.0sec ∞Infinite Storage Raw Data HADOOP Information Management | Text | Search | Graph | Geospatial | Predictive Smart Data Streaming Administration | Monitoring | Operations | User Management | Security
  • 14. © 2015 SAP AG or an SAP affiliate company. All rights reserved. 15 Agenda 1 Big Data Overview and Market opportunity 2 Big Data Solutions from SAP 3 SDA, vUDF, Spark Integration and beyond 4 Smart Data Streaming, SDI 5 Analytics & Predictive 6 Use Cases and Customer deployments 7 Q & A
  • 15. © 2015 SAP AG or an SAP affiliate company. All rights reserved. 16 SAP and Hadoop / NoSQL Integration Open Strategy MapReduce / YARN / AWS Elastic MapReduce Distributed Processing Framework Hive SQL QuerySpark (in-memory) HDFS Hadoop Distributed File System Hadoop / NoSQL Adapters SAP Data Services SAP HANA Platform Pig Scripting Smart Data Integration Virtual User Defined Operators RFC Hadoop webHCat WedHDFS Smart Event Processing Smart Data Access ODBC Driver AdapterAdapter SAP EIM
  • 16. © 2015 SAP AG or an SAP affiliate company. All rights reserved. 17 SAP HANA and Hadoop  GUI for design & development  High performance reading from and loading into Hadoop  Extended optimizer: HIVEQL and PIG aware SAP HANA SAP Data Services  MapReduce pushdown  Text Data Processing (Entity Extraction)
  • 17. © 2015 SAP AG or an SAP affiliate company. All rights reserved. 18 SAP HANA Smart Data Access (Data Virtualization) Virtual Table to Hive Capabilities  Real-time, virtualized data access to external sources  SAP Sources: HANA, ASE, IQ, MaxDB, ESP, SQLA  Databases: Teradadata, Microsoft SQLServer, Oracle, IBM DB2, IBM Netezza  Hadoop: Hive ODBC Driver to Cloudera, Hortonworks, MapR Benefits  Optimized performance  Compliments existing enterprise investments  Lower development costs by using data directly from its source system
  • 18. © 2015 SAP AG or an SAP affiliate company. All rights reserved. 19 Remote caching for Hadoop sources When SAP HANA dispatches a federated query to HIVE, it involves series of ‘map’ and ‘reduce’ job execution. This could take few minutes to hours to complete a query depending on the data size in Hadoop and the current cluster capacity. In most cases, the data in Hadoop cluster is not frequently updated and successive execution of map/reduce jobs might result in same tuples. As of SP07, HANA allows this result view to be materialized in the remote system thus avoiding the repetitive execution of the same query. This behavior can be controlled by hinting the optimizer to use remote caching. Syntax Select * from hive_activity_log where incident_type = ‘ERROR’ and plant =’001’ with hint (USE_REMOTE_CACHE)
  • 19. © 2015 SAP AG or an SAP affiliate company. All rights reserved. 20 HANA Data Platform HANA & Hadoop Integration HANA & Hadoop Integration (SPS09)  SQL on Hadoop via SDA (virtual tables) – Hive (SPS07) or Spark  Execution of MR-Jobs via HANA (Virtual Functions)  Access to HDFS (via virtual function)  Integration for storage & processing Next Steps (SP10)  Spark SQL adapter via SDA  Join relocation to Hadoop thru SparkRDD  Unified Admin thru Ambari integration  Tiering to Hadoop using DLM
  • 20. © 2015 SAP AG or an SAP affiliate company. All rights reserved. 21 SAP HANA Virtual User Defined Function Capabilities  User defined function for data virtualization  Direct access to HDFS via RFC Hadoop function (webHCat WedHDFS) without need for package, mapper, and reducer specification  Invoke custom Map Reduce jobs; store as JAR file that be called by SQL  Ad-hoc query capabilities and processing of unstructured data Benefits  Provides flexibility, supporting use cases beyond Hive via SAP HANA smart data access SAP HANA vUDF Operator RFC Hadoop Hadoop Map Reduce HDFS
  • 21. © 2015 SAP SE or an SAP affiliate company. All rights reserved. 22Public Virtual UDF for HDFS and MapReduce Integration Syntax Highlights  Syntax: CREATE VIRTUAL FUNCTION <func_name> [(<parameter_clause>)] RETURNS <return_table_type> [SQL SECURITY <mode>] [<package_clause>] CONFIGURATION <remote_proc_properties> AT <remote_source_name>;  Virtual Function Properties – Can be used in-place of a table or derived table where the return clause represents the result-set – Many configuration parameters depending on HDFS or MapReduce Job Call – Points to a remote Hadoop cluster defined by the CREATE REMOTE SOURCE DDL
  • 22. © 2015 SAP SE or an SAP affiliate company. All rights reserved. 23Public HANA Data Platform HDFS Integration Feature Highlights  Query native HDFS (Hadoop File System) data  Read-only access to HDFS file  vUDF needs to define the schema of the result set returned with the TABLE clause  Some relevant configuration parameters, more in SPS09 Administration Guide Parameter Name Description hdfs_location Where the hdfs file is location, e.g. /user/hive/tpch/products hdfs_field_delimiter The character which defines the separator between fields in the file pointed by hdfs_location datetime_format Defines the ISO datetime format of a date_time columun in the file date_format Defines the ISO date format of a date typed column in the file, e.g yyyy-MM-dd time_format Same for time format
  • 23. © 2015 SAP AG or an SAP affiliate company. All rights reserved. 25 SAP HANA Smart Data Streaming Real-time Event Streams Capabilities  Capture, filter, analyze and act on millions of events per second in real-time  Capture high value data in SAP HANA and direct other data into Hadoop (adapter for HDFS or MapReduce job into Hive)  Stream live information to operational dashboards  Perform continuous queries using declarative (CCL) or model-driven approaches Benefits Real-time insight from streaming event data Incoming streams Stream (push) SAP HANA Streaming Service
  • 24. © 2015 SAP AG or an SAP affiliate company. All rights reserved. 26 Real-time Applications, Interactive Analysis SCMERP CRM Text Geospatial Sensor Social Media Logs Data Source Distributed File Persistence In-Memory Persistence In-Memory Processing SAP HANA smart data access Data Access SQL Java Scala Python OtherSQL .NET Javascript MDX OtherNodeJS In-memory Columnar Data Predictive Text / NLP Geospatial Planning / Rules SAP HANA Spark SQL/ Shark Spark Streaming MLlib Graph X (graph) HDFS / Any Hadoop Fault Tolerant DFS Mgmt SAP HANA and Apache Spark Enterprise Fabric for Big Data Integration between SAP HANA and Spark is via SAP HANA Smart Data Access Done with Spark SQL Requires Shark ODBC driver and unixODBC Driver Manager
  • 25. © 2015 SAP SE or an SAP affiliate company. All rights reserved. 27Public HANA Data Platform and Hadoop Where we are heading Some relevant features:  Lightweight and fast data replication/movement from HANA to Hadoop  Data Aging solution for HANA via Data Lifecycle Management utility to define aging rules and relocate aged data to Hadoop  SDA support for Data Provisioning for the SAP HANA Service/Adapter Framework  SDA performance optimization: maintain statistics  Optimize SAP HANA and Spark SQL Integration  Leverage HANA/Hadoop Security capabilities for User Authentication  Single UI for HANA and Hadoop cluster Administration & Monitoring (through Ambari)  SQL on Hadoop
  • 26. © 2015 SAP AG or an SAP affiliate company. All rights reserved. 28 SAP BusinessObjects BI / SAP Lumira & Hadoop / NoSQL Combined With SAP HANA Data Integration Log Files Text Data Sources Structured Data Sources SAP HANA Platform SAP BusinessObjects BI SAP Sources Non-SAP BI Universe Available as of SAP Data Services 4.1 (Hive & HDFS) SAP HANA smart data access (Hive) Available as of SAP HANA SPS6 Hive, Amazon EMR, Impala available as of BI 4.0 FP3* Hadoop * BI 4.0 FP3 for single-source universe BI 4.0 FP5 for multi-source universe SAP Lumira Desktop Hive 0.1, Amazon EMR 0,8 EMR, Hive 0.13, Impala, support planned for 1.21 SAP Lumira Cloud Hive , EMR Hana Cloud Integration
  • 27. © 2015 SAP AG or an SAP affiliate company. All rights reserved. 29 SAP Predictive Analytics & Hadoop / NoSQL SAP Predictive Analytics 2.0 Hadoop / NoSQL SPARK HDFS HIVE Greenplum DB Capabilities  Unified UI for business analysts and data scientists  Packaged business applications  Extensive predictive library plus R, Hadoop, and No SQL integration (Hive, HDFS, SPARK, and Greenplum)  Cloud ready Benefits  Improved forecasts from analysis of Big Data  Support for business users & data scientists
  • 28. © 2015 SAP AG or an SAP affiliate company. All rights reserved. 30 Agenda 1 Big Data Overview and Market opportunity 2 Big Data Solutions from SAP 3 SDA, vUDF, Spark Integration and beyond 4 Smart Data Streaming, SDI 5 Analytics & Predictive 6 Use Cases and Customer deployments 7 Q & A
  • 29. © 2015 SAP AG or an SAP affiliate company. All rights reserved. 31 SAP Big Data Strategy Methodology Example Successes © 2014 SAP AG or an SAP affiliate company. All rights reserved. 31 Drive innovation by improving forecasting models* Analyze Big Data from sensors. Watch the video. Engaging customers and building fandom* Built state of the art experience for fans. Watch the video. Predict customer purchase sentiment Seasonality Analysis in 5 seconds. Watch the video. Cost Reduction, Tire Life Extension* 40 billion events per year analyzed. Read about Services Success Customer Insight Detect critical signals from 50+ PBs of data in eBay EDW. Watch the video Performance Insight of Customer Behavior* Improved ROI of campaigns by targeting the right audience. Watch the video. Customer
  • 30. © 2015 SAP AG or an SAP affiliate company. All rights reserved. 32 Data Storage/Analytics Hi Tech Industry Information Mgmt Landscape HANA Revision 58 Cloudera CDH 4.2.1 HIVE HDFS Source Systems ECC CRM Others … ETL Tools Data Services 4.1 SP 1, Patch 4 SAP Landscape Transformation (SLT)
  • 31. © 2015 SAP AG or an SAP affiliate company. All rights reserved. 33 Gaming Industry • Look at game play metrics - various/ frequency distribution of data. What are the top performing games from a revenue standpoint? • Using SDA, access the game data in Hadoop and BW to calculate wins sorted by top playing games (tie revenue to individual product) • Calculate average session length per game, total events per session, pulls per session. • Analyze MOST profitable games with the LEAST number of plays/handle pulls? • Access revenue on games by account manager level • Royalty tracking – Look at what royalties are owed and how it relates to revenue generation for each product. Using SDA to access the game data and royalty data • Predictive use case - Track performance of each game and predict future performance.
  • 32. © 2015 SAP AG or an SAP affiliate company. All rights reserved. 34 SAP Consumer Insights 365 MAPR DISTRIBUTION FOR HADOOPNFS, MapR File System MapR Distribution SAP Mobile Services Global Data Centers Predictive Analytics Machine Learning SAP Consumer Insights 365 Hive, MapR-DB Raw data files HANA Load Files Mediation Zone Mobile Operator Data Center S/FTP Raw data files Existing Mobile Operator Systems Mediation Zone --------------------- Files Anonymized SAP Consumer Insights 365 Subscriber US EMEA, APJ, etc…
  • 33. © 2015 SAP SE or an SAP affiliate company. All rights reserved. 35Public Conclusion Bringing Big Data to main stream Enterprise Data ONE PLATFORM ALL WORKLOADS INTEGRATED ALL DATA SIMPLE OPEN
  • 34. © 2015 SAP AG or an SAP affiliate company. All rights reserved. Thank You Balaji Krishna - SAP HANA Product Management AskSAPHANA@sap.com
  • 35. © 2015 SAP AG or an SAP affiliate company. All rights reserved. 38 © 2015 SAP AG or an SAP affiliate company. All rights reserved. No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP SE or an SAP affiliate company. SAP and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP SE (or an SAP affiliate company) in Germany and other countries. Please see http://global12.sap.com/corporate-en/legal/copyright/index.epx for additional trademark information and notices. Some software products marketed by SAP SE and its distributors contain proprietary software components of other software vendors. National product specifications may vary. These materials are provided by SAP SE or an SAP affiliate company for informational purposes only, without representation or warranty of any kind, and SAP SE or its affiliated companies shall not be liable for errors or omissions with respect to the materials. The only warranties for SAP SE or SAP affiliate company products and services are those that are set forth in the express warranty statements accompanying such products and services, if any. Nothing herein should be construed as constituting an additional warranty. In particular, SAP SE or its affiliated companies have no obligation to pursue any course of business outlined in this document or any related presentation, or to develop or release any functionality mentioned therein. This document, or any related presentation, and SAP SE’s or its affiliated companies’ strategy and possible future developments, products, and/or platform directions and functionality are all subject to change and may be changed by SAP SE or its affiliated companies at any time for any reason without notice. The information in this document is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. All forward-looking statements are subject to various risks and uncertainties that could cause actual results to differ materially from expectations. Readers are cautioned not to place undue reliance on these forward-looking statements, which speak only as of their dates, and they should not be relied upon in making purchasing decisions.