SlideShare a Scribd company logo
1 of 25
Leveraging SAP, Hadoop, and Big Data to Redefine
Business
Javier Cuerva | Enterprise Solution Architect | SAP Global CoE
April 16th, 2015 Public
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 2Public
More Data, Different Data, Faster Data = Big Data
 Digital Universe Exploding*
– Until 2020, the digital universe will double
every 2 years, reaching 40 Zettabytes
 Relational data, not only anymore
 Machine generated data is the trend “du-jour”
– Internet of Things or Industrial Data
*source IDC: the digital universe in 20201 Zettabyte (ZB) = 1 million Petabytes (PB)
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 3Public
Big Data Economics
Generating significant financial value across sectors
$300 billion value per year
US healthcare
#0.7% annual GDP
Manufacturing
Up to -50% assembly costs
Up to +7% reduction in
working capital
Retail
60%+ increase in net margin
possible
0.5-1.0% annual GDP
Global personal location data
$100 billion+ revenue for
service providers
source IDC: McKinsey Global Institute Analysis
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 4Public
SAP Focus
End-to-End Value Chain
SPATIAL
PROCESSING
ANALYTICS, TEXT,
GRAPH, PREDICTIVE
ENGINES
CONSUME
COMPUTE
STORAGE
SOURCE
INGEST
Application
Development
Environment
Transformations &
Cleansing
Smart Data Integration
Smart Data Quality
Stream
Processing
Smart Data Streaming
STREAM
PROCESSING
LogsTextOLTP Social MachineGeoERP SensorStore & forward
Mobile applications and BI
Smart Data Access
Virtual
Tables
User Defined
Functions
1010100
1010110
1001110
Dynamic Tiering
Aged data
in Disk
In-Memory
Data model
& data
Calculation engine
Fast
computing
Column Storage
High performance
analytics
Series Data Storage
Store time-
series data
Reporting &
Dashboards
High Performance
Applications
Data Exploration
& Visualization
Adhoc & OLAP
Analytics
Predictive
Analysis
Business Planning
& Forecasting
Lumira / BI
Hadoop / NoSQL
MapReduce
YARN
HDFS
HANA DATA PLATFORM
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 5Public
HANA Data Management
Technical Foundation for End-to-End Big Data
In-Memory
Sub-second Response
Column Storage
High Performance
Analytics
Dynamic Tiering
Warm data to disk
Smart Data Access
Remote Source as
Virtual Tables
Virtual UDF
HDFS and
MapReduce
011001
Smart Data Streaming
On-the-fly Stream
Analysis
Smart Data Integration
Extend HANA with
Hadoop Stores
Smart Data Quality
Cleansing and
Transformation
Replication server
Real-time data
movement to Hadoop
Smart Data Preparation
Clean data for
better decisions
Data Services
Big Data and No-SQL
transformations
Aging Rules and
Automated Data
Movement from HANA
to Hadoop
Data Warehouse Foundation
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 6Public
HANA Data Platform
SAP HANA
In-Memory
0.1 sec
Instant Results
Text | Search | Graph | GeoSpatial | Predictive | Time Series
Administration | Monitoring | Operations | User Management | Security
HANA Dynamic
Tiering
Warm Data
HADOOP
Compute & Storage
∞
HANA Data Management Platform for Big Data
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 7Public
HANA Data Platform
Big Data Features
HANA native BigData
 Dynamic Tiering
 Smart Data Streaming
 Graph | Geo | TimeSeries
HANA & Hadoop
 Smart Data Access  Hive | Spark
 MapReduce | HDFS
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 8Public
HANA Data Platform
Dynamic Tiering
HANA Dynamic Tiering
 Native Big Data solution – real-time
insights – ALL enterprise data
 Manage data cost effectively
 Terabytes to Petabytes
 Application defined temperature
 Single Database experience
 Centralized operational control
CREATE TABLE „demo“.“SalesOrders_WARM“ (
ID Integer NOT NULL,
CustomerID Integer NOT NULL,
OrderDate date NOT NULL,
…,
PRIMARY KEY (id)
) USING EXTENDED STORAGE;
INSERT INTO „demo“.“SalesOrders_WARM“ VALUES ( … );
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 9Public
HANA Data Platform
Hadoop Integration
HANA & Hadoop Integration
 SQL on Hadoop via SDA with virtual tables
– Hive or Spark
 Execution of MR-Jobs via virtual functions
 Access to HDFS
 Calculation view support for virtual tables
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 10Public
HANA Data Platform
Hive, Spark Integration
Feature Highlights
 Data Virtualization (Smart Data Access) to
Hive via ODBC connectivity
 Richer SQL access from SAP HANA studio
to Hive Tables
 Compute SQL operations between Hive
Tables and HANA Tables from HANA studio
 Remote caching of HIVE data in queries:
Ex: SELECT * FROM HIVE_LINEITEMS WHERE
ORDER_ID=6 WITH HINT ( USE_REMOTE_CACHE )
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 11Public
Virtual UDF for HDFS and MapReduce Integration
Architecture
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 12Public
Virtual UDF for HDFS and MapReduce Integration
Syntax
Highlights
 Syntax:
CREATE VIRTUAL FUNCTION <func_name> [(<parameter_clause>)]
RETURNS <return_table_type>
[SQL SECURITY <mode>]
[<package_clause>]
CONFIGURATION <remote_proc_properties>
AT <remote_source_name>;
 Virtual Function Properties
– Can be used in-place of a table or derived table where the return clause represents the result-set
– Many configuration parameters depending on HDFS or MapReduce Job Call
– Points to a remote Hadoop cluster defined by the CREATE REMOTE SOURCE DDL
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 13Public
HANA Data Platform
HDFS Integration
Feature Highlights
 Query native HDFS (Hadoop File System) data
 Read-only access to HDFS file
 vUDF needs to define the schema of the result set returned with the TABLE clause
 Some relevant configuration parameters, more in SPS09 Administration Guide
Parameter Name Description
hdfs_location Where the hdfs file is location, e.g. /user/hive/tpch/products
hdfs_field_delimiter The character which defines the separator between fields in the file pointed by hdfs_location
datetime_format Defines the ISO datetime format of a date_time column in the file
date_format Defines the ISO date format of a date column in the file, e.g yyyy-MM-dd
time_format Same for time format
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 14Public
HDFS Demo with Virtual UDF
Create First a Remote Server pointing to the WebHDFS and WebHCAT servers
 Use Remote Server Statement for that
Create a Virtual User Defined Function
 Pointing to the HDFS file and specifying the type of data returned
Access the HDFS file
 Call the vUDF
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 15Public
HANA Data Platform
Map Reduce Integration
Feature Highlights
 Capability to invoke MapReduce jobs from HANA
 End-to-End development:
– Define Mapper and Reducer JAVA classes developed in HANA studio by creating a Java Project with
the SAP HANA Development Perspective.
– MapReduce Deployment from HANA Studio
 vUDF needs to define the schema of the result set returned with the TABLE clause
 Some relevant configuration parameters, more in SPS09 Administration Guide
Parameter Name Description
mapred_mapper The full java class name for the map phase
mapred_reducer The full java class name for the reduce phase
mapred_input The initial file to be used by MapReduce or an intermediate result if chaining MapReduce calls or
the input directory to read the data from
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 16Public
MapReduce Demo with Virtual UDF
Create First a Remote Server pointing to the WebHDFS and WebHCAT servers
 Use Remote Server Statement for that
Create a Virtual User Defined Function
 Reference the Mapper Class Name
 Reference the Reducer Class Name
 Reference the input file location where the MapReduce Job should look for
Call the MapReduce Job
 Call the vUDF
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 17Public
HANA Data Platform and Hadoop
Where we are heading
Some relevant features:
 Lightweight and fast data replication/movement from
HANA to Hadoop
 Data Aging solution for HANA via Data Lifecycle
Management utility to define aging rules and relocate
aged data to Hadoop
 SDA support for Data Provisioning for the SAP HANA
Service/Adapter Framework
 SDA performance optimization: maintain statistics
 Optimize SAP HANA and Spark SQL Integration
 Leverage HANA/Hadoop Security capabilities for User
Authentication
 Single UI for HANA and Hadoop cluster Administration
& Monitoring (through Ambari)
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 18Public
Conclusion
Bringing Big Data to main stream Enterprise Data
ONE
PLATFORM
ALL
WORKLOADS
INTEGRATED
ALL DATA
SIMPLE
OPEN
© 2015 SAP SE or an SAP affiliate company. All rights reserved.
Thank you
Contact information:
Javier Cuerva
Enterprise Solution Architect
SAP Global Center of Expertise
javier.cuerva@sap.com
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 20Public
Backup Slides
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 21Public
HANA Data Platform
Any Apps
Any App Server
SAP Business Suite and BW
ABAP App Server
Other AppsLocationReal-timeHADOOPMachineUnstructuredTransaction
HANA Platform
SQL, SQLScript, JavaScript
Spatial Text Search
Text
Analysis & Mining
Stored Procedure
& Data Models
Application &
UI Services
Business Function Library Predictive Analysis Library
Database
Services
Series Data
Rules
Engine
Integration & Steaming Services
SAP HANA is the platform for
ALL Applications
A true platform
 Converged OLTP + OLAP
 Native processing services
 Embedded business logic
Supports any application
 60% of HANA use cases are outside of the SAP Landscape
 1,300+ start-ups & ISVs developing on HANA
Supports any Device
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 22Public
SAP HANA Smart Data Integration & Smart Data Quality
Replication, Batch Integration, and Data Virtualization
Capabilities
 Real-time replication & CDC on select sources
 Bulk integration (metadata / data)
 Data virtualization via Smart Data Access
 Real-time data cleansing and transformation
 Data enrichment with geospatial information
 SAP HANA Studio to define data transformation flows
 Support for on-premise and cloud sources
 Open SDK and built-in adapters including HIVE
Benefits
 Simplified landscape: 1 environment to provision data
 Real-time: lower latency with in-memory performance
 Open & extensible: supports data of any shape or size
Built-In Adapters Custom Adapters
Transformations
SAP HANA
Metadata
Adapter
Framework
OData
DB2, Oracle
SQL Server
Smart Data IntegrationSmart Data Quality
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 23Public
SAP HANA Smart Data Access
Virtual Table
Capabilities
 Real-time, virtualized data access to external sources
 SAP Sources: HANA, ASE, IQ, MaxDB, ESP, SQLA
 Databases: Teradadata, Microsoft SQLServer, Oracle,
IBM DB2, IBM Netezza
 Hadoop: Hive ODBC Driver to Cloudera, Hortonworks,
MapR
 NoSQL: SPARK
Benefits
 Optimized performance
 Compliments existing enterprise investments
 Lower development costs by using data directly from its
source system
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 24Public
SAP BusinessObjects BI / SAP Lumira & Hadoop / NoSQL
Combined With SAP HANA
Hadoop / NoSQL
Hive
SQL Query
Impala
MPP SQL
Query
MongoDB
Document
DB
Cassandra
NoSQL DB
MapReduce / YARN / AWS Elastic MapReduce
Distributed Processing Framework
SAP HANA Platform
SAP
BusinessObjects BI
Data Integration
BI Universe
SAP
Lumira
Desktop
SAP
Lumira Cloud
Capabilities
SAP Lumira Desktop & SAP BusinessObjects BI
can integrate with Hadoop via SAP HANA
SAP BusinessObjects BI 4.0 FP 3(Universe)
integrates with Hive, Cloudera Implala and AWS
EMR
SAP Lumira desktop integrates with Hive and AWS
EMR
 SAP Lumira comes with a Datasource Extension Framework;
developers can use to build additional datasource access:
MongoDB, Datastax, SparkSQL are the most recent examples
 SAP Lumira cloud integrates with Hive (0.13),
Cloudera Impala (1.21), and AWS EMR
Benefits
 Flexible choice on how to access Hadoop / NoSQL
 Greater insight from Big Data Analytics
Smart data access
SAP Data Services 4.1 (Hive & HDFS)
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 25Public
SAP Predictive Analytics & Hadoop / NoSQL
SAP Predictive Analytics 2.0
Hadoop / NoSQL
Greenplum
SQL DB
Capabilities
 Unified UI for business analysts and data scientists
 Extensive predictive library including R algorithms
 Big Data ready with support of Hive and Spark, but
also Greenplum. Custom data extensions available
for HDFS and virtually to any NoSQL database
 Cloud services & SDK ready with full process
automation capabilities
Benefits
 Packable in business applications
 Improved prediction & insights from Big Data
analysis
Hive
SQL
Spark
In Memory Processing
HDFS
Hadoop Distributed File System

More Related Content

What's hot

Building Information Platform - Integration of Hadoop with SAP HANA and HANA ...
Building Information Platform - Integration of Hadoop with SAP HANA and HANA ...Building Information Platform - Integration of Hadoop with SAP HANA and HANA ...
Building Information Platform - Integration of Hadoop with SAP HANA and HANA ...DataWorks Summit/Hadoop Summit
 
SAP HANA SPS09 - HANA IM Services
SAP HANA SPS09 - HANA IM ServicesSAP HANA SPS09 - HANA IM Services
SAP HANA SPS09 - HANA IM ServicesSAP Technology
 
What's New for SAP HANA Smart Data Integration & Smart Data Quality
What's New for SAP HANA Smart Data Integration & Smart Data QualityWhat's New for SAP HANA Smart Data Integration & Smart Data Quality
What's New for SAP HANA Smart Data Integration & Smart Data QualitySAP Technology
 
What's new on SAP HANA Smart Data Access
What's new on SAP HANA Smart Data AccessWhat's new on SAP HANA Smart Data Access
What's new on SAP HANA Smart Data AccessSAP Technology
 
SAP HANA and Apache Hadoop for Big Data Management (SF Scalable Systems Meetup)
SAP HANA and Apache Hadoop for Big Data Management (SF Scalable Systems Meetup)SAP HANA and Apache Hadoop for Big Data Management (SF Scalable Systems Meetup)
SAP HANA and Apache Hadoop for Big Data Management (SF Scalable Systems Meetup)Will Gardella
 
HANA SPS07 Smart Data Access
HANA SPS07 Smart Data AccessHANA SPS07 Smart Data Access
HANA SPS07 Smart Data AccessSAP Technology
 
What's Planned for SAP HANA SPS10
What's Planned for SAP HANA SPS10What's Planned for SAP HANA SPS10
What's Planned for SAP HANA SPS10SAP Technology
 
SAP HANA SPS09 - Dynamic Tiering
SAP HANA SPS09 - Dynamic TieringSAP HANA SPS09 - Dynamic Tiering
SAP HANA SPS09 - Dynamic TieringSAP Technology
 
What's New in SPS11 Overview
What's New in SPS11 OverviewWhat's New in SPS11 Overview
What's New in SPS11 OverviewSAP Technology
 
SAP HANA SPS10- Hadoop Integration
SAP HANA SPS10- Hadoop IntegrationSAP HANA SPS10- Hadoop Integration
SAP HANA SPS10- Hadoop IntegrationSAP Technology
 
Hadoop, Spark and Big Data Summit presentation with SAP HANA Vora and a path ...
Hadoop, Spark and Big Data Summit presentation with SAP HANA Vora and a path ...Hadoop, Spark and Big Data Summit presentation with SAP HANA Vora and a path ...
Hadoop, Spark and Big Data Summit presentation with SAP HANA Vora and a path ...Ocean9, Inc.
 
SAP HANA SPS10- Enterprise Information Management
SAP HANA SPS10- Enterprise Information ManagementSAP HANA SPS10- Enterprise Information Management
SAP HANA SPS10- Enterprise Information ManagementSAP Technology
 
Sap hana platform sps 11 introduces new sap hana hadoop integration features
Sap hana platform sps 11 introduces new sap hana hadoop integration featuresSap hana platform sps 11 introduces new sap hana hadoop integration features
Sap hana platform sps 11 introduces new sap hana hadoop integration featuresAvinash Kumar Gautam
 
Building Custom Advanced Analytics Applications with SAP HANA
Building Custom Advanced Analytics Applications with SAP HANABuilding Custom Advanced Analytics Applications with SAP HANA
Building Custom Advanced Analytics Applications with SAP HANASAP Technology
 
SAP HANA SPS09 - Smart Data Streaming
SAP HANA SPS09 - Smart Data StreamingSAP HANA SPS09 - Smart Data Streaming
SAP HANA SPS09 - Smart Data StreamingSAP Technology
 

What's hot (20)

Building Information Platform - Integration of Hadoop with SAP HANA and HANA ...
Building Information Platform - Integration of Hadoop with SAP HANA and HANA ...Building Information Platform - Integration of Hadoop with SAP HANA and HANA ...
Building Information Platform - Integration of Hadoop with SAP HANA and HANA ...
 
SAP HANA SPS09 - HANA IM Services
SAP HANA SPS09 - HANA IM ServicesSAP HANA SPS09 - HANA IM Services
SAP HANA SPS09 - HANA IM Services
 
What's New for SAP HANA Smart Data Integration & Smart Data Quality
What's New for SAP HANA Smart Data Integration & Smart Data QualityWhat's New for SAP HANA Smart Data Integration & Smart Data Quality
What's New for SAP HANA Smart Data Integration & Smart Data Quality
 
What's new on SAP HANA Smart Data Access
What's new on SAP HANA Smart Data AccessWhat's new on SAP HANA Smart Data Access
What's new on SAP HANA Smart Data Access
 
SAP HANA and Apache Hadoop for Big Data Management (SF Scalable Systems Meetup)
SAP HANA and Apache Hadoop for Big Data Management (SF Scalable Systems Meetup)SAP HANA and Apache Hadoop for Big Data Management (SF Scalable Systems Meetup)
SAP HANA and Apache Hadoop for Big Data Management (SF Scalable Systems Meetup)
 
HANA SPS07 Smart Data Access
HANA SPS07 Smart Data AccessHANA SPS07 Smart Data Access
HANA SPS07 Smart Data Access
 
What's Planned for SAP HANA SPS10
What's Planned for SAP HANA SPS10What's Planned for SAP HANA SPS10
What's Planned for SAP HANA SPS10
 
SAP HANA SPS09 - Dynamic Tiering
SAP HANA SPS09 - Dynamic TieringSAP HANA SPS09 - Dynamic Tiering
SAP HANA SPS09 - Dynamic Tiering
 
Big data/Hadoop/HANA Basics
Big data/Hadoop/HANA BasicsBig data/Hadoop/HANA Basics
Big data/Hadoop/HANA Basics
 
What's New in SPS11 Overview
What's New in SPS11 OverviewWhat's New in SPS11 Overview
What's New in SPS11 Overview
 
SAP HANA SPS10- Hadoop Integration
SAP HANA SPS10- Hadoop IntegrationSAP HANA SPS10- Hadoop Integration
SAP HANA SPS10- Hadoop Integration
 
SDA - POC
SDA - POCSDA - POC
SDA - POC
 
Hadoop, Spark and Big Data Summit presentation with SAP HANA Vora and a path ...
Hadoop, Spark and Big Data Summit presentation with SAP HANA Vora and a path ...Hadoop, Spark and Big Data Summit presentation with SAP HANA Vora and a path ...
Hadoop, Spark and Big Data Summit presentation with SAP HANA Vora and a path ...
 
SAP HANA SPS10- Enterprise Information Management
SAP HANA SPS10- Enterprise Information ManagementSAP HANA SPS10- Enterprise Information Management
SAP HANA SPS10- Enterprise Information Management
 
Sap hana platform sps 11 introduces new sap hana hadoop integration features
Sap hana platform sps 11 introduces new sap hana hadoop integration featuresSap hana platform sps 11 introduces new sap hana hadoop integration features
Sap hana platform sps 11 introduces new sap hana hadoop integration features
 
Building Custom Advanced Analytics Applications with SAP HANA
Building Custom Advanced Analytics Applications with SAP HANABuilding Custom Advanced Analytics Applications with SAP HANA
Building Custom Advanced Analytics Applications with SAP HANA
 
SAP HANA SPS09 - Smart Data Streaming
SAP HANA SPS09 - Smart Data StreamingSAP HANA SPS09 - Smart Data Streaming
SAP HANA SPS09 - Smart Data Streaming
 
Autodesk Technical Webinar: SAP HANA in-memory database
Autodesk Technical Webinar: SAP HANA in-memory databaseAutodesk Technical Webinar: SAP HANA in-memory database
Autodesk Technical Webinar: SAP HANA in-memory database
 
SAP EIM Overview
SAP EIM OverviewSAP EIM Overview
SAP EIM Overview
 
Why SAP HANA?
Why SAP HANA?Why SAP HANA?
Why SAP HANA?
 

Viewers also liked

Frank Celler – Processing large-scale graphs with Google(TM) Pregel - NoSQL m...
Frank Celler – Processing large-scale graphs with Google(TM) Pregel - NoSQL m...Frank Celler – Processing large-scale graphs with Google(TM) Pregel - NoSQL m...
Frank Celler – Processing large-scale graphs with Google(TM) Pregel - NoSQL m...NoSQLmatters
 
Employing Graph Databases as a Standardization Model towards Addressing Heter...
Employing Graph Databases as a Standardization Model towards Addressing Heter...Employing Graph Databases as a Standardization Model towards Addressing Heter...
Employing Graph Databases as a Standardization Model towards Addressing Heter...Dippy Aggarwal
 
Experimenting with Google Knowledge Graph & How Can we Potentially use it in...
 Experimenting with Google Knowledge Graph & How Can we Potentially use it in... Experimenting with Google Knowledge Graph & How Can we Potentially use it in...
Experimenting with Google Knowledge Graph & How Can we Potentially use it in...Pritesh Patel
 
Inside Google Knowledge Graph
Inside Google Knowledge GraphInside Google Knowledge Graph
Inside Google Knowledge GraphMatthew Brown
 
SAP Korea Forum - SAP The Ultimate Simplifier
SAP Korea Forum - SAP The Ultimate SimplifierSAP Korea Forum - SAP The Ultimate Simplifier
SAP Korea Forum - SAP The Ultimate SimplifierPaul Marriott
 
10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data Lake10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data LakeVMware Tanzu
 
NoSQL and MapReduce
NoSQL and MapReduceNoSQL and MapReduce
NoSQL and MapReduceJ Singh
 
Enterprise knowledge graphs
Enterprise knowledge graphsEnterprise knowledge graphs
Enterprise knowledge graphsSören Auer
 
Wakanda: NoSQL for Model-Driven Web applications - NoSQL matters 2012
Wakanda: NoSQL for Model-Driven Web applications - NoSQL matters 2012Wakanda: NoSQL for Model-Driven Web applications - NoSQL matters 2012
Wakanda: NoSQL for Model-Driven Web applications - NoSQL matters 2012Alexandre Morgaut
 
Dealing with Changed Data in Hadoop
Dealing with Changed Data in HadoopDealing with Changed Data in Hadoop
Dealing with Changed Data in HadoopDataWorks Summit
 
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...Hortonworks
 
Enterprise Knowledge Graph
Enterprise Knowledge GraphEnterprise Knowledge Graph
Enterprise Knowledge GraphLukas Masuch
 
Relational to Graph - Import
Relational to Graph - ImportRelational to Graph - Import
Relational to Graph - ImportNeo4j
 

Viewers also liked (16)

Frank Celler – Processing large-scale graphs with Google(TM) Pregel - NoSQL m...
Frank Celler – Processing large-scale graphs with Google(TM) Pregel - NoSQL m...Frank Celler – Processing large-scale graphs with Google(TM) Pregel - NoSQL m...
Frank Celler – Processing large-scale graphs with Google(TM) Pregel - NoSQL m...
 
Employing Graph Databases as a Standardization Model towards Addressing Heter...
Employing Graph Databases as a Standardization Model towards Addressing Heter...Employing Graph Databases as a Standardization Model towards Addressing Heter...
Employing Graph Databases as a Standardization Model towards Addressing Heter...
 
Experimenting with Google Knowledge Graph & How Can we Potentially use it in...
 Experimenting with Google Knowledge Graph & How Can we Potentially use it in... Experimenting with Google Knowledge Graph & How Can we Potentially use it in...
Experimenting with Google Knowledge Graph & How Can we Potentially use it in...
 
Inside Google Knowledge Graph
Inside Google Knowledge GraphInside Google Knowledge Graph
Inside Google Knowledge Graph
 
SAP HORTONWORKS
SAP HORTONWORKSSAP HORTONWORKS
SAP HORTONWORKS
 
Drupal 6 Database layer
Drupal 6 Database layerDrupal 6 Database layer
Drupal 6 Database layer
 
Enterprise Knowledge Graph
Enterprise Knowledge GraphEnterprise Knowledge Graph
Enterprise Knowledge Graph
 
SAP Korea Forum - SAP The Ultimate Simplifier
SAP Korea Forum - SAP The Ultimate SimplifierSAP Korea Forum - SAP The Ultimate Simplifier
SAP Korea Forum - SAP The Ultimate Simplifier
 
10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data Lake10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data Lake
 
NoSQL and MapReduce
NoSQL and MapReduceNoSQL and MapReduce
NoSQL and MapReduce
 
Enterprise knowledge graphs
Enterprise knowledge graphsEnterprise knowledge graphs
Enterprise knowledge graphs
 
Wakanda: NoSQL for Model-Driven Web applications - NoSQL matters 2012
Wakanda: NoSQL for Model-Driven Web applications - NoSQL matters 2012Wakanda: NoSQL for Model-Driven Web applications - NoSQL matters 2012
Wakanda: NoSQL for Model-Driven Web applications - NoSQL matters 2012
 
Dealing with Changed Data in Hadoop
Dealing with Changed Data in HadoopDealing with Changed Data in Hadoop
Dealing with Changed Data in Hadoop
 
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
 
Enterprise Knowledge Graph
Enterprise Knowledge GraphEnterprise Knowledge Graph
Enterprise Knowledge Graph
 
Relational to Graph - Import
Relational to Graph - ImportRelational to Graph - Import
Relational to Graph - Import
 

Similar to Leveraging SAP, Hadoop, and Big Data to Redefine Business

Deploy s4 hana
Deploy s4 hanaDeploy s4 hana
Deploy s4 hanaDivya Goel
 
SAP HANA SQL Data Warehousing (Sefan Linders)
SAP HANA SQL Data Warehousing (Sefan Linders)SAP HANA SQL Data Warehousing (Sefan Linders)
SAP HANA SQL Data Warehousing (Sefan Linders)Twan van den Broek
 
IMCSummit 2015 - Day 1 IT Business Track - In-memory computing with SAP HANA:...
IMCSummit 2015 - Day 1 IT Business Track - In-memory computing with SAP HANA:...IMCSummit 2015 - Day 1 IT Business Track - In-memory computing with SAP HANA:...
IMCSummit 2015 - Day 1 IT Business Track - In-memory computing with SAP HANA:...In-Memory Computing Summit
 
Sneak Peak into Self-Service, Cross-Enterprise, Job Scheduling with CA Worklo...
Sneak Peak into Self-Service, Cross-Enterprise, Job Scheduling with CA Worklo...Sneak Peak into Self-Service, Cross-Enterprise, Job Scheduling with CA Worklo...
Sneak Peak into Self-Service, Cross-Enterprise, Job Scheduling with CA Worklo...CA Technologies
 
SAP HANA SPS10- Multitenant Database Containers
SAP HANA SPS10- Multitenant Database ContainersSAP HANA SPS10- Multitenant Database Containers
SAP HANA SPS10- Multitenant Database ContainersSAP Technology
 
How is sap data services unique for sap hana integration
How is sap data services unique for sap hana integrationHow is sap data services unique for sap hana integration
How is sap data services unique for sap hana integrationFlavio Alejandro Corradini
 
SAP TechEd 2015 | DEV109 | Extending Cloud Solutions from SAP using SAP HANA ...
SAP TechEd 2015 | DEV109 | Extending Cloud Solutions from SAP using SAP HANA ...SAP TechEd 2015 | DEV109 | Extending Cloud Solutions from SAP using SAP HANA ...
SAP TechEd 2015 | DEV109 | Extending Cloud Solutions from SAP using SAP HANA ...SAP HANA Cloud Platform
 
What's New in SAP HANA SPS 11 DB Control Center (Operations)
What's New in SAP HANA SPS 11 DB Control Center (Operations)What's New in SAP HANA SPS 11 DB Control Center (Operations)
What's New in SAP HANA SPS 11 DB Control Center (Operations)SAP Technology
 
Extend SAP S/4HANA to deliver real-time intelligent processes
Extend SAP S/4HANA to deliver real-time intelligent processesExtend SAP S/4HANA to deliver real-time intelligent processes
Extend SAP S/4HANA to deliver real-time intelligent processesSAP Technology
 
Development to Deployment with SAP HANA
Development to Deployment with SAP HANADevelopment to Deployment with SAP HANA
Development to Deployment with SAP HANACraig Cmehil
 
SAP HANA, from development to deployment, cloud, on-premise or hybrid, a solu...
SAP HANA, from development to deployment, cloud, on-premise or hybrid, a solu...SAP HANA, from development to deployment, cloud, on-premise or hybrid, a solu...
SAP HANA, from development to deployment, cloud, on-premise or hybrid, a solu...Abdelhalim DADOUCHE
 
Bring your SAP and Enterprise Data to Hadoop, Apache Kafka and the Cloud
Bring your SAP and Enterprise Data to Hadoop, Apache Kafka and the CloudBring your SAP and Enterprise Data to Hadoop, Apache Kafka and the Cloud
Bring your SAP and Enterprise Data to Hadoop, Apache Kafka and the CloudDataWorks Summit/Hadoop Summit
 
Analytics Products L2 public 2020-23 Black.pptx
Analytics Products L2 public 2020-23 Black.pptxAnalytics Products L2 public 2020-23 Black.pptx
Analytics Products L2 public 2020-23 Black.pptxBurakAyan6
 
Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture
Hadoop Hive Tutorial | Hive Fundamentals | Hive ArchitectureHadoop Hive Tutorial | Hive Fundamentals | Hive Architecture
Hadoop Hive Tutorial | Hive Fundamentals | Hive ArchitectureSkillspeed
 
Developing and Deploying Applications on the SAP HANA Platform
Developing and Deploying Applications on the SAP HANA PlatformDeveloping and Deploying Applications on the SAP HANA Platform
Developing and Deploying Applications on the SAP HANA PlatformVitaliy Rudnytskiy
 
Build and run an sql data warehouse on sap hana
Build and run an sql data warehouse on sap hanaBuild and run an sql data warehouse on sap hana
Build and run an sql data warehouse on sap hanaLuc Vanrobays
 

Similar to Leveraging SAP, Hadoop, and Big Data to Redefine Business (20)

SAP HANA and SAP Vora
SAP HANA and SAP VoraSAP HANA and SAP Vora
SAP HANA and SAP Vora
 
SAP HANA Cloud Platform Expert Session - SAP HANA Cloud Platform Analytics
SAP HANA Cloud Platform Expert Session - SAP HANA Cloud Platform AnalyticsSAP HANA Cloud Platform Expert Session - SAP HANA Cloud Platform Analytics
SAP HANA Cloud Platform Expert Session - SAP HANA Cloud Platform Analytics
 
Deploy s4 hana
Deploy s4 hanaDeploy s4 hana
Deploy s4 hana
 
SAP HANA SQL Data Warehousing (Sefan Linders)
SAP HANA SQL Data Warehousing (Sefan Linders)SAP HANA SQL Data Warehousing (Sefan Linders)
SAP HANA SQL Data Warehousing (Sefan Linders)
 
IMCSummit 2015 - Day 1 IT Business Track - In-memory computing with SAP HANA:...
IMCSummit 2015 - Day 1 IT Business Track - In-memory computing with SAP HANA:...IMCSummit 2015 - Day 1 IT Business Track - In-memory computing with SAP HANA:...
IMCSummit 2015 - Day 1 IT Business Track - In-memory computing with SAP HANA:...
 
Dev207 berlin
Dev207 berlinDev207 berlin
Dev207 berlin
 
Sneak Peak into Self-Service, Cross-Enterprise, Job Scheduling with CA Worklo...
Sneak Peak into Self-Service, Cross-Enterprise, Job Scheduling with CA Worklo...Sneak Peak into Self-Service, Cross-Enterprise, Job Scheduling with CA Worklo...
Sneak Peak into Self-Service, Cross-Enterprise, Job Scheduling with CA Worklo...
 
SAP HANA SPS10- Multitenant Database Containers
SAP HANA SPS10- Multitenant Database ContainersSAP HANA SPS10- Multitenant Database Containers
SAP HANA SPS10- Multitenant Database Containers
 
How is sap data services unique for sap hana integration
How is sap data services unique for sap hana integrationHow is sap data services unique for sap hana integration
How is sap data services unique for sap hana integration
 
SAP TechEd 2015 | DEV109 | Extending Cloud Solutions from SAP using SAP HANA ...
SAP TechEd 2015 | DEV109 | Extending Cloud Solutions from SAP using SAP HANA ...SAP TechEd 2015 | DEV109 | Extending Cloud Solutions from SAP using SAP HANA ...
SAP TechEd 2015 | DEV109 | Extending Cloud Solutions from SAP using SAP HANA ...
 
What's New in SAP HANA SPS 11 DB Control Center (Operations)
What's New in SAP HANA SPS 11 DB Control Center (Operations)What's New in SAP HANA SPS 11 DB Control Center (Operations)
What's New in SAP HANA SPS 11 DB Control Center (Operations)
 
Extend SAP S/4HANA to deliver real-time intelligent processes
Extend SAP S/4HANA to deliver real-time intelligent processesExtend SAP S/4HANA to deliver real-time intelligent processes
Extend SAP S/4HANA to deliver real-time intelligent processes
 
Development to Deployment with SAP HANA
Development to Deployment with SAP HANADevelopment to Deployment with SAP HANA
Development to Deployment with SAP HANA
 
Sap bw4 hana
Sap bw4 hanaSap bw4 hana
Sap bw4 hana
 
SAP HANA, from development to deployment, cloud, on-premise or hybrid, a solu...
SAP HANA, from development to deployment, cloud, on-premise or hybrid, a solu...SAP HANA, from development to deployment, cloud, on-premise or hybrid, a solu...
SAP HANA, from development to deployment, cloud, on-premise or hybrid, a solu...
 
Bring your SAP and Enterprise Data to Hadoop, Apache Kafka and the Cloud
Bring your SAP and Enterprise Data to Hadoop, Apache Kafka and the CloudBring your SAP and Enterprise Data to Hadoop, Apache Kafka and the Cloud
Bring your SAP and Enterprise Data to Hadoop, Apache Kafka and the Cloud
 
Analytics Products L2 public 2020-23 Black.pptx
Analytics Products L2 public 2020-23 Black.pptxAnalytics Products L2 public 2020-23 Black.pptx
Analytics Products L2 public 2020-23 Black.pptx
 
Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture
Hadoop Hive Tutorial | Hive Fundamentals | Hive ArchitectureHadoop Hive Tutorial | Hive Fundamentals | Hive Architecture
Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture
 
Developing and Deploying Applications on the SAP HANA Platform
Developing and Deploying Applications on the SAP HANA PlatformDeveloping and Deploying Applications on the SAP HANA Platform
Developing and Deploying Applications on the SAP HANA Platform
 
Build and run an sql data warehouse on sap hana
Build and run an sql data warehouse on sap hanaBuild and run an sql data warehouse on sap hana
Build and run an sql data warehouse on sap hana
 

More from DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

More from DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Leveraging SAP, Hadoop, and Big Data to Redefine Business

  • 1. Leveraging SAP, Hadoop, and Big Data to Redefine Business Javier Cuerva | Enterprise Solution Architect | SAP Global CoE April 16th, 2015 Public
  • 2. © 2015 SAP SE or an SAP affiliate company. All rights reserved. 2Public More Data, Different Data, Faster Data = Big Data  Digital Universe Exploding* – Until 2020, the digital universe will double every 2 years, reaching 40 Zettabytes  Relational data, not only anymore  Machine generated data is the trend “du-jour” – Internet of Things or Industrial Data *source IDC: the digital universe in 20201 Zettabyte (ZB) = 1 million Petabytes (PB)
  • 3. © 2015 SAP SE or an SAP affiliate company. All rights reserved. 3Public Big Data Economics Generating significant financial value across sectors $300 billion value per year US healthcare #0.7% annual GDP Manufacturing Up to -50% assembly costs Up to +7% reduction in working capital Retail 60%+ increase in net margin possible 0.5-1.0% annual GDP Global personal location data $100 billion+ revenue for service providers source IDC: McKinsey Global Institute Analysis
  • 4. © 2015 SAP SE or an SAP affiliate company. All rights reserved. 4Public SAP Focus End-to-End Value Chain SPATIAL PROCESSING ANALYTICS, TEXT, GRAPH, PREDICTIVE ENGINES CONSUME COMPUTE STORAGE SOURCE INGEST Application Development Environment Transformations & Cleansing Smart Data Integration Smart Data Quality Stream Processing Smart Data Streaming STREAM PROCESSING LogsTextOLTP Social MachineGeoERP SensorStore & forward Mobile applications and BI Smart Data Access Virtual Tables User Defined Functions 1010100 1010110 1001110 Dynamic Tiering Aged data in Disk In-Memory Data model & data Calculation engine Fast computing Column Storage High performance analytics Series Data Storage Store time- series data Reporting & Dashboards High Performance Applications Data Exploration & Visualization Adhoc & OLAP Analytics Predictive Analysis Business Planning & Forecasting Lumira / BI Hadoop / NoSQL MapReduce YARN HDFS HANA DATA PLATFORM
  • 5. © 2015 SAP SE or an SAP affiliate company. All rights reserved. 5Public HANA Data Management Technical Foundation for End-to-End Big Data In-Memory Sub-second Response Column Storage High Performance Analytics Dynamic Tiering Warm data to disk Smart Data Access Remote Source as Virtual Tables Virtual UDF HDFS and MapReduce 011001 Smart Data Streaming On-the-fly Stream Analysis Smart Data Integration Extend HANA with Hadoop Stores Smart Data Quality Cleansing and Transformation Replication server Real-time data movement to Hadoop Smart Data Preparation Clean data for better decisions Data Services Big Data and No-SQL transformations Aging Rules and Automated Data Movement from HANA to Hadoop Data Warehouse Foundation
  • 6. © 2015 SAP SE or an SAP affiliate company. All rights reserved. 6Public HANA Data Platform SAP HANA In-Memory 0.1 sec Instant Results Text | Search | Graph | GeoSpatial | Predictive | Time Series Administration | Monitoring | Operations | User Management | Security HANA Dynamic Tiering Warm Data HADOOP Compute & Storage ∞ HANA Data Management Platform for Big Data
  • 7. © 2015 SAP SE or an SAP affiliate company. All rights reserved. 7Public HANA Data Platform Big Data Features HANA native BigData  Dynamic Tiering  Smart Data Streaming  Graph | Geo | TimeSeries HANA & Hadoop  Smart Data Access  Hive | Spark  MapReduce | HDFS
  • 8. © 2015 SAP SE or an SAP affiliate company. All rights reserved. 8Public HANA Data Platform Dynamic Tiering HANA Dynamic Tiering  Native Big Data solution – real-time insights – ALL enterprise data  Manage data cost effectively  Terabytes to Petabytes  Application defined temperature  Single Database experience  Centralized operational control CREATE TABLE „demo“.“SalesOrders_WARM“ ( ID Integer NOT NULL, CustomerID Integer NOT NULL, OrderDate date NOT NULL, …, PRIMARY KEY (id) ) USING EXTENDED STORAGE; INSERT INTO „demo“.“SalesOrders_WARM“ VALUES ( … );
  • 9. © 2015 SAP SE or an SAP affiliate company. All rights reserved. 9Public HANA Data Platform Hadoop Integration HANA & Hadoop Integration  SQL on Hadoop via SDA with virtual tables – Hive or Spark  Execution of MR-Jobs via virtual functions  Access to HDFS  Calculation view support for virtual tables
  • 10. © 2015 SAP SE or an SAP affiliate company. All rights reserved. 10Public HANA Data Platform Hive, Spark Integration Feature Highlights  Data Virtualization (Smart Data Access) to Hive via ODBC connectivity  Richer SQL access from SAP HANA studio to Hive Tables  Compute SQL operations between Hive Tables and HANA Tables from HANA studio  Remote caching of HIVE data in queries: Ex: SELECT * FROM HIVE_LINEITEMS WHERE ORDER_ID=6 WITH HINT ( USE_REMOTE_CACHE )
  • 11. © 2015 SAP SE or an SAP affiliate company. All rights reserved. 11Public Virtual UDF for HDFS and MapReduce Integration Architecture
  • 12. © 2015 SAP SE or an SAP affiliate company. All rights reserved. 12Public Virtual UDF for HDFS and MapReduce Integration Syntax Highlights  Syntax: CREATE VIRTUAL FUNCTION <func_name> [(<parameter_clause>)] RETURNS <return_table_type> [SQL SECURITY <mode>] [<package_clause>] CONFIGURATION <remote_proc_properties> AT <remote_source_name>;  Virtual Function Properties – Can be used in-place of a table or derived table where the return clause represents the result-set – Many configuration parameters depending on HDFS or MapReduce Job Call – Points to a remote Hadoop cluster defined by the CREATE REMOTE SOURCE DDL
  • 13. © 2015 SAP SE or an SAP affiliate company. All rights reserved. 13Public HANA Data Platform HDFS Integration Feature Highlights  Query native HDFS (Hadoop File System) data  Read-only access to HDFS file  vUDF needs to define the schema of the result set returned with the TABLE clause  Some relevant configuration parameters, more in SPS09 Administration Guide Parameter Name Description hdfs_location Where the hdfs file is location, e.g. /user/hive/tpch/products hdfs_field_delimiter The character which defines the separator between fields in the file pointed by hdfs_location datetime_format Defines the ISO datetime format of a date_time column in the file date_format Defines the ISO date format of a date column in the file, e.g yyyy-MM-dd time_format Same for time format
  • 14. © 2015 SAP SE or an SAP affiliate company. All rights reserved. 14Public HDFS Demo with Virtual UDF Create First a Remote Server pointing to the WebHDFS and WebHCAT servers  Use Remote Server Statement for that Create a Virtual User Defined Function  Pointing to the HDFS file and specifying the type of data returned Access the HDFS file  Call the vUDF
  • 15. © 2015 SAP SE or an SAP affiliate company. All rights reserved. 15Public HANA Data Platform Map Reduce Integration Feature Highlights  Capability to invoke MapReduce jobs from HANA  End-to-End development: – Define Mapper and Reducer JAVA classes developed in HANA studio by creating a Java Project with the SAP HANA Development Perspective. – MapReduce Deployment from HANA Studio  vUDF needs to define the schema of the result set returned with the TABLE clause  Some relevant configuration parameters, more in SPS09 Administration Guide Parameter Name Description mapred_mapper The full java class name for the map phase mapred_reducer The full java class name for the reduce phase mapred_input The initial file to be used by MapReduce or an intermediate result if chaining MapReduce calls or the input directory to read the data from
  • 16. © 2015 SAP SE or an SAP affiliate company. All rights reserved. 16Public MapReduce Demo with Virtual UDF Create First a Remote Server pointing to the WebHDFS and WebHCAT servers  Use Remote Server Statement for that Create a Virtual User Defined Function  Reference the Mapper Class Name  Reference the Reducer Class Name  Reference the input file location where the MapReduce Job should look for Call the MapReduce Job  Call the vUDF
  • 17. © 2015 SAP SE or an SAP affiliate company. All rights reserved. 17Public HANA Data Platform and Hadoop Where we are heading Some relevant features:  Lightweight and fast data replication/movement from HANA to Hadoop  Data Aging solution for HANA via Data Lifecycle Management utility to define aging rules and relocate aged data to Hadoop  SDA support for Data Provisioning for the SAP HANA Service/Adapter Framework  SDA performance optimization: maintain statistics  Optimize SAP HANA and Spark SQL Integration  Leverage HANA/Hadoop Security capabilities for User Authentication  Single UI for HANA and Hadoop cluster Administration & Monitoring (through Ambari)
  • 18. © 2015 SAP SE or an SAP affiliate company. All rights reserved. 18Public Conclusion Bringing Big Data to main stream Enterprise Data ONE PLATFORM ALL WORKLOADS INTEGRATED ALL DATA SIMPLE OPEN
  • 19. © 2015 SAP SE or an SAP affiliate company. All rights reserved. Thank you Contact information: Javier Cuerva Enterprise Solution Architect SAP Global Center of Expertise javier.cuerva@sap.com
  • 20. © 2015 SAP SE or an SAP affiliate company. All rights reserved. 20Public Backup Slides
  • 21. © 2015 SAP SE or an SAP affiliate company. All rights reserved. 21Public HANA Data Platform Any Apps Any App Server SAP Business Suite and BW ABAP App Server Other AppsLocationReal-timeHADOOPMachineUnstructuredTransaction HANA Platform SQL, SQLScript, JavaScript Spatial Text Search Text Analysis & Mining Stored Procedure & Data Models Application & UI Services Business Function Library Predictive Analysis Library Database Services Series Data Rules Engine Integration & Steaming Services SAP HANA is the platform for ALL Applications A true platform  Converged OLTP + OLAP  Native processing services  Embedded business logic Supports any application  60% of HANA use cases are outside of the SAP Landscape  1,300+ start-ups & ISVs developing on HANA Supports any Device
  • 22. © 2015 SAP SE or an SAP affiliate company. All rights reserved. 22Public SAP HANA Smart Data Integration & Smart Data Quality Replication, Batch Integration, and Data Virtualization Capabilities  Real-time replication & CDC on select sources  Bulk integration (metadata / data)  Data virtualization via Smart Data Access  Real-time data cleansing and transformation  Data enrichment with geospatial information  SAP HANA Studio to define data transformation flows  Support for on-premise and cloud sources  Open SDK and built-in adapters including HIVE Benefits  Simplified landscape: 1 environment to provision data  Real-time: lower latency with in-memory performance  Open & extensible: supports data of any shape or size Built-In Adapters Custom Adapters Transformations SAP HANA Metadata Adapter Framework OData DB2, Oracle SQL Server Smart Data IntegrationSmart Data Quality
  • 23. © 2015 SAP SE or an SAP affiliate company. All rights reserved. 23Public SAP HANA Smart Data Access Virtual Table Capabilities  Real-time, virtualized data access to external sources  SAP Sources: HANA, ASE, IQ, MaxDB, ESP, SQLA  Databases: Teradadata, Microsoft SQLServer, Oracle, IBM DB2, IBM Netezza  Hadoop: Hive ODBC Driver to Cloudera, Hortonworks, MapR  NoSQL: SPARK Benefits  Optimized performance  Compliments existing enterprise investments  Lower development costs by using data directly from its source system
  • 24. © 2015 SAP SE or an SAP affiliate company. All rights reserved. 24Public SAP BusinessObjects BI / SAP Lumira & Hadoop / NoSQL Combined With SAP HANA Hadoop / NoSQL Hive SQL Query Impala MPP SQL Query MongoDB Document DB Cassandra NoSQL DB MapReduce / YARN / AWS Elastic MapReduce Distributed Processing Framework SAP HANA Platform SAP BusinessObjects BI Data Integration BI Universe SAP Lumira Desktop SAP Lumira Cloud Capabilities SAP Lumira Desktop & SAP BusinessObjects BI can integrate with Hadoop via SAP HANA SAP BusinessObjects BI 4.0 FP 3(Universe) integrates with Hive, Cloudera Implala and AWS EMR SAP Lumira desktop integrates with Hive and AWS EMR  SAP Lumira comes with a Datasource Extension Framework; developers can use to build additional datasource access: MongoDB, Datastax, SparkSQL are the most recent examples  SAP Lumira cloud integrates with Hive (0.13), Cloudera Impala (1.21), and AWS EMR Benefits  Flexible choice on how to access Hadoop / NoSQL  Greater insight from Big Data Analytics Smart data access SAP Data Services 4.1 (Hive & HDFS)
  • 25. © 2015 SAP SE or an SAP affiliate company. All rights reserved. 25Public SAP Predictive Analytics & Hadoop / NoSQL SAP Predictive Analytics 2.0 Hadoop / NoSQL Greenplum SQL DB Capabilities  Unified UI for business analysts and data scientists  Extensive predictive library including R algorithms  Big Data ready with support of Hive and Spark, but also Greenplum. Custom data extensions available for HDFS and virtually to any NoSQL database  Cloud services & SDK ready with full process automation capabilities Benefits  Packable in business applications  Improved prediction & insights from Big Data analysis Hive SQL Spark In Memory Processing HDFS Hadoop Distributed File System