Leveraging SAP, Hadoop, and Big Data to Redefine
Business
Javier Cuerva | Enterprise Solution Architect | SAP Global CoE
April 16th, 2015 Public
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 2Public
More Data, Different Data, Faster Data = Big Data
 Digital Universe Exploding*
– Until 2020, the digital universe will double
every 2 years, reaching 40 Zettabytes
 Relational data, not only anymore
 Machine generated data is the trend “du-jour”
– Internet of Things or Industrial Data
*source IDC: the digital universe in 20201 Zettabyte (ZB) = 1 million Petabytes (PB)
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 3Public
Big Data Economics
Generating significant financial value across sectors
$300 billion value per year
US healthcare
#0.7% annual GDP
Manufacturing
Up to -50% assembly costs
Up to +7% reduction in
working capital
Retail
60%+ increase in net margin
possible
0.5-1.0% annual GDP
Global personal location data
$100 billion+ revenue for
service providers
source IDC: McKinsey Global Institute Analysis
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 4Public
SAP Focus
End-to-End Value Chain
SPATIAL
PROCESSING
ANALYTICS, TEXT,
GRAPH, PREDICTIVE
ENGINES
CONSUME
COMPUTE
STORAGE
SOURCE
INGEST
Application
Development
Environment
Transformations &
Cleansing
Smart Data Integration
Smart Data Quality
Stream
Processing
Smart Data Streaming
STREAM
PROCESSING
LogsTextOLTP Social MachineGeoERP SensorStore & forward
Mobile applications and BI
Smart Data Access
Virtual
Tables
User Defined
Functions
1010100
1010110
1001110
Dynamic Tiering
Aged data
in Disk
In-Memory
Data model
& data
Calculation engine
Fast
computing
Column Storage
High performance
analytics
Series Data Storage
Store time-
series data
Reporting &
Dashboards
High Performance
Applications
Data Exploration
& Visualization
Adhoc & OLAP
Analytics
Predictive
Analysis
Business Planning
& Forecasting
Lumira / BI
Hadoop / NoSQL
MapReduce
YARN
HDFS
HANA DATA PLATFORM
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 5Public
HANA Data Management
Technical Foundation for End-to-End Big Data
In-Memory
Sub-second Response
Column Storage
High Performance
Analytics
Dynamic Tiering
Warm data to disk
Smart Data Access
Remote Source as
Virtual Tables
Virtual UDF
HDFS and
MapReduce
011001
Smart Data Streaming
On-the-fly Stream
Analysis
Smart Data Integration
Extend HANA with
Hadoop Stores
Smart Data Quality
Cleansing and
Transformation
Replication server
Real-time data
movement to Hadoop
Smart Data Preparation
Clean data for
better decisions
Data Services
Big Data and No-SQL
transformations
Aging Rules and
Automated Data
Movement from HANA
to Hadoop
Data Warehouse Foundation
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 6Public
HANA Data Platform
SAP HANA
In-Memory
0.1 sec
Instant Results
Text | Search | Graph | GeoSpatial | Predictive | Time Series
Administration | Monitoring | Operations | User Management | Security
HANA Dynamic
Tiering
Warm Data
HADOOP
Compute & Storage
∞
HANA Data Management Platform for Big Data
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 7Public
HANA Data Platform
Big Data Features
HANA native BigData
 Dynamic Tiering
 Smart Data Streaming
 Graph | Geo | TimeSeries
HANA & Hadoop
 Smart Data Access  Hive | Spark
 MapReduce | HDFS
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 8Public
HANA Data Platform
Dynamic Tiering
HANA Dynamic Tiering
 Native Big Data solution – real-time
insights – ALL enterprise data
 Manage data cost effectively
 Terabytes to Petabytes
 Application defined temperature
 Single Database experience
 Centralized operational control
CREATE TABLE „demo“.“SalesOrders_WARM“ (
ID Integer NOT NULL,
CustomerID Integer NOT NULL,
OrderDate date NOT NULL,
…,
PRIMARY KEY (id)
) USING EXTENDED STORAGE;
INSERT INTO „demo“.“SalesOrders_WARM“ VALUES ( … );
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 9Public
HANA Data Platform
Hadoop Integration
HANA & Hadoop Integration
 SQL on Hadoop via SDA with virtual tables
– Hive or Spark
 Execution of MR-Jobs via virtual functions
 Access to HDFS
 Calculation view support for virtual tables
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 10Public
HANA Data Platform
Hive, Spark Integration
Feature Highlights
 Data Virtualization (Smart Data Access) to
Hive via ODBC connectivity
 Richer SQL access from SAP HANA studio
to Hive Tables
 Compute SQL operations between Hive
Tables and HANA Tables from HANA studio
 Remote caching of HIVE data in queries:
Ex: SELECT * FROM HIVE_LINEITEMS WHERE
ORDER_ID=6 WITH HINT ( USE_REMOTE_CACHE )
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 11Public
Virtual UDF for HDFS and MapReduce Integration
Architecture
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 12Public
Virtual UDF for HDFS and MapReduce Integration
Syntax
Highlights
 Syntax:
CREATE VIRTUAL FUNCTION <func_name> [(<parameter_clause>)]
RETURNS <return_table_type>
[SQL SECURITY <mode>]
[<package_clause>]
CONFIGURATION <remote_proc_properties>
AT <remote_source_name>;
 Virtual Function Properties
– Can be used in-place of a table or derived table where the return clause represents the result-set
– Many configuration parameters depending on HDFS or MapReduce Job Call
– Points to a remote Hadoop cluster defined by the CREATE REMOTE SOURCE DDL
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 13Public
HANA Data Platform
HDFS Integration
Feature Highlights
 Query native HDFS (Hadoop File System) data
 Read-only access to HDFS file
 vUDF needs to define the schema of the result set returned with the TABLE clause
 Some relevant configuration parameters, more in SPS09 Administration Guide
Parameter Name Description
hdfs_location Where the hdfs file is location, e.g. /user/hive/tpch/products
hdfs_field_delimiter The character which defines the separator between fields in the file pointed by hdfs_location
datetime_format Defines the ISO datetime format of a date_time column in the file
date_format Defines the ISO date format of a date column in the file, e.g yyyy-MM-dd
time_format Same for time format
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 14Public
HDFS Demo with Virtual UDF
Create First a Remote Server pointing to the WebHDFS and WebHCAT servers
 Use Remote Server Statement for that
Create a Virtual User Defined Function
 Pointing to the HDFS file and specifying the type of data returned
Access the HDFS file
 Call the vUDF
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 15Public
HANA Data Platform
Map Reduce Integration
Feature Highlights
 Capability to invoke MapReduce jobs from HANA
 End-to-End development:
– Define Mapper and Reducer JAVA classes developed in HANA studio by creating a Java Project with
the SAP HANA Development Perspective.
– MapReduce Deployment from HANA Studio
 vUDF needs to define the schema of the result set returned with the TABLE clause
 Some relevant configuration parameters, more in SPS09 Administration Guide
Parameter Name Description
mapred_mapper The full java class name for the map phase
mapred_reducer The full java class name for the reduce phase
mapred_input The initial file to be used by MapReduce or an intermediate result if chaining MapReduce calls or
the input directory to read the data from
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 16Public
MapReduce Demo with Virtual UDF
Create First a Remote Server pointing to the WebHDFS and WebHCAT servers
 Use Remote Server Statement for that
Create a Virtual User Defined Function
 Reference the Mapper Class Name
 Reference the Reducer Class Name
 Reference the input file location where the MapReduce Job should look for
Call the MapReduce Job
 Call the vUDF
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 17Public
HANA Data Platform and Hadoop
Where we are heading
Some relevant features:
 Lightweight and fast data replication/movement from
HANA to Hadoop
 Data Aging solution for HANA via Data Lifecycle
Management utility to define aging rules and relocate
aged data to Hadoop
 SDA support for Data Provisioning for the SAP HANA
Service/Adapter Framework
 SDA performance optimization: maintain statistics
 Optimize SAP HANA and Spark SQL Integration
 Leverage HANA/Hadoop Security capabilities for User
Authentication
 Single UI for HANA and Hadoop cluster Administration
& Monitoring (through Ambari)
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 18Public
Conclusion
Bringing Big Data to main stream Enterprise Data
ONE
PLATFORM
ALL
WORKLOADS
INTEGRATED
ALL DATA
SIMPLE
OPEN
© 2015 SAP SE or an SAP affiliate company. All rights reserved.
Thank you
Contact information:
Javier Cuerva
Enterprise Solution Architect
SAP Global Center of Expertise
javier.cuerva@sap.com
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 20Public
Backup Slides
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 21Public
HANA Data Platform
Any Apps
Any App Server
SAP Business Suite and BW
ABAP App Server
Other AppsLocationReal-timeHADOOPMachineUnstructuredTransaction
HANA Platform
SQL, SQLScript, JavaScript
Spatial Text Search
Text
Analysis & Mining
Stored Procedure
& Data Models
Application &
UI Services
Business Function Library Predictive Analysis Library
Database
Services
Series Data
Rules
Engine
Integration & Steaming Services
SAP HANA is the platform for
ALL Applications
A true platform
 Converged OLTP + OLAP
 Native processing services
 Embedded business logic
Supports any application
 60% of HANA use cases are outside of the SAP Landscape
 1,300+ start-ups & ISVs developing on HANA
Supports any Device
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 22Public
SAP HANA Smart Data Integration & Smart Data Quality
Replication, Batch Integration, and Data Virtualization
Capabilities
 Real-time replication & CDC on select sources
 Bulk integration (metadata / data)
 Data virtualization via Smart Data Access
 Real-time data cleansing and transformation
 Data enrichment with geospatial information
 SAP HANA Studio to define data transformation flows
 Support for on-premise and cloud sources
 Open SDK and built-in adapters including HIVE
Benefits
 Simplified landscape: 1 environment to provision data
 Real-time: lower latency with in-memory performance
 Open & extensible: supports data of any shape or size
Built-In Adapters Custom Adapters
Transformations
SAP HANA
Metadata
Adapter
Framework
OData
DB2, Oracle
SQL Server
Smart Data IntegrationSmart Data Quality
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 23Public
SAP HANA Smart Data Access
Virtual Table
Capabilities
 Real-time, virtualized data access to external sources
 SAP Sources: HANA, ASE, IQ, MaxDB, ESP, SQLA
 Databases: Teradadata, Microsoft SQLServer, Oracle,
IBM DB2, IBM Netezza
 Hadoop: Hive ODBC Driver to Cloudera, Hortonworks,
MapR
 NoSQL: SPARK
Benefits
 Optimized performance
 Compliments existing enterprise investments
 Lower development costs by using data directly from its
source system
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 24Public
SAP BusinessObjects BI / SAP Lumira & Hadoop / NoSQL
Combined With SAP HANA
Hadoop / NoSQL
Hive
SQL Query
Impala
MPP SQL
Query
MongoDB
Document
DB
Cassandra
NoSQL DB
MapReduce / YARN / AWS Elastic MapReduce
Distributed Processing Framework
SAP HANA Platform
SAP
BusinessObjects BI
Data Integration
BI Universe
SAP
Lumira
Desktop
SAP
Lumira Cloud
Capabilities
SAP Lumira Desktop & SAP BusinessObjects BI
can integrate with Hadoop via SAP HANA
SAP BusinessObjects BI 4.0 FP 3(Universe)
integrates with Hive, Cloudera Implala and AWS
EMR
SAP Lumira desktop integrates with Hive and AWS
EMR
 SAP Lumira comes with a Datasource Extension Framework;
developers can use to build additional datasource access:
MongoDB, Datastax, SparkSQL are the most recent examples
 SAP Lumira cloud integrates with Hive (0.13),
Cloudera Impala (1.21), and AWS EMR
Benefits
 Flexible choice on how to access Hadoop / NoSQL
 Greater insight from Big Data Analytics
Smart data access
SAP Data Services 4.1 (Hive & HDFS)
© 2015 SAP SE or an SAP affiliate company. All rights reserved. 25Public
SAP Predictive Analytics & Hadoop / NoSQL
SAP Predictive Analytics 2.0
Hadoop / NoSQL
Greenplum
SQL DB
Capabilities
 Unified UI for business analysts and data scientists
 Extensive predictive library including R algorithms
 Big Data ready with support of Hive and Spark, but
also Greenplum. Custom data extensions available
for HDFS and virtually to any NoSQL database
 Cloud services & SDK ready with full process
automation capabilities
Benefits
 Packable in business applications
 Improved prediction & insights from Big Data
analysis
Hive
SQL
Spark
In Memory Processing
HDFS
Hadoop Distributed File System

Leveraging SAP, Hadoop, and Big Data to Redefine Business

  • 1.
    Leveraging SAP, Hadoop,and Big Data to Redefine Business Javier Cuerva | Enterprise Solution Architect | SAP Global CoE April 16th, 2015 Public
  • 2.
    © 2015 SAPSE or an SAP affiliate company. All rights reserved. 2Public More Data, Different Data, Faster Data = Big Data  Digital Universe Exploding* – Until 2020, the digital universe will double every 2 years, reaching 40 Zettabytes  Relational data, not only anymore  Machine generated data is the trend “du-jour” – Internet of Things or Industrial Data *source IDC: the digital universe in 20201 Zettabyte (ZB) = 1 million Petabytes (PB)
  • 3.
    © 2015 SAPSE or an SAP affiliate company. All rights reserved. 3Public Big Data Economics Generating significant financial value across sectors $300 billion value per year US healthcare #0.7% annual GDP Manufacturing Up to -50% assembly costs Up to +7% reduction in working capital Retail 60%+ increase in net margin possible 0.5-1.0% annual GDP Global personal location data $100 billion+ revenue for service providers source IDC: McKinsey Global Institute Analysis
  • 4.
    © 2015 SAPSE or an SAP affiliate company. All rights reserved. 4Public SAP Focus End-to-End Value Chain SPATIAL PROCESSING ANALYTICS, TEXT, GRAPH, PREDICTIVE ENGINES CONSUME COMPUTE STORAGE SOURCE INGEST Application Development Environment Transformations & Cleansing Smart Data Integration Smart Data Quality Stream Processing Smart Data Streaming STREAM PROCESSING LogsTextOLTP Social MachineGeoERP SensorStore & forward Mobile applications and BI Smart Data Access Virtual Tables User Defined Functions 1010100 1010110 1001110 Dynamic Tiering Aged data in Disk In-Memory Data model & data Calculation engine Fast computing Column Storage High performance analytics Series Data Storage Store time- series data Reporting & Dashboards High Performance Applications Data Exploration & Visualization Adhoc & OLAP Analytics Predictive Analysis Business Planning & Forecasting Lumira / BI Hadoop / NoSQL MapReduce YARN HDFS HANA DATA PLATFORM
  • 5.
    © 2015 SAPSE or an SAP affiliate company. All rights reserved. 5Public HANA Data Management Technical Foundation for End-to-End Big Data In-Memory Sub-second Response Column Storage High Performance Analytics Dynamic Tiering Warm data to disk Smart Data Access Remote Source as Virtual Tables Virtual UDF HDFS and MapReduce 011001 Smart Data Streaming On-the-fly Stream Analysis Smart Data Integration Extend HANA with Hadoop Stores Smart Data Quality Cleansing and Transformation Replication server Real-time data movement to Hadoop Smart Data Preparation Clean data for better decisions Data Services Big Data and No-SQL transformations Aging Rules and Automated Data Movement from HANA to Hadoop Data Warehouse Foundation
  • 6.
    © 2015 SAPSE or an SAP affiliate company. All rights reserved. 6Public HANA Data Platform SAP HANA In-Memory 0.1 sec Instant Results Text | Search | Graph | GeoSpatial | Predictive | Time Series Administration | Monitoring | Operations | User Management | Security HANA Dynamic Tiering Warm Data HADOOP Compute & Storage ∞ HANA Data Management Platform for Big Data
  • 7.
    © 2015 SAPSE or an SAP affiliate company. All rights reserved. 7Public HANA Data Platform Big Data Features HANA native BigData  Dynamic Tiering  Smart Data Streaming  Graph | Geo | TimeSeries HANA & Hadoop  Smart Data Access  Hive | Spark  MapReduce | HDFS
  • 8.
    © 2015 SAPSE or an SAP affiliate company. All rights reserved. 8Public HANA Data Platform Dynamic Tiering HANA Dynamic Tiering  Native Big Data solution – real-time insights – ALL enterprise data  Manage data cost effectively  Terabytes to Petabytes  Application defined temperature  Single Database experience  Centralized operational control CREATE TABLE „demo“.“SalesOrders_WARM“ ( ID Integer NOT NULL, CustomerID Integer NOT NULL, OrderDate date NOT NULL, …, PRIMARY KEY (id) ) USING EXTENDED STORAGE; INSERT INTO „demo“.“SalesOrders_WARM“ VALUES ( … );
  • 9.
    © 2015 SAPSE or an SAP affiliate company. All rights reserved. 9Public HANA Data Platform Hadoop Integration HANA & Hadoop Integration  SQL on Hadoop via SDA with virtual tables – Hive or Spark  Execution of MR-Jobs via virtual functions  Access to HDFS  Calculation view support for virtual tables
  • 10.
    © 2015 SAPSE or an SAP affiliate company. All rights reserved. 10Public HANA Data Platform Hive, Spark Integration Feature Highlights  Data Virtualization (Smart Data Access) to Hive via ODBC connectivity  Richer SQL access from SAP HANA studio to Hive Tables  Compute SQL operations between Hive Tables and HANA Tables from HANA studio  Remote caching of HIVE data in queries: Ex: SELECT * FROM HIVE_LINEITEMS WHERE ORDER_ID=6 WITH HINT ( USE_REMOTE_CACHE )
  • 11.
    © 2015 SAPSE or an SAP affiliate company. All rights reserved. 11Public Virtual UDF for HDFS and MapReduce Integration Architecture
  • 12.
    © 2015 SAPSE or an SAP affiliate company. All rights reserved. 12Public Virtual UDF for HDFS and MapReduce Integration Syntax Highlights  Syntax: CREATE VIRTUAL FUNCTION <func_name> [(<parameter_clause>)] RETURNS <return_table_type> [SQL SECURITY <mode>] [<package_clause>] CONFIGURATION <remote_proc_properties> AT <remote_source_name>;  Virtual Function Properties – Can be used in-place of a table or derived table where the return clause represents the result-set – Many configuration parameters depending on HDFS or MapReduce Job Call – Points to a remote Hadoop cluster defined by the CREATE REMOTE SOURCE DDL
  • 13.
    © 2015 SAPSE or an SAP affiliate company. All rights reserved. 13Public HANA Data Platform HDFS Integration Feature Highlights  Query native HDFS (Hadoop File System) data  Read-only access to HDFS file  vUDF needs to define the schema of the result set returned with the TABLE clause  Some relevant configuration parameters, more in SPS09 Administration Guide Parameter Name Description hdfs_location Where the hdfs file is location, e.g. /user/hive/tpch/products hdfs_field_delimiter The character which defines the separator between fields in the file pointed by hdfs_location datetime_format Defines the ISO datetime format of a date_time column in the file date_format Defines the ISO date format of a date column in the file, e.g yyyy-MM-dd time_format Same for time format
  • 14.
    © 2015 SAPSE or an SAP affiliate company. All rights reserved. 14Public HDFS Demo with Virtual UDF Create First a Remote Server pointing to the WebHDFS and WebHCAT servers  Use Remote Server Statement for that Create a Virtual User Defined Function  Pointing to the HDFS file and specifying the type of data returned Access the HDFS file  Call the vUDF
  • 15.
    © 2015 SAPSE or an SAP affiliate company. All rights reserved. 15Public HANA Data Platform Map Reduce Integration Feature Highlights  Capability to invoke MapReduce jobs from HANA  End-to-End development: – Define Mapper and Reducer JAVA classes developed in HANA studio by creating a Java Project with the SAP HANA Development Perspective. – MapReduce Deployment from HANA Studio  vUDF needs to define the schema of the result set returned with the TABLE clause  Some relevant configuration parameters, more in SPS09 Administration Guide Parameter Name Description mapred_mapper The full java class name for the map phase mapred_reducer The full java class name for the reduce phase mapred_input The initial file to be used by MapReduce or an intermediate result if chaining MapReduce calls or the input directory to read the data from
  • 16.
    © 2015 SAPSE or an SAP affiliate company. All rights reserved. 16Public MapReduce Demo with Virtual UDF Create First a Remote Server pointing to the WebHDFS and WebHCAT servers  Use Remote Server Statement for that Create a Virtual User Defined Function  Reference the Mapper Class Name  Reference the Reducer Class Name  Reference the input file location where the MapReduce Job should look for Call the MapReduce Job  Call the vUDF
  • 17.
    © 2015 SAPSE or an SAP affiliate company. All rights reserved. 17Public HANA Data Platform and Hadoop Where we are heading Some relevant features:  Lightweight and fast data replication/movement from HANA to Hadoop  Data Aging solution for HANA via Data Lifecycle Management utility to define aging rules and relocate aged data to Hadoop  SDA support for Data Provisioning for the SAP HANA Service/Adapter Framework  SDA performance optimization: maintain statistics  Optimize SAP HANA and Spark SQL Integration  Leverage HANA/Hadoop Security capabilities for User Authentication  Single UI for HANA and Hadoop cluster Administration & Monitoring (through Ambari)
  • 18.
    © 2015 SAPSE or an SAP affiliate company. All rights reserved. 18Public Conclusion Bringing Big Data to main stream Enterprise Data ONE PLATFORM ALL WORKLOADS INTEGRATED ALL DATA SIMPLE OPEN
  • 19.
    © 2015 SAPSE or an SAP affiliate company. All rights reserved. Thank you Contact information: Javier Cuerva Enterprise Solution Architect SAP Global Center of Expertise javier.cuerva@sap.com
  • 20.
    © 2015 SAPSE or an SAP affiliate company. All rights reserved. 20Public Backup Slides
  • 21.
    © 2015 SAPSE or an SAP affiliate company. All rights reserved. 21Public HANA Data Platform Any Apps Any App Server SAP Business Suite and BW ABAP App Server Other AppsLocationReal-timeHADOOPMachineUnstructuredTransaction HANA Platform SQL, SQLScript, JavaScript Spatial Text Search Text Analysis & Mining Stored Procedure & Data Models Application & UI Services Business Function Library Predictive Analysis Library Database Services Series Data Rules Engine Integration & Steaming Services SAP HANA is the platform for ALL Applications A true platform  Converged OLTP + OLAP  Native processing services  Embedded business logic Supports any application  60% of HANA use cases are outside of the SAP Landscape  1,300+ start-ups & ISVs developing on HANA Supports any Device
  • 22.
    © 2015 SAPSE or an SAP affiliate company. All rights reserved. 22Public SAP HANA Smart Data Integration & Smart Data Quality Replication, Batch Integration, and Data Virtualization Capabilities  Real-time replication & CDC on select sources  Bulk integration (metadata / data)  Data virtualization via Smart Data Access  Real-time data cleansing and transformation  Data enrichment with geospatial information  SAP HANA Studio to define data transformation flows  Support for on-premise and cloud sources  Open SDK and built-in adapters including HIVE Benefits  Simplified landscape: 1 environment to provision data  Real-time: lower latency with in-memory performance  Open & extensible: supports data of any shape or size Built-In Adapters Custom Adapters Transformations SAP HANA Metadata Adapter Framework OData DB2, Oracle SQL Server Smart Data IntegrationSmart Data Quality
  • 23.
    © 2015 SAPSE or an SAP affiliate company. All rights reserved. 23Public SAP HANA Smart Data Access Virtual Table Capabilities  Real-time, virtualized data access to external sources  SAP Sources: HANA, ASE, IQ, MaxDB, ESP, SQLA  Databases: Teradadata, Microsoft SQLServer, Oracle, IBM DB2, IBM Netezza  Hadoop: Hive ODBC Driver to Cloudera, Hortonworks, MapR  NoSQL: SPARK Benefits  Optimized performance  Compliments existing enterprise investments  Lower development costs by using data directly from its source system
  • 24.
    © 2015 SAPSE or an SAP affiliate company. All rights reserved. 24Public SAP BusinessObjects BI / SAP Lumira & Hadoop / NoSQL Combined With SAP HANA Hadoop / NoSQL Hive SQL Query Impala MPP SQL Query MongoDB Document DB Cassandra NoSQL DB MapReduce / YARN / AWS Elastic MapReduce Distributed Processing Framework SAP HANA Platform SAP BusinessObjects BI Data Integration BI Universe SAP Lumira Desktop SAP Lumira Cloud Capabilities SAP Lumira Desktop & SAP BusinessObjects BI can integrate with Hadoop via SAP HANA SAP BusinessObjects BI 4.0 FP 3(Universe) integrates with Hive, Cloudera Implala and AWS EMR SAP Lumira desktop integrates with Hive and AWS EMR  SAP Lumira comes with a Datasource Extension Framework; developers can use to build additional datasource access: MongoDB, Datastax, SparkSQL are the most recent examples  SAP Lumira cloud integrates with Hive (0.13), Cloudera Impala (1.21), and AWS EMR Benefits  Flexible choice on how to access Hadoop / NoSQL  Greater insight from Big Data Analytics Smart data access SAP Data Services 4.1 (Hive & HDFS)
  • 25.
    © 2015 SAPSE or an SAP affiliate company. All rights reserved. 25Public SAP Predictive Analytics & Hadoop / NoSQL SAP Predictive Analytics 2.0 Hadoop / NoSQL Greenplum SQL DB Capabilities  Unified UI for business analysts and data scientists  Extensive predictive library including R algorithms  Big Data ready with support of Hive and Spark, but also Greenplum. Custom data extensions available for HDFS and virtually to any NoSQL database  Cloud services & SDK ready with full process automation capabilities Benefits  Packable in business applications  Improved prediction & insights from Big Data analysis Hive SQL Spark In Memory Processing HDFS Hadoop Distributed File System