PUBLIC
SAP HANA Data Management Suite
Sefan Linders
Big Data Warehouse Architect
Customer Innovation & Enterprise Platform
November 2018
2PUBLIC© 2018 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Legal disclaimer
The information in this presentation is confidential and proprietary to SAP and may not be disclosed without the permission
of SAP. This presentation is not subject to your license agreement or any other service or subscription agreement with SAP.
SAP has no obligation to pursue any course of business outlined in this document or any related presentation, or to develop
or release any functionality mentioned therein. This document, or any related presentation, and SAP’s strategy and possible
future developments, products, and platforms, directions, and functionality are all subject to change and may be changed
by SAP at any time for any reason without notice. The information in this document is not a commitment, promise, or legal
obligation to deliver any material, code, or functionality. This document is provided without a warranty of any kind, either
express or implied, including but not limited to the implied warranties of merchantability, fitness for a particular purpose,
or noninfringement. This document is for informational purposes and may not be incorporated into a contract. SAP assumes
no responsibility for errors or omissions in this document, except if such damages were caused by SAP’s willful misconduct
or gross negligence.
All forward-looking statements are subject to various risks and uncertainties that could cause actual results to differ
materially from expectations. Readers are cautioned not to place undue reliance on these forward-looking statements,
which speak only as of their dates, and they should not be relied upon in making purchasing decisions.
3PUBLIC© 2018 SAP SE or an SAP affiliate company. All rights reserved. ǀ
What problem are we addressing?
Business users need to
have all the data
relevant to their
decision and they need
to trust the security
and accuracy of their
data
Businesses need to
harness the power of
all their data –
business and new data
types – and to
anticipate and influence
business outcomes
Businesses need to
provide all users with
the right information
in context at the right
moment for the task at
hand
4PUBLIC© 2018 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Supply Chain Finance HR Manufacturing Sales Connected Assets
Third-party Finance
and Planning
Visualization Tools Statistical Analytics Spreadsheets
SAP
BusinessObjects
Decision
Intelligence
Systems
TACTICAL REPORTS FUNCTIONAL REPORTS STRATEGIC REPORTS INNOVATION APPS
BW
Today: Data sprawl, impossible to govern, security complexity
5PUBLIC© 2018 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Business situation and implications
Most enterprises now have data in 6-8 clouds
Data has become less accessible due to the proliferation of
cloud based solutions and business unit build applications further
fragmenting the data landscape
Company’s understanding of their customers, suppliers,
products has been in decline, caused by data being inaccessible
Substantial legal risks due to lack of governance, e.g. GDPR
Difficulty of operationalizing data science use in everyday
business processes
6PUBLIC© 2018 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Business situation and implications
Most enterprises now have data in 6-8 clouds
Data has become less accessible due to the proliferation of
cloud based solutions and business unit build applications further
fragmenting the data landscape
Company’s understanding of their customers, suppliers,
products has been in decline, caused by data being inaccessible
Substantial legal risks due to lack of governance, e.g. GDPR
Difficulty of operationalizing data science use in everyday
business processes
More
trusted
data
More
connected,
intelligent
data
More
cloud and
architecture
flexibility
8PUBLIC© 2018 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Third-party Finance
and Planning
Visualization Tools Statistical Analytics Spreadsheets
SAP
BusinessObjects
Decision
Intelligence
Systems
Supply Chain Finance HR Manufacturing Sales Connected Assets
Vision: Common data model, all data used by everyone, simple
TACTICAL REPORTS FUNCTIONAL REPORTS STRATEGIC REPORTS INNOVATION APPS
BW
SAP HANA DATA MANAGEMENT SUITE
In-Memory Data Management | Single logical data model across entire organization | Data Flow Modeling and Control | Insights from powerful analytics engines
11PUBLIC© 2018 SAP SE or an SAP affiliate company. All rights reserved. ǀ
SAP HANA Data Management Suite
Trusted Data | Connected, Intelligent Data | Cloud Architecture Flexibility
SAP Intelligent Enterprise Suite SAP Leonardo and SAP Analytics Cloud Third-Party Applications
SAP HANA Data Management Suite
In-memory
transaction & analytics
Data discovery &
governance
Data orchestration
& integration
Data cleansing &
enrichment
Data storage &
compute
SAP HANA
SAP Data Hub
SAP Enterprise
Architecture Designer
SAP Big Data Services
Third-Party
Services & Products
Spark
Hadoop
Third-party
Databases
Third-party Data
Management
HybridCloudManagement
SAPCloudPlatform
Business
Data
Cloud
Application Data
IoT Spatial Social Image
On Premises Multi-Cloud
SAP Add-On API
Services & Products
SAP HANA
Spatial services
SAP HANA
Blockchain service
SAP HANA
Streaming Analytics
Other SAP Cloud
Platform and SAP
Leonardo Services
SAP EIM Solutions
Hybrid
13PUBLIC© 2018 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Development platform for
applications that need analytics
on real-time transactions
Harmonized UX across
administration and development
tools
Data governance, anonymization,
and pipeline flow to protect and
refine data across the landscape
Modelling across business,
data, and technology
Applied AI to automate data
operations and pre-defined
business application scenarios
In-memory multi-model analytics
and data processing on a
distributed computing framework
Common metadata catalog,
business models, and
comprehensive data governance
SAP HANA Data
Management Suite
SAP HANA Data Management Suite
Common capabilities today and tomorrow
On Premise | Hybrid | Multi-cloud
Today Future
Seamless Cloud Service
14PUBLIC© 2018 SAP SE or an SAP affiliate company. All rights reserved. ǀ
2019
SAP HANA and SAP Data Hub engine integration
Shared capabilities in SAP HANA & SAP Data Hub: spatial data,
SQL, Graph, Doc store and common SQL
Connectivity
Automatic connectivity between HANA and Data Hub
Lifecycle management / DevOps / deployment
Data Hub as a Service (beta)
Tooling / UX
Consistent navigation through HDMS tooling
Meta data model and content repository
Common Meta Data Catalog across HDMS and 3rd party stores
and data orchestration with end-to-end lineage
Security and system enablement
Enhanced secure connections between HDMS components
(hybrid, multicloud, on-premises)
SAP HANA and SAP Data Hub engine integration
Extension of shared capabilities in HANA & Data Hub: spatial (adv), graph & doc data types, loading of parquet or OCR files
Data tiering
Data Tiering as cloud service with BDS integration & HANA Native storage extension
Lifecycle management / DevOps / deployment
Scenario based HDMS deployment of HANA & Data Hub in SAP Cloud Platform
Cross-cloud federation support
One Backup, recovery, and High Availability approach
Common Lifecycle handling –content lifecycle, platform lifecycle(e.g. upgrade) across all HDMS components and engines
Further deployment options for cloud providers & data center
Deeper EAD integration w meta data catalog and lineage
Data Science
Data Hub to execute pipelines using common ML libraries (PAL, APL) with HANA, consume additional ML frameworks
(Leonardo, 3rd party services, etc.)
Common custom ML operators for TF & R serving deployed by Data Hub and consumed in HANA
Tooling / UX
Alignment and harmonization of tooling
Meta data model and content repository
Partner ecosystem for SAP Hana Data Management Suite content
Security and system enablement
Streamlined user management, authorizations and authentications across full logical DW managed by HDMS
SAP HANA Data Management Suite
Roadmap
20192018
The SAP HANA Data Management Suite roadmap follows a ‘cloud first’ strategy. Relevant capabilities will be available in on-premises versions on later dates.
15PUBLIC© 2018 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Big Data
Warehouse
Leonardo
Platform
S/4HANA
Expansion
Spatial
Analytics
Analytics
Data Mart
SAP
BW/4HANA
SAP
Leonardo
SAP
S/4HANA
SAP
HANA
Earth
Observation
Analysis
SAP Cloud
Platform
Spatial
SAP
HANA
SAP Data
Hub
Business
Intelligence
Tools
Multiple Patterns from One Architecture
Cloud and architecture flexibility
Cloud freedom for data systems, applications, and system development
SAP
HANA
SAP Data
Hub
Big Data
services
from SAP
SAP EA
Designer
SAP
HANA
SAP Data
Hub
Big Data
services
from SAP
SAP EA
Designer
SAP
HANA
SAP Data
Hub
Big Data
services
from SAP
SAP EA
Designer
Business
Intelligence
Tools
SAP EA
Designer
16PUBLIC© 2018 SAP SE or an SAP affiliate company. All rights reserved. ǀ
SAP HANA Data Management Suite
Trusted Data | Connected, Intelligent Data | Cloud Architecture Flexibility
SAP Intelligent Enterprise Suite SAP Leonardo and SAP Analytics Cloud Third-Party Applications
SAP HANA Data Management Suite
In-memory
transaction & analytics
Data discovery &
governance
Data orchestration
& integration
Data cleansing &
enrichment
Data storage &
compute
SAP HANA
SAP Data Hub
SAP Enterprise
Architecture Designer
SAP Big Data Services
Third-Party
Services & Products
Spark
Hadoop
Third-party
Databases
Third-party Data
Management
HybridCloudManagement
SAPCloudPlatform
Business
Data
Cloud
Application Data
IoT Spatial Social Image
On Premises Multi-Cloud
SAP Add-On API
Services & Products
SAP HANA
Spatial services
SAP HANA
Blockchain service
SAP HANA
Streaming Analytics
Other SAP Cloud
Platform and SAP
Leonardo Services
SAP EIM Solutions
Hybrid
22PUBLIC© 2018 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Key capabilities
▪ Strategy: Define the business strategy with common business
architecture standards to build a plan to act
▪ Design: Create business and technical architecture using industry-
standard models to define the implementation
▪ Implementation: Align development with strategy and design to drive or
represent the implementation
▪ Consume: Communicate understanding and drive action across all
stakeholders
SAP Enterprise Architecture Designer
Architecture and design
Cloud | On premise
DeveloperBusiness user Architect
Strategy Design Implementation
SAP Enterprise Architecture Designer
Knowledge worker
Landscape Big Data DatabasesRequirements Capabilities Processes
23PUBLIC© 2018 SAP SE or an SAP affiliate company. All rights reserved. ǀ
SAP Data Hub
What’s New
26PUBLIC© 2018 SAP SE or an SAP affiliate company. All rights reserved. ǀ
What is SAP Data Hub?
The Lego Analogy
Streams: live data feed (e.g.
audio, video, twitter)
Events : alert/notification (e.g.
IoT)
Semi-structured: JSON, XML
Structured: RDBMS, CRM, ERP,
Legacy, File, etc.
Unstructured: PPTs, Words,
video, audio, image
Information Catalog | Monitoring & Scheduling | Orchestration | Pipelines
Hybrid
Stream
Subscribe
Ingest
Validate
TransformEnrich
Compute
Machine
Learning
Mask
Custom
Code
Image
Processing
Compute
Refine
Publish
Trigger
Action
Data
Consumption
Disparate
Data Landscapes
Intelligent apps
Automated processes
On-Premises Cloud
27PUBLIC© 2018 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Release Cycle - SAP Data Hub version 2.3
SAP Data Hub 1.4
SAP Vora 2.2
Innovation
SAP
Data Hub
2.3
Release Scope:
 Lean deployment and installation
with a complete containerized
setup ready for any deployment
 Unified User Experience in one
modeling environment
 Introducing Metadata Explorer
and Cataloging
 Unifying SAP Vora & SAP Data
Hub release cycle with a
synchronized delivery
Motivation: Enables enterprises to build scalable data-driven applications rapidly
28PUBLIC© 2018 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Release Theme – SAP Data Hub version 2.3
Deployment &
Consumption
User
Experience
Metadata
Governance
Data Integration &
Processing
• Deployment on cloud
environments with
managed Kubernetes
• Individual SAP Data Hub
Applications
• All components are
containerized
• Unified Modeling Tool for
Workflows, Pipelines and
Data Transforms
• Self Service Data
Preparation with SAP Agile
Data Preparation
• Comprehensive Monitoring &
Diagnostic Framework
• Information Catalog to
discover, define and
understand sources
• Search for Metadata
attributes and Tags
• Automated Metadata
Crawling for SAP HANA,
Cloud Stores, & SAP Vora
• Enhanced Connectivity
(Databases, Big Data
Stores, Cloud native
Technologies)
• Data Integration into SAP
S/4HANA, SAP Coud
solutions (Hybris, etc),
Master Data management
• Data Quality Management
29PUBLIC© 2018 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Deployment & Consumption
Cloud Deployments and Decoupling of Hadoop & Hana
Simplified deployment of SAP Data Hub
in cloud and on-premise environments
• All components are fully containerized and delivered as
Docker images including SAP HANA
• remove the pre-requisites of installing SAP HANA database and
XS advanced.
• remove Hadoop as a pre-requisites of setting up a Hadoop
cluster
• Decoupling data processing from storage platforms (any
supported cloud stores). All runtime execution is now
occurred in Kubernetes
• Deployable on most popular Kubernetes managed
environments*. Supports:
• managed Kubernetes Services of the major cloud providers (i.e.
AWS, Microsoft Azure, Google Cloud Platform),
• private cloud, and
• on-premise installations
* See Product Availability Matrix for detailed version dependencies
30PUBLIC© 2018 SAP SE or an SAP affiliate company. All rights reserved. ǀ
User Experience
Introducing Launchpad – a fresh new look UI
One central entry point to all services and applications
• Connection Management
• Monitoring
SAP Data Hub v2.3
• Metadata Explorer
• Modeler
31PUBLIC© 2018 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Metadata Governance
SAP Data Hub Metadata Explorer
A centralized location for
browse connections | monitoring | metadata catalog | search datasets | publications | labels
32PUBLIC© 2018 SAP SE or an SAP affiliate company. All rights reserved. ǀ
The connectivity framework (Flowagent) serves as the underlying infrastructure with
the goal to rapidly grow and enhance the native connectivity and integration
functionalities:
Data Integration & Processing
The unified connectivity framework
SAP Data Hub
Metadata & Applications
SAP Data Hub Connectivity
Framework (FlowAgent)
Metadata
Extractor Adapter
HDFS, BW4HANA, Oracle, S3, …
1. Metadata Services (Browsing, Profiling, Data Preview)
 Hadoop (HDFS)
 Cloud Object Storages (AWS S3, GCP GCS, Azure Data Lake,
WASB)
 Oracle*, ABAP/ODP*, OData*
2. Connection Operators (Consumer, Producer)
 HDFS, S3, GCS, ADL, WASB
 Oracle**, ABAP/ODP**, OData**
 Support custom adapters
3. Spark code generation
• HDFS
*profiling is planned in future release
**producer is planned in future release
33PUBLIC© 2018 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Orchestration (external):
▪ SAP BW Process Chain
– Trigger execution of a process chain on a BW system
▪ Data Transfer (BW)
– Transfer data from a BW system into Vora tables (created on the fly)
▪ Data Services
– Execute remote data services jobs ((demo)
▪ SAP HANA Flowgraph
– Trigger execution of a HANA flowgraph using SDI REST API (XSC)
▪ Spark / Hadoop
– Submit Spark jobs, Hive queries, etc. to Hadoop clusters
Execution (internal):
▪ Pipeline
– Start a pipeline on a local or remote SAP Data Hub Pipeline engine
– Wait for completion of pipeline (or if set continue immediately)
▪ Data Transform
– Run relational transformations (join, union, filter, etc.) on structured data
(tables, CSV, Parquet, etc.)
User Experience
Workflows Definition
34PUBLIC© 2018 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Connectivity:
Connectivity via Flowagent:
DQMm: Leonardo ML:
Data Integration & Processing
Predefined Connectivity Snapshot
- Azure Data Lake (ADL)
- Local File System (file)
- Google Cloud Storage (GCS)
- HDFS
- Amazon S3
- Azure Storage Blob (WASB)
- WebHDFS
SAP Vora:
Spark / Hadoop:
- Spark
- Spark SQL
- PySpark
- Hive
35PUBLIC© 2018 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Subengines:
▪ Develop and compile new operators locally using SDKs
▪ Register and run custom operators in available pipeline subengine
Process / Command Executors:
▪ Run a process within a pipeline and give contiguous stream to it
▪ Run a shell command for each arrival of a message within a pipeline
Programming Operators:
▪ Write and run custom scripts for data manipulation within a pipeline
▪ Build re-usable operators in different programming languages
Data Integration & Processing
Data Processing
Thank you.
© 2018 SAP SE or an SAP affiliate company. All rights reserved.
No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of
SAP SE or an SAP affiliate company.
The information contained herein may be changed without prior notice. Some software products marketed by SAP SE and its
distributors contain proprietary software components of other software vendors. National product specifications may vary.
These materials are provided by SAP SE or an SAP affiliate company for informational purposes only, without representation or
warranty of any kind, and SAP or its affiliated companies shall not be liable for errors or omissions with respect to the materials.
The only warranties for SAP or SAP affiliate company products and services are those that are set forth in the express warranty
statements accompanying such products and services, if any. Nothing herein should be construed as constituting an additional
warranty.
In particular, SAP SE or its affiliated companies have no obligation to pursue any course of business outlined in this document or
any related presentation, or to develop or release any functionality mentioned therein. This document, or any related presentation,
and SAP SE’s or its affiliated companies’ strategy and possible future developments, products, and/or platforms, directions, and
functionality are all subject to change and may be changed by SAP SE or its affiliated companies at any time for any reason
without notice. The information in this document is not a commitment, promise, or legal obligation to deliver any material, code, or
functionality. All forward-looking statements are subject to various risks and uncertainties that could cause actual results to differ
materially from expectations. Readers are cautioned not to place undue reliance on these forward-looking statements, and they
should not be relied upon in making purchasing decisions.
SAP and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered
trademarks of SAP SE (or an SAP affiliate company) in Germany and other countries. All other product and service names
mentioned are the trademarks of their respective companies.
See https://www.sap.com/copyright for additional trademark information and notices.
www.sap.com/contactsap
Follow us

SAP Data Hub – What is it, and what’s new? (Sefan Linders)

  • 1.
    PUBLIC SAP HANA DataManagement Suite Sefan Linders Big Data Warehouse Architect Customer Innovation & Enterprise Platform November 2018
  • 2.
    2PUBLIC© 2018 SAPSE or an SAP affiliate company. All rights reserved. ǀ Legal disclaimer The information in this presentation is confidential and proprietary to SAP and may not be disclosed without the permission of SAP. This presentation is not subject to your license agreement or any other service or subscription agreement with SAP. SAP has no obligation to pursue any course of business outlined in this document or any related presentation, or to develop or release any functionality mentioned therein. This document, or any related presentation, and SAP’s strategy and possible future developments, products, and platforms, directions, and functionality are all subject to change and may be changed by SAP at any time for any reason without notice. The information in this document is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. This document is provided without a warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability, fitness for a particular purpose, or noninfringement. This document is for informational purposes and may not be incorporated into a contract. SAP assumes no responsibility for errors or omissions in this document, except if such damages were caused by SAP’s willful misconduct or gross negligence. All forward-looking statements are subject to various risks and uncertainties that could cause actual results to differ materially from expectations. Readers are cautioned not to place undue reliance on these forward-looking statements, which speak only as of their dates, and they should not be relied upon in making purchasing decisions.
  • 3.
    3PUBLIC© 2018 SAPSE or an SAP affiliate company. All rights reserved. ǀ What problem are we addressing? Business users need to have all the data relevant to their decision and they need to trust the security and accuracy of their data Businesses need to harness the power of all their data – business and new data types – and to anticipate and influence business outcomes Businesses need to provide all users with the right information in context at the right moment for the task at hand
  • 4.
    4PUBLIC© 2018 SAPSE or an SAP affiliate company. All rights reserved. ǀ Supply Chain Finance HR Manufacturing Sales Connected Assets Third-party Finance and Planning Visualization Tools Statistical Analytics Spreadsheets SAP BusinessObjects Decision Intelligence Systems TACTICAL REPORTS FUNCTIONAL REPORTS STRATEGIC REPORTS INNOVATION APPS BW Today: Data sprawl, impossible to govern, security complexity
  • 5.
    5PUBLIC© 2018 SAPSE or an SAP affiliate company. All rights reserved. ǀ Business situation and implications Most enterprises now have data in 6-8 clouds Data has become less accessible due to the proliferation of cloud based solutions and business unit build applications further fragmenting the data landscape Company’s understanding of their customers, suppliers, products has been in decline, caused by data being inaccessible Substantial legal risks due to lack of governance, e.g. GDPR Difficulty of operationalizing data science use in everyday business processes
  • 6.
    6PUBLIC© 2018 SAPSE or an SAP affiliate company. All rights reserved. ǀ Business situation and implications Most enterprises now have data in 6-8 clouds Data has become less accessible due to the proliferation of cloud based solutions and business unit build applications further fragmenting the data landscape Company’s understanding of their customers, suppliers, products has been in decline, caused by data being inaccessible Substantial legal risks due to lack of governance, e.g. GDPR Difficulty of operationalizing data science use in everyday business processes More trusted data More connected, intelligent data More cloud and architecture flexibility
  • 7.
    8PUBLIC© 2018 SAPSE or an SAP affiliate company. All rights reserved. ǀ Third-party Finance and Planning Visualization Tools Statistical Analytics Spreadsheets SAP BusinessObjects Decision Intelligence Systems Supply Chain Finance HR Manufacturing Sales Connected Assets Vision: Common data model, all data used by everyone, simple TACTICAL REPORTS FUNCTIONAL REPORTS STRATEGIC REPORTS INNOVATION APPS BW SAP HANA DATA MANAGEMENT SUITE In-Memory Data Management | Single logical data model across entire organization | Data Flow Modeling and Control | Insights from powerful analytics engines
  • 8.
    11PUBLIC© 2018 SAPSE or an SAP affiliate company. All rights reserved. ǀ SAP HANA Data Management Suite Trusted Data | Connected, Intelligent Data | Cloud Architecture Flexibility SAP Intelligent Enterprise Suite SAP Leonardo and SAP Analytics Cloud Third-Party Applications SAP HANA Data Management Suite In-memory transaction & analytics Data discovery & governance Data orchestration & integration Data cleansing & enrichment Data storage & compute SAP HANA SAP Data Hub SAP Enterprise Architecture Designer SAP Big Data Services Third-Party Services & Products Spark Hadoop Third-party Databases Third-party Data Management HybridCloudManagement SAPCloudPlatform Business Data Cloud Application Data IoT Spatial Social Image On Premises Multi-Cloud SAP Add-On API Services & Products SAP HANA Spatial services SAP HANA Blockchain service SAP HANA Streaming Analytics Other SAP Cloud Platform and SAP Leonardo Services SAP EIM Solutions Hybrid
  • 9.
    13PUBLIC© 2018 SAPSE or an SAP affiliate company. All rights reserved. ǀ Development platform for applications that need analytics on real-time transactions Harmonized UX across administration and development tools Data governance, anonymization, and pipeline flow to protect and refine data across the landscape Modelling across business, data, and technology Applied AI to automate data operations and pre-defined business application scenarios In-memory multi-model analytics and data processing on a distributed computing framework Common metadata catalog, business models, and comprehensive data governance SAP HANA Data Management Suite SAP HANA Data Management Suite Common capabilities today and tomorrow On Premise | Hybrid | Multi-cloud Today Future Seamless Cloud Service
  • 10.
    14PUBLIC© 2018 SAPSE or an SAP affiliate company. All rights reserved. ǀ 2019 SAP HANA and SAP Data Hub engine integration Shared capabilities in SAP HANA & SAP Data Hub: spatial data, SQL, Graph, Doc store and common SQL Connectivity Automatic connectivity between HANA and Data Hub Lifecycle management / DevOps / deployment Data Hub as a Service (beta) Tooling / UX Consistent navigation through HDMS tooling Meta data model and content repository Common Meta Data Catalog across HDMS and 3rd party stores and data orchestration with end-to-end lineage Security and system enablement Enhanced secure connections between HDMS components (hybrid, multicloud, on-premises) SAP HANA and SAP Data Hub engine integration Extension of shared capabilities in HANA & Data Hub: spatial (adv), graph & doc data types, loading of parquet or OCR files Data tiering Data Tiering as cloud service with BDS integration & HANA Native storage extension Lifecycle management / DevOps / deployment Scenario based HDMS deployment of HANA & Data Hub in SAP Cloud Platform Cross-cloud federation support One Backup, recovery, and High Availability approach Common Lifecycle handling –content lifecycle, platform lifecycle(e.g. upgrade) across all HDMS components and engines Further deployment options for cloud providers & data center Deeper EAD integration w meta data catalog and lineage Data Science Data Hub to execute pipelines using common ML libraries (PAL, APL) with HANA, consume additional ML frameworks (Leonardo, 3rd party services, etc.) Common custom ML operators for TF & R serving deployed by Data Hub and consumed in HANA Tooling / UX Alignment and harmonization of tooling Meta data model and content repository Partner ecosystem for SAP Hana Data Management Suite content Security and system enablement Streamlined user management, authorizations and authentications across full logical DW managed by HDMS SAP HANA Data Management Suite Roadmap 20192018 The SAP HANA Data Management Suite roadmap follows a ‘cloud first’ strategy. Relevant capabilities will be available in on-premises versions on later dates.
  • 11.
    15PUBLIC© 2018 SAPSE or an SAP affiliate company. All rights reserved. ǀ Big Data Warehouse Leonardo Platform S/4HANA Expansion Spatial Analytics Analytics Data Mart SAP BW/4HANA SAP Leonardo SAP S/4HANA SAP HANA Earth Observation Analysis SAP Cloud Platform Spatial SAP HANA SAP Data Hub Business Intelligence Tools Multiple Patterns from One Architecture Cloud and architecture flexibility Cloud freedom for data systems, applications, and system development SAP HANA SAP Data Hub Big Data services from SAP SAP EA Designer SAP HANA SAP Data Hub Big Data services from SAP SAP EA Designer SAP HANA SAP Data Hub Big Data services from SAP SAP EA Designer Business Intelligence Tools SAP EA Designer
  • 12.
    16PUBLIC© 2018 SAPSE or an SAP affiliate company. All rights reserved. ǀ SAP HANA Data Management Suite Trusted Data | Connected, Intelligent Data | Cloud Architecture Flexibility SAP Intelligent Enterprise Suite SAP Leonardo and SAP Analytics Cloud Third-Party Applications SAP HANA Data Management Suite In-memory transaction & analytics Data discovery & governance Data orchestration & integration Data cleansing & enrichment Data storage & compute SAP HANA SAP Data Hub SAP Enterprise Architecture Designer SAP Big Data Services Third-Party Services & Products Spark Hadoop Third-party Databases Third-party Data Management HybridCloudManagement SAPCloudPlatform Business Data Cloud Application Data IoT Spatial Social Image On Premises Multi-Cloud SAP Add-On API Services & Products SAP HANA Spatial services SAP HANA Blockchain service SAP HANA Streaming Analytics Other SAP Cloud Platform and SAP Leonardo Services SAP EIM Solutions Hybrid
  • 13.
    22PUBLIC© 2018 SAPSE or an SAP affiliate company. All rights reserved. ǀ Key capabilities ▪ Strategy: Define the business strategy with common business architecture standards to build a plan to act ▪ Design: Create business and technical architecture using industry- standard models to define the implementation ▪ Implementation: Align development with strategy and design to drive or represent the implementation ▪ Consume: Communicate understanding and drive action across all stakeholders SAP Enterprise Architecture Designer Architecture and design Cloud | On premise DeveloperBusiness user Architect Strategy Design Implementation SAP Enterprise Architecture Designer Knowledge worker Landscape Big Data DatabasesRequirements Capabilities Processes
  • 14.
    23PUBLIC© 2018 SAPSE or an SAP affiliate company. All rights reserved. ǀ
  • 15.
  • 16.
    26PUBLIC© 2018 SAPSE or an SAP affiliate company. All rights reserved. ǀ What is SAP Data Hub? The Lego Analogy Streams: live data feed (e.g. audio, video, twitter) Events : alert/notification (e.g. IoT) Semi-structured: JSON, XML Structured: RDBMS, CRM, ERP, Legacy, File, etc. Unstructured: PPTs, Words, video, audio, image Information Catalog | Monitoring & Scheduling | Orchestration | Pipelines Hybrid Stream Subscribe Ingest Validate TransformEnrich Compute Machine Learning Mask Custom Code Image Processing Compute Refine Publish Trigger Action Data Consumption Disparate Data Landscapes Intelligent apps Automated processes On-Premises Cloud
  • 17.
    27PUBLIC© 2018 SAPSE or an SAP affiliate company. All rights reserved. ǀ Release Cycle - SAP Data Hub version 2.3 SAP Data Hub 1.4 SAP Vora 2.2 Innovation SAP Data Hub 2.3 Release Scope:  Lean deployment and installation with a complete containerized setup ready for any deployment  Unified User Experience in one modeling environment  Introducing Metadata Explorer and Cataloging  Unifying SAP Vora & SAP Data Hub release cycle with a synchronized delivery Motivation: Enables enterprises to build scalable data-driven applications rapidly
  • 18.
    28PUBLIC© 2018 SAPSE or an SAP affiliate company. All rights reserved. ǀ Release Theme – SAP Data Hub version 2.3 Deployment & Consumption User Experience Metadata Governance Data Integration & Processing • Deployment on cloud environments with managed Kubernetes • Individual SAP Data Hub Applications • All components are containerized • Unified Modeling Tool for Workflows, Pipelines and Data Transforms • Self Service Data Preparation with SAP Agile Data Preparation • Comprehensive Monitoring & Diagnostic Framework • Information Catalog to discover, define and understand sources • Search for Metadata attributes and Tags • Automated Metadata Crawling for SAP HANA, Cloud Stores, & SAP Vora • Enhanced Connectivity (Databases, Big Data Stores, Cloud native Technologies) • Data Integration into SAP S/4HANA, SAP Coud solutions (Hybris, etc), Master Data management • Data Quality Management
  • 19.
    29PUBLIC© 2018 SAPSE or an SAP affiliate company. All rights reserved. ǀ Deployment & Consumption Cloud Deployments and Decoupling of Hadoop & Hana Simplified deployment of SAP Data Hub in cloud and on-premise environments • All components are fully containerized and delivered as Docker images including SAP HANA • remove the pre-requisites of installing SAP HANA database and XS advanced. • remove Hadoop as a pre-requisites of setting up a Hadoop cluster • Decoupling data processing from storage platforms (any supported cloud stores). All runtime execution is now occurred in Kubernetes • Deployable on most popular Kubernetes managed environments*. Supports: • managed Kubernetes Services of the major cloud providers (i.e. AWS, Microsoft Azure, Google Cloud Platform), • private cloud, and • on-premise installations * See Product Availability Matrix for detailed version dependencies
  • 20.
    30PUBLIC© 2018 SAPSE or an SAP affiliate company. All rights reserved. ǀ User Experience Introducing Launchpad – a fresh new look UI One central entry point to all services and applications • Connection Management • Monitoring SAP Data Hub v2.3 • Metadata Explorer • Modeler
  • 21.
    31PUBLIC© 2018 SAPSE or an SAP affiliate company. All rights reserved. ǀ Metadata Governance SAP Data Hub Metadata Explorer A centralized location for browse connections | monitoring | metadata catalog | search datasets | publications | labels
  • 22.
    32PUBLIC© 2018 SAPSE or an SAP affiliate company. All rights reserved. ǀ The connectivity framework (Flowagent) serves as the underlying infrastructure with the goal to rapidly grow and enhance the native connectivity and integration functionalities: Data Integration & Processing The unified connectivity framework SAP Data Hub Metadata & Applications SAP Data Hub Connectivity Framework (FlowAgent) Metadata Extractor Adapter HDFS, BW4HANA, Oracle, S3, … 1. Metadata Services (Browsing, Profiling, Data Preview)  Hadoop (HDFS)  Cloud Object Storages (AWS S3, GCP GCS, Azure Data Lake, WASB)  Oracle*, ABAP/ODP*, OData* 2. Connection Operators (Consumer, Producer)  HDFS, S3, GCS, ADL, WASB  Oracle**, ABAP/ODP**, OData**  Support custom adapters 3. Spark code generation • HDFS *profiling is planned in future release **producer is planned in future release
  • 23.
    33PUBLIC© 2018 SAPSE or an SAP affiliate company. All rights reserved. ǀ Orchestration (external): ▪ SAP BW Process Chain – Trigger execution of a process chain on a BW system ▪ Data Transfer (BW) – Transfer data from a BW system into Vora tables (created on the fly) ▪ Data Services – Execute remote data services jobs ((demo) ▪ SAP HANA Flowgraph – Trigger execution of a HANA flowgraph using SDI REST API (XSC) ▪ Spark / Hadoop – Submit Spark jobs, Hive queries, etc. to Hadoop clusters Execution (internal): ▪ Pipeline – Start a pipeline on a local or remote SAP Data Hub Pipeline engine – Wait for completion of pipeline (or if set continue immediately) ▪ Data Transform – Run relational transformations (join, union, filter, etc.) on structured data (tables, CSV, Parquet, etc.) User Experience Workflows Definition
  • 24.
    34PUBLIC© 2018 SAPSE or an SAP affiliate company. All rights reserved. ǀ Connectivity: Connectivity via Flowagent: DQMm: Leonardo ML: Data Integration & Processing Predefined Connectivity Snapshot - Azure Data Lake (ADL) - Local File System (file) - Google Cloud Storage (GCS) - HDFS - Amazon S3 - Azure Storage Blob (WASB) - WebHDFS SAP Vora: Spark / Hadoop: - Spark - Spark SQL - PySpark - Hive
  • 25.
    35PUBLIC© 2018 SAPSE or an SAP affiliate company. All rights reserved. ǀ Subengines: ▪ Develop and compile new operators locally using SDKs ▪ Register and run custom operators in available pipeline subengine Process / Command Executors: ▪ Run a process within a pipeline and give contiguous stream to it ▪ Run a shell command for each arrival of a message within a pipeline Programming Operators: ▪ Write and run custom scripts for data manipulation within a pipeline ▪ Build re-usable operators in different programming languages Data Integration & Processing Data Processing
  • 27.
  • 28.
    © 2018 SAPSE or an SAP affiliate company. All rights reserved. No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP SE or an SAP affiliate company. The information contained herein may be changed without prior notice. Some software products marketed by SAP SE and its distributors contain proprietary software components of other software vendors. National product specifications may vary. These materials are provided by SAP SE or an SAP affiliate company for informational purposes only, without representation or warranty of any kind, and SAP or its affiliated companies shall not be liable for errors or omissions with respect to the materials. The only warranties for SAP or SAP affiliate company products and services are those that are set forth in the express warranty statements accompanying such products and services, if any. Nothing herein should be construed as constituting an additional warranty. In particular, SAP SE or its affiliated companies have no obligation to pursue any course of business outlined in this document or any related presentation, or to develop or release any functionality mentioned therein. This document, or any related presentation, and SAP SE’s or its affiliated companies’ strategy and possible future developments, products, and/or platforms, directions, and functionality are all subject to change and may be changed by SAP SE or its affiliated companies at any time for any reason without notice. The information in this document is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. All forward-looking statements are subject to various risks and uncertainties that could cause actual results to differ materially from expectations. Readers are cautioned not to place undue reliance on these forward-looking statements, and they should not be relied upon in making purchasing decisions. SAP and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP SE (or an SAP affiliate company) in Germany and other countries. All other product and service names mentioned are the trademarks of their respective companies. See https://www.sap.com/copyright for additional trademark information and notices. www.sap.com/contactsap Follow us