Modernizing Business Processes with Big Data: Real-World Use Cases for Production

Modernizing Business Processes with Big Data:
Real-World Use Cases for Production
Christoph Streubert & Amit Satoor
April 2017

© 2017 SAP SE or an SAP affiliate company. All rights reserved. 2PUBLIC
Agenda
• Use-Cases
• What is Vora
• Architecture

Utilities rise to the
smart meter
challenge The mass of information from smart
meters is leading utility suppliers to
reconsider how they use their data
• Smart meters generate TBs of data/month
• Regulatory requirement to retain data for 10 years
• Forecasting energy usage
• Benefits of integrating data
• Meter data could help fraud detection, predict
maintenance requirements and eventually lead to
smart grids which respond intelligently to variations
in supply and demand

Agriculture takes advantage of
Precision Farming
Top Line Revenue
Growth and Lower
Costs
• Run Reports in Minutes
versus a Day or Two Days
• Improved and Scalable
Architecture Lowering
Costs
• Accurate Weather
Forecasting Leads to an
Increase in Production
Business
Challenge
SAP HANA with SAP Vora
• Migrate DW to the SAP HANA
• Leverage in-DB Machine Learning for predictive analytics
• Hadoop and Vora for low cost storage and compute of
unstructured data
Technical
Enablers
Cost Effectiveness & Improve Product Yield
• Increase in costs and lost revenue due to forecasting
challenges
• Sugar production requires accurate timing
• Managing strategic acquisitions and multiple farms
Benefits Improve Speed and Accuracy of Weather Data
• Leverage IOT data
• OCR parsing of satellite imagery data
• Focus on automation and improvement of forecasting process
• Improve the UX presentation and options
Business Benefits

Agenda
• Use-Cases
• What is Vora
• Architecture

Data Storage
Scalable and unified storage
across data types and sources
Data Compute
Data processing and analysis,
discovery, enhancement, and
governance, making data usable
Data Consumption
Data-driven insight connected to action
Unifying
the Data
Landscape
Integrating across
storage, compute
and consumption
CIO Imperatives & Challenges
Common Lesson
Big Data journey is incomplete
without business transformation
1.
2.
3.
4.
Big Data Journey
53%:
difficulty
integrating with
other enterprise
systems
49% can’t apply external data quickly enough
to enable context-based decision making
59% Only few analysts
with specialized training can
analyze big data
Harvard Business Reivew Analytic Services in Sep 2015
Need lower skill, production support and
performance optimization costs

Draft
SAP Vora
SAP Vora is an enterprise-ready, easy-to-use in-
memory distributed computing solution to help
organizations uncover actionable insights from big data.
Builds upon
Apache Spark
Seamless Integration
with SAP HANA
Runs on
Hadoop
SAP
Vora

Distributed Computing for the Digital Enterprise
Hortonworks Data Platform
Spark
Distributed Transaction Log
Disk-to-Memory Accelerator
Data Modeler
OLAP Time Series Graph Doc Store
SAP Vora
OPEN CONSUM PTION
Data Science, Predictive, Business Intelligence, Visualization Apps
Insights from
one single solution
In-memory distributed computing engines:
OLAP, Time Series, Graph, JSON/Doc
Disk-to-memory accelerator
Enterprise-ready
Production-ready, integrated solution
Integration with SAP HANA
Easier to use
Intuitive web interface
One SQL entry point
Open consumption

OLAP on
Hadoop for 360º
view of data Creating business scenarios views:
• Data Browser for viewing and exporting data
• SQL Editor for writing and running SQL scripts
• Modeler to visually create data models with
intuitive web interface

Time series data
analysis across
Big Data
-30
-20
-10
0
10
Temperature °C
Halifax Waterloo
Trend | Cyclical | Seasonal | Random | Exception
Efficiently analyze time series data in
distributed environments:
• Interactive access to standard time series
analysis functions using the well-known SQL
language
• Efficient compression allowing analysis of more
data using less memory
• Build time series models visually using Vora
Data Modeler

Graph engine
to uncover
connected
data
relationships
Native graph processing for:
• Interactive analysis of graphs using graph extension for SQL
• Supports directed and undirected graphs
• Algorithms for pattern matching, shortest path, and connected
components
1:Actor
NAME=‘Brad Pitt’ 4:MOVIE
TITLE=‘Mr. & Mrs. Smith’
YEAR=2005
RATING:6.5
7:DIRECTOR
NAME=‘Doug Liman’
1:Actor
NAME=‘Angelina Jolie’
3:Actor
NAME=‘Shah Rukh Khan’
5:MOVIE
TITLE=‘Kal Ho Naa Ho’
YEAR=2003
RATING=8.1
4:MOVIE
TITLE=‘My Name is Khan’
YEAR=2010
RATING-8.90
Plays in
Plays in
Director

Flexible
storage with
document
store
Support for collection of documents with
different structures:
• Interactive analysis of schema-less JSON data using the
well-known SQL language
• Capability to flexibly add or remove fields from any JSON docs
Document #1
Key: Value
Document #2
{Key: Value, Key:
Value}
Document #3
{Key: Value}
Document #4
Key: {Key: Value,
Key: Value|
Collection
Collection
Collection
Document Store

Big Data is complex
It gets more complicated as you scale

Introducing: SAP Cloud Platform Big Data Services
Fully Managed Big Data Cloud offering for Production Use
Data Centers optimized for Hadoop
Automated Operations Center
Unified Control Plane
Workbench
Business
Analytics
Search &
Discovery
Data
Exploration
Data Science
& Modeling
Custom
Applications
DataTransfer
Portal
ProactiveHelpdesk
SAP Vora

Agenda
• Use-Cases
• What is Vora
• Architecture

“Big Data” Style
 Opportunity Oriented
 Bottom-up Experimentation
 Immediate use and gratification
 Tool proliferation
 “World of Hadoop”
 Hackathons
 Better business
 Open Source
Suit vs. Hoddie
Traditional IM
 Requirements based
 Top-down design
 Integration and re-use
 Technology Consolidation
 World of EDW, CRM, ERP, ECM
 Competence Centers
 Better decisions
 Commercial Software
SAP
Vora

SAP centric viewCOMPUTEConsumeDataStore
GBs - TBs TBs - 10s of TBs 10s of TBs - PBs
In-
Memory
System of Record
HANA / BW/4HANA
Data
Tiering
In-
Memory
Structured data
for fast analytics
Less frequently
accessed,
structured data
Raw data:
semi-structured,
unstructured,
streaming data etc.
Data Lake
On-Premises In the Cloud
Hadoop and Spark
SAP Vora
Next Generation
Data Warehouse
= SCPBDS

Hadoop/Spark centric viewCOMPUTEIngestSourcesConsumeDataStore
GBs - TBs TBs - 10s of TBs 100s of TBs - PBs
Smart Data Streaming Data Services
Log Data Sensors Machine Data
In-
Memory
System of Record
HANA / BW/4HANA
Data
Tiering
In-
Memory
Structured data
for fast analytics
Less frequently
accessed,
structured data
Raw data:
semi-structured,
unstructured,
streaming data etc.
Data Lake
On-Premises In the Cloud
Hadoop and Spark
SAP Vora
Next Generation
Data Warehouse
etc.
etc.
= SCPBDS

InfrastructureFunctions&datatypes&toolsDataaccess
methods
Vora Value in the Hadoop world (what we mean by )
Graph
function
TimeSeries
function
JSONDocu
Store
OLAP
functions/
relational
modeling
Cypher, SQL
browser
SQL editor
Dedicated
infrastructure
Data model Data model Data model Data model
JavaScript
shell
Dedicated
infrastructure
Dedicated
infrastructure
Dedicated
infrastructure
SQL editor
Complex federation
etc.
etc.
etc.
Security Security Security Security
Distributed Transaction Log
Disk-to-Memory Accelerator
Vora Tools
(Data Browser, SQL Editor, OLAP Modeler)
Graph
JSONDoc
Store
Relational
e.g.hierarchies,
curr.conversion
Time
Series
Future
text,spatial,
video,etc.
Integrated Engines
Data Exchange
Security
Shared Hadoop Infrastructure
SAP HANA Vora
etc.
complex custom code

Architecture
• Delivers an in-memory relational engine that
processes Hadoop data stored in HDFS, S3,
parquet, ORC
• Combines multiple processing engines like
Time Series, Graph and Document Store
• Uses Dlog for storing metadata catalog from
Vora and HANA data sources
• Leverages SparkSQL to build native
VoraSQLcontext to provide distributed
processing of relational and other workloads
• Connects via Spark Thriftserver to provide web
based modeling UI to build “olap” models on
Hadoop data

Vora Cluster Manager – nodes and services assignment

Integration with Hadoop Management Tools
• Admin tools like Apache Ambari, are
used to administer and monitor the
Hadoop landscape

Native HANA
Integration
(no need for Spark
adapter)
Improved Usability of
Vora Modeler
Four New Engines:
Disk, Graph, Doc Store,
Time Series
Improved Vora Core:
Stability & enhancement
Key Capabilities of SAP Vora
New with Vora 1.3
Vora Vora Vora
Vora Vora Vora
Vora Vora Vora
Native „Data store“
in Hadoop
Multiple Engines
Relational
(OLAP)
In-
Memor
Time
Series
Doc Store
Graphs
Intuitive Tools
Tight HANA
Integration
0.1sec
∞
HANA
Hadoo
p
New

New in 1.4
• New Product Name "SAP Vora“
• Installation Package
• Supported Platforms (HDP2.5)
• Next-generation in-memory engine featuring SAP
Vora's native distribution technology
• Additional functions in engines
• Avro support
• Data preview

How can you get started with SAP Vora?
1. Blog: https://blogs.sap.com/2016/12/19/a-
look-at-the-sap-hana-vora-1.3-new-analysis-
engines/
2. Developer Community:
https://www.sap.com/developer/topics/hana-
vora.html
Download
and Install
Access From
the Cloud
1. Access from the SAP Cloud Appliance Library
https://www.sap.com/developer/topics/hana-
vora.html
2. Enter credentials
3. Get up and running in the cloud
* Free SAP Vora Developer Edition plus infrastructure cost

More Information!
Technical documentation
https://help.sap.com/viewer/p/SAP_VORA
Developer downloads
https://www.sap.com/developer/topics/vora.html#freetria
Helpful links
https://blogs.sap.com/2017/03/30/useful-sap-vora-links/
CTA

Modernizing Business Processes with Big Data: Real-World Use Cases for Production

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Modernizing Business Processes with Big Data: Real-World Use Cases for Production

Similar to Modernizing Business Processes with Big Data: Real-World Use Cases for Production (20)

More from DataWorks Summit/Hadoop Summit

More from DataWorks Summit/Hadoop Summit (20)

Recently uploaded

Recently uploaded (20)

Modernizing Business Processes with Big Data: Real-World Use Cases for Production