Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Spark Usage in Enterprise Business Operations

2,794 views

Published on

At Spark Summit East 2016, SAP’s Ken Tsai highlighted how SAP HANA Vora extends Apache Spark to provide OLAP modeling capabilities and real-time query federation to enterprise data. You will learn real-world use cases where instant insight from a combination of enterprise and Hadoop data make an impact on everyday business operations.

Published in: Data & Analytics

Spark Usage in Enterprise Business Operations

  1. 1. Spark Usage in Enterprise Business Operations Ken Tsai VP, Data Management & Platform-as-Services SAP @kentsaiSAP 2.17.16: Spark Summit, NYC
  2. 2. ©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16 © 2016 SAP SE or an SAP affiliate company. All rights reserved. No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP SE or an SAP affiliate company. SAP and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP SE (or an SAP affiliate company) in Germany and other countries. Please see http://global12.sap.com/corporate-en/legal/copyright/index.epx for additional trademark information and notices. Some software products marketed by SAP SE and its distributors contain proprietary software components of other software vendors. National product specifications may vary. These materials are provided by SAP SE or an SAP affiliate company for informational purposes only, without representation or warranty of any kind, and SAP SE or its affiliated companies shall not be liable for errors or omissions with respect to the materials. The only warranties for SAP SE or SAP affiliate company products and services are those that are set forth in the express warranty statements accompanying such products and services, if any. Nothing herein should be construed as constituting an additional warranty. In particular, SAP SE or its affiliated companies have no obligation to pursue any course of business outlined in this document or any related presentation, or to develop or release any functionality mentioned therein. This document, or any related presentation, and SAP SE’s or its affiliated companies’ strategy and possible future developments, products, and/or platform directions and functionality are all subject to change and may be changed by SAP SE or its affiliated companies at any time for any reason without notice. The information in this document is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. All forward- looking statements are subject to various risks and uncertainties that could cause actual results to differ materially from expectations. Readers are cautioned not to place undue reliance on these forward-looking statements, which speak only as of their dates, and they should not be relied upon in making purchasing decisions.
  3. 3. ©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16 SAP – Our Quick Snapshot in the Enterprise Computing World 74% of the world’s transaction revenue touches an SAP system. SAP’s product focus: Enterprise Applications Business Networks Platforms – 15 yrs on IMC SAP customers represent 87% of Forbes Global 2,000 companies. SAP touches $16 trillion of world consumer purchases.
  4. 4. ©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16 SAP HANA – An In-Memory Platform to Enable New Business Scenarios Previously Not Feasible BKPF BSEG BSEG BSEG no indices no aggregates no redundancies CORE DATA STRUCTURE REMAINS UNCHANGED •  Soft financial close anytime •  Real-time revenue and cost analysis •  Real-time liquidity forecasts •  Real-time alerts and blocks on suspicious transactions
  5. 5. ©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16 Distributed Big Data Is Everywhere How to better use it in core enterprise business applications? ~79% of Data Reservoirs/ Lakes are still disconnected from core business operations How do I embed big data signal into my business applications and enterprise analytics? 53 Difficulty integrating with CRM and/or other systems % 49 Unable to apply or integrate external data quickly enough to inform real-time decision making % 59 Only a few analysts with specialized training can analyze big data % Harvard Business Review Analytic Services, Global Survey of 251 Respondents, Sept. 2015
  6. 6. ©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16 Introducing SAP HANA Vora An in-memory query engine that extends the Apache Spark execution framework to enrich the interactive analytics experiences on massively distributed computing clusters •  OLAP processing •  In-Memory Computing for high performance •  Connecting to Enterprise Systems •  Unified System Management SAP HANA ERP DATA BIG DATA Parallelized Queries Vora
  7. 7. ©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16 Key Open Source Contribution to Apache Spark Ecosystem Spark to HANA Push-downs & Data Hierarchies scala> val hierarchy = sqlContext.sql( s""" SELECT LVL, COUNT(*), ROUND( AVG(P_RETAILPRICE), 2) FROM ( SELECT LEVEL(node) AS LVL, P_RETAILPRICE FROM HIERARCHY( USING PART_HIERARCHY AS c JOIN PARENT p ON c.P_PARENT = p.P_PARTKEY SEARCH BY P_PARTKEY ASC START WHERE P_PARTKEY = 1 SET node ) AS H0 ) T1 GROUP BY LVL """.stripMargin ).collect().foreach(println) 901 903 913912 904 911 +---+---+------------+ |LEVEL|COUNT|AVG(P_RETAILPRICE)| +-----+-----+------------------+ | 0 | 1 | 901 | | 1 | 2 | 903.5 | | 2 | 3 | 912 | +-----+-----+------------------+ val options = Map("dbschema" -> config.user,"host" -> config.host,"instance" -> config.instance) # HANA Live CustomerBasicData Virtual Data Model val custConf = options + ("path" -> s"""sap.hba.ecc/ CustomerBasicData""") val cust = sqlContext.read.format("com.sap.spark.hana").options(custConf).load() cust.registerTempTable("customer") # HANA Live SalesOrderHeader VDM val sohConf = options + ("path" -> s"""sap.hba.ecc/ SalesOrderHeader""") val soh = sqlContext.read.format("com.sap.spark.hana").options(sohConf).load() soh.registerTempTable(soh) # Top 5 Countries by Sales Order Volume salesOrder = sqlContext.sql("select "Country",count(*) as Frequency from salesOrder as s LEFT OUTER JOIN customer as c on s.soldToParty = c.Customer GROUP BY Country ORDER BY Frequency desc”)
  8. 8. ©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16 Airline Use Case – Optimize MRO scheduling with Sensor Data Challenges •  $10,000 loss for every hour spent on maintenance, repair, and overhaul (MRO) •  Predictive MRO generates TB of sensor data per flight Solution •  SAP HANA Vora rapidly processes sensor data in HDFS and combines it with flight schedule and staffing data in SAP HANA to prioritize maintenance jobs and accelerate MRO Why SAP HANA Vora •  Optimize MRO operations with interactive, on-demand drill down by airport, flight route, etc. ©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16
  9. 9. ©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16 Utility Use Case – CenterPoint Energy Challenge •  Smart meters generate TBs of data/month •  Regulatory requirement to retain data for 10 years •  Current storage solution full by end-2016 •  Need to leverage HDFS as an additional tier for storage Solution •  SAP HANA for most recent sensor signal and operational data, Dynamic Tiering for 1~2yrs old data, HDFS for historical sensor data •  SAP HANA Vora accesses and queries data across all tiers Why SAP HANA Vora •  SAP HANA Vora provides enterprise analytics & OLAP like experience across data warehouse and HDFS. ©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16
  10. 10. ©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16 Utility Use Case – How It Works CenterPoint Energy Our benchmark tests proved that SAP HANA paired with SAP HANA Vora are the right solutions for us. We expect immediate cost benefits and to see competitive differentiation in the future.” Gary Hayes, CIO & SVP at CenterPoint Energy ©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16 SAP HANA MOST RECENT SENSOR DATA Dynamie Tiering 1-2 YR OLD DATA Parallelized Queries HDFS HISTORICAL SENSOR DATA Query data within and across tiers
  11. 11. ©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16 Financial Services Use Case – Extend Fraud Pattern Detection Challenges •  100+ million business transactions daily, 25% growth YoY •  Limited access to archived data •  Difficult to detect patterns in historical transactions Solution •  Current transactions in SAP HANA, historical transactions in HDFS clusters •  Real-time detection of abnormalities Why SAP HANA Vora •  Real-time, aggregated insights from current and historical transactions ©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16
  12. 12. ©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16 2016 and the Road Ahead Customers in North America, APJ, and EMEA Dev edition available on AWS TODAY General Availability Vora Modeler to build and query OLAP style cubes on data COMING SOON Planning (HR, Financial) Extend engine support for time series Transaction management Analytics on archived ERP data in Hadoop FUTURE
  13. 13. ©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16 Contribute to Spark Ecosystem, Embrace Best of Community Innovation Contribution to Open Source: Hierarchy capabilities Connection to ERP: predicate pushdown to HANA On-the-market solution SAP HANA Vora
  14. 14. Thank you! Ken Tsai: ken.tsai@sap.com @kentsaiSAP Enter to Win a GoPro HERO4 Session at SAP Booth 102 Learn More @ hana.sap.com/vora Try Dev Edition bit.ly/1K1qLyo We’re Hiring: https://spark-summit.org/east-2016/jobs/

×