SAP HANA - Big Data and Fast Data

1,646 views
1,464 views

Published on

Slides from a session at GigaCon Big Data conference in Warsaw, Poland on Jan 27, 2014. Updated with the content presented at Big Data Budapest on Aug 18, 2014.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,646
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
78
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

SAP HANA - Big Data and Fast Data

  1. 1. Big Data && Fast Data Vitaliy Rudnytskiy, SAP August 2014, Budapest
  2. 2. © 2013 SAP AG or an SAP affiliate company. All rights reserved. 2 Disclaimer This presentation outlines our general product direction and should not be relied on in making a purchase decision. This presentation is not subject to your license agreement or any other agreement with SAP. SAP has no obligation to pursue any course of business outlined in this presentation or to develop or release any functionality mentioned in this presentation. This presentation and SAP's strategy and possible future developments are subject to change and may be changed by SAP at any time for any reason without notice. This document is provided without a warranty of any kind, either express or implied, including but not limited to, the implied warranties of merchantability, fitness for a particular purpose, or non-infringement. SAP assumes no responsibility for errors or omissions in this document, except if such damages were caused by SAP intentionally or grossly negligent.
  3. 3. © 2013 SAP AG or an SAP affiliate company. All rights reserved. 3 Let’s start with… … me :) Vitaliy Rudnytskiy @Sygyzmundovych SAP’s Developer Center team - developers.sap.com - Data Management and Analytics Live in Wrocław, Poland
  4. 4. What is … Big Data? 4
  5. 5. What is Big Data? 5 3xV: Volume, Variety, Velocity 4xV: + Veracity 5xV: + Value 6xV: + Vitaliy 
  6. 6. „One Terabyte Club” sounds like terayears ago! 7
  7. 7. Big Data Definitions accordingly to CIOs 8
  8. 8. Instant Access to Data 9
  9. 9. SAP Road to In-Memory Technology 11 Source: “SAP HANA Essentials”, Jeffrey Word
  10. 10. Principle #1 Keep all required data (aka “hot data”) in computers’ main memory Rarely accessed data (aka “cold data”) can be moved to cheaper storage
  11. 11. © 2013 SAP AG or an SAP affiliate company. All rights reserved. 13 Data Access Latencies and Bandwidth Source: Intel
  12. 12. Principle #2 Compress data to minimize the footprint
  13. 13. © 2013 SAP AG or an SAP affiliate company. All rights reserved. 17 Columnar and Row Based Data Storage
  14. 14. © 2013 SAP AG or an SAP affiliate company. All rights reserved. 18 Data Compression in Column Store 1. Dictionary Compression 2. Compression of the Value ID Sequence • Prefix encoding • Run Length encoding • Cluster encoding • Sparse encoding • Indirect encoding 3. Further Compression for Main Dictionary
  15. 15. Principle #3 Cache-sensitive data layout and cache-aware algorithms
  16. 16. © 2013 SAP AG or an SAP affiliate company. All rights reserved. 21 „RAM Locality is King” Source: Microsoft Research: research.microsoft.com/en.../Flash_is_Good.ppt
  17. 17. © 2013 SAP AG or an SAP affiliate company. All rights reserved. 22 Bandwidth between CPU and Main Memory (a) Intel Shared Front Side Bus (UMA Architecture) (b) Intel Quick Path Interconnect (NUMA Architecture) Source: “In-Memory Data Management: An Inflection Point for Enterprise Applications” by Hasso Plattner, Alexander Zeier
  18. 18. Principle #4 Software performance growth is in the parallel processing, not in increasing clock speed
  19. 19. © 2013 SAP AG or an SAP affiliate company. All rights reserved. 25 Examples of Parallelization in a Column Store
  20. 20. Principle #5 Move data-intensive operations to the data layer Integrate data processing engines (relational, text, graph, streaming…) Include built-in libraries (geospecial, predictive, 3rd parties)
  21. 21. Have you seen… 1 Petabyte? 31
  22. 22. The SAP HANA One Petabyte Test 33 Source: http://www.saphana.com/community/blogs/blog/2012/11/12/the-sap-hana-one-petabyte-test
  23. 23. Guinness World Record: 12.1 PB DW 34 Source: http://www.guinnessworldrecords.com/world-records/5000/largest-data-warehouse
  24. 24. Here comes Hadoop 2.8 ZB in 2012 85% from New Data Types 15x Machine Data by 2020 40 ZB by 2020 New Sources (Sentiment, Clickstream, Geo, Sensor) Ref:Hortonworks 37
  25. 25. SAP HANA && Hadoop: Comparison 38Source: „How to Use Hadoop with Your SAP® Software Landscape”, http://www.saphana.com/docs/DOC-3777
  26. 26. Big Data && Fast Data 39 Query FederationTwo-Speed Analytics
  27. 27. Big Data is Strategic to SAP and Partners 40 http://www.cloudera.com/content/cloudera/en/ solutions/partner/SAP.html http://hortonworks.com/partner/sap/
  28. 28. © 2013 SAP AG or an SAP affiliate company. All rights reserved. 41 Planned Innovations Future DirectionToday SAP HANA and Hadoop Product road map overview – key themes and capabilities This is the current state of planning and may be changed by SAP at any time. Integrate:  Usability – Data Virtualization (Smart Data Access ) to Hive via ODBC connectivity – Richer SQL access from SAP HANA studio to Hive Tables  Co-processing – Compute SQL operation between Hive and HANA  Performance – SDA optimizations: function completion – Remote caching of HIVE jobs  Enterprise readiness – Hadoop Distributions Offering from our Partners a. Co-innovation & Reselling with Intel and Hortonworks b. Open for other distro’s using HIVE odbc drivers Optimize:  Usability – Execute custom map reduce methods from SAP HANA and consume the results from HANA queries via SQL User Defined Function (UDF). – SDA support for Data Provisioning for the SAP HANA Service/Adapter Framework  Performance – SDA optimization for connectivity, semi joins & relocation when large amounts of data is queried from SAP HANA and the number of Hadoop nodes are increased – Concurrent execution of HIVE jobs from SAP HANA  Enterprise readiness – * Support for HANA Authorization/ Authentication and encryption capabilities with partner Hadoop Disributions – * Integrate and validate Cloudera CDH-5.1 and Hortonworks HDP-2.0 Hadoop distributions with latest SAP HANA platform – Support Apache Spark SQL via partners distro See Appendix for abbreviations Synthesize:  Usability – Trigger and consume output of PIG jobs from HANA – Support HBase as a store for HANA federated queries  Co-processing – Advanced SDA for Hadoop windowing, table partitioned functions – Hadoop as extended data store for HANA (Big Unified Table) – Tiered storage: hot/cold data controlled by HANA  Performance – Distributed in-memory caching of hot HDFS data – Parallel connections from HANA to Hadoop for loading and queries – Extend HANA compute engine to Hadoop  Enterprise readiness – Integration of Partner Hadoop Distro features into SAP HANA Cockpit – Send Hadoop logs to SAP Solution Manager – HANA snapshot to HDFS with restore – Common authentication and authorization
  29. 29. Want to learn and code in… SAP HANA? 42
  30. 30. 43 Free online education: open.sap.com
  31. 31. 45 Community and developer access: http://developers.sap.com/hana • Free developer editions hosted in the cloud (AWS, Azure, CloudShare…) • How-to guides • Community support • Blogs
  32. 32. © 2013 SAP AG or an SAP affiliate company. All rights reserved. 46 Examples of some cool ideas from the community „HADOOP HDFS Explorer built with HANA XS and SAPUI5” by Aron MacDonald http://scn.sap.com/community/developer-center/hana/blog/2014/07/03/hadoop-hdfs-explorer-built-with-hana-xs-and-sapui5
  33. 33. © 2013 SAP AG or an SAP affiliate company. All rights reserved. 47 Examples of some cool ideas from the community „Experiences with SAP HANA Geo-Spatial Features” by Trinoy Hazarika http://scn.sap.com/community/developer-center/hana/blog/2014/02/25/experiences-with-sap-hana-geo-spatial-features-part-1
  34. 34. © 2013 SAP AG or an SAP affiliate company. All rights reserved. 48 Examples of some cool ideas from the community „Predicting My Next Twitter Follower with SAP HANA PAL” by Lucas Sparvieri *PAL – Predictive Analysis Library http://scn.sap.com/community/developer-center/hana/blog/2013/09/02/predicting-my-next-twitter-follower-with-sap-hana-pal
  35. 35. © 2013 SAP AG or an SAP affiliate company. All rights reserved. 49 Examples of some cool ideas from the community „Detecting World Cup GOAL using Twitter and SAP HANA” by Stevanic Artana http://scn.sap.com/community/developer-center/hana/blog/2014/07/03/goal-detection-using-twitter-and-sap-hana
  36. 36. © 2013 SAP AG or an SAP affiliate company. All rights reserved. 50 Examples of some cool ideas from the community „A Simple Door Monitoring System with HANA XS and Raspberry Pi” by Ferry Gunawan http://scn.sap.com/community/developer-center/hana/blog/2014/07/09/build-a-door-sensor-with-raspberry-pi-and-hana
  37. 37. 1000+ Startups working with us 51More info and how to join: http://startups.saphana.com
  38. 38. Semantic Vision from Czech Rep. 52 More: http://www.saphana.com/community/learn/startups/news-views/blog/2013/10/04/beyond-the-big-hype- using-semantic-search-to-predict-the-way-the-universe-will-behave
  39. 39. Startups in SAP Startup Forum Program 53 2/3rds in these 4 categories Area of Startup Focus % of Startups Visualization / BI / Market Insight 25% Social Media / Collaboration / Gaming 17% Predictive Analytics / Complex Analytics 13% Sensor Network Data / Internet of Things 10% Geospatial / Geo-Location / 3D Analysis 8% Primarily a Mobile Solution 7% Big Data Infrastructure / Appliance + SAP HANA 6% Enterprise Process Acceleration 6% Extension of specific SAP Solutions and SAP Expertise 4% Energy Management / Sustainability 3%
  40. 40. 54 Community and developer access: developers.sap.com Contact information: Vitaliy Rudnytskiy vitaliy.rudnytskiy@sap.com @Sygyzmundovych
  41. 41. © 2013 SAP AG or an SAP affiliate company. All rights reserved. 55 © 2013 SAP AG or an SAP affiliate company. All rights reserved. No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP AG. The information contained herein may be changed without prior notice. Some software products marketed by SAP AG and its distributors contain proprietary software components of other software vendors. National product specifications may vary. These materials are provided by SAP AG and its affiliated companies ("SAP Group") for informational purposes only, without representation or warranty of any kind, and SAP Group shall not be liable for errors or omissions with respect to the materials. The only warranties for SAP Group products and services are those that are set forth in the express warranty statements accompanying such products and services, if any. Nothing herein should be construed as constituting an additional warranty. SAP and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP AG in Germany and other countries. Please see http://www.sap.com/corporate-en/legal/copyright/index.epx#trademark for additional trademark information and notices.

×