Your SlideShare is downloading. ×
  • Like
  • Save
Impact Of Column Oriented Main Memory Databases On Enterprise Applications
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Impact Of Column Oriented Main Memory Databases On Enterprise Applications

  • 946 views
Published

 

Published in Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
No Downloads

Views

Total Views
946
On SlideShare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
0
Comments
1
Likes
2

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. BI123 Impact of Column-Oriented Main-Memory Databases on Enterprise Applications Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld Hasso Plattner Institute October 15, 2009
  • 2. Disclaimer This presentation outlines our general product direction and should not be relied on in making a purchase decision. This presentation is not subject to your license agreement or any other agreement with SAP. SAP has no obligation to pursue any course of business outlined in this presentation or to develop or release any functionality mentioned in this presentation. This presentation and SAP's strategy and possible future developments are subject to change and may be changed by SAP at any time for any reason without notice. This document is provided without a warranty of any kind, either express or implied, including but not limited to, the implied warranties of merchantability, fitness for a particular purpose, or non-infringement. SAP assumes no responsibility for errors or omissions in this document, except if such damages were caused by SAP intentionally or grossly negligent. © SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 2
  • 3. Agenda 1.  The Hasso Plattner Institute 2.  Technical Foundation of Columnar In-Memory Databases 3.  Impact on Enterprise Applications © SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 3
  • 4. Agenda 1.  The Hasso Plattner Institute 2.  Technical Foundation of Columnar In-Memory Databases 3.  Impact on Enterprise Applications © SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 4
  • 5. Key Facts about the Hasso Plattner Institute   Founded as a public private partnership in 1998 in Potsdam near Berlin, Germany   Institute belongs to the University of Potsdam   Ranked 1st in “CHE”   340 B.Sc. and M.Sc. students   10 professors, 91 PhD students   Course of study: IT Systems Engineering © SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 5
  • 6. Research Group Enterprise Platform & Integration Concepts Prof. Dr. h.c. Hasso Plattner / Dr. Alexander Zeier   Research focuses on the technical aspects of enterprise software and design of complex applications   Memory-Based Data Management for Enterprise Applications   Human-Centered Software Design and Engineering   Maintenance and Evolution of Service-Oriented Enterprise Software   Integration of RFID Technology in Enterprise Platforms   Architecture-based Performance Simulation   Research co-operations with   Stanford, MIT, etc.   Industry co-operations with   SAP, Siemens, Audi, etc. Partner of Stanford Partner of MIT in Center for Design Supply Chain Research Innovation © SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 6
  • 7. Agenda 1.  The Hasso Plattner Institute 2.  Technical Foundation of Columnar In-Memory Databases 3.  Impact on Enterprise Applications © SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 7
  • 8. Two separate worlds: OLTP and OLAP? OLTP OLAP/DSS Level of operation Full row Selected attributes only Query complexity Simple Complex Level of detail Row-level, e.g. entire Colum-level, e.g. aggregation customer record or group-by Dominant operation INSERT, UPDATE, and Mainly SELECT SELECT Transaction duration Short running Long running Size of result set Small Large Query forecast Pre-determined Adhoc Processing Real-time updates Batch updates © SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 8
  • 9. Two separate worlds: OLTP and OLAP? OLTP OLAP/DSS Level of operation Full row Selected attributes only Query complexity Simple Complex Level of detail Row-level, e.g. entire Colum-level, e.g. aggregation customer record or group-by Dominant operation INSERT, UPDATE, and Mainly SELECT SELECT Transaction duration Short running Long running Size of result set Small Large Query forecast Pre-determined Adhoc Processing Real-time updates Batch updates © SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 9
  • 10. 3 Aspects for a Hybrid Solution  Columnar Storage   New database layout accessing only needed portions of data   Improve access for subsets of attributes  In-Memory   Fastest possible data access   Spatial proximity  Compression   Reduce amount of data to fit in main memory   Use cache and bus capacities more efficient © SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 10
  • 11. Columnar Storage: Architecture  Claim: Columnar storage is suited for update-intensive applications © SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 11
  • 12. In-Memory: Aggregate Processing Time The value of an attribute changes by calculation © SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 12
  • 13. Compression: Types Few Distinct Values Many Distinct Values Ordered Sequence of triples: Delta representation •  Value •  Offset position •  # Occurences Unordered Sequence of tuples: ? •  Value •  Bitmap for positional occurence © SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 13
  • 14. Scalability: Multiple CPU Cores  Set processing is most frequent access type in EAs (scan is dominant pattern)  Sequential column-wise scans show best bandwidth utilization between CPU cores and main memory  Independence of tuples per column allows:   easy partitioning, and   parallel processing (see Hennessy [1])  Faster memory scans by improved memory bandwidth in next generation CPUs  Neither materialized views nor aggregates  everything is calculated on-the-fly [1] John L. Hennessy, David A. Patterson: Computer Architecture: A Quantitative Approach © SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 14
  • 15. Myth 1: Adapting existing databases leverages column-oriented perfomance improvement Traditional Column-Oriented  Neither application nor database caches are necessary Application Cache  Redundant data objects are eliminiated Database Cache  Neither indices nor aggregates need to be maintained  Number of layers is minimized Pre-Built Aggregates  No updates  Application logic is adjacent to raw data  No database locks required Raw Data  Data movements are minimzed  Sustain use of existing resources + Stored Procedures + Mathematical Algorithms © SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 15
  • 16. Myth 2: The entire set of business data does not fit into main memory SCM SRM etc. CRM FI Use cumulated memory capacity of various blades  Only few columns have high many  Only relevant data in memory different attribute values  Partitioning across hardware  Up to ten times higher compression  Redundant-free data possible © SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 16
  • 17. Myth 3: Update/Insert of Huge Amounts of Data Degrades Columnar Performance Traditional Storing Columnar Storing Updates Insert  Our research activities at the HPI in Potsdam showed:   Updates are performed rare Insert Only   Only very few columns are affected by updates Further insights available at SAP TechEd 2009 HPI booth. © SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 17
  • 18. Agenda 1.  The Hasso Plattner Institute 2.  Technical Foundation of Columnar In-Memory Databases 3.  Impact on Enterprise Applications © SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 18
  • 19. Architecture of Existing Financials Systems © SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 19
  • 20. Architecture of Simplified Financials Systems Only base tables and algorithms © SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 20
  • 21. Analyzing Real Customer Data Customer1 Customer2 Customer3 Customer4 BKPF 23M 20M 13M 122K BSEG 268M 85M 28M 1M Years 2003-2008 2004-2008 2003-2007 2008/2009 1M records in BSEG ~ 1GB disk storage © SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 21
  • 22. Accounting Document Header Customer 1 Customer 3 Customer 2 Customer 4 99 attributes per customer © SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 22
  • 23. Value Updates Percentage of rows updated © SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 23
  • 24. Dunning © SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 24
  • 25. Available to Promise © SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 25
  • 26. Demand Planning © SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 26
  • 27. Insert Only  Tuple visibility indicated by timestamps (POSTGRES-style time-travel [2])  Additional storage requirements can be neglected due to low update frequency  Timestamp columns are not compressed to avoid additional merge costs  Snapshot isolation  Application-level locks Insert Only © SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 27
  • 28. Memory Consumption  Experiments show a general factor 10 in compression (using dictionary compression and bit vector encoding)  Additional storage savings by removing materialized aggregates, save ~2×  Keep only the active partition of the data in memory (based on fiscal year), save ~5×  Next generation blade servers will allow up to 500GB RAM.  Arrays of 100 blades already available  50 TB main memory would allow to cover the majority of SAP Business Suite customers © SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 28
  • 29. Impact on Application Development  Formalized logic must be moved close to the engine - calculations must take place close to the data  Reduction of application code  OLTP queries must use minimal projections (SELECT * is not allowed)  No caching necessary anymore © SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 29
  • 30. Conclusion   Technology improvements allow re-thinking of how we build enterprise apps:   A combined OLTP and OLAP system can share the same in-memory column store data base   Our experiments with real applications and data prove it   Open research challenges:   Disaster recovery, extension for unstructured data, life cycle based data management © SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 30
  • 31. Further Information # SAP Public Web: EPIC@HPI: https://epic.hpi.uni-potsdam.de Hasso Plattner Institute: http://www.hpi-web.de © SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 31
  • 32. Work at the speed of thought: Memory-Resident Technology and the Future of Business Strategy Session and One-on-One Conversations about what faster more flexible data access could mean to you know and in the future. Today | 12:30 – 14:00 | West Meeting Room 103A
  • 33. Thank you! Contact us! Hasso Plattner Institute EA²L / Enterprise Platform & Integration Concepts Matthieu-P. Schapranow August-Bebel-Str. 88 D-14482 Potsdam, Germany Matthieu-P. Schapranow matthieu.schapranow@hpi.uni-potsdam.de Responsible: Deputy Prof. of Prof. Hasso Plattner Dr. Alexander Zeier zeier@hpi.uni-potsdam.de © SAP 2008 / SAP TechEd 08 / <Session ID> Page 33
  • 34. Feedback Please complete your session evaluation. Be courteous — deposit your trash, and do not take the handouts for the following session. Thank You ! © SAP & HPI 2009 / SAP TechEd 09 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / BI123 / Page 34