Impact of Column-OrientedMain-Memory Databases on Enterprise Applications<br />Dr. Alexander Zeier, Matthieu-P. Schapranow...
© HPI & SAP 2010 / SAP World Tour 10 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Ale...
Agenda<br />The Hasso Plattner Institute<br />Technical Foundation of Columnar In-Memory Databases<br />Impact on Enterpri...
Agenda<br />The Hasso Plattner Institute<br />Technical Foundation of Columnar In-Memory Databases<br />Impact on Enterpri...
Key Facts about the Hasso Plattner Institute<br />Founded as a public private partnershipin 1998 in Potsdam near Berlin, G...
Prof. Dr. h.c. Hasso Plattner / Dr. Alexander Zeier<br />Research focuses on the technical aspects of enterprise software ...
Agenda<br />The Hasso Plattner Institute<br />Technical Foundation of Columnar In-Memory Databases<br />Impact on Enterpri...
Two separate worlds: OLTP and OLAP?<br />© HPI & SAP 2010 / SAP World Tour 10 / Impact of Column-Oriented Main-Memory Data...
Two separate worlds: OLTP and OLAP?<br />© HPI & SAP 2010 / SAP World Tour 10 / Impact of Column-Oriented Main-Memory Data...
Dominant Hardware Trends<br />Multi-Core Technology<br />Moore’s Law:  “…number of transistors … doubling approximately<br...
3 Aspects for a Hybrid Solution<br />Columnar Storage<br />New database layout accessing only needed portions of data<br /...
Row Store<br />Column Store<br />Storages: Row vs. Column<br />© HPI & SAP 2010 / SAP World Tour 10 / Impact of Column-Ori...
Columnar Storage: Architecture<br />Claim: Columnar storage is suited for update-intensive applications<br />© HPI & SAP 2...
In-Memory: Aggregate Processing Time<br />© HPI & SAP 2010 / SAP World Tour 10 / Impact of Column-Oriented Main-Memory Dat...
Compression: Types<br />© HPI & SAP 2010 / SAP World Tour 10 / Impact of Column-Oriented Main-Memory Databases on Enterpri...
Dictionaries<br />Compression:Advantages ofColumnar Storages<br />© HPI & SAP 2010 / SAP World Tour 10 / Impact of Column-...
Scalability: Multiple CPU Cores<br />Set processing is most frequent access type in EAs(scan is dominant pattern)<br />Seq...
Myth 1: Adapting existing databases leverages column-oriented perfomance improvement<br />© HPI & SAP 2010 / SAP World Tou...
Redundant dataobjectsareeliminiated
Neitherindicesnoraggregatesneed to bemaintained
Number of layersisminimized
No updates
Applicationlogicisadjacent to rawdata
No databaselocksrequired
Data movementsareminimzed
Sustainuse of existingresources</li></ul>Application<br />Cache<br />DatabaseCache<br />Pre-BuiltAggregates<br />Raw Data<...
Myth 2: The entire set of business data does not fit into main memory<br />© HPI & SAP 2010 / SAP World Tour 10 / Impact o...
Partitioning across hardware
Redundant-free data
Only few columns have high many different attribute values
Up to ten times higher compression possible</li></li></ul><li>Myth 3: Update/Insert of Huge Amounts of Data Degrades Colum...
Updates areperformed rare
Onlyveryfewcolumnsareaffectedbyupdates</li></ul>Furtherinsightsavailable at SAP World Tour 2010 HPI booth 1.19.<br />Inser...
Agenda<br />The Hasso Plattner Institute<br />Technical Foundation of Columnar In-Memory Databases<br />Impact on Enterpri...
Architecture of ExistingFinancials Systems<br />© HPI & SAP 2010 / SAP World Tour 10 / Impact of Column-Oriented Main-Memo...
Architecture of Simplified Financials Systems<br />© HPI & SAP 2010 / SAP World Tour 10 / Impact of Column-Oriented Main-M...
Upcoming SlideShare
Loading in …5
×

SAP World Tour 2010: Impact of Column-Oriented Main-Memory Databases on Enterprise Applications

1,551 views
1,456 views

Published on

Published in: Technology
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,551
On SlideShare
0
From Embeds
0
Number of Embeds
79
Actions
Shares
0
Downloads
0
Comments
0
Likes
5
Embeds 0
No embeds

No notes for slide
  • ccdcdMoore’s Law: “…number of transistors … doubling approximately every two years”CPU frequency hit limitin 2002, but Mooreslaw holds todayHow? Multi-Core and Parallelization
  • Select required attributes only
  • X: number of aggregatesY: log. time required for aggregate calculation
  • ordered/few: tarif ratesUnordered/few: sexOrdered/Distinct: temperature values
  • Partitioning!
  • Remove data redundancy
  • Partioning!
  • Insert-Only
  • Stress on:Materialized aggregatesMaterialized viewsIndicesRedudant data in cubes, change history, …
  • Analysis of accounting tablesBkpf= accounting document headersBseg = accounting document line items
  • VieleSELECTs: bspw. Dunning ablauf (mituntersehr complex) row-oriented, relational programming pattern select via attributes (column-wise)  cp. to OLAP needs a rewrite!!!
  • What about rescheduling for high-prio customer now: manual rescheduling necessary dank main-memory jedes mal neuberechnenmöglich rescheduling on-demand ATP combining with pricing, e.g. customer demands for a certain price per product you can name shipping date (cupper, metals, oil, etc.)
  • Aggregates narrow your flexibility interactive planning
  • SAP World Tour 2010: Impact of Column-Oriented Main-Memory Databases on Enterprise Applications

    1. 1. Impact of Column-OrientedMain-Memory Databases on Enterprise Applications<br />Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld<br />Hasso Plattner Institute<br />March 02, 2010<br />
    2. 2. © HPI & SAP 2010 / SAP World Tour 10 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / Page 2<br />Disclaimer<br />This presentation outlines our general product direction and should not be relied on in making a purchase decision. This presentation is not subject to your license agreement or any other agreement with SAP. SAP has no obligation to pursue any course of business outlined in this presentation or to develop or release any functionality mentioned in this presentation. This presentation and SAP's strategy and possible future developments are subject to change and may be changed by SAP at any time for any reason without notice. This document is provided without a warranty of any kind, either express or implied, including but not limited to, the implied warranties of merchantability, fitness for a particular purpose, or non-infringement. SAP assumes no responsibility for errors or omissions in this document, except if such damages were caused by SAP intentionally or grossly negligent.<br />
    3. 3. Agenda<br />The Hasso Plattner Institute<br />Technical Foundation of Columnar In-Memory Databases<br />Impact on Enterprise Applications<br />© HPI & SAP 2010 / SAP World Tour 10 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / Page 3<br />
    4. 4. Agenda<br />The Hasso Plattner Institute<br />Technical Foundation of Columnar In-Memory Databases<br />Impact on Enterprise Applications<br />© HPI & SAP 2010 / SAP World Tour 10 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / Page 4<br />
    5. 5. Key Facts about the Hasso Plattner Institute<br />Founded as a public private partnershipin 1998 in Potsdam near Berlin, Germany<br />Institute belongs to theUniversity of Potsdam<br />Ranked 1st in “CHE”<br />340 B.Sc. and M.Sc. students<br />10 professors, 91 PhD students<br />Course of study: IT Systems Engineering <br />© HPI & SAP 2010 / SAP World Tour 10 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / Page 5<br />
    6. 6. Prof. Dr. h.c. Hasso Plattner / Dr. Alexander Zeier<br />Research focuses on the technical aspects of enterprise software anddesign of complex applications<br />Memory-Based Data Management for Enterprise Applications <br />Human-Centered Software Design and Engineering <br />Maintenance and Evolution of Service-Oriented Enterprise Software <br />Integration of RFID Technology in Enterprise Platforms <br />Architecture-based Performance Simulation<br />Research co-operations with<br />Stanford, MIT, etc.<br />Industry co-operations with<br />SAP, Siemens, Audi, etc.<br />Research GroupEnterprise Platform & Integration Concepts<br />Partner of Stanford Center for Design Research<br />Partner of MIT in Supply Chain Innovation<br />© HPI & SAP 2010 / SAP World Tour 10 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / Page 6<br />
    7. 7. Agenda<br />The Hasso Plattner Institute<br />Technical Foundation of Columnar In-Memory Databases<br />Impact on Enterprise Applications<br />© HPI & SAP 2010 / SAP World Tour 10 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / Page 7<br />
    8. 8. Two separate worlds: OLTP and OLAP?<br />© HPI & SAP 2010 / SAP World Tour 10 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / Page 8<br />
    9. 9. Two separate worlds: OLTP and OLAP?<br />© HPI & SAP 2010 / SAP World Tour 10 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / Page 9<br />
    10. 10. Dominant Hardware Trends<br />Multi-Core Technology<br />Moore’s Law: “…number of transistors … doubling approximately<br />CPU frequency hit limitin 2002, but Moore’s law holds today<br />In-Memory Technology<br />Increased size: up to 2TB of main-memory on one main board in 2010<br />Constantly dropping costs<br />© HPI & SAP 2010 / SAP World Tour 10 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / Page 10<br />
    11. 11. 3 Aspects for a Hybrid Solution<br />Columnar Storage<br />New database layout accessing only needed portions of data<br />Improve access for subsets of attributes<br />In-Memory<br />Fastest possible data access <br />Spatial proximity<br />Compression<br />Reduce amount of data to fit in main memory<br />Use cache and bus capacities more efficient<br />© HPI & SAP 2010 / SAP World Tour 10 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / Page 11<br />
    12. 12. Row Store<br />Column Store<br />Storages: Row vs. Column<br />© HPI & SAP 2010 / SAP World Tour 10 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / Page 12<br />
    13. 13. Columnar Storage: Architecture<br />Claim: Columnar storage is suited for update-intensive applications<br />© HPI & SAP 2010 / SAP World Tour 10 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / Page 13<br />
    14. 14. In-Memory: Aggregate Processing Time<br />© HPI & SAP 2010 / SAP World Tour 10 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / Page 14<br />The value of an attribute changes by calculation<br />
    15. 15. Compression: Types<br />© HPI & SAP 2010 / SAP World Tour 10 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / Page 15<br />
    16. 16. Dictionaries<br />Compression:Advantages ofColumnar Storages<br />© HPI & SAP 2010 / SAP World Tour 10 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / Page 16<br />
    17. 17. Scalability: Multiple CPU Cores<br />Set processing is most frequent access type in EAs(scan is dominant pattern)<br />Sequential column-wise scans show best bandwidth utilization between CPU cores and main memory <br />Independence of tuples per column allows:<br />easy partitioning, and<br />parallel processing (see Hennessy [1])<br />Faster memory scans by improved memory bandwidth in next generation CPUs<br />Neither materialized views nor aggregateseverything is calculated on-the-fly<br />© HPI & SAP 2010 / SAP World Tour 10 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / Page 17<br />[1] John L. Hennessy, David A. Patterson: Computer Architecture: A Quantitative Approach<br />
    18. 18. Myth 1: Adapting existing databases leverages column-oriented perfomance improvement<br />© HPI & SAP 2010 / SAP World Tour 10 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / Page 18<br />Column-Oriented<br />Traditional<br /><ul><li>Neitherapplicationnordatabasecachesarenecessary
    19. 19. Redundant dataobjectsareeliminiated
    20. 20. Neitherindicesnoraggregatesneed to bemaintained
    21. 21. Number of layersisminimized
    22. 22. No updates
    23. 23. Applicationlogicisadjacent to rawdata
    24. 24. No databaselocksrequired
    25. 25. Data movementsareminimzed
    26. 26. Sustainuse of existingresources</li></ul>Application<br />Cache<br />DatabaseCache<br />Pre-BuiltAggregates<br />Raw Data<br />+ Stored Procedures<br />+ Mathematical Algorithms<br />
    27. 27. Myth 2: The entire set of business data does not fit into main memory<br />© HPI & SAP 2010 / SAP World Tour 10 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / Page 19<br />SRM<br />SCM<br />etc.<br />CRM<br />FI<br />Use cumulated memory capacity of various blades<br /><ul><li>Only relevant data in memory
    28. 28. Partitioning across hardware
    29. 29. Redundant-free data
    30. 30. Only few columns have high many different attribute values
    31. 31. Up to ten times higher compression possible</li></li></ul><li>Myth 3: Update/Insert of Huge Amounts of Data Degrades Columnar Performance<br />© HPI & SAP 2010 / SAP World Tour 10 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / Page 20<br />Columnar Storing<br />Traditional Storing<br />Updates<br />Insert<br /><ul><li>Ourresearchactivities at the HPI in Potsdam showed:
    32. 32. Updates areperformed rare
    33. 33. Onlyveryfewcolumnsareaffectedbyupdates</li></ul>Furtherinsightsavailable at SAP World Tour 2010 HPI booth 1.19.<br />Insert Only<br />
    34. 34. Agenda<br />The Hasso Plattner Institute<br />Technical Foundation of Columnar In-Memory Databases<br />Impact on Enterprise Applications<br />© HPI & SAP 2010 / SAP World Tour 10 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / Page 21<br />
    35. 35. Architecture of ExistingFinancials Systems<br />© HPI & SAP 2010 / SAP World Tour 10 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / Page 22<br />
    36. 36. Architecture of Simplified Financials Systems<br />© HPI & SAP 2010 / SAP World Tour 10 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / Page 23<br />Only base tables, algorithms, and some indices<br />
    37. 37. Analyzing Real Customer Data<br />1M records in BSEG ~ 1GB disk storage<br />© HPI & SAP 2010 / SAP World Tour 10 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / Page 24<br />
    38. 38. Results:Distinct Values per Attribute<br />Results on analyzing Financials<br />Distinct values in accounting document headers (99 attributes)<br />CPG<br />Logistics<br />Banking<br />High Tech<br />Discrete Manufacturing<br />© HPI & SAP 2010 / SAP World Tour 10 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / Page 25<br />
    39. 39. Results:Accounting Document Updates<br />Percentage of rows updated<br />© HPI & SAP 2010 / SAP World Tour 10 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / Page 26<br />
    40. 40. Dunning<br />© HPI & SAP 2010 / SAP World Tour 10 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / Page 27<br />
    41. 41. Available to Promise<br />© HPI & SAP 2010 / SAP World Tour 10 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / Page 28<br />
    42. 42. Demand Planning<br />© HPI & SAP 2010 / SAP World Tour 10 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / Page 29<br />
    43. 43. Insert Only<br />Tuple visibility indicated by timestamps (POSTGRES-style time-travel [2])<br />Additional storage requirements can be neglected due to low update frequency<br />Timestamp columns are not compressed to avoid additional merge costs<br />Snapshot isolation<br />Application-level locks<br />Insert Only<br />© HPI & SAP 2010 / SAP World Tour 10 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / Page 30<br />
    44. 44. Memory Consumption<br />Experiments show a general factor 10 in compression (using dictionary compression and bit vector encoding)<br />Additional storage savings by removing materialized aggregates, save ~2×<br />Keep only the active partition of the data in memory (based on fiscal year), save ~5×<br />Next generation blade servers will allow up to 512 GB RAM.<br />Arrays of 100 blades already available<br />50 TB main memory would allow to cover the majority of SAP Business Suite customers<br />© HPI & SAP 2010 / SAP World Tour 10 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / Page 31<br />
    45. 45. Impact on Application Development<br />Formalized logic must be moved close to the engine<br />Calculations must take place close to the data<br />Reduction of application code<br />OLTP queries must use minimal projections (SELECT * is not allowed)<br />No caching necessary anymore<br />© HPI & SAP 2010 / SAP World Tour 10 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / Page 32<br />
    46. 46. Conclusion<br />Technology improvements allow re-thinking of how we build enterprise apps: <br />A combined OLTP and OLAP system can share the same in-memory column store data base<br />Our experiments with real applications and data prove it<br />Open research challenges:<br />Disaster recovery, extension for unstructured data, life cycle based data management<br />© HPI & SAP 2010 / SAP World Tour 10 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / Page 33<br />
    47. 47. Further Information<br />è<br />SAP Public Web:<br />EPIC@HPI: https://epic.hpi.uni-potsdam.de<br />Hasso Plattner Institute: http://www.hpi-web.de<br />© HPI & SAP 2010 / SAP World Tour 10 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / Page 34<br />
    48. 48. Thank you! Contact us!<br />Hasso Plattner Institute<br />EA²L / Enterprise Platform & Integration Concepts<br />Matthieu-P. Schapranow<br />August-Bebel-Str. 88<br />D-14482 Potsdam, Germany<br />Matthieu-P. Schapranow<br />matthieu.schapranow@hpi.uni-potsdam.de<br />Responsible: Deputy Prof. of Prof. Hasso PlattnerDr. Alexander Zeierzeier@hpi.uni-potsdam.de<br />© SAP 2008 / SAP TechEd 08 / <Session ID> Page 35<br />
    49. 49. Feedback<br />Please complete your session evaluation.<br />Be courteous — deposit your trash, and do not take the handouts for the following session. <br />Thank You !<br />© HPI & SAP 2010 / SAP World Tour 10 / Impact of Column-Oriented Main-Memory Databases on Enterprise Applications, Dr. Alexander Zeier, Matthieu-P. Schapranow, Christian Tinnefeld / Page 36<br />

    ×