Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

IN-MEMORY DATABASE SYSTEMS FOR BIG DATA MANAGEMENT.SAP HANA DATABASE.

576 views

Published on

IN-MEMORY DATABASE SYSTEMS FOR BIG DATA MANAGEMENT.SAP HANA DATABASE.

Published in: Software
  • Be the first to comment

IN-MEMORY DATABASE SYSTEMS FOR BIG DATA MANAGEMENT.SAP HANA DATABASE.

  1. 1. IN-MEMORY DATABASE SYSTEMS FOR BIG DATA MANAGEMENT. SAP HANA DATABASE PRESENTED BY : GEORGE JOSEPH S7 CS ALPHA ROLL NO-39 RSET , KERALA.
  2. 2. AGENDA .ball •Revisiting Traditional RDBMS •Defining IMDB •A look at a few IMDB products in the market •SAP HANA database in detail
  3. 3. What is a database ? .ball •An organised collection of information •Allows reading and writing . •Provides authorisation and authentication. •Provides some level of data safety.
  4. 4. Traditional RDBMS .ball •Developed by E F Codd in early 1970s •This model is based on tables rows and columns and the manipulation of data stored within. •A Relational DB is the collection of all these table •Example: Oracle, mysql & microsoft access
  5. 5. What is a database ? .ball •An organised collection of information •Allows reading and writing . •Provides authorisation and authentication. •Provides some level of data safety.
  6. 6. Data store for typical RDBMS .ball •Data resides on disk. •Data maybe cached into memory for access.
  7. 7. PROBLEM .ball • Existing disk-based systems can no longer offer timely response due to the high access latency to hard disks •The unacceptable performance an obstacle for a meaningful real-time service. •Eg :Real-time bidding, advertising, social gaming, Stock market .
  8. 8. “Memory is the new disk, disk is the new tape” Jim Gray Data scientist Creator IBM system R .ball
  9. 9. © 2013 SAP AG. All rights reserved. 9Public Hardware Advances: Moore’s Law - DRAM Pricing 1980: Memory $10,000/MB 2000: Memory $1/MB 2013: Memory $0.004/MB Time Memory Cost / Speed gdfgfgfgh ss
  10. 10. © 2013 SAP AG. All rights reserved. 10Public Hardware Advances: Moore‘s Law - CPUs 2002 1 core 32 bits 4MB 2007 2 cores 2 CPUs per server External Controllers 8 cores -16 threads / CPU 4 CPUs per server On-chip memory control Quick interconnect VM and vector support 64 bits; 256 GB - 1 TB 2010 More cores, bigger caches 16 ... 64 CPUs per server Greater on-chip integration (PCIe, network, ...) Data-direct I/O Tens of TBs 2013 Images: Intel, Danilo Rizzuti / FreeDigitalPhotos.net ball cold
  11. 11. IN-MEMORY DATABASE SYSTEMS .ball •For in-memory DB ,Data resides permanently on main memory. •Source data is loaded into system memory in a compressed, non-relational format •Only backup copy on disk. •Memory optimised data structures are used
  12. 12. Disk VS Memory .ball •Order of magnitude of access time is less for main memory. •Main memory is normally volatile while disk storage is not. •The layout of disk is much more critical than layout of main memory
  13. 13. MMDB PRODUCTS AVAILABLE .ball
  14. 14. .ball
  15. 15. .ball •SAP HANA is the market leader in IMDB systems. It is also a platform for big data processing analysis and prediction. •SAP HANA can help business for building real-time applications and analytics for accelerating the process
  16. 16. © 2013 SAP AG. All rights reserved. 16Public In-Memory Column Database Massively Parallel Processing Optimized Calculation Engine Columnar storage increases the amount of data that can be stored in limited memory (compared to disk) Column databases enable easier parallelization of queries Row buffer fast transactional processing In-memory processing gives more time for relatively slow updates to column data In-memory allows sophisticated calculations in real-time MPP optimized software enables linear performance scaling making sophisticated calculations like allocations possible Each technology works well on its own, but combining them all is the real opportunity — provides all of the upside benefits while mitigating the downsides SAP in-memory innovations make the “New Way” a reality s
  17. 17. © 2013 SAP AG. All rights reserved. 17Public Order Country Product Sales 456 France corn 1000 457 Italy wheat 900 458 Italy corn 600 459 Spain rice 800 SAP HANA: Column Store 456 France corn 1000 457 Italy wheat 900 458 Italy corn 600 459 Spain rice 800 456 457 458 459 France Italy Italy Spain corn wheat corn rice 1000 900 600 800 Typical Database SAP HANA: column order SELECT Country, SUM(sales) FROM SalesOrders WHERE Product = ‘corn’ GROUP BY Country  s
  18. 18. © 2013 SAP AG. All rights reserved. 18Public SAP HANA: Data Compression  Efficient compression methods (dictionary, run length, cluster, prefix, etc.)  Compression works well with columns and can speedup operations on columns (~ factor 10)  Because of compression, write changes into less compressed delta storage  Needs to be merged into columns from time to time or when a certain size is exceeded  Delta merge can be done in background  Trade-off between compression ratio and delta merge runtime  Updates into delta data storage and periodically merged into main data storage  High write performance not affected by compression  Data is written to delta storage with less compression which is optimized for write access. This is merged into the main area of the column store later on.
  19. 19. © 2013 SAP AG. All rights reserved. 19Public SAP HANA: Dictionary Compression Jones Miller Millman Zsuwalski Baker Miller John Miller Johnson Jones Column „Name“ (uncompressed) Value-ID sequence One element for each row in column 4 1 5 N 0 4 2 4 3 1 ValueIDs Johnson Miller John Jones 0 1 2 3 4 Millman ZsuwalskiN Dictionary sorted Value ID implicitly given by sequence in which values are stored Value Baker 5 Column „Name“ (dictionary compressed) point into dictionary s
  20. 20. © 2013 SAP AG. All rights reserved. 20Public SAP HANA: Scalability Scales from very small servers to very large clusters Single Server • 2 CPU 128GB to 8 CPU 1TB Scale Out Cluster • 2 to n servers per cluster • Largest certified configuration: 16 servers • Largest tested configuration: 100+ servers • Support for high availability and disaster tolerance Cloud Deployment s
  21. 21. © 2013 SAP AG. All rights reserved. 21Public What is inside HANA? ACID Compliant Database - In-Memory - Column Store Out In SQL BICS MDX JSON / XML Data Services HANA Studio Parallel Execution Scripting Engine Business Function Library Unstructured (Text) Predictive Analysis Library OLAP XS App Server “R” HS Integration 1. Batch Transfer 2. SAP & Non-SAP 3. Extensive Transformations 4. Structured & Unstructured 5. Hadoop Integration 1. ODBC / JDBC 2. 3rd Party Apps 3. 3rd Party Tools 1. BICS 2. NetWeaver BW 3. SAP BOBJ 1. ODBO 2. MS Excel 3. 3rd Party OLAP Tools 1. HTTP 2. RESTful services 3. OData Compliant “R” ESP Spatial / Geospatial Query Federation 1. IQ / ASE 2. Teradata / Oracle 3. Hadoop Replication Services 1. Near Real Time 2. Non-SAP s
  22. 22. .ball •In-Memory Big Data Management and Processing: By Hao Zhang, Gang Chen, Member, IEEE, Beng Chin Ooi, Fellow, IEEE, Kian-Lee Tan, Member, IEEE, and Meihui Zhang, Member, IEEE •SAP HANA Distributed In-Memory Database System: Transaction, Session, and Metadata Management Juchang Lee#1, Yong Sik Kwon#2, Franz Färber*3, Michael Muehle*4, Chulwon SAP Labs, Korea •In-memory database www.wikipedia.org REFERENCES
  23. 23. .ball

×