0
Dr. Michael Stonebraker and Scott JarrNavigating the Database Universe
About Our PresentersMike Stonebraker                          Scott JarrCo-founder & CTO, VoltDB                  Co-found...
Agenda• The (proper) design of DBMSs  – Presented by Dr. Michael Stonebraker• The database universe• Where the future valu...
We Believe…• “Big Data” is a rare, transformative market• Velocity is becoming the cornerstone• Specialized databases (wor...
Dr. Michael StonebrakerTHE (PROPER) DESIGN        OF THE DBMS
Lessons from 40 Years of Database Design1.   Get the user interaction right     – Bet on a small number of easy-to-2.     ...
#1: Get the User Interaction Right       Historical Lesson: RDBMS vs. CODASYL vs. OODBWinner: RDBMS           Loser: CODAS...
Interaction Take Away − Simple is Good• ACID was easy for people to understand• SQL provided a standard, high-level langua...
#2: Get the Implementation Right• Leverage a few simple ideas: Early relational implementations                           ...
#3: One Size Does NOT Fit All• OSFA is an old technology with hundreds  of bags hanging off it• It breaks 100% of the time...
Example: VoltDB• Get the interface right   – SQL   – ACID• Implementation: Leverage a few simple ideas   – Main memory   –...
Proving the Theory                                    Useful Work• Challenge: OLTP                       4%  performance  ...
Implementation Construct #1: Main Memory• Main memory format for data    – Disk format gets you buffer pool overhead• What...
Implementation Construct #2: Stored Procedures• Round trip to the DBMS is expensive   – Do it once per transaction   – Not...
Implementation Construct #3:Deterministic and Non-deterministic Scheduling• Non-deterministic (can’t tell order until comm...
Result of Design Principles: VoltDB Example• Good interface decisions – made developers more productive   – SQL & ACID• Le...
Proving the Theory• Answer: OLTP performance  – 3 million transactions per second                                        “...
Scott JarrTHE DATABASE UNIVERSE
Technology Meets the MarketBelieve   –   “Big Data” is a rare, transformative market   –   Velocity is becoming the corner...
Data Value Chain                                                 Age of Data     Interactive         Real-time Analytics  ...
Data Value Chain            Value of Individual                                                                 Aggregate ...
The Database Universe Fast Complex Large                               Value of Individual Data Item                      ...
The Database Universe Fast Complex Large                               Value of Individual Data Item                      ...
logins trades authorizations clicks      sensors orders impressions                                      Closed-loop Big D...
logins trades authorizations clicks                  sensors orders impressions                                           ...
The Velocity Use CaseWhat’s it look like?    –   High throughput, relentless data feeds    –   Fast decisions on high-valu...
Next UpQUESTIONS AND ANSWERS
www.voltdb.comTHANK YOU
Upcoming SlideShare
Loading in...5
×

"Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr, VoltDB

1,024

Published on

Webinar presentation delivered by Dr. Michael Stonebraker and Scott Jarr of VoltDB on December 11, 2012. www.voltdb.com

The design decisions you make today will have a huge performance impact down the line. Until recently, when it came to databases, the choice was easy. Essentially, you had one option: the RDBMS. Today, there's a new universe of databases being thrown into production — and not always with the greatest success. How do you make the right choice for your next application? Database pioneer Dr. Michael Stonebraker and VoltDB co-founder Scott Jarr have some thoughts.

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,024
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
42
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Transcript of ""Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr, VoltDB"

  1. 1. Dr. Michael Stonebraker and Scott JarrNavigating the Database Universe
  2. 2. About Our PresentersMike Stonebraker Scott JarrCo-founder & CTO, VoltDB Co-founder & Chief Strategy Officer, VoltDBA pioneer of database research and More than 20 years of experiencetechnology for more than a quarter of a building, launching and growingcentury, and the main architect of the technology companies from inception toIngres relational DBMS and the object- market leadership in therelational DBMS PostgreSQL search, mobile, security, storage and virtualization markets
  3. 3. Agenda• The (proper) design of DBMSs – Presented by Dr. Michael Stonebraker• The database universe• Where the future value comes from
  4. 4. We Believe…• “Big Data” is a rare, transformative market• Velocity is becoming the cornerstone• Specialized databases (working together) are the answer• Products must provide tangible customer value... Fast
  5. 5. Dr. Michael StonebrakerTHE (PROPER) DESIGN OF THE DBMS
  6. 6. Lessons from 40 Years of Database Design1. Get the user interaction right – Bet on a small number of easy-to-2. understand constructs – Plus standards Get the implementation right “ Those who don’t learn from history are – Bet on a small number of easy-to- understand constructs destined to repeat it. -Winston Churchill ”3. One size does not fit all – At least not if you want fast, big or complex
  7. 7. #1: Get the User Interaction Right Historical Lesson: RDBMS vs. CODASYL vs. OODBWinner: RDBMS Loser: CODASYL Loser: OODBs• Simple data model • Complicated data model • Complex data model (records; participate in “sets”; (hierarchical (tables) set has one owner records, pointers, sets, ar• Simple access and, perhaps, many rays, etc.) members, etc.) language (SQL) • Complex access • Messy access language (sea• ACID (transactions) of “cursors”; some -- but not language all -- move on every (navigation, through this• Standards (SQL) command, navigation sea) programming) • No standards
  8. 8. Interaction Take Away − Simple is Good• ACID was easy for people to understand• SQL provided a standard, high-level language and made people productive (transportable skills)
  9. 9. #2: Get the Implementation Right• Leverage a few simple ideas: Early relational implementations Historical Winners – System R storage system dropped links – Views (protection, schema modification, performance) – Cost-based optimizer• Leverage a few simple ideas: Postgres – User-defined data types and functions (adopted by most everybody) – Rules/triggers – No-overwrite storage• Leverage a few simple ideas: Vertica – Store data by column – Compressed up the ging gong – Parallel load without compromising ACID
  10. 10. #3: One Size Does NOT Fit All• OSFA is an old technology with hundreds of bags hanging off it• It breaks 100% of the time when under “ …specialized systems can each be a factor of load 50 faster than the• Load = size or speed or complexity single ‘one size fits all’• Load is increasing at a startling rate system…A factor of 50 is nothing to sneeze at.• Purpose-built will exceed by 10x to 100x• History has not been completely written yet…but let’s look at VoltDB as an -My Top 10 Assertions About Data Warehouses, 2010 ” example
  11. 11. Example: VoltDB• Get the interface right – SQL – ACID• Implementation: Leverage a few simple ideas – Main memory – Stored procedures – Deterministic scheduling• Specialization – OLTP focus allowed for above implementation choices
  12. 12. Proving the Theory Useful Work• Challenge: OLTP 4% performance Recovery 24% Latching 24% – TPC-C CPU cycles Buffer Pool 24% – On the Shore DBMS Locking 24% prototype – Elephants should be similar
  13. 13. Implementation Construct #1: Main Memory• Main memory format for data – Disk format gets you buffer pool overhead• What happens if data doesn’t fit? – Return to disk-buffer pool architecture (slow) – Anti-caching • Main memory format for data • When memory fills up, then bundle together elderly tuples and write them out • Run a transaction in “sleuth mode”; find the required records and move to main memory (and pin) • Run Xact normally
  14. 14. Implementation Construct #2: Stored Procedures• Round trip to the DBMS is expensive – Do it once per transaction – Not once per command – Or even once per cursor move• Ad-hoc queries supported – Turn them into dynamic stored procedures
  15. 15. Implementation Construct #3:Deterministic and Non-deterministic Scheduling• Non-deterministic (can’t tell order until commit time) – MVCC – Dynamic locking• Deterministic – Time stamp order
  16. 16. Result of Design Principles: VoltDB Example• Good interface decisions – made developers more productive – SQL & ACID• Leveraging a few simple implementation ideas – made VoltDB wicked fast – Main memory – Stored procedures – Deterministic scheduling
  17. 17. Proving the Theory• Answer: OLTP performance – 3 million transactions per second “ …we are heading toward a world with at least 5 (and probably – 7x Cassandra more) specialized – 15 million SQL statements per engines and the death second of the ‘one size fits all’ – 100,000+ transactions per legacy systems. commodity server ” -The End of an Architectural Era (It’s Time for a Complete Rewrite), 2007
  18. 18. Scott JarrTHE DATABASE UNIVERSE
  19. 19. Technology Meets the MarketBelieve – “Big Data” is a rare, transformative market – Velocity is becoming the cornerstone – Specialized databases (working together) are the answer – Products must provide tangible customer value… FastObservations – Noisy, crowded and new – kinda like Christmas shopping at the mall – Everyone wants to understand where the pieces fit – Analysts build maps on technology NOT use casesWhat we need is…
  20. 20. Data Value Chain Age of Data Interactive Real-time Analytics Record Lookup Historical Analytics Exploratory Analytics Milliseconds Hundredths of seconds Second(s) Minutes Hours• Place trade • Calculate risk • Retrieve click • Backtest algo • Algo discovery• Serve ad • Leaderboard stream • BI • Log analysis• Enrich stream • Aggregate • Show orders • Daily reports • Fraud pattern match• Examine packet • Count• Approve trans.
  21. 21. Data Value Chain Value of Individual Aggregate Data Item Data Value Data Value Age of Data Interactive Real-time Analytics Record Lookup Historical Analytics Exploratory Analytics Milliseconds Hundredths of seconds Second(s) Minutes Hours• Place trade • Calculate risk • Retrieve click • Backtest algo • Algo discovery• Serve ad • Leaderboard stream • BI • Log analysis• Enrich stream • Aggregate • Show orders • Daily reports • Fraud pattern match• Examine packet • Count• Approve trans.
  22. 22. The Database Universe Fast Complex Large Value of Individual Data Item Aggregate Data Value Application Complexity Data Value Traditional RDBMSSimple SlowSmall Transactional Analytic Exploratory Interactive Real-time Analytics Record Lookup Historical Analytics Analytics
  23. 23. The Database Universe Fast Complex Large Value of Individual Data Item Aggregate Data Value Application Complexity Data Value Velocity Hadoop, etc. NoSQL Data NewSQL Warehouse Traditional RDBMSSimple SlowSmall Transactional Analytic Exploratory Interactive Real-time Analytics Record Lookup Historical Analytics Analytics
  24. 24. logins trades authorizations clicks sensors orders impressions Closed-loop Big Data Interactive & Real-time Analytics Historical Reports & Analytics Exploratory Analytics
  25. 25. logins trades authorizations clicks sensors orders impressions Closed-loop Big Data • Make the most Interactive & Real-time Analytics informed decision every time there is an interaction • Real-time decisions Historical Reports & Analytics are informed byKnowledge operational analytics and past knowledge Exploratory Analytics
  26. 26. The Velocity Use CaseWhat’s it look like? – High throughput, relentless data feeds – Fast decisions on high-value data – Real-time, operational analytics present immediate visibilityWhat’s the big deal? – Batch converts to real time = efficiency – Decisions made at time of event = better decisions – Ability to micro segment/target/personalize/etc. = conversion, satisfaction, more data is coming at you, use it to improve your business
  27. 27. Next UpQUESTIONS AND ANSWERS
  28. 28. www.voltdb.comTHANK YOU
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×