Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

From SQL to NoSQL: the fourth time’s the charm – Couchbase Connect 2016

512 views

Published on

In the beginning was the Word, and the Word was with Codd, and the Word was Codd. The beginning – of the database field as we know it today, that is – was 1970. And of course, the Word was the relational model: rows and columns, a normalized (and flat) world. The spartan simplicity of that early relational model, together with the very idea of a logical data model, enabled revolutionary changes: declarative queries, transparent indexing, query optimization, and scalable run-time parallelism, among others. Also from this beginning grew a multi-billion dollar market for relational database systems and tools. And the rest, as they say, is history...

While relational languages and systems have served us well for nearly half a century, they have always been a bit of a "misfit" when it comes to application data and requirements, particularly for user-facing applications. Time and again this issue has been identified and "solved" – or not. Three notable attempts to address this "impedance mismatch" between applications and databases were object-oriented databases, object-relational databases, and XML databases. Each attempt, in the end, "fell flat" (so to speak) as compared to the success of relational databases. Another attempt is happening today, in the "Big Data" era, in the form of "NoSQL" databases. In this talk, the speaker will share his views about what went wrong the first three times, what lasting lessons resulted, and why differences in the environment today lead to the conclusion that the fourth time will be the charm.

Published in: Software

From SQL to NoSQL: the fourth time’s the charm – Couchbase Connect 2016

  1. 1. ©2016 Couchbase Inc. 1 The Couchbase Connect16 mobile app Take our in-app survey!
  2. 2. ©2016 Couchbase Inc. Beyond Rows and Columns: Is the FourthTime the Charm? MichaelJ. Carey University of California, Irvine and Couchbase, Inc. 2
  3. 3. ©2016 Couchbase Inc. 3 Michael J. Carey Bren Professor of I&CS, UC Irvine Consulting Architect, Couchbase Database researcher and architect. Background includes BEA, IBM, Propel Software, University of Wisconsin-Madison, and UC Irvine. ACM Fellow, member of the National Academy of Engineering, and recipient of the ACM SIGMOD E.F. Codd Innovations Award. Current interests all center around data-intensive computing and scalable data management (a.k.a. Big Data). Personal goal: Retire early enough to go to music school and have a second (unsuccessful but fun ) career playing music.
  4. 4. ©2016 Couchbase Inc.©2016 Couchbase Inc. Databases and Data Models from Pre-SQL to “NoSQL” • The pre-relational era • The relational database era • Beyond rows and columns? 1. The object-oriented database era 2. The object-relational database era 3. The XML database era 4. The NoSQL database era • Where does Couchbase technology fit in? • Reflections, challenges, and Q&A 4
  5. 5. ©2016 Couchbase Inc.©2016 Couchbase Inc. The Birth ofToday’s DBMS Field • In the beginning was theWord, and the Word was with Codd, and theWord was Codd... • 1970 CACM paper: “A relational model of data for large shared data banks” • Some view this as the first generation of (real) DB management systems... 5
  6. 6. ©2016 Couchbase Inc.©2016 Couchbase Inc. The First Decade B.C. • The need for a data management library, or a database management system, had actually been well recognized before Codd • Hierarchical database systems (e.g., IMS from IBM) • Network database systems (most notably the CODASYL standard) • These systems provided navigational APIs • Systems provided files, records, pointers, and indexes • Programmers had to (carefully!) scan or search for records, follow parent/child structures or pointers, and maintain their code for correctness and performance when anything physical changed 6
  7. 7. ©2016 Couchbase Inc.©2016 Couchbase Inc. The First Decade B.C. (cont.) 7 Order (id, custName, custCity, total) Item (ino, qty, price) Product (sku, name, listPrice, size, power) Item-ProductItem-Order Item-Order 123 Fred LA 25.97 401 Garfield T-Shirt 9.99 XL - 544 USB Charger 5.99 - 115V1 2 9.99 2 1 3.99 Order Item Item Product Product Item-Product Item-Product Set Member Set Owner Set
  8. 8. ©2016 Couchbase Inc.©2016 Couchbase Inc. Enter the Relational DB Era • Be sure to notice that • Everything’s now (logical) rows and columns • The world is flat; columns are atomic (1NF) • Data is now connected via values (foreign/primary keys) 8 Order (id, custName, custCity, total) Item (order-id, ino, product-sku, qty, price) Product (sku, name, listPrice, size, power) 123 Fred LA 25.97 401 Garfield T-Shirt 9.99 XL null 544 USB Charger 5.99 null 115V 123 1 401 2 9.99 123 2 544 1 3.99
  9. 9. ©2016 Couchbase Inc.©2016 Couchbase Inc. As the Relational Era Unfolded • The Spartan simplicity of the relational data model made it possible to start tackling the many opportunities (and challenges) of a logical data model • Declarative queries (Relational algebra, Relational calculus,Quel, QBE, SQL, ...) • Transparent indexing (physical data independence) • Query optimization and (efficient) execution • Views, constraints, referential integrity, security, ... • Scalable (shared-nothing) parallel processing • Today’s multi-$B database industry was slowly but surely born • Serious commercial adoption took ~10-15 years • Commercial parallel DB systems took ~5 more years 9
  10. 10. ©2016 Couchbase Inc.©2016 Couchbase Inc. Enter the Object-Oriented DB Era • Notice that: • Data model contains objects and pointers (OIDs) • The world is no longer flat – the Order and Item schemas now “contain” set(Item) and Product , respectively • The world is no longer straight-jacketed – Product details can vary (subject to OO model) 10 123 Fred LA 25.97 {  } 401 Garfield T-Shirt 9.99 XL 544 USB Charger 5.99 115V1 2 9.99  2 1 3.99  Order Item Item ClothingProduct ElectricProduct
  11. 11. ©2016 Couchbase Inc.©2016 Couchbase Inc. What OODBs Sought to Offer • Motivated largely by late 1980’s CAx applications (e.g., mechanical CAD, VLSI CAD, software CAD, ...) • Rich schemas with inheritance, complex objects, object identity, references, ... • Methods (“behavior”) as well as data in the DBMS • Tight bindings with (OO) programming languages • Fast navigation, some declarative querying • Ex: Gemstone, Ontos, Objectivity, Versant, Object Design, O2, ... 11
  12. 12. ©2016 Couchbase Inc.©2016 Couchbase Inc. Why OODBs “Fell Flat” • Too soon for another (radical) DB technology • Also technically immature relative to RDBMSs • Tight PL bindings were a double-edged sword • Data most often shared, and outlives programming languages • Bindings led to significant system heterogeneity • Also made schema evolution a major challenge • Systems “overfitted” in some dimensions • Inheritance, version management, ... • Focused on thick clients (e.g.,CAD workstations) 12
  13. 13. ©2016 Couchbase Inc.©2016 Couchbase Inc. Enter the Object-Relational DB Era • Be sure to notice: • “One size fits all!” () • UDTs/UDFs, table hierarchies, references, ... • But the world got flatter again... (Timing lagged OODBs by just a few years) 13 Product (sku, name, listPrice) ClothingProduct (size) under Product ElectricProduct (power) under Product Order (id, customer, total) Item (order-id, ino, product-sku, qty, price) 401 Garfield T-Shirt 9.99 XL 544 USB Charger 5.99 115V (123) 1 (401) 2 9.99 (123) 2 (544) 1 3.99 123 Fred LA 25.97
  14. 14. ©2016 Couchbase Inc.©2016 Couchbase Inc. What O-R DBs Sought to Offer • Motivated in large part by newly emerging application opportunities (multimedia, spatial, text, ...) • User-defined functions (UDTs/UDFs) & aggregates • Data blades (UDTs/UDFs + indexing support) • OO goodies for tables: row types, references, ... • Nested tables (well, at least Oracle added these) • Back to a programming model where applications were loosely bound to the DBMS (e.g., ODBC/JDBC) • Ex: ADT-Ingres, Postgres, Starburst, UniSQL, Illustra, DB2, Oracle 14
  15. 15. ©2016 Couchbase Inc.©2016 Couchbase Inc. Why O-R DBs “Fell Flat” • Significant differences across DB vendors • SQL standardization lagged somewhat • Didn’t include details of UDT/UDF extensions • Tough to extend the system innards (for indexing) • Application issues (and multiple platform support) • Least common denominator vs. coolest features • Tools (e.g., DB design tools, ORM layers, ...) • Also still probably a bit too much too soon • IT departments still rolling in RDBMSs and creating relational data warehouses 15
  16. 16. ©2016 Couchbase Inc.©2016 Couchbase Inc. Then Came the XML DB Era 16 <Order id=”123”> <Customer> <custName>Fred</custName> <custCity>LA</custCity> </Customer> <total>25.97</total> <Items> <Item ino=”1”> <product-sku>401</product-sku> <qty>2</qty> <price>9.99</price> </Item> <Item ino=“2”> < product-sku>544</product-sku> <qty>1</qty> <price>3.99</price> </Item ino=”2”> </Items> </Order> <Product sku=”401”> <name>Garfield T-Shirt</name> <listPrice>9.99</listPrice> <size>XL</size> </Product> <Product sku=”544”> <name>USB Charger</name> <listPrice>5.99</listPrice> <power>115V</power> </Product> Note that - The world’s less flat again - We’re now in the 2000’s
  17. 17. ©2016 Couchbase Inc.©2016 Couchbase Inc. What XML DBs Sought to Offer • One <flexible/> data model fits all (XML) • Origins in document markup (SGML) • Nested data • Schema variety/optionality • New declarative query language (XQuery) • Designed both for querying and transformation • Early standardization effort (W3C) • Two different DB-related use cases, in reality • Data storage: Lore (pre-XML), Natix,Timber, Ipedo, MarkLogic, BaseX; also DB2, Oracle, SQL Server • Data integration: NimbleTechnology, BEA Liquid Data (from Enosys), BEA AquaLogic Data Services Platform 17
  18. 18. ©2016 Couchbase Inc.©2016 Couchbase Inc. Why XML DBs “Fell Flat”Too • Document-centric origins (vs. data use cases) of XML Schema and XQuery made a mess of things • W3C XPATH legacy () • Document identity, document order, ... • Attributes vs. elements, nulls, ... • Mixed content (overkill for non-document data) • Two other external trends also played a role • SOA andWeb services came, but then went • JSON (and RESTful services) appeared on the scene • Note: Likely still an important niche market... 18
  19. 19. ©2016 Couchbase Inc.©2016 Couchbase Inc. Now the NoSQL DB Era? • Not born in the DB world • Distributed systems folks built them • Also various startups • From caches  K/V use cases • Needed huge (SN) scale-out • OLTP (vs. parallel DB) apps • Simple, low-latencyAPI • Need a key, but want no schema forV • Record-level ACIDity, replica consistency varies • In the context of this talk, NoSQL does not mean • Hadoop (nor SQL on Hadoop! ) • Graph databases or graph analytics platforms 19
  20. 20. ©2016 Couchbase Inc.©2016 Couchbase Inc. NoSQL Data (JSON-based) Collection(Order) {“id: “123”, “Customer”: { “custName”: “Fred”, “custCity”: “LA” } “total”: 25.97, “Items”: [ {“product-sku”: 401, “qty”: 2, “price”: 9.99 }, {“product-sku”: 544, “qty”: 1, “price”: 3.99 } ] } 20 Collection(Product) {“ sku”: 401, “name”: “Garfield T-Shirt”, “listPrice”: 9.99, “size”: “XL” } {“sku”: 544, “name”: “USB Charger”, “listPrice”: 5.99, “power”: “115V” } Note that - The world’s not flat, but now it’s less <messy/> - We’re now in the 2010’s, timing-wise
  21. 21. ©2016 Couchbase Inc.©2016 Couchbase Inc. Current NoSQLTrends • Users coveting the benefits of many DB goodies • Secondary indexing and non-key access • Declarative queries • Aggregates and now (initially small) joins • We seem to be heading towards a world of ... • BDMS (think scalable,OLTP-aimed, parallel DBMS) • Declarative queries and query optimization applied to schema-less data • Return of (some, optional!) schema information 21
  22. 22. ©2016 Couchbase Inc.©2016 Couchbase Inc. One Early Example: Apache AsterixDB http://asterixdb.apache.org/
  23. 23. ©2016 Couchbase Inc.©2016 Couchbase Inc. Queries in AsterixDB: AQL (Asterix Query Language) 23 • Ex: List the user name and messages sent by those users who joined the Mugshot social network in a certain time window: from $user in dataset MugshotUsers where $user.user-since >= datetime('2010-07-22T00:00:00') and $user.user-since <= datetime('2012-07-29T23:59:59') select { "uname" : $user.name, "messages" : from $message in dataset MugshotMessages where $message.author-id = $user.id select $message.message };
  24. 24. ©2016 Couchbase Inc.©2016 Couchbase Inc. Another “Relevant” Example: Couchbase Server (!) 24 CouchDB Memcached Membase + Caching Distributed Key-Value Document Operational Big Data Database
  25. 25. ©2016 Couchbase Inc.©2016 Couchbase Inc. Couchbase Data Management and Data Access 25 • Everything is built on top of Key-Value storage • A Document store is a special case of Key-Value • Views provide aggregation and real-time analytics through incremental map-reduce • Global Secondary Indexes provide low latency/high throughput indexes • N1QL is a language that provides a powerful and expressive (and SQL-like!) way of accessing and manipulating JSON documents – including joins (on keys), aggregates, ..., and access path selection
  26. 26. ©2016 Couchbase Inc.©2016 Couchbase Inc. Coming Next: Couchbase Analytics (DP) 26 One stop shopping for both operations and analytics Couchbase Query Optimized for operational (narrow) queries Many queries Each touches a little data Couchbase Analytics Fewer queries Each touches a lot of data Optimized for analytical (big) queries
  27. 27. ©2016 Couchbase Inc.©2016 Couchbase Inc. Couchbase: A Database Redefined 27 Data Query Index Search AnalyticsTransport Unified Administration Unified Declarative Programming Interface
  28. 28. ©2016 Couchbase Inc.©2016 Couchbase Inc. I Believe that the FourthTime’s the Charm (!) • Applications have long needed a shared “place for their stuff” • Relational databases and SQL (along with 1NF) drastically changed the game in the 70’s and 80’s • Rows, columns, and declarative queries • Tremendous advances in algorithms, indexing, and parallelism • Since then, applications have unsuccessfully sought more “stuff-friendly” places • Object-oriented DB systems • Object-relational DB systems • XML DB systems • But this time it’s different – really! • NoSQL != no queries, but rather, no 1NF and no (or optional) schema • AsterixDB, Couchbase Server, and now Couchbase Analytics are all examples 28
  29. 29. ©2016 Couchbase Inc.©2016 Couchbase Inc. Related Connect16 Sessions • Overview of Couchbase Server Architecture • David Haikney • Tuesday, 10:00 am - 10:50 am • Great America 2 • How to Understand and Use the Query Optimizer • Keshav Murthy • Tuesday, 11:00 am - 11:50 am • Great America 2 • Keynote: Beyond NoSQL – Building a Digital Future • Ravi Mayuram • Tuesday, 1:45 pm - 2:45 pm • Hall D 29 • Utilizing Arrays: Modeling, Indexing, and Querying • Keshav Murthy • Wednesday, 9:00 am -9:50 am • Great America K • Sneak Peek: Couchbase Analytics • TillWestmann &Yingyi Bu • Wednesday, 2:00 pm - 2:50 pm • Great America 2 • SQL++: SQL for NoSQL • Yannis Papakonstantinou • Wednesday, 3:10 pm - 4:00 pm • Great America 2
  30. 30. ©2016 Couchbase Inc. Q&A 30
  31. 31. ©2016 Couchbase Inc. ThankYou! 31
  32. 32. ©2016 Couchbase Inc. 32 Share your opinion on Couchbase 1. Go here: http://gtnr.it/2eRxYWn 2. Create a profile 3. Provide feedback (~15 minutes)

×