Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Polyglot Persistence vs Multi-Model Databases

5,476 views

Published on

Many complex applications scale up by using several different databases, i.e. selecting the best DBMS for each use case. This tends to complicate modern architecture with many products by different vendors, no standards, and a lot of ETL which ultimately causes unpredictable results and a lot of headaches. Multi-Model DBMSs were created to make your life easier, giving you the option of using one NoSQL product with powerful multi-purpose engines capable of handling complex domains. Could one DBMS handle all your needs including speed and scalability in the times of Big Data? Luca will walk you through the benefits and trade-offs of multi-model DBMSs and will show you how easy it is to setup one open source database to handle many different use cases, saving you time and money.

Presented at Data Day Texas - Austin (TX) - USA

Published in: Software
  • Get Your Ex Back Today, Relationship expert Justin Sinclair, shows you how with 3 easy steps. ★★★ http://ow.ly/f23I301xGAo
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Seven Databases in Seven Weeks: A Guide to Modern Databases and the NoSQL Movement --- http://amzn.to/1PkBcIZ
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Database Design for Mere Mortals: A Hands-On Guide to Relational Database Design (3rd Edition) --- http://amzn.to/21BQRut
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Fundamentals of Database Systems (7th Edition) --- http://amzn.to/22wjM5q
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • #57 slide - It would be better to change the columns order, because now OrientDB logo seems located in the "Polyglot" column.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Polyglot Persistence vs Multi-Model Databases

  1. 1. Polyglot Persistence vs Multi-Model Databases Luca Garulli, Founder and CEO at OrientDB
  2. 2. Relational Databases are the Most Successful Technology Ever and Ruled for the Last 45 Years
  3. 3. Structured Data Small Datasets Few Relationships Waterfall Approach Scale Up CIO The World Has Changed Unstructured Data Large Volume Connected Data Agile Approach Scale Out Developers Relational NoSQL 1970 2009 A NoSQL database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases. Motivations for this approach include: simplicity of design, "horizontal" scaling, which is a problem for relational databases, and finer control over availability What s Next?
  4. 4. DBMS Quadrant RelationshipComplexity> Data Complexity > Relational Key Value Column Graph Document
  5. 5. Polyglot Persistence Polyglot Persistence is a fancy term to describe that when storing data, it is best to use multiple data storage technologies, chosen based upon the way data is being used by individual applications or components. http://www.jamesserra.com/archive/2015/07/what-is-polyglot-persistence/
  6. 6. Multi-Model A multi-model database is designed to support multiple data models against a single, integrated backend. Multi-model databases are intended to offer the data modeling advantages of polyglot persistence without its disadvantages. Complexity, in particular, is reduced. https://en.wikipedia.org/wiki/Multi-model_database
  7. 7. What s a Multi-Model DBMS? GraphDocument Object Key/Value Multi-Model represents the intersection of multiple models in just one product Full-Text Spatial
  8. 8. Multi-Model Snow Patrol (Band) Luca (Account) Indie (Genre) 123, 1st Street Austin, Jill (Account) Graphs { @rid": 12:382 , @class": Customer", name : Jill , surname : Raggio , phone : +39 33123212 , details : { city : London", tags : millennial } } Schema-less structures Object Oriented Key-Value pairs Geo-Spatial Full-Text OrientDB (Multi-Model)
  9. 9. Multi-Model Snow Patrol (Band) Luca (Account) Indie (Genre) 123, 1st Street Austin, Jill (Account) Graphs { ”@rid": “12:382”, ”@class": ”Customer", “name”: “Jill”, “surname” : “Raggio”, “phone” : “+39 33123212”, “details”: { “city”:”London", “tags”:”millennial” } } Schema-less structures Object OrientedKey-Value pairsGeo-Spatial Full-Text OrientDB (Multi-Model)
  10. 10. DBMS Quadrant: Multi-Model RelationshipComplexity> Data Complexity > Relational Key Value Column Graph Document Multi-Model
  11. 11. OrientDB • First Multi-Model DBMS with a Graph-Engine • Open Source Apache2 license • Data Models are built into the core engine • The Graph Database engine allows O(1) performance on traversing relationships, against O(LogN) of RDBMS and any other Multi-Model DBMS built as layers • Schema-less, Schema-full and Schema-mixed • Use of Apache Lucene for Full-Text and Spatial • Written in Java (runs on every platform) • Zero-config HA
  12. 12. Online Hotel Booking Application
  13. 13. Product Catalog Recommendations for similar products
  14. 14. Transactional data about rooms left at this moment Consider the active sessions on the same Product
  15. 15. The initial reaction of developers when the CTO agrees to use Polyglot Persistence
  16. 16. I can use the best tools for my use cases! No limits anymore! This is Developer FREEDOM! No more Tables?
  17. 17. Deployment
  18. 18. Polyglot Persistence in Action DOCUMENTKEY/VALUE GRAPH RELATIONAL User Sessions Rapid Access for reads and writes. No need to be durable. Financial Data Needs transactional updates. It will manage orders and payments. Recommendations Rapidly traverse links between friends, product purchases, and ratings. Product Catalog Lots of reads, infrequent writes. Products make natural aggregates. Example: Hotel Booking Application SEARCH Search Engine Full-Text Search. Support for faceted search and suggestions.
  19. 19. Polyglot Persistence Application DOCUMENT KEY/VALUE GRAPH APPLICATION RELATIONAL User Sessions Product Catalog Recommendation Financial Data SEARCH Search Engine
  20. 20. Multi-Model in Action Example: Hotel Booking Application User Sessions Rapid Access for reads and writes. No need to be durable. Financial Data Needs transactional updates. It will manage orders and payments. Recommendations Rapidly traverse links between friends, product purchases, and ratings. Product Catalog Lots of reads, infrequent writes. Products make natural aggregates. Search Engine Full-Text Search. Support for faceted search and suggestions
  21. 21. Multi-Model Application APPLICATION User Sessions Product Catalog Recommendation Financial Data Search Engine
  22. 22. Deployment Multi-ModelPolyglot • Only 1 product to learn • Only 1 server to configure and deploy • Only 1 vendor in case of support • 5 products to learn • 5 servers to configure and deploy • 5 vendors in case of support
  23. 23. Polyglot Deployment • 5 PRODUCTS TO LEARN No standard, all products are different. Even in the same category, they have different APIs (ex. MongoDB and CouchDB). Every developer has to learn multiple products or you should hire multiple developers with specific skills for every product. • 5 SERVERS TO CONFIGURE AND DEPLOY Usually it’s a bad idea to put more databases on the same machine due to the aggressive use of resources such as RAM and DISK. • 5 VENDORS IN CASE OF SUPPORT This means 5 contracts with 5 different vendors.
  24. 24. Domain design
  25. 25. Domain Design Product User Session Order Review
  26. 26. Polyglot Domain Design Product User Session Order Review
  27. 27. Multi-Model Domain Design Product User Session Order Review
  28. 28. Domain Design Multi-ModelPolyglot • The entire domain is represented in just one model in the same database • All data is interconnected and easy to access • Easy to refactor • Design of 5 different ways to reproduce part of the data on each product • Management of Application level relationship between data in different datasets represented in different way • Hard to refactor
  29. 29. Performance
  30. 30. Polyglot: Sequence Diagram APPLICATION (2) Get Product Details (3) Get Recommendation for the current product (5) Get orders to check availability (6) Check concurrent user activity on the same product (7) Update current user activity (in background) (4) Get basic information for each recommended product (1) Request Product Detail Page
  31. 31. Polyglot: Performance APPLICATION (4) Get orders to check availability (1) Request Product Detail Page (5) Check concurrent user activity on the same product = 10ms = 50ms = 200ms = 150ms = 20ms = 10ms Total Time = 530ms (6) Update current user activity (in background) (2) Get Product Details (3) Get Recommendation for the current product (4) Get basic information for each recommended product = 100ms
  32. 32. Multi-Model: Sequence Diagram APPLICATION (1) Request Product Detail Page (2) Get Product Details (3) Get Recommendation for the current product (5) Get orders to check availability (7) Update concurrent user activity (in background) (6) Check concurrent users activity on the same product (4) Get basic information for each recommended product
  33. 33. Multi-Model: Performance APPLICATION (1) Request Product Detail Page = 10ms Total Time = 300ms APPLICATION = 290ms (2) Get Product Details (3) Get Recommendation for the current product (5) Get orders to check availability (7) Update concurrent user activity (in background) (6) Check concurrent users activity on the same product (4) Get basic information for each recommended product
  34. 34. Caching to the Rescue (2) Get Product Details (3) Get Recommendation for the current product (4) Get basic information for each recommended product (1) Request Product Detail Page (6) Check concurrent users activity on the same product = 200ms (7) Update current user activity (in background) = 10ms = 50ms = 150ms = 20ms = 10ms If products description don’t change very often, they can be cached Caching recommendation means loosing the ability to recommend per use, but only per products (5) Get orders to check availability = 100ms If products description don’t change very often, they can be cached
  35. 35. Polyglot: Parallel Async Execution (2) Get Product Details (3) Get Recommendation for the current product (5) Get orders to check availability (1) Request Product Detail Page (6) Check concurrent users activity on the same product = 200ms (7) Update current user activity = 10ms = 50ms = 150ms = 20ms = 10ms = 310ms APPLICATION (4) Get basic information for each recommended product = 100ms
  36. 36. Performance But when the domain is simple, using specific products could give you better performance With complex domains, Multi-Model is faster then Polyglot
  37. 37. Performance continued... • With OrientDB, we have many stories about users that switched from a pure Graph Database to OrientDB. In all the cases, they had comparable or better performance. • From the other side, we don t have many stories about users that switched from a Key-Value to OrientDB. • Performance depends on the Multi-Model product. • With Multi-Model it s very important having the models built in the engine. If they are just layers, you ll have a lot of compromises in term of flexibility and performance.
  38. 38. Features
  39. 39. Features Multi-ModelPolyglot Even if Multi-Model are feature-rich products, it’s possible to not find the feature you need. You can choose from 300 products, giving you access to all the available features.
  40. 40. Data Synchronization
  41. 41. Synchronization Multi-ModelPolyglot No standard between products, the synchronization is entirely up to the developer via ETL or at Application level. All data is in the same datastore, so no synchronization is needed.
  42. 42. Polyglot: Synchronization by ETL DOCUMENT GRAPH RELATIONAL In order to use the Recommendation engine, you have to develop the ETL to pump data into the Graph Database every hour/day, mixing data of products and sales. The Search Engine, instead, only needs data from the Product Catalog. ETL ETL ETL
  43. 43. Polyglot: Synchronization by App DOCUMENT GRAPH RELATIONAL You can avoid ETL is the application is responsible to populate all the DBMS and keep them in synch. APPLICATION
  44. 44. Let s put everything in High Availability (HA)
  45. 45. Polyglot Persistence in High Availability (HA)
  46. 46. Redis in HA Server A Sentinel A Server B Sentinel B Server C Sentinel C Suggested Configuration: Deploy at least 3 Redis Server + Redis Sentinel on 3 separate Boxes http://redis.io/topics/sentinel
  47. 47. Neo4j in HA Suggested Configuration: Deploy at least 3 Neo4j Servers http://neo4j.com/docs/stable/ha-architecture.html
  48. 48. MongoDB in HA Secondary 1 Suggested Configuration: Deploy at least 3 MongoDB Servers (1 Primary and 2 Secondary Servers) Primary Secondary 2 https://docs.mongodb.org/manual/core/replica-set-members/
  49. 49. ElasticSearch in HA Suggested Configuration: Deploy at least 2 ElasticSearch Servers https://www.elastic.co/guide/en/elasticsearch/guide/current/_add_failover.html
  50. 50. MySQL in HA Sorry, but the ways to put MySQL in HA are too many… I found this configuration with 2 master servers that should be the minimum for HA.
  51. 51. Polyglot Persistence in HA APPLICATION Servers = 13
  52. 52. The beauty of me is that I’m very rich. But this setup would cost too much even for me! You’re fired!!!
  53. 53. Multi-Model in High Availability (HA)
  54. 54. Multi-Model in HA APPLICATION OrientDB supports Multi-Master replication with flexible sharding Zero-config cluster deployment allows to create a cluster of servers in a few minutes When a new server connects to the cluster, the database is automatically shared All the clients are always notified about new servers, so in case of a crash, the client can automatically switch to another available server with no failure at application level Servers = 3
  55. 55. Final score
  56. 56. Final Score (1-3) Multi-Model Polyglot Low TCO (Costs) Easy Maintenance Easy Scalability Performance Easy Deployment Minimal Skills Product Variety Easy Synchronization Number of Features 3 3 1 1 1 3 3 3 3 3 3 3 2 1 1 1 3 1
  57. 57. Confidential OrientDB At a Glance 70,000 Downloads per month from 200+ countries 100+ Code contributors on Github and 15,000+ commits 1,000s Users from SMBs to Fortune 10 Companies 17+ Years of Product Research Global Coverage and 24x7 Support
  58. 58. Awards and Press Coverage 2015 Bossie Award Winner OrientDB is an interesting hybrid in the NoSQL world, combining features from a document database and a graph database. A new breed of database hopes to blend the best of NoSQL and RDBMS Multi-model databases may help tame the growing complexity of enterprise data. 11 cutting-edge databases worth exploring now OrientDB packages itself as a "second-generation graph database." In other words, the nodes in the graphs are documents waiting for arbitrary key-value pairs.
  59. 59. A Bright Future Graph DBMS increased their popularity by 500% within the last 2 years. Document DBMS are the 3rd fastest growing category. Forrester estimates that over 25 percent of enterprises will use graph databases by 2017. Among the top 50, OrientDB is the technology with the largest year-on-year growth (+22 positions).
  60. 60. Don t miss my presentation Tomorrow, at GraphDay 10:00am: Working Towards an Unbreakable Graph Database that Scales
  61. 61. Thank you! @lgarulli Join the community, visit orientdb.com

×