Polyglot Persistence
      Two Great Tastes
  That Taste Great Together!


                      John Wood
               ...
About Me
●   Software Developer at Interactive Mediums
●   Primarily work on a web application that allows
    our custome...
You Now Have A Choice
You Now Have A Choice
You Now Have A Choice
You Now Have A Choice
You Now Have A Choice
You Now Have A Choice
You Now Have A Choice
You Now Have A Choice
You Now Have A Choice
You Now Have A Choice
The RDBMS Is No Longer The
      Default Choice
The RDBMS Is No Longer The
           Default Choice
●   Can be very difficult to scale horizontally
●   Schemas can be di...
NoSQL Databases Have Stepped
  Up To Address These Issues
NoSQL Databases Have Stepped
      Up To Address These Issues

●   Schema-less
●   Little to no data integrity enforcement...
But The RDBMS Is Far From Dead
But The RDBMS Is Far From Dead
●   Incredibly mature, and battle tested
●   Immediate and constant consistency
●   Integri...
Choice is good...right?
Decisions, Decisions...
You Don't Have to
     Choose
“You've got your chocolate in my peanut butter!”
Polyglot Persistence
pol●y●glot - Adjective
Knowing or using several languages
pol●y●glot - Adjective
  Knowing or using several languages



        per●sist●ence - Noun
The continued or prolonged exi...
Polyglot Persistence
The continued or prolonged existence of
   something using several languages
Polyglot Persistence
The continued or prolonged existence of
   something using several languages
              databases
“Polyglot Persistence, like
  polyglot programming, is all
    about choosing the right
persistence option for the task at...
Why On Earth Would
You Want To Do This?
CAP Theorem



  http://en.wikipedia.org/wiki/CAP_theorem
http://blog.nahurst.com/visual-guide-to-nosql-systems
Compromise
Consistency and
 Data Integrity
       +
 Scalability and
   Flexibility
Support A Wide Range
     of Storage
   Requirements
Get The Job Done
Faster, With Better
     Quality
DB Doesn't Just Stand For
       Database
Don't Swim Upstream
Possible Use Cases
Use A NoSQL Database
    For A Particular
  Application Feature
Use A NoSQL Database
  For Speedy Batch
      Processing
Use A NoSQL Database
For Distributed Logging
Use A NoSQL Database
   For Large Tables
Use A RDBMS For
    Reporting
Sounds Great!
What's The Catch?
Difficult For Data In
Different Databases To
        Interact
You Now Have To
Decide Where To Store
        Data
Increased Application
  And Deployment
     Complexity
Additional
Administrative
Responsibilities
Training
What Will This Do To
My Beautiful Code?
It's All About The Layers
class User < ActiveRecord::Base
end


class ContestEntry < CouchRest::ExtendedDocument
 property :entry_number
end
class User < ActiveRecord::Base
 def contest_entries
   ContestEntry.entries_for_user(self.id)
 end
end

class ContestEntr...
Additional Options
    Available
So, Who Is Actually
    Doing This?
●   Primary MySQL database with a backup
●   A few very large tables, containing 5M – 30M
    rows each, and growing quick...
●   Brought in a consultant to help us optimize our
    MySQL setup
●   Optimized slow queries
●   Added some indexes
●   ...
+
●   Migrated old data from large tables to CouchDB
●   Using CouchDB views to aggregate summary
    data
●   Data is impor...
It's Not All Rainbows And Unicorns
●   CouchDB databases and views can be very
    large on disk
●   Some queries could not be substituted with
    CouchDB v...
http://twitter.com/about/opensource
●   Vertically and horizontally partitioned MySQL
●   Several layers of aggressive caching, all
    application managed
● ...
HBase



FlockDB
●   Migrating from MySQL to Cassandra as their
    main online data store
●   Hadoop/HBase used for people search feature
...
●   Increased availability
●   The ability to support new features
●   The ability to analyze their massive amount of
    ...
Right Tool For The Job
Thanks!
john_p_wood@yahoo.com
      @johnpwood
Polyglot Persistence - Two Great Tastes That Taste Great Together
Polyglot Persistence - Two Great Tastes That Taste Great Together
Polyglot Persistence - Two Great Tastes That Taste Great Together
Polyglot Persistence - Two Great Tastes That Taste Great Together
Upcoming SlideShare
Loading in...5
×

Polyglot Persistence - Two Great Tastes That Taste Great Together

11,597

Published on

The days of the relational database being a one-stop-shop for all of your persistence needs are over. Although NoSQL databases address some issues that can’t be addressed by relational databases, the opposite is true as well. The relational database offers an unparalleled feature set and rock solid stability. One cannot underestimate the importance of using the right tool for the job, and for some jobs, one tool is not enough. This talk focuses on the strength and weaknesses of both relational and NoSQL databases, the benefits and challenges of polyglot persistence, and examples of polyglot persistence in the wild.

These slides were presented at WindyCityDB 2010.

Published in: Technology
0 Comments
13 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
11,597
On Slideshare
0
From Embeds
0
Number of Embeds
7
Actions
Shares
0
Downloads
182
Comments
0
Likes
13
Embeds 0
No embeds

No notes for slide

Polyglot Persistence - Two Great Tastes That Taste Great Together

  1. 1. Polyglot Persistence Two Great Tastes That Taste Great Together! John Wood john_p_wood@yahoo.com @johnpwood
  2. 2. About Me ● Software Developer at Interactive Mediums ● Primarily work on a web application that allows our customers to engage and interact with their customers ● Writing code for about 15 years ● Tinkering with NoSQL for about 1.5 years ● Have a NoSQL solution that has been running in production for a year
  3. 3. You Now Have A Choice
  4. 4. You Now Have A Choice
  5. 5. You Now Have A Choice
  6. 6. You Now Have A Choice
  7. 7. You Now Have A Choice
  8. 8. You Now Have A Choice
  9. 9. You Now Have A Choice
  10. 10. You Now Have A Choice
  11. 11. You Now Have A Choice
  12. 12. You Now Have A Choice
  13. 13. The RDBMS Is No Longer The Default Choice
  14. 14. The RDBMS Is No Longer The Default Choice ● Can be very difficult to scale horizontally ● Schemas can be difficult to maintain and migrate ● For some applications, the data integrity features of the RDBMS are an unnecessary overhead ● Data constraints and JOINs can be expensive at runtime
  15. 15. NoSQL Databases Have Stepped Up To Address These Issues
  16. 16. NoSQL Databases Have Stepped Up To Address These Issues ● Schema-less ● Little to no data integrity enforcement ● Self-contained data ● Eventually consistent ● Easy to scale horizontally to add processing power and storage
  17. 17. But The RDBMS Is Far From Dead
  18. 18. But The RDBMS Is Far From Dead ● Incredibly mature, and battle tested ● Immediate and constant consistency ● Integrity of data is enforced ● Efficient use of storage space if data normalized properly ● Supported by everyone and everything (tools, frameworks, libraries, etc) ● Incredibly flexible and powerful query language ● Help is plentiful and easy to find
  19. 19. Choice is good...right?
  20. 20. Decisions, Decisions...
  21. 21. You Don't Have to Choose
  22. 22. “You've got your chocolate in my peanut butter!”
  23. 23. Polyglot Persistence
  24. 24. pol●y●glot - Adjective Knowing or using several languages
  25. 25. pol●y●glot - Adjective Knowing or using several languages per●sist●ence - Noun The continued or prolonged existence of something
  26. 26. Polyglot Persistence The continued or prolonged existence of something using several languages
  27. 27. Polyglot Persistence The continued or prolonged existence of something using several languages databases
  28. 28. “Polyglot Persistence, like polyglot programming, is all about choosing the right persistence option for the task at hand.” - Scott Leberknight, October, 2008 http://www.nearinfinity.com/blogs/scott_leberknight/polyglot_persistence.html
  29. 29. Why On Earth Would You Want To Do This?
  30. 30. CAP Theorem http://en.wikipedia.org/wiki/CAP_theorem
  31. 31. http://blog.nahurst.com/visual-guide-to-nosql-systems
  32. 32. Compromise
  33. 33. Consistency and Data Integrity + Scalability and Flexibility
  34. 34. Support A Wide Range of Storage Requirements
  35. 35. Get The Job Done Faster, With Better Quality
  36. 36. DB Doesn't Just Stand For Database
  37. 37. Don't Swim Upstream
  38. 38. Possible Use Cases
  39. 39. Use A NoSQL Database For A Particular Application Feature
  40. 40. Use A NoSQL Database For Speedy Batch Processing
  41. 41. Use A NoSQL Database For Distributed Logging
  42. 42. Use A NoSQL Database For Large Tables
  43. 43. Use A RDBMS For Reporting
  44. 44. Sounds Great! What's The Catch?
  45. 45. Difficult For Data In Different Databases To Interact
  46. 46. You Now Have To Decide Where To Store Data
  47. 47. Increased Application And Deployment Complexity
  48. 48. Additional Administrative Responsibilities
  49. 49. Training
  50. 50. What Will This Do To My Beautiful Code?
  51. 51. It's All About The Layers
  52. 52. class User < ActiveRecord::Base end class ContestEntry < CouchRest::ExtendedDocument property :entry_number end
  53. 53. class User < ActiveRecord::Base def contest_entries ContestEntry.entries_for_user(self.id) end end class ContestEntry < CouchRest::ExtendedDocument property :entry_number property :user_id def self.entries_for_user(user_id) # Execute your view to fetch the contest entries end def user User.f nd_by_id(user_id) i end end
  54. 54. Additional Options Available
  55. 55. So, Who Is Actually Doing This?
  56. 56. ● Primary MySQL database with a backup ● A few very large tables, containing 5M – 30M rows each, and growing quickly ● Increasing query execution time ● Some pages on the web app were timing out ● Increasing database migration time ● Rigid schema of the RDBMS was preventing some planned features from moving forward
  57. 57. ● Brought in a consultant to help us optimize our MySQL setup ● Optimized slow queries ● Added some indexes ● Offloaded some work to the backup database ● Considered the use of summary tables for statistics
  58. 58. +
  59. 59. ● Migrated old data from large tables to CouchDB ● Using CouchDB views to aggregate summary data ● Data is imported and views are updated nightly ● Queries for statistics now very fast ● Using Lucene (via couchdb-lucene) for full text searching ● Taking full advantage of CouchDBs schema- less nature in several new application features
  60. 60. It's Not All Rainbows And Unicorns
  61. 61. ● CouchDB databases and views can be very large on disk ● Some queries could not be substituted with CouchDB views ● Indexing tens of millions of documents for full text search with Lucene takes weeks ● Development takes longer, as the map/reduce model requires additional thought and planning ● Changing/Upgrading views in production not straightforward http://www.couch.io/migrating-to-couchdb
  62. 62. http://twitter.com/about/opensource
  63. 63. ● Vertically and horizontally partitioned MySQL ● Several layers of aggressive caching, all application managed ● Schema changes impossible, resulting in the use of bitfields and piggyback tables ● Hardware intensive ● Error prone ● Hitting MySQL limits ● Already eventually consistent
  64. 64. HBase FlockDB
  65. 65. ● Migrating from MySQL to Cassandra as their main online data store ● Hadoop/HBase used for people search feature ● FlockDB used to manage the social graph ● Hadoop for analytics ● “As with all NoSQL systems, strengths in different situations” - Kevin Weil, Analytics Lead, Twitter http://www.slideshare.net/kevinweil/nosql-at-twitter-nosql-eu-2010
  66. 66. ● Increased availability ● The ability to support new features ● The ability to analyze their massive amount of data in a reasonable amount of time http://www.slideshare.net/kevinweil/nosql-at-twitter-nosql-eu-2010
  67. 67. Right Tool For The Job
  68. 68. Thanks! john_p_wood@yahoo.com @johnpwood
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×