Polyglottany is not a sin mongo db boston 2012-10-24

1,411 views

Published on

0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,411
On SlideShare
0
From Embeds
0
Number of Embeds
122
Actions
Shares
0
Downloads
16
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • SimpleReach is a social intelligence tool for content creators. We track everything social action, on every major network, across the entire web in real-time. That means every like, tweet, pin, stumble and many more.\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Polyglottany is not a sin mongo db boston 2012-10-24

    1. 1. Polyglottany Is Nota Sin Eric Lubow @elubow elubow@simplereach.com #MongoBoston
    2. 2. OverviewPolyglottany Is Not A Sin Eric Lubow @elubow
    3. 3. Overview• SimpleReach Polyglottany Is Not A Sin Eric Lubow @elubow
    4. 4. Overview• SimpleReach• Definitions and Data Stores Polyglottany Is Not A Sin Eric Lubow @elubow
    5. 5. Overview• SimpleReach• Definitions and Data Stores• Evolution to Polyglottany Polyglottany Is Not A Sin Eric Lubow @elubow
    6. 6. Overview• SimpleReach• Definitions and Data Stores• Evolution to Polyglottany• Tie It Together Polyglottany Is Not A Sin Eric Lubow @elubow
    7. 7. Overview• SimpleReach• Definitions and Data Stores• Evolution to Polyglottany• Tie It Together• Final Thoughts Polyglottany Is Not A Sin Eric Lubow @elubow
    8. 8. Overview• SimpleReach• Definitions and Data Stores• Evolution to Polyglottany• Tie It Together• Final Thoughts• Questions Polyglottany Is Not A Sin Eric Lubow @elubow
    9. 9. Socially IntelligentPolyglottany Is Not A Sin Eric Lubow @elubow
    10. 10. Size Polyglottany Is Not A Sin Eric Lubow @elubow
    11. 11. Size• 150m events recorded per day and growing Polyglottany Is Not A Sin Eric Lubow @elubow
    12. 12. Size• 150m events recorded per day and growing• 600m Pageviews per month and growing Polyglottany Is Not A Sin Eric Lubow @elubow
    13. 13. Polyglot PersistencePolyglot Persistence, like polyglot programming, is allabout choosing the right persistence option for the taskat hand. http://www.sleberknight.com/blog/sleberkn/entry/polyglot_persistencePolyglottany Is Not A Sin Eric Lubow @elubow
    14. 14. Right Tool For The JobPolyglottany Is Not A Sin Eric Lubow @elubow
    15. 15. Decisions. Decisions.• What are my query patterns? • Are my display requirements • Is the encryption/authentication/ • How fault tolerant is the system? for realtime data? authorization support sufficient for Tech Is my data ingestion high volume/high my needs? What supporting tools do I need? Data• • velocity? • Do I need to aggregate data on the fly? • Are there monitoring architectures • Is there support for my language?• Am I batch loading data? already built? • Is my data structured or• Am I write heavy or read heavy? unstructured? • Are there best practices guides already• Are data relationships important? • Does my data lend itself to a specific design pattern? • Will the data need to be distributed?• Does my data need to be immediately available everywhere? Data Tech Financial Other• Am I cloud based? Financial Other • Do I have legal requirements (HIPAA/FIPS/Sarbanes Oxley/PII)?• Am I hardware based? • What kind of enterprise support is available?• Am I a cloud/iron hybrid? • What is the community like?• How much am I willing to spend? • Does the product roadmap pertain to my roadmap?• How much am I willing to spend if something goes wrong? Polyglottany Is Not A Sin Eric Lubow @elubow
    16. 16. No One Size Fits AllPolyglottany Is Not A Sin Eric Lubow @elubow
    17. 17. Tools C*Polyglottany Is Not A Sin Eric Lubow @elubow
    18. 18. Free vs. CostPolyglottany Is Not A Sin Eric Lubow @elubow
    19. 19. LanguagesPolyglottany Is Not A Sin Eric Lubow @elubow
    20. 20. Pre-ScalePolyglottany Is Not A Sin Eric Lubow @elubow
    21. 21. SimpleReach Pre-ScalePolyglottany Is Not A Sin Eric Lubow @elubow
    22. 22. ScalePolyglottany Is Not A Sin Eric Lubow @elubow
    23. 23. SimpleReach C*Polyglottany Is Not A Sin Eric Lubow @elubow
    24. 24. Mongo ConferencePolyglottany Is Not A Sin Eric Lubow @elubow
    25. 25. Cassandra C*Polyglottany Is Not A Sin Eric Lubow @elubow
    26. 26. Cassandra C*• Large data volume ingestion at high velocity Polyglottany Is Not A Sin Eric Lubow @elubow
    27. 27. Cassandra C*• Large data volume ingestion at high velocity• Really fast writes to many locations (eventual consistency) Polyglottany Is Not A Sin Eric Lubow @elubow
    28. 28. Cassandra C*• Large data volume ingestion at high velocity• Really fast writes to many locations (eventual consistency)• Query by column groups within rows (slicing) Polyglottany Is Not A Sin Eric Lubow @elubow
    29. 29. Cassandra C*• Large data volume ingestion at high velocity• Really fast writes to many locations (eventual consistency)• Query by column groups within rows (slicing)• Opscenter Polyglottany Is Not A Sin Eric Lubow @elubow
    30. 30. Cassandra C*• Large data volume ingestion at high velocity• Really fast writes to many locations (eventual consistency)• Query by column groups within rows (slicing)• Opscenter• Data toolkit: more than a data storage layer Polyglottany Is Not A Sin Eric Lubow @elubow
    31. 31. Cassandra C*• Large data volume ingestion at high velocity• Really fast writes to many locations (eventual consistency)• Query by column groups within rows (slicing)• Opscenter• Data toolkit: more than a data storage layer• TTLs for small group aggregation Polyglottany Is Not A Sin Eric Lubow @elubow
    32. 32. Cassandra C*• Large data volume ingestion at high velocity• Really fast writes to many locations (eventual consistency)• Query by column groups within rows (slicing)• Opscenter• Data toolkit: more than a data storage layer• TTLs for small group aggregation• Wrote Helenus, Node.js driver for Cassandra Polyglottany Is Not A Sin Eric Lubow @elubow
    33. 33. MongoDBPolyglottany Is Not A Sin Eric Lubow @elubow
    34. 34. MongoDB• Fast atomic increments (Node.js is native JSON) Polyglottany Is Not A Sin Eric Lubow @elubow
    35. 35. MongoDB• Fast atomic increments (Node.js is native JSON)• Sharding Polyglottany Is Not A Sin Eric Lubow @elubow
    36. 36. MongoDB• Fast atomic increments (Node.js is native JSON)• Sharding• Solid ORM for Rails (MongoID) Polyglottany Is Not A Sin Eric Lubow @elubow
    37. 37. MongoDB• Fast atomic increments (Node.js is native JSON)• Sharding• Solid ORM for Rails (MongoID)• Fast access for pub/sub of durable/persisted documents Polyglottany Is Not A Sin Eric Lubow @elubow
    38. 38. MongoDB• Fast atomic increments (Node.js is native JSON)• Sharding• Solid ORM for Rails (MongoID)• Fast access for pub/sub of durable/persisted documents• B-Tree Indexes Polyglottany Is Not A Sin Eric Lubow @elubow
    39. 39. MongoDB• Fast atomic increments (Node.js is native JSON)• Sharding• Solid ORM for Rails (MongoID)• Fast access for pub/sub of durable/persisted documents• B-Tree Indexes• Document based via JSON Polyglottany Is Not A Sin Eric Lubow @elubow
    40. 40. MongoDB• Fast atomic increments (Node.js is native JSON)• Sharding• Solid ORM for Rails (MongoID)• Fast access for pub/sub of durable/persisted documents• B-Tree Indexes• Document based via JSON• TTLs for ephemeral data Polyglottany Is Not A Sin Eric Lubow @elubow
    41. 41. RedisPolyglottany Is Not A Sin Eric Lubow @elubow
    42. 42. Redis• Supports hundreds of thousands transactions per second Polyglottany Is Not A Sin Eric Lubow @elubow
    43. 43. Redis• Supports hundreds of thousands transactions per second• Great caching engine Polyglottany Is Not A Sin Eric Lubow @elubow
    44. 44. Redis• Supports hundreds of thousands transactions per second• Great caching engine• Supports useful variable types like sets, sorted set, lists Polyglottany Is Not A Sin Eric Lubow @elubow
    45. 45. Redis• Supports hundreds of thousands transactions per second• Great caching engine• Supports useful variable types like sets, sorted set, lists• Everything is guaranteed to Memory Mapped (mmap) Polyglottany Is Not A Sin Eric Lubow @elubow
    46. 46. Redis• Supports hundreds of thousands transactions per second• Great caching engine• Supports useful variable types like sets, sorted set, lists• Everything is guaranteed to Memory Mapped (mmap)• Transactional and supports bulk operations Polyglottany Is Not A Sin Eric Lubow @elubow
    47. 47. Redis• Supports hundreds of thousands transactions per second• Great caching engine• Supports useful variable types like sets, sorted set, lists• Everything is guaranteed to Memory Mapped (mmap)• Transactional and supports bulk operations• Centralized queueing and locking system Polyglottany Is Not A Sin Eric Lubow @elubow
    48. 48. InfobrightPolyglottany Is Not A Sin Eric Lubow @elubow
    49. 49. Infobright• Works with standard MySQL driver Polyglottany Is Not A Sin Eric Lubow @elubow
    50. 50. Infobright• Works with standard MySQL driver• Column Stores for ad-hoc analytics queries in SQL Polyglottany Is Not A Sin Eric Lubow @elubow
    51. 51. Infobright• Works with standard MySQL driver• Column Stores for ad-hoc analytics queries in SQL• Databases built for business intelligence Polyglottany Is Not A Sin Eric Lubow @elubow
    52. 52. Infobright• Works with standard MySQL driver• Column Stores for ad-hoc analytics queries in SQL• Databases built for business intelligence• Heavy compression of data Polyglottany Is Not A Sin Eric Lubow @elubow
    53. 53. Infobright• Works with standard MySQL driver• Column Stores for ad-hoc analytics queries in SQL• Databases built for business intelligence• Heavy compression of data• Pre-aggregated data (Knowledge Grid) Polyglottany Is Not A Sin Eric Lubow @elubow
    54. 54. Ruby, Node.js, PythonPolyglottany Is Not A Sin Eric Lubow @elubow
    55. 55. Ruby, Node.js, Python• Polyglottany doesn’t only apply to data stores Polyglottany Is Not A Sin Eric Lubow @elubow
    56. 56. Ruby, Node.js, Python• Polyglottany doesn’t only apply to data stores• Each language has its own benefit to each data storage layer Polyglottany Is Not A Sin Eric Lubow @elubow
    57. 57. Ruby, Node.js, Python• Polyglottany doesn’t only apply to data stores• Each language has its own benefit to each data storage layer• Each language has its own individual benefits Polyglottany Is Not A Sin Eric Lubow @elubow
    58. 58. Ruby, Node.js, Python• Polyglottany doesn’t only apply to data stores• Each language has its own benefit to each data storage layer• Each language has its own individual benefits• JSON, APIs, Performance Polyglottany Is Not A Sin Eric Lubow @elubow
    59. 59. ChoicePolyglottany Is Not A Sin Eric Lubow @elubow
    60. 60. ConsPolyglottany Is Not A Sin Eric Lubow @elubow
    61. 61. Cons• Redis - Can only utilize a single core. SerDe price. Polyglottany Is Not A Sin Eric Lubow @elubow
    62. 62. Cons• Redis - Can only utilize a single core. SerDe price.• MySQL Column Store - DELETE/UPDATEs are VERY expensive Polyglottany Is Not A Sin Eric Lubow @elubow
    63. 63. Cons• Redis - Can only utilize a single core. SerDe price.• MySQL Column Store - DELETE/UPDATEs are VERY expensive• Cassandra - No btree indexes Polyglottany Is Not A Sin Eric Lubow @elubow
    64. 64. Cons• Redis - Can only utilize a single core. SerDe price.• MySQL Column Store - DELETE/UPDATEs are VERY expensive• Cassandra - No btree indexes• Mongo - Indexes must fit in memory. Forced Replica ping times Polyglottany Is Not A Sin Eric Lubow @elubow
    65. 65. Cons• Redis - Can only utilize a single core. SerDe price.• MySQL Column Store - DELETE/UPDATEs are VERY expensive• Cassandra - No btree indexes• Mongo - Indexes must fit in memory. Forced Replica ping times• Python - Whitespace. Community Polyglottany Is Not A Sin Eric Lubow @elubow
    66. 66. Cons• Redis - Can only utilize a single core. SerDe price.• MySQL Column Store - DELETE/UPDATEs are VERY expensive• Cassandra - No btree indexes• Mongo - Indexes must fit in memory. Forced Replica ping times• Python - Whitespace. Community• Ruby - Not high performance enough for our standards Polyglottany Is Not A Sin Eric Lubow @elubow
    67. 67. Cons• Redis - Can only utilize a single core. SerDe price.• MySQL Column Store - DELETE/UPDATEs are VERY expensive• Cassandra - No btree indexes• Mongo - Indexes must fit in memory. Forced Replica ping times• Python - Whitespace. Community• Ruby - Not high performance enough for our standards• Javascript (Node.js) - Bad for CPU or IO intensive workloads Polyglottany Is Not A Sin Eric Lubow @elubow
    68. 68. Tying It TogetherEven with the right tools, 80% of the work of building abig data system is acquiring and refining the raw data intousable data.Polyglottany Is Not A Sin Eric Lubow @elubow
    69. 69. Tying It TogetherPolyglottany Is Not A Sin Eric Lubow @elubow
    70. 70. Tying It TogetherPolyglottany Is Not A Sin Eric Lubow @elubow
    71. 71. Tying It Together• Service Oriented Architecture (Internal API) Polyglottany Is Not A Sin Eric Lubow @elubow
    72. 72. Tying It Together• Service Oriented Architecture (Internal API)• Data accuracy checks: visual and programmatic Polyglottany Is Not A Sin Eric Lubow @elubow
    73. 73. Tying It Together• Service Oriented Architecture (Internal API)• Data accuracy checks: visual and programmatic• Built framework for testing out storage engines Polyglottany Is Not A Sin Eric Lubow @elubow
    74. 74. Tying It Together• Service Oriented Architecture (Internal API)• Data accuracy checks: visual and programmatic• Built framework for testing out storage engines• Access to many toolsets (for all languages and DBs) Polyglottany Is Not A Sin Eric Lubow @elubow
    75. 75. Service Architecture Analytics C* Real-time C* Internal APIPolyglottany Is Not A Sin Eric Lubow @elubow
    76. 76. Distributed Architecture US-EAST-1a US-EAST-1b US-EAST-1e CASSANDRA-0001 CASSANDRA-0002 CASSANDRA-0003 CASSANDRA-0010 CASSANDRA-0011 CASSANDRA-0012 REDIS-0001A REDIS-0001B MYSQL-0001 MYSQL-0002 MONGO-SHARD-0000-A MONGO-SHARD-0000-B MONGO-SHARD-0001-B MONGO-SHARD-0001-A MONGO-SHARD-0002-B MONGO-SHARD-0002-A iAPI-0001 iAPI-0002 iAPI-0003Polyglottany Is Not A Sin Eric Lubow @elubow
    77. 77. Points To ConsiderPolyglottany Is Not A Sin Eric Lubow @elubow
    78. 78. Points To Consider• Data consistency - Same in all data stores Polyglottany Is Not A Sin Eric Lubow @elubow
    79. 79. Points To Consider• Data consistency - Same in all data stores• How important is data durability? Polyglottany Is Not A Sin Eric Lubow @elubow
    80. 80. Points To Consider• Data consistency - Same in all data stores• How important is data durability?• Managing many servers (Chef, AWS, CSSH) Polyglottany Is Not A Sin Eric Lubow @elubow
    81. 81. Points To Consider• Data consistency - Same in all data stores• How important is data durability?• Managing many servers (Chef, AWS, CSSH)• Managing and learning many different applications and tuning for them Polyglottany Is Not A Sin Eric Lubow @elubow
    82. 82. Points To Consider• Data consistency - Same in all data stores• How important is data durability?• Managing many servers (Chef, AWS, CSSH)• Managing and learning many different applications and tuning for them• Expertise Polyglottany Is Not A Sin Eric Lubow @elubow
    83. 83. ExpertisePolyglottany Is Not A Sin Eric Lubow @elubow
    84. 84. Expertise• What happens when you need help? Polyglottany Is Not A Sin Eric Lubow @elubow
    85. 85. Expertise• What happens when you need help?• How do you become experts? Polyglottany Is Not A Sin Eric Lubow @elubow
    86. 86. Expertise• What happens when you need help?• How do you become experts?• What happens when you need more experts? Polyglottany Is Not A Sin Eric Lubow @elubow
    87. 87. SummaryPolyglottany Is Not A Sin Eric Lubow @elubow
    88. 88. Summary• Polyglottany is not a sin Polyglottany Is Not A Sin Eric Lubow @elubow
    89. 89. Summary• Polyglottany is not a sin• Know your data read/write patterns Polyglottany Is Not A Sin Eric Lubow @elubow
    90. 90. Summary• Polyglottany is not a sin• Know your data read/write patterns• Know the tools available to you Polyglottany Is Not A Sin Eric Lubow @elubow
    91. 91. Summary• Polyglottany is not a sin• Know your data read/write patterns• Know the tools available to you• Know your compromises Polyglottany Is Not A Sin Eric Lubow @elubow
    92. 92. Summary• Polyglottany is not a sin• Know your data read/write patterns• Know the tools available to you• Know your compromises• Expertise Polyglottany Is Not A Sin Eric Lubow @elubow
    93. 93. We’re HiringPolyglottany Is Not A Sin Eric Lubow @elubow
    94. 94. Questions are guaranteed in life.Answers aren’t. Eric Lubow @elubow elubow@simplereach.com #MongoBoston Thank you.

    ×