Micro-Blogging for The Enterprise (MongoDB)

2,478 views

Published on

Transition from relational structure to MongoDB

Published in: Technology
1 Comment
5 Likes
Statistics
Notes
  • Mongo DB trông có vẻ rất hay ở chỗ sắp xếp dữ liệu theo dạng document. Có thể tăng tốc độ và hiệu suất
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total views
2,478
On SlideShare
0
From Embeds
0
Number of Embeds
732
Actions
Shares
0
Downloads
0
Comments
1
Likes
5
Embeds 0
No embeds

No notes for slide
  • Enterprise is a closed ecosystem as compared to openness of Internet. \nThey have their own set of rules. \nYou must comply with internal technical standards, local regulations, enterprise architecture regulations etc.\nThis was no different and we started building the application using Enterprisy traditional relational database ==> which in this case was Oracle. \nThe db structure was simple with few tables but later the solution became convoluted by growing complex business rules. \nNot only this the performance of the app was degrading with growing volume of data.\nWe took a step back and translated the solution into elegance of document based db... yeah MongoDB\n\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • I am going to focus on:\nWhat enterprise wants from micro-blogging tool. What are the major requirement\nThe kind of relational db structure we started with\nWith growing data volume the perf of the app started to degrade. We did some server side fine tuning but we were not solving the actual problem.\nCaching - this is the very first step we developer take, cache it and hide the problem under the covers :))\nHow MongoDB helped us and how we took the hybrid approach\n\n
  • I am going to focus on:\nWhat enterprise wants from micro-blogging tool. What are the major requirement\nThe kind of relational db structure we started with\nWith growing data volume the perf of the app started to degrade. We did some server side fine tuning but we were not solving the actual problem.\nCaching - this is the very first step we developer take, cache it and hide the problem under the covers :))\nHow MongoDB helped us and how we took the hybrid approach\n\n
  • I am going to focus on:\nWhat enterprise wants from micro-blogging tool. What are the major requirement\nThe kind of relational db structure we started with\nWith growing data volume the perf of the app started to degrade. We did some server side fine tuning but we were not solving the actual problem.\nCaching - this is the very first step we developer take, cache it and hide the problem under the covers :))\nHow MongoDB helped us and how we took the hybrid approach\n\n
  • I am going to focus on:\nWhat enterprise wants from micro-blogging tool. What are the major requirement\nThe kind of relational db structure we started with\nWith growing data volume the perf of the app started to degrade. We did some server side fine tuning but we were not solving the actual problem.\nCaching - this is the very first step we developer take, cache it and hide the problem under the covers :))\nHow MongoDB helped us and how we took the hybrid approach\n\n
  • I am going to focus on:\nWhat enterprise wants from micro-blogging tool. What are the major requirement\nThe kind of relational db structure we started with\nWith growing data volume the perf of the app started to degrade. We did some server side fine tuning but we were not solving the actual problem.\nCaching - this is the very first step we developer take, cache it and hide the problem under the covers :))\nHow MongoDB helped us and how we took the hybrid approach\n\n
  • I am going to focus on:\nWhat enterprise wants from micro-blogging tool. What are the major requirement\nThe kind of relational db structure we started with\nWith growing data volume the perf of the app started to degrade. We did some server side fine tuning but we were not solving the actual problem.\nCaching - this is the very first step we developer take, cache it and hide the problem under the covers :))\nHow MongoDB helped us and how we took the hybrid approach\n\n
  • \n
  • - Combination of traditional microblogging like twitter and social networking sites like facebook = enterprise micro-blogging\n- Best of the two worlds (thread like structure of activities)\n- Next lets have a look at some of the requirements we had to start with\n
  • - Combination of traditional microblogging like twitter and social networking sites like facebook = enterprise micro-blogging\n- Best of the two worlds (thread like structure of activities)\n- Next lets have a look at some of the requirements we had to start with\n
  • - Combination of traditional microblogging like twitter and social networking sites like facebook = enterprise micro-blogging\n- Best of the two worlds (thread like structure of activities)\n- Next lets have a look at some of the requirements we had to start with\n
  • - thread like structure of activities\n- Activities done by the people you follow flow to your timeline\n- Having to put filters was the tipping point to start the complexity in the code\n- Lets have a look at the typical Activity thread\n
  • - thread like structure of activities\n- Activities done by the people you follow flow to your timeline\n- Having to put filters was the tipping point to start the complexity in the code\n- Lets have a look at the typical Activity thread\n
  • - thread like structure of activities\n- Activities done by the people you follow flow to your timeline\n- Having to put filters was the tipping point to start the complexity in the code\n- Lets have a look at the typical Activity thread\n
  • - thread like structure of activities\n- Activities done by the people you follow flow to your timeline\n- Having to put filters was the tipping point to start the complexity in the code\n- Lets have a look at the typical Activity thread\n
  • - thread like structure of activities\n- Activities done by the people you follow flow to your timeline\n- Having to put filters was the tipping point to start the complexity in the code\n- Lets have a look at the typical Activity thread\n
  • - thread like structure of activities\n- Activities done by the people you follow flow to your timeline\n- Having to put filters was the tipping point to start the complexity in the code\n- Lets have a look at the typical Activity thread\n
  • - this is my timeline\n- Message from Joh Doe appears on my timeline as I follow him\n- Comment/Like are the ways to participate\n- I can filter threads based on role and office \n- Lets have a look at the DB structure we had to start with\n
  • - this is my timeline\n- Message from Joh Doe appears on my timeline as I follow him\n- Comment/Like are the ways to participate\n- I can filter threads based on role and office \n- Lets have a look at the DB structure we had to start with\n
  • - this is my timeline\n- Message from Joh Doe appears on my timeline as I follow him\n- Comment/Like are the ways to participate\n- I can filter threads based on role and office \n- Lets have a look at the DB structure we had to start with\n
  • - this is my timeline\n- Message from Joh Doe appears on my timeline as I follow him\n- Comment/Like are the ways to participate\n- I can filter threads based on role and office \n- Lets have a look at the DB structure we had to start with\n
  • - this is my timeline\n- Message from Joh Doe appears on my timeline as I follow him\n- Comment/Like are the ways to participate\n- I can filter threads based on role and office \n- Lets have a look at the DB structure we had to start with\n
  • \n
  • - the integration with Legacy database was one of the reason we started with relational Oracle db\n- lets quickly have at other set of requirements which started to make things more complex\n
  • - the integration with Legacy database was one of the reason we started with relational Oracle db\n- lets quickly have at other set of requirements which started to make things more complex\n
  • we treated hash tags as virtual person\nTo make matters even worse - we had to accommodate for \n\n
  • we treated hash tags as virtual person\nTo make matters even worse - we had to accommodate for \n\n
  • \n
  • \n
  • Two things to observe\n- Thread came to profile becoz I am following the hash tag\n- Martha (whom I follow commented)\n\nLets have a look at the impact on db structure coz of these set of requirements\n
  • Two things to observe\n- Thread came to profile becoz I am following the hash tag\n- Martha (whom I follow commented)\n\nLets have a look at the impact on db structure coz of these set of requirements\n
  • Two things to observe\n- Thread came to profile becoz I am following the hash tag\n- Martha (whom I follow commented)\n\nLets have a look at the impact on db structure coz of these set of requirements\n
  • Two things to observe\n- Thread came to profile becoz I am following the hash tag\n- Martha (whom I follow commented)\n\nLets have a look at the impact on db structure coz of these set of requirements\n
  • \n
  • \n
  • - two STIs\n- this is not the end\n- to build a visual threads with all the requirements we saw - we had to start writing hand crafted sql queries.\n- not the usual Rails way of using ActiveRecord\n- Let me show to you what they look like\n
  • - two STIs\n- this is not the end\n- to build a visual threads with all the requirements we saw - we had to start writing hand crafted sql queries.\n- not the usual Rails way of using ActiveRecord\n- Let me show to you what they look like\n
  • - two STIs\n- this is not the end\n- to build a visual threads with all the requirements we saw - we had to start writing hand crafted sql queries.\n- not the usual Rails way of using ActiveRecord\n- Let me show to you what they look like\n
  • - code became complex and unmaintainable\n
  • - code became complex and unmaintainable\n
  • - code became complex and unmaintainable\n
  • - code became complex and unmaintainable\n
  • - we did a quick benchmark of the sql queries\n
  • - How many user loads (10 user load test)\n- Environment of benchmark (mysql or oracle)\n- 2m (200ms), 4m (600 ms), 8m (1100ms), 10m (1800ms), 12m (2500ms)\n- exponential increase in time\n\n- the first thing that struck us - lets cache baby !!\n- and the default option was Redis\n
  • - How many user loads (10 user load test)\n- Environment of benchmark (mysql or oracle)\n- 2m (200ms), 4m (600 ms), 8m (1100ms), 10m (1800ms), 12m (2500ms)\n- exponential increase in time\n\n- the first thing that struck us - lets cache baby !!\n- and the default option was Redis\n
  • - How many user loads (10 user load test)\n- Environment of benchmark (mysql or oracle)\n- 2m (200ms), 4m (600 ms), 8m (1100ms), 10m (1800ms), 12m (2500ms)\n- exponential increase in time\n\n- the first thing that struck us - lets cache baby !!\n- and the default option was Redis\n
  • - How many user loads (10 user load test)\n- Environment of benchmark (mysql or oracle)\n- 2m (200ms), 4m (600 ms), 8m (1100ms), 10m (1800ms), 12m (2500ms)\n- exponential increase in time\n\n- the first thing that struck us - lets cache baby !!\n- and the default option was Redis\n
  • - How many user loads (10 user load test)\n- Environment of benchmark (mysql or oracle)\n- 2m (200ms), 4m (600 ms), 8m (1100ms), 10m (1800ms), 12m (2500ms)\n- exponential increase in time\n\n- the first thing that struck us - lets cache baby !!\n- and the default option was Redis\n
  • - How many user loads (10 user load test)\n- Environment of benchmark (mysql or oracle)\n- 2m (200ms), 4m (600 ms), 8m (1100ms), 10m (1800ms), 12m (2500ms)\n- exponential increase in time\n\n- the first thing that struck us - lets cache baby !!\n- and the default option was Redis\n
  • - How many user loads (10 user load test)\n- Environment of benchmark (mysql or oracle)\n- 2m (200ms), 4m (600 ms), 8m (1100ms), 10m (1800ms), 12m (2500ms)\n- exponential increase in time\n\n- the first thing that struck us - lets cache baby !!\n- and the default option was Redis\n
  • - How many user loads (10 user load test)\n- Environment of benchmark (mysql or oracle)\n- 2m (200ms), 4m (600 ms), 8m (1100ms), 10m (1800ms), 12m (2500ms)\n- exponential increase in time\n\n- the first thing that struck us - lets cache baby !!\n- and the default option was Redis\n
  • - How many user loads (10 user load test)\n- Environment of benchmark (mysql or oracle)\n- 2m (200ms), 4m (600 ms), 8m (1100ms), 10m (1800ms), 12m (2500ms)\n- exponential increase in time\n\n- the first thing that struck us - lets cache baby !!\n- and the default option was Redis\n
  • - How many user loads (10 user load test)\n- Environment of benchmark (mysql or oracle)\n- 2m (200ms), 4m (600 ms), 8m (1100ms), 10m (1800ms), 12m (2500ms)\n- exponential increase in time\n\n- the first thing that struck us - lets cache baby !!\n- and the default option was Redis\n
  • \n
  • - but we soon realized that caching is not going to solve our problem, in fact it would have made matters more worse\n- we started looking at if document based db can help solve the problem at the same time scale with increasing complex business requirements\n
  • - but we soon realized that caching is not going to solve our problem, in fact it would have made matters more worse\n- we started looking at if document based db can help solve the problem at the same time scale with increasing complex business requirements\n
  • - but we soon realized that caching is not going to solve our problem, in fact it would have made matters more worse\n- we started looking at if document based db can help solve the problem at the same time scale with increasing complex business requirements\n
  • - but we soon realized that caching is not going to solve our problem, in fact it would have made matters more worse\n- we started looking at if document based db can help solve the problem at the same time scale with increasing complex business requirements\n
  • - which is true\n- each activity thread is like a small document\n
  • \n
  • \n
  • \n
  • - we are storing the user activity as relational structure\n- We convert the relational structure, run our business rules, convert them into JSON format\n- Give it to the View Layer to layout the result\n- Instead what we can do is store them as documents and cut short the whole deal of business rules etc…\n- Store them in the way business wants us to show…\n- we started to evaluating document based DB and again default was MongoDB\n- you might have this question\n
  • - we are storing the user activity as relational structure\n- We convert the relational structure, run our business rules, convert them into JSON format\n- Give it to the View Layer to layout the result\n- Instead what we can do is store them as documents and cut short the whole deal of business rules etc…\n- Store them in the way business wants us to show…\n- we started to evaluating document based DB and again default was MongoDB\n- you might have this question\n
  • - we are storing the user activity as relational structure\n- We convert the relational structure, run our business rules, convert them into JSON format\n- Give it to the View Layer to layout the result\n- Instead what we can do is store them as documents and cut short the whole deal of business rules etc…\n- Store them in the way business wants us to show…\n- we started to evaluating document based DB and again default was MongoDB\n- you might have this question\n
  • - I am pretty sure everyone sitting in this room is convinced why mongodb :)))\n- We took the hybrid approach as we had to deal with integrating Legacy Tables as well\n
  • - I am pretty sure everyone sitting in this room is convinced why mongodb :)))\n- We took the hybrid approach as we had to deal with integrating Legacy Tables as well\n
  • \n
  • - lets have a look at the simple arch diagram\n
  • We are still in the infant stages and there are so many things to learn from this community\n- the next obvious question how did we migrate the GBs of data from relational to document structure\n
  • - we already had the code to translate relational structure to json format (for view layer)\n- we used it to generate document structure for mongo\n- lets have a look at some benchmarks\n
  • SQL\n- 2m (200ms), 4m (600 ms), 8m (1100ms), 10m (1800ms), 12m (2500ms)\nMongo\n~ 280ms\n- so now our code is\n
  • SQL\n- 2m (200ms), 4m (600 ms), 8m (1100ms), 10m (1800ms), 12m (2500ms)\nMongo\n~ 280ms\n- so now our code is\n
  • SQL\n- 2m (200ms), 4m (600 ms), 8m (1100ms), 10m (1800ms), 12m (2500ms)\nMongo\n~ 280ms\n- so now our code is\n
  • SQL\n- 2m (200ms), 4m (600 ms), 8m (1100ms), 10m (1800ms), 12m (2500ms)\nMongo\n~ 280ms\n- so now our code is\n
  • SQL\n- 2m (200ms), 4m (600 ms), 8m (1100ms), 10m (1800ms), 12m (2500ms)\nMongo\n~ 280ms\n- so now our code is\n
  • SQL\n- 2m (200ms), 4m (600 ms), 8m (1100ms), 10m (1800ms), 12m (2500ms)\nMongo\n~ 280ms\n- so now our code is\n
  • SQL\n- 2m (200ms), 4m (600 ms), 8m (1100ms), 10m (1800ms), 12m (2500ms)\nMongo\n~ 280ms\n- so now our code is\n
  • SQL\n- 2m (200ms), 4m (600 ms), 8m (1100ms), 10m (1800ms), 12m (2500ms)\nMongo\n~ 280ms\n- so now our code is\n
  • SQL\n- 2m (200ms), 4m (600 ms), 8m (1100ms), 10m (1800ms), 12m (2500ms)\nMongo\n~ 280ms\n- so now our code is\n
  • SQL\n- 2m (200ms), 4m (600 ms), 8m (1100ms), 10m (1800ms), 12m (2500ms)\nMongo\n~ 280ms\n- so now our code is\n
  • SQL\n- 2m (200ms), 4m (600 ms), 8m (1100ms), 10m (1800ms), 12m (2500ms)\nMongo\n~ 280ms\n- so now our code is\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • - not a full proof solution\n- but has worked for us \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Few gotchas:\n- Do not forgot the schema design\n- We named the collection --> bloggers and later on changed to blogger\n\n
  • Micro-Blogging for The Enterprise (MongoDB)

    1. 1. MICRO-BLOGGING FOR THEENTERPRISE
    2. 2. Who Am I
    3. 3. Who Am IAmit Kumar
    4. 4. Who Am IAmit KumarConsultant and Rubyist
    5. 5. Who Am IAmit KumarConsultant and RubyistTATA Consultancy Services Limited
    6. 6. Who Am IAmit KumarConsultant and RubyistTATA Consultancy Services Limitedtwitter.com/toamit
    7. 7. Who Am IAmit KumarConsultant and RubyistTATA Consultancy Services Limitedtwitter.com/toamitgithub.com/toamitkumar
    8. 8. AGENDA
    9. 9. AGENDAMicro-Blogging for “The Enterprise”
    10. 10. AGENDAMicro-Blogging for “The Enterprise”Relational DB Structure
    11. 11. AGENDAMicro-Blogging for “The Enterprise”Relational DB StructurePerformance issues
    12. 12. AGENDAMicro-Blogging for “The Enterprise”Relational DB StructurePerformance issuesCaching?
    13. 13. AGENDAMicro-Blogging for “The Enterprise”Relational DB StructurePerformance issuesCaching?MongoDB
    14. 14. AGENDAMicro-Blogging for “The Enterprise”Relational DB StructurePerformance issuesCaching?MongoDBHybrid Approach
    15. 15. WhatMicro-Blogging Mean To Enterprise
    16. 16. Enterprise Demands
    17. 17. Enterprise DemandsMessage threads (Facebook like experience)
    18. 18. Enterprise DemandsMessage threads (Facebook like experience)Activities of follower list
    19. 19. Enterprise DemandsMessage threads (Facebook like experience)Activities of follower listTwo timelines
    20. 20. Enterprise DemandsMessage threads (Facebook like experience)Activities of follower listTwo timelines Individual
    21. 21. Enterprise DemandsMessage threads (Facebook like experience)Activities of follower listTwo timelines Individual Public
    22. 22. Enterprise DemandsMessage threads (Facebook like experience)Activities of follower listTwo timelines Individual PublicFiltering messages (role/office etc)
    23. 23. Timeline
    24. 24. Timeline
    25. 25. Timeline
    26. 26. Timeline
    27. 27. Timeline
    28. 28. Timeline
    29. 29. RelationalDB Structure
    30. 30. Traditional Relational Design
    31. 31. Traditional Relational Design
    32. 32. Traditional Relational Design
    33. 33. Enterprise Demands - II
    34. 34. Enterprise Demands - IIMost popular posts
    35. 35. Enterprise Demands - IIMost popular postsHash-tags (#hashtag)
    36. 36. Enterprise Demands - II
    37. 37. Enterprise Demands - II Ability to follow hash-tags
    38. 38. Enterprise Demands - II Ability to follow hash-tags
    39. 39. Enterprise Demands - II
    40. 40. Enterprise Demands - IIActivity on hash-tag flow to timeline
    41. 41. Enterprise Demands - IIActivity on hash-tag flow to timeline
    42. 42. Enterprise Demands - IIActivity on hash-tag flow to timeline
    43. 43. Enterprise Demands - IIActivity on hash-tag flow to timeline
    44. 44. Enterprise Demands - II
    45. 45. Enterprise Demands - IIHash-tag on comment will tag the whole thread
    46. 46. Enterprise Demands - IIHash-tag on comment will tag the whole thread
    47. 47. Traditional Relational Design
    48. 48. Traditional Relational Design
    49. 49. Traditional Relational Design
    50. 50. Traditional Relational Design
    51. 51. COMPLEX ANDUNMAINTAINABLE CODE
    52. 52. COMPLEX ANDUNMAINTAINABLE CODE
    53. 53. COMPLEX ANDUNMAINTAINABLE CODE
    54. 54. COMPLEX ANDUNMAINTAINABLE CODE
    55. 55. COMPLEX ANDUNMAINTAINABLE CODE
    56. 56. PERFORMANCE DEGRADES WITHINCREASING DATA VOLUME
    57. 57. Performance Benchmark
    58. 58. Performance BenchmarkSQL query benchmarks
    59. 59. Performance BenchmarkSQL query benchmarks 10,000,000 messages
    60. 60. Performance BenchmarkSQL query benchmarks 10,000,000 messages 200,000,000 comments
    61. 61. Performance BenchmarkSQL query benchmarks 10,000,000 messages 200,000,000 comments 50,000,000 message likes 3000 Response ms 2250 1500 750 0 2 4 8 10 12 in millions
    62. 62. Caching? (Redis)
    63. 63. Caching Solution Like
    64. 64. Caching Solution LikeBreak point in Architecture
    65. 65. Caching Solution LikeBreak point in ArchitectureNo Query Interface
    66. 66. Caching Solution LikeBreak point in ArchitectureNo Query InterfaceNo clustering support
    67. 67. Caching Solution LikeBreak point in ArchitectureNo Query InterfaceNo clustering supportMore messy code
    68. 68. A Document Based DB?
    69. 69. A Document Based DB?Activity threads are like nested “DOCUMENTS”
    70. 70. One Document
    71. 71. A Document Based DB?
    72. 72. A Document Based DB?Storing them as relational structure
    73. 73. A Document Based DB?Storing them as relational structureConverting them back to documents
    74. 74. A Document Based DB?Storing them as relational structureConverting them back to documentsStore them as “Activity Documents”
    75. 75. Why MONGODB?
    76. 76. Why MONGODB?It is Cool
    77. 77. Hybrid Approach (Relational DB + MongoDB)
    78. 78. Hybrid Approach
    79. 79. Hybrid Approach
    80. 80. Architecture
    81. 81. Architecture
    82. 82. Data Migration FromRelational DB TO MongoDB WAS EASY!!
    83. 83. Performance Benchmark SQL MongoDB
    84. 84. Performance Benchmark SQL QUERY vs MongoDB 3000Response in ms 2250 1500 750 0 2 4 8 10 12 in millions SQL MongoDB
    85. 85. Simple, Clean
    86. 86. Simple, Clean And
    87. 87. Simple, Clean AndMaintainable Code
    88. 88. Simple, Clean AndMaintainable Code BUT WAIT !!!
    89. 89. Simple, Clean And Maintainable Code BUT WAIT !!!TRANSACTION??
    90. 90. Transaction Management
    91. 91. Transaction Management
    92. 92. Scaling with MongoDB
    93. 93. Scaling with MongoDBIndexing
    94. 94. Scaling with MongoDBIndexingSharding?
    95. 95. Scaling with MongoDBIndexingSharding?8 GB of Data
    96. 96. Monitoring MongosmongostatHttp Console (localhost:28017)
    97. 97. Monitoring Mongos
    98. 98. Relational DB And MongoDB Live In Harmony Together
    99. 99. Relational DB And MongoDB Live In Harmony Together
    100. 100. THANK YOU

    ×