Data, data, data. I cannot make bricks without clay. <br />Sherlock Holmes, Sherlock Holmes [2009]<br />
Data<br />Qualitative or Quantitative attributes of a variable or set of variables<br />Lowest level of abstraction from w...
A well organized newspaper or a clumsy, cluttered one?<br />
Data explosion<br />From Gigabytes to Terabytes to Petabytes to perhaps (I’m out of nomenclature)-bytes<br />
NoSQL<br />= Not Only SQL!= No to SQL<br />!= Never SQL<br />
Open Source<br />Abridged version of this presentation and notes will be available for everyone.<br />Distributed under no...
WEB 2.0<br />DDBMS<br />RDBMS performance<br />OODB<br />RnD<br />Cloud Computing<br />Multiple Solutions<br />Necessity i...
SQL Databases, the ‘Hammer’<br />It’s a wonderful tool<br />
Commercial SQL Databases<br />Even Gods use it<br />Design<br />Power<br />Ergonomics<br />Ease of use<br />Features<br />...
Nail is a nail, Screw is a screw<br />Hammering a screw or Screw driving a nail is FOOLISHNESS!<br />
Non-relational next generation operational data stores and databases<br />What?<br />NoSQL is a new look at data to delive...
Unlimited horizontal scalability
Economic, common, unreliable hardware
Auto Sharding
Support for wide range of data
Recursive, Hierarchical
Non-Rigid
High Availability</li></li></ul><li>What? (Continued…)<br />Partly or completely independent of RDBMS concepts<br />No spe...
NoSQL, the ‘screwdriver’<br />Yet another tool in our repository to go along with the hammer<br />
NoSQL is about choice<br />Not all problems are nails.<br />Not all screws are same.<br />GOOD PROGRAMMING PRACTICE: <br /...
SQL Databases<br />Data<br />Relational<br />Tabular – Rows/Columns<br />Interface<br />Sql<br />Basic Design Inspiration<...
MySQL
Teradata
SQLite
SQL Server</li></ul>And many more<br />
Why?<br /><ul><li> Is all data really relational?
 If Consistency is ensured, do we have to enforce/check it again at the database level.
 Are RDBMS ready for challenges of the future like:
Dynamic schema/metadata
Huge amounts of data
Through horizontal auto scaling
Ability to handle complex data types
Images, Videos, Audios and much more</li></ul>Not Really!<br />
Why? (Continued…)<br />RDBMS drawbacks:<br />Scalability<br />CRUD<br />Performance<br />Write Overhead<br />Limited by si...
HAMMERS<br />Are under some<br />Hammering<br />
DRAWBACKS<br />E<br />E<br />P<br />D<br />I<br />V<br />E<br />
Scalability<br />True Scalability<br />Horizontal Scaling<br />Transparency to the application<br />No single point of fai...
No Breadcrumbs<br />CRUD is crude<br />Delete/Update strategy is improper<br />CRA!<br />Create, Read, Archive – way to go...
Naive Data Support<br />Not designed for <br />Complex Data Structures<br />Recursive<br />Hierarchical<br />Ordered List<...
Logical/Physical separation concerns<br />Relational model -> Logical Model<br />RDBMS implement it at physical level<br /...
Spinning Disk Storage<br />Design flaw for most RDBMS systems<br />With cheaper memory, Memory based approach should also ...
Think ‘Out of the ROM’<br />
At Snail’s pace<br />RDBMS engine growth – SLOW<br />Optimizations have been minor since initial days<br />Majority of gro...
Database size limits<br />RDBMS are too slow<br />Over multiterabyte and petabyte databases<br />Purpose designed parallel...
RDBMS<br /> has been there since years <br />and is proven technology<br />What aboutNoSQL<br />
RDBMS<br />grew fast but <br />growth slowed down over time and <br />might eventually reach a stale point<br />NoSQL<br /...
Did you say<br />BIG PLAYERS!<br />WHO?<br />
NoSQL Real World Implementations<br /><ul><li>Google – BigTable
Facebook – Hbase
Digg – Cassandra
Amazon – Dynamo
Trend Micro – Hbase
Netflix – Amazon SimpleDB
Shutterfly – MongoDB
LinkedIn – Voldemort</li></ul>and more<br />Microsoft is considering NoSQL as well for Azure services so is Twitter<br />A...
We are used to <br />SQL and relatedness, <br />why can’t they just fix RDBMS<br />to handle Big Data<br />STORAGE SEEK RA...
CAP Theorem<br />Applies to distributed shared data system<br />
CAP THEOREM	<br />
A Deeper look<br />Consistency: The system is in a consistent state after an operation<br />All clients see the same data<...
CP<br /><ul><li>Some data maybe inaccessible but rest is accurate/consistent
Sharded database
TERADATA comes here</li></ul>CA<br /><ul><li>Single Site Clusters</li></ul>RDBMS<br />Paxos<br />NoSQL<br />AP<br /><ul><l...
Basically<br />Available<br />Soft State<br />Eventually<br /> Consistent<br />When Availability and Partitionability are ...
Eventual Consistency<br />If no new updates are made to the object, eventually all accesses will return the last updated v...
Types of Eventual Consistency<br />Read-your-write consistency<br />Session consistency<br />Monotonic read consistency<br...
Hash()<br />Different Apps – Different CAP requirement<br />Prioritize among<br />Consistency – Availability<br />Availabi...
WHERE?<br />So will NoSQL eventually replace RDBMSs everywhere?No, RDBMS are there to stay.<br />NoSQL is here to help.<br />
Wherever you want to take<br />Advantage<br />of <br />NoSQL<br />
Big Data<br />Denormalize<br />Shard<br />Scale Out<br />And look no further than NoSQL<br />
Write Intensive Applications<br />I/OpS of the Best storage device <<< n * I/OpS of relatively cheaper storage devices in ...
Fast Key-Value Access<br />NoSQL – ‘User, you are looking for $value’<br />RDBMS – ‘Query executing ….’<br />A O(1) Hash o...
Flexible Schema and Data types<br />‘I once was a integer, then a string then a date; What am I’  - FieldRDBMS – ‘WTH! Wha...
Transient Data<br />Data – ‘I’m here only for a while and want to get my work done fast’<br />RDBMS – ‘You are data and yo...
High Write Availability<br />Warning - Incoming data ….NoSQL – ‘Anytime you like, user’<br />RDBMS – ‘This is insane, I’m ...
ECONOMICS<br />RDBMS – ‘I’m powered by a wonderful, beautiful rabbit’<br />NoSQL – ‘I’m powered by many cute little hamste...
No Single Point of Failure<br />Designed to run over<br />Economic<br />Commonly Available<br />Unreliable hardware<br />
Full table scan operations<br />MapReduce:<br />Map: <br />To define your problems into optimal sub problems which can be ...
Ability to restore, maintain, repair itself<br />No DBA required Design<br />
HOW?<br />Let us welcome <br />Keys, Values, Collections, Data Structures, Objects, Documents  Graphs<br />
NoSQL View<br />The basic approach at data:<br />Key/Value store<br />Run on multiple machines<br />Partitions and Replica...
Document Store<br />Key-Value Store<br />Object<br />NoSQL<br />Multivalue<br />Graph Stores<br />BigTable Clones<br />Tub...
Key-Value Stores<br />One key, one value, no duplicates and crazy fast<br />Distributed hash tables<br />The value is stor...
Key4<br />Key3<br />Key2<br />Key1<br />Key/Value store doesn’t know what is in here<br />
Document Store<br />Key-value store, but the value is structured and understood by the DB<br />Querying data is possible<b...
Upcoming SlideShare
Loading in...5
×

NoSQL

2,032

Published on

My presentation in the Architects forum

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,032
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
48
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • SQL Databases approach data in the form of sets and tables. Incidentally its strength soon become its weakness.Assumptions made:Data is represented in the form of tables. Row and ColumnsData in each table can be related to data in another.Data can/has to be searchable through all columns.Strengths:Data manipulation through Set theory.Enforce relational constraints with its management system.Weakness:Relational ness becomes an overhead once data becomes real huge.Large amounts of writes in a SQL database is a lot of burden on the DBMS apart from the storage disk.
  • NoSQL is a collection of databases which elude from the drawbacks of RDBMS without completely giving up on Relational Models. They are not stringent when it comes to certain core RDBMS concepts like ACID complianceand other integrity constraints.The priority is to support high levels of scalability through easy partitioning abilities across multiple cheap naïve hardware by giving up on Consistency which SQL databases look at delivering apart from some amount of relatedness from the data.
  • The CAP theorem states that any shared-data system can only achieve two of these three.Consistency (All database clients see the same data, even with concurrent updates.)Availability (All database clients are able to access some version of the data.)Partition tolerance (The database can be split over multiple servers.)http://www.julianbrowne.com/article/viewer/brewers-cap-theoremhttp://devblog.streamy.com/2009/08/24/cap-theorem/http://www.royans.net/arch/brewers-cap-theorem-on-distributed-systems/
  • NoSQL

    1. 1. Data, data, data. I cannot make bricks without clay. <br />Sherlock Holmes, Sherlock Holmes [2009]<br />
    2. 2. Data<br />Qualitative or Quantitative attributes of a variable or set of variables<br />Lowest level of abstraction from which information and then knowledge are derived.<br />Representation of a fact, figure and idea.<br />
    3. 3. A well organized newspaper or a clumsy, cluttered one?<br />
    4. 4. Data explosion<br />From Gigabytes to Terabytes to Petabytes to perhaps (I’m out of nomenclature)-bytes<br />
    5. 5. NoSQL<br />= Not Only SQL!= No to SQL<br />!= Never SQL<br />
    6. 6. Open Source<br />Abridged version of this presentation and notes will be available for everyone.<br />Distributed under no License<br />FREE AS IN SPEECH AND BEER<br />
    7. 7. WEB 2.0<br />DDBMS<br />RDBMS performance<br />OODB<br />RnD<br />Cloud Computing<br />Multiple Solutions<br />Necessity is the mother of Invention<br />
    8. 8. SQL Databases, the ‘Hammer’<br />It’s a wonderful tool<br />
    9. 9. Commercial SQL Databases<br />Even Gods use it<br />Design<br />Power<br />Ergonomics<br />Ease of use<br />Features<br />Warranty<br />Upgrades<br />Apart from<br />Hole in the Pocket<br />
    10. 10. Nail is a nail, Screw is a screw<br />Hammering a screw or Screw driving a nail is FOOLISHNESS!<br />
    11. 11. Non-relational next generation operational data stores and databases<br />What?<br />NoSQL is a new look at data to deliver:<br /><ul><li>High Performance
    12. 12. Unlimited horizontal scalability
    13. 13. Economic, common, unreliable hardware
    14. 14. Auto Sharding
    15. 15. Support for wide range of data
    16. 16. Recursive, Hierarchical
    17. 17. Non-Rigid
    18. 18. High Availability</li></li></ul><li>What? (Continued…)<br />Partly or completely independent of RDBMS concepts<br />No specific implementation<br />Breakthrough Approaches<br />Key:<br />Non-relational approach<br />Non-ACIDness<br />A STEP BACKWARDS, THEN MANY STEPS FORWARD<br />
    19. 19. NoSQL, the ‘screwdriver’<br />Yet another tool in our repository to go along with the hammer<br />
    20. 20. NoSQL is about choice<br />Not all problems are nails.<br />Not all screws are same.<br />GOOD PROGRAMMING PRACTICE: <br />Know your tools and use them appropriately<br />
    21. 21. SQL Databases<br />Data<br />Relational<br />Tabular – Rows/Columns<br />Interface<br />Sql<br />Basic Design Inspiration<br />Set Theory<br />ACID Design<br />Scale Up Design<br /><ul><li>Oracle
    22. 22. MySQL
    23. 23. Teradata
    24. 24. SQLite
    25. 25. SQL Server</li></ul>And many more<br />
    26. 26. Why?<br /><ul><li> Is all data really relational?
    27. 27. If Consistency is ensured, do we have to enforce/check it again at the database level.
    28. 28. Are RDBMS ready for challenges of the future like:
    29. 29. Dynamic schema/metadata
    30. 30. Huge amounts of data
    31. 31. Through horizontal auto scaling
    32. 32. Ability to handle complex data types
    33. 33. Images, Videos, Audios and much more</li></ul>Not Really!<br />
    34. 34. Why? (Continued…)<br />RDBMS drawbacks:<br />Scalability<br />CRUD<br />Performance<br />Write Overhead<br />Limited by single disk architecture<br />Lack of In Memory design<br />Rigid schema design<br />And more …..<br />
    35. 35. HAMMERS<br />Are under some<br />Hammering<br />
    36. 36. DRAWBACKS<br />E<br />E<br />P<br />D<br />I<br />V<br />E<br />
    37. 37. Scalability<br />True Scalability<br />Horizontal Scaling<br />Transparency to the application<br />No single point of failure<br />Problems with SQL databases<br />Vertical Scaling<br />Partitioning aka Sharding<br />Read Slaves<br />Anti Patterns<br />Normalized Data<br />Joins<br />ACID Transactions<br />
    38. 38. No Breadcrumbs<br />CRUD is crude<br />Delete/Update strategy is improper<br />CRA!<br />Create, Read, Archive – way to go ahead<br />Audit information is lost in CRUD but not in the case of CRA<br />
    39. 39. Naive Data Support<br />Not designed for <br />Complex Data Structures<br />Recursive<br />Hierarchical<br />Ordered List<br />Circular<br />Dynamic Metadata<br />
    40. 40. Logical/Physical separation concerns<br />Relational model -> Logical Model<br />RDBMS implement it at physical level<br />Using Multiple indices<br />Artificial overhead in managing the database<br />Frequent drop and create index to make DB perform<br />
    41. 41. Spinning Disk Storage<br />Design flaw for most RDBMS systems<br />With cheaper memory, Memory based approach should also be included in the design<br />Defiance of Moore’s law<br />Disk reads grew only 12.5 times in about 50 years<br />Disk writes much lesser.<br />Disk write is expensive.<br />RDBMS make things worse by writing more.<br />ACID rains are UNHEALTHY<br />
    42. 42. Think ‘Out of the ROM’<br />
    43. 43. At Snail’s pace<br />RDBMS engine growth – SLOW<br />Optimizations have been minor since initial days<br />Majority of growth due to Moore’s law<br />Faster hardware<br />Slightly faster storage<br />Faster memory<br />What when Moore’s law diminishes thanks to external factors like heat generated.<br />
    44. 44. Database size limits<br />RDBMS are too slow<br />Over multiterabyte and petabyte databases<br />Purpose designed parallel processing would be needed to handle such capacities of data in a RDBMS.<br />
    45. 45. RDBMS<br /> has been there since years <br />and is proven technology<br />What aboutNoSQL<br />
    46. 46. RDBMS<br />grew fast but <br />growth slowed down over time and <br />might eventually reach a stale point<br />NoSQL<br />unarguably a new immature tool, <br />has been growing faster than RDBMS ever did<br />and is being supported by the Big Players<br />
    47. 47. Did you say<br />BIG PLAYERS!<br />WHO?<br />
    48. 48. NoSQL Real World Implementations<br /><ul><li>Google – BigTable
    49. 49. Facebook – Hbase
    50. 50. Digg – Cassandra
    51. 51. Amazon – Dynamo
    52. 52. Trend Micro – Hbase
    53. 53. Netflix – Amazon SimpleDB
    54. 54. Shutterfly – MongoDB
    55. 55. LinkedIn – Voldemort</li></ul>and more<br />Microsoft is considering NoSQL as well for Azure services so is Twitter<br />Are we next?<br />Major IT Companies have implemented or even better created their own NoSQL to manage huge Data stores which couldn’t be managed by SQL Databases.<br />
    56. 56. We are used to <br />SQL and relatedness, <br />why can’t they just fix RDBMS<br />to handle Big Data<br />STORAGE SEEK RATES<br />Large writes and ACID being a huge limitation<br />Big Data can be handled via <br />Scale Out/Partitionability across Multiple Nodes<br />
    57. 57. CAP Theorem<br />Applies to distributed shared data system<br />
    58. 58. CAP THEOREM <br />
    59. 59. A Deeper look<br />Consistency: The system is in a consistent state after an operation<br />All clients see the same data<br />Strong Consistency(ACID) vs. Eventual (BASE)<br />Availability: ‘Always On’ mode, no downtime<br />All clients can find some available replica<br />Software/hardware upgrade tolerance<br />Partition Tolerance: The system continues to function even when split into disconnected subsets (by a network disruption)<br />Reads and Writes combined<br />
    60. 60. CP<br /><ul><li>Some data maybe inaccessible but rest is accurate/consistent
    61. 61. Sharded database
    62. 62. TERADATA comes here</li></ul>CA<br /><ul><li>Single Site Clusters</li></ul>RDBMS<br />Paxos<br />NoSQL<br />AP<br /><ul><li>System is still available under partitioning but some of the data returned may be inaccurate</li></li></ul><li>All of the operations in the transaction will complete, or none will.<br />The database will be in a consistent state when the transaction begins and ends.<br />The transaction will behave as if it is the only operation being performed upon the database.<br />Upon completion of the transaction, the operation will not be reversed.<br />Atomicity<br />Consistency<br />Isolation<br />Durability<br />
    63. 63. Basically<br />Available<br />Soft State<br />Eventually<br /> Consistent<br />When Availability and Partitionability are prioritized over Consistency, think in terms of BASE<br />
    64. 64. Eventual Consistency<br />If no new updates are made to the object, eventually all accesses will return the last updated value.<br />Ex: Domain Name System (DNS)<br />
    65. 65. Types of Eventual Consistency<br />Read-your-write consistency<br />Session consistency<br />Monotonic read consistency<br />Monotonic write consistency<br />Causal consistency<br />Practically, Read-your-write consistency and monotonic read consistency are desirable in an eventually consistent system<br />
    66. 66. Hash()<br />Different Apps – Different CAP requirement<br />Prioritize among<br />Consistency – Availability<br />Availability – Partitionability<br />Consistency - Partitionability<br />
    67. 67. WHERE?<br />So will NoSQL eventually replace RDBMSs everywhere?No, RDBMS are there to stay.<br />NoSQL is here to help.<br />
    68. 68. Wherever you want to take<br />Advantage<br />of <br />NoSQL<br />
    69. 69. Big Data<br />Denormalize<br />Shard<br />Scale Out<br />And look no further than NoSQL<br />
    70. 70. Write Intensive Applications<br />I/OpS of the Best storage device <<< n * I/OpS of relatively cheaper storage devices in simple terms: ‘HARNESS THE POWER OF YOUR CLOUD’<br />
    71. 71. Fast Key-Value Access<br />NoSQL – ‘User, you are looking for $value’<br />RDBMS – ‘Query executing ….’<br />A O(1) Hash operation or O(log n) B+/B tree traversals<br />
    72. 72. Flexible Schema and Data types<br />‘I once was a integer, then a string then a date; What am I’ - FieldRDBMS – ‘WTH! Whatever you are, You are beyond my scope’<br />
    73. 73. Transient Data<br />Data – ‘I’m here only for a while and want to get my work done fast’<br />RDBMS – ‘You are data and you shall be treated like the rest’<br />NoSQL – ‘Okay, I’ll allot you space in the RAM using Memcached If available otherwise you still have my cloud’<br />
    74. 74. High Write Availability<br />Warning - Incoming data ….NoSQL – ‘Anytime you like, user’<br />RDBMS – ‘This is insane, I’m already busy with other things’<br />
    75. 75. ECONOMICS<br />RDBMS – ‘I’m powered by a wonderful, beautiful rabbit’<br />NoSQL – ‘I’m powered by many cute little hamsters’ <br />
    76. 76. No Single Point of Failure<br />Designed to run over<br />Economic<br />Commonly Available<br />Unreliable hardware<br />
    77. 77. Full table scan operations<br />MapReduce:<br />Map: <br />To define your problems into optimal sub problems which can be computed in parallel and reduced later<br />Reduce:<br />To merge the sub optimal solutions into the result<br />Divide and Conquer your way to Victory<br />Powered by MapReduce! Or something similar<br />
    78. 78. Ability to restore, maintain, repair itself<br />No DBA required Design<br />
    79. 79. HOW?<br />Let us welcome <br />Keys, Values, Collections, Data Structures, Objects, Documents Graphs<br />
    80. 80. NoSQL View<br />The basic approach at data:<br />Key/Value store<br />Run on multiple machines<br />Partitions and Replication across these machines<br />Relax consistency<br />Aim at Eventual Consistency<br />Asynchronous replication<br />But not all NoSQL take the same path.<br />
    81. 81. Document Store<br />Key-Value Store<br />Object<br />NoSQL<br />Multivalue<br />Graph Stores<br />BigTable Clones<br />Tuble Store<br />
    82. 82. Key-Value Stores<br />One key, one value, no duplicates and crazy fast<br />Distributed hash tables<br />The value is stored as binary object – BLOB<br />The DB doesn’t understand it and doesn’t want to<br />Ex: Amazon Dynamo, MemcacheDB<br />
    83. 83. Key4<br />Key3<br />Key2<br />Key1<br />Key/Value store doesn’t know what is in here<br />
    84. 84. Document Store<br />Key-value store, but the value is structured and understood by the DB<br />Querying data is possible<br />On not just the key<br />Ex: MongoDB, CouchDB, Riaketc<br />
    85. 85. Each database has collections<br />Each collection has a set of documents<br />They are well-designed for access through applications<br />Suitable for web applications<br />Few Document databases provide SQL Like query interface now<br />
    86. 86. Key4<br />Key3<br />Key2<br />Key1<br />Name: $NameValue: $Value<br />Version: $Version<br />Type: $Type<br />Emb Object1<br />Objects inside Objects<br />CRAZY!<br />Emb Object2<br />
    87. 87. BigTable & its Clones<br />Database, tables, rows, columns and ’ SuperColumn’<br />Row consists of columns and SuperColumns<br />Few supercolumns can be made a must<br />Each supercolumn – arbitrary set of columns<br />Rows are typically versioned by a system assigned timestamp.<br />
    88. 88. Intended for tables with huge number of columns<br />Millions can also be supported very easily<br />‘a sparse, distributed multi-dimensional sorted map’<br />Also referred to as Wide Column stores<br />Ex: Google BigTable, Cassandra, Hbase, Voldemort, Azure Tables<br />
    89. 89. Key1<br />Key2<br />Key3<br />
    90. 90. Graph Databases<br />Nodes, Edges, Properties<br />Replace traditional tables, columns, rows<br />Graph database can be implement in different ways<br />Key/value store, columnar, bigtable clone or even combination of these<br />Fields are used to directly store the id of another entity forming the edge<br />
    91. 91. Graph database is a multi-relational graph<br />No need for secondary indexes<br />Relationships in RDBMS are ‘weak’<br />Relationships in Graphs are ‘strong’<br />The rest don’t really care about relations at db level<br />
    92. 92. Address<br />Age: 32<br />Matt<br />Mobile<br />April<br />Is related to<br />SSN<br />Spouse<br />owns<br />Drives<br />Honda<br />Model<br />City<br />registration<br />
    93. 93. Key-Value Store<br />Size<br />Document Store<br />BigTable Clone<br />Graph Databases<br />Complexity<br />
    94. 94. Too Many Cooks and Recipes<br />No specific recipe!<br />Major implementations:<br />Graph<br />Document store<br />Tabular<br />Key value store<br />Eventually consistent<br />Hierarchical<br />Ordered<br />Other Known Recipes:<br />Multivalue<br />Object<br />Tuble Store<br />
    95. 95. The Menu<br />On Disk<br />BigTable<br />Membase<br />Tokyo Cabinet<br />In RAM<br />Memcached<br />Velocity<br />Eventually Consistent<br />Cassandra<br />Dynamo<br />Riak<br />Hierarchical<br />GT.M<br />Ordered<br />Berkeley DB<br />NMDB<br />C-ISAM<br />Multivalue<br />eXe<br />OpenQM<br />Document Store<br />CouchDB<br />Lotus Notes<br />MongoDB<br />Graph<br />AllegroGraph<br />Neo4j<br />DEX<br />Tabular<br />BigTable<br />Hbase<br />HyperTable<br />The list isn’t even a quarter of the whole<br />
    96. 96. _theOpenSourceIssue<br />Most of them are open source <br />Thus fork-ablelike Linux<br />The first of the lot<br />Google’s BigTable<br />Amazon’s Dynamo<br />All in all, there are about 10 roots with 4 major ones.<br />
    97. 97. No single database to rule them all<br />
    98. 98. Real World Implementations<br />Digg’s 3TB for Green Badges [CASSANDRA]<br />Facebook’s 50TB for Inbox Search [HBASE]<br />eBay’s 2PB overall data<br />Google’s <br />
    99. 99. Naïve Recipe<br />
    100. 100. MongoDB<br />Document Store<br />JSON Storage<br />REST ….. Not out of the box<br />Map/Reduce<br />Master slave replication<br />Strong suite of query APIs<br />Good support for SQL<br />Work in Progress:<br />Autosharding based scalability<br />Failover support<br />Open Source<br />Non Relational<br />Scalable<br />Schemaless<br />Queryable<br />
    101. 101. Document Oriented<br />Mongo stores documents in collections<br />Documents are slightly enhanced JSON Objects<br />Complex data structures is very much possible<br />Data Modelling is a more natural process<br />
    102. 102. Embeddable Objects<br />Complexity.begin()<br />Embed objects within a single document<br />Document is an enhanced form of object like mentioned earlier<br />The same thing in RDBMS can be achieved using multiple tables and joining them together<br />Consider our requirement is to store a blogging post with this information<br />Post Content<br />Post Title<br />Post Author <br />Comments<br />Comment order<br />Comment content<br />Comment author <br />
    103. 103. RDBMS solution<br />
    104. 104. MongoDB Solution<br />Documents …. Each one of them is a post<br />{ Name: $name, <br />Author: $author,<br />Comment: [ { Author: $author1, <br />Comment: $comment1} , <br /> { Author: $author2,<br />Comment: $comment2,<br />Replies: [ { Author: $author3,<br />Comment: $comment3} ] } <br /> ]<br /> }<br />
    105. 105. RDBMS Viewpoint<br />
    106. 106. ODF<br />Mongodb’ed<br />
    107. 107.
    108. 108. Schema-less<br />No database enforced Schema<br />Addition, Deletion of columns are simple<br />Its about how the application uses APIs<br />Data definition need not be defined up front.<br />
    109. 109. Other Features<br />Data Tagging<br />Caching<br />Real Time Analytics<br />Image Storage<br />Dynamic Queries<br />Binary Storage<br />
    110. 110. MongoDB - Why Not? <br />Lacks transactions<br />Doesn’t completely support SQL<br />Lacks built-in revisioning system like CouchDB<br />Lacks full text searching features<br />
    111. 111. Try MongoDB @<br />http://try.mongodb.org/<br />
    112. 112. n<br />EOL<br />
    113. 113. Calm down!<br />Eventually Answered System<br />All your questions will be answered eventually<br />
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×