Drop acid


Published on

Session on NoSQL Databases and MongoDB. I stole the title from someone who deserves credit, but unfortunately, can't remember who. I blame the acid.

Published in: Technology
1 Comment
1 Like
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Drop acid

  1. 1. NoSQL - Death to Relational Databases Mike Feltman F1 Technologies
  2. 2. Agenda• The NoSQL Movement• MongoDB Discussion & Demo• Discussion
  3. 3. The NoSQL MovementNo SQL Databases: Non-relational Less ACID More BASE CAP Trading Highly Scalable Highly PerformantNoSQL = Not Only SQL
  4. 4. Less ACID• Atomic • basically means supports transactions• Consistent • Has hard constraints & rejects non-conforming data• Isolated • No peaking at incomplete commits• Durable • Once a commit is finished, it lasts forever.
  5. 5. More BASE• Basically Available• Soft-state• Eventually consistent
  6. 6. CAP Trading• Consistency (client perceives set of operations completed)• Availability (operations terminate with an expected result)• Partition tolerance (operations will complete, even if a required resource is unavailable)• Only 2 are possible in distributed systems. – Eric Brewer
  7. 7. The NoSQL MovementWhy:• SQL is tedious and difficult• Strongly typed schemas are inflexible and painful to maintain• Inadequate performance of RDBMS on huge data stores• Poor Scalability of RDBMS• Poor Replication Support
  8. 8. Types of NoSQL Databases• Document Stores• Graph• Key/Value Store• Object Database• Tabular
  9. 9. Major Players• Mongodb (10gen) • Dynamo (Amazon)• CouchDB (Apache) • MObStor (Yahoo)• Cassandra (Apache – • Haystack (Facebook) formerly Facebook) • Voldemort (LinkedIn)• BigTable – (Google) • HBase/Hadoop (Apache• Berkeley DB (Oracle) & Microsoft)
  10. 10. MongoDB Combining the best features of document databases, key-value stores, and RDBMSes.• Scalable• High-Performance• Open Source• Schema-free• Document Oriented
  11. 11. MongoDB Features• Document-oriented • Replication storage (BSON) • Auto-sharding• Dynamic Queries • MapReduce• Full index support • Driver support for many (including embedded languages objects & arrays) • Cross-Platform• Fast, in-place updates • Admin Tools• Efficient Blob storage
  12. 12. Document Oriented Storage { firstName: “Nicklas”,• Data is stored in BSON lastName: “Lidstrom”, – Binary-encoded team: “Red Wings”, serialization of JSON-like stanleyCups : [1997, 1998, documents. 2002, 2008], – Lightweight, traversable norrisTrophies : [2001, & efficient 2002, 2003, 2006, 2007, – Supports embedded 2008] } objects & arrays – Document = Record
  13. 13. Dynamic Queries• No indexes required to Examples find data. • All records:• RDBMSes all support db.players.find({}) this as well. • All Red Wings db.players.find({“team”: “Red Wings”})
  14. 14. Index Support• B-Tree format• Default index on PK• Supports unique, compound, document indexes (indexes on nested documents) and multikeys indexes (allows indexing of arrays of values)
  15. 15. Fast in-place updates• Updates are made to existing documents within a collection.• Many “NoSQL” databases (such as CouchDB) do not support updates and instead store versions of records.
  16. 16. Efficient Blob Storage• Blob = Binary Large Object• Up to 4MB within document• GridFS specification is followed for larger items and external files
  17. 17. Replication• Enhanced master-slave configuration – one server active for writes at a time. – Provides failover and redundancy – Implemented with Replica Pairs • When master fails slave takes over • When slave fails control reverts to master• Limited Master-master
  18. 18. Auto-Sharding• Sharding: – Breaking database down into “shards” and spreading those across distributed/commodity servers. – highly scalable approach for increased throughput and performance of high-transaction, large database applications. – MongoDB manages data storage and retrieval behind the scenes.
  19. 19. MapReduce • Useful for batch• Term comes from Google. operations – Patented framework for • Aggregation: NoSQL processing huge datasets answer to GROUP BY on certain kinds of distributable problems using a large number of servers. – MongoDB applies it to single server instances as well.
  20. 20. Drivers• .NET (C#) • Perl• JavaScript • JVM• Python – Clojure• PHP – Groovy – Scala• Ruby• Java• C++
  21. 21. Cross-Platform• 32 bit & 64 bit versions available for: – Windows – OS X – Linux – Solaris
  22. 22. Admin Tools• Command Shell• Simple limited REST (http) Interface• Mongostat• Mongosniff (Unix only – use tcpdump on Windows)• Backup & Restore
  23. 23. MongoDB TerminologyTraditional RDBMS MongoDB• Database • Database• Table • Collection• Record • Document• Field • Key
  24. 24. Demo!• Start the server (if it’s not running). C:mongodbbinmongod• Start the shell C:mongodbbinmongo
  25. 25. The MongoDB Shell
  26. 26. Database Commands• Open Database • use (database name)• Create Database • use (database name)
  27. 27. How it works• Focused on documents – Document = sequence of key value pairs in bson • Value can be another document • Additional types vs. JSON. ie dates, regexp• Messages (cpassed over TCP/IP) in BSON drivers convert code to BSON• Memory mapped storage engine (MMSE) – all disk access takes place through MMSE• Query Optimizer: – Find( {x:10, y:”foo”}) – Launches multiple simultaneous queries based on indexes & table scan. Stops when one finishes, remembers which one was the fastest for future similar queries. Can use hint option to specify which index to use.
  28. 28. Why?• Applications where schema gets in the way• Performance• Scalability• RAD• More natural fit with OO Languages
  29. 29. Resources• www.mongodb.org