Your SlideShare is downloading. ×
Drop acid
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Drop acid


Published on

Session on NoSQL Databases and MongoDB. I stole the title from someone who deserves credit, but unfortunately, can't remember who. I blame the acid.

Session on NoSQL Databases and MongoDB. I stole the title from someone who deserves credit, but unfortunately, can't remember who. I blame the acid.

Published in: Technology

1 Comment
1 Like
No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. NoSQL - Death to Relational Databases Mike Feltman F1 Technologies
  • 2. Agenda• The NoSQL Movement• MongoDB Discussion & Demo• Discussion
  • 3. The NoSQL MovementNo SQL Databases: Non-relational Less ACID More BASE CAP Trading Highly Scalable Highly PerformantNoSQL = Not Only SQL
  • 4. Less ACID• Atomic • basically means supports transactions• Consistent • Has hard constraints & rejects non-conforming data• Isolated • No peaking at incomplete commits• Durable • Once a commit is finished, it lasts forever.
  • 5. More BASE• Basically Available• Soft-state• Eventually consistent
  • 6. CAP Trading• Consistency (client perceives set of operations completed)• Availability (operations terminate with an expected result)• Partition tolerance (operations will complete, even if a required resource is unavailable)• Only 2 are possible in distributed systems. – Eric Brewer
  • 7. The NoSQL MovementWhy:• SQL is tedious and difficult• Strongly typed schemas are inflexible and painful to maintain• Inadequate performance of RDBMS on huge data stores• Poor Scalability of RDBMS• Poor Replication Support
  • 8. Types of NoSQL Databases• Document Stores• Graph• Key/Value Store• Object Database• Tabular
  • 9. Major Players• Mongodb (10gen) • Dynamo (Amazon)• CouchDB (Apache) • MObStor (Yahoo)• Cassandra (Apache – • Haystack (Facebook) formerly Facebook) • Voldemort (LinkedIn)• BigTable – (Google) • HBase/Hadoop (Apache• Berkeley DB (Oracle) & Microsoft)
  • 10. MongoDB Combining the best features of document databases, key-value stores, and RDBMSes.• Scalable• High-Performance• Open Source• Schema-free• Document Oriented
  • 11. MongoDB Features• Document-oriented • Replication storage (BSON) • Auto-sharding• Dynamic Queries • MapReduce• Full index support • Driver support for many (including embedded languages objects & arrays) • Cross-Platform• Fast, in-place updates • Admin Tools• Efficient Blob storage
  • 12. Document Oriented Storage { firstName: “Nicklas”,• Data is stored in BSON lastName: “Lidstrom”, – Binary-encoded team: “Red Wings”, serialization of JSON-like stanleyCups : [1997, 1998, documents. 2002, 2008], – Lightweight, traversable norrisTrophies : [2001, & efficient 2002, 2003, 2006, 2007, – Supports embedded 2008] } objects & arrays – Document = Record
  • 13. Dynamic Queries• No indexes required to Examples find data. • All records:• RDBMSes all support db.players.find({}) this as well. • All Red Wings db.players.find({“team”: “Red Wings”})
  • 14. Index Support• B-Tree format• Default index on PK• Supports unique, compound, document indexes (indexes on nested documents) and multikeys indexes (allows indexing of arrays of values)
  • 15. Fast in-place updates• Updates are made to existing documents within a collection.• Many “NoSQL” databases (such as CouchDB) do not support updates and instead store versions of records.
  • 16. Efficient Blob Storage• Blob = Binary Large Object• Up to 4MB within document• GridFS specification is followed for larger items and external files
  • 17. Replication• Enhanced master-slave configuration – one server active for writes at a time. – Provides failover and redundancy – Implemented with Replica Pairs • When master fails slave takes over • When slave fails control reverts to master• Limited Master-master
  • 18. Auto-Sharding• Sharding: – Breaking database down into “shards” and spreading those across distributed/commodity servers. – highly scalable approach for increased throughput and performance of high-transaction, large database applications. – MongoDB manages data storage and retrieval behind the scenes.
  • 19. MapReduce • Useful for batch• Term comes from Google. operations – Patented framework for • Aggregation: NoSQL processing huge datasets answer to GROUP BY on certain kinds of distributable problems using a large number of servers. – MongoDB applies it to single server instances as well.
  • 20. Drivers• .NET (C#) • Perl• JavaScript • JVM• Python – Clojure• PHP – Groovy – Scala• Ruby• Java• C++
  • 21. Cross-Platform• 32 bit & 64 bit versions available for: – Windows – OS X – Linux – Solaris
  • 22. Admin Tools• Command Shell• Simple limited REST (http) Interface• Mongostat• Mongosniff (Unix only – use tcpdump on Windows)• Backup & Restore
  • 23. MongoDB TerminologyTraditional RDBMS MongoDB• Database • Database• Table • Collection• Record • Document• Field • Key
  • 24. Demo!• Start the server (if it’s not running). C:mongodbbinmongod• Start the shell C:mongodbbinmongo
  • 25. The MongoDB Shell
  • 26. Database Commands• Open Database • use (database name)• Create Database • use (database name)
  • 27. How it works• Focused on documents – Document = sequence of key value pairs in bson • Value can be another document • Additional types vs. JSON. ie dates, regexp• Messages (cpassed over TCP/IP) in BSON drivers convert code to BSON• Memory mapped storage engine (MMSE) – all disk access takes place through MMSE• Query Optimizer: – Find( {x:10, y:”foo”}) – Launches multiple simultaneous queries based on indexes & table scan. Stops when one finishes, remembers which one was the fastest for future similar queries. Can use hint option to specify which index to use.
  • 28. Why?• Applications where schema gets in the way• Performance• Scalability• RAD• More natural fit with OO Languages
  • 29. Resources•