• Like
  • Save
Data as Documents: Overview and intro to MongoDB
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Data as Documents: Overview and intro to MongoDB

  • 1,719 views
Published

This is from my talk at BigDive in Turin, Italy 2013. The talk is generally about databases and how we evolved to where we are. There is a lot of command line stuff that is not shown here though - …

This is from my talk at BigDive in Turin, Italy 2013. The talk is generally about databases and how we evolved to where we are. There is a lot of command line stuff that is not shown here though - this is mostly for attendees for reference.

Published in Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,719
On SlideShare
0
From Embeds
0
Number of Embeds
3

Actions

Shares
Downloads
0
Comments
0
Likes
3

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. DATAASDOCUMENTSMitch PirtleBigDive 2013Turin, Italy
  • 2. ABOUTME•Moved from NYC to TO in 2011•Recovering Joomla! founder•CTO @soundaymusic•Use primarily PHP (Lithium), Node.js•MongoDB Master
  • 3. ABOUTTHISTALK•Background on database history•Impact from the Web•Emerging solutions and technologies•Hands-on session•Close with Q&A
  • 4. Are you done with lunch?
  • 5. INTHEBEGINNING• Data was simple.• Performance wassimpler.• Scale was a rare need.
  • 6. BIRTHOFRELATIONALDATA• Applications got morecomplex.• Many apps, onedatabase pushed logicinto the data tier.• “Business rules” wasthe king buzzword.
  • 7. BIRTHOFWEB• Very complexarchitecture• Very high scalerequirements• Rapid applicationdevelopment
  • 8. WRONGTOOLRIGHTJOB?•Was great for data consistency andfeatures, but...•Impossible to scale•Impedance mismatch with modernapps
  • 9. ALTERNATIVES• Key / Value• Documents• Memory-only*
  • 10. KEYVALUE•EXAMPLES: Memcache, Voldemort,Cassandra, Dynamo, Hibari, Riak•No schema needed•Blazing fast•Minimal features
  • 11. DOCUMENT•EXAMPLES: MongoDB, SimpleDB,ElasticSearch, OrientDB•Rich datatypes matching modern apps•More features•Mostly JSON based
  • 12. EXAMPLEPLATFORMS
  • 13. MONGODB•Document database, uses JSON•Many user/developer features•Many deployment features•Designed specifically for modern scalechallenges and programminglanguages
  • 14. REDIS•Key-value database•Extended data types•Many features•Similar facilities for scale andperformance
  • 15. VOLDEMORT•Key-value•Extreme scale
  • 16. HADOOP•Framework, not really a database•Born from Google’s map reduce anddistributed file system efforts
  • 17. DEPLOYMENTOVERVIEW
  • 18. DEDICATEDSYSTEMS•Low cost, simple to setup•Great performance•Difficult to scale•Require constant management
  • 19. TETHEREDCLOUD•Takes dedicated environment andextends with cloud infrastructure forscale•Extremely flexible•Even more management andadministration
  • 20. FULLCLOUD•High initial effort•Much simpler to manage long term•Extreme scale•Possibility for equally extreme costsavings*
  • 21. DEVELOPERS!
  • 22. (hang on a minute)
  • 23. DEVELOPERS!
  • 24. (much better)
  • 25. (ok now to get serious)
  • 26. WORKINGWITHSQL• Crap, now I need anORM!• Disconnect betweenrelational data andobject languages• Tons of debuggingfun!
  • 27. WORKINGWITHMONGODB• Simplifies data access• Simplifies code• Fewer execution stepsmake faster andlighter apps
  • 28. COMMONTERMS•database <-> database•table <-> collection•result <-> document•column <-> property
  • 29. WHATISJSON?
  • 30. DOCUMENTDESIGN• strings• integers• arrays• objects• dates• boolean• regex• symbol• javascript• ObjectID• timestamps• GridFSMongoDB documents are BSON:
  • 31. DOCUMENTDESIGN• strings• integers• arrays• objects• dates• boolean• regex• symbol• javascript• ObjectID• timestamps• GridFSMongoDB documents are BSON:
  • 32. DATATYPE:OBJECTID•MongoDB’s ObjectID is a 12-byteBSON type, comprised of unix secondsfrom epoch (4 bytes), machine identifier(3 bytes), process id (2 bytes), andrandom counter (3 bytes).
  • 33. DATATYPE:OBJECTIDObjectId("4ee75a9c318b9d2c640001a6"}
  • 34. DATATYPE:OBJECTID•ObjectID is not a string. Alwaysreference them as ObjectId(“...”) asyour comparisons will not work if youdo not.
  • 35. DATATYPE:OBJECTID> x = ObjectId()ObjectId("51b73dff884498553b746046")> x.getTimestamp()ISODate("2013-06-11T15:10:55Z")
  • 36. DATATYPE:DATE•MongoDB’s Date is a 64-bit integerthat represents the Unix epoch inmilliseconds. It is signed, negativevalues represents dates before 1970.
  • 37. DATATYPE:DATE> when = new Date()ISODate("2013-06-11T15:18:30.241Z")> when.toString()Tue Jun 11 2013 17:18:30 GMT+0200 (CEST)> when.getMonth()5
  • 38. DATATYPE:GRIDFS•MongoDB’s GridFS is a facility thatallows you to store binary files withinthe database, and allows you to extendthem with JSON metadata.
  • 39. (ok this part is easier on thecommand line. more on thislater in this class.)
  • 40. COMMONTASKS• find(), findOne()• findAndModify()• ensureIndex()• drop()• insert()• update()• upsert()• save()• remove()• stats()
  • 41. INDEXES•MongoDB’s indexes support a varietyof types and needs•Indexing overview
  • 42. INDEXTYPES• Standard (_id)• Secondary• Subdocuments• Embedded fields• Compound• ASC and DESC keys• Multikeys• Unique• Sparse• Hash
  • 43. INDEXCREATION
  • 44. INDEXCREATION•Standard:db.people.ensureIndex( { zipcode: 1} )
  • 45. INDEXCREATION•Standard:db.people.ensureIndex( { zipcode: 1} )•Background:
  • 46. INDEXCREATION•Standard:db.people.ensureIndex( { zipcode: 1} )•Background:db.people.ensureIndex( { zipcode: 1},{background: true } )
  • 47. INDEXCREATION•Standard:db.people.ensureIndex( { zipcode: 1} )•Background:db.people.ensureIndex( { zipcode: 1},{background: true } )•Background Sparse:
  • 48. INDEXCREATION•Standard:db.people.ensureIndex( { zipcode: 1} )•Background:db.people.ensureIndex( { zipcode: 1},{background: true } )•Background Sparse:db.people.ensureIndex( { zipcode: 1},{background: true, sparse: true } )
  • 49. GRIDFS•Drivers support GridFS with helpermethods, as well as the mongofilescommand line tool that is distributedwith MongoDB.•Crazy, whack-daddy fast.•Dead simple to use.
  • 50. (drop to console)
  • 51. NOTE: MongoDB provides manycommand line tools to work with yourdatabase. They are listed anddocumented in great detail online.
  • 52. HOWMONGODBSCALES•Vertically: Replication•Horizontally: Sharding
  • 53. REPLICATION•MongoDB’s Replica Sets allow you toadd multiple masters for writeperformance, slaves for readperformance•Many tutorials and procedures
  • 54. REPLICATIONM1 M2 M3H1 D1 D2(M)ember(H)idden(D)elayed
  • 55. AGGREGATIONFRAMEWORK•Aggregation Framework providesGROUP BY like functionality withoutmap reduce•Many examples•Detailed reference
  • 56. {! "_id" : ObjectId("51b833cd884498553b746047"),! "title" : "Book 1",! "author" : "Ima Writer",! "tags" : [! ! "awesome",! ! "ok",! ! "lousy",! ! "ok",! ! "meh",! ! "meh"! ]}{! "_id" : ObjectId("51b833ee884498553b746048"),! "title" : "Book 2",! "author" : "Heesan Author",! "tags" : [! ! "awesome",! ! "ok",! ! "lousy",! ! "awesome",! ! "good",! ! "good"! ]}
  • 57. db.articles.aggregate({ $project : {author : 1,tags : 1,} },{ $unwind : "$tags" },{ $group : {_id : { tags : "$tags" },authors : { $addToSet : "$author" }} });
  • 58. {! "result" : [! ! {! ! ! "_id" : {! ! ! ! "tags" : "good"! ! ! },! ! ! "authors" : [! ! ! ! "Heesan Author"! ! ! ]! ! },! ! {! ! ! "_id" : {! ! ! ! "tags" : "meh"! ! ! },! ! ! "authors" : [! ! ! ! "Sheesan Author",! ! ! ! "Ima Writer"! ! ! ]! ! }! ],! "ok" : 1}
  • 59. SHARDING•MongoDB’s Sharding allows you toscale your data beyond one physicalmachine:- need more RAM- need more CPU- need more disk
  • 60. SHARDINGDEPLOYMENTS1 S2 S3M1 M2 M3(C)onfig(S)hard server (mongos)(M)ongo shard node (mongod)C1
  • 61. MAPREDUCE•MongoDB’s mapReduce performscomplex aggregation operations•Many examples•Even more fun than regex!
  • 62. Map Reduce is covered indetail in a later class atBIGDIVE
  • 63. QUESTIONSANDANSWERS
  • 64. THANKYOU