• Save
Data as Documents: Overview and intro to MongoDB
Upcoming SlideShare
Loading in...5
×
 

Data as Documents: Overview and intro to MongoDB

on

  • 2,092 views

This is from my talk at BigDive in Turin, Italy 2013. The talk is generally about databases and how we evolved to where we are. There is a lot of command line stuff that is not shown here though - ...

This is from my talk at BigDive in Turin, Italy 2013. The talk is generally about databases and how we evolved to where we are. There is a lot of command line stuff that is not shown here though - this is mostly for attendees for reference.

Statistics

Views

Total Views
2,092
Views on SlideShare
1,895
Embed Views
197

Actions

Likes
3
Downloads
0
Comments
0

3 Embeds 197

https://twitter.com 192
http://www.linkedin.com 4
http://gazeta.yandex.ru 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Data as Documents: Overview and intro to MongoDB Data as Documents: Overview and intro to MongoDB Presentation Transcript

    • DATAASDOCUMENTSMitch PirtleBigDive 2013Turin, Italy
    • ABOUTME•Moved from NYC to TO in 2011•Recovering Joomla! founder•CTO @soundaymusic•Use primarily PHP (Lithium), Node.js•MongoDB Master
    • ABOUTTHISTALK•Background on database history•Impact from the Web•Emerging solutions and technologies•Hands-on session•Close with Q&A
    • Are you done with lunch?
    • INTHEBEGINNING• Data was simple.• Performance wassimpler.• Scale was a rare need.
    • BIRTHOFRELATIONALDATA• Applications got morecomplex.• Many apps, onedatabase pushed logicinto the data tier.• “Business rules” wasthe king buzzword.
    • BIRTHOFWEB• Very complexarchitecture• Very high scalerequirements• Rapid applicationdevelopment
    • WRONGTOOLRIGHTJOB?•Was great for data consistency andfeatures, but...•Impossible to scale•Impedance mismatch with modernapps
    • ALTERNATIVES• Key / Value• Documents• Memory-only*
    • KEYVALUE•EXAMPLES: Memcache, Voldemort,Cassandra, Dynamo, Hibari, Riak•No schema needed•Blazing fast•Minimal features
    • DOCUMENT•EXAMPLES: MongoDB, SimpleDB,ElasticSearch, OrientDB•Rich datatypes matching modern apps•More features•Mostly JSON based
    • EXAMPLEPLATFORMS
    • MONGODB•Document database, uses JSON•Many user/developer features•Many deployment features•Designed specifically for modern scalechallenges and programminglanguages
    • REDIS•Key-value database•Extended data types•Many features•Similar facilities for scale andperformance
    • VOLDEMORT•Key-value•Extreme scale
    • HADOOP•Framework, not really a database•Born from Google’s map reduce anddistributed file system efforts
    • DEPLOYMENTOVERVIEW
    • DEDICATEDSYSTEMS•Low cost, simple to setup•Great performance•Difficult to scale•Require constant management
    • TETHEREDCLOUD•Takes dedicated environment andextends with cloud infrastructure forscale•Extremely flexible•Even more management andadministration
    • FULLCLOUD•High initial effort•Much simpler to manage long term•Extreme scale•Possibility for equally extreme costsavings*
    • DEVELOPERS!
    • (hang on a minute)
    • DEVELOPERS!
    • (much better)
    • (ok now to get serious)
    • WORKINGWITHSQL• Crap, now I need anORM!• Disconnect betweenrelational data andobject languages• Tons of debuggingfun!
    • WORKINGWITHMONGODB• Simplifies data access• Simplifies code• Fewer execution stepsmake faster andlighter apps
    • COMMONTERMS•database <-> database•table <-> collection•result <-> document•column <-> property
    • WHATISJSON?
    • DOCUMENTDESIGN• strings• integers• arrays• objects• dates• boolean• regex• symbol• javascript• ObjectID• timestamps• GridFSMongoDB documents are BSON:
    • DOCUMENTDESIGN• strings• integers• arrays• objects• dates• boolean• regex• symbol• javascript• ObjectID• timestamps• GridFSMongoDB documents are BSON:
    • DATATYPE:OBJECTID•MongoDB’s ObjectID is a 12-byteBSON type, comprised of unix secondsfrom epoch (4 bytes), machine identifier(3 bytes), process id (2 bytes), andrandom counter (3 bytes).
    • DATATYPE:OBJECTIDObjectId("4ee75a9c318b9d2c640001a6"}
    • DATATYPE:OBJECTID•ObjectID is not a string. Alwaysreference them as ObjectId(“...”) asyour comparisons will not work if youdo not.
    • DATATYPE:OBJECTID> x = ObjectId()ObjectId("51b73dff884498553b746046")> x.getTimestamp()ISODate("2013-06-11T15:10:55Z")
    • DATATYPE:DATE•MongoDB’s Date is a 64-bit integerthat represents the Unix epoch inmilliseconds. It is signed, negativevalues represents dates before 1970.
    • DATATYPE:DATE> when = new Date()ISODate("2013-06-11T15:18:30.241Z")> when.toString()Tue Jun 11 2013 17:18:30 GMT+0200 (CEST)> when.getMonth()5
    • DATATYPE:GRIDFS•MongoDB’s GridFS is a facility thatallows you to store binary files withinthe database, and allows you to extendthem with JSON metadata.
    • (ok this part is easier on thecommand line. more on thislater in this class.)
    • COMMONTASKS• find(), findOne()• findAndModify()• ensureIndex()• drop()• insert()• update()• upsert()• save()• remove()• stats()
    • INDEXES•MongoDB’s indexes support a varietyof types and needs•Indexing overview
    • INDEXTYPES• Standard (_id)• Secondary• Subdocuments• Embedded fields• Compound• ASC and DESC keys• Multikeys• Unique• Sparse• Hash
    • INDEXCREATION
    • INDEXCREATION•Standard:db.people.ensureIndex( { zipcode: 1} )
    • INDEXCREATION•Standard:db.people.ensureIndex( { zipcode: 1} )•Background:
    • INDEXCREATION•Standard:db.people.ensureIndex( { zipcode: 1} )•Background:db.people.ensureIndex( { zipcode: 1},{background: true } )
    • INDEXCREATION•Standard:db.people.ensureIndex( { zipcode: 1} )•Background:db.people.ensureIndex( { zipcode: 1},{background: true } )•Background Sparse:
    • INDEXCREATION•Standard:db.people.ensureIndex( { zipcode: 1} )•Background:db.people.ensureIndex( { zipcode: 1},{background: true } )•Background Sparse:db.people.ensureIndex( { zipcode: 1},{background: true, sparse: true } )
    • GRIDFS•Drivers support GridFS with helpermethods, as well as the mongofilescommand line tool that is distributedwith MongoDB.•Crazy, whack-daddy fast.•Dead simple to use.
    • (drop to console)
    • NOTE: MongoDB provides manycommand line tools to work with yourdatabase. They are listed anddocumented in great detail online.
    • HOWMONGODBSCALES•Vertically: Replication•Horizontally: Sharding
    • REPLICATION•MongoDB’s Replica Sets allow you toadd multiple masters for writeperformance, slaves for readperformance•Many tutorials and procedures
    • REPLICATIONM1 M2 M3H1 D1 D2(M)ember(H)idden(D)elayed
    • AGGREGATIONFRAMEWORK•Aggregation Framework providesGROUP BY like functionality withoutmap reduce•Many examples•Detailed reference
    • {! "_id" : ObjectId("51b833cd884498553b746047"),! "title" : "Book 1",! "author" : "Ima Writer",! "tags" : [! ! "awesome",! ! "ok",! ! "lousy",! ! "ok",! ! "meh",! ! "meh"! ]}{! "_id" : ObjectId("51b833ee884498553b746048"),! "title" : "Book 2",! "author" : "Heesan Author",! "tags" : [! ! "awesome",! ! "ok",! ! "lousy",! ! "awesome",! ! "good",! ! "good"! ]}
    • db.articles.aggregate({ $project : {author : 1,tags : 1,} },{ $unwind : "$tags" },{ $group : {_id : { tags : "$tags" },authors : { $addToSet : "$author" }} });
    • {! "result" : [! ! {! ! ! "_id" : {! ! ! ! "tags" : "good"! ! ! },! ! ! "authors" : [! ! ! ! "Heesan Author"! ! ! ]! ! },! ! {! ! ! "_id" : {! ! ! ! "tags" : "meh"! ! ! },! ! ! "authors" : [! ! ! ! "Sheesan Author",! ! ! ! "Ima Writer"! ! ! ]! ! }! ],! "ok" : 1}
    • SHARDING•MongoDB’s Sharding allows you toscale your data beyond one physicalmachine:- need more RAM- need more CPU- need more disk
    • SHARDINGDEPLOYMENTS1 S2 S3M1 M2 M3(C)onfig(S)hard server (mongos)(M)ongo shard node (mongod)C1
    • MAPREDUCE•MongoDB’s mapReduce performscomplex aggregation operations•Many examples•Even more fun than regex!
    • Map Reduce is covered indetail in a later class atBIGDIVE
    • QUESTIONSANDANSWERS
    • THANKYOU