• Session 1: All about MongoDB (this one)!
• Session 2: All about Node.js (that’s next)
• Session 3: The coolness of both together
That’s a lotta
lotta stuff to
All About MongoDB
• Brief introduction to MongoDB
• Really cool discoveries and surprises
• Shameful admissions and painful stories
In The Beginning
• We had relational databases. Back then they
were called “databases” and that’s where you
stored your data.
• Primary focus: atomicity, consistency, reliability.
• Was normal to spend 6 hours. ON ONE QUERY.
• I love vacuum tubes, keep you warm in winter.
• Life was good.
• Hello, Internet!
• Databases became immediate source of pain for
• Trafﬁc grew, along with it came bigger
expectations, inﬁnitely more complexity, a slew
of new platforms, and Big Data™
• Web languages gravitated toward objects, not
• Size of data needed to live on more than one
• Performance requirements needed to be far
Along came sharding
• Can split your data across multiple machines
• Also splits your query load across multiple
• Like RAID for your data, right?
What sharding brought
along for the ride
• How do you back this stuff
• How do you spread a group
query across N machines
• How do you run a join query
that spans a sharded table?
All those hours, spent
mastering 3NF and
The Promises of MongoDB
• Speed - crazy whack-daddy fast
• Simplicity - JSON documents FTW
• Embedded documents
• 16MB limit
• Scale - sharding, multimaster out of the box
• Yes, I said whack-daddy.
Wait, there’s more
• Fulltext: Allows for compound indexes, supports
• Sharding: You can scale collections across N
• GridFS: Simple interface to store ﬁles in your
• Multimaster: Replica Sets make it possible for
read slaves, failover, redundancy
Mini Case Study: Totsy
• First ecommerce site to rely on MongoDB for all
data. Everything. Even product images and
• I suspected it would be fast.
• I suspected we could develop quickly.
(This was important, as they only let me hire one
• Went live with MongoDB on a quad-core
consumer grade el-cheapo machine, only 2GB
• I was terriﬁed.
• Over a million moms waiting for the launch.
• Upon launch, load was 0.05. Highest it ever got
was around 0.5.
• Simple models make for less code. There were
no sixteen-table joins, no ORM, one result had
all the data needed from a single query.!
• Less code makes for less bugs. No more six-
hour query debugging marathons. No more
learning why UNION was faster than JOIN…
• Less bugs leaves time for more code. Did I
mention they only let me hire one guy?
Even moar impact
• Used GridFS for all media storage.
• Allowed free MD5 checking for duplicates.
• Allowed storage of metadata per ﬁle (views,
comments, rates, whatever else we wanted).
• No need for NFS, clumsy rsync cronjobs, high
costs of NAS or iSCSI.