Chris Lea - What does NoSQL Mean for You

What Does NoSQL Mean
for You?
Chris Lea
(mt) Media Temple
FOWA Dublin 2010

For Starters: What does it
mean at all?

mean at all?

“NoSQL is a blanket term used to describe
structured storage that doesn’t rely on SQL
to be accessed in a useful way”.

-- Chris Lea

mean at all?

“NoSQL” DOES NOT mean “SQL is Bad”

MySQL does what I need, why
should I care?

should I care?

“If I’d asked my customers what they wanted,
they’d have said a faster horse.” -- Henry Ford

should I care?
RDBMS NoSQL

Designed for generic Designed to solve
workloads speciﬁc problems

Large (and growing) Trades features for
feature sets performance

Key / Value Caches
• Redis
(the NoSQL umbrella) • Memcached

Key / Value Caches
• Redis

Key / Value Stores
• Tokyo cabinet
• Memcachedb
• Project Voldemort
• Cassandra

Key / Value Caches
• Redis

Key / Value Stores
• Tokyo cabinet
• Memcachedb
• Project Voldemort
• Cassandra
Tabular
• HBase
• Hypertable

Key / Value Caches
• Redis

Key / Value Stores
Document
• Tokyo cabinet
• Memcachedb • CouchDB
• Project Voldemort • MongoDB
• Cassandra • Jackrabbit
Tabular
• HBase
• Hypertable

Should I be Thinking about
NoSQL?

Should I be Thinking about
NoSQL?
Probably need
RDBMS.

Yes Can you sanely do
what you need in the app? No
Do you need
transactions?

Yes
No

Think about
NoSQL.

NoSQL Systems Typically
Don’t do Transactions
or Joins

or Joins
• If you really need transactions, stick with RDBMS
• Not having joins turns out to be not such a big deal

or Joins

MongoDB is an excellent use case example

Why MongoDB?
• Comfortable if you are coming from MySQL
• Written in C++ means all machine code
• no Erlang / Java / virtual machines
• Tools like mongo (shell), mongodump, mongostat,
mongoimport
• Native drives in languages you care about
• no Thrift / REST / code generation steps

Why MongoDB?
• No complex transactions
• If you don’t use them, this is a non-issue
• No joins
• This turns out to not be a big deal generally, because
we’re going to rethink our data modeling

Why MongoDB?
Transactions and joins are a huge computational
overhead, even if you don’t use them!

• No complex transactions
• If you don’t use them, this is a non-issue
• No joins
• This turns out to not be a big deal generally, because
we’re going to rethink our data modeling

Thinking About Your Data (RDBMS)
• Look at data, determine logical groupings
• (hope structure never changes)
• Make tables based on groups, link with ID ﬁelds
• Break up data on insert, put into appropriate tables
• Use joins on select to re-assemble data
• Create indexes as needed for fast queries

user_t comment_t

comment_id
user_id
post_t post_id
user_name post_id
comment_body
user_id
post_title
post_body


This leads to queries such as:

SELECT post_title,post_body,post_id FROM post_t,user_t WHERE
user_t.user_name = “Lorraine” AND post_t.user_id = user_t.user_id LIMIT 1;

SELECT comment_body FROM comment_t WHERE comment_t.post_id = $post_id;

Thinking About Your Data (MongoDB)

• Figure out how you will eventually use the data
• Store it that way
• Create indexes as needed for fast queries


from pymongo import Connection
connection = Connection()
db = connection['blog']

posts = db['posts']

post = {"author": "Lorraine",
"title": "Who on Earth lets Chris Lea Talk on Stage?",
"post": "Seriously. That's just not cool.",
"comments": ["Is he really that bad?", "Yes, he really is."],
"date": datetime.datetime.utcnow()}

posts.insert(post)



posts = db['posts']

post = posts.ﬁnd_one({“author”: “Lorraine”})

Say Goodbye to Schemas


posts = db['posts']


posts.insert(post)


posts = db['posts']

"tags": ["fowa", "nosql", "nerds"],

posts.insert(post)

If you want new ﬁelds... just start
posts = db['posts'] using them!
"tags": ["fowa", "nosql", "nerds"],

posts.insert(post)

Enjoy a Wealth of Query Options


posts = db['posts']

posts.ﬁnd_one({“author”: “Lorraine”})



posts = db['posts']

posts.ﬁnd({“author”: “Lorraine”}).limit(5)



posts = db['posts']

posts.ﬁnd({“author”: /^Lor/})



posts = db['posts']

posts.ﬁnd({“author”: {$not: “Lorraine”} })

Enjoy a Massive Performance Jump

• Mileage will vary, but 10x is not uncommon
• For reads and writes
• Writes happen at near disk native speed
• Logging to MongoDB is perfectly acceptable
• Reads for active data near Memcached speeds

Enjoy a Massive Performance Jump

Ability to write bad queries is
enormously reduced!

Ability to write bad queries is
enormously reduced!

• No joins means need for complex indexes reduced
• Chances of index / query mismatches vastly lower
• Disk I/O much less complex, and therefore much faster

Caveats for MongoDB

• Really should use 64bit machines for production
• 32bit has 2G limit per collection (table)
• Happiest with lots of RAM relative to active data
• Under heavy development
• Features / drivers / docs changing rapidly

Chris Lea - What does NoSQL Mean for You

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Chris Lea - What does NoSQL Mean for You

Similar to Chris Lea - What does NoSQL Mean for You (20)

More from Carsonified Team

More from Carsonified Team (20)

Chris Lea - What does NoSQL Mean for You