• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Chris Lea - What does NoSQL Mean for You
 

Chris Lea - What does NoSQL Mean for You

on

  • 11,064 views

From FOWA Dublin 2010

From FOWA Dublin 2010

Video: http://www.ustream.tv/myvideos/1/6906682

Statistics

Views

Total Views
11,064
Views on SlideShare
10,949
Embed Views
115

Actions

Likes
4
Downloads
50
Comments
0

6 Embeds 115

http://www.slideshare.net 48
http://calebesantos.wordpress.com 32
http://www.linkedin.com 20
http://lanyrd.com 11
https://www.linkedin.com 3
https://twitter.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Chris Lea - What does NoSQL Mean for You Chris Lea - What does NoSQL Mean for You Presentation Transcript

    • What Does NoSQL Mean for You? Chris Lea (mt) Media Temple FOWA Dublin 2010
    • For Starters: What does it mean at all?
    • For Starters: What does it mean at all? “NoSQL is a blanket term used to describe structured storage that doesn’t rely on SQL to be accessed in a useful way”. -- Chris Lea
    • For Starters: What does it mean at all? “NoSQL” DOES NOT mean “SQL is Bad”
    • MySQL does what I need, why should I care?
    • MySQL does what I need, why should I care? “If I’d asked my customers what they wanted, they’d have said a faster horse.” -- Henry Ford
    • MySQL does what I need, why should I care? RDBMS NoSQL Designed for generic Designed to solve workloads specific problems Large (and growing) Trades features for feature sets performance
    • (the NoSQL umbrella)
    • Key / Value Caches • Redis (the NoSQL umbrella) • Memcached
    • Key / Value Caches • Redis (the NoSQL umbrella) • Memcached Key / Value Stores • Tokyo cabinet • Memcachedb • Project Voldemort • Cassandra
    • Key / Value Caches • Redis (the NoSQL umbrella) • Memcached Key / Value Stores • Tokyo cabinet • Memcachedb • Project Voldemort • Cassandra Tabular • HBase • Hypertable
    • Key / Value Caches • Redis (the NoSQL umbrella) • Memcached Key / Value Stores Document • Tokyo cabinet • Memcachedb • CouchDB • Project Voldemort • MongoDB • Cassandra • Jackrabbit Tabular • HBase • Hypertable
    • Key / Value Caches • Redis (the NoSQL umbrella) • Memcached Key / Value Stores Document • Tokyo cabinet • Memcachedb • CouchDB • Project Voldemort • MongoDB • Cassandra • Jackrabbit Tabular • HBase • Hypertable
    • Should I be Thinking about NoSQL?
    • Should I be Thinking about NoSQL? Probably need RDBMS. Yes Can you sanely do what you need in the app? No Do you need transactions? Yes No Think about NoSQL.
    • NoSQL Systems Typically Don’t do Transactions or Joins
    • NoSQL Systems Typically Don’t do Transactions or Joins • If you really need transactions, stick with RDBMS • Not having joins turns out to be not such a big deal
    • NoSQL Systems Typically Don’t do Transactions or Joins MongoDB is an excellent use case example
    • Why MongoDB? • Comfortable if you are coming from MySQL • Written in C++ means all machine code • no Erlang / Java / virtual machines • Tools like mongo (shell), mongodump, mongostat, mongoimport • Native drives in languages you care about • no Thrift / REST / code generation steps
    • Why MongoDB? • No complex transactions • If you don’t use them, this is a non-issue • No joins • This turns out to not be a big deal generally, because we’re going to rethink our data modeling
    • Why MongoDB? Transactions and joins are a huge computational overhead, even if you don’t use them! • No complex transactions • If you don’t use them, this is a non-issue • No joins • This turns out to not be a big deal generally, because we’re going to rethink our data modeling
    • Why MongoDB? Transactions and joins are a huge computational overhead, even if you don’t use them! • No complex transactions • If you don’t use them, this is a non-issue • No joins • This turns out to not be a big deal generally, because we’re going to rethink our data modeling
    • Thinking About Your Data (RDBMS) • Look at data, determine logical groupings • (hope structure never changes) • Make tables based on groups, link with ID fields • Break up data on insert, put into appropriate tables • Use joins on select to re-assemble data • Create indexes as needed for fast queries
    • Thinking About Your Data (RDBMS) user_t comment_t comment_id user_id post_t post_id user_name post_id comment_body user_id post_title post_body
    • Thinking About Your Data (RDBMS) This leads to queries such as: SELECT post_title,post_body,post_id FROM post_t,user_t WHERE user_t.user_name = “Lorraine” AND post_t.user_id = user_t.user_id LIMIT 1; SELECT comment_body FROM comment_t WHERE comment_t.post_id = $post_id;
    • Thinking About Your Data (MongoDB) • Figure out how you will eventually use the data • Store it that way • Create indexes as needed for fast queries
    • Thinking About Your Data (MongoDB) from pymongo import Connection connection = Connection() db = connection['blog'] posts = db['posts'] post = {"author": "Lorraine", "title": "Who on Earth lets Chris Lea Talk on Stage?", "post": "Seriously. That's just not cool.", "comments": ["Is he really that bad?", "Yes, he really is."], "date": datetime.datetime.utcnow()} posts.insert(post)
    • Thinking About Your Data (MongoDB) from pymongo import Connection connection = Connection() db = connection['blog'] posts = db['posts'] post = posts.find_one({“author”: “Lorraine”})
    • Say Goodbye to Schemas from pymongo import Connection connection = Connection() db = connection['blog'] posts = db['posts'] post = {"author": "Lorraine", "title": "Who on Earth lets Chris Lea Talk on Stage?", "post": "Seriously. That's just not cool.", "comments": ["Is he really that bad?", "Yes, he really is."], "date": datetime.datetime.utcnow()} posts.insert(post)
    • Say Goodbye to Schemas from pymongo import Connection connection = Connection() db = connection['blog'] posts = db['posts'] post = {"author": "Lorraine", "title": "Who on Earth lets Chris Lea Talk on Stage?", "post": "Seriously. That's just not cool.", "comments": ["Is he really that bad?", "Yes, he really is."], "tags": ["fowa", "nosql", "nerds"], "date": datetime.datetime.utcnow()} posts.insert(post)
    • Say Goodbye to Schemas from pymongo import Connection connection = Connection() db = connection['blog'] If you want new fields... just start posts = db['posts'] using them! post = {"author": "Lorraine", "title": "Who on Earth lets Chris Lea Talk on Stage?", "post": "Seriously. That's just not cool.", "comments": ["Is he really that bad?", "Yes, he really is."], "tags": ["fowa", "nosql", "nerds"], "date": datetime.datetime.utcnow()} posts.insert(post)
    • Enjoy a Wealth of Query Options from pymongo import Connection connection = Connection() db = connection['blog'] posts = db['posts'] posts.find_one({“author”: “Lorraine”})
    • Enjoy a Wealth of Query Options from pymongo import Connection connection = Connection() db = connection['blog'] posts = db['posts'] posts.find({“author”: “Lorraine”}).limit(5)
    • Enjoy a Wealth of Query Options from pymongo import Connection connection = Connection() db = connection['blog'] posts = db['posts'] posts.find({“author”: /^Lor/})
    • Enjoy a Wealth of Query Options from pymongo import Connection connection = Connection() db = connection['blog'] posts = db['posts'] posts.find({“author”: {$not: “Lorraine”} })
    • Enjoy a Massive Performance Jump • Mileage will vary, but 10x is not uncommon • For reads and writes • Writes happen at near disk native speed • Logging to MongoDB is perfectly acceptable • Reads for active data near Memcached speeds
    • Enjoy a Massive Performance Jump Ability to write bad queries is enormously reduced!
    • Ability to write bad queries is enormously reduced! • No joins means need for complex indexes reduced • Chances of index / query mismatches vastly lower • Disk I/O much less complex, and therefore much faster
    • Caveats for MongoDB • Really should use 64bit machines for production • 32bit has 2G limit per collection (table) • Happiest with lots of RAM relative to active data • Under heavy development • Features / drivers / docs changing rapidly
    • Questions?