SlideShare a Scribd company logo
1 of 44
Download to read offline
NoSQL
Now we know what it’s not... what is it?
What are we running
from?
• Relational databases are the defacto
standard for storing data in a web
application.
• A lot of times, that data isn’t really
relational at all.
• RDBMS’s have lots of rules that can impact
performance.
Rules? What Rules?
• Classic relational databases follow the
ACID rules:
• Atomicity
• Consistency
• Isolation
• Durability
Atomicity
• If any part of the update fails, it all fails.
• Databases have to be able to lock tables
and rows for operations, which can block
or delay other incoming requests.
Consistency
• After a transaction, all copies of the data
must be consistent with each other (my
interpretation).
• Replication across lots of shards is
expensive especially if there’s locking
involved.
Isolation
• Data involved in a transaction must be
inaccessible to other operations.
• Remember the thing about locked rows
and tables?
• It’s a bummer.
Durability
• Once a user is notified that a transaction
has completed, the data must be accessible
and all integrity constraints have been met.
I come not to bury
MySQL...
• Relational databases are great for a lot of
uses.
• If you have data that’s actually relational and
you need transactions, joins and have a
limited number of data types, then an
RDBMS will work for you.
But...
• RDBMS’s have been
treated like hammers
and used for things
they’re not good at and
weren’t designed for.
• Like the web...
Thus were born...
• Key-Value Stores
• Wide-Column Stores
• Document Stores/Databases
• Graph Databases
All thrown together &
clumsily dubbed...
NoSQL
Which, despite it’s
negative sound,
supposedly means:
“Not Only SQL”
Yeah, I don’t believe it
either...
Key-Value
Just what it sounds like. You set a Key to aValue and
can then retrieve it.
Key-Value Benefits
• Simple
• High performance (usually) because there
are no transactions or relations so it’s a
simple bucket and lookup.
• Extremely flexible
• Commonly used as caches in front of
slower resources (like MySQL - bazinga!)
Popular Players
• memcached - in memory only, extremely
efficient hashing algorithm allows you to
scale easily to hundreds of nodes.
• Redis - persistent, slightly more complex
than memcached (has support for arrays)
but still highly performant.
• Riak - The Rails Machine guys love it. Jesse?
My Uses
• memcached: Read-through cache for
Rails with cache-money.
• redis: persistent cache for results from
our algorithm, partitioned by version and
instance.
Wide Column
• Family of databases modeled on either
Google’s BigTable or Amazon’s Dynamo.
• Pick two out of three from the CAP
theorem in order to get horizontal
scalability.
• Data stored by column instead of by row.
CAP?
• Consistency:All clients always have the
same view of the data.
• Availability: Each client can always read
and write.
• Partition Tolerance:The system works
well despite physical network partitions
Use cases
• Making sense out of large amounts of data
where you know your query scenario
ahead of time.
• Large = 100s of millions of records.
• Data-mining log files and other sources of
similar data.
Big Players
• HBase
• Cassandra
• Hypertable
• Amazon’s SimpleDB
• Google’s BigTable (the granddaddy of all of
them)
Graph Databases
• Store nodes, edges and properties
• Think of them as Things, Connections and
Properties
• Good for storing properties and
relationships.
• Honestly, I don’t fully understand them...
anyone?
The Players
• Neo4j
• FlockDB
• HyperGraphDB
Document Stores
• Short on relationships, tall on rich data
types.
• Big on eventual consistency and flexible
schemas.
• Hybrid of traditional RDBMS and Key-Value
stores.
Use Cases
• Content Management Systems
• Applications with rapid partial updates
• Anything you don’t need joins or
transactions for that you would normally
use a RDBMS for.
The Players
• CouchDB
• MongoDB
• Terrastore
MongoDB
• Support for rich data types: arrays, hashes,
embedded documents, etc
• Support for adding and removing things
from arrays and embedded documents
(addToSet, for example).
• Map/Reduce support and strong indexes
• Regular expression support in queries
Design Considerations
• Embedded Documents - Use only if it
the embedded document will always be
selected with the parent.
• Indexes - MongoDB punishes you much
earlier for missing indexes than MySQL.
• Document size - Currently, documents
are limited to 4MB, which should be large
enough, but if it’s not...
Real-World MongoDB
• We use MongoDB heavily at MIS.
• Statistics application and reporting
• Top-secret new application
• Web crawler and indexer
• CMS
Real-World Example
Let’s do tags. Everything is taggable now, right?
The MySQL Way
Schema
And to get a “thing’s”
tags?
SELECT `tags`.* FROM `tags`
INNER JOIN `taggings` ON `tags`.id = `taggings`.tag_id
WHERE ((`taggings`.taggable_id = 237)
AND (`taggings`.taggable_type = 'Song'))
Yuck!
That’s a lot of pain for something so simple.
And I didn’t even show you finding things with tag “x”.
Or how to set and unset tags on a “thing”.
Ouch.
The MongoDB Way
Using MongoMapper and Rails 3
class Post
include MongoMapper::Document
key :title, String
key :body, String
key :tags, Array
ensure_index :tags
end
Let’s Make This Easy...
def add_tag(tag)
tag = Post.clean_tag(tag)
self.tags << tag
self.add_to_set(:tags => tag) unless self.new_record?
end
def remove_tag(tag)
tag = Post.clean_tag(tag)
self.tags.delete(tag)
self.pull(:tags => tag) unless self.new_record?
end
def self.clean_tag(str)
str.strip.downcase.gsub(" ","-").gsub(/[^a-z0-9-]/,"")
end
def self.clean_tags(str)
out = []
arr = str.split(",")
arr.each do |t|
out << self.clean_tag(t)
end
out
end
Demo Time
Sorry if you’re looking at this later, but it’s console time!
Why I Love MongoDB
• Document model fits how I build web apps.
• For most apps, I don’t need transactions.
• Eventual consistency is actually OK.
• Partial updates and arrays make things that
are a pain in SQL-land absolutely painless.
• It’s just smart enough without getting in the
way.
What’s NoSQL, really?
• The right tool for the job.
• We’ve got lots of options for storing
application data.
• The key is picking the one that solves our
real problem.
• And if an RDBMS is the right tool, that’s OK
too.
Questions?
Further Reading
• Visual NoSQL: http://blog.nahurst.com/
visual-guide-to-nosql-systems
• MongoDB: http://mongodb.org
• MongoMapper: http://mongomapper.com/
Thanks!
• Kevin Lawver
• @kplawver
• kevin@lawver.net
• http://kevinlawver.com

More Related Content

Viewers also liked

Viewers also liked (9)

CODE!
CODE!CODE!
CODE!
 
Hinduja Interactive Company Profile
Hinduja Interactive Company ProfileHinduja Interactive Company Profile
Hinduja Interactive Company Profile
 
Welcome To Ruby On Rails
Welcome To Ruby On RailsWelcome To Ruby On Rails
Welcome To Ruby On Rails
 
Crowdsourcing in the Public Sector
Crowdsourcing in the Public SectorCrowdsourcing in the Public Sector
Crowdsourcing in the Public Sector
 
Inspire U Presents Aromatherapy for Special Populations
Inspire U Presents Aromatherapy for Special PopulationsInspire U Presents Aromatherapy for Special Populations
Inspire U Presents Aromatherapy for Special Populations
 
Vocabulario o viño
Vocabulario o viñoVocabulario o viño
Vocabulario o viño
 
Súper Casares Paqui
Súper Casares PaquiSúper Casares Paqui
Súper Casares Paqui
 
Social Media Food Chain
Social Media Food ChainSocial Media Food Chain
Social Media Food Chain
 
'UX', 'UX Design' and 'Good UX'
'UX', 'UX Design' and 'Good UX''UX', 'UX Design' and 'Good UX'
'UX', 'UX Design' and 'Good UX'
 

Recently uploaded

Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
FIDO Alliance
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
FIDO Alliance
 

Recently uploaded (20)

Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
Navigating the Large Language Model choices_Ravi Daparthi
Navigating the Large Language Model choices_Ravi DaparthiNavigating the Large Language Model choices_Ravi Daparthi
Navigating the Large Language Model choices_Ravi Daparthi
 
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
 
الأمن السيبراني - ما لا يسع للمستخدم جهله
الأمن السيبراني - ما لا يسع للمستخدم جهلهالأمن السيبراني - ما لا يسع للمستخدم جهله
الأمن السيبراني - ما لا يسع للمستخدم جهله
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage Intacct
 
ChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps Productivity
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
The Ultimate Prompt Engineering Guide for Generative AI: Get the Most Out of ...
The Ultimate Prompt Engineering Guide for Generative AI: Get the Most Out of ...The Ultimate Prompt Engineering Guide for Generative AI: Get the Most Out of ...
The Ultimate Prompt Engineering Guide for Generative AI: Get the Most Out of ...
 
State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream Processing
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
Introduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxIntroduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptx
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
2024 May Patch Tuesday
2024 May Patch Tuesday2024 May Patch Tuesday
2024 May Patch Tuesday
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
 
JavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuideJavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate Guide
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 

NoSQL - We know what it isn't, but what is it?

  • 1. NoSQL Now we know what it’s not... what is it?
  • 2. What are we running from? • Relational databases are the defacto standard for storing data in a web application. • A lot of times, that data isn’t really relational at all. • RDBMS’s have lots of rules that can impact performance.
  • 3. Rules? What Rules? • Classic relational databases follow the ACID rules: • Atomicity • Consistency • Isolation • Durability
  • 4. Atomicity • If any part of the update fails, it all fails. • Databases have to be able to lock tables and rows for operations, which can block or delay other incoming requests.
  • 5. Consistency • After a transaction, all copies of the data must be consistent with each other (my interpretation). • Replication across lots of shards is expensive especially if there’s locking involved.
  • 6. Isolation • Data involved in a transaction must be inaccessible to other operations. • Remember the thing about locked rows and tables? • It’s a bummer.
  • 7. Durability • Once a user is notified that a transaction has completed, the data must be accessible and all integrity constraints have been met.
  • 8. I come not to bury MySQL... • Relational databases are great for a lot of uses. • If you have data that’s actually relational and you need transactions, joins and have a limited number of data types, then an RDBMS will work for you.
  • 9. But... • RDBMS’s have been treated like hammers and used for things they’re not good at and weren’t designed for. • Like the web...
  • 10. Thus were born... • Key-Value Stores • Wide-Column Stores • Document Stores/Databases • Graph Databases
  • 11. All thrown together & clumsily dubbed...
  • 12. NoSQL
  • 13. Which, despite it’s negative sound, supposedly means: “Not Only SQL”
  • 14. Yeah, I don’t believe it either...
  • 15. Key-Value Just what it sounds like. You set a Key to aValue and can then retrieve it.
  • 16. Key-Value Benefits • Simple • High performance (usually) because there are no transactions or relations so it’s a simple bucket and lookup. • Extremely flexible • Commonly used as caches in front of slower resources (like MySQL - bazinga!)
  • 17. Popular Players • memcached - in memory only, extremely efficient hashing algorithm allows you to scale easily to hundreds of nodes. • Redis - persistent, slightly more complex than memcached (has support for arrays) but still highly performant. • Riak - The Rails Machine guys love it. Jesse?
  • 18. My Uses • memcached: Read-through cache for Rails with cache-money. • redis: persistent cache for results from our algorithm, partitioned by version and instance.
  • 19. Wide Column • Family of databases modeled on either Google’s BigTable or Amazon’s Dynamo. • Pick two out of three from the CAP theorem in order to get horizontal scalability. • Data stored by column instead of by row.
  • 20. CAP? • Consistency:All clients always have the same view of the data. • Availability: Each client can always read and write. • Partition Tolerance:The system works well despite physical network partitions
  • 21. Use cases • Making sense out of large amounts of data where you know your query scenario ahead of time. • Large = 100s of millions of records. • Data-mining log files and other sources of similar data.
  • 22. Big Players • HBase • Cassandra • Hypertable • Amazon’s SimpleDB • Google’s BigTable (the granddaddy of all of them)
  • 23. Graph Databases • Store nodes, edges and properties • Think of them as Things, Connections and Properties • Good for storing properties and relationships. • Honestly, I don’t fully understand them... anyone?
  • 24. The Players • Neo4j • FlockDB • HyperGraphDB
  • 25. Document Stores • Short on relationships, tall on rich data types. • Big on eventual consistency and flexible schemas. • Hybrid of traditional RDBMS and Key-Value stores.
  • 26. Use Cases • Content Management Systems • Applications with rapid partial updates • Anything you don’t need joins or transactions for that you would normally use a RDBMS for.
  • 27. The Players • CouchDB • MongoDB • Terrastore
  • 28. MongoDB • Support for rich data types: arrays, hashes, embedded documents, etc • Support for adding and removing things from arrays and embedded documents (addToSet, for example). • Map/Reduce support and strong indexes • Regular expression support in queries
  • 29. Design Considerations • Embedded Documents - Use only if it the embedded document will always be selected with the parent. • Indexes - MongoDB punishes you much earlier for missing indexes than MySQL. • Document size - Currently, documents are limited to 4MB, which should be large enough, but if it’s not...
  • 30. Real-World MongoDB • We use MongoDB heavily at MIS. • Statistics application and reporting • Top-secret new application • Web crawler and indexer • CMS
  • 31. Real-World Example Let’s do tags. Everything is taggable now, right?
  • 34. And to get a “thing’s” tags? SELECT `tags`.* FROM `tags` INNER JOIN `taggings` ON `tags`.id = `taggings`.tag_id WHERE ((`taggings`.taggable_id = 237) AND (`taggings`.taggable_type = 'Song'))
  • 35. Yuck! That’s a lot of pain for something so simple. And I didn’t even show you finding things with tag “x”. Or how to set and unset tags on a “thing”. Ouch.
  • 36. The MongoDB Way Using MongoMapper and Rails 3
  • 37. class Post include MongoMapper::Document key :title, String key :body, String key :tags, Array ensure_index :tags end
  • 38. Let’s Make This Easy... def add_tag(tag) tag = Post.clean_tag(tag) self.tags << tag self.add_to_set(:tags => tag) unless self.new_record? end def remove_tag(tag) tag = Post.clean_tag(tag) self.tags.delete(tag) self.pull(:tags => tag) unless self.new_record? end def self.clean_tag(str) str.strip.downcase.gsub(" ","-").gsub(/[^a-z0-9-]/,"") end def self.clean_tags(str) out = [] arr = str.split(",") arr.each do |t| out << self.clean_tag(t) end out end
  • 39. Demo Time Sorry if you’re looking at this later, but it’s console time!
  • 40. Why I Love MongoDB • Document model fits how I build web apps. • For most apps, I don’t need transactions. • Eventual consistency is actually OK. • Partial updates and arrays make things that are a pain in SQL-land absolutely painless. • It’s just smart enough without getting in the way.
  • 41. What’s NoSQL, really? • The right tool for the job. • We’ve got lots of options for storing application data. • The key is picking the one that solves our real problem. • And if an RDBMS is the right tool, that’s OK too.
  • 43. Further Reading • Visual NoSQL: http://blog.nahurst.com/ visual-guide-to-nosql-systems • MongoDB: http://mongodb.org • MongoMapper: http://mongomapper.com/
  • 44. Thanks! • Kevin Lawver • @kplawver • kevin@lawver.net • http://kevinlawver.com