SlideShare a Scribd company logo
Intro to NoSQL databases
with MongoDB
Hector Correa
hector@hectorcorrea.com
@hectorjcorrea
Monday, June 3, 13
Agenda
• What are NoSQL databases
• Why NoSQL
• Examples using MongoDB
• Pros and cons
• Q&A
Monday, June 3, 13
What are NoSQL databases
Monday, June 3, 13
•A type of databases
•Don’t use the relational model
•Good fit for distributed environments
•(Usually) don’t use SQL
•(Most of them) are open source
What are NoSQL databases
Source: http://nosql-database.org
Monday, June 3, 13
“NoSQL refers to an ill-defined
set of mostly open-source
databases, mostly developed in
the early 21st century, and
mostly not using SQL”
What are NoSQL databases
Source: NoSQL Distilled by Sadalage and Fowler
Monday, June 3, 13
NoSQL is a movement
not a specific technology or
product
What are NoSQL databases
Monday, June 3, 13
What are NoSQL databases
NOSQL Meetup
This meetup is about "open source,
distributed, non relational databases".
Have you run into limitations with
traditional relational databases? Don't
mind trading a query language for
scalability? [...]
NoSQL term coined in 2009 by Johan Oskarsson
while organizing a meetup
Source: http://nosql.eventbrite.com
Monday, June 3, 13
It doesn’t mean “Screw SQL”
It’s more like “Not Only SQL”
What are NoSQL databases
Monday, June 3, 13
NoSQL has very little to do with
SQL (structured query language)
It should have been called Not Only
Relational Databases
#NoRDBMS anyone?
What are NoSQL databases
Monday, June 3, 13
Quick Demo
with MongoDB
Monday, June 3, 13
Why NoSQL?
Monday, June 3, 13
Why NoSQL?
Very large and distributed databases
Rise of unstructured data
Ease of development
Monday, June 3, 13
Why NoSQL? - large datasets
Massive datasets (Google,Amazon, Facebook)
Distributed environments (hundreds of nodes)
RDBMS scale up but not scale out
Monday, June 3, 13
Why NoSQL? - large datasets
But we don't have that problem
at my company...
Monday, June 3, 13
Why NoSQL? - large datasets
...true, and we also thought that
640 KB of RAM should be
enough for everybody.
Source: http://www.youtube.com/watch?v=qI_g07C_Q5I
But we don't have that problem
at my company...
Monday, June 3, 13
Why NoSQL? - unstructured data
Web pages (Google,Yahoo)
Log data, scientific data
Content Management Systems
Field1, field2, field3, fieldN
Storing field definitions as rows
Tracking changes (usually a BLOB)
Monday, June 3, 13
Why NoSQL? - ease of development
Data impedance mismatch (OO vs RDBMS)
Applies to both structured and unstructured data
Aggregates are desirable in a cluster environment
NoSQL can reduce this friction
Monday, June 3, 13
Types of NoSQL
databases
Monday, June 3, 13
•Key-value
SimpleDB, Redis, Dynamo,Voldemort, Riak
•Document-oriented
MongoDB, CouchDB, RavenDB
•Column-oriented
BigTable, HBase, CASSANDRA, PNUTS
Types of NoSQL databases
Sources:
Book: NoSQL Distilled by Sadalage and Fowler
Paper: NoSQL Databases: a step to database scalability in Web environment by Jaroslav Pokorny
Monday, June 3, 13
Types of NoSQL databases
•Not using the relational model
•Run well on clusters
•Can handle huge amount of data
•Open Source
•Build for 21st century web access
•Schema-less / schema-free
•BASE (not ACID)
Common Characteristics
Monday, June 3, 13
Types of NoSQL databases
ACID Transactions
Source: BASE:An Acid Alternative by Dan Pritchett http://queue.acm.org/detail.cfm?id=1394128
• Atomicity. Transactions are all or nothing.
• Consistency. The database will be in a
consistent state when the transaction begins and
ends.
• Isolation. The transaction will behave as if it is
the only operation being performed upon the
database.
• Durability. Upon completion of the transaction,
the operation will not be reversed.
Monday, June 3, 13
Types of NoSQL databases
BASE
Source: BASE:An Acid Alternative by Dan Pritchett http://queue.acm.org/detail.cfm?id=1394128
• Basically Available, Soft state, Eventually
consistent
• BASE is diametrically opposed to ACID
• ACID is pessimistic and forces consistency at the
end of every operation
• BASE is optimistic and accepts that the database
consistency will be in a state of flux
Monday, June 3, 13
Types of NoSQL databases
“BASE is optimistic and accepts
that the database consistency will
be in a state of flux”
This is really a business requirement, not a
technical one (overbooked planes, oversold
items)
Source: BASE:An Acid Alternative by Dan Pritchett http://queue.acm.org/detail.cfm?id=1394128
Monday, June 3, 13
Types of NoSQL databases
“it leads to levels of scalability
that cannot be obtained with ACID”
BASE sounds scary at first, but...
Source: BASE:An Acid Alternative by Dan Pritchett http://queue.acm.org/detail.cfm?id=1394128
Monday, June 3, 13
Document-oriented
Open source
Free
Multi-platform
Maintained by 10gen
http://www.mongodb.org
Image source: http://upload.wikimedia.org/wikipedia/commons/e/eb/MongoDB_Logo.png
Monday, June 3, 13
“bridge the gap between key-
value stores (which are fast
and highly scalable) and
traditional RDBMS systems
(which provide rich queries
and deep functionality).” -
Mike Dirolf
Source: http://www.10gen.com/presentations/webinar/introduction-to-mongodb
Goal of MongoDB
Monday, June 3, 13
MongoDB Examples
Monday, June 3, 13
INSERT INTO table (f1, f2, f3)
VALUES (v1, v2, v3)
db.collection.insert(
{f1: v1, f2: v2, f3: v3}
)
MongoDB Examples
Monday, June 3, 13
db.blog.insert({
url: “blog-1”,
title:"Blog 1",
text: “blah blah blah”,
author: "jdoe",
tags: ["software", "databases"],
addedOn: "2013-04-01"
});
MongoDB Examples
Monday, June 3, 13
UPDATE table
SET f1 = v1, f2 = v2
WHERE f3 = v3
db.collection.update(
{f3: v3}, // where
{$set:
{f1: v1, f2: v2}
}
)
MongoDB Examples
Monday, June 3, 13
SELECT f1, f2
FROM table
WHERE f3 = “X”
db.collection.find(
{f3: “X”}, // where
{f1: 1, f2: 1} // projection
)
MongoDB Examples
Monday, June 3, 13
SELECT i.id, i.date, i.total,
c.name, c.address,
t.qty, p.name, t.price
FROM invoices i
INNER JOIN customers c ON i.custId = c.id
INNER JOIN items t ON t.invoiceId = i.id
INNER JOIN prods p ON t.prodId = p.id
WHERE i.id = 34
MongoDB Examples
Monday, June 3, 13
MongoDB Examples
id date total name address qty name price
34 2013-04-01 100 Customer A 123 main 2 item A 30
34 2013-04-01 100 Customer A 123 main 1 item B 40
Monday, June 3, 13
db.invoices.find({id: 34})
MongoDB Examples
Monday, June 3, 13
MongoDB Examples
{
id: 34,
date: “2013-04-01”,
total: 100,
customer: {
name: “Customer A”,
address: “123 main”
},
items: [
{qty: 2, name: “item A”, price: 30},
{qty: 1, name: “item B”, price: 40},
]
}
Monday, June 3, 13
SELECT title, addedOn
FROM blog
WHERE addedOn >= “2013-04-01”
and addedOn <= “2013-04-15”
db.blog.find(
{addedOn:{
$gte: "2013-04-01",
$lte: "2013-04-15"}
},
{title: 1, addedOn: 1}
)
MongoDB Examples
Monday, June 3, 13
• Insert/Update/Find
• Arrays & nested documents
• Java / C# Sample
• Indices
• MapReduce
• Server Functions
• Aggregation Framework
MongoDB Examples
Code can be found at: https://github.com/hectorcorrea/intro-to-nosql-with-mongodb
Monday, June 3, 13
Replication & Sharding
Monday, June 3, 13
MongoDB Replication
Replication - saving the same data multiple times
Gives you redundancy in case one server fails
Allows you to spread your reads
[server 1]
All
Customers
[server 2]
All
Customers
[server 3]
All
Customers
Monday, June 3, 13
MongoDB Replication
Master-Slave
• One master (you define it)
• Many slaves
[server 1]$ mongod --master
[server 2]$ mongod --slave --source server1
[server 3]$ mongod --slave --source server1
Monday, June 3, 13
MongoDB Replication
Replica Sets
• Like master-slave...
• ...but the master is designated by the set
• Automatic failover
[server 1]$ mongod --replSet name/server2,server3
[server 2]$ mongod --replSet name/server1,server3
[server 3]$ mongod --replSet name/server1,server2
Monday, June 3, 13
$ mongo
> config = {
_id: "rs1",
members: [
{_id: 0, host: "server1"},
{_id: 1, host: "server2"},
{_id: 2, host: "server3"}
]
}
> rs.initiate(config)
> rs.status()
MongoDB Replication
Monday, June 3, 13
MongoDB Sharding
Sharding is a fancy word for “data partitioning”
MongoDB supports automatic sharding
[server 1]
Customers
West
[server 2]
Customers
MidWest
[server 3]
Customers
East
db.runCommand({
“shardcollection”:”customers”,
“key”: {“region”:1}
})
Monday, June 3, 13
MongoDB Sharding
[server 1]
Customers
West
[server 2]
Customers
MidWest
[server 3]
Customers
East
[server X]
mongos
db.customers.find({region:“WEST”})
Monday, June 3, 13
MongoDB Sharding
[server 1]
Customers
West
[server 2]
Customers
MidWest
[server 3]
Customers
East
[server X]
mongos
db.customers.find({name:“John”})
Monday, June 3, 13
To SQL or
no to SQL,
that is the
question
http://en.wikipedia.org/wiki/File:Edwin_Booth_Hamlet_1870.jpg
Monday, June 3, 13
Advantages of Relational Databases
• Some data fits the relational model nicely
• SQL is a declarative language
• SQL is universal
• Joins
• Multi-row / multi-table ACID transactions
• One size fits most
• Well known technology
To SQL or no to SQL
Monday, June 3, 13
Advantages of NoSQL Databases
• Cluster friendly / scales out
• Tend to be very fast
• Handle complex data nicely
• Reduced data impedance mismatch
• Joins don’t needed as much
• Multi-row / multi-table transactions don’t
needed as much
To SQL or no to SQL
Monday, June 3, 13
Disadvantages of NoSQL Databases
• Many different data modes
• No standard/universal query syntax
• Each product uses a different one
• Vendor lock-in?
• Learning curve (on your data layer!)
• Can you live with BASE?
To SQL or no to SQL
Monday, June 3, 13
• Not using referential integrity
• Minimizing the use of JOINS
• Denormalizing (a lot) of your data
• Saying “no” to some features
To SQL or no to SQL
Consider NoSQL Databases
if for performance reasons you are...
Monday, June 3, 13
Try One
NoSQL database
Monday, June 3, 13
Recommended Books
• NoSQL Distilled by Pramod Sadalage
and Martin Fowler
• MongoDB The Definitive Guide by
Kristina Chodorow and Michael Dirolf
Monday, June 3, 13
Thank you!
Hector Correa
hector@hectorcorrea.com
@hectorjcorrea
Monday, June 3, 13
• Bigtable: Google NoSQL database
• Big Data: Buzz word
• HBase:Apache’s NoSQL database
• MapReduce: Programming model to process data in a cluster
(ETL)
• Hadoop: Apache product to run MapReduce jobs
• ACID: Mantra for Relational Databases. Properties for
database transactions (atomicity, consistency, isolation,
durability)
• BASE: Mantra for NoSQL databases. Basically Available, Soft
state, and eventually consistent.
Glossary
Monday, June 3, 13

More Related Content

What's hot

NoSQL and MongoDB Introdction
NoSQL and MongoDB IntrodctionNoSQL and MongoDB Introdction
NoSQL and MongoDB Introdction
Brian Enochson
 
MongoDB Schema Design (Richard Kreuter's Mongo Berlin preso)
MongoDB Schema Design (Richard Kreuter's Mongo Berlin preso)MongoDB Schema Design (Richard Kreuter's Mongo Berlin preso)
MongoDB Schema Design (Richard Kreuter's Mongo Berlin preso)MongoDB
 
SQL vs NoSQL
SQL vs NoSQLSQL vs NoSQL
SQL vs NoSQL
Jacinto Limjap
 
Making MySQL Agile-ish
Making MySQL Agile-ishMaking MySQL Agile-ish
Making MySQL Agile-ish
Dave Stokes
 
Evolution of the DBA to Data Platform Administrator/Specialist
Evolution of the DBA to Data Platform Administrator/SpecialistEvolution of the DBA to Data Platform Administrator/Specialist
Evolution of the DBA to Data Platform Administrator/Specialist
Tony Rogerson
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
Filip Ilievski
 
Non Relational Databases
Non Relational DatabasesNon Relational Databases
Non Relational Databases
Chris Baglieri
 
Schemaless Databases
Schemaless DatabasesSchemaless Databases
Schemaless Databases
Dan Gunter
 
NoSQL databases and managing big data
NoSQL databases and managing big dataNoSQL databases and managing big data
NoSQL databases and managing big data
Steven Francia
 
Introduction to Sql on Hadoop
Introduction to Sql on HadoopIntroduction to Sql on Hadoop
Introduction to Sql on Hadoop
Samuel Yee
 
An introduction to Nosql
An introduction to NosqlAn introduction to Nosql
An introduction to Nosql
greprep
 
NoSQL Introduction
NoSQL IntroductionNoSQL Introduction
NoSQL Introduction
John Kerley-Weeks
 
NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQL
Ramakant Soni
 
MongoDB
MongoDBMongoDB
Polyglot Persistence vs Multi-Model Databases
Polyglot Persistence vs Multi-Model DatabasesPolyglot Persistence vs Multi-Model Databases
Polyglot Persistence vs Multi-Model Databases
Luca Garulli
 
NoSQL and MapReduce
NoSQL and MapReduceNoSQL and MapReduce
NoSQL and MapReduce
J Singh
 
NoSQL databases pros and cons
NoSQL databases pros and consNoSQL databases pros and cons
NoSQL databases pros and cons
Fabio Fumarola
 
Multi-model database
Multi-model databaseMulti-model database
Multi-model database
Jiaheng Lu
 

What's hot (20)

NoSQL and MongoDB Introdction
NoSQL and MongoDB IntrodctionNoSQL and MongoDB Introdction
NoSQL and MongoDB Introdction
 
MongoDB Schema Design (Richard Kreuter's Mongo Berlin preso)
MongoDB Schema Design (Richard Kreuter's Mongo Berlin preso)MongoDB Schema Design (Richard Kreuter's Mongo Berlin preso)
MongoDB Schema Design (Richard Kreuter's Mongo Berlin preso)
 
SQL vs NoSQL
SQL vs NoSQLSQL vs NoSQL
SQL vs NoSQL
 
Making MySQL Agile-ish
Making MySQL Agile-ishMaking MySQL Agile-ish
Making MySQL Agile-ish
 
Evolution of the DBA to Data Platform Administrator/Specialist
Evolution of the DBA to Data Platform Administrator/SpecialistEvolution of the DBA to Data Platform Administrator/Specialist
Evolution of the DBA to Data Platform Administrator/Specialist
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
Non Relational Databases
Non Relational DatabasesNon Relational Databases
Non Relational Databases
 
Schemaless Databases
Schemaless DatabasesSchemaless Databases
Schemaless Databases
 
NoSQL databases and managing big data
NoSQL databases and managing big dataNoSQL databases and managing big data
NoSQL databases and managing big data
 
Introduction to Sql on Hadoop
Introduction to Sql on HadoopIntroduction to Sql on Hadoop
Introduction to Sql on Hadoop
 
An introduction to Nosql
An introduction to NosqlAn introduction to Nosql
An introduction to Nosql
 
NoSQL Introduction
NoSQL IntroductionNoSQL Introduction
NoSQL Introduction
 
NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQL
 
MongoDB
MongoDBMongoDB
MongoDB
 
No sql
No sqlNo sql
No sql
 
Polyglot Persistence vs Multi-Model Databases
Polyglot Persistence vs Multi-Model DatabasesPolyglot Persistence vs Multi-Model Databases
Polyglot Persistence vs Multi-Model Databases
 
NoSQL and MapReduce
NoSQL and MapReduceNoSQL and MapReduce
NoSQL and MapReduce
 
NoSQL databases pros and cons
NoSQL databases pros and consNoSQL databases pros and cons
NoSQL databases pros and cons
 
Hdfs Dhruba
Hdfs DhrubaHdfs Dhruba
Hdfs Dhruba
 
Multi-model database
Multi-model databaseMulti-model database
Multi-model database
 

Viewers also liked

Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
Ravi Teja
 
An Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDBAn Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDB
Lee Theobald
 
Redis vs Aerospike
Redis vs AerospikeRedis vs Aerospike
Redis vs Aerospike
Sayyaparaju Sunil
 
NoSQL Introduction
NoSQL IntroductionNoSQL Introduction
NoSQL Introduction
Ahmed Soliman
 
Scaling Teams, Processes and Architectures
Scaling Teams, Processes and ArchitecturesScaling Teams, Processes and Architectures
Scaling Teams, Processes and Architectures
Lorenzo Alberton
 
Introduction to Aerospike
Introduction to AerospikeIntroduction to Aerospike
Introduction to Aerospike
Aerospike, Inc.
 
Back to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQLBack to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQL
MongoDB
 
Intro To MongoDB
Intro To MongoDBIntro To MongoDB
Intro To MongoDB
Alex Sharp
 
NoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenNoSQL Databases: Why, what and when
NoSQL Databases: Why, what and when
Lorenzo Alberton
 
Monitoring at scale - Intuitive dashboard design
Monitoring at scale - Intuitive dashboard designMonitoring at scale - Intuitive dashboard design
Monitoring at scale - Intuitive dashboard design
Lorenzo Alberton
 

Viewers also liked (10)

Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
An Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDBAn Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDB
 
Redis vs Aerospike
Redis vs AerospikeRedis vs Aerospike
Redis vs Aerospike
 
NoSQL Introduction
NoSQL IntroductionNoSQL Introduction
NoSQL Introduction
 
Scaling Teams, Processes and Architectures
Scaling Teams, Processes and ArchitecturesScaling Teams, Processes and Architectures
Scaling Teams, Processes and Architectures
 
Introduction to Aerospike
Introduction to AerospikeIntroduction to Aerospike
Introduction to Aerospike
 
Back to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQLBack to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQL
 
Intro To MongoDB
Intro To MongoDBIntro To MongoDB
Intro To MongoDB
 
NoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenNoSQL Databases: Why, what and when
NoSQL Databases: Why, what and when
 
Monitoring at scale - Intuitive dashboard design
Monitoring at scale - Intuitive dashboard designMonitoring at scale - Intuitive dashboard design
Monitoring at scale - Intuitive dashboard design
 

Similar to Introduction to NoSQL with MongoDB

Symfony2 and MongoDB - MidwestPHP 2013
Symfony2 and MongoDB - MidwestPHP 2013   Symfony2 and MongoDB - MidwestPHP 2013
Symfony2 and MongoDB - MidwestPHP 2013
Pablo Godel
 
elasticsearch
elasticsearchelasticsearch
elasticsearch
Satish Mohan
 
Mongo db php_shaken_not_stirred_joomlafrappe
Mongo db php_shaken_not_stirred_joomlafrappeMongo db php_shaken_not_stirred_joomlafrappe
Mongo db php_shaken_not_stirred_joomlafrappe
Spyros Passas
 
Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Fou...
Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Fou...Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Fou...
Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Fou...
✔ Eric David Benari, PMP
 
What can we learn from NoSQL technologies?
What can we learn from NoSQL technologies?What can we learn from NoSQL technologies?
What can we learn from NoSQL technologies?
Ivan Zoratti
 
NoSQL Now 2013 Presentation
NoSQL Now 2013 PresentationNoSQL Now 2013 Presentation
NoSQL Now 2013 Presentation
Arjen Schoneveld
 
Cassandra at scale
Cassandra at scaleCassandra at scale
Cassandra at scale
Patrick McFadin
 
Client-side storage
Client-side storageClient-side storage
Client-side storage
Ruben Tan
 
Mongo db basics
Mongo db basicsMongo db basics
Mongo db basics
Claudio Montoya
 
Creating social features at BranchOut using MongoDB
Creating social features at BranchOut using MongoDBCreating social features at BranchOut using MongoDB
Creating social features at BranchOut using MongoDB
Lewis Lin 🦊
 
Gray_Compass99.ppt
Gray_Compass99.pptGray_Compass99.ppt
Gray_Compass99.ppt
PokinMorakrant
 
NoSQL: An Analysis
NoSQL: An AnalysisNoSQL: An Analysis
NoSQL: An Analysis
Andrew Brust
 
No More SQL
No More SQLNo More SQL
No More SQL
Glenn Street
 
Vital.AI Creating Intelligent Apps
Vital.AI Creating Intelligent AppsVital.AI Creating Intelligent Apps
Vital.AI Creating Intelligent Apps
Vital.AI
 
Google jeff dean lessons learned while building infrastructure software at go...
Google jeff dean lessons learned while building infrastructure software at go...Google jeff dean lessons learned while building infrastructure software at go...
Google jeff dean lessons learned while building infrastructure software at go...xu liwei
 
Polylog: A Log-Based Architecture for Distributed Systems
Polylog: A Log-Based Architecture for Distributed SystemsPolylog: A Log-Based Architecture for Distributed Systems
Polylog: A Log-Based Architecture for Distributed Systems
Kamil Sindi
 
MongoDB
MongoDBMongoDB

Similar to Introduction to NoSQL with MongoDB (20)

Symfony2 and MongoDB - MidwestPHP 2013
Symfony2 and MongoDB - MidwestPHP 2013   Symfony2 and MongoDB - MidwestPHP 2013
Symfony2 and MongoDB - MidwestPHP 2013
 
elasticsearch
elasticsearchelasticsearch
elasticsearch
 
Mongo db php_shaken_not_stirred_joomlafrappe
Mongo db php_shaken_not_stirred_joomlafrappeMongo db php_shaken_not_stirred_joomlafrappe
Mongo db php_shaken_not_stirred_joomlafrappe
 
Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Fou...
Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Fou...Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Fou...
Making MySQL Flexible with ParElastic Database Scalability, Amrith Kumar, Fou...
 
What can we learn from NoSQL technologies?
What can we learn from NoSQL technologies?What can we learn from NoSQL technologies?
What can we learn from NoSQL technologies?
 
NoSQL Now 2013 Presentation
NoSQL Now 2013 PresentationNoSQL Now 2013 Presentation
NoSQL Now 2013 Presentation
 
10082013
1008201310082013
10082013
 
Cassandra at scale
Cassandra at scaleCassandra at scale
Cassandra at scale
 
Client-side storage
Client-side storageClient-side storage
Client-side storage
 
Mongo db basics
Mongo db basicsMongo db basics
Mongo db basics
 
Creating social features at BranchOut using MongoDB
Creating social features at BranchOut using MongoDBCreating social features at BranchOut using MongoDB
Creating social features at BranchOut using MongoDB
 
Gray_Compass99.ppt
Gray_Compass99.pptGray_Compass99.ppt
Gray_Compass99.ppt
 
NoSQL: An Analysis
NoSQL: An AnalysisNoSQL: An Analysis
NoSQL: An Analysis
 
No More SQL
No More SQLNo More SQL
No More SQL
 
Vital.AI Creating Intelligent Apps
Vital.AI Creating Intelligent AppsVital.AI Creating Intelligent Apps
Vital.AI Creating Intelligent Apps
 
Slides
SlidesSlides
Slides
 
Couchbase
CouchbaseCouchbase
Couchbase
 
Google jeff dean lessons learned while building infrastructure software at go...
Google jeff dean lessons learned while building infrastructure software at go...Google jeff dean lessons learned while building infrastructure software at go...
Google jeff dean lessons learned while building infrastructure software at go...
 
Polylog: A Log-Based Architecture for Distributed Systems
Polylog: A Log-Based Architecture for Distributed SystemsPolylog: A Log-Based Architecture for Distributed Systems
Polylog: A Log-Based Architecture for Distributed Systems
 
MongoDB
MongoDBMongoDB
MongoDB
 

Recently uploaded

Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus
 
Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"
Donna Lenk
 
Pro Unity Game Development with C-sharp Book
Pro Unity Game Development with C-sharp BookPro Unity Game Development with C-sharp Book
Pro Unity Game Development with C-sharp Book
abdulrafaychaudhry
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
Georgi Kodinov
 
Launch Your Streaming Platforms in Minutes
Launch Your Streaming Platforms in MinutesLaunch Your Streaming Platforms in Minutes
Launch Your Streaming Platforms in Minutes
Roshan Dwivedi
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
Globus
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
Max Andersen
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
Globus
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke
 
Introduction to Pygame (Lecture 7 Python Game Development)
Introduction to Pygame (Lecture 7 Python Game Development)Introduction to Pygame (Lecture 7 Python Game Development)
Introduction to Pygame (Lecture 7 Python Game Development)
abdulrafaychaudhry
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
rickgrimesss22
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
Globus
 
Top 7 Unique WhatsApp API Benefits | Saudi Arabia
Top 7 Unique WhatsApp API Benefits | Saudi ArabiaTop 7 Unique WhatsApp API Benefits | Saudi Arabia
Top 7 Unique WhatsApp API Benefits | Saudi Arabia
Yara Milbes
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
informapgpstrackings
 
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Globus
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
Cyanic lab
 
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Globus
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
Globus
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
Globus
 

Recently uploaded (20)

Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
 
Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"
 
Pro Unity Game Development with C-sharp Book
Pro Unity Game Development with C-sharp BookPro Unity Game Development with C-sharp Book
Pro Unity Game Development with C-sharp Book
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
 
Launch Your Streaming Platforms in Minutes
Launch Your Streaming Platforms in MinutesLaunch Your Streaming Platforms in Minutes
Launch Your Streaming Platforms in Minutes
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
 
Introduction to Pygame (Lecture 7 Python Game Development)
Introduction to Pygame (Lecture 7 Python Game Development)Introduction to Pygame (Lecture 7 Python Game Development)
Introduction to Pygame (Lecture 7 Python Game Development)
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
 
Top 7 Unique WhatsApp API Benefits | Saudi Arabia
Top 7 Unique WhatsApp API Benefits | Saudi ArabiaTop 7 Unique WhatsApp API Benefits | Saudi Arabia
Top 7 Unique WhatsApp API Benefits | Saudi Arabia
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
 
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
 
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
 

Introduction to NoSQL with MongoDB

  • 1. Intro to NoSQL databases with MongoDB Hector Correa hector@hectorcorrea.com @hectorjcorrea Monday, June 3, 13
  • 2. Agenda • What are NoSQL databases • Why NoSQL • Examples using MongoDB • Pros and cons • Q&A Monday, June 3, 13
  • 3. What are NoSQL databases Monday, June 3, 13
  • 4. •A type of databases •Don’t use the relational model •Good fit for distributed environments •(Usually) don’t use SQL •(Most of them) are open source What are NoSQL databases Source: http://nosql-database.org Monday, June 3, 13
  • 5. “NoSQL refers to an ill-defined set of mostly open-source databases, mostly developed in the early 21st century, and mostly not using SQL” What are NoSQL databases Source: NoSQL Distilled by Sadalage and Fowler Monday, June 3, 13
  • 6. NoSQL is a movement not a specific technology or product What are NoSQL databases Monday, June 3, 13
  • 7. What are NoSQL databases NOSQL Meetup This meetup is about "open source, distributed, non relational databases". Have you run into limitations with traditional relational databases? Don't mind trading a query language for scalability? [...] NoSQL term coined in 2009 by Johan Oskarsson while organizing a meetup Source: http://nosql.eventbrite.com Monday, June 3, 13
  • 8. It doesn’t mean “Screw SQL” It’s more like “Not Only SQL” What are NoSQL databases Monday, June 3, 13
  • 9. NoSQL has very little to do with SQL (structured query language) It should have been called Not Only Relational Databases #NoRDBMS anyone? What are NoSQL databases Monday, June 3, 13
  • 12. Why NoSQL? Very large and distributed databases Rise of unstructured data Ease of development Monday, June 3, 13
  • 13. Why NoSQL? - large datasets Massive datasets (Google,Amazon, Facebook) Distributed environments (hundreds of nodes) RDBMS scale up but not scale out Monday, June 3, 13
  • 14. Why NoSQL? - large datasets But we don't have that problem at my company... Monday, June 3, 13
  • 15. Why NoSQL? - large datasets ...true, and we also thought that 640 KB of RAM should be enough for everybody. Source: http://www.youtube.com/watch?v=qI_g07C_Q5I But we don't have that problem at my company... Monday, June 3, 13
  • 16. Why NoSQL? - unstructured data Web pages (Google,Yahoo) Log data, scientific data Content Management Systems Field1, field2, field3, fieldN Storing field definitions as rows Tracking changes (usually a BLOB) Monday, June 3, 13
  • 17. Why NoSQL? - ease of development Data impedance mismatch (OO vs RDBMS) Applies to both structured and unstructured data Aggregates are desirable in a cluster environment NoSQL can reduce this friction Monday, June 3, 13
  • 19. •Key-value SimpleDB, Redis, Dynamo,Voldemort, Riak •Document-oriented MongoDB, CouchDB, RavenDB •Column-oriented BigTable, HBase, CASSANDRA, PNUTS Types of NoSQL databases Sources: Book: NoSQL Distilled by Sadalage and Fowler Paper: NoSQL Databases: a step to database scalability in Web environment by Jaroslav Pokorny Monday, June 3, 13
  • 20. Types of NoSQL databases •Not using the relational model •Run well on clusters •Can handle huge amount of data •Open Source •Build for 21st century web access •Schema-less / schema-free •BASE (not ACID) Common Characteristics Monday, June 3, 13
  • 21. Types of NoSQL databases ACID Transactions Source: BASE:An Acid Alternative by Dan Pritchett http://queue.acm.org/detail.cfm?id=1394128 • Atomicity. Transactions are all or nothing. • Consistency. The database will be in a consistent state when the transaction begins and ends. • Isolation. The transaction will behave as if it is the only operation being performed upon the database. • Durability. Upon completion of the transaction, the operation will not be reversed. Monday, June 3, 13
  • 22. Types of NoSQL databases BASE Source: BASE:An Acid Alternative by Dan Pritchett http://queue.acm.org/detail.cfm?id=1394128 • Basically Available, Soft state, Eventually consistent • BASE is diametrically opposed to ACID • ACID is pessimistic and forces consistency at the end of every operation • BASE is optimistic and accepts that the database consistency will be in a state of flux Monday, June 3, 13
  • 23. Types of NoSQL databases “BASE is optimistic and accepts that the database consistency will be in a state of flux” This is really a business requirement, not a technical one (overbooked planes, oversold items) Source: BASE:An Acid Alternative by Dan Pritchett http://queue.acm.org/detail.cfm?id=1394128 Monday, June 3, 13
  • 24. Types of NoSQL databases “it leads to levels of scalability that cannot be obtained with ACID” BASE sounds scary at first, but... Source: BASE:An Acid Alternative by Dan Pritchett http://queue.acm.org/detail.cfm?id=1394128 Monday, June 3, 13
  • 25. Document-oriented Open source Free Multi-platform Maintained by 10gen http://www.mongodb.org Image source: http://upload.wikimedia.org/wikipedia/commons/e/eb/MongoDB_Logo.png Monday, June 3, 13
  • 26. “bridge the gap between key- value stores (which are fast and highly scalable) and traditional RDBMS systems (which provide rich queries and deep functionality).” - Mike Dirolf Source: http://www.10gen.com/presentations/webinar/introduction-to-mongodb Goal of MongoDB Monday, June 3, 13
  • 28. INSERT INTO table (f1, f2, f3) VALUES (v1, v2, v3) db.collection.insert( {f1: v1, f2: v2, f3: v3} ) MongoDB Examples Monday, June 3, 13
  • 29. db.blog.insert({ url: “blog-1”, title:"Blog 1", text: “blah blah blah”, author: "jdoe", tags: ["software", "databases"], addedOn: "2013-04-01" }); MongoDB Examples Monday, June 3, 13
  • 30. UPDATE table SET f1 = v1, f2 = v2 WHERE f3 = v3 db.collection.update( {f3: v3}, // where {$set: {f1: v1, f2: v2} } ) MongoDB Examples Monday, June 3, 13
  • 31. SELECT f1, f2 FROM table WHERE f3 = “X” db.collection.find( {f3: “X”}, // where {f1: 1, f2: 1} // projection ) MongoDB Examples Monday, June 3, 13
  • 32. SELECT i.id, i.date, i.total, c.name, c.address, t.qty, p.name, t.price FROM invoices i INNER JOIN customers c ON i.custId = c.id INNER JOIN items t ON t.invoiceId = i.id INNER JOIN prods p ON t.prodId = p.id WHERE i.id = 34 MongoDB Examples Monday, June 3, 13
  • 33. MongoDB Examples id date total name address qty name price 34 2013-04-01 100 Customer A 123 main 2 item A 30 34 2013-04-01 100 Customer A 123 main 1 item B 40 Monday, June 3, 13
  • 35. MongoDB Examples { id: 34, date: “2013-04-01”, total: 100, customer: { name: “Customer A”, address: “123 main” }, items: [ {qty: 2, name: “item A”, price: 30}, {qty: 1, name: “item B”, price: 40}, ] } Monday, June 3, 13
  • 36. SELECT title, addedOn FROM blog WHERE addedOn >= “2013-04-01” and addedOn <= “2013-04-15” db.blog.find( {addedOn:{ $gte: "2013-04-01", $lte: "2013-04-15"} }, {title: 1, addedOn: 1} ) MongoDB Examples Monday, June 3, 13
  • 37. • Insert/Update/Find • Arrays & nested documents • Java / C# Sample • Indices • MapReduce • Server Functions • Aggregation Framework MongoDB Examples Code can be found at: https://github.com/hectorcorrea/intro-to-nosql-with-mongodb Monday, June 3, 13
  • 39. MongoDB Replication Replication - saving the same data multiple times Gives you redundancy in case one server fails Allows you to spread your reads [server 1] All Customers [server 2] All Customers [server 3] All Customers Monday, June 3, 13
  • 40. MongoDB Replication Master-Slave • One master (you define it) • Many slaves [server 1]$ mongod --master [server 2]$ mongod --slave --source server1 [server 3]$ mongod --slave --source server1 Monday, June 3, 13
  • 41. MongoDB Replication Replica Sets • Like master-slave... • ...but the master is designated by the set • Automatic failover [server 1]$ mongod --replSet name/server2,server3 [server 2]$ mongod --replSet name/server1,server3 [server 3]$ mongod --replSet name/server1,server2 Monday, June 3, 13
  • 42. $ mongo > config = { _id: "rs1", members: [ {_id: 0, host: "server1"}, {_id: 1, host: "server2"}, {_id: 2, host: "server3"} ] } > rs.initiate(config) > rs.status() MongoDB Replication Monday, June 3, 13
  • 43. MongoDB Sharding Sharding is a fancy word for “data partitioning” MongoDB supports automatic sharding [server 1] Customers West [server 2] Customers MidWest [server 3] Customers East db.runCommand({ “shardcollection”:”customers”, “key”: {“region”:1} }) Monday, June 3, 13
  • 44. MongoDB Sharding [server 1] Customers West [server 2] Customers MidWest [server 3] Customers East [server X] mongos db.customers.find({region:“WEST”}) Monday, June 3, 13
  • 45. MongoDB Sharding [server 1] Customers West [server 2] Customers MidWest [server 3] Customers East [server X] mongos db.customers.find({name:“John”}) Monday, June 3, 13
  • 46. To SQL or no to SQL, that is the question http://en.wikipedia.org/wiki/File:Edwin_Booth_Hamlet_1870.jpg Monday, June 3, 13
  • 47. Advantages of Relational Databases • Some data fits the relational model nicely • SQL is a declarative language • SQL is universal • Joins • Multi-row / multi-table ACID transactions • One size fits most • Well known technology To SQL or no to SQL Monday, June 3, 13
  • 48. Advantages of NoSQL Databases • Cluster friendly / scales out • Tend to be very fast • Handle complex data nicely • Reduced data impedance mismatch • Joins don’t needed as much • Multi-row / multi-table transactions don’t needed as much To SQL or no to SQL Monday, June 3, 13
  • 49. Disadvantages of NoSQL Databases • Many different data modes • No standard/universal query syntax • Each product uses a different one • Vendor lock-in? • Learning curve (on your data layer!) • Can you live with BASE? To SQL or no to SQL Monday, June 3, 13
  • 50. • Not using referential integrity • Minimizing the use of JOINS • Denormalizing (a lot) of your data • Saying “no” to some features To SQL or no to SQL Consider NoSQL Databases if for performance reasons you are... Monday, June 3, 13
  • 52. Recommended Books • NoSQL Distilled by Pramod Sadalage and Martin Fowler • MongoDB The Definitive Guide by Kristina Chodorow and Michael Dirolf Monday, June 3, 13
  • 54. • Bigtable: Google NoSQL database • Big Data: Buzz word • HBase:Apache’s NoSQL database • MapReduce: Programming model to process data in a cluster (ETL) • Hadoop: Apache product to run MapReduce jobs • ACID: Mantra for Relational Databases. Properties for database transactions (atomicity, consistency, isolation, durability) • BASE: Mantra for NoSQL databases. Basically Available, Soft state, and eventually consistent. Glossary Monday, June 3, 13