MONGODB WORKSHOP
{

meetup: “NYC Open Data”,
presenters: [“Kannan Sankaran”, “Roman Kubiak”],
host: “Vivian”,
location: “T...
MONGODB WORKSHOP
{

meetup: “NYC Open Data”,
presenters: [“Kannan Sankaran”, “Roman Kubiak”],
host: “Vivian is awesome, TH...
OUR TOPICS
OVERVIEW OF DATABASES
WHAT IS MONGODB?
MONGODB, NOSQL, AND RELATIONAL DATABASES
A PEEK AT MONGODB COMMANDS
SHAR...
MONGO PIE

ARCHITECT
OVERVIEW OF DATABASES
ORGANIZING DATA
ROWS
COLUMNS
TABLES
DATA SPREAD
OUT IN VARIOUS
TABLES
DATA MAY BE
RELATED
DATABASES AND THEIR GROWTH
RELATIONAL
DATABASES
(RDBMS) CREATED

1970s

1980s

RDBMS CONTINUE
TO BE POPULAR
INTERNET ARRIV...
WHAT IS NoSQL?
A TWITTER HASHTAG
#nosql
NOSQL GENERALLY REFERS TO
DATABASES THAT DO NOT HAVE
A FIXED ROW-COLUMN DATA
ORGANIZATION STRUCTURE.
WHAT IS MONGODB?
A HUMONGOUS NoSQL DB
A HUMONGOUS NoSQL DB
WHERE DATA IS ORGANIZED BY
DOCUMENTS NOT ROWS
COLLECTIONS NOT TABLES
WHAT IS A DOCUMENT?
A DOCUMENT IS LIKE A ROW…
{
_id: ObjectID(“12AB34CD56EF”),
name: “Ed Brown”,
orderDate: “2-1-2014”
}
…BUT IT IS MORE FLEXIBLE
{

{
_id: ObjectID(“12AB34CD56EF”),
name: “Ed Brown”,
orderDate: “2-1-2014”,
payments:
{
car: “10...
HOW LARGE CAN THIS DOCUMENT BE?
{
_id: ObjectID(“12AB34CD56EF”),
name: “Ed Brown”,
orderDate: “2-1-2014”,
payments:
{
car:...
ISN’T THAT JSON?
WELL, ALMOST!
WHAT IS JSON?
WEB
SERVER

{

}

MONGODB
DATABASE

“make”: “Chevy”,
“model”: “Malibu”,
“year”: 2014

{

“vehicle”: “Chevy M...
WHAT IS JSON?
JAVASCRIPT OBJECT NOTATION
NAME-VALUE PAIRS
{

{

}

vehicle: “car”,
make: “Malibu”,
color: “blue”

}

name:...
MONGODB DOCUMENT
{
_id: ObjectID(“12AB34CD56EF”),
name: “Kannan”,
gender: “male”,
favorites:
{
color: “blue”

},
interests...
WHAT IS A COLLECTION?
A GROUP OF DOCUMENTS
{

SIMILAR

{

_id: ObjectID(“34AB34CD56EF”),
name: “Ed Brown”,
orderDate: “2-1-2014”,
tags: [“shirt”...
MONGODB IS...
A DOCUMENT-ORIENTED NOSQL
DATABASE WHERE DATA CONSISTS OF
DOCUMENTS STORED IN COLLECTIONS.
MONGODB FEATURES
EASY TO LEARN
DYNAMIC QUERY LANGUAGE
- SEARCH BY FIELDS, REGULAR EXPRESSIONS
- USER-DEFINED JAVASCRIPT FU...
MONGODB USAGE
CONTENT MANAGEMENT SYSTEMS
E-COMMERCE WEBSITES
LOG DATA AND HIERARCHICAL AGGREGATION
REAL-TIME ANALYTICS
MONGODB, NOSQL, AND
RELATIONAL DATABASES
DATABASE MANAGEMENT SYSTEMS
BERKELEY INGRES
ORACLE

1970s

MOST SYSTEMS
USE SOME
FLAVOR OF SQL

1980s
INFORMIX
DB2
SYBASE
...
RELATIONAL DATABASES
WERE / STILL ARE THE
DEFACTO IN SEVERAL
COMPANIES.
RELATIONAL DATABASE FEATURES
C.R.U.D. OPERATIONS
STRUCTURED QUERY LANGUAGE (SQL)
FIXED DATABASE SCHEMAS
NORMALIZATION
REFE...
IN THE LATE 90s/EARLY 2000s…
DOT COM BUBBLE
DOT COM BUST
WEB SERVICES
SOCIAL NETWORKS
GOOGLE, AMAZON
COMPUTER OWNERS/USERS...
COMPUTING/STORAGE
RESOURCES BECAME A
CHALLENGE FOR SMALLER
COMPANIES LIKE GOOGLE AND
AMAZON THAT HAD LOTS OF DATA.
SCALE UP
BIGGER
MACHINE
MORE DISK SPACE
MORE RAM
MORE PROCESSORS
MORE EXPENSIVE
SINGLE POINT OF FAILURE

HARDWARE HAS LIMI...
RELATIONAL DATABASES WERE
DESIGNED TO OPERATE ON A
SINGLE MACHINE, AND
SCALING OUT MEANT A LOT OF
CHALLENGES.
SPLITTING DATA FOR SCALE OUT

BY
COLUMNS

BY
ROWS
WORDPRESS MYSQL SCHEMA WITH 2 TABLES
WP_POSTS

A JOIN QUERY IN MYSQL

WP_COMMENTS

SELECT p.post_author,
p.post_date,
c.comment_author,
c.comment_date
FROM wp_...
WP_POSTS

A JOIN QUERY IN MYSQL

WP_COMMENTS

RESULT
SCALE OUT DATA BY ROWS
WP_POSTS

A
B
C
WP_COMMENTS

D
HOW
COMPLICATED
WOULD
SCALING THIS
BE?
JOINS MAY GET REALLY MESSY
WITH MANY MACHINES
(DISTRIBUTED JOINS)
WP_POSTS

TRANSACTIONS

MUST SATISFY
A.C.I.D.
PROPERTIES

WP_COMMENTS

BEGIN TRANSACTION
TRY
DELETE FROM wp_comments AS c
...
TRANSACTIONS MAY TAKE A
LONG TIME TO EXECUTE IF DATA
IS ON DIFFERENT MACHINES
(DISTRIBUTED TRANSACTIONS)
TO SPLIT THE DATA, A WHOLE
BUNCH OF COMPROMISES
MUST BE MADE IN RELATIONAL
DATABASES
THIS GAVE RISE TO NONRELATIONAL SOLUTIONS
GOOGLE
AMAZON
NoSQL SYSTEM CHARACTERISTICS
C.R.U.D. OPERATIONS
STRUCTURED QUERY LANGUAGE (SQL)
FIXED DATABASE SCHEMAS
NORMALIZATION
REFE...
HOW IS THIS SCALABILITY
ACHIEVED IN MONGODB?
STACKING THE DATA
WP_POSTS

STACKING THE DATA
{

NO NEED TO
JOIN
}

WP_COMMENTS
_id: 1,
post_author: “Amy W”,
post_date: “1/1/2014”,
comment...
NOW, EACH DOCUMENT CAN BE IN A
DIFFERENT MACHINE
WHAT ABOUT
TRANSACTIONS?
MONGODB DOES NOT
SUPPORT TRANSACTIONS
{

BUT SINGLE DOCUMENT
UPDATE IS ATOMIC
_id: 1,
post_author: “Amy W”,
post_date: “1/1/2014”,
comments: [{
comment_author: ...
THE KEY IS TO FOCUS ON
THE DATA MODEL
MONGODB CHARACTERISTICS
C.R.U.D. OPERATIONS
STRUCTURED QUERY LANGUAGE (SQL)
DYNAMIC QUERY LANGUAGE
FIXED DATABASE SCHEMAS
...
WHEN NOT TO USE MONGODB
IF TRANSACTIONS ARE A MUST
IF JOINS ARE ABSOLUTELY NECESSARY
SOFTWARE PRODUCTS LIKE WORDPRESS
THAT...
FOR MONGODB vs MYSQL
ARGUMENTS, WATCH…

Source: http://www.youtube.com/watch?v=b2F-DItXtZs
A PEEK AT MONGODB
COMMANDS
MONGODB IS A DOCUMENTORIENTED DATABASE
{
_id: ObjectID(“A1234566789”),
name: “Ed Brown”,
orderDate: “2-1-2014”

}
{
_id: O...
MONGODB FEATURES
EASY TO LEARN
DYNAMIC QUERY LANGUAGE
- SEARCH BY FIELDS, REGULAR EXPRESSIONS
- USER-DEFINED JAVASCRIPT FU...
MONGODB SYNTAX SEEMS TO
BE BORROWED FROM…
-

MYSQL
JSON
JAVASCRIPT
UNIX
MONGODB SUPPORTS SEVERAL
LANGUAGES
DRIVERS FOR
- PYTHON
- NODE.JS
- C#
- HADOOP
- R
AND MANY MORE
MONGODB TERMINOLOGY
RDBMS

MONGODB

DATABASE
TABLE
ROW

DATABASE
COLLECTION
DOCUMENT

A DATABASE CAN HAVE 1 OR MORE COLLEC...
MONGODB SUPPORTS SEVERAL
DATA TYPES
STRING
NUMBER
BOOLEAN
ARRAY
DATE
EMBEDDED DOCUMENT
NULL
MONGODB OPERATIONS
C.R.U.D.
CREATE
READ
UPDATE
DELETE
CONNECTING TO MONGODB
MONGO SHELL IS A
JAVASCRIPT INTERPRETER.

MONGOD

ROBOMONGO HAS THE
SAME JAVASCRIPT ENGINE
AS THE MO...
IMPORT JSON TO MONGO COLLECTION

mongoimport -d tennis –c ParksNYC --type json --drop < ParksNYC.json
CREATE COLLECTION
SQL
CREATE TABLE ParksNYC
(
id int identity(1, 1),
Prop_ID varchar(10),
Name varchar(50) not null,
Locat...
CREATE DOCUMENT
SQL

MONGODB

INSERT ParksNYC (Prop_ID,
Name, Location, EstablishedOn)
VALUES(’Q900’, ’Ridge Park’,
‘1843 ...
READ ALL DOCUMENTS
SQL
SELECT * FROM ParksNYC

MONGODB
db.ParksNYC.find()
READ SPECIFIC DOCUMENT
SQL
SELECT * FROM ParksNYC
WHERE Name = "Ridge Park"

MONGODB
db.ParksNYC.find(
{
Name : "Ridge Par...
READ FIRST DOCUMENT
SQL
SELECT TOP 1 * FROM
ParksNYC

MONGODB
db.ParksNYC.findOne()
READ SPECIFIC FIELDS IN DOCUMENT
SQL
SELECT id, Name FROM ParksNYC

MONGODB
db.ParksNYC.find(
{ },
{
_id: 1, Name: 1
}
)
READ DOCUMENTS WITH RANGE CRITERIA
SQL
SELECT id, Name FROM ParksNYC
WHERE Courts > 5
AND Courts <= 8

MONGODB
db.ParksNYC...
READ DOCUMENTS THAT START WITH
A LETTER (REGULAR EXPRESSION)
SQL
SELECT id, Name FROM ParksNYC
WHERE NAME LIKE ‘F%’

MONGO...
UPDATE FIELD IN DOCUMENT
SQL
UPDATE ParksNYC
SET VisitDate = ‘1/1/2014’

MONGODB
db.ParksNYC.update(
{ },
{
$set: { VisitD...
DELETE DOCUMENT
SQL
DELETE FROM ParksNYC
Where Name = ‘Ridge Park’

MONGODB
db.ParksNYC.remove(
{
Name : “Ridge Park”
})
GROUP BY AND SUM
SQL
SELECT COUNT(Name) AS
Parks_Number,
SUM(Courts) AS Courts_Number
FROM ParksNYC
GROUP BY Accessible

M...
SHARDING AND
REPLICATION IN MONGODB
EACH DOCUMENT CAN BE IN A
DIFFERENT MACHINE
HOW DOES MONGODB DO
THIS?
AUTOSHARDING,
FOR A COLLECTION
MONGODB CLUSTER
MONGOD MONGOD MONGOD

MONGOD

MONGOS
CLIENT
CLIENT
SHARDING STEPS
1. ENABLE SHARDING ON DATABASE.
2. PICK A SHARD KEY FROM THE COLLECTION.
MAKE SURE THE KEY IS
- INDEXED
- S...
SHARDING WP_POSTS COLLECTION
{
_id: 1,
post_author: “Amy W”,
post_date: “1/1/2014”,
comments: [{
comment_author: “bestguy”...
BREAKING THE USERS INTO CHUNKS
$minKey
Abba1234

Abba1235
CarlW

CarlZ
FrankT

FrankY
JackA

JackB
LambV

LambW
RobF

RobG...
BREAKING THE RANGE INTO CHUNKS
SHARD0000
MONGOD

$minKey
Abba1234

RobG
TimA

LambW
RobF

SHARD0001
MONGOD
TimB
$maxKey

M...
BENEFITS OF SHARDING
1.
2.
3.
4.

INCREASES AVAILABLE MEMORY.
REDUCES LOAD ON THE SERVER.
INCREASES HARD DISK SPACE.
LOCAT...
MASTER-SLAVE REPLICATION
REPLICA SET
MASTER

SLAVE

SLAVE

MONGOD

MONGOD

MONGOD

CLIENT
MASTER-SLAVE REPLICATION
REPLICA SET
MASTER

SLAVE

SLAVE

MONGOD

MONGOD

MONGOD

CLIENT

ELECTION
MASTER-SLAVE REPLICATION
REPLICA SET
MASTER

MONGOD

CLIENT

SLAVE

MONGOD

MONGOD

MINIMUM 3 MEMBERS TO
FORM REPLICA SET
MASTER-SLAVE REPLICATION
REPLICA SET
SLAVE

MASTER

SLAVE

MONGOD

MONGOD

MONGOD

CLIENT

REPLICATION SOLVES THE
PROBLEM ...
FUTURE OF MONGODB
AND US 
COMPANIES USING MONGODB
MONGODB WINS AWARD
36 MOST VALUABLE STARTUPS
ON EARTH
POSTGRESQL
RIAK
MONGODB

NEO4J

?

SQL
SERVER

MYSQL
ORACLE
DREMEL

POLYGLOT
PERSISTENCE
GOOD TO KNOW
BOTH SQL AND
NOSQL
WHAT WE DID NOT COVER
SECURITY
BACKUP/RECOVERY
DATA MODELING

ARCHITECT
THANK YOU VERY MUCH
AND THANK YOU TO EVERYONE WHO HELPED US
DR. BILL HOWE, UNIVERSITY OF WASHINGTON
JASON CHEN, MONGODB RECRUITER
KRISTINA CHO...
REFERENCES
MongoDB
http://www.mongodb.org
Book: MongoDB, The Definitive Guide – Kristina Chodorow
Book: NoSQL Distilled – ...
DEMO
MongoDB Workshop
Upcoming SlideShare
Loading in...5
×

MongoDB Workshop

2,958

Published on

Workshop held at NYC Open Data Meetup

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,958
On Slideshare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
49
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

MongoDB Workshop

  1. 1. MONGODB WORKSHOP { meetup: “NYC Open Data”, presenters: [“Kannan Sankaran”, “Roman Kubiak”], host: “Vivian”, location: “ThoughtWorks”, audience: “You guys” }
  2. 2. MONGODB WORKSHOP { meetup: “NYC Open Data”, presenters: [“Kannan Sankaran”, “Roman Kubiak”], host: “Vivian is awesome, THANK YOU”, location: “ThoughtWorks is awesome, THANK YOU”, audience: “You guys are awesome, THANK YOU” }
  3. 3. OUR TOPICS OVERVIEW OF DATABASES WHAT IS MONGODB? MONGODB, NOSQL, AND RELATIONAL DATABASES A PEEK AT MONGODB COMMANDS SHARDING AND REPLICATION IN MONGODB FUTURE OF MONGODB AND US DEMO WORKSHOP
  4. 4. MONGO PIE ARCHITECT
  5. 5. OVERVIEW OF DATABASES
  6. 6. ORGANIZING DATA ROWS COLUMNS TABLES
  7. 7. DATA SPREAD OUT IN VARIOUS TABLES
  8. 8. DATA MAY BE RELATED
  9. 9. DATABASES AND THEIR GROWTH RELATIONAL DATABASES (RDBMS) CREATED 1970s 1980s RDBMS CONTINUE TO BE POPULAR INTERNET ARRIVES 1990s CLIENT/SERVER MODEL STRUCTURED QUERY LANGUAGE (SQL) CREATED 2000s MONGODB CREATED 2007 INTERNET GROWS NoSQL DATABASES EMERGE
  10. 10. WHAT IS NoSQL?
  11. 11. A TWITTER HASHTAG #nosql
  12. 12. NOSQL GENERALLY REFERS TO DATABASES THAT DO NOT HAVE A FIXED ROW-COLUMN DATA ORGANIZATION STRUCTURE.
  13. 13. WHAT IS MONGODB?
  14. 14. A HUMONGOUS NoSQL DB
  15. 15. A HUMONGOUS NoSQL DB WHERE DATA IS ORGANIZED BY DOCUMENTS NOT ROWS COLLECTIONS NOT TABLES
  16. 16. WHAT IS A DOCUMENT?
  17. 17. A DOCUMENT IS LIKE A ROW… { _id: ObjectID(“12AB34CD56EF”), name: “Ed Brown”, orderDate: “2-1-2014” }
  18. 18. …BUT IT IS MORE FLEXIBLE { { _id: ObjectID(“12AB34CD56EF”), name: “Ed Brown”, orderDate: “2-1-2014”, payments: { car: “100.50”, hotel: “200” } _id: ObjectID(“12AB34CD56EF”), name: “Ed Brown”, orderDate: “2-1-2014”, payments: { car: “100.50”, hotel: “200” }, tags: [“shirt”, “tie”] } } THAT LOOKS LIKE A DOCUMENT WITHIN ANOTHER DOCUMENT! WHAT IS THIS? MULTIPLE VALUES WITHIN A COLUMN?
  19. 19. HOW LARGE CAN THIS DOCUMENT BE? { _id: ObjectID(“12AB34CD56EF”), name: “Ed Brown”, orderDate: “2-1-2014”, payments: { car: “100.50”, hotel: “200” } … … … } UP TO 16 MB LEO TOLSTOY’S 1225PAGE BOOK ON WAR AND PEACE CAN FIT IN 1 DOCUMENT, AS IT IS ONLY AROUND 3 MB.
  20. 20. ISN’T THAT JSON? WELL, ALMOST!
  21. 21. WHAT IS JSON? WEB SERVER { } MONGODB DATABASE “make”: “Chevy”, “model”: “Malibu”, “year”: 2014 { “vehicle”: “Chevy Malibu 2014”, “price”: { “min”: 22340, “max”: 29950 }, “citympg”: 25 }
  22. 22. WHAT IS JSON? JAVASCRIPT OBJECT NOTATION NAME-VALUE PAIRS { { } vehicle: “car”, make: “Malibu”, color: “blue” } name: “Kannan”, gender: “male”, favorites: { color: “blue” }, interests: [“MongoDB”, “R”]
  23. 23. MONGODB DOCUMENT { _id: ObjectID(“12AB34CD56EF”), name: “Kannan”, gender: “male”, favorites: { color: “blue” }, interests: [“MongoDB”, “R”], date: new Date() }
  24. 24. WHAT IS A COLLECTION?
  25. 25. A GROUP OF DOCUMENTS { SIMILAR { _id: ObjectID(“34AB34CD56EF”), name: “Ed Brown”, orderDate: “2-1-2014”, tags: [“shirt”, “tie”] _id: ObjectID(“12AB34CD56EF”), name: “Ed Brown”, orderDate: “2-1-2014” } { _id: ObjectID(“78AB34CD56EF”), name: “Roman Ku”, orderDate: “2-1-2014” } { _id: ObjectID(“56AB34CD56EF”), name: “Eva Green”, orderDate: “2-1-2014” DIFFERENT } { _id: ObjectID(“90AB34CD56EF”), name: “Roman Ku”, orderDate: “2-1-2014”, payments: { car: “100.50”, hotel: “200” } } { VERY DIFFERENT { _id: ObjectID(“35AB34CD56EF”), name: “Ed Brown”, orderDate: “2-1-2014” } { _id: ObjectID(“79AB34CD56EF”), vehicle: “car”, make: “Malibu”, color: “blue” } { _id: ObjectID(“57AB34CD56EF”), name: “Eva Green”, orderDate: “2-1-2014”, tags: [“shirt”, “tie”] _id: ObjectID(“13AB34CD56EF”), name: “Eva Green”, orderDate: “2-1-2014” } } }
  26. 26. MONGODB IS... A DOCUMENT-ORIENTED NOSQL DATABASE WHERE DATA CONSISTS OF DOCUMENTS STORED IN COLLECTIONS.
  27. 27. MONGODB FEATURES EASY TO LEARN DYNAMIC QUERY LANGUAGE - SEARCH BY FIELDS, REGULAR EXPRESSIONS - USER-DEFINED JAVASCRIPT FUNCTIONS - AGGREGATION, INCLUDING MAP/REDUCE INDEXING – SINGLE, COMPOUND, GEOSPATIAL REPLICATION LOAD BALANCING USING SHARDING GRIDFS TO STORE FILES
  28. 28. MONGODB USAGE CONTENT MANAGEMENT SYSTEMS E-COMMERCE WEBSITES LOG DATA AND HIERARCHICAL AGGREGATION REAL-TIME ANALYTICS
  29. 29. MONGODB, NOSQL, AND RELATIONAL DATABASES
  30. 30. DATABASE MANAGEMENT SYSTEMS BERKELEY INGRES ORACLE 1970s MOST SYSTEMS USE SOME FLAVOR OF SQL 1980s INFORMIX DB2 SYBASE SQL SERVER MS ACCESS POSTGRESQL MYSQL 1990s 2000s NETEZZA GREENPLUM VERTICA MARIADB MONGODB 2007
  31. 31. RELATIONAL DATABASES WERE / STILL ARE THE DEFACTO IN SEVERAL COMPANIES.
  32. 32. RELATIONAL DATABASE FEATURES C.R.U.D. OPERATIONS STRUCTURED QUERY LANGUAGE (SQL) FIXED DATABASE SCHEMAS NORMALIZATION REFERENTIAL INTEGRITY (E.G. FOREIGN KEYS, CONSTRAINTS) JOINS TRANSACTIONS - A.C.I.D. PROPERTIES INDEXES
  33. 33. IN THE LATE 90s/EARLY 2000s… DOT COM BUBBLE DOT COM BUST WEB SERVICES SOCIAL NETWORKS GOOGLE, AMAZON COMPUTER OWNERS/USERS WEBSITE DATA COLLECTION DATABASE SIZES
  34. 34. COMPUTING/STORAGE RESOURCES BECAME A CHALLENGE FOR SMALLER COMPANIES LIKE GOOGLE AND AMAZON THAT HAD LOTS OF DATA.
  35. 35. SCALE UP BIGGER MACHINE MORE DISK SPACE MORE RAM MORE PROCESSORS MORE EXPENSIVE SINGLE POINT OF FAILURE HARDWARE HAS LIMITS! SCALE OUT SMALLER LESS DISK SPACE MACHINES LESS RAM LESS PROCESSORS LESS EXPENSIVE NO SINGLE POINT OF FAILURE HIGHER RELIABILITY DESPITE FAILURE OF INDIVIDUAL MACHINES
  36. 36. RELATIONAL DATABASES WERE DESIGNED TO OPERATE ON A SINGLE MACHINE, AND SCALING OUT MEANT A LOT OF CHALLENGES.
  37. 37. SPLITTING DATA FOR SCALE OUT BY COLUMNS BY ROWS
  38. 38. WORDPRESS MYSQL SCHEMA WITH 2 TABLES
  39. 39. WP_POSTS A JOIN QUERY IN MYSQL WP_COMMENTS SELECT p.post_author, p.post_date, c.comment_author, c.comment_date FROM wp_posts AS p INNER JOIN wp_comments AS c ON p.ID = c.comment_post_ID WHERE p.ID = 1;
  40. 40. WP_POSTS A JOIN QUERY IN MYSQL WP_COMMENTS RESULT
  41. 41. SCALE OUT DATA BY ROWS WP_POSTS A B C WP_COMMENTS D
  42. 42. HOW COMPLICATED WOULD SCALING THIS BE?
  43. 43. JOINS MAY GET REALLY MESSY WITH MANY MACHINES (DISTRIBUTED JOINS)
  44. 44. WP_POSTS TRANSACTIONS MUST SATISFY A.C.I.D. PROPERTIES WP_COMMENTS BEGIN TRANSACTION TRY DELETE FROM wp_comments AS c WHERE c.comment_post_ID = 1; DELETE FROM wp_posts AS p WHERE p.ID = 1; CATCH IF ERROR THEN ROLLBACK TRANSACTION COMMIT TRANSACTION END TRANSACTION
  45. 45. TRANSACTIONS MAY TAKE A LONG TIME TO EXECUTE IF DATA IS ON DIFFERENT MACHINES (DISTRIBUTED TRANSACTIONS)
  46. 46. TO SPLIT THE DATA, A WHOLE BUNCH OF COMPROMISES MUST BE MADE IN RELATIONAL DATABASES
  47. 47. THIS GAVE RISE TO NONRELATIONAL SOLUTIONS
  48. 48. GOOGLE AMAZON
  49. 49. NoSQL SYSTEM CHARACTERISTICS C.R.U.D. OPERATIONS STRUCTURED QUERY LANGUAGE (SQL) FIXED DATABASE SCHEMAS NORMALIZATION REFERENTIAL INTEGRITY (E.G. FOREIGN KEYS, CONSTRAINTS) JOINS TRANSACTIONS – LIMITED A.C.I.D. PROPERTIES INDEXES OPEN SOURCE
  50. 50. HOW IS THIS SCALABILITY ACHIEVED IN MONGODB?
  51. 51. STACKING THE DATA
  52. 52. WP_POSTS STACKING THE DATA { NO NEED TO JOIN } WP_COMMENTS _id: 1, post_author: “Amy W”, post_date: “1/1/2014”, comments: [{ comment_author: “bestguy”, comment_date: “1/1/2014” },{ comment_author: “baddie”, comment_date: “1/10/2014” },{ comment_author: “clever24”, comment_date: “1/11/2014” }]
  53. 53. NOW, EACH DOCUMENT CAN BE IN A DIFFERENT MACHINE
  54. 54. WHAT ABOUT TRANSACTIONS?
  55. 55. MONGODB DOES NOT SUPPORT TRANSACTIONS
  56. 56. { BUT SINGLE DOCUMENT UPDATE IS ATOMIC _id: 1, post_author: “Amy W”, post_date: “1/1/2014”, comments: [{ comment_author: “bestguy”, comment_date: “1/1/2014” },{ comment_author: “baddie”, comment_date: “1/10/2014” },{ comment_author: “clever24”, comment_date: “1/11/2014” }] }
  57. 57. THE KEY IS TO FOCUS ON THE DATA MODEL
  58. 58. MONGODB CHARACTERISTICS C.R.U.D. OPERATIONS STRUCTURED QUERY LANGUAGE (SQL) DYNAMIC QUERY LANGUAGE FIXED DATABASE SCHEMAS FLEXIBLE DATABASE SCHEMAS NORMALIZATION REFERENTIAL INTEGRITY (E.G. FOREIGN KEYS, CONSTRAINTS) JOINS TRANSACTIONS – LIMITED A.C.I.D. PROPERTIES INDEXES OPEN SOURCE
  59. 59. WHEN NOT TO USE MONGODB IF TRANSACTIONS ARE A MUST IF JOINS ARE ABSOLUTELY NECESSARY SOFTWARE PRODUCTS LIKE WORDPRESS THAT ALREADY HAVE TONS OF SUPPORT FOR RELATIONAL DATABASES
  60. 60. FOR MONGODB vs MYSQL ARGUMENTS, WATCH… Source: http://www.youtube.com/watch?v=b2F-DItXtZs
  61. 61. A PEEK AT MONGODB COMMANDS
  62. 62. MONGODB IS A DOCUMENTORIENTED DATABASE { _id: ObjectID(“A1234566789”), name: “Ed Brown”, orderDate: “2-1-2014” } { _id: ObjectID(“A1234566789”), name: “Roman Ku”, orderDate: “1-1-2014” } { _id: ObjectID(“A1234566789”), name: “Eva Green”, orderDate: “10-12-2013” } DOCUMENTS ARE INTERNALLY STORED AS BSON (BINARY JSON)
  63. 63. MONGODB FEATURES EASY TO LEARN DYNAMIC QUERY LANGUAGE - SEARCH BY FIELDS, REGULAR EXPRESSIONS - USER-DEFINED JAVASCRIPT FUNCTIONS - AGGREGATION, INCLUDING MAP/REDUCE INDEXING – SINGLE, COMPOUND, GEOSPATIAL REPLICATION LOAD BALANCING USING SHARDING GRIDFS TO STORE FILES
  64. 64. MONGODB SYNTAX SEEMS TO BE BORROWED FROM… - MYSQL JSON JAVASCRIPT UNIX
  65. 65. MONGODB SUPPORTS SEVERAL LANGUAGES DRIVERS FOR - PYTHON - NODE.JS - C# - HADOOP - R AND MANY MORE
  66. 66. MONGODB TERMINOLOGY RDBMS MONGODB DATABASE TABLE ROW DATABASE COLLECTION DOCUMENT A DATABASE CAN HAVE 1 OR MORE COLLECTIONS. A COLLECTION CAN HAVE 1 OR MORE DOCUMENTS. A DOCUMENT CAN HAVE 1 OR MORE NAME-VALUE PAIRS, AND/OR 1 OR MORE EMBEDDED DOCUMENTS.
  67. 67. MONGODB SUPPORTS SEVERAL DATA TYPES STRING NUMBER BOOLEAN ARRAY DATE EMBEDDED DOCUMENT NULL
  68. 68. MONGODB OPERATIONS C.R.U.D. CREATE READ UPDATE DELETE
  69. 69. CONNECTING TO MONGODB MONGO SHELL IS A JAVASCRIPT INTERPRETER. MONGOD ROBOMONGO HAS THE SAME JAVASCRIPT ENGINE AS THE MONGO SHELL. MONGO ROBOMONGO
  70. 70. IMPORT JSON TO MONGO COLLECTION mongoimport -d tennis –c ParksNYC --type json --drop < ParksNYC.json
  71. 71. CREATE COLLECTION SQL CREATE TABLE ParksNYC ( id int identity(1, 1), Prop_ID varchar(10), Name varchar(50) not null, Location varchar(20) not null, EstablishedOn datetime ) MONGODB
  72. 72. CREATE DOCUMENT SQL MONGODB INSERT ParksNYC (Prop_ID, Name, Location, EstablishedOn) VALUES(’Q900’, ’Ridge Park’, ‘1843 Norman St.’, ‘1/1/1970’) Prop_ID Name Location EstablishedOn Q900 Ridge Park 1843 Norman St. 1/1/1970 db.ParksNYC.insert( { Prop_ID : "Q900", Name : "Ridge Park", Location : ”1843 Norman St.”, EstablishedOn: “1/1/1970” })
  73. 73. READ ALL DOCUMENTS SQL SELECT * FROM ParksNYC MONGODB db.ParksNYC.find()
  74. 74. READ SPECIFIC DOCUMENT SQL SELECT * FROM ParksNYC WHERE Name = "Ridge Park" MONGODB db.ParksNYC.find( { Name : "Ridge Park” })
  75. 75. READ FIRST DOCUMENT SQL SELECT TOP 1 * FROM ParksNYC MONGODB db.ParksNYC.findOne()
  76. 76. READ SPECIFIC FIELDS IN DOCUMENT SQL SELECT id, Name FROM ParksNYC MONGODB db.ParksNYC.find( { }, { _id: 1, Name: 1 } )
  77. 77. READ DOCUMENTS WITH RANGE CRITERIA SQL SELECT id, Name FROM ParksNYC WHERE Courts > 5 AND Courts <= 8 MONGODB db.ParksNYC.find( { Courts: { $gt: 5, $lte: 8} } )
  78. 78. READ DOCUMENTS THAT START WITH A LETTER (REGULAR EXPRESSION) SQL SELECT id, Name FROM ParksNYC WHERE NAME LIKE ‘F%’ MONGODB db.ParksNYC.find( { Name: /^F/ } )
  79. 79. UPDATE FIELD IN DOCUMENT SQL UPDATE ParksNYC SET VisitDate = ‘1/1/2014’ MONGODB db.ParksNYC.update( { }, { $set: { VisitDate: "1/1/2014" } }, { multi: true} )
  80. 80. DELETE DOCUMENT SQL DELETE FROM ParksNYC Where Name = ‘Ridge Park’ MONGODB db.ParksNYC.remove( { Name : “Ridge Park” })
  81. 81. GROUP BY AND SUM SQL SELECT COUNT(Name) AS Parks_Number, SUM(Courts) AS Courts_Number FROM ParksNYC GROUP BY Accessible MONGODB db.ParksNYC.aggregate( { $group : { _id : "$Accessible", Parks_Number : { $sum : 1 }, Courts_Number : { $sum : "$Courts" } } })
  82. 82. SHARDING AND REPLICATION IN MONGODB
  83. 83. EACH DOCUMENT CAN BE IN A DIFFERENT MACHINE
  84. 84. HOW DOES MONGODB DO THIS?
  85. 85. AUTOSHARDING, FOR A COLLECTION
  86. 86. MONGODB CLUSTER MONGOD MONGOD MONGOD MONGOD MONGOS CLIENT CLIENT
  87. 87. SHARDING STEPS 1. ENABLE SHARDING ON DATABASE. 2. PICK A SHARD KEY FROM THE COLLECTION. MAKE SURE THE KEY IS - INDEXED - SUFFICIENTLY UNIQUE SO IT WILL HAVE A VARIETY OF UNIQUE VALUES. 3. SIT BACK AND RELAX. MONGODB WILL AUTOMATICALLY DO THE SHARDING. 
  88. 88. SHARDING WP_POSTS COLLECTION { _id: 1, post_author: “Amy W”, post_date: “1/1/2014”, comments: [{ comment_author: “bestguy”, comment_date: “1/1/2014” },{ comment_author: “baddie”, comment_date: “1/10/2014” },{ comment_author: “clever24”, comment_date: “1/11/2014” }] } SHARD KEY
  89. 89. BREAKING THE USERS INTO CHUNKS $minKey Abba1234 Abba1235 CarlW CarlZ FrankT FrankY JackA JackB LambV LambW RobF RobG TimA TimB $maxKey
  90. 90. BREAKING THE RANGE INTO CHUNKS SHARD0000 MONGOD $minKey Abba1234 RobG TimA LambW RobF SHARD0001 MONGOD TimB $maxKey MONGOS CarlZ FrankT MONGOD SHARD0002 CLIENT FrankY JackA Abba1235 CarlW JackB LambV
  91. 91. BENEFITS OF SHARDING 1. 2. 3. 4. INCREASES AVAILABLE MEMORY. REDUCES LOAD ON THE SERVER. INCREASES HARD DISK SPACE. LOCATION-BASED SHARD KEYS CAN PUT DATA CLOSE TO THE USERS AND KEEP RELATED DATA TOGETHER.
  92. 92. MASTER-SLAVE REPLICATION REPLICA SET MASTER SLAVE SLAVE MONGOD MONGOD MONGOD CLIENT
  93. 93. MASTER-SLAVE REPLICATION REPLICA SET MASTER SLAVE SLAVE MONGOD MONGOD MONGOD CLIENT ELECTION
  94. 94. MASTER-SLAVE REPLICATION REPLICA SET MASTER MONGOD CLIENT SLAVE MONGOD MONGOD MINIMUM 3 MEMBERS TO FORM REPLICA SET
  95. 95. MASTER-SLAVE REPLICATION REPLICA SET SLAVE MASTER SLAVE MONGOD MONGOD MONGOD CLIENT REPLICATION SOLVES THE PROBLEM OF AVAILABILITY AND FAULT TOLERANCE
  96. 96. FUTURE OF MONGODB AND US 
  97. 97. COMPANIES USING MONGODB
  98. 98. MONGODB WINS AWARD
  99. 99. 36 MOST VALUABLE STARTUPS ON EARTH
  100. 100. POSTGRESQL RIAK MONGODB NEO4J ? SQL SERVER MYSQL ORACLE DREMEL POLYGLOT PERSISTENCE GOOD TO KNOW BOTH SQL AND NOSQL
  101. 101. WHAT WE DID NOT COVER SECURITY BACKUP/RECOVERY DATA MODELING ARCHITECT
  102. 102. THANK YOU VERY MUCH
  103. 103. AND THANK YOU TO EVERYONE WHO HELPED US DR. BILL HOWE, UNIVERSITY OF WASHINGTON JASON CHEN, MONGODB RECRUITER KRISTINA CHODOROW (DEFINITIVE GUIDE AUTHOR) FRANCESCA KRIHELY (MONGODB COMMUNITY MANAGER) DR. MARKUS SCHMIDBERGER, RMONGODB JOHANNES BRANDSTETTER, MONGOSOUP (THE FIRST EUROPEAN PARTNER OF MONGODB TO PROVIDE MONGODB AS A SERVICE) DR. RAMNATH VAIDYANATHAN, RCHARTS
  104. 104. REFERENCES MongoDB http://www.mongodb.org Book: MongoDB, The Definitive Guide – Kristina Chodorow Book: NoSQL Distilled – Pramod J. Sadalage and Martin Fowler NoSQL http://en.wikipedia.org/wiki/NoSQL MongoDB Use Cases http://www.mongodb.com/use-cases First NoSQL Meetup Notes http://developer.yahoo.com/blogs/ydn/notes-nosql-meetup7663.html Billion dollar club http://graphics.wsj.com/billion-dollar-club/ Photos from Google 
  105. 105. DEMO
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×