Your SlideShare is downloading. ×
0
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Schema Design
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Schema Design

2,202

Published on

Robert Stam's presentation at Mongo Atlanta

Robert Stam's presentation at Mongo Atlanta

Published in: Technology
0 Comments
7 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,202
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
83
Comments
0
Likes
7
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Transcript

    • 1. Mongo AtlantaSchema Design Robert Stam robert@10gen.com
    • 2. TopicsIntroduction• Basic data modeling• Manipulating data• Evolving a schemaCommon patterns• Single table inheritance• One-to-many• Many-to-many• Trees• Queues
    • 3. Benefit of relationalBefore relational model• Data and logic combinedAfter relational model• Separation of concerns• Data model independent of logic• Logic freed from concerns of data designMongoDB continues this separation
    • 4. NormalizationGoals• Avoid anomalies when inserting, updating or deleting• Minimize redesign when extending the schema• Make the model informative to users• Avoid bias toward a particular queryIn MongoDB• Similar goals apply• But rules are different
    • 5. Relational model makesnormalized data looks like
    • 6. Document databases makenormalized data look like
    • 7. Terminology Relational MongoDB Table Collection Row(s) Documents Index Index Join Embedding and linking Partition Shard Partition key Shard key
    • 8. Collections• Cheap to create (max 24000)• Collections don’t have a schema• Individual documents have a schema• Common for documents in a collection to share a schema• Document schema can evolve• Consider using multiple related collections tied together by a naming convention: • e.g. LogData-2011-02-08
    • 9. Document basics• Zero or more elements• Elements are name/value pairs• Rich data types for values• JSON• BSON
    • 10. Data types• Numeric (Int32, Int64, Double)• String• Boolean• DateTime• ObjectId• Others (Javascript, Regex, Binary, Null, ...)• Array• Nested document
    • 11. Experimenting with MongoDB• Mongo shell• Javascript$
mongoMongoDB
shell
version:
1.7.5connecting
to:
test>
db.books.find(){



_id
:
ObjectId("12345678901234567890abcd"),



author
:
"Ernest
Hemingway",



title
:
"The
Old
Man
and
the
Sea"}>
    • 12. Sample rich document>
db.orders.findOne(){



_id
:
1,



customer
:
{







customer_id
:
1234,







name
:
"John
Doe",







address
:
{











line1
:
"123
Main
St",











city
:
"Duncannon",











state
:
"PA",











zip
:
"12345‐6789"







}



}



items
:
[







{
item_id
:
111,
...
}
//
data
for
first
item







{
item_id
:
222,
...
}
//
data
for
next
item







...



]}
    • 13. Rich document advantages • Holistic representation • Still easy to manipulate • Pre-joined for fast retrieval
    • 14. Document size• Max 4MB in earlier MongoDB versions• Max 16MB in current versions• Performance considerations long before reaching the maximum size
    • 15. Database considerations • How can we manipulate this data? • Dynamic queries • Secondary indexes • Atomic updatesWhat are the access patterns? • Read/write ratio • Types of updates • Types of queries • Data life-cycleConsiderations • No joins • Document writes are atomic
    • 16. Document design• Design documents that map simply to your application data>
book
=
{



author
:
"Ernest
Hemingway",



title
:
"The
Old
Man
and
the
Sea",



tags
:
["American
Literature",
"Sea",
"Large
Fish"]}>
db.books.insert(book)>
    • 17. Find the document>
db.books.find({
author
:
"Ernest
Hemingway"
}){



_id
:
ObjectId("12345678901234567890abcd"),



author
:
"Ernest
Hemingway",



title
:
"The
Old
Man
and
the
Sea",



tags
:
["American
Literature",
"Sea",
"Large
Fish"]}> Notes: •Every document must have a unique _id •MongoDB will generate one automatically if your document does not have an _id
    • 18. Find via index>
db.books.ensureIndex({
author
:
1
})>
db.books.find({
author
:
"Ernest
Hemingway"
}){



_id
:
ObjectId("12345678901234567890abcd"),



author
:
"Ernest
Hemingway",



title
:
"The
Old
Man
and
the
Sea",



tags
:
["American
Literature",
"Sea",
"Large
Fish"]}>
    • 19. Verify index exists>
db.books.getIndexes(){



...,



{







_id
:
ObjectId("12345678901234567890abcd"),







ns
:
"test.books",







key
:
{
author
:
1
},







name
:
"author_1"



},



...}>
    • 20. Verify index is usedExamine the query plan>
db.books.find({
author
:
"Ernest
Hemingway"
}).explain(){



cursor
:
"BtreeCursor
author_1",



nscanned
:
1,



nscannedObjects
:
1,



n
:
1,



millis
:
1,



indexBounds
:
{







author
:
[











[
"Ernest
Hemingway",
"Ernest
Hemingway"
]







]



}}>
    • 21. Query operatorsConditional operators• equals ({ author : "..." })• matches ({ author : /^e/i })• $ne, $in, $nin, $mod, $all, $size, $exists, $type, $lt, $lte, $gt, $gte, $ne
    • 22. Sample queries//
find
books
by
"Ernest
Hemingway">
db.books.find({
author
:
"Ernest
Hemingway"
})//
find
books
by
authors
whose
name
starts
with
"e">
db.books.find({
author
:
/^e/i
})//
find
books
tagged
"American
Literature">
db.books.find({
tags
:
"American
Literature"
})//
find
books
that
have
a
tags
element>
db.books.find({
tags
:
{
$exists
:
true
}
})//
count
books
by
authors
whose
name
starts
with
"e">
db.books.find({
author
:
/^e/i
}).count()
    • 23. Extending the schema>
comment
=
{



author
:
"Robert",



text
:
"Great
book",



date
:
Date()}>
db.books.update(



{
title
:
"The
Old
Man
and
the
Sea"
},



{








$inc
:
{
comments_count
:
1
},







$push
:
{
comments
:
comment
}



}}>
    • 24. Extended schema>
db.books.find({
title
:
"The
Old
Man
and
the
Sea"
}){



_id
:
ObjectId("12345678901234567890abcd"),



author
:
"Ernest
Hemingway",



title
:
"The
Old
Man
and
the
Sea",



tags
:
["American
Literature",
"Sea",
"Large
Fish"],



comments_count
:
1,



comments
:
[







{











author
:
"Robert",











text
:
"Great
book",











date
:
"Wed
Feb
02
2011
10:36:18
..."







}



]}>
    • 25. Using the extended schema//
create
index
on
nested
element>
db.books.ensureIndex({
"comments.author"
:
1
})//
find
books
Robert
has
commented
on>
db.books.find({
"comments.author"
:
"Robert"
})//
find
book
with
most
comments>
db.books.find().sort({
"comments_count"
:
‐1}).limit(1)//
when
sorting,
check
if
you
need
an
index
    • 26. Watch for full table scansExamine the query plan>
db.books.find()


.sort({
"comments_count"
:
‐1}).limit(1).explain(){



cursor
:
"BasicCursor",



nscanned
:
12345,



nscannedObjects
:
12345,



n
:
1,



millis
:
123



indexBounds
:
{
}}>
    • 27. Inheritance
    • 28. Single table inheritanceShapes table:id type area radius side length width1 circle 3.14 12 square 4 23 rect 10 5 2
    • 29. Single table inheritance: MongoDB>
db.shapes.find(){
_id
:
1,
type
:
"circle",
area
:
3.14,
radius
:
1
},{
_id
:
2,
type
:
"square",
area
:
4,
side
:
2
},{
_id
:
3,
type
:
"rect",
area
:
10,
length
:
5,
width
:
2
}//
find
shapes
where
radius
>
0>
db.shapes.find({
radius
:
{
$gt
:
0
}
}){
_id
:
1,
type
:
"circle",
area
:
3.14,
radius
:
1
},//
find
shapes
where
area
>=
4>
db.shapes.find({
area
:
{
$gte
:
4
}
}){
_id
:
2,
type
:
"square",
area
:
4,
side
:
2
},{
_id
:
3,
type
:
"rect",
area
:
10,
length
:
5,
width
:
2
}//
db.ensureIndex({
radius
:
1
})
    • 30. One-to-manyOptions•Embedded Array•Embedded Document•Normalized
    • 31. One-to-many: embedded array>
db.books.find(){



author
:
"Ernest
Hemingway",



title
:
"The
Old
Man
and
the
Sea",



comments
:
[







{
author
:
"Robert",
text
:
"Great
book"
},







{
author
:
"Jim",
text
:
"I
didnt
like
it"
}



]}>
    • 32. One to many: embedded trees>
db.books.find(){



author
:
"Ernest
Hemingway",



title
:
"The
Old
Man
and
the
Sea",



comments
:
[







{











author
:
"Robert",











text
:
"Great
book"











replies
:
[















{



















author
:
"Jim",



















text
:
"I
didnt
like
it"















}











]







}



]}>
    • 33. One-to-many: normalized>
db.books.find(){



_id
:
1,



author
:
"Ernest
Hemingway",



title
:
"The
Old
Man
and
the
Sea",



comment_ids
:
[1,
2]}>
db.comments.find(){
_id
:
1,
book_id
:
1,
author
:
"Robert",
text
:
"Great
book"
}{
_id
:
2,
book_id
:
1,
author
:
"Jim",
text
:
"I
didnt
like
it"
}>
    • 34. Many-to-manyExample:• Product can be in many categories• Category has many products
    • 35. Many-to-many: products and categories>
db.products.find(){



_id
:
1,



name
:
"Baseball
bat",



category_ids
:
[1,
2]}>
db.categories.find(){



_id
:
1,



name
:
"Sports
Equipment",



product_ids
:
[1]}{



_id
:
2,



name
:
"Baseball",



product_ids
:
[1,
...]}
    • 36. Many-to-many: queries//
all
products
for
a
given
category>
db.products.find({
category_ids
:
1
})//
all
categories
for
a
given
product>
db.categories.find({
product_ids
:
1
})
    • 37. Many-to-many: products and categories(normalized)>
db.products.find(){



_id
:
1,



name
:
"Baseball
bat",



category_ids
:
[1,
2]}>
db.categories.find(){



_id
:
1,



name
:
"Sports
Equipment"}{



_id
:
2,



name
:
"Baseball"}
    • 38. Many-to-many: queries (normalized)//
all
products
for
a
given
category>
db.products.find({
category_ids
:
1
})//
all
categories
for
a
given
product>
product
=
db.product.findOne({
_id
:
1
})>
db.categories.find(



{
_id
:
{
$in
:
product.category_ids
}
})
    • 39. TreesOptions:•Full tree in document•Parent links•Child links•Parent and child links•Array of ancestors•Ancestor paths
    • 40. Trees: full tree in document{



comments
:
[







{
author
:
"Robert",
text
:
"...",











replies
:
[















{
author
:
"Jim",
text
:
"...",



















replies
:
[]















}











]







}



]}Pros:
single
document,
performance,
intuitiveCons:
hard
to
search,
hard
to
get
partial
results,
document
size
limit
could
be
reached
    • 41. Trees: Parent and child linksParent links• Each node is stored as a document• Contains the id of the parentChild links• Each node is stored as a document• Contains the ids of the childrenIn some cases you might do both
    • 42. Trees: array of ancestors>
db.nodes.find(){
_id
:
1
}{
_id
:
2,
ancestors
:
[1],
parent
:
1
}{
_id
:
3,
ancestors
:
[1,
2],
parent
:
2
}{
_id
:
4,
ancestors
:
[1,
2],
parent
:
2
}{
_id
:
5,
ancestors
:
[1],
parent
:
1
}{
_id
:
6,
ancestors
:
[1,
5],
parent
:
5
}
    • 43. Trees: array of ancestors (queries)//
find
all
children
of
2>
db.nodes.find({
parent
:
2
})//
find
all
descendents
of
2>
db.nodes.find({
ancestors
:
2
})//
find
all
ancestors
of
6>
node
=
db.nodes.findOne({
_id
=
6
})>
db.nodes.find({
_id
:
{
$in
:
node.ancestors
}
})//
find
all
siblings
of
3>
node
=
db.nodes.findOne({
_id
=
3
})>
db.nodes.find({
parent
:
node.parent,
_id
:
{
$ne
:
3
}
})
    • 44. Trees: pathsstore
hierarchy
as
a
path
expressionseparate
each
node
by
a
delimiter
(avoid
"/"
and
".")use
regular
expressions
to
find
parts
of
a
tree>
db.nodes.find(){
_id
:
1,
path
:
",1,"
}{
_id
:
2,
path
:
",1,2,"
}{
_id
:
3,
path
:
",1,2,3,"
}{
_id
:
4,
path
:
",1,2,4,"
}{
_id
:
5,
path
:
",1,5,"
}{
_id
:
6,
path
:
",1,5,6,"
}variations:dont
store
leading
or
trailing
delimiterdont
store
final
id
(its
the
same
as
_id)
    • 45. Trees: paths (queries)//
find
all
descendents
of
2>
db.nodes.find({
path
:
/,2,/
})//
find
all
children
of
2>
db.nodes.find({
path
:
/,2,[^,]+,$/
})or>
db.nodes.find({
path
:
/,2,$/
})
//
if
_id
is
not
on
path//
find
all
ancestors
of
6//
not
so
easy//
find
all
siblings
of
3//
not
so
easy
    • 46. QueuesNeed
to
maintain
order
and
stateEnsure
that
updates
to
the
queue
are
atomic>
db.queue.find(){
_id
:
1,
inprogress
:
false,
priority
:
1,
job
:
...
}//
take
the
highest
priority
pending
job>
db.queue.findAndModify(



query
:
{
inprogress
:
false
},



sort
:
{
priority
:
‐1
},



update
:
{







$set
:
{











inprogress
:
true,











started
:
Date()







}



},



new
:
true)>
    • 47. Summary• Schema design is different in MongoDB• Basic principles stay the same• Use rich documents• Theres more than one right way• Focus on how your application uses the data• Rapidly evolve the schema to meet your requirements
    • 48. Thank youLearn more• www.mongodb.org• www.10gen.com/events• www.10gen.com/webinars

    ×