From Relational Databases to
MongoDB – What You Need to
Know
Bryan Reinero
Consulting Engineer, MongoDB
Unhelpful Terms
•

NoSQL

•

Big Data

•

Distributed

What’s the data model?
MongoDB
•

Non-relational

•

Scalable

•

Highly available

•

Full featured

•

Document database
Terminology
RDBMS
Table, View
Row
Index
Join
Foreign Key
Partition

➜
➜
➜
➜
➜
➜

MongoDB
Collection
Document
Index
Embedde...
Sample Document
{
maker : ”Agusta",
type : sportbike,
rake : 7,
trail : 3.93,
engine : {
type : "internal cumbustion",
lay...
Relational DBs
•

Attribute columns are valid for every
row

•

Duplicate rows are not allowed

•

Every column has the sa...
1st Normal Form: No repeating
groups
Product_id
1234
5678
91011

•

Maker
Nokia
Apple
Samsung

Name
Lumia
iPad
Galaxy

Cat...
1st Normal Form: No repeating
groups
Product_id
1234
5678
91011

Maker
Nokia
Apple
Samsung

Name
Lumia
iPad
Galaxy

Catego...
1st Normal Form: No repeating
groups
Product_id
1234
5678
91011

Maker
Nokia
Apple
Samsung

Name
Lumia
iPad
Galaxy

Catego...
1st Normal Form: No repeating
groups
Product_id
1234
5678
91011

Maker
Nokia
Apple
Samsung

Name
Lumia
iPad
Galaxy

Catego...
The Tao of MongoDB
{ _id : ObjectId(),
maker : “Nokia”
name : “Lumia”,
categories : [
"electronics",
"handheld",
"smart ph...
The Tao of MongoDB
{ _id : ObjectId(),
maker : “Nokia”
name : “Lumia”,
categories : [
"electronics",
"handheld",
"smart ph...
The Tao of MongoDB
{ _id : ObjectId(),
maker : “Nokia”
name : “Lumia”,
categories : [
"electronics",
"handheld",
"smart ph...
The Tao of MongoDB
{ _id : ObjectId(),
maker : “Nokia”
name : “Lumia”,
categories : [
"electronics",
"handheld",
"smart ph...
The Tao of MongoDB
{ _id : ObjectId(),
maker : “Nokia”
name : “Lumia”,
categories : [
"electronics",
"handheld",
"smart ph...
The Tao of MongoDB
{ _id : ObjectId(),
maker : “Nokia”
name : “Lumia”,
categories : [
"electronics",
"handheld",
"smart ph...
The Tao of MongoDB
{ _id : ObjectId(),
maker : “Nokia”
name : “Lumia”,
categories : [
"electronics",
"handheld",
"smart ph...
The Tao of MongoDB
"result" : [
{ "_id" : "smart phone”, "counts" : 1589 },
{ "_id" : "handheld”, "counts" : 2403 },
{ "_i...
Meh, big deal…. Right?
Aren’t nested structures just a pre-joined schema?
•

I could use an adjacency list

•

I could use...
Goals of Normalization
•

Model data an understandable form

•

Reduce fact redundancy and data
inconsistency

•

Enforce ...
Normalize or Denormalize
Commonly held that denormalization is faster
Normalize or Denormalize
Commonly held that denormalization is faster
•

Normalization can be fast, right?
Normalize or Denormalize
Commonly held that denormalization is faster
•

Normalization can be fast, right? Requires proper...
Normalize or Denormalize
Commonly held that denormalization is faster
•

Normalization can be fast, right? Requires proper...
Normalize or Denormalize
Commonly held that denormalization is faster
•

Normalization can be fast, right? Requires proper...
Normalize or Denormalize
Commonly held that denormalization is faster
•

Normalization can be fast, right? Requires proper...
Normalize or Denormalize
Commonly held that denormalization is faster
•

Normalization can be fast, right? Requires proper...
Object–Relational Impedance
Mismatch
•

Inheritance hierarchies

•

Polymorphic associations
Table Per Subclass
Vehicles
vin
registration
maker
Motorcycle
Engine
rake
trial

Racebike
racing number
class
team
rider
Table Per Subclass
Vehicles
- electric
- car
- bus
- motorcycle
- internal combustion
-motorcycle
- aircraft
- human power...
Table Per Concrete Class

•

Each class is mapped to a separate table

•

Inherited fields are present in each class’ tabl...
Table Per Concrete Class

•

Each class is mapped to a separate table

•

Inherited fields are present in each class’ tabl...
Table Per Class Family

•

Classes mapped to a single table

Vehicle_id
1234
5678
91011

Maker
M.V
Agusta
M.V.
Agusta
Trit...
Table Per Class Family

•

Classes mapped to a single table

•

Discriminator column to identify class
discriminator

Vehi...
Table Per Class Family

•

Classes mapped to a single table

•

Discriminator column to identify class

•

Many empty colu...
Table Per Class Family

•

Classes mapped to a single table

•

Discriminator column to identify class

•

Many empty colu...
The Tao of MongoDB
{

}
{

}

maker : "M.V. Agusta",
type : sportsbike,
engine : {
type : ”internal combustion",
cylinders...
The Tao of MongoDB
{

}
{

}

maker : "M.V. Agusta",
type : sportsbike,
engine : {
type : ”internal combustion",
cylinders...
The Tao of MongoDB
{

}
{

}

maker : "M.V. Agusta",
type : sportsbike,
engine : {
type : ”internal combustion",
cylinders...
The Tao of MongoDB
{

}
{

}

maker : "M.V. Agusta",
type : sportsbike,
engine : {
type : ”internal combustion",
cylinders...
Relaxed ACID
•

Atomic operations at the Document
level
Relaxed ACID
•

Atomic operations at the Document
level

•

Consistency – strong / eventual
Replication
Relaxed ACID
•

Atomic operations at the Document
level

•

Consistency – strong / eventual

•

Isolation - read lock, wri...
Relaxed ACID
•

Atomic operations at the Document
level

•

Consistency – strong / eventual

•

Isolation - read lock, wri...
The Tao of MongoDB
•

Document database

•

Flexible schema

•

Relaxed ACID
This favors denormalization.
What’s the conse...
Scaling MongoDB
Sharded cluster

MongoDB

Single Instance
Or
Replica Set

Client
Application
Partitioning
•

User defines shard key

•

Shard key defines range of data

•

Key space is like points on a line

•

Rang...
The Mechanism of Sharding
Complete Data Set
Define shard key on vehicle id

1234

2345

3456

4567

5678
The Mechanism of Sharding
Chunk

Chunk

Define shard key on title

1234

2345

3456

4567

5678
The Mechanism of Sharding
Chunk

Chunk

Chunk

Chunk

Define shard key on vehicle id

1234

2345

3456

4567

5678
Chunk

Chunk

Chunk

Chunk

Define shard key on vehicle id

1234
Shard 1

2345

3456

Shard 2

4567
Shard 3

5678
Shard 4
Targeted
Operations

Client

mongos

Shard 1

Shard 2

Shard 3

Shard 4
Data Growth

Shard 1

Shard 2

Shard 3

Shard 4
Load Balancing

Shard 1

Shard 2

Shard 3

Shard 4
Relational if you need to

•

Enforce data constraints

•

Service a broad set of queries

•

Minimize redundancy
The Tao of MongoDB

•

Avoid ad-hoc queries

•

Model data for use, not storage

•

Index effectively, index efficiently
Next Steps
• Webinar: From Relational Databases to MongoDB

- Considerations and Best Practices
– November 7
– 11am PT / 2...
Questions?
Thank You
Upcoming SlideShare
Loading in …5
×

Webinar: From Relational Databases to MongoDB - What You Need to Know

7,301 views

Published on

Relational databases weren't designed to cope with the scale and agility challenges that face modern applications. MongoDB can offer scalability, performance and ease of use - but proper design will be a critical factor to that success. We'll take a dive into how MongoDB works to better understand what non-relational design is, why we might use it and what advantages it gives us. We'll develop schema designs by example, and consider strategies for scale out.

Published in: Technology
0 Comments
6 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
7,301
On SlideShare
0
From Embeds
0
Number of Embeds
2,875
Actions
Shares
0
Downloads
175
Comments
0
Likes
6
Embeds 0
No embeds

No notes for slide
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Initialize -> ElectionPrimary + data replication from primary to secondary
  • Webinar: From Relational Databases to MongoDB - What You Need to Know

    1. 1. From Relational Databases to MongoDB – What You Need to Know Bryan Reinero Consulting Engineer, MongoDB
    2. 2. Unhelpful Terms • NoSQL • Big Data • Distributed What’s the data model?
    3. 3. MongoDB • Non-relational • Scalable • Highly available • Full featured • Document database
    4. 4. Terminology RDBMS Table, View Row Index Join Foreign Key Partition ➜ ➜ ➜ ➜ ➜ ➜ MongoDB Collection Document Index Embedded Document Reference Shard
    5. 5. Sample Document { maker : ”Agusta", type : sportbike, rake : 7, trail : 3.93, engine : { type : "internal cumbustion", layout : "inline" cylinders : 4, displacement : 750, }, transmission : { type : "cassette", speeds : 6, pattern : "sequential”, ratios : [ 2.7, 1.94, 1.34, 1, 0.83, 0.64 ] } }
    6. 6. Relational DBs • Attribute columns are valid for every row • Duplicate rows are not allowed • Every column has the same type and same meaning As a document store, MongoDB supports a flexible schema
    7. 7. 1st Normal Form: No repeating groups Product_id 1234 5678 91011 • Maker Nokia Apple Samsung Name Lumia iPad Galaxy Categories “electronics,hand held, smart phone” “PDA,tablet” “smart phone,tablet” Can't use equality to match elements
    8. 8. 1st Normal Form: No repeating groups Product_id 1234 5678 91011 Maker Nokia Apple Samsung Name Lumia iPad Galaxy Categories “electronics,hand held, smart phone” “PDA,tablet” “smart phone,tablet” • Can't use equality to match elements • Must use regular expressions to find data
    9. 9. 1st Normal Form: No repeating groups Product_id 1234 5678 91011 Maker Nokia Apple Samsung Name Lumia iPad Galaxy Categories “electronics,hand held, smart phone” “PDA,tablet” “smart phone,tablet” • Can't use equality to match elements • Must use regular expressions to find data • Aggregate functions are difficult
    10. 10. 1st Normal Form: No repeating groups Product_id 1234 5678 91011 Maker Nokia Apple Samsung Name Lumia iPad Galaxy Categories “electronics,hand held, smart phone” “PDA,tablet” “smart phone,tablet” • Can't use equality to match elements • Must use regular expressions to find data • Aggregate functions are difficult • Updating a specific element is difficult
    11. 11. The Tao of MongoDB { _id : ObjectId(), maker : “Nokia” name : “Lumia”, categories : [ "electronics", "handheld", "smart phone" ] }
    12. 12. The Tao of MongoDB { _id : ObjectId(), maker : “Nokia” name : “Lumia”, categories : [ "electronics", "handheld", "smart phone" ] } // querying is easy db.products.find( { "categories": ”handheld" } );
    13. 13. The Tao of MongoDB { _id : ObjectId(), maker : “Nokia” name : “Lumia”, categories : [ "electronics", "handheld", "smart phone" ] } // querying is easy db.products.find( { "categories": ”handheld" } ); // can be indexed db.products.ensureIndex( { "categories”: 1 } );
    14. 14. The Tao of MongoDB { _id : ObjectId(), maker : “Nokia” name : “Lumia”, categories : [ "electronics", "handheld", "smart phone" ] } // Updates are easy db.products.update( { "categories": "electronics"}, { $set: { "categories.$" : "consumer electronics" } } );
    15. 15. The Tao of MongoDB { _id : ObjectId(), maker : “Nokia” name : “Lumia”, categories : [ "electronics", "handheld", "smart phone" ] } db.products.aggregate( { $unwind : "$categories" }, { $group : { "_id" : "$categories", "counts" : { "$sum" : 1 } } } );
    16. 16. The Tao of MongoDB { _id : ObjectId(), maker : “Nokia” name : “Lumia”, categories : [ "electronics", "handheld", "smart phone" ] } db.products.aggregate( Unwind the array { $unwind : "$categories" }, { $group : { "_id" : "$categories", "counts" : { "$sum" : 1 } } } );
    17. 17. The Tao of MongoDB { _id : ObjectId(), maker : “Nokia” name : “Lumia”, categories : [ "electronics", "handheld", "smart phone" ] } db.products.aggregate( Unwind the array { $unwind : "$categories" }, { $group : { "_id" : "$categories", "counts" : { "$sum" : 1 } } } Tally the occurrences );
    18. 18. The Tao of MongoDB "result" : [ { "_id" : "smart phone”, "counts" : 1589 }, { "_id" : "handheld”, "counts" : 2403 }, { "_id" : "electronics”, "counts" : 4767 } ] db.products.aggregate( { $unwind : "$categories" }, { $group : { "_id" : "$categories", "counts" : { "$sum" : 1 } } } );
    19. 19. Meh, big deal…. Right? Aren’t nested structures just a pre-joined schema? • I could use an adjacency list • I could use an intersection table
    20. 20. Goals of Normalization • Model data an understandable form • Reduce fact redundancy and data inconsistency • Enforce integrity constraints Performance is not a primary goal
    21. 21. Normalize or Denormalize Commonly held that denormalization is faster
    22. 22. Normalize or Denormalize Commonly held that denormalization is faster • Normalization can be fast, right?
    23. 23. Normalize or Denormalize Commonly held that denormalization is faster • Normalization can be fast, right? Requires proper indexing, indexing effects write performance
    24. 24. Normalize or Denormalize Commonly held that denormalization is faster • Normalization can be fast, right? Requires proper indexing, indexing effects write performance • Does denormalization commit me to a join strategy?
    25. 25. Normalize or Denormalize Commonly held that denormalization is faster • Normalization can be fast, right? Requires proper indexing, indexing effects write performance • Does denormalization commit me to a join strategy? Indexing overhead is a commitment too
    26. 26. Normalize or Denormalize Commonly held that denormalization is faster • Normalization can be fast, right? Requires proper indexing, indexing effects write performance • Does denormalization commit me to a join strategy? Indexing overhead is a commitment too • Does denormalizaiton improve a finite set of queries at the cost of several others?
    27. 27. Normalize or Denormalize Commonly held that denormalization is faster • Normalization can be fast, right? Requires proper indexing, indexing effects write performance • Does denormalization commit me to a join strategy? Indexing overhead is a commitment too • Does denormalizaiton improve a finite set of queries at the cost of several others? MongoDB works best in service to an application
    28. 28. Object–Relational Impedance Mismatch • Inheritance hierarchies • Polymorphic associations
    29. 29. Table Per Subclass Vehicles vin registration maker Motorcycle Engine rake trial Racebike racing number class team rider
    30. 30. Table Per Subclass Vehicles - electric - car - bus - motorcycle - internal combustion -motorcycle - aircraft - human powered - bicycle - skateboard -horsedrawn
    31. 31. Table Per Concrete Class • Each class is mapped to a separate table • Inherited fields are present in each class’ table • Can’t support polymorphic relationships
    32. 32. Table Per Concrete Class • Each class is mapped to a separate table • Inherited fields are present in each class’ table • Can’t support polymorphic relationships SELECT maker FROM Motorcycles WHERE Motorcycles.country = "Italy" UNION SELECT maker FROM Automobiles WHERE Automobiles.country = "Italy"
    33. 33. Table Per Class Family • Classes mapped to a single table Vehicle_id 1234 5678 91011 Maker M.V Agusta M.V. Agusta Triton Name F4 A104 Triton 95 Type sportbike helicopter submarine
    34. 34. Table Per Class Family • Classes mapped to a single table • Discriminator column to identify class discriminator Vehicle_id 1234 5678 91011 Maker M.V Agusta M.V. Agusta Triton Name F4 A104 Triton 95 Type sportbike helicopter submarine
    35. 35. Table Per Class Family • Classes mapped to a single table • Discriminator column to identify class • Many empty columns, nullability issues Vehicle_id 1234 5678 91011 Maker M.V Agusta M.V. Agusta Triton Name F4 A104 Triton 95 Type sportbike helicopter submarine
    36. 36. Table Per Class Family • Classes mapped to a single table • Discriminator column to identify class • Many empty columns, nullability issues Vehicle_id 1234 5678 91011 Maker M.V Agusta M.V. Agusta Triton maker = “M.V. Agusta”, Name type =F4 “sportbike”, num_doors = 0, A104 wing_area = 0, Triton 95 maximum_depth = 0 Type sportbike helicopter submarine ???
    37. 37. The Tao of MongoDB { } { } maker : "M.V. Agusta", type : sportsbike, engine : { type : ”internal combustion", cylinders: 4, displacement : 750 }, rake : 7, trail : 3.93 maker : "M.V. Agusta", type : Helicopter engine : { type : "turboshaft" layout : "axial”, massflow : 1318 }, Blades : 4 undercarriage : "fixed"
    38. 38. The Tao of MongoDB { } { } maker : "M.V. Agusta", type : sportsbike, engine : { type : ”internal combustion", cylinders: 4, displacement : 750 }, rake : 7, trail : 3.93 maker : "M.V. Agusta", type : Helicopter, engine : { type : "turboshaft" layout : "axial”, massflow : 1318 }, Blades : 4, undercarriage : "fixed" Discriminator column
    39. 39. The Tao of MongoDB { } { } maker : "M.V. Agusta", type : sportsbike, engine : { type : ”internal combustion", cylinders: 4, displacement : 750 }, rake : 7, trail : 3.93 maker : "M.V. Agusta", type : Helicopter engine : { type : "turboshaft" layout : "axial”, massflow : 1318 }, Blades : 4, undercarriage : "fixed" Shared indexing strategy
    40. 40. The Tao of MongoDB { } { } maker : "M.V. Agusta", type : sportsbike, engine : { type : ”internal combustion", cylinders: 4, displacement : 750 }, rake : 7, trail : 3.93 maker : "M.V. Agusta", type : Helicopter engine : { type : "turboshaft" layout : "axial”, massflow : 1318 }, Blades : 4 undercarriage : "fixed" Polymorphic attributes
    41. 41. Relaxed ACID • Atomic operations at the Document level
    42. 42. Relaxed ACID • Atomic operations at the Document level • Consistency – strong / eventual
    43. 43. Replication
    44. 44. Relaxed ACID • Atomic operations at the Document level • Consistency – strong / eventual • Isolation - read lock, write lock / logical database
    45. 45. Relaxed ACID • Atomic operations at the Document level • Consistency – strong / eventual • Isolation - read lock, write lock / logical database • Durability – write ahead journal, replication
    46. 46. The Tao of MongoDB • Document database • Flexible schema • Relaxed ACID This favors denormalization. What’s the consequence?
    47. 47. Scaling MongoDB Sharded cluster MongoDB Single Instance Or Replica Set Client Application
    48. 48. Partitioning • User defines shard key • Shard key defines range of data • Key space is like points on a line • Range is a segment of that line
    49. 49. The Mechanism of Sharding Complete Data Set Define shard key on vehicle id 1234 2345 3456 4567 5678
    50. 50. The Mechanism of Sharding Chunk Chunk Define shard key on title 1234 2345 3456 4567 5678
    51. 51. The Mechanism of Sharding Chunk Chunk Chunk Chunk Define shard key on vehicle id 1234 2345 3456 4567 5678
    52. 52. Chunk Chunk Chunk Chunk Define shard key on vehicle id 1234 Shard 1 2345 3456 Shard 2 4567 Shard 3 5678 Shard 4
    53. 53. Targeted Operations Client mongos Shard 1 Shard 2 Shard 3 Shard 4
    54. 54. Data Growth Shard 1 Shard 2 Shard 3 Shard 4
    55. 55. Load Balancing Shard 1 Shard 2 Shard 3 Shard 4
    56. 56. Relational if you need to • Enforce data constraints • Service a broad set of queries • Minimize redundancy
    57. 57. The Tao of MongoDB • Avoid ad-hoc queries • Model data for use, not storage • Index effectively, index efficiently
    58. 58. Next Steps • Webinar: From Relational Databases to MongoDB - Considerations and Best Practices – November 7 – 11am PT / 2pm ET / 6 pm UTC • Register now at: mongodb.com/events
    59. 59. Questions?
    60. 60. Thank You

    ×