#MDBlocal
#MDBlocal
Jump Start:
Introduction to Schema Design
20 MARCH, 2018
#MDBlocal
Andrew Ryder
Consulting Engineer, MongoDB
ryder@mongodb.com
#MDBlocal
Why a Schema Design Jumpstart?
● Schema Design is important in all databases
● Many performance issues we see in MongoDB installations are due to
poor schema design decisions
● MongoDB schema design is different from relational DBs
● MongoDB supports some interesting new patterns
But wait! Isn’t MongoDB “schema-less”?
#MDBlocal
Goals for this session
At the end of this session, you should be able to:
● Explain the document model
● Explain the difference between relational DBs and document-based DBs
● Understand how schema design impacts performance
● Think about schema design in a new way
#MDBlocal
High-Level Schema Design
Start with the Consumer
#MDBlocal
Relational Schemas
#MDBlocal
MongoDB Schemas
#MDBlocal
How is Data Stored in MongoDB?
#MDBlocal
MongoDB Documents are Rich Data Structures
{
first_name: ‘Paul’,
surname: ‘Miller’,
cell: 447557505611,
city: ‘London’,
location: { type: Point,
coordinates: [-0.223,51.52]},
Profession: [‘banking’, ‘finance’, ‘trader’],
cars: [
{ model: ‘Bentley’,
year: 1973,
value: 100000, … },
{ model: ‘Rolls Royce’,
year: 1965,
value: 330000, … }
]
}
Fields can contain an array of sub-
documents (JSON objects)
Typed field names
Fields
Array
#MDBlocal
Document Schemas are Flexible
{
sku: ‘PAINTZXC123’,
product_name: ‘Metallic
Paint’,
product_type: ‘paint’,
colours: [‘Red’, ‘Green’],
size_litres: [5, 10]
}
{
sku: ‘TSHRTASD43546’,
product_name: ‘T-shirt’,
product_type: ‘Mongodb
swag’,
colours: [‘Heather Gray’ …
],
size: [‘S’, ‘M’, ‘L’, ‘XL’],
material: ‘100% cotton’,
wash: ‘cold’,
dry: ‘tumble dry low’
}
● Different Documents in the same ProductsCatalog collection in MongoDB
● Polymorphic Schema - think base class & derived classes
#MDBlocal
Rich Indexes
db.collectors.createIndex({
"location":"2dsphere",
"cars.year": 1
});
#MDBlocal
Expressive Queries
#MDBlocal
Progress on Goals for this session
At the end of this session, you should be able to:
● Explain the document model
● Explain the difference between relational DBs and document-based DBs
● Understand how schema design impacts performance
● Think about schema design in a new way
#MDBlocal
2 Big “Rules”:
● Consider how the data will be used
● Data that works together lives together
#MDBlocal
Example > Guitar Collectors
ERD
Relational DB MongoDB
Table → Collection
Column → Field
Row → Document
Terminology
#MDBlocal
Example > Guitar Collectors
Relational
Consider your consumers:
how will this data be used?
#MDBlocal
Example > Guitar Collectors
MongoDB - Embedding
#MDBlocal
Example > Guitar Collectors
Typical queries from a consumer of this data might be:
● “What guitars does Aimee Doe own?”
● “Show me all the people who own a Gibson
Les Paul of any age”
● “How many people within 100 miles of Seattle
own guitars made before 1970?”
Consider how the data will be used
X
#MDBlocal
Example > Guitar Collectors
Consider how the data will be used
Typical queries from a consumer of this data might be:
● “What guitars does Aimee Doe own?”
● “Show me all the people who own a Gibson
Les Paul of any age”
● “How many people within 100 miles of Seattle
own guitars made before 1970?”
#MDBlocal
Progress on Goals for this session
At the end of this session, you should be able to:
● Explain the document model
● Explain the difference between relational DBs and document-based DBs
● Understand how schema design impacts performance
● Think about schema design in a new way
#MDBlocal
Design Patterns
Data Relationships
● One : One
● One : Many
● Many : Many
MongoDB Patterns
● Embedding
● Reference
● Subset
● Time Series
● Tree Structure
#MDBlocal
Example > Healthcare
We want to build a mobile app for patients and providers.
Within our medical system:
● Patients have multiple medical procedures
● Doctors work at several hospitals & clinics, and vice versa
#MDBlocal
1:1 Relationships
← Embed – weak entity
Medical Procedures
Data that works together
lives together.
#MDBlocal
One : Many Relationships
{
_id: 2,
first: “Joe”,
last: “Patient”,
addr: { …},
procedures: [
{
id: 12345,
date: 2015-02-15,
type: “CAT scan”,
…},
{
id: 12346,
date: 2015-02-15,
type: “blood test”,
…}]
}
Patients
Embed
{
_id: 2,
first: “Joe”,
last: “Patient”,
addr: { …},
procedures: [12345, 12346]
}
{
_id: 12345,
date: 2015-02-15,
type: “CAT scan”,
…}
{
_id: 12346,
date: 2015-02-15,
type: “blood test”,
…}
Patients
Reference
Procedures
OR
#MDBlocal
Many : Many Relationship
Like a One-to-Many relationship, you can embed sub-documents or reference them. Which
approach you take depends on data access patterns and document sizes.
The relational way:
Hospitals
● Id
● Name
● Location
● Foo
● Bar
Hospitals_to_Physicians
Hospital_id
Physician_id
Physicians
● Id
● Name
● Address
● Phone
● etc
“Who works at this hospital?”
“At which clinics does my doctor work?”
#MDBlocal
Many : Many Relationship using Embedding
{
_id: 1,
name: “Oak Valley Hospital”,
city: “New York”,
beds: 131,
physicians: [
{
id: 12345,
name: “Joe Doctor”,
address: {…},
…},
{
id: 12346,
name: “Mary Well”,
address: {…},
…}]
}
{
_id: 2,
name: “Plainmont Hospital”,
city: “Omaha”,
beds: 85,
physicians: [
{
id: 63633,
name: “Harold Green”,
address: {…},
…},
{
id: 12345,
name: “Joe Doctor”,
address: {…},
…}]
}
Data Duplication
is Fine!
(usually)
#MDBlocal
Many : Many Relationship using References
{
_id: 1,
name: “Oak Valley Hospital”,
city: “New York”,
beds: 131,
physicians: [12345, 12346]
},
{
_id: 2,
name: “Plainmont Hospital”,
city: “Omaha”,
beds: 85,
physicians: [63633, 12346]
}
{
id: 63633,
name: “Harold Green”,
hospitals: [1,2],
…},
{
id: 12345,
name: “Joe Doctor”,
hospitals: [1],
…},
{
id: 12346,
name: “Mary Well”,
hospitals: [1,2],
…}
Hospitals Physicians
Note that references can go either direction - there is no primary key/foreign key concept in MongoDB.
#MDBlocal
Progress on Goals for this session
At the end of this session, you should be able to:
● Explain the document model
● Explain the difference between relational DBs and document-based DBs
● Understand how schema design impacts performance
● Think about schema design in a new way
#MDBlocal
Key Points
● Schema design in MongoDB is important
● Differs from relational, but some concepts are still useful
● New patterns are more difficult in relational DBs
● Many performance issues are due to poor schema design
#MDBlocal
Useful Tools
#MDBlocal
Atlas
Deploy, operate, and scale a MongoDB
database in the cloud with just a few clicks
“Database as a Service”
Cloud-hosted MongoDB instances
• You pick the cloud provider and plan,
including a free tier on AWS!
Stitch
Configure your backend application in the
cloud.
“Backend as a Service”
Cloud-hosted app engine that does a lot more
than just access your MongoDB instance:
● Integrate with 3rd party services
● Use OAuth services
● Create custom functions for event-
driven programming
#MDBlocal
Compass
#MDBlocal
THANK YOU
FOR JOINING US!

Jumpstart: Introduction to Schema Design

  • 1.
  • 2.
  • 3.
    20 MARCH, 2018 #MDBlocal AndrewRyder Consulting Engineer, MongoDB ryder@mongodb.com
  • 4.
    #MDBlocal Why a SchemaDesign Jumpstart? ● Schema Design is important in all databases ● Many performance issues we see in MongoDB installations are due to poor schema design decisions ● MongoDB schema design is different from relational DBs ● MongoDB supports some interesting new patterns But wait! Isn’t MongoDB “schema-less”?
  • 5.
    #MDBlocal Goals for thissession At the end of this session, you should be able to: ● Explain the document model ● Explain the difference between relational DBs and document-based DBs ● Understand how schema design impacts performance ● Think about schema design in a new way
  • 6.
  • 7.
  • 8.
  • 9.
    #MDBlocal How is DataStored in MongoDB?
  • 10.
    #MDBlocal MongoDB Documents areRich Data Structures { first_name: ‘Paul’, surname: ‘Miller’, cell: 447557505611, city: ‘London’, location: { type: Point, coordinates: [-0.223,51.52]}, Profession: [‘banking’, ‘finance’, ‘trader’], cars: [ { model: ‘Bentley’, year: 1973, value: 100000, … }, { model: ‘Rolls Royce’, year: 1965, value: 330000, … } ] } Fields can contain an array of sub- documents (JSON objects) Typed field names Fields Array
  • 11.
    #MDBlocal Document Schemas areFlexible { sku: ‘PAINTZXC123’, product_name: ‘Metallic Paint’, product_type: ‘paint’, colours: [‘Red’, ‘Green’], size_litres: [5, 10] } { sku: ‘TSHRTASD43546’, product_name: ‘T-shirt’, product_type: ‘Mongodb swag’, colours: [‘Heather Gray’ … ], size: [‘S’, ‘M’, ‘L’, ‘XL’], material: ‘100% cotton’, wash: ‘cold’, dry: ‘tumble dry low’ } ● Different Documents in the same ProductsCatalog collection in MongoDB ● Polymorphic Schema - think base class & derived classes
  • 12.
  • 13.
  • 14.
    #MDBlocal Progress on Goalsfor this session At the end of this session, you should be able to: ● Explain the document model ● Explain the difference between relational DBs and document-based DBs ● Understand how schema design impacts performance ● Think about schema design in a new way
  • 15.
    #MDBlocal 2 Big “Rules”: ●Consider how the data will be used ● Data that works together lives together
  • 16.
    #MDBlocal Example > GuitarCollectors ERD Relational DB MongoDB Table → Collection Column → Field Row → Document Terminology
  • 17.
    #MDBlocal Example > GuitarCollectors Relational Consider your consumers: how will this data be used?
  • 18.
    #MDBlocal Example > GuitarCollectors MongoDB - Embedding
  • 19.
    #MDBlocal Example > GuitarCollectors Typical queries from a consumer of this data might be: ● “What guitars does Aimee Doe own?” ● “Show me all the people who own a Gibson Les Paul of any age” ● “How many people within 100 miles of Seattle own guitars made before 1970?” Consider how the data will be used X
  • 20.
    #MDBlocal Example > GuitarCollectors Consider how the data will be used Typical queries from a consumer of this data might be: ● “What guitars does Aimee Doe own?” ● “Show me all the people who own a Gibson Les Paul of any age” ● “How many people within 100 miles of Seattle own guitars made before 1970?”
  • 21.
    #MDBlocal Progress on Goalsfor this session At the end of this session, you should be able to: ● Explain the document model ● Explain the difference between relational DBs and document-based DBs ● Understand how schema design impacts performance ● Think about schema design in a new way
  • 22.
    #MDBlocal Design Patterns Data Relationships ●One : One ● One : Many ● Many : Many MongoDB Patterns ● Embedding ● Reference ● Subset ● Time Series ● Tree Structure
  • 23.
    #MDBlocal Example > Healthcare Wewant to build a mobile app for patients and providers. Within our medical system: ● Patients have multiple medical procedures ● Doctors work at several hospitals & clinics, and vice versa
  • 24.
    #MDBlocal 1:1 Relationships ← Embed– weak entity Medical Procedures Data that works together lives together.
  • 25.
    #MDBlocal One : ManyRelationships { _id: 2, first: “Joe”, last: “Patient”, addr: { …}, procedures: [ { id: 12345, date: 2015-02-15, type: “CAT scan”, …}, { id: 12346, date: 2015-02-15, type: “blood test”, …}] } Patients Embed { _id: 2, first: “Joe”, last: “Patient”, addr: { …}, procedures: [12345, 12346] } { _id: 12345, date: 2015-02-15, type: “CAT scan”, …} { _id: 12346, date: 2015-02-15, type: “blood test”, …} Patients Reference Procedures OR
  • 26.
    #MDBlocal Many : ManyRelationship Like a One-to-Many relationship, you can embed sub-documents or reference them. Which approach you take depends on data access patterns and document sizes. The relational way: Hospitals ● Id ● Name ● Location ● Foo ● Bar Hospitals_to_Physicians Hospital_id Physician_id Physicians ● Id ● Name ● Address ● Phone ● etc “Who works at this hospital?” “At which clinics does my doctor work?”
  • 27.
    #MDBlocal Many : ManyRelationship using Embedding { _id: 1, name: “Oak Valley Hospital”, city: “New York”, beds: 131, physicians: [ { id: 12345, name: “Joe Doctor”, address: {…}, …}, { id: 12346, name: “Mary Well”, address: {…}, …}] } { _id: 2, name: “Plainmont Hospital”, city: “Omaha”, beds: 85, physicians: [ { id: 63633, name: “Harold Green”, address: {…}, …}, { id: 12345, name: “Joe Doctor”, address: {…}, …}] } Data Duplication is Fine! (usually)
  • 28.
    #MDBlocal Many : ManyRelationship using References { _id: 1, name: “Oak Valley Hospital”, city: “New York”, beds: 131, physicians: [12345, 12346] }, { _id: 2, name: “Plainmont Hospital”, city: “Omaha”, beds: 85, physicians: [63633, 12346] } { id: 63633, name: “Harold Green”, hospitals: [1,2], …}, { id: 12345, name: “Joe Doctor”, hospitals: [1], …}, { id: 12346, name: “Mary Well”, hospitals: [1,2], …} Hospitals Physicians Note that references can go either direction - there is no primary key/foreign key concept in MongoDB.
  • 29.
    #MDBlocal Progress on Goalsfor this session At the end of this session, you should be able to: ● Explain the document model ● Explain the difference between relational DBs and document-based DBs ● Understand how schema design impacts performance ● Think about schema design in a new way
  • 30.
    #MDBlocal Key Points ● Schemadesign in MongoDB is important ● Differs from relational, but some concepts are still useful ● New patterns are more difficult in relational DBs ● Many performance issues are due to poor schema design
  • 31.
  • 32.
    #MDBlocal Atlas Deploy, operate, andscale a MongoDB database in the cloud with just a few clicks “Database as a Service” Cloud-hosted MongoDB instances • You pick the cloud provider and plan, including a free tier on AWS! Stitch Configure your backend application in the cloud. “Backend as a Service” Cloud-hosted app engine that does a lot more than just access your MongoDB instance: ● Integrate with 3rd party services ● Use OAuth services ● Create custom functions for event- driven programming
  • 33.
  • 34.