JSON Data
Modeling
Matthew D. Groves, @mgroves
Modeling Data in a Relational World
2
Billing
Connections
Purchases
Contacts
Custome
r
3
AGENDA
01/ Why NoSQL?
02/ JSON Data Modeling
03/ Accessing Data
04/ Migrating Data
05/ Summary / Q&A
Why NoSQL?
4
1
NoSQL Landscape
Document
• Couchbase
• MongoDB
• DynamoDB
• CosmosDB
Graph
• OrientDB
• Neo4J
• CosmosDB
Key-Value
• Couchbase
• DynamoDB
• CosmosDB
• Redis Wide Column
• Hbase
• Cassandra
• CosmosDB
NoSQL Landscape
• Get by key(s)
• Set by key(s)
• Replace by key(s)
• Delete by key(s)
Document
• Couchbase
• MongoDB
• DynamoDB
• CosmosDB
What's NoSQL?
7
Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved.
Why NoSQL? Scalability
Why NoSQL? Flexibility
{
"name" : "matt
groves"
}
{
"firstName" : "jeff",
"lastName" :
"morris"
}
DocumentKey: user::120902
DocumentKey: user::930912
Why NoSQL? Availability
Why NoSQL? Performance
Use Cases for NoSQL
• Communication
• Gaming
• Advertising
• Travel booking
• Loyalty programs
• Fraud monitoring
• Social media
• Finance
• Caching
• Session
• User profile
• Catalog
• Content management
• Personalization
• Customer 360
• IoT
https://www.couchbase.com/customers
Use Cases
1
3
JSON Data
Modeling
1
4
2
Properties of Real-World Data
1
5
Modeling Data in a Relational World
1
6
Billing
Connections
Purchases
Contacts
Custome
r
CustomerID Name DOB
CBL2015 Jane Smith 1990-01-30
Table: Customer
{
"Name" : "Jane
Smith",
"DOB" : "1990-01-
30”
}
Customer DocumentKey: CBL20
©2017 Couchbase Inc. 18
CustomerID Name DOB
CBL2015 Jane Smith 1990-01-30
Table: Customer {
"Name" : "Jane Smith",
"DOB" : "1990-01-30",
"Purchases" : [
{
"item" : "laptop",
"amount" : 1499.99,
"date" : "2019-03",
}
]
}
Customer DocumentKey: CBL20
CustomerID Item Amount Date
CBL2015 laptop 1499.99 2019-03
Table: Purchases
CustomerID Name DOB
CBL2015 Jane Smith 1990-01-30
Table: Customer {
"Name" : "Jane Smith",
"DOB" : "1990-01-30",
"Purchases" : [
{
"item" : "laptop",
"amount" : 1499.99,
"date" : "2019-03",
},
{
"item" : "phone",
"amount" : 99.99,
"date" : "2018-12"
}
]
}
Customer DocumentKey: CBL20
CustomerID Item Amount Date
CBL2015 laptop 1499.99 2019-03
CBL2015 phone 99.99 2018-12
Table: Purchases
CustomerID ConnId Relation
CBL2015 XYZ987 Brother
CBL2015 SKR007 Father
Table: Connections {
"Name" : "Jane Smith",
"DOB" : "1990-01-30",
"Billing" : [
{
"type" : "visa",
"cardnum" : "5827-2842-...",
"expiry" : "2019-03"
}, ...
],
"Connections" : [
{
"ConnId" : "XYZ987",
"Relation" : "Brother"
},
{
"ConnId" : "SKR007",
"Relation" : "Father"
}
}
Customer DocumentKey: CBL
©2017 Couchbase Inc. 21
{
"Name" : "Jane Smith",
"DOB" : "1990-01-30",
"cardnum" : "5827-2842…",
"expiry" : "2019-03",
"cardType" : "visa",
"Connections" : [
{
"CustId" : "XYZ987",
"Relation" : "Brother"
},
{
"CustId" : "SKR007",
" Relation " : "Father"
}
],
"Purchases" : [
{ "id":12, item: "mac", "amt":
2823.52 }
{ "id":19, item: "ipad2", "amt":
623.52 }
]
}
DocumentKey: CBL2015
Custom
erID
Name DOB Cardnum Expiry CardType
CBL201
5
Jane
Smith
1990-01-
30
5827-
2842…
2019-03 visa
CustomerI
D
ConnId Relation
CBL2015 XYZ987 Brother
CBL2015 SKR007 Father
CustomerI
D
item amt
CBL2015 mac 2823.5
2
CBL2015 ipad2 623.52
CustomerI
D
ConnId Name
CBL2015 XYZ98
7
Joe
Smith
CBL2015 SKR00
7
Sam
Smith
Contacts
Custome
r
Connection
s
Purchase
s
{
"Name" : "Bob Jones",
"DOB" : "1980-01-29",
"Billing" : [
{
"type" : "visa",
"cardnum" : "5927-2842-2847-3909",
"expiry" : "2020-03"
},
{
"type" : "master",
"cardnum" : "6273-2842-2847-3909",
"expiry" : "2019-11"
}
],
"Connections" : [
{
"CustId" : "XYZ987",
"Relation" : "Brother"
},
{
"CustId" : "PQR823",
"Relation" : "Father"
}
],
"Purchases" : [
{ "id":12, item: "mac", "amt": 2823.52 },
{ "id":19, item: "ipad2", "amt": 623.52 }
]
}
DocumentKey: CBL2016
CustomerID Name DOB
CBL2016 Bob Jones 1980-01-29
Custom
erID
Type Cardnu
m
Expiry
CBL201
6
visa 5927… 2020-03
CBL201
6
mast
er
6273… 2019-11
CustomerI
D
ConnId Relation
CBL2016 XYZ987 Brother
CBL2016 SKR007 Father
CustomerI
D
item amt
CBL2016 mac 2823.5
2
CBL2016 ipad2 623.52
CustomerI
D
ConnI
d
Name
CBL201
6
XYZ9
87
Joe
Smith
CBL201
6
SKR0
07
Sam
Smith
Contacts
Custome
r
Billing
Connection
s
Purchase
s
{
"name" : "matt
groves"
"version" : 1
}
{
"firstName" : "jeff",
"lastName" :
"morris",
"version" : 2
}
DocumentKey: user::120902 DocumentKey: user::930912
Versioning approach 1:Version Numbers
{
"name" : "matt
groves"
}
{
"firstName" : "matt",
"lastName" :
"groves",
}
DocumentKey: user::120902 DocumentKey: user::120902
Versioning approach 2: Big Bang Re-versioning
{
"name" : "matt
groves"
}
{
"firstName" : "matt",
"lastName" :
"groves",
}
DocumentKey: user::120902
DocumentKey: user::120902
Web application
Versioning approach 3: Cooperative Re-versioning
Modeling tools
• Hackolade
• Erwin DM NoSQL
• Idera ER/Studio
• http://jsoneditoronline.org
Accessing Data
2
7
3
Key/Value
public async Task<ShoppingCart> GetCartById(string id)
{
var cart = await _collection.GetAsync(id);
return cart.ContentAs<ShoppingCart>();
}
public async Task CreateShoppingCart()
{
await _collection.InsertAsync(
Guid.NewGuid().ToString(),
new ShoppingCart { . . . }
);
}
Key/Value: Recommendations for keys
•Natural Keys
•Human Readable
•Deterministic
•Semantic
Key/Value: Example keys
• author::matt
• author::matt::blogs
• blog::csharp_9_features
• blog::csharp_9_features::comments
Relationship is one-to-one or one-to-many
Store related data as nested objects
{
"Name" : "Jane Smith",
"DOB" : "1990-01-30",
"Purchases" : [
{
"item" : "laptop",
"amount" : 1499.99,
"date" : "2019-03",
},
{
"item" : "phone",
"amount" : 99.99,
"date" : "2018-12"
}
]
}
Modeling your data: Strategies / rules of thumb
Relationship is many-to-one or many-to-
many
Store related data as separate documents
{
"Name" : "Jane
Smith",
"DOB" : "1990-01-
30",
"Connections" : [
"XYZ987",
"PQR823",
"PQR828"
]
}
Modeling your data: Strategies / rules of thumb
Data reads are mostly parent fields
Store children as separate documents
{
"Name" : "Jane Smith",
"DOB" : "1990-01-30",
"Connections" : [
"XYZ987",
"PQR823",
"PQR828"
]
}
Modeling your data: Strategies / rules of thumb
Data reads are mostly parent + child fields
Store children as nested objects
{
"Name" : "Jane Smith",
"DOB" : "1990-01-30",
"Purchases" : [
{
"item" : "laptop",
"amount" : 1499.99,
"date" : "2019-03",
},
{
"item" : "phone",
"amount" : 99.99,
"date" : "2018-12"
}
]
}
Modeling your data: Strategies / rules of thumb
Data writes are mostly parent or child (not
both)
Store children as separate documents
{
"Name" : "Jane Smith",
"DOB" : "1990-01-30",
"Connections" : [
"XYZ987",
"PQR823",
"PQR828"
]
}
Modeling your data: Strategies / rules of thumb
Data writes are mostly parent and child (both)
Store children as nested objects
{
"Name" : "Jane Smith",
"DOB" : "1990-01-30",
"Purchases" : [
{
"item" : "laptop",
"amount" : 1499.99,
"date" : "2019-03",
},
{
"item" : "phone",
"amount" : 99.99,
"date" : "2018-12"
}
]
}
Modeling your data: Strategies / rules of thumb
If … Then …
Relationship is one-to-one or one-to-many Store related data as nested objects
Relationship is many-to-one or many-to-
many
Store related data as separate documents
Data reads are mostly parent fields Store children as separate documents
Data reads are mostly parent + child fields Store children as nested objects
Data writes are mostly parent or child (not
both)
Store children as separate documents
Data writes are mostly parent and child
(both)
Store children as nested objects
Modeling your data: Key/Value Strategies
Subdocument access
3
8
{
"username": "mgroves",
"profile": {
"phoneNumber": "123-456-7890",
"address": {
"street": "123 main st",
"city": "Grove City",
"state": "Ohio"
}
}
}
Other ways to access data (Couchbase)
Key-Value
(CRUD)
N1QL
(SQL
Query)
Full Text
(Search)
Documents
Indexes Indexes
Views
(JS
Query)
Analytics
(Query)
MapRedu
ce
SQL++
N1QL
Understanding your Query Plan
Full Text Search
Concept Strategies & Recommendations
Key-Value Operations provide the best
possible performance
• Create an effective key naming strategy
• Create an optimized data model
Full Text Search is well-suited to text • Facets / ranges / geography
• Language aware
• Inverted index
N1QL queries provide the most flexibility –
everything else
• Query data regardless of how it is
modeled
• Good indexing is vital
• B-Tree
Accessing your data: Strategies and recommendation
Migrating Data
4
4
4
1. Rewrite: No migration, write the whole thing over
2. Redesign Schema: Keep your business logic, rewrite your data layer and
schema, totally redesign your schema with a NoSQL-optimized model
3. Refactor First: Keep everything but refactor your data logic and RDBMS
schema into a NoSQL-optimized model
4. Optimize Later: Host your schema with as few changes as possible, get the
application running on the new technology, refactor/optimize the schema as
necessary for performance
5. Just Host It: Host your schema with as few changes as possible.
How do you migrate?
Risk
Effort
Migration options: Tools
Migration options: BYO
Migration options: KISS (level 5)
Export
Transform
Import
NoSQL (optimized)
Relational
NoSQL
(raw)
Migration options: KISS (levels 4,3,2)
Export
Transform
Import
NoSQL
Relational
NoSQL (optimized)
"staging"
Migration Recommendations: Align
SqlServerToCouchbase
Demo
5
1
Summary
5
2
5
Pick the right
application
Summary
Proof of Concept
Summary
Match the data
access method to
requirements
Summary
Next Steps
• Download Couchbase 7
https://couchbase.com/downloads
• https://connect.couchbase.com
• https://github.com/mgroves/SqlServerToCouchbase

Data Modeling and Relational to NoSQL

  • 1.
  • 2.
    Modeling Data ina Relational World 2 Billing Connections Purchases Contacts Custome r
  • 3.
    3 AGENDA 01/ Why NoSQL? 02/JSON Data Modeling 03/ Accessing Data 04/ Migrating Data 05/ Summary / Q&A
  • 4.
  • 5.
    NoSQL Landscape Document • Couchbase •MongoDB • DynamoDB • CosmosDB Graph • OrientDB • Neo4J • CosmosDB Key-Value • Couchbase • DynamoDB • CosmosDB • Redis Wide Column • Hbase • Cassandra • CosmosDB
  • 6.
    NoSQL Landscape • Getby key(s) • Set by key(s) • Replace by key(s) • Delete by key(s) Document • Couchbase • MongoDB • DynamoDB • CosmosDB
  • 7.
    What's NoSQL? 7 Confidential andProprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved.
  • 8.
  • 9.
    Why NoSQL? Flexibility { "name": "matt groves" } { "firstName" : "jeff", "lastName" : "morris" } DocumentKey: user::120902 DocumentKey: user::930912
  • 10.
  • 11.
  • 12.
    Use Cases forNoSQL • Communication • Gaming • Advertising • Travel booking • Loyalty programs • Fraud monitoring • Social media • Finance • Caching • Session • User profile • Catalog • Content management • Personalization • Customer 360 • IoT https://www.couchbase.com/customers
  • 13.
  • 14.
  • 15.
  • 16.
    Modeling Data ina Relational World 1 6 Billing Connections Purchases Contacts Custome r
  • 17.
    CustomerID Name DOB CBL2015Jane Smith 1990-01-30 Table: Customer { "Name" : "Jane Smith", "DOB" : "1990-01- 30” } Customer DocumentKey: CBL20
  • 18.
    ©2017 Couchbase Inc.18 CustomerID Name DOB CBL2015 Jane Smith 1990-01-30 Table: Customer { "Name" : "Jane Smith", "DOB" : "1990-01-30", "Purchases" : [ { "item" : "laptop", "amount" : 1499.99, "date" : "2019-03", } ] } Customer DocumentKey: CBL20 CustomerID Item Amount Date CBL2015 laptop 1499.99 2019-03 Table: Purchases
  • 19.
    CustomerID Name DOB CBL2015Jane Smith 1990-01-30 Table: Customer { "Name" : "Jane Smith", "DOB" : "1990-01-30", "Purchases" : [ { "item" : "laptop", "amount" : 1499.99, "date" : "2019-03", }, { "item" : "phone", "amount" : 99.99, "date" : "2018-12" } ] } Customer DocumentKey: CBL20 CustomerID Item Amount Date CBL2015 laptop 1499.99 2019-03 CBL2015 phone 99.99 2018-12 Table: Purchases
  • 20.
    CustomerID ConnId Relation CBL2015XYZ987 Brother CBL2015 SKR007 Father Table: Connections { "Name" : "Jane Smith", "DOB" : "1990-01-30", "Billing" : [ { "type" : "visa", "cardnum" : "5827-2842-...", "expiry" : "2019-03" }, ... ], "Connections" : [ { "ConnId" : "XYZ987", "Relation" : "Brother" }, { "ConnId" : "SKR007", "Relation" : "Father" } } Customer DocumentKey: CBL
  • 21.
    ©2017 Couchbase Inc.21 { "Name" : "Jane Smith", "DOB" : "1990-01-30", "cardnum" : "5827-2842…", "expiry" : "2019-03", "cardType" : "visa", "Connections" : [ { "CustId" : "XYZ987", "Relation" : "Brother" }, { "CustId" : "SKR007", " Relation " : "Father" } ], "Purchases" : [ { "id":12, item: "mac", "amt": 2823.52 } { "id":19, item: "ipad2", "amt": 623.52 } ] } DocumentKey: CBL2015 Custom erID Name DOB Cardnum Expiry CardType CBL201 5 Jane Smith 1990-01- 30 5827- 2842… 2019-03 visa CustomerI D ConnId Relation CBL2015 XYZ987 Brother CBL2015 SKR007 Father CustomerI D item amt CBL2015 mac 2823.5 2 CBL2015 ipad2 623.52 CustomerI D ConnId Name CBL2015 XYZ98 7 Joe Smith CBL2015 SKR00 7 Sam Smith Contacts Custome r Connection s Purchase s
  • 22.
    { "Name" : "BobJones", "DOB" : "1980-01-29", "Billing" : [ { "type" : "visa", "cardnum" : "5927-2842-2847-3909", "expiry" : "2020-03" }, { "type" : "master", "cardnum" : "6273-2842-2847-3909", "expiry" : "2019-11" } ], "Connections" : [ { "CustId" : "XYZ987", "Relation" : "Brother" }, { "CustId" : "PQR823", "Relation" : "Father" } ], "Purchases" : [ { "id":12, item: "mac", "amt": 2823.52 }, { "id":19, item: "ipad2", "amt": 623.52 } ] } DocumentKey: CBL2016 CustomerID Name DOB CBL2016 Bob Jones 1980-01-29 Custom erID Type Cardnu m Expiry CBL201 6 visa 5927… 2020-03 CBL201 6 mast er 6273… 2019-11 CustomerI D ConnId Relation CBL2016 XYZ987 Brother CBL2016 SKR007 Father CustomerI D item amt CBL2016 mac 2823.5 2 CBL2016 ipad2 623.52 CustomerI D ConnI d Name CBL201 6 XYZ9 87 Joe Smith CBL201 6 SKR0 07 Sam Smith Contacts Custome r Billing Connection s Purchase s
  • 23.
    { "name" : "matt groves" "version": 1 } { "firstName" : "jeff", "lastName" : "morris", "version" : 2 } DocumentKey: user::120902 DocumentKey: user::930912 Versioning approach 1:Version Numbers
  • 24.
    { "name" : "matt groves" } { "firstName": "matt", "lastName" : "groves", } DocumentKey: user::120902 DocumentKey: user::120902 Versioning approach 2: Big Bang Re-versioning
  • 25.
    { "name" : "matt groves" } { "firstName": "matt", "lastName" : "groves", } DocumentKey: user::120902 DocumentKey: user::120902 Web application Versioning approach 3: Cooperative Re-versioning
  • 26.
    Modeling tools • Hackolade •Erwin DM NoSQL • Idera ER/Studio • http://jsoneditoronline.org
  • 27.
  • 28.
    Key/Value public async Task<ShoppingCart>GetCartById(string id) { var cart = await _collection.GetAsync(id); return cart.ContentAs<ShoppingCart>(); } public async Task CreateShoppingCart() { await _collection.InsertAsync( Guid.NewGuid().ToString(), new ShoppingCart { . . . } ); }
  • 29.
    Key/Value: Recommendations forkeys •Natural Keys •Human Readable •Deterministic •Semantic
  • 30.
    Key/Value: Example keys •author::matt • author::matt::blogs • blog::csharp_9_features • blog::csharp_9_features::comments
  • 31.
    Relationship is one-to-oneor one-to-many Store related data as nested objects { "Name" : "Jane Smith", "DOB" : "1990-01-30", "Purchases" : [ { "item" : "laptop", "amount" : 1499.99, "date" : "2019-03", }, { "item" : "phone", "amount" : 99.99, "date" : "2018-12" } ] } Modeling your data: Strategies / rules of thumb
  • 32.
    Relationship is many-to-oneor many-to- many Store related data as separate documents { "Name" : "Jane Smith", "DOB" : "1990-01- 30", "Connections" : [ "XYZ987", "PQR823", "PQR828" ] } Modeling your data: Strategies / rules of thumb
  • 33.
    Data reads aremostly parent fields Store children as separate documents { "Name" : "Jane Smith", "DOB" : "1990-01-30", "Connections" : [ "XYZ987", "PQR823", "PQR828" ] } Modeling your data: Strategies / rules of thumb
  • 34.
    Data reads aremostly parent + child fields Store children as nested objects { "Name" : "Jane Smith", "DOB" : "1990-01-30", "Purchases" : [ { "item" : "laptop", "amount" : 1499.99, "date" : "2019-03", }, { "item" : "phone", "amount" : 99.99, "date" : "2018-12" } ] } Modeling your data: Strategies / rules of thumb
  • 35.
    Data writes aremostly parent or child (not both) Store children as separate documents { "Name" : "Jane Smith", "DOB" : "1990-01-30", "Connections" : [ "XYZ987", "PQR823", "PQR828" ] } Modeling your data: Strategies / rules of thumb
  • 36.
    Data writes aremostly parent and child (both) Store children as nested objects { "Name" : "Jane Smith", "DOB" : "1990-01-30", "Purchases" : [ { "item" : "laptop", "amount" : 1499.99, "date" : "2019-03", }, { "item" : "phone", "amount" : 99.99, "date" : "2018-12" } ] } Modeling your data: Strategies / rules of thumb
  • 37.
    If … Then… Relationship is one-to-one or one-to-many Store related data as nested objects Relationship is many-to-one or many-to- many Store related data as separate documents Data reads are mostly parent fields Store children as separate documents Data reads are mostly parent + child fields Store children as nested objects Data writes are mostly parent or child (not both) Store children as separate documents Data writes are mostly parent and child (both) Store children as nested objects Modeling your data: Key/Value Strategies
  • 38.
    Subdocument access 3 8 { "username": "mgroves", "profile":{ "phoneNumber": "123-456-7890", "address": { "street": "123 main st", "city": "Grove City", "state": "Ohio" } } }
  • 39.
    Other ways toaccess data (Couchbase) Key-Value (CRUD) N1QL (SQL Query) Full Text (Search) Documents Indexes Indexes Views (JS Query) Analytics (Query) MapRedu ce SQL++
  • 40.
  • 41.
  • 42.
  • 43.
    Concept Strategies &Recommendations Key-Value Operations provide the best possible performance • Create an effective key naming strategy • Create an optimized data model Full Text Search is well-suited to text • Facets / ranges / geography • Language aware • Inverted index N1QL queries provide the most flexibility – everything else • Query data regardless of how it is modeled • Good indexing is vital • B-Tree Accessing your data: Strategies and recommendation
  • 44.
  • 45.
    1. Rewrite: Nomigration, write the whole thing over 2. Redesign Schema: Keep your business logic, rewrite your data layer and schema, totally redesign your schema with a NoSQL-optimized model 3. Refactor First: Keep everything but refactor your data logic and RDBMS schema into a NoSQL-optimized model 4. Optimize Later: Host your schema with as few changes as possible, get the application running on the new technology, refactor/optimize the schema as necessary for performance 5. Just Host It: Host your schema with as few changes as possible. How do you migrate? Risk Effort
  • 46.
  • 47.
  • 48.
    Migration options: KISS(level 5) Export Transform Import NoSQL (optimized) Relational NoSQL (raw)
  • 49.
    Migration options: KISS(levels 4,3,2) Export Transform Import NoSQL Relational NoSQL (optimized) "staging"
  • 50.
  • 51.
  • 52.
  • 53.
  • 54.
  • 55.
    Match the data accessmethod to requirements Summary
  • 56.
    Next Steps • DownloadCouchbase 7 https://couchbase.com/downloads • https://connect.couchbase.com • https://github.com/mgroves/SqlServerToCouchbase