Json data modeling june 2017 - pittsburgh tech fest

JSON Data Modeling
Matthew D. Groves, @mgroves
David Segleau, @dsegleau

©2017 Couchbase Inc. 2
Agenda
Why NoSQL?
JSON Data Modeling
Accessing data
Migrating data

Where am I?
• PittsburghTech Fest
• http://www.pghtechfest.com/

Who am I?
• Matthew D. Groves
• Developer Advocate for Couchbase
• @mgroves onTwitter
• Podcast and blog: http://crosscuttingconcerns.com
• “I am not an expert, but I am an enthusiast.” – Alan Stevens

Major Enterprises Across Industries are Adopting NoSQL
6
CommunicationsTechnology
Travel & Hospitality Media &
Entertainment
E-Commerce &
Digital Advertising
Retail & Apparel
Games & GamingFinance &
Business Services

Why NoSQL?

NoSQL Landscape
Document
• Couchbase
• MongoDB
• DynamoDB
• DocumentDB
Graph
• OrientDB
• Neo4J
• DEX
• GraphBase
Key-Value
• Couchbase
• Riak
• BerkeleyDB
• Redis
• … Wide Column
• Hbase
• Cassandra
• Hypertable

NoSQL Landscape
Document
• Couchbase
• MongoDB
• DynamoDB
• DocumentDB
• Get by key(s)
• Set by key(s)
• Replace by key(s)
• Delete by key(s)
• Map/Reduce

Why NoSQL? Scalability

Why NoSQL? Flexibility

Why NoSQL? Performance

Why NoSQL? Availability

JSON Data Modeling

Models for Representing Data
Data Concern Relational Model JSON Document Model
Rich Structure
Relationships
Value Evolution
Structure Evolution

Properties of Real-World Data

Modeling Data in RelationalWorld
Billing
ConnectionsPurchases
Contacts
Customer

CustomerID Name DOB
CBL2015 Jane Smith 1990-01-30
Table: Customer
{
"Name" : "Jane Smith",
"DOB" : "1990-01-30”
}
Customer DocumentKey: CBL2015

CustomerID Name DOB
Table: Customer
{
"DOB" : "1990-01-30",
"Billing" : [
{
"type" : "visa",
"cardnum" : "5827-2842-2847-3909",
"expiry" : "2019-03"
}
]
}
CustomerID Type Cardnum Expiry
CBL2015 visa 5827… 2019-03
Table: Billing

CustomerID Name DOB
Table: Customer
{
"DOB" : "1990-01-30",
"Billing" : [
{
"type" : "visa",
"cardnum" : "5827-2842-2847-3909",
"expiry" : "2019-03"
},
{
"type" : "master",
"cardnum" : "6274-2542-5847-3949",
"expiry" : "2018-12"
}
]
}
CustomerID Type Cardnum Expiry
CBL2015 visa 5827… 2019-03
CBL2015 master 6274… 2018-12
Table: Billing

CustomerID ConnId Name
CBL2015 XYZ987 Joe Smith
CBL2015 SKR007 Sam Smith
Table: Connections
{
"DOB" : "1990-01-30",
"Billing" : [
{
"type" : "visa",
"cardnum" : "5827-2842-2847-3909",
"expiry" : "2019-03"
},
{
"type" : "master",
"cardnum" : "6274-2542-5847-3949",
"expiry" : "2018-12"
}
],
"Connections" : [
{
"ConnId" : "XYZ987",
"Name" : "Joe Smith"
},
{
"ConnId" : ”SKR007",
"Name" : ”Sam Smith"
}
}

{
"DOB" : "1990-01-30",
"Billing" : [
{
"type" : "visa",
"cardnum" : "5827-2842-2847-3909",
"expiry" : "2019-03"
},
{
"type" : "master",
"cardnum" : "6274-2842-2847-3909",
"expiry" : "2019-03"
}
],
"Connections" : [
{
"CustId" : "XYZ987",
"Name" : "Joe Smith"
},
{
"CustId" : "PQR823",
"Name" : "Dylan Smith"
}
{
"CustId" : "PQR823",
"Name" : "Dylan Smith"
}
],
"Purchases" : [
{ "id":12, item: "mac", "amt": 2823.52 }
{ "id":19, item: "ipad2", "amt": 623.52 }
]
}
DocumentKey: CBL2015
CustomerID Name DOB
Customer
ID
Type Cardnum Expiry
CBL2015 visa 5827… 2019-03
CBL2015 master 6274… 2018-12
CBL2015 SKR007 Sam Smith
CustomerID item amt
CBL2015 mac 2823.52
CBL2015 ipad2 623.52
CBL2015 SKR007 Sam
Smith
Contacts
Customer
Billing
ConnectionsPurchases

Models for Representing Data
Data Concern Relational Model
JSON Document Model
(NoSQL)
Rich Structure
 Multiple flat tables
 Constant assembly / disassembly
 Documents
 No assembly required!
Relationships
 Represented
 Queried (SQL)
 Represented
 Yes – N1QL (SQL for JSON)
Value Evolution  Data can be updated  Data can be updated
Structure Evolution
 Uniform and rigid
 Manual change (disruptive)
 Flexible
 Dynamic change

Demo: Modeling

Modeling your data: Strategies / rules of thumb
If … Then …
Relationship is one-to-one or one-to-many Store related data as nested objects
Relationship is many-to-one or many-to-many Store related data as separate documents
Data reads are mostly parent fields Store children as separate documents
Data reads are mostly parent + child fields Store children as nested objects
Data writes are mostly parent or child (not both) Store children as separate documents
Data writes are mostly parent and child (both) Store children as nested objects

Accessing Data

Accessing your data (Couchbase)
Key-Value
(CRUD)
N1QL
(Query)
Views
(Query)
Documents
Indexes MapReduce
FullText
(Search)
Geospatial
(Search)
Indexes MapReduce

Key/Value
public ShoppingCart GetCartById(Guid id)
{
return _bucket.Get<ShoppingCart>(id.ToString()).Value;
}
public void CreateShoppingCart()
{
_bucket.Insert(new Document<dynamic>
{
Id = Guid.NewGuid().ToString(),
Content = new { . . . }
});
}

Key/Value: Recommendations for keys
•Natural Keys
•Human Readable
•Deterministic
•Semantic

Key/Value: Example keys
• author::matt
• author::matt::blogs
• blog::csharp_7_features
• blog::csharp_7_features::comments

Understanding your Query Plan

Map/Reduce

Accessing your data: Strategies and recommendation
Concept Strategies & Recommendations
Key-Value Operations provide the best
possible performance
• Create an effective key naming strategy
• Create an optimized data model
Incremental MapReduce (Views) are well
suited to aggregation
• Ideal for large data sets
• Data set can be used to create complex
view indexes
N1QL queries provide the most flexibility –
everything else
• Query data regardless of how it is modeled
• Good indexing is vital

Migrating Data

Migration options: Requirements
ETL / data cleanse / data enrichment

Duration vs. Resources

Data governance

Migration options: Pick your strategy
• Batch vs. Incremental
• Single threaded vs. multi-threaded

Migration options: Pick your tools
• Data migration tools:
• Informatica, Looker,Talend
• BYO-tool
• C# / Powershell / etc
• RhinoETL / DTS / SSIS
• Hadoop, Spark

Migration options: KISS
• CSV:
• Export to CSV
• Import as documents into a 'staging' bucket
• Use N1QL to transform
• Insert into new bucket
• SQL:
• Transform
• Export
• Insert into document database

Migration options: Recommendations
• Align with your data model
• Plan for failure
• Bad source data
• Hardware failure
• Resource limitations
• Ensure: Interruptible, restartable, logged, predictable

Sync NoSQL and relational? Automatic Replication
Couchbase
Kafka
Queue
Producer Consumer
RDBMSDCP
Stream

How can you sync NoSQL and relational?
RDBMS
CData
CouchbaseSSIS
https://www.cdata.com/drivers/couchbase

Sync NoSQL and relational? Manual.

Summary

Summary
Pick the right application

Summary
Drive data model from
data access patterns

Summary
Match the data access
method to requirements

Summary
Proof of Concept

Resources
 https://blog.couchbase.com/moving-from-sql-server-to-
couchbase-part-1-data-modeling/
– http://tinyurl.com/jsonmodel1
 https://blog.couchbase.com/sql-to-json-data-modeling-
hackolade/
– http://tinyurl.com/jsonmodel2

Couchbase, everybody!

Where do you find us?
• blog.couchbase.com
• @couchbasedev
• @mgroves

Frequently Asked Questions
1. How is Couchbase different than Mongo?
2. Is Couchbase the same thing as CouchDb?
3. How did you get to be both incredibly handsome and tremendously
intelligent?
4. What is the Couchbase licensing situation?
5. Is Couchbase a managed cloud service?
6. Transactions?

Json data modeling june 2017 - pittsburgh tech fest

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Json data modeling june 2017 - pittsburgh tech fest

Similar to Json data modeling june 2017 - pittsburgh tech fest (20)

More from Matthew Groves

More from Matthew Groves (20)

Recently uploaded

Recently uploaded (20)

Json data modeling june 2017 - pittsburgh tech fest

Editor's Notes