Data modelling and partitioning in Azure
Cosmos DB
(Azure Cosmos DB でのデータ モデリングとパーティション分割)
Session's objectives
What is Azure Cosmos DB?
Non-relational and horizontally scalable
What is Azure Cosmos DB?
horizontally scalable
What is Azure Cosmos DB?
non-relational
What is Azure Cosmos DB?
non-relational
and
horizontally scalable
So is Azure Cosmos DB suitable for relational
workloads?
Let's look at a concrete example
Identifying the operations we have to serve
Now let's implement this model on Azure Cosmos DB!
Starting with the Customer entity
Starting with the Customer entity
To embed or to reference?
To embed or to reference?
-
-
-
-
-
-
Our first entity: Customer
Customer
customers
PK: ?
What is partitioning?
What is partitioning?
logical partitions
What is partitioning?
Andrew
Theo
Mark
TimDeborah Luis
What is partitioning?
Max size: 20 GB
Max size: 2 MB
What is partitioning?
What is partitioning?
What is partitioning?
What is partitioning?
Andrew TheoMarkTimDeborah Luis
SELECT * FROM c WHERE c.username = 'Mark'
our partition key
What is partitioning?
Andrew TheoMarkTimDeborah Luis
SELECT * FROM c WHERE c.favoriteColor = 'orange'
?
Choosing a partition key for customers
customers
PK: ?
Choosing a partition key for customers
customers
PK: ?
Choosing a partition key for customers
customers
PK: id
Choosing a partition key for customers
customers
PK: id
Next: product categories
Product categories
Product categories
productCategories
PK: ?
Product categories
productCategories
PK: ?
SELECT * FROM c
Product categories
productCategories
PK: type
Next: product tags
Product tags
Product tags
productTags
PK: ?
Product tags
productTags
PK: ?
Product tags
productTags
PK: type
Next: products
Products
Products
Products
products
PK: ?
Products
products
PK: ?
CategoryA CategoryCCategoryB
SELECT * FROM c WHERE c.categoryId = 'CategoryA'
Products
products
PK: categoryId
category name?
tag names?
Products: how to return category and tag names?
products
SELECT * FROM c WHERE c.categoryId = 'CategoryA'
productCategories
SELECT c.name FROM c WHERE c.id = 'CategoryA'
productTags
SELECT * FROM c
WHERE c.id IN ('<tagId1>', '<tagId2>', '<tagId3>')
Introducing denormalization
Products: denormalizing category and tag names
products
PK: categoryId
Products: keeping everything in sync
productCategories
productTags
products
Cosmos DB's change feed
Products: keeping everything in sync
productCategories
productTags
products
Next: sales orders
Sales orders
Sales orders
Sales orders
salesOrders
PK: ?
Sales orders
salesOrders
PK: ?
Sales orders
salesOrders
PK: ?
CustomerA CustomerCCustomerB
SELECT * FROM c WHERE c.customerId = 'CustomerA'
Sales orders
salesOrders
PK: customerId
Sales orders
salesOrders
PK: customerId
customers
PK: id
Mixing entities in the same container?
Sales orders
salesOrders
PK: customerId
customers
PK: id
Sales orders: mixing with customers
customers
PK: id
Sales orders: mixing with customers
customers
PK: customerId
Sales orders: mixing with customers
customers
PK: customerId
Sales orders: mixing with customers
CustomerA
CustomerC
CustomerB
customer sales orders
customers
PK: customerId
Sales orders
customers
PK: customerId
SELECT * FROM c WHERE c.customerId = 'CustomerA'
AND c.type = 'salesOrder'
Sales orders
customers
PK: customerId
Denormalizing the count of sales orders per customer
Denormalizing the count of sales orders per customer
Denormalizing the count of sales orders per customer
CustomerA
CustomerC
CustomerB
customer sales orders
customers
PK: customerId
Denormalizing the count of sales orders per customer
CustomerA
CustomerC
CustomerB
update the customer add a sales order
customers
PK: customerId
Denormalizing the count of sales orders per customer
CustomerA
CustomerC
CustomerB
update the customer add a sales order
Sales orders
customers
PK: customerId
SELECT * FROM c WHERE c.type = 'customer'
ORDER BY c.salesOrderCount DESC
Our final design
customers
PK: customerId
productCategories
PK: type
productTags
PK: type
products
PK: categoryId
Our final design, optimized!
customers
PK: customerId
productMeta
PK: type
products
PK: categoryId
Key takeaways
Going further
https://docs.microsoft.com/azure/cosmos-db/modeling-data
https://docs.microsoft.com/azure/cosmos-db/how-to-model-partition-example
https://devblogs.microsoft.com/cosmosdb/data-modeling-and-partitioning-for-relational-workloads/
https://github.com/AzureCosmosDB/labs/blob/master/readme.md
https://github.com/AzureCosmosDB/labs/blob/master/decks/Data-Modeling.pptx

[第2回 Azure Cosmos DB 勉強会] Data modelling and partitioning in Azure Cosmos DB (Azure Cosmos DB でのデータモデリングとパーティション分割)