2. « »
Azure DocumentDb Training – Resource Model
How do we model Data
inside DocumentDb?
3. Azure DocumentDb Training – Resource Model
Recap: Why DocumentDb
Resource Model
Document and Resource Units
Database Account, Accessibility and Consistency
Database and Namespace
Containers and Partitioning
Conclusions
5. Azure DocumentDb Training – Resource Model
Set of principles that can be satisfied by a Distributed System
Consistency:
All nodes should see the same data at the same time
"Result is always up to date" (no out-of-date data / stale data)
Availability:
Every request receives a response about whether it succeeded or failed
"There is always a result" (accepting out-of-date data / stale data)
Partition-tolerance:
The system continues to operate despite arbitrary partitioning due to network failures
"Data can be given one of multiple nodes" (some nodes can be out-of-date/ stale)
A distributed system can satisfy any two of these principles at the same time, but not all three
6. Azure DocumentDb Training – Resource Model
A DBMS is a Distributed System
It lives under CAP Theorem
A DBMS needs to choose loosing one of the three principles:
A Relational DBMS chooses loosing partitioning to guarantee strict consistency and
availability
A No-Sql DBMS typically chooses moving to eventual consistency (relaxing, not
loosing) to guarantee partitioning and availability (it complains Stale Data)
7. Azure DocumentDb Training – Resource Model
With Eventual Consistency, a DBMS cannot run some typical
relational features
referential integrity
check constraints
schema
8. Azure DocumentDb Training – Resource Model
DocumentDb is a schemaless Db
DocumentDb is a Document-Oriented Db
Document is JSON Document
Promote code first development (mapping objects to json)
Resilient to iterative schema changes
No ORM required
It’s great for Catalog Data, Preference and State, Event Store,
User Generated Content, Data Exchange
10. Azure DocumentDb Training – Resource Model
DocumentDb is Platform as a Service
No OnPremise
RESTful API
All DocDb elements public and accessible as Resource Uri
Resource
Json Resources
16. Azure DocumentDb Training – Resource Model
Representing one-to-many relationships.
Representing many-to-many relationships.
Related data changes frequently.
Referenced data could be unbounded
Provides more flexibility than embedding
More round trips to read data
Normalizing typically provides better write performance
17. Azure DocumentDb Training – Resource Model
There are contains relationships between entities.
There are one-to-few relationships between entities.
There is embedded data that changes infrequently.
There is embedded data won't grow without bound.
There is embedded data that is integral to data in a document.
18. Azure DocumentDb Training – Resource Model
Resource Unit
DocumentDb is Platform as a Service: no perception of physical resource allocation
A throughput currency
1RU: ability of reading a 1Kb Json document
Many factors impacting RU usage: Document size. Document property count. Data
consistency. Indexed properties. Document indexing. Query patterns. Script usage.
Reservation Model
You are billed for the amount of throughput reserved for the collection, regardless of how
much of that throughput is actively used.
There is a pricing calculator available to help calculating costs
https://www.documentdb.com/capacityplanner
21. Azure DocumentDb Training – Resource Model
Unit of Autorization
Unit of Consistency
JS
JS
JS
101
010
22. Azure DocumentDb Training – Resource Model
Master keys
Upon creation of a DocumentDB account, two master keys (primary and secondary) are
created. These keys enable full administrative access to all resources within the
DocumentDB account.
Read-only keys
Upon creation of a DocumentDB account, two read-only keys (primary and secondary) are
created. These keys enable read-only access to all resources within the DocumentDB
account.
Resource tokens
A resource token is associated with a DocumentDB permission resource and captures the
relationship between the user of a database and the permission that user has for a
specific DocumentDB application resource (e.g. collection, document).
23. Azure DocumentDb Training – Resource Model
Query / transaction throughput (and reliability – i.e., hardware
failure) depend on replication!
All writes to the primary are replicated across two secondary replicas
All reads are distributed across three copies
“Scalability of throughput” – allowing different clients to read from different replicas
helps prevent bottlenecks
BUT replication takes time!
Potential scenario: some clients are reading while another is writing
Now, the data is stale (out-of-date), inconsistent!
24. Azure DocumentDb Training – Resource Model
Trade-off: speed (performance & availability) or consistency
(data correctness)?
“Does every read need the MOST current data?”
“Or do I need every request to be handled and handled quickly?”
4 options …
Strong, Session, Bounded Staleness, Eventual
Default consistency for the entire Db…
At collection basis in a future release
On query basis (optional parameter on CreateDocumentQuery method)
30. Azure DocumentDb Training – Resource Model
A unit of scale for transaction
for stored procedures and triggers
A unit of query throughput
capacity units allocated uniformly across
all collections)
A unit of replication
A collection is replicated three times
A container of JSON
documents
JSON docs inside of a collection can
vary dramatically
JS
JS
JS
101
010
31. Azure DocumentDb Training – Resource Model
Collection-based RU Reservation
Capacity units allocated uniformly across all
collections)
Standard pricing tier with hourly
billing
$0.042/hour for mimimum 400RU/s
Performance levels can be
adjusted
Each collection = 10GB of SSD
Limit of 100 collections (1 TB)
Soft limit, can be lifted as needed per account
(with Support)
32. Azure DocumentDb Training – Resource Model
Partitioning
Data Size
A single collection (currently*) holds 10GB
Throughput
3 Performance tiers with a max of 2,500 RU/sec
35. Azure DocumentDb Training – Resource Model
DocumentDb is a Restful service
Documents defines Unit of Costs with Resource Units
Database Account defines Accessibility and Consistency
Database is a Namespace placeholder
Containers is the unit of Scale
Editor's Notes
instead of taking the business subject / domain entity and breaking it up into multiple relational structures store the business subject in the minimal number of documents.