SlideShare a Scribd company logo
Cosmos DB for DBAs & DEVs
Niko Neugebauer – Consultant @ OH22
Speaker
Niko speaks regularly at events such as PASS Summit, SQLRally,
SQLBits, and SQLSaturday events around the world.
Niko Neugebauer
Professional Focus
Community
Lead the first international SQLSaturday
PASS User Group Leader
TUGA Non-Profit Association Leader
/in/webcaravela/
@NikoNeugebauer
Data Platform (especially from Microsoft)
Columnstore Blogger (110+) at http://www.nikoport.com/columnstore
Creator of CISL – Columnstore Indexes Script Library (https://github.com/NikoNeugebauer/CSIL)
Niko Neugebauer
Consultant, OH22 IS
Professional Focus
Data Platform (especially from Microsoft)
Columnstore Blogger (110+) at http://www.nikoport.com/columnstore
Creator of CISL – Columnstore Indexes Script Library
(https://github.com/NikoNeugebauer/CSIL)
Lead the first international SQLSaturday
PASS User Group Leader
TUGA Non-Profit Association Leader
Speaker
Niko speaks regularly at events such as PASS
Summit, SQLRally, SQLBits, and SQLSaturday
events around the world.• /in/webcaravela/ • @NikoNeugebauer
CAP Theorem – old wisdom: pick just 2!
• Consistency
• Availability
• Partition tolerance
So close, so far ...
CosmosDB:
The new Wisdom
Agenda
• What is CosmosDB ?
• Why CosmosDB ?
• How CosmosDB ?
• Use CosmosDB
• CosmosDB for Developers
• CosmosDB for DBAs
CosmosDB:
What is it ?
What is CosmosDB
• Azure Cosmos DB is Microsoft's globally distributed, multi-model
database.
• With the click of a button, Azure Cosmos DB enables you to
elastically and independently scale throughput and storage across
any number of Azure's geographic regions.
• It offers throughput, latency, availability, and consistency guarantees
with comprehensive service level agreements (SLAs), something no
other database service can offer.
What is CosmosDB
Data Models in CosmosDB
• Database engine operates on atom-record-sequence
based type system.
All data models translated to A-R-S
• API and wire protocols supported via extensible
modules
Currently supported data models:
• Documents, Graphs, Key-Value, Column-Value
API (30-11-2017)
• DocumentDB API
• SQL-like API
• MongoDB API
• Table API
• Graph API (TinkerPop, Gremlin/Groove)
• Cassandra API
• Spark
• Geospatial support
• more will be coming!
A word on Table API vs Azure Table Storage comparison
Table Storage Cosmos Table API
Latency Fast Single-digit millisecond latency
Throughput
Variable, scalalbe up to 20.000
operations/second
Highly scalable with dedicated
reserved throughput per table,
up to 10 million operations/sec
Global Distribution Single Region Turnkey global distribution
Indexing
Only Primary Index on
PartitionKey and RowKey
Automatic and complete
indexing on all properties, no
index management (LOL).
Query
Query execution uses index for
primary key, and scans
otherwise.
Queries can take advantage of
automatic indexing on
properties for fast query times.
Consistency
Strong in Primary Region,
Eventual in Secondary Reg.
5 well-defined consistency levels
Resource Model
CosmosDB: Partitioning
CosmosDB Partitioning
CosmosDB Partitioning
Partitioning
• Implemented on the Tenant-level (Collection, Graph, Table)
• A resource partition is a resource-governed primitive, which is
limited to a subset of keys.
• Capable of doing Splits, Merges, etc from the Partitions
Partitioning Best Practices
- Select a PartitionKey for the best data distribution
- Use location-aware partition key for the best access locality
- Select a PartitionKey which can be a transaction scope
- Don’t use Timestamps for write-heavy workloads. Use time ranges
(hour, month, week, day, year) for even data distribution.
CosmosDB: Why
Why creating CosmosDB?
• Traditional relational databases were designed in 70s-80s
• Data is Growing (Petabytes, Exabytes, etc)
• Think about Internet-Scale and distributed systems
• Provide API Choices
Think about:
• Availability
• Performance
• Costs
CosmosDB: the focus on the performance
Reads (1KB) Indexed Writes (1KB)
50th < 2ms < 6ms
99th < 10ms < 15ms
▪ Globally distributed with reads and writes served from/to local
region
▪ Write-optimised, latch-free engine designed for SSD
▪ Synchronous/Asynchronous automatic indexing
Azure Cosmos DB
• Azure Cosmos DB is fully schema agnostic.
• Uses JSON to describe the supported data models
• Automatic indexing of all ingested content
• Resource Governed, write-optimised engine
• Online Index operations
Core pieces of CosmosDB Architecture
• Global distribution
• Resource Governance
• Schema-agnostic service
Consisteny Levels (and there are 5 of them):
• You pick a stronger consistency level like strong/bounded staleness
because for your account, because a critical path in your e-
commerce/LOB application needs the guarantee
• But for some less-critical operations (like a reporting dashboard
query), you would choose a weaker-consistency level because it
consumes only half the throughput.
• The current offering for the Consistency levels is:
Strong / Bounded Staleness / Session / Consistent Prefix / Eventual
Consisteny Levels in 1 Picture:
Default Consisteny Levels:
• Strong - Linear. Reads are guaranteed to return the most recent
version of an item.
• Bounded Staleness - Consistent Prefix. Reads lag behind writes by k
prefixes or t interval
• Session - Consistent Prefix. Monotonic reads, monotonic writes,
read-your-writes, write-follows-reads in your geographical location.
• Consistent Prefix - Updates returned are some prefix of all the
updates, with no gaps. If you applied sequential transactions, the
previous ones are available on request.
• Eventual - Out of order reads
Indexing & Consisteny Levels:
Indexing Mode Reads Queries
Consistent
Select from strong, bounded
staleness, session, consistent
prefix, or eventual
Select from strong, bounded
staleness, session, or eventual
Lazy
Select from strong, bounded
staleness, session, consistent
prefix, or eventual
Eventual
None
Select from strong, bounded
staleness, session, consistent
prefix, or eventual
Eventual
Throughoutput
• RU – Requests Unit
• % Memory / % CPU / % IOPS just like for Azure SQLDB
• READ / INSERT / UPSERT / DELETE / QUERY - operations
• QUERY = Scans + Index Lookups + Query Complexity + Instruction
Cost
• Everything is calculated by Azure ML 
Throughoutput
• RU – Requests Per Unit
• 400 RU/sec – 10.000 RU/sec (Collections)
• 2.500 RU/sec – Unlimited? RU/sec (Partitioned Collections)
• Min Increase / Decrease is 100 RU/sec
Scaling Cosmos DB Up & Out
• Scale Up – Increase the number of RUs
• Scale Out – Increase the number of partitions for your
collections/graphs/tables
Stored Procs, User-Defined Functions, Triggers, etc
• Is a Server-Side JavaScript Programming
• Procedural Logic
• Atomic Transactions
• Batching
• Pre-Compilation
• Encapsulation
Stored Procs for CosmosDB
User-Defined Functions
Triggers (validation and Node.JS registration)
Stored Procedures using Javascript API
DO NOT!
Azure Functions
Are supported 
Real Life Problems
• Data Quality (Data Types Casting, Missing Connections)
• Complex Questions (joins)
CosmosDB: Behind the Scenes
CosmosDB
• Introduction (Availability (Ring 0), Consistency, 5 9s, PaaS, Scaling)
• Blah
• Stored Procedures
• UDFs
• Triggers
At the Data Centre
• Solid State Drives storage (SSD)
• Fusion IO 160GB Drives
• Fast Private Network Connections
Move to CosmosDB
Azure CosmosDB Data Migration Tool
• Allows you to migrate your data into the CosmosDB
• Supports a range of the sources
• Does not support GraphDB ... yet
CosmosDB: Developers
CosmosDB Query Playground
• https://www.documentdb.com/sql/demo
Try CosmosDB for free (need an Azure account):
• https://azure.microsoft.com/en-us/try/cosmosdb/
46
CosmosDB in Azure Storage Explorer
Azure Cosmos DB Emulator
Software requirements:
• Windows Server 2012 R2, Windows Server 2016, or Windows 10
Minimum Hardware requirements:
• 2 GB RAM
• 10 GB available hard disk space
CosmosDB: DBAs
DBA as in DCT = Data Care Taker
Indexing Example:
Indexing Policy Modes
• Consistent – follows the same consistency level as specified for the point-
reads (i.e. strong, bounded-staleness, session or eventual). The index is
updated synchronously as part of the document update.
The workload target is “write quickly, query immediately”.
• Lazy - To allow maximum document ingestion throughput, an Azure
Cosmos DB collection can be configured with lazy consistency; meaning
queries are eventually consistent.
The index is updated asynchronously when an Azure Cosmos DB
collection is quite.
• None - A collection marked with index mode of “None” has no index
associated with it. This is commonly used if Azure Cosmos DB is utilized as
a key-value storage and documents are accessed only by their ID
property.
Indexing Policy Modes
Consistency Indexing Mode:
Consistent
Indexing Mode: Lazy
Strong Strong Eventual
Bounded Staleness Bounded Staleness Eventual
Session Session Eventual
Eventual Eventual Eventual
Indexing Policy Modes with EnableScanInQuery
Consistency Indexing Mode:
Consistent
Indexing Mode: Lazy Indexing Mode: None
Strong Strong Eventual Strong
Bounded Staleness Bounded Staleness Eventual Bounded Staleness
Session Session Eventual Session
Eventual Eventual Eventual Eventual
Indexing Paths
Path Description
/ Default path for the collection. Recursive
/name/? Hash or Range Indexes for predicates and sorts
/name/* Index path for all paths under the specified label. (multiple levels down)
/name/[]/prop/? Index path required to serve iteration and JOIN queries against arrays of
objects like [{prop: "a"}, {prop: "b"}]:
Indexes Types, Kinds & Precisions
DataTypes:
• String
• Number
• Point
• Polygon
• LineString
Indexes Types, Kinds & Precisions
Index Types:
• Hash – Hash Indexes, think Hekaton (Hash Indexes). Supports
equality and JOIN queries, for the most queries default value of 3
bytes is sufficient. DataType can be String or Number.
• Range – Range Indexes, think Hekaton (BW-Tree). Supports equality
& range queries (<,>,<=,>=,!=) and ORDER BY queries. DataType
can be String or Number.
• Spatial – Spatial Queries for Points, Polygons & LineString. Supports
efficient spatial (within & distance queries) queries.
Indexes Precision
Lets you tradeoff between index storage overhead and query
performance.
For numbers, Microsoft recommends using the defulat
precision -1 (“maximum”). Notice that numbers are 8 bytes in
JSON.
Picking smaller numbers for precision (1-7) means collisions
and hence more RU’s consumption.
For String ranges, which can be of arbitrary lengths, the index
precision can impact the performance of range search
queries and impact storage.
The precision can be specified between 1 to 100.
Important: if you need sorting on the results (ORDER BY), you
must specify the precision of 100.
Indexes Inclusion / Exclusion
includedPaths: [
{
“path”: “/mainContent/*”,
“indexes”:[
{
“kind”: “Hash”,
“dataType”: “String”,
“precision”: 20
}
]
}
]
excludedPaths: [
{
“path”: “/nonIndexedContent/*”
}
]
Indexing Policy Changes – What for ?
• When importing bulk data using lazy indexing models
for faster writes, switching then to consistent indexing
for regular operation.
• When reducing the throughput for writes as well as the
storage space used by hand selecting the properties to
be indexed and changing them over time, or by varying
the index precision of individual properties.
• When using new indexing features on your current
DocumentDB collections like Order By and string range
queries which require the newly introduced string
range index kind.
Indexing Policy Changes - how ?
CosmosDB: Backups
Backups for the CosmosDB:
Backup for DBAs:
• Every 4 hours (approx.) a backup is taken (to Azure BLOB
Storage)
• At least 2 backups are stored at all times
• If you lost your data, you need to contact Azure Support
within 8 hours
• Backup retention: 30 days for deleted partitions/databases
• If you want to maintain your own snapshots, you can use
the export to JSON option in the Azure Cosmos DB Data
Migration tool to schedule additional backups.
Backup for DBAs – read carefully:
• As soon as corruption is detected, the user should delete
the corrupted container (collection/graph/table) so that
backups are protected from being overwritten with
corrupted data.
Source: https://docs.microsoft.com/en-us/azure/cosmos-
db/online-backup-and-restore
Backup for DBAs – the alternative:
• Extract JSON files of your databases/collections/graphs with
the help of the Azure Migration Tool
CosmosDB: Failovers
Global Distribution aka Geo-Replication aka Reional Failover
Global Distribution aka Geo-Replication aka Reional Failover
Global Distribution aka Geo-Replication aka Reional Failover
Manual Failover
Manual Failover Scenarios:
• Follow the clock model: If your applications have predictable traffic patterns
based on the time of the day, you can periodically change the write status to
the most active geographic region based on time of the day.
• Service update: Certain globally distributed application deployment may
involve rerouting traffic to different region via traffic manager during their
planned service update. Such application deployment now can use manual
failover to keep the write status to the region where there is going to be
active traffic during the service update window.
• Business Continuity and Disaster Recovery (BCDR) and High Availability
and Disaster Recovery (HADR) drills: Most enterprise applications include
business continuity tests as part of their development and release process.
BCDR and HADR testing is often an important step in compliance
certifications and guaranteeing service availability in the case of regional
outages. You can test the BCDR readiness of your applications that use
Cosmos DB for storage by triggering a manual failover of your Cosmos DB
account and/or adding and removing a region dynamically.
Global Distribution aka Geo-Replication aka Reional Failover
• Configuration
• First, deploy your application in multiple regions
• To ensure low latency access from every region your application is deployed,
configure the corresponding preferred regions list for each region via one of
the supported SDKs.
GraphDB
GraphDB
• Based on Apache TinkerPop (open source)
• Supporting Gremlin & Groove (How much?) languages
GraphDB - possibilities
• Querying across graph collections - not supported right now
• Duplicate Edges detection
• Duplicate Vertex detection
• Betweness Centrality
• Eigenvector (PageRank)
• Recommendation (as Products in SSAS)
• ...
GraphDB Gremlin querying
• g.V().count(); // Documents
• g.V().hasLabel(‘person’).has(‘age’,gt(40)); // People aged over 40
• g.V().hasLabel('person').values('firstName'); // List People’s first
names
Under the hood, the query
• g.V().hasLabel('Azure')
transforms into
• {"query":"SELECT N_2 FROM Node N_2 WHERE
(IS_DEFINED(N_2._isEdge) = false AND (N_2.label = 'Azure'))"}
GraphDB Migrations
• Neo4J: https://github.com/bsherwin/neo2cosmos
• Migration Tool (soon)
Data Migration Tool:
• https://www.microsoft.com/en-us/download/details.aspx?id=46436
Limitations:
• Returning big amounts of data
• No support for Group BY (SQL Api)
PowerBI
• Via Spark - https://github.com/Azure/azure-cosmosdb-
spark/wiki/Configuring-Power-BI-Direct-Query-to-Azure-
Cosmos-DB-via-Apache-Spark-(HDI)
Geospatial
• Working with geospatial and GeoJSON location data in
Azure Cosmos DB:
https://docs.microsoft.com/en-us/azure/cosmos-
db/geospatial
• Azure Cosmos DB: Expanded geospatial support, including
automatic indexing of Polygon and LineString objects:
https://azure.microsoft.com/en-us/updates/documentdb-
expanded-geospatial-support-including-automatic-
indexing-of-polygons-and-lines/
CosmosDB Links
• https://www.microsoft.com/en-us/download/details.aspx?id=46436
• https://docs.microsoft.com/en-us/azure/cosmos-db/consistency-levels
• Azure CosmosDB Emulator:
https://docs.microsoft.com/en-us/azure/cosmos-db/local-emulator
• Indexing Policies:
https://docs.microsoft.com/en-us/azure/cosmos-db/indexing-policies
• Use the Azure Cosmos DB Emulator for local development and testing:
https://docs.microsoft.com/en-us/azure/cosmos-db/local-emulator
• Tunable data consistency levels in Azure Cosmos DB:
https://docs.microsoft.com/en-us/azure/cosmos-db/consistency-levels
CosmosDB Links
• Gremlin Console:
http://tinkerpop.apache.org/docs/current/tutorials/the-gremlin-
console/
• Tunable data consistency levels in Azure Cosmos DB:
DÚVIDAS?
DÚVIDAS?
OBRIGADO POR
PARTICIPAREM
Database Console Commands
Rodrigo Crespi, SQL Server specialist
A seguir….

More Related Content

What's hot

Azure CosmosDB
Azure CosmosDBAzure CosmosDB
Azure CosmosDB
Fernando Mejía
 
Compare DynamoDB vs. MongoDB
Compare DynamoDB vs. MongoDBCompare DynamoDB vs. MongoDB
Compare DynamoDB vs. MongoDB
Amar Das
 
MongoDB - An Agile NoSQL Database
MongoDB - An Agile NoSQL DatabaseMongoDB - An Agile NoSQL Database
MongoDB - An Agile NoSQL Database
Gaurav Awasthi
 
Azure CosmosDb - Where we are
Azure CosmosDb - Where we areAzure CosmosDb - Where we are
Azure CosmosDb - Where we are
Marco Parenzan
 
Building tiered data stores using aesop to bridge sql and no sql systems
Building tiered data stores using aesop to bridge sql and no sql systemsBuilding tiered data stores using aesop to bridge sql and no sql systems
Building tiered data stores using aesop to bridge sql and no sql systems
Regunath B
 
Scaling Pinterest
Scaling PinterestScaling Pinterest
Scaling Pinterest
C4Media
 
Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...
Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...
Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...
rhatr
 
Application Development with Apache Cassandra as a Service
Application Development with Apache Cassandra as a ServiceApplication Development with Apache Cassandra as a Service
Application Development with Apache Cassandra as a ServiceWSO2
 
How and when to use NoSQL
How and when to use NoSQLHow and when to use NoSQL
How and when to use NoSQL
Amazon Web Services
 
Clustered Columnstore Introduction
Clustered Columnstore IntroductionClustered Columnstore Introduction
Clustered Columnstore Introduction
Niko Neugebauer
 
Scaling MySQL using Fabric
Scaling MySQL using FabricScaling MySQL using Fabric
Scaling MySQL using Fabric
Karthik .P.R
 
An Introduction to Amazon’s DynamoDB
An Introduction to Amazon’s DynamoDBAn Introduction to Amazon’s DynamoDB
An Introduction to Amazon’s DynamoDB
Knoldus Inc.
 
Introduction to SharePoint for SQLserver DBAs
Introduction to SharePoint for SQLserver DBAsIntroduction to SharePoint for SQLserver DBAs
Introduction to SharePoint for SQLserver DBAs
Steve Knutson
 
MySQL Query Optimization (Basics)
MySQL Query Optimization (Basics)MySQL Query Optimization (Basics)
MySQL Query Optimization (Basics)Karthik .P.R
 
SQL, NoSQL, Distributed SQL: Choose your DataStore carefully
SQL, NoSQL, Distributed SQL: Choose your DataStore carefullySQL, NoSQL, Distributed SQL: Choose your DataStore carefully
SQL, NoSQL, Distributed SQL: Choose your DataStore carefully
Md Kamaruzzaman
 
MySQL HA Percona cluster @ MySQL meetup Mumbai
MySQL HA Percona cluster @ MySQL meetup MumbaiMySQL HA Percona cluster @ MySQL meetup Mumbai
MySQL HA Percona cluster @ MySQL meetup Mumbai
Remote MySQL DBA
 
NoSQL benchmarking
NoSQL benchmarkingNoSQL benchmarking
NoSQL benchmarking
Prasoon Kumar
 
Cloudant Overview Bluemix Meetup from Lisa Neddam
Cloudant Overview Bluemix Meetup from Lisa NeddamCloudant Overview Bluemix Meetup from Lisa Neddam
Cloudant Overview Bluemix Meetup from Lisa Neddam
Romeo Kienzler
 
MongoDB
MongoDBMongoDB
MongoDB
Rony Gregory
 
MongoDB and DynamoDB
MongoDB and DynamoDBMongoDB and DynamoDB
MongoDB and DynamoDB
Md. Minhazul Haque
 

What's hot (20)

Azure CosmosDB
Azure CosmosDBAzure CosmosDB
Azure CosmosDB
 
Compare DynamoDB vs. MongoDB
Compare DynamoDB vs. MongoDBCompare DynamoDB vs. MongoDB
Compare DynamoDB vs. MongoDB
 
MongoDB - An Agile NoSQL Database
MongoDB - An Agile NoSQL DatabaseMongoDB - An Agile NoSQL Database
MongoDB - An Agile NoSQL Database
 
Azure CosmosDb - Where we are
Azure CosmosDb - Where we areAzure CosmosDb - Where we are
Azure CosmosDb - Where we are
 
Building tiered data stores using aesop to bridge sql and no sql systems
Building tiered data stores using aesop to bridge sql and no sql systemsBuilding tiered data stores using aesop to bridge sql and no sql systems
Building tiered data stores using aesop to bridge sql and no sql systems
 
Scaling Pinterest
Scaling PinterestScaling Pinterest
Scaling Pinterest
 
Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...
Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...
Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...
 
Application Development with Apache Cassandra as a Service
Application Development with Apache Cassandra as a ServiceApplication Development with Apache Cassandra as a Service
Application Development with Apache Cassandra as a Service
 
How and when to use NoSQL
How and when to use NoSQLHow and when to use NoSQL
How and when to use NoSQL
 
Clustered Columnstore Introduction
Clustered Columnstore IntroductionClustered Columnstore Introduction
Clustered Columnstore Introduction
 
Scaling MySQL using Fabric
Scaling MySQL using FabricScaling MySQL using Fabric
Scaling MySQL using Fabric
 
An Introduction to Amazon’s DynamoDB
An Introduction to Amazon’s DynamoDBAn Introduction to Amazon’s DynamoDB
An Introduction to Amazon’s DynamoDB
 
Introduction to SharePoint for SQLserver DBAs
Introduction to SharePoint for SQLserver DBAsIntroduction to SharePoint for SQLserver DBAs
Introduction to SharePoint for SQLserver DBAs
 
MySQL Query Optimization (Basics)
MySQL Query Optimization (Basics)MySQL Query Optimization (Basics)
MySQL Query Optimization (Basics)
 
SQL, NoSQL, Distributed SQL: Choose your DataStore carefully
SQL, NoSQL, Distributed SQL: Choose your DataStore carefullySQL, NoSQL, Distributed SQL: Choose your DataStore carefully
SQL, NoSQL, Distributed SQL: Choose your DataStore carefully
 
MySQL HA Percona cluster @ MySQL meetup Mumbai
MySQL HA Percona cluster @ MySQL meetup MumbaiMySQL HA Percona cluster @ MySQL meetup Mumbai
MySQL HA Percona cluster @ MySQL meetup Mumbai
 
NoSQL benchmarking
NoSQL benchmarkingNoSQL benchmarking
NoSQL benchmarking
 
Cloudant Overview Bluemix Meetup from Lisa Neddam
Cloudant Overview Bluemix Meetup from Lisa NeddamCloudant Overview Bluemix Meetup from Lisa Neddam
Cloudant Overview Bluemix Meetup from Lisa Neddam
 
MongoDB
MongoDBMongoDB
MongoDB
 
MongoDB and DynamoDB
MongoDB and DynamoDBMongoDB and DynamoDB
MongoDB and DynamoDB
 

Similar to CosmosDB for DBAs & Developers

Highlights of AWS ReInvent 2023 (Announcements and Best Practices)
Highlights of AWS ReInvent 2023 (Announcements and Best Practices)Highlights of AWS ReInvent 2023 (Announcements and Best Practices)
Highlights of AWS ReInvent 2023 (Announcements and Best Practices)
Emprovise
 
Innovations of .NET and Azure (Recaps of Build 2017 selected sessions)
Innovations of .NET and Azure (Recaps of Build 2017 selected sessions)Innovations of .NET and Azure (Recaps of Build 2017 selected sessions)
Innovations of .NET and Azure (Recaps of Build 2017 selected sessions)
Jeff Chu
 
Tech-Spark: Exploring the Cosmos DB
Tech-Spark: Exploring the Cosmos DBTech-Spark: Exploring the Cosmos DB
Tech-Spark: Exploring the Cosmos DB
Ralph Attard
 
ENT309 scaling up to your first 10 million users
ENT309 scaling up to your first 10 million usersENT309 scaling up to your first 10 million users
ENT309 scaling up to your first 10 million users
Amazon Web Services
 
Azure CosmosDb
Azure CosmosDbAzure CosmosDb
Azure CosmosDb
Marco Parenzan
 
Azure data platform overview
Azure data platform overviewAzure data platform overview
Azure data platform overview
Alessandro Melchiori
 
ENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million UsersENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million Users
Amazon Web Services
 
ENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million UsersENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million Users
Amazon Web Services
 
Ultimate SharePoint Infrastructure Best Practises Session - Isle of Man Share...
Ultimate SharePoint Infrastructure Best Practises Session - Isle of Man Share...Ultimate SharePoint Infrastructure Best Practises Session - Isle of Man Share...
Ultimate SharePoint Infrastructure Best Practises Session - Isle of Man Share...Michael Noel
 
Select Stars: A SQL DBA's Introduction to Azure Cosmos DB (SQL Saturday Orego...
Select Stars: A SQL DBA's Introduction to Azure Cosmos DB (SQL Saturday Orego...Select Stars: A SQL DBA's Introduction to Azure Cosmos DB (SQL Saturday Orego...
Select Stars: A SQL DBA's Introduction to Azure Cosmos DB (SQL Saturday Orego...
Bob Pusateri
 
Design, Deploy, and Optimize SQL Server on AWS - AWS Online Tech Talks
Design, Deploy, and Optimize SQL Server on AWS - AWS Online Tech TalksDesign, Deploy, and Optimize SQL Server on AWS - AWS Online Tech Talks
Design, Deploy, and Optimize SQL Server on AWS - AWS Online Tech Talks
Amazon Web Services
 
Aws for Startups Building Cloud Enabled Apps
Aws for Startups Building Cloud Enabled AppsAws for Startups Building Cloud Enabled Apps
Aws for Startups Building Cloud Enabled AppsAmazon Web Services
 
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceBDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
Amazon Web Services
 
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
Amazon Web Services
 
Design, Deploy, and Optimize SQL Server on AWS - June 2017 AWS Online Tech Talks
Design, Deploy, and Optimize SQL Server on AWS - June 2017 AWS Online Tech TalksDesign, Deploy, and Optimize SQL Server on AWS - June 2017 AWS Online Tech Talks
Design, Deploy, and Optimize SQL Server on AWS - June 2017 AWS Online Tech Talks
Amazon Web Services
 
Solving Your Backup Needs Using MongoDB Ops Manager, Cloud Manager and Atlas
Solving Your Backup Needs Using MongoDB Ops Manager, Cloud Manager and AtlasSolving Your Backup Needs Using MongoDB Ops Manager, Cloud Manager and Atlas
Solving Your Backup Needs Using MongoDB Ops Manager, Cloud Manager and Atlas
MongoDB
 
Create cloud service on AWS
Create cloud service on AWSCreate cloud service on AWS
Create cloud service on AWS
Amazon Web Services
 
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceBDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
Amazon Web Services
 
AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)
AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)
AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)
Amazon Web Services
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBJustin Smestad
 

Similar to CosmosDB for DBAs & Developers (20)

Highlights of AWS ReInvent 2023 (Announcements and Best Practices)
Highlights of AWS ReInvent 2023 (Announcements and Best Practices)Highlights of AWS ReInvent 2023 (Announcements and Best Practices)
Highlights of AWS ReInvent 2023 (Announcements and Best Practices)
 
Innovations of .NET and Azure (Recaps of Build 2017 selected sessions)
Innovations of .NET and Azure (Recaps of Build 2017 selected sessions)Innovations of .NET and Azure (Recaps of Build 2017 selected sessions)
Innovations of .NET and Azure (Recaps of Build 2017 selected sessions)
 
Tech-Spark: Exploring the Cosmos DB
Tech-Spark: Exploring the Cosmos DBTech-Spark: Exploring the Cosmos DB
Tech-Spark: Exploring the Cosmos DB
 
ENT309 scaling up to your first 10 million users
ENT309 scaling up to your first 10 million usersENT309 scaling up to your first 10 million users
ENT309 scaling up to your first 10 million users
 
Azure CosmosDb
Azure CosmosDbAzure CosmosDb
Azure CosmosDb
 
Azure data platform overview
Azure data platform overviewAzure data platform overview
Azure data platform overview
 
ENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million UsersENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million Users
 
ENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million UsersENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million Users
 
Ultimate SharePoint Infrastructure Best Practises Session - Isle of Man Share...
Ultimate SharePoint Infrastructure Best Practises Session - Isle of Man Share...Ultimate SharePoint Infrastructure Best Practises Session - Isle of Man Share...
Ultimate SharePoint Infrastructure Best Practises Session - Isle of Man Share...
 
Select Stars: A SQL DBA's Introduction to Azure Cosmos DB (SQL Saturday Orego...
Select Stars: A SQL DBA's Introduction to Azure Cosmos DB (SQL Saturday Orego...Select Stars: A SQL DBA's Introduction to Azure Cosmos DB (SQL Saturday Orego...
Select Stars: A SQL DBA's Introduction to Azure Cosmos DB (SQL Saturday Orego...
 
Design, Deploy, and Optimize SQL Server on AWS - AWS Online Tech Talks
Design, Deploy, and Optimize SQL Server on AWS - AWS Online Tech TalksDesign, Deploy, and Optimize SQL Server on AWS - AWS Online Tech Talks
Design, Deploy, and Optimize SQL Server on AWS - AWS Online Tech Talks
 
Aws for Startups Building Cloud Enabled Apps
Aws for Startups Building Cloud Enabled AppsAws for Startups Building Cloud Enabled Apps
Aws for Startups Building Cloud Enabled Apps
 
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceBDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
 
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
 
Design, Deploy, and Optimize SQL Server on AWS - June 2017 AWS Online Tech Talks
Design, Deploy, and Optimize SQL Server on AWS - June 2017 AWS Online Tech TalksDesign, Deploy, and Optimize SQL Server on AWS - June 2017 AWS Online Tech Talks
Design, Deploy, and Optimize SQL Server on AWS - June 2017 AWS Online Tech Talks
 
Solving Your Backup Needs Using MongoDB Ops Manager, Cloud Manager and Atlas
Solving Your Backup Needs Using MongoDB Ops Manager, Cloud Manager and AtlasSolving Your Backup Needs Using MongoDB Ops Manager, Cloud Manager and Atlas
Solving Your Backup Needs Using MongoDB Ops Manager, Cloud Manager and Atlas
 
Create cloud service on AWS
Create cloud service on AWSCreate cloud service on AWS
Create cloud service on AWS
 
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceBDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
 
AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)
AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)
AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 

Recently uploaded

Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 

Recently uploaded (20)

Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 

CosmosDB for DBAs & Developers

  • 1. Cosmos DB for DBAs & DEVs Niko Neugebauer – Consultant @ OH22
  • 2. Speaker Niko speaks regularly at events such as PASS Summit, SQLRally, SQLBits, and SQLSaturday events around the world. Niko Neugebauer Professional Focus Community Lead the first international SQLSaturday PASS User Group Leader TUGA Non-Profit Association Leader /in/webcaravela/ @NikoNeugebauer Data Platform (especially from Microsoft) Columnstore Blogger (110+) at http://www.nikoport.com/columnstore Creator of CISL – Columnstore Indexes Script Library (https://github.com/NikoNeugebauer/CSIL)
  • 3. Niko Neugebauer Consultant, OH22 IS Professional Focus Data Platform (especially from Microsoft) Columnstore Blogger (110+) at http://www.nikoport.com/columnstore Creator of CISL – Columnstore Indexes Script Library (https://github.com/NikoNeugebauer/CSIL) Lead the first international SQLSaturday PASS User Group Leader TUGA Non-Profit Association Leader Speaker Niko speaks regularly at events such as PASS Summit, SQLRally, SQLBits, and SQLSaturday events around the world.• /in/webcaravela/ • @NikoNeugebauer
  • 4. CAP Theorem – old wisdom: pick just 2! • Consistency • Availability • Partition tolerance
  • 5. So close, so far ...
  • 7. Agenda • What is CosmosDB ? • Why CosmosDB ? • How CosmosDB ? • Use CosmosDB • CosmosDB for Developers • CosmosDB for DBAs
  • 9. What is CosmosDB • Azure Cosmos DB is Microsoft's globally distributed, multi-model database. • With the click of a button, Azure Cosmos DB enables you to elastically and independently scale throughput and storage across any number of Azure's geographic regions. • It offers throughput, latency, availability, and consistency guarantees with comprehensive service level agreements (SLAs), something no other database service can offer.
  • 11. Data Models in CosmosDB • Database engine operates on atom-record-sequence based type system. All data models translated to A-R-S • API and wire protocols supported via extensible modules Currently supported data models: • Documents, Graphs, Key-Value, Column-Value
  • 12. API (30-11-2017) • DocumentDB API • SQL-like API • MongoDB API • Table API • Graph API (TinkerPop, Gremlin/Groove) • Cassandra API • Spark • Geospatial support • more will be coming!
  • 13. A word on Table API vs Azure Table Storage comparison Table Storage Cosmos Table API Latency Fast Single-digit millisecond latency Throughput Variable, scalalbe up to 20.000 operations/second Highly scalable with dedicated reserved throughput per table, up to 10 million operations/sec Global Distribution Single Region Turnkey global distribution Indexing Only Primary Index on PartitionKey and RowKey Automatic and complete indexing on all properties, no index management (LOL). Query Query execution uses index for primary key, and scans otherwise. Queries can take advantage of automatic indexing on properties for fast query times. Consistency Strong in Primary Region, Eventual in Secondary Reg. 5 well-defined consistency levels
  • 18. Partitioning • Implemented on the Tenant-level (Collection, Graph, Table) • A resource partition is a resource-governed primitive, which is limited to a subset of keys. • Capable of doing Splits, Merges, etc from the Partitions
  • 19. Partitioning Best Practices - Select a PartitionKey for the best data distribution - Use location-aware partition key for the best access locality - Select a PartitionKey which can be a transaction scope - Don’t use Timestamps for write-heavy workloads. Use time ranges (hour, month, week, day, year) for even data distribution.
  • 21. Why creating CosmosDB? • Traditional relational databases were designed in 70s-80s • Data is Growing (Petabytes, Exabytes, etc) • Think about Internet-Scale and distributed systems • Provide API Choices Think about: • Availability • Performance • Costs
  • 22. CosmosDB: the focus on the performance Reads (1KB) Indexed Writes (1KB) 50th < 2ms < 6ms 99th < 10ms < 15ms ▪ Globally distributed with reads and writes served from/to local region ▪ Write-optimised, latch-free engine designed for SSD ▪ Synchronous/Asynchronous automatic indexing
  • 23. Azure Cosmos DB • Azure Cosmos DB is fully schema agnostic. • Uses JSON to describe the supported data models • Automatic indexing of all ingested content • Resource Governed, write-optimised engine • Online Index operations
  • 24. Core pieces of CosmosDB Architecture • Global distribution • Resource Governance • Schema-agnostic service
  • 25. Consisteny Levels (and there are 5 of them): • You pick a stronger consistency level like strong/bounded staleness because for your account, because a critical path in your e- commerce/LOB application needs the guarantee • But for some less-critical operations (like a reporting dashboard query), you would choose a weaker-consistency level because it consumes only half the throughput. • The current offering for the Consistency levels is: Strong / Bounded Staleness / Session / Consistent Prefix / Eventual
  • 26. Consisteny Levels in 1 Picture:
  • 27. Default Consisteny Levels: • Strong - Linear. Reads are guaranteed to return the most recent version of an item. • Bounded Staleness - Consistent Prefix. Reads lag behind writes by k prefixes or t interval • Session - Consistent Prefix. Monotonic reads, monotonic writes, read-your-writes, write-follows-reads in your geographical location. • Consistent Prefix - Updates returned are some prefix of all the updates, with no gaps. If you applied sequential transactions, the previous ones are available on request. • Eventual - Out of order reads
  • 28. Indexing & Consisteny Levels: Indexing Mode Reads Queries Consistent Select from strong, bounded staleness, session, consistent prefix, or eventual Select from strong, bounded staleness, session, or eventual Lazy Select from strong, bounded staleness, session, consistent prefix, or eventual Eventual None Select from strong, bounded staleness, session, consistent prefix, or eventual Eventual
  • 29. Throughoutput • RU – Requests Unit • % Memory / % CPU / % IOPS just like for Azure SQLDB • READ / INSERT / UPSERT / DELETE / QUERY - operations • QUERY = Scans + Index Lookups + Query Complexity + Instruction Cost • Everything is calculated by Azure ML 
  • 30. Throughoutput • RU – Requests Per Unit • 400 RU/sec – 10.000 RU/sec (Collections) • 2.500 RU/sec – Unlimited? RU/sec (Partitioned Collections) • Min Increase / Decrease is 100 RU/sec
  • 31. Scaling Cosmos DB Up & Out • Scale Up – Increase the number of RUs • Scale Out – Increase the number of partitions for your collections/graphs/tables
  • 32. Stored Procs, User-Defined Functions, Triggers, etc • Is a Server-Side JavaScript Programming • Procedural Logic • Atomic Transactions • Batching • Pre-Compilation • Encapsulation
  • 33. Stored Procs for CosmosDB
  • 35. Triggers (validation and Node.JS registration)
  • 36. Stored Procedures using Javascript API DO NOT!
  • 38. Real Life Problems • Data Quality (Data Types Casting, Missing Connections) • Complex Questions (joins)
  • 40. CosmosDB • Introduction (Availability (Ring 0), Consistency, 5 9s, PaaS, Scaling) • Blah • Stored Procedures • UDFs • Triggers
  • 41. At the Data Centre • Solid State Drives storage (SSD) • Fusion IO 160GB Drives • Fast Private Network Connections
  • 43. Azure CosmosDB Data Migration Tool • Allows you to migrate your data into the CosmosDB • Supports a range of the sources • Does not support GraphDB ... yet
  • 45. CosmosDB Query Playground • https://www.documentdb.com/sql/demo
  • 46. Try CosmosDB for free (need an Azure account): • https://azure.microsoft.com/en-us/try/cosmosdb/ 46
  • 47. CosmosDB in Azure Storage Explorer
  • 48. Azure Cosmos DB Emulator Software requirements: • Windows Server 2012 R2, Windows Server 2016, or Windows 10 Minimum Hardware requirements: • 2 GB RAM • 10 GB available hard disk space
  • 49. CosmosDB: DBAs DBA as in DCT = Data Care Taker
  • 51. Indexing Policy Modes • Consistent – follows the same consistency level as specified for the point- reads (i.e. strong, bounded-staleness, session or eventual). The index is updated synchronously as part of the document update. The workload target is “write quickly, query immediately”. • Lazy - To allow maximum document ingestion throughput, an Azure Cosmos DB collection can be configured with lazy consistency; meaning queries are eventually consistent. The index is updated asynchronously when an Azure Cosmos DB collection is quite. • None - A collection marked with index mode of “None” has no index associated with it. This is commonly used if Azure Cosmos DB is utilized as a key-value storage and documents are accessed only by their ID property.
  • 52. Indexing Policy Modes Consistency Indexing Mode: Consistent Indexing Mode: Lazy Strong Strong Eventual Bounded Staleness Bounded Staleness Eventual Session Session Eventual Eventual Eventual Eventual
  • 53. Indexing Policy Modes with EnableScanInQuery Consistency Indexing Mode: Consistent Indexing Mode: Lazy Indexing Mode: None Strong Strong Eventual Strong Bounded Staleness Bounded Staleness Eventual Bounded Staleness Session Session Eventual Session Eventual Eventual Eventual Eventual
  • 54. Indexing Paths Path Description / Default path for the collection. Recursive /name/? Hash or Range Indexes for predicates and sorts /name/* Index path for all paths under the specified label. (multiple levels down) /name/[]/prop/? Index path required to serve iteration and JOIN queries against arrays of objects like [{prop: "a"}, {prop: "b"}]:
  • 55. Indexes Types, Kinds & Precisions DataTypes: • String • Number • Point • Polygon • LineString
  • 56. Indexes Types, Kinds & Precisions Index Types: • Hash – Hash Indexes, think Hekaton (Hash Indexes). Supports equality and JOIN queries, for the most queries default value of 3 bytes is sufficient. DataType can be String or Number. • Range – Range Indexes, think Hekaton (BW-Tree). Supports equality & range queries (<,>,<=,>=,!=) and ORDER BY queries. DataType can be String or Number. • Spatial – Spatial Queries for Points, Polygons & LineString. Supports efficient spatial (within & distance queries) queries.
  • 57. Indexes Precision Lets you tradeoff between index storage overhead and query performance. For numbers, Microsoft recommends using the defulat precision -1 (“maximum”). Notice that numbers are 8 bytes in JSON. Picking smaller numbers for precision (1-7) means collisions and hence more RU’s consumption. For String ranges, which can be of arbitrary lengths, the index precision can impact the performance of range search queries and impact storage. The precision can be specified between 1 to 100. Important: if you need sorting on the results (ORDER BY), you must specify the precision of 100.
  • 58. Indexes Inclusion / Exclusion includedPaths: [ { “path”: “/mainContent/*”, “indexes”:[ { “kind”: “Hash”, “dataType”: “String”, “precision”: 20 } ] } ] excludedPaths: [ { “path”: “/nonIndexedContent/*” } ]
  • 59. Indexing Policy Changes – What for ? • When importing bulk data using lazy indexing models for faster writes, switching then to consistent indexing for regular operation. • When reducing the throughput for writes as well as the storage space used by hand selecting the properties to be indexed and changing them over time, or by varying the index precision of individual properties. • When using new indexing features on your current DocumentDB collections like Order By and string range queries which require the newly introduced string range index kind.
  • 62. Backups for the CosmosDB:
  • 63. Backup for DBAs: • Every 4 hours (approx.) a backup is taken (to Azure BLOB Storage) • At least 2 backups are stored at all times • If you lost your data, you need to contact Azure Support within 8 hours • Backup retention: 30 days for deleted partitions/databases • If you want to maintain your own snapshots, you can use the export to JSON option in the Azure Cosmos DB Data Migration tool to schedule additional backups.
  • 64. Backup for DBAs – read carefully: • As soon as corruption is detected, the user should delete the corrupted container (collection/graph/table) so that backups are protected from being overwritten with corrupted data. Source: https://docs.microsoft.com/en-us/azure/cosmos- db/online-backup-and-restore
  • 65. Backup for DBAs – the alternative: • Extract JSON files of your databases/collections/graphs with the help of the Azure Migration Tool
  • 67. Global Distribution aka Geo-Replication aka Reional Failover
  • 68. Global Distribution aka Geo-Replication aka Reional Failover
  • 69. Global Distribution aka Geo-Replication aka Reional Failover
  • 71. Manual Failover Scenarios: • Follow the clock model: If your applications have predictable traffic patterns based on the time of the day, you can periodically change the write status to the most active geographic region based on time of the day. • Service update: Certain globally distributed application deployment may involve rerouting traffic to different region via traffic manager during their planned service update. Such application deployment now can use manual failover to keep the write status to the region where there is going to be active traffic during the service update window. • Business Continuity and Disaster Recovery (BCDR) and High Availability and Disaster Recovery (HADR) drills: Most enterprise applications include business continuity tests as part of their development and release process. BCDR and HADR testing is often an important step in compliance certifications and guaranteeing service availability in the case of regional outages. You can test the BCDR readiness of your applications that use Cosmos DB for storage by triggering a manual failover of your Cosmos DB account and/or adding and removing a region dynamically.
  • 72. Global Distribution aka Geo-Replication aka Reional Failover • Configuration • First, deploy your application in multiple regions • To ensure low latency access from every region your application is deployed, configure the corresponding preferred regions list for each region via one of the supported SDKs.
  • 74. GraphDB • Based on Apache TinkerPop (open source) • Supporting Gremlin & Groove (How much?) languages
  • 75. GraphDB - possibilities • Querying across graph collections - not supported right now • Duplicate Edges detection • Duplicate Vertex detection • Betweness Centrality • Eigenvector (PageRank) • Recommendation (as Products in SSAS) • ...
  • 76. GraphDB Gremlin querying • g.V().count(); // Documents • g.V().hasLabel(‘person’).has(‘age’,gt(40)); // People aged over 40 • g.V().hasLabel('person').values('firstName'); // List People’s first names Under the hood, the query • g.V().hasLabel('Azure') transforms into • {"query":"SELECT N_2 FROM Node N_2 WHERE (IS_DEFINED(N_2._isEdge) = false AND (N_2.label = 'Azure'))"}
  • 77. GraphDB Migrations • Neo4J: https://github.com/bsherwin/neo2cosmos • Migration Tool (soon)
  • 78. Data Migration Tool: • https://www.microsoft.com/en-us/download/details.aspx?id=46436
  • 79. Limitations: • Returning big amounts of data • No support for Group BY (SQL Api)
  • 80. PowerBI • Via Spark - https://github.com/Azure/azure-cosmosdb- spark/wiki/Configuring-Power-BI-Direct-Query-to-Azure- Cosmos-DB-via-Apache-Spark-(HDI)
  • 81. Geospatial • Working with geospatial and GeoJSON location data in Azure Cosmos DB: https://docs.microsoft.com/en-us/azure/cosmos- db/geospatial • Azure Cosmos DB: Expanded geospatial support, including automatic indexing of Polygon and LineString objects: https://azure.microsoft.com/en-us/updates/documentdb- expanded-geospatial-support-including-automatic- indexing-of-polygons-and-lines/
  • 82. CosmosDB Links • https://www.microsoft.com/en-us/download/details.aspx?id=46436 • https://docs.microsoft.com/en-us/azure/cosmos-db/consistency-levels • Azure CosmosDB Emulator: https://docs.microsoft.com/en-us/azure/cosmos-db/local-emulator • Indexing Policies: https://docs.microsoft.com/en-us/azure/cosmos-db/indexing-policies • Use the Azure Cosmos DB Emulator for local development and testing: https://docs.microsoft.com/en-us/azure/cosmos-db/local-emulator • Tunable data consistency levels in Azure Cosmos DB: https://docs.microsoft.com/en-us/azure/cosmos-db/consistency-levels
  • 83. CosmosDB Links • Gremlin Console: http://tinkerpop.apache.org/docs/current/tutorials/the-gremlin- console/ • Tunable data consistency levels in Azure Cosmos DB:
  • 87. Database Console Commands Rodrigo Crespi, SQL Server specialist A seguir….