The event, held on 27th April 2019, was part of the Global Azure Bootcamp and covered Microsoft's Cosmos DB, more specifically:
- Introduction to Cosmos DB, its features, internals, resource models, and request units.
- DEMO: Create an SQL API. Download sample .NET app. Simple queries.
- Covered Change Feed and showcased various use case scenarios.
- Detailed Global Distribution and Consistency Models implications.
- DEMO: Mongo - Lift and shift. Run simple .NET code against a MongoDB (in docker container) and cosmos.
- Introduction to Tinkerpop graphs
- DEMO: Graphs API. Download sample .NET app. Simple queries.
https://techspark.mt/global-azure-bootcamp-27th-april-2019/
DAT102 Introduction to Amazon DynamoDB - AWS re: Invent 2012Amazon Web Services
Learn why Amazon DynamoDB is the fastest-growing service in AWS history. DynamoDB is a NoSQL database service that lets you scale from one to hundreds of thousands of I/Os per second (and beyond) with the push of a button. It's designed to give you scalability and high performance with minimal administration and enables you to scale your app while keeping costs down. You also learn about the service’s design principles, its history, and about how some of our customers are using DynamoDB in their applications.
AWS re:Invent 2016: Case Study: Librato's Experience Running Cassandra Using ...Amazon Web Services
At Librato, a Solarwinds company, we run hundreds of Cassandra instances across multiple rings and use it as our primary data store. In the past year, we embarked on a process to upgrade our fleet of Cassandra Amazon EC2 instances from instance store to instances using Amazon EBS and attached elastic network interfaces (ENIs). We find running Cassandra on EBS gives us the flexibility to choose the best instances for the best performance of our workload while saving us significant costs on infrastructure. In this session, we discuss how Librato operates Cassandra on EBS. Topics include how we chose the right instance for our workload, use detached EBS volumes and ENI mobility to reduce MTTR, use mixed EBS storage types for the best cost/performance tradeoff, debug performance issues, and continuously monitor Cassandra to get the most from AWS. We also look at performance tradeoffs made in the implementation of storage engines of large data systems like Cassandra.
MongoDB Ops Manager and Kubernetes - James BroadheadMongoDB
Review the core technologies, such as containers, Kubernetes, and MongoDB Ops Manager. You'll also have a chance to see real-live demos of MongoDB running on Kubernetes and managed with MongoDB Ops Manager with the MongoDB Enterprise Kubernetes Operator.
DAT102 Introduction to Amazon DynamoDB - AWS re: Invent 2012Amazon Web Services
Learn why Amazon DynamoDB is the fastest-growing service in AWS history. DynamoDB is a NoSQL database service that lets you scale from one to hundreds of thousands of I/Os per second (and beyond) with the push of a button. It's designed to give you scalability and high performance with minimal administration and enables you to scale your app while keeping costs down. You also learn about the service’s design principles, its history, and about how some of our customers are using DynamoDB in their applications.
AWS re:Invent 2016: Case Study: Librato's Experience Running Cassandra Using ...Amazon Web Services
At Librato, a Solarwinds company, we run hundreds of Cassandra instances across multiple rings and use it as our primary data store. In the past year, we embarked on a process to upgrade our fleet of Cassandra Amazon EC2 instances from instance store to instances using Amazon EBS and attached elastic network interfaces (ENIs). We find running Cassandra on EBS gives us the flexibility to choose the best instances for the best performance of our workload while saving us significant costs on infrastructure. In this session, we discuss how Librato operates Cassandra on EBS. Topics include how we chose the right instance for our workload, use detached EBS volumes and ENI mobility to reduce MTTR, use mixed EBS storage types for the best cost/performance tradeoff, debug performance issues, and continuously monitor Cassandra to get the most from AWS. We also look at performance tradeoffs made in the implementation of storage engines of large data systems like Cassandra.
MongoDB Ops Manager and Kubernetes - James BroadheadMongoDB
Review the core technologies, such as containers, Kubernetes, and MongoDB Ops Manager. You'll also have a chance to see real-live demos of MongoDB running on Kubernetes and managed with MongoDB Ops Manager with the MongoDB Enterprise Kubernetes Operator.
SmugMug: From MySQL to Amazon DynamoDB (DAT204) | AWS re:Invent 2013Amazon Web Services
SmugMug.com is a popular hosting and commerce platform for photo enthusiasts with hundreds of thousands of subscribers and millions of viewers. Learn now SmugMug uses Amazon DynamoDB to provide customers detailed information about millions of daily image and video views. Smugmug shares code and information about their stats stack, which includes an HTTP interface to Amazon DynamoDB and also interfaces with their internal PHP stack and other tools such as Memcached. Get a detailed picture of lessons learned and the methods SmugMug uses to create a system that is easy to use, reliable, and high performing.
(SDD424) Simplifying Scalable Distributed Applications Using DynamoDB Streams...Amazon Web Services
Dynamo Streams provides a stream of all the updates done to your DynamoDB table. It is a simple but extremely powerful primitive which will enable developers to easily build solutions like cross-region replication, and to host additional materialized views, for instance an ElasticSearch index, on top of DynamoDB tables. In this session we will dive deep into details of Dynamo Streams, and how customers can leverage Dynamo Streams to build custom solutions and to extend the functionality of DynamoDB. We will give a demo of an example application built on top of Dynamo Streams to demonstrate the power and simplicity of Dynamo Streams.
Meet Up - Spark Stream Processing + KafkaKnoldus Inc.
Stream processing is the real-time processing of data continuously, concurrently, and in a record-by-record fashion.
It treats data not as static tables or files, but as a continuous infinite stream of data integrated from both live and historical sources.
In these slides we'll be looking into Sprak Stream Processing with Kafka.
Amazon DynamoDB is a fully managed, highly scalable NoSQL database service. We will deep dive into how DynamoDB scaling and partitioning works, how to do data modeling based on access patterns using primitives such as hash/range keys, secondary indexes, conditional writes and query filters. We will also discuss how to use DynamoDB Streams to build cross-region replication and integrate with other services (such as Amazon S3, Amazon CloudSearch, Amazon ElastiCache, Amazon Redshift) to enable logging, search, analytics and caching. You will learn design patterns and best practices on how to use DynamoDB to build highly scalable applications, with the right performance characteristics at the right cost.
Amazon DynamoDB is a fully managed NoSQL database service for applications that need consistent, single-digit millisecond latency at any scale. This talk explores DynamoDB capabilities and benefits in detail and discusses how to get the most out of your DynamoDB database. We go over schema design best practices with DynamoDB across multiple use cases, including gaming, AdTech, IoT, and others. We also explore designing efficient indexes, scanning, and querying, and go into detail on a number of recently released features, including JSON document support, Streams, and more.
(ARC311) Extreme Availability for Mission-Critical Applications | AWS re:Inve...Amazon Web Services
More and more businesses are deploying their mission-critical applications on AWS, and one of their concerns is how to improve the availability of their services, going beyond traditional availability concepts. In this session, you will learn how to architect different layers of your application―beginning with an extremely available front-end layer with Amazon EC2, Elastic Load Balancing, and Auto Scaling, and going all the way to a protected multitiered information layer, including cross-region replicas for relational and NoSQL databases. The concepts that we will share, using services like Amazon RDS, Amazon DynamoDB, and Amazon Route 53, will provide a framework you can use to keep your application running even with multiple failures. Additionally, you will hear from Magazine Luiza, in an interactive session, on how they run a large e-commerce application with a multiregion architecture using a combination of features and services from AWS to achieve extreme availability.
An overview of Apache Spark and AWS Glue.
Apache Spark is a fast and general-purpose cluster computing system. It provides high-level APIs in Java, Scala, Python and R, and an optimized engine that supports general execution graphs. It also supports a rich set of higher-level tools including Spark SQL for SQL and structured data processing, MLlib for machine learning, GraphX for graph processing, and Spark Streaming.
AWS Glue is a fully managed extract, transform, and load (ETL) service that makes it easy for customers to prepare and load their data for analytics. You can create and run an ETL job with a few clicks in the AWS Management Console. You simply point AWS Glue to your data stored on AWS, and AWS Glue discovers your data and stores the associated metadata (e.g. table definition and schema) in the AWS Glue Data Catalog. Once cataloged, your data is immediately searchable, queryable, and available for ETL.
Data Pipeline team at Demonware (Activision) has to deal with routing large amounts of data from various sources to many destinations every day.
Our team always wanted to be able to query processed data for debugging and analytical purposes, but creating large data warehouses was never our priority, since it usually happens downstream.
AWS Athena is completely serverless query service that doesn't require any infrastructure setup or complex provisioning. We just needed to save some of our data streams to AWS S3 and define a schema. Just a few simple steps, but in the end we were able to write complex SQL queries against gigabytes of data and get results in seconds.
In this presentation I want to show multiple ways to stream your data to AWS S3, explain some underlying tech, show how to define a schema and finally share some of the best practices we applied.
Analyzing big data quickly and efficiently requires a data warehouse optimized to handle and scale for large datasets. Amazon Redshift is a fast, petabyte-scale data warehouse that makes it simple and cost-effective to analyze all of your data for a fraction of the cost of traditional data warehouses. In this webinar, we take an in-depth look at data warehousing with Amazon Redshift for big data analytics. We cover best practices to take advantage of Amazon Redshift's columnar technology and parallel processing capabilities to deliver high throughput and query performance.
Learning Objectives:
• Get an inside look at Amazon Redshift's columnar technology and parallel processing capabilities
• Learn how to design schemas and load data efficiently
• Learn best practices for workload management, distribution and sort keys, and optimizing queries
SmugMug: From MySQL to Amazon DynamoDB (DAT204) | AWS re:Invent 2013Amazon Web Services
SmugMug.com is a popular hosting and commerce platform for photo enthusiasts with hundreds of thousands of subscribers and millions of viewers. Learn now SmugMug uses Amazon DynamoDB to provide customers detailed information about millions of daily image and video views. Smugmug shares code and information about their stats stack, which includes an HTTP interface to Amazon DynamoDB and also interfaces with their internal PHP stack and other tools such as Memcached. Get a detailed picture of lessons learned and the methods SmugMug uses to create a system that is easy to use, reliable, and high performing.
(SDD424) Simplifying Scalable Distributed Applications Using DynamoDB Streams...Amazon Web Services
Dynamo Streams provides a stream of all the updates done to your DynamoDB table. It is a simple but extremely powerful primitive which will enable developers to easily build solutions like cross-region replication, and to host additional materialized views, for instance an ElasticSearch index, on top of DynamoDB tables. In this session we will dive deep into details of Dynamo Streams, and how customers can leverage Dynamo Streams to build custom solutions and to extend the functionality of DynamoDB. We will give a demo of an example application built on top of Dynamo Streams to demonstrate the power and simplicity of Dynamo Streams.
Meet Up - Spark Stream Processing + KafkaKnoldus Inc.
Stream processing is the real-time processing of data continuously, concurrently, and in a record-by-record fashion.
It treats data not as static tables or files, but as a continuous infinite stream of data integrated from both live and historical sources.
In these slides we'll be looking into Sprak Stream Processing with Kafka.
Amazon DynamoDB is a fully managed, highly scalable NoSQL database service. We will deep dive into how DynamoDB scaling and partitioning works, how to do data modeling based on access patterns using primitives such as hash/range keys, secondary indexes, conditional writes and query filters. We will also discuss how to use DynamoDB Streams to build cross-region replication and integrate with other services (such as Amazon S3, Amazon CloudSearch, Amazon ElastiCache, Amazon Redshift) to enable logging, search, analytics and caching. You will learn design patterns and best practices on how to use DynamoDB to build highly scalable applications, with the right performance characteristics at the right cost.
Amazon DynamoDB is a fully managed NoSQL database service for applications that need consistent, single-digit millisecond latency at any scale. This talk explores DynamoDB capabilities and benefits in detail and discusses how to get the most out of your DynamoDB database. We go over schema design best practices with DynamoDB across multiple use cases, including gaming, AdTech, IoT, and others. We also explore designing efficient indexes, scanning, and querying, and go into detail on a number of recently released features, including JSON document support, Streams, and more.
(ARC311) Extreme Availability for Mission-Critical Applications | AWS re:Inve...Amazon Web Services
More and more businesses are deploying their mission-critical applications on AWS, and one of their concerns is how to improve the availability of their services, going beyond traditional availability concepts. In this session, you will learn how to architect different layers of your application―beginning with an extremely available front-end layer with Amazon EC2, Elastic Load Balancing, and Auto Scaling, and going all the way to a protected multitiered information layer, including cross-region replicas for relational and NoSQL databases. The concepts that we will share, using services like Amazon RDS, Amazon DynamoDB, and Amazon Route 53, will provide a framework you can use to keep your application running even with multiple failures. Additionally, you will hear from Magazine Luiza, in an interactive session, on how they run a large e-commerce application with a multiregion architecture using a combination of features and services from AWS to achieve extreme availability.
An overview of Apache Spark and AWS Glue.
Apache Spark is a fast and general-purpose cluster computing system. It provides high-level APIs in Java, Scala, Python and R, and an optimized engine that supports general execution graphs. It also supports a rich set of higher-level tools including Spark SQL for SQL and structured data processing, MLlib for machine learning, GraphX for graph processing, and Spark Streaming.
AWS Glue is a fully managed extract, transform, and load (ETL) service that makes it easy for customers to prepare and load their data for analytics. You can create and run an ETL job with a few clicks in the AWS Management Console. You simply point AWS Glue to your data stored on AWS, and AWS Glue discovers your data and stores the associated metadata (e.g. table definition and schema) in the AWS Glue Data Catalog. Once cataloged, your data is immediately searchable, queryable, and available for ETL.
Data Pipeline team at Demonware (Activision) has to deal with routing large amounts of data from various sources to many destinations every day.
Our team always wanted to be able to query processed data for debugging and analytical purposes, but creating large data warehouses was never our priority, since it usually happens downstream.
AWS Athena is completely serverless query service that doesn't require any infrastructure setup or complex provisioning. We just needed to save some of our data streams to AWS S3 and define a schema. Just a few simple steps, but in the end we were able to write complex SQL queries against gigabytes of data and get results in seconds.
In this presentation I want to show multiple ways to stream your data to AWS S3, explain some underlying tech, show how to define a schema and finally share some of the best practices we applied.
Analyzing big data quickly and efficiently requires a data warehouse optimized to handle and scale for large datasets. Amazon Redshift is a fast, petabyte-scale data warehouse that makes it simple and cost-effective to analyze all of your data for a fraction of the cost of traditional data warehouses. In this webinar, we take an in-depth look at data warehousing with Amazon Redshift for big data analytics. We cover best practices to take advantage of Amazon Redshift's columnar technology and parallel processing capabilities to deliver high throughput and query performance.
Learning Objectives:
• Get an inside look at Amazon Redshift's columnar technology and parallel processing capabilities
• Learn how to design schemas and load data efficiently
• Learn best practices for workload management, distribution and sort keys, and optimizing queries
Modeling data and best practices for the Azure Cosmos DB.Mohammad Asif
Azure Cosmos DB is Microsoft's globally distributed, multi-model database service. In this session we covered ,modeling of data using NOSQL cosmos database and how it's helpful for distributed application to maintain high availability ,scaling in multiple region and throughput.
Azure Cosmos DB - The Swiss Army NoSQL Cloud DatabaseBizTalk360
Microsoft Cosmos DB is the Swiss army NoSQL database in the cloud. It is a multi-model, multi-API, globally-distributed, highly-available, and secure No-SQL database in Azure. In this session, we will explore its capabilities and features through several demos.
Let's make a brief introduction to Azure Data eXplorer, with many examples using Kusto dialect and C# client.
With a particular focus on IIoT contexts and proces control data, let's discover how to implement time series analysis in terms of pattern recognition, and trend correlation.
Traditional data warehouses become expensive and slow down as the volume of your data grows. Amazon Redshift is a fast, petabyte-scale data warehouse that makes it easy to analyze all of your data using existing business intelligence tools for 1/10th the traditional cost. This session will provide an introduction to Amazon Redshift and cover the essentials you need to deploy your data warehouse in the cloud so that you can achieve faster analytics and save costs. We’ll also cover the recently announced Redshift Spectrum, which allows you to query unstructured data directly from Amazon S3.
Amazon Aurora is a MySQL-compatible relational database engine that combines the speed and availability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases. Amazon Aurora is disruptive technology in the database space, bringing a new architectural model and distributed systems techniques to provide far higher performance, availability, and durability than was previously available using conventional monolithic database techniques. In this session, we dive deep into some of the key innovations behind Amazon Aurora, discuss best practices and migration from other databases to Amazon Aurora, and share early customer experiences from the field.
Traditional data warehouses become expensive and slow down as the volume of your data grows. Amazon Redshift is a fast, petabyte-scale data warehouse that makes it easy to analyze all of your data using existing business intelligence tools for 1/10th the traditional cost. This session will provide an introduction to Amazon Redshift and cover the essentials you need to deploy your data warehouse in the cloud so that you can achieve faster analytics and save costs.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
3. Agenda: Exploring Cosmos DB
What is it?
Internals
Resource Model
Try it out!
DEMO: Create an SQL API & Download sample .NET app
Change Feed
Global Distribution
Use Cases
Consistency Models
Request Units
DEMO: Mongo - Lift and shift
Tinkerpop graphs
DEMO: Graphs
4.
5. 2010 2014 2015 2017
DocumentDB Cosmos DBProject Florence
• Originally started to address
the problems faced by large
scale apps inside Microsoft
• Built from the ground up for
the cloud
• Used extensively inside
Microsoft
• One of the fastest growing
services on Azure
Azure Cosmos DB
6. Turnkey global distribution
Elastic scale out
of storage & throughput
Guaranteed low latency at the 99th percentile
Comprehensive SLAs
Five well-defined consistency models
A globally distributed, massively scalable, multi-model database service
Azure Cosmos DB
7. Turnkey global distribution
Elastic scale out
of storage & throughput
Guaranteed low latency at the 99th percentile
Comprehensive SLAs
Five well-defined consistency models
A globally distributed, massively scalable, multi-model database service
Azure Cosmos DB
Column-family Document
Graph
Key-value
8. Column-family Document
Graph
Turnkey global distribution
Elastic scale out
of storage & throughput
Guaranteed low latency at the 99th percentile
Comprehensive SLAs
Five well-defined consistency models
Table API
Key-value
A globally distributed, massively scalable, multi-model database service
Cosmos DB’s API for
MongoDB
Azure Cosmos DB
9. Your application
Database
client
library
Your app
logic
Graph API
MongoDB API
Any other API …
Open-source
driver of choice*
Change of
connection string*
* Depending on feature supportability
Your app
A globally distributed, massively scalable, multi-model database service
Azure Cosmos DB
12. Resource Hierarchy
CONTAINERS
Logical resources “surfaced” to APIs as tables,
collections or graphs, which are made up of one or
more physical partitions or servers.
RESOURCE PARTITIONS
• Consistent, highly available, and resource-governed
coordination primitives
• Consist of replica sets, with each replica hosting an
instance of the database engine
Containers
Resource Partitions
CollectionsTables Graphs
Tenants
Leader
Follower
Follower
Forwarder
Replica Set
To remote resource partition(s)
29. Multi-Master – Read/Write in any region
Benefits
• Write scalability around the world
• Low latency (<10ms P99 for 1kb document)
writes around the world
• 99.999% High Availability around the world
• Well-defined consistency models
• Automatic conflict management
34. Azure IoT Hub
Apache Storm on
Azure HDInsight
Azure Cosmos DB
(telemetry and
device state)
events
Azure Web Jobs
(Change feed
processor)
Azure Function
latest state
Azure Data Lake
(archival)
Internet of Things – Telemetry & Sensor Data
35. Azure Web App
(e-commerce app)
Azure Cosmos DB
(product catalog)
Azure Cosmos DB
(session state)
Azure Search
(full-text index)
Azure Storage
(logs, static
catalog content)
Retail Product Catalogs
37. Azure Cosmos DB
(Low-latency User Profile Store)
Azure API Apps Azure Machine Learning
Azure Data Lake Storage
(Archive of Events)
Azure Cosmos DB
(Event Store)
Azure Web Jobs
(Change feed processor)
Real-time Personalization / Recommendations
38.
39. Consistency Level Guarantees
Strong Linearizability (once operation is complete, it will be visible to all)
Bounded Staleness Consistent Prefix.
Reads lag behind writes by at most k prefixes or t interval
Similar properties to strong consistency (except within staleness window), while preserving 99.99%
availability and low latency.
Session Consistent Prefix.
Within a session: monotonic reads, monotonic writes, read-your-writes, write-follows-reads
Predictable consistency for a session, high read throughput + low latency
Consistent Prefix Reads will never see out of order writes (no gaps).
Eventual Potential for out of order reads. Lowest cost for reads of all consistency levels.
Well-Defined Consistency Models
41. string sessionToken;
using (DocumentClient client = new DocumentClient(new Uri(""), ""))
{
ResourceResponse<Document> response = client.CreateDocumentAsync(
collectionLink,
new { id = "an id", value = "some value" }
).Result;
sessionToken = response.SessionToken;
}
using (DocumentClient client = new DocumentClient(new Uri(""), ""))
{
ResourceResponse<Document> read = client.ReadDocumentAsync(
documentLink,
new RequestOptions { SessionToken = sessionToken }
).Result;
}
Session Consistency: Session is controlled using a “session token”.
• Session tokens are automatically cached by the Client SDK
• Can be pulled out and used to override other requests (to preserve session between multiple clients)
Well-Defined Consistency Models
44. Billing Model
2 components: Storage + Throughput
You are billed on consumed storage and provisioned throughput
Collections in a database can share throughput
Unit Price (for most Azure regions)
SSD Storage (per GB) $0.25 per month
Provisioned Throughput (single region
writes)
$0.008/hour per 100 RU/s
Provisioned Throughput (multi-region
writes)
$0.016/hour per 100 multi-master RU/s
* pricing may vary by region; for up-to-date pricing, see: https://azure.microsoft.com/pricing/details/cosmos-db/
45. Request Units
Request Units (RUs) is a rate-based currency – e.g. 1000 RU/second
Abstracts physical resources for performing requests
% IOPS% CPU% Memory
46. Request Units
Each request consumes # of RU
Approx. 1 RU = 1 read of 1 KB document
Approx. 5 RU = 1 write of a 1KB document
Query: Depends on query & documents involved
GET
POST
PUT
Query
…
=
=
=
=
47. Request Units- Provisioned throughput
Provisioned in terms of RU/sec – e.g. 1000 RU/s
Billed for highest RU/s in 1 hour
Easy to increase and decrease on demand
Rate limiting based on amount of throughput provisioned
Background processes like TTL expiration, index
transformations scheduled when quiescent
Storage: 40RU per 1GB of data
Min RU/sec
Max
RU/sec
IncomingRequests
No rate limiting,
process background
operations
Rate limiting –
SDK retry
No rate limiting
54. Microsoft Confidential
Kobe
Bryant
vertex
label: person
properties:
- age: 39
- height: 6'6”
NBA
Champion
2000
edge
label: isPartOf
edge
label: hasNbaChampionship
Los
Angeles
Lakers
vertex
label: team
properties:
- state: CA
NBA
Champion
2002
NBA
Champion
2001
NBA
Champion
2010
NBA
Champion
2009
vertex
label: award
properties:
- obtained: 2010
vertex
label: award
properties:
- obtained: 2009
vertex
label: award
properties:
- obtained: 2002
vertex
label: award
properties:
- obtained: 2001
vertex
label: award
properties:
- obtained: 2000
55. Microsoft Confidential
Kobe
Bryant
vertex
label: person
properties:
- age: 39
- height: 6'6”
Oscar
2018
NBA
Champion
2000
edge
label: isPartOf
edge
label: hasNbaChampionship
vertex
label: award
properties:
- obtained: 2018
- category: Best Animated
Short Film
Los
Angeles
Lakers
vertex
label: team
properties:
- state: CA
NBA
Champion
2002
NBA
Champion
2001
NBA
Champion
2010
NBA
Champion
2009
vertex
label: award
properties:
- obtained: 2010
vertex
label: award
properties:
- obtained: 2009
vertex
label: award
properties:
- obtained: 2002
vertex
label: award
properties:
- obtained: 2001
vertex
label: award
properties:
- obtained: 2000
edge
label: hasAcademyAward
56. Microsoft Confidential
Kobe
Bryant
vertex
label: person
properties:
- age: 39
- height: 6'6”
Oscar
2018
NBA
Champion
2000
edge
label: isPartOf
edge
label: hasNbaChampionship
vertex
label: award
properties:
- obtained: 2018
- category: Best Animated
Short Film
Los
Angeles
Lakers
vertex
label: team
properties:
- state: CA
NBA
Champion
2002
NBA
Champion
2001
NBA
Champion
2010
NBA
Champion
2009
vertex
label: award
properties:
- obtained: 2010
vertex
label: award
properties:
- obtained: 2009
vertex
label: award
properties:
- obtained: 2002
vertex
label: award
properties:
- obtained: 2001
vertex
label: award
properties:
- obtained: 2000
Tom
Cruise
vertex
label: person
properties:
- awards: null
edge
label: hasAcademyAward
57. Microsoft Confidential
Kobe
Bryant
vertex
label: person
properties:
- age: 39
- height: 6'6”
Oscar
2018
NBA
Champion
2000
edge
label: isPartOf
edge
label: hasNbaChampionship
vertex
label: award
properties:
- obtained: 2018
- category: Best Animated
Short Film
Los
Angeles
Lakers
vertex
label: team
properties:
- state: CA
NBA
Champion
2002
NBA
Champion
2001
NBA
Champion
2010
NBA
Champion
2009
vertex
label: award
properties:
- obtained: 2010
vertex
label: award
properties:
- obtained: 2009
vertex
label: award
properties:
- obtained: 2002
vertex
label: award
properties:
- obtained: 2001
vertex
label: award
properties:
- obtained: 2000
Tom
Cruise
vertex
label: person
properties:
- awards: null
Hollywood
Celebrity
edge
label: hasAcademyAward
vertex
label: status
edge
label: status
edge
label: status
Editor's Notes
Azure Cosmos DB offers the first globally distributed, multi-model database service for building planet scale apps. It’s been powering Microsoft’s internet-scale services for years, and now it’s ready to launch yours.
Only Azure Cosmos DB makes global distribution turn-key.
You can add Azure locations to your database anywhere across the world, at any time, with a single click. Cosmos DB will seamlessly replicate your data and make it highly available.
Cosmos DB allows you to scale throughput and storage elastically, and globally! You only pay for the throughput and storage you need – anywhere in the world, at any time.
Azure Cosmos DB offers the first globally distributed, multi-model database service for building planet scale apps. It’s been powering Microsoft’s internet-scale services for years, and now it’s ready to launch yours.
Only Azure Cosmos DB makes global distribution turn-key.
You can add Azure locations to your database anywhere across the world, at any time, with a single click. Cosmos DB will seamlessly replicate your data and make it highly available.
Cosmos DB allows you to scale throughput and storage elastically, and globally! You only pay for the throughput and storage you need – anywhere in the world, at any time.
Azure Cosmos DB offers the first globally distributed, multi-model database service for building planet scale apps. It’s been powering Microsoft’s internet-scale services for years, and now it’s ready to launch yours.
Only Azure Cosmos DB makes global distribution turn-key.
You can add Azure locations to your database anywhere across the world, at any time, with a single click. Cosmos DB will seamlessly replicate your data and make it highly available.
Cosmos DB allows you to scale throughput and storage elastically, and globally! You only pay for the throughput and storage you need – anywhere in the world, at any time.
11
The number of RU’s each operation consumes depends on many factors which include:
Document size
Number of indexed fields
Type of indexes
Consistency model choice
Not all queries will consume equal numbers of RU’s. Some operations are more computationally complex or require scans through more documents and therefore use more RU’s.