SlideShare a Scribd company logo
1 of 27
CONCURRENCY
PATTERNS WITH
MONGODB
Yann Cluchey
CTO @ Cogenta
• Real-time retail intelligence
• Gather products and prices from web
• MongoDB in production
• Millions of updates per day, 3K/s peak
• Data in SQL, Mongo, ElasticSearch
Concurrency Patterns: Why?
• MongoDB: atomic updates, no transactions
• Need to ensure consistency & correctness
• What are my options with Mongo?
• Shortcuts
• Different approaches
Concurrency Control Strategies
• Pessimistic
• Suited for frequent conflicts
• http://bit.ly/two-phase-commits
• Optimistic
• Efficient when conflicts are rare
• http://bit.ly/isolate-sequence
• Multi-version
• All versions stored, client resolves conflict
• e.g. CouchDb
Optimistic Concurrency Control (OCC)
• No locks
• Prevent dirty writes
• Uses timestamp or a revision number
• Client checks & replays transaction
Example
Original
{ _id: 23, comment: “The quick brown fox…” }
Edit 1
{ _id: 23,
comment: “The quick brown fox prefers SQL” }
Edit 2
{ _id: 23,
comment: “The quick brown fox prefers
MongoDB” }
Example
Edit 1
db.comments.update(
{ _id: 23 },
{ _id: 23,
comment: “The quick brown fox prefers SQL” })
Edit 2
db.comments.update(
{ _id: 23 },
{ _id: 23,
comment: “The quick brown fox prefers MongoDB”
})
Outcome: One update is lost, other might be wrong
OCC Example
Original
{ _id: 23, rev: 1,
comment: “The quick brown fox…” }
Update a specific revision (edit 1)
db.comments.update(
{ _id: 23, rev: 1 },
{ _id: 23, rev: 2,
comment: “The quick brown fox prefers SQL”
})
OCC Example
Edit 2
db.comments.update(
{ _id: 23, rev: 1 },
{ _id: 23, rev: 2,
comment: “The quick brown fox prefers
MongoDB” })
..fails
{ updatedExisting: false, n: 0,
err: null, ok: 1 }
• Caveat: Only works if all clients follow convention
Update Operators in Mongo
• Avoid full document replacement by using operators
• Powerful operators such as $inc, $set, $push
• Many operators can be grouped into single atomic update
• More efficient (data over wire, parsing, etc.)
• Use as much as possible
• http://bit.ly/update-operators
Still Need OCC?
A hit counter
{ _id: 1, hits: 5040 }
Edit 1
db.stats.update({ _id: 1 },
{ $set: { hits: 5045 } })
Edit 2
db.stats.update({ _id: 1 },
{ $set: { hits: 5055 } })
Still Need OCC?
Edit 1
db.stats.update({ _id: 1 },
{ $inc: { hits: 5 } })
Edit 2
db.stats.update({ _id: 1 },
{ $inc: { hits: 10 } })
• Sequence of updates might vary
• Outcome always the same
• But what if sequence is important?
Still Need OCC?
• Operators can offset need for concurrency control
• Support for complex atomic manipulation
• Depends on use case
• You’ll need it for
• Opaque changes (e.g. text)
• Complex update logic in app domain
(e.g. changing a value affects some calculated fields)
• Sequence is important and can’t be inferred
Update Commands
• Update
• Specify query to match one or more documents
• Use { multi: true } to update multiple documents
• Must call Find() separately if you want a copy of the doc
• FindAndModify
• Update single document only
• Find + Update in single hit (atomic)
• Returns the doc before or after update
• Whole doc or subset
• Upsert (update or insert)
• Important feature. Works with OCC..?
Consistent Update Example
• Have a customer document
• Want to set the LastOrderValue and return the previous
value
db.customers.findAndModify({
query: { _id: 16, rev: 45 },
update: {
$set: { lastOrderValue: 699 },
$inc: { rev: 1 }
},
new: false
})
Consistent Update Example
• Customer has since been updated, or doesn’t exist
• Client should replay
null
• Intended version of customer successfully updated
• Original version is returned
{ _id: 16, rev: 45, lastOrderValue: 145 }
• Useful if client has got partial information and needs the
full document
• A separate Find() could introduce inconsistency
Independent Update with Upsert
• Keep stats about customers
• Want to increment NumOrders and return new total
• Customer document might not be there
• Independent operation still needs protection
db.customerStats.findAndModify({
query: { _id: 16 },
update: {
$inc: { numOrders: 1, rev: 1 },
$setOnInsert: { name: “Yann” }
},
new: true,
upsert: true
})
Independent Update with Upsert
• First run, document is created
{ _id: 16, numOrders: 1, rev: 1, name: “Yann” }
• Second run, document is updated
{ _id: 16, numOrders: 2, rev: 2, name: “Yann” }
Subdocuments
• Common scenario
• e.g. Customer and Orders in single document
• Clients like having everything
• Powerful operators for matching and updating
subdocuments
• $elemMatch, $, $addToSet, $push
• Alternatives to “Fat” documents;
• Client-side joins
• Aggregation
• MapReduce
Currency Control and Subdocuments
• Patterns described here still work, but might be
impractical
• Docs are large
• More collisions
• Solve with scale?
Subdocument Example
• Customer document contains orders
• Want to independently update orders
• Correct order #471 value to £260
{
_id: 16,
rev: 20,
name: “Yann”,
orders: {
“471”: { id: 471, value: 250, rev: 4 }
}
}
Subdocument Example
db.customers.findAndModify({
query: { “orders.471.rev”: { $lte: 4 } },
update: {
$set: { “orders.471.value”: 260 },
$inc: { rev: 1, “orders.471.rev”: 1 },
$setOnInsert: {
name: “Yann”,
“orders.471.id”: 471 }
},
new: true,
upsert: true
})
Subdocument Example
• First run, order updated successfully
• Could create if not exists
{
_id: 16,
rev: 21,
name: “Yann”,
orders: {
“471”: { id: 471, value: 260, rev: 5 }
}
}
Subdocument Example
• Second conflicting run
• Query didn’t match revision, duplicate document created
{
_id: ObjectId("533bf88a50dbb55a8a9b9128"),
rev: 1,
name: “Yann”,
orders: {
“471”: { id: 471, value: 260, rev: 1 }
}
}
Subdocument Example
• Solve with unique index (good idea anyway)
db.customers.ensureIndex(
{ "orders.id" : 1 },
{
"name" : "orderids",
"unique" : true
})
Subdocument Example
Client can handle findAndModify result accordingly;
• Successful update
{ updatedExisting: true }
• New document created
{ updatedExisting: false, n: 1 }
• Conflict, need to replay
{ errmsg: “exception: E11000 duplicate key
error index: db.customers.$orderids dup key…” }
Final Words
• Don’t forget deletes
• Gotchas about subdocument structure
orders: [ { id: 471 }, … ]
orders: { “471”: { }, … }
orders: { “471”: { id: 471 }, … }
• Coming in 2.6.0 stable
$setOnInsert: { _id: .. }
• Sharding..?

More Related Content

What's hot

Real Time Analytics: Algorithms and Systems
Real Time Analytics: Algorithms and SystemsReal Time Analytics: Algorithms and Systems
Real Time Analytics: Algorithms and SystemsArun Kejariwal
 
NGINX High-performance Caching
NGINX High-performance CachingNGINX High-performance Caching
NGINX High-performance CachingNGINX, Inc.
 
Architecture Overview: Kubernetes with Red Hat Enterprise Linux 7.1
Architecture Overview: Kubernetes with Red Hat Enterprise Linux 7.1Architecture Overview: Kubernetes with Red Hat Enterprise Linux 7.1
Architecture Overview: Kubernetes with Red Hat Enterprise Linux 7.1Etsuji Nakai
 
Using openCV 2.0 with Dev C++
Using openCV 2.0 with Dev C++Using openCV 2.0 with Dev C++
Using openCV 2.0 with Dev C++Wei-Wen Hsu
 
A Deep Dive into Spring Application Events
A Deep Dive into Spring Application EventsA Deep Dive into Spring Application Events
A Deep Dive into Spring Application EventsVMware Tanzu
 
Deep Dive into Kubernetes - Part 1
Deep Dive into Kubernetes - Part 1Deep Dive into Kubernetes - Part 1
Deep Dive into Kubernetes - Part 1Imesh Gunaratne
 
Apache Kafka vs. Traditional Middleware (Kai Waehner, Confluent) Frankfurt 20...
Apache Kafka vs. Traditional Middleware (Kai Waehner, Confluent) Frankfurt 20...Apache Kafka vs. Traditional Middleware (Kai Waehner, Confluent) Frankfurt 20...
Apache Kafka vs. Traditional Middleware (Kai Waehner, Confluent) Frankfurt 20...confluent
 
Slides du meetup devops aix-marseille d'ocotbre 2023
Slides du meetup devops aix-marseille d'ocotbre 2023Slides du meetup devops aix-marseille d'ocotbre 2023
Slides du meetup devops aix-marseille d'ocotbre 2023Frederic Leger
 
FreeSWITCH Cluster by K8s
FreeSWITCH Cluster by K8sFreeSWITCH Cluster by K8s
FreeSWITCH Cluster by K8sChien Cheng Wu
 
Fallacies of distributed computing with Kubernetes on AWS
Fallacies of distributed computing with Kubernetes on AWSFallacies of distributed computing with Kubernetes on AWS
Fallacies of distributed computing with Kubernetes on AWSRaffaele Di Fazio
 
Understanding Kubernetes
Understanding KubernetesUnderstanding Kubernetes
Understanding KubernetesTu Pham
 
Container Orchestration with Docker Swarm and Kubernetes
Container Orchestration with Docker Swarm and KubernetesContainer Orchestration with Docker Swarm and Kubernetes
Container Orchestration with Docker Swarm and KubernetesWill Hall
 
Cluster management with Kubernetes
Cluster management with KubernetesCluster management with Kubernetes
Cluster management with KubernetesSatnam Singh
 
Introduction to the Disruptor
Introduction to the DisruptorIntroduction to the Disruptor
Introduction to the DisruptorTrisha Gee
 
Service-mesh options with Linkerd, Consul, Istio and AWS AppMesh
Service-mesh options with Linkerd, Consul, Istio and AWS AppMeshService-mesh options with Linkerd, Consul, Istio and AWS AppMesh
Service-mesh options with Linkerd, Consul, Istio and AWS AppMeshChristian Posta
 
Introduction to InfluxDB and TICK Stack
Introduction to InfluxDB and TICK StackIntroduction to InfluxDB and TICK Stack
Introduction to InfluxDB and TICK StackAhmed AbouZaid
 
Unique ID generation in distributed systems
Unique ID generation in distributed systemsUnique ID generation in distributed systems
Unique ID generation in distributed systemsDave Gardner
 
Ansible를 통한 컨테이너 환경 자동화
Ansible를 통한 컨테이너 환경 자동화Ansible를 통한 컨테이너 환경 자동화
Ansible를 통한 컨테이너 환경 자동화Opennaru, inc.
 

What's hot (20)

InfluxDB & Grafana
InfluxDB & GrafanaInfluxDB & Grafana
InfluxDB & Grafana
 
Real Time Analytics: Algorithms and Systems
Real Time Analytics: Algorithms and SystemsReal Time Analytics: Algorithms and Systems
Real Time Analytics: Algorithms and Systems
 
NGINX High-performance Caching
NGINX High-performance CachingNGINX High-performance Caching
NGINX High-performance Caching
 
Architecture Overview: Kubernetes with Red Hat Enterprise Linux 7.1
Architecture Overview: Kubernetes with Red Hat Enterprise Linux 7.1Architecture Overview: Kubernetes with Red Hat Enterprise Linux 7.1
Architecture Overview: Kubernetes with Red Hat Enterprise Linux 7.1
 
Using openCV 2.0 with Dev C++
Using openCV 2.0 with Dev C++Using openCV 2.0 with Dev C++
Using openCV 2.0 with Dev C++
 
A Deep Dive into Spring Application Events
A Deep Dive into Spring Application EventsA Deep Dive into Spring Application Events
A Deep Dive into Spring Application Events
 
Deep Dive into Kubernetes - Part 1
Deep Dive into Kubernetes - Part 1Deep Dive into Kubernetes - Part 1
Deep Dive into Kubernetes - Part 1
 
Apache Kafka vs. Traditional Middleware (Kai Waehner, Confluent) Frankfurt 20...
Apache Kafka vs. Traditional Middleware (Kai Waehner, Confluent) Frankfurt 20...Apache Kafka vs. Traditional Middleware (Kai Waehner, Confluent) Frankfurt 20...
Apache Kafka vs. Traditional Middleware (Kai Waehner, Confluent) Frankfurt 20...
 
Slides du meetup devops aix-marseille d'ocotbre 2023
Slides du meetup devops aix-marseille d'ocotbre 2023Slides du meetup devops aix-marseille d'ocotbre 2023
Slides du meetup devops aix-marseille d'ocotbre 2023
 
FreeSWITCH Cluster by K8s
FreeSWITCH Cluster by K8sFreeSWITCH Cluster by K8s
FreeSWITCH Cluster by K8s
 
Fallacies of distributed computing with Kubernetes on AWS
Fallacies of distributed computing with Kubernetes on AWSFallacies of distributed computing with Kubernetes on AWS
Fallacies of distributed computing with Kubernetes on AWS
 
Understanding Kubernetes
Understanding KubernetesUnderstanding Kubernetes
Understanding Kubernetes
 
Container Orchestration with Docker Swarm and Kubernetes
Container Orchestration with Docker Swarm and KubernetesContainer Orchestration with Docker Swarm and Kubernetes
Container Orchestration with Docker Swarm and Kubernetes
 
Cluster management with Kubernetes
Cluster management with KubernetesCluster management with Kubernetes
Cluster management with Kubernetes
 
Introduction to the Disruptor
Introduction to the DisruptorIntroduction to the Disruptor
Introduction to the Disruptor
 
Service-mesh options with Linkerd, Consul, Istio and AWS AppMesh
Service-mesh options with Linkerd, Consul, Istio and AWS AppMeshService-mesh options with Linkerd, Consul, Istio and AWS AppMesh
Service-mesh options with Linkerd, Consul, Istio and AWS AppMesh
 
Introduction to InfluxDB and TICK Stack
Introduction to InfluxDB and TICK StackIntroduction to InfluxDB and TICK Stack
Introduction to InfluxDB and TICK Stack
 
Introduction to influx db
Introduction to influx dbIntroduction to influx db
Introduction to influx db
 
Unique ID generation in distributed systems
Unique ID generation in distributed systemsUnique ID generation in distributed systems
Unique ID generation in distributed systems
 
Ansible를 통한 컨테이너 환경 자동화
Ansible를 통한 컨테이너 환경 자동화Ansible를 통한 컨테이너 환경 자동화
Ansible를 통한 컨테이너 환경 자동화
 

Similar to Concurrency Patterns with MongoDB

Neue Features in MongoDB 3.6
Neue Features in MongoDB 3.6Neue Features in MongoDB 3.6
Neue Features in MongoDB 3.6MongoDB
 
Novedades de MongoDB 3.6
Novedades de MongoDB 3.6Novedades de MongoDB 3.6
Novedades de MongoDB 3.6MongoDB
 
TechEd AU 2014: Microsoft Azure DocumentDB Deep Dive
TechEd AU 2014: Microsoft Azure DocumentDB Deep DiveTechEd AU 2014: Microsoft Azure DocumentDB Deep Dive
TechEd AU 2014: Microsoft Azure DocumentDB Deep DiveIntergen
 
WinOps Conf 2016 - Michael Greene - Release Pipelines
WinOps Conf 2016 - Michael Greene - Release PipelinesWinOps Conf 2016 - Michael Greene - Release Pipelines
WinOps Conf 2016 - Michael Greene - Release PipelinesWinOps Conf
 
Benchmarking at Parse
Benchmarking at ParseBenchmarking at Parse
Benchmarking at ParseTravis Redman
 
Advanced Benchmarking at Parse
Advanced Benchmarking at ParseAdvanced Benchmarking at Parse
Advanced Benchmarking at ParseMongoDB
 
What's new in MongoDB 3.6?
What's new in MongoDB 3.6?What's new in MongoDB 3.6?
What's new in MongoDB 3.6?MongoDB
 
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...Flink Forward
 
Eventually Elasticsearch: Eventual Consistency in the Real World
Eventually Elasticsearch: Eventual Consistency in the Real WorldEventually Elasticsearch: Eventual Consistency in the Real World
Eventually Elasticsearch: Eventual Consistency in the Real WorldBeyondTrees
 
Webinar: What's new in the .NET Driver
Webinar: What's new in the .NET DriverWebinar: What's new in the .NET Driver
Webinar: What's new in the .NET DriverMongoDB
 
MSFT Dumaguete 061616 - Building High Performance Apps
MSFT Dumaguete 061616 - Building High Performance AppsMSFT Dumaguete 061616 - Building High Performance Apps
MSFT Dumaguete 061616 - Building High Performance AppsMarc Obaldo
 
No REST - Architecting Real-time Bulk Async APIs
No REST - Architecting Real-time Bulk Async APIsNo REST - Architecting Real-time Bulk Async APIs
No REST - Architecting Real-time Bulk Async APIsC4Media
 
Lessons from the Trenches - Building Enterprise Applications with RavenDB
Lessons from the Trenches - Building Enterprise Applications with RavenDBLessons from the Trenches - Building Enterprise Applications with RavenDB
Lessons from the Trenches - Building Enterprise Applications with RavenDBOren Eini
 
Designing your API Server for mobile apps
Designing your API Server for mobile appsDesigning your API Server for mobile apps
Designing your API Server for mobile appsMugunth Kumar
 
Cost-based Query Optimization in Hive
Cost-based Query Optimization in HiveCost-based Query Optimization in Hive
Cost-based Query Optimization in HiveDataWorks Summit
 
Cost-based query optimization in Apache Hive
Cost-based query optimization in Apache HiveCost-based query optimization in Apache Hive
Cost-based query optimization in Apache HiveJulian Hyde
 
Grails patterns and practices
Grails patterns and practicesGrails patterns and practices
Grails patterns and practicespaulbowler
 
MongoDB World 2019: Scaling Real-time Collaboration with MongoDB
MongoDB World 2019: Scaling Real-time Collaboration with MongoDBMongoDB World 2019: Scaling Real-time Collaboration with MongoDB
MongoDB World 2019: Scaling Real-time Collaboration with MongoDBMongoDB
 
Boosting the Performance of your Rails Apps
Boosting the Performance of your Rails AppsBoosting the Performance of your Rails Apps
Boosting the Performance of your Rails AppsMatt Kuklinski
 
Improving the Quality of Existing Software
Improving the Quality of Existing SoftwareImproving the Quality of Existing Software
Improving the Quality of Existing SoftwareSteven Smith
 

Similar to Concurrency Patterns with MongoDB (20)

Neue Features in MongoDB 3.6
Neue Features in MongoDB 3.6Neue Features in MongoDB 3.6
Neue Features in MongoDB 3.6
 
Novedades de MongoDB 3.6
Novedades de MongoDB 3.6Novedades de MongoDB 3.6
Novedades de MongoDB 3.6
 
TechEd AU 2014: Microsoft Azure DocumentDB Deep Dive
TechEd AU 2014: Microsoft Azure DocumentDB Deep DiveTechEd AU 2014: Microsoft Azure DocumentDB Deep Dive
TechEd AU 2014: Microsoft Azure DocumentDB Deep Dive
 
WinOps Conf 2016 - Michael Greene - Release Pipelines
WinOps Conf 2016 - Michael Greene - Release PipelinesWinOps Conf 2016 - Michael Greene - Release Pipelines
WinOps Conf 2016 - Michael Greene - Release Pipelines
 
Benchmarking at Parse
Benchmarking at ParseBenchmarking at Parse
Benchmarking at Parse
 
Advanced Benchmarking at Parse
Advanced Benchmarking at ParseAdvanced Benchmarking at Parse
Advanced Benchmarking at Parse
 
What's new in MongoDB 3.6?
What's new in MongoDB 3.6?What's new in MongoDB 3.6?
What's new in MongoDB 3.6?
 
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...
 
Eventually Elasticsearch: Eventual Consistency in the Real World
Eventually Elasticsearch: Eventual Consistency in the Real WorldEventually Elasticsearch: Eventual Consistency in the Real World
Eventually Elasticsearch: Eventual Consistency in the Real World
 
Webinar: What's new in the .NET Driver
Webinar: What's new in the .NET DriverWebinar: What's new in the .NET Driver
Webinar: What's new in the .NET Driver
 
MSFT Dumaguete 061616 - Building High Performance Apps
MSFT Dumaguete 061616 - Building High Performance AppsMSFT Dumaguete 061616 - Building High Performance Apps
MSFT Dumaguete 061616 - Building High Performance Apps
 
No REST - Architecting Real-time Bulk Async APIs
No REST - Architecting Real-time Bulk Async APIsNo REST - Architecting Real-time Bulk Async APIs
No REST - Architecting Real-time Bulk Async APIs
 
Lessons from the Trenches - Building Enterprise Applications with RavenDB
Lessons from the Trenches - Building Enterprise Applications with RavenDBLessons from the Trenches - Building Enterprise Applications with RavenDB
Lessons from the Trenches - Building Enterprise Applications with RavenDB
 
Designing your API Server for mobile apps
Designing your API Server for mobile appsDesigning your API Server for mobile apps
Designing your API Server for mobile apps
 
Cost-based Query Optimization in Hive
Cost-based Query Optimization in HiveCost-based Query Optimization in Hive
Cost-based Query Optimization in Hive
 
Cost-based query optimization in Apache Hive
Cost-based query optimization in Apache HiveCost-based query optimization in Apache Hive
Cost-based query optimization in Apache Hive
 
Grails patterns and practices
Grails patterns and practicesGrails patterns and practices
Grails patterns and practices
 
MongoDB World 2019: Scaling Real-time Collaboration with MongoDB
MongoDB World 2019: Scaling Real-time Collaboration with MongoDBMongoDB World 2019: Scaling Real-time Collaboration with MongoDB
MongoDB World 2019: Scaling Real-time Collaboration with MongoDB
 
Boosting the Performance of your Rails Apps
Boosting the Performance of your Rails AppsBoosting the Performance of your Rails Apps
Boosting the Performance of your Rails Apps
 
Improving the Quality of Existing Software
Improving the Quality of Existing SoftwareImproving the Quality of Existing Software
Improving the Quality of Existing Software
 

More from Yann Cluchey

Implementing Keyword Sort with Elasticsearch
Implementing Keyword Sort with ElasticsearchImplementing Keyword Sort with Elasticsearch
Implementing Keyword Sort with ElasticsearchYann Cluchey
 
Annotated Text feature in Elasticsearch
Annotated Text feature in ElasticsearchAnnotated Text feature in Elasticsearch
Annotated Text feature in ElasticsearchYann Cluchey
 
Machine Learning and the Elastic Stack
Machine Learning and the Elastic StackMachine Learning and the Elastic Stack
Machine Learning and the Elastic StackYann Cluchey
 
Elasticsearch at AffiliateWindow
Elasticsearch at AffiliateWindowElasticsearch at AffiliateWindow
Elasticsearch at AffiliateWindowYann Cluchey
 
GOTO Aarhus 2014: Making Enterprise Data Available in Real Time with elastics...
GOTO Aarhus 2014: Making Enterprise Data Available in Real Time with elastics...GOTO Aarhus 2014: Making Enterprise Data Available in Real Time with elastics...
GOTO Aarhus 2014: Making Enterprise Data Available in Real Time with elastics...Yann Cluchey
 
Lightning talk: elasticsearch at Cogenta
Lightning talk: elasticsearch at CogentaLightning talk: elasticsearch at Cogenta
Lightning talk: elasticsearch at CogentaYann Cluchey
 

More from Yann Cluchey (6)

Implementing Keyword Sort with Elasticsearch
Implementing Keyword Sort with ElasticsearchImplementing Keyword Sort with Elasticsearch
Implementing Keyword Sort with Elasticsearch
 
Annotated Text feature in Elasticsearch
Annotated Text feature in ElasticsearchAnnotated Text feature in Elasticsearch
Annotated Text feature in Elasticsearch
 
Machine Learning and the Elastic Stack
Machine Learning and the Elastic StackMachine Learning and the Elastic Stack
Machine Learning and the Elastic Stack
 
Elasticsearch at AffiliateWindow
Elasticsearch at AffiliateWindowElasticsearch at AffiliateWindow
Elasticsearch at AffiliateWindow
 
GOTO Aarhus 2014: Making Enterprise Data Available in Real Time with elastics...
GOTO Aarhus 2014: Making Enterprise Data Available in Real Time with elastics...GOTO Aarhus 2014: Making Enterprise Data Available in Real Time with elastics...
GOTO Aarhus 2014: Making Enterprise Data Available in Real Time with elastics...
 
Lightning talk: elasticsearch at Cogenta
Lightning talk: elasticsearch at CogentaLightning talk: elasticsearch at Cogenta
Lightning talk: elasticsearch at Cogenta
 

Recently uploaded

[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 

Recently uploaded (20)

[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 

Concurrency Patterns with MongoDB

  • 2. • Real-time retail intelligence • Gather products and prices from web • MongoDB in production • Millions of updates per day, 3K/s peak • Data in SQL, Mongo, ElasticSearch
  • 3. Concurrency Patterns: Why? • MongoDB: atomic updates, no transactions • Need to ensure consistency & correctness • What are my options with Mongo? • Shortcuts • Different approaches
  • 4. Concurrency Control Strategies • Pessimistic • Suited for frequent conflicts • http://bit.ly/two-phase-commits • Optimistic • Efficient when conflicts are rare • http://bit.ly/isolate-sequence • Multi-version • All versions stored, client resolves conflict • e.g. CouchDb
  • 5. Optimistic Concurrency Control (OCC) • No locks • Prevent dirty writes • Uses timestamp or a revision number • Client checks & replays transaction
  • 6. Example Original { _id: 23, comment: “The quick brown fox…” } Edit 1 { _id: 23, comment: “The quick brown fox prefers SQL” } Edit 2 { _id: 23, comment: “The quick brown fox prefers MongoDB” }
  • 7. Example Edit 1 db.comments.update( { _id: 23 }, { _id: 23, comment: “The quick brown fox prefers SQL” }) Edit 2 db.comments.update( { _id: 23 }, { _id: 23, comment: “The quick brown fox prefers MongoDB” }) Outcome: One update is lost, other might be wrong
  • 8. OCC Example Original { _id: 23, rev: 1, comment: “The quick brown fox…” } Update a specific revision (edit 1) db.comments.update( { _id: 23, rev: 1 }, { _id: 23, rev: 2, comment: “The quick brown fox prefers SQL” })
  • 9. OCC Example Edit 2 db.comments.update( { _id: 23, rev: 1 }, { _id: 23, rev: 2, comment: “The quick brown fox prefers MongoDB” }) ..fails { updatedExisting: false, n: 0, err: null, ok: 1 } • Caveat: Only works if all clients follow convention
  • 10. Update Operators in Mongo • Avoid full document replacement by using operators • Powerful operators such as $inc, $set, $push • Many operators can be grouped into single atomic update • More efficient (data over wire, parsing, etc.) • Use as much as possible • http://bit.ly/update-operators
  • 11. Still Need OCC? A hit counter { _id: 1, hits: 5040 } Edit 1 db.stats.update({ _id: 1 }, { $set: { hits: 5045 } }) Edit 2 db.stats.update({ _id: 1 }, { $set: { hits: 5055 } })
  • 12. Still Need OCC? Edit 1 db.stats.update({ _id: 1 }, { $inc: { hits: 5 } }) Edit 2 db.stats.update({ _id: 1 }, { $inc: { hits: 10 } }) • Sequence of updates might vary • Outcome always the same • But what if sequence is important?
  • 13. Still Need OCC? • Operators can offset need for concurrency control • Support for complex atomic manipulation • Depends on use case • You’ll need it for • Opaque changes (e.g. text) • Complex update logic in app domain (e.g. changing a value affects some calculated fields) • Sequence is important and can’t be inferred
  • 14. Update Commands • Update • Specify query to match one or more documents • Use { multi: true } to update multiple documents • Must call Find() separately if you want a copy of the doc • FindAndModify • Update single document only • Find + Update in single hit (atomic) • Returns the doc before or after update • Whole doc or subset • Upsert (update or insert) • Important feature. Works with OCC..?
  • 15. Consistent Update Example • Have a customer document • Want to set the LastOrderValue and return the previous value db.customers.findAndModify({ query: { _id: 16, rev: 45 }, update: { $set: { lastOrderValue: 699 }, $inc: { rev: 1 } }, new: false })
  • 16. Consistent Update Example • Customer has since been updated, or doesn’t exist • Client should replay null • Intended version of customer successfully updated • Original version is returned { _id: 16, rev: 45, lastOrderValue: 145 } • Useful if client has got partial information and needs the full document • A separate Find() could introduce inconsistency
  • 17. Independent Update with Upsert • Keep stats about customers • Want to increment NumOrders and return new total • Customer document might not be there • Independent operation still needs protection db.customerStats.findAndModify({ query: { _id: 16 }, update: { $inc: { numOrders: 1, rev: 1 }, $setOnInsert: { name: “Yann” } }, new: true, upsert: true })
  • 18. Independent Update with Upsert • First run, document is created { _id: 16, numOrders: 1, rev: 1, name: “Yann” } • Second run, document is updated { _id: 16, numOrders: 2, rev: 2, name: “Yann” }
  • 19. Subdocuments • Common scenario • e.g. Customer and Orders in single document • Clients like having everything • Powerful operators for matching and updating subdocuments • $elemMatch, $, $addToSet, $push • Alternatives to “Fat” documents; • Client-side joins • Aggregation • MapReduce
  • 20. Currency Control and Subdocuments • Patterns described here still work, but might be impractical • Docs are large • More collisions • Solve with scale?
  • 21. Subdocument Example • Customer document contains orders • Want to independently update orders • Correct order #471 value to £260 { _id: 16, rev: 20, name: “Yann”, orders: { “471”: { id: 471, value: 250, rev: 4 } } }
  • 22. Subdocument Example db.customers.findAndModify({ query: { “orders.471.rev”: { $lte: 4 } }, update: { $set: { “orders.471.value”: 260 }, $inc: { rev: 1, “orders.471.rev”: 1 }, $setOnInsert: { name: “Yann”, “orders.471.id”: 471 } }, new: true, upsert: true })
  • 23. Subdocument Example • First run, order updated successfully • Could create if not exists { _id: 16, rev: 21, name: “Yann”, orders: { “471”: { id: 471, value: 260, rev: 5 } } }
  • 24. Subdocument Example • Second conflicting run • Query didn’t match revision, duplicate document created { _id: ObjectId("533bf88a50dbb55a8a9b9128"), rev: 1, name: “Yann”, orders: { “471”: { id: 471, value: 260, rev: 1 } } }
  • 25. Subdocument Example • Solve with unique index (good idea anyway) db.customers.ensureIndex( { "orders.id" : 1 }, { "name" : "orderids", "unique" : true })
  • 26. Subdocument Example Client can handle findAndModify result accordingly; • Successful update { updatedExisting: true } • New document created { updatedExisting: false, n: 1 } • Conflict, need to replay { errmsg: “exception: E11000 duplicate key error index: db.customers.$orderids dup key…” }
  • 27. Final Words • Don’t forget deletes • Gotchas about subdocument structure orders: [ { id: 471 }, … ] orders: { “471”: { }, … } orders: { “471”: { id: 471 }, … } • Coming in 2.6.0 stable $setOnInsert: { _id: .. } • Sharding..?

Editor's Notes

  1. 4xSQL, 1xMongo, 5xElastic
  2. No transactions.Different databases have different features.Cheating is fun. How can we avoid problems entirely.Efficiency is key. Understand what’s achievable in a single update
  3. All doable in MongoDBOptimistic usually the best option for typical mongodb projectHow to roll yo
  4. Two users simultaneously editing
  5. Not excitingly clever, or efficient..?
  6. Not excitingly clever, or efficient..?
  7. Not excitingly clever, or efficient..?
  8. They are your friends, go play wit
  9. Two users simultaneously editing
  10. Bank balance!
  11. Not going to get in to that…
  12. Not going to get in to that…
  13. Data is mastered elsewhere