SlideShare a Scribd company logo
MongoDB 3.2 – Document
Validation
LondonMUG–15/9/2015
Agenda
Value of flexible schemas
Downside of flexible schemas
What 3.2 adds
What 3.2 doesn’t add
Options
Production lifecycle to add Document Validation
Demo
Power of flexible schemas
RDBMS MongoDB
{
_id: ObjectId("4c4ba5e5e8aabf3"),
employee_name: {First: "Billy",
Last: "Fish"},
department: "Engineering",
title: "Aquarium design",
pay_band: "C",
benefits: [
{ type: "Health",
plan: "PPO Plus" },
{ type: "Dental",
plan: "Standard" }
]
}
Power of flexible schemas
{
_id:
ObjectId("4c4ba5e5e8aabf3"),
employee_name: {First: "Billy",
Last: "Fish"},
department: "Engineering",
title: "Aquarium design",
pay_band: "C",
benefits: [
{ type: "Health",
plan: "PPO Plus" },
{ type: "Dental",
plan: "Standard" }
]
}
• Relational
– Up-front schema definition phase
– Adding new column takes time to
develop & *lots* of time to roll out in
production
– Existing rows must be reformatted
• MongoDB:
– Start hacking your app right away
– Want to store new type of information?
• Just start adding it
• If it doesn’t apply to all instances
– just leave it out
Power of flexible schemas
{
_id:
ObjectId("4c4ba5e5e8aabf3"),
employee_name: {First: "Billy”,
Middle: "The",
Last: "Fish"},
department: "Engineering",
title: "Aquarium design",
pay_band: "C",
benefits: [
{ type: "Health",
plan: "PPO Plus" },
{ type: "Dental",
plan: "Standard" }
]
}
• Relational
– Up-front schema definition phase
– Adding new column takes time to
develop & *lots* of time to roll out in
production
– Existing rows must be reformatted
• MongoDB:
– Start hacking your app right away
– Want to store new type of information?
• Just start adding it
• If it doesn’t apply to all instances
– just leave it out
6
Why validate documents?
• Many people writing to the database
– Many developers
– Many teams
– Many companies
– Many development languages
• Multiple applications want to exploit the same data, need to agree on what’s there
• Usually a core subset of keys you always want to be there
• For any key may care about:
– Existence
– Type
– Format
– Value
– Existence in combination with other keys (e.g. need a phone number or an email address)
7
Data Access Layer
Why validate documents?
• Good to have a ‘contract’ for what’s in a collection
– When reading from the “subscriber” collection, I
know that every document will include a
subscription plan name:
db.subscriptions.find(
name: "Billy Fish",
$and:[{plan:{$exists: true}},
{plan:{$type: 2}}]})
• < MongoDB 3.2, this is an application responsibility
– 3rd party tools like Mongoose can help
• Best implemented as a layer between the application
and driver (Data Access Layer)
App App App
Driver
Database
Get the database to do the work!
9
Document Validation - MongoDB 3.2
• Configure document validation within the database
• Use familiar MongoDB query expressions
• Automatically tests each insert/update; delivers warning or error if a rule is broken
• You choose what keys to validate and how
db.runCommand({
collMod: "contacts",
validator: {
$and: [
{year_of_birth: {$lte: 1994}},
{$or: [
{phone: { $type: ”string"}},
{email: { $type: ”string"}}
]}]
}})
10
Document Validation - MongoDB 3.2
db.getCollectionInfos({name:"contacts"})
[
{
"name": "contacts",
"options": {
"validator": {
"$and": [
{"year_of_birth": {
"$lte": 1994}},
{"$or": [
{"phone": {"$type": ”string"}},
{"email": {"$type": ”string"}}
]}
]},
"validationLevel": "strict",
"validationAction": "error"
}
}
]
11
Document Validation - MongoDB 3.2
db.contacts.insert(
name: "Fred",
email: "fred@clusterdb.com",
year_of_birth: 2012
})
Document failed validation
WriteResult({
"nInserted": 0,
"writeError": {
"code": 121,
"errmsg": "Document failed validation”}})
12
Document Validation - MongoDB 3.2
db.runCommand({collMod: "bleh",
validator: {
rogue: {$exists:false}
}
}
});
13
Document Validation - MongoDB 3.2
• Can check most things that work with a find expression
– Existence
– Non-existence
– Data type of values
– <, <=, >, >=, ==, !=
– AND, OR
– Regular expressions
– Some geospatial operators (e.g. $geoWithin & $geoIntersects)
– …
14
MongoDB 3.2 Limitations
• Generic error message
– Application needs to figure out what part of the constraints failed
• Cannot compare 1 key with another
– Either within the same document or between documents
• Some operations not supported:
– $text, $geoNear, $near, $nearSphere, $where
• Applications responsibility to bring legacy data into compliance with new
rules
– No audit or tools
15
What validations remain in the app
• User interface
– Don’t have the database be the first place to detect that an email is poorly
formatted
• Any validations that involve comparisons with
– Other data in the same document
– Data from other documents
– External information (e.g. time of day)
• Semantic checks that are designed to fail frequently
– e.g. user is in wrong country to use this service
– Database should typically be testing for coding errors rather than implementing
your business logic
• Determining why the database rejected a document in order to provide a meaningful
error to the user
16
Where MongoDB Validation excels(vs.RDBMS)
• Simple
– Use familiar search expressions
– No need for stored procedures
• Flexible
– Only enforced on mandatory parts of the schema
– Can start adding new data at any point and then add validation later if needed
• Practical to deploy
– Simple to role out new rules across thousands of production servers
• Light weight
– Negligible impact to performance
17
Cleaning up legacy data
• Validator does not check if existing
documents in the collection meet the
new validation rules
• User/app can execute a query to
identify & update any document
which don’t meet the new rules
– Use $nor on the full expression
• Be cautious about system impacts:
– Could push working data set out of
memory
– Extra load if many documents need
to be updated
– Execute on secondary
secondary> db.runCommand({collMod: "bleh",
validator: {
a: {$lt:4}
}
});
secondary> db.bleh.find({
a:{$not:{$lt:4}}}).count()
secondary> db.bleh.update(
{a:{$not:{$lt:4}}},
{$set:{a:3}},
{multi:true})
18
Controlling validation
validationLevel
off moderate strict
validationAction
warn
No checks
Warn on non-compliance for
inserts & updates to existing
valid documents. Updates to
existing invalid docs OK.
Warn on any non-compliant
document for any insert or
update.
error
No checks
Reject on non-compliant
inserts & updates to existing
valid documents. Updates to
existing invalid docs OK.
Reject any non-compliant
document for any insert or
update.
DEFAULT
19
Controlling validation
• Set behavior:
db.bleh.runCommand("collMod",
{validationLevel: "moderate",
validationAction: "warn"})
– Note that the warnings are written to the log
• Override for a single operation (not working in MongoDB 3.1.7):
db.bleh.insert({a:999},{bypassDocumentValidation:true})
20
Lifecycle
Hacking
(Day one)
•No document validation
•Release quick & often
Analyze De facto
Schema
•MongoDB Scout
Add document
validation rules
•Query & fix existing docs
•Log any new documents
that break rules:
{validationLevel:
"moderate",
validationAction:
"warn}
Fix application
•Follow all schema rules
•When no new problems
being logged, enter strict
mode:
{validationLevel:
”strict",
validationAction:
”error}
Application uses
new data
(Application
evolves/additional
app)
•If not mandated, stop
here
?
Analyze De-facto
Schema
•MongoDB Scout
Add document
validation rules
•Query & fix existing docs
•Log any new documents
that break rules:
{validationLevel:
"moderate",
validationAction:
"warn}
Fix application
•Follow all schema rules
•When no new problems
being logged, enter strict
mode:
{validationLevel:
”strict",
validationAction:
”error}
21
Versioning of Validations (optional)
• Application can lazily update documents with an older version or with no version set
at all
db.runCommand({
collMod: "contacts",
validator:
{$or: [{version: {"$exists": false}},
{version: 1,
$and: [{Name: {"$exists": true}}]
},
{version: 2,
$and: [{Name: {"$type": ”string"}}]
}
]
}
})
Demo
Next Steps
• “Document Validation and What Dynamic Schema Means” – Eliot Horowitz
– http://www.eliothorowitz.com/blog/2015/09/11/document-validation-and-what-
dynamic-schema-means/
• “Bulletproof Data Management” – MongoDB World 2015
– https://www.mongodb.com/presentations/data-management-3-bulletproof-data-
management
• Documentation
– http://docs.mongodb.org/manual/release-notes/3.1-dev-series/#document-
validation
• Not yet ready for production but download and try MongoDB 3.1!
– https://www.mongodb.org/downloads#development
• Feedback
– https://jira.mongodb.org/
MongoDB Days 2015
October 6, 2015
October 20, 2015
November 5, 2015
December 2, 2015
France
Germany
UK
Silicon Valley
25
London MUG - Plans
• Would like to have a regular cadence
– 4-6 weeks?
• What topics would you like to see covered?
• Anyone interested in running these sessions (with support from MongoDB
team)

More Related Content

What's hot

The Promise and Perils of Encrypting Cassandra Data (Ameesh Divatia, Baffle, ...
The Promise and Perils of Encrypting Cassandra Data (Ameesh Divatia, Baffle, ...The Promise and Perils of Encrypting Cassandra Data (Ameesh Divatia, Baffle, ...
The Promise and Perils of Encrypting Cassandra Data (Ameesh Divatia, Baffle, ...
DataStax
 
DataStax | Building a Spark Streaming App with DSE File System (Rocco Varela)...
DataStax | Building a Spark Streaming App with DSE File System (Rocco Varela)...DataStax | Building a Spark Streaming App with DSE File System (Rocco Varela)...
DataStax | Building a Spark Streaming App with DSE File System (Rocco Varela)...
DataStax
 
Data Management 3: Bulletproof Data Management
Data Management 3: Bulletproof Data ManagementData Management 3: Bulletproof Data Management
Data Management 3: Bulletproof Data Management
MongoDB
 
CouchDB : More Couch
CouchDB : More CouchCouchDB : More Couch
CouchDB : More Couch
delagoya
 
Document Validation in MongoDB 3.2
Document Validation in MongoDB 3.2Document Validation in MongoDB 3.2
Document Validation in MongoDB 3.2
MongoDB
 
MongoDB Best Practices for Developers
MongoDB Best Practices for DevelopersMongoDB Best Practices for Developers
MongoDB Best Practices for Developers
Moshe Kaplan
 
Think Distributed: The Hazelcast Way
Think Distributed: The Hazelcast WayThink Distributed: The Hazelcast Way
Think Distributed: The Hazelcast Way
Rahul Gupta
 
Log analysis with elastic stack
Log analysis with elastic stackLog analysis with elastic stack
Log analysis with elastic stack
Bangladesh Network Operators Group
 
Hermes: Free the Data! Distributed Computing with MongoDB
Hermes: Free the Data! Distributed Computing with MongoDBHermes: Free the Data! Distributed Computing with MongoDB
Hermes: Free the Data! Distributed Computing with MongoDB
MongoDB
 
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
DataStax
 
Oracle: Let My People Go! (Shu Zhang, Ilya Sokolov, Symantec) | Cassandra Sum...
Oracle: Let My People Go! (Shu Zhang, Ilya Sokolov, Symantec) | Cassandra Sum...Oracle: Let My People Go! (Shu Zhang, Ilya Sokolov, Symantec) | Cassandra Sum...
Oracle: Let My People Go! (Shu Zhang, Ilya Sokolov, Symantec) | Cassandra Sum...
DataStax
 
Hazelcast
HazelcastHazelcast
Hazelcast
oztalip
 
What's the Scoop on Hadoop? How It Works and How to WORK IT!
What's the Scoop on Hadoop? How It Works and How to WORK IT!What's the Scoop on Hadoop? How It Works and How to WORK IT!
What's the Scoop on Hadoop? How It Works and How to WORK IT!
MongoDB
 
Polyglot Database - Linuxcon North America 2016
Polyglot Database - Linuxcon North America 2016Polyglot Database - Linuxcon North America 2016
Polyglot Database - Linuxcon North America 2016
Dave Stokes
 
Time series databases
Time series databasesTime series databases
Time series databases
Source Ministry
 
DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...
DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...
DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...
DataStax
 
What Your Database Query is Really Doing
What Your Database Query is Really DoingWhat Your Database Query is Really Doing
What Your Database Query is Really Doing
Dave Stokes
 
MongoDB in the Middle of a Hybrid Cloud and Polyglot Persistence Architecture
MongoDB in the Middle of a Hybrid Cloud and Polyglot Persistence ArchitectureMongoDB in the Middle of a Hybrid Cloud and Polyglot Persistence Architecture
MongoDB in the Middle of a Hybrid Cloud and Polyglot Persistence Architecture
MongoDB
 
Agility and Scalability with MongoDB
Agility and Scalability with MongoDBAgility and Scalability with MongoDB
Agility and Scalability with MongoDB
MongoDB
 
Php &amp; my sql - how do pdo, mysq-li, and x devapi do what they do
Php &amp; my sql  - how do pdo, mysq-li, and x devapi do what they doPhp &amp; my sql  - how do pdo, mysq-li, and x devapi do what they do
Php &amp; my sql - how do pdo, mysq-li, and x devapi do what they do
Dave Stokes
 

What's hot (20)

The Promise and Perils of Encrypting Cassandra Data (Ameesh Divatia, Baffle, ...
The Promise and Perils of Encrypting Cassandra Data (Ameesh Divatia, Baffle, ...The Promise and Perils of Encrypting Cassandra Data (Ameesh Divatia, Baffle, ...
The Promise and Perils of Encrypting Cassandra Data (Ameesh Divatia, Baffle, ...
 
DataStax | Building a Spark Streaming App with DSE File System (Rocco Varela)...
DataStax | Building a Spark Streaming App with DSE File System (Rocco Varela)...DataStax | Building a Spark Streaming App with DSE File System (Rocco Varela)...
DataStax | Building a Spark Streaming App with DSE File System (Rocco Varela)...
 
Data Management 3: Bulletproof Data Management
Data Management 3: Bulletproof Data ManagementData Management 3: Bulletproof Data Management
Data Management 3: Bulletproof Data Management
 
CouchDB : More Couch
CouchDB : More CouchCouchDB : More Couch
CouchDB : More Couch
 
Document Validation in MongoDB 3.2
Document Validation in MongoDB 3.2Document Validation in MongoDB 3.2
Document Validation in MongoDB 3.2
 
MongoDB Best Practices for Developers
MongoDB Best Practices for DevelopersMongoDB Best Practices for Developers
MongoDB Best Practices for Developers
 
Think Distributed: The Hazelcast Way
Think Distributed: The Hazelcast WayThink Distributed: The Hazelcast Way
Think Distributed: The Hazelcast Way
 
Log analysis with elastic stack
Log analysis with elastic stackLog analysis with elastic stack
Log analysis with elastic stack
 
Hermes: Free the Data! Distributed Computing with MongoDB
Hermes: Free the Data! Distributed Computing with MongoDBHermes: Free the Data! Distributed Computing with MongoDB
Hermes: Free the Data! Distributed Computing with MongoDB
 
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
 
Oracle: Let My People Go! (Shu Zhang, Ilya Sokolov, Symantec) | Cassandra Sum...
Oracle: Let My People Go! (Shu Zhang, Ilya Sokolov, Symantec) | Cassandra Sum...Oracle: Let My People Go! (Shu Zhang, Ilya Sokolov, Symantec) | Cassandra Sum...
Oracle: Let My People Go! (Shu Zhang, Ilya Sokolov, Symantec) | Cassandra Sum...
 
Hazelcast
HazelcastHazelcast
Hazelcast
 
What's the Scoop on Hadoop? How It Works and How to WORK IT!
What's the Scoop on Hadoop? How It Works and How to WORK IT!What's the Scoop on Hadoop? How It Works and How to WORK IT!
What's the Scoop on Hadoop? How It Works and How to WORK IT!
 
Polyglot Database - Linuxcon North America 2016
Polyglot Database - Linuxcon North America 2016Polyglot Database - Linuxcon North America 2016
Polyglot Database - Linuxcon North America 2016
 
Time series databases
Time series databasesTime series databases
Time series databases
 
DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...
DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...
DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...
 
What Your Database Query is Really Doing
What Your Database Query is Really DoingWhat Your Database Query is Really Doing
What Your Database Query is Really Doing
 
MongoDB in the Middle of a Hybrid Cloud and Polyglot Persistence Architecture
MongoDB in the Middle of a Hybrid Cloud and Polyglot Persistence ArchitectureMongoDB in the Middle of a Hybrid Cloud and Polyglot Persistence Architecture
MongoDB in the Middle of a Hybrid Cloud and Polyglot Persistence Architecture
 
Agility and Scalability with MongoDB
Agility and Scalability with MongoDBAgility and Scalability with MongoDB
Agility and Scalability with MongoDB
 
Php &amp; my sql - how do pdo, mysq-li, and x devapi do what they do
Php &amp; my sql  - how do pdo, mysq-li, and x devapi do what they doPhp &amp; my sql  - how do pdo, mysq-li, and x devapi do what they do
Php &amp; my sql - how do pdo, mysq-li, and x devapi do what they do
 

Similar to Document validation in MongoDB 3.2

Neue Features in MongoDB 3.6
Neue Features in MongoDB 3.6Neue Features in MongoDB 3.6
Neue Features in MongoDB 3.6
MongoDB
 
Novedades de MongoDB 3.6
Novedades de MongoDB 3.6Novedades de MongoDB 3.6
Novedades de MongoDB 3.6
MongoDB
 
MongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and ImplicationsMongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and Implications
MongoDB
 
Apache Calcite Tutorial - BOSS 21
Apache Calcite Tutorial - BOSS 21Apache Calcite Tutorial - BOSS 21
Apache Calcite Tutorial - BOSS 21
Stamatis Zampetakis
 
Evolving your Data Access with MongoDB Stitch - Drew Di Palma
Evolving your Data Access with MongoDB Stitch - Drew Di PalmaEvolving your Data Access with MongoDB Stitch - Drew Di Palma
Evolving your Data Access with MongoDB Stitch - Drew Di Palma
MongoDB
 
Evolutionary db development
Evolutionary db development Evolutionary db development
Evolutionary db development
Open Party
 
Measuring Your Code
Measuring Your CodeMeasuring Your Code
Measuring Your Code
Nate Abele
 
Measuring Your Code 2.0
Measuring Your Code 2.0Measuring Your Code 2.0
Measuring Your Code 2.0
Nate Abele
 
Stored-Procedures-Presentation
Stored-Procedures-PresentationStored-Procedures-Presentation
Stored-Procedures-PresentationChuck Walker
 
Iac d.damyanov 4.pptx
Iac d.damyanov 4.pptxIac d.damyanov 4.pptx
Iac d.damyanov 4.pptx
Dimitar Damyanov
 
Eagle6 mongo dc revised
Eagle6 mongo dc revisedEagle6 mongo dc revised
Eagle6 mongo dc revisedMongoDB
 
Eagle6 Enterprise Situational Awareness
Eagle6 Enterprise Situational AwarenessEagle6 Enterprise Situational Awareness
Eagle6 Enterprise Situational Awareness
MongoDB
 
Creating PostgreSQL-as-a-Service at Scale
Creating PostgreSQL-as-a-Service at ScaleCreating PostgreSQL-as-a-Service at Scale
Creating PostgreSQL-as-a-Service at Scale
Sean Chittenden
 
Centralizing users’ authentication at Active Directory level 
Centralizing users’ authentication at Active Directory level Centralizing users’ authentication at Active Directory level 
Centralizing users’ authentication at Active Directory level 
Hossein Sarshar
 
Webinar: What's New in MongoDB 3.2
Webinar: What's New in MongoDB 3.2Webinar: What's New in MongoDB 3.2
Webinar: What's New in MongoDB 3.2
MongoDB
 
Testing Big Data solutions fast and furiously
Testing Big Data solutions fast and furiouslyTesting Big Data solutions fast and furiously
Testing Big Data solutions fast and furiously
Katherine Golovinova
 
Microsoft Azure Big Data Analytics
Microsoft Azure Big Data AnalyticsMicrosoft Azure Big Data Analytics
Microsoft Azure Big Data Analytics
Mark Kromer
 
ATAGTR2017 Test Approach for Re-engineering Legacy Applications based on Micr...
ATAGTR2017 Test Approach for Re-engineering Legacy Applications based on Micr...ATAGTR2017 Test Approach for Re-engineering Legacy Applications based on Micr...
ATAGTR2017 Test Approach for Re-engineering Legacy Applications based on Micr...
Agile Testing Alliance
 
Easy integration of Bluemix services with your applications
Easy integration of Bluemix services with your applicationsEasy integration of Bluemix services with your applications
Easy integration of Bluemix services with your applications
Jack-Junjie Cai
 
DQ Product Usage Methodology Highlights_v6_ltd
DQ Product Usage Methodology Highlights_v6_ltdDQ Product Usage Methodology Highlights_v6_ltd
DQ Product Usage Methodology Highlights_v6_ltdDigendra Vir Singh (DV)
 

Similar to Document validation in MongoDB 3.2 (20)

Neue Features in MongoDB 3.6
Neue Features in MongoDB 3.6Neue Features in MongoDB 3.6
Neue Features in MongoDB 3.6
 
Novedades de MongoDB 3.6
Novedades de MongoDB 3.6Novedades de MongoDB 3.6
Novedades de MongoDB 3.6
 
MongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and ImplicationsMongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and Implications
 
Apache Calcite Tutorial - BOSS 21
Apache Calcite Tutorial - BOSS 21Apache Calcite Tutorial - BOSS 21
Apache Calcite Tutorial - BOSS 21
 
Evolving your Data Access with MongoDB Stitch - Drew Di Palma
Evolving your Data Access with MongoDB Stitch - Drew Di PalmaEvolving your Data Access with MongoDB Stitch - Drew Di Palma
Evolving your Data Access with MongoDB Stitch - Drew Di Palma
 
Evolutionary db development
Evolutionary db development Evolutionary db development
Evolutionary db development
 
Measuring Your Code
Measuring Your CodeMeasuring Your Code
Measuring Your Code
 
Measuring Your Code 2.0
Measuring Your Code 2.0Measuring Your Code 2.0
Measuring Your Code 2.0
 
Stored-Procedures-Presentation
Stored-Procedures-PresentationStored-Procedures-Presentation
Stored-Procedures-Presentation
 
Iac d.damyanov 4.pptx
Iac d.damyanov 4.pptxIac d.damyanov 4.pptx
Iac d.damyanov 4.pptx
 
Eagle6 mongo dc revised
Eagle6 mongo dc revisedEagle6 mongo dc revised
Eagle6 mongo dc revised
 
Eagle6 Enterprise Situational Awareness
Eagle6 Enterprise Situational AwarenessEagle6 Enterprise Situational Awareness
Eagle6 Enterprise Situational Awareness
 
Creating PostgreSQL-as-a-Service at Scale
Creating PostgreSQL-as-a-Service at ScaleCreating PostgreSQL-as-a-Service at Scale
Creating PostgreSQL-as-a-Service at Scale
 
Centralizing users’ authentication at Active Directory level 
Centralizing users’ authentication at Active Directory level Centralizing users’ authentication at Active Directory level 
Centralizing users’ authentication at Active Directory level 
 
Webinar: What's New in MongoDB 3.2
Webinar: What's New in MongoDB 3.2Webinar: What's New in MongoDB 3.2
Webinar: What's New in MongoDB 3.2
 
Testing Big Data solutions fast and furiously
Testing Big Data solutions fast and furiouslyTesting Big Data solutions fast and furiously
Testing Big Data solutions fast and furiously
 
Microsoft Azure Big Data Analytics
Microsoft Azure Big Data AnalyticsMicrosoft Azure Big Data Analytics
Microsoft Azure Big Data Analytics
 
ATAGTR2017 Test Approach for Re-engineering Legacy Applications based on Micr...
ATAGTR2017 Test Approach for Re-engineering Legacy Applications based on Micr...ATAGTR2017 Test Approach for Re-engineering Legacy Applications based on Micr...
ATAGTR2017 Test Approach for Re-engineering Legacy Applications based on Micr...
 
Easy integration of Bluemix services with your applications
Easy integration of Bluemix services with your applicationsEasy integration of Bluemix services with your applications
Easy integration of Bluemix services with your applications
 
DQ Product Usage Methodology Highlights_v6_ltd
DQ Product Usage Methodology Highlights_v6_ltdDQ Product Usage Methodology Highlights_v6_ltd
DQ Product Usage Methodology Highlights_v6_ltd
 

More from Andrew Morgan

MongoDB 3.4 webinar
MongoDB 3.4 webinarMongoDB 3.4 webinar
MongoDB 3.4 webinar
Andrew Morgan
 
Powering Microservices with MongoDB, Docker, Kubernetes & Kafka – MongoDB Eur...
Powering Microservices with MongoDB, Docker, Kubernetes & Kafka – MongoDB Eur...Powering Microservices with MongoDB, Docker, Kubernetes & Kafka – MongoDB Eur...
Powering Microservices with MongoDB, Docker, Kubernetes & Kafka – MongoDB Eur...
Andrew Morgan
 
Data Streaming with Apache Kafka & MongoDB - EMEA
Data Streaming with Apache Kafka & MongoDB - EMEAData Streaming with Apache Kafka & MongoDB - EMEA
Data Streaming with Apache Kafka & MongoDB - EMEA
Andrew Morgan
 
The rise of microservices - containers and orchestration
The rise of microservices - containers and orchestrationThe rise of microservices - containers and orchestration
The rise of microservices - containers and orchestration
Andrew Morgan
 
PistonHead's use of MongoDB for Analytics
PistonHead's use of MongoDB for AnalyticsPistonHead's use of MongoDB for Analytics
PistonHead's use of MongoDB for Analytics
Andrew Morgan
 
What's new in MySQL Cluster 7.4 webinar charts
What's new in MySQL Cluster 7.4 webinar chartsWhat's new in MySQL Cluster 7.4 webinar charts
What's new in MySQL Cluster 7.4 webinar charts
Andrew Morgan
 
MySQL High Availability Solutions - Feb 2015 webinar
MySQL High Availability Solutions - Feb 2015 webinarMySQL High Availability Solutions - Feb 2015 webinar
MySQL High Availability Solutions - Feb 2015 webinar
Andrew Morgan
 
FOSDEM 2015 - NoSQL and SQL the best of both worlds
FOSDEM 2015 - NoSQL and SQL the best of both worldsFOSDEM 2015 - NoSQL and SQL the best of both worlds
FOSDEM 2015 - NoSQL and SQL the best of both worlds
Andrew Morgan
 
MySQL Replication: What’s New in MySQL 5.7 and Beyond
MySQL Replication: What’s New in MySQL 5.7 and BeyondMySQL Replication: What’s New in MySQL 5.7 and Beyond
MySQL Replication: What’s New in MySQL 5.7 and Beyond
Andrew Morgan
 
NoSQL and SQL - Why Choose? Enjoy the best of both worlds with MySQL
NoSQL and SQL - Why Choose? Enjoy the best of both worlds with MySQLNoSQL and SQL - Why Choose? Enjoy the best of both worlds with MySQL
NoSQL and SQL - Why Choose? Enjoy the best of both worlds with MySQL
Andrew Morgan
 
MySQL Cluster - Latest Developments (up to and including MySQL Cluster 7.4)
MySQL Cluster - Latest Developments (up to and including MySQL Cluster 7.4)MySQL Cluster - Latest Developments (up to and including MySQL Cluster 7.4)
MySQL Cluster - Latest Developments (up to and including MySQL Cluster 7.4)
Andrew Morgan
 
NoSQL & SQL - Best of both worlds - BarCamp Berkshire 2013
NoSQL & SQL - Best of both worlds - BarCamp Berkshire 2013NoSQL & SQL - Best of both worlds - BarCamp Berkshire 2013
NoSQL & SQL - Best of both worlds - BarCamp Berkshire 2013
Andrew Morgan
 
NoSQL and SQL - blending the best of both worlds
NoSQL and SQL - blending the best of both worldsNoSQL and SQL - blending the best of both worlds
NoSQL and SQL - blending the best of both worlds
Andrew Morgan
 
Mysql cluster introduction
Mysql cluster introductionMysql cluster introduction
Mysql cluster introduction
Andrew Morgan
 
Developing high-throughput services with no sql ap-is to innodb and mysql clu...
Developing high-throughput services with no sql ap-is to innodb and mysql clu...Developing high-throughput services with no sql ap-is to innodb and mysql clu...
Developing high-throughput services with no sql ap-is to innodb and mysql clu...
Andrew Morgan
 

More from Andrew Morgan (15)

MongoDB 3.4 webinar
MongoDB 3.4 webinarMongoDB 3.4 webinar
MongoDB 3.4 webinar
 
Powering Microservices with MongoDB, Docker, Kubernetes & Kafka – MongoDB Eur...
Powering Microservices with MongoDB, Docker, Kubernetes & Kafka – MongoDB Eur...Powering Microservices with MongoDB, Docker, Kubernetes & Kafka – MongoDB Eur...
Powering Microservices with MongoDB, Docker, Kubernetes & Kafka – MongoDB Eur...
 
Data Streaming with Apache Kafka & MongoDB - EMEA
Data Streaming with Apache Kafka & MongoDB - EMEAData Streaming with Apache Kafka & MongoDB - EMEA
Data Streaming with Apache Kafka & MongoDB - EMEA
 
The rise of microservices - containers and orchestration
The rise of microservices - containers and orchestrationThe rise of microservices - containers and orchestration
The rise of microservices - containers and orchestration
 
PistonHead's use of MongoDB for Analytics
PistonHead's use of MongoDB for AnalyticsPistonHead's use of MongoDB for Analytics
PistonHead's use of MongoDB for Analytics
 
What's new in MySQL Cluster 7.4 webinar charts
What's new in MySQL Cluster 7.4 webinar chartsWhat's new in MySQL Cluster 7.4 webinar charts
What's new in MySQL Cluster 7.4 webinar charts
 
MySQL High Availability Solutions - Feb 2015 webinar
MySQL High Availability Solutions - Feb 2015 webinarMySQL High Availability Solutions - Feb 2015 webinar
MySQL High Availability Solutions - Feb 2015 webinar
 
FOSDEM 2015 - NoSQL and SQL the best of both worlds
FOSDEM 2015 - NoSQL and SQL the best of both worldsFOSDEM 2015 - NoSQL and SQL the best of both worlds
FOSDEM 2015 - NoSQL and SQL the best of both worlds
 
MySQL Replication: What’s New in MySQL 5.7 and Beyond
MySQL Replication: What’s New in MySQL 5.7 and BeyondMySQL Replication: What’s New in MySQL 5.7 and Beyond
MySQL Replication: What’s New in MySQL 5.7 and Beyond
 
NoSQL and SQL - Why Choose? Enjoy the best of both worlds with MySQL
NoSQL and SQL - Why Choose? Enjoy the best of both worlds with MySQLNoSQL and SQL - Why Choose? Enjoy the best of both worlds with MySQL
NoSQL and SQL - Why Choose? Enjoy the best of both worlds with MySQL
 
MySQL Cluster - Latest Developments (up to and including MySQL Cluster 7.4)
MySQL Cluster - Latest Developments (up to and including MySQL Cluster 7.4)MySQL Cluster - Latest Developments (up to and including MySQL Cluster 7.4)
MySQL Cluster - Latest Developments (up to and including MySQL Cluster 7.4)
 
NoSQL & SQL - Best of both worlds - BarCamp Berkshire 2013
NoSQL & SQL - Best of both worlds - BarCamp Berkshire 2013NoSQL & SQL - Best of both worlds - BarCamp Berkshire 2013
NoSQL & SQL - Best of both worlds - BarCamp Berkshire 2013
 
NoSQL and SQL - blending the best of both worlds
NoSQL and SQL - blending the best of both worldsNoSQL and SQL - blending the best of both worlds
NoSQL and SQL - blending the best of both worlds
 
Mysql cluster introduction
Mysql cluster introductionMysql cluster introduction
Mysql cluster introduction
 
Developing high-throughput services with no sql ap-is to innodb and mysql clu...
Developing high-throughput services with no sql ap-is to innodb and mysql clu...Developing high-throughput services with no sql ap-is to innodb and mysql clu...
Developing high-throughput services with no sql ap-is to innodb and mysql clu...
 

Recently uploaded

Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Anthony Dahanne
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar
 
Designing for Privacy in Amazon Web Services
Designing for Privacy in Amazon Web ServicesDesigning for Privacy in Amazon Web Services
Designing for Privacy in Amazon Web Services
KrzysztofKkol1
 
Software Testing Exam imp Ques Notes.pdf
Software Testing Exam imp Ques Notes.pdfSoftware Testing Exam imp Ques Notes.pdf
Software Testing Exam imp Ques Notes.pdf
MayankTawar1
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Globus
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
Matt Welsh
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
Globus
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
Juraj Vysvader
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Natan Silnitsky
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
Globus
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
Tier1 app
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
vrstrong314
 
Strategies for Successful Data Migration Tools.pptx
Strategies for Successful Data Migration Tools.pptxStrategies for Successful Data Migration Tools.pptx
Strategies for Successful Data Migration Tools.pptx
varshanayak241
 
Advanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should KnowAdvanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should Know
Peter Caitens
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
Globus
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdf
Ortus Solutions, Corp
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New Zealand
IES VE
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
wottaspaceseo
 

Recently uploaded (20)

Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
 
Designing for Privacy in Amazon Web Services
Designing for Privacy in Amazon Web ServicesDesigning for Privacy in Amazon Web Services
Designing for Privacy in Amazon Web Services
 
Software Testing Exam imp Ques Notes.pdf
Software Testing Exam imp Ques Notes.pdfSoftware Testing Exam imp Ques Notes.pdf
Software Testing Exam imp Ques Notes.pdf
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
 
Strategies for Successful Data Migration Tools.pptx
Strategies for Successful Data Migration Tools.pptxStrategies for Successful Data Migration Tools.pptx
Strategies for Successful Data Migration Tools.pptx
 
Advanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should KnowAdvanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should Know
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdf
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New Zealand
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
 

Document validation in MongoDB 3.2

  • 1. MongoDB 3.2 – Document Validation LondonMUG–15/9/2015
  • 2. Agenda Value of flexible schemas Downside of flexible schemas What 3.2 adds What 3.2 doesn’t add Options Production lifecycle to add Document Validation Demo
  • 3. Power of flexible schemas RDBMS MongoDB { _id: ObjectId("4c4ba5e5e8aabf3"), employee_name: {First: "Billy", Last: "Fish"}, department: "Engineering", title: "Aquarium design", pay_band: "C", benefits: [ { type: "Health", plan: "PPO Plus" }, { type: "Dental", plan: "Standard" } ] }
  • 4. Power of flexible schemas { _id: ObjectId("4c4ba5e5e8aabf3"), employee_name: {First: "Billy", Last: "Fish"}, department: "Engineering", title: "Aquarium design", pay_band: "C", benefits: [ { type: "Health", plan: "PPO Plus" }, { type: "Dental", plan: "Standard" } ] } • Relational – Up-front schema definition phase – Adding new column takes time to develop & *lots* of time to roll out in production – Existing rows must be reformatted • MongoDB: – Start hacking your app right away – Want to store new type of information? • Just start adding it • If it doesn’t apply to all instances – just leave it out
  • 5. Power of flexible schemas { _id: ObjectId("4c4ba5e5e8aabf3"), employee_name: {First: "Billy”, Middle: "The", Last: "Fish"}, department: "Engineering", title: "Aquarium design", pay_band: "C", benefits: [ { type: "Health", plan: "PPO Plus" }, { type: "Dental", plan: "Standard" } ] } • Relational – Up-front schema definition phase – Adding new column takes time to develop & *lots* of time to roll out in production – Existing rows must be reformatted • MongoDB: – Start hacking your app right away – Want to store new type of information? • Just start adding it • If it doesn’t apply to all instances – just leave it out
  • 6. 6 Why validate documents? • Many people writing to the database – Many developers – Many teams – Many companies – Many development languages • Multiple applications want to exploit the same data, need to agree on what’s there • Usually a core subset of keys you always want to be there • For any key may care about: – Existence – Type – Format – Value – Existence in combination with other keys (e.g. need a phone number or an email address)
  • 7. 7 Data Access Layer Why validate documents? • Good to have a ‘contract’ for what’s in a collection – When reading from the “subscriber” collection, I know that every document will include a subscription plan name: db.subscriptions.find( name: "Billy Fish", $and:[{plan:{$exists: true}}, {plan:{$type: 2}}]}) • < MongoDB 3.2, this is an application responsibility – 3rd party tools like Mongoose can help • Best implemented as a layer between the application and driver (Data Access Layer) App App App Driver Database
  • 8. Get the database to do the work!
  • 9. 9 Document Validation - MongoDB 3.2 • Configure document validation within the database • Use familiar MongoDB query expressions • Automatically tests each insert/update; delivers warning or error if a rule is broken • You choose what keys to validate and how db.runCommand({ collMod: "contacts", validator: { $and: [ {year_of_birth: {$lte: 1994}}, {$or: [ {phone: { $type: ”string"}}, {email: { $type: ”string"}} ]}] }})
  • 10. 10 Document Validation - MongoDB 3.2 db.getCollectionInfos({name:"contacts"}) [ { "name": "contacts", "options": { "validator": { "$and": [ {"year_of_birth": { "$lte": 1994}}, {"$or": [ {"phone": {"$type": ”string"}}, {"email": {"$type": ”string"}} ]} ]}, "validationLevel": "strict", "validationAction": "error" } } ]
  • 11. 11 Document Validation - MongoDB 3.2 db.contacts.insert( name: "Fred", email: "fred@clusterdb.com", year_of_birth: 2012 }) Document failed validation WriteResult({ "nInserted": 0, "writeError": { "code": 121, "errmsg": "Document failed validation”}})
  • 12. 12 Document Validation - MongoDB 3.2 db.runCommand({collMod: "bleh", validator: { rogue: {$exists:false} } } });
  • 13. 13 Document Validation - MongoDB 3.2 • Can check most things that work with a find expression – Existence – Non-existence – Data type of values – <, <=, >, >=, ==, != – AND, OR – Regular expressions – Some geospatial operators (e.g. $geoWithin & $geoIntersects) – …
  • 14. 14 MongoDB 3.2 Limitations • Generic error message – Application needs to figure out what part of the constraints failed • Cannot compare 1 key with another – Either within the same document or between documents • Some operations not supported: – $text, $geoNear, $near, $nearSphere, $where • Applications responsibility to bring legacy data into compliance with new rules – No audit or tools
  • 15. 15 What validations remain in the app • User interface – Don’t have the database be the first place to detect that an email is poorly formatted • Any validations that involve comparisons with – Other data in the same document – Data from other documents – External information (e.g. time of day) • Semantic checks that are designed to fail frequently – e.g. user is in wrong country to use this service – Database should typically be testing for coding errors rather than implementing your business logic • Determining why the database rejected a document in order to provide a meaningful error to the user
  • 16. 16 Where MongoDB Validation excels(vs.RDBMS) • Simple – Use familiar search expressions – No need for stored procedures • Flexible – Only enforced on mandatory parts of the schema – Can start adding new data at any point and then add validation later if needed • Practical to deploy – Simple to role out new rules across thousands of production servers • Light weight – Negligible impact to performance
  • 17. 17 Cleaning up legacy data • Validator does not check if existing documents in the collection meet the new validation rules • User/app can execute a query to identify & update any document which don’t meet the new rules – Use $nor on the full expression • Be cautious about system impacts: – Could push working data set out of memory – Extra load if many documents need to be updated – Execute on secondary secondary> db.runCommand({collMod: "bleh", validator: { a: {$lt:4} } }); secondary> db.bleh.find({ a:{$not:{$lt:4}}}).count() secondary> db.bleh.update( {a:{$not:{$lt:4}}}, {$set:{a:3}}, {multi:true})
  • 18. 18 Controlling validation validationLevel off moderate strict validationAction warn No checks Warn on non-compliance for inserts & updates to existing valid documents. Updates to existing invalid docs OK. Warn on any non-compliant document for any insert or update. error No checks Reject on non-compliant inserts & updates to existing valid documents. Updates to existing invalid docs OK. Reject any non-compliant document for any insert or update. DEFAULT
  • 19. 19 Controlling validation • Set behavior: db.bleh.runCommand("collMod", {validationLevel: "moderate", validationAction: "warn"}) – Note that the warnings are written to the log • Override for a single operation (not working in MongoDB 3.1.7): db.bleh.insert({a:999},{bypassDocumentValidation:true})
  • 20. 20 Lifecycle Hacking (Day one) •No document validation •Release quick & often Analyze De facto Schema •MongoDB Scout Add document validation rules •Query & fix existing docs •Log any new documents that break rules: {validationLevel: "moderate", validationAction: "warn} Fix application •Follow all schema rules •When no new problems being logged, enter strict mode: {validationLevel: ”strict", validationAction: ”error} Application uses new data (Application evolves/additional app) •If not mandated, stop here ? Analyze De-facto Schema •MongoDB Scout Add document validation rules •Query & fix existing docs •Log any new documents that break rules: {validationLevel: "moderate", validationAction: "warn} Fix application •Follow all schema rules •When no new problems being logged, enter strict mode: {validationLevel: ”strict", validationAction: ”error}
  • 21. 21 Versioning of Validations (optional) • Application can lazily update documents with an older version or with no version set at all db.runCommand({ collMod: "contacts", validator: {$or: [{version: {"$exists": false}}, {version: 1, $and: [{Name: {"$exists": true}}] }, {version: 2, $and: [{Name: {"$type": ”string"}}] } ] } })
  • 22. Demo
  • 23. Next Steps • “Document Validation and What Dynamic Schema Means” – Eliot Horowitz – http://www.eliothorowitz.com/blog/2015/09/11/document-validation-and-what- dynamic-schema-means/ • “Bulletproof Data Management” – MongoDB World 2015 – https://www.mongodb.com/presentations/data-management-3-bulletproof-data- management • Documentation – http://docs.mongodb.org/manual/release-notes/3.1-dev-series/#document- validation • Not yet ready for production but download and try MongoDB 3.1! – https://www.mongodb.org/downloads#development • Feedback – https://jira.mongodb.org/
  • 24. MongoDB Days 2015 October 6, 2015 October 20, 2015 November 5, 2015 December 2, 2015 France Germany UK Silicon Valley
  • 25. 25 London MUG - Plans • Would like to have a regular cadence – 4-6 weeks? • What topics would you like to see covered? • Anyone interested in running these sessions (with support from MongoDB team)