Presented by Sigfrido Narvaez, Senior Solutions Architect, MongoDB
Experience level: Introductory
When it comes time to select database software for your project, there are a bewildering number of choices. How do you know if your project is a good fit for a relational database, or whether one of the many NoSQL options is a better choice? In this session you will learn when to use MongoDB and how to evaluate if MongoDB is a fit for your project. You will see how MongoDB's flexible document model is solving business problems in ways that were not previously possible, and how MongoDB's built-in features allow running at scale.
3. Factors Driving Modern Applications
Data
• 90% data created in last 2 years
• 80% enterprise data is unstructured
• Unstructured data growing 2X rate
of structured data
Mobile
• 2 Billion smartphones by 2015
• Mobile now >50% internet use
• 26 Billion devices on IoT by
2020
Social
• 72% of internet use is social media
• 2 Billion active users monthly
• 93% of businesses use social media
Cloud
• Compute costs declining 33% YOY
• Storage costs declining 38% YOY
• Network costs declining 27% YOY
7. How Databases Stack Up
Requirement RDBMS Key/value Wide column MongoDB
Hierarchical data Poor Poor Good Great
Dynamic schema Poor Poor Poor Great
Native OOP language Poor Great Great Great
Software cost Poor Great Great Great
Performance Poor Great Great Great
Scale Poor Great Great Great
Data consistency Great Poor Poor Great
Rich querying Great Poor Poor Great
Ease of use Good Good Poor Great
VALUE OF SQL
8. Requirement RDBMS Key/value Wide column MongoDB
Hierarchical data Poor Poor Good Great
Dynamic schema Poor Poor Poor Great
Native OOP language Poor Great Great Great
Software cost Poor Great Great Great
Performance Poor Great Great Great
Scale Poor Great Great Great
Data consistency Great Poor Poor Great
Rich querying Great Poor Poor Great
Ease of use Good Good Poor Great
How Databases Stack Up
VALUE OF NOSQL
9. Requirement RDBMS Key/value Wide column MongoDB
Hierarchical data Poor Poor Good Great
Dynamic schema Poor Poor Poor Great
Native OOP language Poor Great Great Great
Software cost Poor Great Great Great
Performance Poor Great Great Great
Scale Poor Great Great Great
Data consistency Great Poor Poor Great
Rich querying Great Poor Poor Great
Ease of use Good Good Poor Great
How Databases Stack Up
VALUE OF MONGODB
11. Mobile HR App
One of largest HCM solution providers builds app for single
view of HR, serving 1M+ users globally
Problem Why MongoDB ResultsProblem Solution Results
Able to serve 1M users and 41K
companies across 17 countries
99.999% uptime (5.26 min/yr)
Top iOS Business App
15. Internet Of Things
Expands 3M car pilot to 300M cars
Problem Why MongoDB Results
https://www.mongodb.com/use-cases/internet-of-things
Before MongoDB
Rigid Schemas New Devices and
Data
Scale-Up Limits Horizontal
Scalability
Inadequate Query
Performance
In-Place Analytics
17. Documents are Rich Data Structures
{
first_name: ‘Paul’,
surname: ‘Miller’,
cell: 447557505611,
city: ‘London’,
location: [45.123,47.232],
Profession: [‘banking’, ‘finance’, ‘trader’],
cars: [
{ model: ‘Bentley’,
year: 1973,
value: 100000, … },
{ model: ‘Rolls Royce’,
year: 1965,
value: 330000, … }
]
}
Fields can contain an array of sub-
documents
Fields
Typed field values
Fields can
contain arrays
Fields can be indexed at any level
18. Do More With Your Data
{
first_name: ‘Paul’,
surname: ‘Miller’,
city: ‘London’,
location: [45.123,47.232],
cars: [
{ model: ‘Bentley’,
year: 1973,
value: 100000, … },
{ model: ‘Rolls Royce’,
year: 1965,
value: 330000, … }
}
}
Rich Queries
Find Paul’s cars
Find everybody in London with a car built
between 1970 and 1980
Geospatial
Find all of the car owners within 5km of Trafalgar
Sq.
Text Search
Find all the cars described as having leather
seats
Aggregation
Calculate the average value of Paul’s car
collection
Map Reduce
What is the ownership pattern of colors by
geography over time?
(is purple trending up in China?)
23. Seismic Modeling
• 2000 x 2000 x 2000 cubic data set
• 8 billion floats
• Relational model can take several
minutes for some calculations
• MongoDB query performs in ~1 second
(4M docs or 2000x2000) {
"_id": ObjectId("55e7358e1a317d0fb177b31e"),
"x": 100,
"y": 25,
"z": [0.8506244646719524,
0.18891124618195854,
0.14090160846138955, ...
]
}
24. Molecular Similarity Database
• Store Chemical Compound Fingerprints
• Find compounds which are “close” to a given compound
• Tanimoto association coefficient compares two
compounds based on their common fingerprints
• Aggregation framework $setIntersection
Source: Chemical Similarity Search in MongoDB by Matt Swain
01001011 [2, 5, 7, 8, …]
26. Replica Sets High Availability
Replica Set – 2 to 50 copies
Self-healing shard
Data Center Aware
Addresses availability considerations:
High Availability
Disaster Recovery
Maintenance
Workload Isolation: operational & analytics
27. Data Hub for Large Investment Bank
Feeds & Batch data
• Pricing
• Accounts
• Securities Master
• Corporate actions
Real-time
Real-time Real-time
Real-time
Real-time
Real-time
Real-time
Each represents
• Less hardware $
• Less license $
• No penalty $
• & many less problems
MongoDB
Secondaries
MongoDB
Primary
28. Automatic Sharding High Scalability
Three types: hash-based, range-based, location-aware
Increase or decrease capacity as you go
Automatic balancing
30. Measuring Scale
250M Ticks/Sec
300K+ Ops/Sec
500K+ Ops/SecFed Agency
Performance
1,400 Servers
1,000+ Servers
250+ Servers
Entertainment Co.
Cluster
Petabytes
10s of billions of objects
13B documents
Data
Asian Internet Co.
https://www.mongodb.com/mongodb-scale
31. Case Study Results
Competitive Edge in Trading Space
Built single platform for all financial data on
MongoDB – open sourced!
60% less disk, 40% savings w/commodity SSDs
100x faster data retrieval
250M ticks per second - 25x!
Measuring Scale
32. Case Study Results
Competitive Edge in Trading Space
Built single platform for all financial data on
MongoDB – open sourced!
60% less disk, 40% savings w/commodity SSDs
100x faster data retrieval
250M ticks per second - 25x!
UK gov’s “Digital Strategy” – up to 2 years each
to deliver new tax services
Pluggable micro-services, supporting CD
Accelerated to 40-50 releases per week
New tax services developed in 3 weeks
New Paperless Tax Notifications saved £3M/month
Measuring Scale
33. Case Study Results
Competitive Edge in Trading Space
Built single platform for all financial data on
MongoDB – open sourced!
60% less disk, 40% savings w/commodity SSDs
100x faster data retrieval
250M ticks per second - 25x!
UK gov’s “Digital Strategy” – up to 2 years each
to deliver new tax services
Pluggable micro-services, supporting CD
Accelerated to 40-50 releases per week
New tax services developed in 3 weeks
New Paperless Tax Notifications saved £3M/month
Modern Instruments vs. Legacy Databases
Schema holds-up research by 3-6 months
Flexible Schema removes impedance
New tests in weeks not months
Reduced time to introduce new drugs – a big
difference to patients
Measuring Scale
34. Case Study Results
Competitive Edge in Trading Space
Built single platform for all financial data on
MongoDB – open sourced!
60% less disk, 40% savings w/commodity SSDs
100x faster data retrieval
250M ticks per second - 25x!
UK gov’s “Digital Strategy” – up to 2 years each
to deliver new tax services
Pluggable micro-services, supporting CD
Accelerated to 40-50 releases per week
New tax services developed in 3 weeks
New Paperless Tax Notifications saved £3M/month
Modern Instruments vs. Legacy Databases
Schema holds-up research by 3-6 months
Flexible Schema removes impedance
New tests in weeks not months
Reduced time to introduce new drugs – a big
difference to patients
SQL Server instances per game
Single flexible DB spanning all titles
Cost center became Profit center: DBaaS for 3rd Party
3-week queries 2 mins
Insights fed into game behavior in real time
Measuring Scale
36. What MongoDB does well
Agile development in
most programming
languages
High Availability and
automatic failover
High performance on
mixed workloads of
reads, writes and
updates
Operational data
analytics in real time
Scale horizontally on
demand at your data
center or the cloud
48. What does your GIANT IDEA need?
• Develop AGILE applications
• 99.999% availability
• Deploy rapidly and SCALE on demand
• Real time analysis in the database, under load
• GEOSPATIAL querying
• Processing in REAL TIME, not in batch
• Deploy over commodity computing and storage architectures
• Point in Time RECOVERY
• Need strong data consistency
• Advanced SECURITY
52. Tell me how I didtoday on Guidebook and enter for achance to
winone of these
How to do it:
Download the Guidebook App
Search for MongoDB Silicon Valley
Submit session feedback