MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
Real World Application Performance with MongoDB
1. Real World Application
Performance with MongoDB
FireScope History
FireScope Unify
FireScope Stratis
Scalable Architecture
Pete Whitney, VP Cloud Development, FireScope, Inc.
2. Real World Application
Performance with MongoDB
Edge Device Edge Device Web Server Web Server
Write Activity
Load Balancer
App Server App Server
Write Activity Read Activity
Mongo Shard
Mongo Shard Mongo Shard
Pete Whitney, VP Cloud Development, FireScope, Inc.
3. Document Marshaling
Performance Research
Performance Driven Selection Criteria
− FireScope collects, stores, and normalizes
hundreds of metrics each second
− As enterprises grow, the demands on
FireScope Stratis increase
− Performance of a single application server
is critically important
Pete Whitney, VP Cloud Development, FireScope, Inc.
4. Document Marshaling
Performance Research
Mongo Java Driver
− Stores object data in a HashMap
− HashMap<String, Object>
− HashMap<field_name, value>
− HashMap data is accessible by extending
com.mongodb.BasicDBObject
− BasicDBObject provides getters/setters for
all common Java types (String, Integer, ...)
Pete Whitney, VP Cloud Development, FireScope, Inc.
5. Document Marshaling
Performance Research
Spring Data
Stores object data in object class variables
− Annotations map document
field_name/value to corresponding class
variables
− Reflection marshals BasicDBObject fields
into the appropriate class variable
Pete Whitney, VP Cloud Development, FireScope, Inc.
6. Document Marshaling
Performance Research
Testing Process
− Saved and retrieved large graphs of nested
object instances
− Tested variations in the nesting depth and
number of instances
− Issued several test runs
− Alternated between the Mongo Java Driver
and Spring Data
Pete Whitney, VP Cloud Development, FireScope, Inc.
7. Document Marshaling
Performance Research
Pete Whitney, VP Cloud Development, FireScope, Inc.
8. Document Marshaling
Performance Research
Analysis Results
− Mongo Java Driver out-performed Spring
Data
− Result trends were based on instance count
− Spring Data added an overhead of between
2X and 4X
− Spring Data also resulted in dropped fields
if data in the HashMap was not mapped via
an annotation to a class variable
Pete Whitney, VP Cloud Development, FireScope, Inc.
9. Minimal Field Retrieval
Research
MongoDB provides the ability to retrieve a
subset of fields for any query
Most use cases do not require all fields of an
underlying document
Reducing the consumed bandwidth between
servers is always a good thing
Being new to MongoDB, we did not know if
the overhead of limiting the returned fields
would offset the benefit of reduced bandwidth
Pete Whitney, VP Cloud Development, FireScope, Inc.
10. Minimal Field Retrieval
Research
Testing Process
− Query/change/save objects without
specifying a defined set of fields to retrieve
− Query/change/save objects while specifying
a defined set of fields to retrieve (¼ of the
fields of the entire instance)
− Alternate between the two test scenarios
− Issue multiple test runs
Pete Whitney, VP Cloud Development, FireScope, Inc.
11. Minimal Field Retrieval
Research
Analysis Results
− Minimal field retrieval provided a surprising
9X performance improvement vs. full field
retrieval
− Use minimal field retrieval with caution
If you inadvertently save a minimally
populated object, you will lose the
unpopulated fields
If you need an additional field, which you
didn't specify in the query, it will appear as a
null value
Pete Whitney, VP Cloud Development, FireScope, Inc.
12. Data Aggregation
Thanks to Foursquare Labs, Inc. for calling
our attention to this approach
Aggregating historically collected data into a
single document improves access time
But wait, index sizes are significantly reduced
too
MongoDB stores indexes in memory so
reduced index size = improved performance
and smaller memory footprint
Pete Whitney, VP Cloud Development, FireScope, Inc.
13. Data Aggregation
Single Element Storage
{ ref_id : ABC123, time : 1336780800, value : 50% }
{ ref_id : ABC123, time : 1336780830, value : 51% }
{ ref_id : ABC123, time : 1336780860, value : 48% }
Pete Whitney, VP Cloud Development, FireScope, Inc.
14. Data Aggregation
Aggregated Storage
{ ref_id : ABC123, midnight : 1336780800, values : [
time : 1336780800, value : 50%,
time : 1336780830, value : 51%,
time : 1336780860, value : 48%,
time : 1336780890, value : 0%,
time : 1336780920, value : 0%,
]}
Pete Whitney, VP Cloud Development, FireScope, Inc.
15. Early Space Allocation
Documents for an entire day of metric collection are
created by scheduled process run once per day
Updates to the document occur throughout the day
Updates do not change the size of the document
Updates do not change the indexes associated with
the document
Only the times and values are changed
No size or index change results in optimal updating
Also use WriteConcern.NORMAL (network integrity)
Pete Whitney, VP Cloud Development, FireScope, Inc.
16. Early Space Allocation
Daily History Record
{ ref_id : ABC123, midnight : 1336780800, values : [
time : 1336780800, value : 0%,
time : 1336780830, value : 0%,
time : 1336780860, value : 0%,
time : 1336780890, value : 0%,
time : 1336780920, value : 0%, ...
]}
Pete Whitney, VP Cloud Development, FireScope, Inc.
17. Summary
Mongo Java Driver provides ~3X performance
improvement vs. Spring Data
Minimal field retrieval provides ~9X performance
improvement vs. full document retrieval
Data aggregation improves performance via
improved data locality and reduced index size
Early space allocation eliminates document size
changes and index changes, which significantly
improves document update performance
Pete Whitney, VP Cloud Development, FireScope, Inc.
18. Parting Thoughts
If you are not consistently gathering data from your
IT infrastructure do you know when some
important portion of your business is failing to
meet your customers needs?
When you change something in your IT
infrastructure do you know that the change is
performing up to the expectations of your
customers?
Do you have the tools in place today that allow you
to explore and identify issues, problems,
bottlenecks throughout all of your IT assets?
If not then check out: www.firescope.com
Pete Whitney, VP Cloud Development, FireScope, Inc.
19. References
Article “Real World Application
Performance with MongoDB”
java.sys-con.com/node/2414029
FireScope www.firescope.com
Foursquare www.foursquare.com
MongoDB www.mongodb.org
Morphia code.google.com/p/morphia
Pete Whitney pwhitney@firescope.com
Spring Data www.springsource.org/spring-data
Pete Whitney, VP Cloud Development, FireScope, Inc.