SlideShare a Scribd company logo
1 of 29
Download to read offline
© CGI Group Inc.
From Traditional Oil & Gas to
Sustainable Energy with
MongoDB
Underpinned with a traditional RDBMS
migration to MongoDB
1
© CGI Group Inc.
Technology because it is cool?
2Footer appears here, if required
© CGI Group Inc.
What do we do in Subsurface and Wells
A quick summary
• We model and simulate what happens under ground:
• Identify what layers the earth consists of
• What geological faults exist
• What wells we are using
• And a “grid” where the 3D structure is discretized in space
• Where each cell has physical and dynamic properties
• What kind of surface network is used to get the oil from the well to a desired location
• Then a simulation determins ‘what will happen’ when a well is created, creating a pressure difference and
start ‘pumping stuff’ up out of the ground
What does that look like?
3
© CGI Group Inc. 4Footer appears here, if required
© CGI Group Inc. 5Footer appears here, if required
© CGI Group Inc. 6Footer appears here, if required
© CGI Group Inc. 7Footer appears here, if required
© CGI Group Inc. 8Footer appears here, if required
© CGI Group Inc.
Dynamic Simulations and a lot of Uncertainty
• Simulating Subsurface Flow is like predicting the weather
• Only the average meteorologist has less uncertainty
• Simulations can simulate up to 40 years for a full field lifecycle
• For a yearly operational plan, usually E+04 realizations are created
• For a single asset
• With a major having 100+ assets that is a lot!
• What kind of information is processed?
• Distribution of properties (Property Arrays)
• Production Curves
• Operating curves of surface equipment
9
© CGI Group Inc.
Integration with a Surface Network
All the equipment/hardware used to manage the
hydro carbon production of a field: A Digital Twin
Key words
• Expensive
• Determines field production
• Needs maintenance
• Controls the field
• Provides safety to a well
• Etc.
More and more focus on integrating the surface network into the simulation
• Lots, and lots of measurement points
• Measurements from many different viewpoints!
(operations, safety, quality of service, planning, etc.)
• Lots and lots of opportunity to ‘match’ against actually measured values
• Orthogonal to the subsurface:
more than 10 yards away from the well and you have to ‘guess’ what is happening
10
© CGI Group Inc.
Data Types
• Time Series Property Arrays
• Multi-dimensional double arrays, recorded over time
• Stored on disk
• State Model
• The object-oriented in memory model of the simulator
• Stored in tables in an RDBMS
• Time Series / Production Results
• Time Series per ‘property’ on a piece of equipment
(pressure, temperature, flow rate, rotations, on/off, etc.)
• Stored in an EAV model in an RDBMS
• Log Data
• Scientific Software
• Simulation is about solving Jacobian matrices
• Log is used to find out “why” a simulation does or does not converge
• Stored in a single log-table
11
Domain
Object
Well
Pump
Fluid
Model
Trajectory
Black Oil Thermal
© CGI Group Inc.
Data Distribution
12
Grid Properties Log Domain
(Disk) Grid 40%
Properties 25%
Logs 25%
Domain
10%
5 – 10 TB per month!
© CGI Group Inc.
RDBMS and where is the pain?
• Maintenance of state table structure
• Any update to the domain model results in a new database configuraton
• Upgrading is hard
• Single table with a lot of information: log data
• Full text
• Not easily searchable
• All logs from all realizations in one table
• EAV
• Every single double is stored as a record in a single table
• Heavy reliance on indexes (foreign keys) to find the right value for
• The right asset
• The right realization
• The right piece of equipment
• This is the biggest problem in the database, a heavily indexed table with billions of rows
13Footer appears here, if required
© CGI Group Inc.
New Energies: Optimizers and the Surface Network
The simulator optimizes for production
• Whether that is oil or solar power or wind power: the optimizer is domain agnostic
• Delivering the energy product is constrained by the surface network
• Surface Network Complexity:
• Oil & Gas à Very Complex
• New Energies à Super Complex
Characteristics:
• Much more equipment
• More complex connections
• Very variable production curves (it gets dark, no wind, clouds,
too much wind, etc.)
Result: many more measurements
EAV becomes unmanageable
14
© CGI Group Inc.
Reasons to Migrate to MongoDB
• EAV was the first problem to tackle:
• Organizational convenience: MongoDB already exists in the landscape
• Time Series Database (Influx) versus MongoDB:
• The EAV problem is ‘disguised’ as a time-series problem
• Not a real-time measurement problem
• Instead:
• In context of an uncertainty realization NOT a single property continuously measure
• Equipment & Contracts & Total Values
• The equipment or contract the measurement belongs to is important
• It is simulated data: 85% or more of the data needs to be thrown away again
• And the rest:
• While architecting the domain and log data emerged as ideal candidates for a document database as
well.
15
© CGI Group Inc.
The Domain Model
A lucky break
• Domain model was translated to RDBMS with a POCO Layer
• Each Domain object has a POCO definition and mapping
• RDBMS database is generated based on POCO definitions
• All ‘relational’ logic exists in the application
à none in the database
• Simple surgery: instead of saving the POCO into an
existing table, use MongoDB’s capability à Collection is generated automatically.
Enhancements:
• Version number of POCO definition
• Collections with version numbers
Results:
• Upgradable database
• No downtime with new data-models
• Auto generation
16
Domain
Object
Well
Pump
Fluid
Model
Trajectory
Black Oil Thermal
POCO Layer
© CGI Group Inc.
Log Data
• Collect Log Data in a new Log-Sink
• Log Sink’s responsibility:
• Create log records collection per realization and time-step
• i.e.: create a document
• Save / Update document in MongoDB
Results:
• Super fast
• Searchable
• But too much “home grown”
Decision:
• remove all logging from any database and push log data to a vendor solution like Splunk or Elastic
17
© CGI Group Inc.
Time Series Data Context
• Different views and usages
• Common concept theme:
• Complete piece of equipment
• Individual measurements are ‘drill down’ result
• Always in context of uncertainty scenario: realization
• Timestep: when was this measured and how large was the time step
• Run Time Context information
• Timestep sizes are dynamic (e.g. solver ‘cuts down’ on time to converge to a solution)
• Simulation sizes are dynamic:
• Week / Quarter / Asset Specific Period / Year / Full Lifecycle
• Not a ‘single’ measurement:
• Totals
• Averages
18
© CGI Group Inc.
We’re in the business of deleting data!
Uncertainty & Optimization in a nutshell:
• Creating many realizations
• Analyzing the results of those realizations
• Categorizing and choosing the realizations that are relevant
• Deleting the results of the realizations that are not relevant
Example:
• Sensitivity Analysis: 10,000 realizations
• 3 Realizations are used (P10, P50, P90)
• 9,997 realizations are thrown away
• Uncertainty Propagation: 25 categorical realizations
• Top 5 selection
• Each with 10,000 realizations à 50,000 total
• P10, P50, P90
• 49,997 realizations are thrown away
19
© CGI Group Inc.
Recommended Best Practices For Time Series
Time Series Recommendation for Document Databases:
• Create a time- or size based bucketing strategy
• Choose a time-frame (e.g. a day) for a property
• One document for that time-frame
• Any value recorded in that timeframe is inserted as part of that document
20
{
"_id" : ObjectId("5b5279d1e303d394db6ea0f8"),
"p" : {
"0" : 56.56,
"1" : 56.56,
"2" : 56.58,
…
"59" : 57.02
},
"symbol" : "MDB",
"d" : ISODate("2018-06-30T00:00:00Z")
},
{
"_id" : ObjectId("5b5279d1e303d394db6ea134"),
"p" : {
"0" : 69.47,
"1" : 69.47,
"2" : 68.46,
...
"59" : 69.45
},
"symbol" : "TSLA",
"d" : ISODate("2018-06-30T00:01:00Z")
},...
https://www.mongodb.com/blog/post/time-series-data-and-mongodb-part-2-schema-design-best-practices
MongoDB Blog: Time Series Schema Design Best Practices
© CGI Group Inc.
Reality of Time Series in Simulations
• Time step sizes vary
• Number of time steps vary
• Equipment is not always known
• (infill strategies: do we need an extra well, do we need extra wind turbines, etc.)
Causes
• User selected length of simulation (1 year, 1 month, 2 years, 10 years, 1 day, etc.)
• User selected time step size and recording rate (1 day, 1 hour, 1 month, etc.)
• Solver ‘cuts’ time step sizes (non-convergence, parameter that can be changed is time)
• Equipment is created/invented dynamically
Other Requirements
• Gather all data of a single piece of equipment or collection of equipment an properties
(e.g. all power cap switches, energy contracts, accumulated power per sector, etc.)
• Deletion Based on Realization
• Deletion is not based on time-based retention
21
© CGI Group Inc.
“Our” Time Series Document
{
"realization" : "f4db735b-9411-4dc0-8156-55c5e2ef2493",
"timestep" : 42,
"start" : "2012-04-23T18:25:43.511Z",
"stop" : "2012-05-23T18:25:43.511Z",
"asset" : "6afbf5e2-6042-4509-9303-ebe4d5d95cec",
"assetname" : "pump01",
"properties":
{
"acc_inflow": "100501",
"acc_outflow": "100501",
"avg_pressuredrop": "25",
"shutdown" :
{
"start" : "2012-04-28T15:14:00.000Z",
"stop" : "2012-04-29T18:00:05.080Z"
}
}
}
22
1
2
3
Query Analysis for Indexes:
1. Realization (10%)
2. Asset (35%)
3. Realization & Asset (50%)
© CGI Group Inc.
Speed Improvements
Add some redundant data:
• Collection listing available Scenarios
• (User) Selection of scenarios becomes ‘fast’
• Collection listing available Realizations
• Relating Time Series Documents to Scenario is faster
than relating to a realization ‘nested’ in the Scenario Document
• Collection with Available Equipment & Descriptions
• Pick & Choose / Slice & Dice on equipment or
production curves to plot
• “Cheating” by updating a nested collection with realizations the
equipment or properties exists in
• Not ‘ideal’ but no real performance impact or limits on document size
Some classic RDBMS / ERD Thinking can speed up
performance!
23
Scenario
Equipment
Time
Series
Realization
© CGI Group Inc.
Time Series Results
Pros:
• Writing is Super Fast
• Writing is Super Easy à Very low complexity in code
• Reading is Super Fast
• Most important: Removing data is Super Fast!
Cons:
• Super fast initial database growth
• Stabilizing after a few weeks
• Deletion regime is responsible for that
• Many documents à index memory requirements
Cons are ‘minor’ compared to the Pros à Decisions are taken much faster, much more value compared to
the few extra $ spent on storage and memory
Note: something that still needs to be evaluated is the asset pattern
https://www.mongodb.com/blog/post/building-with-patterns-the-attribute-pattern
24
© CGI Group Inc.
Document Design vs No Design
Documents should be designed with purpose/goals in mind:
• Fast Writing
• Type of Querying
• Presentation
• Etc.
Simulator Domain Storage:
• Dump the Domain Model as is, Collection Per Class
Design exists in application and is ‘unusable’ by other consumers
Strategic decision: Domain Model is for ‘State’ and Application Optimization
The “interesting” data are Input & Results which is sufficient
25
Domain
Object
Well
Pump
Fluid
Model
Trajectory
Black Oil Thermal
POCO Layer
© CGI Group Inc.
Stuff we didn’t do (yet)
• Sharding
• One database per asset
• Legal constraints prevent data sharing
• Data that is sharable is master data, for which a different platform is used
• Reduce use of MongoDB
• Now we store ‘everything’
• Time Series data is streaming data
• Evaluating Kafka as intermediate layer
• Only store the interesting stuff (P10, P50, P90)
• Store the Grid Data
• Still file based
• GridFS is thought of, but seems cumbersome
26
© CGI Group Inc.
Conclusions
• Document Models are great!
• Super Flexible
• Performance of a database like MongoDB is great
• Own the Domain Logic!
• Ability to evaluate the solution on your domain knowledge
• MongoDB consultancy is excellent, but some domain terminology is only ‘seemingly’ compatible with
other domains
• Don’t be afraid to experiment
• We didn’t do a ‘study up front’
• Timebox your effort and check on potential
• Use the right motivation
• We didn’t do this because NoSQL is cool (it is, but we have other stuff to do as well)
• Recognize that a data model (EAV) is problematic
• Evaluate ‘what solutions’ can negate that problem
(that is the part where you can study ‘upfront’)
27
© CGI Group Inc.
Questions
28
© CGI Group Inc.
Our commitment to you
We approach every engagement with one
objective in mind: to help clients succeed

More Related Content

Similar to MongoDB World 2019: From Traditional Oil and Gas to Sustainable Energy OR From a Traditional RDBMS to MongoDB

Marine Air Ground Task Force Command & Control Systems Software Deployment an...
Marine Air Ground Task Force Command & Control Systems Software Deployment an...Marine Air Ground Task Force Command & Control Systems Software Deployment an...
Marine Air Ground Task Force Command & Control Systems Software Deployment an...LaurenWendler
 
Simulating Heterogeneous Resources in CloudLightning
Simulating Heterogeneous Resources in CloudLightningSimulating Heterogeneous Resources in CloudLightning
Simulating Heterogeneous Resources in CloudLightningCloudLightning
 
RECAP: The Simulation Approach
RECAP: The Simulation ApproachRECAP: The Simulation Approach
RECAP: The Simulation ApproachRECAP Project
 
Aged Data Center Infrastructure.pptx
Aged Data Center Infrastructure.pptxAged Data Center Infrastructure.pptx
Aged Data Center Infrastructure.pptxSchneider Electric
 
Apache Druid Design and Future prospect
Apache Druid Design and Future prospectApache Druid Design and Future prospect
Apache Druid Design and Future prospectc-bslim
 
VMworld 2013: US Air National Guard - DoD Private Cloud Initiative –How Virtu...
VMworld 2013: US Air National Guard - DoD Private Cloud Initiative –How Virtu...VMworld 2013: US Air National Guard - DoD Private Cloud Initiative –How Virtu...
VMworld 2013: US Air National Guard - DoD Private Cloud Initiative –How Virtu...VMworld
 
Addressing Uncertainty How to Model and Solve Energy Optimization Problems
Addressing Uncertainty How to Model and Solve Energy Optimization ProblemsAddressing Uncertainty How to Model and Solve Energy Optimization Problems
Addressing Uncertainty How to Model and Solve Energy Optimization Problemsoptimizatiodirectdirect
 
Filipe paternot - Case Study: Zabbix Deployment at Globo.com
Filipe paternot - Case Study: Zabbix Deployment at Globo.comFilipe paternot - Case Study: Zabbix Deployment at Globo.com
Filipe paternot - Case Study: Zabbix Deployment at Globo.comZabbix
 
IBM Maximo Performance Tuning
IBM Maximo Performance TuningIBM Maximo Performance Tuning
IBM Maximo Performance TuningFMMUG
 
Evolving for Kubernetes
Evolving for KubernetesEvolving for Kubernetes
Evolving for KubernetesChris McEniry
 
Cognos Dynamic Cubes:Set To Retire Transformer?: 10.2.2 Update: Pros & Cons
Cognos Dynamic Cubes:Set To Retire Transformer?: 10.2.2 Update: Pros & ConsCognos Dynamic Cubes:Set To Retire Transformer?: 10.2.2 Update: Pros & Cons
Cognos Dynamic Cubes:Set To Retire Transformer?: 10.2.2 Update: Pros & ConsSenturus
 
Virtualization for efficiency: by Kathrin Winkler, The green grid
Virtualization for efficiency: by Kathrin Winkler, The green gridVirtualization for efficiency: by Kathrin Winkler, The green grid
Virtualization for efficiency: by Kathrin Winkler, The green gridDCC Mission Critical
 
Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications OpenEBS
 
Cloud-Native & Sustainability: How and Why to Build Sustainable Workloads
Cloud-Native & Sustainability: How and Why to Build Sustainable WorkloadsCloud-Native & Sustainability: How and Why to Build Sustainable Workloads
Cloud-Native & Sustainability: How and Why to Build Sustainable WorkloadsNico Meisenzahl
 
RightScale Webinar feat. Redapt: How to Build a Private or Hybrid Cloud
RightScale Webinar feat. Redapt:  How to Build a Private or Hybrid CloudRightScale Webinar feat. Redapt:  How to Build a Private or Hybrid Cloud
RightScale Webinar feat. Redapt: How to Build a Private or Hybrid CloudRightScale
 
Presentation cmg2016 capacity management essentials-boston
Presentation   cmg2016 capacity management essentials-bostonPresentation   cmg2016 capacity management essentials-boston
Presentation cmg2016 capacity management essentials-bostonMohit Verma
 
A Framework to Measure and Maximize Cloud ROI
A Framework to Measure and Maximize Cloud ROIA Framework to Measure and Maximize Cloud ROI
A Framework to Measure and Maximize Cloud ROIRightScale
 

Similar to MongoDB World 2019: From Traditional Oil and Gas to Sustainable Energy OR From a Traditional RDBMS to MongoDB (20)

Marine Air Ground Task Force Command & Control Systems Software Deployment an...
Marine Air Ground Task Force Command & Control Systems Software Deployment an...Marine Air Ground Task Force Command & Control Systems Software Deployment an...
Marine Air Ground Task Force Command & Control Systems Software Deployment an...
 
Simulating Heterogeneous Resources in CloudLightning
Simulating Heterogeneous Resources in CloudLightningSimulating Heterogeneous Resources in CloudLightning
Simulating Heterogeneous Resources in CloudLightning
 
RECAP: The Simulation Approach
RECAP: The Simulation ApproachRECAP: The Simulation Approach
RECAP: The Simulation Approach
 
Aged Data Center Infrastructure.pptx
Aged Data Center Infrastructure.pptxAged Data Center Infrastructure.pptx
Aged Data Center Infrastructure.pptx
 
Apache Druid Design and Future prospect
Apache Druid Design and Future prospectApache Druid Design and Future prospect
Apache Druid Design and Future prospect
 
Designing Scalable Applications
Designing Scalable ApplicationsDesigning Scalable Applications
Designing Scalable Applications
 
VMworld 2013: US Air National Guard - DoD Private Cloud Initiative –How Virtu...
VMworld 2013: US Air National Guard - DoD Private Cloud Initiative –How Virtu...VMworld 2013: US Air National Guard - DoD Private Cloud Initiative –How Virtu...
VMworld 2013: US Air National Guard - DoD Private Cloud Initiative –How Virtu...
 
Addressing Uncertainty How to Model and Solve Energy Optimization Problems
Addressing Uncertainty How to Model and Solve Energy Optimization ProblemsAddressing Uncertainty How to Model and Solve Energy Optimization Problems
Addressing Uncertainty How to Model and Solve Energy Optimization Problems
 
Filipe paternot - Case Study: Zabbix Deployment at Globo.com
Filipe paternot - Case Study: Zabbix Deployment at Globo.comFilipe paternot - Case Study: Zabbix Deployment at Globo.com
Filipe paternot - Case Study: Zabbix Deployment at Globo.com
 
IBM Maximo Performance Tuning
IBM Maximo Performance TuningIBM Maximo Performance Tuning
IBM Maximo Performance Tuning
 
Release it! - Takeaways
Release it! - TakeawaysRelease it! - Takeaways
Release it! - Takeaways
 
Evolving for Kubernetes
Evolving for KubernetesEvolving for Kubernetes
Evolving for Kubernetes
 
Cognos Dynamic Cubes:Set To Retire Transformer?: 10.2.2 Update: Pros & Cons
Cognos Dynamic Cubes:Set To Retire Transformer?: 10.2.2 Update: Pros & ConsCognos Dynamic Cubes:Set To Retire Transformer?: 10.2.2 Update: Pros & Cons
Cognos Dynamic Cubes:Set To Retire Transformer?: 10.2.2 Update: Pros & Cons
 
Virtualization for efficiency: by Kathrin Winkler, The green grid
Virtualization for efficiency: by Kathrin Winkler, The green gridVirtualization for efficiency: by Kathrin Winkler, The green grid
Virtualization for efficiency: by Kathrin Winkler, The green grid
 
Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications
 
Cloud-Native & Sustainability: How and Why to Build Sustainable Workloads
Cloud-Native & Sustainability: How and Why to Build Sustainable WorkloadsCloud-Native & Sustainability: How and Why to Build Sustainable Workloads
Cloud-Native & Sustainability: How and Why to Build Sustainable Workloads
 
Java Performance Tuning
Java Performance TuningJava Performance Tuning
Java Performance Tuning
 
RightScale Webinar feat. Redapt: How to Build a Private or Hybrid Cloud
RightScale Webinar feat. Redapt:  How to Build a Private or Hybrid CloudRightScale Webinar feat. Redapt:  How to Build a Private or Hybrid Cloud
RightScale Webinar feat. Redapt: How to Build a Private or Hybrid Cloud
 
Presentation cmg2016 capacity management essentials-boston
Presentation   cmg2016 capacity management essentials-bostonPresentation   cmg2016 capacity management essentials-boston
Presentation cmg2016 capacity management essentials-boston
 
A Framework to Measure and Maximize Cloud ROI
A Framework to Measure and Maximize Cloud ROIA Framework to Measure and Maximize Cloud ROI
A Framework to Measure and Maximize Cloud ROI
 

More from MongoDB

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump StartMongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
 

More from MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Recently uploaded

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsAndrey Dotsenko
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 

Recently uploaded (20)

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 

MongoDB World 2019: From Traditional Oil and Gas to Sustainable Energy OR From a Traditional RDBMS to MongoDB

  • 1. © CGI Group Inc. From Traditional Oil & Gas to Sustainable Energy with MongoDB Underpinned with a traditional RDBMS migration to MongoDB 1
  • 2. © CGI Group Inc. Technology because it is cool? 2Footer appears here, if required
  • 3. © CGI Group Inc. What do we do in Subsurface and Wells A quick summary • We model and simulate what happens under ground: • Identify what layers the earth consists of • What geological faults exist • What wells we are using • And a “grid” where the 3D structure is discretized in space • Where each cell has physical and dynamic properties • What kind of surface network is used to get the oil from the well to a desired location • Then a simulation determins ‘what will happen’ when a well is created, creating a pressure difference and start ‘pumping stuff’ up out of the ground What does that look like? 3
  • 4. © CGI Group Inc. 4Footer appears here, if required
  • 5. © CGI Group Inc. 5Footer appears here, if required
  • 6. © CGI Group Inc. 6Footer appears here, if required
  • 7. © CGI Group Inc. 7Footer appears here, if required
  • 8. © CGI Group Inc. 8Footer appears here, if required
  • 9. © CGI Group Inc. Dynamic Simulations and a lot of Uncertainty • Simulating Subsurface Flow is like predicting the weather • Only the average meteorologist has less uncertainty • Simulations can simulate up to 40 years for a full field lifecycle • For a yearly operational plan, usually E+04 realizations are created • For a single asset • With a major having 100+ assets that is a lot! • What kind of information is processed? • Distribution of properties (Property Arrays) • Production Curves • Operating curves of surface equipment 9
  • 10. © CGI Group Inc. Integration with a Surface Network All the equipment/hardware used to manage the hydro carbon production of a field: A Digital Twin Key words • Expensive • Determines field production • Needs maintenance • Controls the field • Provides safety to a well • Etc. More and more focus on integrating the surface network into the simulation • Lots, and lots of measurement points • Measurements from many different viewpoints! (operations, safety, quality of service, planning, etc.) • Lots and lots of opportunity to ‘match’ against actually measured values • Orthogonal to the subsurface: more than 10 yards away from the well and you have to ‘guess’ what is happening 10
  • 11. © CGI Group Inc. Data Types • Time Series Property Arrays • Multi-dimensional double arrays, recorded over time • Stored on disk • State Model • The object-oriented in memory model of the simulator • Stored in tables in an RDBMS • Time Series / Production Results • Time Series per ‘property’ on a piece of equipment (pressure, temperature, flow rate, rotations, on/off, etc.) • Stored in an EAV model in an RDBMS • Log Data • Scientific Software • Simulation is about solving Jacobian matrices • Log is used to find out “why” a simulation does or does not converge • Stored in a single log-table 11 Domain Object Well Pump Fluid Model Trajectory Black Oil Thermal
  • 12. © CGI Group Inc. Data Distribution 12 Grid Properties Log Domain (Disk) Grid 40% Properties 25% Logs 25% Domain 10% 5 – 10 TB per month!
  • 13. © CGI Group Inc. RDBMS and where is the pain? • Maintenance of state table structure • Any update to the domain model results in a new database configuraton • Upgrading is hard • Single table with a lot of information: log data • Full text • Not easily searchable • All logs from all realizations in one table • EAV • Every single double is stored as a record in a single table • Heavy reliance on indexes (foreign keys) to find the right value for • The right asset • The right realization • The right piece of equipment • This is the biggest problem in the database, a heavily indexed table with billions of rows 13Footer appears here, if required
  • 14. © CGI Group Inc. New Energies: Optimizers and the Surface Network The simulator optimizes for production • Whether that is oil or solar power or wind power: the optimizer is domain agnostic • Delivering the energy product is constrained by the surface network • Surface Network Complexity: • Oil & Gas à Very Complex • New Energies à Super Complex Characteristics: • Much more equipment • More complex connections • Very variable production curves (it gets dark, no wind, clouds, too much wind, etc.) Result: many more measurements EAV becomes unmanageable 14
  • 15. © CGI Group Inc. Reasons to Migrate to MongoDB • EAV was the first problem to tackle: • Organizational convenience: MongoDB already exists in the landscape • Time Series Database (Influx) versus MongoDB: • The EAV problem is ‘disguised’ as a time-series problem • Not a real-time measurement problem • Instead: • In context of an uncertainty realization NOT a single property continuously measure • Equipment & Contracts & Total Values • The equipment or contract the measurement belongs to is important • It is simulated data: 85% or more of the data needs to be thrown away again • And the rest: • While architecting the domain and log data emerged as ideal candidates for a document database as well. 15
  • 16. © CGI Group Inc. The Domain Model A lucky break • Domain model was translated to RDBMS with a POCO Layer • Each Domain object has a POCO definition and mapping • RDBMS database is generated based on POCO definitions • All ‘relational’ logic exists in the application à none in the database • Simple surgery: instead of saving the POCO into an existing table, use MongoDB’s capability à Collection is generated automatically. Enhancements: • Version number of POCO definition • Collections with version numbers Results: • Upgradable database • No downtime with new data-models • Auto generation 16 Domain Object Well Pump Fluid Model Trajectory Black Oil Thermal POCO Layer
  • 17. © CGI Group Inc. Log Data • Collect Log Data in a new Log-Sink • Log Sink’s responsibility: • Create log records collection per realization and time-step • i.e.: create a document • Save / Update document in MongoDB Results: • Super fast • Searchable • But too much “home grown” Decision: • remove all logging from any database and push log data to a vendor solution like Splunk or Elastic 17
  • 18. © CGI Group Inc. Time Series Data Context • Different views and usages • Common concept theme: • Complete piece of equipment • Individual measurements are ‘drill down’ result • Always in context of uncertainty scenario: realization • Timestep: when was this measured and how large was the time step • Run Time Context information • Timestep sizes are dynamic (e.g. solver ‘cuts down’ on time to converge to a solution) • Simulation sizes are dynamic: • Week / Quarter / Asset Specific Period / Year / Full Lifecycle • Not a ‘single’ measurement: • Totals • Averages 18
  • 19. © CGI Group Inc. We’re in the business of deleting data! Uncertainty & Optimization in a nutshell: • Creating many realizations • Analyzing the results of those realizations • Categorizing and choosing the realizations that are relevant • Deleting the results of the realizations that are not relevant Example: • Sensitivity Analysis: 10,000 realizations • 3 Realizations are used (P10, P50, P90) • 9,997 realizations are thrown away • Uncertainty Propagation: 25 categorical realizations • Top 5 selection • Each with 10,000 realizations à 50,000 total • P10, P50, P90 • 49,997 realizations are thrown away 19
  • 20. © CGI Group Inc. Recommended Best Practices For Time Series Time Series Recommendation for Document Databases: • Create a time- or size based bucketing strategy • Choose a time-frame (e.g. a day) for a property • One document for that time-frame • Any value recorded in that timeframe is inserted as part of that document 20 { "_id" : ObjectId("5b5279d1e303d394db6ea0f8"), "p" : { "0" : 56.56, "1" : 56.56, "2" : 56.58, … "59" : 57.02 }, "symbol" : "MDB", "d" : ISODate("2018-06-30T00:00:00Z") }, { "_id" : ObjectId("5b5279d1e303d394db6ea134"), "p" : { "0" : 69.47, "1" : 69.47, "2" : 68.46, ... "59" : 69.45 }, "symbol" : "TSLA", "d" : ISODate("2018-06-30T00:01:00Z") },... https://www.mongodb.com/blog/post/time-series-data-and-mongodb-part-2-schema-design-best-practices MongoDB Blog: Time Series Schema Design Best Practices
  • 21. © CGI Group Inc. Reality of Time Series in Simulations • Time step sizes vary • Number of time steps vary • Equipment is not always known • (infill strategies: do we need an extra well, do we need extra wind turbines, etc.) Causes • User selected length of simulation (1 year, 1 month, 2 years, 10 years, 1 day, etc.) • User selected time step size and recording rate (1 day, 1 hour, 1 month, etc.) • Solver ‘cuts’ time step sizes (non-convergence, parameter that can be changed is time) • Equipment is created/invented dynamically Other Requirements • Gather all data of a single piece of equipment or collection of equipment an properties (e.g. all power cap switches, energy contracts, accumulated power per sector, etc.) • Deletion Based on Realization • Deletion is not based on time-based retention 21
  • 22. © CGI Group Inc. “Our” Time Series Document { "realization" : "f4db735b-9411-4dc0-8156-55c5e2ef2493", "timestep" : 42, "start" : "2012-04-23T18:25:43.511Z", "stop" : "2012-05-23T18:25:43.511Z", "asset" : "6afbf5e2-6042-4509-9303-ebe4d5d95cec", "assetname" : "pump01", "properties": { "acc_inflow": "100501", "acc_outflow": "100501", "avg_pressuredrop": "25", "shutdown" : { "start" : "2012-04-28T15:14:00.000Z", "stop" : "2012-04-29T18:00:05.080Z" } } } 22 1 2 3 Query Analysis for Indexes: 1. Realization (10%) 2. Asset (35%) 3. Realization & Asset (50%)
  • 23. © CGI Group Inc. Speed Improvements Add some redundant data: • Collection listing available Scenarios • (User) Selection of scenarios becomes ‘fast’ • Collection listing available Realizations • Relating Time Series Documents to Scenario is faster than relating to a realization ‘nested’ in the Scenario Document • Collection with Available Equipment & Descriptions • Pick & Choose / Slice & Dice on equipment or production curves to plot • “Cheating” by updating a nested collection with realizations the equipment or properties exists in • Not ‘ideal’ but no real performance impact or limits on document size Some classic RDBMS / ERD Thinking can speed up performance! 23 Scenario Equipment Time Series Realization
  • 24. © CGI Group Inc. Time Series Results Pros: • Writing is Super Fast • Writing is Super Easy à Very low complexity in code • Reading is Super Fast • Most important: Removing data is Super Fast! Cons: • Super fast initial database growth • Stabilizing after a few weeks • Deletion regime is responsible for that • Many documents à index memory requirements Cons are ‘minor’ compared to the Pros à Decisions are taken much faster, much more value compared to the few extra $ spent on storage and memory Note: something that still needs to be evaluated is the asset pattern https://www.mongodb.com/blog/post/building-with-patterns-the-attribute-pattern 24
  • 25. © CGI Group Inc. Document Design vs No Design Documents should be designed with purpose/goals in mind: • Fast Writing • Type of Querying • Presentation • Etc. Simulator Domain Storage: • Dump the Domain Model as is, Collection Per Class Design exists in application and is ‘unusable’ by other consumers Strategic decision: Domain Model is for ‘State’ and Application Optimization The “interesting” data are Input & Results which is sufficient 25 Domain Object Well Pump Fluid Model Trajectory Black Oil Thermal POCO Layer
  • 26. © CGI Group Inc. Stuff we didn’t do (yet) • Sharding • One database per asset • Legal constraints prevent data sharing • Data that is sharable is master data, for which a different platform is used • Reduce use of MongoDB • Now we store ‘everything’ • Time Series data is streaming data • Evaluating Kafka as intermediate layer • Only store the interesting stuff (P10, P50, P90) • Store the Grid Data • Still file based • GridFS is thought of, but seems cumbersome 26
  • 27. © CGI Group Inc. Conclusions • Document Models are great! • Super Flexible • Performance of a database like MongoDB is great • Own the Domain Logic! • Ability to evaluate the solution on your domain knowledge • MongoDB consultancy is excellent, but some domain terminology is only ‘seemingly’ compatible with other domains • Don’t be afraid to experiment • We didn’t do a ‘study up front’ • Timebox your effort and check on potential • Use the right motivation • We didn’t do this because NoSQL is cool (it is, but we have other stuff to do as well) • Recognize that a data model (EAV) is problematic • Evaluate ‘what solutions’ can negate that problem (that is the part where you can study ‘upfront’) 27
  • 28. © CGI Group Inc. Questions 28
  • 29. © CGI Group Inc. Our commitment to you We approach every engagement with one objective in mind: to help clients succeed