3. Geosciences
3Data Modeling in GeoSciences, Feb. 2016, Tehran, Iran
• The science of Earth is complicated…
Hence, the data!
4. Data in Geosciences
• Data in Geoscience is VERY
– Big
– Diverse
– Complex
– Volatile
– Inter-connected
• Look at
– EPA
– USGS
– OneGeology
– GEON
– EarthCube
4Data Modeling in GeoSciences, Feb. 2016, Tehran, Iran
5. Paradigm Shift
• From:
– Experimental
– Theoretical
– Computational
• Data Intensive Science has emerged!
– Doing science by analyzing data
5Data Modeling in GeoSciences, Feb. 2016, Tehran, Iran
6. Modeling
• A representation of:
– Process
– Concept
– Operation
of a System
6Data Modeling in GeoSciences, Feb. 2016, Tehran, Iran
8. Data Model
• Representation of a system in term of:
– Entities
– Relationships
– Data Flows
– Workflows
Analogous to Geographic maps
8Data Modeling in GeoSciences, Feb. 2016, Tehran, Iran
9. Data Modeling
• The process of creating a data model
• For an information system
• By applying formal techniques
• Using proper tools (usually)
Analogous to Cartography
9Data Modeling in GeoSciences, Feb. 2016, Tehran, Iran
10. Data Modeling in Geoscience
“A rock is a naturally occurring
solid aggregate of one or more
minerals or mineraloids”– Wikipedia
10Data Modeling in GeoSciences, Feb. 2016, Tehran, Iran
Samples of rocks on
Earth and Mars
Should it be natural?
Can’t it be soft?Aggregate OR Composite?What about the proportion
of minerals?
11. Data Modeling in Geosciences
• British Geological Survey
– Open Geological Data Models
• Geochemistry Data Model
11Data Modeling in GeoSciences, Feb. 2016, Tehran, Iran
SITE
SAMPLE
BATCH
ANALYSIS
ANALYTE_
DETERMINATION_
LIMITS
ANALYTE_
DETERMINATION
DIC_
Laboratory
DIC_
Analysis_Method
DIC_
Analysis_
Preparation
DIC_
Analyte
Sample_Ids:
A
B
C
Batch_Ids:
X
Y
Sample_ID, Batch_Id:
A,X
B,Y
12. • NADM Conceptual Model 1.0
• Geologic concept hierarchy
The Geologic Map of NADM
12Data Modeling in GeoSciences, Feb. 2016, Tehran, Iran
14. GSI Iran
• I know of a lot work done
– Unclear licensing!
– Not published!!
• So…
– Not Accessible!
14Data Modeling in GeoSciences, Feb. 2016, Tehran, Iran
22. Network/Graph Data Model
• Water Grid Modeling
• Process Modeling
Modeling Approaches
22Data Modeling in GeoSciences, Feb. 2016, Tehran, Iran
• RDF
• Graph Databases
• Neo4J
• IBM System G
• Info Grid
24. Data Container
Extended Property
Globalization Info«enumeration»
Measurement Scale
0..1
1
{No Duplicate}
Data Container
Extended Property
Globalization Info«enumeration»
Measurement Scale
0..1
1
{No Duplicate}
Data Container
Data AttributeMetadata Attribute
{No Extended Property}
Data Container
Data Type Unit
0..1
+Applies To
1
Data Container
Data Type Unit
0..1
+Applies To
1
Data Container
Methodology
Aggregate Function
0..1
Data Container
Methodology
Aggregate Function
0..1
Data Container Constraint
Default Value
Domain Value
Validator
Data Container Constraint
Default Value
Domain Value
Validator
Data Container
Semantic
Description
Data Container
Semantic
Description
Modeling Approaches
Object Oriented Modeling
24Data Modeling in GeoSciences, Feb. 2016, Tehran, Iran
25. Modeling Techniques
• ERDs:
– Are mostly relational
– Do not capture behaviors
– Do not capture processes and sequences
25Data Modeling in GeoSciences, Feb. 2016, Tehran, Iran
26. Modeling techniques
• OOM (Object Oriented Modeling)
– More natural to Objects/features/behaviors
– Flexible relationships
– Various aspect models
• Structural
• Behavioral
• Sequences
• Timing
26Data Modeling in GeoSciences, Feb. 2016, Tehran, Iran
27. MetadataStructureMetadata PackageMetadata Attribute
MetadataMetadata Attribute
Value
Mapping Info
Dataset Version
Dataset
Metadata Compound
Attribute
Metadata Simple
Atribute
Metdata Package
Usage
Data Container
Metdata Attribute
Usage
Metdata Compound
Usage
Base Usage
1
+Parent
+Children
11
11..*
1
10..*
1
{No Extended Property}
2..*
1..*
Modeling Techniques
Structural Aspect
27Data Modeling in GeoSciences, Feb. 2016, Tehran, Iran
28. Why to do modeling?
28Data Modeling in GeoSciences, Feb. 2016, Tehran, Iran
29. Benefits: Communication
• Various stakeholders
– Domain experts
– Principal Investigators
– Developers
– Managers
• Visual
• Formal (no/very low interpretation possibility)
• Contracting/ Outsourcing
• Standardization (if well-modeled and comprehensive)
• Publishing
29Data Modeling in GeoSciences, Feb. 2016, Tehran, Iran
30. Benefits: System Generation
• System Specification
• Automatic Database Generation
• Model Driven Development (MDD)
• Reproducibility
• Cost reduction
• Multi platform targeting
30Data Modeling in GeoSciences, Feb. 2016, Tehran, Iran
31. Benefits: Project Management
• Work Breakdown
• Cost/Effort Estimation
• Sub contracting/Outsourcing
• Monitoring
31Data Modeling in GeoSciences, Feb. 2016, Tehran, Iran
32. Benefits: Ontology
• OOMs can be transformed to Ontologies
• To provide:
– Formal
– Machine enforceable
– Domain specific
– Semantically annotated
– Geosciences Data
• Improves cross project/ cross domain
– Data integration
– Data Discovery
32Data Modeling in GeoSciences, Feb. 2016, Tehran, Iran
33. Benefits: Data Validation
• Model items as rules
• Domain specific constraints
can be incorporated
• Automatic Data Validation
33Data Modeling in GeoSciences, Feb. 2016, Tehran, Iran
34. Case Study: BExIS
• BExIS
– A Generic Data Management System
– Complex Conceptual Model
– Multiple Teams work on different parts
– Automatic database generation
– Conceptual Model <-> Ontology
34Data Modeling in GeoSciences, Feb. 2016, Tehran, Iran
36. Some resources used
• BGS Rock Classification Scheme,
see: https://www.bgs.ac.uk/bgsrcs/
• NADM Conceptual Model 1.0—A conceptual
model for geologic map information:
http://pubs.usgs.gov/of/2004/1334
• Semantic Web for Earth and Environmental
Terminology (SWEET): http://sweet.jpl.nasa.gov/
36Data Modeling in GeoSciences, Feb. 2016, Tehran, Iran
37. Related Work
• A conceptual model for data management in the field
of ecology, J. Chamanara, B. König-Ries, 2013
• An Extensible Conceptual Model for Tabular Scientific
Datasets, J. Chamanara, M. Owonibi, A. Algergawy, R.
Gerlach
• T. Kiani, 2010, Modeling for geospatial database:
Application to structural geology data. Dissertation,
Pierre and Marie Curie University, 295 p.
37Data Modeling in GeoSciences, Feb. 2016, Tehran, Iran
38. Online Resources
• The BExIS complete conceptual model:
http://fusion.cs.uni-jena.de/bppCM/index.htm
• A public talk on the BExIS conceptual model:
http://www.db-thueringen.de/servlets/DocumentServlet?id=27235
38Data Modeling in GeoSciences, Feb. 2016, Tehran, Iran
39. Feedback
Thank YOU
Sources of the examples/photos are in the slide
notes
39Data Modeling in GeoSciences, Feb. 2016, Tehran, Iran
Editor's Notes
A photo of Earth
A photo of a data scientist and a huge amount of data, but without a hammer!
The scientist sees the world through data
A data model is a set of symbols and text used for communicating a precise representation of an information landscape.
Data entities are determined by the requirements.
Its purpose, scope, methods, and tools are set
http://www.bgs.ac.uk/services/dataModels/geochemistry.html
Basic, foundation of other works
The actual implementation can be specified at generation time. The model can be implemented on different platforms
A WBS, a handshake
An ontology is an explicit specification of a conceptualization
Ontology-driven conceptual modeling
Basin ontology
Reservoir ontology
Task ontology
World Oil and Gas Atlas ontology
SWEET
Unit or Measurement/ Geographical Time Unit/ Conversion