SlideShare a Scribd company logo
1 of 23
Download to read offline
Inferring Versioned
Schemas from NoSQL
Databases and its
Applications
ER’15
Stockholm, October 2015
[{ ”id”: ”90234 af”, ”value”: { ”author”: ”Diego Sevilla Ruiz”,
”e-mail”: ”dsevilla@um.es”,
”institution”: ”U. of Murcia”}},
{ ”id”: ”a243bb5”, ”value”: { ”author”: ”Severino Feliciano Morales”,
”e-mail”: ”severino.feliciano@um.es”,
”institution”: ”U. of Murcia”}},
{ ”id”: ”096705d”, ”value”: { ”author”: ”Jesús García Molina”,
”e-mail”: ”jmolina@um.es”,
”institution”: ”U. of Murcia”}}]
Motivation
NoSQL Databases are Schemaless
Benefits
▶ No need to previously
define an Schema
▶ Non-uniform data
▶ Custom fields
▶ Non-uniform types
▶ Easier evolution
Drawbacks
▶ Harder to reason about
the DB
▶ Static checking is lost
▶ Some of the data logic is
in the application code
(more error prone)
▶ Some utilities need
Schema information to
work
Schemas for NoSQL Databases
▶ How to alleviate the problems of schemaless
databases? ⇒ Inferring a Schema
▶ The Schema Model contains information about
Entities and Relationships
▶ Take into account the different Entity Versions in
the Database
▶ Heterogeneity usually because of slight variations on
Entities
▶ We obtain a precise database model
▶ The Schema allows us to automate the construction
of tools:
▶ migration, refactoring, visualization, …
Related Work
▶ JSON Schema
▶ Object versions and relationships are not considered
▶ Apache Spark SQL/Drill: SQL-like schemas
▶ Union of all fields, nullable ⇒ incorrect combinations
▶ Over-generalization to String
▶ Aggregations and Reference relations not considered
▶ MongoDB-Schema
▶ Prototype to infer schemas from MongoDB
collections
▶ Same limitations than Spark SQL
▶ JSON Discoverer
▶ A MDE solution to infer domain models from REST
web services (i.e. JSON documents)
▶ Not database-oriented; Object versions not
considered
Spark SQL Example
{”name”:”Michael”}
{”name”:”Andy”, ”age”:30}
{”name”:”Justin”, ”age”:19}
{”name”:”Peter”, ”age”:”tiny”}
{”name”:”Martina”, ”address”:”home!”}
> people.printSchema
root
|-- address: string (nullable = true)
|-- age: string (nullable = true)
|-- name: string (nullable = true)
▶ age promoted to string
▶ age and address are never part of the same object
{
”rows”:[
{
”content”:{
”chapters”:33,
”pages”:527
},
”authors”:[
{
”company”:{
”country”:”USA”,
”name”:”IBM”
},
”name”:”Grady Booch”,
”_id”:”210”
},
{
”company”:{
”country”:”USA”,
”name”:”IBM”
},
”name”:”James Rumbaugh”,
”_id”:”310”
},
{
”country”:”USA”,
”company”:”Ivar Jacobson Consulting”,
”name”:”Ivar Jacobson”,
”_id”:”410”
}],
”type”:”book”,
”year”:2013,
”publisher_id”:”345679”,
”title”:”The Unified Modeling Language”,
”_id”:”1”
},
{
”discipline”:”software engineering”,
”issn”:[
”0098 -5589”,
”1939 -3520”
],
”name”:”IEEE Trans. on Software Engineering”,
”type”:”journal”,
”_id”:”11”
},
{
”name”:”Automated Software Engineering”,
”issn”:[
”0928 -8910”,
”1573 -7535”
],
”discipline”:”software engineering”,
”type”:”journal”,
”_id”:”12”,
”number”:10515
},
{
”city”:”Barcelona”,
”name”:”Omega”,
”type”:”publisher”,
”_id”:”123451”
},
{
”type”:”publisher”,
”city”:”Newton”,
”name”:”O’Reilly Media”,
”_id”:”928672”
},
{
”type”:”book”,
”author”:{
”_id”:”101”,
”name”:”Bradley Holt”,
”company”:{
”country”:”USA”,
”name”:”IBM Cloudant”,
}
},
”title”:”Writing and Querying MapReduce Views in
CouchDB”,
”publisher_id”:”928672”,
”_id”:”2”
},
{
”name”:”Addison -Wesley”,
”type”:”publisher”,
”_id”:”345679”
},
{
”type”:”publisher”,
”journals”:[
”11”,
”12”
],
”name”:”IEEE Publications”,
”_id”:”907863”
}]}
NoSQL Database Model
▶ Objects (Entities) and Entity Versions
▶ Attributes
▶ Relationships
▶ Aggregation
▶ References
{
”type”:”publisher”,
”city”:”Newton”,
”name”:”O’Reilly Media”,
”_id”:”928672”
},
{
”type”:”book”,
”author”:{
”_id”:”101”,
”name”:”Bradley Holt”,
”company”:{
”country”:”USA”,
”name”:”IBM Cloudant”,
}
},
”title”:”Writing and Querying MapReduce Views in CouchDB”,
”publisher_id”:”928672”,
”_id”:”2”
},
Schema & Entity Versions Description
Entity Publisher {
Version 1 {
name: String
city: String
}
Version 2 {
name: String
}
Version 3 {
name: String
journal[+]: [Ref]->[Journal] (opposite=False)
}
}
Entity Journal {
Version 1 {
issn: Tuple [String, String]
name: String
discipline: String
}
Version 2 {
issn: Tuple [String, String]
name: String
discipline: String
number: int
}
}
Entity Book {
Version 1 {
title: String
year: int
publisher[1]: [Ref]->[Publisher] (opossite=False)
content[1]: [Aggregate]Content1
author[+]: [Aggregate]Author1
}
Version 2 {
title: String
publisher[1]: [Ref]->[Publisher] (opossite=False)
author[1]: [Aggregate]Author1
}
}
Entity Author {
Version 1 {
name: String
company[1]: [Aggregate]Company
}
Version 2 {
country: String
company: String
name: String
}
}
Entity Company {
Version 1 {
name: String
country: String
}
}
Entity Content {
Version 1 {
chapters: int
pages: int
}
}
(a) (b)
[1..1] company
[1..1] publisher[1..1] content[1..*] authors
[1..*] journals
Solution Design Considerations
▶ We have to process all the objects in the Database
⇒ Map-Reduce
▶ Natural data processing on NoSQL databases
▶ Leverage MDE technologies
▶ Reuse EMF/Ecore tooling to show entity diagrams
▶ Automation & Code Generation by Metamodeling &
Model Transformations
Proposed MDE Architecture
NoSQL
Database
MapReduce
Object
Versions
(JSON)
JSON
Injection
JSON
Model
JSON
Metamodel
Schema
Reverse
Eng
Schema
Model
Application
Generation
Schema
Viewer/
Data
Validator/
Migration
Assistant
Applications Schema
Metamodel
instance
instance
Reverse Engineering Process (i)
▶ Map-Reduce process
▶ Map: obtains the Raw Schema for each object
▶ Reduce: selects an archetype for each Entity Version
▶ Entity Type
▶ Root objects ⇒ “type” field or collection name
▶ Aggregated objects ⇒ key of the pair (e.g. “author”)
JSON object Raw Schema
{name:“Omega”, city:“Barcelona”} {name:String, city:String}
{title:“Writing and...”,
publisher_id:“928672”,
author:{name:“Bradley Holt”,
company:{country:“USA”,
name:“IBM Cloudant”} } }
{title:String,
publisher_id:String,
author:{name:String,
company:{country:String,
name:String} } }
Reverse Engineering Process (ii)
▶ Attributes: primitive or tuple
▶ Aggregated Entities
▶ Value of the pair is an Object (or array of objects)
▶ Entity type inferred from the key
▶ References
▶ Heuristics/Conventions
▶ Key: <entity_name>_id
▶ Value: MongoDB’s DBRef abstraction:
{”$ref”: ”<entity_name>”, ”$id”, <id_value>}
▶ Honor cardinalities (arrays)
Example NoSQL Applications
▶ From the DBSchema model, using Model
Transformations and Model-to-Text transformations
(Code Generation), we can:
▶ Generate models that Characterize each Entity
Version
▶ That characterization can be used to Visualize the
Database
▶ And also to generate code to Validate objects
entering the Database
▶ Generate models that allow Database Migration to
the desired Entity Versions
Type Discrimination/Characterization
Metamodel
function isOfExactTypeBook_2(obj) {
if (! (”type” in obj)) {
return false;
}
if (obj[type] !== ”Book”) {
return false;
}
if (! (”title” in obj)) {
return false;
}
if (! (”author” in obj)) {
return false;
}
if (”publisher” in obj) {
return false;
}
if (”content” in obj) {
return false;
}
if (”year” in obj) {
return false;
}
return true;
}
Generated using a Model-
to-Text transformation
from an instance of the
previous Type Discrimina-
tion Metamodel
Entity Versions
Alternate: D3.js Treemap
Type Transformation Metamodel
db.<collection >. update(
<query >,
<update >,
{
multi: true
}
)
Obtained by Entity Type Characterization
Generate the correct update
MongoDB statement using $set,
$push, etc., maybe via user assis-
tance through a DSL.
For example, for Journal_1 to
Journal_2:
$set: { ”number”: 1 }
Conclusions & Future work
▶ A process for obtaining Conceptual Model Schemas
for NoSQL Databases is shown
▶ The process takes into account the different Entity
Versions present in the Database
▶ A MDE process allows us to automate the
production of several applications from the Schemas
▶ Example applications that allow Database
Visualization and Migration are shown
Conclusions & Future work (ii)
▶ Future work includes:
▶ Building a NoSQL Database Tool Set (NoSQL Data
Engineering)
▶ DSL for Entity Version migration
▶ Refining the Schema to allow a richer Type System
▶ Allow value ranges or enumerated sets
▶ Infer attribute dependencies (derived attributes,
i.e. the value of an attribute dictates the value of
another attribute)
▶ etc.

More Related Content

What's hot

MongoDB Schema Design
MongoDB Schema DesignMongoDB Schema Design
MongoDB Schema Designaaronheckmann
 
Schema Design
Schema DesignSchema Design
Schema DesignMongoDB
 
Schema & Design
Schema & DesignSchema & Design
Schema & DesignMongoDB
 
Schema Design
Schema DesignSchema Design
Schema DesignMongoDB
 
Webinar: Schema Design
Webinar: Schema DesignWebinar: Schema Design
Webinar: Schema DesignMongoDB
 
Back to Basics 1: Thinking in documents
Back to Basics 1: Thinking in documentsBack to Basics 1: Thinking in documents
Back to Basics 1: Thinking in documentsMongoDB
 
Mysql to mongo
Mysql to mongoMysql to mongo
Mysql to mongoAlex Sharp
 
MongoDB Schema Design (Event: An Evening with MongoDB Houston 3/11/15)
MongoDB Schema Design (Event: An Evening with MongoDB Houston 3/11/15)MongoDB Schema Design (Event: An Evening with MongoDB Houston 3/11/15)
MongoDB Schema Design (Event: An Evening with MongoDB Houston 3/11/15)MongoDB
 
Schema Design
Schema DesignSchema Design
Schema DesignMongoDB
 
How to Win Friends and Influence People (with Hadoop)
How to Win Friends and Influence People (with Hadoop)How to Win Friends and Influence People (with Hadoop)
How to Win Friends and Influence People (with Hadoop)Sam Shah
 
Storing tree structures with MongoDB
Storing tree structures with MongoDBStoring tree structures with MongoDB
Storing tree structures with MongoDBVyacheslav
 
MongoDB, PHP and the cloud - php cloud summit 2011
MongoDB, PHP and the cloud - php cloud summit 2011MongoDB, PHP and the cloud - php cloud summit 2011
MongoDB, PHP and the cloud - php cloud summit 2011Steven Francia
 
Using Mongoid with Ruby on Rails
Using Mongoid with Ruby on RailsUsing Mongoid with Ruby on Rails
Using Mongoid with Ruby on RailsNicholas Altobelli
 
MongoDB and PHP ZendCon 2011
MongoDB and PHP ZendCon 2011MongoDB and PHP ZendCon 2011
MongoDB and PHP ZendCon 2011Steven Francia
 
Schema Design
Schema DesignSchema Design
Schema DesignMongoDB
 
Practical Ruby Projects with MongoDB - Ruby Kaigi 2010
Practical Ruby Projects with MongoDB - Ruby Kaigi 2010Practical Ruby Projects with MongoDB - Ruby Kaigi 2010
Practical Ruby Projects with MongoDB - Ruby Kaigi 2010Alex Sharp
 
d3sparql.js demo at SWAT4LS 2014 in Berlin
d3sparql.js demo at SWAT4LS 2014 in Berlind3sparql.js demo at SWAT4LS 2014 in Berlin
d3sparql.js demo at SWAT4LS 2014 in BerlinToshiaki Katayama
 

What's hot (20)

MongoDB Schema Design
MongoDB Schema DesignMongoDB Schema Design
MongoDB Schema Design
 
Schema Design
Schema DesignSchema Design
Schema Design
 
Schema & Design
Schema & DesignSchema & Design
Schema & Design
 
Schema Design
Schema DesignSchema Design
Schema Design
 
Webinar: Schema Design
Webinar: Schema DesignWebinar: Schema Design
Webinar: Schema Design
 
Back to Basics 1: Thinking in documents
Back to Basics 1: Thinking in documentsBack to Basics 1: Thinking in documents
Back to Basics 1: Thinking in documents
 
Mysql to mongo
Mysql to mongoMysql to mongo
Mysql to mongo
 
MongoDB Schema Design (Event: An Evening with MongoDB Houston 3/11/15)
MongoDB Schema Design (Event: An Evening with MongoDB Houston 3/11/15)MongoDB Schema Design (Event: An Evening with MongoDB Houston 3/11/15)
MongoDB Schema Design (Event: An Evening with MongoDB Houston 3/11/15)
 
Schema Design
Schema DesignSchema Design
Schema Design
 
How to Win Friends and Influence People (with Hadoop)
How to Win Friends and Influence People (with Hadoop)How to Win Friends and Influence People (with Hadoop)
How to Win Friends and Influence People (with Hadoop)
 
ActiveRecord vs Mongoid
ActiveRecord vs MongoidActiveRecord vs Mongoid
ActiveRecord vs Mongoid
 
JSON-LD and MongoDB
JSON-LD and MongoDBJSON-LD and MongoDB
JSON-LD and MongoDB
 
Storing tree structures with MongoDB
Storing tree structures with MongoDBStoring tree structures with MongoDB
Storing tree structures with MongoDB
 
Json
JsonJson
Json
 
MongoDB, PHP and the cloud - php cloud summit 2011
MongoDB, PHP and the cloud - php cloud summit 2011MongoDB, PHP and the cloud - php cloud summit 2011
MongoDB, PHP and the cloud - php cloud summit 2011
 
Using Mongoid with Ruby on Rails
Using Mongoid with Ruby on RailsUsing Mongoid with Ruby on Rails
Using Mongoid with Ruby on Rails
 
MongoDB and PHP ZendCon 2011
MongoDB and PHP ZendCon 2011MongoDB and PHP ZendCon 2011
MongoDB and PHP ZendCon 2011
 
Schema Design
Schema DesignSchema Design
Schema Design
 
Practical Ruby Projects with MongoDB - Ruby Kaigi 2010
Practical Ruby Projects with MongoDB - Ruby Kaigi 2010Practical Ruby Projects with MongoDB - Ruby Kaigi 2010
Practical Ruby Projects with MongoDB - Ruby Kaigi 2010
 
d3sparql.js demo at SWAT4LS 2014 in Berlin
d3sparql.js demo at SWAT4LS 2014 in Berlind3sparql.js demo at SWAT4LS 2014 in Berlin
d3sparql.js demo at SWAT4LS 2014 in Berlin
 

Viewers also liked

05 Problem Detection
05 Problem Detection05 Problem Detection
05 Problem DetectionJorge Ressia
 
Dominguez n fichasdecontenido
Dominguez n fichasdecontenidoDominguez n fichasdecontenido
Dominguez n fichasdecontenidoNerea Dominguez
 
Principles of site design
Principles of site designPrinciples of site design
Principles of site designKnoldus Inc.
 
Turismo Fluvial en Alemania - 2012
Turismo Fluvial en Alemania - 2012Turismo Fluvial en Alemania - 2012
Turismo Fluvial en Alemania - 2012Keltia Viatges
 
Insan ve bilgisayar etkileşimi / Human Computer interaction
Insan ve bilgisayar etkileşimi / Human Computer interactionInsan ve bilgisayar etkileşimi / Human Computer interaction
Insan ve bilgisayar etkileşimi / Human Computer interactionNejat Kutup
 
Conceptos para avanzar juntos en la educación actual
Conceptos para avanzar juntos en la educación actualConceptos para avanzar juntos en la educación actual
Conceptos para avanzar juntos en la educación actualSelin Carrasco
 
Planificación TEOYE 1°APM 2014
Planificación TEOYE 1°APM 2014Planificación TEOYE 1°APM 2014
Planificación TEOYE 1°APM 2014silvias10
 
ISLE Professionalization Fair 2. Soledad Gómez González: "Sustainable Practi...
ISLE Professionalization Fair 2. Soledad Gómez González: "Sustainable  Practi...ISLE Professionalization Fair 2. Soledad Gómez González: "Sustainable  Practi...
ISLE Professionalization Fair 2. Soledad Gómez González: "Sustainable Practi...ISLE Network
 
Cómo registrar tu marca en España paso a paso
Cómo registrar tu marca en España paso a pasoCómo registrar tu marca en España paso a paso
Cómo registrar tu marca en España paso a pasoeconred
 
Historia del Planeta Alfa, "el manantial sagrado".
Historia del Planeta Alfa, "el manantial sagrado". Historia del Planeta Alfa, "el manantial sagrado".
Historia del Planeta Alfa, "el manantial sagrado". EDUCACIÓN TOLEDO
 
Biografia web pilar
Biografia web pilarBiografia web pilar
Biografia web pilarpilarica11q
 
Sps Conferenc Essen 2009 Stenum Fresner
Sps Conferenc Essen 2009 Stenum FresnerSps Conferenc Essen 2009 Stenum Fresner
Sps Conferenc Essen 2009 Stenum FresnerCSCP
 
Wellness & Spa Hotel Lindenhof in South Tyrol
Wellness & Spa Hotel Lindenhof in South TyrolWellness & Spa Hotel Lindenhof in South Tyrol
Wellness & Spa Hotel Lindenhof in South TyrolLindenhof
 
MCCCD Experts
MCCCD ExpertsMCCCD Experts
MCCCD Expertsmcccd
 

Viewers also liked (19)

05 Problem Detection
05 Problem Detection05 Problem Detection
05 Problem Detection
 
Dominguez n fichasdecontenido
Dominguez n fichasdecontenidoDominguez n fichasdecontenido
Dominguez n fichasdecontenido
 
Principles of site design
Principles of site designPrinciples of site design
Principles of site design
 
Turismo Fluvial en Alemania - 2012
Turismo Fluvial en Alemania - 2012Turismo Fluvial en Alemania - 2012
Turismo Fluvial en Alemania - 2012
 
Insan ve bilgisayar etkileşimi / Human Computer interaction
Insan ve bilgisayar etkileşimi / Human Computer interactionInsan ve bilgisayar etkileşimi / Human Computer interaction
Insan ve bilgisayar etkileşimi / Human Computer interaction
 
PresentacióN1
PresentacióN1PresentacióN1
PresentacióN1
 
Conceptos para avanzar juntos en la educación actual
Conceptos para avanzar juntos en la educación actualConceptos para avanzar juntos en la educación actual
Conceptos para avanzar juntos en la educación actual
 
Planificación TEOYE 1°APM 2014
Planificación TEOYE 1°APM 2014Planificación TEOYE 1°APM 2014
Planificación TEOYE 1°APM 2014
 
ISLE Professionalization Fair 2. Soledad Gómez González: "Sustainable Practi...
ISLE Professionalization Fair 2. Soledad Gómez González: "Sustainable  Practi...ISLE Professionalization Fair 2. Soledad Gómez González: "Sustainable  Practi...
ISLE Professionalization Fair 2. Soledad Gómez González: "Sustainable Practi...
 
Handbook en
Handbook   enHandbook   en
Handbook en
 
June2016TradeComplianceOps
June2016TradeComplianceOpsJune2016TradeComplianceOps
June2016TradeComplianceOps
 
Cómo registrar tu marca en España paso a paso
Cómo registrar tu marca en España paso a pasoCómo registrar tu marca en España paso a paso
Cómo registrar tu marca en España paso a paso
 
Historia del Planeta Alfa, "el manantial sagrado".
Historia del Planeta Alfa, "el manantial sagrado". Historia del Planeta Alfa, "el manantial sagrado".
Historia del Planeta Alfa, "el manantial sagrado".
 
Biografia web pilar
Biografia web pilarBiografia web pilar
Biografia web pilar
 
Ondas.pptx 11 b
Ondas.pptx 11 bOndas.pptx 11 b
Ondas.pptx 11 b
 
Sps Conferenc Essen 2009 Stenum Fresner
Sps Conferenc Essen 2009 Stenum FresnerSps Conferenc Essen 2009 Stenum Fresner
Sps Conferenc Essen 2009 Stenum Fresner
 
Wellness & Spa Hotel Lindenhof in South Tyrol
Wellness & Spa Hotel Lindenhof in South TyrolWellness & Spa Hotel Lindenhof in South Tyrol
Wellness & Spa Hotel Lindenhof in South Tyrol
 
MCCCD Experts
MCCCD ExpertsMCCCD Experts
MCCCD Experts
 
From Past to Present: Sustainable Transportation Practices in Graz
From Past to Present: Sustainable Transportation Practices in GrazFrom Past to Present: Sustainable Transportation Practices in Graz
From Past to Present: Sustainable Transportation Practices in Graz
 

Similar to Inferring Versioned Schemas from NoSQL Databases and its Applications

Semi Formal Model for Document Oriented Databases
Semi Formal Model for Document Oriented DatabasesSemi Formal Model for Document Oriented Databases
Semi Formal Model for Document Oriented DatabasesDaniel Coupal
 
Data Modelling Zone 2019 - data modelling and JSON
Data Modelling Zone 2019 - data modelling and JSONData Modelling Zone 2019 - data modelling and JSON
Data Modelling Zone 2019 - data modelling and JSONGeorge McGeachie
 
Modeling JSON data for NoSQL document databases
Modeling JSON data for NoSQL document databasesModeling JSON data for NoSQL document databases
Modeling JSON data for NoSQL document databasesRyan CrawCour
 
Aggregation Framework MongoDB Days Munich
Aggregation Framework MongoDB Days MunichAggregation Framework MongoDB Days Munich
Aggregation Framework MongoDB Days MunichNorberto Leite
 
SH 1 - SES 2 part 2 - Tel Aviv MDBlocal - Eliot Keynote.pptx
SH 1 - SES 2 part 2 - Tel Aviv MDBlocal - Eliot Keynote.pptxSH 1 - SES 2 part 2 - Tel Aviv MDBlocal - Eliot Keynote.pptx
SH 1 - SES 2 part 2 - Tel Aviv MDBlocal - Eliot Keynote.pptxMongoDB
 
SH 1 - SES 2 part 2 - Tel Aviv MDBlocal - Eliot Keynote.pptx
SH 1 - SES 2 part 2 - Tel Aviv MDBlocal - Eliot Keynote.pptxSH 1 - SES 2 part 2 - Tel Aviv MDBlocal - Eliot Keynote.pptx
SH 1 - SES 2 part 2 - Tel Aviv MDBlocal - Eliot Keynote.pptxMongoDB
 
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And WhentranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And WhenDavid Peyruc
 
MongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: Tutorial
MongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: TutorialMongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: Tutorial
MongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: TutorialMongoDB
 
MongoDB.local DC 2018: Tutorial - Data Analytics with MongoDB
MongoDB.local DC 2018: Tutorial - Data Analytics with MongoDBMongoDB.local DC 2018: Tutorial - Data Analytics with MongoDB
MongoDB.local DC 2018: Tutorial - Data Analytics with MongoDBMongoDB
 
Eagle6 mongo dc revised
Eagle6 mongo dc revisedEagle6 mongo dc revised
Eagle6 mongo dc revisedMongoDB
 
Eagle6 Enterprise Situational Awareness
Eagle6 Enterprise Situational AwarenessEagle6 Enterprise Situational Awareness
Eagle6 Enterprise Situational AwarenessMongoDB
 
Application development with Oracle NoSQL Database 3.0
Application development with Oracle NoSQL Database 3.0Application development with Oracle NoSQL Database 3.0
Application development with Oracle NoSQL Database 3.0Anuj Sahni
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
 
MongoDB .local London 2019: Best Practices for Working with IoT and Time-seri...
MongoDB .local London 2019: Best Practices for Working with IoT and Time-seri...MongoDB .local London 2019: Best Practices for Working with IoT and Time-seri...
MongoDB .local London 2019: Best Practices for Working with IoT and Time-seri...MongoDB
 
Building your First MEAN App
Building your First MEAN AppBuilding your First MEAN App
Building your First MEAN AppMongoDB
 
MongoDB .local Chicago 2019: Best Practices for Working with IoT and Time-ser...
MongoDB .local Chicago 2019: Best Practices for Working with IoT and Time-ser...MongoDB .local Chicago 2019: Best Practices for Working with IoT and Time-ser...
MongoDB .local Chicago 2019: Best Practices for Working with IoT and Time-ser...MongoDB
 
NoSE: Schema Design for NoSQL Applications
NoSE: Schema Design for NoSQL ApplicationsNoSE: Schema Design for NoSQL Applications
NoSE: Schema Design for NoSQL ApplicationsMichael Mior
 
Getting Started with NoSQL
Getting Started with NoSQLGetting Started with NoSQL
Getting Started with NoSQLAaron Benton
 

Similar to Inferring Versioned Schemas from NoSQL Databases and its Applications (20)

Semi Formal Model for Document Oriented Databases
Semi Formal Model for Document Oriented DatabasesSemi Formal Model for Document Oriented Databases
Semi Formal Model for Document Oriented Databases
 
Data Modelling Zone 2019 - data modelling and JSON
Data Modelling Zone 2019 - data modelling and JSONData Modelling Zone 2019 - data modelling and JSON
Data Modelling Zone 2019 - data modelling and JSON
 
Modeling JSON data for NoSQL document databases
Modeling JSON data for NoSQL document databasesModeling JSON data for NoSQL document databases
Modeling JSON data for NoSQL document databases
 
Aggregation Framework MongoDB Days Munich
Aggregation Framework MongoDB Days MunichAggregation Framework MongoDB Days Munich
Aggregation Framework MongoDB Days Munich
 
MongoDB Meetup
MongoDB MeetupMongoDB Meetup
MongoDB Meetup
 
SH 1 - SES 2 part 2 - Tel Aviv MDBlocal - Eliot Keynote.pptx
SH 1 - SES 2 part 2 - Tel Aviv MDBlocal - Eliot Keynote.pptxSH 1 - SES 2 part 2 - Tel Aviv MDBlocal - Eliot Keynote.pptx
SH 1 - SES 2 part 2 - Tel Aviv MDBlocal - Eliot Keynote.pptx
 
SH 1 - SES 2 part 2 - Tel Aviv MDBlocal - Eliot Keynote.pptx
SH 1 - SES 2 part 2 - Tel Aviv MDBlocal - Eliot Keynote.pptxSH 1 - SES 2 part 2 - Tel Aviv MDBlocal - Eliot Keynote.pptx
SH 1 - SES 2 part 2 - Tel Aviv MDBlocal - Eliot Keynote.pptx
 
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And WhentranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
 
MongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: Tutorial
MongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: TutorialMongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: Tutorial
MongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: Tutorial
 
MongoDB.local DC 2018: Tutorial - Data Analytics with MongoDB
MongoDB.local DC 2018: Tutorial - Data Analytics with MongoDBMongoDB.local DC 2018: Tutorial - Data Analytics with MongoDB
MongoDB.local DC 2018: Tutorial - Data Analytics with MongoDB
 
Eagle6 mongo dc revised
Eagle6 mongo dc revisedEagle6 mongo dc revised
Eagle6 mongo dc revised
 
Eagle6 Enterprise Situational Awareness
Eagle6 Enterprise Situational AwarenessEagle6 Enterprise Situational Awareness
Eagle6 Enterprise Situational Awareness
 
Application development with Oracle NoSQL Database 3.0
Application development with Oracle NoSQL Database 3.0Application development with Oracle NoSQL Database 3.0
Application development with Oracle NoSQL Database 3.0
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDb and Windows Azure
MongoDb and Windows AzureMongoDb and Windows Azure
MongoDb and Windows Azure
 
MongoDB .local London 2019: Best Practices for Working with IoT and Time-seri...
MongoDB .local London 2019: Best Practices for Working with IoT and Time-seri...MongoDB .local London 2019: Best Practices for Working with IoT and Time-seri...
MongoDB .local London 2019: Best Practices for Working with IoT and Time-seri...
 
Building your First MEAN App
Building your First MEAN AppBuilding your First MEAN App
Building your First MEAN App
 
MongoDB .local Chicago 2019: Best Practices for Working with IoT and Time-ser...
MongoDB .local Chicago 2019: Best Practices for Working with IoT and Time-ser...MongoDB .local Chicago 2019: Best Practices for Working with IoT and Time-ser...
MongoDB .local Chicago 2019: Best Practices for Working with IoT and Time-ser...
 
NoSE: Schema Design for NoSQL Applications
NoSE: Schema Design for NoSQL ApplicationsNoSE: Schema Design for NoSQL Applications
NoSE: Schema Design for NoSQL Applications
 
Getting Started with NoSQL
Getting Started with NoSQLGetting Started with NoSQL
Getting Started with NoSQL
 

Recently uploaded

PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一F La
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 

Recently uploaded (20)

PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 

Inferring Versioned Schemas from NoSQL Databases and its Applications

  • 1. Inferring Versioned Schemas from NoSQL Databases and its Applications ER’15 Stockholm, October 2015 [{ ”id”: ”90234 af”, ”value”: { ”author”: ”Diego Sevilla Ruiz”, ”e-mail”: ”dsevilla@um.es”, ”institution”: ”U. of Murcia”}}, { ”id”: ”a243bb5”, ”value”: { ”author”: ”Severino Feliciano Morales”, ”e-mail”: ”severino.feliciano@um.es”, ”institution”: ”U. of Murcia”}}, { ”id”: ”096705d”, ”value”: { ”author”: ”Jesús García Molina”, ”e-mail”: ”jmolina@um.es”, ”institution”: ”U. of Murcia”}}]
  • 2. Motivation NoSQL Databases are Schemaless Benefits ▶ No need to previously define an Schema ▶ Non-uniform data ▶ Custom fields ▶ Non-uniform types ▶ Easier evolution Drawbacks ▶ Harder to reason about the DB ▶ Static checking is lost ▶ Some of the data logic is in the application code (more error prone) ▶ Some utilities need Schema information to work
  • 3. Schemas for NoSQL Databases ▶ How to alleviate the problems of schemaless databases? ⇒ Inferring a Schema ▶ The Schema Model contains information about Entities and Relationships ▶ Take into account the different Entity Versions in the Database ▶ Heterogeneity usually because of slight variations on Entities ▶ We obtain a precise database model ▶ The Schema allows us to automate the construction of tools: ▶ migration, refactoring, visualization, …
  • 4. Related Work ▶ JSON Schema ▶ Object versions and relationships are not considered ▶ Apache Spark SQL/Drill: SQL-like schemas ▶ Union of all fields, nullable ⇒ incorrect combinations ▶ Over-generalization to String ▶ Aggregations and Reference relations not considered ▶ MongoDB-Schema ▶ Prototype to infer schemas from MongoDB collections ▶ Same limitations than Spark SQL ▶ JSON Discoverer ▶ A MDE solution to infer domain models from REST web services (i.e. JSON documents) ▶ Not database-oriented; Object versions not considered
  • 5. Spark SQL Example {”name”:”Michael”} {”name”:”Andy”, ”age”:30} {”name”:”Justin”, ”age”:19} {”name”:”Peter”, ”age”:”tiny”} {”name”:”Martina”, ”address”:”home!”} > people.printSchema root |-- address: string (nullable = true) |-- age: string (nullable = true) |-- name: string (nullable = true) ▶ age promoted to string ▶ age and address are never part of the same object
  • 6. { ”rows”:[ { ”content”:{ ”chapters”:33, ”pages”:527 }, ”authors”:[ { ”company”:{ ”country”:”USA”, ”name”:”IBM” }, ”name”:”Grady Booch”, ”_id”:”210” }, { ”company”:{ ”country”:”USA”, ”name”:”IBM” }, ”name”:”James Rumbaugh”, ”_id”:”310” }, { ”country”:”USA”, ”company”:”Ivar Jacobson Consulting”, ”name”:”Ivar Jacobson”, ”_id”:”410” }], ”type”:”book”, ”year”:2013, ”publisher_id”:”345679”, ”title”:”The Unified Modeling Language”, ”_id”:”1” }, { ”discipline”:”software engineering”, ”issn”:[ ”0098 -5589”, ”1939 -3520” ], ”name”:”IEEE Trans. on Software Engineering”, ”type”:”journal”, ”_id”:”11” }, { ”name”:”Automated Software Engineering”, ”issn”:[ ”0928 -8910”, ”1573 -7535” ], ”discipline”:”software engineering”, ”type”:”journal”, ”_id”:”12”, ”number”:10515 }, { ”city”:”Barcelona”, ”name”:”Omega”, ”type”:”publisher”, ”_id”:”123451” }, { ”type”:”publisher”, ”city”:”Newton”, ”name”:”O’Reilly Media”, ”_id”:”928672” }, { ”type”:”book”, ”author”:{ ”_id”:”101”, ”name”:”Bradley Holt”, ”company”:{ ”country”:”USA”, ”name”:”IBM Cloudant”, } }, ”title”:”Writing and Querying MapReduce Views in CouchDB”, ”publisher_id”:”928672”, ”_id”:”2” }, { ”name”:”Addison -Wesley”, ”type”:”publisher”, ”_id”:”345679” }, { ”type”:”publisher”, ”journals”:[ ”11”, ”12” ], ”name”:”IEEE Publications”, ”_id”:”907863” }]}
  • 7. NoSQL Database Model ▶ Objects (Entities) and Entity Versions ▶ Attributes ▶ Relationships ▶ Aggregation ▶ References { ”type”:”publisher”, ”city”:”Newton”, ”name”:”O’Reilly Media”, ”_id”:”928672” }, { ”type”:”book”, ”author”:{ ”_id”:”101”, ”name”:”Bradley Holt”, ”company”:{ ”country”:”USA”, ”name”:”IBM Cloudant”, } }, ”title”:”Writing and Querying MapReduce Views in CouchDB”, ”publisher_id”:”928672”, ”_id”:”2” },
  • 8. Schema & Entity Versions Description Entity Publisher { Version 1 { name: String city: String } Version 2 { name: String } Version 3 { name: String journal[+]: [Ref]->[Journal] (opposite=False) } } Entity Journal { Version 1 { issn: Tuple [String, String] name: String discipline: String } Version 2 { issn: Tuple [String, String] name: String discipline: String number: int } } Entity Book { Version 1 { title: String year: int publisher[1]: [Ref]->[Publisher] (opossite=False) content[1]: [Aggregate]Content1 author[+]: [Aggregate]Author1 } Version 2 { title: String publisher[1]: [Ref]->[Publisher] (opossite=False) author[1]: [Aggregate]Author1 } } Entity Author { Version 1 { name: String company[1]: [Aggregate]Company } Version 2 { country: String company: String name: String } } Entity Company { Version 1 { name: String country: String } } Entity Content { Version 1 { chapters: int pages: int } } (a) (b) [1..1] company [1..1] publisher[1..1] content[1..*] authors [1..*] journals
  • 9. Solution Design Considerations ▶ We have to process all the objects in the Database ⇒ Map-Reduce ▶ Natural data processing on NoSQL databases ▶ Leverage MDE technologies ▶ Reuse EMF/Ecore tooling to show entity diagrams ▶ Automation & Code Generation by Metamodeling & Model Transformations
  • 11. Reverse Engineering Process (i) ▶ Map-Reduce process ▶ Map: obtains the Raw Schema for each object ▶ Reduce: selects an archetype for each Entity Version ▶ Entity Type ▶ Root objects ⇒ “type” field or collection name ▶ Aggregated objects ⇒ key of the pair (e.g. “author”) JSON object Raw Schema {name:“Omega”, city:“Barcelona”} {name:String, city:String} {title:“Writing and...”, publisher_id:“928672”, author:{name:“Bradley Holt”, company:{country:“USA”, name:“IBM Cloudant”} } } {title:String, publisher_id:String, author:{name:String, company:{country:String, name:String} } }
  • 12. Reverse Engineering Process (ii) ▶ Attributes: primitive or tuple ▶ Aggregated Entities ▶ Value of the pair is an Object (or array of objects) ▶ Entity type inferred from the key ▶ References ▶ Heuristics/Conventions ▶ Key: <entity_name>_id ▶ Value: MongoDB’s DBRef abstraction: {”$ref”: ”<entity_name>”, ”$id”, <id_value>} ▶ Honor cardinalities (arrays)
  • 13.
  • 14. Example NoSQL Applications ▶ From the DBSchema model, using Model Transformations and Model-to-Text transformations (Code Generation), we can: ▶ Generate models that Characterize each Entity Version ▶ That characterization can be used to Visualize the Database ▶ And also to generate code to Validate objects entering the Database ▶ Generate models that allow Database Migration to the desired Entity Versions
  • 16. function isOfExactTypeBook_2(obj) { if (! (”type” in obj)) { return false; } if (obj[type] !== ”Book”) { return false; } if (! (”title” in obj)) { return false; } if (! (”author” in obj)) { return false; } if (”publisher” in obj) { return false; } if (”content” in obj) { return false; } if (”year” in obj) { return false; } return true; } Generated using a Model- to-Text transformation from an instance of the previous Type Discrimina- tion Metamodel
  • 17.
  • 18.
  • 21. Type Transformation Metamodel db.<collection >. update( <query >, <update >, { multi: true } ) Obtained by Entity Type Characterization Generate the correct update MongoDB statement using $set, $push, etc., maybe via user assis- tance through a DSL. For example, for Journal_1 to Journal_2: $set: { ”number”: 1 }
  • 22. Conclusions & Future work ▶ A process for obtaining Conceptual Model Schemas for NoSQL Databases is shown ▶ The process takes into account the different Entity Versions present in the Database ▶ A MDE process allows us to automate the production of several applications from the Schemas ▶ Example applications that allow Database Visualization and Migration are shown
  • 23. Conclusions & Future work (ii) ▶ Future work includes: ▶ Building a NoSQL Database Tool Set (NoSQL Data Engineering) ▶ DSL for Entity Version migration ▶ Refining the Schema to allow a richer Type System ▶ Allow value ranges or enumerated sets ▶ Infer attribute dependencies (derived attributes, i.e. the value of an attribute dictates the value of another attribute) ▶ etc.