SlideShare a Scribd company logo
1 of 18
Download to read offline
MUSYOP: Towards a Query Optimization for
Heterogeneous Distributed Database System
in Energy Data Management
Zhan Liu, Fabian Cretton, Anne Le Calvé, Nicole Glassey,
Alexandre Cotting, Fabrice Chapuis
Institute of Business Information Systems
University of Applied Sciences and Arts Western Switzerland
HES-SO Valais, Sierre, Switzerland
OVERVIEW
• Who Are We ?
• Introduction
• Objectives
• Research methodologies
• Architecture of Heterogeneous Distributed Database System
• Implementation and Use Cases
• Conclusion and Future Research
WHO ARE WE ?
WHO ARE WE ?
• 72 staff members
– 20 Professors
– 14 Research scientists
– 29 Research assistants
– 9 Administrative collaborators
and interns
• 92 Scientific publications
• 291 Projects (2013)
• CHF 8,85 millions (Turnover
in 2013)
CORE COMPETENCIES
eHealtheEnergy
BPM
Cloud Computing
Data Intelligence
Ergonomy, Usability
Intelligent Agents
Internet of Things
Medical Information
Processing
Semantic Web
Software Engineering
eGov ERP eServices
INTRODUCTION
• Databases in the context of an energy database management system
(EDMS) are non-integrated, distributed and heterogeneous
• As a result, information integration is becoming increasing important
• BUT, It consumes “a great deal of time and money”
OBJECTIVES
• To propose a uniform approach by using Semantic Web
technologies for SPARQL querying a heterogeneous distributed
database system named MUSYOP for energy data management
• To provide transparent query access to multiple heterogeneous data
sources
• Query optimization to speed-up search executions
RESEARCH METHODOLOGIES
• (1) Warehouse approach (centralized system)
 High efficiency and the capability of extracting deeper information for decision
making
 Must set up “data cleansing” and “data standardization” before actual use
 Could take months of planning
• (2) Federated approach (decentralized system)
 High adaptability to frequent changes of data sources
 support of large numbers of data sources and data sources with high heterogeneity
• MUSYOP focused on the federated approach
ARCHITECTURE OF HETEROGENEOUS
DISTRIBUTED DATABASE SYSTEM
COMPONENTS OF ARCHITECTURE (1)
• Distributed Query Decomposer
– Generates a number of transaction to match remote data sources
– Assembles transaction results and returns to user
• Query Optimizer
– Data source optimization: data source selection and building sub-queries
– Join order optimization: determine the numbers of intermediate return results
• Distributed Transaction Coordinator
– Detects and handles persistent records and manages the communications with
databases
Distributed
Query
Decomposer
Query
Optimizer
Distributed
Transaction
Coordinator
3
Distributed
transactions
Optimal
sub-queries
4Query Parser 2
Parsed query
COMPONENTS OF ARCHITECTURE (2)
• D2RQ
– Query relational database using SPARQL
– Custom D2RQ mapping file for describing the relation between an ontology and an
relational data model
• SPARQL Endpoint
– Query the knowledge bases, such as TripleStore via SPARQL protocole services
• SPARQL2XQUERY
– Framework provides a generic method for SPARQL to XQuery translation
• ALLEGROGRAPH
– Query MongoDB databases using SPARQL
– Manual transform datasets from MongoDB to TripleStore
– SPARQLverse
D2RQ
SPARQL
ENDPOINT
SPARQL2XQUERY ALLEGROGRAPH
ARCHITECTURE IMPLEMENTATION
User’s notification
settings
Frequent data of
consumption
Daily, monthly and
yearly consumption data
and user alert
management data
Building information,
cities location information
and weather information
D2RQ
Local Schema
DBMS
Local Query
Convertor
Local Schema
DBMS
MYSQL
Distributed Database Server
SPARQL
ENDPOINT
Local Schema
DBMS
Local Query
Convertor
Local Schema
DBMS
MYSQL
SPARQL2XQUERY
Local Schema
DBMS
Local Query
Convertor
Local Schema
DBMS
MYSQL
ALLEGROGRAPH
Local Schema
DBMS
Relational
DB
Triple
Store XML NOSQL
Network
6 6 6 67 77 7
Local
query
Result
Local
query
Result
Local
query
Result
Local
query
Result
Network Network
Distributed Database ServerDistributed Database Server Distributed Database Server
Matching
Link
Aggregation
ENERGY USE CASE
SPARQL QUERY
PREFIX db: <http://localhost:2020/resource/>
PREFIX vocab: <http://localhost:2020/resource/vocab/>
prefix intbatXtra: <http://websemantique.ch/onto/intBatXtra#>
prefix musyopOnto: <http://www.websematique.ch/voc/musyop#>
prefix dbp-ont: <http://dbpedia.org/ontology/>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
select ?userID ?buildingID ?buildingLabel ?cpID ?cons1 ?cons2 ?consDiff
?city ?elevation
{
SERVICE <http://localhost:2020/sparql> {
?config_port a vocab:config_ports;
vocab:config_ports_affectation "Gaz";
vocab:config_ports_batiment_id ?buildingID ;
vocab:config_ports_config_port_id ?cpID.
?donneeMens2011 vocab:donnees_mensuelles_config_port ?config_port;
vocab:donnees_mensuelles_timestamp "2011-01-
01T00:00:00"^^xsd:dateTime;
vocab:donnees_mensuelles_cons ?cons1.
?donneeMens2012 vocab:donnees_mensuelles_config_port ?config_port;
vocab:donnees_mensuelles_timestamp "2012-01-
01T00:00:00"^^xsd:dateTime;
vocab:donnees_mensuelles_cons ?cons2.
?buildingUser rdf:type vocab:building_user;
vocab:building_user_building_id ?buildingID;
vocab:building_user_utilisateur_id ?userID;
}
?build a intbatXtra:Object;
musyopOnto:hasID ?buildingID ;
rdfs:label ?buildingLabel;
musyopOnto:locatedIn ?city.
?city dbp-ont:elevation ?elevation.
Filter(?elevation < ?paramElevationMax)
BIND((xsd:double(?cons2)/xsd:double(?cons1)) as ?consDiff).
Filter(?consDiff > ?paramConsDiffMax)
}
ORDER BY ?buildingID ?userID
limit 100
VALUES (?userID) { USERID_VALUES }
GRAPHICAL USER INTERFACES
GRAPHICAL USER INTERFACES
GRAPHICAL USER INTERFACES
CONCLUSION AND FUTURE RESEARCH
• Presented an architecture of heterogeneous distributed database
system for energy data management
• Integration of a mediator server to access and coordinate with four
different databases: relational database, NOSQL, Triple Store and
XML
• Proposed an approach for query optimization based on our
architecture
• Future work required on the evaluation our system with a large
amount of energy data in Switzerland

More Related Content

What's hot

Introduction of big data unit 1
Introduction of big data unit 1Introduction of big data unit 1
Introduction of big data unit 1RojaT4
 
Modern Scientific Data Management Practices: The Atmospheric Radiation Measur...
Modern Scientific Data Management Practices: The Atmospheric Radiation Measur...Modern Scientific Data Management Practices: The Atmospheric Radiation Measur...
Modern Scientific Data Management Practices: The Atmospheric Radiation Measur...Globus
 
Big data technology unit 3
Big data technology unit 3Big data technology unit 3
Big data technology unit 3RojaT4
 
Fundamentals of big data analytics and Hadoop
Fundamentals of big data analytics and HadoopFundamentals of big data analytics and Hadoop
Fundamentals of big data analytics and HadoopArchana Gopinath
 
How Data Volume Affects Spark Based Data Analytics on a Scale-up Server
How Data Volume Affects Spark Based Data Analytics on a Scale-up ServerHow Data Volume Affects Spark Based Data Analytics on a Scale-up Server
How Data Volume Affects Spark Based Data Analytics on a Scale-up ServerAhsan Javed Awan
 
Implementation of Multi-node Clusters in Column Oriented Database using HDFS
Implementation of Multi-node Clusters in Column Oriented Database using HDFSImplementation of Multi-node Clusters in Column Oriented Database using HDFS
Implementation of Multi-node Clusters in Column Oriented Database using HDFSIJEACS
 
Performance Characterization of In-Memory Data Analytics on a Modern Cloud Se...
Performance Characterization of In-Memory Data Analytics on a Modern Cloud Se...Performance Characterization of In-Memory Data Analytics on a Modern Cloud Se...
Performance Characterization of In-Memory Data Analytics on a Modern Cloud Se...Ahsan Javed Awan
 
Big data analytics: Technology's bleeding edge
Big data analytics: Technology's bleeding edgeBig data analytics: Technology's bleeding edge
Big data analytics: Technology's bleeding edgeBhavya Gulati
 
Big data analysis using hadoop cluster
Big data analysis using hadoop clusterBig data analysis using hadoop cluster
Big data analysis using hadoop clusterFurqan Haider
 
Real time big data analytical architecture for remote sensing application
Real time big data analytical architecture for remote sensing applicationReal time big data analytical architecture for remote sensing application
Real time big data analytical architecture for remote sensing applicationLeMeniz Infotech
 
An Overview of VIEW
An Overview of VIEWAn Overview of VIEW
An Overview of VIEWShiyong Lu
 
Hadoop - A big data initiative
Hadoop - A big data initiativeHadoop - A big data initiative
Hadoop - A big data initiativeMansi Mehra
 
Big Data Analysis in Hydrogen Station using Spark and Azure ML
Big Data Analysis in Hydrogen Station using Spark and Azure MLBig Data Analysis in Hydrogen Station using Spark and Azure ML
Big Data Analysis in Hydrogen Station using Spark and Azure MLJongwook Woo
 
Cloud e-Genome: NGS Workflows on the Cloud Using e-Science Central
Cloud e-Genome: NGS Workflows on the Cloud Using e-Science CentralCloud e-Genome: NGS Workflows on the Cloud Using e-Science Central
Cloud e-Genome: NGS Workflows on the Cloud Using e-Science CentralPaolo Missier
 
SURVEY ON IMPLEMANTATION OF COLUMN ORIENTED NOSQL DATA STORES ( BIGTABLE & CA...
SURVEY ON IMPLEMANTATION OF COLUMN ORIENTED NOSQL DATA STORES ( BIGTABLE & CA...SURVEY ON IMPLEMANTATION OF COLUMN ORIENTED NOSQL DATA STORES ( BIGTABLE & CA...
SURVEY ON IMPLEMANTATION OF COLUMN ORIENTED NOSQL DATA STORES ( BIGTABLE & CA...IJCERT JOURNAL
 
MACHINE LEARNING ON MAPREDUCE FRAMEWORK
MACHINE LEARNING ON MAPREDUCE FRAMEWORKMACHINE LEARNING ON MAPREDUCE FRAMEWORK
MACHINE LEARNING ON MAPREDUCE FRAMEWORKAbhi Jit
 
Big Data Processing with Hadoop : A Review
Big Data Processing with Hadoop : A ReviewBig Data Processing with Hadoop : A Review
Big Data Processing with Hadoop : A ReviewIRJET Journal
 
Big Data with SQL Server
Big Data with SQL ServerBig Data with SQL Server
Big Data with SQL ServerMark Kromer
 

What's hot (20)

Introduction of big data unit 1
Introduction of big data unit 1Introduction of big data unit 1
Introduction of big data unit 1
 
Modern Scientific Data Management Practices: The Atmospheric Radiation Measur...
Modern Scientific Data Management Practices: The Atmospheric Radiation Measur...Modern Scientific Data Management Practices: The Atmospheric Radiation Measur...
Modern Scientific Data Management Practices: The Atmospheric Radiation Measur...
 
Big data technology unit 3
Big data technology unit 3Big data technology unit 3
Big data technology unit 3
 
Fundamentals of big data analytics and Hadoop
Fundamentals of big data analytics and HadoopFundamentals of big data analytics and Hadoop
Fundamentals of big data analytics and Hadoop
 
How Data Volume Affects Spark Based Data Analytics on a Scale-up Server
How Data Volume Affects Spark Based Data Analytics on a Scale-up ServerHow Data Volume Affects Spark Based Data Analytics on a Scale-up Server
How Data Volume Affects Spark Based Data Analytics on a Scale-up Server
 
Implementation of Multi-node Clusters in Column Oriented Database using HDFS
Implementation of Multi-node Clusters in Column Oriented Database using HDFSImplementation of Multi-node Clusters in Column Oriented Database using HDFS
Implementation of Multi-node Clusters in Column Oriented Database using HDFS
 
Performance Characterization of In-Memory Data Analytics on a Modern Cloud Se...
Performance Characterization of In-Memory Data Analytics on a Modern Cloud Se...Performance Characterization of In-Memory Data Analytics on a Modern Cloud Se...
Performance Characterization of In-Memory Data Analytics on a Modern Cloud Se...
 
Hadoop
HadoopHadoop
Hadoop
 
Big data analytics: Technology's bleeding edge
Big data analytics: Technology's bleeding edgeBig data analytics: Technology's bleeding edge
Big data analytics: Technology's bleeding edge
 
Big data analysis using hadoop cluster
Big data analysis using hadoop clusterBig data analysis using hadoop cluster
Big data analysis using hadoop cluster
 
Real time big data analytical architecture for remote sensing application
Real time big data analytical architecture for remote sensing applicationReal time big data analytical architecture for remote sensing application
Real time big data analytical architecture for remote sensing application
 
An Overview of VIEW
An Overview of VIEWAn Overview of VIEW
An Overview of VIEW
 
Hadoop - A big data initiative
Hadoop - A big data initiativeHadoop - A big data initiative
Hadoop - A big data initiative
 
Big Data Analysis in Hydrogen Station using Spark and Azure ML
Big Data Analysis in Hydrogen Station using Spark and Azure MLBig Data Analysis in Hydrogen Station using Spark and Azure ML
Big Data Analysis in Hydrogen Station using Spark and Azure ML
 
Cloud e-Genome: NGS Workflows on the Cloud Using e-Science Central
Cloud e-Genome: NGS Workflows on the Cloud Using e-Science CentralCloud e-Genome: NGS Workflows on the Cloud Using e-Science Central
Cloud e-Genome: NGS Workflows on the Cloud Using e-Science Central
 
SURVEY ON IMPLEMANTATION OF COLUMN ORIENTED NOSQL DATA STORES ( BIGTABLE & CA...
SURVEY ON IMPLEMANTATION OF COLUMN ORIENTED NOSQL DATA STORES ( BIGTABLE & CA...SURVEY ON IMPLEMANTATION OF COLUMN ORIENTED NOSQL DATA STORES ( BIGTABLE & CA...
SURVEY ON IMPLEMANTATION OF COLUMN ORIENTED NOSQL DATA STORES ( BIGTABLE & CA...
 
MACHINE LEARNING ON MAPREDUCE FRAMEWORK
MACHINE LEARNING ON MAPREDUCE FRAMEWORKMACHINE LEARNING ON MAPREDUCE FRAMEWORK
MACHINE LEARNING ON MAPREDUCE FRAMEWORK
 
Big Data Processing with Hadoop : A Review
Big Data Processing with Hadoop : A ReviewBig Data Processing with Hadoop : A Review
Big Data Processing with Hadoop : A Review
 
Big Data with SQL Server
Big Data with SQL ServerBig Data with SQL Server
Big Data with SQL Server
 
CDSS
CDSSCDSS
CDSS
 

Similar to Query Optimization for Heterogeneous Energy Databases

A Workflow-Driven Discovery and Training Ecosystem for Distributed Analysis o...
A Workflow-Driven Discovery and Training Ecosystem for Distributed Analysis o...A Workflow-Driven Discovery and Training Ecosystem for Distributed Analysis o...
A Workflow-Driven Discovery and Training Ecosystem for Distributed Analysis o...Ilkay Altintas, Ph.D.
 
Heterogeneous data transfer and loader
Heterogeneous data transfer and loaderHeterogeneous data transfer and loader
Heterogeneous data transfer and loadereSAT Publishing House
 
Heterogeneous data transfer and loader
Heterogeneous data transfer and loaderHeterogeneous data transfer and loader
Heterogeneous data transfer and loadereSAT Journals
 
Big Data Processing Beyond MapReduce by Dr. Flavio Villanustre
Big Data Processing Beyond MapReduce by Dr. Flavio VillanustreBig Data Processing Beyond MapReduce by Dr. Flavio Villanustre
Big Data Processing Beyond MapReduce by Dr. Flavio VillanustreHPCC Systems
 
Matching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software ArchitecturesMatching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software ArchitecturesGeoffrey Fox
 
Matching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software ArchitecturesMatching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software ArchitecturesGeoffrey Fox
 
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Denodo
 
Foxvalley bigdata
Foxvalley bigdataFoxvalley bigdata
Foxvalley bigdataTom Rogers
 
High Performance Computing and Big Data
High Performance Computing and Big Data High Performance Computing and Big Data
High Performance Computing and Big Data Geoffrey Fox
 
Spark and Couchbase: Augmenting the Operational Database with Spark
Spark and Couchbase: Augmenting the Operational Database with SparkSpark and Couchbase: Augmenting the Operational Database with Spark
Spark and Couchbase: Augmenting the Operational Database with SparkSpark Summit
 
Azure Databricks for Data Scientists
Azure Databricks for Data ScientistsAzure Databricks for Data Scientists
Azure Databricks for Data ScientistsRichard Garris
 
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMINGEVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMINGijiert bestjournal
 
Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action MapR Technologies
 
Where Should You Deliver Database Services From?
Where Should You Deliver Database Services From?Where Should You Deliver Database Services From?
Where Should You Deliver Database Services From?EDB
 
Transform your DBMS to drive engagement innovation with Big Data
Transform your DBMS to drive engagement innovation with Big DataTransform your DBMS to drive engagement innovation with Big Data
Transform your DBMS to drive engagement innovation with Big DataAshnikbiz
 
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...IRJET Journal
 
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...IRJET Journal
 

Similar to Query Optimization for Heterogeneous Energy Databases (20)

A Workflow-Driven Discovery and Training Ecosystem for Distributed Analysis o...
A Workflow-Driven Discovery and Training Ecosystem for Distributed Analysis o...A Workflow-Driven Discovery and Training Ecosystem for Distributed Analysis o...
A Workflow-Driven Discovery and Training Ecosystem for Distributed Analysis o...
 
Heterogeneous data transfer and loader
Heterogeneous data transfer and loaderHeterogeneous data transfer and loader
Heterogeneous data transfer and loader
 
Heterogeneous data transfer and loader
Heterogeneous data transfer and loaderHeterogeneous data transfer and loader
Heterogeneous data transfer and loader
 
Big Data Processing Beyond MapReduce by Dr. Flavio Villanustre
Big Data Processing Beyond MapReduce by Dr. Flavio VillanustreBig Data Processing Beyond MapReduce by Dr. Flavio Villanustre
Big Data Processing Beyond MapReduce by Dr. Flavio Villanustre
 
Matching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software ArchitecturesMatching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software Architectures
 
Matching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software ArchitecturesMatching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software Architectures
 
Query Optimization for Big Data Analytics
Query Optimization for Big Data AnalyticsQuery Optimization for Big Data Analytics
Query Optimization for Big Data Analytics
 
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
 
Foxvalley bigdata
Foxvalley bigdataFoxvalley bigdata
Foxvalley bigdata
 
High Performance Computing and Big Data
High Performance Computing and Big Data High Performance Computing and Big Data
High Performance Computing and Big Data
 
ICT L5+.pptx
ICT L5+.pptxICT L5+.pptx
ICT L5+.pptx
 
Spark and Couchbase: Augmenting the Operational Database with Spark
Spark and Couchbase: Augmenting the Operational Database with SparkSpark and Couchbase: Augmenting the Operational Database with Spark
Spark and Couchbase: Augmenting the Operational Database with Spark
 
Azure Databricks for Data Scientists
Azure Databricks for Data ScientistsAzure Databricks for Data Scientists
Azure Databricks for Data Scientists
 
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMINGEVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
 
Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action
 
No sql database
No sql databaseNo sql database
No sql database
 
Where Should You Deliver Database Services From?
Where Should You Deliver Database Services From?Where Should You Deliver Database Services From?
Where Should You Deliver Database Services From?
 
Transform your DBMS to drive engagement innovation with Big Data
Transform your DBMS to drive engagement innovation with Big DataTransform your DBMS to drive engagement innovation with Big Data
Transform your DBMS to drive engagement innovation with Big Data
 
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...
 
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...
 

More from Institute of Information Systems (HES-SO)

Classification of noisy free-text prostate cancer pathology reports using nat...
Classification of noisy free-text prostate cancer pathology reports using nat...Classification of noisy free-text prostate cancer pathology reports using nat...
Classification of noisy free-text prostate cancer pathology reports using nat...Institute of Information Systems (HES-SO)
 
Machine learning assisted citation screening for Systematic Reviews - Anjani ...
Machine learning assisted citation screening for Systematic Reviews - Anjani ...Machine learning assisted citation screening for Systematic Reviews - Anjani ...
Machine learning assisted citation screening for Systematic Reviews - Anjani ...Institute of Information Systems (HES-SO)
 
Exploiting biomedical literature to mine out a large multimodal dataset of ra...
Exploiting biomedical literature to mine out a large multimodal dataset of ra...Exploiting biomedical literature to mine out a large multimodal dataset of ra...
Exploiting biomedical literature to mine out a large multimodal dataset of ra...Institute of Information Systems (HES-SO)
 
Studying Public Medical Images from Open Access Literature and Social Network...
Studying Public Medical Images from Open Access Literature and Social Network...Studying Public Medical Images from Open Access Literature and Social Network...
Studying Public Medical Images from Open Access Literature and Social Network...Institute of Information Systems (HES-SO)
 
Risques opérationnels et le système de contrôle interne : les limites d’un te...
Risques opérationnels et le système de contrôle interne : les limites d’un te...Risques opérationnels et le système de contrôle interne : les limites d’un te...
Risques opérationnels et le système de contrôle interne : les limites d’un te...Institute of Information Systems (HES-SO)
 
Le contrôle interne dans les administrations publiques tient-il toutes ses pr...
Le contrôle interne dans les administrations publiques tient-il toutes ses pr...Le contrôle interne dans les administrations publiques tient-il toutes ses pr...
Le contrôle interne dans les administrations publiques tient-il toutes ses pr...Institute of Information Systems (HES-SO)
 
Le système de contrôle interne : Présentation générale, enjeux et méthodes
Le système de contrôle interne : Présentation générale, enjeux et méthodesLe système de contrôle interne : Présentation générale, enjeux et méthodes
Le système de contrôle interne : Présentation générale, enjeux et méthodesInstitute of Information Systems (HES-SO)
 
A 3-D Riesz-Covariance Texture Model for the Prediction of Nodule Recurrence ...
A 3-D Riesz-Covariance Texture Model for the Prediction of Nodule Recurrence ...A 3-D Riesz-Covariance Texture Model for the Prediction of Nodule Recurrence ...
A 3-D Riesz-Covariance Texture Model for the Prediction of Nodule Recurrence ...Institute of Information Systems (HES-SO)
 
NOSE: une approche Smart-City pour les zones périphériques et extra-urbaines
NOSE: une approche Smart-City pour les zones périphériques et extra-urbainesNOSE: une approche Smart-City pour les zones périphériques et extra-urbaines
NOSE: une approche Smart-City pour les zones périphériques et extra-urbainesInstitute of Information Systems (HES-SO)
 

More from Institute of Information Systems (HES-SO) (20)

MIE20232.pptx
MIE20232.pptxMIE20232.pptx
MIE20232.pptx
 
Classification of noisy free-text prostate cancer pathology reports using nat...
Classification of noisy free-text prostate cancer pathology reports using nat...Classification of noisy free-text prostate cancer pathology reports using nat...
Classification of noisy free-text prostate cancer pathology reports using nat...
 
Machine learning assisted citation screening for Systematic Reviews - Anjani ...
Machine learning assisted citation screening for Systematic Reviews - Anjani ...Machine learning assisted citation screening for Systematic Reviews - Anjani ...
Machine learning assisted citation screening for Systematic Reviews - Anjani ...
 
Exploiting biomedical literature to mine out a large multimodal dataset of ra...
Exploiting biomedical literature to mine out a large multimodal dataset of ra...Exploiting biomedical literature to mine out a large multimodal dataset of ra...
Exploiting biomedical literature to mine out a large multimodal dataset of ra...
 
L'IoT dans les usines. Quels avantages ?
L'IoT dans les usines. Quels avantages ?L'IoT dans les usines. Quels avantages ?
L'IoT dans les usines. Quels avantages ?
 
Studying Public Medical Images from Open Access Literature and Social Network...
Studying Public Medical Images from Open Access Literature and Social Network...Studying Public Medical Images from Open Access Literature and Social Network...
Studying Public Medical Images from Open Access Literature and Social Network...
 
Risques opérationnels et le système de contrôle interne : les limites d’un te...
Risques opérationnels et le système de contrôle interne : les limites d’un te...Risques opérationnels et le système de contrôle interne : les limites d’un te...
Risques opérationnels et le système de contrôle interne : les limites d’un te...
 
Le contrôle interne dans les administrations publiques tient-il toutes ses pr...
Le contrôle interne dans les administrations publiques tient-il toutes ses pr...Le contrôle interne dans les administrations publiques tient-il toutes ses pr...
Le contrôle interne dans les administrations publiques tient-il toutes ses pr...
 
Le système de contrôle interne : Présentation générale, enjeux et méthodes
Le système de contrôle interne : Présentation générale, enjeux et méthodesLe système de contrôle interne : Présentation générale, enjeux et méthodes
Le système de contrôle interne : Présentation générale, enjeux et méthodes
 
Crowdsourcing-based Mobile Application for Wheelchair Accessibility
Crowdsourcing-based Mobile Application for Wheelchair AccessibilityCrowdsourcing-based Mobile Application for Wheelchair Accessibility
Crowdsourcing-based Mobile Application for Wheelchair Accessibility
 
Quelle(s) valeur(s) pour le leadership stratégique ?
Quelle(s) valeur(s) pour le leadership stratégique ?Quelle(s) valeur(s) pour le leadership stratégique ?
Quelle(s) valeur(s) pour le leadership stratégique ?
 
A 3-D Riesz-Covariance Texture Model for the Prediction of Nodule Recurrence ...
A 3-D Riesz-Covariance Texture Model for the Prediction of Nodule Recurrence ...A 3-D Riesz-Covariance Texture Model for the Prediction of Nodule Recurrence ...
A 3-D Riesz-Covariance Texture Model for the Prediction of Nodule Recurrence ...
 
Challenges in medical imaging and the VISCERAL model
Challenges in medical imaging and the VISCERAL modelChallenges in medical imaging and the VISCERAL model
Challenges in medical imaging and the VISCERAL model
 
NOSE: une approche Smart-City pour les zones périphériques et extra-urbaines
NOSE: une approche Smart-City pour les zones périphériques et extra-urbainesNOSE: une approche Smart-City pour les zones périphériques et extra-urbaines
NOSE: une approche Smart-City pour les zones périphériques et extra-urbaines
 
Medical image analysis and big data evaluation infrastructures
Medical image analysis and big data evaluation infrastructuresMedical image analysis and big data evaluation infrastructures
Medical image analysis and big data evaluation infrastructures
 
Medical image analysis, retrieval and evaluation infrastructures
Medical image analysis, retrieval and evaluation infrastructuresMedical image analysis, retrieval and evaluation infrastructures
Medical image analysis, retrieval and evaluation infrastructures
 
How to detect soft falls on devices
How to detect soft falls on devicesHow to detect soft falls on devices
How to detect soft falls on devices
 
FUNDAMENTALS OF TEXTURE PROCESSING FOR BIOMEDICAL IMAGE ANALYSIS
FUNDAMENTALS OF TEXTURE PROCESSING FOR BIOMEDICAL IMAGE ANALYSISFUNDAMENTALS OF TEXTURE PROCESSING FOR BIOMEDICAL IMAGE ANALYSIS
FUNDAMENTALS OF TEXTURE PROCESSING FOR BIOMEDICAL IMAGE ANALYSIS
 
MOBILE COLLECTION AND DISSEMINATION OF SENIORS’ SKILLS
MOBILE COLLECTION AND DISSEMINATION OF SENIORS’ SKILLSMOBILE COLLECTION AND DISSEMINATION OF SENIORS’ SKILLS
MOBILE COLLECTION AND DISSEMINATION OF SENIORS’ SKILLS
 
Enhanced Students Laboratory The GET project
Enhanced Students Laboratory The GET projectEnhanced Students Laboratory The GET project
Enhanced Students Laboratory The GET project
 

Recently uploaded

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 

Recently uploaded (20)

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 

Query Optimization for Heterogeneous Energy Databases

  • 1. MUSYOP: Towards a Query Optimization for Heterogeneous Distributed Database System in Energy Data Management Zhan Liu, Fabian Cretton, Anne Le Calvé, Nicole Glassey, Alexandre Cotting, Fabrice Chapuis Institute of Business Information Systems University of Applied Sciences and Arts Western Switzerland HES-SO Valais, Sierre, Switzerland
  • 2. OVERVIEW • Who Are We ? • Introduction • Objectives • Research methodologies • Architecture of Heterogeneous Distributed Database System • Implementation and Use Cases • Conclusion and Future Research
  • 4. WHO ARE WE ? • 72 staff members – 20 Professors – 14 Research scientists – 29 Research assistants – 9 Administrative collaborators and interns • 92 Scientific publications • 291 Projects (2013) • CHF 8,85 millions (Turnover in 2013)
  • 5. CORE COMPETENCIES eHealtheEnergy BPM Cloud Computing Data Intelligence Ergonomy, Usability Intelligent Agents Internet of Things Medical Information Processing Semantic Web Software Engineering eGov ERP eServices
  • 6. INTRODUCTION • Databases in the context of an energy database management system (EDMS) are non-integrated, distributed and heterogeneous • As a result, information integration is becoming increasing important • BUT, It consumes “a great deal of time and money”
  • 7. OBJECTIVES • To propose a uniform approach by using Semantic Web technologies for SPARQL querying a heterogeneous distributed database system named MUSYOP for energy data management • To provide transparent query access to multiple heterogeneous data sources • Query optimization to speed-up search executions
  • 8. RESEARCH METHODOLOGIES • (1) Warehouse approach (centralized system)  High efficiency and the capability of extracting deeper information for decision making  Must set up “data cleansing” and “data standardization” before actual use  Could take months of planning • (2) Federated approach (decentralized system)  High adaptability to frequent changes of data sources  support of large numbers of data sources and data sources with high heterogeneity • MUSYOP focused on the federated approach
  • 10. COMPONENTS OF ARCHITECTURE (1) • Distributed Query Decomposer – Generates a number of transaction to match remote data sources – Assembles transaction results and returns to user • Query Optimizer – Data source optimization: data source selection and building sub-queries – Join order optimization: determine the numbers of intermediate return results • Distributed Transaction Coordinator – Detects and handles persistent records and manages the communications with databases Distributed Query Decomposer Query Optimizer Distributed Transaction Coordinator 3 Distributed transactions Optimal sub-queries 4Query Parser 2 Parsed query
  • 11. COMPONENTS OF ARCHITECTURE (2) • D2RQ – Query relational database using SPARQL – Custom D2RQ mapping file for describing the relation between an ontology and an relational data model • SPARQL Endpoint – Query the knowledge bases, such as TripleStore via SPARQL protocole services • SPARQL2XQUERY – Framework provides a generic method for SPARQL to XQuery translation • ALLEGROGRAPH – Query MongoDB databases using SPARQL – Manual transform datasets from MongoDB to TripleStore – SPARQLverse D2RQ SPARQL ENDPOINT SPARQL2XQUERY ALLEGROGRAPH
  • 12. ARCHITECTURE IMPLEMENTATION User’s notification settings Frequent data of consumption Daily, monthly and yearly consumption data and user alert management data Building information, cities location information and weather information D2RQ Local Schema DBMS Local Query Convertor Local Schema DBMS MYSQL Distributed Database Server SPARQL ENDPOINT Local Schema DBMS Local Query Convertor Local Schema DBMS MYSQL SPARQL2XQUERY Local Schema DBMS Local Query Convertor Local Schema DBMS MYSQL ALLEGROGRAPH Local Schema DBMS Relational DB Triple Store XML NOSQL Network 6 6 6 67 77 7 Local query Result Local query Result Local query Result Local query Result Network Network Distributed Database ServerDistributed Database Server Distributed Database Server Matching Link Aggregation
  • 14. SPARQL QUERY PREFIX db: <http://localhost:2020/resource/> PREFIX vocab: <http://localhost:2020/resource/vocab/> prefix intbatXtra: <http://websemantique.ch/onto/intBatXtra#> prefix musyopOnto: <http://www.websematique.ch/voc/musyop#> prefix dbp-ont: <http://dbpedia.org/ontology/> prefix xsd: <http://www.w3.org/2001/XMLSchema#> select ?userID ?buildingID ?buildingLabel ?cpID ?cons1 ?cons2 ?consDiff ?city ?elevation { SERVICE <http://localhost:2020/sparql> { ?config_port a vocab:config_ports; vocab:config_ports_affectation "Gaz"; vocab:config_ports_batiment_id ?buildingID ; vocab:config_ports_config_port_id ?cpID. ?donneeMens2011 vocab:donnees_mensuelles_config_port ?config_port; vocab:donnees_mensuelles_timestamp "2011-01- 01T00:00:00"^^xsd:dateTime; vocab:donnees_mensuelles_cons ?cons1. ?donneeMens2012 vocab:donnees_mensuelles_config_port ?config_port; vocab:donnees_mensuelles_timestamp "2012-01- 01T00:00:00"^^xsd:dateTime; vocab:donnees_mensuelles_cons ?cons2. ?buildingUser rdf:type vocab:building_user; vocab:building_user_building_id ?buildingID; vocab:building_user_utilisateur_id ?userID; } ?build a intbatXtra:Object; musyopOnto:hasID ?buildingID ; rdfs:label ?buildingLabel; musyopOnto:locatedIn ?city. ?city dbp-ont:elevation ?elevation. Filter(?elevation < ?paramElevationMax) BIND((xsd:double(?cons2)/xsd:double(?cons1)) as ?consDiff). Filter(?consDiff > ?paramConsDiffMax) } ORDER BY ?buildingID ?userID limit 100 VALUES (?userID) { USERID_VALUES }
  • 18. CONCLUSION AND FUTURE RESEARCH • Presented an architecture of heterogeneous distributed database system for energy data management • Integration of a mediator server to access and coordinate with four different databases: relational database, NOSQL, Triple Store and XML • Proposed an approach for query optimization based on our architecture • Future work required on the evaluation our system with a large amount of energy data in Switzerland