SlideShare a Scribd company logo
1 of 27
Daniele Bailo
M E T A D A T A
& BROKERING
a modern approach
E P I S O D E # 2
Previously on…
Metadata &
Brokering#1
Main concepts
- Digital Data
- Metadata
- Brokering system
- The triad <PID, MD, DO>
- Database
- APIs (web services)
Side concepts
- Ontologies / Semantics
- PID
- Digital Object
- Standard
- Interoperability
- Open Access
Data
set
Data
set
Data
set
Data
set
Data
set
Data
set
Data
set
Data
set
Data
set
API API API
Discovery (DC) and (CKAN, eGMS)
Contextual (CERIF metadata model)
Detailed (community specific)
Features
1. APIs
2. <PID, metadata, DO>
3. Contextualization
metadata
4. Support ontologies
Data from Irpinia
<PID, metadata, DO>
request response
THE PERFECT
SYSTEM
#6 Metadata driven
canonical Brokering
with contextualization
& PID
NEW & OLD CHARACTERS
Metadata
Purposes
1. Discovery (humans &
machines)
2. Contextualization:
which is the context of
the data
3. Use it for processing
or other advanced
tasks
Usually attached to D.O.
Interoperability
What & Why
Enables 2 system to
1. Exchange information
2. Understand information
Usually achieved
through:
- Agreed language
- Software “translators”
interfaces thin layers
...ma che parli
Arabo???
Ontologies
Why an ontology?
It is the way machines
manage “meaning”
How does it work?
1. Connects concepts
2. Needs vocabulary
Issues
• Many ontologies exist
• Vocabulary Mapping
Michelini
CNT
Is Director of
INGV
Is section of Gresta
Is president of
Sailing
Has hobby
Trieste
Is Born
Italy
Located in
Boat
use
sea
use
Metadata
Catalogue#1
Purposes
Store metadata:
e.g. 1. producer
2. date of creation
3. data format format
Misleading
Example (why?)
Metadata
Catalogue#2
How to implement it?
Single table (bad habit)
One table with all data
Multi table (good habit)
- Data is stored in
multiple tables (one for
concept)
- Tables are linked
- Can contextualize data
Metadata catalogue =
relational database *
Single table
Multi table
Metadata
Catalogue#2
How to implement it?
Single table (bad habit)
One table with all data
Multi table (good habit)
- Data is stored in
unique tables (one for
concept)
- Tables are linked
- Can contextualize data
Metadata catalogue =
relational database *
Single table
Multi table and
contextualization
Catalogue Interface
Human interface (GUI)
Website or portal
Machine interface
- API or Web service
- which execute scripts
or queries
- Returns metadata in a
given standard
What is it?
It does something for the
user
(deliver value to
customer)*
A “thin layer”
We usually don’t know
what’s under the hood
Examples
- FDSN stations
(web) service
FDSN stations
FDSN Dataselect
Database
(MD catalogue)
Waveform
repository
CKAN
CKAN GUI
METADATA
catalogue
CKAN APIs
EIDA stations ISIDE stations
Metadata
replication
What is it?
- Metadata Catalogue
- With interfaces
(GUI+API)
- No direct
CKAN <-> sources
connection
Examples
- Works FDSN stations
- Doesn’t work with
FDSN dataselect
Plugins
Plugins
Plugins Plugins
Plugins
Plugins
Plugins Plugins
Brokering System
(e.g. VERCE framework)
BROKER GUI
METADATA
catalogue
BROKER APIs
EIDA stations
ISIDE
stations
Metadata
replication
What is it?
- Metadata Catalogue
- With interfaces
(GUI+API)
- System manager
- Other modules
- BROKER <-> sources
interactive connection
Examples
- EIDA stations
- EIDA dataselect
- Processing Job at
System
manager
Interactive
access to
service
EIDA
dataselect
Processing
facility
? ? ?
Comments
&
Questions
Why the example
was misleading?
A global view
Data initiatives
RDA
-”regulate” data
sharing/use
EUDAT
- Common data
infrastructure
EGI
- Organize National Grid
Infrastructures (CINECA)
EPOS
- ESFRI integrating Solid
RDA
Do for data what has been
done for the internet
(TCP/IP)
RDA concepts
Data Fabric
What?
Identifies mechanisms,
standard, components and
interfaces making data
science efficient and cost
effective
Data Management Plan
• Data management
• Data analysis
• Data preservation
• Data publication
• Data sharing
[UK data Archive http://www.data-archive.ac.uk/]
RDA concepts
Data Fabric
[RDA WG outputs https://indico.cern.ch/event/370271/session/2/contribution/6/material/0/0.pdf]
How to store?How to register?
How to discover?
How to cite?
How to document
processing?
How to integrate?
How to collect
new DP?
How to
access?
data?
How to discover
data?
Metadata system
WE ALREADY KNOW
EVERYTHING ABOUT IT
METADATA
catalogue
standards?
How to preserve
data?Registry
systemWhat?
An agreed/legacy catalog
of:
- data formats (schemas)
- metadata formats
- Vocabularies & semantic
categories
- Data types
- Trusted repositories
- ….
Registry
Ahaa.. Ma
‘npratica è ‘n
database..
…anfatti…
How to register/cite data or
publications?
PID system
Purpose
- DO / publication can be
uniquely referenced
- Assign a PID at data
creation times
Issues
- Need for a simple
mechanism to implement
it
- Now EUDAT can help
- Peter & Massimo
How to access data?
AAI system
(federeated &
distributed)
Purpose
- Authenticate users
- Authorize users
Issues
- Delegation
- Many system,
sometimes non
interoperable
How to store data?
Data repository
(trusted)
What?
- Store data
- Couple with PIDs
- Ensure preservation (not
curation)
- Can be trusted (DSA)
Opportunity
- INGV DSA repository…
How to document data
processing?
Workflow engines
Purpose
- Tracks data
transformation
- Allows versioning
- Allows reproducibility
Comments
- Interoperability among
various workflow engines
- VERCE did it
Brokering System
(e.g. VERCE framework)
BROKER GUI
METADATA
catalogue
BROKER APIs
Full version include
- Metadata Catalogue
- interfaces (GUI+API)
- System manager
- AAI system
- Workflow engine
External actors
- PID System
- Trusted repositories
- Registries
- Processing facilities
System
manager
Data
set
Data
set Data
set
Data
set Data
set
Data
set
API API
AAI
system
Workflow
Engine
Trusted
repository
Trusted
repository
Registry
PID
system
HPC
center
Q&A

More Related Content

What's hot

How to describe a dataset. Interoperability issues
How to describe a dataset. Interoperability issuesHow to describe a dataset. Interoperability issues
How to describe a dataset. Interoperability issuesValeria Pesce
 
Semantic challenges in sharing dataset metadata and creating federated datase...
Semantic challenges in sharing dataset metadata and creating federated datase...Semantic challenges in sharing dataset metadata and creating federated datase...
Semantic challenges in sharing dataset metadata and creating federated datase...Valeria Pesce
 
Data Quality with AI
Data Quality with AIData Quality with AI
Data Quality with AIVera Ekimenko
 
How Data Virtualization Adds Value to Your Data Science Stack
How Data Virtualization Adds Value to Your Data Science StackHow Data Virtualization Adds Value to Your Data Science Stack
How Data Virtualization Adds Value to Your Data Science StackDenodo
 
Venturing into-datomic
Venturing into-datomicVenturing into-datomic
Venturing into-datomicKirill Salykin
 
Data discovery through federated dataset catalogs
Data discovery through federated dataset catalogsData discovery through federated dataset catalogs
Data discovery through federated dataset catalogsValeria Pesce
 
FAIR Dataverse
FAIR DataverseFAIR Dataverse
FAIR Dataversevty
 
CESSDA Persistent Identifiers
CESSDA Persistent Identifiers CESSDA Persistent Identifiers
CESSDA Persistent Identifiers vty
 
KnowIT, semantic informatics knowledge base
KnowIT, semantic informatics knowledge baseKnowIT, semantic informatics knowledge base
KnowIT, semantic informatics knowledge baseLaurent Alquier
 
Nosql database presentation
Nosql database  presentationNosql database  presentation
Nosql database presentationmusaab fathi
 
Semantic Technologies for Big Data
Semantic Technologies for Big DataSemantic Technologies for Big Data
Semantic Technologies for Big DataMarin Dimitrov
 
API economy
API economyAPI economy
API economyvty
 
CKAN - the open source data portal platform
CKAN - the open source data portal platformCKAN - the open source data portal platform
CKAN - the open source data portal platformMaurizio Napolitano
 
How to enhance your DSpace repository: use cases for DSpace-CRIS, DSpace-RDM,...
How to enhance your DSpace repository: use cases for DSpace-CRIS, DSpace-RDM,...How to enhance your DSpace repository: use cases for DSpace-CRIS, DSpace-RDM,...
How to enhance your DSpace repository: use cases for DSpace-CRIS, DSpace-RDM,...4Science
 
Building Knowledge Graphs in 10 steps
Building Knowledge Graphs in 10 stepsBuilding Knowledge Graphs in 10 steps
Building Knowledge Graphs in 10 stepsOntotext
 
#mashcat: Evolving MarcEdit: Leveraging Semantic Data in MarcEdit
#mashcat: Evolving MarcEdit: Leveraging Semantic Data in MarcEdit#mashcat: Evolving MarcEdit: Leveraging Semantic Data in MarcEdit
#mashcat: Evolving MarcEdit: Leveraging Semantic Data in MarcEditTerry Reese
 
Artificial Intelligence for Data Quality
Artificial Intelligence for Data QualityArtificial Intelligence for Data Quality
Artificial Intelligence for Data QualityVera Ekimenko
 
DSpace-CRIS & OpenAIRE
DSpace-CRIS & OpenAIREDSpace-CRIS & OpenAIRE
DSpace-CRIS & OpenAIRE4Science
 

What's hot (20)

How to describe a dataset. Interoperability issues
How to describe a dataset. Interoperability issuesHow to describe a dataset. Interoperability issues
How to describe a dataset. Interoperability issues
 
Semantic challenges in sharing dataset metadata and creating federated datase...
Semantic challenges in sharing dataset metadata and creating federated datase...Semantic challenges in sharing dataset metadata and creating federated datase...
Semantic challenges in sharing dataset metadata and creating federated datase...
 
Data Quality with AI
Data Quality with AIData Quality with AI
Data Quality with AI
 
How Data Virtualization Adds Value to Your Data Science Stack
How Data Virtualization Adds Value to Your Data Science StackHow Data Virtualization Adds Value to Your Data Science Stack
How Data Virtualization Adds Value to Your Data Science Stack
 
Venturing into-datomic
Venturing into-datomicVenturing into-datomic
Venturing into-datomic
 
Data discovery through federated dataset catalogs
Data discovery through federated dataset catalogsData discovery through federated dataset catalogs
Data discovery through federated dataset catalogs
 
FAIR Dataverse
FAIR DataverseFAIR Dataverse
FAIR Dataverse
 
CESSDA Persistent Identifiers
CESSDA Persistent Identifiers CESSDA Persistent Identifiers
CESSDA Persistent Identifiers
 
KnowIT, semantic informatics knowledge base
KnowIT, semantic informatics knowledge baseKnowIT, semantic informatics knowledge base
KnowIT, semantic informatics knowledge base
 
Nosql database presentation
Nosql database  presentationNosql database  presentation
Nosql database presentation
 
Semantic Technologies for Big Data
Semantic Technologies for Big DataSemantic Technologies for Big Data
Semantic Technologies for Big Data
 
API economy
API economyAPI economy
API economy
 
CKAN - the open source data portal platform
CKAN - the open source data portal platformCKAN - the open source data portal platform
CKAN - the open source data portal platform
 
Sebastian Hellmann
Sebastian HellmannSebastian Hellmann
Sebastian Hellmann
 
How to enhance your DSpace repository: use cases for DSpace-CRIS, DSpace-RDM,...
How to enhance your DSpace repository: use cases for DSpace-CRIS, DSpace-RDM,...How to enhance your DSpace repository: use cases for DSpace-CRIS, DSpace-RDM,...
How to enhance your DSpace repository: use cases for DSpace-CRIS, DSpace-RDM,...
 
Building Knowledge Graphs in 10 steps
Building Knowledge Graphs in 10 stepsBuilding Knowledge Graphs in 10 steps
Building Knowledge Graphs in 10 steps
 
#mashcat: Evolving MarcEdit: Leveraging Semantic Data in MarcEdit
#mashcat: Evolving MarcEdit: Leveraging Semantic Data in MarcEdit#mashcat: Evolving MarcEdit: Leveraging Semantic Data in MarcEdit
#mashcat: Evolving MarcEdit: Leveraging Semantic Data in MarcEdit
 
tecFinal 451 webinar deck
tecFinal 451 webinar decktecFinal 451 webinar deck
tecFinal 451 webinar deck
 
Artificial Intelligence for Data Quality
Artificial Intelligence for Data QualityArtificial Intelligence for Data Quality
Artificial Intelligence for Data Quality
 
DSpace-CRIS & OpenAIRE
DSpace-CRIS & OpenAIREDSpace-CRIS & OpenAIRE
DSpace-CRIS & OpenAIRE
 

Viewers also liked

Zato my iz_zato
Zato my iz_zatoZato my iz_zato
Zato my iz_zatomarymam
 
Ngu phap tieng anh
Ngu phap tieng anhNgu phap tieng anh
Ngu phap tieng anhthucvat
 
Desarollo de la personalidad. Psicologia
Desarollo de la personalidad. PsicologiaDesarollo de la personalidad. Psicologia
Desarollo de la personalidad. Psicologiaclaudiacarnevali
 
Alimentacióny nutrición alejandrina ibarra avila
Alimentacióny nutrición alejandrina ibarra avilaAlimentacióny nutrición alejandrina ibarra avila
Alimentacióny nutrición alejandrina ibarra avilacynthiardzb
 
зато мы из ЗАТО
зато мы из ЗАТОзато мы из ЗАТО
зато мы из ЗАТОmarymam
 
Classroom ethics by Reaz and Ayyaz
Classroom ethics by Reaz and AyyazClassroom ethics by Reaz and Ayyaz
Classroom ethics by Reaz and Ayyazroyos88
 
Classroom ethics cartoon by Reaz and Ayyaz
Classroom ethics cartoon by Reaz and AyyazClassroom ethics cartoon by Reaz and Ayyaz
Classroom ethics cartoon by Reaz and Ayyazroyos88
 
03 streamline english destinations
03 streamline english destinations03 streamline english destinations
03 streamline english destinationsthucvat
 
141124 vocational trg_notice_eng - winter 2014
141124 vocational trg_notice_eng - winter 2014141124 vocational trg_notice_eng - winter 2014
141124 vocational trg_notice_eng - winter 2014Ashok Kumar Yadav
 
Metadata & Brokering - a modern approach for INGV RI
Metadata & Brokering - a modern approach for INGV RI Metadata & Brokering - a modern approach for INGV RI
Metadata & Brokering - a modern approach for INGV RI Daniele Bailo
 
Penalosa Farm: An Organic Haven
Penalosa Farm: An Organic HavenPenalosa Farm: An Organic Haven
Penalosa Farm: An Organic HavenClaire Algarme
 
Kurchatow
KurchatowKurchatow
Kurchatowmarymam
 
Реклама на портале Expoclub.ru
Реклама на портале Expoclub.ruРеклама на портале Expoclub.ru
Реклама на портале Expoclub.ruexpoclub-adv
 
04 streamline english directions
04 streamline english directions04 streamline english directions
04 streamline english directionsthucvat
 
Bali island
Bali islandBali island
Bali islandAIZZY118
 

Viewers also liked (20)

Zato my iz_zato
Zato my iz_zatoZato my iz_zato
Zato my iz_zato
 
Ngu phap tieng anh
Ngu phap tieng anhNgu phap tieng anh
Ngu phap tieng anh
 
Monet
MonetMonet
Monet
 
Article 14-CFS
Article 14-CFSArticle 14-CFS
Article 14-CFS
 
Desarollo de la personalidad. Psicologia
Desarollo de la personalidad. PsicologiaDesarollo de la personalidad. Psicologia
Desarollo de la personalidad. Psicologia
 
Alimentacióny nutrición alejandrina ibarra avila
Alimentacióny nutrición alejandrina ibarra avilaAlimentacióny nutrición alejandrina ibarra avila
Alimentacióny nutrición alejandrina ibarra avila
 
зато мы из ЗАТО
зато мы из ЗАТОзато мы из ЗАТО
зато мы из ЗАТО
 
Classroom ethics by Reaz and Ayyaz
Classroom ethics by Reaz and AyyazClassroom ethics by Reaz and Ayyaz
Classroom ethics by Reaz and Ayyaz
 
Swimming
SwimmingSwimming
Swimming
 
Classroom ethics cartoon by Reaz and Ayyaz
Classroom ethics cartoon by Reaz and AyyazClassroom ethics cartoon by Reaz and Ayyaz
Classroom ethics cartoon by Reaz and Ayyaz
 
03 streamline english destinations
03 streamline english destinations03 streamline english destinations
03 streamline english destinations
 
141124 vocational trg_notice_eng - winter 2014
141124 vocational trg_notice_eng - winter 2014141124 vocational trg_notice_eng - winter 2014
141124 vocational trg_notice_eng - winter 2014
 
Metadata & Brokering - a modern approach for INGV RI
Metadata & Brokering - a modern approach for INGV RI Metadata & Brokering - a modern approach for INGV RI
Metadata & Brokering - a modern approach for INGV RI
 
Penalosa Farm: An Organic Haven
Penalosa Farm: An Organic HavenPenalosa Farm: An Organic Haven
Penalosa Farm: An Organic Haven
 
Kurchatow
KurchatowKurchatow
Kurchatow
 
Реклама на портале Expoclub.ru
Реклама на портале Expoclub.ruРеклама на портале Expoclub.ru
Реклама на портале Expoclub.ru
 
MD Grand Akashu Riau
MD Grand Akashu RiauMD Grand Akashu Riau
MD Grand Akashu Riau
 
04 streamline english directions
04 streamline english directions04 streamline english directions
04 streamline english directions
 
Hazyl Joan Amellos
Hazyl Joan AmellosHazyl Joan Amellos
Hazyl Joan Amellos
 
Bali island
Bali islandBali island
Bali island
 

Similar to Metadata & brokering - a modern approach #2

Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014Hortonworks
 
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dataconomy Media
 
Myth Busters II: BI Tools and Data Virtualization are Interchangeable
Myth Busters II: BI Tools and Data Virtualization are InterchangeableMyth Busters II: BI Tools and Data Virtualization are Interchangeable
Myth Busters II: BI Tools and Data Virtualization are InterchangeableDenodo
 
A machine learning and data science pipeline for real companies
A machine learning and data science pipeline for real companiesA machine learning and data science pipeline for real companies
A machine learning and data science pipeline for real companiesDataWorks Summit
 
Etosha - Data Asset Manager : Status and road map
Etosha - Data Asset Manager : Status and road mapEtosha - Data Asset Manager : Status and road map
Etosha - Data Asset Manager : Status and road mapDr. Mirko Kämpf
 
Maximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
Maximizing Data Lake ROI with Data Virtualization: A Technical DemonstrationMaximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
Maximizing Data Lake ROI with Data Virtualization: A Technical DemonstrationDenodo
 
Denodo Partner Connect: Technical Webinar - Ask Me Anything
Denodo Partner Connect: Technical Webinar - Ask Me AnythingDenodo Partner Connect: Technical Webinar - Ask Me Anything
Denodo Partner Connect: Technical Webinar - Ask Me AnythingDenodo
 
Force11 JDDCP workshop presentation, @ Force2015, Oxford
Force11 JDDCP workshop presentation, @ Force2015, OxfordForce11 JDDCP workshop presentation, @ Force2015, Oxford
Force11 JDDCP workshop presentation, @ Force2015, OxfordMark Wilkinson
 
Big data analytics: Technology's bleeding edge
Big data analytics: Technology's bleeding edgeBig data analytics: Technology's bleeding edge
Big data analytics: Technology's bleeding edgeBhavya Gulati
 
Apache CarbonData+Spark to realize data convergence and Unified high performa...
Apache CarbonData+Spark to realize data convergence and Unified high performa...Apache CarbonData+Spark to realize data convergence and Unified high performa...
Apache CarbonData+Spark to realize data convergence and Unified high performa...Tech Triveni
 
Navigating the World of User Data Management and Data Discovery
Navigating the World of User Data Management and Data DiscoveryNavigating the World of User Data Management and Data Discovery
Navigating the World of User Data Management and Data DiscoveryDataWorks Summit/Hadoop Summit
 
Getting Started with Data Virtualization – What problems DV solves
Getting Started with Data Virtualization – What problems DV solvesGetting Started with Data Virtualization – What problems DV solves
Getting Started with Data Virtualization – What problems DV solvesDenodo
 
Data Science & Big Data - Theory.pdf
Data Science & Big Data - Theory.pdfData Science & Big Data - Theory.pdf
Data Science & Big Data - Theory.pdfRAKESHG79
 
Product Keynote: Advancing Denodo’s Logical Data Fabric with AI and Advanced ...
Product Keynote: Advancing Denodo’s Logical Data Fabric with AI and Advanced ...Product Keynote: Advancing Denodo’s Logical Data Fabric with AI and Advanced ...
Product Keynote: Advancing Denodo’s Logical Data Fabric with AI and Advanced ...Denodo
 
Dwdmunit1 a
Dwdmunit1 aDwdmunit1 a
Dwdmunit1 abhagathk
 
Introduction Big Data
Introduction Big DataIntroduction Big Data
Introduction Big DataFrank Kienle
 
Not Your Father’s Data Warehouse: Breaking Tradition with Innovation
Not Your Father’s Data Warehouse: Breaking Tradition with InnovationNot Your Father’s Data Warehouse: Breaking Tradition with Innovation
Not Your Father’s Data Warehouse: Breaking Tradition with InnovationInside Analysis
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshJeffrey T. Pollock
 

Similar to Metadata & brokering - a modern approach #2 (20)

Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014
 
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
 
L08 Data Source Layer
L08 Data Source LayerL08 Data Source Layer
L08 Data Source Layer
 
Myth Busters II: BI Tools and Data Virtualization are Interchangeable
Myth Busters II: BI Tools and Data Virtualization are InterchangeableMyth Busters II: BI Tools and Data Virtualization are Interchangeable
Myth Busters II: BI Tools and Data Virtualization are Interchangeable
 
A machine learning and data science pipeline for real companies
A machine learning and data science pipeline for real companiesA machine learning and data science pipeline for real companies
A machine learning and data science pipeline for real companies
 
Etosha - Data Asset Manager : Status and road map
Etosha - Data Asset Manager : Status and road mapEtosha - Data Asset Manager : Status and road map
Etosha - Data Asset Manager : Status and road map
 
Maximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
Maximizing Data Lake ROI with Data Virtualization: A Technical DemonstrationMaximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
Maximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
 
Denodo Partner Connect: Technical Webinar - Ask Me Anything
Denodo Partner Connect: Technical Webinar - Ask Me AnythingDenodo Partner Connect: Technical Webinar - Ask Me Anything
Denodo Partner Connect: Technical Webinar - Ask Me Anything
 
Force11 JDDCP workshop presentation, @ Force2015, Oxford
Force11 JDDCP workshop presentation, @ Force2015, OxfordForce11 JDDCP workshop presentation, @ Force2015, Oxford
Force11 JDDCP workshop presentation, @ Force2015, Oxford
 
L15 Data Source Layer
L15 Data Source LayerL15 Data Source Layer
L15 Data Source Layer
 
Big data analytics: Technology's bleeding edge
Big data analytics: Technology's bleeding edgeBig data analytics: Technology's bleeding edge
Big data analytics: Technology's bleeding edge
 
Apache CarbonData+Spark to realize data convergence and Unified high performa...
Apache CarbonData+Spark to realize data convergence and Unified high performa...Apache CarbonData+Spark to realize data convergence and Unified high performa...
Apache CarbonData+Spark to realize data convergence and Unified high performa...
 
Navigating the World of User Data Management and Data Discovery
Navigating the World of User Data Management and Data DiscoveryNavigating the World of User Data Management and Data Discovery
Navigating the World of User Data Management and Data Discovery
 
Getting Started with Data Virtualization – What problems DV solves
Getting Started with Data Virtualization – What problems DV solvesGetting Started with Data Virtualization – What problems DV solves
Getting Started with Data Virtualization – What problems DV solves
 
Data Science & Big Data - Theory.pdf
Data Science & Big Data - Theory.pdfData Science & Big Data - Theory.pdf
Data Science & Big Data - Theory.pdf
 
Product Keynote: Advancing Denodo’s Logical Data Fabric with AI and Advanced ...
Product Keynote: Advancing Denodo’s Logical Data Fabric with AI and Advanced ...Product Keynote: Advancing Denodo’s Logical Data Fabric with AI and Advanced ...
Product Keynote: Advancing Denodo’s Logical Data Fabric with AI and Advanced ...
 
Dwdmunit1 a
Dwdmunit1 aDwdmunit1 a
Dwdmunit1 a
 
Introduction Big Data
Introduction Big DataIntroduction Big Data
Introduction Big Data
 
Not Your Father’s Data Warehouse: Breaking Tradition with Innovation
Not Your Father’s Data Warehouse: Breaking Tradition with InnovationNot Your Father’s Data Warehouse: Breaking Tradition with Innovation
Not Your Father’s Data Warehouse: Breaking Tradition with Innovation
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to Mesh
 

Recently uploaded

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 

Recently uploaded (20)

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 

Metadata & brokering - a modern approach #2

  • 1. Daniele Bailo M E T A D A T A & BROKERING a modern approach E P I S O D E # 2
  • 2. Previously on… Metadata & Brokering#1 Main concepts - Digital Data - Metadata - Brokering system - The triad <PID, MD, DO> - Database - APIs (web services) Side concepts - Ontologies / Semantics - PID - Digital Object - Standard - Interoperability - Open Access
  • 3. Data set Data set Data set Data set Data set Data set Data set Data set Data set API API API Discovery (DC) and (CKAN, eGMS) Contextual (CERIF metadata model) Detailed (community specific) Features 1. APIs 2. <PID, metadata, DO> 3. Contextualization metadata 4. Support ontologies Data from Irpinia <PID, metadata, DO> request response THE PERFECT SYSTEM #6 Metadata driven canonical Brokering with contextualization & PID
  • 4. NEW & OLD CHARACTERS
  • 5. Metadata Purposes 1. Discovery (humans & machines) 2. Contextualization: which is the context of the data 3. Use it for processing or other advanced tasks Usually attached to D.O.
  • 6. Interoperability What & Why Enables 2 system to 1. Exchange information 2. Understand information Usually achieved through: - Agreed language - Software “translators” interfaces thin layers ...ma che parli Arabo???
  • 7. Ontologies Why an ontology? It is the way machines manage “meaning” How does it work? 1. Connects concepts 2. Needs vocabulary Issues • Many ontologies exist • Vocabulary Mapping Michelini CNT Is Director of INGV Is section of Gresta Is president of Sailing Has hobby Trieste Is Born Italy Located in Boat use sea use
  • 8. Metadata Catalogue#1 Purposes Store metadata: e.g. 1. producer 2. date of creation 3. data format format Misleading Example (why?)
  • 9. Metadata Catalogue#2 How to implement it? Single table (bad habit) One table with all data Multi table (good habit) - Data is stored in multiple tables (one for concept) - Tables are linked - Can contextualize data Metadata catalogue = relational database * Single table Multi table
  • 10. Metadata Catalogue#2 How to implement it? Single table (bad habit) One table with all data Multi table (good habit) - Data is stored in unique tables (one for concept) - Tables are linked - Can contextualize data Metadata catalogue = relational database * Single table Multi table and contextualization
  • 11. Catalogue Interface Human interface (GUI) Website or portal Machine interface - API or Web service - which execute scripts or queries - Returns metadata in a given standard
  • 12. What is it? It does something for the user (deliver value to customer)* A “thin layer” We usually don’t know what’s under the hood Examples - FDSN stations (web) service FDSN stations FDSN Dataselect Database (MD catalogue) Waveform repository
  • 13. CKAN CKAN GUI METADATA catalogue CKAN APIs EIDA stations ISIDE stations Metadata replication What is it? - Metadata Catalogue - With interfaces (GUI+API) - No direct CKAN <-> sources connection Examples - Works FDSN stations - Doesn’t work with FDSN dataselect Plugins Plugins Plugins Plugins Plugins Plugins Plugins Plugins
  • 14. Brokering System (e.g. VERCE framework) BROKER GUI METADATA catalogue BROKER APIs EIDA stations ISIDE stations Metadata replication What is it? - Metadata Catalogue - With interfaces (GUI+API) - System manager - Other modules - BROKER <-> sources interactive connection Examples - EIDA stations - EIDA dataselect - Processing Job at System manager Interactive access to service EIDA dataselect Processing facility ? ? ?
  • 16. A global view Data initiatives RDA -”regulate” data sharing/use EUDAT - Common data infrastructure EGI - Organize National Grid Infrastructures (CINECA) EPOS - ESFRI integrating Solid
  • 17. RDA Do for data what has been done for the internet (TCP/IP)
  • 18. RDA concepts Data Fabric What? Identifies mechanisms, standard, components and interfaces making data science efficient and cost effective Data Management Plan • Data management • Data analysis • Data preservation • Data publication • Data sharing [UK data Archive http://www.data-archive.ac.uk/]
  • 19. RDA concepts Data Fabric [RDA WG outputs https://indico.cern.ch/event/370271/session/2/contribution/6/material/0/0.pdf] How to store?How to register? How to discover? How to cite? How to document processing? How to integrate? How to collect new DP? How to access?
  • 20. data? How to discover data? Metadata system WE ALREADY KNOW EVERYTHING ABOUT IT METADATA catalogue
  • 21. standards? How to preserve data?Registry systemWhat? An agreed/legacy catalog of: - data formats (schemas) - metadata formats - Vocabularies & semantic categories - Data types - Trusted repositories - …. Registry Ahaa.. Ma ‘npratica è ‘n database.. …anfatti…
  • 22. How to register/cite data or publications? PID system Purpose - DO / publication can be uniquely referenced - Assign a PID at data creation times Issues - Need for a simple mechanism to implement it - Now EUDAT can help - Peter & Massimo
  • 23. How to access data? AAI system (federeated & distributed) Purpose - Authenticate users - Authorize users Issues - Delegation - Many system, sometimes non interoperable
  • 24. How to store data? Data repository (trusted) What? - Store data - Couple with PIDs - Ensure preservation (not curation) - Can be trusted (DSA) Opportunity - INGV DSA repository…
  • 25. How to document data processing? Workflow engines Purpose - Tracks data transformation - Allows versioning - Allows reproducibility Comments - Interoperability among various workflow engines - VERCE did it
  • 26. Brokering System (e.g. VERCE framework) BROKER GUI METADATA catalogue BROKER APIs Full version include - Metadata Catalogue - interfaces (GUI+API) - System manager - AAI system - Workflow engine External actors - PID System - Trusted repositories - Registries - Processing facilities System manager Data set Data set Data set Data set Data set Data set API API AAI system Workflow Engine Trusted repository Trusted repository Registry PID system HPC center
  • 27. Q&A

Editor's Notes

  1. DIGITAL DATA Sequence of (digital) symbols With a meaning Can be stored Can be transmitted Can be computed METADATA DATA ABOUT DATA What is metadata to me, can be data to others Many standards Ontologies BROKERING SYSTEM - Intermediary software Access to several system at your place Collects data for you (integration) DATABASE - Collection of (organized) DATA Usually has DBMS APIs Application Programming Interfae Standard procedures or instructions to access to a service (or function)
  2. Esempio carta identità
  3. Esempio carta identità
  4. Esempio carta identità
  5. Esempio carta identità
  6. Esempio carta identità
  7. Esempio carta identità
  8. Esempio carta identità
  9. Esempio carta identità
  10. Esempio carta identità
  11. Esempio carta identità
  12. Data management –enterprise to build a data repository, manage an information catalog, & enforce management policy Data analysis –enterprise to process a data collection, apply analysis tools, and automate a processing pipeline. Data preservation –enterprise to build reference collections and knowledge bases that comprise the intellectual capital, while managing technology evolution Data publication –discovery and access of data collections Data sharing – controlled sharing of a data collection, shared analysis workflows, and information catalogs
  13. Data management –enterprise to build a data repository, manage an information catalog, & enforce management policy Data analysis –enterprise to process a data collection, apply analysis tools, and automate a processing pipeline. Data preservation –enterprise to build reference collections and knowledge bases that comprise the intellectual capital, while managing technology evolution Data publication –discovery and access of data collections Data sharing – controlled sharing of a data collection, shared analysis workflows, and information catalogs
  14. Esempio carta identità