SlideShare a Scribd company logo
Data Linkage
Alasdair J G Gray
A.J.G.Gray@hw.ac.uk
alasdairjggray.co.uk
@gray_alasdair
Estuarine Flooding
 Financial implications
 Damage
 Loss of business
 Personal factors
 Emotional impact
 Flood prediction
 Locations
 Severity
 Requires correlating
 Sea-state data
 Weather forecasts
 Details of sea defences
 Response Planning
 Evacuation routes
 Personnel deployment
 …
 Requires more data
 Traffic reports
 Shipping
 …
8 April 2015 SICSA Env. & Social Databases 2
Image: http://www.metro.co.uk/
Flood Predication
Solent Use Case
 Busy shipping
channel
 Two major ports
 Complex tidal
and
wave patterns
8 April 2015 SICSA Env. & Social Databases 3
Flood
defences
data
(database)
Flood Detection
“Detect overtopping
events in the Solent
region”
sea-level >
sea-defence
•Sea-level: sensors
•Defence heights:
databases
8 April 2015 SICSA Env. & Social Databases 4
Real-time
sensor data
Wave,
Wind,
Tide
Meteorological
forecasts
Response Planning
“Provide contextual
information”
• Web feeds
• Other sources: maps,
models
• Real-time merging of
datasets
8 April 2015 SICSA Env. & Social Databases 5
Other sources:
Maps, models,
…
Data Linkage and Querying
Web of Data
8 April 2015 SICSA Env. & Social Databases 6
1. Global ID – URI
2. Resolvable ID
3. Useful content
 HTML for humans
 RDF for machines
4. Link to other resources
Like the Web,
but for data!
Linked Data Approach
8 April 2015 SICSA Env. & Social Databases 7
“RDF and OWL do not
solve the interoperability
problem, they just lay it
bare on the table!”
Olympics 2012
8 April 2015 SICSA Env. & Social Databases 8
Linking Data
8 April 2015 SICSA Env. & Social Databases 9
Querying Approach
Use ontologies as common model
Requires:
 Representation of data:
sensors and databases
 Establishing mappings between ontology
models and data source schemas
 Accessing data sources through queries
over ontology model
 Expressing continuous queries over sensors
8 April 2015 SICSA Env. & Social Databases 10
WSN Resource Concerns
 Energy
 Running off battery
 Computation Capabilities
 Limited CPU
 Limited memory
 Limited storage
 Radio Transmission
 Limited range
 Energy impact
 Lost transmissions
8 April 2015 SICSA Env. & Social Databases 12
Data Matching
Administrative Data Research Centre - Scotland
Messy data
Probabilistic matches
Schema matching
John Grant
Fisherman
Fiona Sinclair
Ian Grant
Smithy
Born: 1861
Stuart Adam
Wheelwright
Morag Scott
Flora Adam
Seamstress
Born: 1866
Married: 1884
John Grant
Farmer
Fiona Grant
Iain Grant
Born: 1860
13
Administrative Data Research Network
Administrative Data Research Centre - Scotland
Administrative
Data Service
14
ADRC-Scotland
Administrative Data Research Centre - Scotland
 Co-located with Farr Institute,
Scottish Government and NHS.
 Universities of Aberdeen, Dundee,
Edinburgh, Glasgow, Herriot-Watt,
St Andrews and Stirling.
 Expertise in administrative data and public
engagement, linkage, law and relevant computer
science techniques.
 Provide research support, facilities, training
15
Research Focus
Administrative Data Research Centre - Scotland
http://www.gov.scot/Resource/0044/00442276-390.jpg
 Schools, colleges and universities
 The criminal and justice system
 Social work services
 Social welfare
 Housing system
 Transport system
 Health system
 Historical administrative data
16
Multiple Identities
Andy Law's Third Law
“The number of unique identifiers
assigned to an individual is never
less than the number of Institutions
involved in the study”
http://bioinformatics.roslin.ac.uk/lawslaws/
8 April 2015 SICSA Env. & Social Databases 17
P12047
X31045
GB:29384
http://rdf.ebi.ac.uk/resource/ch
embl/molecule/CHEMBL1642
https://www.ebi.ac.uk/chembl/co
mpound/inspect/CHEMBL1642
Query Performance
 Response time
 Data freshness
 Reliability
 Volume of requests
 Hosting resources
8 April 2015 SICSA Env. & Social Databases 18
Data
Source
Data
Source
Data Warehouse
Queries
Data
Source
Data
Source
Mediator
Queries
How FAIR is your Data?
8 April 2015 SICSA Env. & Social Databases 19
Summary
 Web of Data
 Global
Identifiers
 Interoperable
data
 Domain
ontologies
 Challenges
 Data matching
 Multiple
identifiers
 Query
performance
 FAIR data
8 April 2015 SICSA Env. & Social Databases 20
www.alasdairjggray.co.uk
A.J.G.Gray@hw.ac.uk
@gray_alasdair

More Related Content

Viewers also liked

Approximation Algorithms Part Four: APTAS
Approximation Algorithms Part Four: APTASApproximation Algorithms Part Four: APTAS
Approximation Algorithms Part Four: APTAS
Benjamin Sach
 
Privacy preserving in data mining with hybrid approach
Privacy preserving in data mining with hybrid approachPrivacy preserving in data mining with hybrid approach
Privacy preserving in data mining with hybrid approach
Narendra Dhadhal
 
Privacy Preserving Data Mining
Privacy Preserving Data MiningPrivacy Preserving Data Mining
Privacy Preserving Data Mining
ROMALEE AMOLIC
 
Privacy Preserving Data Mining
Privacy Preserving Data MiningPrivacy Preserving Data Mining
Privacy Preserving Data Mining
Vrushali Malvadkar
 
A Review Study on the Privacy Preserving Data Mining Techniques and Approaches
A Review Study on the Privacy Preserving Data Mining Techniques and ApproachesA Review Study on the Privacy Preserving Data Mining Techniques and Approaches
A Review Study on the Privacy Preserving Data Mining Techniques and Approaches
14894
 
Data mining and privacy preserving in data mining
Data mining and privacy preserving in data miningData mining and privacy preserving in data mining
Data mining and privacy preserving in data miningNeeda Multani
 
Efficient Duplicate Detection Over Massive Data Sets
Efficient Duplicate Detection Over Massive Data SetsEfficient Duplicate Detection Over Massive Data Sets
Efficient Duplicate Detection Over Massive Data Sets
Pradeeban Kathiravelu, Ph.D.
 
Introduction to Data Linkage
Introduction to Data LinkageIntroduction to Data Linkage
Introduction to Data Linkage
University of Southampton
 
Cryptography for privacy preserving data mining
Cryptography for privacy preserving data miningCryptography for privacy preserving data mining
Cryptography for privacy preserving data mining
Mesbah Uddin Khan
 
Privacy preserving dm_ppt
Privacy preserving dm_pptPrivacy preserving dm_ppt
Privacy preserving dm_ppt
Sagar Verma
 
Prescription Event Monitoring & Record Linkage Systems
Prescription Event Monitoring & Record Linkage SystemsPrescription Event Monitoring & Record Linkage Systems
Prescription Event Monitoring & Record Linkage Systems
Satish Veerla
 

Viewers also liked (11)

Approximation Algorithms Part Four: APTAS
Approximation Algorithms Part Four: APTASApproximation Algorithms Part Four: APTAS
Approximation Algorithms Part Four: APTAS
 
Privacy preserving in data mining with hybrid approach
Privacy preserving in data mining with hybrid approachPrivacy preserving in data mining with hybrid approach
Privacy preserving in data mining with hybrid approach
 
Privacy Preserving Data Mining
Privacy Preserving Data MiningPrivacy Preserving Data Mining
Privacy Preserving Data Mining
 
Privacy Preserving Data Mining
Privacy Preserving Data MiningPrivacy Preserving Data Mining
Privacy Preserving Data Mining
 
A Review Study on the Privacy Preserving Data Mining Techniques and Approaches
A Review Study on the Privacy Preserving Data Mining Techniques and ApproachesA Review Study on the Privacy Preserving Data Mining Techniques and Approaches
A Review Study on the Privacy Preserving Data Mining Techniques and Approaches
 
Data mining and privacy preserving in data mining
Data mining and privacy preserving in data miningData mining and privacy preserving in data mining
Data mining and privacy preserving in data mining
 
Efficient Duplicate Detection Over Massive Data Sets
Efficient Duplicate Detection Over Massive Data SetsEfficient Duplicate Detection Over Massive Data Sets
Efficient Duplicate Detection Over Massive Data Sets
 
Introduction to Data Linkage
Introduction to Data LinkageIntroduction to Data Linkage
Introduction to Data Linkage
 
Cryptography for privacy preserving data mining
Cryptography for privacy preserving data miningCryptography for privacy preserving data mining
Cryptography for privacy preserving data mining
 
Privacy preserving dm_ppt
Privacy preserving dm_pptPrivacy preserving dm_ppt
Privacy preserving dm_ppt
 
Prescription Event Monitoring & Record Linkage Systems
Prescription Event Monitoring & Record Linkage SystemsPrescription Event Monitoring & Record Linkage Systems
Prescription Event Monitoring & Record Linkage Systems
 

Similar to Data Linkage

An Open Data Story
An Open Data StoryAn Open Data Story
Data Science meets Linked Data
Data Science meets Linked DataData Science meets Linked Data
Data Science meets Linked Data
Alasdair Gray
 
Will We Command Our Data? From the Petascale to the Personal
Will We Command Our Data?  From the Petascale to the PersonalWill We Command Our Data?  From the Petascale to the Personal
Will We Command Our Data? From the Petascale to the PersonalRichard Akerman
 
WESCML: A Data Standard for Exchanging Water and Energy Supply and Consumptio...
WESCML: A Data Standard for Exchanging Water and Energy Supply and Consumptio...WESCML: A Data Standard for Exchanging Water and Energy Supply and Consumptio...
WESCML: A Data Standard for Exchanging Water and Energy Supply and Consumptio...
Jonathan Yu
 
Introduction to data support services and resources for public policy
Introduction to data support services and resources for public policyIntroduction to data support services and resources for public policy
Introduction to data support services and resources for public policy
Historic Environment Scotland
 
Introduction to the University Data Library and national data services
Introduction to the University Data Library and national data servicesIntroduction to the University Data Library and national data services
Introduction to the University Data Library and national data services
EDINA, University of Edinburgh
 
CeRDI Research RUN Vietnam Agriculture Group
CeRDI Research RUN Vietnam Agriculture GroupCeRDI Research RUN Vietnam Agriculture Group
CeRDI Research RUN Vietnam Agriculture Group
Helen Thompson
 
Ict Expo Data Privacy Global Issues & Trends
Ict Expo Data Privacy Global Issues & TrendsIct Expo Data Privacy Global Issues & Trends
Ict Expo Data Privacy Global Issues & Trends
Charles Mok
 
Centre for eResearch and Digital Innovation - Research Overview
Centre for eResearch and Digital Innovation - Research OverviewCentre for eResearch and Digital Innovation - Research Overview
Centre for eResearch and Digital Innovation - Research Overview
Helen Thompson
 
An open data story
An open data storyAn open data story
An open data story
ProgCity
 
The whole is other than the sum of its parts: where is the spatial data infra...
The whole is other than the sum of its parts: where is the spatial data infra...The whole is other than the sum of its parts: where is the spatial data infra...
The whole is other than the sum of its parts: where is the spatial data infra...
Royal Commission on the Ancient and Historical Monuments of Scotland (RCAHMS)
 
COMS5225 Critical Data Studies
COMS5225 Critical Data Studies COMS5225 Critical Data Studies
Data and Innovation in the public sector
Data and Innovation in the public sectorData and Innovation in the public sector
Data and Innovation in the public sector
James Stewart
 
The age of analytics
The age of analyticsThe age of analytics
The age of analytics
bis_foresight
 
Big data and the dark arts - Jisc Digital Media 2015
Big data and the dark arts - Jisc Digital Media 2015Big data and the dark arts - Jisc Digital Media 2015
Big data and the dark arts - Jisc Digital Media 2015
Jisc
 
Introduction to Edinburgh University Data Library and national data services
Introduction to Edinburgh University Data Library and national data servicesIntroduction to Edinburgh University Data Library and national data services
Introduction to Edinburgh University Data Library and national data services
EDINA, University of Edinburgh
 
A Genealogy of an Open Data Assemblage
A Genealogy of an Open Data AssemblageA Genealogy of an Open Data Assemblage
A Genealogy of an Open Data Assemblage
ProgCity
 
Tom Vair (Ssmic)
Tom Vair (Ssmic)Tom Vair (Ssmic)
Tom Vair (Ssmic)TORC
 
Tomorrow’s Standards Together
Tomorrow’s Standards TogetherTomorrow’s Standards Together

Similar to Data Linkage (20)

An Open Data Story
An Open Data StoryAn Open Data Story
An Open Data Story
 
Data Science meets Linked Data
Data Science meets Linked DataData Science meets Linked Data
Data Science meets Linked Data
 
Will We Command Our Data? From the Petascale to the Personal
Will We Command Our Data?  From the Petascale to the PersonalWill We Command Our Data?  From the Petascale to the Personal
Will We Command Our Data? From the Petascale to the Personal
 
share23webversion-1
share23webversion-1share23webversion-1
share23webversion-1
 
WESCML: A Data Standard for Exchanging Water and Energy Supply and Consumptio...
WESCML: A Data Standard for Exchanging Water and Energy Supply and Consumptio...WESCML: A Data Standard for Exchanging Water and Energy Supply and Consumptio...
WESCML: A Data Standard for Exchanging Water and Energy Supply and Consumptio...
 
Introduction to data support services and resources for public policy
Introduction to data support services and resources for public policyIntroduction to data support services and resources for public policy
Introduction to data support services and resources for public policy
 
Introduction to the University Data Library and national data services
Introduction to the University Data Library and national data servicesIntroduction to the University Data Library and national data services
Introduction to the University Data Library and national data services
 
CeRDI Research RUN Vietnam Agriculture Group
CeRDI Research RUN Vietnam Agriculture GroupCeRDI Research RUN Vietnam Agriculture Group
CeRDI Research RUN Vietnam Agriculture Group
 
Ict Expo Data Privacy Global Issues & Trends
Ict Expo Data Privacy Global Issues & TrendsIct Expo Data Privacy Global Issues & Trends
Ict Expo Data Privacy Global Issues & Trends
 
Centre for eResearch and Digital Innovation - Research Overview
Centre for eResearch and Digital Innovation - Research OverviewCentre for eResearch and Digital Innovation - Research Overview
Centre for eResearch and Digital Innovation - Research Overview
 
An open data story
An open data storyAn open data story
An open data story
 
The whole is other than the sum of its parts: where is the spatial data infra...
The whole is other than the sum of its parts: where is the spatial data infra...The whole is other than the sum of its parts: where is the spatial data infra...
The whole is other than the sum of its parts: where is the spatial data infra...
 
COMS5225 Critical Data Studies
COMS5225 Critical Data Studies COMS5225 Critical Data Studies
COMS5225 Critical Data Studies
 
Data and Innovation in the public sector
Data and Innovation in the public sectorData and Innovation in the public sector
Data and Innovation in the public sector
 
The age of analytics
The age of analyticsThe age of analytics
The age of analytics
 
Big data and the dark arts - Jisc Digital Media 2015
Big data and the dark arts - Jisc Digital Media 2015Big data and the dark arts - Jisc Digital Media 2015
Big data and the dark arts - Jisc Digital Media 2015
 
Introduction to Edinburgh University Data Library and national data services
Introduction to Edinburgh University Data Library and national data servicesIntroduction to Edinburgh University Data Library and national data services
Introduction to Edinburgh University Data Library and national data services
 
A Genealogy of an Open Data Assemblage
A Genealogy of an Open Data AssemblageA Genealogy of an Open Data Assemblage
A Genealogy of an Open Data Assemblage
 
Tom Vair (Ssmic)
Tom Vair (Ssmic)Tom Vair (Ssmic)
Tom Vair (Ssmic)
 
Tomorrow’s Standards Together
Tomorrow’s Standards TogetherTomorrow’s Standards Together
Tomorrow’s Standards Together
 

More from Alasdair Gray

Using a Jupyter Notebook to perform a reproducible scientific analysis over s...
Using a Jupyter Notebook to perform a reproducible scientific analysis over s...Using a Jupyter Notebook to perform a reproducible scientific analysis over s...
Using a Jupyter Notebook to perform a reproducible scientific analysis over s...
Alasdair Gray
 
Bioschemas Community: Developing profiles over Schema.org to make life scienc...
Bioschemas Community: Developing profiles over Schema.org to make life scienc...Bioschemas Community: Developing profiles over Schema.org to make life scienc...
Bioschemas Community: Developing profiles over Schema.org to make life scienc...
Alasdair Gray
 
An Identifier Scheme for the Digitising Scotland Project
An Identifier Scheme for the Digitising Scotland ProjectAn Identifier Scheme for the Digitising Scotland Project
An Identifier Scheme for the Digitising Scotland Project
Alasdair Gray
 
Supporting Dataset Descriptions in the Life Sciences
Supporting Dataset Descriptions in the Life SciencesSupporting Dataset Descriptions in the Life Sciences
Supporting Dataset Descriptions in the Life Sciences
Alasdair Gray
 
Tutorial: Describing Datasets with the Health Care and Life Sciences Communit...
Tutorial: Describing Datasets with the Health Care and Life Sciences Communit...Tutorial: Describing Datasets with the Health Care and Life Sciences Communit...
Tutorial: Describing Datasets with the Health Care and Life Sciences Communit...
Alasdair Gray
 
Validata: A tool for testing profile conformance
Validata: A tool for testing profile conformanceValidata: A tool for testing profile conformance
Validata: A tool for testing profile conformance
Alasdair Gray
 
The HCLS Community Profile: Describing Datasets, Versions, and Distributions
The HCLS Community Profile: Describing Datasets, Versions, and DistributionsThe HCLS Community Profile: Describing Datasets, Versions, and Distributions
The HCLS Community Profile: Describing Datasets, Versions, and Distributions
Alasdair Gray
 
Open PHACTS: The Data Today
Open PHACTS: The Data TodayOpen PHACTS: The Data Today
Open PHACTS: The Data Today
Alasdair Gray
 
Project X
Project XProject X
Project X
Alasdair Gray
 
Data Integration in a Big Data Context: An Open PHACTS Case Study
Data Integration in a Big Data Context: An Open PHACTS Case StudyData Integration in a Big Data Context: An Open PHACTS Case Study
Data Integration in a Big Data Context: An Open PHACTS Case Study
Alasdair Gray
 
Data Integration in a Big Data Context
Data Integration in a Big Data ContextData Integration in a Big Data Context
Data Integration in a Big Data Context
Alasdair Gray
 
Scientific lenses to support multiple views over linked chemistry data
Scientific lenses to support multiple views over linked chemistry dataScientific lenses to support multiple views over linked chemistry data
Scientific lenses to support multiple views over linked chemistry data
Alasdair Gray
 
Scientific Lenses over Linked Data An approach to support multiple integrate...
Scientific Lenses over Linked Data An approach to support multiple integrate...Scientific Lenses over Linked Data An approach to support multiple integrate...
Scientific Lenses over Linked Data An approach to support multiple integrate...
Alasdair Gray
 
Describing Scientific Datasets: The HCLS Community Profile
Describing Scientific Datasets: The HCLS Community ProfileDescribing Scientific Datasets: The HCLS Community Profile
Describing Scientific Datasets: The HCLS Community Profile
Alasdair Gray
 
SensorBench
SensorBenchSensorBench
SensorBench
Alasdair Gray
 
Sensors and Big Data for Health and Well-being
Sensors and Big Data for Health and Well-beingSensors and Big Data for Health and Well-being
Sensors and Big Data for Health and Well-beingAlasdair Gray
 
Scientific Lenses over Linked Data: Identity Management in the Open PHACTS p...
Scientific Lenses over Linked Data: Identity Management in the Open PHACTS p...Scientific Lenses over Linked Data: Identity Management in the Open PHACTS p...
Scientific Lenses over Linked Data: Identity Management in the Open PHACTS p...Alasdair Gray
 
Dataset Descriptions in Open PHACTS and HCLS
Dataset Descriptions in Open PHACTS and HCLSDataset Descriptions in Open PHACTS and HCLS
Dataset Descriptions in Open PHACTS and HCLS
Alasdair Gray
 
Computing Identity Co-Reference Across Drug Discovery Datasets
Computing Identity Co-Reference Across Drug Discovery DatasetsComputing Identity Co-Reference Across Drug Discovery Datasets
Computing Identity Co-Reference Across Drug Discovery Datasets
Alasdair Gray
 
Incorporating Commercial and Private Data into an Open Linked Data Platform f...
Incorporating Commercial and Private Data into an Open Linked Data Platform f...Incorporating Commercial and Private Data into an Open Linked Data Platform f...
Incorporating Commercial and Private Data into an Open Linked Data Platform f...
Alasdair Gray
 

More from Alasdair Gray (20)

Using a Jupyter Notebook to perform a reproducible scientific analysis over s...
Using a Jupyter Notebook to perform a reproducible scientific analysis over s...Using a Jupyter Notebook to perform a reproducible scientific analysis over s...
Using a Jupyter Notebook to perform a reproducible scientific analysis over s...
 
Bioschemas Community: Developing profiles over Schema.org to make life scienc...
Bioschemas Community: Developing profiles over Schema.org to make life scienc...Bioschemas Community: Developing profiles over Schema.org to make life scienc...
Bioschemas Community: Developing profiles over Schema.org to make life scienc...
 
An Identifier Scheme for the Digitising Scotland Project
An Identifier Scheme for the Digitising Scotland ProjectAn Identifier Scheme for the Digitising Scotland Project
An Identifier Scheme for the Digitising Scotland Project
 
Supporting Dataset Descriptions in the Life Sciences
Supporting Dataset Descriptions in the Life SciencesSupporting Dataset Descriptions in the Life Sciences
Supporting Dataset Descriptions in the Life Sciences
 
Tutorial: Describing Datasets with the Health Care and Life Sciences Communit...
Tutorial: Describing Datasets with the Health Care and Life Sciences Communit...Tutorial: Describing Datasets with the Health Care and Life Sciences Communit...
Tutorial: Describing Datasets with the Health Care and Life Sciences Communit...
 
Validata: A tool for testing profile conformance
Validata: A tool for testing profile conformanceValidata: A tool for testing profile conformance
Validata: A tool for testing profile conformance
 
The HCLS Community Profile: Describing Datasets, Versions, and Distributions
The HCLS Community Profile: Describing Datasets, Versions, and DistributionsThe HCLS Community Profile: Describing Datasets, Versions, and Distributions
The HCLS Community Profile: Describing Datasets, Versions, and Distributions
 
Open PHACTS: The Data Today
Open PHACTS: The Data TodayOpen PHACTS: The Data Today
Open PHACTS: The Data Today
 
Project X
Project XProject X
Project X
 
Data Integration in a Big Data Context: An Open PHACTS Case Study
Data Integration in a Big Data Context: An Open PHACTS Case StudyData Integration in a Big Data Context: An Open PHACTS Case Study
Data Integration in a Big Data Context: An Open PHACTS Case Study
 
Data Integration in a Big Data Context
Data Integration in a Big Data ContextData Integration in a Big Data Context
Data Integration in a Big Data Context
 
Scientific lenses to support multiple views over linked chemistry data
Scientific lenses to support multiple views over linked chemistry dataScientific lenses to support multiple views over linked chemistry data
Scientific lenses to support multiple views over linked chemistry data
 
Scientific Lenses over Linked Data An approach to support multiple integrate...
Scientific Lenses over Linked Data An approach to support multiple integrate...Scientific Lenses over Linked Data An approach to support multiple integrate...
Scientific Lenses over Linked Data An approach to support multiple integrate...
 
Describing Scientific Datasets: The HCLS Community Profile
Describing Scientific Datasets: The HCLS Community ProfileDescribing Scientific Datasets: The HCLS Community Profile
Describing Scientific Datasets: The HCLS Community Profile
 
SensorBench
SensorBenchSensorBench
SensorBench
 
Sensors and Big Data for Health and Well-being
Sensors and Big Data for Health and Well-beingSensors and Big Data for Health and Well-being
Sensors and Big Data for Health and Well-being
 
Scientific Lenses over Linked Data: Identity Management in the Open PHACTS p...
Scientific Lenses over Linked Data: Identity Management in the Open PHACTS p...Scientific Lenses over Linked Data: Identity Management in the Open PHACTS p...
Scientific Lenses over Linked Data: Identity Management in the Open PHACTS p...
 
Dataset Descriptions in Open PHACTS and HCLS
Dataset Descriptions in Open PHACTS and HCLSDataset Descriptions in Open PHACTS and HCLS
Dataset Descriptions in Open PHACTS and HCLS
 
Computing Identity Co-Reference Across Drug Discovery Datasets
Computing Identity Co-Reference Across Drug Discovery DatasetsComputing Identity Co-Reference Across Drug Discovery Datasets
Computing Identity Co-Reference Across Drug Discovery Datasets
 
Incorporating Commercial and Private Data into an Open Linked Data Platform f...
Incorporating Commercial and Private Data into an Open Linked Data Platform f...Incorporating Commercial and Private Data into an Open Linked Data Platform f...
Incorporating Commercial and Private Data into an Open Linked Data Platform f...
 

Recently uploaded

Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Linda486226
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 

Recently uploaded (20)

Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 

Data Linkage

  • 1. Data Linkage Alasdair J G Gray A.J.G.Gray@hw.ac.uk alasdairjggray.co.uk @gray_alasdair
  • 2. Estuarine Flooding  Financial implications  Damage  Loss of business  Personal factors  Emotional impact  Flood prediction  Locations  Severity  Requires correlating  Sea-state data  Weather forecasts  Details of sea defences  Response Planning  Evacuation routes  Personnel deployment  …  Requires more data  Traffic reports  Shipping  … 8 April 2015 SICSA Env. & Social Databases 2 Image: http://www.metro.co.uk/
  • 3. Flood Predication Solent Use Case  Busy shipping channel  Two major ports  Complex tidal and wave patterns 8 April 2015 SICSA Env. & Social Databases 3
  • 4. Flood defences data (database) Flood Detection “Detect overtopping events in the Solent region” sea-level > sea-defence •Sea-level: sensors •Defence heights: databases 8 April 2015 SICSA Env. & Social Databases 4 Real-time sensor data Wave, Wind, Tide
  • 5. Meteorological forecasts Response Planning “Provide contextual information” • Web feeds • Other sources: maps, models • Real-time merging of datasets 8 April 2015 SICSA Env. & Social Databases 5 Other sources: Maps, models, …
  • 6. Data Linkage and Querying Web of Data 8 April 2015 SICSA Env. & Social Databases 6
  • 7. 1. Global ID – URI 2. Resolvable ID 3. Useful content  HTML for humans  RDF for machines 4. Link to other resources Like the Web, but for data! Linked Data Approach 8 April 2015 SICSA Env. & Social Databases 7 “RDF and OWL do not solve the interoperability problem, they just lay it bare on the table!”
  • 8. Olympics 2012 8 April 2015 SICSA Env. & Social Databases 8
  • 9. Linking Data 8 April 2015 SICSA Env. & Social Databases 9
  • 10. Querying Approach Use ontologies as common model Requires:  Representation of data: sensors and databases  Establishing mappings between ontology models and data source schemas  Accessing data sources through queries over ontology model  Expressing continuous queries over sensors 8 April 2015 SICSA Env. & Social Databases 10
  • 11. WSN Resource Concerns  Energy  Running off battery  Computation Capabilities  Limited CPU  Limited memory  Limited storage  Radio Transmission  Limited range  Energy impact  Lost transmissions 8 April 2015 SICSA Env. & Social Databases 12
  • 12. Data Matching Administrative Data Research Centre - Scotland Messy data Probabilistic matches Schema matching John Grant Fisherman Fiona Sinclair Ian Grant Smithy Born: 1861 Stuart Adam Wheelwright Morag Scott Flora Adam Seamstress Born: 1866 Married: 1884 John Grant Farmer Fiona Grant Iain Grant Born: 1860 13
  • 13. Administrative Data Research Network Administrative Data Research Centre - Scotland Administrative Data Service 14
  • 14. ADRC-Scotland Administrative Data Research Centre - Scotland  Co-located with Farr Institute, Scottish Government and NHS.  Universities of Aberdeen, Dundee, Edinburgh, Glasgow, Herriot-Watt, St Andrews and Stirling.  Expertise in administrative data and public engagement, linkage, law and relevant computer science techniques.  Provide research support, facilities, training 15
  • 15. Research Focus Administrative Data Research Centre - Scotland http://www.gov.scot/Resource/0044/00442276-390.jpg  Schools, colleges and universities  The criminal and justice system  Social work services  Social welfare  Housing system  Transport system  Health system  Historical administrative data 16
  • 16. Multiple Identities Andy Law's Third Law “The number of unique identifiers assigned to an individual is never less than the number of Institutions involved in the study” http://bioinformatics.roslin.ac.uk/lawslaws/ 8 April 2015 SICSA Env. & Social Databases 17 P12047 X31045 GB:29384 http://rdf.ebi.ac.uk/resource/ch embl/molecule/CHEMBL1642 https://www.ebi.ac.uk/chembl/co mpound/inspect/CHEMBL1642
  • 17. Query Performance  Response time  Data freshness  Reliability  Volume of requests  Hosting resources 8 April 2015 SICSA Env. & Social Databases 18 Data Source Data Source Data Warehouse Queries Data Source Data Source Mediator Queries
  • 18. How FAIR is your Data? 8 April 2015 SICSA Env. & Social Databases 19
  • 19. Summary  Web of Data  Global Identifiers  Interoperable data  Domain ontologies  Challenges  Data matching  Multiple identifiers  Query performance  FAIR data 8 April 2015 SICSA Env. & Social Databases 20 www.alasdairjggray.co.uk A.J.G.Gray@hw.ac.uk @gray_alasdair

Editor's Notes

  1. Environmental decision support systems Flood emergency response: real-time data mash-ups real-time data linkage
  2. Strait of water separating Isle of Wight from English mainland Two high tides -> increased opportunities for getting ships in and out -> better for business Complex tidal pattern Non-standard models
  3. Overtopping: a wave or tide exceeds the height of the sea defence: simplified as threshold in graph Sensor data provides current sea-state conditions National Flood and Coastal Defences Database (NFCDD) provides height of sea walls, etc Lots of forms of heterogeneity in the system
  4. Contextual Data Weather feed provides predicted wind speed and direction, contextual streaming data Maps -> contextual visual data Report data in a form understandable to the user, ontology
  5. Data from heterogeneous sources: discover relevant sources; different temporal modalities; different data models and representations Interlink data: common representation, align data models/schemas, identify common entities Query decomposition across distributed sources Efficient in-network processing: Save energy, increase network lifetime Enable new insights through novel user interfaces
  6. Linked data offers a platform on which to do data science Linked Data hugely successful since inception in 2006, revision 2009 About 300 datasets published Wide range of topics
  7. Coverage of 10,000+ athletes, 200+ countries, 400-500 disciplines and 30 venues Page for every athlete and country drawing on open data
  8. Internally DBPedia and Geonames
  9. Previous streaming extensions to SPARQL have problems
  10. Bird habitat monitoring, Coastal monitoring, Glacier movement, Farms, Volcanoes… Cost effective monitoring, high spatial/temporal resolution What is the underlying technology/software?
  11. Trade-off of capabilities vs QoS vs Lifetime Every system performed their own bespoke evaluations, how do you compare?
  12. Social science example from ADRC Scotland Same problem in environmental science: bore holes in the North Sea
  13. Four Administrative Data Research Centres (ADRCs), one in each UK country England – led by University of Southampton Northern Ireland – led by Queens Uni Belfast Scotland – led by University of Edinburgh Wales – led by Swansea University Coordinating Administrative Data Service (ADS) – led by University of Essex
  14. Each captures a subtly different view of the world Are they the same? … depends on your point of view Different URIs for different representations (content negotiation)