SlideShare a Scribd company logo
1 of 27
Download to read offline
FAIR Linked Data
Publishing them, using them, and
why it doesn’t take a giant to do it.
Alessandro Adamou
Open Scholarship Week 2021
WHOAMI
Digital Humanities Scientist at
Bibliotheca Hertziana - Max Planck Institute for Art History
prior: Research Fellow at Data Science Institute, NUI Galway
Computer scientist background, eventually chose DH as
application domain
Research projects in multiple domains:
- Education (academic and informal), music history,
eGov, smart cities, literature, industry 4.0
My roles in those projects involved:
- Creating data myself
- Cataloguing/integrating data by others
MSc degree studies (until 2007):
Was initially introduced to structured data
- Storing in a relational DBMS (MySQL, MSAccess) or
native XML database (basex, eXistDB)
- Publishing: XML and not much more than that
- No formation on how to make good data schemas,
other than RDBMS optimisation
Otherwise, was perfectly happy with calling “data” the
tables in an HTML page, an Excel sheet, or even text in a
PDF or Word document.
First contact with a Data Paradigm
Rationales for hierarchical data
● Both are built according to somewhat sensible
rationales
● Neither way is standard or conformant to a shared
logical paradigm.
● Just good enough for a thesaurus
Masters and Doctoral theses (2008-12):
Machine-readable data (mostly structured data with
annotations)
- Learned about triple stores but didn’t use them (more
on that later)
- Publishing: RDF format and the many ways to
serialise it: XML, JSON, Turtle
- The world of “good data schemas” opens to me:
ontologies and semantics
Evolution of the Data Paradigm
No longer just hierarchies!
● Hierarchy is only taxonomical: what is
of one type is also of the type above
● I can make up as many types of relation
as I want
Galway
Connacht
Ireland
locatedIn
locatedIn
City
Region
Country
Place
a
a
a
Settlement
(locatedIn)
(locatedIn)
ONTOLOGY
D
a
t
a
L
i
n
k
e
d
“Like the web of hypertext, the
web of data is constructed with
documents on the web. However,
unlike the web of hypertext,
where links are relationships
anchors in hypertext documents
written in HTML, for data they are
links between arbitrary things.”
- Tim Berners-Lee, 2006
Berners-Lee, T. “Linked Data: Design Issues” (2006).
https://www.w3.org/DesignIssues/LinkedData.html
2006
Linked Data
● Not a format, but:
○ a set of principles and recommendations,
○ for all of which, Web standards and technologies exist
Berners-Lee, T. “Linked Data: Design Issues” (2006). https://www.w3.org/DesignIssues/LinkedData.html
1. Have a system of identifiers as names for
all things in your data (people, artworks,
time periods, books, events…)
2. Make it possible to look up those names
3. When looked up, the information
returned is formatted using standards.
4. This information contains links to other
things, using the same system of identifiers.
URI
HTTP URI
RDF, SPARQL
External URIs
Some custom code (for convention);
or: Linked Data API implementations
Web/application servers (Apache,
Jetty, Nginx...)
Triple stores / Graph databases with
SPARQL servers; Client programming
libraries
External LD services (e.g. Wikidata)
PRINCIPLE STANDARD TECHNOLOGY
Post-doctoral work (2013-):
Linked data
- Shared conventions encompass not only the data
schemas (ontologies), but also the data elements!
- Storage: Graph Databases - triple stores and more
(Virtuoso, Jena, GraphDB, Neo4J...)
- Publishing: RDF format + public query endpoint
(SPARQL language)
Evolution of the Data Paradigm
Many ways to publish Linked Data
Embedded inside HTML pages.
Serialise to XML, JSON or
plaintext and provide a
downloadable data dump.
Publish a Web service for users
to query with the SPARQL
language.
Make a URI point to RDF
snippets that describe that
thing in your data.
D
a
t
a
O
p
e
n
L
i
n
k
e
d
2010*
(*) originally introduced alongside Linked Data, but formalised later
The Open Data stars
Open and on the Web
“Make your stuff available on the Web (whatever format) under an open license.”
Machine-readable
“Make it available as structured data (e.g., Excel instead of image scan of a table).”
Open format
“Make it available in a non-proprietary open format (e.g. CSV instead of Excel).”
URIs for everything
“Use URIs to denote things, so that people can point at your stuff.”
Linking
“Link your data to other data to provide context.” → Linked Open Data
★★★★★
A 5-Star deployment scheme for your data.
★★★★
★★★
★★
★
Source: “5 ★ Open Data”. https://5stardata.info/
D
a
t
a
O
p
e
n
L
i
n
k
e
d
Maintained by the Data Science Institute,
NUI Galway, https://lod-cloud.net/
Cloud
● Linked (Open) Data are a set principles born from the questions arising from
the scientific community.
○ Several of its arising technologies born here in NUI Galway 😊
● They are standardised at a scholarly/technical level (W3C)...
● but not at a political level (European Union, G20)
● Legitimation of LOD by policy makers had not been sought, but a reference
framework to evaluate LOD against was still missing...
The scholarly-political gap
2016
“What constitutes ‘good data
management’ is largely undefined,
and is generally left as a decision
for the data or repository owner.
Therefore, bringing some clarity
around the goals and desiderata of
good data management and
stewardship [...] would be of great
utility.”
- Mark D. Wilkinson et al, 2016
R
e
u
s
a
b
l
e
I
n
t
e
r
o
p
e
r
a
b
l
e
A
c
c
e
s
s
i
b
l
e
F
i
nd
a
b
l
e
FAIR Principles
The first set of guidelines to have gained support by policy-makers.
★ Findable
Data are assigned a globally unique and persistent identifier and indexed in
a searchable source.
★ Accessible
Standard and open communications protocol; metadata always available.
★ Interoperable
Shared knowledge representation language. Data schemas are also FAIR.
★ Reusable
Data are richly described, with a clear license and provenance information.
Wilkinson, M. D. et al. The FAIR Guiding Principles for scientific data management and
stewardship. Sci. Data 3:160018 doi: 10.1038/sdata.2016.18 (2016).
X X
X X
X X
X X X
X X
Linked
Standards
Non-proprietary
Machine-readable
Online - Open
Findable Accessible Interoperable Reusable
5★ data as a FAIR implementation
What you need to publish 5★ data
- Manage a Web domain/host to publish HTTP URIs with
- e.g. bnb.data.bl.uk , wikidata.org …
- Storage: a Triple store / Graph database with SPARQL
Data stores
Many triple stores and
graph databases come
in an open source
“community edition” with
limited performance
capabilities, and an
“uncapped” proprietary
edition.
- Many open source solutions (e.g. Jena,
Virtuoso, GraphDB, Blazegraph)
- CPU power to handle user queries
- Varies, but affordable hosted solutions
- Programming skills: only some Web
development (HTML+JS+CSS, Web apps)
- Or, CMS and Linked Data frameworks
What you need to use 5★ data
- Human consumption: a Web browser is enough
- Especially if the linked data are embedded in pages
Did you know?
Search engines like
Google use Linked Data
to display “rich”
interactive search results
entries and the info
boxes that often appear
to the right of search
results.
- Developers:
- curl program alone can do wonders!
- Client libraries available for many
languages (Java, Python, NodeJS...)
- But even without them, HTTP client
libraries will do the job just fine!
- Core resource is network traffic
An emerging new paradigm
Scholars in cultural heritage and digital humanities
are among the most vocal in lamenting that LOD
principles do not address every common issue.
In particular, while understanding the principles may
be easy, dealing with FAIR LOD may not be!
Conforming data to scholarly recommendation does
not necessarily entail visibility and uptake!
D
a
t
a
O
p
e
n
L
i
n
k
e
d
U
s
a
b
l
e
D
a
t
a
O
p
e
n
L
i
n
k
e
d
2018
“If our data isn’t used, then no
value is gained from the
resources that were invested in
its creation, publication,
maintenance and improvement.
If we want our data to be used,
then it needs to be usable.”
- Robert Sanderson, 2018
The LOUD stars
Not quite (yet) as formalised as open data stars.
★ The right abstraction for the audience
“Some use cases and requirements should drive
the interoperability layer between systems.”
★ Few barriers to entry
“If it takes time to understand the model, [...]
query syntax and so forth, then developers will
look for easier targets.”
The LOUD stars
★ Comprehensible by introspection
“Data should be understandable by looking at it,
rather than requiring the developer to read the
ontology and vocabularies.”
★ Documentation with working examples
“Documentation clarifies the patterns that the
developer can expect to encounter, such that they
can implement robustly.”
★ Few exceptions, many consistent patterns
“While not everything is homogenous, a set of
patterns that manage exceptions well is better than
many custom fields.”
The Listening Experience Database,
https:/
/led.kmi.open.ac.uk
➢ A music history catalogue
➢ Developed since 2013
- (i.e. LOD but before FAIR)
➢ Linked Data features
- Data license is CC BY-NC SA
- Data documentation page
- External query endpoint
- Links to MusicBrainz and more
- Embedded data in pages
➢ How many Open Data stars?
➢ How FAIR? How LOUD?
A born-linked Humanities data project
Just like Napalm Death music:
To be FAIR, it must be LOUD!
Image
Sven Mandel /
CC-BY-SA-4.0
Thank You!
Alessandro Adamou
Open Scholarship Week 2021

More Related Content

What's hot

From Open Linked Data towards an Ecosystem of Interlinked Knowledge
From Open Linked Data towards an Ecosystem of Interlinked KnowledgeFrom Open Linked Data towards an Ecosystem of Interlinked Knowledge
From Open Linked Data towards an Ecosystem of Interlinked KnowledgeSören Auer
 
Towards an Open Research Knowledge Graph
Towards an Open Research Knowledge GraphTowards an Open Research Knowledge Graph
Towards an Open Research Knowledge GraphSören Auer
 
Enterprise knowledge graphs
Enterprise knowledge graphsEnterprise knowledge graphs
Enterprise knowledge graphsSören Auer
 
Knowledge Graph Introduction
Knowledge Graph IntroductionKnowledge Graph Introduction
Knowledge Graph IntroductionSören Auer
 
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural HeritageBuild Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural HeritageOntotext
 
Introduction to the Semantic Web
Introduction to the Semantic WebIntroduction to the Semantic Web
Introduction to the Semantic WebNuxeo
 
Open Data Dialog 2013 - Linked Data in Education
Open Data Dialog 2013 - Linked Data in EducationOpen Data Dialog 2013 - Linked Data in Education
Open Data Dialog 2013 - Linked Data in EducationStefan Dietze
 
DBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of DataDBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of DataSebastian Hellmann
 
Towards Knowledge Graph based Representation, Augmentation and Exploration of...
Towards Knowledge Graph based Representation, Augmentation and Exploration of...Towards Knowledge Graph based Representation, Augmentation and Exploration of...
Towards Knowledge Graph based Representation, Augmentation and Exploration of...Sören Auer
 
2011 05-02 linked data intro
2011 05-02 linked data intro2011 05-02 linked data intro
2011 05-02 linked data introvafopoulos
 
2011 05-01 linked data
2011 05-01 linked data2011 05-01 linked data
2011 05-01 linked datavafopoulos
 
DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016Sebastian Hellmann
 
Das Semantische Daten Web für Unternehmen
Das Semantische Daten Web für UnternehmenDas Semantische Daten Web für Unternehmen
Das Semantische Daten Web für UnternehmenSören Auer
 
Why SKOS should be a Focal Point of your Linked Data Strategy
Why SKOS should be a Focal Point of your Linked Data StrategyWhy SKOS should be a Focal Point of your Linked Data Strategy
Why SKOS should be a Focal Point of your Linked Data StrategySemantic Web Company
 
Describing Scholarly Contributions semantically with the Open Research Knowle...
Describing Scholarly Contributions semantically with the Open Research Knowle...Describing Scholarly Contributions semantically with the Open Research Knowle...
Describing Scholarly Contributions semantically with the Open Research Knowle...Sören Auer
 
Cooking up the Semantic Web
Cooking up the Semantic WebCooking up the Semantic Web
Cooking up the Semantic WebOntotext
 
DBpedia Tutorial - Feb 2015, Dublin
DBpedia Tutorial - Feb 2015, DublinDBpedia Tutorial - Feb 2015, Dublin
DBpedia Tutorial - Feb 2015, Dublinm_ackermann
 
Hello Open World - Semtech 2009
Hello Open World - Semtech 2009Hello Open World - Semtech 2009
Hello Open World - Semtech 2009Alexandre Passant
 
Managing Metadata for Science and Technology Studies: the RISIS case
Managing Metadata for Science and Technology Studies: the RISIS caseManaging Metadata for Science and Technology Studies: the RISIS case
Managing Metadata for Science and Technology Studies: the RISIS caseRinke Hoekstra
 

What's hot (20)

From Open Linked Data towards an Ecosystem of Interlinked Knowledge
From Open Linked Data towards an Ecosystem of Interlinked KnowledgeFrom Open Linked Data towards an Ecosystem of Interlinked Knowledge
From Open Linked Data towards an Ecosystem of Interlinked Knowledge
 
Towards an Open Research Knowledge Graph
Towards an Open Research Knowledge GraphTowards an Open Research Knowledge Graph
Towards an Open Research Knowledge Graph
 
Enterprise knowledge graphs
Enterprise knowledge graphsEnterprise knowledge graphs
Enterprise knowledge graphs
 
Knowledge Graph Introduction
Knowledge Graph IntroductionKnowledge Graph Introduction
Knowledge Graph Introduction
 
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural HeritageBuild Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
 
Introduction to the Semantic Web
Introduction to the Semantic WebIntroduction to the Semantic Web
Introduction to the Semantic Web
 
Open Data Dialog 2013 - Linked Data in Education
Open Data Dialog 2013 - Linked Data in EducationOpen Data Dialog 2013 - Linked Data in Education
Open Data Dialog 2013 - Linked Data in Education
 
DBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of DataDBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of Data
 
Towards Knowledge Graph based Representation, Augmentation and Exploration of...
Towards Knowledge Graph based Representation, Augmentation and Exploration of...Towards Knowledge Graph based Representation, Augmentation and Exploration of...
Towards Knowledge Graph based Representation, Augmentation and Exploration of...
 
2011 05-02 linked data intro
2011 05-02 linked data intro2011 05-02 linked data intro
2011 05-02 linked data intro
 
2011 05-01 linked data
2011 05-01 linked data2011 05-01 linked data
2011 05-01 linked data
 
DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016
 
Das Semantische Daten Web für Unternehmen
Das Semantische Daten Web für UnternehmenDas Semantische Daten Web für Unternehmen
Das Semantische Daten Web für Unternehmen
 
Why SKOS should be a Focal Point of your Linked Data Strategy
Why SKOS should be a Focal Point of your Linked Data StrategyWhy SKOS should be a Focal Point of your Linked Data Strategy
Why SKOS should be a Focal Point of your Linked Data Strategy
 
Cognitive data
Cognitive dataCognitive data
Cognitive data
 
Describing Scholarly Contributions semantically with the Open Research Knowle...
Describing Scholarly Contributions semantically with the Open Research Knowle...Describing Scholarly Contributions semantically with the Open Research Knowle...
Describing Scholarly Contributions semantically with the Open Research Knowle...
 
Cooking up the Semantic Web
Cooking up the Semantic WebCooking up the Semantic Web
Cooking up the Semantic Web
 
DBpedia Tutorial - Feb 2015, Dublin
DBpedia Tutorial - Feb 2015, DublinDBpedia Tutorial - Feb 2015, Dublin
DBpedia Tutorial - Feb 2015, Dublin
 
Hello Open World - Semtech 2009
Hello Open World - Semtech 2009Hello Open World - Semtech 2009
Hello Open World - Semtech 2009
 
Managing Metadata for Science and Technology Studies: the RISIS case
Managing Metadata for Science and Technology Studies: the RISIS caseManaging Metadata for Science and Technology Studies: the RISIS case
Managing Metadata for Science and Technology Studies: the RISIS case
 

Similar to FAIR data: LOUD for all audiences

Introduction to linked data
Introduction to linked dataIntroduction to linked data
Introduction to linked dataLaura Po
 
(PROJEKTURA) Big Data Open Data story for TGG
(PROJEKTURA) Big Data Open Data story for TGG(PROJEKTURA) Big Data Open Data story for TGG
(PROJEKTURA) Big Data Open Data story for TGGRatko Mutavdzic
 
Linked Open Data Utrecht University Library
Linked Open Data Utrecht University LibraryLinked Open Data Utrecht University Library
Linked Open Data Utrecht University LibraryRuben Schalk
 
Sands Fish - Knowing in the Age of Networked Knowledge
Sands Fish - Knowing in the Age of Networked KnowledgeSands Fish - Knowing in the Age of Networked Knowledge
Sands Fish - Knowing in the Age of Networked Knowledgesandsfish
 
Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things PayamBarnaghi
 
(PROJEKTURA) open data big data @tgg osijek
(PROJEKTURA) open data big data @tgg osijek(PROJEKTURA) open data big data @tgg osijek
(PROJEKTURA) open data big data @tgg osijekRatko Mutavdzic
 
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...giuseppe_futia
 
Llinked open data training for EU institutions
Llinked open data training for EU institutionsLlinked open data training for EU institutions
Llinked open data training for EU institutionsOpen Data Support
 
Toward universal information access on the digital object cloud
Toward universal information access on the digital object cloudToward universal information access on the digital object cloud
Toward universal information access on the digital object cloudNational Institute of Informatics
 
Minimizing the Complexities of Machine Learning with Data Virtualization
Minimizing the Complexities of Machine Learning with Data VirtualizationMinimizing the Complexities of Machine Learning with Data Virtualization
Minimizing the Complexities of Machine Learning with Data VirtualizationDenodo
 
Linked Data for Architecture, Engineering and Construction (AEC)
Linked Data for Architecture, Engineering and Construction (AEC)Linked Data for Architecture, Engineering and Construction (AEC)
Linked Data for Architecture, Engineering and Construction (AEC)Stefan Dietze
 
Linked Open Data in the World of Patents
Linked Open Data in the World of Patents Linked Open Data in the World of Patents
Linked Open Data in the World of Patents Dr. Haxel Consult
 
Open Data - Principles and Techniques
Open Data - Principles and TechniquesOpen Data - Principles and Techniques
Open Data - Principles and TechniquesBernhard Haslhofer
 
Illuminating DSpace's Linked Data Support
Illuminating DSpace's Linked Data SupportIlluminating DSpace's Linked Data Support
Illuminating DSpace's Linked Data SupportPascal-Nicolas Becker
 

Similar to FAIR data: LOUD for all audiences (20)

Introduction to linked data
Introduction to linked dataIntroduction to linked data
Introduction to linked data
 
(PROJEKTURA) Big Data Open Data story for TGG
(PROJEKTURA) Big Data Open Data story for TGG(PROJEKTURA) Big Data Open Data story for TGG
(PROJEKTURA) Big Data Open Data story for TGG
 
Introduction to lod
Introduction to lodIntroduction to lod
Introduction to lod
 
Linked Open Data Utrecht University Library
Linked Open Data Utrecht University LibraryLinked Open Data Utrecht University Library
Linked Open Data Utrecht University Library
 
Linked (Open) Data
Linked (Open) DataLinked (Open) Data
Linked (Open) Data
 
Sands Fish - Knowing in the Age of Networked Knowledge
Sands Fish - Knowing in the Age of Networked KnowledgeSands Fish - Knowing in the Age of Networked Knowledge
Sands Fish - Knowing in the Age of Networked Knowledge
 
Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things
 
(PROJEKTURA) open data big data @tgg osijek
(PROJEKTURA) open data big data @tgg osijek(PROJEKTURA) open data big data @tgg osijek
(PROJEKTURA) open data big data @tgg osijek
 
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
 
Llinked open data training for EU institutions
Llinked open data training for EU institutionsLlinked open data training for EU institutions
Llinked open data training for EU institutions
 
Aggregation as tactic sm new
Aggregation as tactic sm newAggregation as tactic sm new
Aggregation as tactic sm new
 
Aggregation as Tactic
Aggregation as TacticAggregation as Tactic
Aggregation as Tactic
 
Toward universal information access on the digital object cloud
Toward universal information access on the digital object cloudToward universal information access on the digital object cloud
Toward universal information access on the digital object cloud
 
BrightTALK - Semantic AI
BrightTALK - Semantic AI BrightTALK - Semantic AI
BrightTALK - Semantic AI
 
Minimizing the Complexities of Machine Learning with Data Virtualization
Minimizing the Complexities of Machine Learning with Data VirtualizationMinimizing the Complexities of Machine Learning with Data Virtualization
Minimizing the Complexities of Machine Learning with Data Virtualization
 
Linked Data for Architecture, Engineering and Construction (AEC)
Linked Data for Architecture, Engineering and Construction (AEC)Linked Data for Architecture, Engineering and Construction (AEC)
Linked Data for Architecture, Engineering and Construction (AEC)
 
The Web of Data: The W3C Semantic Web Initiative
The Web of Data: The W3C Semantic Web InitiativeThe Web of Data: The W3C Semantic Web Initiative
The Web of Data: The W3C Semantic Web Initiative
 
Linked Open Data in the World of Patents
Linked Open Data in the World of Patents Linked Open Data in the World of Patents
Linked Open Data in the World of Patents
 
Open Data - Principles and Techniques
Open Data - Principles and TechniquesOpen Data - Principles and Techniques
Open Data - Principles and Techniques
 
Illuminating DSpace's Linked Data Support
Illuminating DSpace's Linked Data SupportIlluminating DSpace's Linked Data Support
Illuminating DSpace's Linked Data Support
 

Recently uploaded

Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024Timothy Spann
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...ssuserf63bd7
 
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGILLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGIThomas Poetter
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 

Recently uploaded (20)

Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
 
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGILLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 

FAIR data: LOUD for all audiences

  • 1. FAIR Linked Data Publishing them, using them, and why it doesn’t take a giant to do it. Alessandro Adamou Open Scholarship Week 2021
  • 2. WHOAMI Digital Humanities Scientist at Bibliotheca Hertziana - Max Planck Institute for Art History prior: Research Fellow at Data Science Institute, NUI Galway Computer scientist background, eventually chose DH as application domain Research projects in multiple domains: - Education (academic and informal), music history, eGov, smart cities, literature, industry 4.0 My roles in those projects involved: - Creating data myself - Cataloguing/integrating data by others
  • 3. MSc degree studies (until 2007): Was initially introduced to structured data - Storing in a relational DBMS (MySQL, MSAccess) or native XML database (basex, eXistDB) - Publishing: XML and not much more than that - No formation on how to make good data schemas, other than RDBMS optimisation Otherwise, was perfectly happy with calling “data” the tables in an HTML page, an Excel sheet, or even text in a PDF or Word document. First contact with a Data Paradigm
  • 4. Rationales for hierarchical data ● Both are built according to somewhat sensible rationales ● Neither way is standard or conformant to a shared logical paradigm. ● Just good enough for a thesaurus
  • 5. Masters and Doctoral theses (2008-12): Machine-readable data (mostly structured data with annotations) - Learned about triple stores but didn’t use them (more on that later) - Publishing: RDF format and the many ways to serialise it: XML, JSON, Turtle - The world of “good data schemas” opens to me: ontologies and semantics Evolution of the Data Paradigm
  • 6. No longer just hierarchies! ● Hierarchy is only taxonomical: what is of one type is also of the type above ● I can make up as many types of relation as I want Galway Connacht Ireland locatedIn locatedIn City Region Country Place a a a Settlement (locatedIn) (locatedIn) ONTOLOGY
  • 7. D a t a L i n k e d “Like the web of hypertext, the web of data is constructed with documents on the web. However, unlike the web of hypertext, where links are relationships anchors in hypertext documents written in HTML, for data they are links between arbitrary things.” - Tim Berners-Lee, 2006 Berners-Lee, T. “Linked Data: Design Issues” (2006). https://www.w3.org/DesignIssues/LinkedData.html 2006
  • 8. Linked Data ● Not a format, but: ○ a set of principles and recommendations, ○ for all of which, Web standards and technologies exist Berners-Lee, T. “Linked Data: Design Issues” (2006). https://www.w3.org/DesignIssues/LinkedData.html 1. Have a system of identifiers as names for all things in your data (people, artworks, time periods, books, events…) 2. Make it possible to look up those names 3. When looked up, the information returned is formatted using standards. 4. This information contains links to other things, using the same system of identifiers. URI HTTP URI RDF, SPARQL External URIs Some custom code (for convention); or: Linked Data API implementations Web/application servers (Apache, Jetty, Nginx...) Triple stores / Graph databases with SPARQL servers; Client programming libraries External LD services (e.g. Wikidata) PRINCIPLE STANDARD TECHNOLOGY
  • 9. Post-doctoral work (2013-): Linked data - Shared conventions encompass not only the data schemas (ontologies), but also the data elements! - Storage: Graph Databases - triple stores and more (Virtuoso, Jena, GraphDB, Neo4J...) - Publishing: RDF format + public query endpoint (SPARQL language) Evolution of the Data Paradigm
  • 10. Many ways to publish Linked Data Embedded inside HTML pages. Serialise to XML, JSON or plaintext and provide a downloadable data dump. Publish a Web service for users to query with the SPARQL language. Make a URI point to RDF snippets that describe that thing in your data.
  • 11. D a t a O p e n L i n k e d 2010* (*) originally introduced alongside Linked Data, but formalised later
  • 12. The Open Data stars Open and on the Web “Make your stuff available on the Web (whatever format) under an open license.” Machine-readable “Make it available as structured data (e.g., Excel instead of image scan of a table).” Open format “Make it available in a non-proprietary open format (e.g. CSV instead of Excel).” URIs for everything “Use URIs to denote things, so that people can point at your stuff.” Linking “Link your data to other data to provide context.” → Linked Open Data ★★★★★ A 5-Star deployment scheme for your data. ★★★★ ★★★ ★★ ★ Source: “5 ★ Open Data”. https://5stardata.info/
  • 13. D a t a O p e n L i n k e d Maintained by the Data Science Institute, NUI Galway, https://lod-cloud.net/ Cloud
  • 14. ● Linked (Open) Data are a set principles born from the questions arising from the scientific community. ○ Several of its arising technologies born here in NUI Galway 😊 ● They are standardised at a scholarly/technical level (W3C)... ● but not at a political level (European Union, G20) ● Legitimation of LOD by policy makers had not been sought, but a reference framework to evaluate LOD against was still missing... The scholarly-political gap
  • 15. 2016 “What constitutes ‘good data management’ is largely undefined, and is generally left as a decision for the data or repository owner. Therefore, bringing some clarity around the goals and desiderata of good data management and stewardship [...] would be of great utility.” - Mark D. Wilkinson et al, 2016 R e u s a b l e I n t e r o p e r a b l e A c c e s s i b l e F i nd a b l e
  • 16. FAIR Principles The first set of guidelines to have gained support by policy-makers. ★ Findable Data are assigned a globally unique and persistent identifier and indexed in a searchable source. ★ Accessible Standard and open communications protocol; metadata always available. ★ Interoperable Shared knowledge representation language. Data schemas are also FAIR. ★ Reusable Data are richly described, with a clear license and provenance information. Wilkinson, M. D. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci. Data 3:160018 doi: 10.1038/sdata.2016.18 (2016).
  • 17. X X X X X X X X X X X Linked Standards Non-proprietary Machine-readable Online - Open Findable Accessible Interoperable Reusable 5★ data as a FAIR implementation
  • 18. What you need to publish 5★ data - Manage a Web domain/host to publish HTTP URIs with - e.g. bnb.data.bl.uk , wikidata.org … - Storage: a Triple store / Graph database with SPARQL Data stores Many triple stores and graph databases come in an open source “community edition” with limited performance capabilities, and an “uncapped” proprietary edition. - Many open source solutions (e.g. Jena, Virtuoso, GraphDB, Blazegraph) - CPU power to handle user queries - Varies, but affordable hosted solutions - Programming skills: only some Web development (HTML+JS+CSS, Web apps) - Or, CMS and Linked Data frameworks
  • 19. What you need to use 5★ data - Human consumption: a Web browser is enough - Especially if the linked data are embedded in pages Did you know? Search engines like Google use Linked Data to display “rich” interactive search results entries and the info boxes that often appear to the right of search results. - Developers: - curl program alone can do wonders! - Client libraries available for many languages (Java, Python, NodeJS...) - But even without them, HTTP client libraries will do the job just fine! - Core resource is network traffic
  • 20. An emerging new paradigm Scholars in cultural heritage and digital humanities are among the most vocal in lamenting that LOD principles do not address every common issue. In particular, while understanding the principles may be easy, dealing with FAIR LOD may not be! Conforming data to scholarly recommendation does not necessarily entail visibility and uptake!
  • 22. U s a b l e D a t a O p e n L i n k e d 2018 “If our data isn’t used, then no value is gained from the resources that were invested in its creation, publication, maintenance and improvement. If we want our data to be used, then it needs to be usable.” - Robert Sanderson, 2018
  • 23. The LOUD stars Not quite (yet) as formalised as open data stars. ★ The right abstraction for the audience “Some use cases and requirements should drive the interoperability layer between systems.” ★ Few barriers to entry “If it takes time to understand the model, [...] query syntax and so forth, then developers will look for easier targets.”
  • 24. The LOUD stars ★ Comprehensible by introspection “Data should be understandable by looking at it, rather than requiring the developer to read the ontology and vocabularies.” ★ Documentation with working examples “Documentation clarifies the patterns that the developer can expect to encounter, such that they can implement robustly.” ★ Few exceptions, many consistent patterns “While not everything is homogenous, a set of patterns that manage exceptions well is better than many custom fields.”
  • 25. The Listening Experience Database, https:/ /led.kmi.open.ac.uk ➢ A music history catalogue ➢ Developed since 2013 - (i.e. LOD but before FAIR) ➢ Linked Data features - Data license is CC BY-NC SA - Data documentation page - External query endpoint - Links to MusicBrainz and more - Embedded data in pages ➢ How many Open Data stars? ➢ How FAIR? How LOUD? A born-linked Humanities data project
  • 26. Just like Napalm Death music: To be FAIR, it must be LOUD! Image Sven Mandel / CC-BY-SA-4.0
  • 27. Thank You! Alessandro Adamou Open Scholarship Week 2021