SlideShare a Scribd company logo
Linkage in Haze
Challenges and take-home messages of crowd-sourcing
vagueness in musical data
Alessandro Adamou
Listening to music: people, practices and experiences
Sunday, October 25, 2015
What we capture in a listening experience
what they said
where
documented where
who
what
when
how
What we capture in a listening experience
A well-formed listening experience
On July 19, 2014
Leonie Holmes (a professor of Music in New Zealand)*
was listening to Johann Strauss’ “Don Juan”
and “Also Spracht Zarathustra”
and Sarah Ballard’s “Synergos”
played by the NZSO National Youth Orchestra
and Alexander Shelley
using harp and double bass (+others?)
in the Aotea Centre.
(*) plus a generic public, which does not pose a problem in the representation.
A worst-case (but more likely)
listening experience
One evening between May and September
in the late 1950’s
a group of war veterans, a reporter and an
unknown female
were listening to Chopin and an anthem
played by a string orchestra
in a concert hall in South London.
Goal: to capture both fact sets
as structured data
(interconnected or prepared for
refinement)
Other issues
• Source
– unpublished manuscripts
• Unaligned semantic layers
– instrument category | instrument name | brand and model
– generic occupations | gender-dependent | personal titles
Monarch
King (Queen), Emperor (Empress)…
King of England, fourth Sultan of Zanzibar…
Chords
Electric Guitar
Gibson Les Paul Custom Sunburst
Factors contributing to fuzziness
• Domain knowledge of the evidence author(s)
• Deterioration of evidence
• Crowd-sourcing community issues:
– Misaligned semantics
– Varying scholarly rigour
– Popularity of the domain of interest
Data representation in LED
• Linked [Open] Data http://linkeddata.org
– Formalism for machine-readable and human-
readable data
– Object identifiers are URIs
– Standard representation and query languages
(RDF, SPARQL…)
– The meaning of links between objects is
globally understood.
Identity in Linked Data
• http://musicbrainz.org/artist/f5aca88c-e3c1-4bc2-af33-
68a9a9f7b56a#_
– (the band Killing Joke, as in MusicBrainz)
• http://bnb.data.bl.uk/id/agent/DailyMirror
– (The Daily Mirror on the British National Bibliography)
• http://dbpedia.org/resource/London
– (London, as in Wikipedia/DBpedia)
• http://reference.data.gov.uk/doc/day/2015-10-23
– (last Friday, as in the UK Government Calendar data)
• http://led.kmi.open.ac.uk/term/Medium.Live
– (the concept of live music, as in LED)
Identity in Linked Data
• http://musicbrainz.org/artist/f5aca88c-e3c1-4bc2-af33-
68a9a9f7b56a#_
– (the band Killing Joke, as in MusicBrainz)
• http://bnb.data.bl.uk/id/agent/DailyMirror
– (The Daily Mirror on the British National Bibliography)
• http://dbpedia.org/resource/London
– (London, as in Wikipedia/DBpedia)
• http://reference.data.gov.uk/doc/day/2015-10-23
– (last Friday, as in the UK Government Calendar data)
• http://led.kmi.open.ac.uk/term/Medium.Live
– (the concept of live music, as in LED)
Easy: these are all named entities…
Goal: to capture both fact sets
as linked data
There are no right or wrong
ways to do it, only linkable or
unlinkable.
Linked Data encourage reuse…
No two things are distinct nor equal, until
some LD node asserts or implies otherwise.
– e.g. Bono on MusicBrainz and Bono on DBpedia
– Groups too, if it can be demonstrated they are an
exact match
…but fuzzy concepts have caveats
Group entity “Mourners of Felix Mendelssohn” (attending
the arrival of his body in Berlin)
See http://led.kmi.open.ac.uk/entity/lexp/1434029100189
• Identifier of the group is
http://data.open.ac.uk/led/agent/Mourners+of+Felix+Mendelssohn/1434029100190
should not be reused when modelling an entry about
Mendelssohn’s funeral service.
See http://led.kmi.open.ac.uk/entity/lexp/1434029387526
– Identifier of the group is
http://data.open.ac.uk/led/person/Mourners+at+the+Funeral+Service+of+Felix+Mende
lssohn/1434029247629
Blank nodes
• Fallback mechanism for providing data about objects
without having a naming convention for them.
• Reference something not by name, but by description.
• Example:
:performance/Messiah/12345 mo:listener [
a foaf:Group ;
dc:description “Foreign ambassadors”
:occupation dbpedia:Ambassador
]
• Generally not an advisable solution:
– Cannot perform matching on blank nodes
– Querying or detecting changes in the data is much harder
Ontological classes
• Model vague objects as formally-specified categories rather than named
entities
• e.g. “the class of all people whose occupation is Ambassador and who were
at the Royal Albert Hall on May 12, 1876”
• Pros:
– Allows separation of “known” and “generic” entities
– Semantically cleaner and easier to store and manage
• Cons:
– Still need to make URIs for each class
– They have to be instantiated before they can be used in a listening
experience
– Harder to apply changes to the data without fixed classes
Countermeasures in LED
• No blank nodes
• For unaligned semantic layers (cf.
example on instruments and
occupations):
–Use lax model properties
–Enforce reuse of external taxonomies
• ‘rich’ real-time recommendations
Countermeasures in LED
Data reconciliation
Currently with restricted
access, but plans to open
to crowd-sourcing
Countermeasures in LED
• Ad-hoc formal models for underspecified data.
• Example: Extended Date/Time Format (standard draft, Library of
Congress, 2012)
– Allows formalisation of underspecified points in time and intervals, e.g.
“187u-05-uu”
– We extended it to support subjective fuzzy intervals (e.g. early/mid/late)
and ranges (from-to)
– Made available in RDF through data.open.ac.uk
• Example 2: GeoSPARQL
– Used to support geospatial queries in Linked Data
– Named entity recognition on arbitrary text for locations (recently)
– We compute location URIs by hashing their descriptions and all the
locations extracted from it and related via geosparql:sfIntersects
How thick is the mist in LED?
Named Vague Total
Participants 802 260 1062
Locations 136 15
(cannot pinpoint)
151*
Times 826 843
(ranges, not qualified)
1669
Musical works 1550 1263 2813
(*) since database opened to arbitrary experience locations
Figures for LED public dataset
Lessons learnt
• Advantages
– Open-world semantics: minimise risk of ambiguities
generated by name clashes, allows for coherent management
– Monotonic: data are refined by addition of facts
– Can be reasoned upon by machine-learning agents working
on the native data structure
– Incorporates reuse for the benefit of the whole data cloud.
• Disadvantages
– No reuse entails heavy replication
– Data cleansing may require a large context for detecting entities
that can be reconciled
Lessons learnt
• Most, if not all representational issues with vagueness can be
addressed in LD without resorting to blank nodes and safe from
ambiguity.
– Way more powerful that traditional database systems.
• Data providers are yet to reach an albeit silent agreement on:
– representational paradigms for entities commonly at risk of
underspecification, such as spatio-temporal ones;
– how to name their objects.
• Most are making it easy for themselves when it comes to LD
• The way to go is de facto standards
Where to go next
• Model ontological classes as their instances
(equivalence classes?)
• Increase context for fact-based data alignment
(opening reconciliation facilities to the public –
with voting?)
• Argumentation on every statement in LED
• Dissemination of controlled vocabularies and
naming convention for managed vague entities.
Are Linked Data mature for
representing vagueness?
• The technology is.
• The data out there aren’t.
– (but that is the part that can be improved)
Further reading
• Eero Hyvönen, Publishing and Using Cultural Heritage Linked
Data on the Semantic Web (Morgan & Claypool, 2012)
• Daniel J. Lewis and Trevor P. Martin, Managing Vagueness with
Fuzzy in Hierarchical Big Data. In 2015 INNS Conference on Big
Data (Elsevier, 2015), Procedia Computer Science, Vol. 53, p. 19-28
• Fuzzy Logic and the Semantic Web, Elie Sanchez (ed.) (Elsevier,
2006)
Thank you!
QA time
alessandro.adamou@open.ac.uk

More Related Content

What's hot

AskRI, June 24, Providence
AskRI, June 24, ProvidenceAskRI, June 24, Providence
AskRI, June 24, Providence
Karen Mellor
 
Current metadata landscape in the library world (Getaneh Alemu)
Current metadata landscape in the library world (Getaneh Alemu)Current metadata landscape in the library world (Getaneh Alemu)
Current metadata landscape in the library world (Getaneh Alemu)
Getaneh Alemu
 
03 Researchfriendly Org2
03 Researchfriendly Org203 Researchfriendly Org2
03 Researchfriendly Org2Inria
 
"Il n´y a pas de hors-texte": challenges for Archival Linked Data. Adrian Ste...
"Il n´y a pas de hors-texte": challenges for Archival Linked Data. Adrian Ste..."Il n´y a pas de hors-texte": challenges for Archival Linked Data. Adrian Ste...
"Il n´y a pas de hors-texte": challenges for Archival Linked Data. Adrian Ste...
Biblioteca Nacional de España
 
Historical methods 2012
Historical methods 2012Historical methods 2012
Historical methods 2012
p-logsdon
 
Current metadata landscape in the library world Getaneh Alemu
Current metadata landscape in the library world Getaneh AlemuCurrent metadata landscape in the library world Getaneh Alemu
Current metadata landscape in the library world Getaneh Alemu
Getaneh Alemu
 
Thinking of Linking
Thinking of LinkingThinking of Linking
Thinking of Linking
Martin Kalfatovic
 
The Danish National Bibliography as LOD
The Danish National Bibliography as LODThe Danish National Bibliography as LOD
Rich Data? Poor Data? Depends on...
Rich Data? Poor Data? Depends on...Rich Data? Poor Data? Depends on...
Rich Data? Poor Data? Depends on...
Lars G. Svensson
 
Aths 2013 winter (15 aug2013)
Aths 2013 winter (15 aug2013)Aths 2013 winter (15 aug2013)
Aths 2013 winter (15 aug2013)Han Woo PARK
 
Border Trouble: On the Frontiers of Digital Scholarship
Border Trouble: On the Frontiers of Digital ScholarshipBorder Trouble: On the Frontiers of Digital Scholarship
Border Trouble: On the Frontiers of Digital Scholarship
Spencer Keralis
 
Dutch culture link
Dutch culture linkDutch culture link
Dutch culture link
Lukas Koster
 
Richard deswarte interrogating the archived uk web
Richard deswarte   interrogating the archived uk webRichard deswarte   interrogating the archived uk web
Richard deswarte interrogating the archived uk web
Digital History
 
Your digital humanities are in my library! No, your library is in my digital ...
Your digital humanities are in my library! No, your library is in my digital ...Your digital humanities are in my library! No, your library is in my digital ...
Your digital humanities are in my library! No, your library is in my digital ...
Rebekah Cummings
 
Finding, searching and sharing qualitative data: the uses of XML
Finding, searching and sharing qualitative data: the uses of XMLFinding, searching and sharing qualitative data: the uses of XML
Finding, searching and sharing qualitative data: the uses of XML
London School of Hygiene and Tropical Medicine
 
Charper.penn.20140411
Charper.penn.20140411Charper.penn.20140411
Charper.penn.20140411charper
 
Data Culture / Culture Data
Data Culture / Culture DataData Culture / Culture Data
Data Culture / Culture Data
Barry Norton
 
Digital humanities
Digital humanitiesDigital humanities
Digital humanities
Mokhtar Ben Henda
 
[DCSB] Dr Gabriel Bodard (KCL) “A View on Digital Classics Collaboration: fro...
[DCSB] Dr Gabriel Bodard (KCL) “A View on Digital Classics Collaboration: fro...[DCSB] Dr Gabriel Bodard (KCL) “A View on Digital Classics Collaboration: fro...
[DCSB] Dr Gabriel Bodard (KCL) “A View on Digital Classics Collaboration: fro...Digital Classicist Seminar Berlin
 

What's hot (20)

AskRI, June 24, Providence
AskRI, June 24, ProvidenceAskRI, June 24, Providence
AskRI, June 24, Providence
 
Current metadata landscape in the library world (Getaneh Alemu)
Current metadata landscape in the library world (Getaneh Alemu)Current metadata landscape in the library world (Getaneh Alemu)
Current metadata landscape in the library world (Getaneh Alemu)
 
03 Researchfriendly Org2
03 Researchfriendly Org203 Researchfriendly Org2
03 Researchfriendly Org2
 
"Il n´y a pas de hors-texte": challenges for Archival Linked Data. Adrian Ste...
"Il n´y a pas de hors-texte": challenges for Archival Linked Data. Adrian Ste..."Il n´y a pas de hors-texte": challenges for Archival Linked Data. Adrian Ste...
"Il n´y a pas de hors-texte": challenges for Archival Linked Data. Adrian Ste...
 
Historical methods 2012
Historical methods 2012Historical methods 2012
Historical methods 2012
 
Current metadata landscape in the library world Getaneh Alemu
Current metadata landscape in the library world Getaneh AlemuCurrent metadata landscape in the library world Getaneh Alemu
Current metadata landscape in the library world Getaneh Alemu
 
Pln powerpoint
Pln powerpointPln powerpoint
Pln powerpoint
 
Thinking of Linking
Thinking of LinkingThinking of Linking
Thinking of Linking
 
The Danish National Bibliography as LOD
The Danish National Bibliography as LODThe Danish National Bibliography as LOD
The Danish National Bibliography as LOD
 
Rich Data? Poor Data? Depends on...
Rich Data? Poor Data? Depends on...Rich Data? Poor Data? Depends on...
Rich Data? Poor Data? Depends on...
 
Aths 2013 winter (15 aug2013)
Aths 2013 winter (15 aug2013)Aths 2013 winter (15 aug2013)
Aths 2013 winter (15 aug2013)
 
Border Trouble: On the Frontiers of Digital Scholarship
Border Trouble: On the Frontiers of Digital ScholarshipBorder Trouble: On the Frontiers of Digital Scholarship
Border Trouble: On the Frontiers of Digital Scholarship
 
Dutch culture link
Dutch culture linkDutch culture link
Dutch culture link
 
Richard deswarte interrogating the archived uk web
Richard deswarte   interrogating the archived uk webRichard deswarte   interrogating the archived uk web
Richard deswarte interrogating the archived uk web
 
Your digital humanities are in my library! No, your library is in my digital ...
Your digital humanities are in my library! No, your library is in my digital ...Your digital humanities are in my library! No, your library is in my digital ...
Your digital humanities are in my library! No, your library is in my digital ...
 
Finding, searching and sharing qualitative data: the uses of XML
Finding, searching and sharing qualitative data: the uses of XMLFinding, searching and sharing qualitative data: the uses of XML
Finding, searching and sharing qualitative data: the uses of XML
 
Charper.penn.20140411
Charper.penn.20140411Charper.penn.20140411
Charper.penn.20140411
 
Data Culture / Culture Data
Data Culture / Culture DataData Culture / Culture Data
Data Culture / Culture Data
 
Digital humanities
Digital humanitiesDigital humanities
Digital humanities
 
[DCSB] Dr Gabriel Bodard (KCL) “A View on Digital Classics Collaboration: fro...
[DCSB] Dr Gabriel Bodard (KCL) “A View on Digital Classics Collaboration: fro...[DCSB] Dr Gabriel Bodard (KCL) “A View on Digital Classics Collaboration: fro...
[DCSB] Dr Gabriel Bodard (KCL) “A View on Digital Classics Collaboration: fro...
 

Viewers also liked

Past forest and land fires in SEA: What did we learn?
Past forest and land fires in SEA: What did we learn?Past forest and land fires in SEA: What did we learn?
Past forest and land fires in SEA: What did we learn?
CIFOR-ICRAF
 
Effects of 2015 Southeast Asian Haze
Effects of 2015 Southeast Asian HazeEffects of 2015 Southeast Asian Haze
Effects of 2015 Southeast Asian Haze
Victor Lau
 
Proposed fire and haze research
Proposed fire and haze researchProposed fire and haze research
Proposed fire and haze research
CIFOR-ICRAF
 
Haze Crisis and Landscape Approach
Haze Crisis and Landscape ApproachHaze Crisis and Landscape Approach
Haze Crisis and Landscape Approach
CIFOR-ICRAF
 
Haze presentation 1
Haze presentation 1Haze presentation 1
Haze presentation 1LUGYA DENIS
 
Haze presentation
Haze presentationHaze presentation
Haze presentation
drhawkseye
 
Burning issues: Global and local effects of indonesian haze
Burning issues: Global and local effects of indonesian hazeBurning issues: Global and local effects of indonesian haze
Burning issues: Global and local effects of indonesian haze
CIFOR-ICRAF
 

Viewers also liked (8)

Past forest and land fires in SEA: What did we learn?
Past forest and land fires in SEA: What did we learn?Past forest and land fires in SEA: What did we learn?
Past forest and land fires in SEA: What did we learn?
 
Effects of 2015 Southeast Asian Haze
Effects of 2015 Southeast Asian HazeEffects of 2015 Southeast Asian Haze
Effects of 2015 Southeast Asian Haze
 
Proposed fire and haze research
Proposed fire and haze researchProposed fire and haze research
Proposed fire and haze research
 
Haze Crisis and Landscape Approach
Haze Crisis and Landscape ApproachHaze Crisis and Landscape Approach
Haze Crisis and Landscape Approach
 
Haze presentation 1
Haze presentation 1Haze presentation 1
Haze presentation 1
 
Haze presentation
Haze presentationHaze presentation
Haze presentation
 
Burning issues: Global and local effects of indonesian haze
Burning issues: Global and local effects of indonesian hazeBurning issues: Global and local effects of indonesian haze
Burning issues: Global and local effects of indonesian haze
 
Haze-presentation
Haze-presentationHaze-presentation
Haze-presentation
 

Similar to Linkage in Haze: challenges and take-home messages of crowd-sourcing vagueness in musical data

Linked data for knowledge curation in humanities research
Linked data for knowledge curation in humanities researchLinked data for knowledge curation in humanities research
Linked data for knowledge curation in humanities research
Enrico Daga
 
LIS 653 Posters Fall 2014
LIS 653 Posters Fall 2014 LIS 653 Posters Fall 2014
LIS 653 Posters Fall 2014 PrattSILS
 
The Listening Experience Database
The Listening Experience DatabaseThe Listening Experience Database
The Listening Experience Database
Alessandro Adamou
 
Linked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & MuseumsLinked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & Museums
Jon Voss
 
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsJon Voss
 
Musical Meetups Knowledge Graph (MMKG): a collection of evidence for historic...
Musical Meetups Knowledge Graph (MMKG): a collection of evidence for historic...Musical Meetups Knowledge Graph (MMKG): a collection of evidence for historic...
Musical Meetups Knowledge Graph (MMKG): a collection of evidence for historic...
Alba Morales
 
Sally Rumsey, Janet McKnight, James A.J. Wilson - Research data management fo...
Sally Rumsey, Janet McKnight, James A.J. Wilson - Research data management fo...Sally Rumsey, Janet McKnight, James A.J. Wilson - Research data management fo...
Sally Rumsey, Janet McKnight, James A.J. Wilson - Research data management fo...
DCC-info
 
Knowledge Organization | LIS653 | Fall 2017
Knowledge Organization | LIS653 | Fall 2017Knowledge Organization | LIS653 | Fall 2017
Knowledge Organization | LIS653 | Fall 2017
PrattSILS
 
Connecting Museums with Linked Data
Connecting Museums with Linked DataConnecting Museums with Linked Data
Connecting Museums with Linked Data
National Institute of Informatics (NII)
 
Implementing Linked Data in Low-Resource Conditions
Implementing Linked Data in Low-Resource ConditionsImplementing Linked Data in Low-Resource Conditions
Implementing Linked Data in Low-Resource Conditions
AIMS (Agricultural Information Management Standards)
 
The Challenge of Digital Sources in the Web Age: Common Tensions Across Three...
The Challenge of Digital Sources in the Web Age: Common Tensions Across Three...The Challenge of Digital Sources in the Web Age: Common Tensions Across Three...
The Challenge of Digital Sources in the Web Age: Common Tensions Across Three...
Digital History
 
Interpretation, Context, and Metadata: Examples from Open Context
Interpretation, Context, and Metadata: Examples from Open ContextInterpretation, Context, and Metadata: Examples from Open Context
Interpretation, Context, and Metadata: Examples from Open Context
Eric Kansa
 
When Semantics support Multilingual Access to Digital Cultural Heritage - the...
When Semantics support Multilingual Access to Digital Cultural Heritage - the...When Semantics support Multilingual Access to Digital Cultural Heritage - the...
When Semantics support Multilingual Access to Digital Cultural Heritage - the...
Valentine Charles
 
Digitization and public libraries
Digitization and public librariesDigitization and public libraries
Digitization and public libraries
Recollection Wisconsin
 
PATHS at Royal Melbourne Institute of Technology
PATHS at Royal Melbourne Institute of TechnologyPATHS at Royal Melbourne Institute of Technology
PATHS at Royal Melbourne Institute of Technology
pathsproject
 
Digital Humanities Venice Group Presentation - Opening the Libro d'Oro
Digital Humanities Venice Group Presentation - Opening the Libro d'OroDigital Humanities Venice Group Presentation - Opening the Libro d'Oro
Digital Humanities Venice Group Presentation - Opening the Libro d'OroMichael Mitchell
 
Open data and linked data
Open data and linked dataOpen data and linked data
Open data and linked data
Marie Gustafsson Friberger
 
PATHS at the Language Technology Group, Computer Science and Software Enginee...
PATHS at the Language Technology Group, Computer Science and Software Enginee...PATHS at the Language Technology Group, Computer Science and Software Enginee...
PATHS at the Language Technology Group, Computer Science and Software Enginee...
pathsproject
 
Broad Data (India 2015)
Broad Data (India 2015)Broad Data (India 2015)
Broad Data (India 2015)
James Hendler
 
Open Data Publication - Requirements, Good practices, and Benefits
Open Data Publication - Requirements, Good practices, and BenefitsOpen Data Publication - Requirements, Good practices, and Benefits
Open Data Publication - Requirements, Good practices, and Benefits
ariadnenetwork
 

Similar to Linkage in Haze: challenges and take-home messages of crowd-sourcing vagueness in musical data (20)

Linked data for knowledge curation in humanities research
Linked data for knowledge curation in humanities researchLinked data for knowledge curation in humanities research
Linked data for knowledge curation in humanities research
 
LIS 653 Posters Fall 2014
LIS 653 Posters Fall 2014 LIS 653 Posters Fall 2014
LIS 653 Posters Fall 2014
 
The Listening Experience Database
The Listening Experience DatabaseThe Listening Experience Database
The Listening Experience Database
 
Linked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & MuseumsLinked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & Museums
 
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
 
Musical Meetups Knowledge Graph (MMKG): a collection of evidence for historic...
Musical Meetups Knowledge Graph (MMKG): a collection of evidence for historic...Musical Meetups Knowledge Graph (MMKG): a collection of evidence for historic...
Musical Meetups Knowledge Graph (MMKG): a collection of evidence for historic...
 
Sally Rumsey, Janet McKnight, James A.J. Wilson - Research data management fo...
Sally Rumsey, Janet McKnight, James A.J. Wilson - Research data management fo...Sally Rumsey, Janet McKnight, James A.J. Wilson - Research data management fo...
Sally Rumsey, Janet McKnight, James A.J. Wilson - Research data management fo...
 
Knowledge Organization | LIS653 | Fall 2017
Knowledge Organization | LIS653 | Fall 2017Knowledge Organization | LIS653 | Fall 2017
Knowledge Organization | LIS653 | Fall 2017
 
Connecting Museums with Linked Data
Connecting Museums with Linked DataConnecting Museums with Linked Data
Connecting Museums with Linked Data
 
Implementing Linked Data in Low-Resource Conditions
Implementing Linked Data in Low-Resource ConditionsImplementing Linked Data in Low-Resource Conditions
Implementing Linked Data in Low-Resource Conditions
 
The Challenge of Digital Sources in the Web Age: Common Tensions Across Three...
The Challenge of Digital Sources in the Web Age: Common Tensions Across Three...The Challenge of Digital Sources in the Web Age: Common Tensions Across Three...
The Challenge of Digital Sources in the Web Age: Common Tensions Across Three...
 
Interpretation, Context, and Metadata: Examples from Open Context
Interpretation, Context, and Metadata: Examples from Open ContextInterpretation, Context, and Metadata: Examples from Open Context
Interpretation, Context, and Metadata: Examples from Open Context
 
When Semantics support Multilingual Access to Digital Cultural Heritage - the...
When Semantics support Multilingual Access to Digital Cultural Heritage - the...When Semantics support Multilingual Access to Digital Cultural Heritage - the...
When Semantics support Multilingual Access to Digital Cultural Heritage - the...
 
Digitization and public libraries
Digitization and public librariesDigitization and public libraries
Digitization and public libraries
 
PATHS at Royal Melbourne Institute of Technology
PATHS at Royal Melbourne Institute of TechnologyPATHS at Royal Melbourne Institute of Technology
PATHS at Royal Melbourne Institute of Technology
 
Digital Humanities Venice Group Presentation - Opening the Libro d'Oro
Digital Humanities Venice Group Presentation - Opening the Libro d'OroDigital Humanities Venice Group Presentation - Opening the Libro d'Oro
Digital Humanities Venice Group Presentation - Opening the Libro d'Oro
 
Open data and linked data
Open data and linked dataOpen data and linked data
Open data and linked data
 
PATHS at the Language Technology Group, Computer Science and Software Enginee...
PATHS at the Language Technology Group, Computer Science and Software Enginee...PATHS at the Language Technology Group, Computer Science and Software Enginee...
PATHS at the Language Technology Group, Computer Science and Software Enginee...
 
Broad Data (India 2015)
Broad Data (India 2015)Broad Data (India 2015)
Broad Data (India 2015)
 
Open Data Publication - Requirements, Good practices, and Benefits
Open Data Publication - Requirements, Good practices, and BenefitsOpen Data Publication - Requirements, Good practices, and Benefits
Open Data Publication - Requirements, Good practices, and Benefits
 

Recently uploaded

Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfUnleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Enterprise Wired
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
mzpolocfi
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
eddie19851
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Subhajit Sahu
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
GetInData
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
Roger Valdez
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
javier ramirez
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 

Recently uploaded (20)

Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfUnleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 

Linkage in Haze: challenges and take-home messages of crowd-sourcing vagueness in musical data

  • 1. Linkage in Haze Challenges and take-home messages of crowd-sourcing vagueness in musical data Alessandro Adamou Listening to music: people, practices and experiences Sunday, October 25, 2015
  • 2. What we capture in a listening experience
  • 3. what they said where documented where who what when how What we capture in a listening experience
  • 4. A well-formed listening experience On July 19, 2014 Leonie Holmes (a professor of Music in New Zealand)* was listening to Johann Strauss’ “Don Juan” and “Also Spracht Zarathustra” and Sarah Ballard’s “Synergos” played by the NZSO National Youth Orchestra and Alexander Shelley using harp and double bass (+others?) in the Aotea Centre. (*) plus a generic public, which does not pose a problem in the representation.
  • 5. A worst-case (but more likely) listening experience One evening between May and September in the late 1950’s a group of war veterans, a reporter and an unknown female were listening to Chopin and an anthem played by a string orchestra in a concert hall in South London.
  • 6. Goal: to capture both fact sets as structured data (interconnected or prepared for refinement)
  • 7. Other issues • Source – unpublished manuscripts • Unaligned semantic layers – instrument category | instrument name | brand and model – generic occupations | gender-dependent | personal titles Monarch King (Queen), Emperor (Empress)… King of England, fourth Sultan of Zanzibar… Chords Electric Guitar Gibson Les Paul Custom Sunburst
  • 8. Factors contributing to fuzziness • Domain knowledge of the evidence author(s) • Deterioration of evidence • Crowd-sourcing community issues: – Misaligned semantics – Varying scholarly rigour – Popularity of the domain of interest
  • 9. Data representation in LED • Linked [Open] Data http://linkeddata.org – Formalism for machine-readable and human- readable data – Object identifiers are URIs – Standard representation and query languages (RDF, SPARQL…) – The meaning of links between objects is globally understood.
  • 10. Identity in Linked Data • http://musicbrainz.org/artist/f5aca88c-e3c1-4bc2-af33- 68a9a9f7b56a#_ – (the band Killing Joke, as in MusicBrainz) • http://bnb.data.bl.uk/id/agent/DailyMirror – (The Daily Mirror on the British National Bibliography) • http://dbpedia.org/resource/London – (London, as in Wikipedia/DBpedia) • http://reference.data.gov.uk/doc/day/2015-10-23 – (last Friday, as in the UK Government Calendar data) • http://led.kmi.open.ac.uk/term/Medium.Live – (the concept of live music, as in LED)
  • 11. Identity in Linked Data • http://musicbrainz.org/artist/f5aca88c-e3c1-4bc2-af33- 68a9a9f7b56a#_ – (the band Killing Joke, as in MusicBrainz) • http://bnb.data.bl.uk/id/agent/DailyMirror – (The Daily Mirror on the British National Bibliography) • http://dbpedia.org/resource/London – (London, as in Wikipedia/DBpedia) • http://reference.data.gov.uk/doc/day/2015-10-23 – (last Friday, as in the UK Government Calendar data) • http://led.kmi.open.ac.uk/term/Medium.Live – (the concept of live music, as in LED) Easy: these are all named entities…
  • 12. Goal: to capture both fact sets as linked data There are no right or wrong ways to do it, only linkable or unlinkable.
  • 13. Linked Data encourage reuse… No two things are distinct nor equal, until some LD node asserts or implies otherwise. – e.g. Bono on MusicBrainz and Bono on DBpedia – Groups too, if it can be demonstrated they are an exact match
  • 14. …but fuzzy concepts have caveats Group entity “Mourners of Felix Mendelssohn” (attending the arrival of his body in Berlin) See http://led.kmi.open.ac.uk/entity/lexp/1434029100189 • Identifier of the group is http://data.open.ac.uk/led/agent/Mourners+of+Felix+Mendelssohn/1434029100190 should not be reused when modelling an entry about Mendelssohn’s funeral service. See http://led.kmi.open.ac.uk/entity/lexp/1434029387526 – Identifier of the group is http://data.open.ac.uk/led/person/Mourners+at+the+Funeral+Service+of+Felix+Mende lssohn/1434029247629
  • 15. Blank nodes • Fallback mechanism for providing data about objects without having a naming convention for them. • Reference something not by name, but by description. • Example: :performance/Messiah/12345 mo:listener [ a foaf:Group ; dc:description “Foreign ambassadors” :occupation dbpedia:Ambassador ] • Generally not an advisable solution: – Cannot perform matching on blank nodes – Querying or detecting changes in the data is much harder
  • 16. Ontological classes • Model vague objects as formally-specified categories rather than named entities • e.g. “the class of all people whose occupation is Ambassador and who were at the Royal Albert Hall on May 12, 1876” • Pros: – Allows separation of “known” and “generic” entities – Semantically cleaner and easier to store and manage • Cons: – Still need to make URIs for each class – They have to be instantiated before they can be used in a listening experience – Harder to apply changes to the data without fixed classes
  • 17. Countermeasures in LED • No blank nodes • For unaligned semantic layers (cf. example on instruments and occupations): –Use lax model properties –Enforce reuse of external taxonomies • ‘rich’ real-time recommendations
  • 18. Countermeasures in LED Data reconciliation Currently with restricted access, but plans to open to crowd-sourcing
  • 19. Countermeasures in LED • Ad-hoc formal models for underspecified data. • Example: Extended Date/Time Format (standard draft, Library of Congress, 2012) – Allows formalisation of underspecified points in time and intervals, e.g. “187u-05-uu” – We extended it to support subjective fuzzy intervals (e.g. early/mid/late) and ranges (from-to) – Made available in RDF through data.open.ac.uk • Example 2: GeoSPARQL – Used to support geospatial queries in Linked Data – Named entity recognition on arbitrary text for locations (recently) – We compute location URIs by hashing their descriptions and all the locations extracted from it and related via geosparql:sfIntersects
  • 20. How thick is the mist in LED? Named Vague Total Participants 802 260 1062 Locations 136 15 (cannot pinpoint) 151* Times 826 843 (ranges, not qualified) 1669 Musical works 1550 1263 2813 (*) since database opened to arbitrary experience locations Figures for LED public dataset
  • 21. Lessons learnt • Advantages – Open-world semantics: minimise risk of ambiguities generated by name clashes, allows for coherent management – Monotonic: data are refined by addition of facts – Can be reasoned upon by machine-learning agents working on the native data structure – Incorporates reuse for the benefit of the whole data cloud. • Disadvantages – No reuse entails heavy replication – Data cleansing may require a large context for detecting entities that can be reconciled
  • 22. Lessons learnt • Most, if not all representational issues with vagueness can be addressed in LD without resorting to blank nodes and safe from ambiguity. – Way more powerful that traditional database systems. • Data providers are yet to reach an albeit silent agreement on: – representational paradigms for entities commonly at risk of underspecification, such as spatio-temporal ones; – how to name their objects. • Most are making it easy for themselves when it comes to LD • The way to go is de facto standards
  • 23. Where to go next • Model ontological classes as their instances (equivalence classes?) • Increase context for fact-based data alignment (opening reconciliation facilities to the public – with voting?) • Argumentation on every statement in LED • Dissemination of controlled vocabularies and naming convention for managed vague entities.
  • 24. Are Linked Data mature for representing vagueness? • The technology is. • The data out there aren’t. – (but that is the part that can be improved)
  • 25. Further reading • Eero Hyvönen, Publishing and Using Cultural Heritage Linked Data on the Semantic Web (Morgan & Claypool, 2012) • Daniel J. Lewis and Trevor P. Martin, Managing Vagueness with Fuzzy in Hierarchical Big Data. In 2015 INNS Conference on Big Data (Elsevier, 2015), Procedia Computer Science, Vol. 53, p. 19-28 • Fuzzy Logic and the Semantic Web, Elie Sanchez (ed.) (Elsevier, 2006)