Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Modeling the Complexity of Music Metadata
in Semantic Graphs for Exploration and Discovery
ANR-14-CE24-0020
@pasqlisena
pa...
https://list.indiana.edu/sympa/arc/mla-l/2017-08/msg00248.html
2
Information contained in librarian knowledge
but not publicly available
Hard question for current
music models and ontolog...
4
Project Goals • Improve music description to foster
music exchange and reuse
• Connect sources, multiply usage,
enrich use...
Works
62 550 | XML
Scores
9 154 | XML
Concerts
340 609 | XML
Discs
9 500 | XML
Works
6 846 | UNIMARC
Scores
30 319 | UNIMA...
Source Datasets
DATASET
Works
Scores
Concerts
Discs
Classic work
Jazz improvisation
Ethnic/World/Traditional music
How to ...
State of the Art: MusicOntology
- One of the first example of describing
music using Semantic Web
- Extend FRBR, Timeline ...
The DOREMUS model
F15
Work
F22
Expression
F28
Expression
Creation
- Music specific extension of
FRBRoo
- Triplet pattern:
...
F14
Work
F22
Expression
M2
Opus
Statement
F28
Expression
Creation
R3 is
realized in
E7
Activity
5
1
“Sonate pour violoncel...
11
Controlled Vocabularies
12
“Sax”@en
“Saxophone”@en
“Saxofone”@pt
“Sassofono”@it
“Saxophone”@fr
Alternate labels Alternate ...
Controlled Vocabularies
13
GENRES
Diabolo (629)
IAML (607)
Itema3 (212)
Redomi (313)
RAMEAU (654)
Medium of
performance
MI...
Interlinking: Vocabularies
14
http://data.doremus.org/
vocabulary/iaml/genre/cha
“cha-cha-cha”
http://data.doremus.org/
vo...
001 FRBNF139081882FR
100 $313891295$w.0..b.....$aBeethoven$mLudwig van$d1770-1827
144 $w....b.fre.$aSonates$bPiano$pOp. 27...
MARC issues
16
• Different variants
UNIMARC, INTERMARC
• Free text field
different practices in describing the same inform...
Data conversion
marc2rdf
experts-made
mapping rules
17
controlled
vocabularies
https://github.com/DOREMUS-ANR/marc2rdf/
• ...
Interlinking: Works
18
http://data.doremus.org/expression/d72
301f0-0aba-3ba6-93e5-c4efbee9c6ea
“Sonata quasi una fantasia...
Interlinking: Works
19
1. Data cleaning
removing “noisy” properties, i.e. identifiers, comments, …
2. Instance profiling
r...
Visualizing
20http://overture.doremus.org
Prototype of web app that
uses the DOREMUS dataset
• Follow the links
like in th...
Future Work
21
• Pivot Vocabularies of Genres and MoPs
as result of the interconnection task
• Recommendation System
first...
22
But what about this?
23
results
This and more questions:
https://github.com/DOREMUS-ANR/knowledge-base/tree/master/query-examples
Links
http://www.doremus.org/
DOREMUS Website
GitHub page
with tools, converters, ontologies, ...
https://github.com/DOREM...
Upcoming SlideShare
Loading in …5
×

Modeling the Complexity of Music Metadata in Semantic Graphs for Exploration and Discovery

223 views

Published on

Presentation of DOREMUS work at the 4th International Digital Libraries for Musicology workshop (DLfM 2017)

Published in: Engineering
  • Be the first to comment

Modeling the Complexity of Music Metadata in Semantic Graphs for Exploration and Discovery

  1. 1. Modeling the Complexity of Music Metadata in Semantic Graphs for Exploration and Discovery ANR-14-CE24-0020 @pasqlisena pasquale.lisena@eurecom.fr Pasquale Lisena, Raphaël Troncy, Konstantin Todorov, Manel Achichi Digital Libraries for Musicology (DLfM) Workshop 28th October 2017 | Shanghai Conservatory of Music
  2. 2. https://list.indiana.edu/sympa/arc/mla-l/2017-08/msg00248.html 2
  3. 3. Information contained in librarian knowledge but not publicly available Hard question for current music models and ontologies Different practical implications (MIR, concert and radio programming, music recommendation) 3
  4. 4. 4
  5. 5. Project Goals • Improve music description to foster music exchange and reuse • Connect sources, multiply usage, enrich user experience • Music specific data model • Vocabularies and data public available as Linked Open Data • Tools for visualization, interconnections, recommendation • Experience and praxis for other institutions 5
  6. 6. Works 62 550 | XML Scores 9 154 | XML Concerts 340 609 | XML Discs 9 500 | XML Works 6 846 | UNIMARC Scores 30 319 | UNIMARC Concerts 5 164 | XML Discs 8 602 | XML Source Datasets Works 135 940 | INTERMARC Scores 89 184 | INTERMARC 6
  7. 7. Source Datasets DATASET Works Scores Concerts Discs Classic work Jazz improvisation Ethnic/World/Traditional music How to manage this complex metadata? 7
  8. 8. State of the Art: MusicOntology - One of the first example of describing music using Semantic Web - Extend FRBR, Timeline Ontology, Event Ontology - Uses vocabularies for Keys, Musical Instrument (by MusicBrainz), Genres (DBpedia) 8 Raimond, Samer A. Abdallah, Mark B. Sandler, and Frederick Giasson. 2007. The Music Ontology. In 15th International Conference on Music Information Retrieval (ISMIR). 417–422
  9. 9. The DOREMUS model F15 Work F22 Expression F28 Expression Creation - Music specific extension of FRBRoo - Triplet pattern: Work-Expression-Event - Dynamic: every triplet is autonomous, and linkable to the other ones - Relies on Linked Data principles (everything is an URI, RDF model) 9http://data.doremus.org/ontology
  10. 10. F14 Work F22 Expression M2 Opus Statement F28 Expression Creation R3 is realized in E7 Activity 5 1 “Sonate pour violoncelle et piano no 1”@fr “Sonates" , "Sonata in F" Ludwig van Beethoven Ludwig von Beethoven composer compositeur@fr compositore@it U17 has opus statement U12 has genre P102 has title U31 had function of type P14 carried out by P9 consists of P4 has time span1796 Sonata sonata@it , sonate@fr , klaviersonate@de M42 Performed Expression Creation M43 Performed Expression Berlin P4 has time span 1796 P7 took place at F24 Publication Expression F30 Publication Event P4 has time span 1797 P7 took place at Vienna U4 had princeps publication U54 is performed expression of P165 incorporates 1770 1827 P98 born P100 died F Major F Dur@de , Fa majeur@fr, Fa maggiore@it , Fa mayor@es M6 Casting M23 Casting Detail 1 U30 quantity U2 foresees mop Piano Pianoforte@it Fortepian@pl M23 Casting Detail 1 U30 quantity U2 foresees mop Cello Violoncello@it Violoncelle@fr F15 Complex Work F19 Publication Work M44 Performed Work U5 had premiere U38 has descriptive expression R10 has member
  11. 11. 11
  12. 12. Controlled Vocabularies 12 “Sax”@en “Saxophone”@en “Saxofone”@pt “Sassofono”@it “Saxophone”@fr Alternate labels Alternate languages <http://data.doremus.org/vocabulary/iaml/mop/wsa> “English term is preferred globally” Notes “Woodwinds”@en “Legni”@it Hierarchy “Baritone Saxophone”@en• Disambiguation • Search • Graph-based analysis APPLICATIONS
  13. 13. Controlled Vocabularies 13 GENRES Diabolo (629) IAML (607) Itema3 (212) Redomi (313) RAMEAU (654) Medium of performance MIMO (2480) Itema3 (314) IAML (419) Diabolo (2117) RAMEAU (876) Redomi (179) Musical keys 29 Modes 22 Catalogues 151 Derivation types 16 Functions ~ 30 coming soon http://data.doremus.org/vocabularies
  14. 14. Interlinking: Vocabularies 14 http://data.doremus.org/ vocabulary/iaml/genre/cha “cha-cha-cha” http://data.doremus.org/ vocabulary/diabolo/genre/cha_cha_cha “cha cha cha” http://yamplusplus.lirmm.fr/ = String matching + graph traversal Interface for validating the matching
  15. 15. 001 FRBNF139081882FR 100 $313891295$w.0..b.....$aBeethoven$mLudwig van$d1770-1827 144 $w....b.fre.$aSonates$bPiano$pOp. 27, no 2$tDo dièse mineur LANG TITLE MOP OPUS KEY “MARC must die” -- Roy Tennant, 2002 http://lj.libraryjournal.com/2002/10/ljarchives/marc-must-die/#_
  16. 16. MARC issues 16 • Different variants UNIMARC, INTERMARC • Free text field different practices in describing the same information “Op. 27 n. 2” - “Op. 27 no 2” • Frequent mistakes in editorial work wrong fields, typos, wrong punctuation
  17. 17. Data conversion marc2rdf experts-made mapping rules 17 controlled vocabularies https://github.com/DOREMUS-ANR/marc2rdf/ • Field parsing and mapping • NLP techniques • Graph generation • String2URI TASKS
  18. 18. Interlinking: Works 18 http://data.doremus.org/expression/d72 301f0-0aba-3ba6-93e5-c4efbee9c6ea “Sonata quasi una fantasia” http://data.doremus.org/expression/226790 01-2cd0-3f84-b502-0f337429966f “Quasi una fantasia” https://github.com/DOREMUS-ANR/legato = Legato F-measure > 0.85 Precision > 0.87 Recall > 0.82
  19. 19. Interlinking: Works 19 1. Data cleaning removing “noisy” properties, i.e. identifiers, comments, … 2. Instance profiling represent each resource as sub-graph 3. Instance indexing and matching convert the sub-graph in a set of keywords in order to apply text document matching techniques 4. Post-processing Clustering of the datasets, identify false positive of previous points
  20. 20. Visualizing 20http://overture.doremus.org Prototype of web app that uses the DOREMUS dataset • Follow the links like in the graph • Enriched experience DBpedia, GeoNames, … • Timeline of related event • Similar works recommendation
  21. 21. Future Work 21 • Pivot Vocabularies of Genres and MoPs as result of the interconnection task • Recommendation System first step: “Combining Music Specific Embeddings for Computing Artist Similarity” @ISMIR2017 • Schema.org injection in all pages goals: SEO optimization, simplification of the data in order to extend their usage
  22. 22. 22 But what about this?
  23. 23. 23 results This and more questions: https://github.com/DOREMUS-ANR/knowledge-base/tree/master/query-examples
  24. 24. Links http://www.doremus.org/ DOREMUS Website GitHub page with tools, converters, ontologies, ... https://github.com/DOREMUS-ANR/ Dataset & SPARQL Endpoint https://data.doremus.org/sparql https://data.doremus.org/fct OVERTURE https://overture.doremus.org/ This presentation https://www.slideshare.net/squalelis 24

×