Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

DOREMUS - a Graph of Interlinked Musical Work

122 views

Published on

Presentation at ISWC 2018
17th International Semantic Web Conference
11th October, 2018 - Monterey, California, USA

Published in: Data & Analytics
  • Thank you very much! We, at the sheet music archive of the Dutch Broadcasting Corporation are also publishing our metadata as linked open data (at data.muziekschatten.nl). Very interested in your ontology and applications. Keep up the excellent work!.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

DOREMUS - a Graph of Interlinked Musical Work

  1. 1. DOREMUSa Graph of Interlinked Musical Work Pasquale Lisena EURECOM, France @pasqLisena M. Achichi, P. Lisena, K. Todorov, R. Troncy, J. Delahousse
  2. 2. 2 Which works have been composed by Mozart when he was <10? How many works have been composed and performed for the 1st time in the same city? Which composers had the chance to direct their own work in a performance during the last decade?
  3. 3. 3 metadata about artists, works, performances, scores Music knowledge graph used for building the knowledge graph open-source, reusable Tools for converting and interlinking
  4. 4. 4 Music is complex
  5. 5. 5 M. Lasar (2011). Digging into Pandora’s Music Genome with musicologist Nolan Gasser. https://arstechnica.com/tech-policy/2011/01/digging-into-pandoras-music-genome-with-musicologist-nolan-gasser/ When it comes to classical music, on the other hand, it's much more about the composition itself, because even though the interpretation can vary in various subtle ways. CLASSICALPOP VS For pop music the experience of the music is really defined by the recording.
  6. 6. 6 CLASSICALPOP VS Track-based Work-based 60 years of history Thousand years from Gregorian chant to a work written last Tuesday Songs Multi-movement works Major, minor Polyphonic, homophonic, monophonic
  7. 7. 7
  8. 8. 8 Music archives have very detailed knowledge PROBLEMS ● Multiple formats ● No possible interoperability ● Need for discovering overlapping knowledge ● Information codified as free text ● Not always publicly accessible APPROACH Semantic Web!
  9. 9. 9 Improve music description to foster music exchange and reuse Travel to the heart of the musical archives in France’s greatest institutions Connect sources, multiply usage, enrich user experience
  10. 10. 10 Building the DOREMUS graph DATA CONVERSION DATA LINKING LINK VALIDATION DATA MODELING marc2rdf string2vocabulary ...custom converters legato
  11. 11. DATA CONVERSION DATA LINKING LINK VALIDATION 11 The DOREMUS Model - Music specific extension of FRBRoo - Dynamic: it is made up of autonomous combined modules - Relies on Linked Data principles (everything is an URI, RDF model) FRBR museum information bibliographic records DATA MODELING Choffé, Pierre, and Françoise Leresche. DOREMUS: connecting sources, enriching catalogues and user experience. In 24th IFLA World Library and Information Congress. 2016.
  12. 12. 12 The building blocks Work-Expression-Event F14 Work F22 Expression F28 Expression Creation R3 is realized in R17 created R19 created a realization of DATA CONVERSION DATA LINKING LINK VALIDATIONDATA MODELING
  13. 13. 13 F14 Work F22 Expression M2 Opus Statement F28 Expression Creation R3 is realized in E7 Activity 5 1 “Sonate pour violoncelle et piano no 1”@fr “Sonates" , "Sonata in F" Ludwig van Beethoven Ludwig von Beethoven composer compositeur@fr compositore@it R17 created R19createda realizationof U17 has opus statement U12 has genre P102 has title U31 had function of type P14 carried out by P9 consists of P4 has time span1796 Sonata sonata@it , sonate@fr , klaviersonate@de M42 Performed Expression Creation M43 Performed Expression Berlin P4 has time span 1796 P7 took place at F24 Publication Expression F30 Publication Event P4 has time span 1797 P7 took place at Vienna U4 had princeps publication U54 is performed expression of P165 incorporates 1770 1827 P98 born P100 died U11 has key F Major F Dur@de , Fa majeur@fr, Fa maggiore@it , Fa mayor@es M6 Casting M23 Casting Detail U13 has casting 1 U30 quantity U2 foresees mop Piano Pianoforte@it Fortepian@pl M23 Casting Detail 1 U30 quantity U2 foresees mop Cello Violoncello@it Violoncelle@fr F15 Complex Work F19 Publication Work M44 Performed Work U5 had premiere U38 has descriptive expression R10 has member
  14. 14. 14 F22 Expression M6 Casting M23 Casting Detail U13 has casting 1 U30 quantity U2 foresees mop Piano Pianoforte@it Fortepian@pl M23 Casting Detail 1 U30 quantity U2 foresees mop Cello Violoncello@it Violoncelle@fr
  15. 15. Controlled Vocabularies for Music Metadata GENRES Diabolo IAML Itema3 Redomi RAMEAU Medium of performance MIMO Itema3 IAML Diabolo RAMEAU Redomi Musical keys Modes Catalogues Derivation types Functions more available at http://data.doremus.org/vocabularies 23 families of vocabularies · 11,000+ concepts · 610 links between terms published at ISMIR 2018 INTERLINKED INTERLINKED
  16. 16. 16 Dealing with different formats Works: INTERMARC Scores: INTERMARC Discs: INTERMARC Works: UNIMARC Scores: INTERMARC Performances: XML Works - Recordings - Scores 3 different XML sources A pre-digital archive format in Radio France DATA MODELING DATA LINKING LINK VALIDATIONDATA CONVERSION
  17. 17. Source datasets 17 Works 62 550 | XML Scores 9 154 | XML Concerts 340 609 | XML Discs 9 500 | XML Works 6 846 | UNIMARC Scores 30 319 | UNIMARC Concerts 5 164 | XML Discs 8 602 | XML Works 135 940 | INTERMARC Scores 89 184 | INTERMARC
  18. 18. Source datasets 18 DATASET Works Scores Concerts Discs Classic work Jazz improvisation Ethnic/World/Traditional music
  19. 19. 19 001 FRBNF139081882FR 100 $313891295$w.0..b.....$aBeethoven$mLudwig van$d1770-1827 144 $w....b.fre.$aSonates$bPiano$pOp. 27, no 2$tDo dièse mineur 001 FRBNF139081882FR 100 $313891295$w.0..b.....$aBeethoven$mLudwig van$d1770-1827 144 $w....b.fre.$aSonates$bPiano$pOp. 27, no 2$tDo dièse mineur LANG TITLE MOP OPUS KEY MARC FILE MARC must die http://lj.libraryjournal.com/2002/10/ljarchives/marc-must-die “ Roy Tennant, 2002 ” DATA MODELING DATA LINKING LINK VALIDATIONDATA CONVERSION
  20. 20. 20 marc2rdf MARC PARSER ● Parsing of the file ● Interpretation of the fields ● Graph generation MARC files mapping rules DATA MODELING DATA LINKING LINK VALIDATIONDATA CONVERSION
  21. 21. 21 144 $w....b.fre.$aSonates$bPiano$pOp. 27, no 2$tDo dièse mineur F22 Expression: Opus Number F22 Self-Contained Expression U17 has opus statement M2 Opus Statement [U42 has opus number M12 Opus Number] + [U43 has opus subnumber M13 Opus Subnumber] TUM : 144 $p, chain of digits TUM : 144 $p, chain of digits before the comma Remove the abbreviation “Op.” before the number 144 $pOp. 352 --> M12 = 352 144 $pOp. 27, no 2 --> M12 = 27, M13 =2 UNIT OF INFORMATION PATH INTERMARC BNF TRANSFER RULE EXAMPLE MAPPING RULES DATA MODELING DATA LINKING LINK VALIDATIONDATA CONVERSION
  22. 22. 22 marc2rdf http://data.doremus.org/performance/8abb8e71-1593-36b9-a998-80b437258ef4 MARC PARSER FREE TEXT INTERPRETER MARC files vocabularies 1st performance in Moscow, December 29, 1956, by Mstislav Rostropovich on cello and A. Dedukhin on piano “ ” ● Extracting info from the text through empirical rules ● Disambiguation for vocabularies terms and artists DATA MODELING DATA LINKING LINK VALIDATIONDATA CONVERSION
  23. 23. 23 marc2rdf MARC PARSER FREE TEXT INTERPRETER STRING 2 VOCABULARY ● Replace labels with URIs from controlled vocabularies MARC files vocabularies “Violoncelle”@fr <http://www.mimo-db.eu/InstrumentsKeywords/3582> DATA MODELING DATA LINKING LINK VALIDATIONDATA CONVERSION
  24. 24. 24 STRING 2 VOCABULARY ● Match against a family of vocabularies “Soprano”@it MIMO IAML DIABOLO ITEMA3 REDOMI RAMEAU GENRE “C Major”@en GENRE vocabulary:key/c KEY vocabulary:key/c https://github.com/DOREMUS-ANR/string2vocabulary ● 2 passes ○ Exact label + language ○ Exact label, any language ● Correction of editorial mistakes DATA MODELING DATA LINKING LINK VALIDATIONDATA CONVERSION
  25. 25. 25 INTERMARC marc2rdf UNIMARC EUTERPE XML ITEMA3 XML euterpe converter itema3 converter GRAPH BNF GRAPH PHILHARMONIE GRAPH EUTERPE GRAPH ITEMA3 diabolo converter DIABOLO XML GRAPH DIABOLO DATA MODELING DATA LINKING LINK VALIDATIONDATA CONVERSION STRING 2 VOCABULARY
  26. 26. 26 GRAPH BNF GRAPH PHILHARMONIE http://data.doremus.org/expression/d72301f0-0aba- 3ba6-93e5-c4efbee9c6ea “Quasi una fantasia” COMPOSER Beethoven ORDER NUM 14 OPUS 27, n 2 GENRE sonata CASTING piano KEY C sharp major 1st PUB ? PREMIERE ? http://data.doremus.org/expression/37932fbc-fef3-3edb- 9fae-1eec9b4be01d “Sonata quasi una fantasia” COMPOSER Beethoven ORDER NUM 14 OPUS 27, n 2 GENRE sonata, romantic music CASTING piano (1) KEY C sharp major 1st PUB 1802, Vienna PREMIERE ? sameAs
  27. 27. 27 DATA MODELING LINK VALIDATIONDATA CONVERSION DATA LINKING Challenges ● Not all the works have values for all the properties lack of attributes ● Similar values do not necessarily imply a match i.e. Beethoven’s Sonata n. 1, Sonata n. 2, Sonata n. 3 ● Lexical, semantic, transliteration, orthographic mismatches On the left: Beethoven. On the right: (the same) Beethoven.
  28. 28. 28 DATA MODELING LINK VALIDATIONDATA CONVERSION DATA LINKING First Linking Composer + Catalogue Wolfgang Amadeus Mozart Eine kleine Nachtmusik K 525 Wolfgang Amadeus Mozart Serenade No. 13 in G major KV 525 sameAs
  29. 29. 29 DATA MODELING LINK VALIDATIONDATA CONVERSION DATA LINKING Legato New linking system Existing data linking system were not satisfactory
  30. 30. 30 DATA MODELING LINK VALIDATIONDATA CONVERSION DATA LINKING * works to be compared are grouped by composer *
  31. 31. 31 DATA MODELING LINK VALIDATIONDATA CONVERSION DATA LINKING
  32. 32. 32 DATA MODELING LINK VALIDATIONDATA CONVERSION DATA LINKING Heterogeneities Task False Positive Trap Legato performances at the OAEI campaign 2017 sandbox mainbox SPIMBENCHDOREMUS
  33. 33. 33 DATA LINKINGDATA MODELING DATA CONVERSION LINK VALIDATION certain links confidence score + experts’ validation ? SINGLE LINK TRIANGLE MISSING LINK CONFLICT inference if experts’ validation remove with experts’ check
  34. 34. 34 What is in the Knowledge Graph? 89.872 persons (composers, performers, …) 18.075 corporate bodies (orchestras, chorus, publishers, …) 357.451 musical works 16k components 4k derived works 193.412 concerts and studio recordings 469.131 performed work 3.833 foreseen concerts 31.296 publications 48.006 scores
  35. 35. 35 Future Work ● More interlinking with MusicBrainz ● Internal interlinking of performances ● Create bridges with other communities (musicologists, streaming services, …) Applications ● Explorative Search Engine ● KG-Based Recommender System http://overture.doremus.org/ DOREMUS CHATBOT https://chatbot.doremus.org/
  36. 36. GitHub page converters, interlinking tool, data dumps, ... github.com/DOREMUS-ANR/ OVERTURE discover DOREMUS data overture.doremus.org DOREMUS website www.doremus.org CHATBOT q&a system for classical music chatbot.doremus.org THIS PRESENTATION https://goo.gl/1UmKnVpasquale.lisena@eurecom.fr @pasqLisena
  37. 37. 37 Persons 9.269 euterpe 1.503 diabolo 9.040 itema3 8.419 philharmonie 19.881 bnf 54.675 bnf bib 291.421 in the whole graph 89.872 active* * with 1 or more compositions, performances, dedications, ... 1.479 dedicatees 529 subjects 21.626 composers 7.830 conductors 3.583 performers 13.242 text authors
  38. 38. 38 Corporate Bodies 45.743 in the whole graph 18.075 active* * with 1 or more compositions, performances, dedications, ... 1001 euterpe 0 diabolo 39 itema3 1.603 philharmonie 855 bnf 14.657 bnf bib 6 dedicatees 7 subjects 517 orchestras + ensembles 192 choruses 6.099 publishers 2.194 producers
  39. 39. 39 Works f15 f14 f22 - 10.587 10.587 euterpe 9.343 12.344 12.344 diabolo -- 15.016 15.016 itema3 5.762 14.527 14.875 philharmonie 135.749 134.973 134.973 bnf 245.069 223.357 279.641 bnf bib 420.733 expressions (include movements) 357.451 complex works
  40. 40. 40 Works 16.132 components* 4.619 arrangements 293 transcriptions 43 orchestration 4.884 total of derivations * movements, parts, acts, selections (extraits) ... 420.733 expressions (include movements) 357.451 complex works
  41. 41. 41 Performances 193.065 concerts (performances) 5.702 converted from specific records 469.131 interpretations of 288.298 distinct works f31 m43 2.294 2.294 diabolo 2.296 12.602 itema3 7107 47.119 philharmonie 14.115 15.221 bnf 165.225 387.519 bnf bib
  42. 42. 42 Foreseen Concerts 3.833 concerts 13.520 interpretations of 10.759 distinct works m26 f25 > f22 3.833 13.520 euterpe 17 artistic seasons 281 cycles 33 festivals
  43. 43. 43 Recordings 397.597 recordings 15.267 supports f26 f4 f3 2.296 2.842 - itema3 3.406 11.681 - philharmonie 392.020 744 199.339 bnf bib 198.693 publications
  44. 44. 44 Scores 31.296 publications 48.006 scores 44.668 distinct works f24 f24 > f22 31.296 48.006 bnf bib

×