Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

One Score To Rule Them All: Semantics in Music Notation

206 views

Published on


In this talk, Albert Meroño Peñuela will summarize the ongoing efforts to bridge this gap by means of knowledge representations used in the Semantic Web (RDF and ontologies). In particular, he will describe recent research at the Vrije Universiteit Amsterdam on applying semantic models to the popular digital music format MIDI, and its implications for a future Web capable of providing a universal interface to musical knowledge.

Published in: Data & Analytics
  • Login to see the comments

One Score To Rule Them All: Semantics in Music Notation

  1. 1. ‹#› Het begint met een idee ONE SCORE TO RULE THEM ALL: SEMANTICS IN MUSIC NOTATION Albert Meroño-Peñuela, et al. DHDK seminar, University of Bologna, 13/02/2018
  2. 2. R E F I N I N G S TAT I S T I C A L D ATA O N T H E W E B
  3. 3. R E F I N I N G S TAT I S T I C A L D ATA O N T H E W E B
  4. 4. Vrije Universiteit Amsterdam 4 ME • Postdoc researcher at VU University Amsterdam, Knowledge Representation & Reasoning • Computer Science! • Interfaces between the Digital Humanities and the Semantic Web • Representation of and access to cultural knowledge, such as contained in historical objects, music sheets, and statistical registers • Ontologies, Linked Data, Semantic Music, APIs, reproducibility, provenance ”Refining Statistical Data on the Web”
  5. 5. Vrije Universiteit Amsterdam 5 OUTLINE • Digital Data-driven Humanities • The human-machine spectrum of DH • Beyond text processing • Enabling a Global & Repeatable Social History • Data preparation • Data integration • Reusing and publishing schemas • Accessing the data OR Asking the same questions to different datasets • One score to rule them all • Music on the Web • The MIDI Linked Data Cloud • Creative applications • This slide deck at http://tinyurl.com/semanticmusic
  6. 6. Vrije Universiteit Amsterdam 6 WHAT IS DIGITAL HUMANITIES? “to study human culture in a more scientific way” “to compute data from the humanities” • Albert: “doing humanities is exactly equal to doing science” • Repeatability • Hypothesis testing • Pragmatic, clean, idealized • Jacky: “doing humanities is completely different to doing science” • Interpretative approach, relativistic • Give value to argumentation and vagueness instead of truth • Focus on the questions we do ask • https://storify.com/ingorohlfing/overly-honest-methods-in-science
  7. 7. Vrije Universiteit Amsterdam 7 THE HUMAN-MACHINE SPECTRUM OF DH Purely machine-based Purely human-based
  8. 8. Vrije Universiteit Amsterdam 8 BEYOND TEXT PROCESSING
  9. 9. Vrije Universiteit Amsterdam 9 BEYOND TEXT PROCESSING
  10. 10. Vrije Universiteit Amsterdam 10 BEYOND TEXT PROCESSING
  11. 11. ‹#› Het begint met een idee ENABLING SOCIAL HISTORY ON THE WEB
  12. 12. Vrije Universiteit Amsterdam 12 WHAT IS SOCIAL HISTORY? Contrasted with political history, intellectual history and the history of great men Explains history from the perspective of ordinary people (demography, work, family, migration) Uses (to a great degree) social science methods  Data science!
  13. 13. Vrije Universiteit Amsterdam 13 THE (HISTORICAL) KNOWLEDGE DISCOVERY PROCESS VolumeVariety
  14. 14. Vrije Universiteit Amsterdam 14 DATA PREPARATION Present data = high volume Historical data = high variety  Multiple legacy (tabular) formats  Diverse identity, unity, rigidity and dependence Preparing them to gain knowledge is expensive  Manual data munging  Hardly reproducible
  15. 15. Vrije Universiteit Amsterdam 15 DATA PREPARATION This ‘data preparation’ step can take up to 60% 80% of the total work
  16. 16. R E F I N I N G S TAT I S T I C A L D ATA O N T H E W E B We do this repeatedly for the same datasets!
  17. 17. Vrije Universiteit Amsterdam 17 CEDAR / CLARIAH ? 1795 1830 1889 1930 1971
  18. 18. Vrije Universiteit Amsterdam 18 TOWARDS 5-STAR HISTORICAL STATISTICAL DATA >4 years ago 4 years ago
  19. 19. Vrije Universiteit Amsterdam 19 LINKED DATA – THE RDF GRAPH DATA MODEL The Divine Comedy was written by Dante
  20. 20. Vrije Universiteit Amsterdam 20 LINKED DATA – THE RDF GRAPH DATA MODEL The Divine Comedy was written by Dante Subject Predicate Object
  21. 21. Vrije Universiteit Amsterdam 21 LINKED DATA – THE RDF GRAPH DATA MODEL The Divine Comedy was written by Dante Subject Predicate Object dbr:Divine_Comedy dbp:author dbr:Dante_Alighieri .
  22. 22. Vrije Universiteit Amsterdam 22 LINKED DATA – THE RDF GRAPH DATA MODEL The Divine Comedy was written by Dante Subject Predicate Object dbr:Divine_Comedy dbp:author dbr:Dante_Alighieri . dbr: <http://dbpedia.org/resource/...> dbp: <http://dbpedia.org/property/...>
  23. 23. Vrije Universiteit Amsterdam 23 LINKED DATA – THE RDF GRAPH DATA MODEL The Divine Comedy was written by Dante Subject Predicate Object dbr:Divine_Comedy dbp:author dbr:Dante_Alighieri . dbr: <http://dbpedia.org/resource/...> dbp: <http://dbpedia.org/property/...> dbr:Divine_Comedy rdf:type owl:Thing , dbo:Poem . dbr:Divine_Comedy :completed “1320” . …
  24. 24. Vrije Universiteit Amsterdam 24 GENERATING LINKED DATA FROM EXCEL https://github.com/Data2Semantics/TabLinker Credits to Rinke Hoekstra
  25. 25. Vrije Universiteit Amsterdam 25 GENERATING LINKED DATA FROM CSV Semi-automatic Generic Domain independent Microdata = CSVW [COW] Macrodata = RDF Data Cube [QBer] [TabLinker] Credits to Rinke Hoekstra
  26. 26. Vrije Universiteit Amsterdam LSD DIMENSIONS – FINDING THE VERB http://lsd-dimensions.org/ Index of statistical dimensions and associated concept schemes on the Web
  27. 27. R E F I N I N G S TAT I S T I C A L D ATA O N T H E W E B
  28. 28. R E F I N I N G S TAT I S T I C A L D ATA O N T H E W E B
  29. 29. R E F I N I N G S TAT I S T I C A L D ATA O N T H E W E B
  30. 30. R E F I N I N G S TAT I S T I C A L D ATA O N T H E W E B New code lists • HISCO http://historyofwork.iisg.nl/ Credits to Richard Zijdeman
  31. 31. R E F I N I N G S TAT I S T I C A L D ATA O N T H E W E B New code lists • Gemeentegeschiedenis.nl http://www.gemeentegeschiedenis.nl/ Credits to Ivo Zandhuis
  32. 32. R E F I N I N G S TAT I S T I C A L D ATA O N T H E W E B New code lists http://licr.io/ Credits to Ashkan Ashkpour
  33. 33. R E F I N I N G S TAT I S T I C A L D ATA O N T H E W E B New code lists http://licr.io/ Credits to Ashkan Ashkpour
  34. 34. R E F I N I N G S TAT I S T I C A L D ATA O N T H E W E B Credits to Richard Zijdeman http://nlgis.nl/
  35. 35. ‹#› Het begint met een idee35
  36. 36. ‹#› Het begint met een idee 36 Het begint met een idee  One .rq file for SPARQL query  Good support of query curation processes > Versioning > Branching > Clone-pull-push  Web-friendly features! > One URI per query > Uniquely identifiable > De-referenceable (raw.githubusercontent.com) 36 Faculty / department / title presentation GITHUB AS A HUB OF SPARQL QUERIES
  37. 37. ‹#› Het begint met een idee 37 Het begint met een idee http://grlc.io/
  38. 38. Vrije Universiteit Amsterdam 38 THE GRLC SERVICE  Assuming your repo is at https://github.com/:owner/:repo and your grlc instance at :host, > http://:host/:owner/:repo/spec returns the JSON swagger spec > http://:host/:owner/:repo/api-docs returns the swagger UI > http://:host/:owner/:repo/:operation?p_1=v_1...p_n=v_n calls operation with specifiec parameter values > Uses BASIL’s SPARQL variable name convention for query parameters  Sends requests to > https://api.github.com/repos/:owner/:repo to look for SPARQL queries and their decorators > https://raw.githubusercontent.com/:owner/:repo/master/file.rq to dereference queries, get the SPARQL, and parse it
  39. 39. Vrije Universiteit Amsterdam 39 SPARQL DECORATOR SYNTAX
  40. 40. Vrije Universiteit Amsterdam 40 SPICED-UP SWAGGER UI
  41. 41. Vrije Universiteit Amsterdam 41 EVALUATION – USE CASES  CEDAR: Access to census data for historians > Hides SPARQL > Allows them to fill query parameters through forms > Co-existence of SPARQL and non-SPARQL clients  CLARIAH - Born Under a Bad Sign: Do prenatal and early-life conditions have an impact on socioeconomic and health outcomes later in life? (uses 1891 Canada and Sweden Linked Census Data) > Reduction of coupling between SPARQL libs and R > Shorter R code – input stream as CSV
  42. 42. Vrije Universiteit Amsterdam > “multiple copies of the same queries in different places (…) was problematic. grlc allows queries to be maintained in a single location” > “with grlc the R code becomes clearer due to the decoupling with SPARQL; and shorter, since a curl suffices to retrieve the data” > “it allows us to manage SPARQL queries separate from the rest of the API – this enables, for instance, to have different queries without having to deploy a new version of the API” > “we use grlc to provide FAQ for those who would prefer REST over SPARQL, but also to explore the data” > “we use grlc to expose the ECAI conference proceedings not only as Linked Data that can be used by Semantic Web practitioners, but also as a Web API that web developers can consume” > “grlc helps to share, extend and repurpose queries by providing a URI for the resulted queries and by supporting collaborative update of those queries” 42 QUALITATIVE EVALUATION
  43. 43. Vrije Universiteit Amsterdam 43 QUANTITATIVE EVALUATION The cost of grlc is independent of the dataset size HTTP requests and payloads are important costs
  44. 44. ‹#› Het begint met een idee ONE SCORE TO RULE THEM ALL
  45. 45. Vrije Universiteit Amsterdam  The “digital” as an instrument for the Humanities 45 SEMANTIC WEB AND THE HUMANITIES
  46. 46. Vrije Universiteit Amsterdam 46 ISWC 2013 JAM SESSION Jam’s “metadata”
  47. 47. Vrije Universiteit Amsterdam  The jam became global (i.e. de-referenceable URIs from anywhere) rather than local > But any video stream would have been more accurate (for humans)  The jam became machine readable > But not all of it  Digital music as Linked Data?  But why? 47 REPRESENTING MUSIC IN RDF?
  48. 48. Vrije Universiteit Amsterdam 48 THE WEB MUSIC ECOSYSTEM
  49. 49. Vrije Universiteit Amsterdam 49 LINKED MUSIC ON THE WEB Linking Open Data cloud diagram 2017, by Andrejs Abele, John P. McCrae, Paul Buitelaar, Anja Jentzsch and Richard Cyganiak. http://lod-cloud.net/ Etree See Daquino et al. 2017 (WHiSe II) Characterizing the Landscape of Musical Data on the Web: state of the art and challenges
  50. 50. Vrije Universiteit Amsterdam Symbolic music databases (MusicXML, MIDI, NIFF, MEI) are non-interoperable From Daquino et al.’s (WHiSe 2017):  “Repositories and digital libraries are the most representative resources collecting musical data. They mainly offer digitisations of scores and lyrics (77%), published as PDF and/or JPG (40%)”  “The more the scale of repositories increases, the less structured formats for representing symbolic notation seem to be used and the less depth of analysis is provided”  “Larger collections are more likely to feature melody” Can we find ways of increasing the level of structure of musical data without compromising its scalability? 50 COOL, BUT…
  51. 51. Vrije Universiteit Amsterdam  MIDI: Digital music representation protocol > (i.e. leaving nothing to analog signals  actual instruments)  Popular/abundant, production, standard  Musical Instrument Digital Interface (1983) > Universal synthesizer interface > Roland (I. Kakehashi), Yamaha, Korg, Kawai (1981) > Digital, fine-grained representation of musical tracks and events > Wide range of controllers and instruments 51 MIDI
  52. 52. Vrije Universiteit Amsterdam [ 144, 60, 100 ] 52 BUT WHAT IS MIDI? Thanks @rumyra! https://www.youtube.com/watch?v=khsBjXKJOPs
  53. 53. Vrije Universiteit Amsterdam [ 144, 60, 100 ] [ 128, 60, 64 ] 53 BUT WHAT IS MIDI? Thanks @rumyra! https://www.youtube.com/watch?v=khsBjXKJOPs
  54. 54. Vrije Universiteit Amsterdam midi2rdf: lossless conversion of MIDI to RDF (and back) Albert Meroño-Peñuela, Rinke Hoekstra. “The Song Remains the Same: Lossless Conversion and Streaming of MIDI to RDF and Back”. In: 13th Extended Semantic Web Conference (ESWC 2016), posters and demos track. May 29th — June 2nd, Heraklion, Crete, Greece (2016).  rdf2midi, direct stream mapping 54 MIDI2RDF & RDF2MIDI https://midi-ld.github.io/
  55. 55. Vrije Universiteit Amsterdam  Music representation format which is > 100% digital (i.e. leaving nothing to analog signals) > Secundary list  MIDI (Musical Instrument Digital Interface) > Universal synthesizer interface > Roland (I. Kakehashi), Yamaha, Korg, Kawai (1981) > Digital, fine-grained representation of musical events > Wide range of controllers and instruments 55 WEEKEND EXPERIMENT
  56. 56. Vrije Universiteit Amsterdam 56 MIDI LINKED DATA http://purl.org/midi-ld/pattern/635f0b49bb3f62c3a76cc58f979bd858
  57. 57. Vrije Universiteit Amsterdam 57 MIDI SCHEMA http://purl.org/midi-ld/midi#
  58. 58. Vrije Universiteit Amsterdam 58 MIDI LINKED DATA RESOURCES  MIDI Pieces http://purl.org/midi-ld/piece/ > Access to MIDI level triples > Cryptographic hash for unique MIDI content http://purl.org/midi-ld/pattern/87dd99fb346cd4c7934cb36a00868cbe  MIDI Notes http://purl.org/midi-ld/notes/ > Type, label, octave, pitch value  MIDI Programs http://purl.org/midi-ld/programs/ > All instruments linked to DBpedia  MIDI Chords http://purl.org/midi-ld/chords/ > Label, quality, number of pitch classes, intervals  Enrichments > Provenance > Integrated lyrics (mostly from karaoke data) > Key (Krumhansl-Schumkler), scale degree, metric accents
  59. 59. Vrije Universiteit Amsterdam 59 MIDI LINKED DATA RESOURCES Current collections  The largest MIDI collection on the Internet (thanks @midi_man)  Lakh MIDI dataset (thanks @colinraffel)  MySongBook MIDI  Yours! https://midi-ld.github.com  308,443 interconnected MIDI files  10,215,557,355 triples  Full dump, SPARQL endpoint, RESTful API
  60. 60. Vrije Universiteit Amsterdam 60 ENABLING SEMANTIC WEB RESEARCH  Data integration > Further format interoperability: MIDI, MusicXML, NIFF, MEI > Integration with formats of other arts: LabanXML  Entity linking > Audio (Spotify URIs), symbolic notation (MIDI), metadata (MusicBrainz) > High heterogeneity, low overlap > Challenge to entity linking algorithms  Semantics and ontologies > Music Ontology, Chord Ontology, Timeline Ontology > Underspecification of musical concepts > Reasoning > Challenge for ontology alignment
  61. 61. Vrije Universiteit Amsterdam 61 ENABLING MUSICOLOGY RESEARCH  Analysis of chords, patterns and melodies at Web scale > Integrating knowledge from external databases > Historical, geographical, cultural, economic, sylistic contexts  Everything has a URI > Annotation tasks, workflow descriptions  Establishing standard Web vocabularies > Chords (iReal Pro), melodies, metadata  Recommender systems > Collaborative filtering, content-based feature extraction, hybrid > Notation-based support for abstract representation of musical concepts  Machine learning (multimodal training data, convincing samples)  Audiolisation
  62. 62. Vrije Universiteit Amsterdam
  63. 63. Vrije Universiteit Amsterdam 63 SPARQL-DJ Web-based tool that finds, selects, plays, mixes, beat- syncs and generates MIDI mashups from a very large MIDI Linked Data collection
  64. 64. Vrije Universiteit Amsterdam 64 SONIC PI
  65. 65. Vrije Universiteit Amsterdam 65 RDF PI https://github.com/midi-ld/Web-MIDI-API Live coding music directly in RDF (MIDI) Everything happens in your browser (RDF parsing, Web MIDI API)
  66. 66. Vrije Universiteit Amsterdam 66 THE MUSIC SEMANTIC GAP • MIR tasks have a performance ceiling of 65% accuracy, independently of the method • Cause: semantic gap • The closer to the gap, the harder the task Some ontologies in place, BUT: • Metadata • Audio features • Ignore notation
  67. 67. Vrije Universiteit Amsterdam 67 THE MUSIC SEMANTIC GAP What knowledge representations and algorithms are needed to generalize music symbolic notation and include it into the existing music retrieval formalisms, in order to reduce the semantic gap? • A knowledge graph of symbolic notation • Data and methods Challenges: 1. KR for notation (horizontal gap) ← machine learning, ontology engineering 2. Bridging notation and humans (vertical gap) ← ontology matching 3. Multimodal entity linking (inter-dataset gap) ← hybrid FT, DTW + LIMES
  68. 68. Music and Knowledge Representation "Music impregnates every person’s memory, reasoning, and language. And yet, we lack a global view of all of humankind’s musical knowledge, telling us precisely what music we know, how much there is, and how it differs across societies."
  69. 69. Vrije Universiteit Amsterdam 69 CONCLUSIONS (I)  Semantic Web and Digital Humanities: to science, or not to science?  Data preparation = 80% of work > We throw it away after use!  Linked Data based solutions > Use RDF to make research repeatable – but more intuitive tools needed > Statistical dimensions & codelists – but hard to find, might be missing > GitHub for queries as Linked Data APIs – enables reproducibility, you need an expert JUST ONCE
  70. 70. Vrije Universiteit Amsterdam 70 CONCLUSIONS (AND II)  One score to rule them all > General knowledge representation language (RDF) for music (MIDI) > Mappings for MusicXML, MEI, NIFF, and others > The spectrum of symbolic music vs low level audio signal  Quality (& automatic) links to external Linked Datasets > MusicBrainz, DBpedia, etc. > Hybrid approaches (metadata, lyrics, incipits, MIR algorithms)  Tools > (Contextual) querying > Annotation (every note has a URL!) > Workflow recording  Your ideas & contributions most welcome! https://midi-ld.github.io/
  71. 71. Vrije Universiteit Amsterdam > Albert Meroño-Peñuela. “Humanists And Scientists: More Alike Than Different”. eHumanities Magazine, number 7, February 2016 (HTML) > Albert Meroño-Peñuela, Rinke Hoekstra. “grlc Makes GitHub Taste Like Linked Data APIs”. SALAD 2016 — Services and Applications over Linked Data APIs and Data. International workshop, ESWC 2016, May 29th, Heraklion, Crete, Greece (2016). (PDF) > Rinke Hoekstra, Albert Meroño-Peñuela, Kathrin Dentler, Auke Rijpma, Richard Zijdeman, Ivo Zandhuis. “An Ecosystem for Linked Humanities Data”. In: Proceedings of the 1st Workshop on Humanities in the SEmantic web (WHiSE 2016). ESWC 2016, May 29th, Heraklion, Crete, Greece (2016). (PDF) > Albert Meroño-Peñuela, Rinke Hoekstra. “The Song Remains the Same: Lossless Conversion and Streaming of MIDI to RDF and Back”. In: 13th Extended Semantic Web Conference (ESWC 2016), posters and demos track. May 29th — June 2nd, Heraklion, Crete, Greece (2016). (PDF) > Albert Meroño-Peñuela. “Refining Statistical Data on the Web”. Vrije Universiteit Amsterdam (2016) (Amazon) (VU-DARE) > Albert Meroño-Peñuela, Christophe Guéret, Stefan Schlobach. “Linked Edit Rules: A Web Friendly Way of Checking Quality of RDF Data Cubes”. Proceedings of the 3rd International Workshop on Semantic Statistics (SemStats 2015), ISWC 2015, Bethlehem, PA, USA (2015). (PDF) > Bas Stringer, Albert Meroño-Peñuela, Antonis Loizou, Sanne Abeln, Jaap Heringa. “To SCRY Linked Data: Extending SPARQL the Easy Way”. Diversity++ workshop, ISWC 2015, Bethlehem, PA, USA (2015). (PDF) > Albert Meroño-Peñuela, Ashkan Ashkpour, Marieke van Erp, Kees Mandemakers, Leen Breure, Andrea Scharnhorst, Stefan Schlobach, Frank van Harmelen. “Semantic Technologies for Historical Research: A Survey”. Semantic Web — Interoperability, Usability, Applicability, 6(6), pp. 539–564. IOS Press (2015). > Albert Meroño-Peñuela, Ashkan Ashkpour, Christophe Guéret, Stefan Schlobach. “CEDAR: The Dutch Historical Censuses as Linked Open Data”. Semantic Web — Interoperability, Usability, Applicability, 8(2), pp. 297–310. IOS Press (2015).71 PUBLICATIONS
  72. 72. ‹#› Het begint met een idee THANK YOU! @albertmeronyo DATALEGEND.NET CLARIAH.NL 72
  73. 73. Vrije Universiteit Amsterdam 73 A BASIC WEB SYSTEMS COMMUNICATION TOOLKIT 1. Endpoint location is volatile Names encapsulate semantics of operations → Should be meaningless, just as email addresses HTTP : http://example.org/canihasdata 2. Consensus on data semantics is necessary Simple object exchange format + 15 years of Web ontology development to semantically describe data JSON+LD : [{ "@id": "eg:Albert", "rdf:type": [{ "@id": "foaf:Person" }]}]
  74. 74. Vrije Universiteit Amsterdam 74 LINKED DATA NOTIFICATIONS https://www.w3.org/TR/ldn/ Thanks to Sarven Capadisli
  75. 75. Vrije Universiteit Amsterdam 75 IMPLEMENTATIONS http://pyldn.amp.ops.labs.vu.nl/ https://github.com/albertmeronyo/pyldn/

×