Music Information Retrieval is about retrieving information from music entities.
The slides will introduce the basic concepts of the music language, passing through different kind of music representations and it will end up describing some low level features that are used when dealing with music entities.
This power-point presentation contains 45 slides. It describes SR system (a brief intro), what are the applications, the biological architecture of human speech recognition vs machine architecture, recognition process, flow summery of recognition process and the approaches to the SRS. All this is described in the first few slides (the first part, let's say), after that, this presentation describes the evolution process of SRS through the decades (the middle part), and at the last this presentation describes the machine learning approach in SRS. How neural net enhance the efficiency of a SRS.
Machine learning for creative AI applications in music (2018 nov)Yi-Hsuan Yang
An up-to-date overview of our recent research on music/audio and AI. It contains four parts:
* AI Listener: source separation (ICMLA'18a) and sound event detection (IJCAI'18)
* AI DJ: music thumbnailing (TISMIR'18) and music sequencing (AAAI'18a)
* AI Composer: melody generation (ISMIR'17), lead sheet generation (ICMLA'18b), multitrack pianoroll generation (AAAI'18b), and instrumentation generation (arxiv)
* AI Performer: CNN-based score-to-audio generation (AAAI'19)
This is a ppt on speech recognition system or automated speech recognition system. I hope that it would be helpful for all the people searching for a presentation on this technology
Also known as automatic speech recognition or computer speech recognition which means understanding voice by the computer and performing any required task.
Audio Compression Techniques
a type of lossy or lossless compression in which the amount of data in a recorded waveform is reduced to differing extents for transmission respectively with or without some loss of quality, used in CD and MP3 encoding, Internet radio.
Dynamic range compression, also called audio level compression, in which the dynamic range, the difference between loud and quiet, of an audio waveform is reduced
This power-point presentation contains 45 slides. It describes SR system (a brief intro), what are the applications, the biological architecture of human speech recognition vs machine architecture, recognition process, flow summery of recognition process and the approaches to the SRS. All this is described in the first few slides (the first part, let's say), after that, this presentation describes the evolution process of SRS through the decades (the middle part), and at the last this presentation describes the machine learning approach in SRS. How neural net enhance the efficiency of a SRS.
Machine learning for creative AI applications in music (2018 nov)Yi-Hsuan Yang
An up-to-date overview of our recent research on music/audio and AI. It contains four parts:
* AI Listener: source separation (ICMLA'18a) and sound event detection (IJCAI'18)
* AI DJ: music thumbnailing (TISMIR'18) and music sequencing (AAAI'18a)
* AI Composer: melody generation (ISMIR'17), lead sheet generation (ICMLA'18b), multitrack pianoroll generation (AAAI'18b), and instrumentation generation (arxiv)
* AI Performer: CNN-based score-to-audio generation (AAAI'19)
This is a ppt on speech recognition system or automated speech recognition system. I hope that it would be helpful for all the people searching for a presentation on this technology
Also known as automatic speech recognition or computer speech recognition which means understanding voice by the computer and performing any required task.
Audio Compression Techniques
a type of lossy or lossless compression in which the amount of data in a recorded waveform is reduced to differing extents for transmission respectively with or without some loss of quality, used in CD and MP3 encoding, Internet radio.
Dynamic range compression, also called audio level compression, in which the dynamic range, the difference between loud and quiet, of an audio waveform is reduced
a set of slides introducing the application of machine learning to music related applications; intended for audience not with computer science background;
Research in artificial intelligence (AI) is known to have impacted medical diagnosis, stock trading, robot control, and several other fields. Perhaps less popular is the contribution of AI in the field of music. Nevertheless, Artificial intelligence and music (AIM) has, for a long time, been a common subject in several conferences and workshops, including the International Computer Music Conference, the Computing Society Conference and the International Joint Conference on Artificial Intelligence.
Students will be able to understand about the various types of signal processors.
Students will be able to choose the signal processors according to the need in studio or live applications.
Students will know about the characteristics of signal processors.
Deep Learning techniques have enabled exciting novel applications. Recent advances hold lot of promise for speech based applications that include synthesis and recognition. This slideset is a brief overview that presents a few architectures that are the state of the art in contemporary speech research. These slides are brief because most concepts/details were covered using the blackboard in a classroom setting. These slides are meant to supplement the lecture.
The Amazing Ways Artificial Intelligence Is Transforming The Music IndustryBernard Marr
Artificial intelligence (AI) helps businesses in the music industry sort through data, gain insights from it and become more efficient. From creating music and lyrics to helping discover new musical talent, AI is disrupting the music industry. Organizations in the music industry who accept this and figure out ways to incorporate AI into its operations will be the ones who will benefit the most.
Spotify Discover Weekly: The machine learning behind your music recommendationsSophia Ciocca
In this presentation, I give an overview of the machine learning algorithms behind Spotify’s extraordinarily popular Discover Weekly playlist. I provide a brief introduction to what the playlist is, explain how music recommendation engines have evolved over time, then break down the three main algorithm types powering Spotify’s recommendations: (1) collaborative filtering, (2) Natural Language Processing (NLP), and (3) Raw audio analysis.
Video of the presentation can be found here: https://www.youtube.com/watch?v=PUtYNjInopA
Retrieving musical records from a corpus of Information, using an audio input as a query is not an easy task. Various approaches try to solve the problem modelling the query and the corpus of Information as an array of hashes calculated from the chroma features of the audio input.
Scope of this talk is to introduce a novel approach in calculating such hashes, considering the intervals of the most intense pitches of sequential chroma vectors.
Building on the theoretical introduction, a prototype will show you this approach in action with Apache Solr with a sample dataset and the benefits of positional queries.
Challenges and future works will follow up as conclusive considerations.
Musical Information Retrieval Take 2: Interval Hashing Based RankingSease
Retrieving musical records from a corpus of Information, using an audio input as a query is not an easy task. Various approaches try to solve the problem modelling the query and the corpus of Information as an array of hashes calculated from the chroma features of the audio input.
Scope of this talk is to introduce a novel approach in calculating such hashes, considering the intervals of the most intense pitches of sequential chroma vectors.
Building on the theoretical introduction, a prototype will show you this approach in action with Apache Solr with a sample dataset and the benefits of positional queries.
Challenges and future works will follow up as conclusive considerations.
a set of slides introducing the application of machine learning to music related applications; intended for audience not with computer science background;
Research in artificial intelligence (AI) is known to have impacted medical diagnosis, stock trading, robot control, and several other fields. Perhaps less popular is the contribution of AI in the field of music. Nevertheless, Artificial intelligence and music (AIM) has, for a long time, been a common subject in several conferences and workshops, including the International Computer Music Conference, the Computing Society Conference and the International Joint Conference on Artificial Intelligence.
Students will be able to understand about the various types of signal processors.
Students will be able to choose the signal processors according to the need in studio or live applications.
Students will know about the characteristics of signal processors.
Deep Learning techniques have enabled exciting novel applications. Recent advances hold lot of promise for speech based applications that include synthesis and recognition. This slideset is a brief overview that presents a few architectures that are the state of the art in contemporary speech research. These slides are brief because most concepts/details were covered using the blackboard in a classroom setting. These slides are meant to supplement the lecture.
The Amazing Ways Artificial Intelligence Is Transforming The Music IndustryBernard Marr
Artificial intelligence (AI) helps businesses in the music industry sort through data, gain insights from it and become more efficient. From creating music and lyrics to helping discover new musical talent, AI is disrupting the music industry. Organizations in the music industry who accept this and figure out ways to incorporate AI into its operations will be the ones who will benefit the most.
Spotify Discover Weekly: The machine learning behind your music recommendationsSophia Ciocca
In this presentation, I give an overview of the machine learning algorithms behind Spotify’s extraordinarily popular Discover Weekly playlist. I provide a brief introduction to what the playlist is, explain how music recommendation engines have evolved over time, then break down the three main algorithm types powering Spotify’s recommendations: (1) collaborative filtering, (2) Natural Language Processing (NLP), and (3) Raw audio analysis.
Video of the presentation can be found here: https://www.youtube.com/watch?v=PUtYNjInopA
Retrieving musical records from a corpus of Information, using an audio input as a query is not an easy task. Various approaches try to solve the problem modelling the query and the corpus of Information as an array of hashes calculated from the chroma features of the audio input.
Scope of this talk is to introduce a novel approach in calculating such hashes, considering the intervals of the most intense pitches of sequential chroma vectors.
Building on the theoretical introduction, a prototype will show you this approach in action with Apache Solr with a sample dataset and the benefits of positional queries.
Challenges and future works will follow up as conclusive considerations.
Musical Information Retrieval Take 2: Interval Hashing Based RankingSease
Retrieving musical records from a corpus of Information, using an audio input as a query is not an easy task. Various approaches try to solve the problem modelling the query and the corpus of Information as an array of hashes calculated from the chroma features of the audio input.
Scope of this talk is to introduce a novel approach in calculating such hashes, considering the intervals of the most intense pitches of sequential chroma vectors.
Building on the theoretical introduction, a prototype will show you this approach in action with Apache Solr with a sample dataset and the benefits of positional queries.
Challenges and future works will follow up as conclusive considerations.
slides presented at a three-hour local AI music course in Taiwan in Oct 2021; part 1: a brief introduction to music information retrieval (+analysis, +generation)
Some of my slides from the AES 122 Vienna Convention, workshop on "Music and the Web" (May 6th, 2007). This presentation was dealing with the Music Ontology, and some of the Linked Data concepts.
Yi-Hsuan Yang is an Associate Research Fellow with Academia Sinica. He received his Ph.D. degree in Communication Engineering from National Taiwan University in 2010, and became an Assistant Research Fellow in Academia Sinica in 2011. He is also an Adjunct Associate Professor with the National Tsing Hua University, Taiwan. His research interests include music information retrieval, machine learning and affective computing. Dr. Yang was a recipient of the 2011 IEEE Signal Processing Society (SPS) Young Author Best Paper Award, the 2012 ACM Multimedia Grand Challenge First Prize, and the 2014 Ta-You Wu Memorial Research Award of the Ministry of Science and Technology, Taiwan. He is an author of the book Music Emotion Recognition (CRC Press 2011) and a tutorial speaker on music affect recognition in the International Society for Music Information Retrieval Conference (ISMIR 2012). In 2014, he served as a Technical Program Co-chair of ISMIR, and a Guest Editor of the IEEE Transactions on Affective Computing and the ACM Transactions on Intelligent Systems and Technology.
The kusc classical music dataset for audio key findingijma
In this paper, we present a benchmark dataset based on the KUSC classical music collection and provide
baseline key-finding comparison results. Audio key finding is a basic music information retrieval task; it
forms an essential component of systems for music segmentation, similarity assessment, and mood
detection. Due to copyright restrictions and a labor-intensive annotation process, audio key finding
algorithms have only been evaluated using small proprietary datasets to date. To create a common base for
systematic comparisons, we have constructed a dataset comprising of more than 3,000 excerpts of classical
music. The excerpts are made publicly accessible via commonly used acoustic features such as pitch-based
spectrograms and chromagrams. We introduce a hybrid annotation scheme that combines the use of title
keys with expert validation and correction of only the challenging cases. The expert musicians also provide
ratings of key recognition difficulty. Other meta-data include instrumentation. As demonstration of use of
the dataset, and to provide initial benchmark comparisons for evaluating new algorithms, we conduct a
series of experiments reporting key determination accuracy of four state-of-the-art algorithms. We further
show the importance of considering factors such as estimated tuning frequency, key strength or confidence
value, and key recognition difficulty in key finding. In the future, we plan to expand the dataset to include
meta-data for other music information retrieval tasks.
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - "Machine...I MT
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - "Machine Listening: L'intelligence artificielle pour les sons et la musique". Présentation par Gaël Richard
In large musical catalogs such as in streaming companies, manual curation comes at high cost and the amount of data is considerable with tens of thousands of records delivered every week. While not replacing human resources, automatic systems trained directly from audio data help streaming companies describing audio recordings as well as creating relations between them. We will take a look at what is done by Deezer R&D's team in this domain using machine learning techniques.
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Globus
The U.S. Geological Survey (USGS) has made substantial investments in meeting evolving scientific, technical, and policy driven demands on storing, managing, and delivering data. As these demands continue to grow in complexity and scale, the USGS must continue to explore innovative solutions to improve its management, curation, sharing, delivering, and preservation approaches for large-scale research data. Supporting these needs, the USGS has partnered with the University of Chicago-Globus to research and develop advanced repository components and workflows leveraging its current investment in Globus. The primary outcome of this partnership includes the development of a prototype enterprise repository, driven by USGS Data Release requirements, through exploration and implementation of the entire suite of the Globus platform offerings, including Globus Flow, Globus Auth, Globus Transfer, and Globus Search. This presentation will provide insights into this research partnership, introduce the unique requirements and challenges being addressed and provide relevant project progress.
Experience our free, in-depth three-part Tendenci Platform Corporate Membership Management workshop series! In Session 1 on May 14th, 2024, we began with an Introduction and Setup, mastering the configuration of your Corporate Membership Module settings to establish membership types, applications, and more. Then, on May 16th, 2024, in Session 2, we focused on binding individual members to a Corporate Membership and Corporate Reps, teaching you how to add individual members and assign Corporate Representatives to manage dues, renewals, and associated members. Finally, on May 28th, 2024, in Session 3, we covered questions and concerns, addressing any queries or issues you may have.
For more Tendenci AMS events, check out www.tendenci.com/events
Accelerate Enterprise Software Engineering with PlatformlessWSO2
Key takeaways:
Challenges of building platforms and the benefits of platformless.
Key principles of platformless, including API-first, cloud-native middleware, platform engineering, and developer experience.
How Choreo enables the platformless experience.
How key concepts like application architecture, domain-driven design, zero trust, and cell-based architecture are inherently a part of Choreo.
Demo of an end-to-end app built and deployed on Choreo.
Large Language Models and the End of ProgrammingMatt Welsh
Talk by Matt Welsh at Craft Conference 2024 on the impact that Large Language Models will have on the future of software development. In this talk, I discuss the ways in which LLMs will impact the software industry, from replacing human software developers with AI, to replacing conventional software with models that perform reasoning, computation, and problem-solving.
Listen to the keynote address and hear about the latest developments from Rachana Ananthakrishnan and Ian Foster who review the updates to the Globus Platform and Service, and the relevance of Globus to the scientific community as an automation platform to accelerate scientific discovery.
Your Digital Assistant.
Making complex approach simple. Straightforward process saves time. No more waiting to connect with people that matter to you. Safety first is not a cliché - Securely protect information in cloud storage to prevent any third party from accessing data.
Would you rather make your visitors feel burdened by making them wait? Or choose VizMan for a stress-free experience? VizMan is an automated visitor management system that works for any industries not limited to factories, societies, government institutes, and warehouses. A new age contactless way of logging information of visitors, employees, packages, and vehicles. VizMan is a digital logbook so it deters unnecessary use of paper or space since there is no requirement of bundles of registers that is left to collect dust in a corner of a room. Visitor’s essential details, helps in scheduling meetings for visitors and employees, and assists in supervising the attendance of the employees. With VizMan, visitors don’t need to wait for hours in long queues. VizMan handles visitors with the value they deserve because we know time is important to you.
Feasible Features
One Subscription, Four Modules – Admin, Employee, Receptionist, and Gatekeeper ensures confidentiality and prevents data from being manipulated
User Friendly – can be easily used on Android, iOS, and Web Interface
Multiple Accessibility – Log in through any device from any place at any time
One app for all industries – a Visitor Management System that works for any organisation.
Stress-free Sign-up
Visitor is registered and checked-in by the Receptionist
Host gets a notification, where they opt to Approve the meeting
Host notifies the Receptionist of the end of the meeting
Visitor is checked-out by the Receptionist
Host enters notes and remarks of the meeting
Customizable Components
Scheduling Meetings – Host can invite visitors for meetings and also approve, reject and reschedule meetings
Single/Bulk invites – Invitations can be sent individually to a visitor or collectively to many visitors
VIP Visitors – Additional security of data for VIP visitors to avoid misuse of information
Courier Management – Keeps a check on deliveries like commodities being delivered in and out of establishments
Alerts & Notifications – Get notified on SMS, email, and application
Parking Management – Manage availability of parking space
Individual log-in – Every user has their own log-in id
Visitor/Meeting Analytics – Evaluate notes and remarks of the meeting stored in the system
Visitor Management System is a secure and user friendly database manager that records, filters, tracks the visitors to your organization.
"Secure Your Premises with VizMan (VMS) – Get It Now"
Quarkus Hidden and Forbidden ExtensionsMax Andersen
Quarkus has a vast extension ecosystem and is known for its subsonic and subatomic feature set. Some of these features are not as well known, and some extensions are less talked about, but that does not make them less interesting - quite the opposite.
Come join this talk to see some tips and tricks for using Quarkus and some of the lesser known features, extensions and development techniques.
Software Engineering, Software Consulting, Tech Lead.
Spring Boot, Spring Cloud, Spring Core, Spring JDBC, Spring Security,
Spring Transaction, Spring MVC,
Log4j, REST/SOAP WEB-SERVICES.
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...Juraj Vysvader
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I didn't get rich from it but it did have 63K downloads (powered possible tens of thousands of websites).
Developing Distributed High-performance Computing Capabilities of an Open Sci...Globus
COVID-19 had an unprecedented impact on scientific collaboration. The pandemic and its broad response from the scientific community has forged new relationships among public health practitioners, mathematical modelers, and scientific computing specialists, while revealing critical gaps in exploiting advanced computing systems to support urgent decision making. Informed by our team’s work in applying high-performance computing in support of public health decision makers during the COVID-19 pandemic, we present how Globus technologies are enabling the development of an open science platform for robust epidemic analysis, with the goal of collaborative, secure, distributed, on-demand, and fast time-to-solution analyses to support public health.
top nidhi software solution freedownloadvrstrong314
This presentation emphasizes the importance of data security and legal compliance for Nidhi companies in India. It highlights how online Nidhi software solutions, like Vector Nidhi Software, offer advanced features tailored to these needs. Key aspects include encryption, access controls, and audit trails to ensure data security. The software complies with regulatory guidelines from the MCA and RBI and adheres to Nidhi Rules, 2014. With customizable, user-friendly interfaces and real-time features, these Nidhi software solutions enhance efficiency, support growth, and provide exceptional member services. The presentation concludes with contact information for further inquiries.
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Shahin Sheidaei
Games are powerful teaching tools, fostering hands-on engagement and fun. But they require careful consideration to succeed. Join me to explore factors in running and selecting games, ensuring they serve as effective teaching tools. Learn to maintain focus on learning objectives while playing, and how to measure the ROI of gaming in education. Discover strategies for pitching gaming to leadership. This session offers insights, tips, and examples for coaches, team leads, and enterprise leaders seeking to teach from simple to complex concepts.
Designing for Privacy in Amazon Web ServicesKrzysztofKkol1
Data privacy is one of the most critical issues that businesses face. This presentation shares insights on the principles and best practices for ensuring the resilience and security of your workload.
Drawing on a real-life project from the HR industry, the various challenges will be demonstrated: data protection, self-healing, business continuity, security, and transparency of data processing. This systematized approach allowed to create a secure AWS cloud infrastructure that not only met strict compliance rules but also exceeded the client's expectations.
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisGlobus
JASMIN is the UK’s high-performance data analysis platform for environmental science, operated by STFC on behalf of the UK Natural Environment Research Council (NERC). In addition to its role in hosting the CEDA Archive (NERC’s long-term repository for climate, atmospheric science & Earth observation data in the UK), JASMIN provides a collaborative platform to a community of around 2,000 scientists in the UK and beyond, providing nearly 400 environmental science projects with working space, compute resources and tools to facilitate their work. High-performance data transfer into and out of JASMIN has always been a key feature, with many scientists bringing model outputs from supercomputers elsewhere in the UK, to analyse against observational or other model data in the CEDA Archive. A growing number of JASMIN users are now realising the benefits of using the Globus service to provide reliable and efficient data movement and other tasks in this and other contexts. Further use cases involve long-distance (intercontinental) transfers to and from JASMIN, and collecting results from a mobile atmospheric radar system, pushing data to JASMIN via a lightweight Globus deployment. We provide details of how Globus fits into our current infrastructure, our experience of the recent migration to GCSv5.4, and of our interest in developing use of the wider ecosystem of Globus services for the benefit of our user community.
Enhancing Research Orchestration Capabilities at ORNL.pdfGlobus
Cross-facility research orchestration comes with ever-changing constraints regarding the availability and suitability of various compute and data resources. In short, a flexible data and processing fabric is needed to enable the dynamic redirection of data and compute tasks throughout the lifecycle of an experiment. In this talk, we illustrate how we easily leveraged Globus services to instrument the ACE research testbed at the Oak Ridge Leadership Computing Facility with flexible data and task orchestration capabilities.
How Recreation Management Software Can Streamline Your Operations.pptxwottaspaceseo
Recreation management software streamlines operations by automating key tasks such as scheduling, registration, and payment processing, reducing manual workload and errors. It provides centralized management of facilities, classes, and events, ensuring efficient resource allocation and facility usage. The software offers user-friendly online portals for easy access to bookings and program information, enhancing customer experience. Real-time reporting and data analytics deliver insights into attendance and preferences, aiding in strategic decision-making. Additionally, effective communication tools keep participants and staff informed with timely updates. Overall, recreation management software enhances efficiency, improves service delivery, and boosts customer satisfaction.
How Recreation Management Software Can Streamline Your Operations.pptx
Introduction to Music Information Retrieval
1. London Information Retrieval Meetup
19 Feb 2019
Introduction to Music Information Retrieval
Thoughts from a former bass player
Andrea Gazzarini, Software Engineer
19th February 2019
2. London Information Retrieval Meetup
Who I am
▪ Software Engineer (1999-)
▪ “Hermit” Software Engineer (2010-)
▪ Java & Information Retrieval Passionate
▪ Apache Qpid (past) Committer
▪ Husband & Father
▪ Bass Player
Andrea Gazzarini, “Gazza”
3. London Information Retrieval Meetup
Sease
Search Services
● Open Source Enthusiasts
● Apache Lucene/Solr experts
! Community Contributors
● Active Researchers
● Hot Trends : Learning To Rank, Document Similarity,
Search Quality Evaluation, Relevancy Tuning
4. London Information Retrieval Meetup
✓Music Information Retrieval (MIR)?
➢ Music Essentials
➢ Audio Processing
➢ Q&A
Agenda
5. London Information Retrieval Meetup
MIR is concerned with the extraction, analysis and usage of information about any kind of music
entity (e.g. a song or a music artist) on any representation level (for example, audio signal, symbolic MIDI
representation of a piece of music, or name of a music artist).”
Schedl, M.: Automatically extracting, analyzing and visualizing information on music artists from the world wide web.
Dissertation, Johannes Kepler University, Wien (2003)
Music information retrieval (MIR) is the interdisciplinary science of retrieving information from
music. MIR is a small but growing field of research with many real-world applications. Those involved in
MIR may have a background in in musicology, psychoacoustics, psychology, academic music study,
signal processing, informatics, machine learning, optical music recognition, computational intelligence or
some combination of these.
https://en.wikipedia.org/wiki/Music_information_retrieval
Music Information Retrieval (MIR)
6. London Information Retrieval Meetup
AUDIO IDENTIFICATION
GENRE IDENTIFICATION
TRANSCRIPTION RECOMMENDATION
COVER SONG DETECTION
SYMBOLIC SIMILARITY
MOOD
SOURCE SEPARATION
INSTRUMENT RECOGNITION
TEMPO ESTIMATION
SCORE ALIGNMENT
SONG STRUCTURE
BEAT TRACKING
KEY DETECTION
QUERY BY HUMMINGQUERY BY HUMMING
AUDIO IDENTIFICATION
INSTRUMENT RECOGNITION
GENRE IDENTIFICATION
TRANSCRIPTION RECOMMENDATION
TEMPO ESTIMATION
SONG STRUCTURE
SCORE ALIGNMENT
COVER SONG DETECTION
SYMBOLIC SIMILARITY
KEY DETECTION
BEAT TRACKING
MOOD
SOURCE SEPARATION
Music Information Retrieval (MIR)
7. London Information Retrieval Meetup
Music Content includes all those low-level things we
can extract from the audio signal (e.g. time,
frequencies, loudness)
Computational Factors
Context
State
Music Content
Music Context
Music Context defines additional metadata that
cannot be extracted from the audio signal (e.g. lyrics,
tags, artists, feedback, posts)
Listener state includes the user state in a given
moment (e.g. mood, musical knowledge, preferences)
Listener Context relates to the environment where
the listener is in a given moment (e.g. political,
geographical, social)
Factors in Music Perception
8. London Information Retrieval Meetup
➢ Music Information Retrieval (MIR)
✓Music Essentials
‣ Essentials
‣ Score Music Representation
‣ Symbolic Representations
‣ Audio Representation
➢ Audio Processing
➢ Q&A
Agenda
9. London Information Retrieval Meetup
A note is used for denoting a sound, its pitch and duration
A sound is the audio signal produced by a vibrating body
Notes are associated to graphical symbols (for indicating the pitch and the duration)
Two notes with the same fundamental frequency in a ratio of any integer power of two are perceived as similar. As
consequence of that, we say they belong to the same pitch class
A note is also used for denoting a pitch class. The traditional music theory individuates 12 pitch classes
Notes and Pitch classes are associated to mnemonic codes (e.g. C,D,E,F,G,A,B or DO,RE,MI,FA,SOL,LA,SI)
C D E
F G A B
C
B A
G F E D
C
C#
D# F# G#
A#
Bb Ab Gb Eb Db
Music Language Essentials
10. London Information Retrieval Meetup
Text Music
Letter Note
Word
Phrase
Sentence
Chord
Ghost Note
Phrase
Text vs Music
11. London Information Retrieval Meetup
Time Signature
Key Signature
Clef
Tempo
Note
Reference Chord
Chord
Score music representation
12. London Information Retrieval Meetup
Symbolic music representations comprise any
kind of score representation with an explicit
encoding of notes or other musical events.
Piano Roll, initially used for denoting rolls of
paper with holes for controlling a melody
execution on a self-playing device, it is nowadays
used for referring to a digital visualisation which
provides pitches over time.
Musical Instrument Digital Interface (MIDI) is
another representation, widely adopted, for
representing music event (e.g. pitch, velocity,
duration, intensity)
Piano Roll & MIDI
Symbolic music representation
13. London Information Retrieval Meetup
MusicXML [1] is an XML dialect for expressing Music
in XML format.
As you can imagine from the example on the right,
encoding a whole song will result in a huge and
verbose textual representation (that’s XML!).
For that reason MusicXML 2.0 introduced a
compressed format with a .mxml suffix
• Widely supported (scorewriting, OCR, sequencer)
• Easy to understand
• Full support of music features
MusicXML
Part
Time
Clef
Note(s)
[1] https://www.musicxml.com
MusicXML
14. London Information Retrieval Meetup
The Parsons code, formally named the Parsons
code for melodic contours, is a simple notation
used to identify a piece of music through melodic
motion — movements of the pitch up and down.
(https://en.wikipedia.org/wiki/Parsons_code)
The encoding focuses on the pitch relation between
subsequent notes. Main points about this method are:
• Simplicity
• Being a textual encoding it offers interesting
challenges in text search engines
• Limited: It doesn’t consider at all important
features like time and intervals, pauses, ghost
notes
Parsons CodeSymbol Description
* First note of a sequence
u,/
“up”, the note is higher than the
previous one
d,
“down”, the note is lower than
the previous one
r,-
“repeat”, the note is the same
of the previous one
Parsons Code (1/4)
16. London Information Retrieval Meetup
*
*
r
u u rr u r u r d r d r
d r d r
u r u r u r u r
*
u
d d d u u uX
u
d d d u u uXd
Money, Pink Floyd
Parsons Code (3/4)
18. London Information Retrieval Meetup
Digital computers can only capture this data at discrete moments in time. The rate at which a
computer captures audio data is called the sampling frequency or sampling rate.
An audio signal is a representation of sound that represents the fluctuation in air pressure
caused by the vibration as a function of time. Unlike sheet music or symbolic representations,
audio representations encode everything that is necessary to reproduce an acoustic realization
of a piece of music.
Audio Representation: Time Domain
19. London Information Retrieval Meetup
The Frequency Domain representation
decomposes the audio signal in a number of
waves oscillating a different frequencies.
The FD plots the frequencies on the
horizontal axis by their corresponding
magnitude (power) on the vertical axis.
This representation, among other things, can
be used for highlighting the dominant
frequencies of a musical tone.
Frequency Domain
Frequency Domain
20. London Information Retrieval Meetup
➢ Music Information Retrieval (MIR)
➢ Music Essentials
✓ Audio Processing
‣ Basic Pipeline
‣ Time Domain Features
‣ Frequency Domain Features
‣ Chroma Features
➢ Q&A
Agenda
21. London Information Retrieval Meetup
Time Domain Features Extraction
Frequency Domain Features Extraction
Sampling / Quantization
Framing
Windowing
FFT
Analog Signal
Basic Audio Processing Pipeline
22. London Information Retrieval Meetup
Amplitude Envelope (AE)
Max amplitude within a frame
Root-Mean-Square Energy (RMS)
Perceived sound intensity
Zero Crossing Rate (ZCR)
Number of times the amplitude changes its sign within a frameFeature
Example
Usage
Loudness Estimation
Timbre Analysis
Speech Recognition
Audio Segmentation
Onset Detection
Time Domain Features
23. London Information Retrieval Meetup
Band Energy Ratio (BER)
Ratio between lower and higher
frequency bands energy
Spectral Centroid
Frequency band where most of
the energy is concentrated
Bandwidth (BW)
Spectral range of interesting
part of a signal
Feature
Example
Usage
Timbre Analysis
Speech Recognition
Onset DetectionSpeech/Music Discrimination
Spectral Flux
Frequency band where most of
the energy is concentrated
Frequency Domain Features
24. London Information Retrieval Meetup
Chroma features are a powerful representation for
music audio in which the entire spectrum is
projected onto 12 bins representing the 12 distinct
semitones (or chroma) of the musical octave.
It’s a kind of analysis which bridges between low-level
and middle-level features, moving the audio signal
representation toward something which is more
readable, from a functional perspective.
Chroma Features
Chroma Features (1/2)
25. London Information Retrieval Meetup
Time
C
D
E
F
G
A
B
C#
D#
F#
G#
A#
A A A A A A C A F F F F F F FG C C C C C C D C B B B B B B C B
N
O
I
S
E
Chroma Features (2/2)
26. London Information Retrieval Meetup
FALCON: FAst Lucene-based Cover sOng identification | chromaprint (part of AcustID)
Interesting Projects
27. London Information Retrieval Meetup
➢ Music Information Retrieval (MIR)
➢ Music Representation
➢ Audio Processing
✓ Q&A
Agenda
28. London Information Retrieval Meetup
19 Feb 2019
Thank you!
Introduction to Music Information Retrieval
Thoughts from a former bass player
Andrea Gazzarini, Software Engineer
19th February 2019