Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
"MACHINE LISTENING:
L'INTELLIGENCE
ARTIFICIELLE POUR LES
SONS ET LA MUSIQUE"
COLLOQUE IMT - L’INTELLIGENCE
ARTIFICIELLE AU...
09/04/2019
2
MACHINE LISTENING ?
A well established domain for speech …
Speech Recognition
« What a great workshop !»
Spee...
MACHINE LISTENING FOR AUDIO AND MUSIC
Goal: to extract « information » from the audio signal
09/04/2019
3
Music recognitio...
SOME APPLICATIONS …
09/04/2019
4
Music streaming,
music recommendation Music education
Music Identification, Audio Fringer...
SOURCE SEPARATION
09/04/2019
5
SOURCE SEPARATION FOR REMIXING
09/04/2019
6
SOURCE SEPARATION
(LEGLAIVE & AL.)
09/04/2019
7
Model for Audio sources
Model for reverberation
AN EXAMPLE OF MODEL FOR AUDIO SOURCES
Non Negative Matrix factorization
09/04/2019
8
SOURCE SEPARATION - S. LEGLAIVE
(EXAMPLE REMIXED BY RADIO FRANCE)
09/04/2019
9
S. Leglaive, R. Badeau, G. Richard, "Multic...
AUDIO IDENTIFICATION OU AUDIOID
Audio ID = find high-level metadata from a music recording
Challenges:
Efficiency in adver...
REAL-TIME AUDIO IDENTIFICATION
(FENET & AL.)
Audio recordings recognition
• Identical
• Approximate (live vs studio)
• Rea...
SOUND EVENTS AND ACOUSTIC SCENE RECOGNITION
09/04/2019
12
SOME APPROACHES
09/04/2019
13
AN EXEMPLE OF A RECENT APPROACH
09/04/2019
14
V. Bisot, R. Serizel, S. Essid, G. Richard, "Feature Learning with Matrix Fa...
EXAMPLE FOR SCENE CLASSIFICATION
UNSUPERVISED NMF FOR ACOUSTIC SCENE RECOGNITION
UNSUPERVISED NMF FOR ACOUSTIC SCENE RECOGNITION
EXAMPLE WITH DNN: ACOUSTIC SCENE RECOGNITION
V. Bisot & al., "Feature Learning with Matrix Factorization Applied to Acoust...
TYPICAL PERFORMANCES OF ACOUSTIC SCENE RECOGNITION (CHALLENGE
DCASE 2016)
A Mesaros & al. Detection and Classification of...
MUSIC INFORMATION RETRIEVAL
Major topics:
• Music transcription (Multiple F0 estimation, Beat/Downbeat detection, instrume...
MIR: AN EXAMPLE WITH DOWNBEAT ESTIMATION
(DURAND & AL. 2017)
MIR: AN EXAMPLE WITH DOWNBEAT ESTIMATION
(DURAND & AL. 2017)
S Durand & al., "Robust Downbeat Tracking Using an Ensemble o...
DOWNBEAT ESTIMATION: DÉMO
Examples at the output of each network
https://simondurand.github.io/dnn_audio.html
Other audio ...
MUSIC INFORMATION RETRIEVAL
Some Current trends:
09/04/2019
24
Context-aware music
Recommendation
with
Text-Informed Lead
...
Upcoming SlideShare
Loading in …5
×

Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - Machine listening: l'IA pour les sons et la musique.

30 views

Published on

Colloque IMT - L'IA au cœur des mutations industrielles - Session Données et connaissances: machine listening: l'IA pour les sons et la musique.

Published in: Engineering
  • Be the first to comment

  • Be the first to like this

Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - Machine listening: l'IA pour les sons et la musique.

  1. 1. "MACHINE LISTENING: L'INTELLIGENCE ARTIFICIELLE POUR LES SONS ET LA MUSIQUE" COLLOQUE IMT - L’INTELLIGENCE ARTIFICIELLE AU COEUR DES MUTATIONS INDUSTRIELLES. APRIL 4TH, 2019 Gaël RICHARD Professor, Head of the Image, Data, Signal department
  2. 2. 09/04/2019 2 MACHINE LISTENING ? A well established domain for speech … Speech Recognition « What a great workshop !» Speech Emotion Recognition « The person who’s speaking is happy but anxious» Speaker Recognition « Gaël Richard is speaking» Speech Translation « Quel excellent colloque ! » « What a great workshop ! »
  3. 3. MACHINE LISTENING FOR AUDIO AND MUSIC Goal: to extract « information » from the audio signal 09/04/2019 3 Music recognition (or Audio ID): identifying the music recording Music Information retrieval Music transcription, Music similarity, Music genre recognition, Musical instrument recognition, Music recommendation, Autotagging, Audio Scene recognition: sound recorded in streets, in subway station, in office,….) Audio Source separation Demixing music, Singing voice extraction, source localisation, Audio Event recognition: Speech, Music, Car noises, birds, … Audio Sound capture: Localisation, Dereverberation, Denoising, … Music Emotion recognition: Sad vs happy vs dance music, …. Bioacoustics: Wildlife monitoring, biodiversity, species id…
  4. 4. SOME APPLICATIONS … 09/04/2019 4 Music streaming, music recommendation Music education Music Identification, Audio Fringerprint Karaoke, speech to rap conversion Sound recognition, smarthomes, smart hearables bmat Audeering Stratégie de marque musicale, Supervision musicale (pub.; films) Bioacoustics Vocal separation, music separation
  5. 5. SOURCE SEPARATION 09/04/2019 5
  6. 6. SOURCE SEPARATION FOR REMIXING 09/04/2019 6
  7. 7. SOURCE SEPARATION (LEGLAIVE & AL.) 09/04/2019 7 Model for Audio sources Model for reverberation
  8. 8. AN EXAMPLE OF MODEL FOR AUDIO SOURCES Non Negative Matrix factorization 09/04/2019 8
  9. 9. SOURCE SEPARATION - S. LEGLAIVE (EXAMPLE REMIXED BY RADIO FRANCE) 09/04/2019 9 S. Leglaive, R. Badeau, G. Richard, "Multichannel Audio Source Separation with Probabilistic Reverberation Priors", IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 24, no. 12, December 2016 Simon Leglaive, Roland Badeau, Gaël Richard, Separating Time-Frequency Sources from Time-Domain Convolutive Mixtures Using Non-negative Matrix Factorization. WASPAA , Oct. 2017 New Paltz, US.
  10. 10. AUDIO IDENTIFICATION OU AUDIOID Audio ID = find high-level metadata from a music recording Challenges: Efficiency in adverse conditions (distorsion, noises,..) Scale to “Big data” (bases > millions of titles) Rapidity / Real time Product example : Audio identification Information of the recording (title, artist, etc.., …)
  11. 11. REAL-TIME AUDIO IDENTIFICATION (FENET & AL.) Audio recordings recognition • Identical • Approximate (live vs studio) • Real time demonstrator • For music recommendation, second screen applications, … 09/04/2019 11 Sébastien Fenet, Yves Grenier, Gaël Richard: An Extended Audio Fingerprint Method with Capabilities for Similar Music Detection. ISMIR 2013: 569-574
  12. 12. SOUND EVENTS AND ACOUSTIC SCENE RECOGNITION 09/04/2019 12
  13. 13. SOME APPROACHES 09/04/2019 13
  14. 14. AN EXEMPLE OF A RECENT APPROACH 09/04/2019 14 V. Bisot, R. Serizel, S. Essid, G. Richard, "Feature Learning with Matrix Factorization Applied to Acoustic Scene Classification", IEEE/ACM Transactions on Audio, Speech, and Language Processing, (2017), Special Issue on Sound Scene and Event Analysis.
  15. 15. EXAMPLE FOR SCENE CLASSIFICATION
  16. 16. UNSUPERVISED NMF FOR ACOUSTIC SCENE RECOGNITION
  17. 17. UNSUPERVISED NMF FOR ACOUSTIC SCENE RECOGNITION
  18. 18. EXAMPLE WITH DNN: ACOUSTIC SCENE RECOGNITION V. Bisot & al., "Feature Learning with Matrix Factorization Applied to Acoustic Scene Classification", IEEE/ACM Transactions on Audio, Speech, and Language Processing, (2017), V. Bisot & al., Leveraging deep neural networks with nonnegative representations for improved environmental sound classification IEEE International Workshop on Machine Learning for Signal Processing MLSP, Sep 2017, Tokyo,
  19. 19. TYPICAL PERFORMANCES OF ACOUSTIC SCENE RECOGNITION (CHALLENGE DCASE 2016) A Mesaros & al. Detection and Classification of Acoustic Scenes and Events: Outcome of the DCASE 2016 challenge IEEE/ACM Transactions on Audio, Speech, and Language Processing 26 (2), 379-393
  20. 20. MUSIC INFORMATION RETRIEVAL Major topics: • Music transcription (Multiple F0 estimation, Beat/Downbeat detection, instrument classification, …), • Music recommendation • Source separation, • Multimodal music processing 09/04/2019 20
  21. 21. MIR: AN EXAMPLE WITH DOWNBEAT ESTIMATION (DURAND & AL. 2017)
  22. 22. MIR: AN EXAMPLE WITH DOWNBEAT ESTIMATION (DURAND & AL. 2017) S Durand & al., "Robust Downbeat Tracking Using an Ensemble of Convolutional Networks", IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol 25, N°1, 2017
  23. 23. DOWNBEAT ESTIMATION: DÉMO Examples at the output of each network https://simondurand.github.io/dnn_audio.html Other audio example JBB (Downbeat) JBB (Tatum) Exemple (Downbeat) Exemple (Tatum)
  24. 24. MUSIC INFORMATION RETRIEVAL Some Current trends: 09/04/2019 24 Context-aware music Recommendation with Text-Informed Lead Vocal Extraction with EEG-driven music processing With (ex Technicolor) Context-driven Style transfer With (ex Technicolor) Conditional audio generation With 5 new PhD grants within ITN-MIP-Frontiers projects

×