SlideShare a Scribd company logo
Data augmentation for sound data
MS student 1st grade, The University of Tokyo
Tomoya Koike
What is data augmentation
2
Increase data volume by deforming the input happening in real world
e.g. Flip
1. Can be observed in real world(or test dataset)
2. Does not affect the features required for classification
Data augmentation can be useful if augmented data…
Focus area in this survey
3
Automatic Speech Recognition(ASR)
Speech Emotion Recognition
Environmental Sound Classification(ESC)
Acoustic Scene Classification(ASC)
Audio Tagging
Traditional audio augmentation
4
• Pitch shifting
• Time stretching
• Loudness variation / Changing gain
• Adding background noise
• Adding reverberation noise[1]
• Time shifting
• Crop/Sub-sequence sampling
• Shuffling frames[2]
SpecAugment[8]
5
Simple, but strong
1. Audio into Mel spectrogram
2. Drop some section in time axis
and frequency axis
https://ai.googleblog.com/2019/04/specaugment-
new-data-augmentation.html
Procedure
Spectrogram augmentation[11]
6
Demo page
Mixup[3]/BC-learning[4]
7
c.f. SamplePairing[6],
Extrapolation[5]
Mixup
Between Class(BC) learning
cVAE / ACGAN[7]
8
Reference
9
[1] https://ieeexplore.ieee.org/abstract/document/7472835
[2] DOMESTIC ACTIVITIES CLASSIFICATION BASED ON CNN USING SHUFFLING AND MIXING DATA AUGMENTATION
[3] https://arxiv.org/pdf/1710.09412.pdf
[4] https://arxiv.org/pdf/1711.10282.pdf
[5] https://arxiv.org/pdf/1808.03883.pdf
[6] https://arxiv.org/pdf/1801.02929.pdf
[7] http://dcase.community/documents/challenge2019/technical_reports/DCASE2019_Zhang_34.pdf
[8] https://arxiv.org/pdf/1904.08779.pdf
[9] https://arxiv.org/pdf/2002.12231.pdf
[10] https://www.sciencedirect.com/science/article/abs/pii/S0003682X19309442
[11] https://arxiv.org/pdf/2001.01401.pdf
[12] https://arxiv.org/pdf/1904.05862.pdf

More Related Content

Similar to Audio augmentation

Analysis of speech signal mlbp features
Analysis of speech signal mlbp featuresAnalysis of speech signal mlbp features
Analysis of speech signal mlbp features
ZiadAlqady
 
Analysis of speech signal mlbp features
Analysis of speech signal mlbp featuresAnalysis of speech signal mlbp features
Analysis of speech signal mlbp features
ZiadAlqady
 
"Эффективность и оптимизация кода в Java 8" Сергей Моренец
"Эффективность и оптимизация кода в Java 8" Сергей Моренец"Эффективность и оптимизация кода в Java 8" Сергей Моренец
"Эффективность и оптимизация кода в Java 8" Сергей Моренец
Fwdays
 
Pv task work flow
Pv task work flowPv task work flow
Pv task work flow
horngyuh
 
A Study on the Video Scene Retrieving System
A Study on the Video Scene Retrieving SystemA Study on the Video Scene Retrieving System
A Study on the Video Scene Retrieving System
Yoshika Osawa
 
Lecture capture
Lecture captureLecture capture
Lecture capture
gregynog
 
LabMetry
LabMetryLabMetry
LabMetry
François Voron
 
Database Conditioning Presentation ESRI PUG 2015
Database Conditioning Presentation ESRI PUG 2015Database Conditioning Presentation ESRI PUG 2015
Database Conditioning Presentation ESRI PUG 2015Bernie South
 
PASCAL PASCAL CHALLENGE ON INFORMATION EXTRACTION
PASCAL PASCAL CHALLENGE ON INFORMATION EXTRACTIONPASCAL PASCAL CHALLENGE ON INFORMATION EXTRACTION
PASCAL PASCAL CHALLENGE ON INFORMATION EXTRACTIONbutest
 
A Low Latency Implementation of a Non Uniform Partitioned Overlap and Save Al...
A Low Latency Implementation of a Non Uniform Partitioned Overlap and Save Al...A Low Latency Implementation of a Non Uniform Partitioned Overlap and Save Al...
A Low Latency Implementation of a Non Uniform Partitioned Overlap and Save Al...
a3labdsp
 
Streamlined Technology-driven Orchestration for Effective Teaching
Streamlined Technology-driven Orchestration for Effective TeachingStreamlined Technology-driven Orchestration for Effective Teaching
Streamlined Technology-driven Orchestration for Effective Teaching
Lighton Phiri
 
An overview of multi-filters for eliminating impulse noise for digital images
An overview of multi-filters for eliminating impulse noise for digital imagesAn overview of multi-filters for eliminating impulse noise for digital images
An overview of multi-filters for eliminating impulse noise for digital images
TELKOMNIKA JOURNAL
 
Data Collection
Data CollectionData Collection
Data Collectiontzoubir
 
Final report
Final reportFinal report
Final report
Chuah Say Yin
 
空英課程 Agile development 2014
空英課程 Agile development 2014空英課程 Agile development 2014
空英課程 Agile development 2014
芋頭 烤
 
A Review of Video Classification Techniques
A Review of Video Classification TechniquesA Review of Video Classification Techniques
A Review of Video Classification Techniques
IRJET Journal
 

Similar to Audio augmentation (16)

Analysis of speech signal mlbp features
Analysis of speech signal mlbp featuresAnalysis of speech signal mlbp features
Analysis of speech signal mlbp features
 
Analysis of speech signal mlbp features
Analysis of speech signal mlbp featuresAnalysis of speech signal mlbp features
Analysis of speech signal mlbp features
 
"Эффективность и оптимизация кода в Java 8" Сергей Моренец
"Эффективность и оптимизация кода в Java 8" Сергей Моренец"Эффективность и оптимизация кода в Java 8" Сергей Моренец
"Эффективность и оптимизация кода в Java 8" Сергей Моренец
 
Pv task work flow
Pv task work flowPv task work flow
Pv task work flow
 
A Study on the Video Scene Retrieving System
A Study on the Video Scene Retrieving SystemA Study on the Video Scene Retrieving System
A Study on the Video Scene Retrieving System
 
Lecture capture
Lecture captureLecture capture
Lecture capture
 
LabMetry
LabMetryLabMetry
LabMetry
 
Database Conditioning Presentation ESRI PUG 2015
Database Conditioning Presentation ESRI PUG 2015Database Conditioning Presentation ESRI PUG 2015
Database Conditioning Presentation ESRI PUG 2015
 
PASCAL PASCAL CHALLENGE ON INFORMATION EXTRACTION
PASCAL PASCAL CHALLENGE ON INFORMATION EXTRACTIONPASCAL PASCAL CHALLENGE ON INFORMATION EXTRACTION
PASCAL PASCAL CHALLENGE ON INFORMATION EXTRACTION
 
A Low Latency Implementation of a Non Uniform Partitioned Overlap and Save Al...
A Low Latency Implementation of a Non Uniform Partitioned Overlap and Save Al...A Low Latency Implementation of a Non Uniform Partitioned Overlap and Save Al...
A Low Latency Implementation of a Non Uniform Partitioned Overlap and Save Al...
 
Streamlined Technology-driven Orchestration for Effective Teaching
Streamlined Technology-driven Orchestration for Effective TeachingStreamlined Technology-driven Orchestration for Effective Teaching
Streamlined Technology-driven Orchestration for Effective Teaching
 
An overview of multi-filters for eliminating impulse noise for digital images
An overview of multi-filters for eliminating impulse noise for digital imagesAn overview of multi-filters for eliminating impulse noise for digital images
An overview of multi-filters for eliminating impulse noise for digital images
 
Data Collection
Data CollectionData Collection
Data Collection
 
Final report
Final reportFinal report
Final report
 
空英課程 Agile development 2014
空英課程 Agile development 2014空英課程 Agile development 2014
空英課程 Agile development 2014
 
A Review of Video Classification Techniques
A Review of Video Classification TechniquesA Review of Video Classification Techniques
A Review of Video Classification Techniques
 

Recently uploaded

Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
University of Rennes, INSA Rennes, Inria/IRISA, CNRS
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
DiyaBiswas10
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
SAMIR PANDA
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
muralinath2
 
erythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptxerythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptx
muralinath2
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Erdal Coalmaker
 
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATIONPRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
ChetanK57
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
University of Maribor
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
AlaminAfendy1
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
muralinath2
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
Lokesh Patil
 
in vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptxin vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptx
yusufzako14
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
Sérgio Sacani
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
kejapriya1
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
silvermistyshot
 
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of LipidsGBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
Areesha Ahmad
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
moosaasad1975
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
Nistarini College, Purulia (W.B) India
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Sérgio Sacani
 
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptxBody fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
muralinath2
 

Recently uploaded (20)

Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
 
erythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptxerythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptx
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
 
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATIONPRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
 
in vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptxin vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptx
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
 
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of LipidsGBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
 
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptxBody fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
 

Audio augmentation

  • 1. Data augmentation for sound data MS student 1st grade, The University of Tokyo Tomoya Koike
  • 2. What is data augmentation 2 Increase data volume by deforming the input happening in real world e.g. Flip 1. Can be observed in real world(or test dataset) 2. Does not affect the features required for classification Data augmentation can be useful if augmented data…
  • 3. Focus area in this survey 3 Automatic Speech Recognition(ASR) Speech Emotion Recognition Environmental Sound Classification(ESC) Acoustic Scene Classification(ASC) Audio Tagging
  • 4. Traditional audio augmentation 4 • Pitch shifting • Time stretching • Loudness variation / Changing gain • Adding background noise • Adding reverberation noise[1] • Time shifting • Crop/Sub-sequence sampling • Shuffling frames[2]
  • 5. SpecAugment[8] 5 Simple, but strong 1. Audio into Mel spectrogram 2. Drop some section in time axis and frequency axis https://ai.googleblog.com/2019/04/specaugment- new-data-augmentation.html Procedure
  • 9. Reference 9 [1] https://ieeexplore.ieee.org/abstract/document/7472835 [2] DOMESTIC ACTIVITIES CLASSIFICATION BASED ON CNN USING SHUFFLING AND MIXING DATA AUGMENTATION [3] https://arxiv.org/pdf/1710.09412.pdf [4] https://arxiv.org/pdf/1711.10282.pdf [5] https://arxiv.org/pdf/1808.03883.pdf [6] https://arxiv.org/pdf/1801.02929.pdf [7] http://dcase.community/documents/challenge2019/technical_reports/DCASE2019_Zhang_34.pdf [8] https://arxiv.org/pdf/1904.08779.pdf [9] https://arxiv.org/pdf/2002.12231.pdf [10] https://www.sciencedirect.com/science/article/abs/pii/S0003682X19309442 [11] https://arxiv.org/pdf/2001.01401.pdf [12] https://arxiv.org/pdf/1904.05862.pdf