Submit Search
Upload
[DL Hacks]Power-Normalized Cepstral Coefficients (PNCC) for Robust Speech Recognition
•
1 like
•
661 views
Deep Learning JP
Follow
2018/05/07 Deep Learning JP: http://deeplearning.jp/hacks/
Read less
Read more
Technology
Slideshow view
Report
Share
Slideshow view
Report
Share
1 of 26
Download now
Download to read offline
Recommended
自然言語処理 BERTに関する論文紹介とまとめ
自然言語処理 BERTに関する論文紹介とまとめ
KeisukeNakazono
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
Deep Learning JP
【DL輪読会】事前学習用データセットについて
【DL輪読会】事前学習用データセットについて
Deep Learning JP
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
Deep Learning JP
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
Deep Learning JP
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
Deep Learning JP
【DL輪読会】マルチモーダル LLM
【DL輪読会】マルチモーダル LLM
Deep Learning JP
【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
Deep Learning JP
Recommended
自然言語処理 BERTに関する論文紹介とまとめ
自然言語処理 BERTに関する論文紹介とまとめ
KeisukeNakazono
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
Deep Learning JP
【DL輪読会】事前学習用データセットについて
【DL輪読会】事前学習用データセットについて
Deep Learning JP
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
Deep Learning JP
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
Deep Learning JP
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
Deep Learning JP
【DL輪読会】マルチモーダル LLM
【DL輪読会】マルチモーダル LLM
Deep Learning JP
【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
Deep Learning JP
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
Deep Learning JP
【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Can Neural Network Memorization Be Localized?
Deep Learning JP
【DL輪読会】Hopfield network 関連研究について
【DL輪読会】Hopfield network 関連研究について
Deep Learning JP
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
Deep Learning JP
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
Deep Learning JP
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
Deep Learning JP
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
Deep Learning JP
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
Deep Learning JP
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
Deep Learning JP
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
Deep Learning JP
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
Deep Learning JP
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
Deep Learning JP
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
Deep Learning JP
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
Deep Learning JP
【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...
【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...
Deep Learning JP
【DL輪読会】マルチモーダル 基盤モデル
【DL輪読会】マルチモーダル 基盤モデル
Deep Learning JP
【DL輪読会】TrOCR: Transformer-based Optical Character Recognition with Pre-traine...
【DL輪読会】TrOCR: Transformer-based Optical Character Recognition with Pre-traine...
Deep Learning JP
【DL輪読会】HyperDiffusion: Generating Implicit Neural Fields withWeight-Space Dif...
【DL輪読会】HyperDiffusion: Generating Implicit Neural Fields withWeight-Space Dif...
Deep Learning JP
【DL輪読会】大量API・ツールの扱いに特化したLLM
【DL輪読会】大量API・ツールの扱いに特化したLLM
Deep Learning JP
【DL輪読会】DINOv2: Learning Robust Visual Features without Supervision
【DL輪読会】DINOv2: Learning Robust Visual Features without Supervision
Deep Learning JP
クラウド時代におけるSREとUPWARDの取組ーUPWARD株式会社 CTO門畑
クラウド時代におけるSREとUPWARDの取組ーUPWARD株式会社 CTO門畑
Akihiro Kadohata
ロボットマニピュレーションの作業・動作計画 / rosjp_planning_for_robotic_manipulation_20240521
ロボットマニピュレーションの作業・動作計画 / rosjp_planning_for_robotic_manipulation_20240521
Satoshi Makita
More Related Content
More from Deep Learning JP
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
Deep Learning JP
【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Can Neural Network Memorization Be Localized?
Deep Learning JP
【DL輪読会】Hopfield network 関連研究について
【DL輪読会】Hopfield network 関連研究について
Deep Learning JP
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
Deep Learning JP
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
Deep Learning JP
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
Deep Learning JP
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
Deep Learning JP
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
Deep Learning JP
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
Deep Learning JP
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
Deep Learning JP
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
Deep Learning JP
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
Deep Learning JP
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
Deep Learning JP
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
Deep Learning JP
【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...
【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...
Deep Learning JP
【DL輪読会】マルチモーダル 基盤モデル
【DL輪読会】マルチモーダル 基盤モデル
Deep Learning JP
【DL輪読会】TrOCR: Transformer-based Optical Character Recognition with Pre-traine...
【DL輪読会】TrOCR: Transformer-based Optical Character Recognition with Pre-traine...
Deep Learning JP
【DL輪読会】HyperDiffusion: Generating Implicit Neural Fields withWeight-Space Dif...
【DL輪読会】HyperDiffusion: Generating Implicit Neural Fields withWeight-Space Dif...
Deep Learning JP
【DL輪読会】大量API・ツールの扱いに特化したLLM
【DL輪読会】大量API・ツールの扱いに特化したLLM
Deep Learning JP
【DL輪読会】DINOv2: Learning Robust Visual Features without Supervision
【DL輪読会】DINOv2: Learning Robust Visual Features without Supervision
Deep Learning JP
More from Deep Learning JP
(20)
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Hopfield network 関連研究について
【DL輪読会】Hopfield network 関連研究について
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...
【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...
【DL輪読会】マルチモーダル 基盤モデル
【DL輪読会】マルチモーダル 基盤モデル
【DL輪読会】TrOCR: Transformer-based Optical Character Recognition with Pre-traine...
【DL輪読会】TrOCR: Transformer-based Optical Character Recognition with Pre-traine...
【DL輪読会】HyperDiffusion: Generating Implicit Neural Fields withWeight-Space Dif...
【DL輪読会】HyperDiffusion: Generating Implicit Neural Fields withWeight-Space Dif...
【DL輪読会】大量API・ツールの扱いに特化したLLM
【DL輪読会】大量API・ツールの扱いに特化したLLM
【DL輪読会】DINOv2: Learning Robust Visual Features without Supervision
【DL輪読会】DINOv2: Learning Robust Visual Features without Supervision
Recently uploaded
クラウド時代におけるSREとUPWARDの取組ーUPWARD株式会社 CTO門畑
クラウド時代におけるSREとUPWARDの取組ーUPWARD株式会社 CTO門畑
Akihiro Kadohata
ロボットマニピュレーションの作業・動作計画 / rosjp_planning_for_robotic_manipulation_20240521
ロボットマニピュレーションの作業・動作計画 / rosjp_planning_for_robotic_manipulation_20240521
Satoshi Makita
2024年5月25日Serverless Meetup大阪 アプリケーションをどこで動かすべきなのか.pptx
2024年5月25日Serverless Meetup大阪 アプリケーションをどこで動かすべきなのか.pptx
ssuserbefd24
20240523_IoTLT_vol111_kitazaki_v1___.pdf
20240523_IoTLT_vol111_kitazaki_v1___.pdf
Ayachika Kitazaki
5/22 第23回 Customer系エンジニア座談会のスライド 公開用 西口瑛一
5/22 第23回 Customer系エンジニア座談会のスライド 公開用 西口瑛一
瑛一 西口
論文紹介:ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation
論文紹介:ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation
Toru Tamaki
部内勉強会(IT用語ざっくり学習) 実施日:2024年5月17日(金) 対象者:営業部社員
部内勉強会(IT用語ざっくり学習) 実施日:2024年5月17日(金) 対象者:営業部社員
Sadaomi Nishi
Amazon Cognitoで実装するパスキー (Security-JAWS【第33回】 勉強会)
Amazon Cognitoで実装するパスキー (Security-JAWS【第33回】 勉強会)
keikoitakurag
論文紹介: Exploiting semantic segmentation to boost reinforcement learning in vid...
論文紹介: Exploiting semantic segmentation to boost reinforcement learning in vid...
atsushi061452
論文紹介: Offline Q-Learning on diverse Multi-Task data both scales and generalizes
論文紹介: Offline Q-Learning on diverse Multi-Task data both scales and generalizes
atsushi061452
Intranet Development v1.0 (TSG LIVE! 12 LT )
Intranet Development v1.0 (TSG LIVE! 12 LT )
iwashiira2ctf
論文紹介:Deep Occlusion-Aware Instance Segmentation With Overlapping BiLayers
論文紹介:Deep Occlusion-Aware Instance Segmentation With Overlapping BiLayers
Toru Tamaki
Recently uploaded
(12)
クラウド時代におけるSREとUPWARDの取組ーUPWARD株式会社 CTO門畑
クラウド時代におけるSREとUPWARDの取組ーUPWARD株式会社 CTO門畑
ロボットマニピュレーションの作業・動作計画 / rosjp_planning_for_robotic_manipulation_20240521
ロボットマニピュレーションの作業・動作計画 / rosjp_planning_for_robotic_manipulation_20240521
2024年5月25日Serverless Meetup大阪 アプリケーションをどこで動かすべきなのか.pptx
2024年5月25日Serverless Meetup大阪 アプリケーションをどこで動かすべきなのか.pptx
20240523_IoTLT_vol111_kitazaki_v1___.pdf
20240523_IoTLT_vol111_kitazaki_v1___.pdf
5/22 第23回 Customer系エンジニア座談会のスライド 公開用 西口瑛一
5/22 第23回 Customer系エンジニア座談会のスライド 公開用 西口瑛一
論文紹介:ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation
論文紹介:ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation
部内勉強会(IT用語ざっくり学習) 実施日:2024年5月17日(金) 対象者:営業部社員
部内勉強会(IT用語ざっくり学習) 実施日:2024年5月17日(金) 対象者:営業部社員
Amazon Cognitoで実装するパスキー (Security-JAWS【第33回】 勉強会)
Amazon Cognitoで実装するパスキー (Security-JAWS【第33回】 勉強会)
論文紹介: Exploiting semantic segmentation to boost reinforcement learning in vid...
論文紹介: Exploiting semantic segmentation to boost reinforcement learning in vid...
論文紹介: Offline Q-Learning on diverse Multi-Task data both scales and generalizes
論文紹介: Offline Q-Learning on diverse Multi-Task data both scales and generalizes
Intranet Development v1.0 (TSG LIVE! 12 LT )
Intranet Development v1.0 (TSG LIVE! 12 LT )
論文紹介:Deep Occlusion-Aware Instance Segmentation With Overlapping BiLayers
論文紹介:Deep Occlusion-Aware Instance Segmentation With Overlapping BiLayers
[DL Hacks]Power-Normalized Cepstral Coefficients (PNCC) for Robust Speech Recognition
1.
Power-Normalized Cepstral Coefficients (PNCC) for
Robust Speech Recognition 東京大学工学部システム創成学科Cコース B3 中村泰貴
2.
自己紹介 ・東京大学工学部システム創成学科Cコース B3 中村泰貴 ・音声(深層学習を絡めた)や信号処理の技術に興味あります ・今回が初回発表です...
3.
書誌情報 ・論文名 ・Power-Normalized Cepstral Coefficients
(PNCC) for Robust Speech Recognition ・著者 ・Chanwoo Kim(Google) ・Richard M Stern(Carnegie Mellon University) ・公開日 ・2016/06/24 ・論文URL ・http://www.cs.cmu.edu/ robust/Papers/ OnlinePNCC_V25.pdf
4.
背景 ・音声認識で用いられる特徴抽出 ・MFCCかmelspectrogramがほとんど ・別な特徴抽出方法はないのか... ・Robust性も欲しい!! ・試してみる価値はある deep speech2 PNCC!!!
5.
PNCCとは ・主な特徴 ・MFCCなどは対数を用いているのに対し、 PNCCは冪乗則を用いる ・雑音低減させるasymmetric filtering ・様々なタイプの雑音環境下、エコーがかかる環境下で MFCCやPLPより認識精度が向上 ・従来の特徴抽出との差異 ・計算コストがよりかかる ・clean音声でも認識精度が落ちない
6.
まずは結果から... LibriSpeech dev-cleanの音声に SNR=4[db]ほどのノイズを環境雑音を付加
7.
まずは結果から... mel spectrogram PNCC
8.
まずは結果から...
9.
PNCCの機構
10.
11.
Gammatone Frequency Integration ・Filtabank http://aidiary.hatenablog.com/ entry/20120225/1330179868
12.
13.
Medium-Time Power Calculation ・M
= 2 ・Pの移動平均 ・ガウスノイズに効果的
14.
15.
Asymmetric Noise Suppression floor
level noise を検出
16.
Asymmetric Noise Suppression 有声音などの励起関数によって 駆動されていないと思われる 信号にlowpass
filteringを 適用すると認識精度が向上する この動作は複数回のローパスフィルタに なるため音声のパワー係数をぼかし、 認識精度を低下させるため、音声セグメントに 対して適用しない
17.
Asymmetric Noise Suppression 信号がそれ自身の下側崩落線の定数倍より 小さいならばそれは励起されていないもの と考える c
= 2 がホワイトノイズに対して もっとも効果的
18.
Temporal masking 最終的なR[m, l]の値は... R[m,
l] = Rsp[m, l] (excitation) R[m, l] = Qf[m, l] (non-excitaion) となる
19.
20.
Weight Smoothing
21.
22.
Mean power normalization
23.
24.
Power Function nonlinearity MFCCによる処理 PNCCによる処理
25.
EXPERIMENTAL RESULTS (a)white noise (b)street
noise (c) background music (d) interfering speech (e) artificial reverberation
26.
Computational Complexity
Download now