[DL Hacks]Power-Normalized Cepstral Coefficients (PNCC) for Robust Speech Recognition

•

1 like•661 views

Deep Learning JP

2018/05/07 Deep Learning JP: http://deeplearning.jp/hacks/

Power-Normalized Cepstral
Coefﬁcients (PNCC)
for Robust Speech Recognition
東京大学工学部システム創成学科Cコース
B3 中村泰貴

自己紹介
・東京大学工学部システム創成学科Cコース B3 中村泰貴
・音声(深層学習を絡めた)や信号処理の技術に興味あります
・今回が初回発表です...

書誌情報
・論文名
・Power-Normalized Cepstral Coeﬃcients (PNCC)
for Robust Speech Recognition
・著者
・Chanwoo Kim(Google)
・Richard M Stern(Carnegie Mellon University)
・公開日
・2016/06/24
・論文URL
・http://www.cs.cmu.edu/ robust/Papers/
OnlinePNCC_V25.pdf

背景
・音声認識で用いられる特徴抽出
・MFCCかmelspectrogramがほとんど
・別な特徴抽出方法はないのか...
・Robust性も欲しい！！
・試してみる価値はある
deep speech2
PNCC!!!

PNCCとは
・主な特徴
・MFCCなどは対数を用いているのに対し、
PNCCは冪乗則を用いる
・雑音低減させるasymmetric ﬁltering
・様々なタイプの雑音環境下、エコーがかかる環境下で
MFCCやPLPより認識精度が向上
・従来の特徴抽出との差異
・計算コストがよりかかる
・clean音声でも認識精度が落ちない

まずは結果から...
LibriSpeech dev-cleanの音声に
SNR=4[db]ほどのノイズを環境雑音を付加

まずは結果から...
mel
spectrogram
PNCC

まずは結果から...

PNCCの機構

Gammatone Frequency Integration
・Filtabank
http://aidiary.hatenablog.com/
entry/20120225/1330179868

Medium-Time Power Calculation
・M = 2
・Pの移動平均
・ガウスノイズに効果的

Asymmetric Noise Suppression
ﬂoor level noise を検出

Asymmetric Noise Suppression
有声音などの励起関数によって
駆動されていないと思われる
信号にlowpass ﬁlteringを
適用すると認識精度が向上する
この動作は複数回のローパスフィルタに
なるため音声のパワー係数をぼかし、
認識精度を低下させるため、音声セグメントに
対して適用しない

Asymmetric Noise Suppression
信号がそれ自身の下側崩落線の定数倍より
小さいならばそれは励起されていないもの
と考える
c = 2 がホワイトノイズに対して
もっとも効果的

Temporal masking
最終的なR[m, l]の値は...
R[m, l] = Rsp[m, l] (excitation)
R[m, l] = Qf[m, l] (non-excitaion)
となる

Weight Smoothing

Mean power normalization

Power Function nonlinearity
MFCCによる処理
PNCCによる処理

EXPERIMENTAL RESULTS
(a)white noise
(b)street noise
(c) background
music
(d) interfering
speech
(e) artiﬁcial
reverberation

Computational Complexity

Recommended

自然言語処理 BERTに関する論文紹介とまとめ

自然言語処理 BERTに関する論文紹介とまとめ

自然言語処理 BERTに関する論文紹介とまとめKeisukeNakazono

【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners

【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners

【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving PlannersDeep Learning JP

【DL輪読会】事前学習用データセットについて

【DL輪読会】事前学習用データセットについて

【DL輪読会】事前学習用データセットについてDeep Learning JP

【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...

【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...

【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...Deep Learning JP

【DL輪読会】Zero-Shot Dual-Lens Super-Resolution

【DL輪読会】Zero-Shot Dual-Lens Super-Resolution

【DL輪読会】Zero-Shot Dual-Lens Super-ResolutionDeep Learning JP

【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv

【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv

【DL輪読会】BloombergGPT: A Large Language Model for Finance arxivDeep Learning JP

【DL輪読会】マルチモーダル LLM

【DL輪読会】マルチモーダル LLM

【DL輪読会】マルチモーダル LLMDeep Learning JP

【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...

【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...

【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...Deep Learning JP

Recommended

自然言語処理 BERTに関する論文紹介とまとめ

自然言語処理 BERTに関する論文紹介とまとめ

自然言語処理 BERTに関する論文紹介とまとめKeisukeNakazono

【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners

【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners

【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving PlannersDeep Learning JP

【DL輪読会】事前学習用データセットについて

【DL輪読会】事前学習用データセットについて

【DL輪読会】事前学習用データセットについてDeep Learning JP

【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...

【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...

【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...Deep Learning JP

【DL輪読会】Zero-Shot Dual-Lens Super-Resolution

【DL輪読会】Zero-Shot Dual-Lens Super-Resolution

【DL輪読会】Zero-Shot Dual-Lens Super-ResolutionDeep Learning JP

【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv

【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv

【DL輪読会】BloombergGPT: A Large Language Model for Finance arxivDeep Learning JP

【DL輪読会】マルチモーダル LLM

【DL輪読会】マルチモーダル LLM

【DL輪読会】マルチモーダル LLMDeep Learning JP

【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...

【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...

【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...Deep Learning JP

【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition

【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition

【DL輪読会】AnyLoc: Towards Universal Visual Place RecognitionDeep Learning JP

【DL輪読会】Can Neural Network Memorization Be Localized?

【DL輪読会】Can Neural Network Memorization Be Localized?

【DL輪読会】Can Neural Network Memorization Be Localized?Deep Learning JP

【DL輪読会】Hopfield network　関連研究について

【DL輪読会】Hopfield network　関連研究について

【DL輪読会】Hopfield network　関連研究についてDeep Learning JP

【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )

【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )

【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )Deep Learning JP

【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...

【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...

【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...Deep Learning JP

【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"

【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"

【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"Deep Learning JP

【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "

【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "

【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "Deep Learning JP

【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models

【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models

【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat ModelsDeep Learning JP

【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"

【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"

【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"Deep Learning JP

【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...

【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...

【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...Deep Learning JP

【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...

【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...

【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...Deep Learning JP

【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...

【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...

【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...Deep Learning JP

【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...

【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...

【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...Deep Learning JP

【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...

【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...

【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...Deep Learning JP

【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...

【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...

【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...Deep Learning JP

【DL輪読会】マルチモーダル基盤モデル

【DL輪読会】マルチモーダル基盤モデル

【DL輪読会】マルチモーダル基盤モデルDeep Learning JP

【DL輪読会】TrOCR: Transformer-based Optical Character Recognition with Pre-traine...

【DL輪読会】TrOCR: Transformer-based Optical Character Recognition with Pre-traine...

【DL輪読会】TrOCR: Transformer-based Optical Character Recognition with Pre-traine...Deep Learning JP

【DL輪読会】HyperDiffusion: Generating Implicit Neural Fields withWeight-Space Dif...

【DL輪読会】HyperDiffusion: Generating Implicit Neural Fields withWeight-Space Dif...

【DL輪読会】HyperDiffusion: Generating Implicit Neural Fields withWeight-Space Dif...Deep Learning JP

【DL輪読会】大量API・ツールの扱いに特化したLLM

【DL輪読会】大量API・ツールの扱いに特化したLLM

【DL輪読会】大量API・ツールの扱いに特化したLLMDeep Learning JP

【DL輪読会】DINOv2: Learning Robust Visual Features without Supervision

【DL輪読会】DINOv2: Learning Robust Visual Features without Supervision

【DL輪読会】DINOv2: Learning Robust Visual Features without SupervisionDeep Learning JP

クラウド時代におけるSREとUPWARDの取組ーUPWARD株式会社　CTO門畑

クラウド時代におけるSREとUPWARDの取組ーUPWARD株式会社　CTO門畑

クラウド時代におけるSREとUPWARDの取組ーUPWARD株式会社　CTO門畑Akihiro Kadohata

ロボットマニピュレーションの作業・動作計画 / rosjp_planning_for_robotic_manipulation_20240521

ロボットマニピュレーションの作業・動作計画 / rosjp_planning_for_robotic_manipulation_20240521

ロボットマニピュレーションの作業・動作計画 / rosjp_planning_for_robotic_manipulation_20240521Satoshi Makita

More Related Content

More from Deep Learning JP

【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition

【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition

【DL輪読会】AnyLoc: Towards Universal Visual Place RecognitionDeep Learning JP

【DL輪読会】Can Neural Network Memorization Be Localized?

【DL輪読会】Can Neural Network Memorization Be Localized?

【DL輪読会】Can Neural Network Memorization Be Localized?Deep Learning JP

【DL輪読会】Hopfield network　関連研究について

【DL輪読会】Hopfield network　関連研究について

【DL輪読会】Hopfield network　関連研究についてDeep Learning JP

【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )

【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )

【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )Deep Learning JP

【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...

【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...

【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...Deep Learning JP

【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"

【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"

【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"Deep Learning JP

【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "

【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "

【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "Deep Learning JP

【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models

【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models

【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat ModelsDeep Learning JP

【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"

【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"

【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"Deep Learning JP

【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...

【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...

【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...Deep Learning JP

【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...

【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...

【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...Deep Learning JP

【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...

【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...

【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...Deep Learning JP

【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...

【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...

【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...Deep Learning JP

【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...

【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...

【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...Deep Learning JP

【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...

【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...

【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...Deep Learning JP

【DL輪読会】マルチモーダル基盤モデル

【DL輪読会】マルチモーダル基盤モデル

【DL輪読会】マルチモーダル基盤モデルDeep Learning JP

【DL輪読会】TrOCR: Transformer-based Optical Character Recognition with Pre-traine...

【DL輪読会】TrOCR: Transformer-based Optical Character Recognition with Pre-traine...

【DL輪読会】TrOCR: Transformer-based Optical Character Recognition with Pre-traine...Deep Learning JP

【DL輪読会】HyperDiffusion: Generating Implicit Neural Fields withWeight-Space Dif...

【DL輪読会】HyperDiffusion: Generating Implicit Neural Fields withWeight-Space Dif...

【DL輪読会】HyperDiffusion: Generating Implicit Neural Fields withWeight-Space Dif...Deep Learning JP

【DL輪読会】大量API・ツールの扱いに特化したLLM

【DL輪読会】大量API・ツールの扱いに特化したLLM

【DL輪読会】大量API・ツールの扱いに特化したLLMDeep Learning JP

【DL輪読会】DINOv2: Learning Robust Visual Features without Supervision

【DL輪読会】DINOv2: Learning Robust Visual Features without Supervision

【DL輪読会】DINOv2: Learning Robust Visual Features without SupervisionDeep Learning JP

More from Deep Learning JP (20)

【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition

【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition

【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition

【DL輪読会】Can Neural Network Memorization Be Localized?

【DL輪読会】Can Neural Network Memorization Be Localized?

【DL輪読会】Can Neural Network Memorization Be Localized?

【DL輪読会】Hopfield network　関連研究について

【DL輪読会】Hopfield network　関連研究について

【DL輪読会】Hopfield network　関連研究について

【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )

【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )

【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )

【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...

【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...

【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...

【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"

【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"

【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"

【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "

【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "

【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "

【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models

【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models

【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models

【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"

【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"

【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"

【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...

【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...

【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...

【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...

【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...

【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...

【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...

【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...

【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...

【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...

【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...

【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...

【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...

【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...

【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...

【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...

【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...

【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...

【DL輪読会】マルチモーダル基盤モデル

【DL輪読会】マルチモーダル基盤モデル

【DL輪読会】マルチモーダル基盤モデル

【DL輪読会】TrOCR: Transformer-based Optical Character Recognition with Pre-traine...

【DL輪読会】TrOCR: Transformer-based Optical Character Recognition with Pre-traine...

【DL輪読会】TrOCR: Transformer-based Optical Character Recognition with Pre-traine...

【DL輪読会】HyperDiffusion: Generating Implicit Neural Fields withWeight-Space Dif...

【DL輪読会】HyperDiffusion: Generating Implicit Neural Fields withWeight-Space Dif...

【DL輪読会】HyperDiffusion: Generating Implicit Neural Fields withWeight-Space Dif...

【DL輪読会】大量API・ツールの扱いに特化したLLM

【DL輪読会】大量API・ツールの扱いに特化したLLM

【DL輪読会】大量API・ツールの扱いに特化したLLM

【DL輪読会】DINOv2: Learning Robust Visual Features without Supervision

【DL輪読会】DINOv2: Learning Robust Visual Features without Supervision

【DL輪読会】DINOv2: Learning Robust Visual Features without Supervision

Recently uploaded

クラウド時代におけるSREとUPWARDの取組ーUPWARD株式会社　CTO門畑

クラウド時代におけるSREとUPWARDの取組ーUPWARD株式会社　CTO門畑

クラウド時代におけるSREとUPWARDの取組ーUPWARD株式会社　CTO門畑Akihiro Kadohata

ロボットマニピュレーションの作業・動作計画 / rosjp_planning_for_robotic_manipulation_20240521

ロボットマニピュレーションの作業・動作計画 / rosjp_planning_for_robotic_manipulation_20240521

ロボットマニピュレーションの作業・動作計画 / rosjp_planning_for_robotic_manipulation_20240521Satoshi Makita

2024年5月25日Serverless Meetup大阪　アプリケーションをどこで動かすべきなのか.pptx

2024年5月25日Serverless Meetup大阪　アプリケーションをどこで動かすべきなのか.pptx

2024年5月25日Serverless Meetup大阪　アプリケーションをどこで動かすべきなのか.pptxssuserbefd24

20240523_IoTLT_vol111_kitazaki_v1___.pdf

20240523_IoTLT_vol111_kitazaki_v1___.pdf

20240523_IoTLT_vol111_kitazaki_v1___.pdfAyachika Kitazaki

5/22 第23回 Customer系エンジニア座談会のスライド公開用西口瑛一

5/22 第23回 Customer系エンジニア座談会のスライド公開用西口瑛一

5/22 第23回 Customer系エンジニア座談会のスライド公開用西口瑛一瑛一西口

論文紹介：ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation

論文紹介：ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation

論文紹介：ViTPose: Simple Vision Transformer Baselines for Human Pose EstimationToru Tamaki

部内勉強会（IT用語ざっくり学習）　実施日：2024年5月17日（金）　対象者：営業部社員

部内勉強会（IT用語ざっくり学習）　実施日：2024年5月17日（金）　対象者：営業部社員

部内勉強会（IT用語ざっくり学習）　実施日：2024年5月17日（金）　対象者：営業部社員Sadaomi Nishi

Amazon Cognitoで実装するパスキー（Security-JAWS【第33回】勉強会）

Amazon Cognitoで実装するパスキー（Security-JAWS【第33回】勉強会）

Amazon Cognitoで実装するパスキー（Security-JAWS【第33回】勉強会）keikoitakurag

論文紹介: Exploiting semantic segmentation to boost reinforcement learning in vid...

論文紹介: Exploiting semantic segmentation to boost reinforcement learning in vid...

論文紹介: Exploiting semantic segmentation to boost reinforcement learning in vid...atsushi061452

論文紹介: Offline Q-Learning on diverse Multi-Task data both scales and generalizes

論文紹介: Offline Q-Learning on diverse Multi-Task data both scales and generalizes

論文紹介: Offline Q-Learning on diverse Multi-Task data both scales and generalizesatsushi061452

Intranet Development v1.0 (TSG LIVE! 12 LT )

Intranet Development v1.0 (TSG LIVE! 12 LT )

Intranet Development v1.0 (TSG LIVE! 12 LT )iwashiira2ctf

論文紹介：Deep Occlusion-Aware Instance Segmentation With Overlapping BiLayers

論文紹介：Deep Occlusion-Aware Instance Segmentation With Overlapping BiLayers

論文紹介：Deep Occlusion-Aware Instance Segmentation With Overlapping BiLayersToru Tamaki

Recently uploaded (12)

クラウド時代におけるSREとUPWARDの取組ーUPWARD株式会社　CTO門畑

クラウド時代におけるSREとUPWARDの取組ーUPWARD株式会社　CTO門畑

クラウド時代におけるSREとUPWARDの取組ーUPWARD株式会社　CTO門畑

ロボットマニピュレーションの作業・動作計画 / rosjp_planning_for_robotic_manipulation_20240521

ロボットマニピュレーションの作業・動作計画 / rosjp_planning_for_robotic_manipulation_20240521

ロボットマニピュレーションの作業・動作計画 / rosjp_planning_for_robotic_manipulation_20240521

2024年5月25日Serverless Meetup大阪　アプリケーションをどこで動かすべきなのか.pptx

2024年5月25日Serverless Meetup大阪　アプリケーションをどこで動かすべきなのか.pptx

2024年5月25日Serverless Meetup大阪　アプリケーションをどこで動かすべきなのか.pptx

20240523_IoTLT_vol111_kitazaki_v1___.pdf

20240523_IoTLT_vol111_kitazaki_v1___.pdf

20240523_IoTLT_vol111_kitazaki_v1___.pdf

5/22 第23回 Customer系エンジニア座談会のスライド公開用西口瑛一

5/22 第23回 Customer系エンジニア座談会のスライド公開用西口瑛一

5/22 第23回 Customer系エンジニア座談会のスライド公開用西口瑛一

論文紹介：ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation

論文紹介：ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation

論文紹介：ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation

部内勉強会（IT用語ざっくり学習）　実施日：2024年5月17日（金）　対象者：営業部社員

部内勉強会（IT用語ざっくり学習）　実施日：2024年5月17日（金）　対象者：営業部社員

部内勉強会（IT用語ざっくり学習）　実施日：2024年5月17日（金）　対象者：営業部社員

Amazon Cognitoで実装するパスキー（Security-JAWS【第33回】勉強会）

Amazon Cognitoで実装するパスキー（Security-JAWS【第33回】勉強会）

Amazon Cognitoで実装するパスキー（Security-JAWS【第33回】勉強会）

論文紹介: Exploiting semantic segmentation to boost reinforcement learning in vid...

論文紹介: Exploiting semantic segmentation to boost reinforcement learning in vid...

論文紹介: Exploiting semantic segmentation to boost reinforcement learning in vid...

論文紹介: Offline Q-Learning on diverse Multi-Task data both scales and generalizes

論文紹介: Offline Q-Learning on diverse Multi-Task data both scales and generalizes

論文紹介: Offline Q-Learning on diverse Multi-Task data both scales and generalizes

Intranet Development v1.0 (TSG LIVE! 12 LT )

Intranet Development v1.0 (TSG LIVE! 12 LT )

Intranet Development v1.0 (TSG LIVE! 12 LT )

論文紹介：Deep Occlusion-Aware Instance Segmentation With Overlapping BiLayers

論文紹介：Deep Occlusion-Aware Instance Segmentation With Overlapping BiLayers

論文紹介：Deep Occlusion-Aware Instance Segmentation With Overlapping BiLayers

[DL Hacks]Power-Normalized Cepstral Coefficients (PNCC) for Robust Speech Recognition

1. Power-Normalized Cepstral Coefﬁcients (PNCC) for Robust Speech Recognition 東京大学工学部システム創成学科Cコース B3 中村泰貴

2. 自己紹介・東京大学工学部システム創成学科Cコース B3 中村泰貴・音声(深層学習を絡めた)や信号処理の技術に興味あります・今回が初回発表です...

3. 書誌情報・論文名・Power-Normalized Cepstral Coeﬃcients (PNCC) for Robust Speech Recognition ・著者・Chanwoo Kim(Google) ・Richard M Stern(Carnegie Mellon University) ・公開日・2016/06/24 ・論文URL ・http://www.cs.cmu.edu/ robust/Papers/ OnlinePNCC_V25.pdf

4. 背景・音声認識で用いられる特徴抽出・MFCCかmelspectrogramがほとんど・別な特徴抽出方法はないのか... ・Robust性も欲しい！！・試してみる価値はある deep speech2 PNCC!!!

5. PNCCとは・主な特徴・MFCCなどは対数を用いているのに対し、 PNCCは冪乗則を用いる・雑音低減させるasymmetric ﬁltering ・様々なタイプの雑音環境下、エコーがかかる環境下で MFCCやPLPより認識精度が向上・従来の特徴抽出との差異・計算コストがよりかかる・clean音声でも認識精度が落ちない

6. まずは結果から... LibriSpeech dev-cleanの音声に SNR=4[db]ほどのノイズを環境雑音を付加

7. まずは結果から... mel spectrogram PNCC

8. まずは結果から...

9. PNCCの機構

10.

11. Gammatone Frequency Integration ・Filtabank http://aidiary.hatenablog.com/ entry/20120225/1330179868

12.

13. Medium-Time Power Calculation ・M = 2 ・Pの移動平均・ガウスノイズに効果的

14.

15. Asymmetric Noise Suppression ﬂoor level noise を検出

16. Asymmetric Noise Suppression 有声音などの励起関数によって駆動されていないと思われる信号にlowpass ﬁlteringを適用すると認識精度が向上するこの動作は複数回のローパスフィルタになるため音声のパワー係数をぼかし、認識精度を低下させるため、音声セグメントに対して適用しない

17. Asymmetric Noise Suppression 信号がそれ自身の下側崩落線の定数倍より小さいならばそれは励起されていないものと考える c = 2 がホワイトノイズに対してもっとも効果的

18. Temporal masking 最終的なR[m, l]の値は... R[m, l] = Rsp[m, l] (excitation) R[m, l] = Qf[m, l] (non-excitaion) となる

19.

20. Weight Smoothing

21.

22. Mean power normalization

23.

24. Power Function nonlinearity MFCCによる処理 PNCCによる処理

25. EXPERIMENTAL RESULTS (a)white noise (b)street noise (c) background music (d) interfering speech (e) artiﬁcial reverberation

26. Computational Complexity