SlideShare a Scribd company logo
Super-resolution in sound field recording and
reproduction based on sparse representation
Shoichi Koyama1,2, Naoki Murata1, and Hiroshi Saruwatari1
1The University of Tokyo
2Paris Diderot University / Institute Langevin
November 29, 2016
Sound field reproduction for audio system
Microphone array Loudspeaker array
 Large listening area can be achieved
 Listeners can perceive source distance
 Real-time recording and reproduction can be achieved
without recording engineers
Recording area Target area
November 29, 2016
Sound field reproduction for audio system
Microphone array Loudspeaker array
Telecommunication system
NW
Home Theatre
Live broadcasting
Applications
Recording area Target area
November 29, 2016
Sound field reproduction for audio system
Microphone array Loudspeaker array
Improve reproduction accuracy when # of array elements is small
 # of microphones > # of loudspeakers
– Higher reproduction accuracy within local region of target area
 # of microphones < # of loudspeakers
– Higher reproduction accuracy of sources in local region of recording area
[Koyama+ IEEE JSTSP2015], [Koyama+ ICASSP 2014, 2015]
[Ahrens+ AES Conv. 2010], [Ueno+ ICASSP 2017 (submitted)]
Recording area Target area
Sound Field Recording and Reproduction
November 29, 2016
Recording area Target area
Obtain driving signals of secondary sources (= loudspeakers)
arranged on to reconstruct desired sound field inside
 Inherently, sound pressure and its gradient on is required to obtain
, but sound pressure is usually only known
 Signal conversion for sound field recording and reproduction with ordinary
acoustic sensors and transducers is necessary
Primary
sources
November 29, 2016
Conventional: WFR filtering method
Recording area Target area
Secondary source planeReceiving plane
Primary
sources
Signal
conversion
[Koyama+ IEEE TASLP 2013]
Received
signals
Driving signals
Plane wave Plane wave
Each plane wave determines entire sound field
Signal conversion can be achieved in spatial frequency domain
November 29, 2016
Conventional: WFR filtering method
Target area
Received
signals
Driving signals
Plane wave Plane wave
Each plane wave determines entire sound field
Spatial aliasing artifacts due to plane wave decomposition
Significant error at high freq even when microphone < loudspeaker
Recording area
Signal
conversion
Secondary source planeReceiving plane
Primary
sources
[Koyama+ IEEE TASLP 2013]
Sound field representation for super-resolution
 Plane wave decomposition suffers from spatial aliasing artifacts because
many basis functions are used
 Observed signals should be represented by a few basis functions for accurate
interpolation of sound field
 Appropriate basis function may be close to pressure distribution originating
from sound sources
 To obtain driving signals of loudspeakers, basis functions must be
fundamental solutions of Helmholtz equation (e.g. Green functions)
November 29, 2016
Basis functionReceived
signals
Sound field decomposition into fundamental solutions
of Helmholtz equation is necessary
Sound field decomposition
Generative model of sound field
 Inhomogeneous and homogeneous Helmholtz eq. Distribution of
source components
November 29, 2016
[Koyama+ ICASSP 2014]
Sound field is divided into two regions
Generative model of sound field
 Inhomogeneous and homogeneous Helmholtz eq.
November 29, 2016
[Koyama+ ICASSP 2014]
Green’s function
Inhomogeneous + homogeneous terms
Plane wave
November 29, 2016
Generative model of sound field
 Observe sound pressure distribution on plane
 Conversion into driving signals
Synthesize monopole sources [Spors+ AES Conv. 2008]
Ambient componentsDirect source components
Applying WFR filtering method [Koyama+ IEEE TASLP 2013]
Decomposition into two components can lead to
higher reproduction accuracy above spatial Nyquist freq
November 29, 2016
Sparse sound field representation
・・・・・・・・
Microphone array
Source components
Grid points Sparsity-based signal decomposition
Discretization
Ambient components
Dictionary matrix of Green’s functions
Observed signal Distribution of source components
A few elements of has non-zero values
under the assumption of spatially sparse source distribution
Sparse signal decomposition
 Sparse signal representation in vector form
 Signal decomposition based on sparsity of
November 29, 2016
Minimize -norm of
Group sparsity based on physical properties
November 29, 2016
Group sparse signal models for robust decomposition
• Multiple time frames
• Temporal frequencies
• Multipole components
Decomposition algorithm extending FOCUSS
[Koyama+ ICASSP 2015]
 Sparse signal representation in vector form Structure of sparsity induced
by physical properties
Block diagram of signal conversion
 Decomposition stage
– Group sparse decomposition of
 Reconstruction stage
– and are respectively converted into driving signals
– is obtained as sum of two components
November 29, 2016
Simulation Experiment
 Proposed method (Proposed), WFR filtering method (WFR), and Sound
Pressure Control method (SPC) were compared
 32 microphones (6 cm intervals)and 48 loudspeakers (4 cm intervals)
 : Rectangular region of 2.4x2.4 m, Grid points: (10cm, 20cm) intervals
 Source directivity: unidirectional
 Source signal: single frequency sinewave
Recording area Target area
November 29, 2016
Simulation Experiment
 Signal-to-distortion ratio of reproduction (SDRR)
Recording area Target area
November 29, 2016
Original pressure distribution
Reproduced pressure distribution
November 29, 2016
Frequency vs. SDR
SDRRs above spatial Nyquist frequency were improved
 Source location: (-0.32, -0.84, 0.0) m
Reproduced sound pressure distribution (1.0 kHz)PressureError
November 29, 2016
Proposed WFR SPC
18.1 dB 18.0 dB 19.4 dB
 Source location: (-0.32, -0.84, 0.0) m
SDRR:
Reproduced sound pressure distribution (4.0 kHz)PressureError
November 29, 2016
19.7 dB 6.8 dB 7.8 dB
Proposed WFR SPC
 Source location: (-0.32, -0.84, 0.0) m
SDRR:
Frequency response of reproduced sound field
November 29, 2016
 Frequency response at (0.0, 1.0, 0.0) m
Reproduced frequency response was improved
Conclusion
 Super-resolution sound field recording and
reproduction based on sparse representation
– Conventional plane wave decomposition is suffered from
spatial aliasing artifacts
– Sound field representation using source and plane wave
components
– Sound field decomposition based on spatial sparsity of
source components
– Group sparsity based on physical properties of sound field
– Experimental results indicated that reproduction accuracy
above spatial Nyquist frequency can be improved
November 29, 2016
Thank you for your attention!
Related publications
• S. Koyama and H. Saruwatari, “Sound field decomposition in reverberant environment
using sparse and low-rank signal models,” Proc. IEEE ICASSP, 2016.
• N. Murata, S. Koyama, et al. “Sparse sound field decomposition with multichannel
extension of complex NMF,” Proc. IEEE ICASSP, 2016.
• S. Koyama, et al. “Sparse sound field decomposition using group sparse Bayesian
learning,” Proc. APSIPA ASC, 2015.
• N. Murata, S. Koyama, et al. “Sparse sound field decomposition with parametric
dictionary learning for super-resolution recording and reproduction,” Proc. IEEE
CAMSAP, 2015.
• S. Koyama, et al. “Source-location-informed sound field recording and reproduction
with spherical arrays,” Proc. IEEE WASPAA, 2015.
• S. Koyama, et al. “Source-location-informed sound field recording and reproduction,”
IEEE J. Sel. Topics Signal Process., vol. 9, no. 5, pp. 881-894, 2015.
• S. Koyama, et al. “Structured sparse signal models and decomposition algorithm for
super-resolution in sound field recording and reproduction,” Proc. IEEE ICASSP, 2015.
• S. Koyama, et al. “Sparse sound field representation in recording and reproduction for
reducing spatial aliasing artifacts,” Proc. IEEE ICASSP, 2014.
November 29, 2016

More Related Content

What's hot

サンプリング定理
サンプリング定理サンプリング定理
サンプリング定理
Toshihisa Tanaka
 
Asj2017 3invited
Asj2017 3invitedAsj2017 3invited
Asj2017 3invited
SaruwatariLabUTokyo
 
基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)
基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)
基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)
Daichi Kitamura
 
調波打撃音分離の時間周波数マスクを用いた線形ブラインド音源分離
調波打撃音分離の時間周波数マスクを用いた線形ブラインド音源分離調波打撃音分離の時間周波数マスクを用いた線形ブラインド音源分離
調波打撃音分離の時間周波数マスクを用いた線形ブラインド音源分離
Kitamura Laboratory
 
Interspeech2022 参加報告
Interspeech2022 参加報告Interspeech2022 参加報告
Interspeech2022 参加報告
Yuki Saito
 
Slp201702
Slp201702Slp201702
Slp201702
Yuki Saito
 
音源分離における音響モデリング(Acoustic modeling in audio source separation)
音源分離における音響モデリング(Acoustic modeling in audio source separation)音源分離における音響モデリング(Acoustic modeling in audio source separation)
音源分離における音響モデリング(Acoustic modeling in audio source separation)
Daichi Kitamura
 
統計的独立性と低ランク行列分解理論に基づく ブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...
統計的独立性と低ランク行列分解理論に基づくブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...統計的独立性と低ランク行列分解理論に基づくブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...
統計的独立性と低ランク行列分解理論に基づく ブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...
Daichi Kitamura
 
03.12 cnn backpropagation
03.12 cnn backpropagation03.12 cnn backpropagation
03.12 cnn backpropagation
Dea-hwan Ki
 
論文紹介 Unsupervised training of neural mask-based beamforming
論文紹介 Unsupervised training of neural  mask-based beamforming論文紹介 Unsupervised training of neural  mask-based beamforming
論文紹介 Unsupervised training of neural mask-based beamforming
Shinnosuke Takamichi
 
ICASSP 2019での音響信号処理分野の世界動向
ICASSP 2019での音響信号処理分野の世界動向ICASSP 2019での音響信号処理分野の世界動向
ICASSP 2019での音響信号処理分野の世界動向
Yuma Koizumi
 
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...
Daichi Kitamura
 
音情報処理における特徴表現
音情報処理における特徴表現音情報処理における特徴表現
音情報処理における特徴表現
NU_I_TODALAB
 
音楽の知識表現:自動作編曲への応用
音楽の知識表現:自動作編曲への応用音楽の知識表現:自動作編曲への応用
音楽の知識表現:自動作編曲への応用
kthrlab
 
円形アレイを用いた水平面3次元音場の収録と再現
円形アレイを用いた水平面3次元音場の収録と再現円形アレイを用いた水平面3次元音場の収録と再現
円形アレイを用いた水平面3次元音場の収録と再現
Takuma_OKAMOTO
 
Nakai22sp03 presentation
Nakai22sp03 presentationNakai22sp03 presentation
Nakai22sp03 presentation
Yuki Saito
 
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
音楽信号処理における基本周波数推定を応用した心拍信号解析
音楽信号処理における基本周波数推定を応用した心拍信号解析音楽信号処理における基本周波数推定を応用した心拍信号解析
音楽信号処理における基本周波数推定を応用した心拍信号解析
Kitamura Laboratory
 
非負値行列因子分解を用いた被り音の抑圧
非負値行列因子分解を用いた被り音の抑圧非負値行列因子分解を用いた被り音の抑圧
非負値行列因子分解を用いた被り音の抑圧
Kitamura Laboratory
 
コサイン類似度罰則条件付き非負値行列因子分解に基づく音楽音源分離
コサイン類似度罰則条件付き非負値行列因子分解に基づく音楽音源分離コサイン類似度罰則条件付き非負値行列因子分解に基づく音楽音源分離
コサイン類似度罰則条件付き非負値行列因子分解に基づく音楽音源分離
Kitamura Laboratory
 

What's hot (20)

サンプリング定理
サンプリング定理サンプリング定理
サンプリング定理
 
Asj2017 3invited
Asj2017 3invitedAsj2017 3invited
Asj2017 3invited
 
基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)
基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)
基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)
 
調波打撃音分離の時間周波数マスクを用いた線形ブラインド音源分離
調波打撃音分離の時間周波数マスクを用いた線形ブラインド音源分離調波打撃音分離の時間周波数マスクを用いた線形ブラインド音源分離
調波打撃音分離の時間周波数マスクを用いた線形ブラインド音源分離
 
Interspeech2022 参加報告
Interspeech2022 参加報告Interspeech2022 参加報告
Interspeech2022 参加報告
 
Slp201702
Slp201702Slp201702
Slp201702
 
音源分離における音響モデリング(Acoustic modeling in audio source separation)
音源分離における音響モデリング(Acoustic modeling in audio source separation)音源分離における音響モデリング(Acoustic modeling in audio source separation)
音源分離における音響モデリング(Acoustic modeling in audio source separation)
 
統計的独立性と低ランク行列分解理論に基づく ブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...
統計的独立性と低ランク行列分解理論に基づくブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...統計的独立性と低ランク行列分解理論に基づくブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...
統計的独立性と低ランク行列分解理論に基づく ブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...
 
03.12 cnn backpropagation
03.12 cnn backpropagation03.12 cnn backpropagation
03.12 cnn backpropagation
 
論文紹介 Unsupervised training of neural mask-based beamforming
論文紹介 Unsupervised training of neural  mask-based beamforming論文紹介 Unsupervised training of neural  mask-based beamforming
論文紹介 Unsupervised training of neural mask-based beamforming
 
ICASSP 2019での音響信号処理分野の世界動向
ICASSP 2019での音響信号処理分野の世界動向ICASSP 2019での音響信号処理分野の世界動向
ICASSP 2019での音響信号処理分野の世界動向
 
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...
 
音情報処理における特徴表現
音情報処理における特徴表現音情報処理における特徴表現
音情報処理における特徴表現
 
音楽の知識表現:自動作編曲への応用
音楽の知識表現:自動作編曲への応用音楽の知識表現:自動作編曲への応用
音楽の知識表現:自動作編曲への応用
 
円形アレイを用いた水平面3次元音場の収録と再現
円形アレイを用いた水平面3次元音場の収録と再現円形アレイを用いた水平面3次元音場の収録と再現
円形アレイを用いた水平面3次元音場の収録と再現
 
Nakai22sp03 presentation
Nakai22sp03 presentationNakai22sp03 presentation
Nakai22sp03 presentation
 
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
 
音楽信号処理における基本周波数推定を応用した心拍信号解析
音楽信号処理における基本周波数推定を応用した心拍信号解析音楽信号処理における基本周波数推定を応用した心拍信号解析
音楽信号処理における基本周波数推定を応用した心拍信号解析
 
非負値行列因子分解を用いた被り音の抑圧
非負値行列因子分解を用いた被り音の抑圧非負値行列因子分解を用いた被り音の抑圧
非負値行列因子分解を用いた被り音の抑圧
 
コサイン類似度罰則条件付き非負値行列因子分解に基づく音楽音源分離
コサイン類似度罰則条件付き非負値行列因子分解に基づく音楽音源分離コサイン類似度罰則条件付き非負値行列因子分解に基づく音楽音源分離
コサイン類似度罰則条件付き非負値行列因子分解に基づく音楽音源分離
 

Viewers also liked

Hybrid NMF APSIPA2014 invited
Hybrid NMF APSIPA2014 invitedHybrid NMF APSIPA2014 invited
Hybrid NMF APSIPA2014 invited
SaruwatariLabUTokyo
 
HMMに基づく日本人英語音声合成における中学生徒の英語音声を用いた評価
HMMに基づく日本人英語音声合成における中学生徒の英語音声を用いた評価HMMに基づく日本人英語音声合成における中学生徒の英語音声を用いた評価
HMMに基づく日本人英語音声合成における中学生徒の英語音声を用いた評価
Shinnosuke Takamichi
 
Asj2017 3 bileveloptnmf
Asj2017 3 bileveloptnmfAsj2017 3 bileveloptnmf
Asj2017 3 bileveloptnmf
SaruwatariLabUTokyo
 
Ica2016 312 saruwatari
Ica2016 312 saruwatariIca2016 312 saruwatari
Ica2016 312 saruwatari
SaruwatariLabUTokyo
 
Moment matching networkを用いた音声パラメータのランダム生成の検討
Moment matching networkを用いた音声パラメータのランダム生成の検討Moment matching networkを用いた音声パラメータのランダム生成の検討
Moment matching networkを用いた音声パラメータのランダム生成の検討
Shinnosuke Takamichi
 
Dsp2015for ss
Dsp2015for ssDsp2015for ss
Dsp2015for ss
SaruwatariLabUTokyo
 
Apsipa2016for ss
Apsipa2016for ssApsipa2016for ss
Apsipa2016for ss
SaruwatariLabUTokyo
 
Koyama AES Conference SFC 2016
Koyama AES Conference SFC 2016Koyama AES Conference SFC 2016
Koyama AES Conference SFC 2016
SaruwatariLabUTokyo
 
Discriminative SNMF EA201603
Discriminative SNMF EA201603Discriminative SNMF EA201603
Discriminative SNMF EA201603
SaruwatariLabUTokyo
 
数値解析と物理学
数値解析と物理学数値解析と物理学
数値解析と物理学
すずしめ
 

Viewers also liked (10)

Hybrid NMF APSIPA2014 invited
Hybrid NMF APSIPA2014 invitedHybrid NMF APSIPA2014 invited
Hybrid NMF APSIPA2014 invited
 
HMMに基づく日本人英語音声合成における中学生徒の英語音声を用いた評価
HMMに基づく日本人英語音声合成における中学生徒の英語音声を用いた評価HMMに基づく日本人英語音声合成における中学生徒の英語音声を用いた評価
HMMに基づく日本人英語音声合成における中学生徒の英語音声を用いた評価
 
Asj2017 3 bileveloptnmf
Asj2017 3 bileveloptnmfAsj2017 3 bileveloptnmf
Asj2017 3 bileveloptnmf
 
Ica2016 312 saruwatari
Ica2016 312 saruwatariIca2016 312 saruwatari
Ica2016 312 saruwatari
 
Moment matching networkを用いた音声パラメータのランダム生成の検討
Moment matching networkを用いた音声パラメータのランダム生成の検討Moment matching networkを用いた音声パラメータのランダム生成の検討
Moment matching networkを用いた音声パラメータのランダム生成の検討
 
Dsp2015for ss
Dsp2015for ssDsp2015for ss
Dsp2015for ss
 
Apsipa2016for ss
Apsipa2016for ssApsipa2016for ss
Apsipa2016for ss
 
Koyama AES Conference SFC 2016
Koyama AES Conference SFC 2016Koyama AES Conference SFC 2016
Koyama AES Conference SFC 2016
 
Discriminative SNMF EA201603
Discriminative SNMF EA201603Discriminative SNMF EA201603
Discriminative SNMF EA201603
 
数値解析と物理学
数値解析と物理学数値解析と物理学
数値解析と物理学
 

Similar to Koyama ASA ASJ joint meeting 2016

SoundSense
SoundSenseSoundSense
SoundSense
butest
 
Audio Source Separation Based on Low-Rank Structure and Statistical Independence
Audio Source Separation Based on Low-Rank Structure and Statistical IndependenceAudio Source Separation Based on Low-Rank Structure and Statistical Independence
Audio Source Separation Based on Low-Rank Structure and Statistical Independence
Daichi Kitamura
 
HUFFMAN CODING ALGORITHM BASED ADAPTIVE NOISE CANCELLATION
HUFFMAN CODING ALGORITHM BASED ADAPTIVE NOISE CANCELLATIONHUFFMAN CODING ALGORITHM BASED ADAPTIVE NOISE CANCELLATION
HUFFMAN CODING ALGORITHM BASED ADAPTIVE NOISE CANCELLATION
IRJET Journal
 
Audio spotlighting
Audio spotlightingAudio spotlighting
Audio spotlighting
PRADEEP Cheekatla
 
Defense - Sound space rendering based on the virtual Sound space rendering ba...
Defense - Sound space rendering based on the virtual Sound space rendering ba...Defense - Sound space rendering based on the virtual Sound space rendering ba...
Defense - Sound space rendering based on the virtual Sound space rendering ba...
JunjieShi3
 
Future Proof Surround Sound Mixing using Ambisonics
Future Proof Surround Sound Mixing using AmbisonicsFuture Proof Surround Sound Mixing using Ambisonics
Future Proof Surround Sound Mixing using Ambisonics
Bruce Wiggins
 
T26123129
T26123129T26123129
T26123129
IJERA Editor
 
B45010811
B45010811B45010811
B45010811
IJERA Editor
 
Plane wave decomposition and beamforming for directional spatial sound locali...
Plane wave decomposition and beamforming for directional spatial sound locali...Plane wave decomposition and beamforming for directional spatial sound locali...
Plane wave decomposition and beamforming for directional spatial sound locali...
Muhammad Imran
 
Sound Source Localization
Sound Source LocalizationSound Source Localization
Sound Source Localization
Muhammad Imran
 
Adaptive noise estimation algorithm for speech enhancement
Adaptive noise estimation algorithm for speech enhancementAdaptive noise estimation algorithm for speech enhancement
Adaptive noise estimation algorithm for speech enhancement
Harshal Ladhe
 
Audio spot light
Audio spot lightAudio spot light
Audio spot light
Mansi Gupta
 
Blind audio source separation based on time-frequency structure models
Blind audio source separation based on time-frequency structure modelsBlind audio source separation based on time-frequency structure models
Blind audio source separation based on time-frequency structure models
Kitamura Laboratory
 
F010334548
F010334548F010334548
F010334548
IOSR Journals
 
Experimental analysis of optimal window length for independent low-rank matri...
Experimental analysis of optimal window length for independent low-rank matri...Experimental analysis of optimal window length for independent low-rank matri...
Experimental analysis of optimal window length for independent low-rank matri...
Daichi Kitamura
 
Transferring Singing Expressions from One Voice to Another for a Given Song
Transferring Singing Expressions from One Voice to Another for a Given SongTransferring Singing Expressions from One Voice to Another for a Given Song
Transferring Singing Expressions from One Voice to Another for a Given Song
NAVER Engineering
 
A Combined Sub-Band And Reconstructed Phase Space Approach To Phoneme Classif...
A Combined Sub-Band And Reconstructed Phase Space Approach To Phoneme Classif...A Combined Sub-Band And Reconstructed Phase Space Approach To Phoneme Classif...
A Combined Sub-Band And Reconstructed Phase Space Approach To Phoneme Classif...
April Smith
 
Robust Sound Field Reproduction against Listener’s Movement Utilizing Image ...
Robust Sound Field Reproduction against  Listener’s Movement Utilizing Image ...Robust Sound Field Reproduction against  Listener’s Movement Utilizing Image ...
Robust Sound Field Reproduction against Listener’s Movement Utilizing Image ...
奈良先端大 情報科学研究科
 
Audiospotlighting
AudiospotlightingAudiospotlighting
Audiospotlighting
Biji Raju
 
A Computational Framework for Sound Segregation in Music Signals using Marsyas
A Computational Framework for Sound Segregation in Music Signals using MarsyasA Computational Framework for Sound Segregation in Music Signals using Marsyas
A Computational Framework for Sound Segregation in Music Signals using Marsyas
Luís Gustavo Martins
 

Similar to Koyama ASA ASJ joint meeting 2016 (20)

SoundSense
SoundSenseSoundSense
SoundSense
 
Audio Source Separation Based on Low-Rank Structure and Statistical Independence
Audio Source Separation Based on Low-Rank Structure and Statistical IndependenceAudio Source Separation Based on Low-Rank Structure and Statistical Independence
Audio Source Separation Based on Low-Rank Structure and Statistical Independence
 
HUFFMAN CODING ALGORITHM BASED ADAPTIVE NOISE CANCELLATION
HUFFMAN CODING ALGORITHM BASED ADAPTIVE NOISE CANCELLATIONHUFFMAN CODING ALGORITHM BASED ADAPTIVE NOISE CANCELLATION
HUFFMAN CODING ALGORITHM BASED ADAPTIVE NOISE CANCELLATION
 
Audio spotlighting
Audio spotlightingAudio spotlighting
Audio spotlighting
 
Defense - Sound space rendering based on the virtual Sound space rendering ba...
Defense - Sound space rendering based on the virtual Sound space rendering ba...Defense - Sound space rendering based on the virtual Sound space rendering ba...
Defense - Sound space rendering based on the virtual Sound space rendering ba...
 
Future Proof Surround Sound Mixing using Ambisonics
Future Proof Surround Sound Mixing using AmbisonicsFuture Proof Surround Sound Mixing using Ambisonics
Future Proof Surround Sound Mixing using Ambisonics
 
T26123129
T26123129T26123129
T26123129
 
B45010811
B45010811B45010811
B45010811
 
Plane wave decomposition and beamforming for directional spatial sound locali...
Plane wave decomposition and beamforming for directional spatial sound locali...Plane wave decomposition and beamforming for directional spatial sound locali...
Plane wave decomposition and beamforming for directional spatial sound locali...
 
Sound Source Localization
Sound Source LocalizationSound Source Localization
Sound Source Localization
 
Adaptive noise estimation algorithm for speech enhancement
Adaptive noise estimation algorithm for speech enhancementAdaptive noise estimation algorithm for speech enhancement
Adaptive noise estimation algorithm for speech enhancement
 
Audio spot light
Audio spot lightAudio spot light
Audio spot light
 
Blind audio source separation based on time-frequency structure models
Blind audio source separation based on time-frequency structure modelsBlind audio source separation based on time-frequency structure models
Blind audio source separation based on time-frequency structure models
 
F010334548
F010334548F010334548
F010334548
 
Experimental analysis of optimal window length for independent low-rank matri...
Experimental analysis of optimal window length for independent low-rank matri...Experimental analysis of optimal window length for independent low-rank matri...
Experimental analysis of optimal window length for independent low-rank matri...
 
Transferring Singing Expressions from One Voice to Another for a Given Song
Transferring Singing Expressions from One Voice to Another for a Given SongTransferring Singing Expressions from One Voice to Another for a Given Song
Transferring Singing Expressions from One Voice to Another for a Given Song
 
A Combined Sub-Band And Reconstructed Phase Space Approach To Phoneme Classif...
A Combined Sub-Band And Reconstructed Phase Space Approach To Phoneme Classif...A Combined Sub-Band And Reconstructed Phase Space Approach To Phoneme Classif...
A Combined Sub-Band And Reconstructed Phase Space Approach To Phoneme Classif...
 
Robust Sound Field Reproduction against Listener’s Movement Utilizing Image ...
Robust Sound Field Reproduction against  Listener’s Movement Utilizing Image ...Robust Sound Field Reproduction against  Listener’s Movement Utilizing Image ...
Robust Sound Field Reproduction against Listener’s Movement Utilizing Image ...
 
Audiospotlighting
AudiospotlightingAudiospotlighting
Audiospotlighting
 
A Computational Framework for Sound Segregation in Music Signals using Marsyas
A Computational Framework for Sound Segregation in Music Signals using MarsyasA Computational Framework for Sound Segregation in Music Signals using Marsyas
A Computational Framework for Sound Segregation in Music Signals using Marsyas
 

Recently uploaded

(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
Scintica Instrumentation
 
The cost of acquiring information by natural selection
The cost of acquiring information by natural selectionThe cost of acquiring information by natural selection
The cost of acquiring information by natural selection
Carl Bergstrom
 
Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...
Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...
Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...
frank0071
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
PRIYANKA PATEL
 
Farming systems analysis: what have we learnt?.pptx
Farming systems analysis: what have we learnt?.pptxFarming systems analysis: what have we learnt?.pptx
Farming systems analysis: what have we learnt?.pptx
Frédéric Baudron
 
Gadgets for management of stored product pests_Dr.UPR.pdf
Gadgets for management of stored product pests_Dr.UPR.pdfGadgets for management of stored product pests_Dr.UPR.pdf
Gadgets for management of stored product pests_Dr.UPR.pdf
PirithiRaju
 
Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...
Leonel Morgado
 
Compexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titrationCompexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titration
Vandana Devesh Sharma
 
Applied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdfApplied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdf
University of Hertfordshire
 
The debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically youngThe debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically young
Sérgio Sacani
 
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
Advanced-Concepts-Team
 
Modelo de slide quimica para powerpoint
Modelo  de slide quimica para powerpointModelo  de slide quimica para powerpoint
Modelo de slide quimica para powerpoint
Karen593256
 
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of ProteinsGBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
Areesha Ahmad
 
Immersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths ForwardImmersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths Forward
Leonel Morgado
 
AJAY KUMAR NIET GreNo Guava Project File.pdf
AJAY KUMAR NIET GreNo Guava Project File.pdfAJAY KUMAR NIET GreNo Guava Project File.pdf
AJAY KUMAR NIET GreNo Guava Project File.pdf
AJAY KUMAR
 
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Leonel Morgado
 
Basics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different formsBasics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different forms
MaheshaNanjegowda
 
Pests of Storage_Identification_Dr.UPR.pdf
Pests of Storage_Identification_Dr.UPR.pdfPests of Storage_Identification_Dr.UPR.pdf
Pests of Storage_Identification_Dr.UPR.pdf
PirithiRaju
 
23PH301 - Optics - Optical Lenses.pptx
23PH301 - Optics  -  Optical Lenses.pptx23PH301 - Optics  -  Optical Lenses.pptx
23PH301 - Optics - Optical Lenses.pptx
RDhivya6
 
aziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobelaziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobel
İsa Badur
 

Recently uploaded (20)

(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
 
The cost of acquiring information by natural selection
The cost of acquiring information by natural selectionThe cost of acquiring information by natural selection
The cost of acquiring information by natural selection
 
Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...
Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...
Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
 
Farming systems analysis: what have we learnt?.pptx
Farming systems analysis: what have we learnt?.pptxFarming systems analysis: what have we learnt?.pptx
Farming systems analysis: what have we learnt?.pptx
 
Gadgets for management of stored product pests_Dr.UPR.pdf
Gadgets for management of stored product pests_Dr.UPR.pdfGadgets for management of stored product pests_Dr.UPR.pdf
Gadgets for management of stored product pests_Dr.UPR.pdf
 
Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...
 
Compexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titrationCompexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titration
 
Applied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdfApplied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdf
 
The debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically youngThe debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically young
 
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
 
Modelo de slide quimica para powerpoint
Modelo  de slide quimica para powerpointModelo  de slide quimica para powerpoint
Modelo de slide quimica para powerpoint
 
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of ProteinsGBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
 
Immersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths ForwardImmersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths Forward
 
AJAY KUMAR NIET GreNo Guava Project File.pdf
AJAY KUMAR NIET GreNo Guava Project File.pdfAJAY KUMAR NIET GreNo Guava Project File.pdf
AJAY KUMAR NIET GreNo Guava Project File.pdf
 
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
 
Basics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different formsBasics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different forms
 
Pests of Storage_Identification_Dr.UPR.pdf
Pests of Storage_Identification_Dr.UPR.pdfPests of Storage_Identification_Dr.UPR.pdf
Pests of Storage_Identification_Dr.UPR.pdf
 
23PH301 - Optics - Optical Lenses.pptx
23PH301 - Optics  -  Optical Lenses.pptx23PH301 - Optics  -  Optical Lenses.pptx
23PH301 - Optics - Optical Lenses.pptx
 
aziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobelaziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobel
 

Koyama ASA ASJ joint meeting 2016

  • 1. Super-resolution in sound field recording and reproduction based on sparse representation Shoichi Koyama1,2, Naoki Murata1, and Hiroshi Saruwatari1 1The University of Tokyo 2Paris Diderot University / Institute Langevin
  • 2. November 29, 2016 Sound field reproduction for audio system Microphone array Loudspeaker array  Large listening area can be achieved  Listeners can perceive source distance  Real-time recording and reproduction can be achieved without recording engineers Recording area Target area
  • 3. November 29, 2016 Sound field reproduction for audio system Microphone array Loudspeaker array Telecommunication system NW Home Theatre Live broadcasting Applications Recording area Target area
  • 4. November 29, 2016 Sound field reproduction for audio system Microphone array Loudspeaker array Improve reproduction accuracy when # of array elements is small  # of microphones > # of loudspeakers – Higher reproduction accuracy within local region of target area  # of microphones < # of loudspeakers – Higher reproduction accuracy of sources in local region of recording area [Koyama+ IEEE JSTSP2015], [Koyama+ ICASSP 2014, 2015] [Ahrens+ AES Conv. 2010], [Ueno+ ICASSP 2017 (submitted)] Recording area Target area
  • 5. Sound Field Recording and Reproduction November 29, 2016 Recording area Target area Obtain driving signals of secondary sources (= loudspeakers) arranged on to reconstruct desired sound field inside  Inherently, sound pressure and its gradient on is required to obtain , but sound pressure is usually only known  Signal conversion for sound field recording and reproduction with ordinary acoustic sensors and transducers is necessary Primary sources
  • 6. November 29, 2016 Conventional: WFR filtering method Recording area Target area Secondary source planeReceiving plane Primary sources Signal conversion [Koyama+ IEEE TASLP 2013] Received signals Driving signals Plane wave Plane wave Each plane wave determines entire sound field Signal conversion can be achieved in spatial frequency domain
  • 7. November 29, 2016 Conventional: WFR filtering method Target area Received signals Driving signals Plane wave Plane wave Each plane wave determines entire sound field Spatial aliasing artifacts due to plane wave decomposition Significant error at high freq even when microphone < loudspeaker Recording area Signal conversion Secondary source planeReceiving plane Primary sources [Koyama+ IEEE TASLP 2013]
  • 8. Sound field representation for super-resolution  Plane wave decomposition suffers from spatial aliasing artifacts because many basis functions are used  Observed signals should be represented by a few basis functions for accurate interpolation of sound field  Appropriate basis function may be close to pressure distribution originating from sound sources  To obtain driving signals of loudspeakers, basis functions must be fundamental solutions of Helmholtz equation (e.g. Green functions) November 29, 2016 Basis functionReceived signals Sound field decomposition into fundamental solutions of Helmholtz equation is necessary Sound field decomposition
  • 9. Generative model of sound field  Inhomogeneous and homogeneous Helmholtz eq. Distribution of source components November 29, 2016 [Koyama+ ICASSP 2014] Sound field is divided into two regions
  • 10. Generative model of sound field  Inhomogeneous and homogeneous Helmholtz eq. November 29, 2016 [Koyama+ ICASSP 2014] Green’s function Inhomogeneous + homogeneous terms Plane wave
  • 11. November 29, 2016 Generative model of sound field  Observe sound pressure distribution on plane  Conversion into driving signals Synthesize monopole sources [Spors+ AES Conv. 2008] Ambient componentsDirect source components Applying WFR filtering method [Koyama+ IEEE TASLP 2013] Decomposition into two components can lead to higher reproduction accuracy above spatial Nyquist freq
  • 12. November 29, 2016 Sparse sound field representation ・・・・・・・・ Microphone array Source components Grid points Sparsity-based signal decomposition Discretization Ambient components Dictionary matrix of Green’s functions Observed signal Distribution of source components A few elements of has non-zero values under the assumption of spatially sparse source distribution
  • 13. Sparse signal decomposition  Sparse signal representation in vector form  Signal decomposition based on sparsity of November 29, 2016 Minimize -norm of
  • 14. Group sparsity based on physical properties November 29, 2016 Group sparse signal models for robust decomposition • Multiple time frames • Temporal frequencies • Multipole components Decomposition algorithm extending FOCUSS [Koyama+ ICASSP 2015]  Sparse signal representation in vector form Structure of sparsity induced by physical properties
  • 15. Block diagram of signal conversion  Decomposition stage – Group sparse decomposition of  Reconstruction stage – and are respectively converted into driving signals – is obtained as sum of two components November 29, 2016
  • 16. Simulation Experiment  Proposed method (Proposed), WFR filtering method (WFR), and Sound Pressure Control method (SPC) were compared  32 microphones (6 cm intervals)and 48 loudspeakers (4 cm intervals)  : Rectangular region of 2.4x2.4 m, Grid points: (10cm, 20cm) intervals  Source directivity: unidirectional  Source signal: single frequency sinewave Recording area Target area November 29, 2016
  • 17. Simulation Experiment  Signal-to-distortion ratio of reproduction (SDRR) Recording area Target area November 29, 2016 Original pressure distribution Reproduced pressure distribution
  • 18. November 29, 2016 Frequency vs. SDR SDRRs above spatial Nyquist frequency were improved  Source location: (-0.32, -0.84, 0.0) m
  • 19. Reproduced sound pressure distribution (1.0 kHz)PressureError November 29, 2016 Proposed WFR SPC 18.1 dB 18.0 dB 19.4 dB  Source location: (-0.32, -0.84, 0.0) m SDRR:
  • 20. Reproduced sound pressure distribution (4.0 kHz)PressureError November 29, 2016 19.7 dB 6.8 dB 7.8 dB Proposed WFR SPC  Source location: (-0.32, -0.84, 0.0) m SDRR:
  • 21. Frequency response of reproduced sound field November 29, 2016  Frequency response at (0.0, 1.0, 0.0) m Reproduced frequency response was improved
  • 22. Conclusion  Super-resolution sound field recording and reproduction based on sparse representation – Conventional plane wave decomposition is suffered from spatial aliasing artifacts – Sound field representation using source and plane wave components – Sound field decomposition based on spatial sparsity of source components – Group sparsity based on physical properties of sound field – Experimental results indicated that reproduction accuracy above spatial Nyquist frequency can be improved November 29, 2016 Thank you for your attention!
  • 23. Related publications • S. Koyama and H. Saruwatari, “Sound field decomposition in reverberant environment using sparse and low-rank signal models,” Proc. IEEE ICASSP, 2016. • N. Murata, S. Koyama, et al. “Sparse sound field decomposition with multichannel extension of complex NMF,” Proc. IEEE ICASSP, 2016. • S. Koyama, et al. “Sparse sound field decomposition using group sparse Bayesian learning,” Proc. APSIPA ASC, 2015. • N. Murata, S. Koyama, et al. “Sparse sound field decomposition with parametric dictionary learning for super-resolution recording and reproduction,” Proc. IEEE CAMSAP, 2015. • S. Koyama, et al. “Source-location-informed sound field recording and reproduction with spherical arrays,” Proc. IEEE WASPAA, 2015. • S. Koyama, et al. “Source-location-informed sound field recording and reproduction,” IEEE J. Sel. Topics Signal Process., vol. 9, no. 5, pp. 881-894, 2015. • S. Koyama, et al. “Structured sparse signal models and decomposition algorithm for super-resolution in sound field recording and reproduction,” Proc. IEEE ICASSP, 2015. • S. Koyama, et al. “Sparse sound field representation in recording and reproduction for reducing spatial aliasing artifacts,” Proc. IEEE ICASSP, 2014. November 29, 2016

Editor's Notes

  1. These are the reproduced sound pressure and time-averaged squared error distributions of the three methods at 1 kHz. The white region indicates the region of high reproduction accuracy. Because 1.0 kHz is below the spatial Nyquist frequency, the sound field was accurately reproduced using these three methods.