Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Brain signal seminar
1. Scientific Reports - Nature
Speech synthesis from ECoG using
densely connected 3D convolutional neural
networks
-
Miguel Angrick, Christian Herff, Emily Mugler , Matthew C Tate ,
Marc W Slutzky , Dean J Krusienski and Tanja Schultz
16 April 2019
NAFIZ ISHTIAQUE AHMED
LITERATURE REVIEW
UOU – UNIVERSITY OF ULSAN
2. 소개
간단한 선형 모델은 신경 활동과 지속적인 구어 연설 사이의
관계를 포착 할 수 없습니다.
기사에서는 심 신경 네트워크를 사용하여 ECoG를 매핑하여
음성을 생성 할 수 있음을 보여줍니다.
3. Sogae
gandanhan seonhyeong model-eun singyeong hwaldong-gwa
jisogjeog-in gueo yeonseol saiui gwangyeleul pochag hal su eobs-
seubnida.
gisa-eseoneun sim singyeong neteuwokeuleul sayonghayeo
ECoGleul maepinghayeo eumseong-eul saengseong hal su
iss-eum-eul boyeojubnida.
4. Introduction
ECoG signals which supplies the necessary
temporal and spatial resolution could provide a
fast and natural way of communication to
people with neurological diseases.
However; simple linear models are not good
enough to make the relation between neural
activity with continuous spoken speech.
Thus; deep neural networks can be used to
map ECoG from speech production areas onto
an intermediate representation of speech.
5. Introduction
Brainstem stroke can result in a loss of this ability to speak.
Where; BCI with ECoG is particularly is well-suited for the
decoding of speech processes from Invasively-measured
brain activity.
Densely-connected convolutional neural networks is applied
on ECoG data that results reconstructing high-quality audio
from neural signals during speech production.
6. Experiment
ECoG from six native English speaking participants. All
subjects had normal speech and language function and
normal hearing.
ECoG was recorded with a medium-density, 64-channel, 8
× 8 electrode grid.
Participants read between 244 and 372 single words shown
to them on a screen.
7. Architecture of the Decoding approach
ECoG features for each time window are fed into DenseNet regression
model to reconstruct the logarithmic mel-scaled spectrogram. Wavenet
is then used to reconstruct an audio waveform from the spectrogram.
8. Reconstruction performance
(a) Pearson correlation coefficients between original and reconstructed
spectrograms for each participant. Bars indicate the mean over all
logarithmic mel-scaled coefficients
(b) Detailed performance across all spectral bins for participant 5.
9. Reconstruction example for visual inspection
(a) compares a time-aligned excerpt of participant 5
(b) generated waveform representation of the same excerpt as in the
spectrogram comparison.
10. Discussion
It is evident that the model has learned a distinguishable
representation between silence and acoustic speech and
captures many of the intricate dynamics of human speech.
This network transforms the measured brain activity to
spectral features of speech. Correlations up to r = 0.69
across all frequency bands were achieved by this network
This is the first time that high quality audio of speech has
been reconstructed from neural recordings of speech
production using deep neural networks.