This document summarizes a presentation about real-time pitch detection and speech recognition in Python. It introduces a system that can perform multilingual lyric transcription and pitch detection on a singing voice in real-time. Key aspects of the system include acquiring audio data, processing it using a spectrogram for pitch detection, performing speech recognition using an API, and displaying the results of lyric transcription and pitch tracking. The presentation demonstrates the system recognizing and tracking the pitch of songs sung in different languages.
This document introduces Renyuan Lyu and their research on real-time speech recognition and pitch detection in Python. It discusses Lyu's background and credentials, including being a professor in Taiwan and speaker at PyCon JP. It then provides an overview of Lyu's research, including using multi-threading for audio processing, pitch detection algorithms, and speech recognition with Google APIs. Examples of demonstrations integrating these techniques are provided.
This document introduces Renyuan Lyu and his research on real-time speech recognition and pitch detection in Python. It summarizes his work developing a system that can perform multilingual lyric transcription and pitch detection on a singing voice in real-time. It describes the technical components of the system including audio signal processing, pitch detection algorithms, speech recognition using Google APIs, and a multi-threading programming approach to handle the real-time aspects.
This document discusses pitch detection in singing evaluation for karaoke scoring. It introduces pitch detection and evaluation, describing the standard pitch of A440 Hz. It discusses using dynamic time warping with difflib to compare a student's singing pitch to a teacher's sample for scoring by finding the ratio of matching blocks between the two sequences. The document also lists Python modules used like PyAudio and Pygame for audio input/output, and Tkinter and VPython for the GUI. It provides a link to sample kernel code using autocorrelation for pitch detection in the frequency domain.
This document summarizes a presentation about real-time pitch detection and speech recognition in Python. It introduces a system that can perform multilingual lyric transcription and pitch detection on a singing voice in real-time. Key aspects of the system include acquiring audio data, processing it using a spectrogram for pitch detection, performing speech recognition using an API, and displaying the results of lyric transcription and pitch tracking. The presentation demonstrates the system recognizing and tracking the pitch of songs sung in different languages.
This document introduces Renyuan Lyu and their research on real-time speech recognition and pitch detection in Python. It discusses Lyu's background and credentials, including being a professor in Taiwan and speaker at PyCon JP. It then provides an overview of Lyu's research, including using multi-threading for audio processing, pitch detection algorithms, and speech recognition with Google APIs. Examples of demonstrations integrating these techniques are provided.
This document introduces Renyuan Lyu and his research on real-time speech recognition and pitch detection in Python. It summarizes his work developing a system that can perform multilingual lyric transcription and pitch detection on a singing voice in real-time. It describes the technical components of the system including audio signal processing, pitch detection algorithms, speech recognition using Google APIs, and a multi-threading programming approach to handle the real-time aspects.
This document discusses pitch detection in singing evaluation for karaoke scoring. It introduces pitch detection and evaluation, describing the standard pitch of A440 Hz. It discusses using dynamic time warping with difflib to compare a student's singing pitch to a teacher's sample for scoring by finding the ratio of matching blocks between the two sequences. The document also lists Python modules used like PyAudio and Pygame for audio input/output, and Tkinter and VPython for the GUI. It provides a link to sample kernel code using autocorrelation for pitch detection in the frequency domain.
This document describes a machine translation system called CguTranslate that translates Python programs from English to Traditional Chinese. It summarizes the challenges in machine translation including issues with identifiers, function names, keywords, and symbols. It then outlines the approach taken by CguTranslate which involves separate translation modules for names, documentation, modules, and programs. Statistics are provided showing over 60% of items were successfully translated. The goal of the system is to enable people to code in their native language.
pyconjp2015_talk_Translation of Python Program__Renyuan Lyu
Translation of Python Program into non-English Languages for Learners without English Proficiency,
a talk at Pycon Japan 2015, by Renyuan Lyu from Taiwan
This document describes a presentation given in Taiwan on translating Python programs into non-English languages to improve readability for non-native English speakers. It provides an example of translating 18 Turtle graphics demo programs into Traditional Chinese. The document discusses how Python 3.0 allows using non-ASCII characters as identifiers by changing the default encoding to UTF-8. It shows examples of Python code in both English and Traditional Chinese and argues that code written in one's native language can be more readable and compact.
This document describes a karaoke-style read-aloud system that uses speech alignment and text-to-speech technology. It involves using a text-to-speech API to generate an audio file from text, then aligning the audio with the text using hidden Markov model tools (HTK) to create a timed text file. This allows highlighting text as it is read like a karaoke system and has applications for language learning by allowing shadowing of speech. The process involves text preprocessing, audio generation and processing, phonetic transcription, forced alignment with HTK, and output of a timed text file.
Pycon apac 2014, Taipei, Taiwan
a Real time audio spectrogram in Python 3,
importing Pyaudio, Pygame, and Pylab
with comments on native language programming
This document describes a machine translation system called CguTranslate that translates Python programs from English to Traditional Chinese. It summarizes the challenges in machine translation including issues with identifiers, function names, keywords, and symbols. It then outlines the approach taken by CguTranslate which involves separate translation modules for names, documentation, modules, and programs. Statistics are provided showing over 60% of items were successfully translated. The goal of the system is to enable people to code in their native language.
pyconjp2015_talk_Translation of Python Program__Renyuan Lyu
Translation of Python Program into non-English Languages for Learners without English Proficiency,
a talk at Pycon Japan 2015, by Renyuan Lyu from Taiwan
This document describes a presentation given in Taiwan on translating Python programs into non-English languages to improve readability for non-native English speakers. It provides an example of translating 18 Turtle graphics demo programs into Traditional Chinese. The document discusses how Python 3.0 allows using non-ASCII characters as identifiers by changing the default encoding to UTF-8. It shows examples of Python code in both English and Traditional Chinese and argues that code written in one's native language can be more readable and compact.
This document describes a karaoke-style read-aloud system that uses speech alignment and text-to-speech technology. It involves using a text-to-speech API to generate an audio file from text, then aligning the audio with the text using hidden Markov model tools (HTK) to create a timed text file. This allows highlighting text as it is read like a karaoke system and has applications for language learning by allowing shadowing of speech. The process involves text preprocessing, audio generation and processing, phonetic transcription, forced alignment with HTK, and output of a timed text file.
Pycon apac 2014, Taipei, Taiwan
a Real time audio spectrogram in Python 3,
importing Pyaudio, Pygame, and Pylab
with comments on native language programming
4. 多言語の歌詞 Multilingual Lyrics
4
「キラキラ星」
きらきら光る お空の星よ
瞬きしては 皆を見てる
きらきら光る お空の星よ
“Twinkle Star”
Twinkle, twinkle, little star,
How I wonder what you are.
Up above the world so high,
Like a diamond in the sky.
Twinkle, twinkle, little star,
How I wonder what you are!
【小星星】
一閃一閃亮晶晶,滿天都是小星星
掛在天上放光明,好像許多小眼睛
一閃一閃亮晶晶,滿天都是小星星
デモを 始めます、、、 https://youtu.be/n4tEBu4mUMA
5. Come to Poster Session for Detail
ご清聴ありがとうございました。
明日の ポスターセッションを
見に来て ください
Renyuan LYU
CGU, TAIWAN
5