SlideShare a Scribd company logo
1 of 4
InterSpeech 2012
Home
13th Annual Conference of the International Speech Communication Association
September 9-13, 2012 |
Portland, Oregon
About The
Conference
Program
Grants and
Awards
Sponsors
ISDN Number:
1990-9770
Conference
Poster
Helpful Hotlinks
EventScribe
Attendee Roster
Final Agenda
Organizing Secretariat
Computer-Assisted Language
Learning (CALL) Systems
Overview
Computer-assisted language learning (CALL) provides an effective
learning environment so that students can practice in an interactive
manner using multi-media content, either with the supervision of
teachers or on their own pace in self-learning. The advancement of
speech and language technologies has opened new perspectives on
CALL systems, such as automatic pronunciation assessment and
simulated conversational-style lessons. CALL is also regarded as one
of new and promising applications of speech analysis, recognition and
synthesis. CALL covers a variety of aspects including segmental,
prosodic and lexical features. Modeling non-native speech to correctly
segment/recognize utterances while detecting errors included in them
poses a number of challenges in speech processing. Assessing
intelligibility of non-native speech or proficiency of non-native
speakers is also an important issue. In this tutorial, we will give an
overview on these issues and current solutions. The tutorial is mainly
targeted for speech researchers and engineers interested in CALL, but
also for those engaged in language teaching or learning technology.
First we review speech recognition technologies for pronunciation
learning, specifically pronunciation evaluation and error detection.
Statistical approaches to these problems are formulated, and then
acoustic and pronunciation modeling of non-native speech is described.
Unlike the conventional non-native speech recognition, error detection
capability is required in CALL, thus an effective error prediction
scheme is vitally important. Next, we address prosodic modeling and
evaluation, such as duration, stress and tones, and then the use of
speech synthesis technologies including re-synthesis and morphing.
After the review of basic component technologies, we introduce a
number of practical CALL systems which have been developed as
commercial products or deployed in classrooms, including those in our
universities. Majority of them focus on learning English as a second
language (ESL), but some deal with other languages such as Japanese
and Chinese. We also review databases of non-native speech, which
are necessary to develop CALL systems.
Outline
1. Introduction and Overview (Kawahara)
Review history and category of CALL systems.
2. Segmental aspect and speech recognition technology
(Kawahara)
2.1. Speech analysis for CALL
2.2. Segmentation of non-native speech
2.3. Error detection of non-native speech
2.4. Scoring of non-native speech
2.5. Acoustic model for non-native speech
2.6. Pronunciation model for non-native speech
2.7. Discriminative modeling
3. Prosodic aspect (Minematsu)
3.1. Prosodic deviations found in non-native pronunciation
3.2. Duration modeling & evaluation
3.3. Stress and tone modeling & evaluation
3.4. Intonation modeling & evaluation
4. Speech synthesis technology for CALL (Minematsu)
4.1. Text-to-speech for CALL
4.2. Re-synthesis for CALL
4.3. Morphing for CALL
5. Practical CALL systems (Kawahara)
Review major CALL systems that have been developed and
deployed for learning English and other languages.
6. Database for CALL (Minematsu)
Review major databases of non-native speech, which are
critical resources in developing CALL systems.
Short Biographies
Tatsuya Kawahara is a professor in Academic Center for Computing
and Media Studies and an affiliated professor in School of Informatics,
Kyoto University.
He has also been an invited researcher at ATR and NICT. He was a
visiting researcher at Bell Laboratories from 1995 to 1996. He has
published more than 200 technical papers on speech recognition,
spoken language processing, and spoken dialog systems. He has been
managing several speech-related projects including a free speech
recognition engine Julius (http://julius.sourceforge.jp/) and the
automatic transcription system for the Japanese Parliament (Diet).
From 2003 to 2006, he was a member of IEEE SPS Speech Technical
Committee. From 2011, he is a secretary of IEEE SPS Japan Chapter.
He was a general chair of IEEE Automatic Speech Recognition &
Understanding workshop (ASRU 2007). He has also served as a
tutorial chair of INTERSPEECH 2010 and a local arrangement chair of
ICASSP 2012. He is an editorial board member of Elsevier Journal of
Computer Speech and Language, ACM Transactions on Speech and
Language Processing, and APSIPA Transactions on Signal and
Information. He is a senior member of IEEE.
E-mail: kawahara@i.kyoto-u.ac.jp
Webpage: http://www.ar.media.kyoto-u.ac.jp/members/kawahara/
Nobuaki Minematsu is an associate professor in Graduate School of
Information Science and Technology, the University of Tokyo. He was
a visiting researcher at Royal Institute of Technology, Sweden (KTH)
from 2002 to 2003. He has a very wide interest in speech
communication covering from science to engineering. He has published
more than 200 scientific and technical papers including conference
papers. Those papers are on speech analysis, speech perception, speech
recognition, speech synthesis, language learning systems, etc. He was a
member of the organizing committee of Speech Prosody 2004, L2WS
2010, INTERSPEECH 2010. From 2006, he is a member of SLaTE
(ISCA SIG on Speech and Language Technology in Education). From
2011, he is a treasurer of IEEE SPS Japan Chapter. He has also been
serving as an editorial board member of Acoustic Society of Japan, The
Institute of Electronics, Information and Communication Engineers,
and Information Processing Society of Japan.

More Related Content

What's hot

Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversion
ankit_saluja
 
Speech Recognition: Transcription and transformation of human speech
Speech Recognition: Transcription and transformation of human speechSpeech Recognition: Transcription and transformation of human speech
Speech Recognition: Transcription and transformation of human speech
SubmissionResearchpa
 

What's hot (19)

Résumé
RésuméRésumé
Résumé
 
Machine translation from English to Hindi
Machine translation from English to HindiMachine translation from English to Hindi
Machine translation from English to Hindi
 
Generations of programming_language.kum_ari11-1-1-1
Generations of programming_language.kum_ari11-1-1-1Generations of programming_language.kum_ari11-1-1-1
Generations of programming_language.kum_ari11-1-1-1
 
IRJET- Communication Aid for Deaf and Dumb People
IRJET- Communication Aid for Deaf and Dumb PeopleIRJET- Communication Aid for Deaf and Dumb People
IRJET- Communication Aid for Deaf and Dumb People
 
Evolution of programinglang
Evolution of programinglangEvolution of programinglang
Evolution of programinglang
 
Lila Prabodh Pragya Interactive course
Lila Prabodh Pragya Interactive courseLila Prabodh Pragya Interactive course
Lila Prabodh Pragya Interactive course
 
Speech to text conversion for visually impaired person using µ law companding
Speech to text conversion for visually impaired person using µ law compandingSpeech to text conversion for visually impaired person using µ law companding
Speech to text conversion for visually impaired person using µ law companding
 
Different valuable tools for Arabic sentiment analysis: a comparative evaluat...
Different valuable tools for Arabic sentiment analysis: a comparative evaluat...Different valuable tools for Arabic sentiment analysis: a comparative evaluat...
Different valuable tools for Arabic sentiment analysis: a comparative evaluat...
 
call and study skills-1
call and study skills-1call and study skills-1
call and study skills-1
 
Summer Research Project (Anusaaraka) Report
Summer Research Project (Anusaaraka) ReportSummer Research Project (Anusaaraka) Report
Summer Research Project (Anusaaraka) Report
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversion
 
A017420108
A017420108A017420108
A017420108
 
Introduction to text to speech
Introduction to text to speechIntroduction to text to speech
Introduction to text to speech
 
IS-EUD-2015, Madrid, Spain, 27 May 2015
IS-EUD-2015, Madrid, Spain, 27 May 2015IS-EUD-2015, Madrid, Spain, 27 May 2015
IS-EUD-2015, Madrid, Spain, 27 May 2015
 
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
 
Principal of Programming Language
Principal of Programming Language Principal of Programming Language
Principal of Programming Language
 
A Review on the Cross and Multilingual Information Retrieval
A Review on the Cross and Multilingual Information RetrievalA Review on the Cross and Multilingual Information Retrieval
A Review on the Cross and Multilingual Information Retrieval
 
Voice input and speech recognition system in tourism/social media
Voice input and speech recognition system in tourism/social mediaVoice input and speech recognition system in tourism/social media
Voice input and speech recognition system in tourism/social media
 
Speech Recognition: Transcription and transformation of human speech
Speech Recognition: Transcription and transformation of human speechSpeech Recognition: Transcription and transformation of human speech
Speech Recognition: Transcription and transformation of human speech
 

Viewers also liked

Viewers also liked (9)

Verb phrase
Verb phraseVerb phrase
Verb phrase
 
Verb phrase
Verb phraseVerb phrase
Verb phrase
 
Bees
BeesBees
Bees
 
Best Artist Rep Magazine December 2015 issue
Best Artist Rep Magazine December 2015 issueBest Artist Rep Magazine December 2015 issue
Best Artist Rep Magazine December 2015 issue
 
Learn BEM: CSS Naming Convention
Learn BEM: CSS Naming ConventionLearn BEM: CSS Naming Convention
Learn BEM: CSS Naming Convention
 
How to Build a Dynamic Social Media Plan
How to Build a Dynamic Social Media PlanHow to Build a Dynamic Social Media Plan
How to Build a Dynamic Social Media Plan
 
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika AldabaLightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
 
SEO: Getting Personal
SEO: Getting PersonalSEO: Getting Personal
SEO: Getting Personal
 
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job? Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
 

Similar to CALL (computer Assisted Language)

Calico 2014 intelligent call - def
Calico 2014   intelligent call - defCalico 2014   intelligent call - def
Calico 2014 intelligent call - def
Piet Desmet
 
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text EditorDynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Waqas Tariq
 
Teachbot teaching robot_using_artificial
Teachbot teaching robot_using_artificialTeachbot teaching robot_using_artificial
Teachbot teaching robot_using_artificial
CamillaTonanzi
 
12EEE032- text 2 voice
12EEE032-  text 2 voice12EEE032-  text 2 voice
12EEE032- text 2 voice
Nsaroj kumar
 
EdMedia2013 - Educational Impacts of the Intelligent Integrated Computer-Assi...
EdMedia2013 - Educational Impacts of the Intelligent Integrated Computer-Assi...EdMedia2013 - Educational Impacts of the Intelligent Integrated Computer-Assi...
EdMedia2013 - Educational Impacts of the Intelligent Integrated Computer-Assi...
Harald Wahl
 
A Strong Object Recognition Using Lbp, Ltp And Rlbp
A Strong Object Recognition Using Lbp, Ltp And RlbpA Strong Object Recognition Using Lbp, Ltp And Rlbp
A Strong Object Recognition Using Lbp, Ltp And Rlbp
Rikki Wright
 
Ara--CANINE: Character-Based Pre-Trained Language Model for Arabic Language U...
Ara--CANINE: Character-Based Pre-Trained Language Model for Arabic Language U...Ara--CANINE: Character-Based Pre-Trained Language Model for Arabic Language U...
Ara--CANINE: Character-Based Pre-Trained Language Model for Arabic Language U...
IJCI JOURNAL
 

Similar to CALL (computer Assisted Language) (20)

Calico 2014 intelligent call - def
Calico 2014   intelligent call - defCalico 2014   intelligent call - def
Calico 2014 intelligent call - def
 
International Journal on Natural Language Computing (IJNLC) Vol. 4, No.2,Apri...
International Journal on Natural Language Computing (IJNLC) Vol. 4, No.2,Apri...International Journal on Natural Language Computing (IJNLC) Vol. 4, No.2,Apri...
International Journal on Natural Language Computing (IJNLC) Vol. 4, No.2,Apri...
 
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text EditorDynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
 
English speaking proficiency assessment using speech and electroencephalograp...
English speaking proficiency assessment using speech and electroencephalograp...English speaking proficiency assessment using speech and electroencephalograp...
English speaking proficiency assessment using speech and electroencephalograp...
 
Speech-Recognition.pptx
Speech-Recognition.pptxSpeech-Recognition.pptx
Speech-Recognition.pptx
 
Teachbot teaching robot_using_artificial
Teachbot teaching robot_using_artificialTeachbot teaching robot_using_artificial
Teachbot teaching robot_using_artificial
 
IRJET - Gesture based Communication Recognition System
IRJET -  	  Gesture based Communication Recognition SystemIRJET -  	  Gesture based Communication Recognition System
IRJET - Gesture based Communication Recognition System
 
visH (fin).pptx
visH (fin).pptxvisH (fin).pptx
visH (fin).pptx
 
IRJET- Kinyarwanda Speech Recognition in an Automatic Dictation System for Tr...
IRJET- Kinyarwanda Speech Recognition in an Automatic Dictation System for Tr...IRJET- Kinyarwanda Speech Recognition in an Automatic Dictation System for Tr...
IRJET- Kinyarwanda Speech Recognition in an Automatic Dictation System for Tr...
 
IRJET- Text to Speech Synthesis for Hindi Language using Festival Framework
IRJET- Text to Speech Synthesis for Hindi Language using Festival FrameworkIRJET- Text to Speech Synthesis for Hindi Language using Festival Framework
IRJET- Text to Speech Synthesis for Hindi Language using Festival Framework
 
12EEE032- text 2 voice
12EEE032-  text 2 voice12EEE032-  text 2 voice
12EEE032- text 2 voice
 
Speech recognition An overview
Speech recognition An overviewSpeech recognition An overview
Speech recognition An overview
 
Hidden markov model based part of speech tagger for sinhala language
Hidden markov model based part of speech tagger for sinhala languageHidden markov model based part of speech tagger for sinhala language
Hidden markov model based part of speech tagger for sinhala language
 
1.pdf
1.pdf1.pdf
1.pdf
 
EdMedia2013 - Educational Impacts of the Intelligent Integrated Computer-Assi...
EdMedia2013 - Educational Impacts of the Intelligent Integrated Computer-Assi...EdMedia2013 - Educational Impacts of the Intelligent Integrated Computer-Assi...
EdMedia2013 - Educational Impacts of the Intelligent Integrated Computer-Assi...
 
A Strong Object Recognition Using Lbp, Ltp And Rlbp
A Strong Object Recognition Using Lbp, Ltp And RlbpA Strong Object Recognition Using Lbp, Ltp And Rlbp
A Strong Object Recognition Using Lbp, Ltp And Rlbp
 
SECOND LANGUAGE RESEARCH.pptx
SECOND LANGUAGE RESEARCH.pptxSECOND LANGUAGE RESEARCH.pptx
SECOND LANGUAGE RESEARCH.pptx
 
Ara--CANINE: Character-Based Pre-Trained Language Model for Arabic Language U...
Ara--CANINE: Character-Based Pre-Trained Language Model for Arabic Language U...Ara--CANINE: Character-Based Pre-Trained Language Model for Arabic Language U...
Ara--CANINE: Character-Based Pre-Trained Language Model for Arabic Language U...
 
Computational linguistics
Computational linguisticsComputational linguistics
Computational linguistics
 
IRJET- Tamil Speech to Indian Sign Language using CMUSphinx Language Models
IRJET- Tamil Speech to Indian Sign Language using CMUSphinx Language ModelsIRJET- Tamil Speech to Indian Sign Language using CMUSphinx Language Models
IRJET- Tamil Speech to Indian Sign Language using CMUSphinx Language Models
 

Recently uploaded

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Recently uploaded (20)

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 

CALL (computer Assisted Language)

  • 1. InterSpeech 2012 Home 13th Annual Conference of the International Speech Communication Association September 9-13, 2012 | Portland, Oregon About The Conference Program Grants and Awards Sponsors ISDN Number: 1990-9770 Conference Poster Helpful Hotlinks EventScribe Attendee Roster Final Agenda Organizing Secretariat Computer-Assisted Language Learning (CALL) Systems Overview Computer-assisted language learning (CALL) provides an effective learning environment so that students can practice in an interactive manner using multi-media content, either with the supervision of teachers or on their own pace in self-learning. The advancement of speech and language technologies has opened new perspectives on CALL systems, such as automatic pronunciation assessment and simulated conversational-style lessons. CALL is also regarded as one of new and promising applications of speech analysis, recognition and synthesis. CALL covers a variety of aspects including segmental, prosodic and lexical features. Modeling non-native speech to correctly segment/recognize utterances while detecting errors included in them poses a number of challenges in speech processing. Assessing intelligibility of non-native speech or proficiency of non-native speakers is also an important issue. In this tutorial, we will give an overview on these issues and current solutions. The tutorial is mainly targeted for speech researchers and engineers interested in CALL, but also for those engaged in language teaching or learning technology. First we review speech recognition technologies for pronunciation learning, specifically pronunciation evaluation and error detection. Statistical approaches to these problems are formulated, and then acoustic and pronunciation modeling of non-native speech is described.
  • 2. Unlike the conventional non-native speech recognition, error detection capability is required in CALL, thus an effective error prediction scheme is vitally important. Next, we address prosodic modeling and evaluation, such as duration, stress and tones, and then the use of speech synthesis technologies including re-synthesis and morphing. After the review of basic component technologies, we introduce a number of practical CALL systems which have been developed as commercial products or deployed in classrooms, including those in our universities. Majority of them focus on learning English as a second language (ESL), but some deal with other languages such as Japanese and Chinese. We also review databases of non-native speech, which are necessary to develop CALL systems. Outline 1. Introduction and Overview (Kawahara) Review history and category of CALL systems. 2. Segmental aspect and speech recognition technology (Kawahara) 2.1. Speech analysis for CALL 2.2. Segmentation of non-native speech 2.3. Error detection of non-native speech 2.4. Scoring of non-native speech 2.5. Acoustic model for non-native speech 2.6. Pronunciation model for non-native speech 2.7. Discriminative modeling 3. Prosodic aspect (Minematsu) 3.1. Prosodic deviations found in non-native pronunciation
  • 3. 3.2. Duration modeling & evaluation 3.3. Stress and tone modeling & evaluation 3.4. Intonation modeling & evaluation 4. Speech synthesis technology for CALL (Minematsu) 4.1. Text-to-speech for CALL 4.2. Re-synthesis for CALL 4.3. Morphing for CALL 5. Practical CALL systems (Kawahara) Review major CALL systems that have been developed and deployed for learning English and other languages. 6. Database for CALL (Minematsu) Review major databases of non-native speech, which are critical resources in developing CALL systems. Short Biographies Tatsuya Kawahara is a professor in Academic Center for Computing and Media Studies and an affiliated professor in School of Informatics, Kyoto University. He has also been an invited researcher at ATR and NICT. He was a visiting researcher at Bell Laboratories from 1995 to 1996. He has published more than 200 technical papers on speech recognition, spoken language processing, and spoken dialog systems. He has been managing several speech-related projects including a free speech recognition engine Julius (http://julius.sourceforge.jp/) and the automatic transcription system for the Japanese Parliament (Diet). From 2003 to 2006, he was a member of IEEE SPS Speech Technical Committee. From 2011, he is a secretary of IEEE SPS Japan Chapter. He was a general chair of IEEE Automatic Speech Recognition & Understanding workshop (ASRU 2007). He has also served as a tutorial chair of INTERSPEECH 2010 and a local arrangement chair of ICASSP 2012. He is an editorial board member of Elsevier Journal of
  • 4. Computer Speech and Language, ACM Transactions on Speech and Language Processing, and APSIPA Transactions on Signal and Information. He is a senior member of IEEE. E-mail: kawahara@i.kyoto-u.ac.jp Webpage: http://www.ar.media.kyoto-u.ac.jp/members/kawahara/ Nobuaki Minematsu is an associate professor in Graduate School of Information Science and Technology, the University of Tokyo. He was a visiting researcher at Royal Institute of Technology, Sweden (KTH) from 2002 to 2003. He has a very wide interest in speech communication covering from science to engineering. He has published more than 200 scientific and technical papers including conference papers. Those papers are on speech analysis, speech perception, speech recognition, speech synthesis, language learning systems, etc. He was a member of the organizing committee of Speech Prosody 2004, L2WS 2010, INTERSPEECH 2010. From 2006, he is a member of SLaTE (ISCA SIG on Speech and Language Technology in Education). From 2011, he is a treasurer of IEEE SPS Japan Chapter. He has also been serving as an editorial board member of Acoustic Society of Japan, The Institute of Electronics, Information and Communication Engineers, and Information Processing Society of Japan.