Your SlideShare is downloading. ×
0
Automated Lecture Transcription at OCW Consortium Global Meeting 2009
Automated Lecture Transcription at OCW Consortium Global Meeting 2009
Automated Lecture Transcription at OCW Consortium Global Meeting 2009
Automated Lecture Transcription at OCW Consortium Global Meeting 2009
Automated Lecture Transcription at OCW Consortium Global Meeting 2009
Automated Lecture Transcription at OCW Consortium Global Meeting 2009
Automated Lecture Transcription at OCW Consortium Global Meeting 2009
Automated Lecture Transcription at OCW Consortium Global Meeting 2009
Automated Lecture Transcription at OCW Consortium Global Meeting 2009
Automated Lecture Transcription at OCW Consortium Global Meeting 2009
Automated Lecture Transcription at OCW Consortium Global Meeting 2009
Automated Lecture Transcription at OCW Consortium Global Meeting 2009
Automated Lecture Transcription at OCW Consortium Global Meeting 2009
Automated Lecture Transcription at OCW Consortium Global Meeting 2009
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Automated Lecture Transcription at OCW Consortium Global Meeting 2009

1,103

Published on

Introduction and background to the automated lecture transcription/lecture transcription service project by MIT's Office of Educational Innovation and Technology (OEIT). Presented by Brandon Muramatsu …

Introduction and background to the automated lecture transcription/lecture transcription service project by MIT's Office of Educational Innovation and Technology (OEIT). Presented by Brandon Muramatsu at the OCW Consortium Global Meeting in Monterrey Mexico, April 22, 2009.

Published in: Education, Business, Technology
1 Comment
0 Likes
Statistics
Notes
  • Great presentation. I have taken some of the structure graphics along with adapted to my startup
    Sharika
    http://winkhealth.com http://financewink.com
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Be the first to like this

No Downloads
Views
Total Views
1,103
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
10
Comments
1
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Automated Lecture Transcription Brandon Muramatsu [email_address] MIT, Office of Educational Innovation and Technology (Really, I’m just moonlighting as an OCWC Staffer…) Citation: Muramatsu, B. (2009). Automated Lecture Transcription. Presented at the OpenCourseWare Global Meeting. Monterrey, Mexico. April 22, 2009.
  • 2. Motivation <ul><li>More &amp; more academic videos on the Web </li></ul><ul><ul><li>Universities recording lectures </li></ul></ul><ul><ul><li>Cultural organizations interviewing experts </li></ul></ul>MIT OCW 8.01 : Professor Lewin puts his life on the line in Lecture 11 by demonstrating his faith in the Conservation of Mechanical Energy.
  • 3. Motivation <ul><li>Challenges </li></ul><ul><ul><li>Volume </li></ul></ul><ul><ul><li>Search </li></ul></ul><ul><ul><li>Accessibility </li></ul></ul>
  • 4. Research: Spoken Lecture Project <ul><li>Speech recognition &amp; automated transcription of lectures </li></ul><ul><li>Why lectures? </li></ul><ul><ul><li>Conversational, spontaneous, starts/stops </li></ul></ul><ul><ul><li>Different from broadcast news, other types of speech recognition </li></ul></ul><ul><ul><li>Specialized vocabularies </li></ul></ul>James Glass [email_address]
  • 5. Research: Spoken Lecture Project <ul><li>Processor, browser, workflow </li></ul><ul><ul><li>web.sls.csail.mit.edu/lectures/ </li></ul></ul><ul><li>Prototyped with lecture &amp; seminar video </li></ul><ul><ul><li>MIT OCW (~300 hours, lectures) </li></ul></ul><ul><ul><li>MIT World (~80 hours, seminar speakers) </li></ul></ul><ul><li>Supported with iCampus MIT/Microsoft Alliance funding </li></ul>James Glass [email_address]
  • 6. What problems are we trying to solve? For Learners? For Content Producers? <ul><li>Finding…(primary) </li></ul><ul><ul><li>Content in videos (text metadata) </li></ul></ul><ul><ul><li>Specific “phrase” in video (via transcript) </li></ul></ul><ul><ul><li>Specific “concept” in video </li></ul></ul><ul><li>Facilitating…(secondary) </li></ul><ul><ul><li>Accessibility (closed captioning) </li></ul></ul><ul><ul><li>Translations </li></ul></ul>
  • 7. Transition: Towards a Lecture Transcription Service <ul><li>Develop a prototype production service </li></ul><ul><ul><li>MIT, University of Queensland </li></ul></ul><ul><ul><li>Engage external partners (hosted service?, community?) </li></ul></ul><ul><li>Requirements gathering </li></ul><ul><ul><li>Internal MIT customers (OCW, AMPS) </li></ul></ul><ul><ul><li>External (OpenCast, UC Berkeley, Others) </li></ul></ul>
  • 8. MIT Projects/Customers <ul><li>OpenCourseWare </li></ul><ul><ul><li>(Production support) </li></ul></ul><ul><ul><li>Existing videos &amp; audio, new video </li></ul></ul><ul><ul><li>Lecture notes, slides, etc. for domain model </li></ul></ul><ul><ul><li>Multiple videos/audio by same lecturer for speaker model </li></ul></ul><ul><ul><li>Diverse topics/disciplines </li></ul></ul><ul><ul><li>Improve search and retrieval (more granularity) </li></ul></ul><ul><ul><li>English transcripts can facilitate translation </li></ul></ul><ul><li>MIT 150 th Celebration (AMPS) </li></ul><ul><ul><li>Highly produced, individual speakers </li></ul></ul><ul><ul><li>Full transcripts available </li></ul></ul><ul><ul><li>Facilitate search </li></ul></ul>
  • 9. External Customers/Interest <ul><li>University of Queensland </li></ul><ul><ul><li>Lecture podcasting </li></ul></ul><ul><ul><li>25 years of interviews with world-class scientists, Australian Broadcasting Company </li></ul></ul><ul><li>UC Berkeley </li></ul><ul><ul><li>Lecture podcasting, 500+ hours of new content per term </li></ul></ul><ul><ul><li>Improve search and retrieval </li></ul></ul><ul><li>OpenCast Project ( www.opencast.org ) </li></ul><ul><ul><li>Extend generic podcast production workflow </li></ul></ul><ul><li>Harvard University Extension </li></ul><ul><ul><li>100 th Anniversary </li></ul></ul>
  • 10. Lecture Transcription Workflow
  • 11. Demo <ul><li>Spoken Lecture Browser </li></ul><ul><ul><li>web.sls.csail.mit.edu/lectures </li></ul></ul><ul><ul><li>Requires Real Player 10 </li></ul></ul><ul><li>Alternate UI, Google Audio Indexing </li></ul><ul><ul><li>labs.google.com/gaudi </li></ul></ul><ul><ul><li>U.S. political coverage (2008 elections, CSPAN) </li></ul></ul>
  • 12. Spoken Lecture Browser <ul><ul><li>web.sls.csail.mit.edu/lectures </li></ul></ul>
  • 13. A Lecture Transcription Service? <ul><li>Under consideration </li></ul><ul><li>Limitations (anticipated, may change) </li></ul><ul><ul><li>Lecture-style content (technology optimized) </li></ul></ul><ul><ul><li>Approximately 80% accuracy </li></ul></ul><ul><ul><li>Probably NOT full accessibility solution </li></ul></ul><ul><ul><li>Other languages? (not sure) </li></ul></ul><ul><ul><li>Browser open-sourced (expected) </li></ul></ul><ul><ul><li>Processing hosted/limited to MIT (current thinking) </li></ul></ul><ul><ul><ul><li>So will submit jobs via MIT-run service </li></ul></ul></ul><ul><ul><li>Audio extract, domain models and transcripts available donated for further research </li></ul></ul>
  • 14. Thanks! Brandon Muramatsu [email_address] MIT, Office of Educational Innovation and Technology Citation: Muramatsu, B. (2009). Automated Lecture Transcription. Presented at the OpenCourseWare Global Meeting. Monterrey, Mexico. April 22, 2009.

×