2. In this video:
• Uses of transcripts
• Manual transcription
• Automatic speech recognition
3. Uses of transcripts
• Reading
– hearing impairment
– non-native speaker
– in a hurry
• Searching
• Subtitles/captions: alignment
• Translation
Screenshot of YouTube
4. Manual transcription
• Script for video?
• Type while listening
– Player with global hotkeys
– Slow down the video
• Amara, also crowdsourcing
• Online services
Screenshot of VLC media player
5. Automatic speech recognition
• Still requires checking
• Punctuation missing
• “Funny” results if dictionary
not restricted: YouTube
• Better results with EMMA,
Dragon Naturally Speaking
Screenshot of YouTube
6. Automatic speech recognition
• Still requires checking
• Punctuation missing
• “Funny” results if dictionary
not restricted: YouTube
• Better results with EMMA,
Dragon Naturally Speaking
Screenshot of YouTube
7. Automatic speech recognition
• Still requires checking
• Punctuation missing
• “Funny” results if dictionary
not restricted: YouTube
• Better results with EMMA,
Dragon Naturally Speaking
Screenshot of YouTube
8. Automatic speech recognition
• Still requires checking
• Punctuation missing
• “Funny” results if dictionary
not restricted: YouTube
• Better results with EMMA,
Dragon Naturally Speaking
Screenshot of YouTube
9. The LoCoMoTion project is funded by the Erasmus+ programme of the European Union. The European Commission
support for the production of this publication does not constitute an endorsement of the contents which reflects the
views only of the authors, and the Commission cannot be held responsible for any use which may be made of the
information contained therein.
Editor's Notes
Solutions to create audio transcripts. In earlier days, audio transcripts were hard and costly to produce, but today there are manageable ways to turn videos into readable – and searchable -- text, and into captions.
I‘ll discuss the different ways in which transcripts can be used. And how they can be created – both by hand and by machine. Admittedly, these days only partially by machine.
People may have trouble understanding what‘s being said in the video -- for one reason or another. But there is also a trivial reason for providing a text version: It‘s easier to skim through text than to skim through a video. From reading the transcript you can quickly judge whether or not you should watch the video.
You can use the regular search function to find a keyword in the text. And possibly also find the corresponding position in the video if the transcript and the video are aligned. Luckily, YouTube can do this alignment fully automatic. It can even export the timings it has found.
Alignment is imperative to turn a transcript into subtitles or captions.If you really want to invest the time and money or if there‘s a crowd out there to support you, a transcript also is the basis for a translation as text or as subtitles. Again, the automatic solutions require much editing.
The easiest way to create a transcript is to start with a word-by-word script – and read the script rather than improvising. Then the script becomes the transcript. Whether or not this style of video works for you and your audience is a different question.
If you do not have a full script, you can type what you hear while listening to the recording. This is far easier to do if you use a media player that can be controlled by the keyboard even when the word processor is in the foreground. The open-source VLC media player can do that.
Another helpful feature of VLC media player and HTML video players in modern web browsers and Windows Media Player is that they have a speed control: You can slow down the video.
Amara is a website to enable the “crowd” to create subtitles for your video.
And, if all else fails, there are transcription services that you book online and pay by the minute.
Given the amout of labor that manual transcription requires, automatic speech recognition looks promising. Its progress in terms of quality has been great, but if only every 20th word is wrong on average, you are rather lucky.
YouTube doesn‘t guess the punctuation. You have at least to put in all the commas and periods by yourself.
YouTube does not use a specific dictionary which means it will miss technical terms and it will use expressions that are obviously out of place. At first this can be hilarious, but it demands lots of editing. For that, admittedly, there is great editor in YouTube.
Automatic speech recognition with a specific dictionary produces far less funny results. The automatic transcription of the MOOC platform EMMA and the dictation and transcription software Dragon Naturally Speaking belong to that class.