1/6
January 30, 2020
Tools and Tips for Transcribing Video
Whether it’s a how-to video on YouTube, a short corporate training segment, or a feature-
length documentary, producing video content is a major undertaking with lots of moving
parts. As a video editor, you play a crucial role in the post-production process and what
the consumer ultimately sees. Ideally you are devoting most of your time to the artistic
process of determining what to include versus cut, but it can be easy to get lost in
technical processes and details.
One key way of both streamlining your editing process and expanding your viewership is
by leveraging video transcriptions. In this guide, we will explore why video transcriptions
should be a fundamental component of your work, how they can be used, and what tools
are available to create them. Scroll down or click one of the sections below to read more.
What is Video Transcription?
Video transcription is the process of converting video content into a text format. Video
transcriptions can then be used for a variety of purposes such as captioning video for a
hearing-impaired audience, subtitling video with different languages, streamlining the
post-production editing process, or creating text-only versions of video that are more
easily indexed and therefore more discoverable online for search engine optimization
(SEO) purposes.
Different Types of Video Transcriptions
Captioning
Captioning is typically used to provide deaf or hearing-impaired communities as rich a
viewing experience as possible. Video captioning assumes that viewers cannot hear the
audio. Sometimes called same-language subtitling, video captioning denotes not only
dialogue but also other relevant audio content such as soundtracks and background
noise in text.
Text usually appears white in a black box at the base of the screen, and non-dialogue
content is typically presented in brackets (e.g., [knocking on door], [violin music begins]).
Captions are time-synchronized so that the audience can read the text as that same
content is being spoken on video.
A Beginner-friendly Guide to Video Transcriptions
TRANSCRIBE FREE 3 VIDEOS OF 30MIN DAILY HERE
2/6
Video captions may be closed, open, or live. Closed captioning refers to captions that
viewers can choose to turn on or off. In contrast, open captioning is always visible and
cannot be turned off by viewers. Live captioning occurs during news, sporting events, or
other live broadcasts. A stenographer listens to the broadcast and types what he or she
hears into a specialized device and computer program so that captions appear just
seconds after something is spoken.
Subtitling
Subtitles assume that the audience can hear the audio content but does not understand
the dialogue because it is in an unfamiliar language. Subtitles translate dialogue content
into a different language but don’t include descriptions of background noise, music, or
other audio cues. For instance, an English-speaker viewing a French movie on Netflix can
turn on subtitles to read all the dialogue in English. Like captions, subtitles can be either
closed (i.e., optional) or open (i.e., permanent).
Differences between Subtitling and Captioning
To summarize, captioning assists audiences who are deaf, hard-of-hearing, or who must
mute a video’s audio content; subtitling translates video content into a viewer’s native or
preferred language. Because of this, audio cues and background noises are denoted in
brackets for captions but these are omitted for subtitles.
Subtitles also tend to have greater flexibility with fonts, colors, and positioning than
captions do. While white text with a black rim or shadow is most common for subtitles,
this can be altered. Similarly, their position is most commonly found at the lower portion of
the screen but is also more easily altered.
Alternate Uses of Video Transcriptions
While captions and subtitles are the most common application of video transcriptions, this
text content can also be used outside of video editing. People will often transcribe videos
to improve the searchability of their video content online. Because search engines do not
index audio or visual content, creating video transcriptions help potential viewers discover
content more easily because of improved searchability and accessibility of video with
transcripts in a site.[/vc_column_text][/vc_column][/vc_row]
Uses of Video Transcriptions
As a video editor, your work with video transcriptions can span a wide variety of sectors
and specialties. Here are some of the most common uses of transcribed video:
Entertainment - Transcribing videos and creating captions and subtitles improves
the distribution and reach of movies, documentaries, TV shows, live sporting
events, awards ceremonies, and other entertainment content.
3/6
Education - Video transcripts and captions make educational materials deaf-
friendly and more accessible to hearing-impaired communities. Education content
includes lectures, how-to or training videos, webinars, interviews, and other
interactive materials.
Sermons - Including captions or subtitles with online videos of sermons makes
content accessible to a much wider audience.
International shows and movies - Independent filmmakers rely on subtitles and
captions when submitting their films to international film festivals such as Sundance
or Cannes.
Repurposing video content - Video transcriptions can be repurposed for other
uses such as writing articles, how-to guides, study guides, product descriptions, or
as a foundation for other written content.
Advantages of Getting Videos Transcribed
More efficient post-production editing process
As a video editor, you typically have to condense a large amount of raw footage into
featured content that is much shorter. Video transcriptions are one way of streamlining
this process and making the editing process more efficient. Transcripts help you locate
specific scenes or soundbites and facilitate paper edits.
A paper edit is a time-coded list of the segments you want to incorporate in the order you
plan on using them. This list can be paired with notes on associated footage you plan on
including (e.g., B-roll footage of interviewee eating at a restaurant). Creating a good
paper edit can be a major challenge, especially if you are juggling large quantities of
interview footage. Accurate transcriptions make it easier to scan through, highlight, edit,
and re-order content during paper edits.
Section 508 and ADA compliant
Improving accessibility to videos for people with disabilities via captioning is not only
business-savvy and the right thing to do, it’s also the law. The American Disabilities Act
(ADA) and Section 508 require that any content developed, purchased, or distributed by
the federal government must be accessible to people with disabilities. By creating
captions with video transcriptions, you ensure 508-compliance for the hearing-impaired
and deaf community.
Better search engine visibility
Video transcripts also play a major role in search engine optimization (SEO) because
search engines do not index audio or video files. By transforming video to text, you
improve its searchability. For instance, academics can transcribe conference
presentations to increase exposure to their findings. Webinars, vlogs, speeches,
sermons, and how-to videos are just some of the other source materials that gain SEO
benefits from video transcription.
4/6
Better social media visibility
Social networks such as Facebook play videos without sound by default. Using video
transcripts to create captions increases these embedded videos’ visibility, particularly
when people view them in locations such as airports or hospitals where full volume
viewing would be disruptive.
Increased viewership
By improving both accessibility and visibility for videos, you ultimately increase total
viewership.
Elements of Video Transcripts
Timecodes/timestamps - Video producers rely on timecodes to synchronize various
components of their work, such as shots taken from multiple cameras or audio that
is recorded separately from video. Timecodes also help editors reference particular
frames or scenes more easily. Timestamps are embedded in transcripts; readers
can click on the timestamp and immediately refer to the corresponding video
content. By pairing timecodes with timestamps, captions and subtitles are time-
synchronized so that the audience can read content while the text’s corresponding
images are on-screen.
Audio descriptions - Captions also require that all audio content is described. While
automated transcription options may work for dialogue-only video, human
transcriptionists are more effective at including descriptive audio content such as
[gust of wind and windows rattling] or [cheering crowd]. Audio descriptions must be
succinct but also descriptive enough to convey the video’s original intent and
atmosphere.
Use of punctuation - Another area that requires either manual editing or human
transcription is punctuation. The placement of punctuation may alter a caption or
subtitle’s meaning and automated transcriptions do not address these subtleties
very effectively. For example, “We’re going to learn to draw kids!” has a different
meaning than “We’re going to learn to draw, kids!” Accurately conveying pauses,
tone, and meaning require effective use of punctuation.
Timing - Timing captions goes beyond the use of timecodes and timestamps. As a
video editor, you must not only be sure that viewers cannot read ahead of what they
are seeing, you must time audio descriptions precisely. For instance, being able to
read [gunshot] before other viewers can hear the sound may ruin the director’s
intended experience.
5/6
File types - As a video editor, you may encounter a variety of file types when using
transcripts. File types vary on multiple levels:
Files may be binary—only readable by computers or specialized hardware—or
they may be readable text. Some may incorporate elements of both.
Files may be standardized so that they are accessible to anyone or proprietary
and require one manufacturer’s suite of tools.
Files may have a simple default (i.e., white text centered at the bottom of the
screen) or they may allow for a wide variety of customization. Some
commonly used open formats include TTML, STL, SRT, and IMSC. Which file
types you ultimately use will depend largely on the nature of the job (e.g.,
broadcasting, online clips, film, depositions).
Tools for Video Transcription
Depending on the nature of your video editing project, you may use different types of
tools to generate video transcriptions. The most basic options use speech recognition
software to automatically transcribe content while the most sophisticated rely more
heavily on human transcribers.
When doing small scale projects, you may be able to tap into existing voice recognition
software for free using your phone or computer. For instance, select “voice typing” on
Google Docs on your computer while playing the video. Alternately, use the microphone
on a word processing app on your phone to transcribe the recording while it plays.
Automatic transcription can also be done using paid versions of software programs (e.g.,
Adobe’s Premiere Pro, InqScribe). These can be purchased and downloaded onto your
computer. Alternatively, you can upload your files to a web-based service (e.g., Trint, Rev)
which use AI-based automated transcription. These services’ rates will vary depending on
your content and additional features you want (e.g., human editing of automated
transcripts).
While it is possible to make straightforward transcripts from video content with
applications like these, the transcripts they generate are difficult to use as captions or
subtitles without timestamps. If you’re publishing videos to a platform like YouTube,
captions and subtitles will be generated automatically and you won’t have to transcribe
the audio first. You will almost certainly have to edit them, however.
All these forms of speech recognition software share similar drawbacks; background
noise, jargon, dialects and accents, slurring, mispronunciation, or mumbled words can all
negatively affect accuracy. These tools almost always require manual editing to ensure
there is no misrepresentation of the video. While they may be suitable for a simple how-to
video, they will be frustrating to use for something like a documentary.
Depending on the length and complexity of the video you are editing, manually editing an
automated transcript may become so time-intensive that using video transcription
services makes more sense. Subtitling services and human transcriptionists can provide
TRANSCRIBE 3 FREE 30 MINUTES VIDEOS HERE
6/6
much greater accuracy and offer more advanced tools to properly sync text with the
corresponding visual content.
For instance, TranscriptionWing™ offers a 3-stage proofing process to ensure both the
accuracy of the transcript and the precision of time-synchronization. Timestamps are
typically included and human transcriptionists are far better at differentiating speakers,
understanding subtle changes in accents, and describing important audio cues for
captions. They can also offer greater customization and more easily accommodate
different file types.
Common Guidelines for Using Video Transcripts
Creating accurate and effective subtitles and captions may sound straightforward, but the
process is often more art than science. Captions must parallel a viewing experience with
sound while remaining short enough to be readable. Subtitles and captions must match
the timing of the dialogue while being placed without blocking important imagery.
Following are some core guidelines for using transcripts for captions and subtitles:
Succinctness - Keep your captions and subtitles short, using no more than three
lines of text.
Consistency - Maintain consistency with the videographer’s intent in regards to both
meaning and style. For example, write words exactly as they are spoken; don’t
change “yeah” to “yes” in a caption. Also, take account of important stylistic choices,
noting speakers’ accents, music lyrics, and tone.
Differentiation - Choose a style of speaker differentiation that makes it easiest for
viewers to follow dialogue easily. Alter the color of text for speakers or label
speakers. Be sure to note if someone off-screen is speaking so that a scene is
clearly understood.
Positioning - Make sure your subtitles and captions do not obscure other visual
information. For instance, additional information often appears at the base of the
screen during news broadcasts; you may have to move your live captioning to the
top of the screen. Positioning captions may also become relevant when assisting
viewers with speaker differentiation. Sometimes captions can be positioned next to
the person who is speaking rather than using labels.
Conclusion
If you’re not already using video transcriptions during your editing process or to create
captions and subtitles, then you’re missing a major source of time-savings and means of
adding value to your work. Avoid headache during paper edits, increase viewership of
your work by expanding its accessibility, and preserve more time for your artistic eye than
irritating technicalities. Video transcriptions are just a small component of your work but
they will move you one step closer to seeing the forest through the trees.
Get Your Transcripts Now!

Friendly Guide to Video Transcriptions for Editors (1).pdf

  • 1.
    1/6 January 30, 2020 Toolsand Tips for Transcribing Video Whether it’s a how-to video on YouTube, a short corporate training segment, or a feature- length documentary, producing video content is a major undertaking with lots of moving parts. As a video editor, you play a crucial role in the post-production process and what the consumer ultimately sees. Ideally you are devoting most of your time to the artistic process of determining what to include versus cut, but it can be easy to get lost in technical processes and details. One key way of both streamlining your editing process and expanding your viewership is by leveraging video transcriptions. In this guide, we will explore why video transcriptions should be a fundamental component of your work, how they can be used, and what tools are available to create them. Scroll down or click one of the sections below to read more. What is Video Transcription? Video transcription is the process of converting video content into a text format. Video transcriptions can then be used for a variety of purposes such as captioning video for a hearing-impaired audience, subtitling video with different languages, streamlining the post-production editing process, or creating text-only versions of video that are more easily indexed and therefore more discoverable online for search engine optimization (SEO) purposes. Different Types of Video Transcriptions Captioning Captioning is typically used to provide deaf or hearing-impaired communities as rich a viewing experience as possible. Video captioning assumes that viewers cannot hear the audio. Sometimes called same-language subtitling, video captioning denotes not only dialogue but also other relevant audio content such as soundtracks and background noise in text. Text usually appears white in a black box at the base of the screen, and non-dialogue content is typically presented in brackets (e.g., [knocking on door], [violin music begins]). Captions are time-synchronized so that the audience can read the text as that same content is being spoken on video. A Beginner-friendly Guide to Video Transcriptions TRANSCRIBE FREE 3 VIDEOS OF 30MIN DAILY HERE
  • 2.
    2/6 Video captions maybe closed, open, or live. Closed captioning refers to captions that viewers can choose to turn on or off. In contrast, open captioning is always visible and cannot be turned off by viewers. Live captioning occurs during news, sporting events, or other live broadcasts. A stenographer listens to the broadcast and types what he or she hears into a specialized device and computer program so that captions appear just seconds after something is spoken. Subtitling Subtitles assume that the audience can hear the audio content but does not understand the dialogue because it is in an unfamiliar language. Subtitles translate dialogue content into a different language but don’t include descriptions of background noise, music, or other audio cues. For instance, an English-speaker viewing a French movie on Netflix can turn on subtitles to read all the dialogue in English. Like captions, subtitles can be either closed (i.e., optional) or open (i.e., permanent). Differences between Subtitling and Captioning To summarize, captioning assists audiences who are deaf, hard-of-hearing, or who must mute a video’s audio content; subtitling translates video content into a viewer’s native or preferred language. Because of this, audio cues and background noises are denoted in brackets for captions but these are omitted for subtitles. Subtitles also tend to have greater flexibility with fonts, colors, and positioning than captions do. While white text with a black rim or shadow is most common for subtitles, this can be altered. Similarly, their position is most commonly found at the lower portion of the screen but is also more easily altered. Alternate Uses of Video Transcriptions While captions and subtitles are the most common application of video transcriptions, this text content can also be used outside of video editing. People will often transcribe videos to improve the searchability of their video content online. Because search engines do not index audio or visual content, creating video transcriptions help potential viewers discover content more easily because of improved searchability and accessibility of video with transcripts in a site.[/vc_column_text][/vc_column][/vc_row] Uses of Video Transcriptions As a video editor, your work with video transcriptions can span a wide variety of sectors and specialties. Here are some of the most common uses of transcribed video: Entertainment - Transcribing videos and creating captions and subtitles improves the distribution and reach of movies, documentaries, TV shows, live sporting events, awards ceremonies, and other entertainment content.
  • 3.
    3/6 Education - Videotranscripts and captions make educational materials deaf- friendly and more accessible to hearing-impaired communities. Education content includes lectures, how-to or training videos, webinars, interviews, and other interactive materials. Sermons - Including captions or subtitles with online videos of sermons makes content accessible to a much wider audience. International shows and movies - Independent filmmakers rely on subtitles and captions when submitting their films to international film festivals such as Sundance or Cannes. Repurposing video content - Video transcriptions can be repurposed for other uses such as writing articles, how-to guides, study guides, product descriptions, or as a foundation for other written content. Advantages of Getting Videos Transcribed More efficient post-production editing process As a video editor, you typically have to condense a large amount of raw footage into featured content that is much shorter. Video transcriptions are one way of streamlining this process and making the editing process more efficient. Transcripts help you locate specific scenes or soundbites and facilitate paper edits. A paper edit is a time-coded list of the segments you want to incorporate in the order you plan on using them. This list can be paired with notes on associated footage you plan on including (e.g., B-roll footage of interviewee eating at a restaurant). Creating a good paper edit can be a major challenge, especially if you are juggling large quantities of interview footage. Accurate transcriptions make it easier to scan through, highlight, edit, and re-order content during paper edits. Section 508 and ADA compliant Improving accessibility to videos for people with disabilities via captioning is not only business-savvy and the right thing to do, it’s also the law. The American Disabilities Act (ADA) and Section 508 require that any content developed, purchased, or distributed by the federal government must be accessible to people with disabilities. By creating captions with video transcriptions, you ensure 508-compliance for the hearing-impaired and deaf community. Better search engine visibility Video transcripts also play a major role in search engine optimization (SEO) because search engines do not index audio or video files. By transforming video to text, you improve its searchability. For instance, academics can transcribe conference presentations to increase exposure to their findings. Webinars, vlogs, speeches, sermons, and how-to videos are just some of the other source materials that gain SEO benefits from video transcription.
  • 4.
    4/6 Better social mediavisibility Social networks such as Facebook play videos without sound by default. Using video transcripts to create captions increases these embedded videos’ visibility, particularly when people view them in locations such as airports or hospitals where full volume viewing would be disruptive. Increased viewership By improving both accessibility and visibility for videos, you ultimately increase total viewership. Elements of Video Transcripts Timecodes/timestamps - Video producers rely on timecodes to synchronize various components of their work, such as shots taken from multiple cameras or audio that is recorded separately from video. Timecodes also help editors reference particular frames or scenes more easily. Timestamps are embedded in transcripts; readers can click on the timestamp and immediately refer to the corresponding video content. By pairing timecodes with timestamps, captions and subtitles are time- synchronized so that the audience can read content while the text’s corresponding images are on-screen. Audio descriptions - Captions also require that all audio content is described. While automated transcription options may work for dialogue-only video, human transcriptionists are more effective at including descriptive audio content such as [gust of wind and windows rattling] or [cheering crowd]. Audio descriptions must be succinct but also descriptive enough to convey the video’s original intent and atmosphere. Use of punctuation - Another area that requires either manual editing or human transcription is punctuation. The placement of punctuation may alter a caption or subtitle’s meaning and automated transcriptions do not address these subtleties very effectively. For example, “We’re going to learn to draw kids!” has a different meaning than “We’re going to learn to draw, kids!” Accurately conveying pauses, tone, and meaning require effective use of punctuation. Timing - Timing captions goes beyond the use of timecodes and timestamps. As a video editor, you must not only be sure that viewers cannot read ahead of what they are seeing, you must time audio descriptions precisely. For instance, being able to read [gunshot] before other viewers can hear the sound may ruin the director’s intended experience.
  • 5.
    5/6 File types -As a video editor, you may encounter a variety of file types when using transcripts. File types vary on multiple levels: Files may be binary—only readable by computers or specialized hardware—or they may be readable text. Some may incorporate elements of both. Files may be standardized so that they are accessible to anyone or proprietary and require one manufacturer’s suite of tools. Files may have a simple default (i.e., white text centered at the bottom of the screen) or they may allow for a wide variety of customization. Some commonly used open formats include TTML, STL, SRT, and IMSC. Which file types you ultimately use will depend largely on the nature of the job (e.g., broadcasting, online clips, film, depositions). Tools for Video Transcription Depending on the nature of your video editing project, you may use different types of tools to generate video transcriptions. The most basic options use speech recognition software to automatically transcribe content while the most sophisticated rely more heavily on human transcribers. When doing small scale projects, you may be able to tap into existing voice recognition software for free using your phone or computer. For instance, select “voice typing” on Google Docs on your computer while playing the video. Alternately, use the microphone on a word processing app on your phone to transcribe the recording while it plays. Automatic transcription can also be done using paid versions of software programs (e.g., Adobe’s Premiere Pro, InqScribe). These can be purchased and downloaded onto your computer. Alternatively, you can upload your files to a web-based service (e.g., Trint, Rev) which use AI-based automated transcription. These services’ rates will vary depending on your content and additional features you want (e.g., human editing of automated transcripts). While it is possible to make straightforward transcripts from video content with applications like these, the transcripts they generate are difficult to use as captions or subtitles without timestamps. If you’re publishing videos to a platform like YouTube, captions and subtitles will be generated automatically and you won’t have to transcribe the audio first. You will almost certainly have to edit them, however. All these forms of speech recognition software share similar drawbacks; background noise, jargon, dialects and accents, slurring, mispronunciation, or mumbled words can all negatively affect accuracy. These tools almost always require manual editing to ensure there is no misrepresentation of the video. While they may be suitable for a simple how-to video, they will be frustrating to use for something like a documentary. Depending on the length and complexity of the video you are editing, manually editing an automated transcript may become so time-intensive that using video transcription services makes more sense. Subtitling services and human transcriptionists can provide TRANSCRIBE 3 FREE 30 MINUTES VIDEOS HERE
  • 6.
    6/6 much greater accuracyand offer more advanced tools to properly sync text with the corresponding visual content. For instance, TranscriptionWing™ offers a 3-stage proofing process to ensure both the accuracy of the transcript and the precision of time-synchronization. Timestamps are typically included and human transcriptionists are far better at differentiating speakers, understanding subtle changes in accents, and describing important audio cues for captions. They can also offer greater customization and more easily accommodate different file types. Common Guidelines for Using Video Transcripts Creating accurate and effective subtitles and captions may sound straightforward, but the process is often more art than science. Captions must parallel a viewing experience with sound while remaining short enough to be readable. Subtitles and captions must match the timing of the dialogue while being placed without blocking important imagery. Following are some core guidelines for using transcripts for captions and subtitles: Succinctness - Keep your captions and subtitles short, using no more than three lines of text. Consistency - Maintain consistency with the videographer’s intent in regards to both meaning and style. For example, write words exactly as they are spoken; don’t change “yeah” to “yes” in a caption. Also, take account of important stylistic choices, noting speakers’ accents, music lyrics, and tone. Differentiation - Choose a style of speaker differentiation that makes it easiest for viewers to follow dialogue easily. Alter the color of text for speakers or label speakers. Be sure to note if someone off-screen is speaking so that a scene is clearly understood. Positioning - Make sure your subtitles and captions do not obscure other visual information. For instance, additional information often appears at the base of the screen during news broadcasts; you may have to move your live captioning to the top of the screen. Positioning captions may also become relevant when assisting viewers with speaker differentiation. Sometimes captions can be positioned next to the person who is speaking rather than using labels. Conclusion If you’re not already using video transcriptions during your editing process or to create captions and subtitles, then you’re missing a major source of time-savings and means of adding value to your work. Avoid headache during paper edits, increase viewership of your work by expanding its accessibility, and preserve more time for your artistic eye than irritating technicalities. Video transcriptions are just a small component of your work but they will move you one step closer to seeing the forest through the trees. Get Your Transcripts Now!