Kaltura, Captioning Tools

Captions are automatically added to all new media in Kaltura using ASR (automatic speech recognition). Owners are responsible for editing the captions and publishing them to show in the player.

Automatic Speech Recognition (ASR) captioning is available via Kaltura (in Media Space, Canvas, and Moodle).

Before you start
Why should I caption?
Request captions
Edit captions
Release captions
Hints, tips, and best practices

Accessible and searchable media is very important to the University of Illinois. We strive to make all media captioned to enable discovery and to ensure that accessibility needs are met. Captioned video is useful for those with a hearing loss and for users with certain learning disabilities. It can also be a useful tool to anyone who struggles to understand an accent, as reading text along with listening assists with comprehension.

Before you start, keep in mind:

If an entry was created before 10 October 2019 or the video started as a Zoom recording to cloud, you will still need to request captions.
Captions will not display until you choose to show them on the player.
Plan on spending time checking the captions for accuracy.
It would be a good idea to practice this process the first time with a short private video.

Media creators are responsible for captioning their content. Content that is not captioned may be removed from public view. If you have a student or colleague that will benefit from this captioning feature, you must make that content accessible to them.

Important Note regarding YouTube videos: Kaltura has implemented a new policy and it no longer allows captions to be generated, edited, and displayed on YouTube videos played in through the mediaspace website. If you order captions, and "Error" will be displayed. We are working with the company on a solution. In the meantime, we apologize the difficulties.

[To the top.]

Benefits (Why should I create captions?)

Accessible content has many benefits to you the media owner and to viewers in general. Beyond following the law, it means being a good neighbor. Accessible content:

Is necessary for the deaf or hard of hearing to understand what is happening in an audio or video file. Federal (section 508) and Illinois State Law (IITAA) as well as campus regulations require that multimedia and web content be accessible to all users.
Is useful to those with learning disabilities. Many students find it easier to focus if written words accompany the media.
Can make it easier to understand someone with an accent.
Enables non-native English speakers to better understand what is being said and follow along.
Helps certain learners process information more effectively.
Enables viewing of content in a loud space or if one lacks headphones.
Is easier to reuse in later semesters, saving instructors time.
Is more easily found. As a content creator you will have more views if your content is captioned because the content is searchable. Really, it's a great feature in Kaltura.

Take a moment and see for yourself. Go to https://mediaspace.illinois.edu/

In the search bar that says Search All Media, type "MOOC." The initial results returned are all instances where MOOC appears in metadata.
In the results window select the Search in Video tab. You are now seeing every instance where the word MOOC appears in a caption file. Click the blue time code on one of the videos, it will take you to that exact point in the media entry. All possible because of the associated caption file. It also works with any language captions. Your content will be more readily viewable and searchable to everyone because of captions.

One more example, go to this video example. In the Search in Video box below the video, type "Kaltura" or "video." The results returned will give you links to points in this video where these words were used. Viewers can find and review key points in your media through captioning.

This entry does not seek to offer expertise in captioning. The Division of Disability Resources and Educational Services (DRES) is the definitive campus source of information and expertise on accessibility. The tools described below are in support of the efforts of DRES. Students that require accessibility services and human based captioning (for best accuracy) should work with DRES.

Captioning in Kaltura

Automatic Speech Recognition (ASR) captioning in produces a text file is then associates it with the media in Kaltura. Captions are time coded to specific points in a video or audio file.

ASR is currently around 90% accurate, based on various factors. That may sound like a passing grade, but it isn't. Even a recording of a speaker with perfect diction and fidelity in the recording will need some editing. Technical terms, acronyms, proper names, and both common and uncommon words may not appear as you expect them to. For example, early tests returned "Amino acids" as "I mean no acids." Also, punctuation will not be added, aside from where ACR thinks a pause is long enough to merit a period. You will need to review and edit captions for media you own.

The process of using ASR for captioning has 2 components: Requesting the captions and Editing the captions.

The following assumes you have used Kaltura before.

[To the top.]

Request captions

Note well: Generally, you will not have to request captions for your videos. They are generated automatically for all content you add to Kaltura. However:

If your video started as a Zoom recording to cloud and was automatically transferred to Kaltura, you may need to order captions.

If an entry was created before 10 October 2019, you will still need to request captions.

Captions can be requested by owners, co-editors, or co-publishers of media. (For information on adding netids as co-editors or co-publishers, see Kaltura, Adding collaborators .)

To request captions:

Log in to mediaspace.illinois.edu and go to MyMedia or to the media entry directly.
Go to the video/audio file you want to caption and go to that video's entry page.
Under the video entry click on the Actions button and choose Captions and Enrich from the drop-down menu.
Click the Order Button.
Click Submit.

[To the top.]

Edit the Captions

Once the ASR captions are returned, they must be reviewed and edited. Owners, co-editors, or co-publishers of media can edit captions. For information on adding netids as co-editors, see Kaltura, Adding collaborators .

(Editing the captions can be done locally by downloading the .srt file and using a desktop editor if you are an advanced user. For the majority of Illinois medias owners, you will want to use the editor in Kaltura.)

Note well: You cannot order another set of captions. Sometimes, a set of captions needs so much work, such as being out of sync after editing, that you want to start over. The only way to request captions again is to launch the online editor and Save a Copy. The new entry will then get a set of captions. They can only be requested once for each video and we automatically request them for each new video. This limit is set, not by our license with Kaltura, but by *their* license with the captions provider.

Accessing the editor:

Go back to Captions and Enrich on the Actions menu.
Click the pencil icon next to the completed captions file.
Alternate method: Choose Edit from the Actions menu.
Go to the Captions tab.
Click the pencil icon next to the completed captions file.
Here are detailed instructions from Kaltura on using the editor: https://knowledge.kaltura.com/editing-captions-reach-v2#reach+v2

You can edit captions anywhere you can access your My Media, including Canvas.

[To the top.]

Release the Captions

Owners and co-editors can release the captions to show in the video player.

Choose Edit from the actions menu to edit properties.
Click on the Captions tab.
Click the last icon next to the captions file with the tool tip Show in player.

Arrow pointing to top left corner at the Show in Player button, zoomed in image of previous image

[To the top.]

Important hints and tips

Please note the following hints and tips. Reading and understanding these may save you time and frustrations later.

When you are in the online editor, we recommend that you not make changes to the time codes unless it is critical and you know what you are doing.
You should be in the habit of saving your work often.
You should also frequently test the video to make sure your time codes are in sync.
When you edit a caption for the first time (or two), we suggest that you practice on a private video and not one other viewers may see. This takes some of the pressure off you as you learn the system.
If you mess up the edit of the captions (you remove too much, you alter the times), don't panic. You can always go back and save a copy of your video (Actions menu > Launch Editor) and the new entry will have a new set of ASR captions, available for editing. On a long media file you may lose some work, but you can always start over by requesting a new caption file.
At this time ASR only works in English. Multiple languages can be manually associated with a single media asset.
At this time only ASR is available via the Kaltura interface. If you need a human to caption a file, you can contact DRES.
You can only request captions for a video hosted in Kaltura, e.g., not one linked from YouTube.
If you need to take an English caption file and translate it to another language, contact the Kaltura team via techsvc-kaltura-help@illinois.edu and we will try to assist you with some suggestions.
You can ignore the color and speaker tools in the editor, our captioning solution does not accommodate these at the moment.

Best practices and recommendations to create better captions

Effective and efficient captioning is a practiced skill and this service does not pretend to replace our campus experts at DRES. This service is provided to the campus in order to provide more accessibility for content that otherwise would not be captioned by a person. The ASR tool does not record everything verbatim, it drops ummms and uhhhs for example. It also does not provide important punctuation cues. You, the editor, can take some small steps that will greatly enhance the experience for someone using the captions.

If there is a period of silence or music, don't leave the captions blank. Add a text that says [MUSIC] or [SILENCE] so the reader knows nothing is being missed.
If you cannot understand what was said enter [UNKNOWN] or [INAUDIBLE]
Use other descriptors when relevant, such as [CROSSTALK], [MUSIC], [NOISE], [LAUGH], [COUGH], [FOREIGN], [SOUND], [BLANK_AUDIO], AND[APPLAUSE].
If more than one speaker is present in the media, particularly if there is a back and forth discussion, identify the speaker when the person changes. For example:

Dialogue or conversation

Professor X: The past: a new and uncertain world. A world of endless possibilities and infinite outcomes.
Peter: With great power comes great responsibility.

Question in the middle of a lecture with one main speaker:

Student: Will this be on the test?

Captions should preserve and identify slang or accents.
Do not correct errors in what was said. The captions should reflect exactly what is said, and not correct a misspoken phrase or word.
"Neutral" accents will result in better captions, as will enunciating clearly. Mumbled words will not convert well.
Better than average audio sources in a quiet room will result in better captions than ASR from a recording in a noisy space. Media with music and sound effects will not convert well either.
A camcorder at the back of a loud classroom using the built in mic will result in very poor ASR results. Use a mic attached to the speaker or at least one in very close proximity.
Practice, practice, practice.

To the top.]