Automatic Speech Recognition (ASR) captioning is available via Kaltura (in Media Space, Compass 2g, and Moodle).
Accessible and searchable media is very important to the University of Illinois. We strive to make all media captioned to enable discovery and to ensure that accessibility needs are met. Captioned video is useful for those with a hearing loss and for users with certain learning disabilities. It can also be a useful tool to anyone who struggles to understand an accent, as reading text along with listening assists with comprehension.
Benefits (Why should I create captions?)
Accessible content has many benefits to you the media owner and to viewers in general. Beyond following the law, it means being a good neighbor. Accessible content:
- Is necessary for the deaf or hard of hearing to understand what is happening in an audio or video file. Federal (section 508) and Illinois State Law (IITAA) as well as campus regulations require that multimedia and web content be accessible to all users.
- Is useful to those with learning disabilities. Many students find it easier to focus if written words accompany the media.
- Can make it easier to understand someone with an accent.
- Enables non-native English speakers to better understand what is being said and follow along.
- Helps certain learners process information more effectively.
- Enables viewing of content in a loud space or if one lacks headphones.
- Is easier to reuse in later semesters, saving instructors time.
- Is more easily found. As a content creator you will have more views if your content is captioned because the content is searchable. Really, it's a great feature in Kaltura.
- Take a moment and see for yourself. Go to https://mediaspace.illinois.edu/
- In the search bar that says Search All Media, type "MOOC." The initial results returned are all instances where MOOC appears in metadata.
- In the results window select the Search in Video tab. You are now seeing every instance where the word MOOC appears in a caption file. Click the blue time code on one of the videos, it will take you to that exact point in the media entry. All possible because of the associated caption file. It also works with any language captions. Your content will be more readily viewable and searchable to everyone because of captions.
- One more example, go to this video example. In the Search in Video box below the video, type "Kaltura" or "video." The results returned will give you links to points in this video where these words were used. Viewers can find and review key points in your media through captioning.
This entry does not seek to offer expertise in captioning. The Division of Disability Resources and Educational Services (DRES) is the definitive campus source of information and expertise on accessibility. The tools described below are in support of the efforts of DRES. Students that require accessibility services and human based captioning (for best accuracy) should work with DRES.
Captioning in Kaltura
Automatic Speech Recognition (ASR) captioning in produces a text file is then associates it with the media in Kaltura. Captions are time coded to specific points in a video or audio file.
ASR is 70%-90% accurate, based on various factors. That may sound like a passing grade, but it isn't. Even a recording of a speaker with perfect diction and fidelity in the recording will need some editing. Technical terms, acronyms, proper names, and both common and uncommon words may not appear as you expect them to. For example, early tests returned "Amino acids" as "I mean no acids." Also, punctuation will not be added, aside from where ACR thinks a pause is long enough to merit a period. You will need to review and edit captions for media you own.
The process of using ASR for captioning has 2 components: Requesting the captions and Editing the captions.
The following assumes you have used Kaltura before.
[To the top.]
Generally, you will not have to request captions for your videos. They are generated automatically for all content you add to Kaltura. If an entry was created before 10 October 2019, you will still need to request captions. can be requested by owners, co-editors, or co-publishers of media. (For information on adding netids as co-editors or co-publishers, see Kaltura, Adding collaborators .)
To request captions:
- Log in (to MediaSpace/Compass2g/Moodle) and go to MyMedia or to the media entry directly.
- Go to the video/audio file you want to caption and go to that video's entry page.
- Under the video entry click on the Actions button and choose Captions and Enrich from the drop-down menu.
- Click the Order Button.
- Click Submit.
Edit the Captions
Once the ASR captions are returned, they must be reviewed and edited. Owners, co-editors, or co-publishers of media can edit captions. For information on adding netids as co-editors, see Kaltura, Adding collaborators .
(Editing the captions can be done locally by downloading the .srt file and using a desktop editor if you are an advanced user. For the majority of Illinois medias owners, you will want to use the editor in Kaltura.)
Note well: You cannot order another set of captions. Sometimes, a set of captions needs so much work, such as being out of sync after editing, that you want to start over. The only way to request captions again is to launch the online editor and Save a Copy. The new entry will then get a set of captions. They can only be requested once for each video and we automatically request them for each new video. This limit is set, not by our license with Kaltura, but by *their* license with the captions provider.
Accessing the editor:
Owners and co-editors can release the captions to show in the video player.
- Choose Edit from the actions menu to edit properties.
- Click on the Captions tab.
- Click the last icon next to the captions file with the tool tip Show in player.
Important hints and tips
Please note the following hints and tips. Reading and understanding these may save you time and frustrations later.
- When you are in the online editor, we recommend that you not make changes to the time codes unless it is critical and you know what you are doing.
- When you edit a caption for the first time (or two), we suggest that you practice on a private video and not one other viewers may see. This takes some of the pressure off you as you learn the system.
- If you mess up the edit of the captions (you remove too much, you alter the times), don't panic. You can always go back an request ASR captioning again and a new file will be available for editing. On a long media file you may lose some work, but you can always start over by requesting a new caption file.
- At this time ASR only works in English. Multiple languages can be manually associated with a single media asset.
- At this time only ASR is available via the Kaltura interface. If you need a human to caption a file, you can contact DRES.
- You can only request captions for a video hosted in Kaltura, e.g., not one linked from YouTube.
- If you need to take an English caption file and translate it to another language, contact the Kaltura team via email@example.com and we will try to assist you with some suggestions.
- You can ignore the color and speaker tools in the editor, our captioning solution does not accommodate these at the moment.
Best practices and recommendations to create better captions
Effective and efficient captioning is a practiced skill and this service does not pretend to replace our campus experts at DRES. This service is provided to the campus in order to provide more accessibility for content that otherwise would not be captioned by a person. The ASR tool does not record everything verbatim, it drops ummms and uhhhs for example. It also does not provide important punctuation cues. You, the editor, can take some small steps that will greatly enhance the experience for someone using the captions.
- If there is a period of silence or music, don't leave the captions blank. Add a text that says [MUSIC] or [SILENCE] so the reader knows nothing is being missed.
- If you cannot understand what was said enter [UNKNOWN] or [INAUDIBLE]
- Use other descriptors when relevant, such as [CROSSTALK], [MUSIC], [NOISE], [LAUGH], [COUGH], [FOREIGN], [SOUND], [BLANK_AUDIO], AND[APPLAUSE].
- If more than one speaker is present in the media, particularly if there is a back and forth discussion, identify the speaker when the person changes. For example:
- Dialogue or conversation
- Professor X: The past: a new and uncertain world. A world of endless possibilities and infinite outcomes.
- Peter: With great power comes great responsibility.
- Question in the middle of a lecture with one main speaker:
- Student: Will this be on the test?
- Captions should preserve and identify slang or accents.
- Do not correct errors in what was said. The captions should reflect exactly what is said, and not correct a misspoken phrase or word.
- "Neutral" accents will result in better captions, as will enunciating clearly. Mumbled words will not convert well.
- Better than average audio sources in a quiet room will result in better captions than ASR from a recording in a noisy space. Media with music and sound effects will not convert well either.
- A camcorder at the back of a loud classroom using the built in mic will result in very poor ASR results. Use a mic attached to the speaker or at least one in very close proximity.
- Practice, practice, practice.
To the top.]