Kaltura, Captioning Tools
This article is an overview and walk through of captioning tools in Kaltura
Captions are automatically added to all new media in Kaltura using ASR (automatic speech recognition). Owners are responsible for editing the captions and publishing them to show in the player.
Automatic Speech Recognition (ASR) captioning is available via Kaltura (in Media Space, Canvas and Blackboard).
Accessible and searchable media is very important to the University of Illinois Springfield. We strive to make all media captioned to enable discovery and to ensure that accessibility needs are met. Captioned video is useful for those with a hearing loss and for users with certain learning disabilities. It can also be a useful tool to anyone who struggles to understand an accent, as reading text along with listening assists with comprehension.
Before you start, keep in mind:
- If an entry was created before 10 October 2019, you will still need to request captions.
- Captions will not display until you choose to show them on the player.
- Plan on spending time checking the captions for accuracy.
- It would be a good idea to practice this process the first time with a short private video.
Media creators are responsible for captioning their content. Content that is not captioned may be removed from public view. If you have a student or colleague that will benefit from this captioning feature, you must make that content accessible to them.
Benefits (Why should I create captions?)
Accessible content has many benefits to you the media owner and to viewers in general. Beyond following the law, it means being a good neighbor. Accessible content:
- Is necessary for the deaf or hard of hearing to understand what is happening in an audio or video file. Federal (section 508) and Illinois State Law (IITAA) as well as campus regulations require that multimedia and web content be accessible to all users.
- Is useful to those with learning disabilities. Many students find it easier to focus if written words accompany the media.
- Can make it easier to understand someone with an accent.
- Enables non-native English speakers to better understand what is being said and follow along.
- Helps certain learners process information more effectively.
- Enables viewing of content in a loud space or if one lacks headphones.
- Is easier to reuse in later semesters, saving instructors time.
- Is more easily found. As a content creator you will have more views if your content is captioned because the content is searchable. Really, it's a great feature in Kaltura.
This entry does not seek to offer expertise in captioning. Students that require accessibility services and human based captioning (for best accuracy) should work with the Office of Disability Services.
Captioning in Kaltura
Automatic Speech Recognition (ASR) captioning in produces a text file is then associates it with the media in Kaltura. Captions are time coded to specific points in a video or audio file.
ASR is 70%-90% accurate, based on various factors. That may sound like a passing grade, but it isn't. Even a recording of a speaker with perfect diction and fidelity in the recording will need some editing. Technical terms, acronyms, proper names, and both common and uncommon words may not appear as you expect them to. For example, early tests returned "Amino acids" as "I mean no acids." Also, punctuation will not be added, aside from where ACR thinks a pause is long enough to merit a period. You will need to review and edit captions for media you own.
The process of using ASR for captioning has 2 components: Requesting the captions and Editing the captions.
The following assumes you have used Kaltura before.
[To the top.]
Request captions
Captions can be requested by owners, co-editors, or co-publishers of media. (For information on adding netids as co-editors or co-publishers, see Kaltura, Adding collaborators .)
To request captions:
- Log in (to MediaSpace/Canvas/Blackboard) and go to MyMedia or to the media entry directly.
- Go to the video/audio file you want to caption and go to that video's entry page.
- Under the video entry click on the Actions button and choose Captions and Enrich from the drop-down menu.
- Click the Order Button.
- Click Submit.
Edit the Captions
Once the ASR captions are returned, they must be reviewed and edited. Owners, co-editors, or co-publishers of media can edit captions. For information on adding netids as co-editors, see Kaltura, Adding collaborators .
(Editing the captions can be done locally by downloading the .srt file and using a desktop editor if you are an advanced user. For the majority of Illinois medias owners, you will want to use the editor in Kaltura.)
Accessing the editor:
Owners and co-editors can release the captions to show in the video player.
- Choose Edit from the actions menu to edit properties.
- Click on the Captions tab.
- Click the last icon next to the captions file with the tool tip Show in player.
Important hints and tips
Please note the following hints and tips. Reading and understanding these may save you time and frustrations later.
- When you are in the online editor, we recommend that you not make changes to the time codes unless it is critical and you know what you are doing.
- When you edit a caption for the first time (or two), we suggest that you practice on a private video and not one other viewers may see. This takes some of the pressure off you as you learn the system.
- If you mess up the edit of the captions (you remove too much, you alter the times), don't panic. You can always go back an request ASR captioning again and a new file will be available for editing. On a long media file you may lose some work, but you can always start over by requesting a new caption file.
- At this time ASR only works in English. Multiple languages can be manually associated with a single media asset.
- At this time only ASR is available via the Kaltura interface. If you need a human to caption a file, visit the Center for Online, Learning, Research and Service's Accessibilty at UIS webpage for more information and who to contact.
- You can only request captions for a video hosted in Kaltura, e.g., not one linked from YouTube.
- You can ignore the color and speaker tools in the editor, our captioning solution does not accommodate these at the moment.
Best practices and recommendations to create better captions
Effective and efficient captioning is a practiced skill and this service does not pretend to replace our campus experts. This service is provided to the campus in order to provide more accessibility for content that otherwise would not be captioned by a person. The ASR tool does not record everything verbatim, it drops ummms and uhhhs for example. It also does not provide important punctuation cues. You, the editor, can take some small steps that will greatly enhance the experience for someone using the captions.
- If there is a period of silence or music, don't leave the captions blank. Add a text that says [MUSIC] or [SILENCE] so the reader knows nothing is being missed.
- If you cannot understand what was said enter [UNKNOWN] or [INAUDIBLE]
- Use other descriptors when relevant, such as [CROSSTALK], [MUSIC], [NOISE], [LAUGH], [COUGH], [FOREIGN], [SOUND], [BLANK_AUDIO], AND[APPLAUSE].
- If more than one speaker is present in the media, particularly if there is a back and forth discussion, identify the speaker when the person changes. For example:
- Dialogue or conversation
- Professor X: The past: a new and uncertain world. A world of endless possibilities and infinite outcomes.
- Peter: With great power comes great responsibility.
- Question in the middle of a lecture with one main speaker:
- Student: Will this be on the test?
- Captions should preserve and identify slang or accents.
- Do not correct errors in what was said. The captions should reflect exactly what is said, and not correct a misspoken phrase or word.
- "Neutral" accents will result in better captions, as will enunciating clearly. Mumbled words will not convert well.
- Better than average audio sources in a quiet room will result in better captions than ASR from a recording in a noisy space. Media with music and sound effects will not convert well either.
- A camcorder at the back of a loud classroom using the built in mic will result in very poor ASR results. Use a mic attached to the speaker or at least one in very close proximity.
- Practice, practice, practice.