Captions are a vital part of making sure that your video content is accessible to all users. While captions certainly benefit the millions of people worldwide with hearing impairments, they are also used by many other people - from those watching videos on silent in public spaces to second language learners. Below are some best practices for creating video captions, links to resources on captioning, and free webtools available to help make captioning easier.
Captions need to coincide with the spoken word and sounds in the video as much as possible.
Captions need to accurately reflect the spoken word, background noises, and other relevant sounds, and should run from the beginning of a video to the end. Auto generated captions should always be checked and edited for accuracy before a video is uploaded online.
Captions should be written in a sans serif font for easiest readability. Examples of sans serif fonts are Arial, Calibri, and Tahoma. Serif fonts have small, decorative strokes at the end of the larger letter strokes. Some examples are Times New Roman, Georgia, and Garamond.
Font color and background must have an appropriate level of contrast. There are many free resources on the web that will either generate WCAG2 compliant text and background colors, such as Color Safe, or allow you to plug in the hex codes of the colors you are using to check that they meet accessibility standards, such as WebAIM's Contrast Checker.
Captions should not block other important visual information on the screen. Ideally, captions should be moveable (platforms like YouTube allow this), though not all captioning services offer this feature.
Caption content should be easily available and well marked for people who need or like to use captions.
A note on auto captions: Many of the tools below create auto captions, with the ability to edit, by extracting the audio from a video. While auto captioning can make the process of captioning go much faster, it is not without its faults. Much of this technology has been built on an over sampling of white, American voices, which means that it tends to be less accurate for users outside of that identity.