To comply with the Americans with Disabilities Act (ADA) and WCAG standards, a video player must support accessibility features such as captions, transcripts, audio descriptions, keyboard navigation, and screen reader compatibility. Building a custom video player requires implementing these features at both the UI and playback levels.
Captions and Subtitles
Captions are required for users who cannot hear the audio. They are usually created in the WebVTT format and linked to the video with a track element inside the video tag. The browser then makes the captions available to the viewer, who can enable or disable them as needed.
Captions should not only display speech but also describe important sounds like background music or sound effects. This ensures that people who are deaf or hard of hearing can still follow the video's content.
<video controls>
<source src="video.mp4" type="video/mp4">
<track src="captions.vtt" kind="captions" srclang="en" label="English" default>
</video>Keyboard Navigation
All player controls must be accessible through the keyboard using the Tab and Shift+Tab keys. Common shortcuts should also be supported to control playback, seek, volume, and captions. For example, pressing the spacebar or Enter key should toggle play and pause, arrow keys should handle seeking and volume, and a dedicated shortcut should toggle captions.
document.addEventListener('keydown', e => {
if (e.code === 'Space') video.paused ? video.play() : video.pause();
if (e.code === 'ArrowRight') video.currentTime += 5;
if (e.code === 'ArrowLeft') video.currentTime -= 5;
});Screen Reader Support
A screen reader translates on-screen elements into speech or braille for people who are blind or visually impaired. To work properly with screen readers, each control in the video player needs descriptive ARIA labels. For example, a button should say → Play video → or → Pause video → instead of leaving the label empty.
Toggle controls, such as captions or mute buttons, should use the aria-pressed attribute so the screen reader can announce whether the option is active. By doing this, blind users can navigate and operate the player with the same level of understanding as sighted users.
<button aria-label="Play video" onclick="video.play()">Play</button>
<button aria-label="Pause video" onclick="video.pause()">Pause</button>Visual Contrast and Focus States
People with low vision or color blindness need clear visibility of controls. The text and icons in the player should maintain a contrast ratio of at least 4.5:1 against the background. In addition, when a user navigates with the keyboard, the control in focus must be highlighted with a strong outline or similar visual indicator. These focus states prevent confusion and make sure users know which element they are interacting with at any given moment.
button:focus {
outline: 3px solid #00f;
}Audio Descriptions
Audio descriptions provide a narrated explanation of key visual elements in a video, such as actions, scene changes, and on-screen text, for users who are blind or have low vision. There are two primary methods for delivering audio descriptions: a separate audio track that is played alongside the main video or a secondary video with the descriptions baked in.
Implementing a Separate Audio Description Track
The standard method uses a second <track> element with kind="descriptions". This track points to a WebVTT file containing timed text descriptions. A JavaScript function can then be used to toggle this track, and a speech synthesis API can read the descriptions aloud at the correct time.
<video id="mainVideo" controls>
<source src="movie.mp4" type="video/mp4">
<track src="descriptions.vtt" kind="descriptions" srclang="en" label="Audio Descriptions">
</video>
<button onclick="toggleAudioDescriptions()">Toggle Audio Descriptions</button>
<script>
function toggleAudioDescriptions() {
// Logic to enable/disable the track and speak the descriptions
}
</script>Example of a WebVTT file for descriptions (descriptions.vtt):
WEBVTT
00:00:05.000 --> 00:00:10.000
A red car speeds down a deserted highway, kicking up dust.
00:00:15.000 --> 00:00:20.000
John looks nervously in the rearview mirror, his eyes wide with fear.Providing Full Text Transcripts
A transcript is a complete text version of the video and audio content, including spoken dialogue, speaker identification, and descriptions of all relevant visual and sound events. Transcripts are required by WCAG Success Criterion 1.2.1 (Audio-only and Video-only Prerecorded) and are used for users who are deaf-blind and rely on braille displays, as well as for anyone who prefers to read or cannot play the video.
The transcript should be placed on the same page as the video, either directly below it or linked clearly nearby. It must be a static text element, not inside a <track> tag, to be accessible to all users and assistive technologies at all times.
<video controls>
<source src="instructional-video.mp4" type="video/mp4">
</video>
<br>
<a href="#transcript">Jump to Transcript</a> <!-- Link to anchor -->
<!-- ... other page content ... -->
<h2 id="transcript">Video Transcript: How to Assemble a Desk</h2>
<p>
[Video begins with soft instrumental music playing]
<br><br>
<strong>Sarah:</strong> Hi, welcome to our tutorial. Today we're assembling the Spectrum desk.
<br><br>
[Sarah is shown standing in a workshop in front of a box of parts. She smiles at the camera.]
<br><br>
<strong>Visual:</strong> Text on screen: "Episode 12: Desk Assembly"
<br><br>
<strong>Sarah:</strong> First, open the box and lay out all the components. You should have four legs, a tabletop, and a bag of hardware.
<br><br>
[Sound of cardboard box flaps opening]
<br><br>
... <!-- Full transcript continues -->
</p>
