Captions versus transcripts for online video content

Abstract
Captions provide deaf and hard of hearing (DHH) users access to the audio component of web videos and television. While hearing consumers can watch and listen simultaneously, the transformation of audio to text requires deaf viewers to watch two simultaneous visual streams: the video and the textual representation of the audio. This can be a problem when the video has a lot of text or the content is dense, e.g., in Massively Open Online Courses. We explore the effect of providing caption history on users' ability to follow captions and be more engaged. We compare traditional on-video captions that display a few words at a time to off-video transcripts that can display many more words at once, and investigate the trade off of requiring more effort to switch between the transcript and visuals versus being able to review more content history. We find significant difference in users' preferences for viewing video with on-screen captions over off-screen transcripts in terms of readability, but no significant difference in users' preferences in following and understanding the video and narration content. We attribute this to viewers' perceived understanding significantly improving when using transcripts over captions, even if they were less easy to track. We then discuss the implications of these results for on-line education, and conclude with an overview of potential methods for combining the benefits of both onscreen captions and transcripts.
Funding Information
  • Division of Information and Intelligent Systems (IIS-121-70356, IIS-1218056)

This publication has 9 references indexed in Scilit: