Closed captioning (CC) allows deaf and hard of hearing / hearing-impaired people, people learning English as an additional language, people first learning how to read, people in a noisy environment, and others to read a transcript or dialogue of the audio portion of a video, film, or other presentation. As the video plays, text captions are displayed that transcribe, although not always verbatim, what is said and by whom and indicate other relevant sounds.
The term "closed" in closed captioning means that not all viewers see the captions—only those who decode or activate them. This is distinguished from "open captions," where the captions are visible to all viewers. Open captions are sometimes referred to as "in-vision" in the UK. Captions that are permanently visible in a video, film, or other medium are called "burned-in" captions.
In the US and Canada, "captions" are distinguished from "subtitles". In these countries, "subtitles" assume the viewer can hear but cannot understand the language, so they only translate dialogue and some onscreen text. "Captions" aim to describe all significant audio content, as well as "non-speech information," such as the identity of speakers and their manner of speaking; sometimes music or sound effects are also described using words or symbols within the closed caption. The distinction between subtitles and closed captions is not always made in the United Kingdom and Ireland, where the term "subtitles" is a general term.
It has been suggested that the largest audience of closed captioning are now in fact hearing people in ESL communities. In the US, the National Captioning Institute noted that ESL learners were the largest group buying decoders in the late 1980s and early 1990s (before built-in decoders became a standard feature of U.S. television sets).
Television and video
For live programs, spoken words comprising the television program's soundtrack are transcribed by an operator using stenotype or stenomask type of machines, whose phonetic output is instantly translated into text by a computer and displayed on the screen. This technique was developed in the 1970s as an initiative of the BBC's Ceefax teletext service. In collaboration with the BBC, a university student took on the research project of writing the first phonetics-to-text conversion program for this purpose. Automatic computer speech recognition now works well when trained to recognize a single voice, and so since 2003 the BBC does live subtitling by having someone re-speak what is being broadcast.
In some cases the transcript is available beforehand and captions are simply displayed during the program after being edited. For programs that have a mix of pre-prepared and live content, such as news bulletins, a combination of the above techniques is used.
For prerecorded programs and home videos, audio is transcribed and captions are prepared, positioned, and timed in advance.
For all types of NTSC programming, captions are "encoded" into Line 21 of the vertical blanking interval – a part of the TV picture that sits just above the visible portion and is usually unseen. For ATSC (digital television) programming, three streams are encoded in the video: two are backward compatible Line 21 captions, and the third is a set of up to 63 additional caption streams encoded in EIA-708 format.
Captioning is transmitted and stored differently in PAL and SECAM countries, where teletext is used rather than Line 21, but the methods of preparation are similar. Note that, for home videotapes, a variation of the Line 21 system is used in PAL countries. Teletext captions can't be stored on a standard VHS tape, although they are available on S-VHS tapes.
For older televisions, a set-top box or other decoder is usually required. In the U.S., since the passage of the Television Decoder Circuitry Act, manufacturers of most television receivers sold in have been required to include closed captioning. High-definition TV sets, receivers, and tuner cards are also covered, though the technical specifications are different. Canada has no similar law, but receives the same sets as the U.S. in most cases.
There are three styles of Line 21 closed captioning:
Roll-up or scroll-up or scrolling: The words appear from left to right, up to one line at a time; when a line is filled, the whole line scrolls up to make way for a new line, and the line on top is erased. The captions usually appear at the bottom of the screen, but can actually be placed anywhere to avoid covering graphics or action. This method is used for live events, where a sequential word-by-word captioning process is needed.
Pop-on or pop-up or block: A caption appears anywhere on the screen as a whole, followed by another caption or no captions. This method is used for most pre-taped television and film programming.
Paint-on: The caption, whether it is a single word or a line, appears on the screen letter-by-letter from left to right, but ends up as a stationary block like pop-on captions. Rarely used; most often seen in very first captions when little time is available to read the caption or in "overlay" captions added to an existing caption.
A single program may include scroll-up and pop-on captions (e.g., scroll-up for narration and pop-on for song lyrics). A musical note symbol is used to indicate song lyrics or background music. Generally, lyrics are preceded and followed by music notes, while song titles are bracketed like a sound effect. Standards vary from country to country and company to company.
For live programs, some soap operas, and other shows captioned using scroll-up, Line 21 caption text includes the symbols '>>' to indicate a new speaker (the name of the new speaker sometimes appears as well), and '>>>' in news reports to identify a new story. In some cases, '>>' means one person is talking and '>>>' means two or more people are talking. Capitals are frequently used because many older home caption decoder fonts had no descenders for the lowercase letters g, j, p, q, and y, though virtually all modern TVs have caption character sets with descenders. Text can be italicized, among a few other style choices. Captions can be presented in different colors as well. Coloration is rarely used in North America, but is often used in the United Kingdom and Australia for speaker differentiation.
There were many shortcomings in the original Line 21 specification from a typographic standpoint, since, for example, it lacked many of the characters required for captioning in languages other than English. Since that time, the core Line 21 character set has been expanded to include quite a few more characters, handling most requirements for languages common in North and South America such as French, Spanish, and Portuguese, though those extended characters are not required in all decoders and are thus unreliable in everyday use. The problem has been almost eliminated with the EIA-708 standard for digital television, which boasts a far more comprehensive character set.
Captions are often edited to make them easier to read and to reduce the amount of text displayed onscreen. This editing can be very minor, with only a few occasional unimportant missed lines, to severe where virtually every line spoken by the actors is condensed. The measure used to guide this editing is words per minute, commonly varying from 180 to 300, depending on the type of program. Offensive words are also captioned, but if the program is censored for TV broadcast, the broadcaster might not have arranged for the captioning to be edited or censored also. A television set top box is available to parents who wishes to censor offensive language of programs, the video signal is fed into the box and if it detects an offensive word in the captioning, the audio signal is bleeped or muted for that period of time.
There are some instances when the audio track of a TV program is altered -- useless dialog is silenced, words are bleeped, a licensed song in a syndicated TV episode is removed, etc. -- however, the captions of the removed dialog or lyrics remain. This can have serious consequences, as when a person's name is bleeped in the audio track for legal reasons but is included in the captions.
There are several competing technologies used to provide captioning for movies in theaters. Just as with television captioning, they fall into two broad categories: open and closed. The definition of "closed" captioning in this context is a bit different from television, as it refers to any technology that allows some of the viewers to use captions while others in the same theater at the same time do not see captions.
Open captioning in a theater can be accomplished through burned-in captions, projected bitmaps, or (rarely) a display located above or below the movie screen. Typically, this display is a large LED sign.
Probably the best-known closed captioning option for theaters is the Rear Window Captioning System from the National Center for Accessible Media. Upon entering the theater, viewers requiring captions are given a panel of flat translucent glass or plastic on a gooseneck stalk, which can be mounted in front of the viewer's seat. In the back of the theater is an LED display that shows the captions in mirror-image. The panel reflects the captions for the viewer, but is nearly invisible to surrounding patrons. The panel can be positioned so that the viewer watches the movie through the panel and captions appear either on or near the movie image. A company called Cinematic Captioning Systems has a similar reflective system called Bounce Back.
Other closed captioning technologies for movies include hand-held displays similar to a PDA (Personal digital assistant); eyeglasses fitted with a prism over one lens; and projected bitmap captions. The PDA and eyeglass systems use a wireless transmitter to send the captions to the display device.