Text capture: Difference between revisions

From LinuxTVWiki
Jump to navigation Jump to search
No edit summary
Line 1: Line 1:
==Introduction==
[http://en.wikipedia.org/wiki/Teletext Teletext] is popular in Europe and provides both informational pages and captions or subtitles to television programs. In 1992, teletext was provided in 18 countries. In the PAL standard, the text is digitally encoded in the [http://en.wikipedia.org/wiki/VBI vertical blanking interval] (vbi) on lines 17 through 20.


[http://en.wikipedia.org/wiki/Teletext Teletext] is popular in Europe and provides both informational pages and captions or subtitles to television programs. In 1992, teletext was provided in 18 countries.
In North America, [http://www.robson.org/capfaq/ closed captioning] uses line 21 of the vertical blanking interval (NTSC standard). US federal law requires closed captioning of all non-exampt programs starting in 2006. Some broadcasters are implementing [http://www.robson.org/capfaq/technical.html#XDS XDS], or Extended Data Services.

In North America, US federal law requires [http://www.robson.org/capfaq/ closed captioning] of all non-exampt programs starting in 2006. Some broadcasters are implementing [http://www.robson.org/capfaq/technical.html#XDS XDS], or Extended Data Services.


TV capture chipsets implement teletext and closed captioning in different ways, and the free software code to support text capture is still missing or incomplete for some chipsets.
TV capture chipsets implement teletext and closed captioning in different ways, and the free software code to support text capture is still missing or incomplete for some chipsets.


Recommended applications for testing under PAL/SECAM are mtt, zapping/zapzilla, alevt/d.


==Supported cards==
To visualize the Vertical Blanking Interval on NTSC, issue


===bt8x8===
ntsc-cc -d /dev/vbi -c -w -r 27


Cards based on the bt8x8 chip (cf. [[bttv devices]] have excellent support for text capture for both PAL/SECAM and NTSC.
At least bttv and saa7134 cards show orderly signals, though ntsc-cc currently doesn't work on non-bttv cards.

Commandline output is provided by both ntsc-cc and the test/capture utility in zvbi.

===saa713x===

Cards based on saa713x chips (cf. [[saa7134 devices]] have excellent support for text capture under PAL/SECAM.

NTSC support is sketchy but close; cf. this [[closed_captioning_on_saa7134 | failed attempt]] to get closed captioning working.

===ivtv===

This [http://www.gossamer-threads.com/lists/ivtv/devel/19383 exchange] on the ivtv list suggests there is support for CC, VPS and WSS signals for the PVR-350, but that hardware limitations prevent teletext.

Similarly, this [http://www.gossamer-threads.com/lists/ivtv/devel/16665?search_string=vbi;#16665 exchange] from January 2005 discusses an ongoing project adding vbi support to PVR-x50. Finally, in this [http://www.gossamer-threads.com/lists/ivtv/devel/19376 exchange] from April 2005, Chris Kennedy discusses vbi support for pvr150/500.

The READMI.vbi in a current ivtv release has lots of details and attempts to improve it are going on.

===DVB===

The [http://zapping.sourceforge.net/cgi-bin/view/ZVBI/WebHome zvbi library] supports the European standards [http://www.etsi.org ETSI EN 300 472] "Specification for conveying ITU-R System B Teletext in DVB bitstreams" and [http://www.etsi.org ETSI EN 301 775] "Specification for the carriage of Vertical Blanking Information (VBI) data in DVB bitstreams".

It can read VBI PES packets from Linux DVB devices and extract Teletext, VPS, WSS and Caption data. A demultiplexer is available to extract VBI data from DVB MPEG-2 program streams and a multiplexer to convert sliced VBI data, e. g. captured from analog devices, to DVB format. For details see the [http://zapping.sourceforge.net/cgi-bin/view/ZVBI/Documentation zvbi documentation]. The test/capture utility in the source tarball can demonstrate these capabilities.

The American standard [http://www.atsc.org ATSC A/53] "ATSC Digital Television Standard" which also covers Closed Caption transport and the "digital" Closed Caption standard [http://global.ihs.com EIA 708-B] are not currently supported by the zvbi library.

===cx88===

The latest improvements on vbi support for cx88-based cards, aside from kernel-related stuff and modules reorganization, were made by Tom Zoerner in March of 2004. The README.cx88 still says: "some code present. Doesn't crash any more, but also doesn't work yet ..."

However, Tim sketched out exactly what needs to be done in this [http://marc.theaimsgroup.com/?l=linux-video&m=108518568124567&w=2 22 May 2004 e-mail].


In the meantime, nxtvepg (which uses vbi information) is working on cx88 PAL tuners, cf. [http://nxtvepg.sourceforge.net/download.html#requirements system requirements].
See also the utilities in the test/ directory of the zvbi tarball, available on [http://zapping.sourceforge.net/cgi-bin/view/ZVBI/Download#Source_Code zvbi's project page]; these are actively maintained.




Line 52: Line 83:
===zvbi===
===zvbi===


"The zvbi library provides functions to read from Linux V4L, V4L2 and FreeBSD BKTR raw VBI capture devices, from Linux DVB devices and from a VBI proxy to share V4L and V4L2 VBI devices between multiple applications.
"The vertical blanking interval (VBI) is an interval in a television signal that temporarily suspends transmission of the signal for the electron gun
to move back up to the first line of the television screen to trace the next screen field. The vertical blanking interval can be used to carry
data, since anything sent during the VBI would naturally not be displayed; various test signals, closed captioning, and other digital data can be
sent during this time period.

The zvbi library provides functions to read from Linux V4L, V4L2 and FreeBSD BKTR raw VBI capture devices, from Linux DVB devices and from a VBI proxy to share V4L and V4L2 VBI devices between multiple applications.


It can demodulate raw to sliced VBI data in software, with support for a wide range of formats, has functions to decode several popular services including Teletext and Closed Caption, a Teletext cache with search function, various text export and rendering functions.
It can demodulate raw to sliced VBI data in software, with support for a wide range of formats, has functions to decode several popular services including Teletext and Closed Caption, a Teletext cache with search function, various text export and rendering functions.
Line 68: Line 94:




==Card support==
==Technical background==


===bt8x8===
===Analog television===


The vertical blanking interval (VBI) is an interval in a television signal that temporarily suspends transmission of the signal for the electron gun
Cards using the bttv driver have excellent support for text capture for both PAL/SECAM and NTSC.
to move back up to the first line of the television screen to trace the next screen field. The vertical blanking interval can be used to carry
data, since anything sent during the VBI would naturally not be displayed; various test signals, closed captioning, and other digital data can be
sent during this time period.


In the PAL standard, [http://en.wikipedia.org/wiki/Teletext Teletext] and Caption data is digitally encoded in the [http://en.wikipedia.org/wiki/VBI vertical blanking interval] (vbi) on lines 17 through 20.
Commandline output is provided by both ntsc-cc and the test/capture utility in zvbi.


In North America, [http://www.robson.org/capfaq/ closed captioning] uses line 21 of the vertical blanking interval (NTSC standard).
===saa713x===


===Digital television===
Cards using the saa7134 driver have excellent support for text capture under PAL/SECAM.


North American digital television (HDTV) uses the ATSC A/53 standard, cf. [http://www.atsc.org atsc.org]. It transmits CC bytes in a user data field following a picture header, inside the video elementary stream. If the driver doesn't extract CC data on its own, and the "broadcast
The status of NTSC support is unclear to me; cf. my [[closed_captioning_on_saa7134 | failed attempt]] to closed captioning working.
flag" permits this, one could perhaps read video packets from the device, or a complete MPEG-2 program stream from disk, and demultiplex
in software. Libzvbi does something similar for DVB, so it shouldn't be hard to implement, if it hasn't been done already to extract DVD
caption. A freestanding CC capture application would still need to tune in and choose a program ID; that's beyond the scope of libzvbi.


The European DVB standard transmits sliced VBI data in a separate elementary stream. Current drivers in libzvbi provide full support.
===ivtv===


===Testing===
This [http://www.gossamer-threads.com/lists/ivtv/devel/19383 exchange] on the ivtv list suggests there is support for CC, VPS and WSS signals for the PVR-350, but that hardware limitations prevent teletext.


Recommended applications for testing under PAL/SECAM are mtt, zapping/zapzilla and alevt/d.
Similarly, this [http://www.gossamer-threads.com/lists/ivtv/devel/16665?search_string=vbi;#16665 exchange] from January 2005 discusses an ongoing project adding vbi support to PVR-x50. Finally, in this [http://www.gossamer-threads.com/lists/ivtv/devel/19376 exchange] from April 2005, Chris Kennedy discusses vbi support for pvr150/500.


For lower-level testing, use the utilities in the test/ directory of the zvbi tarball, available on [http://zapping.sourceforge.net/cgi-bin/view/ZVBI/Download#Source_Code zvbi's project page]; these are actively maintained.
The READMI.vbi in a current ivtv release has lots of details and attempts to improve it are going on.


The test/capture utility currently just dumps printable characters on stdout. This output needs to be "sliced". Sliced VBI is the data transmitted on each scan line. It still contains multiple logical streams, parity bits and control codes. In libzvbi "formatting" takes one stream, converts characters to Unicode and interprets control codes, giving one page of text for display. An export function would convert the text to ASCII.
===DVB===


To visualize the Vertical Blanking Interval on NTSC, issue
Status unknown.


ntsc-cc -d /dev/vbi -c -w -r 27
The [http://zapping.sourceforge.net/cgi-bin/view/ZVBI/WebHome zvbi library] supports the European standards [http://www.etsi.org ETSI EN 300 472] "Specification for conveying ITU-R System B Teletext in DVB bitstreams" and [http://www.etsi.org ETSI EN 301 775] "Specification for the carriage of Vertical Blanking Information (VBI) data in DVB bitstreams".


Both bttv and saa7134 cards show orderly signals, while cx88 shows just noise.
It can read VBI PES packets from Linux DVB devices and extract Teletext, VPS, WSS and Caption data. A demultiplexer is available to extract VBI data from DVB MPEG-2 program streams and a multiplexer to convert sliced VBI data, e. g. captured from analog devices, to DVB format. For details see the [http://zapping.sourceforge.net/cgi-bin/view/ZVBI/Documentation zvbi documentation]. The test/capture utility in the source tarball can demonstrate these capabilities.


In the zvbi test/ suite, osc similarly visualizes (for some reason, mine is not compiling).
The American standard [http://www.atsc.org ATSC A/53] "ATSC Digital Television Standard" which also covers Closed Caption transport and the "digital" Closed Caption standard [http://global.ihs.com EIA 708-B] are not currently supported by the zvbi library.


===Displaying captured text===
===cx88===

The latest improvements on vbi support for cx88, aside from kernel-related stuff and modules reorganization, were made by Tom Zoerner in March of 2004, which is getting to be quite a while ago.

These improvements didn't change what the README.cx88 still says: "some code present. Doesn't crash any more, but also doesn't work yet ..."

However, Tim sketched out exactly what needs to be done in this [http://marc.theaimsgroup.com/?l=linux-video&m=108518568124567&w=2 22 May 2004 e-mail].


To get formatted output for overlay on the television image, Michael Schimek's decoder (part of libzvbi) works like a terminal emulator, printing caption into virtual screen memory. As long as you have "pop on" style caption, streaming is easy. For "roll up" caption it needs a more character or line oriented interface. Perhaps it would suffice to call the client and clear the screen before any vertical cursor motions and scrolling.
Finally, nxtvepg is working on cx88 PAL tuners, cf. [http://nxtvepg.sourceforge.net/download.html#requirements download].

Revision as of 22:23, 27 April 2005

Introduction

Teletext is popular in Europe and provides both informational pages and captions or subtitles to television programs. In 1992, teletext was provided in 18 countries.

In North America, US federal law requires closed captioning of all non-exampt programs starting in 2006. Some broadcasters are implementing XDS, or Extended Data Services.

TV capture chipsets implement teletext and closed captioning in different ways, and the free software code to support text capture is still missing or incomplete for some chipsets.


Supported cards

bt8x8

Cards based on the bt8x8 chip (cf. bttv devices have excellent support for text capture for both PAL/SECAM and NTSC.

Commandline output is provided by both ntsc-cc and the test/capture utility in zvbi.

saa713x

Cards based on saa713x chips (cf. saa7134 devices have excellent support for text capture under PAL/SECAM.

NTSC support is sketchy but close; cf. this failed attempt to get closed captioning working.

ivtv

This exchange on the ivtv list suggests there is support for CC, VPS and WSS signals for the PVR-350, but that hardware limitations prevent teletext.

Similarly, this exchange from January 2005 discusses an ongoing project adding vbi support to PVR-x50. Finally, in this exchange from April 2005, Chris Kennedy discusses vbi support for pvr150/500.

The READMI.vbi in a current ivtv release has lots of details and attempts to improve it are going on.

DVB

The zvbi library supports the European standards ETSI EN 300 472 "Specification for conveying ITU-R System B Teletext in DVB bitstreams" and ETSI EN 301 775 "Specification for the carriage of Vertical Blanking Information (VBI) data in DVB bitstreams".

It can read VBI PES packets from Linux DVB devices and extract Teletext, VPS, WSS and Caption data. A demultiplexer is available to extract VBI data from DVB MPEG-2 program streams and a multiplexer to convert sliced VBI data, e. g. captured from analog devices, to DVB format. For details see the zvbi documentation. The test/capture utility in the source tarball can demonstrate these capabilities.

The American standard ATSC A/53 "ATSC Digital Television Standard" which also covers Closed Caption transport and the "digital" Closed Caption standard EIA 708-B are not currently supported by the zvbi library.

cx88

The latest improvements on vbi support for cx88-based cards, aside from kernel-related stuff and modules reorganization, were made by Tom Zoerner in March of 2004. The README.cx88 still says: "some code present. Doesn't crash any more, but also doesn't work yet ..."

However, Tim sketched out exactly what needs to be done in this 22 May 2004 e-mail.

In the meantime, nxtvepg (which uses vbi information) is working on cx88 PAL tuners, cf. system requirements.


Applications

alevtv

AleVT is a teletext/videotext decoder and browser for the bttv driver (/dev/vbi) and X11.

gstreamer

The application gstreamer has incorporated support for closed captioning (they also mention some tweaks for Canadian English and French television); see Freedesktop's repository.

nxtvepg

The Nextview EPG decoder and browser is an Electronic TV Programme Guide for the analog domain (as opposed to the various digital EPGs that come with most digital broadcasts). It allows you to decode and browse TV programme listings for most of the major networks in Germany, Austria, France and Switzerland. The EPG information is read from /dev/vbi.

ntsc-cc

The application ntsc-cc handles closed captioning on bttv devices only, because it implements only the old v4l API. The saa7134 chip uses other sample rates.

For ntsc-cc to work, you typically need to be running an application for viewing or recording television, such as xawtv and mencoder. If no such application is running, ntsc-cc tends to produce garbled output.

The downside to ntsc-cc is that it doesn't link against libzvbi, which is the core of text capture under Linux, and it's not actively maintained. On the other hand, it's a great workhorse for bttv cards. Michael Schimek's test/capture utility will hopefully begin to cover the functionality of ntsc-cc.

tvtime

tvtime has built-in support for closed captioning for bttv and saa7134 cards (also others?).

In early 2004, Kevin Ko wrote a patch with useful comments to tvtime's vbidata.c; see his detailed account and the tvtime bugreport.

zapzilla

Zapping has a built-in teletext viewer called Zapzilla.

In addition, Zapping provides subtitle overlay through the closed captioning decoder built into libzvbi.

zvbi

"The zvbi library provides functions to read from Linux V4L, V4L2 and FreeBSD BKTR raw VBI capture devices, from Linux DVB devices and from a VBI proxy to share V4L and V4L2 VBI devices between multiple applications.

It can demodulate raw to sliced VBI data in software, with support for a wide range of formats, has functions to decode several popular services including Teletext and Closed Caption, a Teletext cache with search function, various text export and rendering functions.

Basically zvbi offers all functions needed by VBI applications except for the user interface. The library was written for the Zapping TV viewer http://zapping.sourceforge.net."

From the zvbi README, copyright Michael H. Schimek, Iñaki García Etxebarria, and Tom Zoerner. For further information, see the zvbi documentation, the zvbi wiki, and the zapping-misc mailing list.

There are utilities for testing in the tarball, available on zvbi's project page; cf. cvs and my failed attempt to use some of these to get cc working on a saa7133 card.


Technical background

Analog television

The vertical blanking interval (VBI) is an interval in a television signal that temporarily suspends transmission of the signal for the electron gun to move back up to the first line of the television screen to trace the next screen field. The vertical blanking interval can be used to carry data, since anything sent during the VBI would naturally not be displayed; various test signals, closed captioning, and other digital data can be sent during this time period.

In the PAL standard, Teletext and Caption data is digitally encoded in the vertical blanking interval (vbi) on lines 17 through 20.

In North America, closed captioning uses line 21 of the vertical blanking interval (NTSC standard).

Digital television

North American digital television (HDTV) uses the ATSC A/53 standard, cf. atsc.org. It transmits CC bytes in a user data field following a picture header, inside the video elementary stream. If the driver doesn't extract CC data on its own, and the "broadcast flag" permits this, one could perhaps read video packets from the device, or a complete MPEG-2 program stream from disk, and demultiplex in software. Libzvbi does something similar for DVB, so it shouldn't be hard to implement, if it hasn't been done already to extract DVD caption. A freestanding CC capture application would still need to tune in and choose a program ID; that's beyond the scope of libzvbi.

The European DVB standard transmits sliced VBI data in a separate elementary stream. Current drivers in libzvbi provide full support.

Testing

Recommended applications for testing under PAL/SECAM are mtt, zapping/zapzilla and alevt/d.

For lower-level testing, use the utilities in the test/ directory of the zvbi tarball, available on zvbi's project page; these are actively maintained.

The test/capture utility currently just dumps printable characters on stdout. This output needs to be "sliced". Sliced VBI is the data transmitted on each scan line. It still contains multiple logical streams, parity bits and control codes. In libzvbi "formatting" takes one stream, converts characters to Unicode and interprets control codes, giving one page of text for display. An export function would convert the text to ASCII.

To visualize the Vertical Blanking Interval on NTSC, issue

ntsc-cc -d /dev/vbi -c -w -r 27

Both bttv and saa7134 cards show orderly signals, while cx88 shows just noise.

In the zvbi test/ suite, osc similarly visualizes (for some reason, mine is not compiling).

Displaying captured text

To get formatted output for overlay on the television image, Michael Schimek's decoder (part of libzvbi) works like a terminal emulator, printing caption into virtual screen memory. As long as you have "pop on" style caption, streaming is easy. For "roll up" caption it needs a more character or line oriented interface. Perhaps it would suffice to call the client and clear the screen before any vertical cursor motions and scrolling.