GStreamer

From LinuxTVWiki
Revision as of 11:50, 10 June 2014 by Sidgilles (talk | contribs)
Jump to navigation Jump to search

GStreamer is a multimedia processing library which front end applications can leverage in order to provide a wide variety of functions such as audio and video playback, streaming, non-linear video editing and V4L2 capture support. It is also available for many other platforms for example for windows. It is mainly used as API working in the background of applications as Totem. But with the comman-line-tools gst-launch and entrans it is also possible to directly create and use gstreamer-piplines on the commandline.

Getting GStreamer

GStreamer, the most common GStreamer-plugins and the most important tools like gst-launch are available through your disttribution's package management. But entrans and some of the plugins used in the examples below are not. You can find their sources bundled by the GEntrans project at sourceforge. Google may help you to find precompiled packages for your distro. There are the old GStreamer 0.10 and the new GStreamer 1.0. They are incompatible but can be installed parallel. Everything said below can be done with both GStreamer 0.10 and GStreamer 1.0. You simply have to use the appropriate commands, e.g. gst-launch-0.10 vs. gst-launch-1.0 (gst-launch is linked to one of them). Note the ff(mpeg)-plugins have been renamed to av. For example ffenc_mp2 of GStreamer 0.10 is called avenc_mp2 in GStreamer 1.0.

Entrans versus gst-launch

gst-launch is better documented and part of all distributions. But entrans is a bit smarter for the following reasons:

  • It provides partly automatically composing of GStreamer pipelines
  • It allows cutting of the streams; the most simple application of this feature is to capture for a distinct time. That allows the muxers to properly close the captured files writing correct headers which is not always given if you finish capturing with gst-launch by simply typing Ctrl+C. To use this feature one has to insert a dam element after the first queue of each part of the pipeline.

Using GStreamer for V4L TV capture

Why preferring GStreamer?

Despite the fact that GStreamer is much more flexible than other solutions most other tools especially those which are based on the ffmpeg library (e.g. mencoder) have by design difficulties in managing A/V-synchronisation: The common tools process the audio and the video stream independently frame by frame as frames come in. Afterwards they are muxed relying on their specified framerates. E.g. if you have a 25fps video stream and a 48,000kHz audio stream it simply takes 1 video frame, 1920000 audio frames, 1 video frame and so on. This probably leeds to sync issues for at least three reasons:

  • If frames get dropped audio and video shift against each other. For example if your CPU is not fast enough and sometimes drops a video frame after 25 dropped audio frames the video is one second in advance. (Using mencoder this can be fixed by usindg the -harddup option in most situations. It causes mencoder to wath if video frames are dropped and to copy other frames to fill the gaps before muxing.)
  • The audio and video devices need different time to start the streams. For example after you have started capturing the first audioframe is grabbed 0.01 seconds thereafter while the first videoframe is grabbed 0.7 seconds thereafter. This leads to constant time-offset between audio and video. (Using mencoder this can be fixed by using the -delay option. But you have to find out the appropriate delay by try and error. It won't be very precise. Another problem is that things very often change if you update your software as for example the buffers and timeouts of the drivers and the audioframework change. So you have to do it again and again after each update.)
  • If the clocks of your audio source and your video source are not accurate enough (very common on low-cost home-user-products) and your webcam sends 25.02 fps instead of 25 fps and your audio source delivers 47,999kHz instead of 48,000kHz audio and video are going to slowly shift by time. The result is that after an hour or so of capturing audio and video differ by a second or so. But that's not only an issue of inaccurate clocks. Analogue TV is not completely analogue but the frames are discrete. Therefore video-capturing devices can't rely on their internal clocks but have to synchronize to the incoming signal. That's no problem if capturing live TV, but if you capture the signal of a VHS recorder. These recorders can't instantly change the speed of the tape but need to slightly adjust the fps for accurate tracking, especially if the quality of the tape is bad. (This issue can't be addressed using mencoder.)

To be more robust and accurate GStreamer attaches a timestamp to each incoming frame in the very beginning using the PC's clock. In the end muxing is done according to these timestamps. Many container-formats for example Matroska support timestamps. Therefore the GStreamer timestamps are written to the file when cepturing to those formats. Others like avi can't handle timestamps. If you write streams with varying framerate (for example due to framedrops) to those files they are played back with varying speed (but nevertheless correct sync) as the timestamps get lost. To prevent this you have to use the videorate-plugin in the end of the pipeline's video part and the audiorate-plugin in the end of the pipeline's audio part which produce streams with constant framerates by copying and dropping frames according to the timestamps. To further improve the A/V-sync one should use the do-timestamp=true option for each capturing source. It gets the timestamps from the drivers instead of letting GStreamer do the stamping. This is even more accurate as effects of the hardware-buffering are also taken into account. Even if the source doesn't support this (e.g. many v4l2-drivers doesn't support timestamping) there is no harm in doing this as GStreamer falls back to the normal behaviour in those cases.

Common caputuring issues and their solutions

Jerking

  • If your hardware is not fast enough you probably get framedrops. An indication is high CPU load. If there are only few framedrops you can try to address the issue by enlarging the buffers. (If that doesn't help you have to change the codecs or its parameters and / or lower the resolution.) You have to enlarge the queues as well as the buffers of your sources, e.g. queue max-size-buffers=0 max-size-time=0 max-size-bytes=0, v4l2src queue-size=16, pulsesrc buffer-time=2000000 Even if not necessary increasing the buffer-sizes doesn't do any harm. Big buffers only increase the pipeline's latency which doesn't matter for capturing purposes.
  • Some muxers need big chunks of data at once. Therefore situations happen in which the muxer waits for example for more audio data and doesn't process any videodata. If this takes longer than filling up all the video buffers of the pipeline video frames get dropped. To prevent the resulting jerking one should enlarge all the buffers. queue max-size-buffers=0 max-size-time=0 max-size-bytes=0 should disable any size-restrictions of the queue. But in fact there seems to be a hard-coded restriction in some cases. If needed you can work around this issue by adding additional queue elements to the pipeline. You can try out if to small buffers are the reason for your jerking by changing the pipeline to write the audio and the video stream into different files instead of muxing them together. If these files don't judder to small buffers are your problem.
  • Many v4l2-drivers don't support timestamping. So even if do-timestamping=true is given GStreamer has to do the timestamping when it gets the frames from the driver. Most drivers tend to fill up their internal buffers and pass many frame as a cluster. This makes GStreamer giving all the frames of such a cluster (nearly) the same timestamp. As those inaccuracies are typically very small this doesn't disturb A/V sync but leads to Jerking as the frames are played back in clusters if videorate isn't used. If videorate is used Jerking is even worse as videorate relying on the inaccurate timestamps drops all but one frame of each cluster and copies this frame to fill the gaps. If videorate is used with the silent=false option it reports many framedrops and framecopies even if the CPU load is low. To solve this problem use the stamp plugin between v4l2src and queue. For example v4l2src do-timestamp=true ! stamp sync-margin=2 sync-interval=5 ! queue. The stamp plugin inspects the buffers and uses a smart algorithm to correct the timestamps if the drivers doesn't support timestamping. Using the sync options stamp can additionally drop or copy frames to get a close to constant framerate. In most cases this doesn't completely replace videorate. It's safe to use videorate in addition.

Capturing of disturbed video signals

  • Most video capturing devices send EndOfStream singnals if the quality of the input signal is too bad or if there is a period of snow. This aborts the capturing process. To prevent the device from sending EOS set num-buffers=-1 on the v4l2src element.
  • The stamp plugin gets confused by periods of snow and does faulty timestamps and framedropping. This effect itself doesn't matter as stamp recovers normal behaviour when the brake is over. But chances are good that the buffers are full of old weird stamped frames. stamp then drops only one of them each sync-intervall with the result that it can take a quite long time (minutes) until everything works fine again. To solve this problem set leaky=2 on each queue element to allow dropping of old frames which aren't needed any longer.
  • If using variable bitrate for encoding the bitrate increases very much during periods of bad signal quality or snow. Afterwards the codec uses a very low bitrate to reach the desired average bitrate resulting in poor quality. To prevent this and stay in the limits that are allowed for the purpose you aim at don't forget to specify a maximum and a minimum for the variable bitrate.

Sample commands

Record to ogg theora

gst-launch-0.10 oggmux name=mux ! filesink location=test0.ogg v4l2src device=/dev/video2 ! \  
video/x-raw-yuv,width=640,height=480,framerate=\(fraction\)30000/1001 ! ffmpegcolorspace ! \
theoraenc ! queue ! mux. alsasrc device=hw:2,0 ! audio/x-raw-int,channels=2,rate=32000,depth=16 ! \
audioconvert ! vorbisenc ! mux.

The files will play in mplayer, using the codec Theora. Note the required workaround to get sound on a saa7134 card, which is set at 32000Hz (cf. bug). However, I was still unable to get sound output, though mplayer claimed there was sound -- the video is good quality:

VIDEO:  [theo]  640x480  24bpp  29.970 fps    0.0 kbps ( 0.0 kbyte/s)
Selected video codec: [theora] vfm: theora (Theora (free, reworked VP3))
AUDIO: 32000 Hz, 2 ch, s16le, 112.0 kbit/10.94% (ratio: 14000->128000)
Selected audio codec: [ffvorbis] afm: ffmpeg (FFmpeg Vorbis decoder)

Record to mpeg4

Or mpeg4 with an avi container (Debian has disabled ffmpeg encoders, so install Marillat's package or use example above):

gst-launch-0.10 avimux name=mux ! filesink location=test0.avi v4l2src device=/dev/video2 ! \
video/x-raw-yuv,width=640,height=480,framerate=\(fraction\)30000/1001 ! ffmpegcolorspace ! \ 
ffenc_mpeg4 ! queue ! mux. alsasrc device=hw:2,0 ! audio/x-raw-int,channels=2,rate=32000,depth=16 ! \ 
audioconvert ! lame ! mux.

I get a file out of this that plays in mplayer, with blocky video and no sound. Avidemux cannot open the file.

Record to DVD-compliant MPEG2

  entrans -s cut-time -c 0-180 -v -x '.*caps' --dam -- --raw \
  v4l2src queue-size=16 do-timestamp=true device=/dev/video0 norm=PAL-BG num-buffers=-1 ! stamp silent=false progress=0 sync-margin=2 sync-interval=5 ! \
     queue silent=false leaky=2 max-size-buffers=0 max-size-time=0 max-size-bytes=0 ! dam ! \
     cogcolorspace ! videorate silent=false ! \
     'video/x-raw-yuv,width=720,height=576,framerate=25/1,interlaced=true,aspect-ratio=4/3' ! \
     queue silent=false leaky=2 max-size-buffers=0 max-size-time=0 max-size-bytes=0 ! \
     ffenc_mpeg2video rc-buffer-size=1500000 rc-max-rate=7000000 rc-min-rate=3500000 bitrate=4000000 max-key-interval=15 pass=pass1 ! \
     queue silent=false leaky=2 max-size-buffers=0 max-size-time=0 max-size-bytes=0 ! mux. \
  pulsesrc buffer-time=2000000 do-timestamp=true ! \
     queue silent=false leaky=2 max-size-buffers=0 max-size-time=0 max-size-bytes=0 ! dam ! \
     audioconvert ! audiorate silent=false ! \
     audio/x-raw-int,rate=48000,channels=2,depth=16 ! \
     queue silent=false max-size-buffers=0 max-size-time=0 max-size-bytes=0 ! \
     ffenc_mp2 bitrate=192000 ! \
     queue silent=false leaky=2 max-size-buffers=0 max-size-time=0 max-size-bytes=0 ! mux. \
  ffmux_mpeg name=mux ! filesink location=my_recording.mpg

This captures 3 minutes (180 seconds, see first line of the command) to my_recording.mpg and even works for bad input signals.

  • I wasn't able to figure out how to produce a mpeg with ac3-sound as neither ffmux_mpeg nor mpegpsmux support ac3 streams at the moment. mplex does but I wasn't able to get it working as one needs very big buffers to prevent the pipeline from stalling and at least my GStreamer build didn't allow for such big buffers.
  • The limited buffer size on my system is again the reason why I had to add a third queue element to the middle of the audio as well as of the video part of the pipeline to prevent jerking.
  • In many HOWTOs you find ffmpegcolorspace instead of cogcolorspace. You can even use this but cogcolorspace is much faster.
  • It seems to be important that the video/x-raw-yuv,width=720,height=576,framerate=25/1,interlaced=true,aspect-ratio=4/3-statement is after videorate as videorate seems to drop the aspect-ratio-metadata otherwise resulting in files with aspect-ratio 1 in theis headers. Those files are probably played back warped and programs like dvdauthor complain.

Record to raw video

If you don't care for sound, this simple version works for uncompressed video:

gst-launch-0.10 v4l2src device=/dev/video5 ! video/x-raw-yuv,width=640,height=480 ! avimux ! \
filesink location=test0.avi

tcprobe says this video-only file uses the I420 codec and gives the framerate as correct NTSC:

$ tcprobe -i test1.avi
[tcprobe] RIFF data, AVI video
[avilib] V: 29.970 fps, codec=I420, frames=315, width=640, height=480
[tcprobe] summary for test1.avi, (*) = not default, 0 = not detected
import frame size: -g 640x480 [720x576] (*)
       frame rate: -f 29.970 [25.000] frc=4 (*)
   no audio track: use "null" import module for audio
           length: 315 frames, frame_time=33 msec, duration=0:00:10.510

The files will play in mplayer, using the codec [raw] RAW Uncompressed Video.

Bash script to digitize video tapes easily

#!/bin/bash
 
 targetdirectory="~/videos"
 
 
 # Test ob doppelt geöffnet
 
 if [[ -e "~/.lock_shutdown.digitalisieren" ]]; then
     echo ""
     echo ""
     echo "Capturing already running. It is impossible to capture to tapes simultaneously. Hit a key to abort."
     read -n 1
     exit
 fi
 
 # trap keyboard interrupt (control-c)
 trap control_c 0 SIGHUP SIGINT SIGQUIT SIGABRT SIGKILL SIGALRM SIGSEGV SIGTERM
 
 control_c()
 # run if user hits control-c
 {
   cleanup
   exit $?
 }
 
 cleanup()
 {
   rm ~/.lock_shutdown.digitalisieren
   return $?
 }
 
 touch "~/.lock_shutdown.digitalisieren"
 
 echo ""
 echo ""
 echo "Please enter the length of the tape in minutes and press ENTER. (Press Ctrl+C to abort.)"
 echo ""
 while read -e laenge; do
     if [[ $laenge == [0-9]* ]]; then
         break 2
     else
         echo ""
         echo ""
         echo "That's not a number."
         echo "Please enter the length of the tape in minutes and press ENTER. (Press Ctrl+C to abort.)"
         echo ""
     fi
 done
 
 let laenge=laenge+10  # Sicherheitsaufschlag, falls Band doch länger
 let laenge=laenge*60
 
 echo ""
 echo ""
 echo "Please type in the description of the tape."
 echo "Don't forget to rewind the tape?"
 echo "Hit ENTER to start capturing. Press Ctrl+C to abort."
 echo ""
 read -e name;
 name=${name//\//_}
 name=${name//\"/_}
 name=${name//:/_}
 
 # Falls Name schon vorhanden
 if [[ -e "$targetdirectory/$name.mpg" ]]; then
     nummer=0
     while [[ -e "$targetdirectory/$name.$nummer.mpg" ]]; do
        let nummer=nummer+1
     done
     name=$name.$nummer
 fi
 
 # Audioeinstellungen setzen: unmuten, Regler
 amixer -D pulse cset name='Capture Switch' 1 >& /dev/null      # Aufnahme-Kanal einschalten
 amixer -D pulse cset name='Capture Volume' 20724 >& /dev/null  # Aufnahme-Pegel einstellen
 
 # Videoinput auswählen und Karte einstellen
 v4l2-ctl --set-input 3 >& /dev/null
 v4l2-ctl -c saturation=80 >& /dev/null
 v4l2-ctl -c brightness=130 >& /dev/null
 
 let ende=$(date +%s)+laenge
 
 echo ""
 echo "Working"
 echo "Capturing will be finished at "$(date -d @$ende +%H.%M)"."
 echo ""
 echo "Press Ctrl+C to finish capturing now."
 
 
 nice -n -10 entrans -s cut-time -c 0-$laenge -m --dam -- --raw \
 v4l2src queue-size=16 do-timestamp=true device=/dev/video0 norm=PAL-BG num-buffers=-1 ! stamp sync-margin=2 sync-interval=5 silent=false progress=0 ! \
    queue leaky=2 max-size-buffers=0 max-size-time=0 max-size-bytes=0 ! dam ! \
    cogcolorspace ! videorate ! \
    'video/x-raw-yuv,width=720,height=576,framerate=25/1,interlaced=true,aspect-ratio=4/3' ! \
    queue leaky=2 max-size-buffers=0 max-size-time=0 max-size-bytes=0 ! \
    ffenc_mpeg2video rc-buffer-size=1500000 rc-max-rate=7000000 rc-min-rate=3500000 bitrate=4000000 max-key-interval=15 pass=pass1 ! \
    queue leaky=2 max-size-buffers=0 max-size-time=0 max-size-bytes=0 ! mux. \
 pulsesrc buffer-time=2000000 do-timestamp=true ! \
    queue leaky=2 max-size-buffers=0 max-size-time=0 max-size-bytes=0 ! dam ! \
    audioconvert ! audiorate ! \
    audio/x-raw-int,rate=48000,channels=2,depth=16 ! \
    queue max-size-buffers=0 max-size-time=0 max-size-bytes=0 ! \
    ffenc_mp2 bitrate=192000 ! \
    queue leaky=2 max-size-buffers=0 max-size-time=0 max-size-bytes=0 ! mux. \
 ffmux_mpeg name=mux ! filesink location=\"$targetdirectory/$name.mpg\" >& /dev/null
 
 echo "Finished Capturing"
 rm ~/.lock_shutdown.digitalisieren

The script uses a command line similar to this to produce a DVD compliant MPEG2 file.

  • The script abborts if another instance is already running.
  • If not it asks for the length of the tape and its description
  • It records to description.mpg or if this file already exists to description.0.mpg and so on for the given time plus 10 minutes. The target-directory has to be specified in the beginning of the script.
  • As setting of the inputs and settings of the capture device is only partly possible via GStreamer other tools are used.
  • Adjust the settings to match your input sources, the recording volume, capturing saturation and so on.


Record from a bad analog signal to MJPEG video and RAW mono audio

All advices read above still apply. However, in Gstreamer 1.0, stamp element does not exist anymore. cogcolorspace and ffmpegcolorspace have been replaced by videoconvert.

gst-launch-1.0 -v avimux name=mux v4l2src device=/dev/video1 do-timestamp=true ! 'video/x-raw,format=(string)YV12,width=(int)720,height=(int)576' !\
videorate ! 'video/x-raw,format=(string)YV12,framerate=25/1' ! videoconvert ! 'video/x-raw,format=(string)YV12,width=(int)720,height=(int)576' !\
jpegenc  ! queue ! mux. alsasrc device=hw:2,0 ! 'audio/x-raw,format=(string)S16LE,rate=(int)48000,channels=(int)2' !\
audiorate ! audioresample ! 'audio/x-raw,rate=(int)44100' ! audioconvert ! 'audio/x-raw,channels=(int)1' ! queue ! mux. mux. ! filesink location=test.avi

This pipeline get video from V4L2 device "/dev/video1" (and compress it to MJPEG) and sound from ALSA device "hw:2,0" to record it in an AVI file. Adapt inputs format according to your chip and proceed to transformation after videorate/audiorate.

As stated above, it is best to use both audiorate and videorate: you problably use the same chip to capture the audio and the video so the audio part is subject to disturbance as well.


Using GStreamer to view pictures from a Webcam

gst-launch-0.10 v4l2src use-fixed-fps=false ! video/x-raw-yuv,format=\(fourcc\)UYVY,width=320,height=240 \
! ffmpegcolorspace ! ximagesink
gst-launch-0.10 v4lsrc autoprobe-fps=false device=/dev/video0 ! "video/x-raw-yuv, width=160, height=120, \
framerate=10, format=(fourcc)I420" ! xvimagesink  

Converting formats

To convert the files to matlab (didn't work for me):

mencoder test0.avi -ovc raw -vf format=bgr24 -o test0m.avi -ffourcc none 

For details, see gst-launch and google; the plugins in particular are poorly documented so far.


Further documentation resources

External Links