Hi,
Morfsta wrote:
What I do not understand and perhaps you could help me with, is that you say that there is a problem with the firmware, yet there are entries in the log from VDR saying that it is dropping x bytes to sync on the next audio frame... This suggests to me that VDR is doing *something* with respect to audio synchronisation. Why then is the problem always reported as being an issue with the firmware?
As I tried to explain already, VDR doesn't see the TS packets when you for example just watch live TV on a FF card (i. e. no recording running in background), and the watched channel is available via the FF card.
When VDR runs a transfer thread e. g. when a budget card supplies a channel that is to be watched on a FF card or when VDR is recording, then the TS packets travel through cAudioRepacker and sync messages are reported.
What happens when e. g. a TS packet gets lost in an audio stream?
Consider the following TS packets numbered Tx. An even and an odd numbered packed shall build an audio frame (i. e., the length information at the beginning of an audio packet shall report a length which suits with this assumption) which is stored in a PES packet numbered Pxy:
T0 T1 T2 T3 T4 T5 T6 T7 \ / \ / \ / \ / P01 P23 P45 P67
Now assume that T3 gets lost:
T0 T1 T2 T4 T5 T6 T7 \ / \ / \ / P01 P24 P67
According to the length information, cAudioRepacker bundles T2 and T4 into PES packet P24. As the tail of this audio packet is incorrect, playing it might result in some noise. cAudioRepacker will go out of sync when reading packet T5 as it doesn't find an audio frame sync marker and starts skipping bytes until it reaches T6. There it finds the audio frame sync marker and reports that syncing skipped some bytes. After that it continues it's work normally.
BTW: just in case you have a look at the code, please keep in mind that this explanation has been simplified for clarity -- the real work is done differently.
What's the result of the above process?
In the error case, you will only get 3 audio packets and moreover, if you simply concat them, you will replay packet P67 at the point in time where the original packet P45 would have been replayed. If one uses the same technique for video packets and frames it is obvious, that audio and video are nolonger in sync with each other.
The issue is solved by adding a so called presentation time stamp (PTS) to some PES packets. Let's assume that packet P01 has a PTS of 0 and packet P67 has a PTS of 6. When replaying, the PTS of packet P01 is stored internally and according to the length information of the audio packet, the internal counter is advanced by 2. The next packet P24 doesn't have a PTS value so it is assumed it would have the PTS value of the internal counter which is 2 at that time. This packet is simply output and the internal counter is advanced by 2, resulting in 4. Now, when packet P67 is arriving, a PTS of 6 is read, but the internal counter just shows 4, which means that there is a gap of duration 2 which needs to be filled for example with silence to stay in sync. After that packet P67 is replayed.
Keeping audio and video in sync is done by using similar PTS values on both video and audio PES packets. Consider two buffers for decoded audio and video frames. The frames in each buffer were assigned either the PTS value contained in the PES packet or the internally determined PTS value (e. g. in the sample above, for the audio frame in packet P24 a PTS value of 2 was determined). It's now quite easy to replay audio and video in sync by introducing the so called system time counter STC, which shall periodically advance by one -- in the above example starting at 0. To replay audio and video frames in sync, those frames need to be taken from the buffers and presented to the user which have an assigned PTS value less than the current STC value.
BTW: this was a simplified explanation how xine manages sync of audio and video while replaying.
Actually, I don't know how this is done in the case of a FF card and what the firmware has to do in this regard. A guess -- which could explain the issues you see -- would be that sync is not maintained continuously. So after having maintained sync for some time, audio and video frames are simply taken out of some FIFOs at a constant rate and presented to the user -- this should keep audio and video in sync as originally maintained. But when then for example an audio frame gets lost due to a lost TS packet, audio and video get out of sync as the lost packet brakes filling the FIFOs at a constant rate. When you try to reproduce this effect by seeking back in the recording, then sync is maintained actively and you don't see this issue again at that position in the recording.
Please keep in mind that the last paragraph was just a guess -- I do not want to blame anybody with this email.
Bye.