Hi,
as you may have noticed: ZDF increased it's bitrate to about 7.5 Mbit/s. As a result, less powerful VDR systems "suffered" when taking a recording or when running in transfer mode.
The attached patch speeds up cVideoRepacker by choosing a known art search algorithm for finding start codes in the video stream.
With this patch and a similar one for xine-lib I was finally able to watch and record ZDF at the same time on my 600 MHz "budget only" EPIA VDR system.
Bye.
Reinhard Nissl wrote:
Did you consider using memchr()? e.g. something like ... while (Data < Limit) { Data = memchr(Data, 0x01, Limit - Data); if (Data == NULL) break; if (Data[-2] != 0x00 || Data[-1] != 0x00) Data += 3; ...
It makes no noticeable difference on my AMD64 machine (<1%), but maybe it is worth trying on your EPIA?
Jon
Hi,
Jon Burgess wrote:
I don't think that it is worth a try as it tests every byte while the above code tests most of the time only every third byte.
Consider the following data:
hh ii jj kk [ 00 00 01 01 ] mm nn oo 01 01 pp 01 ^^^^^^^^^^^^^^^^^^^ ++++++++++^^^ ^ ^^ 6 4 0 1
which will give you the above distance between consecutive 01 which are not part of a startcode (i. e. first slice start code [ 00 00 01 01 ]) or immediately following a startcode (the area marked with +++++). ++ or ^^ indicate bytes which have to be considered for the distance.
For the above case, a statistical analysis will give this numbers:
distance: 0, count: 1 distance: 1, count: 1 distance: 2, count: 0 distance: 3, count: 0 distance: 4, count: 1 distance: 5, count: 0 distance: 6, count: 1 distance: >, count: 0
Now consider real data, like 001.vdr of "Wetten, dass ..?" which was broadcast last Saturday on ZDF:
distance: 0, count: 75248 distance: 1, count: 519949 distance: 2, count: 316632 distance: 3, count: 331874 distance: 4, count: 381855 distance: 5, count: 367280 distance: 6, count: 370620 distance: 7, count: 369649 distance: 8, count: 405861 distance: 9, count: 360555 distance: 10, count: 332593 distance: 11, count: 366192 distance: 12, count: 319692 distance: 13, count: 313795 distance: 14, count: 309567 distance: 15, count: 305986 distance: 16, count: 297510 distance: 17, count: 291607 distance: 18, count: 286045 distance: >, count: 18794252
As you can easily see, it's a waste of time to test every byte (= distance 0) for 01 as it is very unlikely to find one.
Bye.
Reinhard Nissl wrote:
I don't think that it is worth a try as it tests every byte while the above code tests most of the time only every third byte.
I agree that your algorithm is clever and does greatly cut down the number of comparisons as compared to the old code.
The glibc memchr() implementation does the comparisons 4 bytes at a time using a clever algorithm. It also has assembler optimised variants for some CPU's. I don't think that only doing a comparison of every 3rd byte wins you anything over memchr().
I believe the bulk of the time taken by the routine is transferring all the data from memory into the CPU. Every byte of the data will have to be read into the CPU caches due to cacheline effects. I believe that the asm optimisations will take into account the possibilities of speculative readahead etc. I've not looked into the assembler to see whether it actually exploits this.
I've atached the quickly hacked up test program that I wrote. The output is the time taken for many iterations of the 2 different algorithms. For me the difference is within the measurement noise. It certainly isn't any slower. I'd be interested to know whether it makes any difference on your EPIA, both in the test program and in VDR.
$ ./search /video0/%Click_Online/2005-04-10.04:28.99.99.rec/001.vdr Found 10585344 matches in 12.5873 seconds Found 10585344 matches in 12.6235 seconds
Jon
On Wednesday 01 February 2006 00:33, Jon Burgess wrote:
Just done a quick test on my Epia MII-12000 system (whilst under heavy load running softdevice, etc. so the numbers are probably crap!):
laz@vdr-tng 2006-02-01.18.28.50.50.del $ ~/stmp 001.vdr Found 12721024 matches in 276.0596 seconds Found 12721024 matches in 201.9245 seconds
I'll try to test it again later when it has finished recording so it is more idle! With the above numbers, it looks like the memchr() is quite a bit faster, at least under these conditions!
Cheers,
Laz
Cheers,
Laz
On Wednesday 01 February 2006 21:54, Laz wrote:
And not under load: laz@vdr-tng 2006-02-01.18.28.50.50.del $ ~/stmp 001.vdr Found 12721024 matches in 47.0021 seconds Found 12721024 matches in 32.8197 seconds
memchr() definitely looks better for my Epia system.
I'll try Reinhard's new patch next...
Cheers,
Laz
Hi,
Jon Burgess wrote:
You were right. Using memchr() reduces CPU load on my 600 MHz EPIA System by 1 % for channel ZDF and by 4 % for the HDTV channel HDFORUM. The numbers were taken by just running VDR in transfer mode for the mentioned channel (= no xine attached to VDR).
I also gave memmem() a try but the CPU load was increased by this change.
Attached you'll find an updated patch according to your suggestion.
Bye.