On Wednesday 01 February 2006 00:33, Jon Burgess wrote:
Reinhard Nissl wrote:
I don't think that it is worth a try as it tests every byte while the above code tests most of the time only every third byte.
I agree that your algorithm is clever and does greatly cut down the number of comparisons as compared to the old code.
The glibc memchr() implementation does the comparisons 4 bytes at a time using a clever algorithm. It also has assembler optimised variants for some CPU's. I don't think that only doing a comparison of every 3rd byte wins you anything over memchr().
I believe the bulk of the time taken by the routine is transferring all the data from memory into the CPU. Every byte of the data will have to be read into the CPU caches due to cacheline effects. I believe that the asm optimisations will take into account the possibilities of speculative readahead etc. I've not looked into the assembler to see whether it actually exploits this.
I've atached the quickly hacked up test program that I wrote. The output is the time taken for many iterations of the 2 different algorithms. For me the difference is within the measurement noise. It certainly isn't any slower. I'd be interested to know whether it makes any difference on your EPIA, both in the test program and in VDR.
$ ./search /video0/%Click_Online/2005-04-10.04:28.99.99.rec/001.vdr Found 10585344 matches in 12.5873 seconds Found 10585344 matches in 12.6235 seconds
Just done a quick test on my Epia MII-12000 system (whilst under heavy load running softdevice, etc. so the numbers are probably crap!):
laz@vdr-tng 2006-02-01.18.28.50.50.del $ ~/stmp 001.vdr Found 12721024 matches in 276.0596 seconds Found 12721024 matches in 201.9245 seconds
I'll try to test it again later when it has finished recording so it is more idle! With the above numbers, it looks like the memchr() is quite a bit faster, at least under these conditions!
Cheers,
Laz
Cheers,
Laz