(On the other hand, as far as I'm concerned, the whole optimized memcpy and related stuff is on its way out: it is being used in only one place in the whole plugin and I'm told that the performance of memcpy doesn't really matter even there. If someone has something against it being removed, now would be a good time to yell... :)
Interesting that you mention that now. Over the last days I have experimented with dxr3 on my via eden board with a c3 processor. memcpy_mmx makes the plugin segfault any time you start it with "unknown machine code". This is caused by the assembler code in dxr3memcpy.c in function mmx_memcpy. I removed the function from cDxr3MemCpyBench::cDxr3MemCpyBench and it worked. Strange thing is, that I remember it working before ...
Jan