Sun, Dec 01, 2024 at 02:18:02PM +0100, schorpp wrote:
HA! I've got this bitch of intermittent bug finally:
Program received signal SIGFPE, Arithmetic exception. [Switching to Thread 0xad0ffb40 (LWP 27522)] 0x08136f6a in cFrameDetector::Analyze (this=0x9ade480, Data=<optimized out>, Length=296852) at remux.c:1567 1567 uint32_t Delta = ptsValues[0] / (framesPerPayloadUnit + parser->IFrameTemporalReferenceOffset());
It would have been useful to include the disassembly of the function. Maybe alos the output fo the following, if the values are known to the debugger:
print ptsValues[0] print framesPerPayloadUnit print parser->iFrameTemporalReferenceOffset
I was curious about this. I am able to reproduce SIGFPE on both x86-64 and i386 when compiling the following C program without optimization:
#include <stdint.h> #include <stdio.h> #include <inttypes.h> int main() { int a = 0, b = -1; uint32_t pts = 1U << 31; int32_t delta = ((int32_t)pts) / (a + b); printf("%" PRIi32 "\n", delta); return 0; }
Initially I had "uint32_t delta" and no type cast, and PRIu32, to exactly match the data types that are involved in VDR. That variant would produce the incorrect result 0. This of course is a bad approximation for the code in VDR, because above it is possible to perform all the arithmetics at compilation time. In VDR, the values would be determined at runtime.
Curiously, if I compile the above program with GCC 14.2.0 -O2, then it will return the incorrect result -2147483648 instead of an approximation like 2147483647 (which would be one less than the correct result, which cannot be represented in int32_t). If I look at the disassembly, the compiler would have performed an incorrect constant folding for "delta".
If I compile the program with -fsanitize=undefined, it will flag an error:
runtime error: division of -2147483648 by -1 cannot be represented in type 'int'
For the non-optimized case, for both i386 and x86-64, I see that the SIGFPE is being raised by an idiv instruction that is preceded by ctld a.k.a. cdq: https://www.felixcloutier.com/x86/cwd:cdq:cqo
Aussume it is a divide by zero exception?
I don't know if your case involves the idiv instruction, but https://www.felixcloutier.com/x86/idiv mentions that #DE may be raised both on a division by zero and on overflow.
Can you post the output of "disassemble" and "info registers" for the innermost stack frame?
Marko