Hi,
I'm facing a deadlock situation, when the below code is modified to
ignore the "r == 0" cases (= original code in vdr-xine-0.7.2):
int cXineLib::xread(int f, void *b, int n)
{
bool atEOF = false;
int t = 0;
while (t < n)
{
void (* const sigPipeHandler)(int) = ::signal(SIGPIPE, SIG_IGN);
errno = 0;
int r = ::read(f, ((char *)b) + t, n - t);
int myErrno = errno;
::signal(SIGPIPE, sigPipeHandler);
if (r < 0
|| (r == 0 && atEOF))
{
if (EAGAIN == myErrno)
continue;
fprintf(stderr, "lib::read(%d) failed (atEOF: %d) %d: ", n,
atEOF, myErrno);
errno = myErrno;
perror("");
disconnect();
return r;
}
else if (r == 0)
{
cPoller Poller(f);
atEOF = Poller.Poll(0);
fprintf(stderr, "--- lib read 0, atEOF %d\n", atEOF);
}
else
atEOF = false;
t += r;
}
return t;
}
Some more information:
- Filedescriptor "f" represents a FIFO, which is opend for reading and
should be in blocking mode by design.
- As this function is called by different threads (synchronized outside
via mutex), I save and restore the handler for signal SIGPIPE, as I
don't want any of VDR's signal handlers to be triggered.
The read() should block until data is available and typically return the
number of bytes read, which should be greater than zero. But there are
some cases, where read() returns zero:
a) the writing side of the FIFO was closed.
b) a signal caused the block to break.
The original code simply ignores the result of zero, as for case a) a
different function (xwrite), which should be called by a different
thread, should see a SIGPIPE and initiate the disconnect(). Case b)
should simply go on with reading the remaining data. But this leads to a
deadlock situation where the read() never returns anything != 0 and
therefore the original loop spins forever. This is most likely to
trigger if you move cutting marks (on my machine it only triggers when
xine uses -V xshm and when moving cutting marks in HDTV recordings).
The new code above tries to detect case a) by asking a cPoller, whether
data is available on file descriptor f, after the read returned 0.
Let's assume, the Poll() returns true, because data is available: atEOF
is set pessimistically but the next read() should return anything > 0,
which resets atEOF. The loop continues.
When the Poll() returns false as there is no further data available, the
next read() should block. The loop continues.
But the Poll() might return true in an error condition (e. g. the
writing side of the FIFO was closed). Then the next read() is expected
to return anything <= 0. The loop terminates and a disconnect() happens.
The strange thing is now, that a disconnect() happens occationally when
moving cutting marks.
Any help appreciated! Thanks!
Bye.
--
Dipl.-Inform. (FH) Reinhard Nissl
mailto:rnissl@gmx.de