Hi,
I would like to raise an issue with SVDRP. When the client doesn't send QUIT but simply closes the connection, VDR happily ignores the EOF returned by read. The socket will remain in state CLOSE_WAIT until it's finally closed due to the SVDRP timeout. In the meantime the SVDRP port remains blocked for other clients.
I think the way it was implemented until vdr-1.2.1 was correct. Then a broken pipe problem was reported (http://linvdr.org/mailinglists/vdr/2003/07/msg00254.html) and the implementation of cSVDRP::Process was modified to the way which is still used in 1.4.1.
The way it is implemented now is IMHO broken. So I would suggest to return to the way it was initially implemented or - if someone has an idea where broken pipes might come from - find a better way to fix it. Maybe the broken pipes at that time even had a totally different cause?
Code <= 1.2.1: while (file.Ready(false)) { unsigned char c; int r = safe_read(file, &c, 1); if (r > 0) { ... } else if (r <= 0) { isyslog("lost connection to SVDRP client"); Close(); } }
Code >= 1.2.4 while (file.Ready(false)) { unsigned char c; int r = safe_read(file, &c, 1); if (r > 0) { ... } else if (r < 0) { isyslog("lost connection to SVDRP client"); Close(); } else break; }
Cheers, Frank
Quoting Frank Schmirler vdr@schmirler.de:
Hi,
I would like to raise an issue with SVDRP. When the client doesn't send QUIT but simply closes the connection, VDR happily ignores the EOF returned by read. The socket will remain in state CLOSE_WAIT until it's finally closed due to the SVDRP timeout. In the meantime the SVDRP port remains blocked for other clients.
I think the way it was implemented until vdr-1.2.1 was correct. Then a broken pipe problem was reported (http://linvdr.org/mailinglists/vdr/2003/07/msg00254.html) and the implementation of cSVDRP::Process was modified to the way which is still used in 1.4.1.
The way it is implemented now is IMHO broken. So I would suggest to return to the way it was initially implemented or - if someone has an idea where broken pipes might come from - find a better way to fix it. Maybe the broken pipes at that time even had a totally different cause?
Code <= 1.2.1: while (file.Ready(false)) { unsigned char c; int r = safe_read(file, &c, 1); if (r > 0) { ... } else if (r <= 0) { isyslog("lost connection to SVDRP client"); Close(); } }
read() returning 0 means eof. So above code should be the correct one (tm) . But safe_read() has to be modified, as socket is non-blocking (oldflags |= O_NONBLOCK;), EAGAIN should be handled too:
ssize_t safe_read(int filedes, void *buffer, size_t size) { for (;;) { ssize_t p = read(filedes, buffer, size); if (p < 0) { if (errno == EINTR) { dsyslog("EINTR while reading from file handle %d - retrying", filedes); continue; } if (errno == EAGAIN) { dsyslog("EAGAIN while reading from file handle %d - retrying", filedes); usleep (10000); continue; } } return p; } }
Code >= 1.2.4 while (file.Ready(false)) { unsigned char c; int r = safe_read(file, &c, 1); if (r > 0) { ... } else if (r < 0) { isyslog("lost connection to SVDRP client"); Close(); } else break; }
Stefan Lucke
On Wed, 19 Jul 2006 14:20:26 +0200, Stefan Lucke wrote
But safe_read() has to be modified, as socket is non- blocking (oldflags |= O_NONBLOCK;), EAGAIN should be handled too:
Before safe_read() select() is called through file.Ready(). So EAGAIN should not occur with SVDRP. But of course you're right if safe_read() is intended to be called for nonblocking IO without prior select().
Cheers, Frank
Quoting Frank Schmirler vdr@schmirler.de:
On Wed, 19 Jul 2006 14:20:26 +0200, Stefan Lucke wrote
But safe_read() has to be modified, as socket is non- blocking (oldflags |= O_NONBLOCK;), EAGAIN should be handled too:
Before safe_read() select() is called through file.Ready(). So EAGAIN should not occur with SVDRP. But of course you're right if safe_read() is intended to be called for nonblocking IO without prior select().
But select() on files with O_NONBLOCK is like a noop according to this:
http://man.linuxquestions.org/index.php?query=select&type=2%C2%A7ion=2
Three independent sets of descriptors are watched. Those listed in readfds will be watched to see if characters become available for read- ing (more precisely, to see if a read will not block - in particular, a file descriptor is also ready on end-of-file), those in writefds will be watched to see if a write will not block, and those in exceptfds will be watched for exceptions. On exit, the sets are modified in place to indicate which descriptors actually changed status.
as with O_NONBLOCK a read would never block :-) .
Stefan
On Wed, 19 Jul 2006 16:35:27 +0200, Stefan Lucke wrote
Quoting Frank Schmirler vdr@schmirler.de:
On Wed, 19 Jul 2006 14:20:26 +0200, Stefan Lucke wrote
But safe_read() has to be modified, as socket is non- blocking (oldflags |= O_NONBLOCK;), EAGAIN should be handled too:
Before safe_read() select() is called through file.Ready(). So EAGAIN should not occur with SVDRP. But of course you're right if safe_read() is intended to be called for nonblocking IO without prior select().
But select() on files with O_NONBLOCK is like a noop according to this:
http://man.linuxquestions.org/index.php?query=select&type=2%C2%A7ion=2
Three independent sets of descriptors are watched. Those
listed in readfds will be watched to see if characters become available for read- ing (more precisely, to see if a read will not block - in particular, a file descriptor is also ready on end-of-file), those in writefds will be watched to see if a write will not block, and those in exceptfds will be watched for exceptions. On exit, the sets are modified in place to indicate which descriptors actually changed status.
as with O_NONBLOCK a read would never block :-) .
This is from the posix standard IEEE Std 1003.1 definition of select():
"A descriptor shall be considered ready for reading when a call to an input function with O_NONBLOCK clear would not block, whether or not the function would transfer data successfully. (The function might return data, an end-of-file indication, or an error other than one indicating that it is blocked, and in each of these cases the descriptor shall be considered ready for reading.)"
Cheers, Frank
Hi Frank,
Could this be the cause of the watchdog exit I get more or less regularly when my infosat/tvm2vdr.pl scripts are sending status updates via SVDRP ? Afaik could this cause the watchdog to fire when another SVDRP command comes in ?
The script does a little more status messages than the original scripts. SVDRP messages are send around each ~30sec.
However I may give your suggestion a try. I would happily live again with a broken pipe instead of the exit ;-)
regards Peter
P.S. it's quite anoying here as it causes from time to time kernel panic caused by some race condition when reloading the DVB drivers.
vdr-bounces@linuxtv.org schrieb am 19.07.2006 12:59:36:
Hi,
I would like to raise an issue with SVDRP. When the client doesn't send
QUIT
but simply closes the connection, VDR happily ignores the EOF returned
by
read. The socket will remain in state CLOSE_WAIT until it's finally
closed due
to the SVDRP timeout. In the meantime the SVDRP port remains blockedfor
other
clients.
I think the way it was implemented until vdr-1.2.1 was correct. Then a
broken
pipe problem was reported (http://linvdr.org/mailinglists/vdr/2003/07/msg00254.html) and the implementation of cSVDRP::Process was modified to the way which is still
used
in 1.4.1.
On Wed, 19 Jul 2006 15:43:22 +0200, Peter Dittmann wrote
Could this be the cause of the watchdog exit I get more or less regularly when my infosat/tvm2vdr.pl scripts are sending status updates via SVDRP ? Afaik could this cause the watchdog to fire when another SVDRP command comes in ?
The script does a little more status messages than the original scripts. SVDRP messages are send around each ~30sec.
However I may give your suggestion a try. I would happily live again with a broken pipe instead of the exit ;-)
Hi Peter,
I doubt it will change anything for you, but you never know... To find out if really SVDRP itself is responsible, I'd suggest to add some syslog calls to cSVDRP::Process. Check how much time it spends in there.
Cheers, Frank
vdr-bounces@linuxtv.org schrieb am 19.07.2006 18:04:16:
On Wed, 19 Jul 2006 15:43:22 +0200, Peter Dittmann wrote
Could this be the cause of the watchdog exit I get more or less regularly when my infosat/tvm2vdr.pl scripts are sending status updates via SVDRP ? Afaik could this cause the watchdog to fire when another SVDRP command comes in ?
The script does a little more status messages than the original scripts. SVDRP messages are send around each ~30sec.
However I may give your suggestion a try. I would happily live again with a broken pipe instead of the exit ;-)
Hi Peter,
I doubt it will change anything for you, but you never know... To find
out if
really SVDRP itself is responsible, I'd suggest to add some syslog calls
to
cSVDRP::Process. Check how much time it spends in there.
The log already shows it's triggered by SVDRP. SVDRP command is send but not received correctly. VDR did not log here that he received the MESG command but the log tells the SVDRP connection was opened. 30sec later (at a watchdog timout of 30) I get the emergency exit.
Here a small log:
Jul 13 04:00:01 vdr /USR/SBIN/CRON[4117]: (root) CMD ( /usr/bin/infosat.sh) Jul 13 04:00:01 vdr vdr[1401]: connect from 127.0.0.1, port 32770 - accepted Jul 13 04:00:01 vdr vdr[1401]: SVDRP message: 'EPG Update started' Jul 13 04:00:01 vdr vdr[1401]: info: EPG Update started Jul 13 04:00:01 vdr vdr[1401]: closing SVDRP connection Jul 13 04:00:01 vdr vdr[1401]: connect from 127.0.0.1, port 32771 - accepted Jul 13 04:00:01 vdr vdr[1401]: switching to channel 5633 Jul 13 04:00:01 vdr vdr[2419]: TS buffer on device 1 thread ended (pid=2419, tid=213006) Jul 13 04:00:01 vdr vdr[2418]: buffer stats: 11092 (0%) used Jul 13 04:00:01 vdr vdr[2418]: receiver on device 1 thread ended (pid=2418, tid=196621) Jul 13 04:00:01 vdr vdr[1401]: buffer stats: 0 (0%) used Jul 13 04:00:02 vdr vdr[1401]: closing SVDRP connection Jul 13 04:00:07 vdr vdr[1401]: ERROR: can't open XPM file '/var/lib/vdrdevel/plugins/skinelchi/logos/RADIOROPA-BERLIN.xpm' Jul 13 04:00:08 vdr vdr[1411]: changing portal name of channel 511 from '' to 'Info Tafel' Jul 13 04:00:08 vdr vdr[1411]: linking channel 511 from none to 5001 5634 5162 5163 5758 Jul 13 04:00:08 vdr vdr[1411]: changing portal name of channel 509 from '' to 'Direkt Portal' Jul 13 04:00:08 vdr vdr[1411]: changing name of channel 5005 from 'D 1 - 20:30,;' to 'D 1 - 02:30,;' Jul 13 04:00:08 vdr vdr[1411]: changing name of channel 5616 from 'D 1 - 21:00,;' to 'D 1 - 03:00,;' Jul 13 04:00:08 vdr vdr[1411]: changing name of channel 5635 from 'D 2 - 20:30,;' to 'D 2 - 03:15,;' Jul 13 04:00:08 vdr vdr[1411]: changing name of channel 5008 from 'D 3 - 20:30,;' to 'D 3 - 03:15,;' Jul 13 04:00:08 vdr vdr[1411]: changing name of channel 5010 from 'D 4 - 20:30,;' to 'D 4 - 03:15,;' Jul 13 04:00:08 vdr vdr[1411]: linking channel 509 from none to 5005 5616 5635 5008 5010 Jul 13 04:01:02 vdr vdr[1401]: connect from 127.0.0.1, port 32772 - accepted Jul 13 04:01:02 vdr vdr[1401]: SVDRP message: 'Infosat:Received 0 of 1500 data blocks [ 0.00%]'
this is normal flow, connect and report SVDRP message
Jul 13 04:01:02 vdr vdr[1401]: info: Infosat:Received 0 of 1500 data blocks [ 0.00%] Jul 13 04:01:02 vdr vdr[1401]: closing SVDRP connection Jul 13 04:01:02 vdr vdr[1401]: connect from 127.0.0.1, port 32773 - accepted
This should be an SVDRP CHAN 1
Jul 13 04:01:02 vdr vdr[1401]: switching to channel 1 Jul 13 04:01:02 vdr vdr[1401]: graphlcd plugin: logo /var/lib/vdrdevel/plugins/graphlcd/logos/DAS ERSTE_l.glcd loaded. delay 2 Jul 13 04:01:02 vdr vdr[1401]: creating directory /var/cache/vdr/vtx/S19.2E-1-1101-28106 Jul 13 04:01:02 vdr vdr[4460]: receiver on device 1 thread started (pid=4460, tid=245773) Jul 13 04:01:02 vdr vdr[4461]: TS buffer on device 1 thread started (pid=4461, tid=262158) Jul 13 04:01:02 vdr vdr[1401]: closing SVDRP connection
I'm personally missing here a second SVDRP HITK POWER
Jul 13 04:01:57 vdr vdr[1408]: changing pids of channel 5013 from 901+901:902:204 to 701+701:702:204 Jul 13 04:02:20 vdr vdr[1411]: changing name of channel 1041 from 'NDR 90,3;ARD NDR' to 'NDR 90,3,;ARD NDR' Jul 13 04:02:20 vdr vdr[1411]: changing name of channel 1053 from 'radioBERLIN 88,8;ARD rbb' to 'radioBERLIN 88,8,;ARD rbb' Jul 13 04:02:56 vdr vdr[1408]: changing pids of channel 5013 from 701+701:702:204 to 901+901:902:204 Jul 13 04:03:57 vdr vdr[1408]: changing pids of channel 5013 from 901+901:902:204 to 701+701:702:204 Jul 13 04:04:56 vdr vdr[1408]: changing pids of channel 5013 from 701+701:702:204 to 901+901:902:204 Jul 13 04:05:01 vdr /USR/SBIN/CRON[5629]: (root) CMD ( run-parts --report /etc/cron.hourly) Jul 13 04:05:58 vdr vdr[1408]: changing pids of channel 5013 from 901+901:902:204 to 701+701:702:204 Jul 13 04:06:56 vdr vdr[1408]: changing pids of channel 5013 from 701+701:702:204 to 901+901:902:204 Jul 13 04:07:58 vdr vdr[1408]: changing pids of channel 5013 from 901+901:902:204 to 701+701:702:204 Jul 13 04:08:57 vdr vdr[1408]: changing pids of channel 5013 from 701+701:702:204 to 901+901:902:204 Jul 13 04:09:46 vdr vdr[1401]: connect from 127.0.0.1, port 32774 - accepted
this is possibly VDRADMIN
Jul 13 04:09:57 vdr vdr[1408]: changing pids of channel 5013 from 901+901:902:204 to 701+701:702:204 Jul 13 04:10:01 vdr /USR/SBIN/CRON[7221]: (root) CMD (test -x /usr/sbin/anacron || run-parts --report /etc/cron.daily) Jul 13 04:10:05 vdr vdr[1401]: max. latency time 19 seconds Jul 13 04:10:27 vdr vdr[1401]: closing SVDRP connection Jul 13 04:10:55 vdr vdr[1408]: changing pids of channel 5013 from 701+701:702:204 to 901+901:902:204 Jul 13 04:11:57 vdr vdr[1408]: changing pids of channel 5013 from 901+901:902:204 to 701+701:702:204 Jul 13 04:12:07 vdr vdr[1411]: changing pids of channel 5015 from 160+160:80=fra:0 to 160+160:80=fra,81=eng:32 Jul 13 04:12:07 vdr vdr[1411]: changing caids of channel 5015 from 500,100 to 100 Jul 13 04:12:56 vdr vdr[1408]: changing pids of channel 5013 from 701+701:702:204 to 901+901:902:204 Jul 13 04:13:58 vdr vdr[1408]: changing pids of channel 5013 from 901+901:902:204 to 701+701:702:204 Jul 13 04:14:43 vdr vdr[1411]: changing caids of channel 531 from 100 to 0 Jul 13 04:14:56 vdr vdr[1408]: changing pids of channel 5013 from 701+701:702:204 to 901+901:902:204 Jul 13 04:15:56 vdr vdr[1408]: changing pids of channel 5013 from 901+901:902:204 to 701+701:702:204 Jul 13 04:16:58 vdr vdr[1408]: changing pids of channel 5013 from 701+701:702:204 to 901+901:902:204 Jul 13 04:17:56 vdr vdr[1408]: changing pids of channel 5013 from 901+901:902:204 to 701+701:702:204 Jul 13 04:18:55 vdr vdr[1408]: changing pids of channel 5013 from 701+701:702:204 to 901+901:902:204 Jul 13 04:19:58 vdr vdr[1408]: changing pids of channel 5013 from 901+901:902:204 to 701+701:702:204 Jul 13 04:20:29 vdr vdr[1401]: connect from 127.0.0.1, port 32775 - accepted
next connect
Jul 13 04:20:58 vdr vdr[1408]: changing pids of channel 5013 from 701+701:702:204 to 901+901:902:204
60sec later we get the PANIC
Jul 13 04:21:31 vdr vdr[1401]: PANIC: watchdog timer expired - exiting! Jul 13 04:21:31 vdr lircd 0.7.1pre2[840]: removed client Jul 13 04:21:31 vdr vdr[2416]: KBD remote control thread ended (pid=2416, tid=163851) Jul 13 04:21:32 vdr vdr[2405]: graphlcd plugin: Display update thread ended (pid=2405) Jul 13 04:21:32 vdr vdr[4461]: TS buffer on device 1 thread ended (pid=4461, tid=262158) Jul 13 04:21:32 vdr vdr[4460]: buffer stats: 10152 (0%) used Jul 13 04:21:32 vdr vdr[4460]: receiver on device 1 thread ended (pid=4460, tid=245773) Jul 13 04:21:32 vdr vdr[1401]: buffer stats: 0 (0%) used Jul 13 04:21:32 vdr runvdrdevel: restarting VDR Jul 13 04:21:37 vdr kernel: saa7146: unregister extension 'dvb'.
I't not realy clear what kills VDR. But it seems to be triggered by SVDRP.
P.S. It's a vdr 1.3.24, the scripts are placing a high stress as they never exit SVDRP correctly. Besides I still see a lot of broken pipes for SVDRP too, mainly comming from VDRAdmin.
kind regards Peter
On Thu, 20 Jul 2006 13:11:23 +0200, Peter Dittmann wrote
vdr-bounces@linuxtv.org schrieb am 19.07.2006 18:04:16:
On Wed, 19 Jul 2006 15:43:22 +0200, Peter Dittmann wrote
Could this be the cause of the watchdog exit I get more or less regularly when my infosat/tvm2vdr.pl scripts are sending status updates via SVDRP ? Afaik could this cause the watchdog to fire when another SVDRP command comes in ?
The script does a little more status messages than the original scripts. SVDRP messages are send around each ~30sec.
However I may give your suggestion a try. I would happily live again with a broken pipe instead of the exit ;-)
Hi Peter,
I doubt it will change anything for you, but you never know... To find
out if
really SVDRP itself is responsible, I'd suggest to add some syslog calls
to
cSVDRP::Process. Check how much time it spends in there.
The log already shows it's triggered by SVDRP. SVDRP command is send but not received correctly. VDR did not log here that he received the MESG command but the log tells the SVDRP connection was opened. 30sec later (at a watchdog timout of 30) I get the emergency exit.
...
I't not realy clear what kills VDR. But it seems to be triggered by SVDRP.
P.S. It's a vdr 1.3.24, the scripts are placing a high stress as they never exit SVDRP correctly. Besides I still see a lot of broken pipes for SVDRP too, mainly comming from VDRAdmin.
Well, it's not very hard to trigger the watchdog with SVDRP ;-) All you need to do is feed input fast enough so that it never leaves the loop in cSVDRP::Process.
In your log everything looks fine in respect of connect and disconnect until the watchdog fires. Would be interesting to see which command triggers the watchdog. Could it be an EPG upload with PUTE? If there's enough data to keep SVDRP busy this would explain it.
Try to find out which section of which script (tvm2vdr.pl?) is run when the watchdog expires. A sleep() every now and then in the loop which sends the actual data might solve your problem.
BTW: Did you ever try it with the watchdog disabled?
Good luck, Frank
On Wed, 19 Jul 2006 12:59:36 +0200, Frank Schmirler wrote
I would like to raise an issue with SVDRP. When the client doesn't send QUIT but simply closes the connection, VDR happily ignores the EOF returned by read. The socket will remain in state CLOSE_WAIT until it's finally closed due to the SVDRP timeout. In the meantime the SVDRP port remains blocked for other clients.
An update on this issue: with the implementation used before vdr-1.2.2 you *will* get a broken pipe message, as cSVDRP::Close() tries to send a goodby message to the client. This is not a problem, but as a "clean" solution I would suggest:
--- svdrp.c.orig 2006-07-20 11:34:10.000000000 +0200 +++ svdrp.c 2006-07-20 11:39:51.000000000 +0200 @@ -1575,8 +1575,11 @@ isyslog("lost connection to SVDRP client"); Close(); } - else - break; + else { + isyslog("SVDRP client closed connection"); + file.Close(); + DELETENULL(PUTEhandler); + } } if (Setup.SVDRPTimeout && time(NULL) - lastActivity > Setup.SVDRP Timeout) { isyslog("timeout on SVDRP connection");
Cheers, Frank
vdr-bounces@linuxtv.org schrieb am 20.07.2006 12:03:09:
On Wed, 19 Jul 2006 12:59:36 +0200, Frank Schmirler wrote
I would like to raise an issue with SVDRP. When the client doesn't send QUIT but simply closes the connection, VDR happily ignores the EOF returned by read. The socket will remain in state CLOSE_WAIT until it's finally closed due to the SVDRP timeout. In the meantime the SVDRP port remains blocked for other clients.
An update on this issue: with the implementation used before vdr-1.2.2
you
*will* get a broken pipe message, as cSVDRP::Close() tries to send a
goodby
message to the client. This is not a problem, but as a "clean" solution
I
would suggest:
--- svdrp.c.orig 2006-07-20 11:34:10.000000000 +0200 +++ svdrp.c 2006-07-20 11:39:51.000000000 +0200 @@ -1575,8 +1575,11 @@ isyslog("lost connection to SVDRP client"); Close(); }
else
break;
else {
isyslog("SVDRP client closed connection");
file.Close();
DELETENULL(PUTEhandler);
} } if (Setup.SVDRPTimeout && time(NULL) - lastActivity > Setup.SVDRP
Timeout) { isyslog("timeout on SVDRP connection");
I will try this workarounds by backporting it to my productive system.
Had another kernel crash this morning with stuck SVDRP:
Jul 24 04:09:03 vdr vdr[1399]: closing SVDRP connection Jul 24 04:09:44 vdr vdr[1399]: connect from 127.0.0.1, port 32781 - accepted Jul 24 04:10:01 vdr /USR/SBIN/CRON[7313]: (root) CMD (test -x /usr/sbin/anacron || run-parts --report /etc/cr Jul 24 04:10:08 vdr vdr[1399]: max. latency time 23 seconds Jul 24 04:10:28 vdr vdr[1399]: timer 45 (55 2050-2110 'kinder~Bernd~BRAVO BERND~der etwas andere Sandmann') a
this shurely is VDRAdmin adding timers looks like it takes 54 second here =0
Jul 24 04:10:38 vdr vdr[1399]: closing SVDRP connection Jul 24 04:10:38 vdr vdr[1399]: connect from 127.0.0.1, port 32782 - accepted Jul 24 04:10:38 vdr vdr[1399]: ERROR (svdrp.c,365): Broken pipe Jul 24 04:11:03 vdr vdr[1399]: connect from 127.0.0.1, port 32784 - accepted Jul 24 04:11:03 vdr vdr[1399]: SVDRP message: 'Infosat:Received 1415 of 1500 data blocks [ 94.33%]' Jul 24 04:11:03 vdr vdr[1399]: info: Infosat:Received 1415 of 1500 data blocks [ 94.33%] Jul 24 04:11:04 vdr vdr[1399]: closing SVDRP connection Jul 24 04:12:03 vdr vdr[1399]: connect from 127.0.0.1, port 32785 - accepted Jul 24 04:12:03 vdr vdr[1399]: SVDRP message: 'Infosat:Received 1426 of 1500 data blocks [ 95.07%]' Jul 24 04:12:03 vdr vdr[1399]: info: Infosat:Received 1426 of 1500 data blocks [ 95.07%] Jul 24 04:12:03 vdr vdr[1399]: closing SVDRP connection Jul 24 04:13:04 vdr vdr[1399]: connect from 127.0.0.1, port 32786 - accepted Jul 24 04:13:04 vdr vdr[1399]: SVDRP message: 'Infosat:Received 1440 of 1500 data blocks [ 96.00%]' Jul 24 04:13:04 vdr vdr[1399]: info: Infosat:Received 1440 of 1500 data blocks [ 96.00%] Jul 24 04:13:04 vdr vdr[1399]: closing SVDRP connection Jul 24 04:14:04 vdr vdr[1399]: connect from 127.0.0.1, port 32787 - accepted Jul 24 04:14:04 vdr vdr[1399]: SVDRP message: 'Infosat:Received 1449 of 1500 data blocks [ 96.60%]' Jul 24 04:14:04 vdr vdr[1399]: info: Infosat:Received 1449 of 1500 data blocks [ 96.60%] Jul 24 04:14:04 vdr vdr[1399]: closing SVDRP connection Jul 24 04:14:47 vdr vdr[1410]: changing caids of channel 531 from 100 to 0 Jul 24 04:15:04 vdr vdr[1399]: connect from 127.0.0.1, port 32788 - accepted Jul 24 04:15:05 vdr vdr[1399]: SVDRP message: 'Infosat:Received 1464 of 1500 data blocks [ 97.60%]' Jul 24 04:15:05 vdr vdr[1399]: info: Infosat:Received 1464 of 1500 data blocks [ 97.60%] Jul 24 04:15:05 vdr vdr[1399]: closing SVDRP connection Jul 24 04:16:04 vdr vdr[1399]: connect from 127.0.0.1, port 32789 - accepted Jul 24 04:16:05 vdr vdr[1399]: SVDRP message: 'Infosat:Received 1472 of 1500 data blocks [ 98.13%]' Jul 24 04:16:05 vdr vdr[1399]: info: Infosat:Received 1472 of 1500 data blocks [ 98.13%] Jul 24 04:16:05 vdr vdr[1399]: closing SVDRP connection Jul 24 04:17:04 vdr vdr[1399]: connect from 127.0.0.1, port 32790 - accepted Jul 24 04:17:04 vdr vdr[1399]: SVDRP message: 'Infosat:Received 1476 of 1500 data blocks [ 98.40%]' Jul 24 04:17:04 vdr vdr[1399]: info: Infosat:Received 1476 of 1500 data blocks [ 98.40%] Jul 24 04:17:04 vdr vdr[1399]: closing SVDRP connection Jul 24 04:18:05 vdr vdr[1399]: connect from 127.0.0.1, port 32791 - accepted Jul 24 04:18:05 vdr vdr[1399]: SVDRP message: 'Infosat:Received 1481 of 1500 data blocks [ 98.73%]' Jul 24 04:18:05 vdr vdr[1399]: info: Infosat:Received 1481 of 1500 data blocks [ 98.73%] Jul 24 04:18:05 vdr vdr[1399]: closing SVDRP connection Jul 24 04:19:05 vdr vdr[1399]: connect from 127.0.0.1, port 32792 - accepted Jul 24 04:19:05 vdr vdr[1399]: SVDRP message: 'Infosat:Received 1490 of 1500 data blocks [ 99.33%]' Jul 24 04:19:05 vdr vdr[1399]: info: Infosat:Received 1490 of 1500 data blocks [ 99.33%] Jul 24 04:19:05 vdr vdr[1399]: closing SVDRP connection Jul 24 04:20:05 vdr vdr[1399]: connect from 127.0.0.1, port 32793 - accepted Jul 24 04:20:05 vdr vdr[1399]: SVDRP message: 'Infosat:Received 1497 of 1500 data blocks [ 99.80%]' Jul 24 04:20:05 vdr vdr[1399]: info: Infosat:Received 1497 of 1500 data blocks [ 99.80%] Jul 24 04:20:05 vdr vdr[1399]: closing SVDRP connection
this seems to be VDRAdin as it's not the same 60sec raster as the
messages from infosatepg
Jul 24 04:20:41 vdr vdr[1399]: connect from 127.0.0.1, port 32794 - accepted Jul 24 04:21:42 vdr vdr[1399]: PANIC: watchdog timer expired - exiting! Jul 24 04:21:43 vdr lircd 0.7.1pre2[839]: removed client Jul 24 04:21:43 vdr vdr[2415]: KBD remote control thread ended (pid=2415, tid=163851) Jul 24 04:21:43 vdr vdr[2404]: graphlcd plugin: Display update thread ended (pid=2404) Jul 24 04:21:43 vdr runvdrdevel: restarting VDR
Watchdog is 60sec and SVDRP timeout is 30 sec. VDRAdmin is 0.97-am3.3-2, vdr is 1.3.24 ctvdr-multipatch (please no request to test vanilla, this is the productive system and the crash comes every other month, no way I'm living with vanilla vdr this long ;)) )
Infosat should not do anything other than messages at this time.
Crash comes after restarting with an killed interrupt handler and infosat being blaimed. As a crash woraround I now happily kill infosatepg in my runvdr. That should keep me at least from crashing.
regards Peter
On Mon, 24 Jul 2006 09:58:00 +0200, Peter Dittmann wrote
Had another kernel crash this morning with stuck SVDRP:
(...)
Jul 24 04:19:05 vdr vdr[1399]: connect from 127.0.0.1, port 32792 - accepted Jul 24 04:19:05 vdr vdr[1399]: SVDRP message: 'Infosat:Received 1490 of 1500 data blocks [ 99.33%]' Jul 24 04:19:05 vdr vdr[1399]: info: Infosat:Received 1490 of 1500 data blocks [ 99.33%] Jul 24 04:19:05 vdr vdr[1399]: closing SVDRP connection Jul 24 04:20:05 vdr vdr[1399]: connect from 127.0.0.1, port 32793 - accepted Jul 24 04:20:05 vdr vdr[1399]: SVDRP message: 'Infosat:Received 1497 of 1500 data blocks [ 99.80%]' Jul 24 04:20:05 vdr vdr[1399]: info: Infosat:Received 1497 of 1500 data blocks [ 99.80%] Jul 24 04:20:05 vdr vdr[1399]: closing SVDRP connection
this seems to be VDRAdin as it's not the same 60sec raster as the
messages from infosatepg
Jul 24 04:20:41 vdr vdr[1399]: connect from 127.0.0.1, port 32794 - accepted
Jul 24 04:21:42 vdr vdr[1399]: PANIC: watchdog timer expired - exiting!
I don't know infosatepg, but as far as I've read, it first downloads the data via DVB-S and then tvmovie2vdr uploads it to SVDRP. The whole process is controled by infosatepg.sh.
According to the log, infosatepg receives about 5-10 packets per minute, so it is most likely finished at Jul 24 04:20:41. At that time the SVDRP upload starts which is simply too fast.
Again: try to find out what's really happening when the watchdog is triggered. Use a sniffer, add some debug lines to vdr - whatever.
Good luck, Frank
------------------------------------------------------------------------------------ Peter Dittmann APM-DS Philips Technologie GmbH Automotive Playback Modules Wetzlar / GERMANY Tel: +49 (0) 6441 3722 759 Fax: +49 (0) 6441 3722 566 E-mail: : peter.dittmann@philips.com ------------ http:\www.apm.philips.com ---------------------------------------------------------------
vdr-bounces@linuxtv.org schrieb am 24.07.2006 17:51:55:
On Mon, 24 Jul 2006 09:58:00 +0200, Peter Dittmann wrote
Had another kernel crash this morning with stuck SVDRP:
(...)
Jul 24 04:19:05 vdr vdr[1399]: connect from 127.0.0.1, port 32792 -
accepted
Jul 24 04:19:05 vdr vdr[1399]: SVDRP message: 'Infosat:Received
1490 of 1500
data blocks [ 99.33%]' Jul 24 04:19:05 vdr vdr[1399]: info: Infosat:Received 1490 of 1500
data
blocks [ 99.33%] Jul 24 04:19:05 vdr vdr[1399]: closing SVDRP connection Jul 24 04:20:05 vdr vdr[1399]: connect from 127.0.0.1, port 32793 -
accepted
Jul 24 04:20:05 vdr vdr[1399]: SVDRP message: 'Infosat:Received 1497 of 1500 data blocks [ 99.80%]' Jul 24 04:20:05 vdr vdr[1399]: info: Infosat:Received 1497 of 1500 data blocks [ 99.80%] Jul 24 04:20:05 vdr vdr[1399]: closing SVDRP connection
this seems to be VDRAdin as it's not the same 60sec raster as the
messages from infosatepg
Jul 24 04:20:41 vdr vdr[1399]: connect from 127.0.0.1, port 32794 - accepted
Jul 24 04:21:42 vdr vdr[1399]: PANIC: watchdog timer expired -
exiting!
I don't know infosatepg, but as far as I've read, it first downloads the
data
via DVB-S and then tvmovie2vdr uploads it to SVDRP. The whole process is controled by infosatepg.sh.
According to the log, infosatepg receives about 5-10 packets per minute,
so it
is most likely finished at Jul 24 04:20:41. At that time the SVDRP
upload
starts which is simply too fast.
Nope, infosatepg is not filling the data linearly. It takes around three runs here to accumulate all blocks as the FF has a high packet error rate. I'm shure that infosatepg would have been happily busy untin 4:35. This is supported by infosatepg still running and being the cause of the kernel crash once VDR has restarted.
Again: try to find out what's really happening when the watchdog is
triggered.
Use a sniffer, add some debug lines to vdr - whatever.
It's a ctvdr productive system, not much possibilities here to not break all plugin packages. However the problem seems to rise from VDRAdmin and the long timeout when adding timers. I had removed VDRAdmin already from the init and run it only during the infosatepg update time. I had strange crashes before. Especially when mp3ng plugin was running and a SVDRP message was comming in. This crashed VDR almost guaranteed. It looks like Morones mp3ng does not like messages completely. But that's different issue.
I will try to backport the SVDRP changes to my productive system and check were this leads.
regards Peter