On Wed, Aug 13, 2008 at 09:09:45PM +0100, Gavin Hamill wrote:
OK, found a suitable recording, and after a couple of 'play 1 begin' to SVDRP it starts OK. The picture is pretty good, but it still shifts around the screen a bit:
that's because PLL encounters very large increments of 'sync point displacement' it must compensate for. We must find the cause where these leaps come from.
BTW: If I start a kernel build in the background I get similar effects:)
In your case it appears to be the Xserver itself consuming huge amounts of CPU resources for some yet unknown reason (see below).
[...]
overall compensation: -11 sync point displacement: -685 <------+ drift speed: 393 | overall compensation: -10 | there is no sync point displacement: 3175 <------+ stability drift speed: 8509 | overall compensation: 402 | sync point displacement: 6186 <------+
[...]
FWIW, the output is not once per second.. there is often a delay of up to 4 seconds before another group of 3 lines is displayed.
strange. The output here gets exactly every second directly to the screen. And also is updated every second through 'tail -F /var/log/Xorg.0.log'. I think that could fit to the '40% Xserver-CPU' phenomenon (see below).
So, I started looking for other reasons. Whilst X + vdr are running, the Xorg process is taking 40% CPU, with vdr taking 25%. The 'system' CPU usage is 32%, with 16% for user processes. I thought maybe it was using X11 output rather than xv, and thus causing a drain on the system...
oh - a very interesting fact. that's different to mine (see my output of top below). Xorg takes only 0.7%(!) CPU on my system. Are there some special patches in ubuntu that causes this?
This appears be the root cause of our problem!
Does the Xserver poll for some resources not available or something? A value of 40% CPU is way too much. The only process consuming some CPU power should be 'vdr' whilst decoding. Most other processes don't have to do much all over the time. We must dig deeper into that '40% Xserver-CPU' phenomenon! DISPLAY environment variable is set to DISPLAY=:0 ?
again a typical Xserver output:
sync point displacement: -26 drift speed: -71 overall compensation: -3 sync point displacement: -31 drift speed: 100 overall compensation: 2 sync point displacement: -25 drift speed: -57 overall compensation: -2 sync point displacement: -23 drift speed: 12 overall compensation: 0 completed sync point displacement: 24 drift speed: 63 overall compensation: 3 sync point displacement: 6 drift speed: -72 overall compensation: -2 sync point displacement: -10 drift speed: -24 overall compensation: -1 completed sync point displacement: 5 drift speed: 60 overall compensation: 2
while at the same time you get these messages in '/var/log/messages'. You see the correction is only floating a little:
kernel: [drm] changed radeon drift trim from 00520101 -> 00520105 kernel: [drm] changed radeon drift trim from 00520105 -> 00520104 kernel: [drm] changed radeon drift trim from 00520104 -> 00520101 kernel: [drm] changed radeon drift trim from 00520101 -> 00520103 kernel: [drm] changed radeon drift trim from 00520103 -> 00520101 kernel: [drm] changed radeon drift trim from 00520101 -> 00520104 kernel: [drm] changed radeon drift trim from 00520104 -> 00520102 kernel: [drm] changed radeon drift trim from 00520102 -> 00520101 kernel: [drm] changed radeon drift trim from 00520101 -> 00520103
at the same time I get following values through 'vmstat 1':
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 0 349304 11228 107228 0 0 1788 0 286 777 22 1 77 0 0 0 0 349296 11228 107216 0 0 0 0 294 787 22 0 78 0 0 0 0 347364 11232 109012 0 0 1804 0 299 780 22 2 76 0 0 0 0 347364 11232 109024 0 0 0 0 281 767 23 1 76 0 0 0 0 345572 11232 110824 0 0 1796 0 300 782 24 0 76 0 0 0 0 345564 11240 110820 0 0 0 72 295 799 24 1 75 0 0 0 0 343896 11240 112596 0 0 1800 0 294 781 24 1 75 0 0 0 0 343896 11240 112596 0 0 0 0 287 776 25 0 75 0 0 0 0 342104 11240 114396 0 0 1808 0 293 781 24 2 74 0 0 0 0 342096 11240 114404 0 0 0 20 291 780 26 1 73 0 0 0 0 340304 11248 116196 0 0 1800 56 307 779 25 1 74 0 0 0 0 340296 11248 116204 0 0 0 0 281 768 24 2 74 0 0 0 0 338504 11248 118004 0 0 1788 0 285 764 21 4 75 0 0 0 0 338512 11248 117992 0 0 0 0 283 745 27 0 73 0 0 0 0 344776 11248 111580 0 0 1788 4 300 775 23 2 75 0
and top:
top - 10:48:13 up 1:33, 8 users, load average: 0.22, 0.09, 0.02 Tasks: 58 total, 2 running, 56 sleeping, 0 stopped, 0 zombie Cpu(s): 1.0%us, 0.3%sy, 24.3%ni, 74.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 516368k total, 173356k used, 343012k free, 11416k buffers Swap: 3903784k total, 0k used, 3903784k free, 113200k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 8858 root 20 0 149m 25m 14m S 16.0 5.0 0:31.03 vdr 8798 root 20 0 294m 15m 12m S 0.7 3.1 0:01.54 Xorg 8894 root 20 0 2316 1096 872 R 0.7 0.2 0:00.28 top 1 root 20 0 2028 708 604 S 0.0 0.1 0:01.26 init 2 root 15 -5 0 0 0 S 0.0 0.0 0:00.00 kthreadd 3 root RT -5 0 0 0 S 0.0 0.0 0:00.00 migration/0 4 root 15 -5 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/0
You see Xorg is almost not noticable on my system!
Can you strace the Xserver? Maybe you can try Debian experimental packages like I do? Don't the run on Ubuntu as well?
If it would help you I can offer you to make a copy of my entire development system (about 800MB as compressed tar image). It's based on actual Debian lenny. This way you would have a system running instantly as expected.
From there you could try to stepwise activate your additional components.
And you have a chance to see which components causes the failure.
Cheers Thomas