vdr uses quite a lot of memory, RSS here used to be ~70M, so i decided to see if it could be reduced a bit. First suspects became the ringbuffers, cause i was seeing a lot of log lines such as "buffer stats: 96068 (1%) used". Turns out many buffers are way over sized, only a few percent of the memory are typically used, yet because of how a ringbuffer works _all_ of it is constantly accessed. I first tried to make the ringbuffer users smarter about picking the size and eg after changing the transfer buffer to start out small and use bigger sizes only when the current one isn't large enough it seemed a more reasonable size for that vdr instance was 384k instead of 2M. Similarly the TS buffer could be just a few hundred K instead of 2M.
But that approach was not ideal as there isn't anything one could do after realizing the buffer is too small, it could be grown next time, after channel switch etc, but that's it. Also some of the huge buffers came from plugins, and I wasn't all that excited about fixing all of them (one example w/ the one-digit % fill was streamdev-server). It is hard to estimate what the proper size would be under load, with many simultaneous recordings. So I did attack this issue differently -- by making the ringbuffers themselves smarter.
Patch below changes the interpretation of the 'Size' parm given when creating a buffer from a fixed length to a max limit. The buffers start out much smaller (currently 128k, instead of many megabytes) and grow automatically when they become almost full. This way all callers benefit, and they don't have to be modified at all.
The buffers are actually always allocated using the max size, but only the necessary buffer part is used, the rest of the memory is never accessed; this is done this way so that no extra locking is necessary.
I've tested this on two vdrs for a few hours and it seems to work, but it needs a lot more testing. There is some extra logging in this patch so that you can see what's going on; every time a ring buffer gets deleted it logs a line which tells you exactly how much memory was requested and how much was really needed.
vdr: [19729] Deleting ring buffer "TS" size: 196608 / 2097152 (used 9.375% of requested size) vdr: [19718] Deleting ring buffer "Result" size: 262144 / 262144 (used 100% of requested size) vdr: [19718] Deleting ring buffer "Recorder" size: 442368 / 5242880 (used 8.4375% of requested size) vdr: [19718] Deleting ring buffer "Result" size: 262144 / 262144 (used 100% of requested size) vdr: [19718] Deleting ring buffer "Recorder" size: 131072 / 5242880 (used 2.5% of requested size) vdr: [19733] Deleting ring buffer "TS" size: 131072 / 2097152 (used 6.25% of requested size) vdr: [19718] Deleting ring buffer "Result" size: 262144 / 262144 (used 100% of requested size) vdr: [19718] Deleting ring buffer "Recorder" size: 442368 / 5242880 (used 8.4375% of requested size)
So each of the several 'Recorder' buffers was an order of magnitude smaller than w/o this patch and several megabytes less ram was used.
artur
diff --git a/ringbuffer.c b/ringbuffer.c index 0633bd3..8dd1efe 100644 --- a/ringbuffer.c +++ b/ringbuffer.c @@ -151,13 +151,29 @@ void cRingBufferLinear::PrintDebugRBL(void) } #endif
+// cRingBufferLinear are dynamically sized, or at least we can pretend they are ;) +// We treat 'Size' as the maximum size, but start with a small buffer, which can +// grow later as it fills up. Memory is always allocated for the full buffer, but +// the unused RAM portion remains untouched until (if at all) it is actually needed. +// Note that we can not start with a larger than requested buffer because there are +// ring buffer users that cause crashes then (eg softdevice mpegdecoder absolutely +// needs 32/64k). + +// Startup size. 64k still causes overflows before buffer starts to grow, 128k doesn't. +// The buffers grow to 200..500k anyway, so maybe increasing this a bit more would +// make sense, but let's first see if it's really needed. +#define DEFRBSIZE KILOBYTE(128) + cRingBufferLinear::cRingBufferLinear(int Size, int Margin, bool Statistics, const char *Description) -:cRingBuffer(Size, Statistics) +:cRingBuffer(min(Size,DEFRBSIZE), Statistics) { description = Description ? strdup(Description) : NULL; tail = head = margin = Margin; gotten = 0; buffer = NULL; + maxsize = Size; + growbuf = 0; + dsyslog("New ring buffer "%s" size: %d margin: %d", description, Size, Margin ); if (Size > 1) { // 'Size - 1' must not be 0! if (Margin <= Size / 2) { buffer = MALLOC(uchar, Size); @@ -183,6 +199,8 @@ cRingBufferLinear::~cRingBufferLinear() #ifdef DEBUGRINGBUFFERS DelDebugRBL(this); #endif + dsyslog("Deleting ring buffer "%s" size: %d / %d (used %g%% of requested size)", description, + size, maxsize, size/(double)maxsize*100.0 ); free(buffer); free(description); } @@ -207,6 +225,13 @@ void cRingBufferLinear::Clear(void)
int cRingBufferLinear::Read(int FileHandle, int Max) { + if (Available() >= size*2/4) + growbuf = 1; + if (growbuf && head>=tail && size<maxsize) { + size = min(size+size/2, maxsize); + dsyslog("Enlarging ring buffer "%s": %d bytes (Read())", description, size); + growbuf = 0; + } int Tail = tail; int diff = Tail - head; int free = (diff > 0) ? diff - 1 : Size() - head; @@ -236,6 +261,8 @@ int cRingBufferLinear::Read(int FileHandle, int Max) lastHead = head; lastPut = Count; #endif + if (Available() >= size*3/4) + growbuf = 1; EnableGet(); if (free == 0) WaitForPut(); @@ -245,6 +272,13 @@ int cRingBufferLinear::Read(int FileHandle, int Max) int cRingBufferLinear::Put(const uchar *Data, int Count) { if (Count > 0) { + if (Available()+Count >= size*2/4) + growbuf = 1; + if (growbuf && head>=tail && size<maxsize) { + size = min(size+size/2, maxsize); + dsyslog("Enlarging ring buffer "%s": %d bytes", description, size); + growbuf = 0; + } int Tail = tail; int rest = Size() - head; int diff = Tail - head; diff --git a/ringbuffer.h b/ringbuffer.h index dba0e61..f417d57 100644 --- a/ringbuffer.h +++ b/ringbuffer.h @@ -18,11 +18,11 @@ private: cCondWait readyForPut, readyForGet; int putTimeout; int getTimeout; - int size; time_t lastOverflowReport; int overflowCount; int overflowBytes; protected: + int size; tThreadId getThreadTid; int maxFill;//XXX int lastPercent; @@ -59,6 +59,8 @@ private: int margin, head, tail; int gotten; uchar *buffer; + int growbuf; + int maxsize; char *description; public: cRingBufferLinear(int Size, int Margin = 0, bool Statistics = false, const char *Description = NULL);
I've tested this on two vdrs for a few hours and it seems to work, but it needs a lot more testing.
I did a few test recordings (total duration likely more than 24h) and the new code seems mostly fine; just some tweaking of the sizes and grow heuristics left.
However one problem did appear after many hours of simultaneous recordings:
08:17:38 vdr: [24052] buffer usage: 70% (tid=24051) 08:17:39 vdr: [24052] buffer usage: 80% (tid=24051) 08:17:39 vdr: [24052] buffer usage: 90% (tid=24051) 08:17:39 vdr: [24052] buffer usage: 100% (tid=24051) 08:17:39 vdr: [24052] ERROR: 1 ring buffer overflow (61 bytes dropped) 08:17:45 vdr: [24052] ERROR: 9053 ring buffer overflows (1701964 bytes dropped) 08:17:47 vdr: [24052] Enlarging ring buffer "Recorder": 442368 bytes 08:17:47 vdr: [24052] buffer usage: 0% (tid=24051)
Apparently the 'Recorder' consumer stalled for several seconds at a point when growing the ringbuffer wasn't possible - to keep Get/Put lockless resizing is only possible when the ringbuffer data is continuous; and if the consumer stops making progress at the wrong moment, the buffer will not be enlarged and overflow. In this case the buffer was ~300k and ~2m would have been needed. I can think of a few ways to make the ringbuffer growing more aggressive, but they all make the buffer handling much more complex, so i'd like to avoid that.
Looking at the caller:
void cRecorder::Receive(uchar *Data, int Length) { if (Running()) { int p = ringBuffer->Put(Data, Length); if (p != Length && Running()) ringBuffer->ReportOverflow(Length - p); } }
it simply drops any data that does not fit into the buffer, which would be fine for live viewing, but isn't ideal when recording.
Can we try harder not to loose data here? IOW can this function sleep (and retry)?
(not that i think doing this would be enough; it could help for short stalls, but for the longer ones a much larger buffer size is necessary, i'll have to find a clean way to handle that anyway. It's kind of hard to test though as the overflow only happened once during ~24h of recording...)
artur
It took a while to trigger the condition again, but now it happened and trying a bit harder payed off. Running vdr /w following patch resulted in this log (and no overflow):
20:31:35 vdr: [16328] buffer usage: 70% (tid=16327) 20:31:35 vdr: [16328] buffer usage: 80% (tid=16327) 20:31:35 vdr: [16328] buffer usage: 90% (tid=16327) 20:31:35 vdr: [16328] buffer usage: 100% (tid=16327) 20:31:35 vdr: [16327] Enlarging ring buffer "Result": 262144 bytes (trigger 3) 20:31:35 vdr: [16329] Enlarging ring buffer "TS": 262144 bytes (trigger 2) 20:31:35 vdr: [16328] Enlarging ring buffer "Recorder": 262144 bytes (trigger 3) 20:31:35 vdr: [16328] buffer usage: 0% (tid=16327) 20:31:35 vdr: [16328] saved extra 153 bytes in recorder ring buffer after 80 ms delay
artur
diff --git a/recorder.c b/recorder.c index 8bb1621..3c0e002 100644 --- a/recorder.c +++ b/recorder.c @@ -157,8 +157,20 @@ void cRecorder::Receive(uchar *Data, int Length) { if (Running()) { int p = ringBuffer->Put(Data, Length); - if (p != Length && Running()) + if (p != Length && Running()) { + for (int ms=20; ms<1000; ms+=ms) { + cCondWait::SleepMs(ms); + if (!Running()) + return; + int r = ringBuffer->Put(Data+p, Length-p); + p += r; + if (r) + dsyslog("saved extra %d bytes in recorder ring buffer after %d ms delay", r, ms); + if (p == Length || !Running()) + return; + } ringBuffer->ReportOverflow(Length - p); + } } }
On 05/09/07 20:56, Artur Skawina wrote:
From receiver.h:
virtual void Receive(uchar *Data, int Length) = 0; ///< This function is called from the cDevice we are attached to, and ///< delivers one TS packet from the set of PIDs the cReceiver has requested. ///< The data packet must be accepted immediately, and the call must return **************************** ****** ///< as soon as possible, without any unnecessary delay. Each TS packet ************************************************** ///< will be delivered only ONCE, so the cReceiver must make sure that ///< it will be able to buffer the data if necessary.
Klaus
Klaus Schmidinger wrote:
On 05/09/07 20:56, Artur Skawina wrote:
void cRecorder::Receive(uchar *Data, int Length)
Yes, saw that, that's why asked if it couldn't sleep after all... And is why i made it sleep only for for short periods. ASAP is relative ;) Avoiding corrupting recordings is worth some small glitches when live viewing. Now, the max 1.26s sleep is probably too much, but i'll need to run with this for a few weeks to gather some meaningful statistics. So far the overflow has triggered twice in three days, first time (vanilla Receive()) some data was lost, second time the "unnecessary" delays rescued a few bytes. Another way to look at this is -- given how rarely this happens i don't think it's statistically significant ;) There will likely be many more errors due to other factors during these 36h or so of recordings. So this change isn't necessary, but i'd like to find out if the data loss can be avoided and it lets me monitor that.
The advantage of auto tuning the buffer sizes is that the max size can be increased cheaply, as the unneeded parts won't be used at all; i'll do this for the TS buffer; whether that will be enough to cope with the extra delays i'll probably find out in a few days...
artur