The day before I tried to find out if somebody ever had implemented something to prevent vdr to trash the file system buffers. I did this because I hate to wait 20 seconds for vdr to re-read my recordings list. So I read about the usage of O_DIRECT and the trouble with it. While googling about O_DIRECT I found the function posix_fadvise - which I had never heard about before - can help to do something similar without to have such an impact on the current programming scheme.
Using this function I wrote a little patch for vdr 1.3.30 - if you are interested have a look here:
http://vdr.unetz.com/download/patches/vdr-avoidTrashing-0.2.0-plain-1.3.30.d...
The same patch against enAIO-2.5-rm-a:
http://vdr.unetz.com/download/patches/vdr-avoidTrashing-0.2.0.diff.gz
For sure the whole thing only makes sense when all programs dealing with video streams on a vdr do the same. So I did some first steps patching noad. I'm not finished with it - there seems to be a read() I missed. Anyway - the preliminary patch for noad 0.6.0 can be found here:
http://vdr.unetz.com/download/patches/noad-avoidTrashing-0.0.1.diff.gz
Further candidates for posix_fadvise are mkisofs and the burn-plugin.
Regards Ralf Müller
Ralf Müller rmvdr@bj-ig.de wrote:
For sure the whole thing only makes sense when all programs dealing with video streams on a vdr do the same. So I did some first steps patching noad.
very interesting. i would suggest, that vdr should wrap all relevant calls into its own APIs and make it public to plugins. additionally you could write defines for (f)open, read(), write(), ... and redirect them to your replacements, so at least plugins that deal with reading from harddisk by themselfs, would benefit instantly.
On Montag 22 August 2005 20:43, Clemens Kirchgatterer wrote:
Ralf Müller rmvdr@bj-ig.de wrote:
For sure the whole thing only makes sense when all programs dealing with video streams on a vdr do the same. So I did some first steps patching noad.
very interesting. i would suggest, that vdr should wrap all relevant calls into its own APIs and make it public to plugins.
Yes - it would be interesting to have that functionality in core vdr.
additionally you could write defines for (f)open, read(), write(), ... and redirect them to your replacements, so at least plugins that deal with reading from harddisk by themselfs, would benefit instantly.
From my point of view I would like to _know_ what I call. Thatwhy I choose the names OpenStream(), ReadStream() ... just to make clear what kind of file is meant. I know in plain vdr there are only two types files - tiny ones which will be read nearly atomic; for which it makes nearly no difference if they will be cached or not and the streaming type of file. The point is the _nearly_ no difference. I came to the point of writing this patch because the filesystem structure and the small info.vdr files where out of buffer cache all the time they where needed. To only read all the info.vdr files on my machine takes 6 seconds when they are not already buffered - when wrapping read(), write() by #defines they would never go into buffer cache. Thats not what I want.
Ralf
On Mon, Aug 22, 2005 at 11:17:21AM +0200, Ralf Müller wrote:
The day before I tried to find out if somebody ever had implemented something to prevent vdr to trash the file system buffers. I did this because I hate to wait 20 seconds for vdr to re-read my recordings list.
Thanks, this is very much appreciated. I applied this patch on my vdr-1.3.30 today, and I hope it will remove the regular spin-ups of other video disks while playing back recordings from one disk (caused by some housekeeping task, perhaps related to timers, EPG or purging deleted recordings).
So I read about the usage of O_DIRECT and the trouble with it.
Is there a summary of the problems somewhere? I've heard of Linux file system corruption in heavy database use (MySQL/InnoDB), but I'm not sure if there have been cases that have been tracked down to the use of O_DIRECT. I guess O_DIRECT won't work on NFS, but it would be a very bad idea to run a database on NFS anyway.
While googling about O_DIRECT I found the function posix_fadvise - which I had never heard about before - can help to do something similar without to have such an impact on the current programming scheme.
Using this function I wrote a little patch for vdr 1.3.30 - if you are interested have a look here:
http://vdr.unetz.com/download/patches/vdr-avoidTrashing-0.2.0-plain-1.3.30.d...
Hmm, is there a reason why your WriteStream function doesn't simply do posix_fadvise(fd, 0, 0, POSIX_FADV_DONTNEED)? That would simplify the logic, and if I understood correctly, it should work equally well, except with some early 2.5 kernels. I understand that you will need to keep track on the offsets in ReadStream because of read-ahead.
Marko
Marko Mäkelä wrote:
So I read about the usage of O_DIRECT and the trouble with it.
Is there a summary of the problems somewhere? I've heard of Linux file system corruption in heavy database use (MySQL/InnoDB), but I'm not sure if there have been cases that have been tracked down to the use of O_DIRECT. I guess O_DIRECT won't work on NFS, but it would be a very bad idea to run a database on NFS anyway.
Actually I meant the trouble with programming. It is a difference to align all buffers to 512 bytes and read/write in 512 byte blocks instead of just reading and writing what you want. This is especially true for programs which have not been designed this way from the very beginning. I first started to do a patch based on O_DIRECT - this would have been much more intrusive.
Hmm, is there a reason why your WriteStream function doesn't simply do posix_fadvise(fd, 0, 0, POSIX_FADV_DONTNEED)? That would simplify the logic, and if I understood correctly, it should work equally well, except with some early 2.5 kernels. I understand that you will need to keep track on the offsets in ReadStream because of read-ahead.
It seemed to me that different streams interfere with each other. So the posix_fadvise() of a recording thread seemed to kill the read ahead buffer of a player thread for the same recording - very annyoing.
I would be really happy to simplify the code if you can prove me wrong. Another nice thing to have would be a way to tell the kernel to forget dirty buffers after they have been written to disc without the need to call fdatasync() before. The forced sync definitly is less then optimal.
Ralf
Ralf Müller wrote:
reply to myself ...
It seemed to me that different streams interfere with each other. So the posix_fadvise() of a recording thread seemed to kill the read ahead buffer of a player thread for the same recording - very annyoing.
This has been even more annoying when I did my first tests with a smaller write buffer ...
Ralf
On Tue, Aug 23, 2005 at 09:00:14PM +0200, Ralf Müller wrote:
Actually I meant the trouble with programming. It is a difference to align all buffers to 512 bytes and read/write in 512 byte blocks instead of just reading and writing what you want. This is especially true for programs which have not been designed this way from the very beginning. I first started to do a patch based on O_DIRECT - this would have been much more intrusive.
I see, you would need to introduce a buffering layer.
Hmm, is there a reason why your WriteStream function doesn't simply do posix_fadvise(fd, 0, 0, POSIX_FADV_DONTNEED)? That would simplify the logic, and if I understood correctly, it should work equally well, except with some early 2.5 kernels. I understand that you will need to keep track on the offsets in ReadStream because of read-ahead.
It seemed to me that different streams interfere with each other. So the posix_fadvise() of a recording thread seemed to kill the read ahead buffer of a player thread for the same recording - very annyoing.
Sorry, I didn't think of that.
Another nice thing to have would be a way to tell the kernel to forget dirty buffers after they have been written to disc without the need to call fdatasync() before. The forced sync definitly is less then optimal.
In theory, this could be solved by writing a write completion callback function that would invoke posix_fadvise(), but this asynchronous I/O would be a real overkill. As far as I know, aio (asynchronous I/O) is only available in 2.6 series kernels (and perhaps some Red Hat 2.4 kernels).
Marko
Ralf Müller wrote:
The day before I tried to find out if somebody ever had implemented something to prevent vdr to trash the file system buffers. I did this because I hate to wait 20 seconds for vdr to re-read my recordings list.
Excellent work, Ralf! I have 4 disks in my VDR system and 5 more VDR disks in my server, most of them are in standby most of the time, so it takes even longer here to bring up the recordings menu for the first time. I'll try your patch next weekend.
Thanks,
Carsten.
Ralf Müller wrote:
The day before I tried to find out if somebody ever had implemented something to prevent vdr to trash the file system buffers. I did this because I hate to wait 20 seconds for vdr to re-read my recordings list.
I tried your patch today and watched the available memory (using top) when replaying. The number stayed the same. Excellent!
However, when I did FF or FR, the number decreased. Ralf, can you reproduce the problem? I am using plain vanilla VDR 1.3.30 with a few small patches (SourceCaps, DeleteResume, burn) and of course your patch.
Carsten.
Carsten Koch wrote:
I tried your patch today and watched the available memory (using top) when replaying. The number stayed the same. Excellent!
Thanks a lot.
However, when I did FF or FR, the number decreased. Ralf, can you reproduce the problem?
With FF there are no decreases here - but with FR. I didn't notice that before (why ever). But I'm happy you found that. When jumping backward I missed to clear the read ahead buffer. Interestingly it doesn't help to clear that buffer - it may have to do with the man page description for POSIX_FADV_WILLNEED: "start a non blocking read". It seems this read may or may not be finished when I try to say that I don't need the previously asked read ahead buffer anymore. So as a work around I dropped the POSIX_FADV_WILLNEED call. This also helped for the noad patch.
I will try to release a new patch today in the afternoon/evening.
Regards Ralf