[linux-dvb] [PATCH] dvb-core: Fix several locking related problems.

Simon Arlott simon at fire.lp0.eu
Mon Mar 5 19:04:44 CET 2007


On 05/03/07 17:19, Oliver Endriss wrote:
> Simon Arlott wrote:
>> On Mon, March 5, 2007 11:19, Johannes Stezenbach wrote:
>>> On Mon, Mar 05, 2007 at 01:58:14AM +0100, Oliver Endriss wrote:
>>>> Simon Arlott wrote:
>>>>> Is any part of the patch going to be applied? I mentioned this
>>>>> problem in September last year and it looks like it's existed for
>>>>> years (the semaphore locking did the same thing).
>>>> Well, I hoped that someone more familiar with the demuxer stuff would
>>>> comment on the patch. I am not very happy about using non-interruptible
>>>> lock operations...
>> Why? If there are deadlocks these should be fixed, not ignored.
> 
> Yes, they should be fixed, but they should not require a reboot.

What!? The fix for a deadlock is not a reboot... perhaps you misunderstood 
what I meant - whatever ends up holding the lock forever should be fixed.

> From 'Linux Device Drivers' (replace 'down' by 'mutex_lock'):
> | ...
> | down decrements the value of the semaphore and waits as long as need
> | be. down_ interruptible does the same, but the operation is
> | interruptible. The interruptible version is almost always the one you
> | will want; it allows a user-space process that is waiting on a
> | semaphore to be interrupted by the user. You do not, as a general
>                                            ^^^^^^^^^^^^^^^^^^^^^^^^
> | rule, want to use noninterruptible operations unless there truly is no
>   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> | alternative. Noninterruptible operations are a good way to create
>   ^^^^^^^^^^^
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> | unkillable processes (the dreaded  D state  seen in ps), and annoy
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> | your users. Using down_interruptible requires some extra care,
    ^^^^^^^^^^
I don't see this advice in Documentation/mutex-design.txt (although it 
doesn't advise either way) or in Documentation/ at all.

                                         ^^^^^^^^^^^^^^^^^^^^^^^^^
> | however, if the operation is interrupted, the function returns a
> | nonzero value, and the caller does not hold the semaphore. Proper use
> | of down_interruptible requires always checking the return value and
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Until you do this, you can't use down_interruptible.

> | responding accordingly.            
> | ...

>>>> Anyway, I'll apply the patch to HG master if you submit an updated patch:
>>>> - Please add a line of comment to each mutex_lock() stating _why_ the
>>>>   non-interruptible lock has to be used at this place.
>> What's the point of doing that?
> 
> The point is to document why we do not use the interruptible version.
> Next year someone might submit a patch replacing mutex_lock by
> mutex_lock_interruptible, and no one remembers why mutex_lock is
> required at this place.

There is no danger of this, if someone submits a patch replacing it with 
mutex_lock_interruptible and doesn't take into account every possible 
calling function's check of the return value then their patch needs 
changing before it can be accepted.
 
>>> IMHO using mutex_lock_interruptible() is almost always wrong.
>>>
>>> The only use it has in dvb-core is to recover from locking bugs --
>>> if it deadlocks you can Ctrl-C out of it
>>> (instead of being left with a non-killable program -> reboot).

The key phrase here is "locking *bugs*", if the code can lock forever 
it needs to be fixed. The real cause of bugs will never be found if 
interruptible locking is used everywhere when when it's not necessary :(

Also, this is in dvb-core not actual card drivers - why would there be  
locking bugs in dvb-core itself?

>> This is what lockdep is for.
> 
> Is lockdep activated in distribution kernels?

No, but developers could use it while testing. It only reports locking 
bugs by checking how locks depend on other locks (see lockdep-design.txt), 
I don't think it would spot code that just locked a mutex and then waited 
forever (which is really bad code anyway).

-- 
Simon Arlott



More information about the linux-dvb mailing list