Beautiful Code [58]
| - Attempts to promote holder of | | |
| upibp->upib_lock (Thread B) | | |
| - Acquires B's thread lock | | - Drops upibp->upib_lock |
| - Adjusts B's priority | | |
| - Drops B's thread lock | | |
| - Seeing that B's wchan is not | | |
| NULL, attempts to continue | | |
| priority inheritance | | |
| - Calls SOBJ_OWNER() on B's wchan | | |
| - Seeing that owner of B's wchan | | |
| is A, panics with "Deadlock: | | |
| cycle in blocking chain" | | |
| | | |
------------------------------------+-+--------------------------------------+
As the above sequence implies, the problem is in turnstile_block()
THREAD_SLEEP(t, &tc->tc_lock);
t->t_wchan = sobj;
t->t_sobj_ops = sobj_ops;
...
/*
* Follow the blocking chain to its end, or until we run out of
* inversions, willing our priority to everyone who's in our way.
*/
while (inverted && t->t_sobj_ops != NULL &&
(owner = SOBJ_OWNER(t->t_sobj_ops, t->t_wchan)) != NULL) {
...
}
(1) --> thread_unlock_nopreempt(t);
/*
* At this point, "t" may not be curthread. So, use "curthread", from
* now on, instead of "t".
*/
if (SOBJ_TYPE(sobj_ops) == SOBJ_USER_PI) {
(2) --> mutex_exit(mp);
...
We're dropping the thread lock of the blocking thread (at (1)) before we drop
the upibp->upib_lock at (2). From (1) until (2) we are violating one of
the invariants of SOBJ_USER_PI locks: when sleeping on a SOBJ_USER_PI lock,
_no_ kernel locks may be held; any held kernel locks can yield a deadlock
panic.
Understanding the analysis requires some knowledge of implementation nomenclature:
upibp
A pointer to the in-kernel state associated with a held user-level priority inheriting lock; the upib_lock is the lock that protects this state.
t_wchan
The member of the thread structure that contains the pointer to the synchronization primitive upon which the thread (if any) is blocked.[§]
[§]wchan stands for wait channel, a term that dates back to the earliest days of UNIX at Bell Labs, and is itself almost certainly a bastardization of event channels from Multics.
SOBJ_TYPE
A macro that takes the ops vector for a synchronization primitive and returns a constant denoting the type; SOBJ_USER_PI is the constant that denotes a user-level, priority-inheriting lock.
The essence of the problem is this: for user-level locks, we normally keep track of the state associated with the lock (e.g., whether or not there's a waiter) at user-level—that information is considered purely advisory by the kernel. (There are several situations in which the waiters bit can't be trusted, and the kernel knows not to trust it in these situations.)
To implement priority inheritance for user-level locks, however, one must become much more precise about ownership; the ownership must be tracked the same way we track ownership for kernel-level synchronization primitives. That is, when we're doing the complicated thread lock dance in turnstile_interlock(), we can't be doing loads from user-level memory to determine ownership. The nasty implication of this is that the kernel-level state tracking the ownership of the user-level lock must itself be protected by a lock, and that (in-kernel) lock must itself implement priority inheritance to avoid a potential inversion.
This leads us to a deadlock that we did not predict: the in-kernel lock must be acquired and dropped both to acquire the user-level lock and to drop it. That is, there are conditions in which a thread owns the in-kernel lock and wants the user-level lock, and there are conditions in which a thread owns the user-level lock and wants the in-kernel lock. As a result, there can exist blocking chains that appear circular—which will cause the kernel to induce an explicit panic. And indeed, that's exactly what happened in the failure analyzed above: thread A owned the user-level lock and wanted the in-kernel lock (upib_lock), and thread B owned the in-kernel lock and wanted the user-level lock—deadlock!
Once I understood the problem, it was disconcertingly