Online Book Reader

Home Category

Beautiful Code [54]

By Root 5286 0
moment, I can't help but illuminate an elegant detail in the code that executes when a per-CPU magazine is empty and the depot lock (which is global per-cache) must be acquired:

/*

* If we can't get the depot lock without contention,

* update our contention count. We use the depot

* contention rate to determine whether we need to

* increase the magazine size for better scalability.

*/

if (!mutex_tryenter(&cp->cache_depot_lock)) {

mutex_enter(&cp->cache_depot_lock);

cp->cache_depot_contention++;

}

This code doesn't simply acquire the lock, it attempts to acquire the lock, keeping track of the number of times that this attempt fails because the lock was held. The resulting count is a rough indicator of contention at the global layer, and if the count becomes too high in a given interval of time, the system increases the number of buffers stored at the per-CPU layer, reducing the contention at the global layer. This simple mechanism thus allows the subsystem to dynamically adjust its structures to reduce its own contention! Beautiful code, for certain.

Let's return to the construction of our example, and now to ZFS, where all we need to know is that files and directories have an in-memory structure called a znode.

Given this background about the kernel memory allocator and ZFS, we can envision the following sequence of events:

A thread T1, attempts an allocation from the kmem_alloc_32 cache on CPU 2, which requires taking the lock for CPU 2's kmem_alloc_32 magazine. Discovering that the magazines for CPU 2 are all empty, T1 acquires the depot lock for the kmem_alloc_32 cache, and it is then preempted, with both the CPU 2 magazine lock and the depot lock held.

A second thread T2, running on CPU 3, attempts an unrelated allocation from the kmem_alloc_32 cache. As bad luck would have it, its magazines are also empty. T2 attempts to acquire the depot lock for the kmem_alloc_32 cache—but it sees that the lock is held by T1, and it blocks.

A third thread T3, runs on CPU 3 after T2 has blocked. This thread is attempting to create the ZFS file /foo/bar/mumble. As part of this operation, it must create a ZFS directory entry lock for the entry mumble in the directory /foo/bar. It acquires the lock on the znode that corresponds to /foo/bar and then attempts to allocate a zfs_dirlock_t. Because a zfs_dirlock_t is 32 bytes in size, this allocation is to be satisfied from the kmem_alloc_32 cache, and T3 therefore attempts to acquire the magazine lock for the kmem_alloc_32 cache on CPU 3—but it sees that the lock is held by T2, and it blocks.

A fourth thread, T4, attempts to examine the contents of the directory /foo/bar. As a part of this operation, it attempts to acquire the lock on the znode that corresponds to /foo/bar—but it sees that the lock is held by T3, and it blocks.

When T4 blocks, it is blocking on T3, which is in turn blocked on T2, which is in turn blocked on T1—and it is this chain of threads that constitutes the blocking chain. Having seen what a blocking chain might actually look like in the wild, it might be easier to appreciate the essential subtlety of getting priority inheritance correct: when a blocking thread wills its priority to its blocking chain, we must iterate over the entire blocking chain coherently. That is, when we iterate over the blocking chain, we must see a consistent snapshot of all of the threads that were on the blocking chain at that instant—no more and no less. In the context of this example, we wouldn't want to will our priority to T1 after it released the lock blocking T2 (and thus, transitively, T4)—this would potentially leave T1 at an artificially high priority.

So, how can we iterate over the blocking chain coherently? In Solaris, a thread's dispatcher state (e.g., whether it's running, enqueued to run, or sleeping) is protected by acquiring a special spin lock known as its thread lock; it might be tempting to believe that in order to process the blocking chain coherently, we should simply acquire all of the thread locks at once. This won't

Return Main Page Previous Page Next Page

®Online Book Reader