Dynamic Cycle Detection for Lock Ordering
Recently, I discovered that an algorithm of mine from a few years back is being used in both TensorFlow and the Abseil C++ library (see here and here). That is of course pretty exciting since they are both widely used libraries! So, I thought it would be interesting to look at what it is being used for.
In Abseil, the algorithm is used in
to ensure locks are acquired in a consistent
order. Suppose we
have two mutexes
M1 which can be held at the same time by
two threads. A deadlock can easily occur if, for example, the first
M1, whilst the second acquires
M0. This doesn’t mean a deadlock will happen every time.
But if, by chance, the first thread acquires
M0 at the same time as
the second aqcuires
M1 — then we have a deadlock.
On the other hand, if mutexes are acquired according to a globally
consistent ordering (e.g.
M0 always acquired before
M1), then no
deadlock can arise. The challenge is to determine an appropriate
ordering of mutexes. In fact,
mutex does not attempt to determine
this statically (presumably this is considered too hard). Instead, it
simply observes program execution and reconstructs the ordering
dynamically. Then, during execution, if some thread attempts to
acquire a mutex in an order inconsistent with this, a potential
deadlock is reported.
To detect deadlocks,
mutex maintains a (global) ordering of lock
acquisitions called the aquires-before graph. This is implemented
using a global variable called
deadlock_graph which stores a
directed acyclic graph, such as the following:
Here, we have four mutexes and an edge
Mx -> My means
Mx must be
My. Actually, it indicates that
Mx has been
My in all lock acquisitions observed thus far. As
an example, consider mutex
M1 must be acquired
M3. In contrast, it doesn’t matter whether
M2 is acquired
M3 or not.
Mutexes which are unordered with respect to each other (as for
M3 above) are ordered on demand. For example, if a thread
comes along and acquires
M2 whilst holding
M3, the ordering is
From now on, any attempt to acquire
M3 generates an
error message highlighting the potential deadlock. To make this work,
every thread is associated with the mutexes it currently holds. When
a thread holding mutex
Mx attempts to acquire mutex
corresponding edge is added to
deadlock_graph. If that edge
introduces a cycle, we have a potential deadlock. Otherwise, the
ordering is updated (as above) and execution proceeds.
At this point, it is becoming clear that performance is an issue.
Whenever any thread acquires a lock, the
deadlock_graph must be
updated and the current ordering recalculated. For this reason,
deadlock detection in
mutex is only enabled when debugging.
Furthermore, an efficient algorithm for updating the ordering is
desirable (and this is where my algorithm comes in).
Dynamic Cycle Detection
Detecting cycles in a directed graph is actually pretty easy. We can just traverse the entire graph using a depth-first search and, if we encounter a vertex already visited, then we’ve found a cycle. We can also use Tarjan’s algorithm for detecting strongly connected components if we want to know what’s in the cycle.
Whilst detecting cycles is easy, the challenge lies in doing it
efficiently after an edge has been inserted. A simple solution is
just to retraverse the entire graph (e.g. using Tarjan’s algorithm),
but this is quite wasteful. Instead, my algorithm limits the traveral
as much as possible by maintaining a topoligical (i.e. consistent)
ordering of the
graph. For example, consider adding the edge
M3 --> M1 to the
Observe, when inserting an edge
Mx --> My, there is a 50% chance
My are already correctly ordered and, hence, no
further work is required! However, if they are incorrectly ordered
(as above), then either: they do form a cycle (hence, we have detected
a potential deadlock); or, they don’t and the ordering needs updating.
To figure this out, we must traverse some (or all) of the graph. My
algorithm improves upon the naive approach (i.e. always traversing the
whole graph) by limiting the search to just the affected region.
That is, those mutexes beteween the two end points of the edge being
inserted (as shown above). Of course, in the worst case, the affected
region is the whole graph! But, in the average case, it is often much
less (as above). The key is that my algorithm never does more work
than the naive approach, and usually does a lot less. For graphs with
a reasonable number of vertices, this offers considerable performance
improvements (hence, presumably why the Abseil developers chose it).
Hopefully, that’s given you an insight into the deadlock detection algorithm used in Abseil. It’s an interesting problem that turns out to be ideally suited to my algorithm, and something I had never thought of. That’s the beauty of algorithms — sometimes they have a life of their own!
Follow the discussion on Twitter or Reddit