this module provides several implicitly thread-safe data structures. rather than allowing only one thread to access them at a time, their operations are carefully implemented such that they take effect in one atomic step. data consistency problems are thus avoided. this novel approach to synchronization has several advantages:

  • deadlocks are impossible;
  • overhead due to OS kernel entry is avoided;
  • graceful scaling to multiple processors is ensured.


the basic primitive that makes this possible is "compare and swap", a CPU instruction that performs both steps atomically. it compares a machine word against the expected value; if equal, the new value is written and an indication returned. otherwise, another thread must have been writing to the same location; the operation is typically retried.

this instruction is available on all modern architectures; in some cases, emulation in terms of an alternate primitive (LL/SC) is necessary.

memory management

one major remaining problem is how to free no longer needed nodes in the data structure. in general, we want to reclaim their memory for arbitrary use; this isn't safe as long as other threads are still accessing them.

the RCU algorithm recognizes that all CPUs having entered a quiescent state means that no threads are still referencing data. lacking such kernel support, we use a similar mechanism - "hazard pointers" are set before accessing data; only if none are pointing to a node can it be freed. until then, they are stored in a per-thread 'waiting list'.

this approach has several advantages over previous algorithms (typically involving reference count): the CAS primitive need only operate on single machine words, and space/time overhead is much reduced.

usage notes

useful "payload" in the data structures is allocated when inserting each item: additional_bytes are appended. rationale: see struct Node definition.

since lock-free algorithms are subtle and easy to get wrong, an extensive self-test is included; #define PERFORM_SELF_TEST 1 to activate.


; "atomic" : means indivisible; in this case, other CPUs cannot interfere with such an operation. ; "race conditions" : are potential data consistency problems resulting from lack of thread synchronization. ; "deadlock" : is a state where several threads are waiting on one another and no progress is possible. ; "thread-safety" : is understood to mean the preceding two problems do not occur. ; "scalability" : is a measure of how efficient synchronization is; overhead should not increase significantly with more processors. ; "linearization point" : denotes the time at which an external observer believes a lock-free operation to have taken effect.

Last modified 10 years ago Last modified on Feb 23, 2008, 4:19:01 AM