| 1 | [KEEP IN SYNC WITH LOCKFREE.H] |
| 2 | |
| 3 | overview |
| 4 | -------- |
| 5 | |
| 6 | this module provides several implicitly thread-safe data structures. |
| 7 | rather than allowing only one thread to access them at a time, their |
| 8 | operations are carefully implemented such that they take effect in |
| 9 | one atomic step. data consistency problems are thus avoided. |
| 10 | this novel approach to synchronization has several advantages: |
| 11 | * deadlocks are impossible; |
| 12 | * overhead due to OS kernel entry is avoided; |
| 13 | * graceful scaling to multiple processors is ensured. |
| 14 | |
| 15 | |
| 16 | mechanism |
| 17 | --------- |
| 18 | |
| 19 | the basic primitive that makes this possible is "compare and swap", |
| 20 | a CPU instruction that performs both steps atomically. it compares a |
| 21 | machine word against the expected value; if equal, the new value is |
| 22 | written and an indication returned. otherwise, another thread must have |
| 23 | been writing to the same location; the operation is typically retried. |
| 24 | |
| 25 | this instruction is available on all modern architectures; in some cases, |
| 26 | emulation in terms of an alternate primitive (LL/SC) is necessary. |
| 27 | |
| 28 | |
| 29 | memory management |
| 30 | ----------------- |
| 31 | |
| 32 | one major remaining problem is how to free no longer needed nodes in the |
| 33 | data structure. in general, we want to reclaim their memory for arbitrary use; |
| 34 | this isn't safe as long as other threads are still accessing them. |
| 35 | |
| 36 | the RCU algorithm recognizes that all CPUs having entered a quiescent |
| 37 | state means that no threads are still referencing data. |
| 38 | lacking such kernel support, we use a similar mechanism - "hazard pointers" |
| 39 | are set before accessing data; only if none are pointing to a node can it |
| 40 | be freed. until then, they are stored in a per-thread 'waiting list'. |
| 41 | |
| 42 | this approach has several advantages over previous algorithms |
| 43 | (typically involving reference count): the CAS primitive need only |
| 44 | operate on single machine words, and space/time overhead is much reduced. |
| 45 | |
| 46 | |
| 47 | usage notes |
| 48 | ----------- |
| 49 | |
| 50 | useful "payload" in the data structures is allocated when inserting each |
| 51 | item: additional_bytes are appended. rationale: see struct Node definition. |
| 52 | |
| 53 | since lock-free algorithms are subtle and easy to get wrong, an extensive |
| 54 | self-test is included; #define PERFORM_SELF_TEST 1 to activate. |
| 55 | |
| 56 | |
| 57 | terminology |
| 58 | ----------- |
| 59 | |
| 60 | ; "atomic" : means indivisible; in this case, other CPUs cannot interfere with such an operation. |
| 61 | ; "race conditions" : are potential data consistency problems resulting from lack of thread synchronization. |
| 62 | ; "deadlock" : is a state where several threads are waiting on one another and no progress is possible. |
| 63 | ; "thread-safety" : is understood to mean the preceding two problems do not occur. |
| 64 | ; "scalability" : is a measure of how efficient synchronization is; overhead should not increase significantly with more processors. |
| 65 | ; "linearization point" : denotes the time at which an external observer believes a lock-free operation to have taken effect. |