Lock-free collection in libgee: hazard pointer

From version 0.8 libgee have support for lock-free collections. While they can be used as a drop-in replacement for any other there are a few ‘gotchas’ that user needs to be aware of if (s)he wants to use them efficiently, of if (s)he wants to implement collection themselves.

One of the problems in programming is memory management – if thread 1 uses some resource thread 2 cannot free it. In many languages it is solved by tracing garbage collection, in others it is required to manually manage memory. Some languages use reference counting either by libraries, like Rust (which actually have 2 implementations for different purposes), C++ and others, or natively, like Python’s CPython implementation or Vala.

While both methods have some advantages and disadvantages there is one disadvantage to reference counting – pure reference counting works with atomic object only when you have double compare and swap. To explain – a CAS (compare and swap) is a basic primitive in lock-free algorithms which takes atomic variable, expected value and new value and, atomically, compares atomic with expected value and if it succeeds writes new value to atomic. DCAS (double compare and swap) performs 2 such operations atomically if both comparatives succeeds. Unfortunately DCAS is not implemented on any processor but some old Motorola 68k processors.

Fortunately there are hazard pointers, which libgee contains implementation of. The basic idea is that you create a (global) list of pointers that are currently in use. Now each thread if it wants to read a pointer gets a node from the list, reads a pointer and writes it to the node. Now any thread that wants to use the resource will know that it’s still accessed by that thread. Of course someone might change the resource between read and write so we need to ensure that the current value (pointer) is still there.

Now if a thread wants to free the resource it can do it as long as a) it clears the pointer (so there are no dangling pointers) b) scans the list of pointers to ensure that no thread is accessing it. If there is no one accessing it thread can free it and if there is one it needs to store it in per-thread free list, which is periodically scanned.

For libgee however direct implementation seemed to be unsuitable. If a thread with large free list was not accessing lock-free API for longer period the resources might not be freed during that time. While it was unlikely that it would constitute an increasing resource leak it might have been undesired. Therefore a concept of contexts were added.

Hazard pointer context have an associated array of not-yet-freed resources and policy dictating what should happen when context is destroyed. One of policies is release policy which sends the remaining resources to a global helper – either a thread (by default) or glib main loop. In addition any of those policies may be executed manually while context is alive.

Policies are organized in the stack and if child context is destroyed remaining elements are passed to parent. In effect the contexts constitute a second, parallel stack and should be used as such – any case when context is accessed from different thread or when parent is destroyed before child is illegal.

By default the methods of concurrent collections create their own context and execute within. If there is no context added they will execute a default thread-exit action (namely release of resources). While it is ‘sane’ default it is not exactly lock-free. Therefore if speed or lock-freeness is concerned it is recommended to create contexts to manually control the lifetime. It can give some benefits (I get varied results depending on allocation speed, parameters, phase of moon etc.).

In the end while ConcurrentList and ConcurrentSet are drop-in replacements for List/SortedSet purely using them as such may be suboptimal.

This entry was posted in Libgee and tagged , , , . Bookmark the permalink.

Leave a comment