4 `folly/Synchronized.h` introduces a simple abstraction for mutex-
5 based concurrency. It replaces convoluted, unwieldy, and just
6 plain wrong code with simple constructs that are easy to get
7 right and difficult to get wrong.
11 Many of our multithreaded C++ programs use shared data structures
12 associated with locks. This follows the time-honored adage of
13 mutex-based concurrency control "associate mutexes with data, not code".
14 Consider the following example:
18 class RequestHandler {
20 RequestQueue requestQueue_;
21 SharedMutex requestQueueMutex_;
23 std::map<std::string, Endpoint> requestEndpoints_;
24 SharedMutex requestEndpointsMutex_;
26 HandlerState workState_;
27 SharedMutex workStateMutex_;
32 Whenever the code needs to read or write some of the protected
33 data, it acquires the mutex for reading or for reading and
37 void RequestHandler::processRequest(const Request& request) {
39 checkRequestValidity(request);
40 SharedMutex::WriteHolder lock(requestQueueMutex_);
41 requestQueue_.push_back(request);
42 stats_->addStatValue("requestEnqueueLatency", watch.elapsed());
43 LOG(INFO) << "enqueued request ID " << request.getID();
47 However, the correctness of the technique is entirely predicated on
48 convention. Developers manipulating these data members must take care
49 to explicitly acquire the correct lock for the data they wish to access.
50 There is no ostensible error for code that:
52 * manipulates a piece of data without acquiring its lock first
53 * acquires a different lock instead of the intended one
54 * acquires a lock in read mode but modifies the guarded data structure
55 * acquires a lock in read-write mode although it only has `const` access
58 ### Introduction to `folly/Synchronized.h`
60 The same code sample could be rewritten with `Synchronized`
64 class RequestHandler {
66 Synchronized<RequestQueue> requestQueue_;
67 Synchronized<std::map<std::string, Endpoint>> requestEndpoints_;
68 Synchronized<HandlerState> workState_;
72 void RequestHandler::processRequest(const Request& request) {
74 checkRequestValidity(request);
75 requestQueue_.wlock()->push_back(request);
76 stats_->addStatValue("requestEnqueueLatency", watch.elapsed());
77 LOG(INFO) << "enqueued request ID " << request.getID();
81 The rewrite does at maximum efficiency what needs to be done:
82 acquires the lock associated with the `RequestQueue` object, writes to
83 the queue, and releases the lock immediately thereafter.
85 On the face of it, that's not much to write home about, and not
86 an obvious improvement over the previous state of affairs. But
87 the features at work invisible in the code above are as important
88 as those that are visible:
90 * Unlike before, the data and the mutex protecting it are
91 inextricably encapsulated together.
92 * If you tried to use `requestQueue_` without acquiring the lock you
93 wouldn't be able to; it is virtually impossible to access the queue
94 without acquiring the correct lock.
95 * The lock is released immediately after the insert operation is
96 performed, and is not held for operations that do not need it.
98 If you need to perform several operations while holding the lock,
99 `Synchronized` provides several options for doing this.
101 The `wlock()` method (or `lock()` if you have a non-shared mutex type)
102 returns a `LockedPtr` object that can be stored in a variable. The lock
103 will be held for as long as this object exists, similar to a
104 `std::unique_lock`. This object can be used as if it were a pointer to
105 the underlying locked object:
109 auto lockedQueue = requestQueue_.wlock();
110 lockedQueue->push_back(request1);
111 lockedQueue->push_back(request2);
115 The `rlock()` function is similar to `wlock()`, but acquires a shared lock
116 rather than an exclusive lock.
118 We recommend explicitly opening a new nested scope whenever you store a
119 `LockedPtr` object, to help visibly delineate the critical section, and
120 to ensure that the `LockedPtr` is destroyed as soon as it is no longer
123 Alternatively, `Synchronized` also provides mechanisms to run a function while
124 holding the lock. This makes it possible to use lambdas to define brief
128 void RequestHandler::processRequest(const Request& request) {
130 checkRequestValidity(request);
131 requestQueue_.withWLock([](auto& queue) {
132 // withWLock() automatically holds the lock for the
133 // duration of this lambda function
134 queue.push_back(request);
136 stats_->addStatValue("requestEnqueueLatency", watch.elapsed());
137 LOG(INFO) << "enqueued request ID " << request.getID();
141 One advantage of the `withWLock()` approach is that it forces a new
142 scope to be used for the critical section, making the critical section
143 more obvious in the code, and helping to encourage code that releases
144 the lock as soon as possible.
146 ### Template class `Synchronized<T>`
148 #### Template Parameters
150 `Synchronized` is a template with two parameters, the data type and a
151 mutex type: `Synchronized<T, Mutex>`.
153 If not specified, the mutex type defaults to `std::mutex`. However, any
154 mutex type supported by `folly::LockTraits` can be used instead.
155 `folly::LockTraits` can be specialized to support other custom mutex
156 types that it does not know about out of the box. See
157 `folly/LockTraitsBoost.h` for an example of how to support additional mutex
160 `Synchronized` provides slightly different APIs when instantiated with a
161 shared mutex type than with a plain exclusive mutex type. When used with
162 a shared mutex type, it has separate `wlock()` and `rlock()` methods,
163 rather than a single `lock()` method. Similarly, it has `withWLock()`
164 and `withRLock()` rather than `withLock()`. When using a shared mutex
165 type, these APIs ensure that callers make an explicit choice to acquire
166 the a shared or an exclusive lock, and that callers do not
167 unintentionally lock the mutex in the incorrect mode. The `rlock()`
168 APIs only provide `const` access to the underlying data type, ensuring
169 that it cannot be modified when only holding a shared lock.
173 The default constructor default-initializes the data and its
177 The copy constructor locks the source for reading and copies its
178 data into the target. (The target is not locked as an object
179 under construction is only accessed by one thread.)
181 Finally, `Synchronized<T>` defines an explicit constructor that
182 takes an object of type `T` and copies it. For example:
185 // Default constructed
186 Synchronized<map<string, int>> syncMap1;
189 Synchronized<map<string, int>> syncMap2(syncMap1);
191 // Initializing from an existing map
192 map<string, int> init;
194 Synchronized<map<string, int>> syncMap3(init);
195 EXPECT_EQ(syncMap3->size(), 1);
198 #### Assignment, swap, and copying
200 The canonical assignment operator locks both objects involved and
201 then copies the underlying data objects. The mutexes are not
202 copied. The locks are acquired in increasing address order, so
203 deadlock is avoided. For example, there is no problem if one
204 thread assigns `a = b` and the other assigns `b = a` (other than
205 that design probably deserving a Razzie award). Similarly, the
206 `swap` method takes a reference to another `Synchronized<T>`
207 object and swaps the data. Again, locks are acquired in a well-
208 defined order. The mutexes are not swapped.
210 An additional assignment operator accepts a `const T&` on the
211 right-hand side. The operator copies the datum inside a
214 In addition to assignment operators, `Synchronized<T>` has move
215 assignment operators.
217 An additional `swap` method accepts a `T&` and swaps the data
218 inside a critical section. This is by far the preferred method of
219 changing the guarded datum wholesale because it keeps the lock
220 only for a short time, thus lowering the pressure on the mutex.
222 To get a copy of the guarded data, there are two methods
223 available: `void copy(T*)` and `T copy()`. The first copies data
224 to a provided target and the second returns a copy by value. Both
225 operations are done under a read lock. Example:
228 Synchronized<vector<string>> syncVec1, syncVec2;
233 // Assign straight from vector
237 syncVec1.swap(syncVec2);
241 // Copy to given target
243 // Get a copy by value
244 auto copy = syncVec1.copy();
249 If the mutex type used with `Synchronized` is a simple exclusive mutex
250 type (as opposed to a shared mutex), `Synchronized<T>` provides a
251 `lock()` method that returns a `LockedPtr<T>` to access the data while
254 The `LockedPtr` object returned by `lock()` holds the lock for as long
255 as it exists. Whenever possible, prefer declaring a separate inner
256 scope for storing this variable, to make sure the `LockedPtr` is
257 destroyed as soon as the lock is no longer needed:
260 void fun(Synchronized<vector<string>, std::mutex>& vec) {
262 auto locked = vec.lock();
263 locked->push_back("hello");
264 locked->push_back("world");
266 LOG(INFO) << "successfully added greeting";
270 #### `wlock()` and `rlock()`
272 If the mutex type used with `Synchronized` is a shared mutex type,
273 `Synchronized<T>` provides a `wlock()` method that acquires an exclusive
274 lock, and an `rlock()` method that acquires a shared lock.
276 The `LockedPtr` returned by `rlock()` only provides const access to the
277 internal data, to ensure that it cannot be modified while only holding a
281 int computeSum(const Synchronized<vector<int>>& vec) {
283 auto locked = vec.rlock();
284 for (int n : *locked) {
290 void doubleValues(Synchronized<vector<int>>& vec) {
291 auto locked = vec.wlock();
292 for (int& n : *locked) {
298 This example brings us to a cautionary discussion. The `LockedPtr`
299 object returned by `lock()`, `wlock()`, or `rlock()` only holds the lock
300 as long as it exists. This object makes it difficult to access the data
301 without holding the lock, but not impossible. In particular you should
302 never store a raw pointer or reference to the internal data for longer
303 than the lifetime of the `LockedPtr` object.
305 For instance, if we had written the following code in the examples
306 above, this would have continued accessing the vector after the lock had
311 for (int& n : *vec.wlock()) {
316 The `vec.wlock()` return value is destroyed in this case as soon as the
317 internal range iterators are created. The range iterators point into
318 the vector's data, but lock is released immediately, before executing
321 Needless to say, this is a crime punishable by long debugging nights.
323 Range-based for loops are slightly subtle about the lifetime of objects
324 used in the initializer statement. Most other problematic use cases are
325 a bit easier to spot than this, since the lifetime of the `LockedPtr` is
326 more explicitly visible.
330 As an alternative to the `lock()` API, `Synchronized` also provides a
331 `withLock()` method that executes a function or lambda expression while
332 holding the lock. The function receives a reference to the data as its
335 This has a few benefits compared to `lock()`:
337 * The lambda expression requires its own nested scope, making critical
338 sections more visible in the code. Callers are recommended to define
339 a new scope when using `lock()` if they choose to, but this is not
340 required. `withLock()` ensures that a new scope must always be
342 * Because a new scope is required, `withLock()` also helps encourage
343 users to release the lock as soon as possible. Because the critical
344 section scope is easily visible in the code, it is harder to
345 accidentally put extraneous code inside the critical section without
347 * The separate lambda scope makes it more difficult to store raw
348 pointers or references to the protected data and continue using those
349 pointers outside the critical section.
351 For example, `withLock()` makes the range-based for loop mistake from
352 above much harder to accidentally run into:
355 vec.withLock([](auto& locked) {
356 for (int& n : locked) {
362 This code does not have the same problem as the counter-example with
363 `wlock()` above, since the lock is held for the duration of the loop.
365 When using `Synchronized` with a shared mutex type, it provides separate
366 `withWLock()` and `withRLock()` methods instead of `withLock()`.
370 When `Synchronized` is used with a mutex type that supports timed lock
371 acquisition, `lock()`, `wlock()`, and `rlock()` can all take an optional
372 `std::chrono::duration` argument. This argument specifies a timeout to
373 use for acquiring the lock. If the lock is not acquired before the
374 timeout expires, a null `LockedPtr` object will be returned. Callers
375 must explicitly check the return value before using it:
378 void fun(Synchronized<vector<string>>& vec) {
380 auto locked = vec.lock(10ms);
382 throw std::runtime_error("failed to acquire lock");
384 locked->push_back("hello");
385 locked->push_back("world");
387 LOG(INFO) << "successfully added greeting";
391 #### `unlock()` and `scopedUnlock()`
393 `Synchronized` is a good mechanism for enforcing scoped
394 synchronization, but it has the inherent limitation that it
395 requires the critical section to be, well, scoped. Sometimes the
396 code structure requires a fleeting "escape" from the iron fist of
397 synchronization, while still inside the critical section scope.
399 One common pattern is releasing the lock early on error code paths,
400 prior to logging an error message. The `LockedPtr` class provides an
401 `unlock()` method that makes this possible:
404 Synchronized<map<int, string>> dic;
407 auto locked = dic.rlock();
408 auto iter = locked->find(0);
409 if (iter == locked.end()) {
410 locked.unlock(); // don't hold the lock while logging
411 LOG(ERROR) << "key 0 not found";
416 LOG(INFO) << "succeeded";
419 For more complex nested control flow scenarios, `scopedUnlock()` returns
420 an object that will release the lock for as long as it exists, and will
421 reacquire the lock when it goes out of scope.
425 Synchronized<map<int, string>> dic;
428 auto locked = dic.wlock();
429 auto iter = locked->find(0);
430 if (iter == locked->end()) {
432 auto unlocker = locked.scopedUnlock();
433 LOG(INFO) << "Key 0 not found, inserting it."
435 locked->emplace(0, "zero");
442 Clearly `scopedUnlock()` comes with specific caveats and
443 liabilities. You must assume that during the `scopedUnlock()`
444 section, other threads might have changed the protected structure
445 in arbitrary ways. In the example above, you cannot use the
446 iterator `iter` and you cannot assume that the key `0` is not in the
447 map; another thread might have inserted it while you were
448 bragging on `LOG(INFO)`.
450 Whenever a `LockedPtr` object has been unlocked, whether with `unlock()`
451 or `scopedUnlock()`, it will behave as if it is null. `isNull()` will
452 return true. Dereferencing an unlocked `LockedPtr` is not allowed and
453 will result in undefined behavior.
455 #### `Synchronized` and `std::condition_variable`
457 When used with a `std::mutex`, `Synchronized` supports using a
458 `std::condition_variable` with its internal mutex. This allows a
459 `condition_variable` to be used to wait for a particular change to occur
460 in the internal data.
462 The `LockedPtr` returned by `Synchronized<T, std::mutex>::lock()` has a
463 `getUniqueLock()` method that returns a reference to a
464 `std::unique_lock<std::mutex>`, which can be given to the
465 `std::condition_variable`:
468 Synchronized<vector<string>, std::mutex> vec;
469 std::condition_variable emptySignal;
471 // Assuming some other thread will put data on vec and signal
472 // emptySignal, we can then wait on it as follows:
473 auto locked = vec.lock();
474 emptySignal.wait_for(locked.getUniqueLock(),
475 [&] { return !locked->empty(); });
478 ### `acquireLocked()`
480 Sometimes locking just one object won't be able to cut the mustard. Consider a
481 function that needs to lock two `Synchronized` objects at the
482 same time - for example, to copy some data from one to the other.
483 At first sight, it looks like sequential `wlock()` calls will work just
487 void fun(Synchronized<vector<int>>& a, Synchronized<vector<int>>& b) {
488 auto lockedA = a.wlock();
489 auto lockedB = b.wlock();
490 ... use lockedA and lockedB ...
494 This code compiles and may even run most of the time, but embeds
495 a deadly peril: if one threads call `fun(x, y)` and another
496 thread calls `fun(y, x)`, then the two threads are liable to
497 deadlocking as each thread will be waiting for a lock the other
498 is holding. This issue is a classic that applies regardless of
499 the fact the objects involved have the same type.
501 This classic problem has a classic solution: all threads must
502 acquire locks in the same order. The actual order is not
503 important, just the fact that the order is the same in all
504 threads. Many libraries simply acquire mutexes in increasing
505 order of their address, which is what we'll do, too. The
506 `acquireLocked()` function takes care of all details of proper
507 locking of two objects and offering their innards. It returns a
508 `std::tuple` of `LockedPtr`s:
511 void fun(Synchronized<vector<int>>& a, Synchronized<vector<int>>& b) {
512 auto ret = folly::acquireLocked(a, b);
513 auto& lockedA = std::get<0>(ret);
514 auto& lockedB = std::get<1>(ret);
515 ... use lockedA and lockedB ...
519 Note that C++ 17 introduces
520 (structured binding syntax)[(http://wg21.link/P0144r2)]
521 which will make the returned tuple more convenient to use:
524 void fun(Synchronized<vector<int>>& a, Synchronized<vector<int>>& b) {
525 auto [lockedA, lockedB] = folly::acquireLocked(a, b);
526 ... use lockedA and lockedB ...
530 An `acquireLockedPair()` function is also available, which returns a
531 `std::pair` instead of a `std::tuple`. This is more convenient to use
532 in many situations, until compiler support for structured bindings is
533 more widely available.
535 ### Synchronizing several data items with one mutex
537 The library is geared at protecting one object of a given type
538 with a mutex. However, sometimes we'd like to protect two or more
539 members with the same mutex. Consider for example a bidirectional
540 map, i.e. a map that holds an `int` to `string` mapping and also
541 the converse `string` to `int` mapping. The two maps would need
542 to be manipulated simultaneously. There are at least two designs
545 #### Using a nested `struct`
547 You can easily pack the needed data items in a little struct.
553 map<int, string> direct;
554 map<string, int> inverse;
556 Synchronized<BiMap> bimap_;
560 bimap_.withLock([](auto& locked) {
561 locked.direct[0] = "zero";
562 locked.inverse["zero"] = 0;
566 With this code in tow you get to use `bimap_` just like any other
567 `Synchronized` object, without much effort.
569 #### Using `std::tuple`
571 If you won't stop short of using a spaceship-era approach,
572 `std::tuple` is there for you. The example above could be
573 rewritten for the same functionality like this:
577 Synchronized<tuple<map<int, string>, map<string, int>>> bimap_;
581 bimap_.withLock([](auto& locked) {
582 get<0>(locked)[0] = "zero";
583 get<1>(locked)["zero"] = 0;
587 The code uses `std::get` with compile-time integers to access the
588 fields in the tuple. The relative advantages and disadvantages of
589 using a local struct vs. `std::tuple` are quite obvious - in the
590 first case you need to invest in the definition, in the second
591 case you need to put up with slightly more verbose and less clear
596 `Synchronized` and its supporting tools offer you a simple,
597 robust paradigm for mutual exclusion-based concurrency. Instead
598 of manually pairing data with the mutexes that protect it and
599 relying on convention to use them appropriately, you can benefit
600 of encapsulation and typechecking to offload a large part of that
601 task and to provide good guarantees.