4 `folly/Synchronized.h` introduces a simple abstraction for mutex-
5 based concurrency. It replaces convoluted, unwieldy, and just
6 plain wrong code with simple constructs that are easy to get
7 right and difficult to get wrong.
11 Many of our multithreaded C++ programs use shared data structures
12 associated with locks. This follows the time-honored adage of
13 mutex-based concurrency control "associate mutexes with data, not code".
14 Consider the following example:
18 class RequestHandler {
20 RequestQueue requestQueue_;
21 SharedMutex requestQueueMutex_;
23 std::map<std::string, Endpoint> requestEndpoints_;
24 SharedMutex requestEndpointsMutex_;
26 HandlerState workState_;
27 SharedMutex workStateMutex_;
32 Whenever the code needs to read or write some of the protected
33 data, it acquires the mutex for reading or for reading and
37 void RequestHandler::processRequest(const Request& request) {
39 checkRequestValidity(request);
40 SharedMutex::WriteHolder lock(requestQueueMutex_);
41 requestQueue_.push_back(request);
42 stats_->addStatValue("requestEnqueueLatency", watch.elapsed());
43 LOG(INFO) << "enqueued request ID " << request.getID();
47 However, the correctness of the technique is entirely predicated on
48 convention. Developers manipulating these data members must take care
49 to explicitly acquire the correct lock for the data they wish to access.
50 There is no ostensible error for code that:
52 * manipulates a piece of data without acquiring its lock first
53 * acquires a different lock instead of the intended one
54 * acquires a lock in read mode but modifies the guarded data structure
55 * acquires a lock in read-write mode although it only has `const` access
58 ### Introduction to `folly/Synchronized.h`
60 The same code sample could be rewritten with `Synchronized`
64 class RequestHandler {
66 Synchronized<RequestQueue> requestQueue_;
67 Synchronized<std::map<std::string, Endpoint>> requestEndpoints_;
68 Synchronized<HandlerState> workState_;
72 void RequestHandler::processRequest(const Request& request) {
74 checkRequestValidity(request);
75 requestQueue_.wlock()->push_back(request);
76 stats_->addStatValue("requestEnqueueLatency", watch.elapsed());
77 LOG(INFO) << "enqueued request ID " << request.getID();
81 The rewrite does at maximum efficiency what needs to be done:
82 acquires the lock associated with the `RequestQueue` object, writes to
83 the queue, and releases the lock immediately thereafter.
85 On the face of it, that's not much to write home about, and not
86 an obvious improvement over the previous state of affairs. But
87 the features at work invisible in the code above are as important
88 as those that are visible:
90 * Unlike before, the data and the mutex protecting it are
91 inextricably encapsulated together.
92 * If you tried to use `requestQueue_` without acquiring the lock you
93 wouldn't be able to; it is virtually impossible to access the queue
94 without acquiring the correct lock.
95 * The lock is released immediately after the insert operation is
96 performed, and is not held for operations that do not need it.
98 If you need to perform several operations while holding the lock,
99 `Synchronized` provides several options for doing this.
101 The `wlock()` method (or `lock()` if you have a non-shared mutex type)
102 returns a `LockedPtr` object that can be stored in a variable. The lock
103 will be held for as long as this object exists, similar to a
104 `std::unique_lock`. This object can be used as if it were a pointer to
105 the underlying locked object:
109 auto lockedQueue = requestQueue_.wlock();
110 lockedQueue->push_back(request1);
111 lockedQueue->push_back(request2);
115 The `rlock()` function is similar to `wlock()`, but acquires a shared lock
116 rather than an exclusive lock.
118 We recommend explicitly opening a new nested scope whenever you store a
119 `LockedPtr` object, to help visibly delineate the critical section, and
120 to ensure that the `LockedPtr` is destroyed as soon as it is no longer
123 Alternatively, `Synchronized` also provides mechanisms to run a function while
124 holding the lock. This makes it possible to use lambdas to define brief
128 void RequestHandler::processRequest(const Request& request) {
130 checkRequestValidity(request);
131 requestQueue_.withWLock([](auto& queue) {
132 // withWLock() automatically holds the lock for the
133 // duration of this lambda function
134 queue.push_back(request);
136 stats_->addStatValue("requestEnqueueLatency", watch.elapsed());
137 LOG(INFO) << "enqueued request ID " << request.getID();
141 One advantage of the `withWLock()` approach is that it forces a new
142 scope to be used for the critical section, making the critical section
143 more obvious in the code, and helping to encourage code that releases
144 the lock as soon as possible.
146 ### Template class `Synchronized<T>`
148 #### Template Parameters
150 `Synchronized` is a template with two parameters, the data type and a
151 mutex type: `Synchronized<T, Mutex>`.
153 If not specified, the mutex type defaults to `folly::SharedMutex`. However, any
154 mutex type supported by `folly::LockTraits` can be used instead.
155 `folly::LockTraits` can be specialized to support other custom mutex
156 types that it does not know about out of the box. See
157 `folly/LockTraitsBoost.h` for an example of how to support additional mutex
160 `Synchronized` provides slightly different APIs when instantiated with a
161 shared mutex type or an upgrade mutex type then with a plain exclusive mutex.
162 If instantiated with either of the two mutex types above (either through
163 having a member called lock_shared() or specializing `LockTraits` as in
164 `folly/LockTraitsBoost.h`) the `Synchronized` object has corresponding
165 `wlock`, `rlock` or `ulock` methods to acquire different lock types. When
166 using a shared or upgrade mutex type, these APIs ensure that callers make an
167 explicit choice to acquire a shared, exclusive or upgrade lock and that
168 callers do not unintentionally lock the mutex in the incorrect mode. The
169 `rlock()` APIs only provide `const` access to the underlying data type,
170 ensuring that it cannot be modified when only holding a shared lock.
174 The default constructor default-initializes the data and its
178 The copy constructor locks the source for reading and copies its
179 data into the target. (The target is not locked as an object
180 under construction is only accessed by one thread.)
182 Finally, `Synchronized<T>` defines an explicit constructor that
183 takes an object of type `T` and copies it. For example:
186 // Default constructed
187 Synchronized<map<string, int>> syncMap1;
190 Synchronized<map<string, int>> syncMap2(syncMap1);
192 // Initializing from an existing map
193 map<string, int> init;
195 Synchronized<map<string, int>> syncMap3(init);
196 EXPECT_EQ(syncMap3->size(), 1);
199 #### Assignment, swap, and copying
201 The canonical assignment operator locks both objects involved and
202 then copies the underlying data objects. The mutexes are not
203 copied. The locks are acquired in increasing address order, so
204 deadlock is avoided. For example, there is no problem if one
205 thread assigns `a = b` and the other assigns `b = a` (other than
206 that design probably deserving a Razzie award). Similarly, the
207 `swap` method takes a reference to another `Synchronized<T>`
208 object and swaps the data. Again, locks are acquired in a well-
209 defined order. The mutexes are not swapped.
211 An additional assignment operator accepts a `const T&` on the
212 right-hand side. The operator copies the datum inside a
215 In addition to assignment operators, `Synchronized<T>` has move
216 assignment operators.
218 An additional `swap` method accepts a `T&` and swaps the data
219 inside a critical section. This is by far the preferred method of
220 changing the guarded datum wholesale because it keeps the lock
221 only for a short time, thus lowering the pressure on the mutex.
223 To get a copy of the guarded data, there are two methods
224 available: `void copy(T*)` and `T copy()`. The first copies data
225 to a provided target and the second returns a copy by value. Both
226 operations are done under a read lock. Example:
229 Synchronized<vector<string>> syncVec1, syncVec2;
234 // Assign straight from vector
238 syncVec1.swap(syncVec2);
242 // Copy to given target
244 // Get a copy by value
245 auto copy = syncVec1.copy();
250 If the mutex type used with `Synchronized` is a simple exclusive mutex
251 type (as opposed to a shared mutex), `Synchronized<T>` provides a
252 `lock()` method that returns a `LockedPtr<T>` to access the data while
255 The `LockedPtr` object returned by `lock()` holds the lock for as long
256 as it exists. Whenever possible, prefer declaring a separate inner
257 scope for storing this variable, to make sure the `LockedPtr` is
258 destroyed as soon as the lock is no longer needed:
261 void fun(Synchronized<vector<string>, std::mutex>& vec) {
263 auto locked = vec.lock();
264 locked->push_back("hello");
265 locked->push_back("world");
267 LOG(INFO) << "successfully added greeting";
271 #### `wlock()` and `rlock()`
273 If the mutex type used with `Synchronized` is a shared mutex type,
274 `Synchronized<T>` provides a `wlock()` method that acquires an exclusive
275 lock, and an `rlock()` method that acquires a shared lock.
277 The `LockedPtr` returned by `rlock()` only provides const access to the
278 internal data, to ensure that it cannot be modified while only holding a
282 int computeSum(const Synchronized<vector<int>>& vec) {
284 auto locked = vec.rlock();
285 for (int n : *locked) {
291 void doubleValues(Synchronized<vector<int>>& vec) {
292 auto locked = vec.wlock();
293 for (int& n : *locked) {
299 This example brings us to a cautionary discussion. The `LockedPtr`
300 object returned by `lock()`, `wlock()`, or `rlock()` only holds the lock
301 as long as it exists. This object makes it difficult to access the data
302 without holding the lock, but not impossible. In particular you should
303 never store a raw pointer or reference to the internal data for longer
304 than the lifetime of the `LockedPtr` object.
306 For instance, if we had written the following code in the examples
307 above, this would have continued accessing the vector after the lock had
312 for (int& n : *vec.wlock()) {
317 The `vec.wlock()` return value is destroyed in this case as soon as the
318 internal range iterators are created. The range iterators point into
319 the vector's data, but lock is released immediately, before executing
322 Needless to say, this is a crime punishable by long debugging nights.
324 Range-based for loops are slightly subtle about the lifetime of objects
325 used in the initializer statement. Most other problematic use cases are
326 a bit easier to spot than this, since the lifetime of the `LockedPtr` is
327 more explicitly visible.
331 As an alternative to the `lock()` API, `Synchronized` also provides a
332 `withLock()` method that executes a function or lambda expression while
333 holding the lock. The function receives a reference to the data as its
336 This has a few benefits compared to `lock()`:
338 * The lambda expression requires its own nested scope, making critical
339 sections more visible in the code. Callers are recommended to define
340 a new scope when using `lock()` if they choose to, but this is not
341 required. `withLock()` ensures that a new scope must always be
343 * Because a new scope is required, `withLock()` also helps encourage
344 users to release the lock as soon as possible. Because the critical
345 section scope is easily visible in the code, it is harder to
346 accidentally put extraneous code inside the critical section without
348 * The separate lambda scope makes it more difficult to store raw
349 pointers or references to the protected data and continue using those
350 pointers outside the critical section.
352 For example, `withLock()` makes the range-based for loop mistake from
353 above much harder to accidentally run into:
356 vec.withLock([](auto& locked) {
357 for (int& n : locked) {
363 This code does not have the same problem as the counter-example with
364 `wlock()` above, since the lock is held for the duration of the loop.
366 When using `Synchronized` with a shared mutex type, it provides separate
367 `withWLock()` and `withRLock()` methods instead of `withLock()`.
369 #### `ulock()` and `withULockPtr()`
371 `Synchronized` also supports upgrading and downgrading mutex lock levels as
372 long as the mutex type used to instantiate the `Synchronized` type has the
373 same interface as the mutex types in the C++ standard library, or if
374 `LockTraits` is specialized for the mutex type and the specialization is
375 visible. See below for an intro to upgrade mutexes.
377 An upgrade lock can be acquired as usual either with the `ulock()` method or
378 the `withULockPtr()` method as so
382 // only const access allowed to the underlying object when an upgrade lock
384 auto ulock = vec.ulock();
385 auto newSize = ulock->size();
388 auto newSize = vec.withULockPtr([](auto ulock) {
389 // only const access allowed to the underlying object when an upgrade lock
391 return ulock->size();
395 An upgrade lock acquired via `ulock()` or `withULockPtr()` can be upgraded or
396 downgraded by calling any of the following methods on the `LockedPtr` proxy
398 * `moveFromUpgradeToWrite()`
399 * `moveFromWriteToUpgrade()`
400 * `moveFromWriteToRead()`
401 * `moveFromUpgradeToRead()`
403 Calling these leaves the `LockedPtr` object on which the method was called in
404 an invalid `null` state and returns another LockedPtr proxy holding the
405 specified lock. The upgrade or downgrade is done atomically - the
406 `Synchronized` object is never in an unlocked state during the lock state
407 transition. For example
410 auto ulock = obj.ulock();
411 if (ulock->needsUpdate()) {
412 auto wlock = ulock.moveFromUpgradeToWrite();
420 This "move" can also occur in the context of a `withULockPtr()`
421 (`withWLockPtr()` or `withRLockPtr()` work as well!) function as so
424 auto newSize = obj.withULockPtr([](auto ulock) {
425 if (ulock->needsUpdate()) {
427 // release upgrade lock get write lock atomically
428 auto wlock = ulock.moveFromUpgradeToWrite();
432 // release write lock and acquire read lock atomically
433 auto rlock = wlock.moveFromWriteToRead();
435 return rlock->newSize();
439 // release upgrade lock and acquire read lock atomically
440 auto rlock = ulock.moveFromUpgradeToRead();
442 return rlock->newSize();
447 #### Intro to upgrade mutexes:
449 An upgrade mutex is a shared mutex with an extra state called `upgrade` and an
450 atomic state transition from `upgrade` to `unique`. The `upgrade` state is more
451 powerful than the `shared` state but less powerful than the `unique` state.
453 An upgrade lock permits only const access to shared state for doing reads. It
454 does not permit mutable access to shared state for doing writes. Only a unique
455 lock permits mutable access for doing writes.
457 An upgrade lock may be held concurrently with any number of shared locks on the
458 same mutex. An upgrade lock is exclusive with other upgrade locks and unique
459 locks on the same mutex - only one upgrade lock or unique lock may be held at a
462 The upgrade mutex solves the problem of doing a read of shared state and then
463 optionally doing a write to shared state efficiently under contention. Consider
464 this scenario with a shared mutex:
468 bool isUpdateRequired() const;
472 struct MyContainingObject {
473 folly::Synchronized<MyObject> sync;
475 void mightHappenConcurrently() {
477 if (!sync.rlock()->isUpdateRequired()) {
480 sync.withWLock([&](auto& state) {
482 if (!state.isUpdateRequired()) {
491 Here, the second `isUpdateRequired` check happens under a unique lock. This
492 means that the second check cannot be done concurrently with other threads doing
493 first `isUpdateRequired` checks under the shared lock, even though the second
494 check, like the first check, is read-only and requires only const access to the
497 This may even introduce unnecessary blocking under contention. Since the default
498 mutex type, `folly::SharedMutex`, has write priority, the unique lock protecting
499 the second check may introduce unnecessary blocking to all the other threads
500 that are attempting to acquire a shared lock to protect the first check. This
501 problem is called reader starvation.
503 One solution is to use a shared mutex type with read priority, such as
504 `folly::SharedMutexReadPriority`. That can introduce less blocking under
505 contention to the other threads attemping to acquire a shared lock to do the
506 first check. However, that may backfire and cause threads which are attempting
507 to acquire a unique lock (for the second check) to stall, waiting for a moment
508 in time when there are no shared locks held on the mutex, a moment in time that
509 may never even happen. This problem is called writer starvation.
511 Starvation is a tricky problem to solve in general. But we can partially side-
514 An alternative solution is to use an upgrade lock for the second check. Threads
515 attempting to acquire an upgrade lock for the second check do not introduce
516 unnecessary blocking to all other threads that are attempting to acquire a
517 shared lock for the first check. Only after the second check passes, and the
518 upgrade lock transitions atomically from an upgrade lock to a unique lock, does
519 the unique lock introduce *necessary* blocking to the other threads attempting
520 to acquire a shared lock. With this solution, unlike the solution without the
521 upgrade lock, the second check may be done concurrently with all other first
522 checks rather than blocking or being blocked by them.
524 The example would then look like:
528 bool isUpdateRequired() const;
532 struct MyContainingObject {
533 folly::Synchronized<MyObject> sync;
535 void mightHappenConcurrently() {
537 if (!sync.rlock()->isUpdateRequired()) {
540 sync.withULockPtr([&](auto ulock) {
542 if (!ulock->isUpdateRequired()) {
545 auto wlock = ulock.moveFromUpgradeToWrite();
552 Note: Some shared mutex implementations offer an atomic state transition from
553 `shared` to `unique` and some upgrade mutex implementations offer an atomic
554 state transition from `shared` to `upgrade`. These atomic state transitions are
555 dangerous, however, and can deadlock when done concurrently on the same mutex.
556 For example, if threads A and B both hold shared locks on a mutex and are both
557 attempting to transition atomically from shared to upgrade locks, the threads
558 are deadlocked. Likewise if they are both attempting to transition atomically
559 from shared to unique locks, or one is attempting to transition atomically from
560 shared to upgrade while the other is attempting to transition atomically from
561 shared to unique. Therefore, `LockTraits` does not expose either of these
562 dangerous atomic state transitions even when the underlying mutex type supports
563 them. Likewise, `Synchronized`'s `LockedPtr` proxies do not expose these
564 dangerous atomic state transitions either.
568 When `Synchronized` is used with a mutex type that supports timed lock
569 acquisition, `lock()`, `wlock()`, and `rlock()` can all take an optional
570 `std::chrono::duration` argument. This argument specifies a timeout to
571 use for acquiring the lock. If the lock is not acquired before the
572 timeout expires, a null `LockedPtr` object will be returned. Callers
573 must explicitly check the return value before using it:
576 void fun(Synchronized<vector<string>>& vec) {
578 auto locked = vec.lock(10ms);
580 throw std::runtime_error("failed to acquire lock");
582 locked->push_back("hello");
583 locked->push_back("world");
585 LOG(INFO) << "successfully added greeting";
589 #### `unlock()` and `scopedUnlock()`
591 `Synchronized` is a good mechanism for enforcing scoped
592 synchronization, but it has the inherent limitation that it
593 requires the critical section to be, well, scoped. Sometimes the
594 code structure requires a fleeting "escape" from the iron fist of
595 synchronization, while still inside the critical section scope.
597 One common pattern is releasing the lock early on error code paths,
598 prior to logging an error message. The `LockedPtr` class provides an
599 `unlock()` method that makes this possible:
602 Synchronized<map<int, string>> dic;
605 auto locked = dic.rlock();
606 auto iter = locked->find(0);
607 if (iter == locked.end()) {
608 locked.unlock(); // don't hold the lock while logging
609 LOG(ERROR) << "key 0 not found";
614 LOG(INFO) << "succeeded";
617 For more complex nested control flow scenarios, `scopedUnlock()` returns
618 an object that will release the lock for as long as it exists, and will
619 reacquire the lock when it goes out of scope.
623 Synchronized<map<int, string>> dic;
626 auto locked = dic.wlock();
627 auto iter = locked->find(0);
628 if (iter == locked->end()) {
630 auto unlocker = locked.scopedUnlock();
631 LOG(INFO) << "Key 0 not found, inserting it."
633 locked->emplace(0, "zero");
640 Clearly `scopedUnlock()` comes with specific caveats and
641 liabilities. You must assume that during the `scopedUnlock()`
642 section, other threads might have changed the protected structure
643 in arbitrary ways. In the example above, you cannot use the
644 iterator `iter` and you cannot assume that the key `0` is not in the
645 map; another thread might have inserted it while you were
646 bragging on `LOG(INFO)`.
648 Whenever a `LockedPtr` object has been unlocked, whether with `unlock()`
649 or `scopedUnlock()`, it will behave as if it is null. `isNull()` will
650 return true. Dereferencing an unlocked `LockedPtr` is not allowed and
651 will result in undefined behavior.
653 #### `Synchronized` and `std::condition_variable`
655 When used with a `std::mutex`, `Synchronized` supports using a
656 `std::condition_variable` with its internal mutex. This allows a
657 `condition_variable` to be used to wait for a particular change to occur
658 in the internal data.
660 The `LockedPtr` returned by `Synchronized<T, std::mutex>::lock()` has a
661 `getUniqueLock()` method that returns a reference to a
662 `std::unique_lock<std::mutex>`, which can be given to the
663 `std::condition_variable`:
666 Synchronized<vector<string>, std::mutex> vec;
667 std::condition_variable emptySignal;
669 // Assuming some other thread will put data on vec and signal
670 // emptySignal, we can then wait on it as follows:
671 auto locked = vec.lock();
672 emptySignal.wait(locked.getUniqueLock(),
673 [&] { return !locked->empty(); });
676 ### `acquireLocked()`
678 Sometimes locking just one object won't be able to cut the mustard. Consider a
679 function that needs to lock two `Synchronized` objects at the
680 same time - for example, to copy some data from one to the other.
681 At first sight, it looks like sequential `wlock()` calls will work just
685 void fun(Synchronized<vector<int>>& a, Synchronized<vector<int>>& b) {
686 auto lockedA = a.wlock();
687 auto lockedB = b.wlock();
688 ... use lockedA and lockedB ...
692 This code compiles and may even run most of the time, but embeds
693 a deadly peril: if one threads call `fun(x, y)` and another
694 thread calls `fun(y, x)`, then the two threads are liable to
695 deadlocking as each thread will be waiting for a lock the other
696 is holding. This issue is a classic that applies regardless of
697 the fact the objects involved have the same type.
699 This classic problem has a classic solution: all threads must
700 acquire locks in the same order. The actual order is not
701 important, just the fact that the order is the same in all
702 threads. Many libraries simply acquire mutexes in increasing
703 order of their address, which is what we'll do, too. The
704 `acquireLocked()` function takes care of all details of proper
705 locking of two objects and offering their innards. It returns a
706 `std::tuple` of `LockedPtr`s:
709 void fun(Synchronized<vector<int>>& a, Synchronized<vector<int>>& b) {
710 auto ret = folly::acquireLocked(a, b);
711 auto& lockedA = std::get<0>(ret);
712 auto& lockedB = std::get<1>(ret);
713 ... use lockedA and lockedB ...
717 Note that C++ 17 introduces
718 (structured binding syntax)[(http://wg21.link/P0144r2)]
719 which will make the returned tuple more convenient to use:
722 void fun(Synchronized<vector<int>>& a, Synchronized<vector<int>>& b) {
723 auto [lockedA, lockedB] = folly::acquireLocked(a, b);
724 ... use lockedA and lockedB ...
728 An `acquireLockedPair()` function is also available, which returns a
729 `std::pair` instead of a `std::tuple`. This is more convenient to use
730 in many situations, until compiler support for structured bindings is
731 more widely available.
733 ### Synchronizing several data items with one mutex
735 The library is geared at protecting one object of a given type
736 with a mutex. However, sometimes we'd like to protect two or more
737 members with the same mutex. Consider for example a bidirectional
738 map, i.e. a map that holds an `int` to `string` mapping and also
739 the converse `string` to `int` mapping. The two maps would need
740 to be manipulated simultaneously. There are at least two designs
743 #### Using a nested `struct`
745 You can easily pack the needed data items in a little struct.
751 map<int, string> direct;
752 map<string, int> inverse;
754 Synchronized<BiMap> bimap_;
758 bimap_.withLock([](auto& locked) {
759 locked.direct[0] = "zero";
760 locked.inverse["zero"] = 0;
764 With this code in tow you get to use `bimap_` just like any other
765 `Synchronized` object, without much effort.
767 #### Using `std::tuple`
769 If you won't stop short of using a spaceship-era approach,
770 `std::tuple` is there for you. The example above could be
771 rewritten for the same functionality like this:
775 Synchronized<tuple<map<int, string>, map<string, int>>> bimap_;
779 bimap_.withLock([](auto& locked) {
780 get<0>(locked)[0] = "zero";
781 get<1>(locked)["zero"] = 0;
785 The code uses `std::get` with compile-time integers to access the
786 fields in the tuple. The relative advantages and disadvantages of
787 using a local struct vs. `std::tuple` are quite obvious - in the
788 first case you need to invest in the definition, in the second
789 case you need to put up with slightly more verbose and less clear
794 `Synchronized` and its supporting tools offer you a simple,
795 robust paradigm for mutual exclusion-based concurrency. Instead
796 of manually pairing data with the mutexes that protect it and
797 relying on convention to use them appropriately, you can benefit
798 of encapsulation and typechecking to offload a large part of that
799 task and to provide good guarantees.