2 * Copyright 2016 Facebook, Inc.
4 * Licensed under the Apache License, Version 2.0 (the "License");
5 * you may not use this file except in compliance with the License.
6 * You may obtain a copy of the License at
8 * http://www.apache.org/licenses/LICENSE-2.0
10 * Unless required by applicable law or agreed to in writing, software
11 * distributed under the License is distributed on an "AS IS" BASIS,
12 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13 * See the License for the specific language governing permissions and
14 * limitations under the License.
18 * N.B. You most likely do _not_ want to use RWSpinLock or any other
19 * kind of spinlock. Use SharedMutex instead.
21 * In short, spinlocks in preemptive multi-tasking operating systems
22 * have serious problems and fast mutexes like SharedMutex are almost
23 * certainly the better choice, because letting the OS scheduler put a
24 * thread to sleep is better for system responsiveness and throughput
25 * than wasting a timeslice repeatedly querying a lock held by a
26 * thread that's blocked, and you can't prevent userspace
29 * Spinlocks in an operating system kernel make much more sense than
30 * they do in userspace.
32 * -------------------------------------------------------------------
34 * Two Read-Write spin lock implementations.
36 * Ref: http://locklessinc.com/articles/locks
38 * Both locks here are faster than pthread_rwlock and have very low
39 * overhead (usually 20-30ns). They don't use any system mutexes and
40 * are very compact (4/8 bytes), so are suitable for per-instance
41 * based locking, particularly when contention is not expected.
43 * For a spinlock, RWSpinLock is a reasonable choice. (See the note
44 * about for why a spin lock is frequently a bad idea generally.)
45 * RWSpinLock has minimal overhead, and comparable contention
46 * performance when the number of competing threads is less than or
47 * equal to the number of logical CPUs. Even as the number of
48 * threads gets larger, RWSpinLock can still be very competitive in
49 * READ, although it is slower on WRITE, and also inherently unfair
52 * RWTicketSpinLock shows more balanced READ/WRITE performance. If
53 * your application really needs a lot more threads, and a
54 * higher-priority writer, prefer one of the RWTicketSpinLock locks.
58 * RWTicketSpinLock locks can only be used with GCC on x86/x86-64
61 * RWTicketSpinLock<32> only allows up to 2^8 - 1 concurrent
62 * readers and writers.
64 * RWTicketSpinLock<64> only allows up to 2^16 - 1 concurrent
65 * readers and writers.
67 * RWTicketSpinLock<..., true> (kFavorWriter = true, that is, strict
68 * writer priority) is NOT reentrant, even for lock_shared().
70 * The lock will not grant any new shared (read) accesses while a thread
71 * attempting to acquire the lock in write mode is blocked. (That is,
72 * if the lock is held in shared mode by N threads, and a thread attempts
73 * to acquire it in write mode, no one else can acquire it in shared mode
74 * until these N threads release the lock and then the blocked thread
75 * acquires and releases the exclusive lock.) This also applies for
76 * attempts to reacquire the lock in shared mode by threads that already
77 * hold it in shared mode, making the lock non-reentrant.
79 * RWSpinLock handles 2^30 - 1 concurrent readers.
81 * @author Xin Liu <xliux@fb.com>
87 ========================================================================
88 Benchmark on (Intel(R) Xeon(R) CPU L5630 @ 2.13GHz) 8 cores(16 HTs)
89 ========================================================================
91 ------------------------------------------------------------------------------
92 1. Single thread benchmark (read/write lock + unlock overhead)
93 Benchmark Iters Total t t/iter iter/sec
94 -------------------------------------------------------------------------------
95 * BM_RWSpinLockRead 100000 1.786 ms 17.86 ns 53.4M
96 +30.5% BM_RWSpinLockWrite 100000 2.331 ms 23.31 ns 40.91M
97 +85.7% BM_RWTicketSpinLock32Read 100000 3.317 ms 33.17 ns 28.75M
98 +96.0% BM_RWTicketSpinLock32Write 100000 3.5 ms 35 ns 27.25M
99 +85.6% BM_RWTicketSpinLock64Read 100000 3.315 ms 33.15 ns 28.77M
100 +96.0% BM_RWTicketSpinLock64Write 100000 3.5 ms 35 ns 27.25M
101 +85.7% BM_RWTicketSpinLock32FavorWriterRead 100000 3.317 ms 33.17 ns 28.75M
102 +29.7% BM_RWTicketSpinLock32FavorWriterWrite 100000 2.316 ms 23.16 ns 41.18M
103 +85.3% BM_RWTicketSpinLock64FavorWriterRead 100000 3.309 ms 33.09 ns 28.82M
104 +30.2% BM_RWTicketSpinLock64FavorWriterWrite 100000 2.325 ms 23.25 ns 41.02M
105 + 175% BM_PThreadRWMutexRead 100000 4.917 ms 49.17 ns 19.4M
106 + 166% BM_PThreadRWMutexWrite 100000 4.757 ms 47.57 ns 20.05M
108 ------------------------------------------------------------------------------
109 2. Contention Benchmark 90% read 10% write
110 Benchmark hits average min max sigma
111 ------------------------------------------------------------------------------
112 ---------- 8 threads ------------
113 RWSpinLock Write 142666 220ns 78ns 40.8us 269ns
114 RWSpinLock Read 1282297 222ns 80ns 37.7us 248ns
115 RWTicketSpinLock Write 85692 209ns 71ns 17.9us 252ns
116 RWTicketSpinLock Read 769571 215ns 78ns 33.4us 251ns
117 pthread_rwlock_t Write 84248 2.48us 99ns 269us 8.19us
118 pthread_rwlock_t Read 761646 933ns 101ns 374us 3.25us
120 ---------- 16 threads ------------
121 RWSpinLock Write 124236 237ns 78ns 261us 801ns
122 RWSpinLock Read 1115807 236ns 78ns 2.27ms 2.17us
123 RWTicketSpinLock Write 81781 231ns 71ns 31.4us 351ns
124 RWTicketSpinLock Read 734518 238ns 78ns 73.6us 379ns
125 pthread_rwlock_t Write 83363 7.12us 99ns 785us 28.1us
126 pthread_rwlock_t Read 754978 2.18us 101ns 1.02ms 14.3us
128 ---------- 50 threads ------------
129 RWSpinLock Write 131142 1.37us 82ns 7.53ms 68.2us
130 RWSpinLock Read 1181240 262ns 78ns 6.62ms 12.7us
131 RWTicketSpinLock Write 83045 397ns 73ns 7.01ms 31.5us
132 RWTicketSpinLock Read 744133 386ns 78ns 11ms 31.4us
133 pthread_rwlock_t Write 80849 112us 103ns 4.52ms 263us
134 pthread_rwlock_t Read 728698 24us 101ns 7.28ms 194us
138 #include <folly/Portability.h>
139 #include <folly/portability/Asm.h>
141 #if defined(__GNUC__) && \
142 (defined(__i386) || FOLLY_X64 || \
144 # define RW_SPINLOCK_USE_X86_INTRINSIC_
145 # include <x86intrin.h>
146 #elif defined(_MSC_VER) && defined(FOLLY_X64)
147 # define RW_SPINLOCK_USE_X86_INTRINSIC_
149 # undef RW_SPINLOCK_USE_X86_INTRINSIC_
152 // iOS doesn't define _mm_cvtsi64_si128 and friends
153 #if (FOLLY_SSE >= 2) && !FOLLY_MOBILE
154 #define RW_SPINLOCK_USE_SSE_INSTRUCTIONS_
156 #undef RW_SPINLOCK_USE_SSE_INSTRUCTIONS_
164 #include <glog/logging.h>
166 #include <folly/Likely.h>
171 * A simple, small (4-bytes), but unfair rwlock. Use it when you want
172 * a nice writer and don't expect a lot of write/read contention, or
173 * when you need small rwlocks since you are creating a large number
176 * Note that the unfairness here is extreme: if the lock is
177 * continually accessed for read, writers will never get a chance. If
178 * the lock can be that highly contended this class is probably not an
179 * ideal choice anyway.
181 * It currently implements most of the Lockable, SharedLockable and
182 * UpgradeLockable concepts except the TimedLockable related locking/unlocking
186 enum : int32_t { READER = 4, UPGRADED = 2, WRITER = 1 };
188 constexpr RWSpinLock() : bits_(0) {}
190 RWSpinLock(RWSpinLock const&) = delete;
191 RWSpinLock& operator=(RWSpinLock const&) = delete;
196 while (!LIKELY(try_lock())) {
197 if (++count > 1000) sched_yield();
201 // Writer is responsible for clearing up both the UPGRADED and WRITER bits.
203 static_assert(READER > WRITER + UPGRADED, "wrong bits!");
204 bits_.fetch_and(~(WRITER | UPGRADED), std::memory_order_release);
207 // SharedLockable Concept
210 while (!LIKELY(try_lock_shared())) {
211 if (++count > 1000) sched_yield();
215 void unlock_shared() {
216 bits_.fetch_add(-READER, std::memory_order_release);
219 // Downgrade the lock from writer status to reader status.
220 void unlock_and_lock_shared() {
221 bits_.fetch_add(READER, std::memory_order_acquire);
225 // UpgradeLockable Concept
226 void lock_upgrade() {
228 while (!try_lock_upgrade()) {
229 if (++count > 1000) sched_yield();
233 void unlock_upgrade() {
234 bits_.fetch_add(-UPGRADED, std::memory_order_acq_rel);
237 // unlock upgrade and try to acquire write lock
238 void unlock_upgrade_and_lock() {
240 while (!try_unlock_upgrade_and_lock()) {
241 if (++count > 1000) sched_yield();
245 // unlock upgrade and read lock atomically
246 void unlock_upgrade_and_lock_shared() {
247 bits_.fetch_add(READER - UPGRADED, std::memory_order_acq_rel);
250 // write unlock and upgrade lock atomically
251 void unlock_and_lock_upgrade() {
252 // need to do it in two steps here -- as the UPGRADED bit might be OR-ed at
253 // the same time when other threads are trying do try_lock_upgrade().
254 bits_.fetch_or(UPGRADED, std::memory_order_acquire);
255 bits_.fetch_add(-WRITER, std::memory_order_release);
259 // Attempt to acquire writer permission. Return false if we didn't get it.
262 return bits_.compare_exchange_strong(expect, WRITER,
263 std::memory_order_acq_rel);
266 // Try to get reader permission on the lock. This can fail if we
267 // find out someone is a writer or upgrader.
268 // Setting the UPGRADED bit would allow a writer-to-be to indicate
269 // its intention to write and block any new readers while waiting
270 // for existing readers to finish and release their read locks. This
271 // helps avoid starving writers (promoted from upgraders).
272 bool try_lock_shared() {
273 // fetch_add is considerably (100%) faster than compare_exchange,
274 // so here we are optimizing for the common (lock success) case.
275 int32_t value = bits_.fetch_add(READER, std::memory_order_acquire);
276 if (UNLIKELY(value & (WRITER|UPGRADED))) {
277 bits_.fetch_add(-READER, std::memory_order_release);
283 // try to unlock upgrade and write lock atomically
284 bool try_unlock_upgrade_and_lock() {
285 int32_t expect = UPGRADED;
286 return bits_.compare_exchange_strong(expect, WRITER,
287 std::memory_order_acq_rel);
290 // try to acquire an upgradable lock.
291 bool try_lock_upgrade() {
292 int32_t value = bits_.fetch_or(UPGRADED, std::memory_order_acquire);
294 // Note: when failed, we cannot flip the UPGRADED bit back,
295 // as in this case there is either another upgrade lock or a write lock.
296 // If it's a write lock, the bit will get cleared up when that lock's done
298 return ((value & (UPGRADED | WRITER)) == 0);
301 // mainly for debugging purposes.
302 int32_t bits() const { return bits_.load(std::memory_order_acquire); }
305 class UpgradedHolder;
310 explicit ReadHolder(RWSpinLock* lock = nullptr) : lock_(lock) {
311 if (lock_) lock_->lock_shared();
314 explicit ReadHolder(RWSpinLock& lock) : lock_(&lock) {
315 lock_->lock_shared();
318 ReadHolder(ReadHolder&& other) noexcept : lock_(other.lock_) {
319 other.lock_ = nullptr;
323 explicit ReadHolder(UpgradedHolder&& upgraded) : lock_(upgraded.lock_) {
324 upgraded.lock_ = nullptr;
325 if (lock_) lock_->unlock_upgrade_and_lock_shared();
328 explicit ReadHolder(WriteHolder&& writer) : lock_(writer.lock_) {
329 writer.lock_ = nullptr;
330 if (lock_) lock_->unlock_and_lock_shared();
333 ReadHolder& operator=(ReadHolder&& other) {
335 swap(lock_, other.lock_);
339 ReadHolder(const ReadHolder& other) = delete;
340 ReadHolder& operator=(const ReadHolder& other) = delete;
342 ~ReadHolder() { if (lock_) lock_->unlock_shared(); }
344 void reset(RWSpinLock* lock = nullptr) {
345 if (lock == lock_) return;
346 if (lock_) lock_->unlock_shared();
348 if (lock_) lock_->lock_shared();
351 void swap(ReadHolder* other) {
352 std::swap(lock_, other->lock_);
356 friend class UpgradedHolder;
357 friend class WriteHolder;
361 class UpgradedHolder {
363 explicit UpgradedHolder(RWSpinLock* lock = nullptr) : lock_(lock) {
364 if (lock_) lock_->lock_upgrade();
367 explicit UpgradedHolder(RWSpinLock& lock) : lock_(&lock) {
368 lock_->lock_upgrade();
371 explicit UpgradedHolder(WriteHolder&& writer) {
372 lock_ = writer.lock_;
373 writer.lock_ = nullptr;
374 if (lock_) lock_->unlock_and_lock_upgrade();
377 UpgradedHolder(UpgradedHolder&& other) noexcept : lock_(other.lock_) {
378 other.lock_ = nullptr;
381 UpgradedHolder& operator =(UpgradedHolder&& other) {
383 swap(lock_, other.lock_);
387 UpgradedHolder(const UpgradedHolder& other) = delete;
388 UpgradedHolder& operator =(const UpgradedHolder& other) = delete;
390 ~UpgradedHolder() { if (lock_) lock_->unlock_upgrade(); }
392 void reset(RWSpinLock* lock = nullptr) {
393 if (lock == lock_) return;
394 if (lock_) lock_->unlock_upgrade();
396 if (lock_) lock_->lock_upgrade();
399 void swap(UpgradedHolder* other) {
401 swap(lock_, other->lock_);
405 friend class WriteHolder;
406 friend class ReadHolder;
412 explicit WriteHolder(RWSpinLock* lock = nullptr) : lock_(lock) {
413 if (lock_) lock_->lock();
416 explicit WriteHolder(RWSpinLock& lock) : lock_(&lock) {
420 // promoted from an upgrade lock holder
421 explicit WriteHolder(UpgradedHolder&& upgraded) {
422 lock_ = upgraded.lock_;
423 upgraded.lock_ = nullptr;
424 if (lock_) lock_->unlock_upgrade_and_lock();
427 WriteHolder(WriteHolder&& other) noexcept : lock_(other.lock_) {
428 other.lock_ = nullptr;
431 WriteHolder& operator =(WriteHolder&& other) {
433 swap(lock_, other.lock_);
437 WriteHolder(const WriteHolder& other) = delete;
438 WriteHolder& operator =(const WriteHolder& other) = delete;
440 ~WriteHolder () { if (lock_) lock_->unlock(); }
442 void reset(RWSpinLock* lock = nullptr) {
443 if (lock == lock_) return;
444 if (lock_) lock_->unlock();
446 if (lock_) lock_->lock();
449 void swap(WriteHolder* other) {
451 swap(lock_, other->lock_);
455 friend class ReadHolder;
456 friend class UpgradedHolder;
460 // Synchronized<> adaptors
461 friend void acquireRead(RWSpinLock& l) { return l.lock_shared(); }
462 friend void acquireReadWrite(RWSpinLock& l) { return l.lock(); }
463 friend void releaseRead(RWSpinLock& l) { return l.unlock_shared(); }
464 friend void releaseReadWrite(RWSpinLock& l) { return l.unlock(); }
467 std::atomic<int32_t> bits_;
471 #ifdef RW_SPINLOCK_USE_X86_INTRINSIC_
472 // A more balanced Read-Write spin lock implemented based on GCC intrinsics.
475 template <size_t kBitWidth> struct RWTicketIntTrait {
476 static_assert(kBitWidth == 32 || kBitWidth == 64,
477 "bit width has to be either 32 or 64 ");
481 struct RWTicketIntTrait<64> {
482 typedef uint64_t FullInt;
483 typedef uint32_t HalfInt;
484 typedef uint16_t QuarterInt;
486 #ifdef RW_SPINLOCK_USE_SSE_INSTRUCTIONS_
487 static __m128i make128(const uint16_t v[4]) {
488 return _mm_set_epi16(0, 0, 0, 0, v[3], v[2], v[1], v[0]);
490 static inline __m128i fromInteger(uint64_t from) {
491 return _mm_cvtsi64_si128(from);
493 static inline uint64_t toInteger(__m128i in) {
494 return _mm_cvtsi128_si64(in);
496 static inline uint64_t addParallel(__m128i in, __m128i kDelta) {
497 return toInteger(_mm_add_epi16(in, kDelta));
503 struct RWTicketIntTrait<32> {
504 typedef uint32_t FullInt;
505 typedef uint16_t HalfInt;
506 typedef uint8_t QuarterInt;
508 #ifdef RW_SPINLOCK_USE_SSE_INSTRUCTIONS_
509 static __m128i make128(const uint8_t v[4]) {
510 return _mm_set_epi8(0, 0, 0, 0, 0, 0, 0, 0,
511 0, 0, 0, 0, v[3], v[2], v[1], v[0]);
513 static inline __m128i fromInteger(uint32_t from) {
514 return _mm_cvtsi32_si128(from);
516 static inline uint32_t toInteger(__m128i in) {
517 return _mm_cvtsi128_si32(in);
519 static inline uint32_t addParallel(__m128i in, __m128i kDelta) {
520 return toInteger(_mm_add_epi8(in, kDelta));
527 template<size_t kBitWidth, bool kFavorWriter=false>
528 class RWTicketSpinLockT {
529 typedef detail::RWTicketIntTrait<kBitWidth> IntTraitType;
530 typedef typename detail::RWTicketIntTrait<kBitWidth>::FullInt FullInt;
531 typedef typename detail::RWTicketIntTrait<kBitWidth>::HalfInt HalfInt;
532 typedef typename detail::RWTicketIntTrait<kBitWidth>::QuarterInt
536 constexpr RWTicket() : whole(0) {}
539 __extension__ struct {
546 private: // Some x64-specific utilities for atomic access to ticket.
547 template<class T> static T load_acquire(T* addr) {
548 T t = *addr; // acquire barrier
549 asm_volatile_memory();
554 static void store_release(T* addr, T v) {
555 asm_volatile_memory();
556 *addr = v; // release barrier
561 constexpr RWTicketSpinLockT() {}
563 RWTicketSpinLockT(RWTicketSpinLockT const&) = delete;
564 RWTicketSpinLockT& operator=(RWTicketSpinLockT const&) = delete;
568 writeLockAggressive();
575 * Both try_lock and try_lock_shared diverge in our implementation from the
576 * lock algorithm described in the link above.
578 * In the read case, it is undesirable that the readers could wait
579 * for another reader (before increasing ticket.read in the other
580 * implementation). Our approach gives up on
581 * first-come-first-serve, but our benchmarks showed improve
582 * performance for both readers and writers under heavily contended
583 * cases, particularly when the number of threads exceeds the number
586 * We have writeLockAggressive() using the original implementation
587 * for a writer, which gives some advantage to the writer over the
588 * readers---for that path it is guaranteed that the writer will
589 * acquire the lock after all the existing readers exit.
593 FullInt old = t.whole = load_acquire(&ticket.whole);
594 if (t.users != t.write) return false;
596 return __sync_bool_compare_and_swap(&ticket.whole, old, t.whole);
600 * Call this if you want to prioritize writer to avoid starvation.
601 * Unlike writeLockNice, immediately acquires the write lock when
602 * the existing readers (arriving before the writer) finish their
605 void writeLockAggressive() {
606 // sched_yield() is needed here to avoid a pathology if the number
607 // of threads attempting concurrent writes is >= the number of real
608 // cores allocated to this process. This is less likely than the
609 // corresponding situation in lock_shared(), but we still want to
612 QuarterInt val = __sync_fetch_and_add(&ticket.users, 1);
613 while (val != load_acquire(&ticket.write)) {
614 asm_volatile_pause();
615 if (UNLIKELY(++count > 1000)) sched_yield();
619 // Call this when the writer should be nicer to the readers.
620 void writeLockNice() {
621 // Here it doesn't cpu-relax the writer.
623 // This is because usually we have many more readers than the
624 // writers, so the writer has less chance to get the lock when
625 // there are a lot of competing readers. The aggressive spinning
626 // can help to avoid starving writers.
628 // We don't worry about sched_yield() here because the caller
629 // has already explicitly abandoned fairness.
630 while (!try_lock()) {}
633 // Atomically unlock the write-lock from writer and acquire the read-lock.
634 void unlock_and_lock_shared() {
635 QuarterInt val = __sync_fetch_and_add(&ticket.read, 1);
638 // Release writer permission on the lock.
641 t.whole = load_acquire(&ticket.whole);
642 FullInt old = t.whole;
644 #ifdef RW_SPINLOCK_USE_SSE_INSTRUCTIONS_
645 // SSE2 can reduce the lock and unlock overhead by 10%
646 static const QuarterInt kDeltaBuf[4] = { 1, 1, 0, 0 }; // write/read/user
647 static const __m128i kDelta = IntTraitType::make128(kDeltaBuf);
648 __m128i m = IntTraitType::fromInteger(old);
649 t.whole = IntTraitType::addParallel(m, kDelta);
654 store_release(&ticket.readWrite, t.readWrite);
658 // sched_yield() is important here because we can't grab the
659 // shared lock if there is a pending writeLockAggressive, so we
660 // need to let threads that already have a shared lock complete
662 while (!LIKELY(try_lock_shared())) {
663 asm_volatile_pause();
664 if (UNLIKELY((++count & 1023) == 0)) sched_yield();
668 bool try_lock_shared() {
670 old.whole = t.whole = load_acquire(&ticket.whole);
671 old.users = old.read;
672 #ifdef RW_SPINLOCK_USE_SSE_INSTRUCTIONS_
673 // SSE2 may reduce the total lock and unlock overhead by 10%
674 static const QuarterInt kDeltaBuf[4] = { 0, 1, 1, 0 }; // write/read/user
675 static const __m128i kDelta = IntTraitType::make128(kDeltaBuf);
676 __m128i m = IntTraitType::fromInteger(old.whole);
677 t.whole = IntTraitType::addParallel(m, kDelta);
682 return __sync_bool_compare_and_swap(&ticket.whole, old.whole, t.whole);
685 void unlock_shared() {
686 QuarterInt val = __sync_fetch_and_add(&ticket.write, 1);
691 typedef RWTicketSpinLockT<kBitWidth, kFavorWriter> RWSpinLock;
694 ReadHolder(ReadHolder const&) = delete;
695 ReadHolder& operator=(ReadHolder const&) = delete;
697 explicit ReadHolder(RWSpinLock *lock = nullptr) :
699 if (lock_) lock_->lock_shared();
702 explicit ReadHolder(RWSpinLock &lock) : lock_ (&lock) {
703 if (lock_) lock_->lock_shared();
706 // atomically unlock the write-lock from writer and acquire the read-lock
707 explicit ReadHolder(WriteHolder *writer) : lock_(nullptr) {
708 std::swap(this->lock_, writer->lock_);
710 lock_->unlock_and_lock_shared();
715 if (lock_) lock_->unlock_shared();
718 void reset(RWSpinLock *lock = nullptr) {
719 if (lock_) lock_->unlock_shared();
721 if (lock_) lock_->lock_shared();
724 void swap(ReadHolder *other) {
725 std::swap(this->lock_, other->lock_);
734 WriteHolder(WriteHolder const&) = delete;
735 WriteHolder& operator=(WriteHolder const&) = delete;
737 explicit WriteHolder(RWSpinLock *lock = nullptr) : lock_(lock) {
738 if (lock_) lock_->lock();
740 explicit WriteHolder(RWSpinLock &lock) : lock_ (&lock) {
741 if (lock_) lock_->lock();
745 if (lock_) lock_->unlock();
748 void reset(RWSpinLock *lock = nullptr) {
749 if (lock == lock_) return;
750 if (lock_) lock_->unlock();
752 if (lock_) lock_->lock();
755 void swap(WriteHolder *other) {
756 std::swap(this->lock_, other->lock_);
760 friend class ReadHolder;
764 // Synchronized<> adaptors.
765 friend void acquireRead(RWTicketSpinLockT& mutex) {
768 friend void acquireReadWrite(RWTicketSpinLockT& mutex) {
771 friend void releaseRead(RWTicketSpinLockT& mutex) {
772 mutex.unlock_shared();
774 friend void releaseReadWrite(RWTicketSpinLockT& mutex) {
779 typedef RWTicketSpinLockT<32> RWTicketSpinLock32;
780 typedef RWTicketSpinLockT<64> RWTicketSpinLock64;
782 #endif // RW_SPINLOCK_USE_X86_INTRINSIC_
786 #ifdef RW_SPINLOCK_USE_X86_INTRINSIC_
787 #undef RW_SPINLOCK_USE_X86_INTRINSIC_