3 #ifndef CDSLIB_CONTAINER_IMPL_FELDMAN_HASHSET_H
4 #define CDSLIB_CONTAINER_IMPL_FELDMAN_HASHSET_H
6 #include <cds/intrusive/impl/feldman_hashset.h>
7 #include <cds/container/details/feldman_hashset_base.h>
9 namespace cds { namespace container {
11 /// Hash set based on multi-level array
12 /** @ingroup cds_nonintrusive_set
13 @anchor cds_container_FeldmanHashSet_hp
16 - [2013] Steven Feldman, Pierre LaBorde, Damian Dechev "Concurrent Multi-level Arrays:
17 Wait-free Extensible Hash Maps"
19 [From the paper] The hardest problem encountered while developing a parallel hash map is how to perform
20 a global resize, the process of redistributing the elements in a hash map that occurs when adding new
21 buckets. The negative impact of blocking synchronization is multiplied during a global resize, because all
22 threads will be forced to wait on the thread that is performing the involved process of resizing the hash map
23 and redistributing the elements. \p %FeldmanHashSet implementation avoids global resizes through new array
24 allocation. By allowing concurrent expansion this structure is free from the overhead of an explicit resize,
25 which facilitates concurrent operations.
27 The presented design includes dynamic hashing, the use of sub-arrays within the hash map data structure;
28 which, in combination with <b>perfect hashing</b>, means that each element has a unique final, as well as current, position.
29 It is important to note that the perfect hash function required by our hash map is trivial to realize as
30 any hash function that permutes the bits of the key is suitable. This is possible because of our approach
31 to the hash function; we require that it produces hash values that are equal in size to that of the key.
32 We know that if we expand the hash map a fixed number of times there can be no collision as duplicate keys
33 are not provided for in the standard semantics of a hash map.
35 \p %FeldmanHashSet is a multi-level array which has an internal structure similar to a tree:
36 @image html feldman_hashset.png
37 The multi-level array differs from a tree in that each position on the tree could hold an array of nodes or a single node.
38 A position that holds a single node is a \p dataNode which holds the hash value of a key and the value that is associated
39 with that key; it is a simple struct holding two variables. A \p dataNode in the multi-level array could be marked.
40 A \p markedDataNode refers to a pointer to a \p dataNode that has been bitmarked at the least significant bit (LSB)
41 of the pointer to the node. This signifies that this \p dataNode is contended. An expansion must occur at this node;
42 any thread that sees this \p markedDataNode will try to replace it with an \p arrayNode; which is a position that holds
43 an array of nodes. The pointer to an \p arrayNode is differentiated from that of a pointer to a \p dataNode by a bitmark
44 on the second-least significant bit.
46 \p %FeldmanHashSet multi-level array is similar to a tree in that we keep a pointer to the root, which is a memory array
47 called \p head. The length of the \p head memory array is unique, whereas every other \p arrayNode has a uniform length;
48 a normal \p arrayNode has a fixed power-of-two length equal to the binary logarithm of a variable called \p arrayLength.
49 The maximum depth of the tree, \p maxDepth, is the maximum number of pointers that must be followed to reach any node.
50 We define \p currentDepth as the number of memory arrays that we need to traverse to reach the \p arrayNode on which
51 we need to operate; this is initially one, because of \p head.
53 That approach to the structure of the hash set uses an extensible hashing scheme; <b> the hash value is treated as a bit
54 string</b> and rehash incrementally.
56 @note Two important things you should keep in mind when you're using \p %FeldmanHashSet:
57 - all keys must be fixed-size. It means that you cannot use \p std::string as a key for \p %FeldmanHashSet.
58 Instead, for the strings you should use well-known hashing algorithms like <a href="https://en.wikipedia.org/wiki/Secure_Hash_Algorithm">SHA1, SHA2</a>,
59 <a href="https://en.wikipedia.org/wiki/MurmurHash">MurmurHash</a>, <a href="https://en.wikipedia.org/wiki/CityHash">CityHash</a>
60 or its successor <a href="https://code.google.com/p/farmhash/">FarmHash</a> and so on, which
61 converts variable-length strings to fixed-length bit-strings, and use that hash as a key in \p %FeldmanHashSet.
62 - \p %FeldmanHashSet uses a perfect hashing. It means that if two different keys, for example, of type \p std::string,
63 have identical hash then you cannot insert both that keys in the set. \p %FeldmanHashSet does not maintain the key,
64 it maintains its fixed-size hash value.
66 The set supports @ref cds_container_FeldmanHashSet_iterators "bidirectional thread-safe iterators".
69 - \p GC - safe memory reclamation schema. Can be \p gc::HP, \p gc::DHP or one of \ref cds_urcu_type "RCU type"
70 - \p T - a value type to be stored in the set
71 - \p Traits - type traits, the structure based on \p feldman_hashset::traits or result of \p feldman_hashset::make_traits metafunction.
72 \p Traits is the mandatory argument because it has one mandatory type - an @ref feldman_hashset::traits::hash_accessor "accessor"
73 to hash value of \p T. The set algorithm does not calculate that hash value.
75 There are several specializations of \p %FeldmanHashSet for each \p GC. You should include:
76 - <tt><cds/container/feldman_hashset_hp.h></tt> for \p gc::HP garbage collector
77 - <tt><cds/container/feldman_hashset_dhp.h></tt> for \p gc::DHP garbage collector
78 - <tt><cds/container/feldman_hashset_rcu.h></tt> for \ref cds_intrusive_FeldmanHashSet_rcu "RCU type". RCU specialization
79 has a slightly different interface.
84 #ifdef CDS_DOXYGEN_INVOKED
85 , class Traits = feldman_hashset::traits
91 #ifdef CDS_DOXYGEN_INVOKED
92 : protected cds::intrusive::FeldmanHashSet< GC, T, Traits >
94 : protected cds::container::details::make_feldman_hashset< GC, T, Traits >::type
98 typedef cds::container::details::make_feldman_hashset< GC, T, Traits > maker;
99 typedef typename maker::type base_class;
103 typedef GC gc; ///< Garbage collector
104 typedef T value_type; ///< type of value stored in the set
105 typedef Traits traits; ///< Traits template parameter, see \p feldman_hashset::traits
107 typedef typename base_class::hash_accessor hash_accessor; ///< Hash accessor functor
108 typedef typename base_class::hash_type hash_type; ///< Hash type deduced from \p hash_accessor return type
109 typedef typename base_class::hash_comparator hash_comparator; ///< hash compare functor based on \p opt::compare and \p opt::less option setter
111 typedef typename traits::item_counter item_counter; ///< Item counter type
112 typedef typename traits::allocator allocator; ///< Element allocator
113 typedef typename traits::node_allocator node_allocator; ///< Array node allocator
114 typedef typename traits::memory_model memory_model; ///< Memory model
115 typedef typename traits::back_off back_off; ///< Backoff strategy
116 typedef typename traits::stat stat; ///< Internal statistics type
118 typedef typename gc::template guarded_ptr< value_type > guarded_ptr; ///< Guarded pointer
120 /// Count of hazard pointers required
121 static CDS_CONSTEXPR size_t const c_nHazardPtrCount = base_class::c_nHazardPtrCount;
123 typedef typename base_class::iterator iterator; ///< @ref cds_container_FeldmanHashSet_iterators "bidirectional iterator" type
124 typedef typename base_class::const_iterator const_iterator; ///< @ref cds_container_FeldmanHashSet_iterators "bidirectional const iterator" type
125 typedef typename base_class::reverse_iterator reverse_iterator; ///< @ref cds_container_FeldmanHashSet_iterators "bidirectional reverse iterator" type
126 typedef typename base_class::const_reverse_iterator const_reverse_iterator; ///< @ref cds_container_FeldmanHashSet_iterators "bidirectional reverse const iterator" type
130 typedef typename maker::cxx_node_allocator cxx_node_allocator;
131 typedef std::unique_ptr< value_type, typename maker::node_disposer > scoped_node_ptr;
135 /// Creates empty set
137 @param head_bits: 2<sup>head_bits</sup> specifies the size of head array, minimum is 4.
138 @param array_bits: 2<sup>array_bits</sup> specifies the size of array node, minimum is 2.
140 Equation for \p head_bits and \p array_bits:
142 sizeof(hash_type) * 8 == head_bits + N * array_bits
144 where \p N is multi-level array depth.
146 FeldmanHashSet( size_t head_bits = 8, size_t array_bits = 4 )
147 : base_class( head_bits, array_bits )
150 /// Destructs the set and frees all data
154 /// Inserts new element
156 The function creates an element with copy of \p val value and then inserts it into the set.
158 The type \p Q should contain as minimum the complete hash for the element.
159 The object of \ref value_type should be constructible from a value of type \p Q.
160 In trivial case, \p Q is equal to \ref value_type.
162 Returns \p true if \p val is inserted into the set, \p false otherwise.
164 template <typename Q>
165 bool insert( Q const& val )
167 scoped_node_ptr sp( cxx_node_allocator().New( val ));
168 if ( base_class::insert( *sp )) {
175 /// Inserts new element
177 The function allows to split creating of new item into two part:
178 - create item with key only
179 - insert new item into the set
180 - if inserting is success, calls \p f functor to initialize value-fields of \p val.
182 The functor signature is:
184 void func( value_type& val );
186 where \p val is the item inserted. User-defined functor \p f should guarantee that during changing
187 \p val no any other changes could be made on this set's item by concurrent threads.
188 The user-defined functor is called only if the inserting is success.
190 template <typename Q, typename Func>
191 bool insert( Q const& val, Func f )
193 scoped_node_ptr sp( cxx_node_allocator().New( val ));
194 if ( base_class::insert( *sp, f )) {
201 /// Updates the element
203 The operation performs inserting or replacing with lock-free manner.
205 If the \p val key not found in the set, then the new item created from \p val
206 will be inserted into the set iff \p bInsert is \p true.
207 Otherwise, if \p val is found, it is replaced with new item created from \p val
208 and previous item is disposed.
209 In both cases \p func functor is called.
211 The functor \p Func signature:
214 void operator()( value_type& cur, value_type * prev );
218 - \p cur - current element
219 - \p prev - pointer to previous element with such hash. \p prev is \p nullptr
220 if \p cur was just inserted.
222 The functor may change non-key fields of the \p item; however, \p func must guarantee
223 that during changing no any other modifications could be made on this item by concurrent threads.
225 Returns <tt> std::pair<bool, bool> </tt> where \p first is \p true if operation is successfull,
226 i.e. the item has been inserted or updated,
227 \p second is \p true if the new item has been added or \p false if the item with key equal to \p val
230 template <typename Q, typename Func>
231 std::pair<bool, bool> update( Q const& val, Func func, bool bInsert = true )
233 scoped_node_ptr sp( cxx_node_allocator().New( val ));
234 std::pair<bool, bool> bRes = base_class::do_update( *sp, func, bInsert );
240 /// Inserts data of type \p value_type created in-place from <tt>std::forward<Args>(args)...</tt>
242 Returns \p true if inserting successful, \p false otherwise.
244 template <typename... Args>
245 bool emplace( Args&&... args )
247 scoped_node_ptr sp( cxx_node_allocator().New( std::forward<Args>(args)... ));
248 if ( base_class::insert( *sp )) {
255 /// Deletes the item from the set
257 The function searches \p hash in the set,
258 deletes the item found, and returns \p true.
259 If that item is not found the function returns \p false.
261 bool erase( hash_type const& hash )
263 return base_class::erase( hash );
266 /// Deletes the item from the set
268 The function searches \p hash in the set,
269 call \p f functor with item found, and deltes the element from the set.
271 The \p Func interface is
274 void operator()( value_type& item );
278 If \p hash is not found the function returns \p false.
280 template <typename Func>
281 bool erase( hash_type const& hash, Func f )
283 return base_class::erase( hash, f );
286 /// Deletes the item pointed by iterator \p iter
288 Returns \p true if the operation is successful, \p false otherwise.
290 The function does not invalidate the iterator, it remains valid and can be used for further traversing.
292 bool erase_at( iterator const& iter )
294 return base_class::erase_at( iter );
297 bool erase_at( reverse_iterator const& iter )
299 return base_class::erase_at( iter );
303 /// Extracts the item with specified \p hash
305 The function searches \p hash in the set,
306 unlinks it from the set, and returns a guarded pointer to the item extracted.
307 If \p hash is not found the function returns an empty guarded pointer.
309 The item returned is reclaimed by garbage collector \p GC
310 when returned \ref guarded_ptr object to be destroyed or released.
311 @note Each \p guarded_ptr object uses the GC's guard that can be limited resource.
315 typedef cds::container::FeldmanHashSet< your_template_args > my_set;
319 my_set::guarded_ptr gp( theSet.extract( 5 ));
324 // Destructor of gp releases internal HP guard
328 guarded_ptr extract( hash_type const& hash )
330 return base_class::extract( hash );
333 /// Finds an item by it's \p hash
335 The function searches the item by \p hash and calls the functor \p f for item found.
336 The interface of \p Func functor is:
339 void operator()( value_type& item );
342 where \p item is the item found.
344 The functor may change non-key fields of \p item. Note that the functor is only guarantee
345 that \p item cannot be disposed during the functor is executing.
346 The functor does not serialize simultaneous access to the set's \p item. If such access is
347 possible you must provide your own synchronization schema on item level to prevent unsafe item modifications.
349 The function returns \p true if \p hash is found, \p false otherwise.
351 template <typename Func>
352 bool find( hash_type const& hash, Func f )
354 return base_class::find( hash, f );
357 /// Checks whether the set contains \p hash
359 The function searches the item by its \p hash
360 and returns \p true if it is found, or \p false otherwise.
362 bool contains( hash_type const& hash )
364 return base_class::contains( hash );
367 /// Finds an item by it's \p hash and returns the item found
369 The function searches the item by its \p hash
370 and returns the guarded pointer to the item found.
371 If \p hash is not found the function returns an empty \p guarded_ptr.
373 @note Each \p guarded_ptr object uses one GC's guard which can be limited resource.
377 typedef cds::container::FeldmanHashSet< your_template_params > my_set;
381 my_set::guarded_ptr gp( theSet.get( 5 ));
382 if ( theSet.get( 5 )) {
386 // Destructor of guarded_ptr releases internal HP guard
390 guarded_ptr get( hash_type const& hash )
392 return base_class::get( hash );
395 /// Clears the set (non-atomic)
397 The function unlink all data node from the set.
398 The function is not atomic but is thread-safe.
399 After \p %clear() the set may not be empty because another threads may insert items.
406 /// Checks if the set is empty
408 Emptiness is checked by item counting: if item count is zero then the set is empty.
409 Thus, the correct item counting feature is an important part of the set implementation.
413 return base_class::empty();
416 /// Returns item count in the set
419 return base_class::size();
422 /// Returns const reference to internal statistics
423 stat const& statistics() const
425 return base_class::statistics();
428 /// Returns the size of head node
429 size_t head_size() const
431 return base_class::head_size();
434 /// Returns the size of the array node
435 size_t array_node_size() const
437 return base_class::array_node_size();
440 /// Collects tree level statistics into \p stat
441 /** @copydetails cds::intrusive::FeldmanHashSet::get_level_statistics
443 void get_level_statistics(std::vector< feldman_hashset::level_statistics>& stat) const
445 base_class::get_level_statistics(stat);
449 ///@name Thread-safe iterators
450 /** @anchor cds_container_FeldmanHashSet_iterators
451 The set supports thread-safe iterators: you may iterate over the set in multi-threaded environment.
452 It is guaranteed that the iterators will remain valid even if another thread deletes the node the iterator points to:
453 Hazard Pointer embedded into the iterator object protects the node from physical reclamation.
455 @note Since the iterator object contains hazard pointer that is a thread-local resource,
456 the iterator should not be passed to another thread.
458 Each iterator object supports the common interface:
459 - dereference operators:
461 value_type [const] * operator ->() noexcept
462 value_type [const] & operator *() noexcept
464 - pre-increment and pre-decrement. Post-operators is not supported
465 - equality operators <tt>==</tt> and <tt>!=</tt>.
466 Iterators are equal iff they point to the same cell of the same array node.
467 Note that for two iterators \p it1 and \p it2, the conditon <tt> it1 == it2 </tt>
468 does not entail <tt> &(*it1) == &(*it2) </tt>
469 - helper member function \p release() that clears internal hazard pointer.
470 After \p release() the iterator points to \p nullptr but it still remain valid: further iterating is possible.
472 During iteration you may safely erase any item from the set;
473 @ref erase_at() function call doesn't invalidate any iterator.
474 If some iterator points to the item to be erased, that item is not deleted immediately
475 but only after that iterator will be advanced forward or backward.
477 @note It is possible the item can be iterated more that once, for example, if an iterator points to the item
478 in array node that is being splitted.
482 /// Returns an iterator to the beginning of the set
485 return base_class::begin();
488 /// Returns an const iterator to the beginning of the set
489 const_iterator begin() const
491 return base_class::begin();
494 /// Returns an const iterator to the beginning of the set
495 const_iterator cbegin()
497 return base_class::cbegin();
500 /// Returns an iterator to the element following the last element of the set. This element acts as a placeholder; attempting to access it results in undefined behavior.
503 return base_class::end();
506 /// Returns a const iterator to the element following the last element of the set. This element acts as a placeholder; attempting to access it results in undefined behavior.
507 const_iterator end() const
509 return base_class::end();
512 /// Returns a const iterator to the element following the last element of the set. This element acts as a placeholder; attempting to access it results in undefined behavior.
513 const_iterator cend()
515 return base_class::cend();
518 /// Returns a reverse iterator to the first element of the reversed set
519 reverse_iterator rbegin()
521 return base_class::rbegin();
524 /// Returns a const reverse iterator to the first element of the reversed set
525 const_reverse_iterator rbegin() const
527 return base_class::rbegin();
530 /// Returns a const reverse iterator to the first element of the reversed set
531 const_reverse_iterator crbegin()
533 return base_class::crbegin();
536 /// Returns a reverse iterator to the element following the last element of the reversed set
538 It corresponds to the element preceding the first element of the non-reversed container.
539 This element acts as a placeholder, attempting to access it results in undefined behavior.
541 reverse_iterator rend()
543 return base_class::rend();
546 /// Returns a const reverse iterator to the element following the last element of the reversed set
548 It corresponds to the element preceding the first element of the non-reversed container.
549 This element acts as a placeholder, attempting to access it results in undefined behavior.
551 const_reverse_iterator rend() const
553 return base_class::rend();
556 /// Returns a const reverse iterator to the element following the last element of the reversed set
558 It corresponds to the element preceding the first element of the non-reversed container.
559 This element acts as a placeholder, attempting to access it results in undefined behavior.
561 const_reverse_iterator crend()
563 return base_class::crend();
568 }} // namespace cds::container
570 #endif // #ifndef CDSLIB_CONTAINER_IMPL_FELDMAN_HASHSET_H