X-Git-Url: http://demsky.eecs.uci.edu/git/?a=blobdiff_plain;f=docs%2FProgrammersManual.html;h=0b1a875da1abee5f68b76acf1dd82b216848f357;hb=1aee22e0720932a82dd3bf3fc8be804fff6bb89a;hp=bfa721ddc95a26d23a1a1501cf604a985ad23479;hpb=9d69d4aadd4a58aba5634d5c3d2c2a6d8d284134;p=oota-llvm.git diff --git a/docs/ProgrammersManual.html b/docs/ProgrammersManual.html index bfa721ddc95..0b1a875da1a 100644 --- a/docs/ProgrammersManual.html +++ b/docs/ProgrammersManual.html @@ -68,6 +68,13 @@ option
for ( ... ) { std::vector<foo> V; - use V; + // make use of V. }
std::vector<foo> V; for ( ... ) { - use V; + // make use of V. V.clear(); }@@ -1209,9 +1212,187 @@ std::priority_queue, std::stack, etc. These provide simplified access to an underlying container but don't affect the cost of the container itself. + + + +
+There are a variety of ways to pass around and use strings in C and C++, and +LLVM adds a few new options to choose from. Pick the first option on this list +that will do what you need, they are ordered according to their relative cost. +
++Note that is is generally preferred to not pass strings around as +"const char*"'s. These have a number of problems, including the fact +that they cannot represent embedded nul ("\0") characters, and do not have a +length available efficiently. The general replacement for 'const +char*' is StringRef. +
+ +For more information on choosing string containers for APIs, please see +Passing strings.
+ + + ++The StringRef class is a simple value class that contains a pointer to a +character and a length, and is quite related to the ArrayRef class (but specialized for arrays of +characters). Because StringRef carries a length with it, it safely handles +strings with embedded nul characters in it, getting the length does not require +a strlen call, and it even has very convenient APIs for slicing and dicing the +character range that it represents. +
+ ++StringRef is ideal for passing simple strings around that are known to be live, +either because they are C string literals, std::string, a C array, or a +SmallVector. Each of these cases has an efficient implicit conversion to +StringRef, which doesn't result in a dynamic strlen being executed. +
+ +StringRef has a few major limitations which make more powerful string +containers useful:
+ +Because of its strengths and limitations, it is very common for a function to +take a StringRef and for a method on an object to return a StringRef that +points into some string that it owns.
+ ++ The Twine class is used as an intermediary datatype for APIs that want to take + a string that can be constructed inline with a series of concatenations. + Twine works by forming recursive instances of the Twine datatype (a simple + value object) on the stack as temporary objects, linking them together into a + tree which is then linearized when the Twine is consumed. Twine is only safe + to use as the argument to a function, and should always be a const reference, + e.g.: +
+ ++ void foo(const Twine &T); + ... + StringRef X = ... + unsigned i = ... + foo(X + "." + Twine(i)); ++ +
This example forms a string like "blarg.42" by concatenating the values + together, and does not form intermediate strings containing "blarg" or + "blarg.". +
+ +Because Twine is constructed with temporary objects on the stack, and + because these instances are destroyed at the end of the current statement, + it is an inherently dangerous API. For example, this simple variant contains + undefined behavior and will probably crash:
+ ++ void foo(const Twine &T); + ... + StringRef X = ... + unsigned i = ... + const Twine &Tmp = X + "." + Twine(i); + foo(Tmp); ++ +
... because the temporaries are destroyed before the call. That said, + Twine's are much more efficient than intermediate std::string temporaries, and + they work really well with StringRef. Just be aware of their limitations.
+SmallString is a subclass of SmallVector that +adds some convenience APIs like += that takes StringRef's. SmallString avoids +allocating memory in the case when the preallocated space is enough to hold its +data, and it calls back to general heap allocation when required. Since it owns +its data, it is very safe to use and supports full mutation of the string.
+ +Like SmallVector's, the big downside to SmallString is their sizeof. While +they are optimized for small strings, they themselves are not particularly +small. This means that they work great for temporary scratch buffers on the +stack, but should not generally be put into the heap: it is very rare to +see a SmallString as the member of a frequently-allocated heap data structure +or returned by-value. +
+ +The standard C++ std::string class is a very general class that (like + SmallString) owns its underlying data. sizeof(std::string) is very reasonable + so it can be embedded into heap data structures and returned by-value. + On the other hand, std::string is highly inefficient for inline editing (e.g. + concatenating a bunch of stuff together) and because it is provided by the + standard library, its performance characteristics depend a lot of the host + standard library (e.g. libc++ and MSVC provide a highly optimized string + class, GCC contains a really slow implementation). +
+ +The major disadvantage of std::string is that almost every operation that + makes them larger can allocate memory, which is slow. As such, it is better + to use SmallVector or Twine as a scratch buffer, but then use std::string to + persist the result.
+ + +SetVector is an adapter class that defaults to using std::vector and std::set -for the underlying containers, so it is quite expensive. However, -"llvm/ADT/SetVector.h" also provides a SmallSetVector class, which -defaults to using a SmallVector and SmallSet of a specified size. If you use -this, and if your sets are dynamically smaller than N, you will save a lot of -heap traffic.
+SetVector is an adapter class that defaults to + using std::vector and a size 16 SmallSet for the underlying + containers, so it is quite expensive. However, + "llvm/ADT/SetVector.h" also provides a SmallSetVector + class, which defaults to using a SmallVector and SmallSet + of a specified size. If you use this, and if your sets are dynamically + smaller than N, you will save a lot of heap traffic.
@@ -1428,6 +1610,29 @@ factors, and produces a lot of malloc traffic. It should be avoided. + ++ImmutableSet is an immutable (functional) set implementation based on an AVL +tree. +Adding or removing elements is done through a Factory object and results in the +creation of a new ImmutableSet object. +If an ImmutableSet already exists with the given contents, then the existing one +is returned; equality is compared with a FoldingSetNodeID. +The time and space complexity of add or remove operations is logarithmic in the +size of the original set. + +
+There is no method for returning an element of the set, you can only check for +membership. + +
There are several aspects of DenseMap that you should be aware of, however. The -iterators in a densemap are invalidated whenever an insertion occurs, unlike +iterators in a DenseMap are invalidated whenever an insertion occurs, unlike map. Also, because DenseMap allocates space for a large number of key/value pairs (it starts with 64 by default), it will waste a lot of space if your keys or values are large. Finally, you must implement a partial specialization of @@ -1556,6 +1761,14 @@ DenseMapInfo for the key that you want, if it isn't already supported. This is required to tell DenseMap about two special marker values (which can never be inserted into the map) that it needs internally.
++DenseMap's find_as() method supports lookup operations using an alternate key +type. This is useful in cases where the normal key type is expensive to +construct, but cheap to compare against. The DenseMapInfo is responsible for +defining the appropriate comparison and hashing methods for each alternate +key type used. +
+ @@ -1632,6 +1845,25 @@ it can be edited again. + ++ImmutableMap is an immutable (functional) map implementation based on an AVL +tree. +Adding or removing elements is done through a Factory object and results in the +creation of a new ImmutableMap object. +If an ImmutableMap already exists with the given key set, then the existing one +is returned; equality is compared with a FoldingSetNodeID. +The time and space complexity of add or remove operations is logarithmic in the +size of the original map. + +
-TODO: const char* vs stringref vs smallstring vs std::string. Describe twine, -xref to #string_apis. -
- -