X-Git-Url: http://demsky.eecs.uci.edu/git/?a=blobdiff_plain;f=docs%2FProgrammersManual.html;h=64ddb9d105d7ce7e2942b9208d8920bce84df005;hb=7eafc3e7be067709c6fcdae7b7fc4994c7ec2377;hp=5565973ac1b8b8ec63c8827b1dc6a0b896e15459;hpb=d475c105080cc5082ca8b8e87c89fa7c2dade39e;p=oota-llvm.git diff --git a/docs/ProgrammersManual.html b/docs/ProgrammersManual.html index 5565973ac1b..64ddb9d105d 100644 --- a/docs/ProgrammersManual.html +++ b/docs/ProgrammersManual.html @@ -4,7 +4,7 @@
The Twine class is an efficient way for APIs to accept concatenated -strings. For example, a common LLVM paradigm is to name one instruction based on +
The Twine class is an +efficient way for APIs to accept concatenated strings. For example, a common +LLVM paradigm is to name one instruction based on the name of another instruction with a suffix, for example:
The Twine class is effectively a -lightweight rope +
The Twine class is effectively a lightweight +rope which points to temporary (stack allocated) objects. Twines can be implicitly constructed as the result of the plus operator applied to strings (i.e., a C -strings, an std::string, or a StringRef). The twine delays the -actual concatenation of strings until it is actually required, at which point -it can be efficiently rendered directly into a character array. This avoids -unnecessary heap allocation involved in constructing the temporary results of -string concatenation. See -"llvm/ADT/Twine.h" -for more information.
+strings, an std::string, or a StringRef). The twine delays +the actual concatenation of strings until it is actually required, at which +point it can be efficiently rendered directly into a character array. This +avoids unnecessary heap allocation involved in constructing the temporary +results of string concatenation. See +"llvm/ADT/Twine.h" +and here for more information.As with a StringRef, Twine objects point to external memory and should almost never be stored or mentioned directly. They are intended @@ -883,7 +892,7 @@ cost of adding the elements to the container.
TinyPtrVector<Type> is a highly specialized collection class +that is optimized to avoid allocation in the case when a vector has zero or one +elements. It has two major restrictions: 1) it can only hold values of pointer +type, and 2) it cannot hold a null pointer.
+ +Since this container is highly specialized, it is rarely used.
+ +for ( ... ) { std::vector<foo> V; - use V; + // make use of V. }
std::vector<foo> V; for ( ... ) { - use V; + // make use of V. V.clear(); }@@ -1190,9 +1215,187 @@ std::priority_queue, std::stack, etc. These provide simplified access to an underlying container but don't affect the cost of the container itself. + + + +
+There are a variety of ways to pass around and use strings in C and C++, and +LLVM adds a few new options to choose from. Pick the first option on this list +that will do what you need, they are ordered according to their relative cost. +
++Note that is is generally preferred to not pass strings around as +"const char*"'s. These have a number of problems, including the fact +that they cannot represent embedded nul ("\0") characters, and do not have a +length available efficiently. The general replacement for 'const +char*' is StringRef. +
+ +For more information on choosing string containers for APIs, please see +Passing strings.
+ + + ++The StringRef class is a simple value class that contains a pointer to a +character and a length, and is quite related to the ArrayRef class (but specialized for arrays of +characters). Because StringRef carries a length with it, it safely handles +strings with embedded nul characters in it, getting the length does not require +a strlen call, and it even has very convenient APIs for slicing and dicing the +character range that it represents. +
+ ++StringRef is ideal for passing simple strings around that are known to be live, +either because they are C string literals, std::string, a C array, or a +SmallVector. Each of these cases has an efficient implicit conversion to +StringRef, which doesn't result in a dynamic strlen being executed. +
+ +StringRef has a few major limitations which make more powerful string +containers useful:
+ +Because of its strengths and limitations, it is very common for a function to +take a StringRef and for a method on an object to return a StringRef that +points into some string that it owns.
++ The Twine class is used as an intermediary datatype for APIs that want to take + a string that can be constructed inline with a series of concatenations. + Twine works by forming recursive instances of the Twine datatype (a simple + value object) on the stack as temporary objects, linking them together into a + tree which is then linearized when the Twine is consumed. Twine is only safe + to use as the argument to a function, and should always be a const reference, + e.g.: +
+ ++ void foo(const Twine &T); + ... + StringRef X = ... + unsigned i = ... + foo(X + "." + Twine(i)); ++ +
This example forms a string like "blarg.42" by concatenating the values + together, and does not form intermediate strings containing "blarg" or + "blarg.". +
+ +Because Twine is constructed with temporary objects on the stack, and + because these instances are destroyed at the end of the current statement, + it is an inherently dangerous API. For example, this simple variant contains + undefined behavior and will probably crash:
+ ++ void foo(const Twine &T); + ... + StringRef X = ... + unsigned i = ... + const Twine &Tmp = X + "." + Twine(i); + foo(Tmp); ++ +
... because the temporaries are destroyed before the call. That said, + Twine's are much more efficient than intermediate std::string temporaries, and + they work really well with StringRef. Just be aware of their limitations.
+ +SmallString is a subclass of SmallVector that +adds some convenience APIs like += that takes StringRef's. SmallString avoids +allocating memory in the case when the preallocated space is enough to hold its +data, and it calls back to general heap allocation when required. Since it owns +its data, it is very safe to use and supports full mutation of the string.
+ +Like SmallVector's, the big downside to SmallString is their sizeof. While +they are optimized for small strings, they themselves are not particularly +small. This means that they work great for temporary scratch buffers on the +stack, but should not generally be put into the heap: it is very rare to +see a SmallString as the member of a frequently-allocated heap data structure +or returned by-value. +
+ +The standard C++ std::string class is a very general class that (like + SmallString) owns its underlying data. sizeof(std::string) is very reasonable + so it can be embedded into heap data structures and returned by-value. + On the other hand, std::string is highly inefficient for inline editing (e.g. + concatenating a bunch of stuff together) and because it is provided by the + standard library, its performance characteristics depend a lot of the host + standard library (e.g. libc++ and MSVC provide a highly optimized string + class, GCC contains a really slow implementation). +
+ +The major disadvantage of std::string is that almost every operation that + makes them larger can allocate memory, which is slow. As such, it is better + to use SmallVector or Twine as a scratch buffer, but then use std::string to + persist the result.
+ + +SparseSet holds a small number of objects identified by unsigned keys of +moderate size. It uses a lot of memory, but provides operations that are +almost as fast as a vector. Typical keys are physical registers, virtual +registers, or numbered basic blocks.
+ +SparseSet is useful for algorithms that need very fast clear/find/insert/erase +and fast iteration over small sets. It is not intended for building composite +data structures.
+ +SetVector is an adapter class that defaults to using std::vector and std::set -for the underlying containers, so it is quite expensive. However, -"llvm/ADT/SetVector.h" also provides a SmallSetVector class, which -defaults to using a SmallVector and SmallSet of a specified size. If you use -this, and if your sets are dynamically smaller than N, you will save a lot of -heap traffic.
+SetVector is an adapter class that defaults to + using std::vector and a size 16 SmallSet for the underlying + containers, so it is quite expensive. However, + "llvm/ADT/SetVector.h" also provides a SmallSetVector + class, which defaults to using a SmallVector and SmallSet + of a specified size. If you use this, and if your sets are dynamically + smaller than N, you will save a lot of heap traffic.
@@ -1409,6 +1631,29 @@ factors, and produces a lot of malloc traffic. It should be avoided. + ++ImmutableSet is an immutable (functional) set implementation based on an AVL +tree. +Adding or removing elements is done through a Factory object and results in the +creation of a new ImmutableSet object. +If an ImmutableSet already exists with the given contents, then the existing one +is returned; equality is compared with a FoldingSetNodeID. +The time and space complexity of add or remove operations is logarithmic in the +size of the original set. + +
+There is no method for returning an element of the set, you can only check for +membership. + +
StringMap also provides query methods that take byte ranges, so it only ever copies a string if a value is inserted into the table.
+ +StringMap iteratation order, however, is not guaranteed to be deterministic, +so any uses which require that should instead use a std::map.
@@ -1529,7 +1777,7 @@ pointers, or map other small types to each other.There are several aspects of DenseMap that you should be aware of, however. The -iterators in a densemap are invalidated whenever an insertion occurs, unlike +iterators in a DenseMap are invalidated whenever an insertion occurs, unlike map. Also, because DenseMap allocates space for a large number of key/value pairs (it starts with 64 by default), it will waste a lot of space if your keys or values are large. Finally, you must implement a partial specialization of @@ -1537,6 +1785,14 @@ DenseMapInfo for the key that you want, if it isn't already supported. This is required to tell DenseMap about two special marker values (which can never be inserted into the map) that it needs internally.
++DenseMap's find_as() method supports lookup operations using an alternate key +type. This is useful in cases where the normal key type is expensive to +construct, but cheap to compare against. The DenseMapInfo is responsible for +defining the appropriate comparison and hashing methods for each alternate +key type used. +
+ @@ -1593,6 +1849,24 @@ another element takes place). + + +MapVector<KeyT,ValueT> provides a subset of the DenseMap interface. + The main difference is that the iteration order is guaranteed to be + the insertion order, making it an easy (but somewhat expensive) solution + for non-deterministic iteration over maps of pointers.
+ +It is implemented by mapping from key to an index in a vector of key,value + pairs. This provides fast lookup and iteration, but has two main drawbacks: + The key is stored twice and it doesn't support removing elements.
+ ++ImmutableMap is an immutable (functional) map implementation based on an AVL +tree. +Adding or removing elements is done through a Factory object and results in the +creation of a new ImmutableMap object. +If an ImmutableMap already exists with the given key set, then the existing one +is returned; equality is compared with a FoldingSetNodeID. +The time and space complexity of add or remove operations is logarithmic in the +size of the original map. + +
-TODO: const char* vs stringref vs smallstring vs std::string. Describe twine, -xref to #string_apis. -
- -Replacing individual instructions
+Including "llvm/Transforms/Utils/BasicBlockUtils.h" permits use of two very useful replace functions: ReplaceInstWithValue @@ -2319,6 +2598,7 @@ and ReplaceInstWithInst.
Replacing multiple uses of Users and Values
+You can use Value::replaceAllUsesWith and User::replaceUsesOfWith to change more than one use at a time. See the @@ -3049,13 +3331,12 @@ helpful member functions that try to make common operations easy.
Constructing a Module is easy. You can optionally +
Constructing a Module is easy. You can optionally provide a name for it (probably based on the name of the translation unit).
+Look up the specified function in the Module SymbolTable. If it does not exist, return @@ -3602,7 +3884,7 @@ is its address (after linking) which is guaranteed to be constant.
*Ty, LinkageTypes Linkage, const std::string &N = "", Module* Parent = 0)Constructor used when you need to create new Functions to add - the the program. The constructor must specify the type of the function to + the program. The constructor must specify the type of the function to create and what type of linkage the function should have. The FunctionType argument specifies the formal arguments and return value for the function. The same