+<!-- _______________________________________________________________________ -->
+<div class="doc_subsubsection">
+ <a name="dss_uniquevector">"llvm/ADT/UniqueVector.h"</a>
+</div>
+
+<div class="doc_text">
+
+<p>
+UniqueVector is similar to <a href="#dss_setvector">SetVector</a>, but it
+retains a unique ID for each element inserted into the set. It internally
+contains a map and a vector, and it assigns a unique ID for each value inserted
+into the set.</p>
+
+<p>UniqueVector is very expensive: its cost is the sum of the cost of
+maintaining both the map and vector, it has high complexity, high constant
+factors, and produces a lot of malloc traffic. It should be avoided.</p>
+
+</div>
+
+
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsubsection">
+ <a name="dss_otherset">Other Set-Like Container Options</a>
+</div>
+
+<div class="doc_text">
+
+<p>
+The STL provides several other options, such as std::multiset and the various
+"hash_set" like containers (whether from C++ TR1 or from the SGI library).</p>
+
+<p>std::multiset is useful if you're not interested in elimination of
+duplicates, but has all the drawbacks of std::set. A sorted vector (where you
+don't delete duplicate entries) or some other approach is almost always
+better.</p>
+
+<p>The various hash_set implementations (exposed portably by
+"llvm/ADT/hash_set") is a simple chained hashtable. This algorithm is as malloc
+intensive as std::set (performing an allocation for each element inserted,
+thus having really high constant factors) but (usually) provides O(1)
+insertion/deletion of elements. This can be useful if your elements are large
+(thus making the constant-factor cost relatively low) or if comparisons are
+expensive. Element iteration does not visit elements in a useful order.</p>
+
+</div>
+
+<!-- ======================================================================= -->
+<div class="doc_subsection">
+ <a name="ds_map">Map-Like Containers (std::map, DenseMap, etc)</a>
+</div>
+
+<div class="doc_text">
+Map-like containers are useful when you want to associate data to a key. As
+usual, there are a lot of different ways to do this. :)
+</div>
+
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsubsection">
+ <a name="dss_sortedvectormap">A sorted 'vector'</a>
+</div>
+
+<div class="doc_text">
+
+<p>
+If your usage pattern follows a strict insert-then-query approach, you can
+trivially use the same approach as <a href="#dss_sortedvectorset">sorted vectors
+for set-like containers</a>. The only difference is that your query function
+(which uses std::lower_bound to get efficient log(n) lookup) should only compare
+the key, not both the key and value. This yields the same advantages as sorted
+vectors for sets.
+</p>
+</div>
+
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsubsection">
+ <a name="dss_stringmap">"llvm/ADT/StringMap.h"</a>
+</div>
+
+<div class="doc_text">
+
+<p>
+Strings are commonly used as keys in maps, and they are difficult to support
+efficiently: they are variable length, inefficient to hash and compare when
+long, expensive to copy, etc. StringMap is a specialized container designed to
+cope with these issues. It supports mapping an arbitrary range of bytes to an
+arbitrary other object.</p>
+
+<p>The StringMap implementation uses a quadratically-probed hash table, where
+the buckets store a pointer to the heap allocated entries (and some other
+stuff). The entries in the map must be heap allocated because the strings are
+variable length. The string data (key) and the element object (value) are
+stored in the same allocation with the string data immediately after the element
+object. This container guarantees the "<tt>(char*)(&Value+1)</tt>" points
+to the key string for a value.</p>
+
+<p>The StringMap is very fast for several reasons: quadratic probing is very
+cache efficient for lookups, the hash value of strings in buckets is not
+recomputed when lookup up an element, StringMap rarely has to touch the
+memory for unrelated objects when looking up a value (even when hash collisions
+happen), hash table growth does not recompute the hash values for strings
+already in the table, and each pair in the map is store in a single allocation
+(the string data is stored in the same allocation as the Value of a pair).</p>
+
+<p>StringMap also provides query methods that take byte ranges, so it only ever
+copies a string if a value is inserted into the table.</p>
+</div>
+
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsubsection">
+ <a name="dss_indexedmap">"llvm/ADT/IndexedMap.h"</a>
+</div>
+
+<div class="doc_text">
+<p>
+IndexedMap is a specialized container for mapping small dense integers (or
+values that can be mapped to small dense integers) to some other type. It is
+internally implemented as a vector with a mapping function that maps the keys to
+the dense integer range.
+</p>
+
+<p>
+This is useful for cases like virtual registers in the LLVM code generator: they
+have a dense mapping that is offset by a compile-time constant (the first
+virtual register ID).</p>
+
+</div>
+
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsubsection">
+ <a name="dss_densemap">"llvm/ADT/DenseMap.h"</a>
+</div>
+
+<div class="doc_text">
+
+<p>
+DenseMap is a simple quadratically probed hash table. It excels at supporting
+small keys and values: it uses a single allocation to hold all of the pairs that
+are currently inserted in the map. DenseMap is a great way to map pointers to
+pointers, or map other small types to each other.
+</p>
+
+<p>
+There are several aspects of DenseMap that you should be aware of, however. The
+iterators in a densemap are invalidated whenever an insertion occurs, unlike
+map. Also, because DenseMap allocates space for a large number of key/value
+pairs (it starts with 64 by default), it will waste a lot of space if your keys
+or values are large. Finally, you must implement a partial specialization of
+DenseMapKeyInfo for the key that you want, if it isn't already supported. This
+is required to tell DenseMap about two special marker values (which can never be
+inserted into the map) that it needs internally.</p>
+
+</div>
+
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsubsection">
+ <a name="dss_map"><map></a>
+</div>
+
+<div class="doc_text">
+
+<p>
+std::map has similar characteristics to <a href="#dss_set">std::set</a>: it uses
+a single allocation per pair inserted into the map, it offers log(n) lookup with
+an extremely large constant factor, imposes a space penalty of 3 pointers per
+pair in the map, etc.</p>
+
+<p>std::map is most useful when your keys or values are very large, if you need
+to iterate over the collection in sorted order, or if you need stable iterators
+into the map (i.e. they don't get invalidated if an insertion or deletion of
+another element takes place).</p>
+
+</div>
+
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsubsection">
+ <a name="dss_othermap">Other Map-Like Container Options</a>
+</div>
+
+<div class="doc_text">
+
+<p>
+The STL provides several other options, such as std::multimap and the various
+"hash_map" like containers (whether from C++ TR1 or from the SGI library).</p>
+
+<p>std::multimap is useful if you want to map a key to multiple values, but has
+all the drawbacks of std::map. A sorted vector or some other approach is almost
+always better.</p>
+
+<p>The various hash_map implementations (exposed portably by
+"llvm/ADT/hash_map") are simple chained hash tables. This algorithm is as
+malloc intensive as std::map (performing an allocation for each element
+inserted, thus having really high constant factors) but (usually) provides O(1)
+insertion/deletion of elements. This can be useful if your elements are large
+(thus making the constant-factor cost relatively low) or if comparisons are
+expensive. Element iteration does not visit elements in a useful order.</p>
+
+</div>
+
+
+<!-- *********************************************************************** -->
+<div class="doc_section">
+ <a name="common">Helpful Hints for Common Operations</a>
+</div>
+<!-- *********************************************************************** -->
+
+<div class="doc_text">
+
+<p>This section describes how to perform some very simple transformations of
+LLVM code. This is meant to give examples of common idioms used, showing the
+practical side of LLVM transformations. <p> Because this is a "how-to" section,
+you should also read about the main classes that you will be working with. The
+<a href="#coreclasses">Core LLVM Class Hierarchy Reference</a> contains details
+and descriptions of the main classes that you should know about.</p>
+
+</div>
+
+<!-- NOTE: this section should be heavy on example code -->
+<!-- ======================================================================= -->
+<div class="doc_subsection">
+ <a name="inspection">Basic Inspection and Traversal Routines</a>
+</div>
+
+<div class="doc_text">
+
+<p>The LLVM compiler infrastructure have many different data structures that may
+be traversed. Following the example of the C++ standard template library, the
+techniques used to traverse these various data structures are all basically the
+same. For a enumerable sequence of values, the <tt>XXXbegin()</tt> function (or
+method) returns an iterator to the start of the sequence, the <tt>XXXend()</tt>
+function returns an iterator pointing to one past the last valid element of the
+sequence, and there is some <tt>XXXiterator</tt> data type that is common
+between the two operations.</p>
+
+<p>Because the pattern for iteration is common across many different aspects of
+the program representation, the standard template library algorithms may be used
+on them, and it is easier to remember how to iterate. First we show a few common
+examples of the data structures that need to be traversed. Other data
+structures are traversed in very similar ways.</p>
+
+</div>
+
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsubsection">
+ <a name="iterate_function">Iterating over the </a><a
+ href="#BasicBlock"><tt>BasicBlock</tt></a>s in a <a
+ href="#Function"><tt>Function</tt></a>
+</div>
+
+<div class="doc_text">
+
+<p>It's quite common to have a <tt>Function</tt> instance that you'd like to
+transform in some way; in particular, you'd like to manipulate its
+<tt>BasicBlock</tt>s. To facilitate this, you'll need to iterate over all of
+the <tt>BasicBlock</tt>s that constitute the <tt>Function</tt>. The following is
+an example that prints the name of a <tt>BasicBlock</tt> and the number of
+<tt>Instruction</tt>s it contains:</p>
+
+<div class="doc_code">
+<pre>
+// <i>func is a pointer to a Function instance</i>
+for (Function::iterator i = func->begin(), e = func->end(); i != e; ++i)
+ // <i>Print out the name of the basic block if it has one, and then the</i>
+ // <i>number of instructions that it contains</i>
+ llvm::cerr << "Basic block (name=" << i->getName() << ") has "
+ << i->size() << " instructions.\n";
+</pre>
+</div>
+
+<p>Note that i can be used as if it were a pointer for the purposes of
+invoking member functions of the <tt>Instruction</tt> class. This is
+because the indirection operator is overloaded for the iterator
+classes. In the above code, the expression <tt>i->size()</tt> is
+exactly equivalent to <tt>(*i).size()</tt> just like you'd expect.</p>
+
+</div>
+
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsubsection">
+ <a name="iterate_basicblock">Iterating over the </a><a
+ href="#Instruction"><tt>Instruction</tt></a>s in a <a
+ href="#BasicBlock"><tt>BasicBlock</tt></a>
+</div>
+
+<div class="doc_text">
+
+<p>Just like when dealing with <tt>BasicBlock</tt>s in <tt>Function</tt>s, it's
+easy to iterate over the individual instructions that make up
+<tt>BasicBlock</tt>s. Here's a code snippet that prints out each instruction in
+a <tt>BasicBlock</tt>:</p>
+
+<div class="doc_code">
+<pre>
+// <i>blk is a pointer to a BasicBlock instance</i>
+for (BasicBlock::iterator i = blk->begin(), e = blk->end(); i != e; ++i)
+ // <i>The next statement works since operator<<(ostream&,...)</i>
+ // <i>is overloaded for Instruction&</i>
+ llvm::cerr << *i << "\n";
+</pre>
+</div>
+
+<p>However, this isn't really the best way to print out the contents of a
+<tt>BasicBlock</tt>! Since the ostream operators are overloaded for virtually
+anything you'll care about, you could have just invoked the print routine on the
+basic block itself: <tt>llvm::cerr << *blk << "\n";</tt>.</p>
+
+</div>
+
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsubsection">
+ <a name="iterate_institer">Iterating over the </a><a
+ href="#Instruction"><tt>Instruction</tt></a>s in a <a
+ href="#Function"><tt>Function</tt></a>
+</div>
+
+<div class="doc_text">
+
+<p>If you're finding that you commonly iterate over a <tt>Function</tt>'s
+<tt>BasicBlock</tt>s and then that <tt>BasicBlock</tt>'s <tt>Instruction</tt>s,
+<tt>InstIterator</tt> should be used instead. You'll need to include <a
+href="/doxygen/InstIterator_8h-source.html"><tt>llvm/Support/InstIterator.h</tt></a>,
+and then instantiate <tt>InstIterator</tt>s explicitly in your code. Here's a
+small example that shows how to dump all instructions in a function to the standard error stream:<p>
+
+<div class="doc_code">
+<pre>
+#include "<a href="/doxygen/InstIterator_8h-source.html">llvm/Support/InstIterator.h</a>"
+
+// <i>F is a pointer to a Function instance</i>
+for (inst_iterator i = inst_begin(F), e = inst_end(F); i != e; ++i)
+ llvm::cerr << *i << "\n";
+</pre>
+</div>
+
+<p>Easy, isn't it? You can also use <tt>InstIterator</tt>s to fill a
+work list with its initial contents. For example, if you wanted to
+initialize a work list to contain all instructions in a <tt>Function</tt>
+F, all you would need to do is something like:</p>
+
+<div class="doc_code">
+<pre>
+std::set<Instruction*> worklist;
+worklist.insert(inst_begin(F), inst_end(F));
+</pre>
+</div>
+
+<p>The STL set <tt>worklist</tt> would now contain all instructions in the
+<tt>Function</tt> pointed to by F.</p>
+
+</div>
+
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsubsection">
+ <a name="iterate_convert">Turning an iterator into a class pointer (and
+ vice-versa)</a>
+</div>
+
+<div class="doc_text">
+
+<p>Sometimes, it'll be useful to grab a reference (or pointer) to a class
+instance when all you've got at hand is an iterator. Well, extracting
+a reference or a pointer from an iterator is very straight-forward.
+Assuming that <tt>i</tt> is a <tt>BasicBlock::iterator</tt> and <tt>j</tt>
+is a <tt>BasicBlock::const_iterator</tt>:</p>
+
+<div class="doc_code">
+<pre>
+Instruction& inst = *i; // <i>Grab reference to instruction reference</i>
+Instruction* pinst = &*i; // <i>Grab pointer to instruction reference</i>
+const Instruction& inst = *j;
+</pre>
+</div>
+
+<p>However, the iterators you'll be working with in the LLVM framework are
+special: they will automatically convert to a ptr-to-instance type whenever they
+need to. Instead of dereferencing the iterator and then taking the address of
+the result, you can simply assign the iterator to the proper pointer type and
+you get the dereference and address-of operation as a result of the assignment
+(behind the scenes, this is a result of overloading casting mechanisms). Thus
+the last line of the last example,</p>
+
+<div class="doc_code">
+<pre>
+Instruction* pinst = &*i;
+</pre>
+</div>
+
+<p>is semantically equivalent to</p>
+
+<div class="doc_code">
+<pre>
+Instruction* pinst = i;
+</pre>
+</div>
+
+<p>It's also possible to turn a class pointer into the corresponding iterator,
+and this is a constant time operation (very efficient). The following code
+snippet illustrates use of the conversion constructors provided by LLVM
+iterators. By using these, you can explicitly grab the iterator of something
+without actually obtaining it via iteration over some structure:</p>
+
+<div class="doc_code">
+<pre>
+void printNextInstruction(Instruction* inst) {
+ BasicBlock::iterator it(inst);
+ ++it; // <i>After this line, it refers to the instruction after *inst</i>
+ if (it != inst->getParent()->end()) llvm::cerr << *it << "\n";
+}
+</pre>
+</div>
+
+</div>
+
+<!--_______________________________________________________________________-->
+<div class="doc_subsubsection">
+ <a name="iterate_complex">Finding call sites: a slightly more complex
+ example</a>
+</div>
+
+<div class="doc_text">
+
+<p>Say that you're writing a FunctionPass and would like to count all the
+locations in the entire module (that is, across every <tt>Function</tt>) where a
+certain function (i.e., some <tt>Function</tt>*) is already in scope. As you'll
+learn later, you may want to use an <tt>InstVisitor</tt> to accomplish this in a
+much more straight-forward manner, but this example will allow us to explore how
+you'd do it if you didn't have <tt>InstVisitor</tt> around. In pseudo-code, this
+is what we want to do:</p>
+
+<div class="doc_code">
+<pre>
+initialize callCounter to zero
+for each Function f in the Module
+ for each BasicBlock b in f
+ for each Instruction i in b
+ if (i is a CallInst and calls the given function)
+ increment callCounter
+</pre>
+</div>
+
+<p>And the actual code is (remember, because we're writing a
+<tt>FunctionPass</tt>, our <tt>FunctionPass</tt>-derived class simply has to
+override the <tt>runOnFunction</tt> method):</p>
+
+<div class="doc_code">
+<pre>
+Function* targetFunc = ...;
+
+class OurFunctionPass : public FunctionPass {
+ public:
+ OurFunctionPass(): callCounter(0) { }
+
+ virtual runOnFunction(Function& F) {
+ for (Function::iterator b = F.begin(), be = F.end(); b != be; ++b) {
+ for (BasicBlock::iterator i = b->begin(); ie = b->end(); i != ie; ++i) {
+ if (<a href="#CallInst">CallInst</a>* callInst = <a href="#isa">dyn_cast</a><<a
+ href="#CallInst">CallInst</a>>(&*i)) {
+ // <i>We know we've encountered a call instruction, so we</i>
+ // <i>need to determine if it's a call to the</i>
+ // <i>function pointed to by m_func or not</i>
+
+ if (callInst->getCalledFunction() == targetFunc)
+ ++callCounter;
+ }
+ }
+ }
+ }
+
+ private:
+ unsigned callCounter;
+};
+</pre>
+</div>
+
+</div>
+
+<!--_______________________________________________________________________-->
+<div class="doc_subsubsection">
+ <a name="calls_and_invokes">Treating calls and invokes the same way</a>
+</div>
+
+<div class="doc_text">
+
+<p>You may have noticed that the previous example was a bit oversimplified in
+that it did not deal with call sites generated by 'invoke' instructions. In
+this, and in other situations, you may find that you want to treat
+<tt>CallInst</tt>s and <tt>InvokeInst</tt>s the same way, even though their
+most-specific common base class is <tt>Instruction</tt>, which includes lots of
+less closely-related things. For these cases, LLVM provides a handy wrapper
+class called <a
+href="http://llvm.org/doxygen/classllvm_1_1CallSite.html"><tt>CallSite</tt></a>.
+It is essentially a wrapper around an <tt>Instruction</tt> pointer, with some
+methods that provide functionality common to <tt>CallInst</tt>s and
+<tt>InvokeInst</tt>s.</p>
+
+<p>This class has "value semantics": it should be passed by value, not by
+reference and it should not be dynamically allocated or deallocated using
+<tt>operator new</tt> or <tt>operator delete</tt>. It is efficiently copyable,
+assignable and constructable, with costs equivalents to that of a bare pointer.
+If you look at its definition, it has only a single pointer member.</p>
+
+</div>
+
+<!--_______________________________________________________________________-->
+<div class="doc_subsubsection">
+ <a name="iterate_chains">Iterating over def-use & use-def chains</a>
+</div>
+
+<div class="doc_text">
+
+<p>Frequently, we might have an instance of the <a
+href="/doxygen/classllvm_1_1Value.html">Value Class</a> and we want to
+determine which <tt>User</tt>s use the <tt>Value</tt>. The list of all
+<tt>User</tt>s of a particular <tt>Value</tt> is called a <i>def-use</i> chain.
+For example, let's say we have a <tt>Function*</tt> named <tt>F</tt> to a
+particular function <tt>foo</tt>. Finding all of the instructions that
+<i>use</i> <tt>foo</tt> is as simple as iterating over the <i>def-use</i> chain
+of <tt>F</tt>:</p>
+
+<div class="doc_code">
+<pre>
+Function* F = ...;
+
+for (Value::use_iterator i = F->use_begin(), e = F->use_end(); i != e; ++i)
+ if (Instruction *Inst = dyn_cast<Instruction>(*i)) {
+ llvm::cerr << "F is used in instruction:\n";
+ llvm::cerr << *Inst << "\n";
+ }
+</pre>
+</div>
+
+<p>Alternately, it's common to have an instance of the <a
+href="/doxygen/classllvm_1_1User.html">User Class</a> and need to know what
+<tt>Value</tt>s are used by it. The list of all <tt>Value</tt>s used by a
+<tt>User</tt> is known as a <i>use-def</i> chain. Instances of class
+<tt>Instruction</tt> are common <tt>User</tt>s, so we might want to iterate over
+all of the values that a particular instruction uses (that is, the operands of
+the particular <tt>Instruction</tt>):</p>
+
+<div class="doc_code">
+<pre>
+Instruction* pi = ...;
+
+for (User::op_iterator i = pi->op_begin(), e = pi->op_end(); i != e; ++i) {
+ Value* v = *i;
+ // <i>...</i>
+}
+</pre>
+</div>
+
+<!--
+ def-use chains ("finding all users of"): Value::use_begin/use_end
+ use-def chains ("finding all values used"): User::op_begin/op_end [op=operand]
+-->
+
+</div>
+
+<!-- ======================================================================= -->
+<div class="doc_subsection">
+ <a name="simplechanges">Making simple changes</a>
+</div>
+
+<div class="doc_text">
+
+<p>There are some primitive transformation operations present in the LLVM
+infrastructure that are worth knowing about. When performing
+transformations, it's fairly common to manipulate the contents of basic
+blocks. This section describes some of the common methods for doing so
+and gives example code.</p>
+
+</div>
+
+<!--_______________________________________________________________________-->
+<div class="doc_subsubsection">
+ <a name="schanges_creating">Creating and inserting new
+ <tt>Instruction</tt>s</a>
+</div>
+
+<div class="doc_text">
+
+<p><i>Instantiating Instructions</i></p>
+
+<p>Creation of <tt>Instruction</tt>s is straight-forward: simply call the
+constructor for the kind of instruction to instantiate and provide the necessary
+parameters. For example, an <tt>AllocaInst</tt> only <i>requires</i> a
+(const-ptr-to) <tt>Type</tt>. Thus:</p>
+
+<div class="doc_code">
+<pre>
+AllocaInst* ai = new AllocaInst(Type::IntTy);
+</pre>
+</div>
+
+<p>will create an <tt>AllocaInst</tt> instance that represents the allocation of
+one integer in the current stack frame, at run time. Each <tt>Instruction</tt>
+subclass is likely to have varying default parameters which change the semantics
+of the instruction, so refer to the <a
+href="/doxygen/classllvm_1_1Instruction.html">doxygen documentation for the subclass of
+Instruction</a> that you're interested in instantiating.</p>