<li><a href="#general">General Information</a>
<ul>
<li><a href="#stl">The C++ Standard Template Library</a>
- <li>The isa<>, cast<> and dyn_cast<> templates
+ <li><a href="#isa">The <tt>isa<></tt>, <tt>cast<></tt> and
+ <tt>dyn_cast<></tt> templates</a>
</ul>
<li><a href="#common">Helpful Hints for Common Operations</a>
<ul>
<li>
<li>
</ul>
--->
<li>Useful LLVM APIs
<ul>
- <li>isa<>, cast<>, and dyn_cast<> templates
-<!--
<li>The general graph API
<li>The <tt>InstVisitor</tt> template
<li>The DEBUG() macro
to write maintainable code more than where to put your curly braces.<p>
+<!-- ======================================================================= -->
+</ul><table width="100%" bgcolor="#441188" border=0 cellpadding=4 cellspacing=0>
+<tr><td> </td><td width="100%">
+<font color="#EEEEFF" face="Georgia,Palatino"><b>
+<a name="isa">The isa<>, cast<> and dyn_cast<> templates</a>
+</b></font></td></tr></table><ul>
+
+The LLVM source-base makes extensive use of a custom form of RTTI. These
+templates have many similarities to the C++ <tt>dynamic_cast<></tt>
+operator, but they don't have some drawbacks (primarily stemming from the fact
+that <tt>dynamic_cast<></tt> only works on classes that have a v-table).
+Because they are used so often, you must know what they do and how they work.
+All of these templates are defined in the <a
+href="/doxygen/Casting_8h-source.html"><tt>Support/Casting.h</tt></a> file (note
+that you very rarely have to include this file directly).<p>
+
+<dl>
+
+<dt><tt>isa<></tt>:
+
+<dd>The <tt>isa<></tt> operator works exactly like the Java
+"<tt>instanceof</tt>" operator. It returns true or false depending on whether a
+reference or pointer points to an instance of the specified class. This can be
+very useful for constraint checking of various sorts (example below).<p>
+
+
+<dt><tt>cast<></tt>:
+
+<dd>The <tt>cast<></tt> operator is a "checked cast" operation. It
+converts a pointer or reference from a base class to a derived cast, causing an
+assertion failure if it is not really an instance of the right type. This
+should be used in cases where you have some information that makes you believe
+that something is of the right type. An example of the <tt>isa<></tt> and
+<tt>cast<></tt> template is:<p>
+
+<pre>
+static bool isLoopInvariant(const <a href="#Value">Value</a> *V, const Loop *L) {
+ if (isa<<a href="#Constant">Constant</a>>(V) || isa<<a href="#Argument">Argument</a>>(V) || isa<<a href="#GlobalValue">GlobalValue</a>>(V))
+ return true;
+
+ <i>// Otherwise, it must be an instruction...</i>
+ return !L->contains(cast<<a href="#Instruction">Instruction</a>>(V)->getParent());
+</pre><p>
+
+Note that you should <b>not</b> use an <tt>isa<></tt> test followed by a
+<tt>cast<></tt>, for that use the <tt>dyn_cast<></tt> operator.<p>
+
+
+<dt><tt>dyn_cast<></tt>:
+
+<dd>The <tt>dyn_cast<></tt> operator is a "checking cast" operation. It
+checks to see if the operand is of the specified type, and if so, returns a
+pointer to it (this operator does not work with references). If the operand is
+not of the correct type, a null pointer is returned. Thus, this works very much
+like the <tt>dynamic_cast</tt> operator in C++, and should be used in the same
+circumstances. An example is:<p>
+
+<pre>
+ <i>// Loop over all of the phi nodes in a basic block</i>
+ BasicBlock::iterator BBI = BB->begin();
+ for (; <a href="#PhiNode">PHINode</a> *PN = dyn_cast<<a href="#PHINode">PHINode</a>>(&*BBI); ++BBI)
+ cerr << *PN;
+</pre><p>
+
+Note that you should not use the <tt>dyn_cast<></tt> operator in a series
+of chained if statements, use an visitor instead... FIXME: continue.<p>
+
+
+</dl>
+
+
+
<!-- *********************************************************************** -->
</ul><table width="100%" bgcolor="#330077" border=0 cellpadding=4 cellspacing=0>
<a name="inspection">Basic Inspection and Traversal Routines</a>
</b></font></td></tr></table><ul>
+The LLVM compiler infrastructure have many different data structures that may be
+traversed. Following the example of the C++ standard template library, the
+techniques used to traverse these various data structures are all basically the
+same. For a enumerable sequence of values, the <tt>XXXbegin()</tt> function (or
+method) returns an iterator to the start of the sequence, the <tt>XXXend()</tt>
+function returns an iterator pointing to one past the last valid element of the
+sequence, and there is some <tt>XXXiterator</tt> data type that is common
+between the two operations.<p>
+
+Because the pattern for iteration is common across many different aspects of the
+program representation, the standard template library algorithms may be used on
+them, and it is easier to remember how to iterate. First we show a few common
+examples of the data structures that need to be traversed. Other data
+structures are traversed in very similar ways.<p>
-<!-- LLVM has heirarchical representation: Module, Function, BasicBlock,
-Instruction. Common patterns for all levels. -->
<!-- _______________________________________________________________________ -->
-</ul><h4><a name="iterate_function"><hr size=0>Iterating over the
-<tt>BasicBlock</tt>s in a <tt>Function</tt> </h4><ul>
+</ul><h4><a name="iterate_function"><hr size=0>Iterating over the <a
+href="#BasicBlock"><tt>BasicBlock</tt></a>s in a <a
+href="#Function"><tt>Function</tt></a> </h4><ul>
It's quite common to have a <tt>Function</tt> instance that you'd like
to transform in some way; in particular, you'd like to manipulate its
exactly equivalent to <tt>(*i).size()</tt> just like you'd expect.
<!-- _______________________________________________________________________ -->
-</ul><h4><a name="iterate_basicblock"><hr size=0>Iterating over the
-<tt>Instruction</tt>s in a <tt>BasicBlock</tt> </h4><ul>
+</ul><h4><a name="iterate_basicblock"><hr size=0>Iterating over the <a
+href="#Instruction"><tt>Instruction</tt></a>s in a <a
+href="#BasicBlock"><tt>BasicBlock</tt></a> </h4><ul>
Just like when dealing with <tt>BasicBlock</tt>s in
<tt>Function</tt>s, it's easy to iterate over the individual
<pre>
// blk is a pointer to a BasicBlock instance
- for(BasicBlock::iterator i = blk->begin(), e = blk->end(); i != e; ++i) {
+ for(BasicBlock::iterator i = blk->begin(), e = blk->end(); i != e; ++i)
// the next statement works since operator<<(ostream&,...)
// is overloaded for Instruction&
- cerr << *i << endl;
+ cerr << *i << "\n";
</pre>
However, this isn't really the best way to print out the contents of a
be removed in the future, so it's best not to depend on it. To print out the
pointer value for now, you must cast to <tt>void*</tt>.<p>
-<!-- _______________________________________________________________________ -->
-</ul><h4><a name="iterate_institer"><hr size=0>Iterating over the
-<tt>Instruction</tt>s in a <tt>Function</tt></h4><ul>
-<!-- Using llvm/Support/InstIterator.h to directly get at the instructions in a
-function.
+<!-- _______________________________________________________________________ -->
+</ul><h4><a name="iterate_institer"><hr size=0>Iterating over the <a
+href="#Instruction"><tt>Instruction</tt></a>s in a <a
+href="#Function"><tt>Function</tt></a></h4><ul>
+
+If you're finding that you commonly iterate over a <tt>Function</tt>'s
+<tt>BasicBlock</tt>s and then that <tt>BasicBlock</tt>'s
+<tt>Instruction</tt>s, <tt>InstIterator</tt> should be used instead.
+You'll need to include <a href="/doxygen/InstIterator_8h-source.html"><tt>llvm/Support/InstIterator.h</tt></a>, and then
+instantiate <tt>InstIterator</tt>s explicitly in your code. Here's a
+small example that shows how to dump all instructions in a function to
+stderr (<b>Note:</b> Dereferencing an <tt>InstIterator</tt> yields an
+<tt>Instruction*</tt>, <i>not</i> an <tt>Instruction&</tt>!):
-Warning: *I returns an Instruction*, not an Instruction&
+<pre>
+#include "<a href="/doxygen/InstIterator_8h-source.html">llvm/Support/InstIterator.h</a>"
+...
+// Suppose F is a ptr to a function
+for(inst_iterator i = inst_begin(F), e = inst_end(F); i != e; ++i)
+ cerr << **i << "\n";
+</pre>
- -->
+Easy, isn't it? You can also use <tt>InstIterator</tt>s to fill a
+worklist with its initial contents. For example, if you wanted to
+initialize a worklist to contain all instructions in a
+<tt>Function</tt> F, all you would need to do is something like:
+<pre>
+std::set<Instruction*> worklist;
+worklist.insert(inst_begin(F), inst_end(F));
+</pre>
+The STL set <tt>worklist</tt> would now contain all instructions in
+the <tt>Function</tt> pointed to by F.
<!-- _______________________________________________________________________ -->
</ul><h4><a name="iterate_convert"><hr size=0>Turning an iterator into a class
-pointer </h4><ul>
+pointer (and vice-versa) </h4><ul>
Sometimes, it'll be useful to grab a reference (or pointer) to a class
instance when all you've got at hand is an iterator. Well, extracting
<pre>Instruction* pinst = i;</pre>
-<b>Caveat emptor</b>: The above syntax works <i>only</i> when you're
-<i>not</i> working with <tt>dyn_cast</tt>. The template definition of
-<tt>dyn_cast</tt> isn't implemented to handle this yet, so you'll
+<b>Caveat emptor</b>: The above syntax works <i>only</i> when you're <i>not</i>
+working with <tt>dyn_cast</tt>. The template definition of <tt><a
+href="#isa">dyn_cast</a></tt> isn't implemented to handle this yet, so you'll
still need the following in order for things to work properly:
<pre>
BasicBlock::iterator bbi = ...;
-BranchInst* b = dyn_cast<BranchInst>(&*bbi);
+<a href="#BranchInst">BranchInst</a>* b = <a href="#isa">dyn_cast</a><<a href="#BranchInst">BranchInst</a>>(&*bbi);
</pre>
-The following code snippet illustrates use of the conversion
-constructors provided by LLVM iterators. By using these, you can
-explicitly grab the iterator of something without actually obtaining
-it via iteration over some structure:
+It's also possible to turn a class pointer into the corresponding
+iterator. Usually, this conversion is quite inexpensive. The
+following code snippet illustrates use of the conversion constructors
+provided by LLVM iterators. By using these, you can explicitly grab
+the iterator of something without actually obtaining it via iteration
+over some structure:
<pre>
void printNextInstruction(Instruction* inst) {
BasicBlock::iterator it(inst);
++it; // after this line, it refers to the instruction after *inst.
- if(it != inst->getParent()->end()) cerr << *it << endl;
+ if(it != inst->getParent()->end()) cerr << *it << "\n";
}
</pre>
Of course, this example is strictly pedagogical, because it'd be much
better to explicitly grab the next instruction directly from inst.
-<!-- dereferenced iterator = Class &
- iterators have converting constructor for 'Class *'
- iterators automatically convert to 'Class *' except in dyn_cast<> case
- -->
<!--_______________________________________________________________________-->
</ul><h4><a name="iterate_complex"><hr size=0>Finding call sites: a slightly
more complex example </h4><ul>
Say that you're writing a FunctionPass and would like to count all the
-locations in the entire module (that is, across every <tt>Function</tt>)
-where a certain function named foo (that takes an int and returns an
-int) is called. As you'll learn later, you may want to use an
-<tt>InstVisitor</tt> to accomplish this in a much more straightforward
-manner, but this example will allow us to explore how you'd do it if
-you didn't have <tt>InstVisitor</tt> around. In pseudocode, this is
-what we want to do:
+locations in the entire module (that is, across every
+<tt>Function</tt>) where a certain function (i.e. some
+<tt>Function</tt>*) already in scope. As you'll learn later, you may
+want to use an <tt>InstVisitor</tt> to accomplish this in a much more
+straightforward manner, but this example will allow us to explore how
+you'd do it if you didn't have <tt>InstVisitor</tt> around. In
+pseudocode, this is what we want to do:
<pre>
initialize callCounter to zero
for each Function f in the Module
for each BasicBlock b in f
for each Instruction i in b
- if(i is a CallInst and foo is the function it calls)
+ if(i is a CallInst and calls the given function)
increment callCounter
</pre>
And the actual code is (remember, since we're writing a
-<tt>FunctionPass</tt> our <tt>FunctionPass</tt>-derived class simply
+<tt>FunctionPass</tt>, our <tt>FunctionPass</tt>-derived class simply
has to override the <tt>runOnFunction</tt> method...):
<pre>
-
-// Assume callCounter is a private member of the pass class being written,
-// and has been initialized in the pass class constructor.
-
-virtual runOnFunction(Function& F) {
-
- // Remember, we assumed that the signature of foo was "int foo(int)";
- // the first thing we'll do is grab the pointer to that function (as a
- // Function*) so we can use it later when we're examining the
- // parameters of a CallInst. All of the code before the call to
- // Module::getOrInsertFunction() is in preparation to do symbol-table
- // to find the function pointer.
-
- vector<const Type*> params;
- params.push_back(Type::IntTy);
- const FunctionType* fooType = FunctionType::get(Type::IntTy, params);
- Function* foo = F.getParent()->getOrInsertFunction("foo", fooType);
-
- // Start iterating and (as per the pseudocode), increment callCounter.
-
- for(Function::iterator b = F.begin(), be = F.end(); b != be; ++b) {
- for(BasicBlock::iterator i = b->begin(); ie = b->end(); i != ie; ++i) {
- if(CallInst* callInst = dyn_cast<CallInst>(&*inst)) {
- // we know we've encountered a call instruction, so we
- // need to determine if it's a call to foo or not
-
- if(callInst->getCalledFunction() == foo)
- ++callCounter;
- }
- }
+Function* targetFunc = ...;
+
+class OurFunctionPass : public FunctionPass {
+ public:
+ OurFunctionPass(): callCounter(0) { }
+
+ virtual runOnFunction(Function& F) {
+ for(Function::iterator b = F.begin(), be = F.end(); b != be; ++b) {
+ for(BasicBlock::iterator i = b->begin(); ie = b->end(); i != ie; ++i) {
+ if (<a href="#CallInst">CallInst</a>* callInst = <a href="#isa">dyn_cast</a><<a href="#CallInst">CallInst</a>>(&*inst)) {
+ // we know we've encountered a call instruction, so we
+ // need to determine if it's a call to the
+ // function pointed to by m_func or not.
+
+ if(callInst->getCalledFunction() == targetFunc)
+ ++callCounter;
+ }
+ }
}
-}
+
+ private:
+ unsigned callCounter;
+};
</pre>
-We could then print out the value of callCounter (if we wanted to)
-inside the doFinalization method of our FunctionPass.
-
-
<!--_______________________________________________________________________-->
</ul><h4><a name="iterate_chains"><hr size=0>Iterating over def-use &
use-def chains</h4><ul>
-
<!--
def-use chains ("finding all users of"): Value::use_begin/use_end
use-def chains ("finding all values used"): User::op_begin/op_end [op=operand]
<a href="mailto:sabre@nondot.org">Chris Lattner</a></address>
<!-- Created: Tue Aug 6 15:00:33 CDT 2002 -->
<!-- hhmts start -->
-Last modified: Mon Sep 9 00:48:53 CDT 2002
+Last modified: Mon Sep 9 19:38:23 CDT 2002
<!-- hhmts end -->
</font></body></html>