X-Git-Url: http://demsky.eecs.uci.edu/git/?a=blobdiff_plain;f=docs%2FProgrammersManual.html;h=85d53dc25193738170405593ee472d2a83300d8f;hb=9fff7e194a2d8aa3abe92efa506b1fbe83583f53;hp=4f458fac553fa7422ba6508b8ad98169b42d1b96;hpb=72ef35ea5f4d53e94160f21bc8fa0412b7774301;p=oota-llvm.git diff --git a/docs/ProgrammersManual.html b/docs/ProgrammersManual.html index 4f458fac553..85d53dc2519 100644 --- a/docs/ProgrammersManual.html +++ b/docs/ProgrammersManual.html @@ -12,7 +12,24 @@
Written by Dinakar Dhurjati - Chris Lattner, and +
Written by Chris Lattner, + Dinakar Dhurjati, and Joel Stanley
@@ -157,9 +161,15 @@ the subject that you can get, so it will not be discussed in this document.
Here are some useful links:
-
-Helpful Hints for Common Operations +Important and useful LLVM APIs |
+ + +
+ +The isa<>, cast<> and dyn_cast<> templates + |
+ +
+ + +
+ +
+static bool isLoopInvariant(const Value *V, const Loop *L) { + if (isa<Constant>(V) || isa<Argument>(V) || isa<GlobalValue>(V)) + return true; + + // Otherwise, it must be an instruction... + return !L->contains(cast<Instruction>(V)->getParent()); +
+ +Note that you should not use an isa<> test followed by a +cast<>, for that use the dyn_cast<> operator.
+ + +
+ +
+ if (AllocationInst *AI = dyn_cast<AllocationInst>(Val)) { + ... + } +
+ +This form of the if statement effectively combines together a call to +isa<> and a call to cast<> into one statement, +which is very convenient.
+ +Another common example is:
+ +
+ // Loop over all of the phi nodes in a basic block + BasicBlock::iterator BBI = BB->begin(); + for (; PHINode *PN = dyn_cast<PHINode>(BBI); ++BBI) + cerr << *PN; +
+ +Note that the dyn_cast<> operator, like C++'s +dynamic_cast or Java's instanceof operator, can be abused. In +particular you should not use big chained if/then/else blocks to check +for lots of different variants of classes. If you find yourself wanting to do +this, it is much cleaner and more efficient to use the InstVisitor class to +dispatch over the instruction type directly.
+ + +
+ + +
+ +
+ + + +
+ +The DEBUG() macro & -debug option + |
+ +Naturally, because of this, you don't want to delete the debug printouts, but +you don't want them to always be noisy. A standard compromise is to comment +them out, allowing you to enable them if you need them in the future.
+ +The "Support/Statistic.h" +file provides a macro named DEBUG() that is a much nicer solution to +this problem. Basically, you can put arbitrary code into the argument of the +DEBUG macro, and it is only executed if 'opt' is run with the +'-debug' command line argument: + +
+ ... + DEBUG(std::cerr << "I am here!\n"); + ... +
+ +Then you can run your pass like this:
+ +
+ $ opt < a.bc > /dev/null -mypass + <no output> + $ opt < a.bc > /dev/null -mypass -debug + I am here! + $ +
+ +Using the DEBUG() macro instead of a home brewed solution allows you to +now have to create "yet another" command line option for the debug output for +your pass. Note that DEBUG() macros are disabled for optimized builds, +so they do not cause a performance impact at all (for the same reason, they +should also not contain side-effects!).
+ +One additional nice thing about the DEBUG() macro is that you can +enable or disable it directly in gdb. Just use "set DebugFlag=0" or +"set DebugFlag=1" from the gdb if the program is running. If the +program hasn't been started yet, you can always just run it with +-debug.
+ + + +
+ +The Statistic template & -stats +option + |
+ +Often you may run your pass on some big program, and you're interested to see +how many times it makes a certain transformation. Although you can do this with +hand inspection, or some ad-hoc method, this is a real pain and not very useful +for big programs. Using the Statistic template makes it very easy to +keep track of this information, and the calculated information is presented in a +uniform manner with the rest of the passes being executed.
+ +There are many examples of Statistic users, but this basics of using it +are as follows:
+ +
+ +
+static Statistic<> NumXForms("mypassname", "The # of times I did stuff"); +
+ +The Statistic template can emulate just about any data-type, but if you +do not specify a template argument, it defaults to acting like an unsigned int +counter (this is usually what you want).
+ +
+ +
+ ++NumXForms; // I did stuff +
+ +
+ +That's all you have to do. To get 'opt' to print out the statistics +gathered, use the '-stats' option:
+ +
+ $ opt -stats -mypassname < program.bc > /dev/null + ... statistic output ... +
+ +When running gccas on a C file from the SPEC benchmark suite, it gives +a report that looks like this:
+ +
+ 7646 bytecodewriter - Number of normal instructions + 725 bytecodewriter - Number of oversized instructions + 129996 bytecodewriter - Number of bytecode bytes written + 2817 raise - Number of insts DCEd or constprop'd + 3213 raise - Number of cast-of-self removed + 5046 raise - Number of expression trees converted + 75 raise - Number of other getelementptr's formed + 138 raise - Number of load/store peepholes + 42 deadtypeelim - Number of unused typenames removed from symtab + 392 funcresolve - Number of varargs functions resolved + 27 globaldce - Number of global variables removed + 2 adce - Number of basic blocks removed + 134 cee - Number of branches revectored + 49 cee - Number of setcc instruction eliminated + 532 gcse - Number of loads removed + 2919 gcse - Number of instructions removed + 86 indvars - Number of cannonical indvars added + 87 indvars - Number of aux indvars removed + 25 instcombine - Number of dead inst eliminate + 434 instcombine - Number of insts combined + 248 licm - Number of load insts hoisted + 1298 licm - Number of insts hoisted to a loop pre-header + 3 licm - Number of insts hoisted to multiple loop preds (bad, no loop pre-header) + 75 mem2reg - Number of alloca's promoted + 1444 cfgsimplify - Number of blocks simplified +
+ +Obviously, with so many optimizations, having a unified framework for this stuff +is very nice. Making your pass fit well into the framework makes it more +maintainable and useful.
+ + + +
+Helpful Hints for Common Operations + |
@@ -206,13 +474,26 @@ that you should know about.
Basic Inspection and Traversal Routines
+ +Because the pattern for iteration is common across many different aspects of the +program representation, the standard template library algorithms may be used on +them, and it is easier to remember how to iterate. First we show a few common +examples of the data structures that need to be traversed. Other data +structures are traversed in very similar ways.
- -
// blk is a pointer to a BasicBlock instance - for(BasicBlock::iterator i = blk->begin(), e = blk->end(); i != e; ++i) { + for(BasicBlock::iterator i = blk->begin(), e = blk->end(); i != e; ++i) // the next statement works since operator<<(ostream&,...) // is overloaded for Instruction& - - cerr << *i << endl; - } + cerr << *i << "\n";However, this isn't really the best way to print out the contents of a @@ -271,9 +551,45 @@ the pointer value you might expect. This is a deprecated interface that will be removed in the future, so it's best not to depend on it. To print out the pointer value for now, you must cast to void*.
+ + +
+#include "llvm/Support/InstIterator.h" +... +// Suppose F is a ptr to a function +for(inst_iterator i = inst_begin(F), e = inst_end(F); i != e; ++i) + cerr << **i << "\n"; ++ +Easy, isn't it? You can also use InstIterators to fill a +worklist with its initial contents. For example, if you wanted to +initialize a worklist to contain all instructions in a +Function F, all you would need to do is something like: + +
+std::set<Instruction*> worklist; +worklist.insert(inst_begin(F), inst_end(F)); ++ +The STL set worklist would now contain all instructions in +the Function pointed to by F. +
Instruction* pinst = i;-Caveat emptor: The above syntax works only when you're -not working with dyn_cast. The template definition of -dyn_cast isn't implemented to handle this yet, so you'll -still need the following in order for things to work properly: - -
-BasicBlock::iterator bbi = ...; -BranchInst* b = dyn_cast<BranchInst>(&*bbi); -- -The following code snippet illustrates use of the conversion -constructors provided by LLVM iterators. By using these, you can -explicitly grab the iterator of something without actually obtaining -it via iteration over some structure: +It's also possible to turn a class pointer into the corresponding +iterator. Usually, this conversion is quite inexpensive. The +following code snippet illustrates use of the conversion constructors +provided by LLVM iterators. By using these, you can explicitly grab +the iterator of something without actually obtaining it via iteration +over some structure:
void printNextInstruction(Instruction* inst) { BasicBlock::iterator it(inst); ++it; // after this line, it refers to the instruction after *inst. - if(it != inst->getParent()->end()) cerr << *it << endl; + if(it != inst->getParent()->end()) cerr << *it << "\n"; }- -Of course, this example is strictly pedagogical, because it'd be -better to do something like - -
if(inst->getNext()) cerr << inst->getNext() << endl;+Of course, this example is strictly pedagogical, because it'd be much +better to explicitly grab the next instruction directly from inst. - - -
initialize callCounter to zero for each Function f in the Module for each BasicBlock b in f for each Instruction i in b - if(i is a CallInst and foo is the function it calls) + if(i is a CallInst and calls the given function) increment callCounterAnd the actual code is (remember, since we're writing a -FunctionPass our FunctionPass-derived class simply +FunctionPass, our FunctionPass-derived class simply has to override the runOnFunction method...):
+Function* targetFunc = ...; + +class OurFunctionPass : public FunctionPass { + public: + OurFunctionPass(): callCounter(0) { } + + virtual runOnFunction(Function& F) { + for(Function::iterator b = F.begin(), be = F.end(); b != be; ++b) { + for(BasicBlock::iterator i = b->begin(); ie = b->end(); i != ie; ++i) { + if (CallInst* callInst = dyn_cast<CallInst>(&*i)) { + // we know we've encountered a call instruction, so we + // need to determine if it's a call to the + // function pointed to by m_func or not. + + if(callInst->getCalledFunction() == targetFunc) + ++callCounter; + } + } + } + + private: + unsigned callCounter; +}; +-// Assume callCounter is a private member of the pass class being written, -// and has been initialized in the pass class constructor. + +
+Function* F = ...; - vector- // Start iterating and (as per the pseudocode), increment callCounter. +Alternately, it's common to have an instance of the User Class and need to know what +Values are used by it. The list of all Values used +by a User is known as a use-def chain. Instances of +class Instruction are common Users, so we might want +to iterate over all of the values that a particular instruction uses +(that is, the operands of the particular Instruction): - for(Function::iterator b = F.begin(), be = F.end(); b != be; ++b) { - for(BasicBlock::iterator i = b->begin(); ie = b->end(); i != ie; ++i) { - if(CallInst* callInst = dyn_castparams; - params.push_back(Type::IntTy); - const FunctionType* fooType = FunctionType::get(Type::IntTy, params); - Function* foo = F.getParent()->getOrInsertFunction("foo", fooType); +for(Value::use_iterator i = F->use_begin(), e = F->use_end(); i != e; ++i) { + if(Instruction* Inst = dyn_cast<Instruction>(*i)) { + cerr << "F is used in instruction:\n"; + cerr << *Inst << "\n"; + } +} +
+Instruction* pi = ...; - if(callInst->getCalledFunction() == foo) - ++callCounter; - } - } - } +for(User::op_iterator i = pi->op_begin(), e = pi->op_end(); i != e; ++i) { + Value* v = *i; + ... }+ -We could then print out the value of callCounter (if we wanted to) -inside the doFinalization method of our FunctionPass. - +
Creation of Instructions is straightforward: simply call the +constructor for the kind of instruction to instantiate and provide the +necessary parameters. For example, an AllocaInst only +requires a (const-ptr-to) Type. Thus: + +
AllocaInst* ai = new AllocaInst(Type::IntTy);+ +will create an AllocaInst instance that represents the +allocation of one integer in the current stack frame, at runtime. +Each Instruction subclass is likely to have varying default +parameters which change the semantics of the instruction, so refer to +the doxygen documentation for +the subclass of Instruction that you're interested in +instantiating. + +
Naming values
+ ++It is very useful to name the values of instructions when you're able +to, as this facilitates the debugging of your transformations. If you +end up looking at generated LLVM machine code, you definitely want to +have logical names associated with the results of instructions! By +supplying a value for the Name (default) parameter of the +Instruction constructor, you associate a logical name with +the result of the instruction's execution at runtime. For example, +say that I'm writing a transformation that dynamically allocates space +for an integer on the stack, and that integer is going to be used as +some kind of index by some other code. To accomplish this, I place an +AllocaInst at the first point in the first +BasicBlock of some Function, and I'm intending to +use it within the same Function. I might do: + +
AllocaInst* pa = new AllocaInst(Type::IntTy, 0, "indexLoc");+ +where indexLoc is now the logical name of the instruction's +execution value, which is a pointer to an integer on the runtime +stack. + + +
Inserting instructions
+ ++There are essentially two ways to insert an Instruction into +an existing sequence of instructions that form a BasicBlock: +
Given a BasicBlock* pb, an Instruction* pi within +that BasicBlock, and a newly-created instruction +we wish to insert before *pi, we do the following: ---> +
+BasicBlock* pb = ...; +Instruction* pi = ...; +Instruction* newInst = new Instruction(...); +pb->getInstList().insert(pi, newInst); // inserts newInst before pi in pb ++ + +
Instruction instances that are already in +BasicBlocks are implicitly associated with an existing +instruction list: the instruction list of the enclosing basic block. +Thus, we could have accomplished the same thing as the above code +without being given a BasicBlock by doing: +
+Instruction* pi = ...; +Instruction* newInst = new Instruction(...); +pi->getParent()->getInstList().insert(pi, newInst); ++In fact, this sequence of steps occurs so frequently that the +Instruction class and Instruction-derived classes +provide constructors which take (as a default parameter) a pointer to +an Instruction which the newly-created Instruction +should precede. That is, Instruction constructors are +capable of inserting the newly-created instance into the +BasicBlock of a provided instruction, immediately before that +instruction. Using an Instruction constructor with a +insertBefore (default) parameter, the above code becomes: +
+Instruction* pi = ...; +Instruction* newInst = new Instruction(..., pi); ++which is much cleaner, especially if you're creating a lot of +instructions and adding them to BasicBlocks. + + +
+ +For example:
+ +
+ Instruction *I = .. ; + BasicBlock *BB = I->getParent(); + BB->getInstList().erase(I); +
+ + +
Replacing individual instructions
++Including "llvm/Transforms/Utils/BasicBlockUtils.h +" permits use of two very useful replace functions: +ReplaceInstWithValue and ReplaceInstWithInst. + +
This function replaces all uses (within a basic block) of a given +instruction with a value, and then removes the original instruction. +The following example illustrates the replacement of the result of a +particular AllocaInst that allocates memory for a single +integer with an null pointer to an integer.
+ ++AllocaInst* instToReplace = ...; +BasicBlock::iterator ii(instToReplace); +ReplaceInstWithValue(instToReplace->getParent()->getInstList(), ii, + Constant::getNullValue(PointerType::get(Type::IntTy))); ++ +
This function replaces a particular instruction with another +instruction. The following example illustrates the replacement of one +AllocaInst with another.
+ +
+AllocaInst* instToReplace = ...; +BasicBlock::iterator ii(instToReplace); +ReplaceInstWithInst(instToReplace->getParent()->getInstList(), ii, + new AllocaInst(Type::IntTy, 0, "ptrToReplacedInt")); ++
Replacing multiple uses of Users and + Values
+ +You can use Value::replaceAllUsesWith and +User::replaceUsesOfWith to change more than one use at a +time. See the doxygen documentation for the Value Class and User Class, respectively, for more +information. + +