X-Git-Url: http://demsky.eecs.uci.edu/git/?a=blobdiff_plain;f=docs%2FProgrammersManual.html;h=61fc213190b912d66393cb92a7a2815ae7accb3f;hb=367373053b4c50cf84ec54d2300ddab14c60063e;hp=28a5079385735da767b9ffd2d13f19c0774c3564;hpb=a9030cb414dfc4d2e709e4ade2c06fe80ae9ab72;p=oota-llvm.git diff --git a/docs/ProgrammersManual.html b/docs/ProgrammersManual.html index 28a50793857..61fc213190b 100644 --- a/docs/ProgrammersManual.html +++ b/docs/ProgrammersManual.html @@ -12,8 +12,28 @@
  • General Information +
  • Important and useful LLVM APIs +
  • Helpful Hints for Common Operations -
  • Useful LLVM APIs - -
  • The Core LLVM Class Hierarchy Reference @@ -161,9 +165,15 @@ the subject that you can get, so it will not be discussed in this document.

    Here are some useful links:

      -
    1. Dinkumware C++ +
    2. Dinkumware C++ Library reference - an excellent reference for the STL and other parts of -the standard C++ library.
      +the standard C++ library. + +
    3. C++ In a Nutshell - This is an +O'Reilly book in the making. It has a decent Standard Library +Reference that rivals Dinkumware's, and is actually free until the book is +published.
    4. C++ Frequently Asked Questions @@ -183,6 +193,16 @@ href="CodingStandards.html">LLVM Coding Standards guide which focuses on how to write maintainable code more than where to put your curly braces.

      + + +
      +Important and useful LLVM APIs +

        + + +Here we highlight some LLVM APIs that are generally useful and good to know +about when writing transformations.

        +

         @@ -224,7 +244,7 @@ static bool isLoopInvariant(const Value *V, const Loop *L) return true; // Otherwise, it must be an instruction... - return !L->contains(cast<Instruction>(V)->getParent()); + return !L->contains(cast<Instruction>(V)->getParent());

      Note that you should not use an isa<> test followed by a @@ -255,8 +275,8 @@ Another common example is:

         // Loop over all of the phi nodes in a basic block
      -  BasicBlock::iterator BBI = BB->begin();
      -  for (; PHINode *PN = dyn_cast<PHINode>(&*BBI); ++BBI)
      +  BasicBlock::iterator BBI = BB->begin();
      +  for (; PHINode *PN = dyn_cast<PHINode>(BBI); ++BBI)
           cerr << *PN;
       

      @@ -292,13 +312,203 @@ Describing this is currently outside the scope of this document, but there are lots of examples in the LLVM source base.

      + + +
         + +The DEBUG() macro & -debug option +

        + +Often when working on your pass you will put a bunch of debugging printouts and +other code into your pass. After you get it working, you want to remove +it... but you may need it again in the future (to work out new bugs that you run +across).

        + +Naturally, because of this, you don't want to delete the debug printouts, but +you don't want them to always be noisy. A standard compromise is to comment +them out, allowing you to enable them if you need them in the future.

        + +The "Support/Debug.h" file +provides a macro named DEBUG() that is a much nicer solution to this +problem. Basically, you can put arbitrary code into the argument of the +DEBUG macro, and it is only executed if 'opt' (or any other +tool) is run with the '-debug' command line argument: + +

        +     ... 
        +     DEBUG(std::cerr << "I am here!\n");
        +     ...
        +

        + +Then you can run your pass like this:

        + +

        +  $ opt < a.bc > /dev/null -mypass
        +    <no output>
        +  $ opt < a.bc > /dev/null -mypass -debug
        +    I am here!
        +  $
        +

        + +Using the DEBUG() macro instead of a home-brewed solution allows you to +now have to create "yet another" command line option for the debug output for +your pass. Note that DEBUG() macros are disabled for optimized builds, +so they do not cause a performance impact at all (for the same reason, they +should also not contain side-effects!).

        + +One additional nice thing about the DEBUG() macro is that you can +enable or disable it directly in gdb. Just use "set DebugFlag=0" or +"set DebugFlag=1" from the gdb if the program is running. If the +program hasn't been started yet, you can always just run it with +-debug.

        + + +


      Fine grained debug info with + DEBUG_TYPE() and the -debug-only option

        + +Sometimes you may find yourself in a situation where enabling -debug +just turns on too much information (such as when working on the code +generator). If you want to enable debug information with more fine-grained +control, you define the DEBUG_TYPE macro and the -debug only +option as follows:

        + +

        +     ...
        +     DEBUG(std::cerr << "No debug type\n");
        +     #undef  DEBUG_TYPE
        +     #define DEBUG_TYPE "foo"
        +     DEBUG(std::cerr << "'foo' debug type\n");
        +     #undef  DEBUG_TYPE
        +     #define DEBUG_TYPE "bar"
        +     DEBUG(std::cerr << "'bar' debug type\n");
        +     #undef  DEBUG_TYPE
        +     #define DEBUG_TYPE ""
        +     DEBUG(std::cerr << "No debug type (2)\n");
        +     ...
        +

        + +Then you can run your pass like this:

        + +

        +  $ opt < a.bc > /dev/null -mypass
        +    <no output>
        +  $ opt < a.bc > /dev/null -mypass -debug
        +    No debug type
        +    'foo' debug type
        +    'bar' debug type
        +    No debug type (2)
        +  $ opt < a.bc > /dev/null -mypass -debug-only=foo
        +    'foo' debug type
        +  $ opt < a.bc > /dev/null -mypass -debug-only=bar
        +    'bar' debug type
        +  $
        +

        + +Of course, in practice, you should only set DEBUG_TYPE at the top of a +file, to specify the debug type for the entire module (if you do this before you +#include "Support/Debug.h", you don't have to insert the ugly +#undef's). Also, you should use names more meaningful that "foo" and +"bar", because there is no system in place to ensure that names do not conflict: +if two different modules use the same string, they will all be turned on when +the name is specified. This allows all, say, instruction scheduling, debug +information to be enabled with -debug-type=InstrSched, even if the +source lives in multiple files.

        + + + +

      +
         + +The Statistic template & -stats +option +
        + +The "Support/Statistic.h" +file provides a template named Statistic that is used as a unified way +to keeping track of what the LLVM compiler is doing and how effective various +optimizations are. It is useful to see what optimizations are contributing to +making a particular program run faster.

        + +Often you may run your pass on some big program, and you're interested to see +how many times it makes a certain transformation. Although you can do this with +hand inspection, or some ad-hoc method, this is a real pain and not very useful +for big programs. Using the Statistic template makes it very easy to +keep track of this information, and the calculated information is presented in a +uniform manner with the rest of the passes being executed.

        + +There are many examples of Statistic users, but this basics of using it +are as follows:

        + +

          +
        1. Define your statistic like this:

          + +

          +static Statistic<> NumXForms("mypassname", "The # of times I did stuff");
          +

          + +The Statistic template can emulate just about any data-type, but if you +do not specify a template argument, it defaults to acting like an unsigned int +counter (this is usually what you want).

          + +

        2. Whenever you make a transformation, bump the counter:

          + +

          +   ++NumXForms;   // I did stuff
          +

          + +

        + +That's all you have to do. To get 'opt' to print out the statistics +gathered, use the '-stats' option:

        + +

        +   $ opt -stats -mypassname < program.bc > /dev/null
        +    ... statistic output ...
        +

        + +When running gccas on a C file from the SPEC benchmark suite, it gives +a report that looks like this:

        + +

        +   7646 bytecodewriter  - Number of normal instructions
        +    725 bytecodewriter  - Number of oversized instructions
        + 129996 bytecodewriter  - Number of bytecode bytes written
        +   2817 raise           - Number of insts DCEd or constprop'd
        +   3213 raise           - Number of cast-of-self removed
        +   5046 raise           - Number of expression trees converted
        +     75 raise           - Number of other getelementptr's formed
        +    138 raise           - Number of load/store peepholes
        +     42 deadtypeelim    - Number of unused typenames removed from symtab
        +    392 funcresolve     - Number of varargs functions resolved
        +     27 globaldce       - Number of global variables removed
        +      2 adce            - Number of basic blocks removed
        +    134 cee             - Number of branches revectored
        +     49 cee             - Number of setcc instruction eliminated
        +    532 gcse            - Number of loads removed
        +   2919 gcse            - Number of instructions removed
        +     86 indvars         - Number of canonical indvars added
        +     87 indvars         - Number of aux indvars removed
        +     25 instcombine     - Number of dead inst eliminate
        +    434 instcombine     - Number of insts combined
        +    248 licm            - Number of load insts hoisted
        +   1298 licm            - Number of insts hoisted to a loop pre-header
        +      3 licm            - Number of insts hoisted to multiple loop preds (bad, no loop pre-header)
        +     75 mem2reg         - Number of alloca's promoted
        +   1444 cfgsimplify     - Number of blocks simplified
        +

        + +Obviously, with so many optimizations, having a unified framework for this stuff +is very nice. Making your pass fit well into the framework makes it more +maintainable and useful.

        +

      Helpful Hints for Common Operations -
        - +
        This section describes how to perform some very simple transformations of LLVM code. This is meant to give examples of common idioms used, showing the @@ -350,7 +560,7 @@ contains:
           // func is a pointer to a Function instance
        -  for(Function::iterator i = func->begin(), e = func->end(); i != e; ++i) {
        +  for (Function::iterator i = func->begin(), e = func->end(); i != e; ++i) {
         
               // print out the name of the basic block if it has one, and then the
               // number of instructions that it contains
        @@ -363,7 +573,7 @@ contains:
         Note that i can be used as if it were a pointer for the purposes of
         invoking member functions of the Instruction class.  This is
         because the indirection operator is overloaded for the iterator
        -classes.  In the above code, the expression i->size() is
        +classes.  In the above code, the expression i->size() is
         exactly equivalent to (*i).size() just like you'd expect.
         
         
        @@ -378,7 +588,7 @@ that prints out each instruction in a BasicBlock:
         
         
           // blk is a pointer to a BasicBlock instance
        -  for(BasicBlock::iterator i = blk->begin(), e = blk->end(); i != e; ++i)
        +  for (BasicBlock::iterator i = blk->begin(), e = blk->end(); i != e; ++i)
              // the next statement works since operator<<(ostream&,...) 
              // is overloaded for Instruction&
              cerr << *i << "\n";
        @@ -415,7 +625,7 @@ stderr (Note: Dereferencing an InstIterator yields an
         #include "llvm/Support/InstIterator.h"
         ...
         // Suppose F is a ptr to a function
        -for(inst_iterator i = inst_begin(F), e = inst_end(F); i != e; ++i)
        +for (inst_iterator i = inst_begin(F), e = inst_end(F); i != e; ++i)
           cerr << **i << "\n";
         
        @@ -462,16 +672,6 @@ is semantically equivalent to
        Instruction* pinst = i;
        -Caveat emptor: The above syntax works only when you're not -working with dyn_cast. The template definition of dyn_cast isn't implemented to handle this yet, so you'll -still need the following in order for things to work properly: - -
        -BasicBlock::iterator bbi = ...;
        -BranchInst* b = dyn_cast<BranchInst>(&*bbi);
        -
        - It's also possible to turn a class pointer into the corresponding iterator. Usually, this conversion is quite inexpensive. The following code snippet illustrates use of the conversion constructors @@ -483,7 +683,7 @@ over some structure: void printNextInstruction(Instruction* inst) { BasicBlock::iterator it(inst); ++it; // after this line, it refers to the instruction after *inst. - if(it != inst->getParent()->end()) cerr << *it << "\n"; + if (it != inst->getParent()->end()) cerr << *it << "\n"; }
        Of course, this example is strictly pedagogical, because it'd be much @@ -496,8 +696,8 @@ more complex example
          Say that you're writing a FunctionPass and would like to count all the locations in the entire module (that is, across every -Function) where a certain function (i.e. some -Function*) already in scope. As you'll learn later, you may +Function) where a certain function (i.e., some +Function*) is already in scope. As you'll learn later, you may want to use an InstVisitor to accomplish this in a much more straightforward manner, but this example will allow us to explore how you'd do it if you didn't have InstVisitor around. In @@ -508,7 +708,7 @@ initialize callCounter to zero for each Function f in the Module for each BasicBlock b in f for each Instruction i in b - if(i is a CallInst and calls the given function) + if (i is a CallInst and calls the given function) increment callCounter @@ -524,14 +724,14 @@ class OurFunctionPass : public FunctionPass { OurFunctionPass(): callCounter(0) { } virtual runOnFunction(Function& F) { - for(Function::iterator b = F.begin(), be = F.end(); b != be; ++b) { - for(BasicBlock::iterator i = b->begin(); ie = b->end(); i != ie; ++i) { + for (Function::iterator b = F.begin(), be = F.end(); b != be; ++b) { + for (BasicBlock::iterator i = b->begin(); ie = b->end(); i != ie; ++i) { if (CallInst* callInst = dyn_cast<CallInst>(&*i)) { // we know we've encountered a call instruction, so we // need to determine if it's a call to the // function pointed to by m_func or not. - if(callInst->getCalledFunction() == targetFunc) + if (callInst->getCalledFunction() == targetFunc) ++callCounter; } } @@ -559,10 +759,10 @@ all Users of a particular Value is called a
           Function* F = ...;
           
          -for(Value::use_iterator i = F->use_begin(), e = F->use_end(); i != e; ++i) {
          -    if(Instruction* i = dyn_cast<Instruction>(*i)) {
          -        cerr << "F is used in instruction:\n\t";
          -        cerr << *i << "\n";
          +for (Value::use_iterator i = F->use_begin(), e = F->use_end(); i != e; ++i) {
          +    if (Instruction *Inst = dyn_cast<Instruction>(*i)) {
          +        cerr << "F is used in instruction:\n";
          +        cerr << *Inst << "\n";
               }
           }
           
          @@ -578,7 +778,7 @@ to iterate over all of the values that a particular instruction uses
           Instruction* pi = ...;
           
          -for(User::op_iterator i = pi->op_begin(), e = pi->op_end(); i != e; ++i) {
          +for (User::op_iterator i = pi->op_begin(), e = pi->op_end(); i != e; ++i) {
               Value* v = *i;
               ...
           }
          @@ -661,24 +861,23 @@ that BasicBlock, and a newly-created instruction
           we wish to insert before *pi, we do the following:
           
           
          -BasicBlock* pb = ...;
          -Instruction* pi = ...;
          -Instruction* newInst = new Instruction(...);
          -pb->getInstList().insert(pi, newInst); // inserts newInst before pi in pb
          +  BasicBlock *pb = ...;
          +  Instruction *pi = ...;
          +  Instruction *newInst = new Instruction(...);
          +  pb->getInstList().insert(pi, newInst); // inserts newInst before pi in pb
           

        • Insertion into an implicit instruction list -

          -Instruction instances that are already in +

          Instruction instances that are already in BasicBlocks are implicitly associated with an existing instruction list: the instruction list of the enclosing basic block. Thus, we could have accomplished the same thing as the above code without being given a BasicBlock by doing:

          -Instruction* pi = ...;
          -Instruction* newInst = new Instruction(...);
          -pi->getParent()->getInstList().insert(pi, newInst);
          +  Instruction *pi = ...;
          +  Instruction *newInst = new Instruction(...);
          +  pi->getParent()->getInstList().insert(pi, newInst);
           
          In fact, this sequence of steps occurs so frequently that the Instruction class and Instruction-derived classes @@ -695,7 +894,7 @@ Instruction* newInst = new Instruction(..., pi);
        • which is much cleaner, especially if you're creating a lot of instructions and adding them to BasicBlocks. -

          +

        @@ -714,21 +913,64 @@ For example:

           Instruction *I = .. ;
        -  BasicBlock *BB = I->getParent();
        -  BB->getInstList().erase(I);
        +  BasicBlock *BB = I->getParent();
        +  BB->getInstList().erase(I);
         

        -


      Replacing an Instruction with another Value