X-Git-Url: http://demsky.eecs.uci.edu/git/?a=blobdiff_plain;f=docs%2FProgrammersManual.html;h=46b82ddc21fb6f23c335d58a578b248562ae68c7;hb=95df6b3603e228cea714be21997fec82cb03011e;hp=aa1fe402ff9b6e88e149f3fd9fec0c258589cffe;hpb=485bad1a09550e2ec182a78dba0ed97bf1d9f5ae;p=oota-llvm.git diff --git a/docs/ProgrammersManual.html b/docs/ProgrammersManual.html index aa1fe402ff9..46b82ddc21f 100644 --- a/docs/ProgrammersManual.html +++ b/docs/ProgrammersManual.html @@ -62,6 +62,7 @@ option
  • A sorted 'vector'
  • "llvm/ADT/SmallSet.h"
  • "llvm/ADT/SmallPtrSet.h"
  • +
  • "llvm/ADT/DenseSet.h"
  • "llvm/ADT/FoldingSet.h"
  • <set>
  • "llvm/ADT/SetVector.h"
  • @@ -77,6 +78,11 @@ option
  • <map>
  • Other Map-Like Container Options
  • +
  • BitVector-like containers +
  • Helpful Hints for Common Operations @@ -97,6 +103,8 @@ complex example
  • the same way
  • Iterating over def-use & use-def chains
  • +
  • Iterating over predecessors & +successors of blocks
  • Making simple changes @@ -106,6 +114,7 @@ use-def chains
  • Deleting Instructions
  • Replacing an Instruction with another Value
  • +
  • Deleting GlobalVariables
  • +
    + "llvm/ADT/DenseSet.h" +
    + +
    + +

    +DenseSet is a simple quadratically probed hash table. It excels at supporting +small values: it uses a single allocation to hold all of the pairs that +are currently inserted in the set. DenseSet is a great way to unique small +values that are not simple pointers (use SmallPtrSet for pointers). Note that DenseSet has +the same requirements for the value type that DenseMap has. +

    + +
    +
    "llvm/ADT/FoldingSet.h" @@ -1224,7 +1259,7 @@ iterators in a densemap are invalidated whenever an insertion occurs, unlike map. Also, because DenseMap allocates space for a large number of key/value pairs (it starts with 64 by default), it will waste a lot of space if your keys or values are large. Finally, you must implement a partial specialization of -DenseMapKeyInfo for the key that you want, if it isn't already supported. This +DenseMapInfo for the key that you want, if it isn't already supported. This is required to tell DenseMap about two special marker values (which can never be inserted into the map) that it needs internally.

    @@ -1275,6 +1310,52 @@ expensive. Element iteration does not visit elements in a useful order.

    + +
    + Bit storage containers (BitVector, SparseBitVector) +
    + +
    +

    Unlike the other containers, there are only two bit storage containers, and +choosing when to use each is relatively straightforward.

    + +

    One additional option is +std::vector<bool>: we discourage its use for two reasons 1) the +implementation in many common compilers (e.g. commonly available versions of +GCC) is extremely inefficient and 2) the C++ standards committee is likely to +deprecate this container and/or change it significantly somehow. In any case, +please don't use it.

    +
    + + +
    + BitVector +
    + +
    +

    The BitVector container provides a fixed size set of bits for manipulation. +It supports individual bit setting/testing, as well as set operations. The set +operations take time O(size of bitvector), but operations are performed one word +at a time, instead of one bit at a time. This makes the BitVector very fast for +set operations compared to other containers. Use the BitVector when you expect +the number of set bits to be high (IE a dense set). +

    +
    + + +
    + SparseBitVector +
    + +
    +

    The SparseBitVector container is much like BitVector, with one major +difference: Only the bits that are set, are stored. This makes the +SparseBitVector much more space efficient than BitVector when the set is sparse, +as well as making set operations O(number of set bits) instead of O(size of +universe). The downside to the SparseBitVector is that setting and testing of random bits is O(N), and on large SparseBitVectors, this can be slower than BitVector. In our implementation, setting or testing bits in sorted order +(either forwards or reverse) is O(1) worst case. Testing and setting bits within 128 bits (depends on size) of the current bit is also O(1). As a general statement, testing/setting bits in a SparseBitVector is O(distance away from last set bit). +

    +
    @@ -1405,8 +1486,8 @@ small example that shows how to dump all instructions in a function to the stand #include "llvm/Support/InstIterator.h" // F is a pointer to a Function instance -for (inst_iterator i = inst_begin(F), e = inst_end(F); i != e; ++i) - llvm::cerr << *i << "\n"; +for (inst_iterator I = inst_begin(F), E = inst_end(F); I != E; ++I) + llvm::cerr << *I << "\n";
    @@ -1418,7 +1499,10 @@ F, all you would need to do is something like:

     std::set<Instruction*> worklist;
    -worklist.insert(inst_begin(F), inst_end(F));
    +// or better yet, SmallPtrSet<Instruction*, 64> worklist;
    +
    +for (inst_iterator I = inst_begin(F), E = inst_end(F); I != E; ++I)
    +   worklist.insert(&*I);
     
    @@ -1459,7 +1543,7 @@ the last line of the last example,

    -Instruction* pinst = &*i;
    +Instruction *pinst = &*i;
     
    @@ -1467,7 +1551,7 @@ Instruction* pinst = &*i;
    -Instruction* pinst = i;
    +Instruction *pinst = i;
     
    @@ -1535,8 +1619,7 @@ class OurFunctionPass : public FunctionPass { href="#CallInst">CallInst>(&*i)) { // We know we've encountered a call instruction, so we // need to determine if it's a call to the - // function pointed to by m_func or not - + // function pointed to by m_func or not. if (callInst->getCalledFunction() == targetFunc) ++callCounter; } @@ -1545,7 +1628,7 @@ class OurFunctionPass : public FunctionPass { } private: - unsigned callCounter; + unsigned callCounter; }; @@ -1597,7 +1680,7 @@ of F:

    -Function* F = ...;
    +Function *F = ...;
     
     for (Value::use_iterator i = F->use_begin(), e = F->use_end(); i != e; ++i)
       if (Instruction *Inst = dyn_cast<Instruction>(*i)) {
    @@ -1617,10 +1700,10 @@ the particular Instruction):

    -Instruction* pi = ...;
    +Instruction *pi = ...;
     
     for (User::op_iterator i = pi->op_begin(), e = pi->op_end(); i != e; ++i) {
    -  Value* v = *i;
    +  Value *v = *i;
       // ...
     }
     
    @@ -1633,6 +1716,36 @@ for (User::op_iterator i = pi->op_begin(), e = pi->op_end(); i != e; ++i)
    + + + +
    + +

    Iterating over the predecessors and successors of a block is quite easy +with the routines defined in "llvm/Support/CFG.h". Just use code like +this to iterate over all predecessors of BB:

    + +
    +
    +#include "llvm/Support/CFG.h"
    +BasicBlock *BB = ...;
    +
    +for (pred_iterator PI = pred_begin(BB), E = pred_end(BB); PI != E; ++PI) {
    +  BasicBlock *Pred = *PI;
    +  // ...
    +}
    +
    +
    + +

    Similarly, to iterate over successors use +succ_iterator/succ_begin/succ_end.

    + +
    + +
    Making simple changes @@ -1665,7 +1778,7 @@ parameters. For example, an AllocaInst only requires a
    -AllocaInst* ai = new AllocaInst(Type::IntTy);
    +AllocaInst* ai = new AllocaInst(Type::Int32Ty);
     
    @@ -1693,7 +1806,7 @@ used as some kind of index by some other code. To accomplish this, I place an
    -AllocaInst* pa = new AllocaInst(Type::IntTy, 0, "indexLoc");
    +AllocaInst* pa = new AllocaInst(Type::Int32Ty, 0, "indexLoc");
     
    @@ -1806,9 +1919,7 @@ erase function to remove your instruction. For example:

     Instruction *I = .. ;
    -BasicBlock *BB = I->getParent();
    -
    -BB->getInstList().erase(I);
    +I->eraseFromParent();
     
    @@ -1845,7 +1956,7 @@ AllocaInst* instToReplace = ...; BasicBlock::iterator ii(instToReplace); ReplaceInstWithValue(instToReplace->getParent()->getInstList(), ii, - Constant::getNullValue(PointerType::get(Type::IntTy))); + Constant::getNullValue(PointerType::get(Type::Int32Ty)));
  • ReplaceInstWithInst @@ -1860,7 +1971,7 @@ AllocaInst* instToReplace = ...; BasicBlock::iterator ii(instToReplace); ReplaceInstWithInst(instToReplace->getParent()->getInstList(), ii, - new AllocaInst(Type::IntTy, 0, "ptrToReplacedInt")); + new AllocaInst(Type::Int32Ty, 0, "ptrToReplacedInt"));
  • @@ -1878,6 +1989,28 @@ ReplaceInstWithValue, ReplaceInstWithInst --> + +
    + Deleting GlobalVariables +
    + +
    + +

    Deleting a global variable from a module is just as easy as deleting an +Instruction. First, you must have a pointer to the global variable that you wish + to delete. You use this pointer to erase it from its parent, the module. + For example:

    + +
    +
    +GlobalVariable *GV = .. ;
    +
    +GV->eraseFromParent();
    +
    +
    + +
    +
    Advanced Topics @@ -1912,7 +2045,7 @@ recursive types and late resolution of opaque types makes the situation very difficult to handle. Fortunately, for the most part, our implementation makes most clients able to be completely unaware of the nasty internal details. The primary case where clients are exposed to the inner workings of it are when -building a recursive type. In addition to this case, the LLVM bytecode reader, +building a recursive type. In addition to this case, the LLVM bitcode reader, assembly parser, and linker also have to be aware of the inner workings of this system.

    @@ -1957,7 +2090,7 @@ To build this, use the following LLVM APIs: PATypeHolder StructTy = OpaqueType::get(); std::vector<const Type*> Elts; Elts.push_back(PointerType::get(StructTy)); -Elts.push_back(Type::IntTy); +Elts.push_back(Type::Int32Ty); StructType *NewSTy = StructType::get(Elts); // At this point, NewSTy = "{ opaque*, i32 }". Tell VMCore that @@ -2045,12 +2178,8 @@ Type is maintained by PATypeHolder objects.

    Some data structures need more to perform more complex updates when types get -resolved. The SymbolTable class, for example, needs -move and potentially merge type planes in its representation when a pointer -changes.

    - -

    -To support this, a class can derive from the AbstractTypeUser class. This class +resolved. To support this, a class can derive from the AbstractTypeUser class. +This class allows it to get callbacks when certain types are resolved. To register to get callbacks for a particular type, the DerivedType::{add/remove}AbstractTypeUser methods can be called on a type. Note that these methods only work for @@ -2062,164 +2191,264 @@ objects) can never be refined.

    - The SymbolTable class + The ValueSymbolTable and + TypeSymbolTable classes
    -

    This class provides a symbol table that the The +ValueSymbolTable class provides a symbol table that the Function and -Module classes use for naming definitions. The symbol table can -provide a name for any Value. -SymbolTable is an abstract data type. It hides the data it contains -and provides access to it through a controlled interface.

    +Module classes use for naming value definitions. The symbol table +can provide a name for any Value. +The +TypeSymbolTable class is used by the Module class to store +names for types.

    Note that the SymbolTable class should not be directly accessed by most clients. It should only be used when iteration over the symbol table names themselves are required, which is very special purpose. Note that not all LLVM -Values have names, and those without names (i.e. they have +Values have names, and those without names (i.e. they have an empty name) do not exist in the symbol table.

    -

    To use the SymbolTable well, you need to understand the -structure of the information it holds. The class contains two -std::map objects. The first, pmap, is a map of -Type* to maps of name (std::string) to Value*. -Thus, Values are stored in two-dimensions and accessed by Type and -name.

    +

    These symbol tables support iteration over the values/types in the symbol +table with begin/end/iterator and supports querying to see if a +specific name is in the symbol table (with lookup). The +ValueSymbolTable class exposes no public mutator methods, instead, +simply call setName on a value, which will autoinsert it into the +appropriate symbol table. For types, use the Module::addTypeName method to +insert entries into the symbol table.

    -

    The interface of this class provides three basic types of operations: -

      -
    1. Accessors. Accessors provide read-only access to information - such as finding a value for a name with the - lookup method.
    2. -
    3. Mutators. Mutators allow the user to add information to the - SymbolTable with methods like - insert.
    4. -
    5. Iterators. Iterators allow the user to traverse the content - of the symbol table in well defined ways, such as the method - plane_begin.
    6. -
    +
    -

    Accessors

    -
    -
    Value* lookup(const Type* Ty, const std::string& name) const: -
    -
    The lookup method searches the type plane given by the - Ty parameter for a Value with the provided name. - If a suitable Value is not found, null is returned.
    - -
    bool isEmpty() const:
    -
    This function returns true if both the value and types maps are - empty
    -
    -

    Mutators

    -
    -
    void insert(Value *Val):
    -
    This method adds the provided value to the symbol table. The Value must - have both a name and a type which are extracted and used to place the value - in the correct type plane under the value's name.
    - -
    void remove(Value* Val):
    -
    This method removes a named value from the symbol table. The - type and name of the Value are extracted from \p N and used to - lookup the Value in the correct type plane. If the Value is - not in the symbol table, this method silently ignores the - request.
    -
    + + -

    Iteration

    -

    The following functions describe three types of iterators you can obtain -the beginning or end of the sequence for both const and non-const. It is -important to keep track of the different kinds of iterators. There are -three idioms worth pointing out:

    - - - - - - - - - - - -
    UnitsIteratorIdiom
    Planes Of name/Value mapsPI
    
    -for (SymbolTable::plane_const_iterator PI = ST.plane_begin(),
    -     PE = ST.plane_end(); PI != PE; ++PI ) {
    -  PI->first  // This is the Type* of the plane
    -  PI->second // This is the SymbolTable::ValueMap of name/Value pairs
    -}
    -    
    name/Value pairs in a planeVI
    
    -for (SymbolTable::value_const_iterator VI = ST.value_begin(SomeType),
    -     VE = ST.value_end(SomeType); VI != VE; ++VI ) {
    -  VI->first  // This is the name of the Value
    -  VI->second // This is the Value* value associated with the name
    -}
    -    
    +
    +

    The +User class provides a base for expressing the ownership of User +towards other +Values. The +Use helper class is employed to do the bookkeeping and to facilitate O(1) +addition and removal.

    -

    Using the recommended iterator names and idioms will help you avoid -making mistakes. Of particular note, make sure that whenever you use -value_begin(SomeType) that you always compare the resulting iterator -with value_end(SomeType) not value_end(SomeOtherType) or else you -will loop infinitely.

    + + -
    +
    +

    +A subclass of User can choose between incorporating its Use objects +or refer to them out-of-line by means of a pointer. A mixed variant +(some Uses inline others hung off) is impractical and breaks the invariant +that the Use objects belonging to the same User form a contiguous array. +

    +
    -
    plane_iterator plane_begin():
    -
    Get an iterator that starts at the beginning of the type planes. - The iterator will iterate over the Type/ValueMap pairs in the - type planes.
    +

    +We have 2 different layouts in the User (sub)classes: +

      +
    • Layout a) +The Use object(s) are inside (resp. at fixed offset) of the User +object and there are a fixed number of them.

      + +
    • Layout b) +The Use object(s) are referenced by a pointer to an +array from the User object and there may be a variable +number of them.

      +
    +

    +Initially each layout will possess a direct pointer to the +start of the array of Uses. Though not mandatory for layout a), +we stick to this redundancy for the sake of simplicity. +The User object will also store the number of Use objects it +has. (Theoretically this information can also be calculated +given the scheme presented below.)

    +

    +Special forms of allocation operators (operator new) +will enforce the following memory layouts:

    -
    plane_const_iterator plane_begin() const:
    -
    Get a const_iterator that starts at the beginning of the type - planes. The iterator will iterate over the Type/ValueMap pairs - in the type planes.
    +
      +
    • Layout a) will be modelled by prepending the User object by the Use[] array.

      -
      plane_iterator plane_end():
      -
      Get an iterator at the end of the type planes. This serves as - the marker for end of iteration over the type planes.
      +
      +...---.---.---.---.-------...
      +  | P | P | P | P | User
      +'''---'---'---'---'-------'''
      +
      -
      plane_const_iterator plane_end() const:
      -
      Get a const_iterator at the end of the type planes. This serves as - the marker for end of iteration over the type planes.
      +
    • Layout b) will be modelled by pointing at the Use[] array.

      +
      +.-------...
      +| User
      +'-------'''
      +    |
      +    v
      +    .---.---.---.---...
      +    | P | P | P | P |
      +    '---'---'---'---'''
      +
      +
    +(In the above figures 'P' stands for the Use** that + is stored in each Use object in the member Use::Prev) -
    value_iterator value_begin(const Type *Typ):
    -
    Get an iterator that starts at the beginning of a type plane. - The iterator will iterate over the name/value pairs in the type plane. - Note: The type plane must already exist before using this.
    + + -
    value_const_iterator value_begin(const Type *Typ) const:
    -
    Get a const_iterator that starts at the beginning of a type plane. - The iterator will iterate over the name/value pairs in the type plane. - Note: The type plane must already exist before using this.
    +
    +

    +Since the Use objects will be deprived of the direct pointer to +their User objects, there must be a fast and exact method to +recover it. This is accomplished by the following scheme:

    +
    -
    value_iterator value_end(const Type *Typ):
    -
    Get an iterator to the end of a type plane. This serves as the marker - for end of iteration of the type plane. - Note: The type plane must already exist before using this.
    +A bit-encoding in the 2 LSBits (least significant bits) of the Use::Prev will allow to find the +start of the User object: +
      +
    • 00 —> binary digit 0
    • +
    • 01 —> binary digit 1
    • +
    • 10 —> stop and calculate (s)
    • +
    • 11 —> full stop (S)
    • +
    +

    +Given a Use*, all we have to do is to walk till we get +a stop and we either have a User immediately behind or +we have to walk to the next stop picking up digits +and calculating the offset:

    +
    +.---.---.---.---.---.---.---.---.---.---.---.---.---.---.---.---.----------------
    +| 1 | s | 1 | 0 | 1 | 0 | s | 1 | 1 | 0 | s | 1 | 1 | s | 1 | S | User (or User*)
    +'---'---'---'---'---'---'---'---'---'---'---'---'---'---'---'---'----------------
    +    |+15                |+10            |+6         |+3     |+1
    +    |                   |               |           |       |__>
    +    |                   |               |           |__________>
    +    |                   |               |______________________>
    +    |                   |______________________________________>
    +    |__________________________________________________________>
    +
    +

    +Only the significant number of bits need to be stored between the +stops, so that the worst case is 20 memory accesses when there are +1000 Use objects associated with a User.

    -
    value_const_iterator value_end(const Type *Typ) const:
    -
    Get a const_iterator to the end of a type plane. This serves as the - marker for end of iteration of the type plane. - Note: the type plane must already exist before using this.
    + + -
    plane_const_iterator find(const Type* Typ ) const:
    -
    This method returns a plane_const_iterator for iteration over - the type planes starting at a specific plane, given by \p Ty.
    +
    +

    +The following literate Haskell fragment demonstrates the concept:

    +
    -
    plane_iterator find( const Type* Typ :
    -
    This method returns a plane_iterator for iteration over the - type planes starting at a specific plane, given by \p Ty.
    +
    +
    +> import Test.QuickCheck
    +> 
    +> digits :: Int -> [Char] -> [Char]
    +> digits 0 acc = '0' : acc
    +> digits 1 acc = '1' : acc
    +> digits n acc = digits (n `div` 2) $ digits (n `mod` 2) acc
    +> 
    +> dist :: Int -> [Char] -> [Char]
    +> dist 0 [] = ['S']
    +> dist 0 acc = acc
    +> dist 1 acc = let r = dist 0 acc in 's' : digits (length r) r
    +> dist n acc = dist (n - 1) $ dist 1 acc
    +> 
    +> takeLast n ss = reverse $ take n $ reverse ss
    +> 
    +> test = takeLast 40 $ dist 20 []
    +> 
    +
    +
    +

    +Printing <test> gives: "1s100000s11010s10100s1111s1010s110s11s1S"

    +

    +The reverse algorithm computes the length of the string just by examining +a certain prefix:

    -
    +
    +
    +> pref :: [Char] -> Int
    +> pref "S" = 1
    +> pref ('s':'1':rest) = decode 2 1 rest
    +> pref (_:rest) = 1 + pref rest
    +> 
    +> decode walk acc ('0':rest) = decode (walk + 1) (acc * 2) rest
    +> decode walk acc ('1':rest) = decode (walk + 1) (acc * 2 + 1) rest
    +> decode walk acc _ = walk + acc
    +> 
    +
    +

    +Now, as expected, printing <pref test> gives 40.

    +

    +We can quickCheck this with following property:

    +
    +
    +> testcase = dist 2000 []
    +> testcaseLength = length testcase
    +> 
    +> identityProp n = n > 0 && n <= testcaseLength ==> length arr == pref arr
    +>     where arr = takeLast n testcase
    +> 
    +
    +
    +

    +As expected <quickCheck identityProp> gives:

    +
    +*Main> quickCheck identityProp
    +OK, passed 100 tests.
    +
    +

    +Let's be a bit more exhaustive:

    - +
    +
    +> 
    +> deepCheck p = check (defaultConfig { configMaxTest = 500 }) p
    +> 
    +
    +
    +

    +And here is the result of <deepCheck identityProp>:

    + +
    +*Main> deepCheck identityProp
    +OK, passed 500 tests.
    +
    + + + + +

    +To maintain the invariant that the 2 LSBits of each Use** in Use +never change after being set up, setters of Use::Prev must re-tag the +new Use** on every modification. Accordingly getters must strip the +tag bits.

    +

    +For layout b) instead of the User we will find a pointer (User* with LSBit set). +Following this pointer brings us to the User. A portable trick will ensure +that the first bytes of User (if interpreted as a pointer) will never have +the LSBit set.

    + +
    + + @@ -2322,7 +2551,7 @@ the lib/VMCore directory.

    point type.
    StructType
    Subclass of DerivedTypes for struct types.
    -
    FunctionType
    +
    FunctionType
    Subclass of DerivedTypes for function types.
    • bool isVarArg() const: Returns true if its a vararg @@ -2516,7 +2745,7 @@ method. In addition, all LLVM values can be named. The "name" of the
    -

    The name of this instruction is "foo". NOTE +

    The name of this instruction is "foo". NOTE that the name of any value may be missing (an empty string), so names should ONLY be used for debugging (making the source code easier to read, debugging printouts), they should not be used to keep track of values or map @@ -2748,10 +2977,20 @@ a subclass, which represents the address of a global variable or function.

  • ConstantInt : This subclass of Constant represents an integer constant of any width.