X-Git-Url: http://demsky.eecs.uci.edu/git/?a=blobdiff_plain;f=docs%2FProgrammersManual.html;h=46b82ddc21fb6f23c335d58a578b248562ae68c7;hb=95df6b3603e228cea714be21997fec82cb03011e;hp=aa1fe402ff9b6e88e149f3fd9fec0c258589cffe;hpb=485bad1a09550e2ec182a78dba0ed97bf1d9f5ae;p=oota-llvm.git diff --git a/docs/ProgrammersManual.html b/docs/ProgrammersManual.html index aa1fe402ff9..46b82ddc21f 100644 --- a/docs/ProgrammersManual.html +++ b/docs/ProgrammersManual.html @@ -62,6 +62,7 @@ option
+DenseSet is a simple quadratically probed hash table. It excels at supporting +small values: it uses a single allocation to hold all of the pairs that +are currently inserted in the set. DenseSet is a great way to unique small +values that are not simple pointers (use SmallPtrSet for pointers). Note that DenseSet has +the same requirements for the value type that DenseMap has. +
+ +Unlike the other containers, there are only two bit storage containers, and +choosing when to use each is relatively straightforward.
+ +One additional option is +std::vector<bool>: we discourage its use for two reasons 1) the +implementation in many common compilers (e.g. commonly available versions of +GCC) is extremely inefficient and 2) the C++ standards committee is likely to +deprecate this container and/or change it significantly somehow. In any case, +please don't use it.
+The BitVector container provides a fixed size set of bits for manipulation. +It supports individual bit setting/testing, as well as set operations. The set +operations take time O(size of bitvector), but operations are performed one word +at a time, instead of one bit at a time. This makes the BitVector very fast for +set operations compared to other containers. Use the BitVector when you expect +the number of set bits to be high (IE a dense set). +
+The SparseBitVector container is much like BitVector, with one major +difference: Only the bits that are set, are stored. This makes the +SparseBitVector much more space efficient than BitVector when the set is sparse, +as well as making set operations O(number of set bits) instead of O(size of +universe). The downside to the SparseBitVector is that setting and testing of random bits is O(N), and on large SparseBitVectors, this can be slower than BitVector. In our implementation, setting or testing bits in sorted order +(either forwards or reverse) is O(1) worst case. Testing and setting bits within 128 bits (depends on size) of the current bit is also O(1). As a general statement, testing/setting bits in a SparseBitVector is O(distance away from last set bit). +
+std::set<Instruction*> worklist; -worklist.insert(inst_begin(F), inst_end(F)); +// or better yet, SmallPtrSet<Instruction*, 64> worklist; + +for (inst_iterator I = inst_begin(F), E = inst_end(F); I != E; ++I) + worklist.insert(&*I);
-Instruction* pinst = &*i; +Instruction *pinst = &*i;
-Instruction* pinst = i; +Instruction *pinst = i;
-Function* F = ...; +Function *F = ...; for (Value::use_iterator i = F->use_begin(), e = F->use_end(); i != e; ++i) if (Instruction *Inst = dyn_cast<Instruction>(*i)) { @@ -1617,10 +1700,10 @@ the particular Instruction):+ + + +-Instruction* pi = ...; +Instruction *pi = ...; for (User::op_iterator i = pi->op_begin(), e = pi->op_end(); i != e; ++i) { - Value* v = *i; + Value *v = *i; // ... }@@ -1633,6 +1716,36 @@ for (User::op_iterator i = pi->op_begin(), e = pi->op_end(); i != e; ++i)+ ++ +Iterating over the predecessors and successors of a block is quite easy +with the routines defined in "llvm/Support/CFG.h". Just use code like +this to iterate over all predecessors of BB:
+ +++ ++#include "llvm/Support/CFG.h" +BasicBlock *BB = ...; + +for (pred_iterator PI = pred_begin(BB), E = pred_end(BB); PI != E; ++PI) { + BasicBlock *Pred = *PI; + // ... +} ++Similarly, to iterate over successors use +succ_iterator/succ_begin/succ_end.
+ +Making simple changes @@ -1665,7 +1778,7 @@ parameters. For example, an AllocaInst only requires a@@ -1693,7 +1806,7 @@ used as some kind of index by some other code. To accomplish this, I place an-AllocaInst* ai = new AllocaInst(Type::IntTy); +AllocaInst* ai = new AllocaInst(Type::Int32Ty);@@ -1806,9 +1919,7 @@ erase function to remove your instruction. For example:-AllocaInst* pa = new AllocaInst(Type::IntTy, 0, "indexLoc"); +AllocaInst* pa = new AllocaInst(Type::Int32Ty, 0, "indexLoc");@@ -1845,7 +1956,7 @@ AllocaInst* instToReplace = ...; BasicBlock::iterator ii(instToReplace); ReplaceInstWithValue(instToReplace->getParent()->getInstList(), ii, - Constant::getNullValue(PointerType::get(Type::IntTy))); + Constant::getNullValue(PointerType::get(Type::Int32Ty)));Instruction *I = .. ; -BasicBlock *BB = I->getParent(); - -BB->getInstList().erase(I); +I->eraseFromParent();
Deleting a global variable from a module is just as easy as deleting an +Instruction. First, you must have a pointer to the global variable that you wish + to delete. You use this pointer to erase it from its parent, the module. + For example:
+ ++GlobalVariable *GV = .. ; + +GV->eraseFromParent(); ++
Some data structures need more to perform more complex updates when types get -resolved. The SymbolTable class, for example, needs -move and potentially merge type planes in its representation when a pointer -changes.
- -
-To support this, a class can derive from the AbstractTypeUser class. This class
+resolved. To support this, a class can derive from the AbstractTypeUser class.
+This class
allows it to get callbacks when certain types are resolved. To register to get
callbacks for a particular type, the DerivedType::{add/remove}AbstractTypeUser
methods can be called on a type. Note that these methods only work for
@@ -2062,164 +2191,264 @@ objects) can never be refined.
This class provides a symbol table that the The
+ValueSymbolTable class provides a symbol table that the Function and
-Module classes use for naming definitions. The symbol table can
-provide a name for any Value.
-SymbolTable is an abstract data type. It hides the data it contains
-and provides access to it through a controlled interface. Note that the SymbolTable class should not be directly accessed
by most clients. It should only be used when iteration over the symbol table
names themselves are required, which is very special purpose. Note that not
all LLVM
-Values have names, and those without names (i.e. they have
+Values have names, and those without names (i.e. they have
an empty name) do not exist in the symbol table.
To use the SymbolTable well, you need to understand the
-structure of the information it holds. The class contains two
-std::map objects. The first, pmap, is a map of
-Type* to maps of name (std::string) to Value*.
-Thus, Values are stored in two-dimensions and accessed by Type and
-name. These symbol tables support iteration over the values/types in the symbol
+table with begin/end/iterator and supports querying to see if a
+specific name is in the symbol table (with lookup). The
+ValueSymbolTable class exposes no public mutator methods, instead,
+simply call setName on a value, which will autoinsert it into the
+appropriate symbol table. For types, use the Module::addTypeName method to
+insert entries into the symbol table. The interface of this class provides three basic types of operations:
-
-
+Accessors
-
-
-Mutators
-
-
+
+
-Iteration
-
The following functions describe three types of iterators you can obtain -the beginning or end of the sequence for both const and non-const. It is -important to keep track of the different kinds of iterators. There are -three idioms worth pointing out:
- -Units | Iterator | Idiom |
---|---|---|
Planes Of name/Value maps | PI | --for (SymbolTable::plane_const_iterator PI = ST.plane_begin(), - PE = ST.plane_end(); PI != PE; ++PI ) { - PI->first // This is the Type* of the plane - PI->second // This is the SymbolTable::ValueMap of name/Value pairs -} - |
-
name/Value pairs in a plane | VI | --for (SymbolTable::value_const_iterator VI = ST.value_begin(SomeType), - VE = ST.value_end(SomeType); VI != VE; ++VI ) { - VI->first // This is the name of the Value - VI->second // This is the Value* value associated with the name -} - |
-
The +User class provides a base for expressing the ownership of User +towards other +Values. The +Use helper class is employed to do the bookkeeping and to facilitate O(1) +addition and removal.
-Using the recommended iterator names and idioms will help you avoid -making mistakes. Of particular note, make sure that whenever you use -value_begin(SomeType) that you always compare the resulting iterator -with value_end(SomeType) not value_end(SomeOtherType) or else you -will loop infinitely.
+ + -+A subclass of User can choose between incorporating its Use objects +or refer to them out-of-line by means of a pointer. A mixed variant +(some Uses inline others hung off) is impractical and breaks the invariant +that the Use objects belonging to the same User form a contiguous array. +
++We have 2 different layouts in the User (sub)classes: +
Layout a) +The Use object(s) are inside (resp. at fixed offset) of the User +object and there are a fixed number of them.
+ +Layout b) +The Use object(s) are referenced by a pointer to an +array from the User object and there may be a variable +number of them.
++Initially each layout will possess a direct pointer to the +start of the array of Uses. Though not mandatory for layout a), +we stick to this redundancy for the sake of simplicity. +The User object will also store the number of Use objects it +has. (Theoretically this information can also be calculated +given the scheme presented below.)
++Special forms of allocation operators (operator new) +will enforce the following memory layouts:
-Layout a) will be modelled by prepending the User object by the Use[] array.
-+...---.---.---.---.-------... + | P | P | P | P | User +'''---'---'---'---'-------''' +-
Layout b) will be modelled by pointing at the Use[] array.
++.-------... +| User +'-------''' + | + v + .---.---.---.---... + | P | P | P | P | + '---'---'---'---''' ++
+Since the Use objects will be deprived of the direct pointer to +their User objects, there must be a fast and exact method to +recover it. This is accomplished by the following scheme:
++Given a Use*, all we have to do is to walk till we get +a stop and we either have a User immediately behind or +we have to walk to the next stop picking up digits +and calculating the offset:
++.---.---.---.---.---.---.---.---.---.---.---.---.---.---.---.---.---------------- +| 1 | s | 1 | 0 | 1 | 0 | s | 1 | 1 | 0 | s | 1 | 1 | s | 1 | S | User (or User*) +'---'---'---'---'---'---'---'---'---'---'---'---'---'---'---'---'---------------- + |+15 |+10 |+6 |+3 |+1 + | | | | |__> + | | | |__________> + | | |______________________> + | |______________________________________> + |__________________________________________________________> ++
+Only the significant number of bits need to be stored between the +stops, so that the worst case is 20 memory accesses when there are +1000 Use objects associated with a User.
-+The following literate Haskell fragment demonstrates the concept:
++> import Test.QuickCheck +> +> digits :: Int -> [Char] -> [Char] +> digits 0 acc = '0' : acc +> digits 1 acc = '1' : acc +> digits n acc = digits (n `div` 2) $ digits (n `mod` 2) acc +> +> dist :: Int -> [Char] -> [Char] +> dist 0 [] = ['S'] +> dist 0 acc = acc +> dist 1 acc = let r = dist 0 acc in 's' : digits (length r) r +> dist n acc = dist (n - 1) $ dist 1 acc +> +> takeLast n ss = reverse $ take n $ reverse ss +> +> test = takeLast 40 $ dist 20 [] +> ++
+Printing <test> gives: "1s100000s11010s10100s1111s1010s110s11s1S"
++The reverse algorithm computes the length of the string just by examining +a certain prefix:
-+> pref :: [Char] -> Int +> pref "S" = 1 +> pref ('s':'1':rest) = decode 2 1 rest +> pref (_:rest) = 1 + pref rest +> +> decode walk acc ('0':rest) = decode (walk + 1) (acc * 2) rest +> decode walk acc ('1':rest) = decode (walk + 1) (acc * 2 + 1) rest +> decode walk acc _ = walk + acc +> +
+Now, as expected, printing <pref test> gives 40.
++We can quickCheck this with following property:
++> testcase = dist 2000 [] +> testcaseLength = length testcase +> +> identityProp n = n > 0 && n <= testcaseLength ==> length arr == pref arr +> where arr = takeLast n testcase +> ++
+As expected <quickCheck identityProp> gives:
++*Main> quickCheck identityProp +OK, passed 100 tests. ++
+Let's be a bit more exhaustive:
- ++> +> deepCheck p = check (defaultConfig { configMaxTest = 500 }) p +> ++
+And here is the result of <deepCheck identityProp>:
+ ++*Main> deepCheck identityProp +OK, passed 500 tests. ++ + + + +
+To maintain the invariant that the 2 LSBits of each Use** in Use +never change after being set up, setters of Use::Prev must re-tag the +new Use** on every modification. Accordingly getters must strip the +tag bits.
++For layout b) instead of the User we will find a pointer (User* with LSBit set). +Following this pointer brings us to the User. A portable trick will ensure +that the first bytes of User (if interpreted as a pointer) will never have +the LSBit set.
+ +The name of this instruction is "foo". NOTE +
The name of this instruction is "foo". NOTE that the name of any value may be missing (an empty string), so names should ONLY be used for debugging (making the source code easier to read, debugging printouts), they should not be used to keep track of values or map @@ -2748,10 +2977,20 @@ a subclass, which represents the address of a global variable or function.