X-Git-Url: http://demsky.eecs.uci.edu/git/?a=blobdiff_plain;f=docs%2FStacker.html;h=81b623efa9a15eca20c51a6d4b78b7f755147b7c;hb=f399235a03e5318804a6833580cd35cbbf9cd4db;hp=41b77fe55fc20ea919b37017a2fa58760aa31e0c;hpb=fd90f88b586a9a1a7d36a38b69b1f3d51dd9217b;p=oota-llvm.git diff --git a/docs/Stacker.html b/docs/Stacker.html index 41b77fe55fc..81b623efa9a 100644 --- a/docs/Stacker.html +++ b/docs/Stacker.html @@ -4,10 +4,6 @@ Stacker: An Example Of Using LLVM - @@ -33,7 +29,7 @@
  • Comments
  • Literals
  • Words
  • -
  • Standard Style
  • +
  • Standard Style
  • Built-Ins
  • Prime: A Complete Example
  • @@ -51,8 +47,8 @@ -
    -

    Written by Reid Spencer

    +
    +

    Written by Reid Spencer

    @@ -131,31 +127,28 @@ I noted that most of the important LLVM IR (Intermediate Representation) C++ classes were derived from the Value class. The full power of that simple design only became fully understood once I started constructing executable expressions for Stacker.

    +

    This really makes your programming go faster. Think about compiling code for the following C/C++ expression: (a|b)*((x+1)/(y+1)). Assuming the values are on the stack in the order a, b, x, y, this could be expressed in stacker as: 1 + SWAP 1 + / ROT2 OR *. -You could write a function using LLVM that computes this expression like this:

    -
    
    +You could write a function using LLVM that computes this expression like 
    +this: 

    + +
     Value* 
     expression(BasicBlock* bb, Value* a, Value* b, Value* x, Value* y )
     {
    -    Instruction* tail = bb->getTerminator();
    -    ConstantSInt* one = ConstantSInt::get( Type::IntTy, 1);
    -    BinaryOperator* or1 = 
    -	BinaryOperator::create( Instruction::Or, a, b, "", tail );
    -    BinaryOperator* add1 = 
    -	BinaryOperator::create( Instruction::Add, x, one, "", tail );
    -    BinaryOperator* add2 =
    -	BinaryOperator::create( Instruction::Add, y, one, "", tail );
    -    BinaryOperator* div1 = 
    -	BinaryOperator::create( Instruction::Div, add1, add2, "", tail);
    -    BinaryOperator* mult1 = 
    -	BinaryOperator::create( Instruction::Mul, or1, div1, "", tail );
    -
    +    ConstantInt* one = ConstantInt::get(Type::IntTy, 1);
    +    BinaryOperator* or1 = BinaryOperator::createOr(a, b, "", bb);
    +    BinaryOperator* add1 = BinaryOperator::createAdd(x, one, "", bb);
    +    BinaryOperator* add2 = BinaryOperator::createAdd(y, one, "", bb);
    +    BinaryOperator* div1 = BinaryOperator::createDiv(add1, add2, "", bb);
    +    BinaryOperator* mult1 = BinaryOperator::createMul(or1, div1, "", bb);
         return mult1;
     }
    -
    +
    +

    "Okay, big deal," you say? It is a big deal. Here's why. Note that I didn't have to tell this function which kinds of Values are being passed in. They could be Instructions, Constants, GlobalVariables, or @@ -205,7 +198,7 @@ should be constructed. In general, here's what I learned:

  • Create your blocks early. While writing your compiler, you will encounter several situations where you know apriori that you will need several blocks. For example, if-then-else, switch, while, and for - statements in C/C++ all need multiple blocks for expression in LVVM. + statements in C/C++ all need multiple blocks for expression in LLVM. The rule is, create them early.
  • Terminate your blocks early. This just reduces the chances that you forget to terminate your blocks which is required (go @@ -226,8 +219,8 @@ should be constructed. In general, here's what I learned:

    The foregoing is such an important principal, its worth making an idiom:

    -BasicBlock* bb = new BasicBlock();
    -bb->getInstList().push_back( new Branch( ... ) );
    +BasicBlock* bb = BasicBlock::Create();
    +bb->getInstList().push_back( BranchInst::Create( ... ) );
     new Instruction(..., bb->getTerminator() );
     

    To make this clear, consider the typical if-then-else statement @@ -236,19 +229,19 @@ in a single function using LLVM in the following way:

     using namespace llvm;
     BasicBlock*
    -MyCompiler::handle_if( BasicBlock* bb, SetCondInst* condition )
    +MyCompiler::handle_if( BasicBlock* bb, ICmpInst* condition )
     {
         // Create the blocks to contain code in the structure of if/then/else
    -    BasicBlock* then_bb = new BasicBlock(); 
    -    BasicBlock* else_bb = new BasicBlock();
    -    BasicBlock* exit_bb = new BasicBlock();
    +    BasicBlock* then_bb = BasicBlock::Create(); 
    +    BasicBlock* else_bb = BasicBlock::Create();
    +    BasicBlock* exit_bb = BasicBlock::Create();
     
         // Insert the branch instruction for the "if"
    -    bb->getInstList().push_back( new BranchInst( then_bb, else_bb, condition ) );
    +    bb->getInstList().push_back( BranchInst::Create( then_bb, else_bb, condition ) );
     
         // Set up the terminating instructions
    -    then->getInstList().push_back( new BranchInst( exit_bb ) );
    -    else->getInstList().push_back( new BranchInst( exit_bb ) );
    +    then->getInstList().push_back( BranchInst::Create( exit_bb ) );
    +    else->getInstList().push_back( BranchInst::Create( exit_bb ) );
     
         // Fill in the then part .. details excised for brevity
         this->fill_in( then_bb );
    @@ -315,9 +308,9 @@ things, this leads to the idiom:
     

     std::vector<Value*> index_vector;
    -index_vector.push_back( ConstantSInt::get( Type::LongTy, 0 );
    +index_vector.push_back( ConstantInt::get( Type::LongTy, 0 );
     // ... push other indices ...
    -GetElementPtrInst* gep = new GetElementPtrInst( ptr, index_vector );
    +GetElementPtrInst* gep = GetElementPtrInst::Create( ptr, index_vector );
     

    For example, suppose we have a global variable whose type is [24 x int]. The variable itself represents a pointer to that array. To subscript the @@ -374,9 +367,9 @@ functions in the LLVM IR that make things easier. Here's what I learned:

    @@ -470,6 +463,11 @@ unit. Anything declared with FORWARD is an external symbol for linking.

    +
    Standard Style
    +
    +

    TODO

    +
    +
    Built In Words

    The built-in words of the Stacker language are put in several groups @@ -513,16 +511,16 @@ using the following construction:

    - - - - +
    Definition Of Operation Of Built In Words
    LOGICAL OPERATIONS
    + + + - + @@ -579,7 +577,7 @@ using the following construction:

    - + @@ -621,7 +619,7 @@ using the following construction:

    are bitwise exclusive OR'd together and pushed back on the stack. For example, The sequence 1 3 XOR yields 2. - + @@ -702,7 +700,7 @@ using the following construction:

    - + @@ -789,7 +787,7 @@ using the following construction:

    - + @@ -847,7 +845,7 @@ using the following construction:

    how much to rotate. That is, ROLL with n=1 is the same as ROT and ROLL with n=2 is the same as ROT2. - + @@ -900,7 +898,7 @@ using the following construction:

    pushed back on the stack so this doesn't count as a "use ptr" in the FREE idiom. - + @@ -948,26 +946,30 @@ using the following construction:

    executed. In either case, after the (words....) have executed, execution continues immediately following the ENDIF. - - + + - - - + 10 WHILE >d -- END
    + This will print the numbers from 10 down to 1. 10 is pushed on the + stack. Since that is non-zero, the while loop is entered. The top of + the stack (10) is printed out with >d. The top of the stack is + decremented, yielding 9 and control is transfered back to the WHILE + keyword. The process starts all over again and repeats until + the top of stack is decremented to 0 at which point the WHILE test + fails and control is transfered to the word after the END. + + + @@ -1294,13 +1296,26 @@ remainder of the story. +

    The source code, test programs, and sample programs can all be found -under the LLVM "projects" directory. You will need to obtain the LLVM sources -to find it (either via anonymous CVS or a tarball. See the -Getting Started document).

    -

    Under the "projects" directory there is a directory named "Stacker". That -directory contains everything, as follows:

    +in the LLVM repository named llvm-stacker This should be checked out to +the projects directory so that it will auto-configure. To do that, make +sure you have the llvm sources in llvm +(see Getting Started) and then use these +commands:

    + +
    +
    +% svn co http://llvm.org/svn/llvm-project/llvm-top/trunk llvm-top
    +% cd llvm-top
    +% make build MODULE=stacker
    +
    +
    + +

    Under the projects/llvm-stacker directory you will find the +implementation of the Stacker compiler, as follows:

    +
    • lib - contains most of the source code
        @@ -1315,35 +1330,38 @@ directory contains everything, as follows:

      • sample - contains the sample programs
    +
    The Lexer
    +
    -

    See projects/Stacker/lib/compiler/Lexer.l

    +

    See projects/llvm-stacker/lib/compiler/Lexer.l

    +
    The Parser
    -

    See projects/Stacker/lib/compiler/StackerParser.y

    +

    See projects/llvm-stacker/lib/compiler/StackerParser.y

    The Compiler
    -

    See projects/Stacker/lib/compiler/StackerCompiler.cpp

    +

    See projects/llvm-stacker/lib/compiler/StackerCompiler.cpp

    The Runtime
    -

    See projects/Stacker/lib/runtime/stacker_rt.c

    +

    See projects/llvm-stacker/lib/runtime/stacker_rt.c

    Compiler Driver
    -

    See projects/Stacker/tools/stkrc/stkrc.cpp

    +

    See projects/llvm-stacker/tools/stkrc/stkrc.cpp

    Test Programs
    -

    See projects/Stacker/test/*.st

    +

    See projects/llvm-stacker/test/*.st

    @@ -1374,16 +1392,9 @@ interested, here are some things that could be implemented better:

  • Write an LLVM pass to compute the correct stack depth needed by the program. Currently the stack is set to a fixed number which means programs with large numbers of definitions might fail.
  • -
  • Enhance to run on 64-bit platforms like SPARC. Right now the size of a - pointer on 64-bit machines will cause incorrect results because of the - 32-bit size of a stack element currently supported. This feature was not - implemented because LLVM needs a union type to be able to support the - different sizes correctly (portably and efficiently).
  • Write an LLVM pass to optimize the use of the global stack. The code emitted currently is somewhat wasteful. It gets cleaned up a lot by existing passes but more could be done.
  • -
  • Add -O -O1 -O2 and -O3 optimization switches to the compiler driver to - allow LLVM optimization without using "opt."
  • Make the compiler driver use the LLVM linking facilities (with IPO) before depending on GCC to do the final link.
  • Clean up parsing. It doesn't handle errors very well.
  • @@ -1409,7 +1420,7 @@ interested, here are some things that could be implemented better:

    src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!">Reid Spencer
    - LLVM Compiler Infrastructure
    + LLVM Compiler Infrastructure
    Last modified: $Date$
    Definition Of Operation Of Built In Words
    LOGICAL OPERATIONS
    Word Name Operation Description
    < LT w1 w2 -- b -- b The boolean value TRUE (-1) is pushed on to the stack.
    BITWISE OPERATORS
    BITWISE OPERATORS
    Word Name
    ARITHMETIC OPERATORS
    ARITHMETIC OPERATORS
    Word Name Two values are popped off the stack. The larger value is pushed back on to the stack.
    STACK MANIPULATION OPERATORS
    STACK MANIPULATION OPERATORS
    Word Name
    RROT RROTw1 w2 w3 -- w2 w3 w1w1 w2 w3 -- w3 w1 w2 Reverse rotation. Like ROT, but it rotates the other way around. Essentially, the third element on the stack is moved to the top of the stack.
    MEMORY OPERATORS
    MEMORY OPERATORS
    Word Name
    CONTROL FLOW OPERATORS
    CONTROL FLOW OPERATORS
    Word Name
    WHILE (words...) ENDWHILE (words...) END
    WHILE word ENDWHILE word END b -- b The boolean value on the top of the stack is examined. If it is non-zero then the - "words..." between WHILE and END are executed. Execution then begins again at the WHILE where another - boolean is popped off the stack. To prevent this operation from eating up the entire - stack, you should push on to the stack (just before the END) a boolean value that indicates - whether to terminate. Note that since booleans and integers can be coerced you can - use the following "for loop" idiom:
    - (push count) WHILE (words...) -- END
    +
    The boolean value on the top of the stack is examined (not popped). If + it is non-zero then the "word" between WHILE and END is executed. + Execution then begins again at the WHILE where the boolean on the top of + the stack is examined again. The stack is not modified by the WHILE...END + loop, only examined. It is imperative that the "word" in the body of the + loop ensure that the top of the stack contains the next boolean to examine + when it completes. Note that since booleans and integers can be coerced + you can use the following "for loop" idiom:
    + (push count) WHILE word -- END
    For example:
    - 10 WHILE DUP >d -- END
    - This will print the numbers from 10 down to 1. 10 is pushed on the stack. Since that is - non-zero, the while loop is entered. The top of the stack (10) is duplicated and then - printed out with >d. The top of the stack is decremented, yielding 9 and control is - transfered back to the WHILE keyword. The process starts all over again and repeats until - the top of stack is decremented to 0 at which the WHILE test fails and control is - transfered to the word after the END.
    INPUT & OUTPUT OPERATORS
    INPUT & OUTPUT OPERATORS
    Word Name