.. contents::
:local:
-Written by `Chris Lattner <mailto:sabre@nondot.org>`_
-
Chapter 4 Introduction
======================
In order to get per-function optimizations going, we need to set up a
`FunctionPassManager <../WritingAnLLVMPass.html#passmanager>`_ to hold
and organize the LLVM optimizations that we want to run. Once we have
-that, we can add a set of optimizations to run. The code looks like
-this:
+that, we can add a set of optimizations to run. We'll need a new
+FunctionPassManager for each module that we want to optimize, so we'll
+write a function to create and initialize both the module and pass manager
+for us:
.. code-block:: c++
- FunctionPassManager OurFPM(TheModule);
+ void InitializeModuleAndPassManager(void) {
+ // Open a new module.
+ TheModule = llvm::make_unique<Module>("my cool jit", getGlobalContext());
+ TheModule->setDataLayout(TheJIT->getTargetMachine().createDataLayout());
+
+ // Create a new pass manager attached to it.
+ TheFPM = llvm::make_unique<FunctionPassManager>(TheModule.get());
- // Set up the optimizer pipeline. Start with registering info about how the
- // target lays out data structures.
- OurFPM.add(new DataLayout(*TheExecutionEngine->getDataLayout()));
// Provide basic AliasAnalysis support for GVN.
- OurFPM.add(createBasicAliasAnalysisPass());
+ TheFPM.add(createBasicAliasAnalysisPass());
// Do simple "peephole" optimizations and bit-twiddling optzns.
- OurFPM.add(createInstructionCombiningPass());
+ TheFPM.add(createInstructionCombiningPass());
// Reassociate expressions.
- OurFPM.add(createReassociatePass());
+ TheFPM.add(createReassociatePass());
// Eliminate Common SubExpressions.
- OurFPM.add(createGVNPass());
+ TheFPM.add(createGVNPass());
// Simplify the control flow graph (deleting unreachable blocks, etc).
- OurFPM.add(createCFGSimplificationPass());
-
- OurFPM.doInitialization();
-
- // Set the global so the code gen can use this.
- TheFPM = &OurFPM;
+ TheFPM.add(createCFGSimplificationPass());
- // Run the main "interpreter loop" now.
- MainLoop();
+ TheFPM.doInitialization();
+ }
-This code defines a ``FunctionPassManager``, "``OurFPM``". It requires a
-pointer to the ``Module`` to construct itself. Once it is set up, we use
-a series of "add" calls to add a bunch of LLVM passes. The first pass is
-basically boilerplate, it adds a pass so that later optimizations know
-how the data structures in the program are laid out. The
-"``TheExecutionEngine``" variable is related to the JIT, which we will
-get to in the next section.
+This code initializes the global module ``TheModule``, and the function pass
+manager ``TheFPM``, which is attached to ``TheModule``. One the pass manager is
+set up, we use a series of "add" calls to add a bunch of LLVM passes.
-In this case, we choose to add 4 optimization passes. The passes we
-chose here are a pretty standard set of "cleanup" optimizations that are
-useful for a wide variety of code. I won't delve into what they do but,
-believe me, they are a good starting place :).
+In this case, we choose to add five passes: one analysis pass (alias analysis),
+and four optimization passes. The passes we choose here are a pretty standard set
+of "cleanup" optimizations that are useful for a wide variety of code. I won't
+delve into what they do but, believe me, they are a good starting place :).
Once the PassManager is set up, we need to make use of it. We do this by
running it after our newly created function is constructed (in
-``FunctionAST::Codegen``), but before it is returned to the client:
+``FunctionAST::codegen()``), but before it is returned to the client:
.. code-block:: c++
- if (Value *RetVal = Body->Codegen()) {
+ if (Value *RetVal = Body->codegen()) {
// Finish off the function.
Builder.CreateRet(RetVal);
be able to call it from the command line.
In order to do this, we first declare and initialize the JIT. This is
-done by adding a global variable and a call in ``main``:
+done by adding a global variable ``TheJIT``, and initializing it in
+``main``:
.. code-block:: c++
- static ExecutionEngine *TheExecutionEngine;
+ static std::unique_ptr<KaleidoscopeJIT> TheJIT;
...
int main() {
..
- // Create the JIT. This takes ownership of the module.
- TheExecutionEngine = EngineBuilder(TheModule).create();
- ..
+ TheJIT = llvm::make_unique<KaleidoscopeJIT>();
+
+ // Run the main "interpreter loop" now.
+ MainLoop();
+
+ return 0;
}
-This creates an abstract "Execution Engine" which can be either a JIT
-compiler or the LLVM interpreter. LLVM will automatically pick a JIT
-compiler for you if one is available for your platform, otherwise it
-will fall back to the interpreter.
+The KaleidoscopeJIT class is a simple JIT built specifically for these
+tutorials. In later chapters we will look at how it works and extend it with
+new features, but for now we will take it as given. Its API is very simple::
+``addModule`` adds an LLVM IR module to the JIT, making its functions
+available for execution; ``removeModule`` removes a module, freeing any
+memory associated with the code in that module; and ``findSymbol`` allows us
+to look up pointers to the compiled code.
-Once the ``ExecutionEngine`` is created, the JIT is ready to be used.
-There are a variety of APIs that are useful, but the simplest one is the
-"``getPointerToFunction(F)``" method. This method JIT compiles the
-specified LLVM Function and returns a function pointer to the generated
-machine code. In our case, this means that we can change the code that
-parses a top-level expression to look like this:
+We can take this simple API and change our code that parses top-level expressions to
+look like this:
.. code-block:: c++
static void HandleTopLevelExpression() {
// Evaluate a top-level expression into an anonymous function.
- if (FunctionAST *F = ParseTopLevelExpr()) {
- if (Function *LF = F->Codegen()) {
- LF->dump(); // Dump the function for exposition purposes.
+ if (auto FnAST = ParseTopLevelExpr()) {
+ if (FnAST->codegen()) {
+
+ // JIT the module containing the anonymous expression, keeping a handle so
+ // we can free it later.
+ auto H = TheJIT->addModule(std::move(TheModule));
+ InitializeModuleAndPassManager();
- // JIT the function, returning a function pointer.
- void *FPtr = TheExecutionEngine->getPointerToFunction(LF);
+ // Search the JIT for the __anon_expr symbol.
+ auto ExprSymbol = TheJIT->findSymbol("__anon_expr");
+ assert(ExprSymbol && "Function not found");
- // Cast it to the right type (takes no arguments, returns a double) so we
- // can call it as a native function.
- double (*FP)() = (double (*)())(intptr_t)FPtr;
+ // Get the symbol's address and cast it to the right type (takes no
+ // arguments, returns a double) so we can call it as a native function.
+ double (*FP)() = (double (*)())(intptr_t)ExprSymbol.getAddress();
fprintf(stderr, "Evaluated to %f\n", FP());
+
+ // Delete the anonymous expression module from the JIT.
+ TheJIT->removeModule(H);
}
-Recall that we compile top-level expressions into a self-contained LLVM
-function that takes no arguments and returns the computed double.
-Because the LLVM JIT compiler matches the native platform ABI, this
-means that you can just cast the result pointer to a function pointer of
-that type and call it directly. This means, there is no difference
-between JIT compiled code and native machine code that is statically
-linked into your application.
+If parsing and codegen succeeed, the next step is to add the module containing
+the top-level expression to the JIT. We do this by calling addModule, which
+triggers code generation for all the functions in the module, and returns a
+handle that can be used to remove the module from the JIT later. Once the module
+has been added to the JIT it can no longer be modified, so we also open a new
+module to hold subsequent code by calling ``InitializeModuleAndPassManager()``.
+
+Once we've added the module to the JIT we need to get a pointer to the final
+generated code. We do this by calling the JIT's findSymbol method, and passing
+the name of the top-level expression function: ``__anon_expr``. Since we just
+added this function, we assert that findSymbol returned a result.
+
+Next, we get the in-memory address of the ``__anon_expr`` function by calling
+``getAddress()`` on the symbol. Recall that we compile top-level expressions
+into a self-contained LLVM function that takes no arguments and returns the
+computed double. Because the LLVM JIT compiler matches the native platform ABI,
+this means that you can just cast the result pointer to a function pointer of
+that type and call it directly. This means, there is no difference between JIT
+compiled code and native machine code that is statically linked into your
+application.
+
+Finally, since we don't support re-evaluation of top-level expressions, we
+remove the module from the JIT when we're done to free the associated memory.
+Recall, however, that the module we created a few lines earlier (via
+``InitializeModuleAndPassManager``) is still open and waiting for new code to be
+added.
With just these two changes, lets see how Kaleidoscope works now!
Evaluated to 24.000000
-This illustrates that we can now call user code, but there is something
-a bit subtle going on here. Note that we only invoke the JIT on the
-anonymous functions that *call testfunc*, but we never invoked it on
-*testfunc* itself. What actually happened here is that the JIT scanned
-for all non-JIT'd functions transitively called from the anonymous
-function and compiled all of them before returning from
-``getPointerToFunction()``.
+ ready> testfunc(5, 10);
+ ready> LLVM ERROR: Program used external function 'testfunc' which could not be resolved!
+
+
+Function definitions and calls also work, but something went very wrong on that
+last line. The call looks valid, so what happened? As you may have guessed from
+the the API a Module is a unit of allocation for the JIT, and testfunc was part
+of the same module that contained anonymous expression. When we removed that
+module from the JIT to free the memory for the anonymous expression, we deleted
+the definition of ``testfunc`` along with it. Then, when we tried to call
+testfunc a second time, the JIT could no longer find it.
+
+The easiest way to fix this is to put the anonymous expression in a separate
+module from the rest of the function definitions. The JIT will happily resolve
+function calls across module boundaries, as long as each of the functions called
+has a prototype, and is added to the JIT before it is called. By putting the
+anonymous expression in a different module we can delete it without affecting
+the rest of the functions.
+
+In fact, we're going to go a step further and put every function in its own
+module. Doing so allows us to exploit a useful property of the KaleidoscopeJIT
+that will make our environment more REPL-like: Functions can be added to the
+JIT more than once (unlike a module where every function must have a unique
+definition). When you look up a symbol in KaleidoscopeJIT it will always return
+the most recent definition:
+
+::
+
+ ready> def foo(x) x + 1;
+ Read function definition:
+ define double @foo(double %x) {
+ entry:
+ %addtmp = fadd double %x, 1.000000e+00
+ ret double %addtmp
+ }
+
+ ready> foo(2);
+ Evaluated to 3.000000
+
+ ready> def foo(x) x + 2;
+ define double @foo(double %x) {
+ entry:
+ %addtmp = fadd double %x, 2.000000e+00
+ ret double %addtmp
+ }
+
+ ready> foo(2);
+ Evaluated to 4.000000
+
+
+To allow each function to live in its own module we'll need a way to
+re-generate previous function declarations into each new module we open:
+
+.. code-block:: c++
+
+ static std::unique_ptr<KaleidoscopeJIT> TheJIT;
+
+ ...
+
+ Function *getFunction(std::string Name) {
+ // First, see if the function has already been added to the current module.
+ if (auto *F = TheModule->getFunction(Name))
+ return F;
+
+ // If not, check whether we can codegen the declaration from some existing
+ // prototype.
+ auto FI = FunctionProtos.find(Name);
+ if (FI != FunctionProtos.end())
+ return FI->second->codegen();
+
+ // If no existing prototype exists, return null.
+ return nullptr;
+ }
+
+ ...
+
+ Value *CallExprAST::codegen() {
+ // Look up the name in the global module table.
+ Function *CalleeF = getFunction(Callee);
+
+ ...
+
+ Function *FunctionAST::codegen() {
+ // Transfer ownership of the prototype to the FunctionProtos map, but keep a
+ // reference to it for use below.
+ auto &P = *Proto;
+ FunctionProtos[Proto->getName()] = std::move(Proto);
+ Function *TheFunction = getFunction(P.getName());
+ if (!TheFunction)
+ return nullptr;
+
+
+To enable this, we'll start by adding a new global, ``FunctionProtos``, that
+holds the most recent prototype for each function. We'll also add a convenience
+method, ``getFunction()``, to replace calls to ``TheModule->getFunction()``.
+Our convenience method searches ``TheModule`` for an existing function
+declaration, falling back to generating a new declaration from FunctionProtos if
+it doesn't find one. In ``CallExprAST::codegen()`` we just need to replace the
+call to ``TheModule->getFunction()``. In ``FunctionAST::codegen()`` we need to
+update the FunctionProtos map first, then call ``getFunction()``. With this
+done, we can always obtain a function declaration in the current module for any
+previously declared function.
+
+We also need to update HandleDefinition and HandleExtern:
+
+.. code-block:: c++
+
+ static void HandleDefinition() {
+ if (auto FnAST = ParseDefinition()) {
+ if (auto *FnIR = FnAST->codegen()) {
+ fprintf(stderr, "Read function definition:");
+ FnIR->dump();
+ TheJIT->addModule(std::move(TheModule));
+ InitializeModuleAndPassManager();
+ }
+ } else {
+ // Skip token for error recovery.
+ getNextToken();
+ }
+ }
+
+ static void HandleExtern() {
+ if (auto ProtoAST = ParseExtern()) {
+ if (auto *FnIR = ProtoAST->codegen()) {
+ fprintf(stderr, "Read extern: ");
+ FnIR->dump();
+ FunctionProtos[ProtoAST->getName()] = std::move(ProtoAST);
+ }
+ } else {
+ // Skip token for error recovery.
+ getNextToken();
+ }
+ }
+
+In HandleDefinition, we add two lines to transfer the newly defined function to
+the JIT and open a new module. In HandleExtern, we just need to add one line to
+add the prototype to FunctionProtos.
+
+With these changes made, lets try our REPL again (I removed the dump of the
+anonymous functions this time, you should get the idea by now :) :
+
+::
+
+ ready> def foo(x) x + 1;
+ ready> foo(2);
+ Evaluated to 3.000000
+
+ ready> def foo(x) x + 2;
+ ready> foo(2);
+ Evaluated to 4.000000
+
+It works!
-The JIT provides a number of other more advanced interfaces for things
-like freeing allocated machine code, rejit'ing functions to update them,
-etc. However, even with this simple code, we get some surprisingly
-powerful capabilities - check this out (I removed the dump of the
-anonymous functions, you should get the idea by now :) :
+Even with this simple code, we get some surprisingly powerful capabilities -
+check this out:
::
Evaluated to 1.000000
-Whoa, how does the JIT know about sin and cos? The answer is
-surprisingly simple: in this example, the JIT started execution of a
-function and got to a function call. It realized that the function was
-not yet JIT compiled and invoked the standard set of routines to resolve
-the function. In this case, there is no body defined for the function,
-so the JIT ended up calling "``dlsym("sin")``" on the Kaleidoscope
-process itself. Since "``sin``" is defined within the JIT's address
-space, it simply patches up calls in the module to call the libm version
-of ``sin`` directly.
-
-The LLVM JIT provides a number of interfaces (look in the
-``ExecutionEngine.h`` file) for controlling how unknown functions get
-resolved. It allows you to establish explicit mappings between IR
-objects and addresses (useful for LLVM global variables that you want to
-map to static tables, for example), allows you to dynamically decide on
-the fly based on the function name, and even allows you to have the JIT
-compile functions lazily the first time they're called.
-
-One interesting application of this is that we can now extend the
-language by writing arbitrary C++ code to implement operations. For
-example, if we add:
+Whoa, how does the JIT know about sin and cos? The answer is surprisingly
+simple: The KaleidoscopeJIT has a straightforward symbol resolution rule that
+it uses to find symbols that aren't available in any given module: First
+it searches all the modules that have already been added to the JIT, from the
+most recent to the oldest, to find the newest definition. If no definition is
+found inside the JIT, it falls back to calling "``dlsym("sin")``" on the
+Kaleidoscope process itself. Since "``sin``" is defined within the JIT's
+address space, it simply patches up calls in the module to call the libm
+version of ``sin`` directly.
+
+In the future we'll see how tweaking this symbol resolution rule can be used to
+enable all sorts of useful features, from security (restricting the set of
+symbols available to JIT'd code), to dynamic code generation based on symbol
+names, and even lazy compilation.
+
+One immediate benefit of the symbol resolution rule is that we can now extend
+the language by writing arbitrary C++ code to implement operations. For example,
+if we add:
.. code-block:: c++
/// putchard - putchar that takes a double and returns 0.
- extern "C"
- double putchard(double X) {
- putchar((char)X);
+ extern "C" double putchard(double X) {
+ fputc((char)X, stderr);
return 0;
}
.. code-block:: bash
# Compile
- clang++ -g toy.cpp `llvm-config --cppflags --ldflags --libs core jit native` -O3 -o toy
+ clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core mcjit native` -O3 -o toy
# Run
./toy
Here is the code:
-.. code-block:: c++
-
- #include "llvm/DerivedTypes.h"
- #include "llvm/ExecutionEngine/ExecutionEngine.h"
- #include "llvm/ExecutionEngine/JIT.h"
- #include "llvm/IRBuilder.h"
- #include "llvm/LLVMContext.h"
- #include "llvm/Module.h"
- #include "llvm/PassManager.h"
- #include "llvm/Analysis/Verifier.h"
- #include "llvm/Analysis/Passes.h"
- #include "llvm/DataLayout.h"
- #include "llvm/Transforms/Scalar.h"
- #include "llvm/Support/TargetSelect.h"
- #include <cstdio>
- #include <string>
- #include <map>
- #include <vector>
- using namespace llvm;
-
- //===----------------------------------------------------------------------===//
- // Lexer
- //===----------------------------------------------------------------------===//
-
- // The lexer returns tokens [0-255] if it is an unknown character, otherwise one
- // of these for known things.
- enum Token {
- tok_eof = -1,
-
- // commands
- tok_def = -2, tok_extern = -3,
-
- // primary
- tok_identifier = -4, tok_number = -5
- };
-
- static std::string IdentifierStr; // Filled in if tok_identifier
- static double NumVal; // Filled in if tok_number
-
- /// gettok - Return the next token from standard input.
- static int gettok() {
- static int LastChar = ' ';
-
- // Skip any whitespace.
- while (isspace(LastChar))
- LastChar = getchar();
-
- if (isalpha(LastChar)) { // identifier: [a-zA-Z][a-zA-Z0-9]*
- IdentifierStr = LastChar;
- while (isalnum((LastChar = getchar())))
- IdentifierStr += LastChar;
-
- if (IdentifierStr == "def") return tok_def;
- if (IdentifierStr == "extern") return tok_extern;
- return tok_identifier;
- }
-
- if (isdigit(LastChar) || LastChar == '.') { // Number: [0-9.]+
- std::string NumStr;
- do {
- NumStr += LastChar;
- LastChar = getchar();
- } while (isdigit(LastChar) || LastChar == '.');
-
- NumVal = strtod(NumStr.c_str(), 0);
- return tok_number;
- }
-
- if (LastChar == '#') {
- // Comment until end of line.
- do LastChar = getchar();
- while (LastChar != EOF && LastChar != '\n' && LastChar != '\r');
-
- if (LastChar != EOF)
- return gettok();
- }
-
- // Check for end of file. Don't eat the EOF.
- if (LastChar == EOF)
- return tok_eof;
-
- // Otherwise, just return the character as its ascii value.
- int ThisChar = LastChar;
- LastChar = getchar();
- return ThisChar;
- }
-
- //===----------------------------------------------------------------------===//
- // Abstract Syntax Tree (aka Parse Tree)
- //===----------------------------------------------------------------------===//
-
- /// ExprAST - Base class for all expression nodes.
- class ExprAST {
- public:
- virtual ~ExprAST() {}
- virtual Value *Codegen() = 0;
- };
-
- /// NumberExprAST - Expression class for numeric literals like "1.0".
- class NumberExprAST : public ExprAST {
- double Val;
- public:
- NumberExprAST(double val) : Val(val) {}
- virtual Value *Codegen();
- };
-
- /// VariableExprAST - Expression class for referencing a variable, like "a".
- class VariableExprAST : public ExprAST {
- std::string Name;
- public:
- VariableExprAST(const std::string &name) : Name(name) {}
- virtual Value *Codegen();
- };
-
- /// BinaryExprAST - Expression class for a binary operator.
- class BinaryExprAST : public ExprAST {
- char Op;
- ExprAST *LHS, *RHS;
- public:
- BinaryExprAST(char op, ExprAST *lhs, ExprAST *rhs)
- : Op(op), LHS(lhs), RHS(rhs) {}
- virtual Value *Codegen();
- };
-
- /// CallExprAST - Expression class for function calls.
- class CallExprAST : public ExprAST {
- std::string Callee;
- std::vector<ExprAST*> Args;
- public:
- CallExprAST(const std::string &callee, std::vector<ExprAST*> &args)
- : Callee(callee), Args(args) {}
- virtual Value *Codegen();
- };
-
- /// PrototypeAST - This class represents the "prototype" for a function,
- /// which captures its name, and its argument names (thus implicitly the number
- /// of arguments the function takes).
- class PrototypeAST {
- std::string Name;
- std::vector<std::string> Args;
- public:
- PrototypeAST(const std::string &name, const std::vector<std::string> &args)
- : Name(name), Args(args) {}
-
- Function *Codegen();
- };
-
- /// FunctionAST - This class represents a function definition itself.
- class FunctionAST {
- PrototypeAST *Proto;
- ExprAST *Body;
- public:
- FunctionAST(PrototypeAST *proto, ExprAST *body)
- : Proto(proto), Body(body) {}
-
- Function *Codegen();
- };
-
- //===----------------------------------------------------------------------===//
- // Parser
- //===----------------------------------------------------------------------===//
-
- /// CurTok/getNextToken - Provide a simple token buffer. CurTok is the current
- /// token the parser is looking at. getNextToken reads another token from the
- /// lexer and updates CurTok with its results.
- static int CurTok;
- static int getNextToken() {
- return CurTok = gettok();
- }
-
- /// BinopPrecedence - This holds the precedence for each binary operator that is
- /// defined.
- static std::map<char, int> BinopPrecedence;
-
- /// GetTokPrecedence - Get the precedence of the pending binary operator token.
- static int GetTokPrecedence() {
- if (!isascii(CurTok))
- return -1;
-
- // Make sure it's a declared binop.
- int TokPrec = BinopPrecedence[CurTok];
- if (TokPrec <= 0) return -1;
- return TokPrec;
- }
-
- /// Error* - These are little helper functions for error handling.
- ExprAST *Error(const char *Str) { fprintf(stderr, "Error: %s\n", Str);return 0;}
- PrototypeAST *ErrorP(const char *Str) { Error(Str); return 0; }
- FunctionAST *ErrorF(const char *Str) { Error(Str); return 0; }
-
- static ExprAST *ParseExpression();
-
- /// identifierexpr
- /// ::= identifier
- /// ::= identifier '(' expression* ')'
- static ExprAST *ParseIdentifierExpr() {
- std::string IdName = IdentifierStr;
-
- getNextToken(); // eat identifier.
-
- if (CurTok != '(') // Simple variable ref.
- return new VariableExprAST(IdName);
-
- // Call.
- getNextToken(); // eat (
- std::vector<ExprAST*> Args;
- if (CurTok != ')') {
- while (1) {
- ExprAST *Arg = ParseExpression();
- if (!Arg) return 0;
- Args.push_back(Arg);
-
- if (CurTok == ')') break;
-
- if (CurTok != ',')
- return Error("Expected ')' or ',' in argument list");
- getNextToken();
- }
- }
-
- // Eat the ')'.
- getNextToken();
-
- return new CallExprAST(IdName, Args);
- }
-
- /// numberexpr ::= number
- static ExprAST *ParseNumberExpr() {
- ExprAST *Result = new NumberExprAST(NumVal);
- getNextToken(); // consume the number
- return Result;
- }
-
- /// parenexpr ::= '(' expression ')'
- static ExprAST *ParseParenExpr() {
- getNextToken(); // eat (.
- ExprAST *V = ParseExpression();
- if (!V) return 0;
-
- if (CurTok != ')')
- return Error("expected ')'");
- getNextToken(); // eat ).
- return V;
- }
-
- /// primary
- /// ::= identifierexpr
- /// ::= numberexpr
- /// ::= parenexpr
- static ExprAST *ParsePrimary() {
- switch (CurTok) {
- default: return Error("unknown token when expecting an expression");
- case tok_identifier: return ParseIdentifierExpr();
- case tok_number: return ParseNumberExpr();
- case '(': return ParseParenExpr();
- }
- }
-
- /// binoprhs
- /// ::= ('+' primary)*
- static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) {
- // If this is a binop, find its precedence.
- while (1) {
- int TokPrec = GetTokPrecedence();
-
- // If this is a binop that binds at least as tightly as the current binop,
- // consume it, otherwise we are done.
- if (TokPrec < ExprPrec)
- return LHS;
-
- // Okay, we know this is a binop.
- int BinOp = CurTok;
- getNextToken(); // eat binop
-
- // Parse the primary expression after the binary operator.
- ExprAST *RHS = ParsePrimary();
- if (!RHS) return 0;
-
- // If BinOp binds less tightly with RHS than the operator after RHS, let
- // the pending operator take RHS as its LHS.
- int NextPrec = GetTokPrecedence();
- if (TokPrec < NextPrec) {
- RHS = ParseBinOpRHS(TokPrec+1, RHS);
- if (RHS == 0) return 0;
- }
-
- // Merge LHS/RHS.
- LHS = new BinaryExprAST(BinOp, LHS, RHS);
- }
- }
-
- /// expression
- /// ::= primary binoprhs
- ///
- static ExprAST *ParseExpression() {
- ExprAST *LHS = ParsePrimary();
- if (!LHS) return 0;
-
- return ParseBinOpRHS(0, LHS);
- }
-
- /// prototype
- /// ::= id '(' id* ')'
- static PrototypeAST *ParsePrototype() {
- if (CurTok != tok_identifier)
- return ErrorP("Expected function name in prototype");
-
- std::string FnName = IdentifierStr;
- getNextToken();
-
- if (CurTok != '(')
- return ErrorP("Expected '(' in prototype");
-
- std::vector<std::string> ArgNames;
- while (getNextToken() == tok_identifier)
- ArgNames.push_back(IdentifierStr);
- if (CurTok != ')')
- return ErrorP("Expected ')' in prototype");
-
- // success.
- getNextToken(); // eat ')'.
-
- return new PrototypeAST(FnName, ArgNames);
- }
-
- /// definition ::= 'def' prototype expression
- static FunctionAST *ParseDefinition() {
- getNextToken(); // eat def.
- PrototypeAST *Proto = ParsePrototype();
- if (Proto == 0) return 0;
-
- if (ExprAST *E = ParseExpression())
- return new FunctionAST(Proto, E);
- return 0;
- }
-
- /// toplevelexpr ::= expression
- static FunctionAST *ParseTopLevelExpr() {
- if (ExprAST *E = ParseExpression()) {
- // Make an anonymous proto.
- PrototypeAST *Proto = new PrototypeAST("", std::vector<std::string>());
- return new FunctionAST(Proto, E);
- }
- return 0;
- }
-
- /// external ::= 'extern' prototype
- static PrototypeAST *ParseExtern() {
- getNextToken(); // eat extern.
- return ParsePrototype();
- }
-
- //===----------------------------------------------------------------------===//
- // Code Generation
- //===----------------------------------------------------------------------===//
-
- static Module *TheModule;
- static IRBuilder<> Builder(getGlobalContext());
- static std::map<std::string, Value*> NamedValues;
- static FunctionPassManager *TheFPM;
-
- Value *ErrorV(const char *Str) { Error(Str); return 0; }
-
- Value *NumberExprAST::Codegen() {
- return ConstantFP::get(getGlobalContext(), APFloat(Val));
- }
-
- Value *VariableExprAST::Codegen() {
- // Look this variable up in the function.
- Value *V = NamedValues[Name];
- return V ? V : ErrorV("Unknown variable name");
- }
-
- Value *BinaryExprAST::Codegen() {
- Value *L = LHS->Codegen();
- Value *R = RHS->Codegen();
- if (L == 0 || R == 0) return 0;
-
- switch (Op) {
- case '+': return Builder.CreateFAdd(L, R, "addtmp");
- case '-': return Builder.CreateFSub(L, R, "subtmp");
- case '*': return Builder.CreateFMul(L, R, "multmp");
- case '<':
- L = Builder.CreateFCmpULT(L, R, "cmptmp");
- // Convert bool 0/1 to double 0.0 or 1.0
- return Builder.CreateUIToFP(L, Type::getDoubleTy(getGlobalContext()),
- "booltmp");
- default: return ErrorV("invalid binary operator");
- }
- }
-
- Value *CallExprAST::Codegen() {
- // Look up the name in the global module table.
- Function *CalleeF = TheModule->getFunction(Callee);
- if (CalleeF == 0)
- return ErrorV("Unknown function referenced");
-
- // If argument mismatch error.
- if (CalleeF->arg_size() != Args.size())
- return ErrorV("Incorrect # arguments passed");
-
- std::vector<Value*> ArgsV;
- for (unsigned i = 0, e = Args.size(); i != e; ++i) {
- ArgsV.push_back(Args[i]->Codegen());
- if (ArgsV.back() == 0) return 0;
- }
-
- return Builder.CreateCall(CalleeF, ArgsV, "calltmp");
- }
-
- Function *PrototypeAST::Codegen() {
- // Make the function type: double(double,double) etc.
- std::vector<Type*> Doubles(Args.size(),
- Type::getDoubleTy(getGlobalContext()));
- FunctionType *FT = FunctionType::get(Type::getDoubleTy(getGlobalContext()),
- Doubles, false);
-
- Function *F = Function::Create(FT, Function::ExternalLinkage, Name, TheModule);
-
- // If F conflicted, there was already something named 'Name'. If it has a
- // body, don't allow redefinition or reextern.
- if (F->getName() != Name) {
- // Delete the one we just made and get the existing one.
- F->eraseFromParent();
- F = TheModule->getFunction(Name);
-
- // If F already has a body, reject this.
- if (!F->empty()) {
- ErrorF("redefinition of function");
- return 0;
- }
-
- // If F took a different number of args, reject.
- if (F->arg_size() != Args.size()) {
- ErrorF("redefinition of function with different # args");
- return 0;
- }
- }
-
- // Set names for all arguments.
- unsigned Idx = 0;
- for (Function::arg_iterator AI = F->arg_begin(); Idx != Args.size();
- ++AI, ++Idx) {
- AI->setName(Args[Idx]);
-
- // Add arguments to variable symbol table.
- NamedValues[Args[Idx]] = AI;
- }
-
- return F;
- }
-
- Function *FunctionAST::Codegen() {
- NamedValues.clear();
-
- Function *TheFunction = Proto->Codegen();
- if (TheFunction == 0)
- return 0;
-
- // Create a new basic block to start insertion into.
- BasicBlock *BB = BasicBlock::Create(getGlobalContext(), "entry", TheFunction);
- Builder.SetInsertPoint(BB);
-
- if (Value *RetVal = Body->Codegen()) {
- // Finish off the function.
- Builder.CreateRet(RetVal);
-
- // Validate the generated code, checking for consistency.
- verifyFunction(*TheFunction);
-
- // Optimize the function.
- TheFPM->run(*TheFunction);
-
- return TheFunction;
- }
-
- // Error reading body, remove function.
- TheFunction->eraseFromParent();
- return 0;
- }
-
- //===----------------------------------------------------------------------===//
- // Top-Level parsing and JIT Driver
- //===----------------------------------------------------------------------===//
-
- static ExecutionEngine *TheExecutionEngine;
-
- static void HandleDefinition() {
- if (FunctionAST *F = ParseDefinition()) {
- if (Function *LF = F->Codegen()) {
- fprintf(stderr, "Read function definition:");
- LF->dump();
- }
- } else {
- // Skip token for error recovery.
- getNextToken();
- }
- }
-
- static void HandleExtern() {
- if (PrototypeAST *P = ParseExtern()) {
- if (Function *F = P->Codegen()) {
- fprintf(stderr, "Read extern: ");
- F->dump();
- }
- } else {
- // Skip token for error recovery.
- getNextToken();
- }
- }
-
- static void HandleTopLevelExpression() {
- // Evaluate a top-level expression into an anonymous function.
- if (FunctionAST *F = ParseTopLevelExpr()) {
- if (Function *LF = F->Codegen()) {
- fprintf(stderr, "Read top-level expression:");
- LF->dump();
-
- // JIT the function, returning a function pointer.
- void *FPtr = TheExecutionEngine->getPointerToFunction(LF);
-
- // Cast it to the right type (takes no arguments, returns a double) so we
- // can call it as a native function.
- double (*FP)() = (double (*)())(intptr_t)FPtr;
- fprintf(stderr, "Evaluated to %f\n", FP());
- }
- } else {
- // Skip token for error recovery.
- getNextToken();
- }
- }
-
- /// top ::= definition | external | expression | ';'
- static void MainLoop() {
- while (1) {
- fprintf(stderr, "ready> ");
- switch (CurTok) {
- case tok_eof: return;
- case ';': getNextToken(); break; // ignore top-level semicolons.
- case tok_def: HandleDefinition(); break;
- case tok_extern: HandleExtern(); break;
- default: HandleTopLevelExpression(); break;
- }
- }
- }
-
- //===----------------------------------------------------------------------===//
- // "Library" functions that can be "extern'd" from user code.
- //===----------------------------------------------------------------------===//
-
- /// putchard - putchar that takes a double and returns 0.
- extern "C"
- double putchard(double X) {
- putchar((char)X);
- return 0;
- }
-
- //===----------------------------------------------------------------------===//
- // Main driver code.
- //===----------------------------------------------------------------------===//
-
- int main() {
- InitializeNativeTarget();
- LLVMContext &Context = getGlobalContext();
-
- // Install standard binary operators.
- // 1 is lowest precedence.
- BinopPrecedence['<'] = 10;
- BinopPrecedence['+'] = 20;
- BinopPrecedence['-'] = 20;
- BinopPrecedence['*'] = 40; // highest.
-
- // Prime the first token.
- fprintf(stderr, "ready> ");
- getNextToken();
-
- // Make the module, which holds all the code.
- TheModule = new Module("my cool jit", Context);
-
- // Create the JIT. This takes ownership of the module.
- std::string ErrStr;
- TheExecutionEngine = EngineBuilder(TheModule).setErrorStr(&ErrStr).create();
- if (!TheExecutionEngine) {
- fprintf(stderr, "Could not create ExecutionEngine: %s\n", ErrStr.c_str());
- exit(1);
- }
-
- FunctionPassManager OurFPM(TheModule);
-
- // Set up the optimizer pipeline. Start with registering info about how the
- // target lays out data structures.
- OurFPM.add(new DataLayout(*TheExecutionEngine->getDataLayout()));
- // Provide basic AliasAnalysis support for GVN.
- OurFPM.add(createBasicAliasAnalysisPass());
- // Do simple "peephole" optimizations and bit-twiddling optzns.
- OurFPM.add(createInstructionCombiningPass());
- // Reassociate expressions.
- OurFPM.add(createReassociatePass());
- // Eliminate Common SubExpressions.
- OurFPM.add(createGVNPass());
- // Simplify the control flow graph (deleting unreachable blocks, etc).
- OurFPM.add(createCFGSimplificationPass());
-
- OurFPM.doInitialization();
-
- // Set the global so the code gen can use this.
- TheFPM = &OurFPM;
-
- // Run the main "interpreter loop" now.
- MainLoop();
-
- TheFPM = 0;
-
- // Print out all of the generated code.
- TheModule->dump();
-
- return 0;
- }
+.. literalinclude:: ../../examples/Kaleidoscope/Chapter4/toy.cpp
+ :language: c++
`Next: Extending the language: control flow <LangImpl5.html>`_