A more interesting example is included in Chapter 6 where we write a
little Kaleidoscope application that `displays a Mandelbrot
-Set <LangImpl6.html#example>`_ at various levels of magnification.
+Set <LangImpl6.html#kicking-the-tires>`_ at various levels of magnification.
Lets dive into the implementation of this language!
}
With this, we have the complete lexer for the basic Kaleidoscope
-language (the `full code listing <LangImpl2.html#code>`_ for the Lexer
+language (the `full code listing <LangImpl2.html#full-code-listing>`_ for the Lexer
is available in the `next chapter <LangImpl2.html>`_ of the tutorial).
Next we'll `build a simple parser that uses this to build an Abstract
Syntax Tree <LangImpl2.html>`_. When we have that, we'll include a
can define a helper function to wrap it together into one entry point.
We call this class of expressions "primary" expressions, for reasons
that will become more clear `later in the
-tutorial <LangImpl6.html#unary>`_. In order to parse an arbitrary
+tutorial <LangImpl6.html#user-defined-unary-operators>`_. In order to parse an arbitrary
primary expression, we need to determine what sort of expression it is:
.. code-block:: c++
The driver for this simply invokes all of the parsing pieces with a
top-level dispatch loop. There isn't much interesting here, so I'll just
-include the top-level loop. See `below <#code>`_ for full code in the
+include the top-level loop. See `below <#full-code-listing>`_ for full code in the
"Top-Level Parsing" section.
.. code-block:: c++
This code simply checks to see that the specified name is in the map (if
not, an unknown variable is being referenced) and returns the value for
it. In future chapters, we'll add support for `loop induction
-variables <LangImpl5.html#for>`_ in the symbol table, and for `local
-variables <LangImpl7.html#localvars>`_.
+variables <LangImpl5.html#for-loop-expression>`_ in the symbol table, and for `local
+variables <LangImpl7.html#user-defined-local-variables>`_.
.. code-block:: c++
suffix. Local value names for instructions are purely optional, but it
makes it much easier to read the IR dumps.
-`LLVM instructions <../LangRef.html#instref>`_ are constrained by strict
+`LLVM instructions <../LangRef.html#instruction-reference>`_ are constrained by strict
rules: for example, the Left and Right operators of an `add
-instruction <../LangRef.html#i_add>`_ must have the same type, and the
+instruction <../LangRef.html#add-instruction>`_ must have the same type, and the
result type of the add must match the operand types. Because all values
in Kaleidoscope are doubles, this makes for very simple code for add,
sub and mul.
On the other hand, LLVM specifies that the `fcmp
-instruction <../LangRef.html#i_fcmp>`_ always returns an 'i1' value (a
+instruction <../LangRef.html#fcmp-instruction>`_ always returns an 'i1' value (a
one bit integer). The problem with this is that Kaleidoscope wants the
value to be a 0.0 or 1.0 value. In order to get these semantics, we
combine the fcmp instruction with a `uitofp
-instruction <../LangRef.html#i_uitofp>`_. This instruction converts its
+instruction <../LangRef.html#uitofp-to-instruction>`_. This instruction converts its
input integer into a floating point value by treating the input as an
unsigned value. In contrast, if we used the `sitofp
-instruction <../LangRef.html#i_sitofp>`_, the Kaleidoscope '<' operator
+instruction <../LangRef.html#sitofp-to-instruction>`_, the Kaleidoscope '<' operator
would return 0.0 and -1.0, depending on the input value.
.. code-block:: c++
Once we have the function to call, we recursively codegen each argument
that is to be passed in, and create an LLVM `call
-instruction <../LangRef.html#i_call>`_. Note that LLVM uses the native C
+instruction <../LangRef.html#call-instruction>`_. Note that LLVM uses the native C
calling conventions by default, allowing these calls to also call into
standard library functions like "sin" and "cos", with no additional
effort.
we call the ``codegen()`` method for the root expression of the function. If no
error happens, this emits code to compute the expression into the entry block
and returns the value that was computed. Assuming no error, we then create an
-LLVM `ret instruction <../LangRef.html#i_ret>`_, which completes the function.
+LLVM `ret instruction <../LangRef.html#ret-instruction>`_, which completes the function.
Once the function is built, we call ``verifyFunction``, which is
provided by LLVM. This function does a variety of consistency checks on
the generated code, to determine if our compiler is doing everything
Note how the parser turns the top-level expression into anonymous
functions for us. This will be handy when we add `JIT
-support <LangImpl4.html#jit>`_ in the next chapter. Also note that the
+support <LangImpl4.html#adding-a-jit-compiler>`_ in the next chapter. Also note that the
code is very literally transcribed, no optimizations are being performed
except simple constant folding done by IRBuilder. We will `add
-optimizations <LangImpl4.html#trivialconstfold>`_ explicitly in the next
+optimizations <LangImpl4.html#trivial-constant-folding>`_ explicitly in the next
chapter.
::
optimizer until the entire file has been parsed.
In order to get per-function optimizations going, we need to set up a
-`FunctionPassManager <../WritingAnLLVMPass.html#passmanager>`_ to hold
+`FunctionPassManager <../WritingAnLLVMPass.html#what-passmanager-doesr>`_ to hold
and organize the LLVM optimizations that we want to run. Once we have
that, we can add a set of optimizations to run. We'll need a new
FunctionPassManager for each module that we want to optimize, so we'll
}
This code initializes the global module ``TheModule``, and the function pass
-manager ``TheFPM``, which is attached to ``TheModule``. One the pass manager is
+manager ``TheFPM``, which is attached to ``TheModule``. Once the pass manager is
set up, we use a series of "add" calls to add a bunch of LLVM passes.
In this case, we choose to add five passes: one analysis pass (alias analysis),
To visualize the control flow graph, you can use a nifty feature of the
LLVM '`opt <http://llvm.org/cmds/opt.html>`_' tool. If you put this LLVM
IR into "t.ll" and run "``llvm-as < t.ll | opt -analyze -view-cfg``", `a
-window will pop up <../ProgrammersManual.html#ViewGraph>`_ and you'll
+window will pop up <../ProgrammersManual.html#viewing-graphs-while-debugging-code>`_ and you'll
see this graph:
.. figure:: LangImpl5-cfg.png
this as a way to show some interesting parsing techniques.
At the end of this tutorial, we'll run through an example Kaleidoscope
-application that `renders the Mandelbrot set <#example>`_. This gives an
+application that `renders the Mandelbrot set <#kicking-the-tires>`_. This gives an
example of what you can build with Kaleidoscope and its feature set.
User-defined Operators: the Idea
return tok_identifier;
This just adds lexer support for the unary and binary keywords, like we
-did in `previous chapters <LangImpl5.html#iflexer>`_. One nice thing
+did in `previous chapters <LangImpl5.html#lexer-extensions-for-if-then-else>`_. One nice thing
about our current AST, is that we represent binary operators with full
generalisation by using their ASCII code as the opcode. For our extended
operators, we'll use this same representation, so we don't need any new
*name* actually refers to the address for that space. Stack variables
work the same way, except that instead of being declared with global
variable definitions, they are declared with the `LLVM alloca
-instruction <../LangRef.html#i_alloca>`_:
+instruction <../LangRef.html#alloca-instruction>`_:
.. code-block:: llvm
funny pointer arithmetic is involved, the alloca will not be
promoted.
#. mem2reg only works on allocas of `first
- class <../LangRef.html#t_classifications>`_ values (such as pointers,
+ class <../LangRef.html#first-class-types>`_ values (such as pointers,
scalars and vectors), and only if the array size of the allocation is
1 (or missing in the .ll file). mem2reg is not capable of promoting
structs or arrays to registers. Note that the "scalarrepl" pass is
As you can see, this is pretty straightforward. Now we need to update
the things that define the variables to set up the alloca. We'll start
-with ``ForExprAST::codegen()`` (see the `full code listing <#code>`_ for
+with ``ForExprAST::codegen()`` (see the `full code listing <#id1>`_ for
the unabridged code):
.. code-block:: c++
...
This code is virtually identical to the code `before we allowed mutable
-variables <LangImpl5.html#forcodegen>`_. The big difference is that we
+variables <LangImpl5.html#code-generation-for-the-for-loop>`_. The big difference is that we
no longer have to construct a PHI node, and we use load/store to access
the variable as needed.
====================
Similar to the ``IRBuilder`` class we have a
-```DIBuilder`` <http://llvm.org/doxygen/classllvm_1_1DIBuilder.html>`_ class
+`DIBuilder <http://llvm.org/doxygen/classllvm_1_1DIBuilder.html>`_ class
that helps in constructing debug metadata for an llvm IR file. It
corresponds 1:1 similarly to ``IRBuilder`` and llvm IR, but with nicer names.
Using it does require that you be more familiar with DWARF terminology than
you needed to be with ``IRBuilder`` and ``Instruction`` names, but if you
read through the general documentation on the
-```Metadata Format`` <http://llvm.org/docs/SourceLevelDebugging.html>`_ it
+`Metadata Format <http://llvm.org/docs/SourceLevelDebugging.html>`_ it
should be a little more clear. We'll be using this class to construct all
of our IR level descriptions. Construction for it takes a module so we
need to construct it shortly after we construct our module. We've left it
extending the type system in all sorts of interesting ways. Simple
arrays are very easy and are quite useful for many different
applications. Adding them is mostly an exercise in learning how the
- LLVM `getelementptr <../LangRef.html#i_getelementptr>`_ instruction
+ LLVM `getelementptr <../LangRef.html#getelementptr-instruction>`_ instruction
works: it is so nifty/unconventional, it `has its own
FAQ <../GetElementPtr.html>`_! If you add support for recursive types
(e.g. linked lists), make sure to read the `section in the LLVM
A more interesting example is included in Chapter 6 where we write a
little Kaleidoscope application that `displays a Mandelbrot
-Set <OCamlLangImpl6.html#example>`_ at various levels of magnification.
+Set <OCamlLangImpl6.html#kicking-the-tires>`_ at various levels of magnification.
Lets dive into the implementation of this language!
| [< >] -> [< >]
With this, we have the complete lexer for the basic Kaleidoscope
-language (the `full code listing <OCamlLangImpl2.html#code>`_ for the
+language (the `full code listing <OCamlLangImpl2.html#full-code-listing>`_ for the
Lexer is available in the `next chapter <OCamlLangImpl2.html>`_ of the
tutorial). Next we'll `build a simple parser that uses this to build an
Abstract Syntax Tree <OCamlLangImpl2.html>`_. When we have that, we'll
process. For each production in our grammar, we'll define a function
which parses that production. We call this class of expressions
"primary" expressions, for reasons that will become more clear `later in
-the tutorial <OCamlLangImpl6.html#unary>`_. In order to parse an
+the tutorial <OCamlLangImpl6.html#user-defined-unary-operators>`_. In order to parse an
arbitrary primary expression, we need to determine what sort of
expression it is. For numeric literals, we have:
The driver for this simply invokes all of the parsing pieces with a
top-level dispatch loop. There isn't much interesting here, so I'll just
-include the top-level loop. See `below <#code>`_ for full code in the
+include the top-level loop. See `below <#full-code-listing>`_ for full code in the
"Top-Level Parsing" section.
.. code-block:: ocaml
arguments. This code simply checks to see that the specified name is in
the map (if not, an unknown variable is being referenced) and returns
the value for it. In future chapters, we'll add support for `loop
-induction variables <LangImpl5.html#for>`_ in the symbol table, and for
-`local variables <LangImpl7.html#localvars>`_.
+induction variables <LangImpl5.html#for-loop-expression>`_ in the symbol table, and for
+`local variables <LangImpl7.html#user-defined-local-variables>`_.
.. code-block:: ocaml
suffix. Local value names for instructions are purely optional, but it
makes it much easier to read the IR dumps.
-`LLVM instructions <../LangRef.html#instref>`_ are constrained by strict
+`LLVM instructions <../LangRef.html#instruction-reference>`_ are constrained by strict
rules: for example, the Left and Right operators of an `add
-instruction <../LangRef.html#i_add>`_ must have the same type, and the
+instruction <../LangRef.html#add-instruction>`_ must have the same type, and the
result type of the add must match the operand types. Because all values
in Kaleidoscope are doubles, this makes for very simple code for add,
sub and mul.
On the other hand, LLVM specifies that the `fcmp
-instruction <../LangRef.html#i_fcmp>`_ always returns an 'i1' value (a
+instruction <../LangRef.html#fcmp-instruction>`_ always returns an 'i1' value (a
one bit integer). The problem with this is that Kaleidoscope wants the
value to be a 0.0 or 1.0 value. In order to get these semantics, we
combine the fcmp instruction with a `uitofp
-instruction <../LangRef.html#i_uitofp>`_. This instruction converts its
+instruction <../LangRef.html#uitofp-to-instruction>`_. This instruction converts its
input integer into a floating point value by treating the input as an
unsigned value. In contrast, if we used the `sitofp
-instruction <../LangRef.html#i_sitofp>`_, the Kaleidoscope '<' operator
+instruction <../LangRef.html#sitofp-to-instruction>`_, the Kaleidoscope '<' operator
would return 0.0 and -1.0, depending on the input value.
.. code-block:: ocaml
Once we have the function to call, we recursively codegen each argument
that is to be passed in, and create an LLVM `call
-instruction <../LangRef.html#i_call>`_. Note that LLVM uses the native C
+instruction <../LangRef.html#call-instruction>`_. Note that LLVM uses the native C
calling conventions by default, allowing these calls to also call into
standard library functions like "sin" and "cos", with no additional
effort.
This indicates the type and name to use, as well as which module to
insert into. By default we assume a function has
``Llvm.Linkage.ExternalLinkage``. "`external
-linkage <LangRef.html#linkage>`_" means that the function may be defined
+linkage <../LangRef.html#linkage>`_" means that the function may be defined
outside the current module and/or that it is callable by functions
outside the module. The "``name``" passed in is the name the user
specified: this name is registered in "``Codegen.the_module``"s symbol
method for the root expression of the function. If no error happens,
this emits code to compute the expression into the entry block and
returns the value that was computed. Assuming no error, we then create
-an LLVM `ret instruction <../LangRef.html#i_ret>`_, which completes the
+an LLVM `ret instruction <../LangRef.html#ret-instruction>`_, which completes the
function. Once the function is built, we call
``Llvm_analysis.assert_valid_function``, which is provided by LLVM. This
function does a variety of consistency checks on the generated code, to
Note how the parser turns the top-level expression into anonymous
functions for us. This will be handy when we add `JIT
-support <OCamlLangImpl4.html#jit>`_ in the next chapter. Also note that
+support <OCamlLangImpl4.html#adding-a-jit-compiler>`_ in the next chapter. Also note that
the code is very literally transcribed, no optimizations are being
performed. We will `add
-optimizations <OCamlLangImpl4.html#trivialconstfold>`_ explicitly in the
+optimizations <OCamlLangImpl4.html#trivial-constant-folding>`_ explicitly in the
next chapter.
::
optimizer until the entire file has been parsed.
In order to get per-function optimizations going, we need to set up a
-`Llvm.PassManager <../WritingAnLLVMPass.html#passmanager>`_ to hold and
+`Llvm.PassManager <../WritingAnLLVMPass.html#what-passmanager-does>`_ to hold and
organize the LLVM optimizations that we want to run. Once we have that,
we can add a set of optimizations to run. The code looks like this:
To visualize the control flow graph, you can use a nifty feature of the
LLVM '`opt <http://llvm.org/cmds/opt.html>`_' tool. If you put this LLVM
IR into "t.ll" and run "``llvm-as < t.ll | opt -analyze -view-cfg``", `a
-window will pop up <../ProgrammersManual.html#ViewGraph>`_ and you'll
+window will pop up <../ProgrammersManual.html#viewing-graphs-while-debugging-code>`_ and you'll
see this graph:
.. figure:: LangImpl5-cfg.png
this as a way to show some interesting parsing techniques.
At the end of this tutorial, we'll run through an example Kaleidoscope
-application that `renders the Mandelbrot set <#example>`_. This gives an
+application that `renders the Mandelbrot set <#kicking-the-tires>`_. This gives an
example of what you can build with Kaleidoscope and its feature set.
User-defined Operators: the Idea
| "unary" -> [< 'Token.Unary; stream >]
This just adds lexer support for the unary and binary keywords, like we
-did in `previous chapters <OCamlLangImpl5.html#iflexer>`_. One nice
+did in `previous chapters <OCamlLangImpl5.html#lexer-extensions-for-if-then-else>`_. One nice
thing about our current AST, is that we represent binary operators with
full generalisation by using their ASCII code as the opcode. For our
extended operators, we'll use this same representation, so we don't need
*name* actually refers to the address for that space. Stack variables
work the same way, except that instead of being declared with global
variable definitions, they are declared with the `LLVM alloca
-instruction <../LangRef.html#i_alloca>`_:
+instruction <../LangRef.html#alloca-instruction>`_:
.. code-block:: llvm
funny pointer arithmetic is involved, the alloca will not be
promoted.
#. mem2reg only works on allocas of `first
- class <../LangRef.html#t_classifications>`_ values (such as pointers,
+ class <../LangRef.html#first-class-types>`_ values (such as pointers,
scalars and vectors), and only if the array size of the allocation is
1 (or missing in the .ll file). mem2reg is not capable of promoting
structs or arrays to registers. Note that the "scalarrepl" pass is
As you can see, this is pretty straightforward. Now we need to update
the things that define the variables to set up the alloca. We'll start
-with ``codegen_expr Ast.For ...`` (see the `full code listing <#code>`_
+with ``codegen_expr Ast.For ...`` (see the `full code listing <#id1>`_
for the unabridged code):
.. code-block:: ocaml
...
This code is virtually identical to the code `before we allowed mutable
-variables <OCamlLangImpl5.html#forcodegen>`_. The big difference is that
+variables <OCamlLangImpl5.html#code-generation-for-the-for-loop>`_. The big difference is that
we no longer have to construct a PHI node, and we use load/store to
access the variable as needed.
extending the type system in all sorts of interesting ways. Simple
arrays are very easy and are quite useful for many different
applications. Adding them is mostly an exercise in learning how the
- LLVM `getelementptr <../LangRef.html#i_getelementptr>`_ instruction
+ LLVM `getelementptr <../LangRef.html#getelementptr-instruction>`_ instruction
works: it is so nifty/unconventional, it `has its own
FAQ <../GetElementPtr.html>`_! If you add support for recursive types
(e.g. linked lists), make sure to read the `section in the LLVM