1 ===================================
2 Customizing LLVMC: Reference Manual
3 ===================================
5 LLVMC is a generic compiler driver, designed to be customizable and
6 extensible. It plays the same role for LLVM as the ``gcc`` program
7 does for GCC - LLVMC's job is essentially to transform a set of input
8 files into a set of targets depending on configuration rules and user
9 options. What makes LLVMC different is that these transformation rules
10 are completely customizable - in fact, LLVMC knows nothing about the
11 specifics of transformation (even the command-line options are mostly
12 not hard-coded) and regards the transformation structure as an
13 abstract graph. The structure of this graph is completely determined
14 by plugins, which can be either statically or dynamically linked. This
15 makes it possible to easily adapt LLVMC for other purposes - for
16 example, as a build tool for game resources.
18 Because LLVMC employs TableGen [1]_ as its configuration language, you
19 need to be familiar with it to customize LLVMC.
28 LLVMC tries hard to be as compatible with ``gcc`` as possible,
29 although there are some small differences. Most of the time, however,
30 you shouldn't be able to notice them::
32 $ # This works as expected:
33 $ llvmc2 -O3 -Wall hello.cpp
37 One nice feature of LLVMC is that one doesn't have to distinguish
38 between different compilers for different languages (think ``g++`` and
39 ``gcc``) - the right toolchain is chosen automatically based on input
40 language names (which are, in turn, determined from file
41 extensions). If you want to force files ending with ".c" to compile as
42 C++, use the ``-x`` option, just like you would do it with ``gcc``::
44 $ llvmc2 -x c hello.cpp
45 $ # hello.cpp is really a C file
49 On the other hand, when using LLVMC as a linker to combine several C++
50 object files you should provide the ``--linker`` option since it's
51 impossible for LLVMC to choose the right linker in that case::
55 [A lot of link-time errors skipped]
56 $ llvmc2 --linker=c++ hello.o
64 LLVMC has some built-in options that can't be overridden in the
67 * ``-o FILE`` - Output file name.
69 * ``-x LANGUAGE`` - Specify the language of the following input files
70 until the next -x option.
72 * ``-load PLUGIN_NAME`` - Load the specified plugin DLL. Example:
73 ``-load $LLVM_DIR/Release/lib/LLVMCSimple.so``.
75 * ``-v`` - Enable verbose mode, i.e. print out all executed commands.
77 * ``--view-graph`` - Show a graphical representation of the compilation
78 graph. Requires that you have ``dot`` and ``gv`` programs
79 installed. Hidden option, useful for debugging.
81 * ``--write-graph`` - Write a ``compilation-graph.dot`` file in the
82 current directory with the compilation graph description in the
83 Graphviz format. Hidden option, useful for debugging.
85 * ``--save-temps`` - Write temporary files to the current directory
86 and do not delete them on exit. Hidden option, useful for debugging.
88 * ``--help``, ``--help-hidden``, ``--version`` - These options have
89 their standard meaning.
92 Compiling LLVMC plugins
93 =======================
95 It's easiest to start working on your own LLVMC plugin by copying the
96 skeleton project which lives under ``$LLVMC_DIR/plugins/Simple``::
98 $ cd $LLVMC_DIR/plugins
99 $ cp -r Simple MyPlugin
102 Makefile PluginMain.cpp Simple.td
104 As you can see, our basic plugin consists of only two files (not
105 counting the build script). ``Simple.td`` contains TableGen
106 description of the compilation graph; its format is documented in the
107 following sections. ``PluginMain.cpp`` is just a helper file used to
108 compile the auto-generated C++ code produced from TableGen source. It
109 can also contain hook definitions (see `below`__).
113 The first thing that you should do is to change the ``LLVMC_PLUGIN``
114 variable in the ``Makefile`` to avoid conflicts (since this variable
115 is used to name the resulting library)::
117 LLVMC_PLUGIN=MyPlugin
119 It is also a good idea to rename ``Simple.td`` to something less
122 $ mv Simple.td MyPlugin.td
124 Note that the plugin source directory must be placed under
125 ``$LLVMC_DIR/plugins`` to make use of the existing build
126 infrastructure. To build a version of the LLVMC executable called
127 ``mydriver`` with your plugin compiled in, use the following command::
130 $ make BUILTIN_PLUGINS=MyPlugin DRIVER_NAME=mydriver
132 To build your plugin as a dynamic library, just ``cd`` to its source
133 directory and run ``make``. The resulting file will be called
134 ``LLVMC$(LLVMC_PLUGIN).$(DLL_EXTENSION)`` (in our case,
135 ``LLVMCMyPlugin.so``). This library can be then loaded in with the
136 ``-load`` option. Example::
138 $ cd $LLVMC_DIR/plugins/Simple
140 $ llvmc2 -load $LLVM_DIR/Release/lib/LLVMCSimple.so
142 Sometimes, you will want a 'bare-bones' version of LLVMC that has no
143 built-in plugins. It can be compiled with the following command::
146 $ make BUILTIN_PLUGINS=""
148 How plugins are loaded
149 ======================
151 It is possible for LLVMC plugins to depend on each other. For example,
152 one can create edges between nodes defined in some other plugin. To
153 make this work, however, that plugin should be loaded first. To
154 achieve this, the concept of plugin priority was introduced. By
155 default, every plugin has priority zero; to specify the priority
156 explicitly, put the following line in your ``.td`` file::
158 def Priority : PluginPriority<$PRIORITY_VALUE>;
159 # Where PRIORITY_VALUE is some integer > 0
161 Plugins are loaded in order of their (increasing) priority, starting
162 with 0. Therefore, the plugin with the highest priority value will be
166 Customizing LLVMC: the compilation graph
167 ========================================
169 Each TableGen configuration file should include the common
172 include "llvm/CompilerDriver/Common.td"
174 // include "llvm/CompilerDriver/Tools.td"
175 // which contains some useful tool definitions.
177 Internally, LLVMC stores information about possible source
178 transformations in form of a graph. Nodes in this graph represent
179 tools, and edges between two nodes represent a transformation path. A
180 special "root" node is used to mark entry points for the
181 transformations. LLVMC also assigns a weight to each edge (more on
182 this later) to choose between several alternative edges.
184 The definition of the compilation graph (see file
185 ``plugins/Base/Base.td`` for an example) is just a list of edges::
187 def CompilationGraph : CompilationGraph<[
188 Edge<"root", "llvm_gcc_c">,
189 Edge<"root", "llvm_gcc_assembler">,
192 Edge<"llvm_gcc_c", "llc">,
193 Edge<"llvm_gcc_cpp", "llc">,
196 OptionalEdge<"llvm_gcc_c", "opt", [(switch_on "opt")]>,
197 OptionalEdge<"llvm_gcc_cpp", "opt", [(switch_on "opt")]>,
200 OptionalEdge<"llvm_gcc_assembler", "llvm_gcc_cpp_linker",
201 (case (input_languages_contain "c++"), (inc_weight),
202 (or (parameter_equals "linker", "g++"),
203 (parameter_equals "linker", "c++")), (inc_weight))>,
208 As you can see, the edges can be either default or optional, where
209 optional edges are differentiated by an additional ``case`` expression
210 used to calculate the weight of this edge. Notice also that we refer
211 to tools via their names (as strings). This makes it possible to add
212 edges to an existing compilation graph in plugins without having to
213 know about all tool definitions used in the graph.
215 The default edges are assigned a weight of 1, and optional edges get a
216 weight of 0 + 2*N where N is the number of tests that evaluated to
217 true in the ``case`` expression. It is also possible to provide an
218 integer parameter to ``inc_weight`` and ``dec_weight`` - in this case,
219 the weight is increased (or decreased) by the provided value instead
222 When passing an input file through the graph, LLVMC picks the edge
223 with the maximum weight. To avoid ambiguity, there should be only one
224 default edge between two nodes (with the exception of the root node,
225 which gets a special treatment - there you are allowed to specify one
226 default edge *per language*).
228 To get a visual representation of the compilation graph (useful for
229 debugging), run ``llvmc2 --view-graph``. You will need ``dot`` and
230 ``gsview`` installed for this to work properly.
233 Writing a tool description
234 ==========================
236 As was said earlier, nodes in the compilation graph represent tools,
237 which are described separately. A tool definition looks like this
238 (taken from the ``include/llvm/CompilerDriver/Tools.td`` file)::
240 def llvm_gcc_cpp : Tool<[
242 (out_language "llvm-assembler"),
243 (output_suffix "bc"),
244 (cmd_line "llvm-g++ -c $INFILE -o $OUTFILE -emit-llvm"),
248 This defines a new tool called ``llvm_gcc_cpp``, which is an alias for
249 ``llvm-g++``. As you can see, a tool definition is just a list of
250 properties; most of them should be self-explanatory. The ``sink``
251 property means that this tool should be passed all command-line
252 options that lack explicit descriptions.
254 The complete list of the currently implemented tool properties follows:
256 * Possible tool properties:
258 - ``in_language`` - input language name. Can be either a string or a
259 list, in case the tool supports multiple input languages.
261 - ``out_language`` - output language name.
263 - ``output_suffix`` - output file suffix.
265 - ``cmd_line`` - the actual command used to run the tool. You can
266 use ``$INFILE`` and ``$OUTFILE`` variables, output redirection
267 with ``>``, hook invocations (``$CALL``), environment variables
268 (via ``$ENV``) and the ``case`` construct (more on this below).
270 - ``join`` - this tool is a "join node" in the graph, i.e. it gets a
271 list of input files and joins them together. Used for linkers.
273 - ``sink`` - all command-line options that are not handled by other
274 tools are passed to this tool.
276 The next tool definition is slightly more complex::
278 def llvm_gcc_linker : Tool<[
279 (in_language "object-code"),
280 (out_language "executable"),
281 (output_suffix "out"),
282 (cmd_line "llvm-gcc $INFILE -o $OUTFILE"),
284 (prefix_list_option "L", (forward),
285 (help "add a directory to link path")),
286 (prefix_list_option "l", (forward),
287 (help "search a library when linking")),
288 (prefix_list_option "Wl", (unpack_values),
289 (help "pass options to linker"))
292 This tool has a "join" property, which means that it behaves like a
293 linker. This tool also defines several command-line options: ``-l``,
294 ``-L`` and ``-Wl`` which have their usual meaning. An option has two
295 attributes: a name and a (possibly empty) list of properties. All
296 currently implemented option types and properties are described below:
298 * Possible option types:
300 - ``switch_option`` - a simple boolean switch, for example ``-time``.
302 - ``parameter_option`` - option that takes an argument, for example
305 - ``parameter_list_option`` - same as the above, but more than one
306 occurence of the option is allowed.
308 - ``prefix_option`` - same as the parameter_option, but the option name
309 and parameter value are not separated.
311 - ``prefix_list_option`` - same as the above, but more than one
312 occurence of the option is allowed; example: ``-lm -lpthread``.
314 - ``alias_option`` - a special option type for creating
315 aliases. Unlike other option types, aliases are not allowed to
316 have any properties besides the aliased option name. Usage
317 example: ``(alias_option "preprocess", "E")``
320 * Possible option properties:
322 - ``append_cmd`` - append a string to the tool invocation command.
324 - ``forward`` - forward this option unchanged.
326 - ``forward_as`` - Change the name of this option, but forward the
327 argument unchanged. Example: ``(forward_as "--disable-optimize")``.
329 - ``output_suffix`` - modify the output suffix of this
330 tool. Example: ``(switch "E", (output_suffix "i")``.
332 - ``stop_compilation`` - stop compilation after this phase.
334 - ``unpack_values`` - used for for splitting and forwarding
335 comma-separated lists of options, e.g. ``-Wa,-foo=bar,-baz`` is
336 converted to ``-foo=bar -baz`` and appended to the tool invocation
339 - ``help`` - help string associated with this option. Used for
342 - ``required`` - this option is obligatory.
345 Option list - specifying all options in a single place
346 ======================================================
348 It can be handy to have all information about options gathered in a
349 single place to provide an overview. This can be achieved by using a
350 so-called ``OptionList``::
352 def Options : OptionList<[
353 (switch_option "E", (help "Help string")),
354 (alias_option "quiet", "q")
358 ``OptionList`` is also a good place to specify option aliases.
360 Tool-specific option properties like ``append_cmd`` have (obviously)
361 no meaning in the context of ``OptionList``, so the only properties
362 allowed there are ``help`` and ``required``.
364 Option lists are used at the file scope. See file
365 ``plugins/Clang/Clang.td`` for an example of ``OptionList`` usage.
369 Using hooks and environment variables in the ``cmd_line`` property
370 ==================================================================
372 Normally, LLVMC executes programs from the system ``PATH``. Sometimes,
373 this is not sufficient: for example, we may want to specify tool names
374 in the configuration file. This can be achieved via the mechanism of
375 hooks - to write your own hooks, just add their definitions to the
376 ``PluginMain.cpp`` or drop a ``.cpp`` file into the
377 ``$LLVMC_DIR/driver`` directory. Hooks should live in the ``hooks``
378 namespace and have the signature ``std::string hooks::MyHookName
379 (void)``. They can be used from the ``cmd_line`` tool property::
381 (cmd_line "$CALL(MyHook)/path/to/file -o $CALL(AnotherHook)")
383 It is also possible to use environment variables in the same manner::
385 (cmd_line "$ENV(VAR1)/path/to/file -o $ENV(VAR2)")
387 To change the command line string based on user-provided options use
388 the ``case`` expression (documented below)::
393 "llvm-g++ -E -x c $INFILE -o $OUTFILE",
395 "llvm-g++ -c -x c $INFILE -o $OUTFILE -emit-llvm"))
397 Conditional evaluation: the ``case`` expression
398 ===============================================
400 The 'case' construct can be used to calculate weights of the optional
401 edges and to choose between several alternative command line strings
402 in the ``cmd_line`` tool property. It is designed after the
403 similarly-named construct in functional languages and takes the form
404 ``(case (test_1), statement_1, (test_2), statement_2, ... (test_N),
405 statement_N)``. The statements are evaluated only if the corresponding
406 tests evaluate to true.
410 // Increases edge weight by 5 if "-A" is provided on the
411 // command-line, and by 5 more if "-B" is also provided.
413 (switch_on "A"), (inc_weight 5),
414 (switch_on "B"), (inc_weight 5))
416 // Evaluates to "cmdline1" if option "-A" is provided on the
417 // command line, otherwise to "cmdline2"
419 (switch_on "A"), "cmdline1",
420 (switch_on "B"), "cmdline2",
421 (default), "cmdline3")
423 Note the slight difference in 'case' expression handling in contexts
424 of edge weights and command line specification - in the second example
425 the value of the ``"B"`` switch is never checked when switch ``"A"`` is
426 enabled, and the whole expression always evaluates to ``"cmdline1"`` in
429 Case expressions can also be nested, i.e. the following is legal::
431 (case (switch_on "E"), (case (switch_on "o"), ..., (default), ...)
434 You should, however, try to avoid doing that because it hurts
435 readability. It is usually better to split tool descriptions and/or
436 use TableGen inheritance instead.
438 * Possible tests are:
440 - ``switch_on`` - Returns true if a given command-line option is
441 provided by the user. Example: ``(switch_on "opt")``. Note that
442 you have to define all possible command-line options separately in
443 the tool descriptions. See the next section for the discussion of
444 different kinds of command-line options.
446 - ``parameter_equals`` - Returns true if a command-line parameter equals
447 a given value. Example: ``(parameter_equals "W", "all")``.
449 - ``element_in_list`` - Returns true if a command-line parameter list
450 includes a given value. Example: ``(parameter_in_list "l", "pthread")``.
452 - ``input_languages_contain`` - Returns true if a given language
453 belongs to the current input language set. Example:
454 ``(input_languages_contain "c++")``.
456 - ``in_language`` - Evaluates to true if the language of the input
457 file equals to the argument. At the moment works only with
458 ``cmd_line`` property on non-join nodes. Example: ``(in_language
461 - ``not_empty`` - Returns true if a given option (which should be
462 either a parameter or a parameter list) is set by the
463 user. Example: ``(not_empty "o")``.
465 - ``default`` - Always evaluates to true. Should always be the last
466 test in the ``case`` expression.
468 - ``and`` - A standard logical combinator that returns true iff all
469 of its arguments return true. Used like this: ``(and (test1),
470 (test2), ... (testN))``. Nesting of ``and`` and ``or`` is allowed,
473 - ``or`` - Another logical combinator that returns true only if any
474 one of its arguments returns true. Example: ``(or (test1),
475 (test2), ... (testN))``.
481 One last thing that you will need to modify when adding support for a
482 new language to LLVMC is the language map, which defines mappings from
483 file extensions to language names. It is used to choose the proper
484 toolchain(s) for a given input file set. Language map definition is
485 located in the file ``Tools.td`` and looks like this::
487 def LanguageMap : LanguageMap<
488 [LangToSuffixes<"c++", ["cc", "cp", "cxx", "cpp", "CPP", "c++", "C"]>,
489 LangToSuffixes<"c", ["c"]>,
496 When writing LLVMC plugins, it can be useful to get a visual view of
497 the resulting compilation graph. This can be achieved via the command
498 line option ``--view-graph``. This command assumes that Graphviz [2]_ and
499 Ghostview [3]_ are installed. There is also a ``--dump-graph`` option that
500 creates a Graphviz source file(``compilation-graph.dot``) in the
507 .. [1] TableGen Fundamentals
508 http://llvm.cs.uiuc.edu/docs/TableGenFundamentals.html
511 http://www.graphviz.org/
514 http://pages.cs.wisc.edu/~ghost/