"http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
+ <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>TableGen Fundamentals</title>
<link rel="stylesheet" href="llvm.css" type="text/css">
</head>
<body>
-<div class="doc_title">TableGen Fundamentals</div>
+<h1>TableGen Fundamentals</h1>
-<div class="doc_text">
+<div>
<ul>
<li><a href="#introduction">Introduction</a>
<ol>
</div>
<!-- *********************************************************************** -->
-<div class="doc_section"><a name="introduction">Introduction</a></div>
+<h2><a name="introduction">Introduction</a></h2>
<!-- *********************************************************************** -->
-<div class="doc_text">
+<div>
<p>TableGen's purpose is to help a human develop and maintain records of
domain-specific information. Because there may be a large number of these
<tt>llvm/utils/emacs</tt> and <tt>llvm/utils/vim</tt> directories of your LLVM
distribution, respectively.</p>
-</div>
-
<!-- ======================================================================= -->
-<div class="doc_subsection"><a name="concepts">Basic concepts</a></div>
+<h3><a name="concepts">Basic concepts</a></h3>
-<div class="doc_text">
+<div>
<p>TableGen files consist of two key parts: 'classes' and 'definitions', both
of which are considered 'records'.</p>
as "Instruction".</p>
<p><b>TableGen multiclasses</b> are groups of abstract records that are
-instantiated all at once. Each instantiation can result in multiple TableGen
-definitions.</p>
+instantiated all at once. Each instantiation can result in multiple
+TableGen definitions. If a multiclass inherits from another multiclass,
+the definitions in the sub-multiclass become part of the current
+multiclass, as if they were declared in the current multiclass.</p>
</div>
<!-- ======================================================================= -->
-<div class="doc_subsection"><a name="example">An example record</a></div>
+<h3><a name="example">An example record</a></h3>
-<div class="doc_text">
+<div>
<p>With no other arguments, TableGen parses the specified file and prints out
all of the classes, then all of the definitions. This is a good way to see what
<b>bit</b> isIndirectBranch = 0;
<b>bit</b> isBarrier = 0;
<b>bit</b> isCall = 0;
- <b>bit</b> isSimpleLoad = 0;
+ <b>bit</b> canFoldAsLoad = 0;
<b>bit</b> mayLoad = 0;
<b>bit</b> mayStore = 0;
<b>bit</b> isImplicitDef = 0;
- <b>bit</b> isTwoAddress = 1;
<b>bit</b> isConvertibleToThreeAddress = 1;
<b>bit</b> isCommutable = 1;
<b>bit</b> isTerminator = 0;
<b>bit</b> isReMaterializable = 0;
<b>bit</b> isPredicable = 0;
<b>bit</b> hasDelaySlot = 0;
- <b>bit</b> usesCustomDAGSchedInserter = 0;
+ <b>bit</b> usesCustomInserter = 0;
<b>bit</b> hasCtrlDep = 0;
<b>bit</b> isNotDuplicable = 0;
<b>bit</b> hasSideEffects = 0;
- <b>bit</b> mayHaveSideEffects = 0;
<b>bit</b> neverHasSideEffects = 0;
InstrItinClass Itinerary = NoItinerary;
<b>string</b> Constraints = "";
<p>As you can see, a lot of information is needed for every instruction
supported by the code generator, and specifying it all manually would be
-unmaintainble, prone to bugs, and tiring to do in the first place. Because we
+unmaintainable, prone to bugs, and tiring to do in the first place. Because we
are using TableGen, all of the information was derived from the following
definition:</p>
key feature of TableGen is that it allows the end-user to define the
abstractions they prefer to use when describing their information.</p>
+<p>Each def record has a special entry called "NAME." This is the
+name of the def ("ADD32rr" above). In the general case def names can
+be formed from various kinds of string processing expressions and NAME
+resolves to the final value obtained after resolving all of those
+expressions. The user may refer to NAME anywhere she desires to use
+the ultimate name of the def. NAME should not be defined anywhere
+else in user code to avoid conflict problems.</p>
+
</div>
<!-- ======================================================================= -->
-<div class="doc_subsection"><a name="running">Running TableGen</a></div>
+<h3><a name="running">Running TableGen</a></h3>
-<div class="doc_text">
+<div>
<p>TableGen runs just like any other LLVM tool. The first (optional) argument
specifies the file to read. If a filename is not specified, <tt>tblgen</tt>
</div>
+</div>
<!-- *********************************************************************** -->
-<div class="doc_section"><a name="syntax">TableGen syntax</a></div>
+<h2><a name="syntax">TableGen syntax</a></h2>
<!-- *********************************************************************** -->
-<div class="doc_text">
+<div>
<p>TableGen doesn't care about the meaning of data (that is up to the backend to
define), but it does care about syntax, and it enforces a simple type system.
This section describes the syntax and the constructs allowed in a TableGen file.
</p>
-</div>
-
<!-- ======================================================================= -->
-<div class="doc_subsection"><a name="primitives">TableGen primitives</a></div>
+<h3><a name="primitives">TableGen primitives</a></h3>
+
+<div>
<!-- -------------------------------------------------------------------------->
-<div class="doc_subsubsection"><a name="comments">TableGen comments</a></div>
+<h4><a name="comments">TableGen comments</a></h4>
-<div class="doc_text">
+<div>
<p>TableGen supports BCPL style "<tt>//</tt>" comments, which run to the end of
the line, and it also supports <b>nestable</b> "<tt>/* */</tt>" comments.</p>
</div>
<!-- -------------------------------------------------------------------------->
-<div class="doc_subsubsection">
+<h4>
<a name="types">The TableGen type system</a>
-</div>
+</h4>
-<div class="doc_text">
+<div>
<p>TableGen files are strongly typed, in a simple (but complete) type-system.
These types are used to perform automatic conversions, check for errors, and to
<dd>This type represents a nestable directed graph of elements.</dd>
<dt><tt><b>code</b></tt></dt>
- <dd>This represents a big hunk of text. NOTE: I don't remember why this is
- distinct from string!</dd>
+ <dd>This represents a big hunk of text. This is lexically distinct from
+ string values because it doesn't require escapeing double quotes and other
+ common characters that occur in code.</dd>
</dl>
<p>To date, these types have been sufficient for describing things that
</div>
<!-- -------------------------------------------------------------------------->
-<div class="doc_subsubsection">
+<h4>
<a name="values">TableGen values and expressions</a>
-</div>
+</h4>
-<div class="doc_text">
+<div>
<p>TableGen allows for a pretty reasonable number of different expression forms
when building up values. These forms allow the TableGen file to be written in a
<dd>string value</dd>
<dt><tt>[{ ... }]</tt></dt>
<dd>code fragment</dd>
-<dt><tt>[ X, Y, Z ]</tt></dt>
- <dd>list value.</dd>
+<dt><tt>[ X, Y, Z ]<type></tt></dt>
+ <dd>list value. <type> is the type of the list
+element and is usually optional. In rare cases,
+TableGen is unable to deduce the element type in
+which case the user must specify it explicitly.</dd>
<dt><tt>{ a, b, c }</tt></dt>
<dd>initializer for a "bits<3>" value</dd>
<dt><tt>value</tt></dt>
<dt><tt>!strconcat(a, b)</tt></dt>
<dd>A string value that is the result of concatenating the 'a' and 'b'
strings.</dd>
+<dt><tt>str1#str2</tt></dt>
+ <dd>"#" (paste) is a shorthand for !strconcat. It may concatenate
+ things that are not quoted strings, in which case an implicit
+ !cast<string> is done on the operand of the paste.</dd>
+<dt><tt>!cast<type>(a)</tt></dt>
+ <dd>A symbol of type <em>type</em> obtained by looking up the string 'a' in
+the symbol table. If the type of 'a' does not match <em>type</em>, TableGen
+aborts with an error. !cast<string> is a special case in that the argument must
+be an object defined by a 'def' construct.</dd>
+<dt><tt>!subst(a, b, c)</tt></dt>
+ <dd>If 'a' and 'b' are of string type or are symbol references, substitute
+'b' for 'a' in 'c.' This operation is analogous to $(subst) in GNU make.</dd>
+<dt><tt>!foreach(a, b, c)</tt></dt>
+ <dd>For each member 'b' of dag or list 'a' apply operator 'c.' 'b' is a
+dummy variable that should be declared as a member variable of an instantiated
+class. This operation is analogous to $(foreach) in GNU make.</dd>
+<dt><tt>!head(a)</tt></dt>
+ <dd>The first element of list 'a.'</dd>
+<dt><tt>!tail(a)</tt></dt>
+ <dd>The 2nd-N elements of list 'a.'</dd>
+<dt><tt>!empty(a)</tt></dt>
+ <dd>An integer {0,1} indicating whether list 'a' is empty.</dd>
+<dt><tt>!if(a,b,c)</tt></dt>
+ <dd>'b' if the result of 'int' or 'bit' operator 'a' is nonzero,
+ 'c' otherwise.</dd>
+<dt><tt>!eq(a,b)</tt></dt>
+ <dd>'bit 1' if string a is equal to string b, 0 otherwise. This
+ only operates on string, int and bit objects. Use !cast<string> to
+ compare other types of objects.</dd>
</dl>
<p>Note that all of the values have rules specifying how they convert to values
</div>
+</div>
+
<!-- ======================================================================= -->
-<div class="doc_subsection">
+<h3>
<a name="classesdefs">Classes and definitions</a>
-</div>
+</h3>
-<div class="doc_text">
+<div>
<p>As mentioned in the <a href="#concepts">intro</a>, classes and definitions
(collectively known as 'records') in TableGen are the main high-level unit of
permit the specification of default values for their subclasses, allowing the
subclasses to override them as they wish.</p>
-</div>
-
<!---------------------------------------------------------------------------->
-<div class="doc_subsubsection">
+<h4>
<a name="valuedef">Value definitions</a>
-</div>
+</h4>
-<div class="doc_text">
+<div>
<p>Value definitions define named entries in records. A value must be defined
before it can be referred to as the operand for another value definition or
</div>
<!-- -------------------------------------------------------------------------->
-<div class="doc_subsubsection">
+<h4>
<a name="recordlet">'let' expressions</a>
-</div>
+</h4>
-<div class="doc_text">
+<div>
<p>A record-level let expression is used to change the value of a value
definition in a record. This is primarily useful when a superclass defines a
</div>
<!-- -------------------------------------------------------------------------->
-<div class="doc_subsubsection">
+<h4>
<a name="templateargs">Class template arguments</a>
-</div>
+</h4>
-<div class="doc_text">
+<div>
<p>TableGen permits the definition of parameterized classes as well as normal
concrete classes. Parameterized TableGen classes specify a list of variable
</div>
<!-- -------------------------------------------------------------------------->
-<div class="doc_subsubsection">
+<h4>
<a name="multiclass">Multiclass definitions and instances</a>
-</div>
+</h4>
-<div class="doc_text">
+<div>
<p>
While classes with template arguments are a good way to factor commonality
<p>The name of the resultant definitions has the multidef fragment names
appended to them, so this defines <tt>ADD_rr</tt>, <tt>ADD_ri</tt>,
- <tt>SUB_rr</tt>, etc. Using a multiclass this way is exactly equivalent to
- instantiating the classes multiple times yourself, e.g. by writing:</p>
+ <tt>SUB_rr</tt>, etc. A defm may inherit from multiple multiclasses,
+ instantiating definitions from each multiclass. Using a multiclass
+ this way is exactly equivalent to instantiating the classes multiple
+ times yourself, e.g. by writing:</p>
<div class="doc_code">
<pre>
</pre>
</div>
+<p>
+A defm can also be used inside a multiclass providing several levels of
+multiclass instanciations.
+</p>
+
+<div class="doc_code">
+<pre>
+<b>class</b> Instruction<bits<4> opc, string Name> {
+ bits<4> opcode = opc;
+ string name = Name;
+}
+
+<b>multiclass</b> basic_r<bits<4> opc> {
+ <b>def</b> rr : Instruction<opc, "rr">;
+ <b>def</b> rm : Instruction<opc, "rm">;
+}
+
+<b>multiclass</b> basic_s<bits<4> opc> {
+ <b>defm</b> SS : basic_r<opc>;
+ <b>defm</b> SD : basic_r<opc>;
+ <b>def</b> X : Instruction<opc, "x">;
+}
+
+<b>multiclass</b> basic_p<bits<4> opc> {
+ <b>defm</b> PS : basic_r<opc>;
+ <b>defm</b> PD : basic_r<opc>;
+ <b>def</b> Y : Instruction<opc, "y">;
+}
+
+<b>defm</b> ADD : basic_s<0xf>, basic_p<0xf>;
+...
+
+<i>// Results</i>
+<b>def</b> ADDPDrm { ...
+<b>def</b> ADDPDrr { ...
+<b>def</b> ADDPSrm { ...
+<b>def</b> ADDPSrr { ...
+<b>def</b> ADDSDrm { ...
+<b>def</b> ADDSDrr { ...
+<b>def</b> ADDY { ...
+<b>def</b> ADDX { ...
+</pre>
+</div>
+
+<p>
+defm declarations can inherit from classes too, the
+rule to follow is that the class list must start after the
+last multiclass, and there must be at least one multiclass
+before them.
+</p>
+
+<div class="doc_code">
+<pre>
+<b>class</b> XD { bits<4> Prefix = 11; }
+<b>class</b> XS { bits<4> Prefix = 12; }
+
+<b>class</b> I<bits<4> op> {
+ bits<4> opcode = op;
+}
+
+<b>multiclass</b> R {
+ <b>def</b> rr : I<4>;
+ <b>def</b> rm : I<2>;
+}
+
+<b>multiclass</b> Y {
+ <b>defm</b> SS : R, XD;
+ <b>defm</b> SD : R, XS;
+}
+
+<b>defm</b> Instr : Y;
+
+<i>// Results</i>
+<b>def</b> InstrSDrm {
+ bits<4> opcode = { 0, 0, 1, 0 };
+ bits<4> Prefix = { 1, 1, 0, 0 };
+}
+...
+<b>def</b> InstrSSrr {
+ bits<4> opcode = { 0, 1, 0, 0 };
+ bits<4> Prefix = { 1, 0, 1, 1 };
+}
+</pre>
+</div>
+
+</div>
+
</div>
<!-- ======================================================================= -->
-<div class="doc_subsection">
+<h3>
<a name="filescope">File scope entities</a>
-</div>
+</h3>
+
+<div>
<!-- -------------------------------------------------------------------------->
-<div class="doc_subsubsection">
+<h4>
<a name="include">File inclusion</a>
-</div>
+</h4>
-<div class="doc_text">
+<div>
<p>TableGen supports the '<tt>include</tt>' token, which textually substitutes
the specified file in place of the include directive. The filename should be
specified as a double quoted string immediately after the '<tt>include</tt>'
</div>
<!-- -------------------------------------------------------------------------->
-<div class="doc_subsubsection">
+<h4>
<a name="globallet">'let' expressions</a>
-</div>
+</h4>
-<div class="doc_text">
+<div>
<p>"Let" expressions at file scope are similar to <a href="#recordlet">"let"
expressions within a record</a>, except they can specify a value binding for
end-user to factor out commonality from the records.</p>
<p>File-scope "let" expressions take a comma-separated list of bindings to
-apply, and one of more records to bind the values in. Here are some
+apply, and one or more records to bind the values in. Here are some
examples:</p>
<div class="doc_code">
<b>let</b> Defs = [EAX, ECX, EDX, FP0, FP1, FP2, FP3, FP4, FP5, FP6, ST0,
MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7,
XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6, XMM7, EFLAGS] <b>in</b> {
- <b>def</b> CALLpcrel32 : Ii32<0xE8, RawFrm, (outs), (ins i32imm:$dst,variable_ops),
- "call\t${dst:call}", []>;
- <b>def</b> CALL32r : I<0xFF, MRM2r, (outs), (ins GR32:$dst, variable_ops),
- "call\t{*}$dst", [(X86call GR32:$dst)]>;
- <b>def</b> CALL32m : I<0xFF, MRM2m, (outs), (ins i32mem:$dst, variable_ops),
- "call\t{*}$dst", []>;
+ <b>def</b> CALLpcrel32 : Ii32<0xE8, RawFrm, (outs), (ins i32imm:$dst,variable_ops),
+ "call\t${dst:call}", []>;
+ <b>def</b> CALL32r : I<0xFF, MRM2r, (outs), (ins GR32:$dst, variable_ops),
+ "call\t{*}$dst", [(X86call GR32:$dst)]>;
+ <b>def</b> CALL32m : I<0xFF, MRM2m, (outs), (ins i32mem:$dst, variable_ops),
+ "call\t{*}$dst", []>;
}
</pre>
</div>
need to be added to several records, and the records do not otherwise need to be
opened, as in the case with the <tt>CALL*</tt> instructions above.</p>
+<p>It's also possible to use "let" expressions inside multiclasses, providing
+more ways to factor out commonality from the records, specially if using
+several levels of multiclass instanciations. This also avoids the need of using
+"let" expressions within subsequent records inside a multiclass.</p>
+
+<pre class="doc_code">
+<b>multiclass </b>basic_r<bits<4> opc> {
+ <b>let </b>Predicates = [HasSSE2] in {
+ <b>def </b>rr : Instruction<opc, "rr">;
+ <b>def </b>rm : Instruction<opc, "rm">;
+ }
+ <b>let </b>Predicates = [HasSSE3] in
+ <b>def </b>rx : Instruction<opc, "rx">;
+}
+
+<b>multiclass </b>basic_ss<bits<4> opc> {
+ <b>let </b>IsDouble = 0 in
+ <b>defm </b>SS : basic_r<opc>;
+
+ <b>let </b>IsDouble = 1 in
+ <b>defm </b>SD : basic_r<opc>;
+}
+
+<b>defm </b>ADD : basic_ss<0xf>;
+</pre>
+</div>
+
+</div>
+
+</div>
+
+<!-- *********************************************************************** -->
+<h2><a name="codegen">Code Generator backend info</a></h2>
+<!-- *********************************************************************** -->
+
+<div>
+
+<p>Expressions used by code generator to describe instructions and isel
+patterns:</p>
+
+<dl>
+<dt><tt>(implicit a)</tt></dt>
+ <dd>an implicitly defined physical register. This tells the dag instruction
+ selection emitter the input pattern's extra definitions matches implicit
+ physical register definitions.</dd>
+</dl>
</div>
<!-- *********************************************************************** -->
-<div class="doc_section"><a name="backends">TableGen backends</a></div>
+<h2><a name="backends">TableGen backends</a></h2>
<!-- *********************************************************************** -->
-<div class="doc_text">
+<div>
<p>TODO: How they work, how to write one. This section should not contain
details about any particular backend, except maybe -print-enums as an example.
<hr>
<address>
<a href="http://jigsaw.w3.org/css-validator/check/referer"><img
- src="http://jigsaw.w3.org/css-validator/images/vcss" alt="Valid CSS!"></a>
+ src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a>
<a href="http://validator.w3.org/check/referer"><img
- src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!"></a>
+ src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a>
<a href="mailto:sabre@nondot.org">Chris Lattner</a><br>
- <a href="http://llvm.org">LLVM Compiler Infrastructure</a><br>
+ <a href="http://llvm.org/">LLVM Compiler Infrastructure</a><br>
Last modified: $Date$
</address>