<li><a href="#targetmachine">The <tt>TargetMachine</tt> class</a></li>
<li><a href="#targetdata">The <tt>TargetData</tt> class</a></li>
<li><a href="#targetlowering">The <tt>TargetLowering</tt> class</a></li>
- <li><a href="#mregisterinfo">The <tt>MRegisterInfo</tt> class</a></li>
+ <li><a href="#targetregisterinfo">The <tt>TargetRegisterInfo</tt> class</a></li>
<li><a href="#targetinstrinfo">The <tt>TargetInstrInfo</tt> class</a></li>
<li><a href="#targetframeinfo">The <tt>TargetFrameInfo</tt> class</a></li>
<li><a href="#targetsubtarget">The <tt>TargetSubtarget</tt> class</a></li>
</li>
<li><a href="#targetimpls">Target-specific Implementation Notes</a>
<ul>
+ <li><a href="#tailcallopt">Tail call optimization</a></li>
<li><a href="#x86">The X86 backend</a></li>
<li><a href="#ppc">The PowerPC backend</a>
<ul>
<!-- ======================================================================= -->
<div class="doc_subsection">
- <a name="mregisterinfo">The <tt>MRegisterInfo</tt> class</a>
+ <a name="targetregisterinfo">The <tt>TargetRegisterInfo</tt> class</a>
</div>
<div class="doc_text">
-<p>The <tt>MRegisterInfo</tt> class (which will eventually be renamed to
-<tt>TargetRegisterInfo</tt>) is used to describe the register file of the
-target and any interactions between the registers.</p>
+<p>The <tt>TargetRegisterInfo</tt> class is used to describe the register
+file of the target and any interactions between the registers.</p>
<p>Registers in the code generator are represented in the code generator by
unsigned integers. Physical registers (those that actually exist in the target
(used to indicate whether one register overlaps with another).
</p>
-<p>In addition to the per-register description, the <tt>MRegisterInfo</tt> class
-exposes a set of processor specific register classes (instances of the
+<p>In addition to the per-register description, the <tt>TargetRegisterInfo</tt>
+class exposes a set of processor specific register classes (instances of the
<tt>TargetRegisterClass</tt> class). Each register class contains sets of
registers that have the same properties (for example, they are all 32-bit
integer registers). Each SSA virtual register created by the instruction
<div class="doc_code">
<pre>
-int %test(int %X, int %Y) {
- %Z = div int %X, %Y
- ret int %Z
+define i32 @test(i32 %X, i32 %Y) {
+ %Z = udiv i32 %X, %Y
+ ret i32 %Z
}
</pre>
</div>
corresponds one-to-one with the LLVM function input to the instruction selector.
In addition to a list of basic blocks, the <tt>MachineFunction</tt> contains a
a <tt>MachineConstantPool</tt>, a <tt>MachineFrameInfo</tt>, a
-<tt>MachineFunctionInfo</tt>, a <tt>SSARegMap</tt>, and a set of live in and
-live out registers for the function. See
+<tt>MachineFunctionInfo</tt>, and a <tt>MachineRegisterInfo</tt>. See
<tt>include/llvm/CodeGen/MachineFunction.h</tt> for more information.</p>
</div>
edges are represented by instances of the <tt>SDOperand</tt> class, which is
a <tt><SDNode, unsigned></tt> pair, indicating the node and result
value being used, respectively. Each value produced by an <tt>SDNode</tt> has
-an associated <tt>MVT::ValueType</tt> indicating what type the value is.</p>
+an associated <tt>MVT</tt> (Machine Value Type) indicating what the type of the
+value is.</p>
<p>SelectionDAGs contain two different kinds of values: those that represent
data flow and those that represent control flow dependencies. Data values are
marked as <i>aliased</i> in LLVM. Given a particular architecture, you
can check which registers are aliased by inspecting its
<tt>RegisterInfo.td</tt> file. Moreover, the method
-<tt>MRegisterInfo::getAliasSet(p_reg)</tt> returns an array containing
+<tt>TargetRegisterInfo::getAliasSet(p_reg)</tt> returns an array containing
all the physical registers aliased to the register <tt>p_reg</tt>.</p>
<p>Physical registers, in LLVM, are grouped in <i>Register Classes</i>.
bool RegMapping_Fer::compatible_class(MachineFunction &mf,
unsigned v_reg,
unsigned p_reg) {
- assert(MRegisterInfo::isPhysicalRegister(p_reg) &&
+ assert(TargetRegisterInfo::isPhysicalRegister(p_reg) &&
"Target register must be physical");
- const TargetRegisterClass *trc = mf.getSSARegMap()->getRegClass(v_reg);
- return trc->contains(p_reg);
+ const TargetRegisterClass *trc = mf.getRegInfo().getRegClass(v_reg);
+ return trc->contains(p_reg);
}
</pre>
</div>
number. The smallest virtual register is normally assigned the number
1024. This may change, so, in order to know which is the first virtual
register, you should access
-<tt>MRegisterInfo::FirstVirtualRegister</tt>. Any register whose
+<tt>TargetRegisterInfo::FirstVirtualRegister</tt>. Any register whose
number is greater than or equal to
-<tt>MRegisterInfo::FirstVirtualRegister</tt> is considered a virtual
+<tt>TargetRegisterInfo::FirstVirtualRegister</tt> is considered a virtual
register. Whereas physical registers are statically defined in a
<tt>TargetRegisterInfo.td</tt> file and cannot be created by the
application developer, that is not the case with virtual registers.
In order to create new virtual registers, use the method
-<tt>SSARegMap::createVirtualRegister()</tt>. This method will return a
+<tt>MachineRegisterInfo::createVirtualRegister()</tt>. This method will return a
virtual register with the highest code.
</p>
<p>There are two ways to map virtual registers to physical registers (or to
memory slots). The first way, that we will call <i>direct mapping</i>,
-is based on the use of methods of the classes <tt>MRegisterInfo</tt>,
+is based on the use of methods of the classes <tt>TargetRegisterInfo</tt>,
and <tt>MachineOperand</tt>. The second way, that we will call
<i>indirect mapping</i>, relies on the <tt>VirtRegMap</tt> class in
order to insert loads and stores sending and getting values to and from
memory. To assign a physical register to a virtual register present in
a given operand, use <tt>MachineOperand::setReg(p_reg)</tt>. To insert
a store instruction, use
-<tt>MRegisterInfo::storeRegToStackSlot(...)</tt>, and to insert a load
-instruction, use <tt>MRegisterInfo::loadRegFromStackSlot</tt>.</p>
+<tt>TargetRegisterInfo::storeRegToStackSlot(...)</tt>, and to insert a load
+instruction, use <tt>TargetRegisterInfo::loadRegFromStackSlot</tt>.</p>
<p>The indirect mapping shields the application developer from the
complexities of inserting load and store instructions. In order to map
<div class="doc_code">
<pre>
%a = MOVE %b
-%a = ADD %a %b
+%a = ADD %a %c
</pre>
</div>
<p>Notice that, internally, the second instruction is represented as
-<tt>ADD %a[def/use] %b</tt>. I.e., the register operand <tt>%a</tt> is
+<tt>ADD %a[def/use] %c</tt>. I.e., the register operand <tt>%a</tt> is
both used and defined by the instruction.</p>
</div>
</div>
<p>Instructions can be folded with the
-<tt>MRegisterInfo::foldMemoryOperand(...)</tt> method. Care must be
+<tt>TargetRegisterInfo::foldMemoryOperand(...)</tt> method. Care must be
taken when folding instructions; a folded instruction can be quite
different from the original instruction. See
<tt>LiveIntervals::addIntervalsForSpills</tt> in
</div>
+<!-- ======================================================================= -->
+<div class="doc_subsection">
+ <a name="tailcallopt">Tail call optimization</a>
+</div>
+
+<div class="doc_text">
+ <p>Tail call optimization, callee reusing the stack of the caller, is currently supported on x86/x86-64 and PowerPC. It is performed if:
+ <ul>
+ <li>Caller and callee have the calling convention <tt>fastcc</tt>.</li>
+ <li>The call is a tail call - in tail position (ret immediately follows call and ret uses value of call or is void).</li>
+ <li>Option <tt>-tailcallopt</tt> is enabled.</li>
+ <li>Platform specific constraints are met.</li>
+ </ul>
+ </p>
+ <p>x86/x86-64 constraints:
+ <ul>
+ <li>No variable argument lists are used.</li>
+ <li>On x86-64 when generating GOT/PIC code only module-local calls (visibility = hidden or protected) are supported.</li>
+ </ul>
+ </p>
+ <p>PowerPC constraints:
+ <ul>
+ <li>No variable argument lists are used.</li>
+ <li>No byval parameters are used.</li>
+ <li>On ppc32/64 GOT/PIC only module-local calls (visibility = hidden or protected) are supported.</li>
+ </ul>
+ </p>
+ <p>Example:</p>
+ <p>Call as <tt>llc -tailcallopt test.ll</tt>.
+ <div class="doc_code">
+ <pre>
+declare fastcc i32 @tailcallee(i32 inreg %a1, i32 inreg %a2, i32 %a3, i32 %a4)
+
+define fastcc i32 @tailcaller(i32 %in1, i32 %in2) {
+ %l1 = add i32 %in1, %in2
+ %tmp = tail call fastcc i32 @tailcallee(i32 %in1 inreg, i32 %in2 inreg, i32 %in1, i32 %l1)
+ ret i32 %tmp
+}</pre>
+ </div>
+ </p>
+ <p>Implications of <tt>-tailcallopt</tt>:</p>
+ <p>To support tail call optimization in situations where the callee has more arguments than the caller a 'callee pops arguments' convention is used. This currently causes each <tt>fastcc</tt> call that is not tail call optimized (because one or more of above constraints are not met) to be followed by a readjustment of the stack. So performance might be worse in such cases.</p>
+ <p>On x86 and x86-64 one register is reserved for indirect tail calls (e.g via a function pointer). So there is one less register for integer argument passing. For x86 this means 2 registers (if <tt>inreg</tt> parameter attribute is used) and for x86-64 this means 5 register are used.</p>
+</div>
<!-- ======================================================================= -->
<div class="doc_subsection">
<a name="x86">The X86 backend</a>