The LLVM Target-Independent Code Generator

Target-independent code generation algorithms @@ -857,9 +1030,9 @@ ret SelectionDAG optimizer is run to clean up redundancies exposed by type legalization. -

Legalize SelectionDAG Types — - This stage transforms SelectionDAG nodes to eliminate any types that are - unsupported on the target.

Legalize SelectionDAG Ops — + This stage transforms SelectionDAG nodes to eliminate any operations + that are unsupported on the target.

Optimize SelectionDAG — The SelectionDAG optimizer is run to eliminate inefficiencies introduced by @@ -1386,18 +1559,25 @@ bool RegMapping_Fer::compatible_class(MachineFunction &mf,

Virtual registers are also denoted by integer numbers. Contrary to physical - registers, different virtual registers never share the same number. The - smallest virtual register is normally assigned the number 1024. This may - change, so, in order to know which is the first virtual register, you should - access TargetRegisterInfo::FirstVirtualRegister. Any register whose - number is greater than or equal - to TargetRegisterInfo::FirstVirtualRegister is considered a virtual - register. Whereas physical registers are statically defined in - a TargetRegisterInfo.td file and cannot be created by the - application developer, that is not the case with virtual registers. In order - to create new virtual registers, use the + registers, different virtual registers never share the same number. Whereas + physical registers are statically defined in a TargetRegisterInfo.td + file and cannot be created by the application developer, that is not the case + with virtual registers. In order to create new virtual registers, use the method MachineRegisterInfo::createVirtualRegister(). This method - will return a virtual register with the highest code.

+ will return a new virtual register. Use an IndexedMap<Foo, + VirtReg2IndexFunctor> to hold information per virtual register. If you + need to enumerate all virtual registers, use the function + TargetRegisterInfo::index2VirtReg() to find the virtual register + numbers:

+ +

+  for (unsigned i = 0, e = MRI->getNumVirtRegs(); i != e; ++i) {
+    unsigned VirtReg = TargetRegisterInfo::index2VirtReg(i);
+    stuff(VirtReg);
+  }
+

Before register allocation, the operands of an instruction are mostly virtual registers, although physical registers may also be used. In order to check if @@ -1457,8 +1637,8 @@ bool RegMapping_Fer::compatible_class(MachineFunction &mf, order to get and store values in memory. To assign a physical register to a virtual register present in a given operand, use MachineOperand::setReg(p_reg). To insert a store instruction, - use TargetRegisterInfo::storeRegToStackSlot(...), and to insert a - load instruction, use TargetRegisterInfo::loadRegFromStackSlot.

+ use TargetInstrInfo::storeRegToStackSlot(...), and to insert a + load instruction, use TargetInstrInfo::loadRegFromStackSlot.

The indirect mapping shields the application developer from the complexities of inserting load and store instructions. In order to map a virtual register @@ -1635,25 +1815,228 @@ $ llc -regalloc=pbqp file.bc -o pbqp.s; Late Machine Code Optimizations

- Generating Assembly Code + +

+ +

The code emission step of code generation is responsible for lowering from +the code generator abstractions (like MachineFunction, MachineInstr, etc) down +to the abstractions used by the MC layer (MCInst, +MCStreamer, etc). This is +done with a combination of several different classes: the (misnamed) +target-independent AsmPrinter class, target-specific subclasses of AsmPrinter +(such as SparcAsmPrinter), and the TargetLoweringObjectFile class.

+ +

Since the MC layer works at the level of abstraction of object files, it +doesn't have a notion of functions, global variables etc. Instead, it thinks +about labels, directives, and instructions. A key class used at this time is +the MCStreamer class. This is an abstract API that is implemented in different +ways (e.g. to output a .s file, output an ELF .o file, etc) that is effectively +an "assembler API". MCStreamer has one method per directive, such as EmitLabel, +EmitSymbolAttribute, SwitchSection, etc, which directly correspond to assembly +level directives. +

+ +

If you are interested in implementing a code generator for a target, there +are three important things that you have to implement for your target:

+ +

First, you need a subclass of AsmPrinter for your target. This class +implements the general lowering process converting MachineFunction's into MC +label constructs. The AsmPrinter base class provides a number of useful methods +and routines, and also allows you to override the lowering process in some +important ways. You should get much of the lowering for free if you are +implementing an ELF, COFF, or MachO target, because the TargetLoweringObjectFile +class implements much of the common logic.
Second, you need to implement an instruction printer for your target. The +instruction printer takes an MCInst and renders it to a +raw_ostream as text. Most of this is automatically generated from the .td file +(when you specify something like "add $dst, $src1, $src2" in the +instructions), but you need to implement routines to print operands.
Third, you need to implement code that lowers a MachineInstr to an MCInst, usually implemented in +"<target>MCInstLower.cpp". This lowering process is often target +specific, and is responsible for turning jump table entries, constant pool +indices, global variable addresses, etc into MCLabels as appropriate. This +translation layer is also responsible for expanding pseudo ops used by the code +generator into the actual machine instructions they correspond to. The MCInsts +that are generated by this are fed into the instruction printer or the encoder. +

+ +

Finally, at your choosing, you can also implement an subclass of +MCCodeEmitter which lowers MCInst's into machine code bytes and relocations. +This is important if you want to support direct .o file emission, or would like +to implement an assembler for your target.

+ + + +

+ Implementing a Native Assembler +

+ + +

+ +

Though you're probably reading this because you want to write or maintain a +compiler backend, LLVM also fully supports building a native assemblers too. +We've tried hard to automate the generation of the assembler from the .td files +(in particular the instruction syntax and encodings), which means that a large +part of the manual and repetitive data entry can be factored and shared with the +compiler.

+ +

+ + +

Instruction Parsing

To Be Written

+ + + +

+ Instruction Alias Processing +

+ +

Once the instruction is parsed, it enters the MatchInstructionImpl function. +The MatchInstructionImpl function performs alias processing and then does +actual matching.

+ +

Alias processing is the phase that canonicalizes different lexical forms of +the same instructions down to one representation. There are several different +kinds of alias that are possible to implement and they are listed below in the +order that they are processed (which is in order from simplest/weakest to most +complex/powerful). Generally you want to use the first alias mechanism that +meets the needs of your instruction, because it will allow a more concise +description.

+ +

+ -

- Generating Binary Machine Code +

Mnemonic Aliases

+ +

The first phase of alias processing is simple instruction mnemonic +remapping for classes of instructions which are allowed with two different +mnemonics. This phase is a simple and unconditionally remapping from one input +mnemonic to one output mnemonic. It isn't possible for this form of alias to +look at the operands at all, so the remapping must apply for all forms of a +given mnemonic. Mnemonic aliases are defined simply, for example X86 has: +

+ +

+def : MnemonicAlias<"cbw",     "cbtw">;
+def : MnemonicAlias<"smovq",   "movsq">;
+def : MnemonicAlias<"fldcww",  "fldcw">;
+def : MnemonicAlias<"fucompi", "fucomip">;
+def : MnemonicAlias<"ud2a",    "ud2">;
+

+ +

... and many others. With a MnemonicAlias definition, the mnemonic is +remapped simply and directly. Though MnemonicAlias's can't look at any aspect +of the instruction (such as the operands) they can depend on global modes (the +same ones supported by the matcher), through a Requires clause:

+ +

+def : MnemonicAlias<"pushf", "pushfq">, Requires<[In64BitMode]>;
+def : MnemonicAlias<"pushf", "pushfl">, Requires<[In32BitMode]>;
+

In this example, the mnemonic gets mapped into different a new one depending +on the current instruction set.

+ +

+ + +

Instruction Aliases

For the JIT or .o file writer

+ +

The most general phase of alias processing occurs while matching is +happening: it provides new forms for the matcher to match along with a specific +instruction to generate. An instruction alias has two parts: the string to +match and the instruction to generate. For example: +

+ +

+def : InstAlias<"movsx $src, $dst", (MOVSX16rr8W GR16:$dst, GR8  :$src)>;
+def : InstAlias<"movsx $src, $dst", (MOVSX16rm8W GR16:$dst, i8mem:$src)>;
+def : InstAlias<"movsx $src, $dst", (MOVSX32rr8  GR32:$dst, GR8  :$src)>;
+def : InstAlias<"movsx $src, $dst", (MOVSX32rr16 GR32:$dst, GR16 :$src)>;
+def : InstAlias<"movsx $src, $dst", (MOVSX64rr8  GR64:$dst, GR8  :$src)>;
+def : InstAlias<"movsx $src, $dst", (MOVSX64rr16 GR64:$dst, GR16 :$src)>;
+def : InstAlias<"movsx $src, $dst", (MOVSX64rr32 GR64:$dst, GR32 :$src)>;
+

This shows a powerful example of the instruction aliases, matching the +same mnemonic in multiple different ways depending on what operands are present +in the assembly. The result of instruction aliases can include operands in a +different order than the destination instruction, and can use an input +multiple times, for example:

+ +

+def : InstAlias<"clrb $reg", (XOR8rr  GR8 :$reg, GR8 :$reg)>;
+def : InstAlias<"clrw $reg", (XOR16rr GR16:$reg, GR16:$reg)>;
+def : InstAlias<"clrl $reg", (XOR32rr GR32:$reg, GR32:$reg)>;
+def : InstAlias<"clrq $reg", (XOR64rr GR64:$reg, GR64:$reg)>;
+

+ +

This example also shows that tied operands are only listed once. In the X86 +backend, XOR8rr has two input GR8's and one output GR8 (where an input is tied +to the output). InstAliases take a flattened operand list without duplicates +for tied operands. The result of an instruction alias can also use immediates +and fixed physical registers which are added as simple immediate operands in the +result, for example:

+ +

+// Fixed Immediate operand.
+def : InstAlias<"aad", (AAD8i8 10)>;
+
+// Fixed register operand.
+def : InstAlias<"fcomi", (COM_FIr ST1)>;
+
+// Simple alias.
+def : InstAlias<"fcomi $reg", (COM_FIr RST:$reg)>;
+

+ + +

Instruction aliases can also have a Requires clause to make them +subtarget specific.

+ +

+ + + + +

Instruction Matching

+ +

To Be Written

+ + +

@@ -1664,10 +2047,275 @@ $ llc -regalloc=pbqp file.bc -o pbqp.s;

This section of the document explains features or design decisions that are - specific to the code generator for a particular target.

+ specific to the code generator for a particular target. First we start + with a table that summarizes what features are supported by each target.

+ +

+ + +

+ Target Feature Matrix +

+ +

Note that this table does not include the C backend or Cpp backends, since +they do not use the target independent code generator infrastructure. It also +doesn't list features that are not supported fully by any target yet. It +considers a feature to be supported if at least one subtarget supports it. A +feature being supported means that it is useful and works for most cases, it +does not indicate that there are zero known bugs in the implementation. Here +is the key:

+ + + + + + + + + + + + + + + +

Unknown	No support	Partial Support	Complete Support

+ +

Here is the table:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

Feature	ARM	X86
	Target
is generally reliable
assembly parser
disassembler
inline asm		*
jit	*
.o file writing
tail calls

+ +

+ + +

Is Generally Reliable

+ +

This box indicates whether the target is considered to be production quality. +This indicates that the target has been used as a static compiler to +compile large amounts of code by a variety of different people and is in +continuous use.

+ + +

Assembly Parser

+ +

This box indicates whether the target supports parsing target specific .s +files by implementing the MCAsmParser interface. This is required for llvm-mc +to be able to act as a native assembler and is required for inline assembly +support in the native .o file writer.

+ + +

Disassembler

+ +

This box indicates whether the target supports the MCDisassembler API for +disassembling machine opcode bytes into MCInst's.

+ +

+ + +

Inline Asm

+ +

This box indicates whether the target supports most popular inline assembly +constraints and modifiers.

+ +

X86 lacks reliable support for inline assembly +constraints relating to the X86 floating point stack.

+ +

+ + +

JIT Support

+ +

This box indicates whether the target supports the JIT compiler through +the ExecutionEngine interface.

+ +

The ARM backend has basic support for integer code +in ARM codegen mode, but lacks NEON and full Thumb support.

+ +

+ + +

.o File Writing

+ +

This box indicates whether the target supports writing .o files (e.g. MachO, +ELF, and/or COFF) files directly from the target. Note that the target also +must include an assembly parser and general inline assembly support for full +inline assembly support in the .o writer.

+ +

Targets that don't support this feature can obviously still write out .o +files, they just rely on having an external assembler to translate from a .s +file to a .o file (as is the case for many C compilers).

+ +

+ + +

Tail Calls

+ +

This box indicates whether the target supports guaranteed tail calls. These +are calls marked "tail" and use the fastcc +calling convention. Please see the tail call section +more more details.

+ +

+ + + +

Tail call optimization @@ -1884,7 +2532,7 @@ OperandTy: VirtReg, | VirtReg, UnsImm, VirtReg, SignExtImm PhysReg

x86 has an experimental feature which provides +

x86 has a feature which provides the ability to perform loads and stores to different address spaces via the x86 segment registers. A segment override prefix byte on an instruction causes the instruction's memory access to go to the specified