1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
2 "http://www.w3.org/TR/html4/strict.dtd">
5 <title>Writing an LLVM Compiler Backend</title>
6 <link rel="stylesheet" href="llvm.css" type="text/css">
11 <div class="doc_title">
12 Writing an LLVM Compiler Backend
16 <li><a href="#intro">Introduction</a>
18 <li><a href="#Audience">Audience</a></li>
19 <li><a href="#Prerequisite">Prerequisite Reading</a></li>
20 <li><a href="#Basic">Basic Steps</a></li>
21 <li><a href="#Preliminaries">Preliminaries</a></li>
23 <li><a href="#TargetMachine">Target Machine</a></li>
24 <li><a href="#RegisterSet">Register Set and Register Classes</a>
26 <li><a href="#RegisterDef">Defining a Register</a></li>
27 <li><a href="#RegisterClassDef">Defining a Register Class</a></li>
28 <li><a href="#implementRegister">Implement a subclass of TargetRegisterInfo</a></li>
30 <li><a href="#InstructionSet">Instruction Set</a>
32 <li><a href="#implementInstr">Implement a subclass of TargetInstrInfo</a></li>
33 <li><a href="#branchFolding">Branch Folding and If Conversion</a></li>
35 <li><a href="#InstructionSelector">Instruction Selector</a>
37 <li><a href="#LegalizePhase">The SelectionDAG Legalize Phase</a>
39 <li><a href="#promote">Promote</a></li>
40 <li><a href="#expand">Expand</a></li>
41 <li><a href="#custom">Custom</a></li>
42 <li><a href="#legal">Legal</a></li>
44 <li><a href="#callingConventions">Calling Conventions</a></li>
46 <li><a href="#assemblyPrinter">Assembly Printer</a></li>
47 <li><a href="#subtargetSupport">Subtarget Support</a></li>
48 <li><a href="#jitSupport">JIT Support</a>
50 <li><a href="#mce">Machine Code Emitter</a></li>
51 <li><a href="#targetJITInfo">Target JIT Info</a></li>
55 <div class="doc_author">
56 <p>Written by <a href="http://www.woo.com">Mason Woo</a> and <a href="http://misha.brukman.net">Misha Brukman</a></p>
59 <!-- *********************************************************************** -->
60 <div class="doc_section">
61 <a name="intro">Introduction</a>
63 <!-- *********************************************************************** -->
65 <div class="doc_text">
66 <p>This document describes techniques for writing compiler backends
67 that convert the LLVM IR (intermediate representation) to code for a specified
68 machine or other languages. Code intended for a specific machine can take the
69 form of either assembly code or binary code (usable for a JIT compiler). </p>
71 <p>The backend of LLVM features a target-independent code generator
72 that may create output for several types of target CPUs, including X86,
73 PowerPC, Alpha, and SPARC. The backend may also be used to generate code
74 targeted at SPUs of the Cell processor or GPUs to support the execution of
77 <p>The document focuses on existing examples found in subdirectories
78 of <tt>llvm/lib/Target</tt> in a downloaded LLVM release. In particular, this document
79 focuses on the example of creating a static compiler (one that emits text
80 assembly) for a SPARC target, because SPARC has fairly standard
81 characteristics, such as a RISC instruction set and straightforward calling
85 <div class="doc_subsection">
86 <a name="Audience">Audience</a>
89 <div class="doc_text">
90 <p>The audience for this document is anyone who needs to write an
91 LLVM backend to generate code for a specific hardware or software target.</p>
94 <div class="doc_subsection">
95 <a name="Prerequisite">Prerequisite Reading</a>
98 <div class="doc_text">
99 These essential documents must be read before reading this document:
102 <i><a href="http://www.llvm.org/docs/LangRef.html">LLVM Language Reference Manual</a></i> -
103 a reference manual for the LLVM assembly language
106 <i><a href="http://www.llvm.org/docs/CodeGenerator.html">The LLVM Target-Independent Code Generator </a></i> -
107 a guide to the components (classes and code generation algorithms) for translating
108 the LLVM internal representation to the machine code for a specified target.
109 Pay particular attention to the descriptions of code generation stages:
110 Instruction Selection, Scheduling and Formation, SSA-based Optimization,
111 Register Allocation, Prolog/Epilog Code Insertion, Late Machine Code Optimizations,
115 <i><a href="http://www.llvm.org/docs/TableGenFundamentals.html">TableGen Fundamentals</a></i> -
116 a document that describes the TableGen (tblgen) application that manages domain-specific
117 information to support LLVM code generation. TableGen processes input from a
118 target description file (.td suffix) and generates C++ code that can be used
122 <i><a href="http://www.llvm.org/docs/WritingAnLLVMPass.html">Writing an LLVM Pass</a></i> -
123 The assembly printer is a FunctionPass, as are several SelectionDAG processing steps.
126 To follow the SPARC examples in this document, have a copy of
127 <i><a href="http://www.sparc.org/standards/V8.pdf">The SPARC Architecture Manual, Version 8</a></i>
128 for reference. For details about the ARM instruction set, refer to the
129 <i><a href="http://infocenter.arm.com/">ARM Architecture Reference Manual</a></i>
130 For more about the GNU Assembler format (GAS), see
131 <i><a href="http://sourceware.org/binutils/docs/as/index.html">Using As</a></i>
132 especially for the assembly printer. <i>Using As</i> contains lists of target machine dependent features.
135 <div class="doc_subsection">
136 <a name="Basic">Basic Steps</a>
138 <div class="doc_text">
139 <p>To write a compiler
140 backend for LLVM that converts the LLVM IR (intermediate representation)
141 to code for a specified target (machine or other language), follow these steps:</p>
145 Create a subclass of the TargetMachine class that describes
146 characteristics of your target machine. Copy existing examples of specific
147 TargetMachine class and header files; for example, start with <tt>SparcTargetMachine.cpp</tt>
148 and <tt>SparcTargetMachine.h</tt>, but change the file names for your target. Similarly,
149 change code that references "Sparc" to reference your target. </li>
151 <li>Describe the register set of the target. Use TableGen to generate
152 code for register definition, register aliases, and register classes from a
153 target-specific <tt>RegisterInfo.td</tt> input file. You should also write additional
154 code for a subclass of TargetRegisterInfo class that represents the class
155 register file data used for register allocation and also describes the
156 interactions between registers.</li>
158 <li>Describe the instruction set of the target. Use TableGen to
159 generate code for target-specific instructions from target-specific versions of
160 <tt>TargetInstrFormats.td</tt> and <tt>TargetInstrInfo.td</tt>. You should write additional code
161 for a subclass of the TargetInstrInfo
162 class to represent machine
163 instructions supported by the target machine. </li>
165 <li>Describe the selection and conversion of the LLVM IR from a DAG (directed
166 acyclic graph) representation of instructions to native target-specific
167 instructions. Use TableGen to generate code that matches patterns and selects
168 instructions based on additional information in a target-specific version of
169 <tt>TargetInstrInfo.td</tt>. Write code for <tt>XXXISelDAGToDAG.cpp</tt>
170 (where XXX identifies the specific target) to perform pattern
171 matching and DAG-to-DAG instruction selection. Also write code in <tt>XXXISelLowering.cpp</tt>
172 to replace or remove operations and data types that are not supported natively
173 in a SelectionDAG. </li>
175 <li>Write code for an
176 assembly printer that converts LLVM IR to a GAS format for your target machine.
177 You should add assembly strings to the instructions defined in your
178 target-specific version of <tt>TargetInstrInfo.td</tt>. You should also write code for a
179 subclass of AsmPrinter that performs the LLVM-to-assembly conversion and a
180 trivial subclass of TargetAsmInfo.</li>
182 <li>Optionally, add support for subtargets (that is, variants with
183 different capabilities). You should also write code for a subclass of the
184 TargetSubtarget class, which allows you to use the <tt>-mcpu=</tt>
185 and <tt>-mattr=</tt> command-line options.</li>
187 <li>Optionally, add JIT support and create a machine code emitter (subclass
188 of TargetJITInfo) that is used to emit binary code directly into memory. </li>
191 <p>In the .cpp and .h files, initially stub up these methods and
192 then implement them later. Initially, you may not know which private members
193 that the class will need and which components will need to be subclassed.</p>
196 <div class="doc_subsection">
197 <a name="Preliminaries">Preliminaries</a>
199 <div class="doc_text">
200 <p>To actually create
201 your compiler backend, you need to create and modify a few files. The absolute
202 minimum is discussed here, but to actually use the LLVM target-independent code
203 generator, you must perform the steps described in the <a
204 href="http://www.llvm.org/docs/CodeGenerator.html">LLVM
205 Target-Independent Code Generator</a> document.</p>
208 create a subdirectory under <tt>lib/Target</tt> to hold all the files related to your
209 target. If your target is called "Dummy", create the directory
210 <tt>lib/Target/Dummy</tt>.</p>
213 directory, create a <tt>Makefile</tt>. It is easiest to copy a <tt>Makefile</tt> of another
214 target and modify it. It should at least contain the <tt>LEVEL</tt>, <tt>LIBRARYNAME</tt> and
215 <tt>TARGET</tt> variables, and then include <tt>$(LEVEL)/Makefile.common</tt>. The library can be
216 named LLVMDummy (for example, see the MIPS target). Alternatively, you can
217 split the library into LLVMDummyCodeGen and LLVMDummyAsmPrinter, the latter of
218 which should be implemented in a subdirectory below <tt>lib/Target/Dummy</tt> (for
219 example, see the PowerPC target).</p>
221 <p>Note that these two
222 naming schemes are hardcoded into <tt>llvm-config</tt>. Using any other naming scheme
223 will confuse <tt>llvm-config</tt> and produce lots of (seemingly unrelated) linker
224 errors when linking <tt>llc</tt>.</p>
226 <p>To make your target
227 actually do something, you need to implement a subclass of TargetMachine. This
228 implementation should typically be in the file
229 <tt>lib/Target/DummyTargetMachine.cpp</tt>, but any file in the <tt>lib/Target</tt> directory will
230 be built and should work. To use LLVM's target
231 independent code generator, you should do what all current machine backends do: create a subclass
232 of LLVMTargetMachine. (To create a target from scratch, create a subclass of
236 actually build and link your target, you need to add it to the <tt>TARGETS_TO_BUILD</tt>
237 variable. To do this, you modify the configure script to know about your target
238 when parsing the <tt>--enable-targets</tt> option. Search the configure script for <tt>TARGETS_TO_BUILD</tt>,
239 add your target to the lists there (some creativity required) and then
240 reconfigure. Alternatively, you can change <tt>autotools/configure.ac</tt> and
241 regenerate configure by running <tt>./autoconf/AutoRegen.sh</tt></p>
244 <!-- *********************************************************************** -->
245 <div class="doc_section">
246 <a name="TargetMachine">Target Machine</a>
248 <!-- *********************************************************************** -->
249 <div class="doc_text">
250 <p>LLVMTargetMachine is designed as a base class for targets
251 implemented with the LLVM target-independent code generator. The
252 LLVMTargetMachine class should be specialized by a concrete target class that
253 implements the various virtual methods. LLVMTargetMachine is defined as a
254 subclass of TargetMachine in <tt>include/llvm/Target/TargetMachine.h</tt>. The
255 TargetMachine class implementation (<tt>TargetMachine.cpp</tt>) also processes numerous
256 command-line options. </p>
258 <p>To create a concrete target-specific subclass of
259 LLVMTargetMachine, start by copying an existing TargetMachine class and header.
260 You should name the files that you create to reflect your specific target. For
261 instance, for the SPARC target, name the files <tt>SparcTargetMachine.h</tt> and
262 <tt>SparcTargetMachine.cpp</tt></p>
264 <p>For a target machine XXX, the implementation of XXXTargetMachine
265 must have access methods to obtain objects that represent target components.
266 These methods are named <tt>get*Info</tt> and are intended to obtain the instruction set
267 (<tt>getInstrInfo</tt>), register set (<tt>getRegisterInfo</tt>), stack frame layout
268 (<tt>getFrameInfo</tt>), and similar information. XXXTargetMachine must also implement
269 the <tt>getTargetData</tt> method to access an object with target-specific data
270 characteristics, such as data type size and alignment requirements. </p>
272 <p>For instance, for the SPARC target, the header file <tt>SparcTargetMachine.h</tt>
273 declares prototypes for several <tt>get*Info</tt> and <tt>getTargetData</tt> methods that simply
274 return a class member. </p>
277 <div class="doc_code">
278 <pre>namespace llvm {
282 class SparcTargetMachine : public LLVMTargetMachine {
283 const TargetData DataLayout; // Calculates type size & alignment
284 SparcSubtarget Subtarget;
285 SparcInstrInfo InstrInfo;
286 TargetFrameInfo FrameInfo;
289 virtual const TargetAsmInfo *createTargetAsmInfo()
293 SparcTargetMachine(const Module &M, const std::string &FS);
295 virtual const SparcInstrInfo *getInstrInfo() const {return &InstrInfo; }
296 virtual const TargetFrameInfo *getFrameInfo() const {return &FrameInfo; }
297 virtual const TargetSubtarget *getSubtargetImpl() const{return &Subtarget; }
298 virtual const TargetRegisterInfo *getRegisterInfo() const {
299 return &InstrInfo.getRegisterInfo();
301 virtual const TargetData *getTargetData() const { return &DataLayout; }
302 static unsigned getModuleMatchQuality(const Module &M);
304 // Pass Pipeline Configuration
305 virtual bool addInstSelector(PassManagerBase &PM, bool Fast);
306 virtual bool addPreEmitPass(PassManagerBase &PM, bool Fast);
307 virtual bool addAssemblyEmitter(PassManagerBase &PM, bool Fast,
308 std::ostream &Out);
311 } // end namespace llvm
315 <div class="doc_text">
317 <li><tt>getInstrInfo </tt></li>
318 <li><tt>getRegisterInfo</tt></li>
319 <li><tt>getFrameInfo</tt></li>
320 <li><tt>getTargetData</tt></li>
321 <li><tt>getSubtargetImpl</tt></li>
323 <p>For some targets, you also need to support the following methods:
327 <li><tt>getTargetLowering </tt></li>
328 <li><tt>getJITInfo</tt></li>
330 <p>In addition, the XXXTargetMachine constructor should specify a
331 TargetDescription string that determines the data layout for the target machine,
332 including characteristics such as pointer size, alignment, and endianness. For
333 example, the constructor for SparcTargetMachine contains the following: </p>
336 <div class="doc_code">
338 SparcTargetMachine::SparcTargetMachine(const Module &M, const std::string &FS)
339 : DataLayout("E-p:32:32-f128:128:128"),
340 Subtarget(M, FS), InstrInfo(Subtarget),
341 FrameInfo(TargetFrameInfo::StackGrowsDown, 8, 0) {
346 <div class="doc_text">
347 <p>Hyphens separate portions of the TargetDescription string. </p>
349 <li>The "E" in the string indicates a big-endian target data model; a
350 lower-case "e" would indicate little-endian. </li>
351 <li>"p:" is followed by pointer information: size, ABI alignment, and
352 preferred alignment. If only two figures follow "p:", then the first value is
353 pointer size, and the second value is both ABI and preferred alignment.</li>
354 <li>then a letter for numeric type alignment: "i", "f", "v", or "a"
355 (corresponding to integer, floating point, vector, or aggregate). "i", "v", or
356 "a" are followed by ABI alignment and preferred alignment. "f" is followed by
357 three values, the first indicates the size of a long double, then ABI alignment
358 and preferred alignment.</li>
360 <p>You must also register your target using the RegisterTarget
361 template. (See the TargetMachineRegistry class.) For example, in <tt>SparcTargetMachine.cpp</tt>,
362 the target is registered with:</p>
365 <div class="doc_code">
368 // Register the target.
369 RegisterTarget<SparcTargetMachine>X("sparc", "SPARC");
374 <!-- *********************************************************************** -->
375 <div class="doc_section">
376 <a name="RegisterSet">Register Set and Register Classes</a>
378 <!-- *********************************************************************** -->
379 <div class="doc_text">
380 <p>You should describe
381 a concrete target-specific class
382 that represents the register file of a target machine. This class is
383 called XXXRegisterInfo (where XXX identifies the target) and represents the
384 class register file data that is used for register allocation and also
385 describes the interactions between registers. </p>
388 define register classes to categorize related registers. A register class
389 should be added for groups of registers that are all treated the same way for
390 some instruction. Typical examples are register classes that include integer,
391 floating-point, or vector registers. A register allocator allows an
392 instruction to use any register in a specified register class to perform the
393 instruction in a similar manner. Register classes allocate virtual registers to
394 instructions from these sets, and register classes let the target-independent
395 register allocator automatically choose the actual registers.</p>
397 <p>Much of the code for registers, including register definition,
398 register aliases, and register classes, is generated by TableGen from
399 <tt>XXXRegisterInfo.td</tt> input files and placed in <tt>XXXGenRegisterInfo.h.inc</tt> and
400 <tt>XXXGenRegisterInfo.inc</tt> output files. Some of the code in the implementation of
401 XXXRegisterInfo requires hand-coding. </p>
404 <!-- ======================================================================= -->
405 <div class="doc_subsection">
406 <a name="RegisterDef">Defining a Register</a>
408 <div class="doc_text">
409 <p>The <tt>XXXRegisterInfo.td</tt> file typically starts with register definitions
410 for a target machine. The Register class (specified in <tt>Target.td</tt>) is used to
411 define an object for each register. The specified string n becomes the Name of
412 the register. The basic Register object does not have any subregisters and does
413 not specify any aliases.</p>
415 <div class="doc_code">
417 class Register<string n> {
418 string Namespace = "";
422 int SpillAlignment = 0;
423 list<Register> Aliases = [];
424 list<Register> SubRegs = [];
425 list<int> DwarfNumbers = [];
430 <div class="doc_text">
431 <p>For example, in the <tt>X86RegisterInfo.td</tt> file, there are register
432 definitions that utilize the Register class, such as:</p>
434 <div class="doc_code">
436 def AL : Register<"AL">,
437 DwarfRegNum<[0, 0, 0]>;
441 <div class="doc_text">
442 <p>This defines the register AL and assigns it values (with
443 DwarfRegNum) that are used by <tt>gcc</tt>, <tt>gdb</tt>, or a debug information writer (such as
444 DwarfWriter in <tt>llvm/lib/CodeGen</tt>) to identify a register. For register AL,
445 DwarfRegNum takes an array of 3 values, representing 3 different modes: the
446 first element is for X86-64, the second for EH (exception handling) on X86-32,
447 and the third is generic. -1 is a special Dwarf number that indicates the gcc
448 number is undefined, and -2 indicates the register number is invalid for this
451 <p>From the previously described line in the <tt>X86RegisterInfo.td</tt>
452 file, TableGen generates this code in the <tt>X86GenRegisterInfo.inc</tt> file:</p>
454 <div class="doc_code">
456 static const unsigned GR8[] = { X86::AL, ... };
458 const unsigned AL_AliasSet[] = { X86::AX, X86::EAX, X86::RAX, 0 };
460 const TargetRegisterDesc RegisterDescriptors[] = {
462 { "AL", "AL", AL_AliasSet, Empty_SubRegsSet, Empty_SubRegsSet, AL_SuperRegsSet }, ...
466 <div class="doc_text">
467 <p>From the register info file, TableGen generates a
468 TargetRegisterDesc object for each register. TargetRegisterDesc is defined in
469 <tt>include/llvm/Target/TargetRegisterInfo.h</tt> with the following fields:</p>
472 <div class="doc_code">
474 struct TargetRegisterDesc {
475 const char *AsmName; // Assembly language name for the register
476 const char *Name; // Printable name for the reg (for debugging)
477 const unsigned *AliasSet; // Register Alias Set
478 const unsigned *SubRegs; // Sub-register set
479 const unsigned *ImmSubRegs; // Immediate sub-register set
480 const unsigned *SuperRegs; // Super-register set
484 <div class="doc_text">
485 <p>TableGen uses the entire target description file (<tt>.td</tt>) to
486 determine text names for the register (in the AsmName and Name fields of
487 TargetRegisterDesc) and the relationships of other registers to the defined
488 register (in the other TargetRegisterDesc fields). In this example, other
489 definitions establish the registers "AX", "EAX", and "RAX" as aliases for one
490 another, so TableGen generates a null-terminated array (AL_AliasSet) for this
491 register alias set. </p>
493 <p>The Register class is commonly used as a base class for more
494 complex classes. In <tt>Target.td</tt>, the Register class is the base for the
495 RegisterWithSubRegs class that is used to define registers that need to specify
496 subregisters in the SubRegs list, as shown here:</p>
498 <div class="doc_code">
500 class RegisterWithSubRegs<string n,
501 list<Register> subregs> : Register<n> {
502 let SubRegs = subregs;
506 <div class="doc_text">
507 <p>In <tt>SparcRegisterInfo.td</tt>, additional register classes are defined
508 for SPARC: a Register subclass, SparcReg, and further subclasses: Ri, Rf, and
509 Rd. SPARC registers are identified by 5-bit ID numbers, which is a feature
510 common to these subclasses. Note the use of ‘let’ expressions to override values
511 that are initially defined in a superclass (such as SubRegs field in the Rd
514 <div class="doc_code">
516 class SparcReg<string n> : Register<n> {
517 field bits<5> Num;
518 let Namespace = "SP";
520 // Ri - 32-bit integer registers
521 class Ri<bits<5> num, string n> :
525 // Rf - 32-bit floating-point registers
526 class Rf<bits<5> num, string n> :
530 // Rd - Slots in the FP register file for 64-bit
531 floating-point values.
532 class Rd<bits<5> num, string n,
533 list<Register> subregs> : SparcReg<n> {
535 let SubRegs = subregs;
538 <div class="doc_text">
539 <p>In the <tt>SparcRegisterInfo.td</tt> file, there are register definitions
540 that utilize these subclasses of Register, such as:</p>
542 <div class="doc_code">
544 def G0 : Ri< 0, "G0">,
545 DwarfRegNum<[0]>;
546 def G1 : Ri< 1, "G1">, DwarfRegNum<[1]>;
548 def F0 : Rf< 0, "F0">,
549 DwarfRegNum<[32]>;
550 def F1 : Rf< 1, "F1">,
551 DwarfRegNum<[33]>;
553 def D0 : Rd< 0, "F0", [F0, F1]>,
554 DwarfRegNum<[32]>;
555 def D1 : Rd< 2, "F2", [F2, F3]>,
556 DwarfRegNum<[34]>;
559 <div class="doc_text">
560 <p>The last two registers shown above (D0 and D1) are double-precision
561 floating-point registers that are aliases for pairs of single-precision
562 floating-point sub-registers. In addition to aliases, the sub-register and
563 super-register relationships of the defined register are in fields of a
564 register’s TargetRegisterDesc.</p>
567 <!-- ======================================================================= -->
568 <div class="doc_subsection">
569 <a name="RegisterClassDef">Defining a Register Class</a>
571 <div class="doc_text">
572 <p>The RegisterClass class (specified in <tt>Target.td</tt>) is used to
573 define an object that represents a group of related registers and also defines
574 the default allocation order of the registers. A target description file
575 <tt>XXXRegisterInfo.td</tt> that uses <tt>Target.td</tt> can construct register classes using the
579 <div class="doc_code">
581 class RegisterClass<string namespace,
582 list<ValueType> regTypes, int alignment,
583 list<Register> regList> {
584 string Namespace = namespace;
585 list<ValueType> RegTypes = regTypes;
586 int Size = 0; // spill size, in bits; zero lets tblgen pick the size
587 int Alignment = alignment;
589 // CopyCost is the cost of copying a value between two registers
590 // default value 1 means a single instruction
591 // A negative value means copying is extremely expensive or impossible
593 list<Register> MemberList = regList;
595 // for register classes that are subregisters of this class
596 list<RegisterClass> SubRegClassList = [];
598 code MethodProtos = [{}]; // to insert arbitrary code
599 code MethodBodies = [{}];
602 <div class="doc_text">
603 <p>To define a RegisterClass, use the following 4 arguments:</p>
605 <li>The first argument of the definition is the name of the
608 <li>The second argument is a list of ValueType register type values
609 that are defined in <tt>include/llvm/CodeGen/ValueTypes.td</tt>. Defined values include
610 integer types (such as i16, i32, and i1 for Boolean), floating-point types
611 (f32, f64), and vector types (for example, v8i16 for an 8 x i16 vector). All
612 registers in a RegisterClass must have the same ValueType, but some registers
613 may store vector data in different configurations. For example a register that
614 can process a 128-bit vector may be able to handle 16 8-bit integer elements, 8
615 16-bit integers, 4 32-bit integers, and so on. </li>
617 <li>The third argument of the RegisterClass definition specifies the
618 alignment required of the registers when they are stored or loaded to memory.</li>
620 <li>The final argument, <tt>regList</tt>, specifies which registers are in
621 this class. If an <tt>allocation_order_*</tt> method is not specified, then <tt>regList</tt> also
622 defines the order of allocation used by the register allocator.</li>
625 <p>In <tt>SparcRegisterInfo.td</tt>, three RegisterClass objects are defined:
626 FPRegs, DFPRegs, and IntRegs. For all three register classes, the first
627 argument defines the namespace with the string “SP”. FPRegs defines a group of 32
628 single-precision floating-point registers (F0 to F31); DFPRegs defines a group
629 of 16 double-precision registers (D0-D15). For IntRegs, the MethodProtos and
630 MethodBodies methods are used by TableGen to insert the specified code into generated
633 <div class="doc_code">
635 def FPRegs : RegisterClass<"SP", [f32], 32, [F0, F1, F2, F3, F4, F5, F6, F7,
636 F8, F9, F10, F11, F12, F13, F14, F15, F16, F17, F18, F19, F20, F21, F22,
637 F23, F24, F25, F26, F27, F28, F29, F30, F31]>;
639 def DFPRegs : RegisterClass<"SP", [f64], 64, [D0, D1, D2, D3, D4, D5, D6, D7,
640 D8, D9, D10, D11, D12, D13, D14, D15]>;
642 def IntRegs : RegisterClass<"SP", [i32], 32, [L0, L1, L2, L3, L4, L5, L6, L7,
643 I0, I1, I2, I3, I4, I5,
644 O0, O1, O2, O3, O4, O5, O7,
646 // Non-allocatable regs:
650 I7, // return address
652 G5, G6, G7 // reserved for kernel
654 let MethodProtos = [{
655 iterator allocation_order_end(const MachineFunction &MF) const;
657 let MethodBodies = [{
658 IntRegsClass::iterator
659 IntRegsClass::allocation_order_end(const MachineFunction &MF) const {
660 return end()-10 // Don't allocate special registers
668 <div class="doc_text">
669 <p>Using <tt>SparcRegisterInfo.td</tt> with TableGen generates several output
670 files that are intended for inclusion in other source code that you write.
671 <tt>SparcRegisterInfo.td</tt> generates <tt>SparcGenRegisterInfo.h.inc</tt>, which should be
672 included in the header file for the implementation of the SPARC register
673 implementation that you write (<tt>SparcRegisterInfo.h</tt>). In
674 <tt>SparcGenRegisterInfo.h.inc</tt> a new structure is defined called
675 SparcGenRegisterInfo that uses TargetRegisterInfo as its base. It also
676 specifies types, based upon the defined register classes: DFPRegsClass, FPRegsClass,
677 and IntRegsClass. </p>
679 <p><tt>SparcRegisterInfo.td</tt> also generates SparcGenRegisterInfo.inc,
680 which is included at the bottom of <tt>SparcRegisterInfo.cpp</tt>, the SPARC register
681 implementation. The code below shows only the generated integer registers and
682 associated register classes. The order of registers in IntRegs reflects the
683 order in the definition of IntRegs in the target description file. Take special
684 note of the use of MethodBodies in <tt>SparcRegisterInfo.td</tt> to create code in
685 <tt>SparcGenRegisterInfo.inc</tt>. MethodProtos generates similar code in
686 <tt>SparcGenRegisterInfo.h.inc</tt>.</p>
689 <div class="doc_code">
690 <pre> // IntRegs Register Class...
691 static const unsigned IntRegs[] = {
692 SP::L0, SP::L1, SP::L2, SP::L3, SP::L4, SP::L5,
693 SP::L6, SP::L7, SP::I0, SP::I1, SP::I2, SP::I3, SP::I4, SP::I5, SP::O0, SP::O1,
694 SP::O2, SP::O3, SP::O4, SP::O5, SP::O7, SP::G1, SP::G2, SP::G3, SP::G4, SP::O6,
695 SP::I6, SP::I7, SP::G0, SP::G5, SP::G6, SP::G7,
698 // IntRegsVTs Register Class Value Types...
699 static const MVT::ValueType IntRegsVTs[] = {
702 namespace SP { // Register class instances
703 DFPRegsClass DFPRegsRegClass;
704 FPRegsClass FPRegsRegClass;
705 IntRegsClass IntRegsRegClass;
708 // IntRegs Sub-register Classess...
709 static const TargetRegisterClass* const IntRegsSubRegClasses [] = {
713 // IntRegs Super-register Classess...
714 static const TargetRegisterClass* const IntRegsSuperRegClasses [] = {
718 // IntRegs Register Class sub-classes...
719 static const TargetRegisterClass* const IntRegsSubclasses [] = {
724 // IntRegs Register Class super-classes...
725 static const TargetRegisterClass* const IntRegsSuperclasses [] = {
730 IntRegsClass::iterator
731 IntRegsClass::allocation_order_end(const MachineFunction &MF) const {
733 return end()-10 // Don't allocate special registers
737 IntRegsClass::IntRegsClass() : TargetRegisterClass(IntRegsRegClassID,
738 IntRegsVTs, IntRegsSubclasses, IntRegsSuperclasses, IntRegsSubRegClasses,
739 IntRegsSuperRegClasses, 4, 4, 1, IntRegs, IntRegs + 32) {}
743 <!-- ======================================================================= -->
744 <div class="doc_subsection">
745 <a name="implementRegister">Implement a subclass of</a>
746 <a href="http://www.llvm.org/docs/CodeGenerator.html#targetregisterinfo">TargetRegisterInfo</a>
748 <div class="doc_text">
749 <p>The final step is to hand code portions of XXXRegisterInfo, which
750 implements the interface described in <tt>TargetRegisterInfo.h</tt>. These functions
751 return 0, NULL, or false, unless overridden. Here’s a list of functions that
752 are overridden for the SPARC implementation in <tt>SparcRegisterInfo.cpp</tt>:</p>
754 <li><tt>getCalleeSavedRegs</tt> (returns a list of callee-saved registers in
755 the order of the desired callee-save stack frame offset)</li>
757 <li><tt>getCalleeSavedRegClasses</tt> (returns a list of preferred register
758 classes with which to spill each callee saved register)</li>
760 <li><tt>getReservedRegs</tt> (returns a bitset indexed by physical register
761 numbers, indicating if a particular register is unavailable)</li>
763 <li><tt>hasFP</tt> (return a Boolean indicating if a function should have a
764 dedicated frame pointer register)</li>
766 <li><tt>eliminateCallFramePseudoInstr</tt> (if call frame setup or destroy
767 pseudo instructions are used, this can be called to eliminate them)</li>
769 <li><tt>eliminateFrameIndex</tt> (eliminate abstract frame indices from
770 instructions that may use them)</li>
772 <li><tt>emitPrologue</tt> (insert prologue code into the function)</li>
774 <li><tt>emitEpilogue</tt> (insert epilogue code into the function)</li>
778 <!-- *********************************************************************** -->
779 <div class="doc_section">
780 <a name="InstructionSet">Instruction Set</a>
782 <!-- *********************************************************************** -->
783 <div class="doc_text">
784 <p>During the early stages of code generation, the LLVM IR code is
785 converted to a SelectionDAG with nodes that are instances of the SDNode class
786 containing target instructions. An SDNode has an opcode, operands, type
787 requirements, and operation properties (for example, is an operation
788 commutative, does an operation load from memory). The various operation node
789 types are described in the <tt>include/llvm/CodeGen/SelectionDAGNodes.h</tt> file (values
790 of the NodeType enum in the ISD namespace).</p>
792 <p>TableGen uses the following target description (.td) input files
793 to generate much of the code for instruction definition:</p>
795 <li><tt>Target.td</tt>, where the Instruction, Operand, InstrInfo, and other
796 fundamental classes are defined</li>
798 <li><tt>TargetSelectionDAG.td</tt>, used by SelectionDAG instruction selection
799 generators, contains SDTC* classes (selection DAG type constraint), definitions
800 of SelectionDAG nodes (such as imm, cond, bb, add, fadd, sub), and pattern
801 support (Pattern, Pat, PatFrag, PatLeaf, ComplexPattern)</li>
803 <li><tt>XXXInstrFormats.td</tt>, patterns for definitions of target-specific
806 <li><tt>XXXInstrInfo.td</tt>, target-specific definitions of instruction
807 templates, condition codes, and instructions of an instruction set. (For architecture
808 modifications, a different file name may be used. For example, for Pentium with
809 SSE instruction, this file is <tt>X86InstrSSE.td</tt>, and for Pentium with MMX, this
810 file is <tt>X86InstrMMX.td</tt>.)</li>
812 <p>There is also a target-specific <tt>XXX.td</tt> file, where XXX is the
813 name of the target. The <tt>XXX.td</tt> file includes the other .td input files, but its
814 contents are only directly important for subtargets.</p>
816 <p>You should describe
817 a concrete target-specific class
818 XXXInstrInfo that represents machine
819 instructions supported by a target machine. XXXInstrInfo contains an array of
820 XXXInstrDescriptor objects, each of which describes one instruction. An
821 instruction descriptor defines:</p>
823 <li>opcode mnemonic</li>
825 <li>number of operands</li>
827 <li>list of implicit register definitions and uses</li>
829 <li>target-independent properties (such as memory access, is
832 <li>target-specific flags </li>
835 <p>The Instruction class (defined in <tt>Target.td</tt>) is mostly used as a
836 base for more complex instruction classes.</p>
839 <div class="doc_code">
840 <pre>class Instruction {
841 string Namespace = "";
842 dag OutOperandList; // An dag containing the MI def operand list.
843 dag InOperandList; // An dag containing the MI use operand list.
844 string AsmString = ""; // The .s format to print the instruction with.
845 list<dag> Pattern; // Set to the DAG pattern for this instruction
846 list<Register> Uses = [];
847 list<Register> Defs = [];
848 list<Predicate> Predicates = []; // predicates turned into isel match code
849 ... remainder not shown for space ...
853 <div class="doc_text">
854 <p>A SelectionDAG node (SDNode) should contain an object
855 representing a target-specific instruction that is defined in <tt>XXXInstrInfo.td</tt>. The
856 instruction objects should represent instructions from the architecture manual
857 of the target machine (such as the
858 SPARC Architecture Manual for the SPARC target). </p>
861 instruction from the architecture manual is often modeled as multiple target
862 instructions, depending upon its operands. For example, a manual might
863 describe an add instruction that takes a register or an immediate operand. An
864 LLVM target could model this with two instructions named ADDri and ADDrr.</p>
866 <p>You should define a
867 class for each instruction category and define each opcode as a subclass of the
868 category with appropriate parameters such as the fixed binary encoding of
869 opcodes and extended opcodes. You should map the register bits to the bits of
870 the instruction in which they are encoded (for the JIT). Also you should specify
871 how the instruction should be printed when the automatic assembly printer is
874 <p>As is described in
875 the SPARC Architecture Manual, Version 8, there are three major 32-bit formats
876 for instructions. Format 1 is only for the CALL instruction. Format 2 is for
877 branch on condition codes and SETHI (set high bits of a register) instructions.
878 Format 3 is for other instructions. </p>
881 formats has corresponding classes in <tt>SparcInstrFormat.td</tt>. InstSP is a base
882 class for other instruction classes. Additional base classes are specified for
883 more precise formats: for example in <tt>SparcInstrFormat.td</tt>, F2_1 is for SETHI,
884 and F2_2 is for branches. There are three other base classes: F3_1 for
885 register/register operations, F3_2 for register/immediate operations, and F3_3 for
886 floating-point operations. <tt>SparcInstrInfo.td</tt> also adds the base class Pseudo for
887 synthetic SPARC instructions. </p>
889 <p><tt>SparcInstrInfo.td</tt>
890 largely consists of operand and instruction definitions for the SPARC target. In
891 <tt>SparcInstrInfo.td</tt>, the following target description file entry, LDrr, defines
892 the Load Integer instruction for a Word (the LD SPARC opcode) from a memory
893 address to a register. The first parameter, the value 3 (11<sub>2</sub>), is
894 the operation value for this category of operation. The second parameter
895 (000000<sub>2</sub>) is the specific operation value for LD/Load Word. The
896 third parameter is the output destination, which is a register operand and
897 defined in the Register target description file (IntRegs). </p>
899 <div class="doc_code">
900 <pre>def LDrr : F3_1 <3, 0b000000, (outs IntRegs:$dst), (ins MEMrr:$addr),
901 "ld [$addr], $dst",
902 [(set IntRegs:$dst, (load ADDRrr:$addr))]>;
906 <div class="doc_text">
908 parameter is the input source, which uses the address operand MEMrr that is
909 defined earlier in <tt>SparcInstrInfo.td</tt>:</p>
911 <div class="doc_code">
912 <pre>def MEMrr : Operand<i32> {
913 let PrintMethod = "printMemOperand";
914 let MIOperandInfo = (ops IntRegs, IntRegs);
918 <div class="doc_text">
919 <p>The fifth parameter is a string that is used by the assembly
920 printer and can be left as an empty string until the assembly printer interface
921 is implemented. The sixth and final parameter is the pattern used to match the
922 instruction during the SelectionDAG Select Phase described in
923 (<a href="http://www.llvm.org/docs/CodeGenerator.html">The LLVM Target-Independent Code Generator</a>).
924 This parameter is detailed in the next section, <a href="#InstructionSelector">Instruction Selector</a>.</p>
926 <p>Instruction class definitions are not overloaded for different
927 operand types, so separate versions of instructions are needed for register,
928 memory, or immediate value operands. For example, to perform a
929 Load Integer instruction for a Word
930 from an immediate operand to a register, the following instruction class is
933 <div class="doc_code">
934 <pre>def LDri : F3_2 <3, 0b000000, (outs IntRegs:$dst), (ins MEMri:$addr),
935 "ld [$addr], $dst",
936 [(set IntRegs:$dst, (load ADDRri:$addr))]>;
939 <div class="doc_text">
940 <p>Writing these definitions for so many similar instructions can
941 involve a lot of cut and paste. In td files, the <tt>multiclass</tt> directive enables
942 the creation of templates to define several instruction classes at once (using
943 the <tt>defm</tt> directive). For example in
944 <tt>SparcInstrInfo.td</tt>, the <tt>multiclass</tt> pattern F3_12 is defined to create 2
945 instruction classes each time F3_12 is invoked: </p>
947 <div class="doc_code">
948 <pre>multiclass F3_12 <string OpcStr, bits<6> Op3Val, SDNode OpNode> {
949 def rr : F3_1 <2, Op3Val,
950 (outs IntRegs:$dst), (ins IntRegs:$b, IntRegs:$c),
951 !strconcat(OpcStr, " $b, $c, $dst"),
952 [(set IntRegs:$dst, (OpNode IntRegs:$b, IntRegs:$c))]>;
953 def ri : F3_2 <2, Op3Val,
954 (outs IntRegs:$dst), (ins IntRegs:$b, i32imm:$c),
955 !strconcat(OpcStr, " $b, $c, $dst"),
956 [(set IntRegs:$dst, (OpNode IntRegs:$b, simm13:$c))]>;
960 <div class="doc_text">
961 <p>So when the <tt>defm</tt> directive is used for the XOR and ADD
962 instructions, as seen below, it creates four instruction objects: XORrr, XORri,
963 ADDrr, and ADDri.</p>
965 <div class="doc_code">
966 <pre>defm XOR : F3_12<"xor", 0b000011, xor>;
967 defm ADD : F3_12<"add", 0b000000, add>;
971 <div class="doc_text">
972 <p><tt>SparcInstrInfo.td</tt>
973 also includes definitions for condition codes that are referenced by branch
974 instructions. The following definitions in <tt>SparcInstrInfo.td</tt> indicate the bit location
975 of the SPARC condition code; for example, the 10<sup>th</sup> bit represents
976 the ‘greater than’ condition for integers, and the 22<sup>nd</sup> bit
977 represents the ‘greater than’ condition for floats. </p>
980 <div class="doc_code">
981 <pre>def ICC_NE : ICC_VAL< 9>; // Not Equal
982 def ICC_E : ICC_VAL< 1>; // Equal
983 def ICC_G : ICC_VAL<10>; // Greater
985 def FCC_U : FCC_VAL<23>; // Unordered
986 def FCC_G : FCC_VAL<22>; // Greater
987 def FCC_UG : FCC_VAL<21>; // Unordered or Greater
992 <div class="doc_text">
993 <p>(Note that <tt>Sparc.h</tt>
994 also defines enums that correspond to the same SPARC condition codes. Care must
995 be taken to ensure the values in <tt>Sparc.h</tt> correspond to the values in
996 <tt>SparcInstrInfo.td</tt>; that is, <tt>SPCC::ICC_NE = 9</tt>, <tt>SPCC::FCC_U = 23</tt> and so on.)</p>
999 <!-- ======================================================================= -->
1000 <div class="doc_subsection">
1001 <a name="implementInstr">Implement a subclass of </a>
1002 <a href="http://www.llvm.org/docs/CodeGenerator.html#targetinstrinfo">TargetInstrInfo</a>
1005 <div class="doc_text">
1006 <p>The final step is to hand code portions of XXXInstrInfo, which
1007 implements the interface described in <tt>TargetInstrInfo.h</tt>. These functions return
1008 0 or a Boolean or they assert, unless overridden. Here's a list of functions
1009 that are overridden for the SPARC implementation in <tt>SparcInstrInfo.cpp</tt>:</p>
1011 <li><tt>isMoveInstr</tt> (return true if the instruction is a register to
1012 register move; false, otherwise)</li>
1014 <li><tt>isLoadFromStackSlot</tt> (if the specified machine instruction is a
1015 direct load from a stack slot, return the register number of the destination
1016 and the FrameIndex of the stack slot)</li>
1018 <li><tt>isStoreToStackSlot</tt> (if the specified machine instruction is a
1019 direct store to a stack slot, return the register number of the destination and
1020 the FrameIndex of the stack slot)</li>
1022 <li><tt>copyRegToReg</tt> (copy values between a pair of registers)</li>
1024 <li><tt>storeRegToStackSlot</tt> (store a register value to a stack slot)</li>
1026 <li><tt>loadRegFromStackSlot</tt> (load a register value from a stack slot)</li>
1028 <li><tt>storeRegToAddr</tt> (store a register value to memory)</li>
1030 <li><tt>loadRegFromAddr</tt> (load a register value from memory)</li>
1032 <li><tt>foldMemoryOperand</tt> (attempt to combine instructions of any load or
1033 store instruction for the specified operand(s))</li>
1037 <!-- ======================================================================= -->
1038 <div class="doc_subsection">
1039 <a name="branchFolding">Branch Folding and If Conversion</a>
1041 <div class="doc_text">
1042 <p>Performance can be improved by combining instructions or by eliminating
1043 instructions that are never reached. The <tt>AnalyzeBranch</tt> method in XXXInstrInfo may
1044 be implemented to examine conditional instructions and remove unnecessary
1045 instructions. <tt>AnalyzeBranch</tt> looks at the end of a machine basic block (MBB) for
1046 opportunities for improvement, such as branch folding and if conversion. The
1047 <tt>BranchFolder</tt> and <tt>IfConverter</tt> machine function passes (see the source files
1048 <tt>BranchFolding.cpp</tt> and <tt>IfConversion.cpp</tt> in the <tt>lib/CodeGen</tt> directory) call
1049 <tt>AnalyzeBranch</tt> to improve the control flow graph that represents the
1052 <p>Several implementations of <tt>AnalyzeBranch</tt> (for ARM, Alpha, and
1053 X86) can be examined as models for your own <tt>AnalyzeBranch</tt> implementation. Since
1054 SPARC does not implement a useful <tt>AnalyzeBranch</tt>, the ARM target implementation
1057 <p><tt>AnalyzeBranch</tt> returns a Boolean value and takes four parameters:</p>
1059 <li>MachineBasicBlock &MBB – the incoming block to be
1062 <li>MachineBasicBlock *&TBB – a destination block that is
1063 returned; for a conditional branch that evaluates to true, TBB is the
1066 <li>MachineBasicBlock *&FBB – for a conditional branch that
1067 evaluates to false, FBB is returned as the destination</li>
1069 <li>std::vector<MachineOperand> &Cond – list of
1070 operands to evaluate a condition for a conditional branch</li>
1073 <p>In the simplest case, if a block ends without a branch, then it
1074 falls through to the successor block. No destination blocks are specified for
1075 either TBB or FBB, so both parameters return NULL. The start of the <tt>AnalyzeBranch</tt>
1076 (see code below for the ARM target) shows the function parameters and the code
1077 for the simplest case.</p>
1080 <div class="doc_code">
1081 <pre>bool ARMInstrInfo::AnalyzeBranch(MachineBasicBlock &MBB,
1082 MachineBasicBlock *&TBB, MachineBasicBlock *&FBB,
1083 std::vector<MachineOperand> &Cond) const
1085 MachineBasicBlock::iterator I = MBB.end();
1086 if (I == MBB.begin() || !isUnpredicatedTerminator(--I))
1091 <div class="doc_text">
1092 <p>If a block ends with a single unconditional branch instruction,
1093 then <tt>AnalyzeBranch</tt> (shown below) should return the destination of that branch
1094 in the TBB parameter. </p>
1097 <div class="doc_code">
1098 <pre>if (LastOpc == ARM::B || LastOpc == ARM::tB) {
1099 TBB = LastInst->getOperand(0).getMBB();
1105 <div class="doc_text">
1106 <p>If a block ends with two unconditional branches, then the second
1107 branch is never reached. In that situation, as shown below, remove the last
1108 branch instruction and return the penultimate branch in the TBB parameter. </p>
1111 <div class="doc_code">
1112 <pre>if ((SecondLastOpc == ARM::B || SecondLastOpc==ARM::tB) &&
1113 (LastOpc == ARM::B || LastOpc == ARM::tB)) {
1114 TBB = SecondLastInst->getOperand(0).getMBB();
1116 I->eraseFromParent();
1121 <div class="doc_text">
1122 <p>A block may end with a single conditional branch instruction that
1123 falls through to successor block if the condition evaluates to false. In that
1124 case, <tt>AnalyzeBranch</tt> (shown below) should return the destination of that
1125 conditional branch in the TBB parameter and a list of operands in the <tt>Cond</tt>
1126 parameter to evaluate the condition. </p>
1129 <div class="doc_code">
1130 <pre>if (LastOpc == ARM::Bcc || LastOpc == ARM::tBcc) {
1131 // Block ends with fall-through condbranch.
1132 TBB = LastInst->getOperand(0).getMBB();
1133 Cond.push_back(LastInst->getOperand(1));
1134 Cond.push_back(LastInst->getOperand(2));
1140 <div class="doc_text">
1141 <p>If a block ends with both a conditional branch and an ensuing
1142 unconditional branch, then <tt>AnalyzeBranch</tt> (shown below) should return the
1143 conditional branch destination (assuming it corresponds to a conditional
1144 evaluation of ‘true’) in the TBB parameter and the unconditional branch
1145 destination in the FBB (corresponding to a conditional evaluation of ‘false’).
1146 A list of operands to evaluate the condition should be returned in the <tt>Cond</tt>
1150 <div class="doc_code">
1151 <pre>unsigned SecondLastOpc = SecondLastInst->getOpcode();
1152 if ((SecondLastOpc == ARM::Bcc && LastOpc == ARM::B) ||
1153 (SecondLastOpc == ARM::tBcc && LastOpc == ARM::tB)) {
1154 TBB = SecondLastInst->getOperand(0).getMBB();
1155 Cond.push_back(SecondLastInst->getOperand(1));
1156 Cond.push_back(SecondLastInst->getOperand(2));
1157 FBB = LastInst->getOperand(0).getMBB();
1163 <div class="doc_text">
1164 <p>For the last two cases (ending with a single conditional branch or
1165 ending with one conditional and one unconditional branch), the operands returned
1166 in the <tt>Cond</tt> parameter can be passed to methods of other instructions to create
1167 new branches or perform other operations. An implementation of <tt>AnalyzeBranch</tt>
1168 requires the helper methods <tt>RemoveBranch</tt> and <tt>InsertBranch</tt> to manage subsequent
1171 <p><tt>AnalyzeBranch</tt> should return false indicating success in most circumstances.
1172 <tt>AnalyzeBranch</tt> should only return true when the method is stumped about what to
1173 do, for example, if a block has three terminating branches. <tt>AnalyzeBranch</tt> may
1174 return true if it encounters a terminator it cannot handle, such as an indirect
1178 <!-- *********************************************************************** -->
1179 <div class="doc_section">
1180 <a name="InstructionSelector">Instruction Selector</a>
1182 <!-- *********************************************************************** -->
1184 <div class="doc_text">
1185 <p>LLVM uses a SelectionDAG to represent LLVM IR instructions, and nodes
1186 of the SelectionDAG ideally represent native target instructions. During code
1187 generation, instruction selection passes are performed to convert non-native
1188 DAG instructions into native target-specific instructions. The pass described
1189 in <tt>XXXISelDAGToDAG.cpp</tt> is used to match patterns and perform DAG-to-DAG
1190 instruction selection. Optionally, a pass may be defined (in
1191 <tt>XXXBranchSelector.cpp</tt>) to perform similar DAG-to-DAG operations for branch
1192 instructions. Later,
1193 the code in <tt>XXXISelLowering.cpp</tt> replaces or removes operations and data types
1194 not supported natively (legalizes) in a Selection DAG. </p>
1196 <p>TableGen generates code for instruction selection using the
1197 following target description input files:</p>
1199 <li><tt>XXXInstrInfo.td</tt> contains definitions of instructions in a
1200 target-specific instruction set, generates <tt>XXXGenDAGISel.inc</tt>, which is included
1201 in <tt>XXXISelDAGToDAG.cpp</tt>. </li>
1203 <li><tt>XXXCallingConv.td</tt> contains the calling and return value conventions
1204 for the target architecture, and it generates <tt>XXXGenCallingConv.inc</tt>, which is
1205 included in <tt>XXXISelLowering.cpp</tt>.</li>
1208 <p>The implementation of an instruction selection pass must include
1209 a header that declares the FunctionPass class or a subclass of FunctionPass. In
1210 <tt>XXXTargetMachine.cpp</tt>, a Pass Manager (PM) should add each instruction selection
1211 pass into the queue of passes to run.</p>
1214 compiler (<tt>llc</tt>) is an excellent tool for visualizing the contents of DAGs. To display
1215 the SelectionDAG before or after specific processing phases, use the command
1216 line options for <tt>llc</tt>, described at <a
1217 href="http://llvm.org/docs/CodeGenerator.html#selectiondag_process">
1218 SelectionDAG Instruction Selection Process</a>.
1221 <p>To describe instruction selector behavior, you should add
1222 patterns for lowering LLVM code into a SelectionDAG as the last parameter of
1223 the instruction definitions in <tt>XXXInstrInfo.td</tt>. For example, in
1224 <tt>SparcInstrInfo.td</tt>, this entry defines a register store operation, and the last
1225 parameter describes a pattern with the store DAG operator.</p>
1228 <div class="doc_code">
1229 <pre>def STrr : F3_1< 3, 0b000100, (outs), (ins MEMrr:$addr, IntRegs:$src),
1230 "st $src, [$addr]", [(store IntRegs:$src, ADDRrr:$addr)]>;
1234 <div class="doc_text">
1235 <p>ADDRrr is a memory mode that is also defined in <tt>SparcInstrInfo.td</tt>:</p>
1238 <div class="doc_code">
1239 <pre>def ADDRrr : ComplexPattern<i32, 2, "SelectADDRrr", [], []>;
1243 <div class="doc_text">
1244 <p>The definition of ADDRrr refers to SelectADDRrr, which is a function defined in an
1245 implementation of the Instructor Selector (such as <tt>SparcISelDAGToDAG.cpp</tt>). </p>
1247 <p>In <tt>lib/Target/TargetSelectionDAG.td</tt>, the DAG operator for store
1248 is defined below:</p>
1251 <div class="doc_code">
1252 <pre>def store : PatFrag<(ops node:$val, node:$ptr),
1253 (st node:$val, node:$ptr), [{
1254 if (StoreSDNode *ST = dyn_cast<StoreSDNode>(N))
1255 return !ST->isTruncatingStore() &&
1256 ST->getAddressingMode() == ISD::UNINDEXED;
1261 <div class="doc_text">
1262 <p><tt>XXXInstrInfo.td</tt> also generates (in <tt>XXXGenDAGISel.inc</tt>) the
1263 <tt>SelectCode</tt> method that is used to call the appropriate processing method for an
1264 instruction. In this example, <tt>SelectCode</tt> calls <tt>Select_ISD_STORE</tt> for the
1265 ISD::STORE opcode.</p>
1268 <div class="doc_code">
1269 <pre>SDNode *SelectCode(SDOperand N) {
1271 MVT::ValueType NVT = N.Val->getValueType(0);
1272 switch (N.getOpcode()) {
1276 return Select_ISD_STORE(N);
1284 <div class="doc_text">
1285 <p>The pattern for STrr is matched, so elsewhere in
1286 <tt>XXXGenDAGISel.inc</tt>, code for STrr is created for <tt>Select_ISD_STORE</tt>. The <tt>Emit_22</tt> method
1287 is also generated in <tt>XXXGenDAGISel.inc</tt> to complete the processing of this
1291 <div class="doc_code">
1292 <pre>SDNode *Select_ISD_STORE(const SDOperand &N) {
1293 SDOperand Chain = N.getOperand(0);
1294 if (Predicate_store(N.Val)) {
1295 SDOperand N1 = N.getOperand(1);
1296 SDOperand N2 = N.getOperand(2);
1300 // Pattern: (st:void IntRegs:i32:$src,
1301 // ADDRrr:i32:$addr)<<P:Predicate_store>>
1302 // Emits: (STrr:void ADDRrr:i32:$addr, IntRegs:i32:$src)
1303 // Pattern complexity = 13 cost = 1 size = 0
1304 if (SelectADDRrr(N, N2, CPTmp0, CPTmp1) &&
1305 N1.Val->getValueType(0) == MVT::i32 &&
1306 N2.Val->getValueType(0) == MVT::i32) {
1307 return Emit_22(N, SP::STrr, CPTmp0, CPTmp1);
1313 <!-- ======================================================================= -->
1314 <div class="doc_subsection">
1315 <a name="LegalizePhase">The SelectionDAG Legalize Phase</a>
1317 <div class="doc_text">
1318 <p>The Legalize phase converts a DAG to use types and operations
1319 that are natively supported by the target. For natively unsupported types and
1320 operations, you need to add code to the target-specific XXXTargetLowering implementation
1321 to convert unsupported types and operations to supported ones.</p>
1323 <p>In the constructor for the XXXTargetLowering class, first use the
1324 <tt>addRegisterClass</tt> method to specify which types are supports and which register
1325 classes are associated with them. The code for the register classes are generated
1326 by TableGen from <tt>XXXRegisterInfo.td</tt> and placed in <tt>XXXGenRegisterInfo.h.inc</tt>. For
1327 example, the implementation of the constructor for the SparcTargetLowering
1328 class (in <tt>SparcISelLowering.cpp</tt>) starts with the following code:</p>
1331 <div class="doc_code">
1332 <pre>addRegisterClass(MVT::i32, SP::IntRegsRegisterClass);
1333 addRegisterClass(MVT::f32, SP::FPRegsRegisterClass);
1334 addRegisterClass(MVT::f64, SP::DFPRegsRegisterClass);
1338 <div class="doc_text">
1339 <p>You should examine the node types in the ISD namespace
1340 (<tt>include/llvm/CodeGen/SelectionDAGNodes.h</tt>)
1341 and determine which operations the target natively supports. For operations
1342 that do <b>not</b> have native support, add a callback to the constructor for
1343 the XXXTargetLowering class, so the instruction selection process knows what to
1344 do. The TargetLowering class callback methods (declared in
1345 <tt>llvm/Target/TargetLowering.h</tt>) are:</p>
1347 <li><tt>setOperationAction</tt> (general operation)</li>
1349 <li><tt>setLoadExtAction</tt> (load with extension)</li>
1351 <li><tt>setTruncStoreAction</tt> (truncating store)</li>
1353 <li><tt>setIndexedLoadAction</tt> (indexed load)</li>
1355 <li><tt>setIndexedStoreAction</tt> (indexed store)</li>
1357 <li><tt>setConvertAction</tt> (type conversion)</li>
1359 <li><tt>setCondCodeAction</tt> (support for a given condition code)</li>
1362 <p>Note: on older releases, <tt>setLoadXAction</tt> is used instead of <tt>setLoadExtAction</tt>.
1363 Also, on older releases, <tt>setCondCodeAction</tt> may not be supported. Examine your
1364 release to see what methods are specifically supported.</p>
1366 <p>These callbacks are used to determine that an operation does or
1367 does not work with a specified type (or types). And in all cases, the third
1368 parameter is a LegalAction type enum value: <tt>Promote</tt>, <tt>Expand</tt>,
1369 <tt>Custom</tt>, or <tt>Legal</tt>. <tt>SparcISelLowering.cpp</tt>
1370 contains examples of all four LegalAction values.</p>
1373 <!-- _______________________________________________________________________ -->
1374 <div class="doc_subsubsection">
1375 <a name="promote">Promote</a>
1378 <div class="doc_text">
1379 <p>For an operation without native support for a given type, the
1380 specified type may be promoted to a larger type that is supported. For example,
1381 SPARC does not support a sign-extending load for Boolean values (<tt>i1</tt> type), so
1382 in <tt>SparcISelLowering.cpp</tt> the third
1383 parameter below, <tt>Promote</tt>, changes <tt>i1</tt> type
1384 values to a large type before loading.</p>
1387 <div class="doc_code">
1388 <pre>setLoadExtAction(ISD::SEXTLOAD, MVT::i1, Promote);
1392 <!-- _______________________________________________________________________ -->
1393 <div class="doc_subsubsection">
1394 <a name="expand">Expand</a>
1396 <div class="doc_text">
1397 <p>For a type without native support, a value may need to be broken
1398 down further, rather than promoted. For an operation without native support, a
1399 combination of other operations may be used to similar effect. In SPARC, the
1400 floating-point sine and cosine trig operations are supported by expansion to
1401 other operations, as indicated by the third parameter, <tt>Expand</tt>, to
1402 <tt>setOperationAction</tt>:</p>
1405 <div class="doc_code">
1406 <pre>setOperationAction(ISD::FSIN, MVT::f32, Expand);
1407 setOperationAction(ISD::FCOS, MVT::f32, Expand);
1411 <!-- _______________________________________________________________________ -->
1412 <div class="doc_subsubsection">
1413 <a name="custom">Custom</a>
1415 <div class="doc_text">
1416 <p>For some operations, simple type promotion or operation expansion
1417 may be insufficient. In some cases, a special intrinsic function must be
1420 <p>For example, a constant value may require special treatment, or
1421 an operation may require spilling and restoring registers in the stack and
1422 working with register allocators. </p>
1424 <p>As seen in <tt>SparcISelLowering.cpp</tt> code below, to perform a type
1425 conversion from a floating point value to a signed integer, first the
1426 <tt>setOperationAction</tt> should be called with <tt>Custom</tt> as the third parameter:</p>
1429 <div class="doc_code">
1430 <pre>setOperationAction(ISD::FP_TO_SINT, MVT::i32, Custom);
1433 <div class="doc_text">
1434 <p>In the <tt>LowerOperation</tt> method, for each <tt>Custom</tt> operation, a case
1435 statement should be added to indicate what function to call. In the following
1436 code, an FP_TO_SINT opcode will call the <tt>LowerFP_TO_SINT</tt> method:</p>
1439 <div class="doc_code">
1440 <pre>SDOperand SparcTargetLowering::LowerOperation(
1441 SDOperand Op, SelectionDAG &DAG) {
1442 switch (Op.getOpcode()) {
1443 case ISD::FP_TO_SINT: return LowerFP_TO_SINT(Op, DAG);
1449 <div class="doc_text">
1450 <p>Finally, the <tt>LowerFP_TO_SINT</tt> method is implemented, using an FP
1451 register to convert the floating-point value to an integer.</p>
1454 <div class="doc_code">
1455 <pre>static SDOperand LowerFP_TO_SINT(SDOperand Op, SelectionDAG &DAG) {
1456 assert(Op.getValueType() == MVT::i32);
1457 Op = DAG.getNode(SPISD::FTOI, MVT::f32, Op.getOperand(0));
1458 return DAG.getNode(ISD::BIT_CONVERT, MVT::i32, Op);
1462 <!-- _______________________________________________________________________ -->
1463 <div class="doc_subsubsection">
1464 <a name="legal">Legal</a>
1466 <div class="doc_text">
1467 <p>The <tt>Legal</tt> LegalizeAction enum value simply indicates that an
1468 operation <b>is</b> natively supported. <tt>Legal</tt> represents the default condition,
1469 so it is rarely used. In <tt>SparcISelLowering.cpp</tt>, the action for CTPOP (an
1470 operation to count the bits set in an integer) is natively supported only for
1471 SPARC v9. The following code enables the <tt>Expand</tt> conversion technique for non-v9
1472 SPARC implementations.</p>
1475 <div class="doc_code">
1476 <pre>setOperationAction(ISD::CTPOP, MVT::i32, Expand);
1478 if (TM.getSubtarget<SparcSubtarget>().isV9())
1479 setOperationAction(ISD::CTPOP, MVT::i32, Legal);
1480 case ISD::SETULT: return SPCC::ICC_CS;
1481 case ISD::SETULE: return SPCC::ICC_LEU;
1482 case ISD::SETUGT: return SPCC::ICC_GU;
1483 case ISD::SETUGE: return SPCC::ICC_CC;
1488 <!-- ======================================================================= -->
1489 <div class="doc_subsection">
1490 <a name="callingConventions">Calling Conventions</a>
1492 <div class="doc_text">
1493 <p>To support target-specific calling conventions, <tt>XXXGenCallingConv.td</tt>
1494 uses interfaces (such as CCIfType and CCAssignToReg) that are defined in
1495 <tt>lib/Target/TargetCallingConv.td</tt>. TableGen can take the target descriptor file
1496 <tt>XXXGenCallingConv.td</tt> and generate the header file <tt>XXXGenCallingConv.inc</tt>, which
1497 is typically included in <tt>XXXISelLowering.cpp</tt>. You can use the interfaces in
1498 <tt>TargetCallingConv.td</tt> to specify:</p>
1500 <li>the order of parameter allocation</li>
1502 <li>where parameters and return values are placed (that is, on the
1503 stack or in registers)</li>
1505 <li>which registers may be used</li>
1507 <li>whether the caller or callee unwinds the stack</li>
1510 <p>The following example demonstrates the use of the CCIfType and
1511 CCAssignToReg interfaces. If the CCIfType predicate is true (that is, if the
1512 current argument is of type f32 or f64), then the action is performed. In this
1513 case, the CCAssignToReg action assigns the argument value to the first
1514 available register: either R0 or R1. </p>
1516 <div class="doc_code">
1517 <pre>CCIfType<[f32,f64], CCAssignToReg<[R0, R1]>>
1520 <div class="doc_text">
1521 <p><tt>SparcCallingConv.td</tt> contains definitions for a target-specific return-value
1522 calling convention (RetCC_Sparc32) and a basic 32-bit C calling convention
1523 (CC_Sparc32). The definition of RetCC_Sparc32 (shown below) indicates which
1524 registers are used for specified scalar return types. A single-precision float
1525 is returned to register F0, and a double-precision float goes to register D0. A
1526 32-bit integer is returned in register I0 or I1. </p>
1529 <div class="doc_code">
1530 <pre>def RetCC_Sparc32 : CallingConv<[
1531 CCIfType<[i32], CCAssignToReg<[I0, I1]>>,
1532 CCIfType<[f32], CCAssignToReg<[F0]>>,
1533 CCIfType<[f64], CCAssignToReg<[D0]>>
1537 <div class="doc_text">
1538 <p>The definition of CC_Sparc32 in <tt>SparcCallingConv.td</tt> introduces
1539 CCAssignToStack, which assigns the value to a stack slot with the specified size
1540 and alignment. In the example below, the first parameter, 4, indicates the size
1541 of the slot, and the second parameter, also 4, indicates the stack alignment
1542 along 4-byte units. (Special cases: if size is zero, then the ABI size is used;
1543 if alignment is zero, then the ABI alignment is used.) </p>
1546 <div class="doc_code">
1547 <pre>def CC_Sparc32 : CallingConv<[
1548 // All arguments get passed in integer registers if there is space.
1549 CCIfType<[i32, f32, f64], CCAssignToReg<[I0, I1, I2, I3, I4, I5]>>,
1550 CCAssignToStack<4, 4>
1554 <div class="doc_text">
1555 <p>CCDelegateTo is another commonly used interface, which tries to find
1556 a specified sub-calling convention and, if a match is found, it is invoked. In
1557 the following example (in <tt>X86CallingConv.td</tt>), the definition of RetCC_X86_32_C
1558 ends with CCDelegateTo. After the current value is assigned to the register ST0
1559 or ST1, the RetCC_X86Common is invoked.</p>
1562 <div class="doc_code">
1563 <pre>def RetCC_X86_32_C : CallingConv<[
1564 CCIfType<[f32], CCAssignToReg<[ST0, ST1]>>,
1565 CCIfType<[f64], CCAssignToReg<[ST0, ST1]>>,
1566 CCDelegateTo<RetCC_X86Common>
1570 <div class="doc_text">
1571 <p>CCIfCC is an interface that attempts to match the given name to
1572 the current calling convention. If the name identifies the current calling
1573 convention, then a specified action is invoked. In the following example (in
1574 <tt>X86CallingConv.td</tt>), if the Fast calling convention is in use, then RetCC_X86_32_Fast
1575 is invoked. If the SSECall calling convention is in use, then RetCC_X86_32_SSE
1579 <div class="doc_code">
1580 <pre>def RetCC_X86_32 : CallingConv<[
1581 CCIfCC<"CallingConv::Fast", CCDelegateTo<RetCC_X86_32_Fast>>,
1582 CCIfCC<"CallingConv::X86_SSECall", CCDelegateTo<RetCC_X86_32_SSE>>,
1583 CCDelegateTo<RetCC_X86_32_C>
1587 <div class="doc_text">
1588 <p>Other calling convention interfaces include:</p>
1590 <li>CCIf <predicate, action> - if the predicate matches, apply
1593 <li>CCIfInReg <action> - if the argument is marked with the
1594 ‘inreg’ attribute, then apply the action </li>
1596 <li>CCIfNest <action> - if the argument is marked with the
1597 ‘nest’ attribute, then apply the action</li>
1599 <li>CCIfNotVarArg <action> - if the current function does not
1600 take a variable number of arguments, apply the action</li>
1602 <li>CCAssignToRegWithShadow <registerList, shadowList> -
1603 similar to CCAssignToReg, but with a shadow list of registers</li>
1605 <li>CCPassByVal <size, align> - assign value to a stack slot
1606 with the minimum specified size and alignment </li>
1608 <li>CCPromoteToType <type> - promote the current value to the specified
1611 <li>CallingConv <[actions]> - define each calling convention
1612 that is supported</li>
1616 <!-- *********************************************************************** -->
1617 <div class="doc_section">
1618 <a name="assemblyPrinter">Assembly Printer</a>
1620 <!-- *********************************************************************** -->
1622 <div class="doc_text">
1624 emission stage, the code generator may utilize an LLVM pass to produce assembly
1625 output. To do this, you want to implement the code for a printer that converts
1626 LLVM IR to a GAS-format assembly language for your target machine, using the
1627 following steps:</p>
1629 <li>Define all the assembly strings for your target, adding them to
1630 the instructions defined in the <tt>XXXInstrInfo.td</tt> file.
1631 (See <a href="#InstructionSet">Instruction Set</a>.)
1632 TableGen will produce an output file (<tt>XXXGenAsmWriter.inc</tt>) with an
1633 implementation of the <tt>printInstruction</tt> method for the XXXAsmPrinter class.</li>
1635 <li>Write <tt>XXXTargetAsmInfo.h</tt>, which contains the bare-bones
1636 declaration of the XXXTargetAsmInfo class (a subclass of TargetAsmInfo). </li>
1638 <li>Write <tt>XXXTargetAsmInfo.cpp</tt>, which contains target-specific values
1639 for TargetAsmInfo properties and sometimes new implementations for methods</li>
1641 <li>Write <tt>XXXAsmPrinter.cpp</tt>, which implements the AsmPrinter class
1642 that performs the LLVM-to-assembly conversion. </li>
1645 <p>The code in <tt>XXXTargetAsmInfo.h</tt> is usually a trivial declaration
1646 of the XXXTargetAsmInfo class for use in <tt>XXXTargetAsmInfo.cpp</tt>. Similarly,
1647 <tt>XXXTargetAsmInfo.cpp</tt> usually has a few declarations of XXXTargetAsmInfo replacement
1648 values that override the default values in <tt>TargetAsmInfo.cpp</tt>. For example in
1649 <tt>SparcTargetAsmInfo.cpp</tt>, </p>
1652 <div class="doc_code">
1653 <pre>SparcTargetAsmInfo::SparcTargetAsmInfo(const SparcTargetMachine &TM) {
1654 Data16bitsDirective = "\t.half\t";
1655 Data32bitsDirective = "\t.word\t";
1656 Data64bitsDirective = 0; // .xword is only supported by V9.
1657 ZeroDirective = "\t.skip\t";
1658 CommentString = "!";
1659 ConstantPoolSection = "\t.section \".rodata\",#alloc\n";
1663 <div class="doc_text">
1664 <p>The X86 assembly printer implementation (X86TargetAsmInfo) is an
1665 example where the target specific TargetAsmInfo class uses overridden methods:
1666 <tt>ExpandInlineAsm</tt> and <tt>PreferredEHDataFormat</tt>. </p>
1668 <p>A target-specific implementation of AsmPrinter is written in
1669 <tt>XXXAsmPrinter.cpp</tt>, which implements the AsmPrinter class that converts the LLVM
1670 to printable assembly. The implementation must include the following headers
1671 that have declarations for the AsmPrinter and MachineFunctionPass classes. The
1672 MachineFunctionPass is a subclass of FunctionPass. </p>
1675 <div class="doc_code">
1676 <pre>#include "llvm/CodeGen/AsmPrinter.h"
1677 #include "llvm/CodeGen/MachineFunctionPass.h"
1681 <div class="doc_text">
1682 <p>As a FunctionPass, AsmPrinter first calls <tt>doInitialization</tt> to set
1683 up the AsmPrinter. In SparcAsmPrinter, a Mangler object is instantiated to
1684 process variable names.</p>
1686 <p>In <tt>XXXAsmPrinter.cpp</tt>, the <tt>runOnMachineFunction</tt> method (declared
1687 in MachineFunctionPass) must be implemented for XXXAsmPrinter. In
1688 MachineFunctionPass, the <tt>runOnFunction</tt> method invokes <tt>runOnMachineFunction</tt>.
1689 Target-specific implementations of <tt>runOnMachineFunction</tt> differ, but generally
1690 do the following to process each machine function:</p>
1692 <li>call <tt>SetupMachineFunction</tt> to perform initialization</li>
1694 <li>call <tt>EmitConstantPool</tt> to print out (to the output stream)
1695 constants which have been spilled to memory </li>
1697 <li>call <tt>EmitJumpTableInfo</tt> to print out jump tables used by the
1698 current function </li>
1700 <li>print out the label for the current function</li>
1702 <li>print out the code for the function, including basic block labels
1703 and the assembly for the instruction (using <tt>printInstruction</tt>)</li>
1705 <p>The XXXAsmPrinter implementation must also include the code
1706 generated by TableGen that is output in the <tt>XXXGenAsmWriter.inc</tt> file. The code
1707 in <tt>XXXGenAsmWriter.inc</tt> contains an implementation of the <tt>printInstruction</tt>
1708 method that may call these methods:</p>
1710 <li><tt>printOperand</tt></li>
1712 <li><tt>printMemOperand</tt></li>
1714 <li><tt>printCCOperand (for conditional statements)</tt></li>
1716 <li><tt>printDataDirective</tt></li>
1718 <li><tt>printDeclare</tt></li>
1720 <li><tt>printImplicitDef</tt></li>
1722 <li><tt>printInlineAsm</tt></li>
1724 <li><tt>printLabel</tt></li>
1726 <li><tt>printPICJumpTableEntry</tt></li>
1728 <li><tt>printPICJumpTableSetLabel</tt></li>
1731 <p>The implementations of <tt>printDeclare</tt>, <tt>printImplicitDef</tt>,
1732 <tt>printInlineAsm</tt>, and <tt>printLabel</tt> in <tt>AsmPrinter.cpp</tt> are generally adequate for
1733 printing assembly and do not need to be overridden. (<tt>printBasicBlockLabel</tt> is
1734 another method that is implemented in <tt>AsmPrinter.cpp</tt> that may be directly used
1735 in an implementation of XXXAsmPrinter.)</p>
1737 <p>The <tt>printOperand</tt> method is implemented with a long switch/case
1738 statement for the type of operand: register, immediate, basic block, external
1739 symbol, global address, constant pool index, or jump table index. For an
1740 instruction with a memory address operand, the <tt>printMemOperand</tt> method should be
1741 implemented to generate the proper output. Similarly, <tt>printCCOperand</tt> should be
1742 used to print a conditional operand. </p>
1744 <p><tt>doFinalization</tt> should be overridden in XXXAsmPrinter, and
1745 it should be called to shut down the assembly printer. During <tt>doFinalization</tt>,
1746 global variables and constants are printed to output.</p>
1748 <!-- *********************************************************************** -->
1749 <div class="doc_section">
1750 <a name="subtargetSupport">Subtarget Support</a>
1752 <!-- *********************************************************************** -->
1754 <div class="doc_text">
1755 <p>Subtarget support is used to inform the code generation process
1756 of instruction set variations for a given chip set. For example, the LLVM
1757 SPARC implementation provided covers three major versions of the SPARC
1758 microprocessor architecture: Version 8 (V8, which is a 32-bit architecture),
1759 Version 9 (V9, a 64-bit architecture), and the UltraSPARC architecture. V8 has
1760 16 double-precision floating-point registers that are also usable as either 32
1761 single-precision or 8 quad-precision registers. V8 is also purely big-endian. V9
1762 has 32 double-precision floating-point registers that are also usable as 16
1763 quad-precision registers, but cannot be used as single-precision registers. The
1764 UltraSPARC architecture combines V9 with UltraSPARC Visual Instruction Set
1767 <p>If subtarget support is needed, you should implement a
1768 target-specific XXXSubtarget class for your architecture. This class should
1769 process the command-line options <tt>–mcpu=</tt> and <tt>–mattr=</tt></p>
1771 <p>TableGen uses definitions in the <tt>Target.td</tt> and <tt>Sparc.td</tt> files to
1772 generate code in <tt>SparcGenSubtarget.inc</tt>. In <tt>Target.td</tt>, shown below, the
1773 SubtargetFeature interface is defined. The first 4 string parameters of the
1774 SubtargetFeature interface are a feature name, an attribute set by the feature,
1775 the value of the attribute, and a description of the feature. (The fifth
1776 parameter is a list of features whose presence is implied, and its default
1777 value is an empty array.)</p>
1780 <div class="doc_code">
1781 <pre>class SubtargetFeature<string n, string a, string v, string d,
1782 list<SubtargetFeature> i = []> {
1784 string Attribute = a;
1787 list<SubtargetFeature> Implies = i;
1791 <div class="doc_text">
1792 <p>In the <tt>Sparc.td</tt> file, the SubtargetFeature is used to define the
1793 following features. </p>
1796 <div class="doc_code">
1797 <pre>def FeatureV9 : SubtargetFeature<"v9", "IsV9", "true",
1798 "Enable SPARC-V9 instructions">;
1799 def FeatureV8Deprecated : SubtargetFeature<"deprecated-v8",
1800 "V8DeprecatedInsts", "true",
1801 "Enable deprecated V8 instructions in V9 mode">;
1802 def FeatureVIS : SubtargetFeature<"vis", "IsVIS", "true",
1803 "Enable UltraSPARC Visual Instruction Set extensions">;
1807 <div class="doc_text">
1808 <p>Elsewhere in <tt>Sparc.td</tt>, the Proc class is defined and then is used
1809 to define particular SPARC processor subtypes that may have the previously
1810 described features. </p>
1813 <div class="doc_code">
1814 <pre>class Proc<string Name, list<SubtargetFeature> Features>
1815 : Processor<Name, NoItineraries, Features>;
1817 def : Proc<"generic", []>;
1818 def : Proc<"v8", []>;
1819 def : Proc<"supersparc", []>;
1820 def : Proc<"sparclite", []>;
1821 def : Proc<"f934", []>;
1822 def : Proc<"hypersparc", []>;
1823 def : Proc<"sparclite86x", []>;
1824 def : Proc<"sparclet", []>;
1825 def : Proc<"tsc701", []>;
1826 def : Proc<"v9", [FeatureV9]>;
1827 def : Proc<"ultrasparc", [FeatureV9, FeatureV8Deprecated]>;
1828 def : Proc<"ultrasparc3", [FeatureV9, FeatureV8Deprecated]>;
1829 def : Proc<"ultrasparc3-vis", [FeatureV9, FeatureV8Deprecated, FeatureVIS]>;
1833 <div class="doc_text">
1834 <p>From <tt>Target.td</tt> and <tt>Sparc.td</tt> files, the resulting
1835 SparcGenSubtarget.inc specifies enum values to identify the features, arrays of
1836 constants to represent the CPU features and CPU subtypes, and the
1837 ParseSubtargetFeatures method that parses the features string that sets
1838 specified subtarget options. The generated <tt>SparcGenSubtarget.inc</tt> file should be
1839 included in the <tt>SparcSubtarget.cpp</tt>. The target-specific implementation of the XXXSubtarget
1840 method should follow this pseudocode:</p>
1843 <div class="doc_code">
1844 <pre>XXXSubtarget::XXXSubtarget(const Module &M, const std::string &FS) {
1845 // Set the default features
1846 // Determine default and user specified characteristics of the CPU
1847 // Call ParseSubtargetFeatures(FS, CPU) to parse the features string
1848 // Perform any additional operations
1853 <!-- *********************************************************************** -->
1854 <div class="doc_section">
1855 <a name="jitSupport">JIT Support</a>
1857 <!-- *********************************************************************** -->
1859 <div class="doc_text">
1860 <p>The implementation of a target machine optionally includes a Just-In-Time
1861 (JIT) code generator that emits machine code and auxiliary structures as binary
1862 output that can be written directly to memory.
1863 To do this, implement JIT code generation by performing the following
1866 <li>Write an <tt>XXXCodeEmitter.cpp</tt> file that contains a machine function
1867 pass that transforms target-machine instructions into relocatable machine code.</li>
1869 <li>Write an <tt>XXXJITInfo.cpp</tt> file that implements the JIT interfaces
1870 for target-specific code-generation
1871 activities, such as emitting machine code and stubs. </li>
1873 <li>Modify XXXTargetMachine so that it provides a TargetJITInfo
1874 object through its <tt>getJITInfo</tt> method. </li>
1877 <p>There are several different approaches to writing the JIT support
1878 code. For instance, TableGen and target descriptor files may be used for
1879 creating a JIT code generator, but are not mandatory. For the Alpha and PowerPC
1880 target machines, TableGen is used to generate <tt>XXXGenCodeEmitter.inc</tt>, which
1881 contains the binary coding of machine instructions and the
1882 <tt>getBinaryCodeForInstr</tt> method to access those codes. Other JIT implementations
1885 <p>Both <tt>XXXJITInfo.cpp</tt> and <tt>XXXCodeEmitter.cpp</tt> must include the
1886 <tt>llvm/CodeGen/MachineCodeEmitter.h</tt> header file that defines the MachineCodeEmitter
1887 class containing code for several callback functions that write data (in bytes,
1888 words, strings, etc.) to the output stream.</p>
1890 <!-- ======================================================================= -->
1891 <div class="doc_subsection">
1892 <a name="mce">Machine Code Emitter</a>
1895 <div class="doc_text">
1896 <p>In <tt>XXXCodeEmitter.cpp</tt>, a target-specific of the Emitter class is
1897 implemented as a function pass (subclass of MachineFunctionPass). The
1898 target-specific implementation of <tt>runOnMachineFunction</tt> (invoked by
1899 <tt>runOnFunction</tt> in MachineFunctionPass) iterates through the MachineBasicBlock
1900 calls <tt>emitInstruction</tt> to process each instruction and emit binary code. <tt>emitInstruction</tt>
1901 is largely implemented with case statements on the instruction types defined in
1902 <tt>XXXInstrInfo.h</tt>. For example, in <tt>X86CodeEmitter.cpp</tt>, the <tt>emitInstruction</tt> method
1903 is built around the following switch/case statements:</p>
1906 <div class="doc_code">
1907 <pre>switch (Desc->TSFlags & X86::FormMask) {
1908 case X86II::Pseudo: // for not yet implemented instructions
1909 ... // or pseudo-instructions
1911 case X86II::RawFrm: // for instructions with a fixed opcode value
1914 case X86II::AddRegFrm: // for instructions that have one register operand
1915 ... // added to their opcode
1917 case X86II::MRMDestReg:// for instructions that use the Mod/RM byte
1918 ... // to specify a destination (register)
1920 case X86II::MRMDestMem:// for instructions that use the Mod/RM byte
1921 ... // to specify a destination (memory)
1923 case X86II::MRMSrcReg: // for instructions that use the Mod/RM byte
1924 ... // to specify a source (register)
1926 case X86II::MRMSrcMem: // for instructions that use the Mod/RM byte
1927 ... // to specify a source (memory)
1929 case X86II::MRM0r: case X86II::MRM1r: // for instructions that operate on
1930 case X86II::MRM2r: case X86II::MRM3r: // a REGISTER r/m operand and
1931 case X86II::MRM4r: case X86II::MRM5r: // use the Mod/RM byte and a field
1932 case X86II::MRM6r: case X86II::MRM7r: // to hold extended opcode data
1935 case X86II::MRM0m: case X86II::MRM1m: // for instructions that operate on
1936 case X86II::MRM2m: case X86II::MRM3m: // a MEMORY r/m operand and
1937 case X86II::MRM4m: case X86II::MRM5m: // use the Mod/RM byte and a field
1938 case X86II::MRM6m: case X86II::MRM7m: // to hold extended opcode data
1941 case X86II::MRMInitReg: // for instructions whose source and
1942 ... // destination are the same register
1947 <div class="doc_text">
1948 <p>The implementations of these case statements often first emit the
1949 opcode and then get the operand(s). Then depending upon the operand, helper
1950 methods may be called to process the operand(s). For example, in <tt>X86CodeEmitter.cpp</tt>,
1951 for the <tt>X86II::AddRegFrm</tt> case, the first data emitted (by <tt>emitByte</tt>) is the
1952 opcode added to the register operand. Then an object representing the machine
1953 operand, MO1, is extracted. The helper methods such as <tt>isImmediate</tt>,
1954 <tt>isGlobalAddress</tt>, <tt>isExternalSymbol</tt>, <tt>isConstantPoolIndex</tt>, and
1955 <tt>isJumpTableIndex</tt>
1956 determine the operand type. (<tt>X86CodeEmitter.cpp</tt> also has private methods such
1957 as <tt>emitConstant</tt>, <tt>emitGlobalAddress</tt>,
1958 <tt>emitExternalSymbolAddress</tt>, <tt>emitConstPoolAddress</tt>,
1959 and <tt>emitJumpTableAddress</tt> that emit the data into the output stream.) </p>
1962 <div class="doc_code">
1963 <pre>case X86II::AddRegFrm:
1964 MCE.emitByte(BaseOpcode + getX86RegNum(MI.getOperand(CurOp++).getReg()));
1966 if (CurOp != NumOps) {
1967 const MachineOperand &MO1 = MI.getOperand(CurOp++);
1968 unsigned Size = X86InstrInfo::sizeOfImm(Desc);
1969 if (MO1.isImmediate())
1970 emitConstant(MO1.getImm(), Size);
1972 unsigned rt = Is64BitMode ? X86::reloc_pcrel_word
1973 : (IsPIC ? X86::reloc_picrel_word : X86::reloc_absolute_word);
1974 if (Opcode == X86::MOV64ri)
1975 rt = X86::reloc_absolute_dword; // FIXME: add X86II flag?
1976 if (MO1.isGlobalAddress()) {
1977 bool NeedStub = isa<Function>(MO1.getGlobal());
1978 bool isLazy = gvNeedsLazyPtr(MO1.getGlobal());
1979 emitGlobalAddress(MO1.getGlobal(), rt, MO1.getOffset(), 0,
1981 } else if (MO1.isExternalSymbol())
1982 emitExternalSymbolAddress(MO1.getSymbolName(), rt);
1983 else if (MO1.isConstantPoolIndex())
1984 emitConstPoolAddress(MO1.getIndex(), rt);
1985 else if (MO1.isJumpTableIndex())
1986 emitJumpTableAddress(MO1.getIndex(), rt);
1992 <div class="doc_text">
1993 <p>In the previous example, <tt>XXXCodeEmitter.cpp</tt> uses the variable <tt>rt</tt>,
1994 which is a RelocationType enum that may be used to relocate addresses (for
1995 example, a global address with a PIC base offset). The RelocationType enum for
1996 that target is defined in the short target-specific <tt>XXXRelocations.h</tt> file. The
1997 RelocationType is used by the <tt>relocate</tt> method defined in <tt>XXXJITInfo.cpp</tt> to
1998 rewrite addresses for referenced global symbols.</p>
2000 <p>For example, <tt>X86Relocations.h</tt> specifies the following relocation
2001 types for the X86 addresses. In all four cases, the relocated value is added to
2002 the value already in memory. For <tt>reloc_pcrel_word</tt> and <tt>reloc_picrel_word</tt>,
2003 there is an additional initial adjustment.</p>
2006 <div class="doc_code">
2007 <pre>enum RelocationType {
2008 reloc_pcrel_word = 0, // add reloc value after adjusting for the PC loc
2009 reloc_picrel_word = 1, // add reloc value after adjusting for the PIC base
2010 reloc_absolute_word = 2, // absolute relocation; no additional adjustment
2011 reloc_absolute_dword = 3 // absolute relocation; no additional adjustment
2015 <!-- ======================================================================= -->
2016 <div class="doc_subsection">
2017 <a name="targetJITInfo">Target JIT Info</a>
2019 <div class="doc_text">
2020 <p><tt>XXXJITInfo.cpp</tt> implements the JIT interfaces for target-specific code-generation
2021 activities, such as emitting machine code and stubs. At minimum,
2022 a target-specific version of XXXJITInfo implements the following:</p>
2024 <li><tt>getLazyResolverFunction</tt> – initializes the JIT, gives the
2025 target a function that is used for compilation </li>
2027 <li><tt>emitFunctionStub</tt> – returns a native function with a
2028 specified address for a callback function</li>
2030 <li><tt>relocate</tt> – changes the addresses of referenced globals,
2031 based on relocation types</li>
2033 <li>callback function that are wrappers to a function stub that is
2034 used when the real target is not initially known </li>
2037 <p><tt>getLazyResolverFunction</tt> is generally trivial to implement. It
2038 makes the incoming parameter as the global JITCompilerFunction and returns the
2039 callback function that will be used a function wrapper. For the Alpha target
2040 (in <tt>AlphaJITInfo.cpp</tt>), the <tt>getLazyResolverFunction</tt> implementation is simply:</p>
2043 <div class="doc_code">
2044 <pre>TargetJITInfo::LazyResolverFn AlphaJITInfo::getLazyResolverFunction(
2047 JITCompilerFunction = F;
2048 return AlphaCompilationCallback;
2052 <div class="doc_text">
2053 <p>For the X86 target, the <tt>getLazyResolverFunction</tt> implementation is
2054 a little more complication, because it returns a different callback function
2055 for processors with SSE instructions and XMM registers. </p>
2057 <p>The callback function initially saves and later restores the
2058 callee register values, incoming arguments, and frame and return address. The
2059 callback function needs low-level access to the registers or stack, so it is typically
2060 implemented with assembler. </p>
2063 <!-- *********************************************************************** -->
2067 <a href="http://jigsaw.w3.org/css-validator/check/referer"><img
2068 src="http://jigsaw.w3.org/css-validator/images/vcss" alt="Valid CSS!"></a>
2069 <a href="http://validator.w3.org/check/referer"><img
2070 src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!"></a>
2072 <a href="http://www.woo.com">Mason Woo</a> and <a href="http://misha.brukman.net">Misha Brukman</a><br>
2073 <a href="http://llvm.org">The LLVM Compiler Infrastructure</a>
2075 Last modified: $Date$