1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
2 "http://www.w3.org/TR/html4/strict.dtd">
5 <meta http-equiv="Content-Type" Content="text/html; charset=UTF-8" >
6 <title>Accurate Garbage Collection with LLVM</title>
7 <link rel="stylesheet" href="llvm.css" type="text/css">
8 <style type="text/css">
9 .rowhead { text-align: left; background: inherit; }
10 .indent { padding-left: 1em; }
11 .optl { color: #BFBFBF; }
16 <div class="doc_title">
17 Accurate Garbage Collection with LLVM
21 <li><a href="#introduction">Introduction</a>
23 <li><a href="#feature">GC features provided and algorithms
28 <li><a href="#usage">Using the collectors</a>
30 <li><a href="#shadow-stack">ShadowStack -
31 A highly portable collector</a></li>
32 <li><a href="#semispace">SemiSpace -
33 A simple copying collector runtime</a></li>
34 <li><a href="#ocaml">Ocaml -
35 An Objective Caml-compatible collector</a></li>
39 <li><a href="#intrinsics">Collection intrinsics</a>
41 <li><a href="#gcroot">Identifying GC roots on the stack:
42 <tt>llvm.gcroot</tt></a></li>
43 <li><a href="#barriers">Reading and writing references in the heap</a>
45 <li><a href="#gcwrite">Write barrier: <tt>llvm.gcwrite</tt></a></li>
46 <li><a href="#gcread">Read barrier: <tt>llvm.gcread</tt></a></li>
52 <li><a href="#runtime">Recommended runtime interface</a>
54 <li><a href="#initialize">Garbage collector startup and
55 initialization</a></li>
56 <li><a href="#allocate">Allocating memory from the GC</a></li>
57 <li><a href="#explicit">Explicit invocation of the garbage
59 <li><a href="#traceroots">Tracing GC pointers from the program
61 <li><a href="#staticroots">Tracing GC pointers from static roots</a></li>
65 <li><a href="#plugin">Implementing a collector plugin</a>
67 <li><a href="#collector-algos">Overview of available features</a></li>
68 <li><a href="#stack-map">Computing stack maps</a></li>
69 <li><a href="#init-roots">Initializing roots to null:
70 <tt>InitRoots</tt></a></li>
71 <li><a href="#custom">Custom lowering of intrinsics: <tt>CustomRoots</tt>,
72 <tt>CustomReadBarriers</tt>, and <tt>CustomWriteBarriers</tt></a></li>
73 <li><a href="#safe-points">Generating safe points:
74 <tt>NeededSafePoints</tt></a></li>
75 <li><a href="#assembly">Emitting assembly code:
76 <tt>beginAssembly</tt> and <tt>finishAssembly</tt></a></li>
80 <li><a href="#runtime-impl">Implementing a collector runtime</a>
82 <li><a href="#gcdescriptors">Tracing GC pointers from heap
87 <li><a href="#references">References</a></li>
91 <div class="doc_author">
92 <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a> and
96 <!-- *********************************************************************** -->
97 <div class="doc_section">
98 <a name="introduction">Introduction</a>
100 <!-- *********************************************************************** -->
102 <div class="doc_text">
104 <p>Garbage collection is a widely used technique that frees the programmer from
105 having to know the lifetimes of heap objects, making software easier to produce
106 and maintain. Many programming languages rely on garbage collection for
107 automatic memory management. There are two primary forms of garbage collection:
108 conservative and accurate.</p>
110 <p>Conservative garbage collection often does not require any special support
111 from either the language or the compiler: it can handle non-type-safe
112 programming languages (such as C/C++) and does not require any special
113 information from the compiler. The
114 <a href="http://www.hpl.hp.com/personal/Hans_Boehm/gc/">Boehm collector</a> is
115 an example of a state-of-the-art conservative collector.</p>
117 <p>Accurate garbage collection requires the ability to identify all pointers in
118 the program at run-time (which requires that the source-language be type-safe in
119 most cases). Identifying pointers at run-time requires compiler support to
120 locate all places that hold live pointer variables at run-time, including the
121 <a href="#gcroot">processor stack and registers</a>.</p>
123 <p>Conservative garbage collection is attractive because it does not require any
124 special compiler support, but it does have problems. In particular, because the
125 conservative garbage collector cannot <i>know</i> that a particular word in the
126 machine is a pointer, it cannot move live objects in the heap (preventing the
127 use of compacting and generational GC algorithms) and it can occasionally suffer
128 from memory leaks due to integer values that happen to point to objects in the
129 program. In addition, some aggressive compiler transformations can break
130 conservative garbage collectors (though these seem rare in practice).</p>
132 <p>Accurate garbage collectors do not suffer from any of these problems, but
133 they can suffer from degraded scalar optimization of the program. In particular,
134 because the runtime must be able to identify and update all pointers active in
135 the program, some optimizations are less effective. In practice, however, the
136 locality and performance benefits of using aggressive garbage allocation
137 techniques dominates any low-level losses.</p>
139 <p>This document describes the mechanisms and interfaces provided by LLVM to
140 support accurate garbage collection.</p>
144 <!-- ======================================================================= -->
145 <div class="doc_subsection">
146 <a name="feature">GC features provided and algorithms supported</a>
149 <div class="doc_text">
151 <p>LLVM's intermediate representation provides <a href="#intrinsics">garbage
152 collection intrinsics</a> which offer support for a broad class of
153 collector models. For instance, the intrinsics permit:</p>
156 <li>semi-space collectors</li>
157 <li>mark-sweep collectors</li>
158 <li>generational collectors</li>
159 <li>reference counting</li>
160 <li>incremental collectors</li>
161 <li>concurrent collectors</li>
162 <li>cooperative collectors</li>
165 <p>We hope that the primitive support built into the LLVM IR is sufficient to
166 support a broad class of garbage collected languages including Scheme, ML, Java,
167 C#, Perl, Python, Lua, Ruby, other scripting languages, and more.</p>
169 <p>However, LLVM does not itself implement a garbage collector. This is because
170 collectors are tightly coupled to object models, and LLVM is agnostic to object
171 models. Since LLVM is agnostic to object models, it would be inappropriate for
172 LLVM to dictate any particular collector. Instead, LLVM provides a framework for
173 garbage collector implementations in two manners:</p>
176 <li><b>At compile time</b> with <a href="#plugin">collector plugins</a> for
177 the compiler. Collector plugins have ready access to important garbage
178 collector algorithms. Leveraging these tools, it is straightforward to
179 emit type-accurate stack maps for your runtime in as little as ~100 lines of
182 <li><b>At runtime</b> with <a href="#runtime">suggested runtime
183 interfaces</a>, which allow front-end compilers to support a range of
184 collection runtimes.</li>
189 <!-- *********************************************************************** -->
190 <div class="doc_section">
191 <a name="usage">Using the collectors</a>
193 <!-- *********************************************************************** -->
195 <div class="doc_text">
197 <p>In general, using a collector implies:</p>
200 <li>Emitting compatible code, including initialization in the main
202 <li>Loading a compiler plugin if the collector is not statically linked with
203 your compiler. For <tt>llc</tt>, use the <tt>-load</tt> option.</li>
204 <li>Selecting the collection algorithm with <tt>llc -gc=</tt> or by setting
205 <tt>llvm::TheCollector</tt>.</li>
206 <li>Linking your final executable with the garbage collector runtime.</li>
209 <p>This table summarizes the available runtimes.</p>
214 <th><tt>llc</tt> arguments</th>
216 <th><tt>gcroot</tt></th>
217 <th><tt>gcread</tt></th>
218 <th><tt>gcwrite</tt></th>
220 <tr valign="baseline">
221 <td><a href="#semispace">SemiSpace</a></td>
222 <td><tt>-gc=shadow-stack</tt></td>
228 <tr valign="baseline">
229 <td><a href="#ocaml">Ocaml</a></td>
230 <td><tt>-gc=ocaml</tt></td>
231 <td><i>provided by ocamlopt</i></td>
238 <p>The sections for <a href="#intrinsics">Collection intrinsics</a> and
239 <a href="#runtime">Recommended runtime interface</a> detail the interfaces that
240 collectors may require user programs to utilize.</p>
244 <!-- ======================================================================= -->
245 <div class="doc_subsection">
246 <a name="shadow-stack">ShadowStack - A highly portable collector</a>
249 <div class="doc_code"><tt>
250 Collector *llvm::createShadowStackCollector();
253 <div class="doc_text">
255 <p>The ShadowStack collector is invoked with <tt>llc -gc=shadow-stack</tt>.
256 Unlike many collectors which rely on a cooperative code generator to generate
257 stack maps, this algorithm carefully maintains a linked list of stack root
258 descriptors [<a href="#henderson02">Henderson2002</a>]. This so-called "shadow
259 stack," mirrors the machine stack. Maintaining this data structure is slower
260 than using stack maps, but has a significant portability advantage because it
261 requires no special support from the target code generator.</p>
263 <p>The ShadowStack collector does not use read or write barriers, so the user
264 program may use <tt>load</tt> and <tt>store</tt> instead of <tt>llvm.gcread</tt>
265 and <tt>llvm.gcwrite</tt>.</p>
267 <p>The ShadowStack collector is a compiler plugin only. It must be paired with a
268 compatible runtime.</p>
272 <!-- ======================================================================= -->
273 <div class="doc_subsection">
274 <a name="semispace">SemiSpace - A simple copying collector runtime</a>
277 <div class="doc_text">
279 <p>The SemiSpace runtime implements with the <a href="runtime">suggested
280 runtime interface</a> and is compatible the ShadowStack collector's code
283 <p>SemiSpace is a very simple copying collector. When it starts up, it
284 allocates two blocks of memory for the heap. It uses a simple bump-pointer
285 allocator to allocate memory from the first block until it runs out of space.
286 When it runs out of space, it traces through all of the roots of the program,
287 copying blocks to the other half of the memory space.</p>
289 <p>This runtime is highly experimental and has not been used in a real project.
290 Enhancements would be welcomed.</p>
294 <!-- ======================================================================= -->
295 <div class="doc_subsection">
296 <a name="ocaml">Ocaml - An Objective Caml-compatible collector</a>
299 <div class="doc_code"><tt>
300 Collector *llvm::createOcamlCollector();
303 <div class="doc_text">
305 <p>The ocaml collector is invoked with <tt>llc -gc=ocaml</tt>. It supports the
306 <a href="http://caml.inria.fr/">Objective Caml</a> language runtime by emitting
307 a type-accurate stack map in the form of an ocaml 3.10.0-compatible frametable.
308 The linkage requirements are satisfied automatically by the <tt>ocamlopt</tt>
309 compiler when linking an executable.</p>
311 <p>The ocaml collector does not use read or write barriers, so the user program
312 may use <tt>load</tt> and <tt>store</tt> instead of <tt>llvm.gcread</tt> and
313 <tt>llvm.gcwrite</tt>.</p>
318 <!-- *********************************************************************** -->
319 <div class="doc_section">
320 <a name="intrinsics">Collection intrinsics</a>
322 <!-- *********************************************************************** -->
324 <div class="doc_text">
326 <p>This section describes the garbage collection facilities provided by the
327 <a href="LangRef.html">LLVM intermediate representation</a>.</p>
329 <p>These facilities are limited to those strictly necessary for compilation.
330 They are not intended to be a complete interface to any garbage collector.
331 Notably, heap allocation is not among the supplied primitives. A user program
332 will also need to interface with the runtime, using either the
333 <a href="#runtime">suggested runtime interface</a> or another interface
334 specified by the runtime.</p>
338 <!-- ======================================================================= -->
339 <div class="doc_subsection">
340 <a name="gcroot">Identifying GC roots on the stack: <tt>llvm.gcroot</tt></a>
343 <div class="doc_code"><tt>
344 void %llvm.gcroot(i8** %ptrloc, i8* %metadata)
347 <div class="doc_text">
349 <p>The <tt>llvm.gcroot</tt> intrinsic is used to inform LLVM of a pointer
350 variable on the stack. The first argument <b>must</b> be an alloca instruction
351 or a bitcast of an alloca. The second contains a pointer to metadata that
352 should be associated with the pointer, and <b>must</b> be a constant or global
353 value address. If your target collector uses tags, use a null pointer for
356 <p>Consider the following fragment of Java code:</p>
360 Object X; // A null-initialized reference to an object
365 <p>This block (which may be located in the middle of a function or in a loop
366 nest), could be compiled to this LLVM code:</p>
370 ;; In the entry block for the function, allocate the
371 ;; stack space for X, which is an LLVM pointer.
374 ;; Tell LLVM that the stack space is a stack root.
375 ;; Java has type-tags on objects, so we pass null as metadata.
376 %tmp = bitcast %Object** %X to i8**
377 call void %llvm.gcroot(%i8** %X, i8* null)
380 ;; "CodeBlock" is the block corresponding to the start
381 ;; of the scope above.
383 ;; Java null-initializes pointers.
384 store %Object* null, %Object** %X
388 ;; As the pointer goes out of scope, store a null value into
389 ;; it, to indicate that the value is no longer live.
390 store %Object* null, %Object** %X
396 <!-- ======================================================================= -->
397 <div class="doc_subsection">
398 <a name="barriers">Reading and writing references in the heap</a>
401 <div class="doc_text">
403 <p>Some collectors need to be informed when the mutator (the program that needs
404 garbage collection) either reads a pointer from or writes a pointer to a field
405 of a heap object. The code fragments inserted at these points are called
406 <em>read barriers</em> and <em>write barriers</em>, respectively. The amount of
407 code that needs to be executed is usually quite small and not on the critical
408 path of any computation, so the overall performance impact of the barrier is
411 <p>Barriers often require access to the <em>object pointer</em> rather than the
412 <em>derived pointer</em> (which is a pointer to the field within the
413 object). Accordingly, these intrinsics take both pointers as separate arguments
414 for completeness. In this snippet, <tt>%object</tt> is the object pointer, and
415 <tt>%derived</tt> is the derived pointer:</p>
419 %class.Array = type { %class.Object, i32, [0 x %class.Object*] }
422 ;; Load the object pointer from a gcroot.
423 %object = load %class.Array** %object_addr
425 ;; Compute the derived pointer.
426 %derived = getelementptr %obj, i32 0, i32 2, i32 %n</pre></blockquote>
430 <!-- ======================================================================= -->
431 <div class="doc_subsubsection">
432 <a name="gcwrite">Write barrier: <tt>llvm.gcwrite</tt></a>
435 <div class="doc_code"><tt>
436 void @llvm.gcwrite(i8* %value, i8* %object, i8** %derived)
439 <div class="doc_text">
441 <p>For write barriers, LLVM provides the <tt>llvm.gcwrite</tt> intrinsic
442 function. It has exactly the same semantics as a non-volatile <tt>store</tt> to
443 the derived pointer (the third argument).</p>
445 <p>Many important algorithms require write barriers, including generational
446 and concurrent collectors. Additionally, write barriers could be used to
447 implement reference counting.</p>
449 <p>The use of this intrinsic is optional if the target collector does use
450 write barriers. If so, the collector will replace it with the corresponding
455 <!-- ======================================================================= -->
456 <div class="doc_subsubsection">
457 <a name="gcread">Read barrier: <tt>llvm.gcread</tt></a>
460 <div class="doc_code"><tt>
461 i8* @llvm.gcread(i8* %object, i8** %derived)<br>
464 <div class="doc_text">
466 <p>For read barriers, LLVM provides the <tt>llvm.gcread</tt> intrinsic function.
467 It has exactly the same semantics as a non-volatile <tt>load</tt> from the
468 derived pointer (the second argument).</p>
470 <p>Read barriers are needed by fewer algorithms than write barriers, and may
471 have a greater performance impact since pointer reads are more frequent than
474 <p>As with <tt>llvm.gcwrite</tt>, a target collector might not require the use
475 of this intrinsic.</p>
479 <!-- *********************************************************************** -->
480 <div class="doc_section">
481 <a name="runtime">Recommended runtime interface</a>
483 <!-- *********************************************************************** -->
485 <div class="doc_text">
487 <p>LLVM specifies the following recommended runtime interface to the garbage
488 collection at runtime. A program should use these interfaces to accomplish the
489 tasks not supported by the intrinsics.</p>
491 <p>Unlike the intrinsics, which are integral to LLVM's code generator, there is
492 nothing unique about these interfaces; a front-end compiler and runtime are free
493 to agree to a different specification.</p>
495 <p class="doc_warning">Note: This interface is a work in progress.</p>
499 <!-- ======================================================================= -->
500 <div class="doc_subsection">
501 <a name="initialize">Garbage collector startup and initialization</a>
504 <div class="doc_text">
506 <div class="doc_code"><tt>
507 void llvm_gc_initialize(unsigned InitialHeapSize);
511 The <tt>llvm_gc_initialize</tt> function should be called once before any other
512 garbage collection functions are called. This gives the garbage collector the
513 chance to initialize itself and allocate the heap. The initial heap size to
514 allocate should be specified as an argument.
519 <!-- ======================================================================= -->
520 <div class="doc_subsection">
521 <a name="allocate">Allocating memory from the GC</a>
524 <div class="doc_text">
526 <div class="doc_code"><tt>
527 void *llvm_gc_allocate(unsigned Size);
530 <p>The <tt>llvm_gc_allocate</tt> function is a global function defined by the
531 garbage collector implementation to allocate memory. It returns a
532 zeroed-out block of memory of the specified size, sufficiently aligned to store
537 <!-- ======================================================================= -->
538 <div class="doc_subsection">
539 <a name="explicit">Explicit invocation of the garbage collector</a>
542 <div class="doc_text">
544 <div class="doc_code"><tt>
545 void llvm_gc_collect();
549 The <tt>llvm_gc_collect</tt> function is exported by the garbage collector
550 implementations to provide a full collection, even when the heap is not
551 exhausted. This can be used by end-user code as a hint, and may be ignored by
552 the garbage collector.
557 <!-- ======================================================================= -->
558 <div class="doc_subsection">
559 <a name="traceroots">Tracing GC pointers from the program stack</a>
562 <div class="doc_text">
563 <div class="doc_code"><tt>
564 void llvm_cg_walk_gcroots(void (*FP)(void **Root, void *Meta));
568 The <tt>llvm_cg_walk_gcroots</tt> function is a function provided by the code
569 generator that iterates through all of the GC roots on the stack, calling the
570 specified function pointer with each record. For each GC root, the address of
571 the pointer and the meta-data (from the <a
572 href="#roots"><tt>llvm.gcroot</tt></a> intrinsic) are provided.
576 <!-- ======================================================================= -->
577 <div class="doc_subsection">
578 <a name="staticroots">Tracing GC pointers from static roots</a>
581 <div class="doc_text">
586 <!-- *********************************************************************** -->
587 <div class="doc_section">
588 <a name="plugin">Implementing a collector plugin</a>
590 <!-- *********************************************************************** -->
592 <div class="doc_text">
594 <p>To implement a collector plugin, it is necessary to subclass
595 <tt>llvm::Collector</tt>, which can be accomplished in a few lines of
596 boilerplate code. LLVM's infrastructure provides access to several important
597 algorithms. For an uncontroversial collector, all that remains may be to emit
598 the assembly code for the collector's unique stack map data structure, which
599 might be accomplished in as few as 100 LOC.</p>
601 <p>To subclass <tt>llvm::Collector</tt> and register a collector:</p>
603 <blockquote><pre>// lib/MyGC/MyGC.cpp - Example LLVM collector plugin
605 #include "llvm/CodeGen/Collector.h"
606 #include "llvm/CodeGen/Collectors.h"
607 #include "llvm/CodeGen/CollectorMetadata.h"
608 #include "llvm/Support/Compiler.h"
610 using namespace llvm;
613 class VISIBILITY_HIDDEN MyCollector : public Collector {
618 CollectorRegistry::Add<MyCollector>
619 X("mygc", "My custom garbage collector.");
622 <p>Using the LLVM makefiles (like the <a
623 href="http://llvm.org/viewvc/llvm-project/llvm/trunk/projects/sample/">sample
624 project</a>), this can be built into a plugin using a simple makefile:</p>
630 LIBRARYNAME = <var>MyGC</var>
633 include $(LEVEL)/Makefile.common</pre></blockquote>
638 <p>Once the plugin is compiled, user code may be compiled using <tt>llc
639 -load=<var>MyGC.so</var> -gc=mygc</tt> (though <var>MyGC.so</var> may have some
640 other platform-specific extension).</p>
642 <!-- BEGIN FIXME: Gross -->
643 <p>To use a collector in a tool other than <tt>llc</tt>, simply assign a
644 <tt>Collector</tt> to the <tt>llvm::TheCollector</tt> variable:</p>
647 >TheCollector = new MyGC();</pre></blockquote>
648 <!-- /FIXME GROSS -->
652 <!-- ======================================================================= -->
653 <div class="doc_subsection">
654 <a name="collector-algos">Overview of available features</a>
657 <div class="doc_text">
659 <p>The boilerplate collector above does nothing. More specifically:</p>
662 <li><tt>llvm.gcread</tt> calls are replaced with the corresponding
663 <tt>load</tt> instruction.</li>
664 <li><tt>llvm.gcwrite</tt> calls are replaced with the corresponding
665 <tt>store</tt> instruction.</li>
666 <li>No stack map is emitted, and no safe points are added.</li>
669 <p><tt>Collector</tt> provides a range of features through which a plugin
670 collector may do useful work. This matrix summarizes the supported (and planned)
671 features and correlates them with the collection techniques which typically
678 <th>shadow stack</th>
687 <th class="rowhead"><a href="#stack-map">stack map</a></th>
698 <th class="rowhead"><a href="#init-roots">initialize roots</a></th>
708 <tr class="doc_warning">
709 <th class="rowhead">derived pointers</th>
720 <th class="rowhead"><em><a href="#custom">custom lowering</a></em></th>
731 <th class="rowhead indent">gcroot</th>
742 <th class="rowhead indent">gcwrite</th>
753 <th class="rowhead indent">gcread</th>
764 <th class="rowhead"><em><a href="#safe-points">safe points</a></em></th>
775 <th class="rowhead indent">in calls</th>
786 <th class="rowhead indent">before calls</th>
796 <tr class="doc_warning">
797 <th class="rowhead indent">for loops</th>
808 <th class="rowhead indent">before escape</th>
818 <tr class="doc_warning">
819 <th class="rowhead">emit code at safe points</th>
830 <th class="rowhead"><em>output</em></th>
841 <th class="rowhead indent"><a href="#assembly">assembly</a></th>
851 <tr class="doc_warning">
852 <th class="rowhead indent">JIT</th>
856 <td class="optl">✘</td>
857 <td class="optl">✘</td>
858 <td class="optl">✘</td>
859 <td class="optl">✘</td>
860 <td class="optl">✘</td>
862 <tr class="doc_warning">
863 <th class="rowhead indent">obj</th>
867 <td class="optl">✘</td>
868 <td class="optl">✘</td>
869 <td class="optl">✘</td>
870 <td class="optl">✘</td>
871 <td class="optl">✘</td>
873 <tr class="doc_warning">
874 <th class="rowhead">live analysis</th>
878 <td class="optl">✘</td>
879 <td class="optl">✘</td>
880 <td class="optl">✘</td>
881 <td class="optl">✘</td>
882 <td class="optl">✘</td>
884 <tr class="doc_warning">
885 <th class="rowhead">register map</th>
889 <td class="optl">✘</td>
890 <td class="optl">✘</td>
891 <td class="optl">✘</td>
892 <td class="optl">✘</td>
893 <td class="optl">✘</td>
897 <div><span class="doc_warning">*</span> Derived pointers only pose a
898 hazard to copying collectors.</div>
899 <div><span class="optl">✘</span> in gray denotes a feature which
900 could be utilized if available.</div>
905 <p>To be clear, the collection techniques above are defined as:</p>
908 <dt>Shadow Stack</dt>
909 <dd>The mutator carefully maintains a linked list of stack root
911 <dt>Reference Counting</dt>
912 <dd>The mutator maintains a reference count for each object and frees an
913 object when its count falls to zero.</dd>
915 <dd>When the heap is exhausted, the collector marks reachable objects starting
916 from the roots, then deallocates unreachable objects in a sweep
919 <dd>As reachability analysis proceeds, the collector copies objects from one
920 heap area to another, compacting them in the process. Copying collectors
921 enable highly efficient "bump pointer" allocation and can improve locality
924 <dd>(Including generational collectors.) Incremental collectors generally have
925 all the properties of a copying collector (regardless of whether the
926 mature heap is compacting), but bring the added complexity of requiring
929 <dd>Denotes a multithreaded mutator; the collector must still stop the mutator
930 ("stop the world") before beginning reachability analysis. Stopping a
931 multithreaded mutator is a complicated problem. It generally requires
932 highly platform specific code in the runtime, and the production of
933 carefully designed machine code at safe points.</dd>
935 <dd>In this technique, the mutator and the collector run concurrently, with
936 the goal of eliminating pause times. In a <em>cooperative</em> collector,
937 the mutator further aids with collection should a pause occur, allowing
938 collection to take advantage of multiprocessor hosts. The "stop the world"
939 problem of threaded collectors is generally still present to a limited
940 extent. Sophisticated marking algorithms are necessary. Read barriers may
944 <p>As the matrix indicates, LLVM's garbage collection infrastructure is already
945 suitable for a wide variety of collectors, but does not currently extend to
946 multithreaded programs. This will be added in the future as there is
951 <!-- ======================================================================= -->
952 <div class="doc_subsection">
953 <a name="stack-map">Computing stack maps</a>
956 <div class="doc_text">
959 >CollectorMetadata &MD = ...;
960 unsigned FrameSize = MD.getFrameSize();
961 size_t RootCount = MD.roots_size();
963 for (CollectorMetadata::roots_iterator RI = MD.roots_begin(),
964 RE = MD.roots_end(); RI != RE; ++RI) {
965 int RootNum = RI->Num;
966 int RootStackOffset = RI->StackOffset;
967 Constant *RootMetadata = RI->Metadata;
970 <p>LLVM automatically computes a stack map. All a <tt>Collector</tt> needs to do
971 is access it using <tt>CollectorMetadata::roots_begin()</tt> and
972 -<tt>end()</tt>. If the <tt>llvm.gcroot</tt> intrinsic is eliminated before code
973 generation by a custom lowering pass, LLVM's stack map will be empty.</p>
978 <!-- ======================================================================= -->
979 <div class="doc_subsection">
980 <a name="init-roots">Initializing roots to null: <tt>InitRoots</tt></a>
983 <div class="doc_text">
986 >MyCollector::MyCollector() {
990 <p>When set, LLVM will automatically initialize each root to <tt>null</tt> upon
991 entry to the function. This prevents the reachability analysis from finding
992 uninitialized values in stack roots at runtime, which will almost certainly
993 cause it to segfault. This initialization occurs before custom lowering, so the
994 two may be used together.</p>
996 <p>Since LLVM does not yet compute liveness information, this feature should be
997 used by all collectors which do not custom lower <tt>llvm.gcroot</tt>, and even
1003 <!-- ======================================================================= -->
1004 <div class="doc_subsection">
1005 <a name="custom">Custom lowering of intrinsics: <tt>CustomRoots</tt>,
1006 <tt>CustomReadBarriers</tt>, and <tt>CustomWriteBarriers</tt></a>
1009 <div class="doc_text">
1011 <p>For collectors with barriers or unusual treatment of stack roots, these
1012 flags allow the collector to perform any required transformation on the LLVM
1016 >class MyCollector : public Collector {
1020 CustomReadBarriers = true;
1021 CustomWriteBarriers = true;
1025 virtual Pass *createCustomLoweringPass() const {
1026 return new MyGCLoweringFunctionPass();
1028 };</pre></blockquote>
1030 <p>If any of these flags are set, then LLVM suppresses its default lowering for
1031 the corresponding intrinsics and instead passes them on to a custom lowering
1032 pass specified by the collector.</p>
1034 <p>LLVM's default action for each intrinsic is as follows:</p>
1037 <li><tt>llvm.gcroot</tt>: Pass through to the code generator to generate a
1039 <li><tt>llvm.gcread</tt>: Substitute a <tt>load</tt> instruction.</li>
1040 <li><tt>llvm.gcwrite</tt>: Substitute a <tt>store</tt> instruction.</li>
1043 <p>If <tt>CustomReadBarriers</tt> or <tt>CustomWriteBarriers</tt> are specified,
1044 the custom lowering pass <strong>must</strong> eliminate the corresponding
1047 <p>This template can be used as a starting point for a lowering pass:</p>
1050 >#include "llvm/Function.h"
1051 #include "llvm/Module.h"
1052 #include "llvm/Instructions.h"
1055 class VISIBILITY_HIDDEN MyGCLoweringFunctionPass : public FunctionPass {
1058 MyGCLoweringFunctionPass() : FunctionPass(intptr_t(&ID)) {}
1060 const char *getPassName() const { return "Lower GC Intrinsics"; }
1062 bool runOnFunction(Function &F) {
1063 Module *M = F.getParent();
1065 Function *GCReadInt = M->getFunction("llvm.gcread"),
1066 *GCWriteInt = M->getFunction("llvm.gcwrite"),
1067 *GCRootInt = M->getFunction("llvm.gcroot");
1069 bool MadeChange = false;
1071 for (Function::iterator BB = F.begin(), E = F.end(); BB != E; ++BB)
1072 for (BasicBlock::iterator II = BB->begin(), E = BB->end(); II != E;)
1073 if (CallInst *CI = dyn_cast<CallInst>(II++))
1074 if (Function *F = CI->getCalledFunction())
1075 if (F == GCWriteInt) {
1076 // Handle llvm.gcwrite.
1077 CI->eraseFromParent();
1079 } else if (F == GCReadInt) {
1080 // Handle llvm.gcread.
1081 CI->eraseFromParent();
1083 } else if (F == GCRootInt) {
1084 // Handle llvm.gcroot.
1085 CI->eraseFromParent();
1093 char MyGCLoweringFunctionPass::ID = 0;
1094 }</pre></blockquote>
1099 <!-- ======================================================================= -->
1100 <div class="doc_subsection">
1101 <a name="safe-points">Generating safe points: <tt>NeededSafePoints</tt></a>
1104 <div class="doc_text">
1106 <p>LLVM can compute four kinds of safe points:</p>
1110 /// PointKind - The type of a collector-safe point.
1113 Loop, //< Instr is a loop (backwards branch).
1114 Return, //< Instr is a return instruction.
1115 PreCall, //< Instr is a call instruction.
1116 PostCall //< Instr is the return address of a call.
1118 }</pre></blockquote>
1120 <p>A collector can request any combination of the four by setting the
1121 <tt>NeededSafePoints</tt> mask:</p>
1124 >MyCollector::MyCollector() {
1125 NeededSafePoints = 1 << GC::Loop
1126 | 1 << GC::Return
1127 | 1 << GC::PreCall
1128 | 1 << GC::PostCall;
1129 }</pre></blockquote>
1131 <p>It can then use the following routines to access safe points.</p>
1134 CollectorMetadata &MD = ...;
1135 size_t PointCount = MD.size();
1137 for (CollectorMetadata::iterator PI = MD.begin(),
1138 PE = MD.end(); PI != PE; ++PI) {
1139 GC::PointKind PointKind = PI->Kind;
1140 unsigned PointNum = PI->Num;
1141 }</pre></blockquote>
1143 <p>Almost every collector requires <tt>PostCall</tt> safe points, since these
1144 correspond to the moments when the function is suspended during a call to a
1147 <p>Threaded programs generally require <tt>Loop</tt> safe points to guarantee
1148 that the application will reach a safe point within a bounded amount of time,
1149 even if it is executing a long-running loop which contains no function
1152 <p>Threaded collectors may also require <tt>Return</tt> and <tt>PreCall</tt>
1153 safe points to implement "stop the world" techniques using self-modifying code,
1154 where it is important that the program not exit the function without reaching a
1155 safe point (because only the topmost function has been patched).</p>
1160 <!-- ======================================================================= -->
1161 <div class="doc_subsection">
1162 <a name="assembly">Emitting assembly code:
1163 <tt>beginAssembly</tt> and <tt>finishAssembly</tt></a>
1166 <div class="doc_text">
1168 <p>LLVM allows a collector to print arbitrary assembly code before and after
1169 the rest of a module's assembly code. From the latter callback, the collector
1170 can print stack maps from <tt>CollectorModuleMetadata</tt> populated by the code
1173 <p>Note that LLVM does not currently support garbage collection code generation
1174 in the JIT, nor using the object writers.</p>
1177 >class MyCollector : public Collector {
1178 virtual void beginAssembly(Module &M, std::ostream &OS, AsmPrinter &AP,
1179 const TargetAsmInfo &TAI) const;
1181 virtual void finishAssembly(Module &M, CollectorModuleMetadata &MMD,
1182 std::ostream &OS, AsmPrinter &AP,
1183 const TargetAsmInfo &TAI) const;
1184 }</pre></blockquote>
1186 <p>The collector should use <tt>AsmPrinter</tt> and <tt>TargetAsmInfo</tt> to
1187 print portable assembly code to the <tt>std::ostream</tt>. The collector may
1188 access the stack maps for the entire module using the methods of
1189 <tt>CollectorModuleMetadata</tt>. Here's a realistic example:</p>
1192 >#include "llvm/CodeGen/AsmPrinter.h"
1193 #include "llvm/Function.h"
1194 #include "llvm/Target/TargetAsmInfo.h"
1196 void MyCollector::finishAssembly(Module &M,
1197 CollectorModuleMetadata &MMD,
1198 std::ostream &OS, AsmPrinter &AP,
1199 const TargetAsmInfo &TAI) const {
1200 // Set up for emitting addresses.
1201 const char *AddressDirective;
1202 int AddressAlignLog;
1203 if (TAI.getAddressSize() == sizeof(int32_t)) {
1204 AddressDirective = TAI.getData32bitsDirective();
1205 AddressAlignLog = 2;
1207 AddressDirective = TAI.getData64bitsDirective();
1208 AddressAlignLog = 3;
1211 // Put this in the data section.
1212 AP.SwitchToDataSection(TAI.getDataSection());
1214 // For each function...
1215 for (CollectorModuleMetadata::iterator FI = MMD.begin(),
1216 FE = MMD.end(); FI != FE; ++FI) {
1217 CollectorMetadata &MD = **FI;
1219 // Emit this data structure:
1222 // int32_t PointCount;
1224 // void *SafePointAddress;
1225 // int32_t LiveCount;
1226 // int32_t LiveOffsets[LiveCount];
1227 // } Points[PointCount];
1228 // } __gcmap_<FUNCTIONNAME>;
1230 // Align to address width.
1231 AP.EmitAlignment(AddressAlignLog);
1233 // Emit the symbol by which the stack map can be found.
1235 Symbol += TAI.getGlobalPrefix();
1236 Symbol += "__gcmap_";
1237 Symbol += MD.getFunction().getName();
1238 if (const char *GlobalDirective = TAI.getGlobalDirective())
1239 OS << GlobalDirective << Symbol << "\n";
1240 OS << TAI.getGlobalPrefix() << Symbol << ":\n";
1243 AP.EmitInt32(MD.size());
1244 AP.EOL("safe point count");
1246 // And each safe point...
1247 for (CollectorMetadata::iterator PI = MD.begin(),
1248 PE = MD.end(); PI != PE; ++PI) {
1249 // Align to address width.
1250 AP.EmitAlignment(AddressAlignLog);
1252 // Emit the address of the safe point.
1253 OS << AddressDirective
1254 << TAI.getPrivateGlobalPrefix() << "label" << PI->Num;
1255 AP.EOL("safe point address");
1257 // Emit the stack frame size.
1258 AP.EmitInt32(MD.getFrameSize());
1259 AP.EOL("stack frame size");
1261 // Emit the number of live roots in the function.
1262 AP.EmitInt32(MD.live_size(PI));
1263 AP.EOL("live root count");
1265 // And for each live root...
1266 for (CollectorMetadata::live_iterator LI = MD.live_begin(PI),
1267 LE = MD.live_end(PI);
1269 // Print its offset within the stack frame.
1270 AP.EmitInt32(LI->StackOffset);
1271 AP.EOL("stack offset");
1281 <!-- *********************************************************************** -->
1282 <div class="doc_section">
1283 <a name="runtime-impl">Implementing a collector runtime</a>
1285 <!-- *********************************************************************** -->
1287 <div class="doc_text">
1289 <p>Implementing a garbage collector for LLVM is fairly straightforward. The
1290 LLVM garbage collectors are provided in a form that makes them easy to link into
1291 the language-specific runtime that a language front-end would use. They require
1292 functionality from the language-specific runtime to get information about <a
1293 href="#gcdescriptors">where pointers are located in heap objects</a>.</p>
1295 <p>The implementation must include the
1296 <a href="#allocate"><tt>llvm_gc_allocate</tt></a> and
1297 <a href="#explicit"><tt>llvm_gc_collect</tt></a> functions. To do this, it will
1298 probably have to <a href="#traceroots">trace through the roots
1299 from the stack</a> and understand the <a href="#gcdescriptors">GC descriptors
1300 for heap objects</a>. Luckily, there are some <a href="#gcimpls">example
1301 implementations</a> available.
1306 <!-- ======================================================================= -->
1307 <div class="doc_subsection">
1308 <a name="gcdescriptors">Tracing GC pointers from heap objects</a>
1311 <div class="doc_text">
1313 The three most common ways to keep track of where pointers live in heap objects
1314 are (listed in order of space overhead required):</p>
1317 <li>In languages with polymorphic objects, pointers from an object header are
1318 usually used to identify the GC pointers in the heap object. This is common for
1319 object-oriented languages like Self, Smalltalk, Java, or C#.</li>
1321 <li>If heap objects are not polymorphic, often the "shape" of the heap can be
1322 determined from the roots of the heap or from some other meta-data [<a
1323 href="#appel89">Appel89</a>, <a href="#goldberg91">Goldberg91</a>, <a
1324 href="#tolmach94">Tolmach94</a>]. In this case, the garbage collector can
1325 propagate the information around from meta data stored with the roots. This
1326 often eliminates the need to have a header on objects in the heap. This is
1327 common in the ML family.</li>
1329 <li>If all heap objects have pointers in the same locations, or pointers can be
1330 distinguished just by looking at them (e.g., the low order bit is clear), no
1331 book-keeping is needed at all. This is common for Lisp-like languages.</li>
1334 <p>The LLVM garbage collectors are capable of supporting all of these styles of
1335 language, including ones that mix various implementations. To do this, it
1336 allows the source-language to associate meta-data with the <a
1337 href="#roots">stack roots</a>, and the heap tracing routines can propagate the
1338 information. In addition, LLVM allows the front-end to extract GC information
1339 in any form from a specific object pointer (this supports situations #1 and #3).
1345 <!-- *********************************************************************** -->
1346 <div class="doc_section">
1347 <a name="references">References</a>
1349 <!-- *********************************************************************** -->
1351 <div class="doc_text">
1353 <p><a name="appel89">[Appel89]</a> Runtime Tags Aren't Necessary. Andrew
1354 W. Appel. Lisp and Symbolic Computation 19(7):703-705, July 1989.</p>
1356 <p><a name="goldberg91">[Goldberg91]</a> Tag-free garbage collection for
1357 strongly typed programming languages. Benjamin Goldberg. ACM SIGPLAN
1360 <p><a name="tolmach94">[Tolmach94]</a> Tag-free garbage collection using
1361 explicit type parameters. Andrew Tolmach. Proceedings of the 1994 ACM
1362 conference on LISP and functional programming.</p>
1364 <p><a name="henderson02">[Henderson2002]</a> <a
1365 href="http://citeseer.ist.psu.edu/henderson02accurate.html">
1366 Accurate Garbage Collection in an Uncooperative Environment</a>.
1367 Fergus Henderson. International Symposium on Memory Management 2002.</p>
1372 <!-- *********************************************************************** -->
1376 <a href="http://jigsaw.w3.org/css-validator/check/referer"><img
1377 src="http://jigsaw.w3.org/css-validator/images/vcss" alt="Valid CSS!"></a>
1378 <a href="http://validator.w3.org/check/referer"><img
1379 src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!"></a>
1381 <a href="mailto:sabre@nondot.org">Chris Lattner</a><br>
1382 <a href="http://llvm.org">LLVM Compiler Infrastructure</a><br>
1383 Last modified: $Date$