X-Git-Url: http://demsky.eecs.uci.edu/git/?a=blobdiff_plain;ds=sidebyside;f=docs%2FGarbageCollection.html;h=a226b0ee3e1cc84b01594f941a9a300b97e471dd;hb=b76761351f36ce0a68762ba800dd88f6ecd4bccb;hp=c9324859ba9457b23f24df9a2f3f00f609959246;hpb=0adede059ed76940700195342bb5b02c79e58516;p=oota-llvm.git diff --git a/docs/GarbageCollection.html b/docs/GarbageCollection.html index c9324859ba9..a226b0ee3e1 100644 --- a/docs/GarbageCollection.html +++ b/docs/GarbageCollection.html @@ -13,26 +13,22 @@
-Garbage collection is a widely used technique that frees the programmer from having to know the lifetimes of heap objects, making software easier to produce @@ -135,23 +118,21 @@ conservative garbage collectors (though these seem rare in practice).
they can suffer from degraded scalar optimization of the program. In particular, because the runtime must be able to identify and update all pointers active in the program, some optimizations are less effective. In practice, however, the -locality and performance benefits of using aggressive garbage allocation +locality and performance benefits of using aggressive garbage collection techniques dominates any low-level losses.This document describes the mechanisms and interfaces provided by LLVM to support accurate garbage collection.
-LLVM's intermediate representation provides garbage -collection intrinsics which offer support for a broad class of +collection intrinsics that offer support for a broad class of collector models. For instance, the intrinsics permit:
However, LLVM does not itself implement a garbage collector. This is because -collectors are tightly coupled to object models, and LLVM is agnostic to object -models. Since LLVM is agnostic to object models, it would be inappropriate for -LLVM to dictate any particular collector. Instead, LLVM provides a framework for -garbage collector implementations in two manners:
+However, LLVM does not itself provide a garbage collector—this should +be part of your language's runtime library. LLVM provides a framework for +compile time code generation plugins. The role of these +plugins is to generate code and data structures which conforms to the binary +interface specified by the runtime library. This is similar to the +relationship between LLVM and DWARF debugging info, for example. The +difference primarily lies in the lack of an established standard in the domain +of garbage collection—thus the plugins.
+ +The aspects of the binary interface with which LLVM's GC support is +concerned are:
+ +There are additional areas that LLVM does not directly address:
In general, LLVM's support for GC does not include features which can be +adequately addressed with other features of the IR and does not specify a +particular binary interface. On the plus side, this means that you should be +able to integrate LLVM with an existing runtime. On the other hand, it leaves +a lot of work for the developer of a novel language. However, it's easy to get +started quickly and scale up to a more sophisticated implementation as your +compiler matures.
+In general, using a collector implies:
+Using a GC with LLVM implies many things, for example:
This table summarizes the available runtimes.
- -Collector | -gc attribute | -Linkage | -gcroot | -gcread | -gcwrite | -
---|---|---|---|---|---|
SemiSpace | -gc "shadow-stack" | -TODO FIXME | -required | -optional | -optional | -
Ocaml | -gc "ocaml" | -provided by ocamlopt | -required | -optional | -optional | -
The sections for Collection intrinsics and -Recommended runtime interface detail the interfaces that -collectors may require user programs to utilize.
- -To help with several of these tasks (those indicated with a *), LLVM +includes a highly portable, built-in ShadowStack code generator. It is compiled +into llc and works even with the interpreter and C backends.
- +To turn the shadow stack on for your functions, first call:
-The ShadowStack backend is invoked with the gc "shadow-stack" -function attribute. -Unlike many collectors which rely on a cooperative code generator to generate -stack maps, this algorithm carefully maintains a linked list of stack root -descriptors [Henderson2002]. This so-called "shadow -stack" mirrors the machine stack. Maintaining this data structure is slower -than using stack maps, but has a significant portability advantage because it -requires no special support from the target code generator.
+F.setGC("shadow-stack");
The ShadowStack collector does not use read or write barriers, so the user -program may use load and store instead of llvm.gcread -and llvm.gcwrite.
+for each function your compiler emits. Since the shadow stack is built into +LLVM, you do not need to load a plugin.
-ShadowStack is a code generator plugin only. It must be paired with a -compatible runtime.
+Your compiler must also use @llvm.gcroot as documented. +Don't forget to create a root for each intermediate value that is generated +when evaluating an expression. In h(f(), g()), the result of +f() could easily be collected if evaluating g() triggers a +collection.
-There's no need to use @llvm.gcread and @llvm.gcwrite over +plain load and store for now. You will need them when +switching to a more advanced GC.
- - -The SemiSpace runtime implements with the suggested -runtime interface and is compatible the ShadowStack backend.
- -SemiSpace is a very simple copying collector. When it starts up, it -allocates two blocks of memory for the heap. It uses a simple bump-pointer -allocator to allocate memory from the first block until it runs out of space. -When it runs out of space, it traces through all of the roots of the program, -copying blocks to the other half of the memory space.
- -This runtime is highly experimental and has not been used in a real project. -Enhancements would be welcomed.
+ +The shadow stack doesn't imply a memory allocation algorithm. A semispace +collector or building atop malloc are great places to start, and can +be implemented with very little code.
+ +When it comes time to collect, however, your runtime needs to traverse the +stack roots, and for this it needs to integrate with the shadow stack. Luckily, +doing so is very simple. (This code is heavily commented to help you +understand the data structure, but there are only 20 lines of meaningful +code.)
+ ++/// @brief The map for a single function's stack frame. One of these is +/// compiled as constant data into the executable for each function. +/// +/// Storage of metadata values is elided if the %metadata parameter to +/// @llvm.gcroot is null. +struct FrameMap { + int32_t NumRoots; //< Number of roots in stack frame. + int32_t NumMeta; //< Number of metadata entries. May be < NumRoots. + const void *Meta[0]; //< Metadata for each root. +}; + +/// @brief A link in the dynamic shadow stack. One of these is embedded in the +/// stack frame of each function on the call stack. +struct StackEntry { + StackEntry *Next; //< Link to next stack entry (the caller's). + const FrameMap *Map; //< Pointer to constant FrameMap. + void *Roots[0]; //< Stack roots (in-place array). +}; + +/// @brief The head of the singly-linked list of StackEntries. Functions push +/// and pop onto this in their prologue and epilogue. +/// +/// Since there is only a global list, this technique is not threadsafe. +StackEntry *llvm_gc_root_chain; + +/// @brief Calls Visitor(root, meta) for each GC root on the stack. +/// root and meta are exactly the values passed to +/// @llvm.gcroot. +/// +/// Visitor could be a function to recursively mark live objects. Or it +/// might copy them to another heap or generation. +/// +/// @param Visitor A function to invoke for every GC root on the stack. +void visitGCRoots(void (*Visitor)(void **Root, const void *Meta)) { + for (StackEntry *R = llvm_gc_root_chain; R; R = R->Next) { + unsigned i = 0; + + // For roots [0, NumMeta), the metadata pointer is in the FrameMap. + for (unsigned e = R->Map->NumMeta; i != e; ++i) + Visitor(&R->Roots[i], R->Map->Meta[i]); + + // For roots [NumMeta, NumRoots), the metadata pointer is null. + for (unsigned e = R->Map->NumRoots; i != e; ++i) + Visitor(&R->Roots[i], NULL); + } +}
Unlike many GC algorithms which rely on a cooperative code generator to +compile stack maps, this algorithm carefully maintains a linked list of stack +roots [Henderson2002]. This so-called "shadow stack" +mirrors the machine stack. Maintaining this data structure is slower than using +a stack map compiled into the executable as constant data, but has a significant +portability advantage because it requires no special support from the target +code generator, and does not require tricky platform-specific code to crawl +the machine stack.
-The ocaml backend is invoked with the gc "ocaml" function attribute. -It supports the -Objective Caml language runtime by emitting -a type-accurate stack map in the form of an ocaml 3.10.0-compatible frametable. -The linkage requirements are satisfied automatically by the ocamlopt -compiler when linking an executable.
+The tradeoff for this simplicity and portability is:
-The ocaml collector does not use read or write barriers, so the user program -may use load and store instead of llvm.gcread and -llvm.gcwrite.
+Still, it's an easy way to get started. After your compiler and runtime are +up and running, writing a plugin will allow you to take +advantage of more advanced GC features of LLVM +in order to improve performance.
This section describes the garbage collection facilities provided by the -LLVM intermediate representation.
- -These facilities are limited to those strictly necessary for compilation. -They are not intended to be a complete interface to any garbage collector. -Notably, heap allocation is not among the supplied primitives. A user program -will also need to interface with the runtime, using either the -suggested runtime interface or another interface -specified by the runtime.
+LLVM intermediate representation. The exact behavior +of these IR features is specified by the binary interface implemented by a +code generation plugin, not by this document. -These facilities are limited to those strictly necessary; they are not +intended to be a complete interface to any garbage collector. A program will +need to interface with the GC library using the facilities provided by that +program.
- + + +The gc function attribute is used to specify the desired collector -algorithm to the compiler. It is equivalent to specify the collector name -programmatically using the setCollector method of +
The gc function attribute is used to specify the desired GC style +to the compiler. Its programmatic equivalent is the setGC method of Function.
-Specifying the collector on a per-function basis allows LLVM to link together -programs which use different garbage collection algorithms.
+Setting gc "name" on a function triggers a search for a +matching code generation plugin "name"; it is that plugin which defines +the exact nature of the code generated to support GC. If none is found, the +compiler will raise an error.
+ +Specifying the GC style on a per-function basis allows LLVM to link together +programs that use different garbage collection algorithms (or none at all).
The llvm.gcroot intrinsic is used to inform LLVM that a stack +variable references an object on the heap and is to be tracked for garbage +collection. The exact impact on generated code is specified by a compiler plugin.
-The llvm.gcroot intrinsic is used to inform LLVM of a pointer -variable on the stack. The first argument must be an alloca instruction +
A compiler which uses mem2reg to raise imperative code using alloca +into SSA form need only add a call to @llvm.gcroot for those variables +which a pointers into the GC heap.
+ +It is also important to mark intermediate values with llvm.gcroot. +For example, consider h(f(), g()). Beware leaking the result of +f() in the case that g() triggers a collection.
+ +The first argument must be a value referring to an alloca instruction or a bitcast of an alloca. The second contains a pointer to metadata that should be associated with the pointer, and must be a constant or global value address. If your target collector uses tags, use a null pointer for metadata.
+The %metadata argument can be used to avoid requiring heap objects +to have 'isa' pointers or tag bits. [Appel89, Goldberg91, Tolmach94] If +specified, its value will be tracked along with the location of the pointer in +the stack frame.
+Consider the following fragment of Java code:
-+{ Object X; // A null-initialized reference to an object ... @@ -390,7 +463,7 @@ metadata.This block (which may be located in the middle of a function or in a loop nest), could be compiled to this LLVM code:
-+Entry: ;; In the entry block for the function, allocate the ;; stack space for X, which is an LLVM pointer. @@ -399,7 +472,7 @@ Entry: ;; Tell LLVM that the stack space is a stack root. ;; Java has type-tags on objects, so we pass null as metadata. %tmp = bitcast %Object** %X to i8** - call void %llvm.gcroot(%i8** %X, i8* null) + call void @llvm.gcroot(i8** %X, i8* null) ... ;; "CodeBlock" is the block corresponding to the start @@ -419,11 +492,11 @@ CodeBlock:
Some collectors need to be informed when the mutator (the program that needs garbage collection) either reads a pointer from or writes a pointer to a field @@ -439,218 +512,135 @@ object). Accordingly, these intrinsics take both pointers as separate arguments for completeness. In this snippet, %object is the object pointer, and %derived is the derived pointer:
--;; An array type. ++ %derived = getelementptr %object, i32 0, i32 2, i32 %n+ ;; An array type. %class.Array = type { %class.Object, i32, [0 x %class.Object*] } -... + ... ;; Load the object pointer from a gcroot. %object = load %class.Array** %object_addr ;; Compute the derived pointer. - %derived = getelementptr %obj, i32 0, i32 2, i32 %n
LLVM does not enforce this relationship between the object and derived +pointer (although a plugin might). However, it would be +an unusual collector that violated it.
+ +The use of these intrinsics is naturally optional if the target GC does +require the corresponding barrier. Such a GC plugin will replace the intrinsic +calls with the corresponding load or store instruction if they +are used.
- + + +For write barriers, LLVM provides the llvm.gcwrite intrinsic function. It has exactly the same semantics as a non-volatile store to -the derived pointer (the third argument).
+the derived pointer (the third argument). The exact code generated is specified +by a compiler plugin.Many important algorithms require write barriers, including generational and concurrent collectors. Additionally, write barriers could be used to implement reference counting.
-The use of this intrinsic is optional if the target collector does use -write barriers. If so, the collector will replace it with the corresponding -store.
-For read barriers, LLVM provides the llvm.gcread intrinsic function. It has exactly the same semantics as a non-volatile load from the -derived pointer (the second argument).
+derived pointer (the second argument). The exact code generated is specified by +a compiler plugin.Read barriers are needed by fewer algorithms than write barriers, and may have a greater performance impact since pointer reads are more frequent than writes.
-As with llvm.gcwrite, a target collector might not require the use -of this intrinsic.
- -LLVM specifies the following recommended runtime interface to the garbage -collection at runtime. A program should use these interfaces to accomplish the -tasks not supported by the intrinsics.
- -Unlike the intrinsics, which are integral to LLVM's code generator, there is -nothing unique about these interfaces; a front-end compiler and runtime are free -to agree to a different specification.
- -Note: This interface is a work in progress.
- --The llvm_gc_initialize function should be called once before any other -garbage collection functions are called. This gives the garbage collector the -chance to initialize itself and allocate the heap. The initial heap size to -allocate should be specified as an argument. -
- -The llvm_gc_allocate function is a global function defined by the -garbage collector implementation to allocate memory. It returns a -zeroed-out block of memory of the specified size, sufficiently aligned to store -any object.
- --The llvm_gc_collect function is exported by the garbage collector -implementations to provide a full collection, even when the heap is not -exhausted. This can be used by end-user code as a hint, and may be ignored by -the garbage collector. -
- --The llvm_cg_walk_gcroots function is a function provided by the code -generator that iterates through all of the GC roots on the stack, calling the -specified function pointer with each record. For each GC root, the address of -the pointer and the meta-data (from the llvm.gcroot intrinsic) are provided. -
User code specifies which collector plugin to use with the gc -function attribute or, equivalently, with the setCollector method of +
User code specifies which GC code generation to use with the gc +function attribute or, equivalently, with the setGC method of Function.
-To implement a collector plugin, it is necessary to subclass -llvm::Collector, which can be accomplished in a few lines of +
To implement a GC plugin, it is necessary to subclass +llvm::GCStrategy, which can be accomplished in a few lines of boilerplate code. LLVM's infrastructure provides access to several important -algorithms. For an uncontroversial collector, all that remains may be to emit -the assembly code for the collector's unique stack map data structure, which -might be accomplished in as few as 100 LOC.
+algorithms. For an uncontroversial collector, all that remains may be to +compile LLVM's computed stack map to assembly code (using the binary +representation expected by the runtime library). This can be accomplished in +about 100 lines of code. + +This is not the appropriate place to implement a garbage collected heap or a +garbage collector itself. That code should exist in the language's runtime +library. The compiler plugin is responsible for generating code which +conforms to the binary interface defined by library, most essentially the +stack map.
-To subclass llvm::Collector and register a collector:
+To subclass llvm::GCStrategy and register it with the compiler:
-// lib/MyGC/MyGC.cpp - Example LLVM collector plugin ++// lib/MyGC/MyGC.cpp - Example LLVM GC plugin -#include "llvm/CodeGen/Collector.h" -#include "llvm/CodeGen/Collectors.h" -#include "llvm/CodeGen/CollectorMetadata.h" +#include "llvm/CodeGen/GCStrategy.h" +#include "llvm/CodeGen/GCMetadata.h" #include "llvm/Support/Compiler.h" using namespace llvm; namespace { - class VISIBILITY_HIDDEN MyCollector : public Collector { + class LLVM_LIBRARY_VISIBILITY MyGC : public GCStrategy { public: - MyCollector() {} + MyGC() {} }; - CollectorRegistry::Add<MyCollector> + GCRegistry::Add<MyGC> X("mygc", "My bespoke garbage collector."); }This boilerplate collector does nothing. More specifically:
+ ++
+- llvm.gcread calls are replaced with the corresponding + load instruction.
+- llvm.gcwrite calls are replaced with the corresponding + store instruction.
+- No safe points are added to the code.
+- The stack map is not compiled into the executable.
+Using the LLVM makefiles (like the sample -project), this can be built into a plugin using a simple makefile:
+project), this code can be compiled as a plugin using a simple +makefile:# lib/MyGC/Makefile @@ -676,29 +666,18 @@ $ llvm-as < sample.ll | llc -load=MyGC.soIt is also possible to statically link the collector plugin into tools, such as a language-specific compiler front-end.
-
The boilerplate collector above does nothing. More specifically:
- -Collector provides a range of features through which a plugin -collector may do useful work. This matrix summarizes the supported (and planned) -features and correlates them with the collection techniques which typically -require them.
+GCStrategy provides a range of features through which a plugin +may do useful work. Some of these are callbacks, some are algorithms that can +be enabled, disabled, or customized. This matrix summarizes the supported (and +planned) features and correlates them with the collection techniques which +typically require them.