From 304aa0291287b6146791f986b7e2217701394d04 Mon Sep 17 00:00:00 2001 From: Philip Reames Date: Wed, 26 Aug 2015 23:13:35 +0000 Subject: [PATCH] [docs][Statepoints] More on base pointers Expand the information on base pointers to include an example, the assumptions a collector is allowed to make, legal optimizations over gc.relocates, and the assumptions made by RewriteStatepointsForGC. This is the result of a recent conversation with folks from LLIC and the confusions that came to light therein. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246103 91177308-0d34-0410-b5e6-96231b3b80d8 --- docs/Statepoints.rst | 80 ++++++++++++++++++++++++++++++++++---------- 1 file changed, 62 insertions(+), 18 deletions(-) diff --git a/docs/Statepoints.rst b/docs/Statepoints.rst index 6f1a5a4671d..d6a4b9c5fe2 100644 --- a/docs/Statepoints.rst +++ b/docs/Statepoints.rst @@ -209,20 +209,49 @@ This example was taken from the tests for the :ref:`RewriteStatepointsForGC` uti Base & Derived Pointers ^^^^^^^^^^^^^^^^^^^^^^^ -A base pointer is one which points to the base of an allocation (object). A -derived pointer is one which is offset from a base pointer by some amount. -When relocating objects, a garbage collector needs to be able to relocate each -derived pointer associated with an allocation to the same offset from the new -address. - -Derived pointers fall in to two categories: - * "Interior derived pointers" remain within the bounds of the allocation - they're associated with. As a result, the base object can be found at - runtime provided the bounds of allocations are known to the runtime system. - * "Exterior derived pointers" are outside the bounds of the associated object; - they may even fall within *another* allocations address range. As a result, - there is no way for a garbage collector to determine which allocation they - are associated with at runtime and compiler support is needed. +A "base pointer" is one which points to the starting address of an allocation +(object). A "derived pointer" is one which is offset from a base pointer by +some amount. When relocating objects, a garbage collector needs to be able +to relocate each derived pointer associated with an allocation to the same +offset from the new address. + +"Interior derived pointers" remain within the bounds of the allocation +they're associated with. As a result, the base object can be found at +runtime provided the bounds of allocations are known to the runtime system. + +"Exterior derived pointers" are outside the bounds of the associated object; +they may even fall within *another* allocations address range. As a result, +there is no way for a garbage collector to determine which allocation they +are associated with at runtime and compiler support is needed. + +The ``gc.relocate`` intrinsic supports an explicit operand for describing the +allocation associated with a derived pointer. This operand is frequently +referred to as the base operand, but does not strictly speaking have to be +a base pointer, but it does need to lie within the bounds of the associated +allocation. Some collectors may require that the operand be an actual base +pointer rather than merely an internal derived pointer. Note that during +lowering both the base and derived pointer operands are required to be live +over the associated call safepoint even if the base is otherwise unused +afterwards. + +If we extend our previous example to include a pointless derived pointer, +we get: + +.. code-block:: llvm + + define i8 addrspace(1)* @test1(i8 addrspace(1)* %obj) + gc "statepoint-example" { + %gep = getelementptr i8, i8 addrspace(1)* %obj, i64 20000 + %token = call i32 (i64, i32, void ()*, i32, i32, ...)* @llvm.experimental.gc.statepoint.p0f_isVoidf(i64 0, i32 0, void ()* @foo, i32 0, i32 0, i32 0, i32 0, i8 addrspace(1)* %obj, i8 addrspace(1)* %gep) + %obj.relocated = call i8 addrspace(1)* @llvm.experimental.gc.relocate.p1i8(i32 %token, i32 7, i32 7) + %gep.relocated = call i8 addrspace(1)* @llvm.experimental.gc.relocate.p1i8(i32 %token, i32 7, i32 8) + %p = getelementptr i8, i8 addrspace(1)* %gep, i64 -20000 + ret i8 addrspace(1)* %p + } + +Note that in this example %p and %obj.relocate are the same address and we +could replace one with the other, potentially removing the derived pointer +from the live set at the safepoint entirely. GC Transitions ^^^^^^^^^^^^^^^^^^ @@ -486,9 +515,14 @@ Despite the typing of this as a generic i32, *only* the value defined by a ``gc.statepoint`` is legal here. The second argument is an index into the statepoints list of arguments -which specifies the base pointer for the pointer being relocated. +which specifies the allocation for the pointer being relocated. This index must land within the 'gc parameter' section of the -statepoint's argument list. +statepoint's argument list. The associated value must be within the +object with which the pointer being relocated is associated. The optimizer +is free to change *which* interior derived pointer is reported, provided that +it does not replace an actual base pointer with another interior derived +pointer. Collectors are allowed to rely on the base pointer operand +remaining an actual base pointer if so constructed. The third argument is an index into the statepoint's list of arguments which specify the (potentially) derived pointer being relocated. It @@ -631,8 +665,18 @@ non references. Address space 1 is not globally reserved for this purpose. This pass can be used an utility function by a language frontend that doesn't want to manually reason about liveness, base pointers, or relocation when constructing IR. As currently implemented, RewriteStatepointsForGC must be -run after SSA construction (i.e. mem2ref). - +run after SSA construction (i.e. mem2ref). + +RewriteStatepointsForGC will ensure that appropriate base pointers are listed +for every relocation created. It will do so by duplicating code as needed to +propagate the base pointer associated with each pointer being relocated to +the appropriate safepoints. The implementation assumes that the following +IR constructs produce base pointers: loads from the heap, addresses of global +variables, function arguments, function return values. Constant pointers (such +as null) are also assumed to be base pointers. In practice, this constraint +can be relaxed to producing interior derived pointers provided the target +collector can find the associated allocation from an arbitrary interior +derived pointer. In practice, RewriteStatepointsForGC can be run much later in the pass pipeline, after most optimization is already done. This helps to improve -- 2.34.1