X-Git-Url: http://demsky.eecs.uci.edu/git/?a=blobdiff_plain;f=docs%2FExceptionHandling.rst;h=74827c02a272963a6e2b704a20d30700544d7055;hb=955318d58de6f3a5ca768c00de715891f607f64f;hp=fe6876ad83a792801f5cfe5808a1fa4c04067967;hpb=745f3905122092c409f18e91787b0394b48c66e2;p=oota-llvm.git diff --git a/docs/ExceptionHandling.rst b/docs/ExceptionHandling.rst index fe6876ad83a..74827c02a27 100644 --- a/docs/ExceptionHandling.rst +++ b/docs/ExceptionHandling.rst @@ -67,17 +67,10 @@ exception handling is generally preferred to SJLJ. Windows Runtime Exception Handling ----------------------------------- -Windows runtime based exception handling uses the same basic IR structure as -Itanium ABI based exception handling, but it relies on the personality -functions provided by the native Windows runtime library, ``__CxxFrameHandler3`` -for C++ exceptions: ``__C_specific_handler`` for 64-bit SEH or -``_frame_handler3/4`` for 32-bit SEH. This results in a very different -execution model and requires some minor modifications to the initial IR -representation and a significant restructuring just before code generation. - -General information about the Windows x64 exception handling mechanism can be -found at `MSDN Exception Handling (x64) -_`. +LLVM supports handling exceptions produced by the Windows runtime, but it +requires a very different intermediate representation. It is not based on the +":ref:`landingpad `" instruction like the other two models, and is +described later in this document under :ref:`wineh`. Overview -------- @@ -169,11 +162,11 @@ pad to the back end. For C++, the ``landingpad`` instruction returns a pointer and integer pair corresponding to the pointer to the *exception structure* and the *selector value* respectively. -The ``landingpad`` instruction takes a reference to the personality function to -be used for this ``try``/``catch`` sequence. The remainder of the instruction is -a list of *cleanup*, *catch*, and *filter* clauses. The exception is tested -against the clauses sequentially from first to last. The clauses have the -following meanings: +The ``landingpad`` instruction looks for a reference to the personality +function to be used for this ``try``/``catch`` sequence in the parent +function's attribute list. The instruction contains a list of *cleanup*, +*catch*, and *filter* clauses. The exception is tested against the clauses +sequentially from first to last. The clauses have the following meanings: - ``catch @ExcType`` @@ -278,9 +271,9 @@ there are no catches or filters that require it to. exceptions and throws a third. When all cleanups are finished, if the exception is not handled by the current -function, resume unwinding by calling the `resume -instruction `_, passing in the result of the -``landingpad`` instruction for the original landing pad. +function, resume unwinding by calling the :ref:`resume instruction `, +passing in the result of the ``landingpad`` instruction for the original +landing pad. Throw Filters ------------- @@ -321,97 +314,6 @@ the selector results they understand and then resume exception propagation with the `resume instruction `_ if none of the conditions match. -C++ Exception Handling using the Windows Runtime -================================================= - -(Note: Windows C++ exception handling support is a work in progress and is - not yet fully implemented. The text below describes how it will work - when completed.) - -The Windows runtime function for C++ exception handling uses a multi-phase -approach. When an exception occurs it searches the current callstack for a -frame that has a handler for the exception. If a handler is found, it then -calls the cleanup handler for each frame above the handler which has a -cleanup handler before calling the catch handler. These calls are all made -from a stack context different from the original frame in which the handler -is defined. Therefore, it is necessary to outline these handlers from their -original context before code generation. - -Catch handlers are called with a pointer to the handler itself as the first -argument and a pointer to the parent function's stack frame as the second -argument. The catch handler uses the `llvm.recoverframe -`_ to get a -pointer to a frame allocation block that is created in the parent frame using -the `llvm.allocateframe -`_ intrinsic. -The ``WinEHPrepare`` pass will have created a structure definition for the -contents of this block. The first two members of the structure will always be -(1) a 32-bit integer that the runtime uses to track the exception state of the -parent frame for the purposes of handling chained exceptions and (2) a pointer -to the object associated with the exception (roughly, the parameter of the -catch clause). These two members will be followed by any frame variables from -the parent function which must be accessed in any of the functions unwind or -catch handlers. The catch handler returns the address at which execution -should continue. - -Cleanup handlers perform any cleanup necessary as the frame goes out of scope, -such as calling object destructors. The runtime handles the actual unwinding -of the stack. If an exception occurs in a cleanup handler the runtime manages -termination of the process. Cleanup handlers are called with the same arguments -as catch handlers (a pointer to the handler and a pointer to the parent stack -frame) and use the same mechanism described above to access frame variables -in the parent function. Cleanup handlers do not return a value. - -The IR generated for Windows runtime based C++ exception handling is initially -very similar to the ``landingpad`` mechanism described above. Calls to -libc++abi functions (such as ``__cxa_begin_catch``/``__cxa_end_catch`` and -``__cxa_throw_exception`` are replaced with calls to intrinsics or Windows -runtime functions (such as ``llvm.eh.begincatch``/``llvm.eh.endcatch`` and -``__CxxThrowException``). - -During the WinEHPrepare pass, the handler functions are outlined into handler -functions and the original landing pad code is replaced with a call to the -``llvm.eh.actions`` intrinsic that describes the order in which handlers will -be processed from the logical location of the landing pad and an indirect -branch to the return value of the ``llvm.eh.actions`` intrinsic. The -``llvm.eh.actions`` intrinsic is defined as returning the address at which -execution will continue. This is a temporary construct which will be removed -before code generation, but it allows for the accurate tracking of control -flow until then. - -A typical landing pad will look like this after outlining: - -.. code-block:: llvm - - lpad: - %vals = landingpad { i8*, i32 } personality i8* bitcast (i32 (...)* @__CxxFrameHandler3 to i8*) - cleanup - catch i8* bitcast (i8** @_ZTIi to i8*) - catch i8* bitcast (i8** @_ZTIf to i8*) - %recover = call i8* (...)* @llvm.eh.actions( - i32 3, i8* bitcast (i8** @_ZTIi to i8*), i8* (i8*, i8*)* @_Z4testb.catch.1) - i32 2, i8* null, void (i8*, i8*)* @_Z4testb.cleanup.1) - i32 1, i8* bitcast (i8** @_ZTIf to i8*), i8* (i8*, i8*)* @_Z4testb.catch.0) - i32 0, i8* null, void (i8*, i8*)* @_Z4testb.cleanup.0) - indirectbr i8* %recover, [label %try.cont1, label %try.cont2] - -In this example, the landing pad represents an exception handling context with -two catch handlers and a cleanup handler that have been outlined. If an -exception is thrown with a type that matches ``_ZTIi``, the ``_Z4testb.catch.1`` -handler will be called an no clean-up is needed. If an exception is thrown -with a type that matches ``_ZTIf``, first the ``_Z4testb.cleanup.1`` handler -will be called to perform unwind-related cleanup, then the ``_Z4testb.catch.1`` -handler will be called. If an exception is throw which does not match either -of these types and the exception is handled by another frame further up the -call stack, first the ``_Z4testb.cleanup.1`` handler will be called, then the -``_Z4testb.cleanup.0`` handler (which corresponds to a different scope) will be -called, and exception handling will continue at the next frame in the call -stack will be called. One of the catch handlers will return the address of -``%try.cont1`` in the parent function and the other will return the address of -``%try.cont2``, meaning that execution continues at one of those blocks after -an exception is caught. - - Exception Handling Intrinsics ============================= @@ -442,7 +344,7 @@ Uses of this intrinsic are generated by the C++ front-end. .. code-block:: llvm - i8* @llvm.eh.begincatch(i8* %exn) + void @llvm.eh.begincatch(i8* %ehptr, i8* %ehobj) This intrinsic marks the beginning of catch handling code within the blocks @@ -450,11 +352,11 @@ following a ``landingpad`` instruction. The exact behavior of this function depends on the compilation target and the personality function associated with the ``landingpad`` instruction. -The argument to this intrinsic is a pointer that was previously extracted from -the aggregate return value of the ``landingpad`` instruction. The return -value of the intrinsic is a pointer to the exception object to be used by the -catch code. This pointer is returned as an ``i8*`` value, but the actual type -of the object will depend on the exception that was thrown. +The first argument to this intrinsic is a pointer that was previously extracted +from the aggregate return value of the ``landingpad`` instruction. The second +argument to the intrinsic is a pointer to stack space where the exception object +should be stored. The runtime handles the details of copying the exception +object into the slot. If the second parameter is null, no copy occurs. Uses of this intrinsic are generated by the C++ front-end. Many targets will use implementation-specific functions (such as ``__cxa_begin_catch``) instead @@ -499,6 +401,20 @@ intrinsic serves as a placeholder to delimit code before a catch handler is outlined. After the handler is outlined, this intrinsic is simply removed. +.. _llvm.eh.exceptionpointer: + +``llvm.eh.exceptionpointer`` +---------------------------- + +.. code-block:: llvm + + i8 addrspace(N)* @llvm.eh.padparam.pNi8(token %catchpad) + + +This intrinsic retrieves a pointer to the exception caught by the given +``catchpad``. + + SJLJ Intrinsics --------------- @@ -583,10 +499,279 @@ an exception handling frame for each function in a compile unit, plus a common exception handling frame that defines information common to all functions in the unit. +The format of this call frame information (CFI) is often platform-dependent, +however. ARM, for example, defines their own format. Apple has their own compact +unwind info format. On Windows, another format is used for all architectures +since 32-bit x86. LLVM will emit whatever information is required by the +target. + Exception Tables ---------------- An exception table contains information about what actions to take when an -exception is thrown in a particular part of a function's code. There is one -exception table per function, except leaf functions and functions that have -calls only to non-throwing functions. They do not need an exception table. +exception is thrown in a particular part of a function's code. This is typically +referred to as the language-specific data area (LSDA). The format of the LSDA +table is specific to the personality function, but the majority of personalities +out there use a variation of the tables consumed by ``__gxx_personality_v0``. +There is one exception table per function, except leaf functions and functions +that have calls only to non-throwing functions. They do not need an exception +table. + +.. _wineh: + +Exception Handling using the Windows Runtime +================================================= + +Background on Windows exceptions +--------------------------------- + +Interacting with exceptions on Windows is significantly more complicated than +on Itanium C++ ABI platforms. The fundamental difference between the two models +is that Itanium EH is designed around the idea of "successive unwinding," while +Windows EH is not. + +Under Itanium, throwing an exception typically involes allocating thread local +memory to hold the exception, and calling into the EH runtime. The runtime +identifies frames with appropriate exception handling actions, and successively +resets the register context of the current thread to the most recently active +frame with actions to run. In LLVM, execution resumes at a ``landingpad`` +instruction, which produces register values provided by the runtime. If a +function is only cleaning up allocated resources, the function is responsible +for calling ``_Unwind_Resume`` to transition to the next most recently active +frame after it is finished cleaning up. Eventually, the frame responsible for +handling the exception calls ``__cxa_end_catch`` to destroy the exception, +release its memory, and resume normal control flow. + +The Windows EH model does not use these successive register context resets. +Instead, the active exception is typically described by a frame on the stack. +In the case of C++ exceptions, the exception object is allocated in stack memory +and its address is passed to ``__CxxThrowException``. General purpose structured +exceptions (SEH) are more analogous to Linux signals, and they are dispatched by +userspace DLLs provided with Windows. Each frame on the stack has an assigned EH +personality routine, which decides what actions to take to handle the exception. +There are a few major personalities for C and C++ code: the C++ personality +(``__CxxFrameHandler3``) and the SEH personalities (``_except_handler3``, +``_except_handler4``, and ``__C_specific_handler``). All of them implement +cleanups by calling back into a "funclet" contained in the parent function. + +Funclets, in this context, are regions of the parent function that can be called +as though they were a function pointer with a very special calling convention. +The frame pointer of the parent frame is passed into the funclet either using +the standard EBP register or as the first parameter register, depending on the +architecture. The funclet implements the EH action by accessing local variables +in memory through the frame pointer, and returning some appropriate value, +continuing the EH process. No variables live in to or out of the funclet can be +allocated in registers. + +The C++ personality also uses funclets to contain the code for catch blocks +(i.e. all user code between the braces in ``catch (Type obj) { ... }``). The +runtime must use funclets for catch bodies because the C++ exception object is +allocated in a child stack frame of the function handling the exception. If the +runtime rewound the stack back to frame of the catch, the memory holding the +exception would be overwritten quickly by subsequent function calls. The use of +funclets also allows ``__CxxFrameHandler3`` to implement rethrow without +resorting to TLS. Instead, the runtime throws a special exception, and then uses +SEH (``__try / __except``) to resume execution with new information in the child +frame. + +In other words, the successive unwinding approach is incompatible with Visual +C++ exceptions and general purpose Windows exception handling. Because the C++ +exception object lives in stack memory, LLVM cannot provide a custom personality +function that uses landingpads. Similarly, SEH does not provide any mechanism +to rethrow an exception or continue unwinding. Therefore, LLVM must use the IR +constructs described later in this document to implement compatible exception +handling. + +SEH filter expressions +----------------------- + +The SEH personality functions also use funclets to implement filter expressions, +which allow executing arbitrary user code to decide which exceptions to catch. +Filter expressions should not be confused with the ``filter`` clause of the LLVM +``landingpad`` instruction. Typically filter expressions are used to determine +if the exception came from a particular DLL or code region, or if code faulted +while accessing a particular memory address range. LLVM does not currently have +IR to represent filter expressions because it is difficult to represent their +control dependencies. Filter expressions run during the first phase of EH, +before cleanups run, making it very difficult to build a faithful control flow +graph. For now, the new EH instructions cannot represent SEH filter +expressions, and frontends must outline them ahead of time. Local variables of +the parent function can be escaped and accessed using the ``llvm.localescape`` +and ``llvm.localrecover`` intrinsics. + +New exception handling instructions +------------------------------------ + +The primary design goal of the new EH instructions is to support funclet +generation while preserving information about the CFG so that SSA formation +still works. As a secondary goal, they are designed to be generic across MSVC +and Itanium C++ exceptions. They make very few assumptions about the data +required by the personality, so long as it uses the familiar core EH actions: +catch, cleanup, and terminate. However, the new instructions are hard to modify +without knowing details of the EH personality. While they can be used to +represent Itanium EH, the landingpad model is strictly better for optimization +purposes. + +The following new instructions are considered "exception handling pads", in that +they must be the first non-phi instruction of a basic block that may be the +unwind destination of an EH flow edge: +``catchswitch``, ``catchpad``, and ``cleanuppad``. +As with landingpads, when entering a try scope, if the +frontend encounters a call site that may throw an exception, it should emit an +invoke that unwinds to a ``catchswitch`` block. Similarly, inside the scope of a +C++ object with a destructor, invokes should unwind to a ``cleanuppad``. + +New instructions are also used to mark the points where control is transferred +out of a catch/cleanup handler (which will correspond to exits from the +generated funclet). A catch handler which reaches its end by normal execution +executes a ``catchret`` instruction, which is a terminator indicating where in +the function control is returned to. A cleanup handler which reaches its end +by normal execution executes a ``cleanupret`` instruction, which is a terminator +indicating where the active exception will unwind to next. + +Each of these new EH pad instructions has a way to identify which action should +be considered after this action. The ``catchswitch`` instruction is a terminator +and has an unwind destination operand analogous to the unwind destination of an +invoke. The ``cleanuppad`` instruction is not +a terminator, so the unwind destination is stored on the ``cleanupret`` +instruction instead. Successfully executing a catch handler should resume +normal control flow, so neither ``catchpad`` nor ``catchret`` instructions can +unwind. All of these "unwind edges" may refer to a basic block that contains an +EH pad instruction, or they may unwind to the caller. Unwinding to the caller +has roughly the same semantics as the ``resume`` instruction in the landingpad +model. When inlining through an invoke, instructions that unwind to the caller +are hooked up to unwind to the unwind destination of the call site. + +Putting things together, here is a hypothetical lowering of some C++ that uses +all of the new IR instructions: + +.. code-block:: c + + struct Cleanup { + Cleanup(); + ~Cleanup(); + int m; + }; + void may_throw(); + int f() noexcept { + try { + Cleanup obj; + may_throw(); + } catch (int e) { + may_throw(); + return e; + } + return 0; + } + +.. code-block:: llvm + + define i32 @f() nounwind personality i32 (...)* @__CxxFrameHandler3 { + entry: + %obj = alloca %struct.Cleanup, align 4 + %e = alloca i32, align 4 + %call = invoke %struct.Cleanup* @"\01??0Cleanup@@QEAA@XZ"(%struct.Cleanup* nonnull %obj) + to label %invoke.cont unwind label %lpad.catch + + invoke.cont: ; preds = %entry + invoke void @"\01?may_throw@@YAXXZ"() + to label %invoke.cont.2 unwind label %lpad.cleanup + + invoke.cont.2: ; preds = %invoke.cont + call void @"\01??_DCleanup@@QEAA@XZ"(%struct.Cleanup* nonnull %obj) nounwind + br label %return + + return: ; preds = %invoke.cont.3, %invoke.cont.2 + %retval.0 = phi i32 [ 0, %invoke.cont.2 ], [ %3, %invoke.cont.3 ] + ret i32 %retval.0 + + lpad.cleanup: ; preds = %invoke.cont.2 + %0 = cleanuppad within none [] + call void @"\01??1Cleanup@@QEAA@XZ"(%struct.Cleanup* nonnull %obj) nounwind + cleanupret %0 unwind label %lpad.catch + + lpad.catch: ; preds = %lpad.cleanup, %entry + %1 = catchswitch within none [label %catch.body] unwind label %lpad.terminate + + catch.body: ; preds = %lpad.catch + %catch = catchpad within %1 [%rtti.TypeDescriptor2* @"\01??_R0H@8", i32 0, i32* %e] + invoke void @"\01?may_throw@@YAXXZ"() + to label %invoke.cont.3 unwind label %lpad.terminate + + invoke.cont.3: ; preds = %catch.body + %3 = load i32, i32* %e, align 4 + catchret from %catch to label %return + + lpad.terminate: ; preds = %catch.body, %lpad.catch + cleanuppad within none [] + call void @"\01?terminate@@YAXXZ" + unreachable + } + +Funclet parent tokens +----------------------- + +In order to produce tables for EH personalities that use funclets, it is +necessary to recover the nesting that was present in the source. This funclet +parent relationship is encoded in the IR using tokens produced by the new "pad" +instructions. The token operand of a "pad" or "ret" instruction indicates which +funclet it is in, or "none" if it is not nested within another funclet. + +The ``catchpad`` and ``cleanuppad`` instructions establish new funclets, and +their tokens are consumed by other "pad" instructions to establish membership. +The ``catchswitch`` instruction does not create a funclet, but it produces a +token that is always consumed by its immediate successor ``catchpad`` +instructions. This ensures that every catch handler modelled by a ``catchpad`` +belongs to exactly one ``catchswitch``, which models the dispatch point after a +C++ try. + +Here is an example of what this nesting looks like using some hypothetical +C++ code: + +.. code-block:: c + + void f() { + try { + throw; + } catch (...) { + try { + throw; + } catch (...) { + } + } + } + +.. code-block:: llvm + + define void @f() #0 personality i8* bitcast (i32 (...)* @__CxxFrameHandler3 to i8*) { + entry: + invoke void @_CxxThrowException(i8* null, %eh.ThrowInfo* null) #1 + to label %unreachable unwind label %catch.dispatch + + catch.dispatch: ; preds = %entry + %0 = catchswitch within none [label %catch] unwind to caller + + catch: ; preds = %catch.dispatch + %1 = catchpad within %0 [i8* null, i32 64, i8* null] + invoke void @_CxxThrowException(i8* null, %eh.ThrowInfo* null) #1 + to label %unreachable unwind label %catch.dispatch2 + + catch.dispatch2: ; preds = %catch + %2 = catchswitch within %1 [label %catch3] unwind to caller + + catch3: ; preds = %catch.dispatch2 + %3 = catchpad within %2 [i8* null, i32 64, i8* null] + catchret from %3 to label %try.cont + + try.cont: ; preds = %catch3 + catchret from %1 to label %try.cont6 + + try.cont6: ; preds = %try.cont + ret void + + unreachable: ; preds = %catch, %entry + unreachable + } + +The "inner" ``catchswitch`` consumes ``%1`` which is produced by the outer +catchswitch.