X-Git-Url: http://demsky.eecs.uci.edu/git/?a=blobdiff_plain;ds=sidebyside;f=docs%2FLangRef.html;h=45f6f38f598fc9c34bc7b9ce70feb8905d7b2f40;hb=9e6d1d1f5034347d237941f1bf08fba5c1583cd3;hp=9b801cffc3ac6aa9150f82771cf8873445f13248;hpb=e910b4cefe7e964ba76dbd02920f66b8bdc3d9d6;p=oota-llvm.git diff --git a/docs/LangRef.html b/docs/LangRef.html index 9b801cffc3a..45f6f38f598 100644 --- a/docs/LangRef.html +++ b/docs/LangRef.html @@ -5,7 +5,7 @@ LLVM Assembly Language Reference Manual - @@ -22,20 +22,20 @@
  • Module Structure
  • Linkage Types
      -
    1. private
    2. -
    3. linker_private
    4. -
    5. internal
    6. -
    7. available_externally
    8. -
    9. linkonce
    10. -
    11. common
    12. -
    13. weak
    14. -
    15. appending
    16. -
    17. extern_weak
    18. -
    19. linkonce_odr
    20. -
    21. weak_odr
    22. -
    23. externally visible
    24. -
    25. dllimport
    26. -
    27. dllexport
    28. +
    29. 'private' Linkage
    30. +
    31. 'linker_private' Linkage
    32. +
    33. 'internal' Linkage
    34. +
    35. 'available_externally' Linkage
    36. +
    37. 'linkonce' Linkage
    38. +
    39. 'common' Linkage
    40. +
    41. 'weak' Linkage
    42. +
    43. 'appending' Linkage
    44. +
    45. 'extern_weak' Linkage
    46. +
    47. 'linkonce_odr' Linkage
    48. +
    49. 'weak_odr' Linkage
    50. +
    51. 'externally visible' Linkage
    52. +
    53. 'dllimport' Linkage
    54. +
    55. 'dllexport' Linkage
  • Calling Conventions
  • @@ -48,13 +48,15 @@
  • Garbage Collector Names
  • Module-Level Inline Assembly
  • Data Layout
  • +
  • Pointer Aliasing Rules
  • Type System
    1. Type Classifications
    2. -
    3. Primitive Types +
    4. Primitive Types
        +
      1. Integer Type
      2. Floating Point Types
      3. Void Type
      4. Label Type
      5. @@ -63,7 +65,6 @@
      6. Derived Types
          -
        1. Integer Type
        2. Array Type
        3. Function Type
        4. Pointer Type
        5. @@ -82,6 +83,7 @@
        6. Complex Constants
        7. Global Variable and Function Addresses
        8. Undefined Values
        9. +
        10. Addresses of Basic Blocks
        11. Constant Expressions
        12. Embedded Metadata
        @@ -91,6 +93,17 @@
      7. Inline Assembler Expressions
    5. +
    6. Intrinsic Global Variables +
        +
      1. The 'llvm.used' Global Variable
      2. +
      3. The 'llvm.compiler.used' + Global Variable
      4. +
      5. The 'llvm.global_ctors' + Global Variable
      6. +
      7. The 'llvm.global_dtors' + Global Variable
      8. +
      +
    7. Instruction Reference
      1. Terminator Instructions @@ -98,6 +111,7 @@
      2. 'ret' Instruction
      3. 'br' Instruction
      4. 'switch' Instruction
      5. +
      6. 'indirectbr' Instruction
      7. 'invoke' Instruction
      8. 'unwind' Instruction
      9. 'unreachable' Instruction
      10. @@ -144,8 +158,6 @@
      11. Memory Access and Addressing Operations
          -
        1. 'malloc' Instruction
        2. -
        3. 'free' Instruction
        4. 'alloca' Instruction
        5. 'load' Instruction
        6. 'store' Instruction
        7. @@ -261,6 +273,14 @@
        8. llvm.atomic.load.umin
      12. +
      13. Memory Use Markers +
          +
        1. llvm.lifetime.start
        2. +
        3. llvm.lifetime.end
        4. +
        5. llvm.invariant.start
        6. +
        7. llvm.invariant.end
        8. +
        +
      14. General intrinsics
        1. @@ -271,6 +291,8 @@ 'llvm.trap' Intrinsic
        2. 'llvm.stackprotector' Intrinsic
        3. +
        4. + 'llvm.objectsize' Intrinsic
      @@ -318,7 +340,7 @@ IR's", allowing many source languages to be mapped to them). By providing type information, LLVM can be used as the target of optimizations: for example, through pointer analysis, it can be proven that a C automatic - variable is never accessed outside of the current function... allowing it to + variable is never accessed outside of the current function, allowing it to be promoted to a simple SSA value instead of a memory location.

      @@ -339,12 +361,12 @@ -

      ...because the definition of %x does not dominate all of its - uses. The LLVM infrastructure provides a verification pass that may be used - to verify that an LLVM module is well formed. This pass is automatically run - by the parser after parsing input assembly and by the optimizer before it - outputs bitcode. The violations pointed out by the verifier pass indicate - bugs in transformation passes or input to the parser.

      +

      because the definition of %x does not dominate all of its uses. The + LLVM infrastructure provides a verification pass that may be used to verify + that an LLVM module is well formed. This pass is automatically run by the + parser after parsing input assembly and by the optimizer before it outputs + bitcode. The violations pointed out by the verifier pass indicate bugs in + transformation passes or input to the parser.

      @@ -418,8 +440,8 @@
      -add i32 %X, %X           ; yields {i32}:%0
      -add i32 %0, %0           ; yields {i32}:%1
      +%0 = add i32 %X, %X           ; yields {i32}:%0
      +%1 = add i32 %0, %0           ; yields {i32}:%1
       %result = add i32 %1, %1
       
      @@ -437,7 +459,7 @@
    8. Unnamed temporaries are numbered sequentially
    -

    ...and it also shows a convention that we follow in this document. When +

    It also shows a convention that we follow in this document. When demonstrating instructions, we will follow an instruction with a comment that defines the type and name of value produced. Comments are shown in italic text.

    @@ -462,24 +484,21 @@ the "hello world" module:

    -
    ; Declare the string constant as a global constant...
    -@.LC0 = internal constant [13 x i8] c"hello world\0A\00"          ; [13 x i8]*
    +
    +; Declare the string constant as a global constant.
    +@.LC0 = internal constant [13 x i8] c"hello world\0A\00"    ; [13 x i8]*
     
     ; External declaration of the puts function
    -declare i32 @puts(i8 *)                                           ; i32(i8 *)* 
    +declare i32 @puts(i8 *)                                     ; i32(i8 *)* 
     
     ; Definition of main function
    -define i32 @main() {                                              ; i32()* 
    -        ; Convert [13 x i8]* to i8  *...
    -        %cast210 = getelementptr [13 x i8]* @.LC0, i64 0, i64 0   ; i8 *
    +define i32 @main() {                                        ; i32()* 
    +  ; Convert [13 x i8]* to i8  *...
    +  %cast210 = getelementptr [13 x i8]* @.LC0, i64 0, i64 0   ; i8 *
     
    -        ; Call puts function to write out the string to stdout...
    -        call i32 @puts(i8 * %cast210)                             ; i32
    -        ret i32 0
    }
    + ; Call puts function to write out the string to stdout. + call i32 @puts(i8 * %cast210) ; i32 + ret i32 0
    }
    @@ -507,8 +526,7 @@ define i32 @main() { ; i32()*
    -
    private:
    - +
    private
    Global values with private linkage are only directly accessible by objects in the current module. In particular, linking code into a module with an private global value may cause the private to be renamed as necessary to @@ -516,19 +534,20 @@ define i32 @main() { ; i32()* -
    linker_private:
    - +
    linker_private
    Similar to private, but the symbol is passed through the assembler and - removed by the linker after evaluation.
    - -
    internal:
    + removed by the linker after evaluation. Note that (unlike private + symbols) linker_private symbols are subject to coalescing by the linker: + weak symbols get merged and redefinitions are rejected. However, unlike + normal strong symbols, they are removed by the linker from the final + linked image (executable or dynamic library). +
    internal
    Similar to private, but the value shows as a local symbol (STB_LOCAL in the case of ELF) in the object file. This corresponds to the notion of the 'static' keyword in C.
    -
    available_externally:
    - +
    available_externally
    Globals with "available_externally" linkage are never emitted into the object file corresponding to the LLVM module. They exist to allow inlining and other optimizations to take place given knowledge of @@ -537,47 +556,45 @@ define i32 @main() { ; i32()* linkonce_odr. This linkage type is only allowed on definitions, not declarations.
    -
    linkonce:
    - +
    linkonce
    Globals with "linkonce" linkage are merged with other globals of the same name when linkage occurs. This is typically used to implement inline functions, templates, or other code which must be generated in each translation unit that uses it. Unreferenced linkonce globals are allowed to be discarded.
    -
    common:
    - -
    "common" linkage is exactly the same as linkonce - linkage, except that unreferenced common globals may not be - discarded. This is used for globals that may be emitted in multiple - translation units, but that are not guaranteed to be emitted into every - translation unit that uses them. One example of this is tentative - definitions in C, such as "int X;" at global scope.
    - -
    weak:
    - -
    "weak" linkage is the same as common linkage, except - that some targets may choose to emit different assembly sequences for them - for target-dependent reasons. This is used for globals that are declared - "weak" in C source code.
    - -
    appending:
    - +
    weak
    +
    "weak" linkage has the same merging semantics as + linkonce linkage, except that unreferenced globals with + weak linkage may not be discarded. This is used for globals that + are declared "weak" in C source code.
    + +
    common
    +
    "common" linkage is most similar to "weak" linkage, but + they are used for tentative definitions in C, such as "int X;" at + global scope. + Symbols with "common" linkage are merged in the same way as + weak symbols, and they may not be deleted if unreferenced. + common symbols may not have an explicit section, + must have a zero initializer, and may not be marked 'constant'. Functions and aliases may not + have common linkage.
    + + +
    appending
    "appending" linkage may only be applied to global variables of pointer to array type. When two global variables with appending linkage are linked together, the two global arrays are appended together. This is the LLVM, typesafe, equivalent of having the system linker append together "sections" with identical names when .o files are linked.
    -
    extern_weak:
    - +
    extern_weak
    The semantics of this linkage follow the ELF object file model: the symbol is weak until linked, if not linked, the symbol becomes null instead of being an undefined reference.
    -
    linkonce_odr:
    -
    weak_odr:
    - +
    linkonce_odr
    +
    weak_odr
    Some languages allow differing globals to be merged, such as two functions with different semantics. Other languages, such as C++, ensure that only equivalent globals are ever merged (the "one definition rule" - @@ -587,7 +604,6 @@ define i32 @main() { ; i32()* odr versions.
    externally visible:
    -
    If none of the above identifiers are used, the global is externally visible, meaning that it participates in linkage and can be used to resolve external symbol references.
    @@ -598,16 +614,14 @@ define i32 @main() { ; i32()*
    -
    dllimport:
    - +
    dllimport
    "dllimport" linkage causes the compiler to reference a function or variable via a global pointer to a pointer that is set up by the DLL exporting the symbol. On Microsoft Windows targets, the pointer name is formed by combining __imp_ and the function or variable name.
    -
    dllexport:
    - +
    dllexport
    "dllexport" linkage causes the compiler to provide a global pointer to a pointer in a DLL, so that it can be referenced with the dllimport attribute. On Microsoft Windows targets, the pointer @@ -646,7 +660,6 @@ define i32 @main() { ; i32()*
    "ccc" - The C calling convention:
    -
    This calling convention (the default if no other calling convention is specified) matches the target C calling conventions. This calling convention supports varargs function calls and tolerates some mismatch in @@ -654,7 +667,6 @@ define i32 @main() { ; i32()*
    "fastcc" - The fast calling convention:
    -
    This calling convention attempts to make calls as fast as possible (e.g. by passing things in registers). This calling convention allows the target to use whatever tricks it wants to produce fast code for the @@ -666,7 +678,6 @@ define i32 @main() { ; i32()*
    "coldcc" - The cold calling convention:
    -
    This calling convention attempts to make code in the caller as efficient as possible under the assumption that the call is not commonly executed. As such, these calls often preserve all registers so that the call does @@ -675,7 +686,6 @@ define i32 @main() { ; i32()*
    "cc <n>" - Numbered convention:
    -
    Any calling convention may be specified by number, allowing target-specific calling conventions to be used. Target specific calling conventions start at 64.
    @@ -699,7 +709,6 @@ define i32 @main() { ; i32()*
    "default" - Default style:
    -
    On targets that use the ELF object file format, default visibility means that the declaration is visible to other modules and, in shared libraries, means that the declared entity may be overridden. On Darwin, default @@ -707,7 +716,6 @@ define i32 @main() { ; i32()*
    "hidden" - Hidden style:
    -
    Two declarations of an object with hidden visibility refer to the same object if they are in the same shared object. Usually, hidden visibility indicates that the symbol will not be placed into the dynamic symbol @@ -715,7 +723,6 @@ define i32 @main() { ; i32()*
    "protected" - Protected style:
    -
    On ELF, protected visibility indicates that the symbol will be placed in the dynamic symbol table, but that references within the defining module will bind to the local symbol. That is, the symbol cannot be overridden by @@ -836,7 +843,7 @@ define i32 @main() { ; i32()* LLVM function declarations consist of the "declare" keyword, an optional linkage type, an optional - visibility style, an optional + visibility style, an optional calling convention, a return type, an optional parameter attribute for the return type, a function name, a possibly empty list of arguments, an optional alignment, and an @@ -863,8 +870,7 @@ define i32 @main() { ; i32()* -
    Syntax:
    - +
    Syntax:
     define [linkage] [visibility]
    @@ -889,8 +895,7 @@ define [linkage] [visibility]
        may have an optional linkage type, and an
        optional visibility style.

    -
    Syntax:
    - +
    Syntax:
     @<Name> = alias [Linkage] [Visibility] <AliaseeTy> @<Aliasee>
    @@ -929,28 +934,24 @@ declare signext i8 @returns_signed_char()
     

    Currently, only the following parameter attributes are defined:

    -
    zeroext
    - +
    zeroext
    This indicates to the code generator that the parameter or return value should be zero-extended to a 32-bit value by the caller (for a parameter) or the callee (for a return value).
    -
    signext
    - +
    signext
    This indicates to the code generator that the parameter or return value should be sign-extended to a 32-bit value by the caller (for a parameter) or the callee (for a return value).
    -
    inreg
    - +
    inreg
    This indicates that this parameter or return value should be treated in a special target-dependent fashion during while emitting code for a function call or return (usually, by putting it in a register as opposed to memory, though some targets use it to distinguish between two different kinds of registers). Use of this attribute is target-specific.
    -
    byval
    - +
    byval
    This indicates that the pointer parameter should really be passed by value to the function. The attribute implies that a hidden copy of the pointee is made between the caller and the callee, so the callee is unable to @@ -965,8 +966,7 @@ declare signext i8 @returns_signed_char() generator that usually indicates a desired alignment for the synthesized stack slot.
    -
    sret
    - +
    sret
    This indicates that the pointer parameter specifies the address of a structure that is the return value of the function in the source program. This pointer must be guaranteed by the caller to be valid: loads and @@ -974,8 +974,7 @@ declare signext i8 @returns_signed_char() may only be applied to the first parameter. This is not a valid attribute for return values.
    -
    noalias
    - +
    noalias
    This indicates that the pointer does not alias any global or any other parameter. The caller is responsible for ensuring that this is the case. On a function return value, noalias additionally indicates @@ -985,14 +984,12 @@ declare signext i8 @returns_signed_char() alias analysis.
    -
    nocapture
    - +
    nocapture
    This indicates that the callee does not make any copies of the pointer that outlive the callee itself. This is not a valid attribute for return values.
    -
    nest
    - +
    nest
    This indicates that the pointer parameter can be excised using the trampoline intrinsics. This is not a valid attribute for return values.
    @@ -1012,7 +1009,7 @@ declare signext i8 @returns_signed_char()
    -define void @f() gc "name" { ...
    +define void @f() gc "name" { ... }
     
    @@ -1042,43 +1039,42 @@ define void @f() gc "name" { ... define void @f() noinline { ... } define void @f() alwaysinline { ... } define void @f() alwaysinline optsize { ... } -define void @f() optsize +define void @f() optsize { ... }
    -
    alwaysinline
    - +
    alwaysinline
    This attribute indicates that the inliner should attempt to inline this function into callers whenever possible, ignoring any active inlining size threshold for this caller.
    -
    noinline
    +
    inlinehint
    +
    This attribute indicates that the source code contained a hint that inlining + this function is desirable (such as the "inline" keyword in C/C++). It + is just a hint; it imposes no requirements on the inliner.
    +
    noinline
    This attribute indicates that the inliner should never inline this function in any situation. This attribute may not be used together with the alwaysinline attribute.
    -
    optsize
    - +
    optsize
    This attribute suggests that optimization passes and code generator passes make choices that keep the code size of this function low, and otherwise do optimizations specifically to reduce code size.
    -
    noreturn
    - +
    noreturn
    This function attribute indicates that the function never returns normally. This produces undefined behavior at runtime if the function ever does dynamically return.
    -
    nounwind
    - +
    nounwind
    This function attribute indicates that the function never returns with an unwind or exceptional control flow. If the function does unwind, its runtime behavior is undefined.
    -
    readnone
    - +
    readnone
    This attribute indicates that the function computes its result (or decides to unwind an exception) based strictly on its arguments, without dereferencing any pointer arguments or otherwise accessing any mutable @@ -1089,8 +1085,7 @@ define void @f() optsize exceptions by calling the C++ exception throwing methods, but could use the unwind instruction.
    -
    readonly
    - +
    readonly
    This attribute indicates that the function does not write through any pointer arguments (including byval arguments) or otherwise modify any state (e.g. memory, control registers, @@ -1101,8 +1096,7 @@ define void @f() optsize exception by calling the C++ exception throwing methods, but may use the unwind instruction.
    -
    ssp
    - +
    ssp
    This attribute indicates that the function should emit a stack smashing protector. It is in the form of a "canary"—a random value placed on the stack before the local variables that's checked upon return from the @@ -1113,28 +1107,24 @@ define void @f() optsize function that doesn't have an ssp attribute, then the resulting function will have an ssp attribute.
    -
    sspreq
    - +
    sspreq
    This attribute indicates that the function should always emit a stack smashing protector. This overrides - the ssp function attribute. - - If a function that has an sspreq attribute is inlined into a - function that doesn't have an sspreq attribute or which has - an ssp attribute, then the resulting function will have - an sspreq attribute.
    - -
    noredzone
    + the ssp function attribute.
    +
    + If a function that has an sspreq attribute is inlined into a + function that doesn't have an sspreq attribute or which has + an ssp attribute, then the resulting function will have + an sspreq attribute.
    +
    noredzone
    This attribute indicates that the code generator should not use a red zone, even if the target-specific ABI normally permits it.
    -
    noimplicitfloat
    - +
    noimplicitfloat
    This attributes disables implicit floating point instructions.
    -
    naked
    - +
    naked
    This attribute disables prologue / epilogue emission for the function. This can have very system-specific consequences.
    @@ -1193,48 +1183,47 @@ target datalayout = "layout specification"
    E
    -
    Specifies that the target lays out data in big-endian form. That is, the bits with the most significance have the lowest address location.
    e
    -
    Specifies that the target lays out data in little-endian form. That is, the bits with the least significance have the lowest address location.
    p:size:abi:pref
    - -
    This specifies the size of a pointer and its abi and +
    This specifies the size of a pointer and its abi and preferred alignments. All sizes are in bits. Specifying the pref alignment is optional. If omitted, the preceding : should be omitted too.
    isize:abi:pref
    -
    This specifies the alignment for an integer type of a given bit size. The value of size must be in the range [1,2^23).
    vsize:abi:pref
    - -
    This specifies the alignment for a vector type of a given bit +
    This specifies the alignment for a vector type of a given bit size.
    fsize:abi:pref
    - -
    This specifies the alignment for a floating point type of a given bit +
    This specifies the alignment for a floating point type of a given bit size. The value of size must be either 32 (float) or 64 (double).
    asize:abi:pref
    -
    This specifies the alignment for an aggregate type of a given bit size.
    ssize:abi:pref
    -
    This specifies the alignment for a stack object of a given bit size.
    + +
    nsize1:size2:size3...
    +
    This specifies a set of native integer widths for the target CPU + in bits. For example, it might contain "n32" for 32-bit PowerPC, + "n32:64" for PowerPC 64, or "n8:16:32:64" for X86-64. Elements of + this set are considered to support most general arithmetic + operations efficiently.

    When constructing the data layout for a given target, LLVM starts with a @@ -1282,6 +1271,58 @@ target datalayout = "layout specification" + +

    + +
    + +

    Any memory access must be done through a pointer value associated +with an address range of the memory access, otherwise the behavior +is undefined. Pointer values are associated with address ranges +according to the following rules:

    + +
      +
    • A pointer value formed from a + getelementptr instruction + is associated with the addresses associated with the first operand + of the getelementptr.
    • +
    • An address of a global variable is associated with the address + range of the variable's storage.
    • +
    • The result value of an allocation instruction is associated with + the address range of the allocated storage.
    • +
    • A null pointer in the default address-space is associated with + no address.
    • +
    • A pointer value formed by an + inttoptr is associated with all + address ranges of all pointer values that contribute (directly or + indirectly) to the computation of the pointer's value.
    • +
    • The result value of a + bitcast is associated with all + addresses associated with the operand of the bitcast.
    • +
    • An integer constant other than zero or a pointer value returned + from a function not defined within LLVM may be associated with address + ranges allocated through mechanisms other than those provided by + LLVM. Such ranges shall not overlap with any ranges of addresses + allocated by mechanisms provided by LLVM.
    • +
    + +

    LLVM IR does not associate types with memory. The result type of a +load merely indicates the size and +alignment of the memory from which to load, as well as the +interpretation of the value. The first operand of a +store similarly only indicates the size +and alignment of the store.

    + +

    Consequently, type-based alias analysis, aka TBAA, aka +-fstrict-aliasing, is not applicable to general unadorned +LLVM IR. Metadata may be used to encode +additional information which specialized optimization passes may use +to implement type-based alias analysis.

    + +
    + @@ -1353,7 +1394,7 @@ Classifications

    The first class types are perhaps the most important. Values of these types are the only ones which can be produced by - instructions, passed as arguments, or used as operands to instructions.

    + instructions.

    @@ -1367,6 +1408,42 @@ Classifications + + + +
    + +
    Overview:
    +

    The integer type is a very simple type that simply specifies an arbitrary + bit width for the integer type desired. Any bit width from 1 bit to + 223-1 (about 8 million) can be specified.

    + +
    Syntax:
    +
    +  iN
    +
    + +

    The number of bits the integer will occupy is specified by the N + value.

    + +
    Examples:
    + + + + + + + + + + + + + +
    i1a single-bit integer.
    i32a 32-bit integer.
    i1942652a really big integer of over 1 million bits.
    + +
    + @@ -1389,44 +1466,47 @@ Classifications
    +
    Overview:

    The void type does not represent any value and has no size.

    Syntax:
    -
       void
     
    +
    +
    Overview:

    The label type represents code labels.

    Syntax:
    -
       label
     
    +
    +
    Overview:
    -

    The metadata type represents embedded metadata. The only derived type that - may contain metadata is metadata* or a function type that returns or - takes metadata typed parameters, but not pointer to metadata types.

    +

    The metadata type represents embedded metadata. No derived types may be + created from metadata except for function + arguments.

    Syntax:
    -
       metadata
     
    +
    @@ -1437,50 +1517,10 @@ Classifications

    The real power in LLVM comes from the derived types in the system. This is what allows a programmer to represent arrays, functions, pointers, and other - useful types. Note that these derived types may be recursive: For example, - it is possible to have a two dimensional array.

    - - - - - - -
    - -
    Overview:
    -

    The integer type is a very simple derived type that simply specifies an - arbitrary bit width for the integer type desired. Any bit width from 1 bit to - 2^23-1 (about 8 million) can be specified.

    - -
    Syntax:
    - -
    -  iN
    -
    - -

    The number of bits the integer will occupy is specified by the N - value.

    - -
    Examples:
    - - - - - - - - - - - - - -
    i1a single-bit integer.
    i32a 32-bit integer.
    i1942652a really big integer of over 1 million bits.
    - -

    Note that the code generator does not yet support large integer types to be - used as function return types. The specific limit on how large a return type - the code generator can currently handle is target-dependent; currently it's - often 64 bits for 32-bit targets and 128 bits for 64-bit targets.

    + useful types. Each of these types contain one or more element types which + may be a primitive type, or another derived type. For example, it is + possible to have a two dimensional array, using an array as the element type + of another array.

    @@ -1495,7 +1535,6 @@ Classifications and an underlying data type.

    Syntax:
    -
       [<# elements> x <elementtype>]
     
    @@ -1534,17 +1573,12 @@ Classifications -

    Note that 'variable sized arrays' can be implemented in LLVM with a zero - length array. Normally, accesses past the end of an array are undefined in - LLVM (e.g. it is illegal to access the 5th element of a 3 element array). As - a special case, however, zero length arrays are recognized to be variable - length. This allows implementation of 'pascal style arrays' with the LLVM - type "{ i32, [0 x float]}", for example.

    - -

    Note that the code generator does not yet support large aggregate types to be - used as function return types. The specific limit on how large an aggregate - return type the code generator can currently handle is target-dependent, and - also dependent on the aggregate element types.

    +

    There is no restriction on indexing beyond the end of the array implied by + a static type (though there are restrictions on indexing beyond the bounds + of an allocated object in some cases). This means that single-dimension + 'variable sized array' addressing can be implemented in LLVM with a zero + length array type. An implementation of 'pascal style arrays' in LLVM could + use the type "{ i32, [0 x float]}", for example.

    @@ -1561,9 +1595,8 @@ Classifications and the struct must have at least one element.

    Syntax:
    -
    -  <returntype list> (<parameter list>)
    +  <returntype> (<parameter list>)
     

    ...where '<parameter list>' is a comma-separated list of type @@ -1571,8 +1604,8 @@ Classifications which indicates that the function takes a variable number of arguments. Variable argument functions can access their arguments with the variable argument handling intrinsic - functions. '<returntype list>' is a comma-separated list of - first class type specifiers.

    + functions. '<returntype>' is a any type except + label.

    Examples:
    @@ -1583,22 +1616,22 @@ Classifications - - -
    float (i16 signext, i32 *) * Pointer to a function that takes - an i16 that should be sign extended and a - pointer to i32, returning + Pointer to a function that takes + an i16 that should be sign extended and a + pointer to i32, returning float.
    i32 (i8*, ...)A vararg function that takes at least one - pointer to i8 (char in C), - which returns an integer. This is the signature for printf in + A vararg function that takes at least one + pointer to i8 (char in C), + which returns an integer. This is the signature for printf in LLVM.
    {i32, i32} (i32)A function taking an i32, returning two - i32 values as an aggregate of type { i32, i32 } + A function taking an i32, returning a + structure containing two i32 values
    @@ -1621,8 +1654,9 @@ Classifications the 'getelementptr' instruction.

    Syntax:
    - -
      { <type list> }
    +
    +  { <type list> }
    +
    Examples:
    @@ -1638,11 +1672,6 @@ Classifications
    -

    Note that the code generator does not yet support large aggregate types to be - used as function return types. The specific limit on how large an aggregate - return type the code generator can currently handle is target-dependent, and - also dependent on the aggregate element types.

    - @@ -1662,8 +1691,9 @@ Classifications the 'getelementptr' instruction.

    Syntax:
    - -
      < { <type list> } > 
    +
    +  < { <type list> } >
    +
    Examples:
    @@ -1697,8 +1727,9 @@ Classifications permit pointers to labels (label*). Use i8* instead.

    Syntax:
    - -
      <type> *
    +
    +  <type> *
    +
    Examples:
    @@ -1731,12 +1762,10 @@ Classifications

    A vector type is a simple derived type that represents a vector of elements. Vector types are used when multiple primitive data are operated in parallel using a single instruction (SIMD). A vector type requires a size (number of - elements) and an underlying primitive data type. Vectors must have a power - of two length (1, 2, 4, 8, 16 ...). Vector types are considered + elements) and an underlying primitive data type. Vector types are considered first class.

    Syntax:
    -
       < <# elements> x <elementtype> >
     
    @@ -1745,7 +1774,6 @@ Classifications integer or floating point type.

    Examples:
    -
    @@ -1761,11 +1789,6 @@ Classifications
    <4 x i32>
    -

    Note that the code generator does not yet support large vector types to be - used as function return types. The specific limit on how large a vector - return type codegen can currently handle is target-dependent; currently it's - often a few times longer than a hardware vector register.

    - @@ -1779,13 +1802,11 @@ Classifications a structure type).

    Syntax:
    -
       opaque
     
    Examples:
    - @@ -1822,7 +1843,6 @@ Classifications in llvm IR).

    Syntax:
    -
        \<level>
     
    @@ -1830,7 +1850,6 @@ Classifications

    The level is the count of the lexical type that is being referred to.

    Examples:
    -
    opaque
    @@ -1863,18 +1882,15 @@ Classifications
    Boolean constants
    -
    The two strings 'true' and 'false' are both valid - constants of the i1 type.
    + constants of the i1 type.
    Integer constants
    -
    Standard integers (such as '4') are constants of the integer type. Negative numbers may be used with integer types.
    Floating point constants
    -
    Floating point constants use standard decimal notation (e.g. 123.421), exponential notation (e.g. 1.23421e+2), or a more precise hexadecimal notation (see below). The assembler requires the exact decimal value of a @@ -1883,7 +1899,6 @@ Classifications constants must have a floating point type.
    Null pointer constants
    -
    The identifier 'null' is recognized as a null pointer constant and must be of pointer type.
    @@ -1916,8 +1931,8 @@ Classifications
    @@ -1927,7 +1942,6 @@ Classifications
    Structure constants
    -
    Structure constants are represented with notation similar to structure type definitions (a comma separated list of elements, surrounded by braces ({})). For example: "{ i32 4, float 17.0, i32* @G }", @@ -1937,7 +1951,6 @@ Classifications type.
    Array constants
    -
    Array constants are represented with notation similar to array type definitions (a comma separated list of elements, surrounded by square brackets ([])). For example: "[ i32 42, i32 11, i32 74 @@ -1946,7 +1959,6 @@ Classifications type.
    Vector constants
    -
    Vector constants are represented with notation similar to vector type definitions (a comma separated list of elements, surrounded by less-than/greater-than's (<>)). For example: "< i32 @@ -1955,7 +1967,6 @@ Classifications elements must match those specified by the type.
    Zero initialization
    -
    The string 'zeroinitializer' can be used to zero initialize a value to zero of any type, including scalar and aggregate types. This is often used to avoid having to print large zero initializers @@ -1963,7 +1974,6 @@ Classifications zero initializers.
    Metadata node
    -
    A metadata node is a structure-like constant with metadata type. For example: "metadata !{ i32 0, metadata !"test" }". Unlike other constants that are meant to @@ -2001,15 +2011,179 @@ Classifications
    -

    The string 'undef' is recognized as a type-less constant that has no - specific value. Undefined values may be of any type and be used anywhere a - constant is permitted.

    +

    The string 'undef' can be used anywhere a constant is expected, and + indicates that the user of the value may receive an unspecified bit-pattern. + Undefined values may be of any type (other than label or void) and be used + anywhere a constant is permitted.

    + +

    Undefined values are useful because they indicate to the compiler that the + program is well defined no matter what value is used. This gives the + compiler more freedom to optimize. Here are some examples of (potentially + surprising) transformations that are valid (in pseudo IR):

    + -

    Undefined values indicate to the compiler that the program is well defined no - matter what value is used, giving the compiler more freedom to optimize.

    +
    +
    +  %A = add %X, undef
    +  %B = sub %X, undef
    +  %C = xor %X, undef
    +Safe:
    +  %A = undef
    +  %B = undef
    +  %C = undef
    +
    +
    + +

    This is safe because all of the output bits are affected by the undef bits. +Any output bit can have a zero or one depending on the input bits.

    + +
    +
    +  %A = or %X, undef
    +  %B = and %X, undef
    +Safe:
    +  %A = -1
    +  %B = 0
    +Unsafe:
    +  %A = undef
    +  %B = undef
    +
    +
    + +

    These logical operations have bits that are not always affected by the input. +For example, if "%X" has a zero bit, then the output of the 'and' operation will +always be a zero, no matter what the corresponding bit from the undef is. As +such, it is unsafe to optimize or assume that the result of the and is undef. +However, it is safe to assume that all bits of the undef could be 0, and +optimize the and to 0. Likewise, it is safe to assume that all the bits of +the undef operand to the or could be set, allowing the or to be folded to +-1.

    + +
    +
    +  %A = select undef, %X, %Y
    +  %B = select undef, 42, %Y
    +  %C = select %X, %Y, undef
    +Safe:
    +  %A = %X     (or %Y)
    +  %B = 42     (or %Y)
    +  %C = %Y
    +Unsafe:
    +  %A = undef
    +  %B = undef
    +  %C = undef
    +
    +
    + +

    This set of examples show that undefined select (and conditional branch) +conditions can go "either way" but they have to come from one of the two +operands. In the %A example, if %X and %Y were both known to have a clear low +bit, then %A would have to have a cleared low bit. However, in the %C example, +the optimizer is allowed to assume that the undef operand could be the same as +%Y, allowing the whole select to be eliminated.

    + + +
    +
    +  %A = xor undef, undef
    +
    +  %B = undef
    +  %C = xor %B, %B
    +
    +  %D = undef
    +  %E = icmp lt %D, 4
    +  %F = icmp gte %D, 4
    +
    +Safe:
    +  %A = undef
    +  %B = undef
    +  %C = undef
    +  %D = undef
    +  %E = undef
    +  %F = undef
    +
    +
    + +

    This example points out that two undef operands are not necessarily the same. +This can be surprising to people (and also matches C semantics) where they +assume that "X^X" is always zero, even if X is undef. This isn't true for a +number of reasons, but the short answer is that an undef "variable" can +arbitrarily change its value over its "live range". This is true because the +"variable" doesn't actually have a live range. Instead, the value is +logically read from arbitrary registers that happen to be around when needed, +so the value is not necessarily consistent over time. In fact, %A and %C need +to have the same semantics or the core LLVM "replace all uses with" concept +would not hold.

    + +
    +
    +  %A = fdiv undef, %X
    +  %B = fdiv %X, undef
    +Safe:
    +  %A = undef
    +b: unreachable
    +
    +
    + +

    These examples show the crucial difference between an undefined +value and undefined behavior. An undefined value (like undef) is +allowed to have an arbitrary bit-pattern. This means that the %A operation +can be constant folded to undef because the undef could be an SNaN, and fdiv is +not (currently) defined on SNaN's. However, in the second example, we can make +a more aggressive assumption: because the undef is allowed to be an arbitrary +value, we are allowed to assume that it could be zero. Since a divide by zero +has undefined behavior, we are allowed to assume that the operation +does not execute at all. This allows us to delete the divide and all code after +it: since the undefined operation "can't happen", the optimizer can assume that +it occurs in dead code. +

    + +
    +
    +a:  store undef -> %X
    +b:  store %X -> undef
    +Safe:
    +a: <deleted>
    +b: unreachable
    +
    +
    + +

    These examples reiterate the fdiv example: a store "of" an undefined value +can be assumed to not have any effect: we can assume that the value is +overwritten with bits that happen to match what was already there. However, a +store "to" an undefined location could clobber arbitrary memory, therefore, it +has undefined behavior.

    + + +
    + +

    blockaddress(@function, %block)

    + +

    The 'blockaddress' constant computes the address of the specified + basic block in the specified function, and always has an i8* type. Taking + the address of the entry block is illegal.

    + +

    This value only has defined behavior when used as an operand to the + 'indirectbr' instruction or for comparisons + against null. Pointer equality tests between labels addresses is undefined + behavior - though, again, comparison against null is ok, and no label is + equal to the null pointer. This may also be passed around as an opaque + pointer sized value as long as the bits are not inspected. This allows + ptrtoint and arithmetic to be performed on these values so long as + the original value is reconstituted before the indirectbr.

    + +

    Finally, some targets may provide defined semantics when + using the value as the operand to an inline assembly, but that is target + specific. +

    + +
    + + @@ -2024,36 +2198,30 @@ Classifications
    trunc ( CST to TYPE )
    -
    Truncate a constant to another type. The bit size of CST must be larger than the bit size of TYPE. Both types must be integers.
    zext ( CST to TYPE )
    -
    Zero extend a constant to another type. The bit size of CST must be smaller or equal to the bit size of TYPE. Both types must be integers.
    sext ( CST to TYPE )
    -
    Sign extend a constant to another type. The bit size of CST must be smaller or equal to the bit size of TYPE. Both types must be integers.
    fptrunc ( CST to TYPE )
    -
    Truncate a floating point constant to another floating point type. The size of CST must be larger than the size of TYPE. Both types must be floating point.
    fpext ( CST to TYPE )
    -
    Floating point extend a constant to another type. The size of CST must be smaller or equal to the size of TYPE. Both types must be floating point.
    fptoui ( CST to TYPE )
    -
    Convert a floating point constant to the corresponding unsigned integer constant. TYPE must be a scalar or vector integer type. CST must be of scalar or vector floating point type. Both CST and TYPE must be scalars, @@ -2061,7 +2229,6 @@ Classifications integer type, the results are undefined.
    fptosi ( CST to TYPE )
    -
    Convert a floating point constant to the corresponding signed integer constant. TYPE must be a scalar or vector integer type. CST must be of scalar or vector floating point type. Both CST and TYPE must be scalars, @@ -2069,7 +2236,6 @@ Classifications integer type, the results are undefined.
    uitofp ( CST to TYPE )
    -
    Convert an unsigned integer constant to the corresponding floating point constant. TYPE must be a scalar or vector floating point type. CST must be of scalar or vector integer type. Both CST and TYPE must be scalars, or @@ -2077,7 +2243,6 @@ Classifications floating point type, the results are undefined.
    sitofp ( CST to TYPE )
    -
    Convert a signed integer constant to the corresponding floating point constant. TYPE must be a scalar or vector floating point type. CST must be of scalar or vector integer type. Both CST and TYPE must be scalars, or @@ -2085,61 +2250,51 @@ Classifications floating point type, the results are undefined.
    ptrtoint ( CST to TYPE )
    -
    Convert a pointer typed constant to the corresponding integer constant TYPE must be an integer type. CST must be of pointer type. The CST value is zero extended, truncated, or unchanged to make it fit in TYPE.
    inttoptr ( CST to TYPE )
    -
    Convert a integer constant to a pointer constant. TYPE must be a pointer type. CST must be of integer type. The CST value is zero extended, truncated, or unchanged to make it fit in a pointer size. This one is really dangerous!
    bitcast ( CST to TYPE )
    -
    Convert a constant, CST, to another TYPE. The constraints of the operands are the same as those for the bitcast instruction.
    getelementptr ( CSTPTR, IDX0, IDX1, ... )
    - +
    getelementptr inbounds ( CSTPTR, IDX0, IDX1, ... )
    Perform the getelementptr operation on constants. As with the getelementptr instruction, the index list may have zero or more indexes, which are required to make sense for the type of "CSTPTR".
    select ( COND, VAL1, VAL2 )
    -
    Perform the select operation on constants.
    icmp COND ( VAL1, VAL2 )
    -
    Performs the icmp operation on constants.
    fcmp COND ( VAL1, VAL2 )
    -
    Performs the fcmp operation on constants.
    extractelement ( VAL, IDX )
    -
    Perform the extractelement operation on constants.
    insertelement ( VAL, ELT, IDX )
    -
    Perform the insertelement operation on constants.
    shufflevector ( VEC1, VEC2, IDXMASK )
    -
    Perform the shufflevector operation on constants.
    OPCODE ( LHS, RHS )
    -
    Perform the specified operation of the LHS and RHS constants. OPCODE may be any of the binary or bitwise binary operations. The constraints @@ -2166,7 +2321,7 @@ Classifications the two digit hex code. For example: "!"test\00"".

    Metadata nodes are represented with notation similar to structure constants - (a comma separated list of elements, surrounded by braces and preceeded by an + (a comma separated list of elements, surrounded by braces and preceded by an exclamation point). For example: "!{ metadata !"test\00", i32 10}".

    @@ -2179,59 +2334,166 @@ Classifications computable. Similarly, the code generator may expect a certain metadata format to be used to express debugging information.

    - + + + + + + + + + +
    + +

    LLVM supports inline assembler expressions (as opposed + to Module-Level Inline Assembly) through the use of + a special value. This value represents the inline assembler as a string + (containing the instructions to emit), a list of operand constraints (stored + as a string), a flag that indicates whether or not the inline asm + expression has side effects, and a flag indicating whether the function + containing the asm needs to align its stack conservatively. An example + inline assembler expression is:

    + +
    +
    +i32 (i32) asm "bswap $0", "=r,r"
    +
    +
    + +

    Inline assembler expressions may only be used as the callee operand of + a call instruction. Thus, typically we + have:

    + +
    +
    +%X = call i32 asm "bswap $0", "=r,r"(i32 %Y)
    +
    +
    + +

    Inline asms with side effects not visible in the constraint list must be + marked as having side effects. This is done through the use of the + 'sideeffect' keyword, like so:

    + +
    +
    +call void asm sideeffect "eieio", ""()
    +
    +
    + +

    In some cases inline asms will contain code that will not work unless the + stack is aligned in some way, such as calls or SSE instructions on x86, + yet will not contain code that does that alignment within the asm. + The compiler should make conservative assumptions about what the asm might + contain and should generate its usual stack alignment code in the prologue + if the 'alignstack' keyword is present:

    + +
    +
    +call void asm alignstack "eieio", ""()
    +
    +
    + +

    If both keywords appear the 'sideeffect' keyword must come + first.

    + +

    TODO: The format of the asm and constraints string still need to be + documented here. Constraints on what can be done (e.g. duplication, moving, + etc need to be documented). This is probably best done by reference to + another document that covers inline asm from a holistic perspective.

    + +
    + + + + + + +

    LLVM has a number of "magic" global variables that contain data that affect +code generation or other IR semantics. These are documented here. All globals +of this sort should have a section specified as "llvm.metadata". This +section and all globals that start with "llvm." are reserved for use +by LLVM.

    + + + + +
    + +

    The @llvm.used global is an array with i8* element type which has appending linkage. This array contains a list of +pointers to global variables and functions which may optionally have a pointer +cast formed of bitcast or getelementptr. For example, a legal use of it is:

    + +
    +  @X = global i8 4
    +  @Y = global i32 123
    +
    +  @llvm.used = appending global [2 x i8*] [
    +     i8* @X,
    +     i8* bitcast (i32* @Y to i8*)
    +  ], section "llvm.metadata"
    +
    + +

    If a global variable appears in the @llvm.used list, then the +compiler, assembler, and linker are required to treat the symbol as if there is +a reference to the global that it cannot see. For example, if a variable has +internal linkage and no references other than that from the @llvm.used +list, it cannot be deleted. This is commonly used to represent references from +inline asms and other things the compiler cannot "see", and corresponds to +"attribute((used))" in GNU C.

    - - - +

    On some targets, the code generator must emit a directive to the assembler or +object file to prevent the assembler and linker from molesting the symbol.

    + +
    -

    LLVM supports inline assembler expressions (as opposed - to Module-Level Inline Assembly) through the use of - a special value. This value represents the inline assembler as a string - (containing the instructions to emit), a list of operand constraints (stored - as a string), and a flag that indicates whether or not the inline asm - expression has side effects. An example inline assembler expression is:

    +

    The @llvm.compiler.used directive is the same as the +@llvm.used directive, except that it only prevents the compiler from +touching the symbol. On targets that support it, this allows an intelligent +linker to optimize references to the symbol without being impeded as it would be +by @llvm.used.

    -
    -
    -i32 (i32) asm "bswap $0", "=r,r"
    -
    -
    +

    This is a rare construct that should only be used in rare circumstances, and +should not be exposed to source languages.

    -

    Inline assembler expressions may only be used as the callee operand of - a call instruction. Thus, typically we - have:

    +
    -
    -
    -%X = call i32 asm "bswap $0", "=r,r"(i32 %Y)
    -
    + + -

    Inline asms with side effects not visible in the constraint list must be - marked as having side effects. This is done through the use of the - 'sideeffect' keyword, like so:

    +
    + +

    TODO: Describe this.

    -
    -
    -call void asm sideeffect "eieio", ""()
    -
    -

    TODO: The format of the asm and constraints string still need to be - documented here. Constraints on what can be done (e.g. duplication, moving, - etc need to be documented). This is probably best done by reference to - another document that covers inline asm from a holistic perspective.

    + + + +
    + +

    TODO: Describe this.

    + @@ -2264,6 +2526,7 @@ Instructions
    'ret' instruction, the 'br' instruction, the 'switch' instruction, the + ''indirectbr' Instruction, the 'invoke' instruction, the 'unwind' instruction, and the 'unreachable' instruction.

    @@ -2283,7 +2546,6 @@ Instruction
    Overview:
    -

    The 'ret' instruction is used to return control flow (and optionally a value) from a function back to the caller.

    @@ -2292,7 +2554,6 @@ Instruction occur.

    Arguments:
    -

    The 'ret' instruction optionally accepts a single argument, the return value. The type of the return value must be a 'first class' type.

    @@ -2304,7 +2565,6 @@ Instruction return value.

    Semantics:
    -

    When the 'ret' instruction is executed, control flow returns back to the calling function's context. If the caller is a "call" instruction, execution continues at the @@ -2315,21 +2575,12 @@ Instruction value.

    Example:
    -
       ret i32 5                       ; Return an integer value of 5
       ret void                        ; Return from a void function
       ret { i32, i8 } { i32 4, i8 2 } ; Return a struct of values 4 and 2
     
    -

    Note that the code generator does not yet fully support large - return values. The specific sizes that are currently supported are - dependent on the target. For integers, on 32-bit targets the limit - is often 64 bits, and on 64-bit targets the limit is often 128 bits. - For aggregate types, the current limits are dependent on the element - types; for example targets are often limited to 2 total integer - elements and 2 total floating-point elements.

    - @@ -2360,8 +2611,16 @@ Instruction control flows to the 'iffalse' label argument.

    Example:
    -
    Test:
    %cond = icmp eq i32 %a, %b
    br i1 %cond, label %IfEqual, label %IfUnequal
    IfEqual:
    ret i32 1
    IfUnequal:
    ret i32 0
    +
    +Test:
    +  %cond = icmp eq i32 %a, %b
    +  br i1 %cond, label %IfEqual, label %IfUnequal
    +IfEqual:
    +  ret i32 1
    +IfUnequal:
    +  ret i32 0
    +
    + @@ -2392,8 +2651,8 @@ Instruction

    The switch instruction specifies a table of values and destinations. When the 'switch' instruction is executed, this table is searched for the given value. If the value is found, control flow is - transfered to the corresponding destination; otherwise, control flow is - transfered to the default destination.

    + transferred to the corresponding destination; otherwise, control flow is + transferred to the default destination.

    Implementation:

    Depending on properties of the target machine and the particular @@ -2418,6 +2677,55 @@ Instruction + + +

    + +
    + +
    Syntax:
    +
    +  indirectbr <somety>* <address>, [ label <dest1>, label <dest2>, ... ]
    +
    + +
    Overview:
    + +

    The 'indirectbr' instruction implements an indirect branch to a label + within the current function, whose address is specified by + "address". Address must be derived from a blockaddress constant.

    + +
    Arguments:
    + +

    The 'address' argument is the address of the label to jump to. The + rest of the arguments indicate the full set of possible destinations that the + address may point to. Blocks are allowed to occur multiple times in the + destination list, though this isn't particularly useful.

    + +

    This destination list is required so that dataflow analysis has an accurate + understanding of the CFG.

    + +
    Semantics:
    + +

    Control transfers to the block specified in the address argument. All + possible destination blocks must be listed in the label list, otherwise this + instruction has undefined behavior. This implies that jumps to labels + defined in other functions have undefined behavior as well.

    + +
    Implementation:
    + +

    This is typically implemented with a jump through a register.

    + +
    Example:
    +
    + indirectbr i8* %Addr, [ label %bb1, label %bb2, label %bb3 ]
    +
    + +
    + +
    'invoke' Instruction @@ -2578,14 +2886,16 @@ Instruction
    Syntax:
    -  <result> = add <ty> <op1>, <op2>   ; yields {ty}:result
    +  <result> = add <ty> <op1>, <op2>          ; yields {ty}:result
    +  <result> = add nuw <ty> <op1>, <op2>      ; yields {ty}:result
    +  <result> = add nsw <ty> <op1>, <op2>      ; yields {ty}:result
    +  <result> = add nuw nsw <ty> <op1>, <op2>  ; yields {ty}:result
     
    Overview:

    The 'add' instruction returns the sum of its two operands.

    Arguments:
    -

    The two arguments to the 'add' instruction must be integer or vector of integer values. Both arguments must have identical types.

    @@ -2599,6 +2909,11 @@ Instruction

    Because LLVM integers use a two's complement representation, this instruction is appropriate for both signed and unsigned integers.

    +

    nuw and nsw stand for "No Unsigned Wrap" + and "No Signed Wrap", respectively. If the nuw and/or + nsw keywords are present, the result value of the add + is undefined if unsigned and/or signed overflow, respectively, occurs.

    +
    Example:
       <result> = add i32 4, %var          ; yields {i32}:result = 4 + %var
    @@ -2645,7 +2960,10 @@ Instruction 
     
     
    Syntax:
    -  <result> = sub <ty> <op1>, <op2>   ; yields {ty}:result
    +  <result> = sub <ty> <op1>, <op2>          ; yields {ty}:result
    +  <result> = sub nuw <ty> <op1>, <op2>      ; yields {ty}:result
    +  <result> = sub nsw <ty> <op1>, <op2>      ; yields {ty}:result
    +  <result> = sub nuw nsw <ty> <op1>, <op2>  ; yields {ty}:result
     
    Overview:
    @@ -2671,6 +2989,11 @@ Instruction

    Because LLVM integers use a two's complement representation, this instruction is appropriate for both signed and unsigned integers.

    +

    nuw and nsw stand for "No Unsigned Wrap" + and "No Signed Wrap", respectively. If the nuw and/or + nsw keywords are present, the result value of the sub + is undefined if unsigned and/or signed overflow, respectively, occurs.

    +
    Example:
       <result> = sub i32 4, %var          ; yields {i32}:result = 4 - %var
    @@ -2700,7 +3023,7 @@ Instruction 
        representations.

    Arguments:
    -

    The two arguments to the 'fsub' instruction must be The two arguments to the 'fsub' instruction must be floating point or vector of floating point values. Both arguments must have identical types.

    @@ -2724,7 +3047,10 @@ Instruction
    Syntax:
    -  <result> = mul <ty> <op1>, <op2>   ; yields {ty}:result
    +  <result> = mul <ty> <op1>, <op2>          ; yields {ty}:result
    +  <result> = mul nuw <ty> <op1>, <op2>      ; yields {ty}:result
    +  <result> = mul nsw <ty> <op1>, <op2>      ; yields {ty}:result
    +  <result> = mul nuw nsw <ty> <op1>, <op2>  ; yields {ty}:result
     
    Overview:
    @@ -2734,7 +3060,7 @@ Instruction

    The two arguments to the 'mul' instruction must be integer or vector of integer values. Both arguments must have identical types.

    - +
    Semantics:

    The value produced is the integer product of the two operands.

    @@ -2749,6 +3075,11 @@ Instruction be sign-extended or zero-extended as appropriate to the width of the full product.

    +

    nuw and nsw stand for "No Unsigned Wrap" + and "No Signed Wrap", respectively. If the nuw and/or + nsw keywords are present, the result value of the mul + is undefined if unsigned and/or signed overflow, respectively, occurs.

    +
    Example:
       <result> = mul i32 4, %var          ; yields {i32}:result = 4 * %var
    @@ -2801,7 +3132,7 @@ Instruction 
     

    The 'udiv' instruction returns the quotient of its two operands.

    Arguments:
    -

    The two arguments to the 'udiv' instruction must be +

    The two arguments to the 'udiv' instruction must be integer or vector of integer values. Both arguments must have identical types.

    @@ -2828,14 +3159,15 @@ Instruction
    Syntax:
    -  <result> = sdiv <ty> <op1>, <op2>   ; yields {ty}:result
    +  <result> = sdiv <ty> <op1>, <op2>         ; yields {ty}:result
    +  <result> = sdiv exact <ty> <op1>, <op2>   ; yields {ty}:result
     
    Overview:

    The 'sdiv' instruction returns the quotient of its two operands.

    Arguments:
    -

    The two arguments to the 'sdiv' instruction must be +

    The two arguments to the 'sdiv' instruction must be integer or vector of integer values. Both arguments must have identical types.

    @@ -2850,6 +3182,10 @@ Instruction undefined behavior; this is a rare case, but can occur, for example, by doing a 32-bit division of -2147483648 by -1.

    +

    If the exact keyword is present, the result value of the + sdiv is undefined if the result would be rounded or if overflow + would occur.

    +
    Example:
       <result> = sdiv i32 4, %var          ; yields {i32}:result = 4 / %var
    @@ -2902,7 +3238,7 @@ Instruction 
        division of its two arguments.

    Arguments:
    -

    The two arguments to the 'urem' instruction must be +

    The two arguments to the 'urem' instruction must be integer or vector of integer values. Both arguments must have identical types.

    @@ -2942,7 +3278,7 @@ Instruction elements must be integers.

    Arguments:
    -

    The two arguments to the 'srem' instruction must be +

    The two arguments to the 'srem' instruction must be integer or vector of integer values. Both arguments must have identical types.

    @@ -3037,7 +3373,7 @@ Instruction

    Both arguments to the 'shl' instruction must be the same integer or vector of integer type. 'op2' is treated as an unsigned value.

    - +
    Semantics:

    The value produced is op1 * 2op2 mod 2n, where n is the width of the result. If op2 @@ -3073,7 +3409,7 @@ Instruction operand shifted to the right a specified number of bits with zero fill.

    Arguments:
    -

    Both arguments to the 'lshr' instruction must be the same +

    Both arguments to the 'lshr' instruction must be the same integer or vector of integer type. 'op2' is treated as an unsigned value.

    @@ -3113,7 +3449,7 @@ Instruction extension.

    Arguments:
    -

    Both arguments to the 'ashr' instruction must be the same +

    Both arguments to the 'ashr' instruction must be the same integer or vector of integer type. 'op2' is treated as an unsigned value.

    @@ -3153,7 +3489,7 @@ Instruction operands.

    Arguments:
    -

    The two arguments to the 'and' instruction must be +

    The two arguments to the 'and' instruction must be integer or vector of integer values. Both arguments must have identical types.

    @@ -3212,7 +3548,7 @@ Instruction two operands.

    Arguments:
    -

    The two arguments to the 'or' instruction must be +

    The two arguments to the 'or' instruction must be integer or vector of integer values. Both arguments must have identical types.

    @@ -3275,7 +3611,7 @@ Instruction complement" operation, which is the "~" operator in C.

    Arguments:
    -

    The two arguments to the 'xor' instruction must be +

    The two arguments to the 'xor' instruction must be integer or vector of integer values. Both arguments must have identical types.

    @@ -3323,7 +3659,7 @@ Instruction -
    + @@ -3369,7 +3705,7 @@ Instruction
    Example:
    -  %result = extractelement <4 x i32> %vec, i32 0    ; yields i32
    +  <result> = extractelement <4 x i32> %vec, i32 0    ; yields i32
     
    @@ -3405,7 +3741,7 @@ Instruction
    Example:
    -  %result = insertelement <4 x i32> %vec, i32 1, i32 0    ; yields <4 x i32>
    +  <result> = insertelement <4 x i32> %vec, i32 1, i32 0    ; yields <4 x i32>
     
    @@ -3446,20 +3782,20 @@ Instruction
    Example:
    -  %result = shufflevector <4 x i32> %v1, <4 x i32> %v2, 
    +  <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2,
                               <4 x i32> <i32 0, i32 4, i32 1, i32 5>  ; yields <4 x i32>
    -  %result = shufflevector <4 x i32> %v1, <4 x i32> undef, 
    +  <result> = shufflevector <4 x i32> %v1, <4 x i32> undef,
                               <4 x i32> <i32 0, i32 1, i32 2, i32 3>  ; yields <4 x i32> - Identity shuffle.
    -  %result = shufflevector <8 x i32> %v1, <8 x i32> undef, 
    +  <result> = shufflevector <8 x i32> %v1, <8 x i32> undef,
                               <4 x i32> <i32 0, i32 1, i32 2, i32 3>  ; yields <4 x i32>
    -  %result = shufflevector <4 x i32> %v1, <4 x i32> %v2, 
    +  <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2,
                               <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7 >  ; yields <8 x i32>
     
    -
    + @@ -3498,7 +3834,7 @@ Instruction
    Example:
    -  %result = extractvalue {i32, float} %agg, 0    ; yields i32
    +  <result> = extractvalue {i32, float} %agg, 0    ; yields i32
     
    @@ -3537,14 +3873,14 @@ Instruction
    Example:
    -  %result = insertvalue {i32, float} %agg, i32 1, 0    ; yields {i32, float}
    +  <result> = insertvalue {i32, float} %agg, i32 1, 0    ; yields {i32, float}
     
    -
    + @@ -3552,93 +3888,11 @@ Instruction

    A key design point of an SSA-based representation is how it represents memory. In LLVM, no memory locations are in SSA form, which makes things - very simple. This section describes how to read, write, allocate, and free + very simple. This section describes how to read, write, and allocate memory in LLVM.

    - - - -
    - -
    Syntax:
    -
    -  <result> = malloc <type>[, i32 <NumElements>][, align <alignment>]     ; yields {type*}:result
    -
    - -
    Overview:
    -

    The 'malloc' instruction allocates memory from the system heap and - returns a pointer to it. The object is always allocated in the generic - address space (address space zero).

    - -
    Arguments:
    -

    The 'malloc' instruction allocates - sizeof(<type>)*NumElements bytes of memory from the operating - system and returns a pointer of the appropriate type to the program. If - "NumElements" is specified, it is the number of elements allocated, otherwise - "NumElements" is defaulted to be one. If a constant alignment is specified, - the value result of the allocation is guaranteed to be aligned to at least - that boundary. If not specified, or if zero, the target can choose to align - the allocation on any convenient boundary compatible with the type.

    - -

    'type' must be a sized type.

    - -
    Semantics:
    -

    Memory is allocated using the system "malloc" function, and a - pointer is returned. The result of a zero byte allocation is undefined. The - result is null if there is insufficient memory available.

    - -
    Example:
    -
    -  %array  = malloc [4 x i8]                     ; yields {[%4 x i8]*}:array
    -
    -  %size   = add i32 2, 2                        ; yields {i32}:size = i32 4
    -  %array1 = malloc i8, i32 4                    ; yields {i8*}:array1
    -  %array2 = malloc [12 x i8], i32 %size         ; yields {[12 x i8]*}:array2
    -  %array3 = malloc i32, i32 4, align 1024       ; yields {i32*}:array3
    -  %array4 = malloc i32, align 1024              ; yields {i32*}:array4
    -
    - -

    Note that the code generator does not yet respect the alignment value.

    - -
    - - - - -
    - -
    Syntax:
    -
    -  free <type> <value>                           ; yields {void}
    -
    - -
    Overview:
    -

    The 'free' instruction returns memory back to the unused memory heap - to be reallocated in the future.

    - -
    Arguments:
    -

    'value' shall be a pointer value that points to a value that was - allocated with the 'malloc' instruction.

    - -
    Semantics:
    -

    Access to the memory pointed to by the pointer is no longer defined after - this instruction executes. If the pointer is null, the operation is a - noop.

    - -
    Example:
    -
    -  %array  = malloc [4 x i8]                     ; yields {[4 x i8]*}:array
    -            free   [4 x i8]* %array
    -
    - -
    -
    'alloca' Instruction @@ -3803,6 +4057,7 @@ Instruction
    Syntax:
       <result> = getelementptr <pty>* <ptrval>{, <ty> <idx>}*
    +  <result> = getelementptr inbounds <pty>* <ptrval>{, <ty> <idx>}*
     
    Overview:
    @@ -3812,7 +4067,7 @@ Instruction
    Arguments:

    The first argument is always a pointer, and forms the basis of the - calculation. The remaining arguments are indices, that indicate which of the + calculation. The remaining arguments are indices that indicate which of the elements of the aggregate object are indexed. The interpretation of each index is dependent on the type being indexed into. The first index always indexes the pointer value given as the first argument, the second index @@ -3824,9 +4079,10 @@ Instruction calculation.

    The type of each index argument depends on the type it is indexing into. - When indexing into a (packed) structure, only i32 integer + When indexing into a (optionally packed) structure, only i32 integer constants are allowed. When indexing into an array, pointer or - vector, integers of any width are allowed (also non-constants).

    + vector, integers of any width are allowed, and they are not required to be + constant.

    For example, let's consider a C code fragment and how it gets compiled to LLVM:

    @@ -3857,7 +4113,7 @@ int *foo(struct ST *s) { %RT = type { i8 , [10 x [20 x i32]], i8 } %ST = type { i32, double, %RT } -define i32* %foo(%ST* %s) { +define i32* @foo(%ST* %s) { entry: %reg = getelementptr %ST* %s, i32 1, i32 2, i32 1, i32 5, i32 13 ret i32* %reg @@ -3881,7 +4137,7 @@ entry: the given testcase is equivalent to:

    -  define i32* %foo(%ST* %s) {
    +  define i32* @foo(%ST* %s) {
         %t1 = getelementptr %ST* %s, i32 1                        ; yields %ST*:%t1
         %t2 = getelementptr %ST* %t1, i32 0, i32 2                ; yields %RT*:%t2
         %t3 = getelementptr %RT* %t2, i32 0, i32 1                ; yields [10 x [20 x i32]]*:%t3
    @@ -3891,12 +4147,22 @@ entry:
       }
     
    -

    Note that it is undefined to access an array out of bounds: array and pointer - indexes must always be within the defined bounds of the array type when - accessed with an instruction that dereferences the pointer (e.g. a load or - store instruction). The one exception for this rule is zero length arrays. - These arrays are defined to be accessible as variable length arrays, which - requires access beyond the zero'th element.

    +

    If the inbounds keyword is present, the result value of the + getelementptr is undefined if the base pointer is not an + in bounds address of an allocated object, or if any of the addresses + that would be formed by successive addition of the offsets implied by the + indices to the base address with infinitely precise arithmetic are not an + in bounds address of that allocated object. + The in bounds addresses for an allocated object are all the addresses + that point into the object, plus the address one byte past the end.

    + +

    If the inbounds keyword is not present, the offsets are added to + the base address with silently-wrapping two's complement arithmetic, and + the result value of the getelementptr may be outside the object + pointed to by the base pointer. The result value may not necessarily be + used to access memory though, even if it happens to point into allocated + storage. See the Pointer Aliasing Rules + section for more information.

    The getelementptr instruction is often confusing. For some more insight into how it works, see the getelementptr FAQ.

    @@ -3960,7 +4226,7 @@ entry:
       %X = trunc i32 257 to i8              ; yields i8:1
       %Y = trunc i32 123 to i1              ; yields i1:true
    -  %Y = trunc i32 122 to i1              ; yields i1:false
    +  %Z = trunc i32 122 to i1              ; yields i1:false
     
    @@ -3977,15 +4243,15 @@ entry:
    Overview:
    -

    The 'zext' instruction zero extends its operand to type +

    The 'zext' instruction zero extends its operand to type ty2.

    Arguments:
    -

    The 'zext' instruction takes a value to cast, which must be of +

    The 'zext' instruction takes a value to cast, which must be of integer type, and a type to cast it to, which must also be of integer type. The bit size of the - value must be smaller than the bit size of the destination type, + value must be smaller than the bit size of the destination type, ty2.

    Semantics:
    @@ -4017,10 +4283,10 @@ entry:

    The 'sext' sign extends value to the type ty2.

    Arguments:
    -

    The 'sext' instruction takes a value to cast, which must be of +

    The 'sext' instruction takes a value to cast, which must be of integer type, and a type to cast it to, which must also be of integer type. The bit size of the - value must be smaller than the bit size of the destination type, + value must be smaller than the bit size of the destination type, ty2.

    Semantics:
    @@ -4058,12 +4324,12 @@ entry:

    The 'fptrunc' instruction takes a floating point value to cast and a floating point type to cast it to. The size of value must be larger than the size of - ty2. This implies that fptrunc cannot be used to make a + ty2. This implies that fptrunc cannot be used to make a no-op cast.

    Semantics:

    The 'fptrunc' instruction truncates a value from a larger - floating point type to a smaller + floating point type to a smaller floating point type. If the value cannot fit within the destination type, ty2, then the results are undefined.

    @@ -4092,7 +4358,7 @@ entry: floating point value.

    Arguments:
    -

    The 'fpext' instruction takes a +

    The 'fpext' instruction takes a floating point value to cast, and a floating point type to cast it to. The source type must be smaller than the destination type.

    @@ -4135,7 +4401,7 @@ entry: vector integer type with the same number of elements as ty

    Semantics:
    -

    The 'fptoui' instruction converts its +

    The 'fptoui' instruction converts its floating point operand into the nearest (rounding towards zero) unsigned integer value. If the value cannot fit in ty2, the results are undefined.

    @@ -4144,7 +4410,7 @@ entry:
       %X = fptoui double 123.0 to i32      ; yields i32:123
       %Y = fptoui float 1.0E+300 to i1     ; yields undefined:1
    -  %X = fptoui float 1.04E+17 to i8     ; yields undefined:1
    +  %Z = fptoui float 1.04E+17 to i8     ; yields undefined:1
     
    @@ -4161,7 +4427,7 @@ entry:
    Overview:
    -

    The 'fptosi' instruction converts +

    The 'fptosi' instruction converts floating point value to type ty2.

    @@ -4173,7 +4439,7 @@ entry: vector integer type with the same number of elements as ty

    Semantics:
    -

    The 'fptosi' instruction converts its +

    The 'fptosi' instruction converts its floating point operand into the nearest (rounding towards zero) signed integer value. If the value cannot fit in ty2, the results are undefined.

    @@ -4182,7 +4448,7 @@ entry:
       %X = fptosi double -123.0 to i32      ; yields i32:-123
       %Y = fptosi float 1.0E-247 to i1      ; yields undefined:1
    -  %X = fptosi float 1.04E+17 to i8      ; yields undefined:1
    +  %Z = fptosi float 1.04E+17 to i8      ; yields undefined:1
     
    @@ -4326,8 +4592,8 @@ entry:
    Example:
       %X = inttoptr i32 255 to i32*          ; yields zero extension on 64-bit architecture
    -  %X = inttoptr i32 255 to i32*          ; yields no-op on 32-bit architecture
    -  %Y = inttoptr i64 0 to i32*            ; yields truncation on 32-bit architecture
    +  %Y = inttoptr i32 255 to i32*          ; yields no-op on 32-bit architecture
    +  %Z = inttoptr i64 0 to i32*            ; yields truncation on 32-bit architecture
     
    @@ -4370,7 +4636,7 @@ entry:
       %X = bitcast i8 255 to i8              ; yields i8 :-1
       %Y = bitcast i32* %x to sint*          ; yields sint*:%x
    -  %Z = bitcast <2 x int> %V to i64;      ; yields i64: %V   
    +  %Z = bitcast <2 x int> %V to i64;      ; yields i64: %V
     
    @@ -4426,15 +4692,15 @@ entry:
    Semantics:

    The 'icmp' compares op1 and op2 according to the condition code given as cond. The comparison performed always yields - either an i1 or vector of i1 + either an i1 or vector of i1 result, as follows:

      -
    1. eq: yields true if the operands are equal, +
    2. eq: yields true if the operands are equal, false otherwise. No sign interpretation is necessary or performed.
    3. -
    4. ne: yields true if the operands are unequal, +
    5. ne: yields true if the operands are unequal, false otherwise. No sign interpretation is necessary or performed.
    6. @@ -4503,7 +4769,7 @@ entry: values based on comparison of its operands.

      If the operands are floating point scalars, then the result type is a boolean -(i1).

      +(i1).

      If the operands are floating point vectors, then the result type is a vector of boolean with the same number of elements as the operands being @@ -4545,48 +4811,48 @@ entry:

      The 'fcmp' instruction compares op1 and op2 according to the condition code given as cond. If the operands are vectors, then the vectors are compared element by element. Each comparison - performed always yields an i1 result, as + performed always yields an i1 result, as follows:

      1. false: always yields false, regardless of operands.
      2. -
      3. oeq: yields true if both operands are not a QNAN and +
      4. oeq: yields true if both operands are not a QNAN and op1 is equal to op2.
      5. ogt: yields true if both operands are not a QNAN and op1 is greather than op2.
      6. -
      7. oge: yields true if both operands are not a QNAN and +
      8. oge: yields true if both operands are not a QNAN and op1 is greater than or equal to op2.
      9. -
      10. olt: yields true if both operands are not a QNAN and +
      11. olt: yields true if both operands are not a QNAN and op1 is less than op2.
      12. -
      13. ole: yields true if both operands are not a QNAN and +
      14. ole: yields true if both operands are not a QNAN and op1 is less than or equal to op2.
      15. -
      16. one: yields true if both operands are not a QNAN and +
      17. one: yields true if both operands are not a QNAN and op1 is not equal to op2.
      18. ord: yields true if both operands are not a QNAN.
      19. -
      20. ueq: yields true if either operand is a QNAN or +
      21. ueq: yields true if either operand is a QNAN or op1 is equal to op2.
      22. -
      23. ugt: yields true if either operand is a QNAN or +
      24. ugt: yields true if either operand is a QNAN or op1 is greater than op2.
      25. -
      26. uge: yields true if either operand is a QNAN or +
      27. uge: yields true if either operand is a QNAN or op1 is greater than or equal to op2.
      28. -
      29. ult: yields true if either operand is a QNAN or +
      30. ult: yields true if either operand is a QNAN or op1 is less than op2.
      31. -
      32. ule: yields true if either operand is a QNAN or +
      33. ule: yields true if either operand is a QNAN or op1 is less than or equal to op2.
      34. -
      35. une: yields true if either operand is a QNAN or +
      36. une: yields true if either operand is a QNAN or op1 is not equal to op2.
      37. uno: yields true if either operand is a QNAN.
      38. @@ -4780,6 +5046,12 @@ Loop: ; Infinite loop that counts from 0 on up... %ZZ = call zeroext i32 @bar() ; Return value is %zero extended
    +

    llvm treats calls to some functions with names and arguments that match the +standard C99 library as being the C99 library functions, and may perform +optimizations or generate code for them under that assumption. This is +something we'd like to change in the future to provide better support for +freestanding environments and non-C-based langauges.

    + @@ -4872,7 +5144,7 @@ Loop: ; Infinite loop that counts from 0 on up... suffix is required. Because the argument's type is matched against the return type, it does not require its own name suffix.

    -

    To learn how to add an intrinsic function, please see the +

    To learn how to add an intrinsic function, please see the Extending LLVM Guide.

    @@ -6307,11 +6579,11 @@ LLVM.

    • ll: All loads before the barrier must complete before any load after the barrier begins.
    • -
    • ls: All loads before the barrier must complete before any +
    • ls: All loads before the barrier must complete before any store after the barrier begins.
    • -
    • ss: All stores before the barrier must complete before any +
    • ss: All stores before the barrier must complete before any store after the barrier begins.
    • -
    • sl: All stores before the barrier must complete before any +
    • sl: All stores before the barrier must complete before any load after the barrier begins.
    @@ -6325,7 +6597,8 @@ LLVM.

    Example:
    -%ptr      = malloc i32
    +%mallocP  = tail call i8* @malloc(i32 ptrtoint (i32* getelementptr (i32* null, i32 1) to i32))
    +%ptr      = bitcast i8* %mallocP to i32*
                 store i32 4, %ptr
     
     %result1  = load i32* %ptr      ; yields {i32}:result1 = 4
    @@ -6376,7 +6649,8 @@ LLVM.

    Examples:
    -%ptr      = malloc i32
    +%mallocP  = tail call i8* @malloc(i32 ptrtoint (i32* getelementptr (i32* null, i32 1) to i32))
    +%ptr      = bitcast i8* %mallocP to i32*
                 store i32 4, %ptr
     
     %val1     = add i32 4, 4
    @@ -6431,7 +6705,8 @@ LLVM.

    Examples:
    -%ptr      = malloc i32
    +%mallocP  = tail call i8* @malloc(i32 ptrtoint (i32* getelementptr (i32* null, i32 1) to i32))
    +%ptr      = bitcast i8* %mallocP to i32*
                 store i32 4, %ptr
     
     %val1     = add i32 4, 4
    @@ -6486,8 +6761,9 @@ LLVM.

    Examples:
    -%ptr      = malloc i32
    -        store i32 4, %ptr
    +%mallocP  = tail call i8* @malloc(i32 ptrtoint (i32* getelementptr (i32* null, i32 1) to i32))
    +%ptr      = bitcast i8* %mallocP to i32*
    +            store i32 4, %ptr
     %result1  = call i32 @llvm.atomic.load.add.i32.p0i32( i32* %ptr, i32 4 )
                                     ; yields {i32}:result1 = 4
     %result2  = call i32 @llvm.atomic.load.add.i32.p0i32( i32* %ptr, i32 2 )
    @@ -6520,7 +6796,7 @@ LLVM.

    Overview:
    -

    This intrinsic subtracts delta to the value stored in memory at +

    This intrinsic subtracts delta to the value stored in memory at ptr. It yields the original value at ptr.

    Arguments:
    @@ -6537,8 +6813,9 @@ LLVM.

    Examples:
    -%ptr      = malloc i32
    -        store i32 8, %ptr
    +%mallocP  = tail call i8* @malloc(i32 ptrtoint (i32* getelementptr (i32* null, i32 1) to i32))
    +%ptr      = bitcast i8* %mallocP to i32*
    +            store i32 8, %ptr
     %result1  = call i32 @llvm.atomic.load.sub.i32.p0i32( i32* %ptr, i32 4 )
                                     ; yields {i32}:result1 = 8
     %result2  = call i32 @llvm.atomic.load.sub.i32.p0i32( i32* %ptr, i32 2 )
    @@ -6614,8 +6891,9 @@ LLVM.

    Examples:
    -%ptr      = malloc i32
    -        store i32 0x0F0F, %ptr
    +%mallocP  = tail call i8* @malloc(i32 ptrtoint (i32* getelementptr (i32* null, i32 1) to i32))
    +%ptr      = bitcast i8* %mallocP to i32*
    +            store i32 0x0F0F, %ptr
     %result0  = call i32 @llvm.atomic.load.nand.i32.p0i32( i32* %ptr, i32 0xFF )
                                     ; yields {i32}:result0 = 0x0F0F
     %result1  = call i32 @llvm.atomic.load.and.i32.p0i32( i32* %ptr, i32 0xFF )
    @@ -6674,7 +6952,7 @@ LLVM.

    Overview:
    -

    These intrinsics takes the signed or unsigned minimum or maximum of +

    These intrinsics takes the signed or unsigned minimum or maximum of delta and the value stored in memory at ptr. It yields the original value at ptr.

    @@ -6692,8 +6970,9 @@ LLVM.

    Examples:
    -%ptr      = malloc i32
    -        store i32 7, %ptr
    +%mallocP  = tail call i8* @malloc(i32 ptrtoint (i32* getelementptr (i32* null, i32 1) to i32))
    +%ptr      = bitcast i8* %mallocP to i32*
    +            store i32 7, %ptr
     %result0  = call i32 @llvm.atomic.load.min.i32.p0i32( i32* %ptr, i32 -2 )
                                     ; yields {i32}:result0 = 7
     %result1  = call i32 @llvm.atomic.load.max.i32.p0i32( i32* %ptr, i32 8 )
    @@ -6707,6 +6986,133 @@ LLVM.

    + + + + +
    + +

    This class of intrinsics exists to information about the lifetime of memory + objects and ranges where variables are immutable.

    + +
    + + + + +
    + +
    Syntax:
    +
    +  declare void @llvm.lifetime.start(i64 <size>, i8* nocapture <ptr>)
    +
    + +
    Overview:
    +

    The 'llvm.lifetime.start' intrinsic specifies the start of a memory + object's lifetime.

    + +
    Arguments:
    +

    The first argument is a constant integer representing the size of the + object, or -1 if it is variable sized. The second argument is a pointer to + the object.

    + +
    Semantics:
    +

    This intrinsic indicates that before this point in the code, the value of the + memory pointed to by ptr is dead. This means that it is known to + never be used and has an undefined value. A load from the pointer that + precedes this intrinsic can be replaced with + 'undef'.

    + +
    + + + + +
    + +
    Syntax:
    +
    +  declare void @llvm.lifetime.end(i64 <size>, i8* nocapture <ptr>)
    +
    + +
    Overview:
    +

    The 'llvm.lifetime.end' intrinsic specifies the end of a memory + object's lifetime.

    + +
    Arguments:
    +

    The first argument is a constant integer representing the size of the + object, or -1 if it is variable sized. The second argument is a pointer to + the object.

    + +
    Semantics:
    +

    This intrinsic indicates that after this point in the code, the value of the + memory pointed to by ptr is dead. This means that it is known to + never be used and has an undefined value. Any stores into the memory object + following this intrinsic may be removed as dead. + +

    + + + + +
    + +
    Syntax:
    +
    +  declare {}* @llvm.invariant.start(i64 <size>, i8* nocapture <ptr>) readonly
    +
    + +
    Overview:
    +

    The 'llvm.invariant.start' intrinsic specifies that the contents of + a memory object will not change.

    + +
    Arguments:
    +

    The first argument is a constant integer representing the size of the + object, or -1 if it is variable sized. The second argument is a pointer to + the object.

    + +
    Semantics:
    +

    This intrinsic indicates that until an llvm.invariant.end that uses + the return value, the referenced memory location is constant and + unchanging.

    + +
    + + + + +
    + +
    Syntax:
    +
    +  declare void @llvm.invariant.end({}* <start>, i64 <size>, i8* nocapture <ptr>)
    +
    + +
    Overview:
    +

    The 'llvm.invariant.end' intrinsic specifies that the contents of + a memory object are mutable.

    + +
    Arguments:
    +

    The first argument is the matching llvm.invariant.start intrinsic. + The second argument is a constant integer representing the size of the + object, or -1 if it is variable sized and the third argument is a pointer + to the object.

    + +
    Semantics:
    +

    This intrinsic indicates that the memory is mutable again.

    + +
    +
    General Intrinsics @@ -6842,6 +7248,61 @@ LLVM.

    + + + +
    + +
    Syntax:
    +
    +  declare i32 @llvm.objectsize.i32( i8* <ptr>, i32 <type> )
    +  declare i64 @llvm.objectsize.i64( i8* <ptr>, i32 <type> )
    +
    + +
    Overview:
    +

    The llvm.objectsize intrinsic is designed to provide information + to the optimizers to either discover at compile time either a) when an + operation like memcpy will either overflow a buffer that corresponds to + an object, or b) to determine that a runtime check for overflow isn't + necessary. An object in this context means an allocation of a + specific type.

    + +
    Arguments:
    +

    The llvm.objectsize intrinsic takes two arguments. The first + argument is a pointer to the object ptr. The second argument + is an integer type which ranges from 0 to 3. The first bit in + the type corresponds to a return value based on whole objects, + and the second bit whether or not we return the maximum or minimum + remaining bytes computed.

    +
    \1*
    + + + + + + + + + + + + + + + + +
    00whole object, maximum number of bytes
    01partial object, maximum number of bytes
    10whole object, minimum number of bytes
    11partial object, minimum number of bytes
    + +
    Semantics:
    +

    The llvm.objectsize intrinsic is lowered to either a constant + representing the size of the object concerned or i32/i64 -1 or 0 + (depending on the type argument if the size cannot be determined + at compile time.

    + + +