X-Git-Url: http://demsky.eecs.uci.edu/git/?a=blobdiff_plain;f=docs%2FLangRef.rst;h=b75058fa9a5f50136180fe8f066ee69118a0abf8;hb=e4d0a5ec1841ac5a407c3a07b62749923dda74c2;hp=a6595f7fd068ea679848e9ca0256ddadc2d1541f;hpb=710c1a449dd7bee747ecf9c344a6f6d5461a158d;p=oota-llvm.git diff --git a/docs/LangRef.rst b/docs/LangRef.rst index a6595f7fd06..b75058fa9a5 100644 --- a/docs/LangRef.rst +++ b/docs/LangRef.rst @@ -117,8 +117,8 @@ And the hard way: .. code-block:: llvm - %0 = add i32 %X, %X ; yields {i32}:%0 - %1 = add i32 %0, %0 ; yields {i32}:%1 + %0 = add i32 %X, %X ; yields i32:%0 + %1 = add i32 %0, %0 ; yields i32:%1 %result = add i32 %1, %1 This last way of multiplying ``%X`` by 8 illustrates several important @@ -440,7 +440,10 @@ styles: defining module will bind to the local symbol. That is, the symbol cannot be overridden by another module. -.. _namedtypes: +A symbol with ``internal`` or ``private`` linkage must have ``default`` +visibility. + +.. _dllstorageclass: DLL Storage Classes ------------------- @@ -461,6 +464,36 @@ DLL storage class: exists for defining a dll interface, the compiler, assembler and linker know it is externally referenced and must refrain from deleting the symbol. +.. _tls_model: + +Thread Local Storage Models +--------------------------- + +A variable may be defined as ``thread_local``, which means that it will +not be shared by threads (each thread will have a separated copy of the +variable). Not all targets support thread-local variables. Optionally, a +TLS model may be specified: + +``localdynamic`` + For variables that are only used within the current shared library. +``initialexec`` + For variables in modules that will not be loaded dynamically. +``localexec`` + For variables defined in the executable and only used within it. + +If no explicit model is given, the "general dynamic" model is used. + +The models correspond to the ELF TLS models; see `ELF Handling For +Thread-Local Storage `_ for +more information on under which circumstances the different models may +be used. The target may choose a different TLS model if the specified +model is not supported, or if a better choice of model can be made. + +A model can also be specified in a alias, but then it only governs how +the alias is accessed. It will not have any effect in the aliasee. + +.. _namedtypes: + Structure Types --------------- @@ -486,29 +519,13 @@ Global Variables Global variables define regions of memory allocated at compilation time instead of run-time. -Global variables definitions must be initialized, may have an explicit section -to be placed in, and may have an optional explicit alignment specified. +Global variables definitions must be initialized. Global variables in other translation units can also be declared, in which case they don't have an initializer. -A variable may be defined as ``thread_local``, which means that it will -not be shared by threads (each thread will have a separated copy of the -variable). Not all targets support thread-local variables. Optionally, a -TLS model may be specified: - -``localdynamic`` - For variables that are only used within the current shared library. -``initialexec`` - For variables in modules that will not be loaded dynamically. -``localexec`` - For variables defined in the executable and only used within it. - -The models correspond to the ELF TLS models; see `ELF Handling For -Thread-Local Storage `_ for -more information on under which circumstances the different models may -be used. The target may choose a different TLS model if the specified -model is not supported, or if a better choice of model can be made. +Either global variable definitions or declarations may have an explicit section +to be placed in and may have an optional explicit alignment specified. A variable may be defined as a global ``constant``, which indicates that the contents of the variable will **never** be modified (enabling better @@ -567,11 +584,14 @@ iteration. Globals can also have a :ref:`DLL storage class `. +Variables and aliasaes can have a +:ref:`Thread Local Storage Model `. + Syntax:: [@ =] [Linkage] [Visibility] [DLLStorageClass] [ThreadLocal] - [AddrSpace] [unnamed_addr] [ExternallyInitialized] - + [unnamed_addr] [AddrSpace] [ExternallyInitialized] + [] [, section "name"] [, align ] For example, the following defines a global in a numbered address space @@ -661,29 +681,40 @@ Syntax:: Aliases ------- -Aliases act as "second name" for the aliasee value (which can be either -function, global variable, another alias or bitcast of global value). +Aliases, unlike function or variables, don't create any new data. They +are just a new symbol and metadata for an existing position. + +Aliases have a name and an aliasee that is either a global value or a +constant expression. + Aliases may have an optional :ref:`linkage type `, an optional -:ref:`visibility style `, and an optional :ref:`DLL storage class -`. +:ref:`visibility style `, an optional :ref:`DLL storage class +` and an optional :ref:`tls model `. Syntax:: - @ = [Visibility] [DLLStorageClass] alias [Linkage] @ + @ = [Visibility] [DLLStorageClass] [ThreadLocal] [unnamed_addr] alias [Linkage] @ The linkage must be one of ``private``, ``internal``, ``linkonce``, ``weak``, ``linkonce_odr``, ``weak_odr``, ``external``. Note that some system linkers -might not correctly handle dropping a weak symbol that is aliased by a non-weak -alias. +might not correctly handle dropping a weak symbol that is aliased. Alias that are not ``unnamed_addr`` are guaranteed to have the same address as -the aliasee. +the aliasee expression. ``unnamed_addr`` ones are only guaranteed to point +to the same content. + +Since aliases are only a second name, some restrictions apply, of which +some can only be checked when producing an object file: -The aliasee must be a definition. +* The expression defining the aliasee must be computable at assembly + time. Since it is just a name, no relocations can be used. -Aliases are not allowed to point to aliases with linkages that can be -overridden. Since they are only a second name, the possibility of the -intermediate alias being overridden cannot be represented in an object file. +* No alias in the expression can be weak as the possibility of the + intermediate alias being overridden cannot be represented in an + object file. + +* No global value in the expression can be a declaration, since that + would require a relocation, which is not possible. .. _namedmetadatastructure: @@ -844,6 +875,13 @@ Currently, only the following parameter attributes are defined: operands for the :ref:`bitcast instruction `. This is not a valid attribute for return values and can only be applied to one parameter. +``nonnull`` + This indicates that the parameter or return pointer is not null. This + attribute may only be applied to pointer typed parameters. This is not + checked or enforced by LLVM, the caller must ensure that the pointer + passed in is non-null, or the callee must ensure that the returned pointer + is non-null. + .. _gc: Garbage Collector Names @@ -985,6 +1023,14 @@ example: inlining this function is desirable (such as the "inline" keyword in C/C++). It is just a hint; it imposes no requirements on the inliner. +``jumptable`` + This attribute indicates that the function should be added to a + jump-instruction table at code-generation time, and that all address-taken + references to this function should be replaced with a reference to the + appropriate jump-instruction-table function pointer. Note that this creates + a new pointer for the original function, which means that code that depends + on function-pointer identity can break. So, any function annotated with + ``jumptable`` must also be ``unnamed_addr``. ``minsize`` This attribute suggests that optimization passes and code generator passes make choices that keep the code size of this function as small @@ -2703,11 +2749,12 @@ number representing the maximum relative error, for example: '``range``' Metadata ^^^^^^^^^^^^^^^^^^^^ -``range`` metadata may be attached only to loads of integer types. It -expresses the possible ranges the loaded value is in. The ranges are -represented with a flattened list of integers. The loaded value is known -to be in the union of the ranges defined by each consecutive pair. Each -pair has the following properties: +``range`` metadata may be attached only to ``load``, ``call`` and ``invoke`` of +integer types. It expresses the possible ranges the loaded value or the value +returned by the called function at this call site is in. The ranges are +represented with a flattened list of integers. The loaded value or the value +returned is known to be in the union of the ranges defined by each consecutive +pair. Each pair has the following properties: - The type must match the type loaded by the instruction. - The pair ``a,b`` represents the range ``[a,b)``. @@ -2725,8 +2772,9 @@ Examples: %a = load i8* %x, align 1, !range !0 ; Can only be 0 or 1 %b = load i8* %y, align 1, !range !1 ; Can only be 255 (-1), 0 or 1 - %c = load i8* %z, align 1, !range !2 ; Can only be 0, 1, 3, 4 or 5 - %d = load i8* %z, align 1, !range !3 ; Can only be -2, -1, 3, 4 or 5 + %c = call i8 @foo(), !range !2 ; Can only be 0, 1, 3, 4 or 5 + %d = invoke i8 @bar() to label %cont + unwind label %lpad, !range !3 ; Can only be -2, -1, 3, 4 or 5 ... !0 = metadata !{ i8 0, i8 2 } !1 = metadata !{ i8 255, i8 2 } @@ -2775,15 +2823,29 @@ for optimizations are prefixed with ``llvm.mem``. '``llvm.mem.parallel_loop_access``' Metadata ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -For a loop to be parallel, in addition to using -the ``llvm.loop`` metadata to mark the loop latch branch instruction, -also all of the memory accessing instructions in the loop body need to be -marked with the ``llvm.mem.parallel_loop_access`` metadata. If there -is at least one memory accessing instruction not marked with the metadata, -the loop must be considered a sequential loop. This causes parallel loops to be -converted to sequential loops due to optimization passes that are unaware of -the parallel semantics and that insert new memory instructions to the loop -body. +The ``llvm.mem.parallel_loop_access`` metadata refers to a loop identifier, +or metadata containing a list of loop identifiers for nested loops. +The metadata is attached to memory accessing instructions and denotes that +no loop carried memory dependence exist between it and other instructions denoted +with the same loop identifier. + +Precisely, given two instructions ``m1`` and ``m2`` that both have the +``llvm.mem.parallel_loop_access`` metadata, with ``L1`` and ``L2`` being the +set of loops associated with that metadata, respectively, then there is no loop +carried dependence between ``m1`` and ``m2`` for loops in both ``L1`` and +``L2``. + +As a special case, if all memory accessing instructions in a loop have +``llvm.mem.parallel_loop_access`` metadata that refers to that loop, then the +loop has no loop carried memory dependences and is considered to be a parallel +loop. + +Note that if not all memory access instructions have such metadata referring to +the loop, then the loop is considered not being trivially parallel. Additional +memory dependence analysis is required to make that determination. As a fail +safe mechanism, this causes loops that were originally parallel to be considered +sequential (if optimization passes that are unaware of the parallel semantics +insert new memory instructions into the loop body). Example of a loop that is considered parallel due to its correct use of both ``llvm.loop`` and ``llvm.mem.parallel_loop_access`` @@ -3149,14 +3211,18 @@ The '``llvm.global_ctors``' Global Variable .. code-block:: llvm - %0 = type { i32, void ()* } - @llvm.global_ctors = appending global [1 x %0] [%0 { i32 65535, void ()* @ctor }] + %0 = type { i32, void ()*, i8* } + @llvm.global_ctors = appending global [1 x %0] [%0 { i32 65535, void ()* @ctor, i8* @data }] The ``@llvm.global_ctors`` array contains a list of constructor -functions and associated priorities. The functions referenced by this -array will be called in ascending order of priority (i.e. lowest first) -when the module is loaded. The order of functions with the same priority -is not defined. +functions, priorities, and an optional associated global or function. +The functions referenced by this array will be called in ascending order +of priority (i.e. lowest first) when the module is loaded. The order of +functions with the same priority is not defined. + +If the third field is present, non-null, and points to a global variable +or function, the initializer function will only run if the associated +data from the current module is not discarded. .. _llvmglobaldtors: @@ -3165,14 +3231,18 @@ The '``llvm.global_dtors``' Global Variable .. code-block:: llvm - %0 = type { i32, void ()* } - @llvm.global_dtors = appending global [1 x %0] [%0 { i32 65535, void ()* @dtor }] + %0 = type { i32, void ()*, i8* } + @llvm.global_dtors = appending global [1 x %0] [%0 { i32 65535, void ()* @dtor, i8* @data }] -The ``@llvm.global_dtors`` array contains a list of destructor functions -and associated priorities. The functions referenced by this array will -be called in descending order of priority (i.e. highest first) when the -module is loaded. The order of functions with the same priority is not -defined. +The ``@llvm.global_dtors`` array contains a list of destructor +functions, priorities, and an optional associated global or function. +The functions referenced by this array will be called in descending +order of priority (i.e. highest first) when the module is unloaded. The +order of functions with the same priority is not defined. + +If the third field is present, non-null, and points to a global variable +or function, the destructor function will only run if the associated +data from the current module is not discarded. Instruction Reference ===================== @@ -3509,9 +3579,9 @@ Example: .. code-block:: llvm %retval = invoke i32 @Test(i32 15) to label %Continue - unwind label %TestCleanup ; {i32}:retval set + unwind label %TestCleanup ; i32:retval set %retval = invoke coldcc i32 %Testfnptr(i32 15) to label %Continue - unwind label %TestCleanup ; {i32}:retval set + unwind label %TestCleanup ; i32:retval set .. _i_resume: @@ -3600,10 +3670,10 @@ Syntax: :: - = add , ; yields {ty}:result - = add nuw , ; yields {ty}:result - = add nsw , ; yields {ty}:result - = add nuw nsw , ; yields {ty}:result + = add , ; yields ty:result + = add nuw , ; yields ty:result + = add nsw , ; yields ty:result + = add nuw nsw , ; yields ty:result Overview: """"""""" @@ -3639,7 +3709,7 @@ Example: .. code-block:: llvm - = add i32 4, %var ; yields {i32}:result = 4 + %var + = add i32 4, %var ; yields i32:result = 4 + %var .. _i_fadd: @@ -3651,7 +3721,7 @@ Syntax: :: - = fadd [fast-math flags]* , ; yields {ty}:result + = fadd [fast-math flags]* , ; yields ty:result Overview: """"""""" @@ -3678,7 +3748,7 @@ Example: .. code-block:: llvm - = fadd float 4.0, %var ; yields {float}:result = 4.0 + %var + = fadd float 4.0, %var ; yields float:result = 4.0 + %var '``sub``' Instruction ^^^^^^^^^^^^^^^^^^^^^ @@ -3688,10 +3758,10 @@ Syntax: :: - = sub , ; yields {ty}:result - = sub nuw , ; yields {ty}:result - = sub nsw , ; yields {ty}:result - = sub nuw nsw , ; yields {ty}:result + = sub , ; yields ty:result + = sub nuw , ; yields ty:result + = sub nsw , ; yields ty:result + = sub nuw nsw , ; yields ty:result Overview: """"""""" @@ -3730,8 +3800,8 @@ Example: .. code-block:: llvm - = sub i32 4, %var ; yields {i32}:result = 4 - %var - = sub i32 0, %val ; yields {i32}:result = -%var + = sub i32 4, %var ; yields i32:result = 4 - %var + = sub i32 0, %val ; yields i32:result = -%var .. _i_fsub: @@ -3743,7 +3813,7 @@ Syntax: :: - = fsub [fast-math flags]* , ; yields {ty}:result + = fsub [fast-math flags]* , ; yields ty:result Overview: """"""""" @@ -3773,8 +3843,8 @@ Example: .. code-block:: llvm - = fsub float 4.0, %var ; yields {float}:result = 4.0 - %var - = fsub float -0.0, %val ; yields {float}:result = -%var + = fsub float 4.0, %var ; yields float:result = 4.0 - %var + = fsub float -0.0, %val ; yields float:result = -%var '``mul``' Instruction ^^^^^^^^^^^^^^^^^^^^^ @@ -3784,10 +3854,10 @@ Syntax: :: - = mul , ; yields {ty}:result - = mul nuw , ; yields {ty}:result - = mul nsw , ; yields {ty}:result - = mul nuw nsw , ; yields {ty}:result + = mul , ; yields ty:result + = mul nuw , ; yields ty:result + = mul nsw , ; yields ty:result + = mul nuw nsw , ; yields ty:result Overview: """"""""" @@ -3827,7 +3897,7 @@ Example: .. code-block:: llvm - = mul i32 4, %var ; yields {i32}:result = 4 * %var + = mul i32 4, %var ; yields i32:result = 4 * %var .. _i_fmul: @@ -3839,7 +3909,7 @@ Syntax: :: - = fmul [fast-math flags]* , ; yields {ty}:result + = fmul [fast-math flags]* , ; yields ty:result Overview: """"""""" @@ -3866,7 +3936,7 @@ Example: .. code-block:: llvm - = fmul float 4.0, %var ; yields {float}:result = 4.0 * %var + = fmul float 4.0, %var ; yields float:result = 4.0 * %var '``udiv``' Instruction ^^^^^^^^^^^^^^^^^^^^^^ @@ -3876,8 +3946,8 @@ Syntax: :: - = udiv , ; yields {ty}:result - = udiv exact , ; yields {ty}:result + = udiv , ; yields ty:result + = udiv exact , ; yields ty:result Overview: """"""""" @@ -3910,7 +3980,7 @@ Example: .. code-block:: llvm - = udiv i32 4, %var ; yields {i32}:result = 4 / %var + = udiv i32 4, %var ; yields i32:result = 4 / %var '``sdiv``' Instruction ^^^^^^^^^^^^^^^^^^^^^^ @@ -3920,8 +3990,8 @@ Syntax: :: - = sdiv , ; yields {ty}:result - = sdiv exact , ; yields {ty}:result + = sdiv , ; yields ty:result + = sdiv exact , ; yields ty:result Overview: """"""""" @@ -3956,7 +4026,7 @@ Example: .. code-block:: llvm - = sdiv i32 4, %var ; yields {i32}:result = 4 / %var + = sdiv i32 4, %var ; yields i32:result = 4 / %var .. _i_fdiv: @@ -3968,7 +4038,7 @@ Syntax: :: - = fdiv [fast-math flags]* , ; yields {ty}:result + = fdiv [fast-math flags]* , ; yields ty:result Overview: """"""""" @@ -3995,7 +4065,7 @@ Example: .. code-block:: llvm - = fdiv float 4.0, %var ; yields {float}:result = 4.0 / %var + = fdiv float 4.0, %var ; yields float:result = 4.0 / %var '``urem``' Instruction ^^^^^^^^^^^^^^^^^^^^^^ @@ -4005,7 +4075,7 @@ Syntax: :: - = urem , ; yields {ty}:result + = urem , ; yields ty:result Overview: """"""""" @@ -4037,7 +4107,7 @@ Example: .. code-block:: llvm - = urem i32 4, %var ; yields {i32}:result = 4 % %var + = urem i32 4, %var ; yields i32:result = 4 % %var '``srem``' Instruction ^^^^^^^^^^^^^^^^^^^^^^ @@ -4047,7 +4117,7 @@ Syntax: :: - = srem , ; yields {ty}:result + = srem , ; yields ty:result Overview: """"""""" @@ -4092,7 +4162,7 @@ Example: .. code-block:: llvm - = srem i32 4, %var ; yields {i32}:result = 4 % %var + = srem i32 4, %var ; yields i32:result = 4 % %var .. _i_frem: @@ -4104,7 +4174,7 @@ Syntax: :: - = frem [fast-math flags]* , ; yields {ty}:result + = frem [fast-math flags]* , ; yields ty:result Overview: """"""""" @@ -4132,7 +4202,7 @@ Example: .. code-block:: llvm - = frem float 4.0, %var ; yields {float}:result = 4.0 % %var + = frem float 4.0, %var ; yields float:result = 4.0 % %var .. _bitwiseops: @@ -4153,10 +4223,10 @@ Syntax: :: - = shl , ; yields {ty}:result - = shl nuw , ; yields {ty}:result - = shl nsw , ; yields {ty}:result - = shl nuw nsw , ; yields {ty}:result + = shl , ; yields ty:result + = shl nuw , ; yields ty:result + = shl nsw , ; yields ty:result + = shl nuw nsw , ; yields ty:result Overview: """"""""" @@ -4194,9 +4264,9 @@ Example: .. code-block:: llvm - = shl i32 4, %var ; yields {i32}: 4 << %var - = shl i32 4, 2 ; yields {i32}: 16 - = shl i32 1, 10 ; yields {i32}: 1024 + = shl i32 4, %var ; yields i32: 4 << %var + = shl i32 4, 2 ; yields i32: 16 + = shl i32 1, 10 ; yields i32: 1024 = shl i32 1, 32 ; undefined = shl <2 x i32> < i32 1, i32 1>, < i32 1, i32 2> ; yields: result=<2 x i32> < i32 2, i32 4> @@ -4208,8 +4278,8 @@ Syntax: :: - = lshr , ; yields {ty}:result - = lshr exact , ; yields {ty}:result + = lshr , ; yields ty:result + = lshr exact , ; yields ty:result Overview: """"""""" @@ -4243,10 +4313,10 @@ Example: .. code-block:: llvm - = lshr i32 4, 1 ; yields {i32}:result = 2 - = lshr i32 4, 2 ; yields {i32}:result = 1 - = lshr i8 4, 3 ; yields {i8}:result = 0 - = lshr i8 -2, 1 ; yields {i8}:result = 0x7F + = lshr i32 4, 1 ; yields i32:result = 2 + = lshr i32 4, 2 ; yields i32:result = 1 + = lshr i8 4, 3 ; yields i8:result = 0 + = lshr i8 -2, 1 ; yields i8:result = 0x7F = lshr i32 1, 32 ; undefined = lshr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 2> ; yields: result=<2 x i32> < i32 0x7FFFFFFF, i32 1> @@ -4258,8 +4328,8 @@ Syntax: :: - = ashr , ; yields {ty}:result - = ashr exact , ; yields {ty}:result + = ashr , ; yields ty:result + = ashr exact , ; yields ty:result Overview: """"""""" @@ -4294,10 +4364,10 @@ Example: .. code-block:: llvm - = ashr i32 4, 1 ; yields {i32}:result = 2 - = ashr i32 4, 2 ; yields {i32}:result = 1 - = ashr i8 4, 3 ; yields {i8}:result = 0 - = ashr i8 -2, 1 ; yields {i8}:result = -1 + = ashr i32 4, 1 ; yields i32:result = 2 + = ashr i32 4, 2 ; yields i32:result = 1 + = ashr i8 4, 3 ; yields i8:result = 0 + = ashr i8 -2, 1 ; yields i8:result = -1 = ashr i32 1, 32 ; undefined = ashr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 3> ; yields: result=<2 x i32> < i32 -1, i32 0> @@ -4309,7 +4379,7 @@ Syntax: :: - = and , ; yields {ty}:result + = and , ; yields ty:result Overview: """"""""" @@ -4346,9 +4416,9 @@ Example: .. code-block:: llvm - = and i32 4, %var ; yields {i32}:result = 4 & %var - = and i32 15, 40 ; yields {i32}:result = 8 - = and i32 4, 8 ; yields {i32}:result = 0 + = and i32 4, %var ; yields i32:result = 4 & %var + = and i32 15, 40 ; yields i32:result = 8 + = and i32 4, 8 ; yields i32:result = 0 '``or``' Instruction ^^^^^^^^^^^^^^^^^^^^ @@ -4358,7 +4428,7 @@ Syntax: :: - = or , ; yields {ty}:result + = or , ; yields ty:result Overview: """"""""" @@ -4395,9 +4465,9 @@ Example: :: - = or i32 4, %var ; yields {i32}:result = 4 | %var - = or i32 15, 40 ; yields {i32}:result = 47 - = or i32 4, 8 ; yields {i32}:result = 12 + = or i32 4, %var ; yields i32:result = 4 | %var + = or i32 15, 40 ; yields i32:result = 47 + = or i32 4, 8 ; yields i32:result = 12 '``xor``' Instruction ^^^^^^^^^^^^^^^^^^^^^ @@ -4407,7 +4477,7 @@ Syntax: :: - = xor , ; yields {ty}:result + = xor , ; yields ty:result Overview: """"""""" @@ -4445,10 +4515,10 @@ Example: .. code-block:: llvm - = xor i32 4, %var ; yields {i32}:result = 4 ^ %var - = xor i32 15, 40 ; yields {i32}:result = 39 - = xor i32 4, 8 ; yields {i32}:result = 12 - = xor i32 %V, -1 ; yields {i32}:result = ~%V + = xor i32 4, %var ; yields i32:result = 4 ^ %var + = xor i32 15, 40 ; yields i32:result = 39 + = xor i32 4, 8 ; yields i32:result = 12 + = xor i32 %V, -1 ; yields i32:result = ~%V Vector Operations ----------------- @@ -4470,7 +4540,7 @@ Syntax: :: - = extractelement > , i32 ; yields + = extractelement > , ; yields Overview: """"""""" @@ -4484,7 +4554,7 @@ Arguments: The first operand of an '``extractelement``' instruction is a value of :ref:`vector ` type. The second operand is an index indicating the position from which to extract the element. The index may be a -variable. +variable of any integer type. Semantics: """""""""" @@ -4510,7 +4580,7 @@ Syntax: :: - = insertelement > , , i32 ; yields > + = insertelement > , , ; yields > Overview: """"""""" @@ -4525,7 +4595,7 @@ The first operand of an '``insertelement``' instruction is a value of :ref:`vector ` type. The second operand is a scalar value whose type must equal the element type of the first operand. The third operand is an index indicating the position at which to insert the value. The -index may be a variable. +index may be a variable of any integer type. Semantics: """""""""" @@ -4714,7 +4784,7 @@ Syntax: :: - = alloca [inalloca] [, ] [, align ] ; yields {type*}:result + = alloca [inalloca] [, ] [, align ] ; yields type*:result Overview: """"""""" @@ -4756,10 +4826,10 @@ Example: .. code-block:: llvm - %ptr = alloca i32 ; yields {i32*}:ptr - %ptr = alloca i32, i32 4 ; yields {i32*}:ptr - %ptr = alloca i32, i32 4, align 1024 ; yields {i32*}:ptr - %ptr = alloca i32, align 1024 ; yields {i32*}:ptr + %ptr = alloca i32 ; yields i32*:ptr + %ptr = alloca i32, i32 4 ; yields i32*:ptr + %ptr = alloca i32, i32 4, align 1024 ; yields i32*:ptr + %ptr = alloca i32, align 1024 ; yields i32*:ptr .. _i_load: @@ -4842,9 +4912,9 @@ Examples: .. code-block:: llvm - %ptr = alloca i32 ; yields {i32*}:ptr - store i32 3, i32* %ptr ; yields {void} - %val = load i32* %ptr ; yields {i32}:val = i32 3 + %ptr = alloca i32 ; yields i32*:ptr + store i32 3, i32* %ptr ; yields void + %val = load i32* %ptr ; yields i32:val = i32 3 .. _i_store: @@ -4856,8 +4926,8 @@ Syntax: :: - store [volatile] , * [, align ][, !nontemporal !] ; yields {void} - store atomic [volatile] , * [singlethread] , align ; yields {void} + store [volatile] , * [, align ][, !nontemporal !] ; yields void + store atomic [volatile] , * [singlethread] , align ; yields void Overview: """"""""" @@ -4921,9 +4991,9 @@ Example: .. code-block:: llvm - %ptr = alloca i32 ; yields {i32*}:ptr - store i32 3, i32* %ptr ; yields {void} - %val = load i32* %ptr ; yields {i32}:val = i32 3 + %ptr = alloca i32 ; yields i32*:ptr + store i32 3, i32* %ptr ; yields void + %val = load i32* %ptr ; yields i32:val = i32 3 .. _i_fence: @@ -4935,7 +5005,7 @@ Syntax: :: - fence [singlethread] ; yields {void} + fence [singlethread] ; yields void Overview: """"""""" @@ -4978,8 +5048,8 @@ Example: .. code-block:: llvm - fence acquire ; yields {void} - fence singlethread seq_cst ; yields {void} + fence acquire ; yields void + fence singlethread seq_cst ; yields void .. _i_cmpxchg: @@ -4991,14 +5061,14 @@ Syntax: :: - cmpxchg [volatile] * , , [singlethread] ; yields {ty} + cmpxchg [weak] [volatile] * , , [singlethread] ; yields { ty, i1 } Overview: """"""""" The '``cmpxchg``' instruction is used to atomically modify memory. It loads a value in memory and compares it to a given value. If they are -equal, it stores a new value into the memory. +equal, it tries to store a new value into the memory. Arguments: """""""""" @@ -5015,10 +5085,10 @@ to modify the number or order of execution of this ``cmpxchg`` with other :ref:`volatile operations `. The success and failure :ref:`ordering ` arguments specify how this -``cmpxchg`` synchronizes with other atomic operations. The both ordering -parameters must be at least ``monotonic``, the ordering constraint on failure -must be no stronger than that on success, and the failure ordering cannot be -either ``release`` or ``acq_rel``. +``cmpxchg`` synchronizes with other atomic operations. Both ordering parameters +must be at least ``monotonic``, the ordering constraint on failure must be no +stronger than that on success, and the failure ordering cannot be either +``release`` or ``acq_rel``. The optional "``singlethread``" argument declares that the ``cmpxchg`` is only atomic with respect to code (usually signal handlers) running in @@ -5031,10 +5101,17 @@ equal to the size in memory of the operand. Semantics: """""""""" -The contents of memory at the location specified by the '````' -operand is read and compared to '````'; if the read value is the -equal, '````' is written. The original value at the location is -returned. +The contents of memory at the location specified by the '````' operand +is read and compared to '````'; if the read value is the equal, the +'````' is written. The original value at the location is returned, together +with a flag indicating success (true) or failure (false). + +If the cmpxchg operation is marked as ``weak`` then a spurious failure is +permitted: the operation may not write ```` even if the comparison +matched. + +If the cmpxchg operation is strong (the default), the i1 value is 1 if and only +if the value loaded equals ``cmp``. A successful ``cmpxchg`` is a read-modify-write instruction for the purpose of identifying release sequences. A failed ``cmpxchg`` is equivalent to an atomic @@ -5046,14 +5123,15 @@ Example: .. code-block:: llvm entry: - %orig = atomic load i32* %ptr unordered ; yields {i32} + %orig = atomic load i32* %ptr unordered ; yields i32 br label %loop loop: %cmp = phi i32 [ %orig, %entry ], [%old, %loop] %squared = mul i32 %cmp, %cmp - %old = cmpxchg i32* %ptr, i32 %cmp, i32 %squared acq_rel monotonic ; yields {i32} - %success = icmp eq i32 %cmp, %old + %val_success = cmpxchg i32* %ptr, i32 %cmp, i32 %squared acq_rel monotonic ; yields { i32, i1 } + %value_loaded = extractvalue { i32, i1 } %val_success, 0 + %success = extractvalue { i32, i1 } %val_success, 1 br i1 %success, label %done, label %loop done: @@ -5069,7 +5147,7 @@ Syntax: :: - atomicrmw [volatile] * , [singlethread] ; yields {ty} + atomicrmw [volatile] * , [singlethread] ; yields ty Overview: """"""""" @@ -5130,7 +5208,7 @@ Example: .. code-block:: llvm - %old = atomicrmw add i32* %ptr, i32 1 acquire ; yields {i32} + %old = atomicrmw add i32* %ptr, i32 1 acquire ; yields i32 .. _i_getelementptr: @@ -5864,7 +5942,7 @@ Syntax: :: - = icmp , ; yields {i1} or {}:result + = icmp , ; yields i1 or :result Overview: """"""""" @@ -5955,7 +6033,7 @@ Syntax: :: - = fcmp , ; yields {i1} or {}:result + = fcmp , ; yields i1 or :result Overview: """"""""" @@ -6260,7 +6338,7 @@ Example: call void %foo(i8 97 signext) %struct.A = type { i32, i8 } - %r = call %struct.A @foo() ; yields { 32, i8 } + %r = call %struct.A @foo() ; yields { i32, i8 } %gr = extractvalue %struct.A %r, 0 ; yields i32 %gr1 = extractvalue %struct.A %r, 1 ; yields i8 %Z = call void @foo() noreturn ; indicates that %foo never returns normally @@ -6804,6 +6882,51 @@ Note that calling this intrinsic does not prevent function inlining or other aggressive transformations, so the value returned may not be that of the obvious source-language caller. +.. _int_read_register: +.. _int_write_register: + +'``llvm.read_register``' and '``llvm.write_register``' Intrinsics +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" + +:: + + declare i32 @llvm.read_register.i32(metadata) + declare i64 @llvm.read_register.i64(metadata) + declare void @llvm.write_register.i32(metadata, i32 @value) + declare void @llvm.write_register.i64(metadata, i64 @value) + !0 = metadata !{metadata !"sp\00"} + +Overview: +""""""""" + +The '``llvm.read_register``' and '``llvm.write_register``' intrinsics +provides access to the named register. The register must be valid on +the architecture being compiled to. The type needs to be compatible +with the register being read. + +Semantics: +"""""""""" + +The '``llvm.read_register``' intrinsic returns the current value of the +register, where possible. The '``llvm.write_register``' intrinsic sets +the current value of the register, where possible. + +This is useful to implement named register global variables that need +to always be mapped to a specific register, as is common practice on +bare-metal programs including OS kernels. + +The compiler doesn't check for register availability or use of the used +register in surrounding code, including inline assembly. Because of that, +allocatable registers are not supported. + +Warning: So far it only works with the stack pointer on selected +architectures (ARM, AArch64, PowerPC and x86_64). Significant amount of +work is needed to support other registers and even more so, allocatable +registers. + .. _int_stacksave: '``llvm.stacksave``' Intrinsic @@ -8377,7 +8500,7 @@ Examples: .. code-block:: llvm - %r2 = call float @llvm.fmuladd.f32(float %a, float %b, float %c) ; yields {float}:r2 = (a * b) + c + %r2 = call float @llvm.fmuladd.f32(float %a, float %b, float %c) ; yields float:r2 = (a * b) + c Half Precision Floating Point Intrinsics ----------------------------------------