X-Git-Url: http://demsky.eecs.uci.edu/git/?a=blobdiff_plain;f=docs%2FLangRef.html;h=9a093577025e1381da9c160d9bc9c87b3e743a88;hb=dc896a41182bec1b8f9a59ca5d81770c16bdacf1;hp=1331b021c101b8729ccbb42c6d739d1432277fd2;hpb=07de8d1acf10117cac2cc53f1cb0ae888be6a5bc;p=oota-llvm.git diff --git a/docs/LangRef.html b/docs/LangRef.html index 1331b021c10..9a093577025 100644 --- a/docs/LangRef.html +++ b/docs/LangRef.html @@ -5,7 +5,7 @@
...because the definition of %x does not dominate all of its - uses. The LLVM infrastructure provides a verification pass that may be used - to verify that an LLVM module is well formed. This pass is automatically run - by the parser after parsing input assembly and by the optimizer before it - outputs bitcode. The violations pointed out by the verifier pass indicate - bugs in transformation passes or input to the parser.
+because the definition of %x does not dominate all of its uses. The + LLVM infrastructure provides a verification pass that may be used to verify + that an LLVM module is well formed. This pass is automatically run by the + parser after parsing input assembly and by the optimizer before it outputs + bitcode. The violations pointed out by the verifier pass indicate bugs in + transformation passes or input to the parser.
@@ -430,8 +454,8 @@-add i32 %X, %X ; yields {i32}:%0 -add i32 %0, %0 ; yields {i32}:%1 +%0 = add i32 %X, %X ; yields {i32}:%0 +%1 = add i32 %0, %0 ; yields {i32}:%1 %result = add i32 %1, %1
...and it also shows a convention that we follow in this document. When +
It also shows a convention that we follow in this document. When demonstrating instructions, we will follow an instruction with a comment that defines the type and name of value produced. Comments are shown in italic text.
@@ -474,31 +498,33 @@ the "hello world" module:; Declare the string constant as a global constant... -@.LC0 = internal constant [13 x i8] c"hello world\0A\00" ; [13 x i8]* ++; Declare the string constant as a global constant. +@.LC0 = internal constant [13 x i8] c"hello world\0A\00" ; [13 x i8]* ; External declaration of the puts function -declare i32 @puts(i8 *) ; i32(i8 *)* +declare i32 @puts(i8 *) ; i32(i8 *)* ; Definition of main function -define i32 @main() { ; i32()* - ; Convert [13 x i8]* to i8 *... - %cast210 = getelementptr [13 x i8]* @.LC0, i64 0, i64 0 ; i8 * +define i32 @main() { ; i32()* + ; Convert [13 x i8]* to i8 *... + %cast210 = getelementptr [13 x i8]* @.LC0, i64 0, i64 0 ; i8 * + + ; Call puts function to write out the string to stdout. + call i32 @puts(i8 * %cast210) ; i32 + ret i32 0
} - ; Call puts function to write out the string to stdout... - call i32 @puts(i8 * %cast210) ; i32 - ret i32 0
}
+; Named metadata +!1 = metadata !{i32 41} +!foo = !{!1, null}
This example is made up of a global variable named - ".LC0", an external declaration of the "puts" function, and + ".LC0", an external declaration of the "puts" function, a function definition for - "main".
+ "main" and named metadata + "foo".In general, a module is made up of a list of global values, where both functions and global variables are global values. Global values are @@ -519,7 +545,7 @@ define i32 @main() { ; i32()* linkage:
__imp_
and the function or variable
name.LLVM allows an explicit section to be specified for globals. If the target supports it, it will emit globals to the section specified.
-An explicit alignment may be specified for a global. If not present, or if - the alignment is set to zero, the alignment of the global is set by the - target to whatever it feels convenient. If an explicit alignment is - specified, the global is forced to have at least that much alignment. All - alignments must be a power of 2.
+An explicit alignment may be specified for a global, which must be a power + of 2. If not present, or if the alignment is set to zero, the alignment of + the global is set by the target to whatever it feels convenient. If an + explicit alignment is specified, the global is forced to have exactly that + alignment. Targets and optimizers are not allowed to over-align the global + if the global has an assigned section. In this case, the extra alignment + could be observable: for example, code could assume that the globals are + densely packed in their section and try to iterate over them as an array, + alignment padding would break this iteration.
For example, the following defines a global in a numbered address space with an initializer, section, and alignment:
@@ -823,7 +880,7 @@ define i32 @main() { ; i32()*LLVM function definitions consist of the "define" keyord, an +
LLVM function definitions consist of the "define" keyword, an optional linkage type, an optional visibility style, an optional calling convention, a return type, an optional @@ -836,7 +893,7 @@ define i32 @main() { ; i32()*
LLVM function declarations consist of the "declare" keyword, an optional linkage type, an optional - visibility style, an optional + visibility style, an optional calling convention, a return type, an optional parameter attribute for the return type, a function name, a possibly empty list of arguments, an optional alignment, and an @@ -897,6 +954,27 @@ define [linkage] [visibility]
Named metadata is a collection of metadata. Metadata + nodes (but not metadata strings) and null are the only valid operands for + a named metadata.
+ ++!1 = metadata !{metadata !"one"} +!name = !{null, !1} ++
Currently, only the following parameter attributes are defined:
-define void @f() gc "name" { ... +define void @f() gc "name" { ... }
When constructing the data layout for a given target, LLVM starts with a - default set of specifications which are then (possibly) overriden by the + default set of specifications which are then (possibly) overridden by the specifications in the datalayout keyword. The default specifications are given in this list:
Certain memory accesses, such as loads, stores, and llvm.memcpys may be marked volatile. +The optimizers must not change the number of volatile operations or change their +order of execution relative to other volatile operations. The optimizers +may change the order of volatile operations relative to non-volatile +operations. This is not Java's "volatile" and has no cross-thread +synchronization behavior.
+ +The first class types are perhaps the most important. Values of these types are the only ones which can be produced by - instructions, passed as arguments, or used as operands to instructions.
+ instructions. @@ -1394,6 +1503,42 @@ Classifications + + + +The integer type is a very simple type that simply specifies an arbitrary + bit width for the integer type desired. Any bit width from 1 bit to + 223-1 (about 8 million) can be specified.
+ ++ iN ++ +
The number of bits the integer will occupy is specified by the N + value.
+ +i1 | +a single-bit integer. | +
i32 | +a 32-bit integer. | +
i1942652 | +a really big integer of over 1 million bits. | +
The metadata type represents embedded metadata. The only derived type that - may contain metadata is metadata* or a function type that returns or - takes metadata typed parameters, but not pointer to metadata types.
+The metadata type represents embedded metadata. No derived types may be + created from metadata except for function + arguments.
@@ -1467,49 +1612,25 @@ Classifications
The real power in LLVM comes from the derived types in the system. This is what allows a programmer to represent arrays, functions, pointers, and other - useful types. Note that these derived types may be recursive: For example, - it is possible to have a two dimensional array.
+ useful types. Each of these types contain one or more element types which + may be a primitive type, or another derived type. For example, it is + possible to have a two dimensional array, using an array as the element type + of another array. + - +The integer type is a very simple derived type that simply specifies an - arbitrary bit width for the integer type desired. Any bit width from 1 bit to - 2^23-1 (about 8 million) can be specified.
- -- iN -- -
The number of bits the integer will occupy is specified by the N - value.
- -i1 | -a single-bit integer. | -
i32 | -a 32-bit integer. | -
i1942652 | -a really big integer of over 1 million bits. | -
Aggregate Types are a subset of derived types that can contain multiple + member types. Arrays, + structs, vectors and + unions are aggregate types.
-Note that the code generator does not yet support large integer types to be - used as function return types. The specific limit on how large a return type - the code generator can currently handle is target-dependent; currently it's - often 64 bits for 32-bit targets and 128 bits for 64-bit targets.
+Note that 'variable sized arrays' can be implemented in LLVM with a zero - length array. Normally, accesses past the end of an array are undefined in - LLVM (e.g. it is illegal to access the 5th element of a 3 element array). As - a special case, however, zero length arrays are recognized to be variable - length. This allows implementation of 'pascal style arrays' with the LLVM - type "{ i32, [0 x float]}", for example.
- -Note that the code generator does not yet support large aggregate types to be - used as function return types. The specific limit on how large an aggregate - return type the code generator can currently handle is target-dependent, and - also dependent on the aggregate element types.
+There is no restriction on indexing beyond the end of the array implied by + a static type (though there are restrictions on indexing beyond the bounds + of an allocated object in some cases). This means that single-dimension + 'variable sized array' addressing can be implemented in LLVM with a zero + length array type. An implementation of 'pascal style arrays' in LLVM could + use the type "{ i32, [0 x float]}", for example.
@@ -1584,13 +1700,13 @@ ClassificationsThe function type can be thought of as a function signature. It consists of a return type and a list of formal parameter types. The return type of a - function type is a scalar type, a void type, or a struct type. If the return - type is a struct type then all struct elements must be of first class types, - and the struct must have at least one element.
+ function type is a scalar type, a void type, a struct type, or a union + type. If the return type is a struct type then all struct elements must be + of first class types, and the struct must have at least one element.- <returntype list> (<parameter list>) + <returntype> (<parameter list>)
...where '<parameter list>' is a comma-separated list of type @@ -1598,8 +1714,8 @@ Classifications which indicates that the function takes a variable number of arguments. Variable argument functions can access their arguments with the variable argument handling intrinsic - functions. '<returntype list>' is a comma-separated list of - first class type specifiers.
+ functions. '<returntype>' is any type except + label.function taking an i32, returning an i32 | |||
float (i16 signext, i32 *) *
+ float (i16, i32 *) *
|
- Pointer to a function that takes
- an i16 that should be sign extended and a
- pointer to i32, returning
- float.
+ | Pointer to a function that takes
+ an i16 and a pointer to i32,
+ returning float.
|
|
i32 (i8*, ...) | -A vararg function that takes at least one - pointer to i8 (char in C), - which returns an integer. This is the signature for printf in + | A vararg function that takes at least one + pointer to i8 (char in C), + which returns an integer. This is the signature for printf in LLVM. | |
{i32, i32} (i32) | -A function taking an i32, returning two - i32 values as an aggregate of type { i32, i32 } + | A function taking an i32, returning a + structure containing two i32 values |
Structures are accessed using 'load and - 'store' by getting a pointer to a field with - the 'getelementptr' instruction.
- +Structures in memory are accessed using 'load' + and 'store' by getting a pointer to a field + with the 'getelementptr' instruction. + Structures in registers are accessed using the + 'extractvalue' and + 'insertvalue' instructions.
{ <type list> } @@ -1666,11 +1783,6 @@ Classifications -Note that the code generator does not yet support large aggregate types to be - used as function return types. The specific limit on how large an aggregate - return type the code generator can currently handle is target-dependent, and - also dependent on the aggregate element types.
- @@ -1711,16 +1823,66 @@ Classifications + + + ++ ++Overview:
+A union type describes an object with size and alignment suitable for + an object of any one of a given set of types (also known as an "untagged" + union). It is similar in concept and usage to a + struct, except that all members of the union + have an offset of zero. The elements of a union may be any type that has a + size. Unions must have at least one member - empty unions are not allowed. +
+ +The size of the union as a whole will be the size of its largest member, + and the alignment requirements of the union as a whole will be the largest + alignment requirement of any member.
+ +Union members are accessed using 'load and + 'store' by getting a pointer to a field with + the 'getelementptr' instruction. + Since all members are at offset zero, the getelementptr instruction does + not affect the address, only the type of the resulting pointer.
+ +Syntax:
++ union { <type list> } ++ +Examples:
++
+ ++ union { i32, i32*, float } +A union of three types: an i32, a pointer to + an i32, and a float. ++ ++ union { float, i32 (i32) * } +A union, where the first element is a float and the + second element is a pointer to a + function that takes an i32, returning + an i32. +Overview:
-As in many languages, the pointer type represents a pointer or reference to - another object, which must live in memory. Pointer types may have an optional - address space attribute defining the target-specific numbered address space - where the pointed-to object resides. The default address space is zero.
+The pointer type is used to specify memory locations. + Pointers are commonly used to reference objects in memory.
+ +Pointer types may have an optional address space attribute defining the + numbered address space where the pointed-to object resides. The default + address space is number zero. The semantics of non-zero address + spaces are target-specific.
Note that LLVM does not permit pointers to void (void*) nor does it permit pointers to labels (label*). Use i8* instead.
@@ -1761,8 +1923,7 @@ ClassificationsA vector type is a simple derived type that represents a vector of elements. Vector types are used when multiple primitive data are operated in parallel using a single instruction (SIMD). A vector type requires a size (number of - elements) and an underlying primitive data type. Vectors must have a power - of two length (1, 2, 4, 8, 16 ...). Vector types are considered + elements) and an underlying primitive data type. Vector types are considered first class.
Syntax:
@@ -1789,11 +1950,6 @@ Classifications -Note that the code generator does not yet support large vector types to be - used as function return types. The specific limit on how large a vector - return type codegen can currently handle is target-dependent; currently it's - often a few times longer than a hardware vector register.
- @@ -1888,7 +2044,7 @@ Classifications
The string 'undef' can be used anywhere a constant is expected, and - indicates that the user of the value may recieve an unspecified bit-pattern. + indicates that the user of the value may receive an unspecified bit-pattern. Undefined values may be of any type (other than label or void) and be used anywhere a constant is permitted.
@@ -2059,9 +2224,9 @@ Unsafe: For example, if "%X" has a zero bit, then the output of the 'and' operation will always be a zero, no matter what the corresponding bit from the undef is. As such, it is unsafe to optimize or assume that the result of the and is undef. -However, it is safe to assume that all bits of the undef could be 0, and -optimize the and to 0. Likewise, it is safe to assume that all the bits of -the undef operand to the or could be set, allowing the or to be folded to +However, it is safe to assume that all bits of the undef could be 0, and +optimize the and to 0. Likewise, it is safe to assume that all the bits of +the undef operand to the or could be set, allowing the or to be folded to -1.%A = xor undef, undef - + %B = undef %C = xor %B, %B @@ -2116,7 +2281,7 @@ number of reasons, but the short answer is that an undef "variable" can arbitrarily change its value over its "live range". This is true because the "variable" doesn't actually have a live range. Instead, the value is logically read from arbitrary registers that happen to be around when needed, -so the value is not neccesarily consistent over time. In fact, %A and %C need +so the value is not necessarily consistent over time. In fact, %A and %C need to have the same semantics or the core LLVM "replace all uses with" concept would not hold. @@ -2142,7 +2307,7 @@ does not execute at all. This allows us to delete the divide and all code after it: since the undefined operation "can't happen", the optimizer can assume that it occurs in dead code. - +a: store undef -> %X @@ -2154,13 +2319,149 @@ b: unreachableThese examples reiterate the fdiv example: a store "of" an undefined value -can be assumed to not have any effect: we can assume that the value is +can be assumed to not have any effect: we can assume that the value is overwritten with bits that happen to match what was already there. However, a store "to" an undefined location could clobber arbitrary memory, therefore, it has undefined behavior.
Trap values are similar to undef values, however + instead of representing an unspecified bit pattern, they represent the + fact that an instruction or constant expression which cannot evoke side + effects has nevertheless detected a condition which results in undefined + behavior.
+ +There is currently no way of representing a trap value in the IR; they + only exist when produced by operations such as + add with the nsw flag.
+ +Trap value behavior is defined in terms of value dependence:
+ ++
Whenever a trap value is generated, all values which depend on it evaluate + to trap. If they have side effects, the evoke their side effects as if each + operand with a trap value were undef. If they have externally-visible side + effects, the behavior is undefined.
+ +Here are some examples:
+ ++entry: + %trap = sub nuw i32 0, 1 ; Results in a trap value. + %still_trap = and i32 %trap, 0 ; Whereas (and i32 undef, 0) would return 0. + %trap_yet_again = getelementptr i32* @h, i32 %still_trap + store i32 0, i32* %trap_yet_again ; undefined behavior + + store i32 %trap, i32* @g ; Trap value conceptually stored to memory. + %trap2 = load i32* @g ; Returns a trap value, not just undef. + + volatile store i32 %trap, i32* @g ; External observation; undefined behavior. + + %narrowaddr = bitcast i32* @g to i16* + %wideaddr = bitcast i32* @g to i64* + %trap3 = load 16* %narrowaddr ; Returns a trap value. + %trap4 = load i64* %widaddr ; Returns a trap value. + + %cmp = icmp i32 slt %trap, 0 ; Returns a trap value. + %br i1 %cmp, %true, %end ; Branch to either destination. + +true: + volatile store i32 0, i32* @g ; This is control-dependent on %cmp, so + ; it has undefined behavior. + br label %end + +end: + %p = phi i32 [ 0, %entry ], [ 1, %true ] + ; Both edges into this PHI are + ; control-dependent on %cmp, so this + ; always results in a trap value. + + volatile store i32 0, i32* @g ; %end is control-equivalent to %entry + ; so this is defined (ignoring earlier + ; undefined behavior in this example). ++
blockaddress(@function, %block)
+ +The 'blockaddress' constant computes the address of the specified + basic block in the specified function, and always has an i8* type. Taking + the address of the entry block is illegal.
+ +This value only has defined behavior when used as an operand to the + 'indirectbr' instruction or for comparisons + against null. Pointer equality tests between labels addresses is undefined + behavior - though, again, comparison against null is ok, and no label is + equal to the null pointer. This may also be passed around as an opaque + pointer sized value as long as the bits are not inspected. This allows + ptrtoint and arithmetic to be performed on these values so long as + the original value is reconstituted before the indirectbr.
+ +Finally, some targets may provide defined semantics when + using the value as the operand to an inline assembly, but that is target + specific. +
+ +Embedded metadata provides a way to attach arbitrary data to the instruction - stream without affecting the behaviour of the program. There are two - metadata primitives, strings and nodes. All metadata has the - metadata type and is identified in syntax by a preceding exclamation - point ('!').
- -A metadata string is a string surrounded by double quotes. It can contain - any character by escaping non-printable characters with "\xx" where "xx" is - the two digit hex code. For example: "!"test\00"".
- -Metadata nodes are represented with notation similar to structure constants - (a comma separated list of elements, surrounded by braces and preceeded by an - exclamation point). For example: "!{ metadata !"test\00", i32 - 10}".
- -A metadata node will attempt to track changes to the values it holds. In the - event that a value is deleted, it will be replaced with a typeless - "null", such as "metadata !{null, i32 10}".
- -Optimizations may rely on metadata to provide additional information about - the program that isn't available in the instructions, or that isn't easily - computable. Similarly, the code generator may expect a certain metadata - format to be used to express debugging information.
- -@@ -2357,11 +2628,98 @@ call void asm sideeffect "eieio", ""()
In some cases inline asms will contain code that will not work unless the + stack is aligned in some way, such as calls or SSE instructions on x86, + yet will not contain code that does that alignment within the asm. + The compiler should make conservative assumptions about what the asm might + contain and should generate its usual stack alignment code in the prologue + if the 'alignstack' keyword is present:
+ ++call void asm alignstack "eieio", ""() ++
If both keywords appear the 'sideeffect' keyword must come + first.
+TODO: The format of the asm and constraints string still need to be documented here. Constraints on what can be done (e.g. duplication, moving, etc need to be documented). This is probably best done by reference to another document that covers inline asm from a holistic perspective.
+The call instructions that wrap inline asm nodes may have a "!srcloc" MDNode + attached to it that contains a constant integer. If present, the code + generator will use the integer as the location cookie value when report + errors through the LLVMContext error reporting mechanisms. This allows a + front-end to correlate backend errors that occur with inline asm back to the + source code that produced it. For example:
++call void asm sideeffect "something bad", ""(), !srcloc !42 +... +!42 = !{ i32 1234567 } ++
It is up to the front-end to make sense of the magic numbers it places in the + IR.
+ +LLVM IR allows metadata to be attached to instructions in the program that + can convey extra information about the code to the optimizers and code + generator. One example application of metadata is source-level debug + information. There are two metadata primitives: strings and nodes. All + metadata has the metadata type and is identified in syntax by a + preceding exclamation point ('!').
+ +A metadata string is a string surrounded by double quotes. It can contain + any character by escaping non-printable characters with "\xx" where "xx" is + the two digit hex code. For example: "!"test\00"".
+ +Metadata nodes are represented with notation similar to structure constants + (a comma separated list of elements, surrounded by braces and preceded by an + exclamation point). For example: "!{ metadata !"test\00", i32 + 10}". Metadata nodes can have any values as their operand.
+ +A named metadata is a collection of + metadata nodes, which can be looked up in the module symbol table. For + example: "!foo = metadata !{!4, !3}". + +
Metadata can be used as function arguments. Here llvm.dbg.value + function is using two metadata arguments. + +
+ call void @llvm.dbg.value(metadata !24, i64 0, metadata !25) ++
Metadata can be attached with an instruction. Here metadata !21 is + attached with add instruction using !dbg identifier. + +
+ %indvar.next = add i64 %indvar, 1, !dbg !21 ++
TODO: Describe this.
++%0 = type { i32, void ()* } +@llvm.global_ctors = appending global [1 x %0] [%0 { i32 65535, void ()* @ctor }] ++
The @llvm.global_ctors array contains a list of constructor functions and associated priorities. The functions referenced by this array will be called in ascending order of priority (i.e. lowest first) when the module is loaded. The order of functions with the same priority is not defined. +
+%0 = type { i32, void ()* } +@llvm.global_dtors = appending global [1 x %0] [%0 { i32 65535, void ()* @dtor }] +-
TODO: Describe this.
+The @llvm.global_dtors array contains a list of destructor functions and associated priorities. The functions referenced by this array will be called in descending order of priority (i.e. highest first) when the module is loaded. The order of functions with the same priority is not defined. +
There are six different terminator instructions: the +
There are seven different terminator instructions: the 'ret' instruction, the 'br' instruction, the 'switch' instruction, the + ''indirectbr' Instruction, the 'invoke' instruction, the 'unwind' instruction, and the 'unreachable' instruction.
@@ -2539,14 +2907,6 @@ Instruction ret { i32, i8 } { i32 4, i8 2 } ; Return a struct of values 4 and 2 -Note that the code generator does not yet fully support large - return values. The specific sizes that are currently supported are - dependent on the target. For integers, on 32-bit targets the limit - is often 64 bits, and on 64-bit targets the limit is often 128 bits. - For aggregate types, the current limits are dependent on the element - types; for example targets are often limited to 2 total integer - elements and 2 total floating-point elements.
- @@ -2617,8 +2977,8 @@ IfUnequal:The switch instruction specifies a table of values and destinations. When the 'switch' instruction is executed, this table is searched for the given value. If the value is found, control flow is - transfered to the corresponding destination; otherwise, control flow is - transfered to the default destination.
+ transferred to the corresponding destination; otherwise, control flow is + transferred to the default destination.Depending on properties of the target machine and the particular @@ -2643,6 +3003,55 @@ IfUnequal: + + +
+ ++ indirectbr <somety>* <address>, [ label <dest1>, label <dest2>, ... ] ++ +
The 'indirectbr' instruction implements an indirect branch to a label + within the current function, whose address is specified by + "address". Address must be derived from a blockaddress constant.
+ +The 'address' argument is the address of the label to jump to. The + rest of the arguments indicate the full set of possible destinations that the + address may point to. Blocks are allowed to occur multiple times in the + destination list, though this isn't particularly useful.
+ +This destination list is required so that dataflow analysis has an accurate + understanding of the CFG.
+ +Control transfers to the block specified in the address argument. All + possible destination blocks must be listed in the label list, otherwise this + instruction has undefined behavior. This implies that jumps to labels + defined in other functions have undefined behavior as well.
+ +This is typically implemented with a jump through a register.
+ ++ indirectbr i8* %Addr, [ label %bb1, label %bb2, label %bb3 ] ++ +
Note that the code generator does not yet completely support unwind, and +that the invoke/unwind semantics are likely to change in future versions.
+%retval = invoke i32 @Test(i32 15) to label %Continue @@ -2754,6 +3167,9 @@ Instruction
Note that the code generator does not yet completely support unwind, and +that the invoke/unwind semantics are likely to change in future versions.
+ @@ -2829,7 +3245,8 @@ Instructionnuw and nsw stand for "No Unsigned Wrap" and "No Signed Wrap", respectively. If the nuw and/or nsw keywords are present, the result value of the add - is undefined if unsigned and/or signed overflow, respectively, occurs.
+ is a trap value if unsigned and/or signed overflow, + respectively, occurs.@@ -2909,7 +3326,8 @@ Instructionnuw and nsw stand for "No Unsigned Wrap" and "No Signed Wrap", respectively. If the nuw and/or nsw keywords are present, the result value of the sub - is undefined if unsigned and/or signed overflow, respectively, occurs.
+ is a trap value if unsigned and/or signed overflow, + respectively, occurs.Example:
@@ -2977,7 +3395,7 @@ InstructionThe two arguments to the 'mul' instruction must be integer or vector of integer values. Both arguments must have identical types.
- +Semantics:
The value produced is the integer product of the two operands.
@@ -2995,7 +3413,8 @@ Instructionnuw and nsw stand for "No Unsigned Wrap" and "No Signed Wrap", respectively. If the nuw and/or nsw keywords are present, the result value of the mul - is undefined if unsigned and/or signed overflow, respectively, occurs.
+ is a trap value if unsigned and/or signed overflow, + respectively, occurs.Example:
@@ -3049,7 +3468,7 @@ InstructionThe 'udiv' instruction returns the quotient of its two operands.
Arguments:
-The two arguments to the 'udiv' instruction must be +
The two arguments to the 'udiv' instruction must be integer or vector of integer values. Both arguments must have identical types.
@@ -3084,7 +3503,7 @@ InstructionThe 'sdiv' instruction returns the quotient of its two operands.
Arguments:
-The two arguments to the 'sdiv' instruction must be +
The two arguments to the 'sdiv' instruction must be integer or vector of integer values. Both arguments must have identical types.
@@ -3100,8 +3519,8 @@ Instruction a 32-bit division of -2147483648 by -1.If the exact keyword is present, the result value of the - sdiv is undefined if the result would be rounded or if overflow - would occur.
+ sdiv is a trap value if the result would + be rounded or if overflow would occur.Example:
@@ -3155,7 +3574,7 @@ Instruction division of its two arguments.Arguments:
-The two arguments to the 'urem' instruction must be +
The two arguments to the 'urem' instruction must be integer or vector of integer values. Both arguments must have identical types.
@@ -3195,7 +3614,7 @@ Instruction elements must be integers.Arguments:
-The two arguments to the 'srem' instruction must be +
The two arguments to the 'srem' instruction must be integer or vector of integer values. Both arguments must have identical types.
@@ -3290,7 +3709,7 @@ InstructionBoth arguments to the 'shl' instruction must be the same integer or vector of integer type. 'op2' is treated as an unsigned value.
- +Semantics:
The value produced is op1 * 2op2 mod 2n, where n is the width of the result. If op2 @@ -3326,7 +3745,7 @@ Instruction operand shifted to the right a specified number of bits with zero fill.
Arguments:
-Both arguments to the 'lshr' instruction must be the same +
Both arguments to the 'lshr' instruction must be the same integer or vector of integer type. 'op2' is treated as an unsigned value.
@@ -3366,7 +3785,7 @@ Instruction extension.Arguments:
-Both arguments to the 'ashr' instruction must be the same +
Both arguments to the 'ashr' instruction must be the same integer or vector of integer type. 'op2' is treated as an unsigned value.
@@ -3406,7 +3825,7 @@ Instruction operands.Arguments:
-The two arguments to the 'and' instruction must be +
The two arguments to the 'and' instruction must be integer or vector of integer values. Both arguments must have identical types.
@@ -3465,7 +3884,7 @@ Instruction two operands.Arguments:
-The two arguments to the 'or' instruction must be +
The two arguments to the 'or' instruction must be integer or vector of integer values. Both arguments must have identical types.
@@ -3528,7 +3947,7 @@ Instruction complement" operation, which is the "~" operator in C.Arguments:
-The two arguments to the 'xor' instruction must be +
The two arguments to the 'xor' instruction must be integer or vector of integer values. Both arguments must have identical types.
@@ -3576,7 +3995,7 @@ Instruction -+ @@ -3622,7 +4041,7 @@ InstructionExample:
- %result = extractelement <4 x i32> %vec, i32 0 ; yields i32 + <result> = extractelement <4 x i32> %vec, i32 0 ; yields i32@@ -3658,7 +4077,7 @@ InstructionExample:
- %result = insertelement <4 x i32> %vec, i32 1, i32 0 ; yields <4 x i32> + <result> = insertelement <4 x i32> %vec, i32 1, i32 0 ; yields <4 x i32>@@ -3699,26 +4118,27 @@ InstructionExample:
- %result = shufflevector <4 x i32> %v1, <4 x i32> %v2, + <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2, <4 x i32> <i32 0, i32 4, i32 1, i32 5> ; yields <4 x i32> - %result = shufflevector <4 x i32> %v1, <4 x i32> undef, + <result> = shufflevector <4 x i32> %v1, <4 x i32> undef, <4 x i32> <i32 0, i32 1, i32 2, i32 3> ; yields <4 x i32> - Identity shuffle. - %result = shufflevector <8 x i32> %v1, <8 x i32> undef, + <result> = shufflevector <8 x i32> %v1, <8 x i32> undef, <4 x i32> <i32 0, i32 1, i32 2, i32 3> ; yields <4 x i32> - %result = shufflevector <4 x i32> %v1, <4 x i32> %v2, + <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7 > ; yields <8 x i32>-+-@@ -3735,14 +4155,14 @@ InstructionLLVM supports several instructions for working with aggregate values.
+LLVM supports several instructions for working with + aggregate values.
Overview:
-The 'extractvalue' instruction extracts the value of a struct field - or array element from an aggregate value.
+The 'extractvalue' instruction extracts the value of a member field + from an aggregate value.
Arguments:
The first operand of an 'extractvalue' instruction is a value - of struct or array type. The - operands are constant indices to specify which value to extract in a similar - manner as indices in a + of struct, union or + array type. The operands are constant indices to + specify which value to extract in a similar manner as indices in a 'getelementptr' instruction.
Semantics:
@@ -3751,7 +4171,7 @@ InstructionExample:
- %result = extractvalue {i32, float} %agg, 0 ; yields i32 + <result> = extractvalue {i32, float} %agg, 0 ; yields i32@@ -3765,20 +4185,19 @@ InstructionSyntax:
- <result> = insertvalue <aggregate type> <val>, <ty> <val>, <idx> ; yields <n x <ty>> + <result> = insertvalue <aggregate type> <val>, <ty> <elt>, <idx> ; yields <aggregate type>Overview:
-The 'insertvalue' instruction inserts a value into a struct field or - array element in an aggregate.
- +The 'insertvalue' instruction inserts a value into a member field + in an aggregate value.
Arguments:
The first operand of an 'insertvalue' instruction is a value - of struct or array type. The - second operand is a first-class value to insert. The following operands are - constant indices indicating the position at which to insert the value in a - similar manner as indices in a + of struct, union or + array type. The second operand is a first-class + value to insert. The following operands are constant indices indicating + the position at which to insert the value in a similar manner as indices in a 'getelementptr' instruction. The value to insert must have the same type as the value identified by the indices.
@@ -3790,14 +4209,15 @@ InstructionExample:
- %result = insertvalue {i32, float} %agg, i32 1, 0 ; yields {i32, float} + %agg1 = insertvalue {i32, float} undef, i32 1, 0 ; yields {i32 1, float undef} + %agg2 = insertvalue {i32, float} %agg1, float %val, 1 ; yields {i32 1, float %val}-+ @@ -3805,93 +4225,11 @@ InstructionA key design point of an SSA-based representation is how it represents memory. In LLVM, no memory locations are in SSA form, which makes things - very simple. This section describes how to read, write, allocate, and free + very simple. This section describes how to read, write, and allocate memory in LLVM.
- - - -- -- - - - -Syntax:
-- <result> = malloc <type>[, i32 <NumElements>][, align <alignment>] ; yields {type*}:result -- -Overview:
-The 'malloc' instruction allocates memory from the system heap and - returns a pointer to it. The object is always allocated in the generic - address space (address space zero).
- -Arguments:
-The 'malloc' instruction allocates - sizeof(<type>)*NumElements bytes of memory from the operating - system and returns a pointer of the appropriate type to the program. If - "NumElements" is specified, it is the number of elements allocated, otherwise - "NumElements" is defaulted to be one. If a constant alignment is specified, - the value result of the allocation is guaranteed to be aligned to at least - that boundary. If not specified, or if zero, the target can choose to align - the allocation on any convenient boundary compatible with the type.
- -'type' must be a sized type.
- -Semantics:
-Memory is allocated using the system "malloc" function, and a - pointer is returned. The result of a zero byte allocation is undefined. The - result is null if there is insufficient memory available.
- -Example:
-- %array = malloc [4 x i8] ; yields {[%4 x i8]*}:array - - %size = add i32 2, 2 ; yields {i32}:size = i32 4 - %array1 = malloc i8, i32 4 ; yields {i8*}:array1 - %array2 = malloc [12 x i8], i32 %size ; yields {[12 x i8]*}:array2 - %array3 = malloc i32, i32 4, align 1024 ; yields {i32*}:array3 - %array4 = malloc i32, align 1024 ; yields {i32*}:array4 -- -Note that the code generator does not yet respect the alignment value.
- -- --Syntax:
-- free <type> <value> ; yields {void} -- -Overview:
-The 'free' instruction returns memory back to the unused memory heap - to be reallocated in the future.
- -Arguments:
-'value' shall be a pointer value that points to a value that was - allocated with the 'malloc' instruction.
- -Semantics:
-Access to the memory pointed to by the pointer is no longer defined after - this instruction executes. If the pointer is null, the operation is a - noop.
- -Example:
-- %array = malloc [4 x i8] ; yields {[4 x i8]*}:array - free [4 x i8]* %array -- -'alloca' Instruction @@ -3951,8 +4289,9 @@ InstructionSyntax:
- <result> = load <ty>* <pointer>[, align <alignment>] - <result> = volatile load <ty>* <pointer>[, align <alignment>] + <result> = load <ty>* <pointer>[, align <alignment>][, !nontemporal !<index>] + <result> = volatile load <ty>* <pointer>[, align <alignment>][, !nontemporal !<index>] + !<index> = !{ i32 1 }Overview:
@@ -3963,18 +4302,25 @@ Instruction from which to load. The pointer must point to a first class type. If the load is marked as volatile, then the optimizer is not allowed to modify the - number or order of execution of this load with other - volatile load and store - instructions. + number or order of execution of this load with other volatile operations. -The optional constant "align" argument specifies the alignment of the +
The optional constant align argument specifies the alignment of the operation (that is, the alignment of the memory address). A value of 0 or an - omitted "align" argument means that the operation has the preferential + omitted align argument means that the operation has the preferential alignment for the target. It is the responsibility of the code emitter to ensure that the alignment information is correct. Overestimating the - alignment results in an undefined behavior. Underestimating the alignment may + alignment results in undefined behavior. Underestimating the alignment may produce less efficient code. An alignment of 1 is always safe.
+The optional !nontemporal metadata must reference a single + metatadata name <index> corresponding to a metadata node with + one i32 entry of value 1. The existence of + the !nontemporal metatadata on the instruction tells the optimizer + and code generator that this load is not expected to be reused in the cache. + The code generator may select special instructions to save cache bandwidth, + such as the MOVNT instruction on x86.
+Semantics:
The location of memory pointed to is loaded. If the value being loaded is of scalar type then the number of bytes read does not exceed the minimum number @@ -4001,8 +4347,8 @@ Instruction
Syntax:
- store <ty> <value>, <ty>* <pointer>[, align <alignment>] ; yields {void} - volatile store <ty> <value>, <ty>* <pointer>[, align <alignment>] ; yields {void} + store <ty> <value>, <ty>* <pointer>[, align <alignment>][, !nontemporal !] ; yields {void} + volatile store <ty> <value>, <ty>* <pointer>[, align <alignment>][, !nontemporal ! ] ; yields {void} Overview:
@@ -4013,11 +4359,10 @@ Instruction and an address at which to store it. The type of the '<pointer>' operand must be a pointer to the first class type of the - '<value>' operand. If the store is marked - as volatile, then the optimizer is not allowed to modify the number - or order of execution of this store with other - volatile load and store - instructions. + '<value>' operand. If the store is marked as + volatile, then the optimizer is not allowed to modify the number or + order of execution of this store with other volatile operations.The optional constant "align" argument specifies the alignment of the operation (that is, the alignment of the memory address). A value of 0 or an @@ -4027,6 +4372,15 @@ Instruction alignment results in an undefined behavior. Underestimating the alignment may produce less efficient code. An alignment of 1 is always safe.
+The optional !nontemporal metadata must reference a single metatadata + name
+ +corresponding to a metadata node with one i32 entry of + value 1. The existence of the !nontemporal metatadata on the + instruction tells the optimizer and code generator that this load is + not expected to be reused in the cache. The code generator may + select special instructions to save cache bandwidth, such as the + MOVNT instruction on x86. Semantics:
The contents of memory are updated to contain '<value>' at the location specified by the '<pointer>' operand. If @@ -4061,8 +4415,8 @@ Instruction
Overview:
The 'getelementptr' instruction is used to get the address of a - subelement of an aggregate data structure. It performs address calculation - only and does not access memory.
+ subelement of an aggregate data structure. + It performs address calculation only and does not access memory.Arguments:
The first argument is always a pointer, and forms the basis of the @@ -4072,15 +4426,15 @@ Instruction indexes the pointer value given as the first argument, the second index indexes a value of the type pointed to (not necessarily the value directly pointed to, since the first index can be non-zero), etc. The first type - indexed into must be a pointer value, subsequent types can be arrays, vectors - and structs. Note that subsequent types being indexed into can never be - pointers, since that would require loading the pointer before continuing - calculation.
+ indexed into must be a pointer value, subsequent types can be arrays, + vectors, structs and unions. Note that subsequent types being indexed into + can never be pointers, since that would require loading the pointer before + continuing calculation.The type of each index argument depends on the type it is indexing into. - When indexing into a (optionally packed) structure, only i32 integer - constants are allowed. When indexing into an array, pointer or - vector, integers of any width are allowed, and they are not required to be + When indexing into a (optionally packed) structure or union, only i32 + integer constants are allowed. When indexing into an array, pointer + or vector, integers of any width are allowed, and they are not required to be constant.
For example, let's consider a C code fragment and how it gets compiled to @@ -4147,13 +4501,14 @@ entry:
If the inbounds keyword is present, the result value of the - getelementptr is undefined if the base pointer is not an - in bounds address of an allocated object, or if any of the addresses - that would be formed by successive addition of the offsets implied by the - indices to the base address with infinitely precise arithmetic are not an - in bounds address of that allocated object. - The in bounds addresses for an allocated object are all the addresses - that point into the object, plus the address one byte past the end.
+ getelementptr is a trap value if the + base pointer is not an in bounds address of an allocated object, + or if any of the addresses that would be formed by successive addition of + the offsets implied by the indices to the base address with infinitely + precise arithmetic are not an in bounds address of that allocated + object. The in bounds addresses for an allocated object are all + the addresses that point into the object, plus the address one byte past + the end.If the inbounds keyword is not present, the offsets are added to the base address with silently-wrapping two's complement arithmetic, and @@ -4225,7 +4580,7 @@ entry:
%X = trunc i32 257 to i8 ; yields i8:1 %Y = trunc i32 123 to i1 ; yields i1:true - %Y = trunc i32 122 to i1 ; yields i1:false + %Z = trunc i32 122 to i1 ; yields i1:false@@ -4242,15 +4597,15 @@ entry:Overview:
-The 'zext' instruction zero extends its operand to type +
The 'zext' instruction zero extends its operand to type ty2.
Arguments:
-The 'zext' instruction takes a value to cast, which must be of +
The 'zext' instruction takes a value to cast, which must be of integer type, and a type to cast it to, which must also be of integer type. The bit size of the - value must be smaller than the bit size of the destination type, + value must be smaller than the bit size of the destination type, ty2.
Semantics:
@@ -4282,10 +4637,10 @@ entry:The 'sext' sign extends value to the type ty2.
Arguments:
-The 'sext' instruction takes a value to cast, which must be of +
The 'sext' instruction takes a value to cast, which must be of integer type, and a type to cast it to, which must also be of integer type. The bit size of the - value must be smaller than the bit size of the destination type, + value must be smaller than the bit size of the destination type, ty2.
Semantics:
@@ -4323,12 +4678,12 @@ entry:The 'fptrunc' instruction takes a floating point value to cast and a floating point type to cast it to. The size of value must be larger than the size of - ty2. This implies that fptrunc cannot be used to make a + ty2. This implies that fptrunc cannot be used to make a no-op cast.
Semantics:
The 'fptrunc' instruction truncates a value from a larger - floating point type to a smaller + floating point type to a smaller floating point type. If the value cannot fit within the destination type, ty2, then the results are undefined.
@@ -4357,7 +4712,7 @@ entry: floating point value.Arguments:
-The 'fpext' instruction takes a +
The 'fpext' instruction takes a floating point value to cast, and a floating point type to cast it to. The source type must be smaller than the destination type.
@@ -4400,7 +4755,7 @@ entry: vector integer type with the same number of elements as tySemantics:
-The 'fptoui' instruction converts its +
The 'fptoui' instruction converts its floating point operand into the nearest (rounding towards zero) unsigned integer value. If the value cannot fit in ty2, the results are undefined.
@@ -4409,7 +4764,7 @@ entry:%X = fptoui double 123.0 to i32 ; yields i32:123 %Y = fptoui float 1.0E+300 to i1 ; yields undefined:1 - %X = fptoui float 1.04E+17 to i8 ; yields undefined:1 + %Z = fptoui float 1.04E+17 to i8 ; yields undefined:1@@ -4426,7 +4781,7 @@ entry:
The 'fptosi' instruction converts +
The 'fptosi' instruction converts floating point value to type ty2.
@@ -4438,7 +4793,7 @@ entry: vector integer type with the same number of elements as tyThe 'fptosi' instruction converts its +
The 'fptosi' instruction converts its floating point operand into the nearest (rounding towards zero) signed integer value. If the value cannot fit in ty2, the results are undefined.
@@ -4447,7 +4802,7 @@ entry:%X = fptosi double -123.0 to i32 ; yields i32:-123 %Y = fptosi float 1.0E-247 to i1 ; yields undefined:1 - %X = fptosi float 1.04E+17 to i8 ; yields undefined:1 + %Z = fptosi float 1.04E+17 to i8 ; yields undefined:1@@ -4591,8 +4946,8 @@ entry:
%X = inttoptr i32 255 to i32* ; yields zero extension on 64-bit architecture - %X = inttoptr i32 255 to i32* ; yields no-op on 32-bit architecture - %Y = inttoptr i64 0 to i32* ; yields truncation on 32-bit architecture + %Y = inttoptr i32 255 to i32* ; yields no-op on 32-bit architecture + %Z = inttoptr i64 0 to i32* ; yields truncation on 32-bit architecture@@ -4635,7 +4990,7 @@ entry:
%X = bitcast i8 255 to i8 ; yields i8 :-1 %Y = bitcast i32* %x to sint* ; yields sint*:%x - %Z = bitcast <2 x int> %V to i64; ; yields i64: %V + %Z = bitcast <2 x int> %V to i64; ; yields i64: %V@@ -4691,15 +5046,15 @@ entry:
The 'icmp' compares op1 and op2 according to the condition code given as cond. The comparison performed always yields - either an i1 or vector of i1 + either an i1 or vector of i1 result, as follows:
If the operands are floating point scalars, then the result type is a boolean -(i1).
+(i1).If the operands are floating point vectors, then the result type is a vector of boolean with the same number of elements as the operands being @@ -4810,48 +5165,48 @@ entry:
The 'fcmp' instruction compares op1 and op2 according to the condition code given as cond. If the operands are vectors, then the vectors are compared element by element. Each comparison - performed always yields an i1 result, as + performed always yields an i1 result, as follows:
This instruction requires several arguments:
llvm::GuaranteedTailCallOpt
is true
.llvm treats calls to some functions with names and arguments that match the -standard C library as being the C library functions, and may perform -optimizations or generate code for them under that assumption. These -functions currently include: -acos, asin, atan, atan2, ceil, cos, cosf, cosh, exp, fabs, floor, fmod, log, -log10, malloc, pow, sin, sinh, sqrt, sqrtf, sin, sinf, tan, tanh.
+standard C99 library as being the C99 library functions, and may perform +optimizations or generate code for them under that assumption. This is +something we'd like to change in the future to provide better support for +freestanding environments and non-C-based languages. @@ -5144,7 +5517,7 @@ log10, malloc, pow, sin, sinh, sqrt, sqrtf, sin, sinf, tan, tanh. suffix is required. Because the argument's type is matched against the return type, it does not require its own name suffix. -To learn how to add an intrinsic function, please see the +
To learn how to add an intrinsic function, please see the Extending LLVM Guide.
@@ -5606,7 +5979,7 @@ LLVM.This intrinsic does not modify the behavior of the program. Backends that do - not support this intrinisic may ignore it.
+ not support this intrinsic may ignore it. @@ -5660,17 +6033,14 @@ LLVM.This is an overloaded intrinsic. You can use llvm.memcpy on any - integer bit width. Not all targets support all bit widths however.
+ integer bit width and for different address spaces. Not all targets support + all bit widths however.- declare void @llvm.memcpy.i8(i8 * <dest>, i8 * <src>, - i8 <len>, i32 <align>) - declare void @llvm.memcpy.i16(i8 * <dest>, i8 * <src>, - i16 <len>, i32 <align>) - declare void @llvm.memcpy.i32(i8 * <dest>, i8 * <src>, - i32 <len>, i32 <align>) - declare void @llvm.memcpy.i64(i8 * <dest>, i8 * <src>, - i64 <len>, i32 <align>) + declare void @llvm.memcpy.p0i8.p0i8.i32(i8 * <dest>, i8 * <src>, + i32 <len>, i32 <align>, i1 <isvolatile>) + declare void @llvm.memcpy.p0i8.p0i8.i64(i8 * <dest>, i8 * <src>, + i64 <len>, i32 <align>, i1 <isvolatile>)
Note that, unlike the standard libc function, the llvm.memcpy.* - intrinsics do not return a value, and takes an extra alignment argument.
+ intrinsics do not return a value, takes extra alignment/isvolatile arguments + and the pointers can be in specified address spaces.The first argument is a pointer to the destination, the second is a pointer to the source. The third argument is an integer argument specifying the - number of bytes to copy, and the fourth argument is the alignment of the - source and destination locations.
+ number of bytes to copy, the fourth argument is the alignment of the + source and destination locations, and the fifth is a boolean indicating a + volatile access. -If the call to this intrinisic has an alignment value that is not 0 or 1, +
If the call to this intrinsic has an alignment value that is not 0 or 1, then the caller guarantees that both the source and destination pointers are aligned to that boundary.
+If the isvolatile parameter is true, the + llvm.memcpy call is a volatile operation. + The detailed access behavior is not very cleanly specified and it is unwise + to depend on it.
+The 'llvm.memcpy.*' intrinsics copy a block of memory from the source location to the destination location, which are not allowed to overlap. It copies "len" bytes of memory over. If the argument is known to @@ -5708,17 +6087,14 @@ LLVM.
This is an overloaded intrinsic. You can use llvm.memmove on any integer bit - width. Not all targets support all bit widths however.
+ width and for different address space. Not all targets support all bit + widths however.- declare void @llvm.memmove.i8(i8 * <dest>, i8 * <src>, - i8 <len>, i32 <align>) - declare void @llvm.memmove.i16(i8 * <dest>, i8 * <src>, - i16 <len>, i32 <align>) - declare void @llvm.memmove.i32(i8 * <dest>, i8 * <src>, - i32 <len>, i32 <align>) - declare void @llvm.memmove.i64(i8 * <dest>, i8 * <src>, - i64 <len>, i32 <align>) + declare void @llvm.memmove.p0i8.p0i8.i32(i8 * <dest>, i8 * <src>, + i32 <len>, i32 <align>, i1 <isvolatile>) + declare void @llvm.memmove.p0i8.p0i8.i64(i8 * <dest>, i8 * <src>, + i64 <len>, i32 <align>, i1 <isvolatile>)
Note that, unlike the standard libc function, the llvm.memmove.* - intrinsics do not return a value, and takes an extra alignment argument.
+ intrinsics do not return a value, takes extra alignment/isvolatile arguments + and the pointers can be in specified address spaces.The first argument is a pointer to the destination, the second is a pointer to the source. The third argument is an integer argument specifying the - number of bytes to copy, and the fourth argument is the alignment of the - source and destination locations.
+ number of bytes to copy, the fourth argument is the alignment of the + source and destination locations, and the fifth is a boolean indicating a + volatile access. -If the call to this intrinisic has an alignment value that is not 0 or 1, +
If the call to this intrinsic has an alignment value that is not 0 or 1, then the caller guarantees that the source and destination pointers are aligned to that boundary.
+If the isvolatile parameter is true, the + llvm.memmove call is a volatile operation. + The detailed access behavior is not very cleanly specified and it is unwise + to depend on it.
+The 'llvm.memmove.*' intrinsics copy a block of memory from the source location to the destination location, which may overlap. It copies "len" bytes of memory over. If the argument is known to be aligned to some @@ -5758,17 +6143,14 @@ LLVM.
This is an overloaded intrinsic. You can use llvm.memset on any integer bit - width. Not all targets support all bit widths however.
+ width and for different address spaces. Not all targets support all bit + widths however.- declare void @llvm.memset.i8(i8 * <dest>, i8 <val>, - i8 <len>, i32 <align>) - declare void @llvm.memset.i16(i8 * <dest>, i8 <val>, - i16 <len>, i32 <align>) - declare void @llvm.memset.i32(i8 * <dest>, i8 <val>, - i32 <len>, i32 <align>) - declare void @llvm.memset.i64(i8 * <dest>, i8 <val>, - i64 <len>, i32 <align>) + declare void @llvm.memset.p0i8.i32(i8 * <dest>, i8 <val>, + i32 <len>, i32 <align>, i1 <isvolatile>) + declare void @llvm.memset.p0i8.i64(i8 * <dest>, i8 <val>, + i64 <len>, i32 <align>, i1 <isvolatile>)
Note that, unlike the standard libc function, the llvm.memset - intrinsic does not return a value, and takes an extra alignment argument.
+ intrinsic does not return a value, takes extra alignment/volatile arguments, + and the destination can be in an arbitrary address space.The first argument is a pointer to the destination to fill, the second is the @@ -5784,10 +6167,15 @@ LLVM.
specifying the number of bytes to fill, and the fourth argument is the known alignment of destination location. -If the call to this intrinisic has an alignment value that is not 0 or 1, +
If the call to this intrinsic has an alignment value that is not 0 or 1, then the caller guarantees that the destination pointer is aligned to that boundary.
+If the isvolatile parameter is true, the + llvm.memset call is a volatile operation. + The detailed access behavior is not very cleanly specified and it is unwise + to depend on it.
+The 'llvm.memset.*' intrinsics fill "len" bytes of memory starting at the destination location. If the argument is known to be aligned to some @@ -6407,6 +6795,97 @@ LLVM.
+ + + +Half precision floating point is a storage-only format. This means that it is + a dense encoding (in memory) but does not support computation in the + format.
+ +This means that code must first load the half-precision floating point + value as an i16, then convert it to float with llvm.convert.from.fp16. + Computation can then be performed on the float value (including extending to + double etc). To store the value back to memory, it is first converted to + float if needed, then converted to i16 with + llvm.convert.to.fp16, then + storing as an i16 value.
++ declare i16 @llvm.convert.to.fp16(f32 %a) ++ +
The 'llvm.convert.to.fp16' intrinsic function performs + a conversion from single precision floating point format to half precision + floating point format.
+ +The intrinsic function contains single argument - the value to be + converted.
+ +The 'llvm.convert.to.fp16' intrinsic function performs + a conversion from single precision floating point format to half precision + floating point format. The return value is an i16 which + contains the converted number.
+ ++ %res = call i16 @llvm.convert.to.fp16(f32 %a) + store i16 %res, i16* @x, align 2 ++ +
+ declare f32 @llvm.convert.from.fp16(i16 %a) ++ +
The 'llvm.convert.from.fp16' intrinsic function performs + a conversion from half precision floating point format to single precision + floating point format.
+ +The intrinsic function contains single argument - the value to be + converted.
+ +The 'llvm.convert.from.fp16' intrinsic function performs a + conversion from half single precision floating point format to single + precision floating point format. The input half-float value is represented by + an i16 value.
+ ++ %a = load i16* @x, align 2 + %res = call f32 @llvm.convert.from.fp16(i16 %a) ++ +
The llvm.memory.barrier intrinsic requires five boolean arguments. The first four arguments enables a specific barrier as listed below. The - fith argument specifies that the barrier applies to io or device or uncached + fifth argument specifies that the barrier applies to io or device or uncached memory.
-%ptr = malloc i32 +%mallocP = tail call i8* @malloc(i32 ptrtoint (i32* getelementptr (i32* null, i32 1) to i32)) +%ptr = bitcast i8* %mallocP to i32* store i32 4, %ptr %result1 = load i32* %ptr ; yields {i32}:result1 = 4 @@ -6648,7 +7128,8 @@ LLVM.Examples:
-%ptr = malloc i32 +%mallocP = tail call i8* @malloc(i32 ptrtoint (i32* getelementptr (i32* null, i32 1) to i32)) +%ptr = bitcast i8* %mallocP to i32* store i32 4, %ptr %val1 = add i32 4, 4 @@ -6703,7 +7184,8 @@ LLVM.Examples:
-%ptr = malloc i32 +%mallocP = tail call i8* @malloc(i32 ptrtoint (i32* getelementptr (i32* null, i32 1) to i32)) +%ptr = bitcast i8* %mallocP to i32* store i32 4, %ptr %val1 = add i32 4, 4 @@ -6758,8 +7240,9 @@ LLVM.Examples:
-%ptr = malloc i32 - store i32 4, %ptr +%mallocP = tail call i8* @malloc(i32 ptrtoint (i32* getelementptr (i32* null, i32 1) to i32)) +%ptr = bitcast i8* %mallocP to i32* + store i32 4, %ptr %result1 = call i32 @llvm.atomic.load.add.i32.p0i32( i32* %ptr, i32 4 ) ; yields {i32}:result1 = 4 %result2 = call i32 @llvm.atomic.load.add.i32.p0i32( i32* %ptr, i32 2 ) @@ -6792,7 +7275,7 @@ LLVM.Overview:
-This intrinsic subtracts delta to the value stored in memory at +
This intrinsic subtracts delta to the value stored in memory at ptr. It yields the original value at ptr.
Arguments:
@@ -6809,8 +7292,9 @@ LLVM.Examples:
-%ptr = malloc i32 - store i32 8, %ptr +%mallocP = tail call i8* @malloc(i32 ptrtoint (i32* getelementptr (i32* null, i32 1) to i32)) +%ptr = bitcast i8* %mallocP to i32* + store i32 8, %ptr %result1 = call i32 @llvm.atomic.load.sub.i32.p0i32( i32* %ptr, i32 4 ) ; yields {i32}:result1 = 8 %result2 = call i32 @llvm.atomic.load.sub.i32.p0i32( i32* %ptr, i32 2 ) @@ -6886,8 +7370,9 @@ LLVM.Examples:
-%ptr = malloc i32 - store i32 0x0F0F, %ptr +%mallocP = tail call i8* @malloc(i32 ptrtoint (i32* getelementptr (i32* null, i32 1) to i32)) +%ptr = bitcast i8* %mallocP to i32* + store i32 0x0F0F, %ptr %result0 = call i32 @llvm.atomic.load.nand.i32.p0i32( i32* %ptr, i32 0xFF ) ; yields {i32}:result0 = 0x0F0F %result1 = call i32 @llvm.atomic.load.and.i32.p0i32( i32* %ptr, i32 0xFF ) @@ -6946,7 +7431,7 @@ LLVM.Overview:
-These intrinsics takes the signed or unsigned minimum or maximum of +
These intrinsics takes the signed or unsigned minimum or maximum of delta and the value stored in memory at ptr. It yields the original value at ptr.
@@ -6964,8 +7449,9 @@ LLVM.Examples:
-%ptr = malloc i32 - store i32 7, %ptr +%mallocP = tail call i8* @malloc(i32 ptrtoint (i32* getelementptr (i32* null, i32 1) to i32)) +%ptr = bitcast i8* %mallocP to i32* + store i32 7, %ptr %result0 = call i32 @llvm.atomic.load.min.i32.p0i32( i32* %ptr, i32 -2 ) ; yields {i32}:result0 = 7 %result1 = call i32 @llvm.atomic.load.max.i32.p0i32( i32* %ptr, i32 8 ) @@ -6979,6 +7465,133 @@ LLVM.
This class of intrinsics exists to information about the lifetime of memory + objects and ranges where variables are immutable.
+ ++ declare void @llvm.lifetime.start(i64 <size>, i8* nocapture <ptr>) ++ +
The 'llvm.lifetime.start' intrinsic specifies the start of a memory + object's lifetime.
+ +The first argument is a constant integer representing the size of the + object, or -1 if it is variable sized. The second argument is a pointer to + the object.
+ +This intrinsic indicates that before this point in the code, the value of the + memory pointed to by ptr is dead. This means that it is known to + never be used and has an undefined value. A load from the pointer that + precedes this intrinsic can be replaced with + 'undef'.
+ ++ declare void @llvm.lifetime.end(i64 <size>, i8* nocapture <ptr>) ++ +
The 'llvm.lifetime.end' intrinsic specifies the end of a memory + object's lifetime.
+ +The first argument is a constant integer representing the size of the + object, or -1 if it is variable sized. The second argument is a pointer to + the object.
+ +This intrinsic indicates that after this point in the code, the value of the + memory pointed to by ptr is dead. This means that it is known to + never be used and has an undefined value. Any stores into the memory object + following this intrinsic may be removed as dead. + +
+ declare {}* @llvm.invariant.start(i64 <size>, i8* nocapture <ptr>) readonly ++ +
The 'llvm.invariant.start' intrinsic specifies that the contents of + a memory object will not change.
+ +The first argument is a constant integer representing the size of the + object, or -1 if it is variable sized. The second argument is a pointer to + the object.
+ +This intrinsic indicates that until an llvm.invariant.end that uses + the return value, the referenced memory location is constant and + unchanging.
+ ++ declare void @llvm.invariant.end({}* <start>, i64 <size>, i8* nocapture <ptr>) ++ +
The 'llvm.invariant.end' intrinsic specifies that the contents of + a memory object are mutable.
+ +The first argument is the matching llvm.invariant.start intrinsic. + The second argument is a constant integer representing the size of the + object, or -1 if it is variable sized and the third argument is a pointer + to the object.
+ +This intrinsic indicates that the memory is mutable again.
+ ++ declare i32 @llvm.objectsize.i32( i8* <object>, i1 <type> ) + declare i64 @llvm.objectsize.i64( i8* <object>, i1 <type> ) ++ +
The llvm.objectsize intrinsic is designed to provide information + to the optimizers to discover at compile time either a) when an + operation like memcpy will either overflow a buffer that corresponds to + an object, or b) to determine that a runtime check for overflow isn't + necessary. An object in this context means an allocation of a + specific class, structure, array, or other object.
+ +The llvm.objectsize intrinsic takes two arguments. The first + argument is a pointer to or into the object. The second argument + is a boolean 0 or 1. This argument determines whether you want the + maximum (0) or minimum (1) bytes remaining. This needs to be a literal 0 or + 1, variables are not allowed.
+ +The llvm.objectsize intrinsic is lowered to either a constant + representing the size of the object concerned or i32/i64 -1 or 0 + (depending on the type argument if the size cannot be determined + at compile time.
+ +