From: Reid Spencer Date: Fri, 2 Feb 2007 02:30:19 +0000 (+0000) Subject: 1. Break long lines to 80 col limit X-Git-Url: http://demsky.eecs.uci.edu/git/?a=commitdiff_plain;h=e1888eec1048e251fa8c6ede7e170430b69e95ce;p=oota-llvm.git 1. Break long lines to 80 col limit 2. Fix indentation 3. Renumber the instruction opcodes after the Shift became a binary operator. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@33777 91177308-0d34-0410-b5e6-96231b3b80d8 --- diff --git a/docs/BytecodeFormat.html b/docs/BytecodeFormat.html index 5ef9e5e300a..115ddeddfd2 100644 --- a/docs/BytecodeFormat.html +++ b/docs/BytecodeFormat.html @@ -248,8 +248,8 @@ variable bit rate encoding as described above.

uint64_vbr - A 64-bit unsigned integer that occupies from one to ten - bytes using variable bit rate encoding. + A 64-bit unsigned integer that occupies from one to + ten bytes using variable bit rate encoding. int64_vbr @@ -262,58 +262,60 @@ variable bit rate encoding as described above.

bit(n-m) - A set of bit within some larger integer field. The values - of n and m specify the inclusive range of bits - that define the subfield. The value for m may be omitted if - its the same as n. + A set of bit within some larger integer field. The + values of n and m specify the inclusive range + of bits that define the subfield. The value for m may be + omitted if its the same as n. - float - A floating point value encoded - as a 32-bit IEEE value written in little-endian form.
+ float + + A floating point + value encoded as a 32-bit IEEE value written in little-endian form.
- double - A floating point value encoded - as a64-bit IEEE value written in little-endian form + double + + A floating point value + encoded as a64-bit IEEE value written in little-endian form string - A uint32_vbr indicating the type of the -constant string which also includes its length, immediately followed by -the characters of the string. There is no terminating null byte in the -string. + A uint32_vbr indicating the type of the constant + string which also includes its length, immediately followed by the + characters of the string. There is no terminating null byte in the + string. data - An arbitrarily long segment of data to which -no interpretation is implied. This is used for constant initializers.
+ An arbitrarily long segment of data to which no + interpretation is implied. This is used for constant initializers.
llist(x) - A length list of x. This means the list is -encoded as an uint32_vbr providing the -length of the list, followed by a sequence of that many "x" items. This -implies that the reader should iterate the number of times provided by -the length. + A length list of x. This means the list is encoded + as an uint32_vbr providing the length of the + list, followed by a sequence of that many "x" items. This implies that + the reader should iterate the number of times provided by the length. + zlist(x) - A zero-terminated list of x. This means the -list is encoded as a sequence of an indeterminate number of "x" items, -followed by an uint32_vbr terminating value. -This implies that none of the "x" items can have a zero value (or else -the list terminates). + A zero-terminated list of x. This means the list is + encoded as a sequence of an indeterminate number of "x" items, followed + by an uint32_vbr terminating value. This + implies that none of the "x" items can have a zero value (or else the + list terminates). block - A block of data that is logically related. A -block is an unsigned 32-bit integer that encodes the type of the block -in the low 5 bits and the size of the block in the high 27 bits. The -length does not include the block header or any alignment bytes at the -end of the block. Blocks may compose other blocks. + A block of data that is logically related. A block + is an unsigned 32-bit integer that encodes the type of the block in + the low 5 bits and the size of the block in the high 27 bits. The + length does not include the block header or any alignment bytes at the + end of the block. Blocks may compose other blocks. @@ -333,18 +335,18 @@ following table:

? - The question mark indicates 0 or 1 -occurrences of the thing preceding it. + The question mark indicates 0 or 1 occurrences of + the thing preceding it. * - The asterisk indicates 0 or more occurrences -of the thing preceding it. + The asterisk indicates 0 or more occurrences of the + thing preceding it. + - The plus sign indicates 1 or more occurrences -of the thing preceding it. + The plus sign indicates 1 or more occurrences of the + thing preceding it. () @@ -369,8 +371,8 @@ of the thing preceding it.
  1. An optional string. Matches either nothing or a single string
  2. One or more pairs of uint32_vbr.
  3. -
  4. Zero or more occurrences of either an unsigned followed by a -uint32_vbr or just a uint32_vbr.
  5. +
  6. Zero or more occurrences of either an unsigned followed by a uint32_vbr + or just a uint32_vbr.
  7. An optional length list of unsigned values.
@@ -380,13 +382,14 @@ uint32_vbr or just a uint32_vbr.

The bytecode format uses the notion of a "slot" to reference Types and Values. Since the bytecode file is a direct representation of LLVM's intermediate representation, there is a need to represent pointers in -the file. Slots are used for this purpose. For example, if one has the following -assembly: +the file. Slots are used for this purpose. For example, if one has the +following assembly:

%MyType = type { int, sbyte }
%MyVar = external global %MyType
-

there are two definitions. The definition of %MyVar uses %MyType. +

there are two definitions. The definition of %MyVar uses +%MyType. In the C++ IR this linkage between %MyVar and %MyType is explicit through the use of C++ pointers. In bytecode, however, there's no ability to store memory addresses. Instead, we compute and write out @@ -501,8 +504,8 @@ type constant pool, and symbol table for the function. 2       Function Constant Pool - Any constants (including types) used solely -within the function are emitted here in the function constant pool. + Any constants (including types) used solely within + the function are emitted here in the function constant pool. 0x07 @@ -512,9 +515,9 @@ within the function are emitted here in the function constant pool. 2       Instruction List - This block contains all the instructions of -the function. The basic blocks are inferred by terminating -instructions. + This block contains all the instructions of the + function. The basic blocks are inferred by terminating instructions. + 0x04 @@ -524,8 +527,8 @@ instructions. 2       Function Symbol Table - This symbol table provides the names for the -function specific values used (basic block labels mostly). + This symbol table provides the names for the function + specific values used (basic block labels mostly). 0x04 @@ -534,9 +537,9 @@ function specific values used (basic block labels mostly). No 1    Module Symbol Table - This symbol table provides the names for the -various entries in the file that are not function specific (global -vars, and functions mostly). + This symbol table provides the names for the various + entries in the file that are not function specific (global vars, and + functions mostly). @@ -669,11 +672,11 @@ sections.

blocks in a bytecode file. Specifically, instead of encoding the type and size of the block into a 32-bit integer with 5-bits for type and 27-bits for size, the module block header uses two 32-bit unsigned values, one for type, and one - for size. While the 227 byte limit on block size is sufficient for the blocks - contained in the module, it isn't sufficient for the module block itself - because we want to ensure that bytecode files as large as 232 bytes - are possible. For this reason, the module block (and only the module block) - uses a long format header.

+ for size. While the 227 byte limit on block size is sufficient + for the blocks contained in the module, it isn't sufficient for the module + block itself because we want to ensure that bytecode files as large as + 232 bytes are possible. For this reason, the module block (and + only the module block) uses a long format header.

@@ -744,17 +747,16 @@ types. They are encoded simply as their TypeID.

uint24_vbr - Type ID for the primitive types (values 1 to -11) 1 + Type ID for the primitive types (values 1 to 11) + 1 Notes:
    -
  1. The values for the Type IDs for the primitive types are provided -by the definition of the llvm::Type::TypeID enumeration -in include/llvm/Type.h. The enumeration gives the -following mapping: +
  2. The values for the Type IDs for the primitive types are provided by the + definition of the llvm::Type::TypeID enumeration in + include/llvm/Type.h. The enumeration gives the following mapping:
    1. bool
    2. ubyte
    3. @@ -791,8 +793,8 @@ following mapping: uint32_vbr? - Value 0 if this is a varargs function, -missing otherwise. + Value 0 if this is a varargs function, missing + otherwise. @@ -922,36 +924,36 @@ all functions. The format is shown in the table below:

      zlist(globalvar) - A zero terminated list of global var -definitions occurring in the module. + A zero terminated list of global var definitions + occurring in the module. zlist(funcfield) - A zero terminated list of function definitions -occurring in the module. + A zero terminated list of function definitions + occurring in the module. llist(string) - A length list -of strings that specify the names of the libraries that this module -depends upon. + A length list of strings that specify the names of + the libraries that this module depends upon. string - The target -triple for the module (blank means no target triple specified, i.e. a -platform-independent module). + The target triple for the module (blank means no + target triple specified, i.e. a platform-independent module). string - The data layout string describing the endianness, pointer size, and -type alignments for which the module was written (blank means no data layout specified, i.e. a platform-independent module). + The data layout string describing the endianness, + pointer size, and type alignments for which the module was written + (blank means no data layout specified, i.e. a platform-independent + module). llist(string) - A length list -of strings that defines a table of section strings for globals. A global's -SectionID is an index into this table. + A length list of strings that defines a table of + section strings for globals. A global's SectionID is an index into + this table. string @@ -986,9 +988,8 @@ and a an optional initializers for the global var.

      bit(1) - Has initializer? Note that this bit -determines whether the constant initializer field (described below) -follows. + Has initializer? Note that this bit determines + whether the constant initializer field (described below) follows. bit(2-4) @@ -1056,9 +1057,9 @@ and can includes more information:

      An optional section ID number, specifying the string to use for the section of the global. This an index (+1) of an entry - into the SectionID llist in the Module Global - Info block. If this value is 0 or not present, the global has an - empty section string. + into the SectionID llist in the + Module Global Info block. If this value is + 0 or not present, the global has an empty section string. @@ -1108,11 +1109,11 @@ href="#uint32_vbr">uint32_vbr that describes the function.

      bit(4) If this bit is set to 1, the indicated function is - external, and there is no Function Definiton - Block in the bytecode file for the function. If the function is - external and has dllimport or extern_weak linkage additional - field in the extension word is used to indicate the actual linkage - type. + external, and there is no + Function Definiton Block in the bytecode + file for the function. If the function is external and has + dllimport or extern_weak linkage additional field in the + extension word is used to indicate the actual linkage type. bit(5-30) @@ -1171,9 +1172,9 @@ follows with the following fields:

      An optional section ID number, specifying the string to use for the section of the function. This an index (+1) of an entry - into the SectionID llist in the Module Global - Info block. If this value is 0 or not present, the function has an - empty section string. + into the SectionID llist in the + Module Global Info block. If this value is + 0 or not present, the function has an empty section string. @@ -1218,15 +1219,15 @@ both function and module constant pools.

      uint32_vbr - Zero. This identifies the following "plane" -as containing the constant strings. This is needed to identify it -uniquely from other constant planes that follow. + Zero. This identifies the following "plane" as + containing the constant strings. This is needed to identify it uniquely + from other constant planes that follow. uint24_vbr+ - Type slot number of the constant string's type. -Note that the constant string's type implicitly defines the length of -the string. + Type slot number of the constant string's type. Note + that the constant string's type implicitly defines the length of the + string. @@ -1272,21 +1273,20 @@ constant is solely determined by its type. In this case, we have the following field definitions, based on type:

        -
      • Bool. This is written as an uint32_vbr -of value 1U or 0U.
      • -
      • Signed Integers (sbyte,short,int,long). These are written -as an int64_vbr with the corresponding value.
      • -
      • Unsigned Integers (ubyte,ushort,uint,ulong). These are -written as an uint64_vbr with the -corresponding value.
      • -
      • Floating Point. Both the float and double types are -written literally in binary format.
      • -
      • Arrays. Arrays are written simply as a list of uint32_vbr encoded value slot numbers to the constant -element values.
      • -
      • Structures. Structures are written simply as a list of uint32_vbr encoded value slot numbers to the constant -field values of the structure.
      • +
      • Bool. This is written as an uint32_vbr of + value 1U or 0U.
      • +
      • Signed Integers (sbyte,short,int,long). These are written as an + int64_vbr with the corresponding value.
      • +
      • Unsigned Integers (ubyte,ushort,uint,ulong). These are written as + an uint64_vbr with the corresponding value.
      • +
      • Floating Point. Both the float and double types are written + literally in binary format.
      • +
      • Arrays. Arrays are written simply as a list of + uint32_vbr encoded value slot numbers to the + constant element values.
      • +
      • Structures. Structures are written simply as a list of + uint32_vbr encoded value slot numbers to the + constant field values of the structure.
      @@ -1350,18 +1350,18 @@ number of operands+1.

      uint32_vbr - Op code of the instruction for the constant -expression. + Op code of the instruction for the constant + expression. uint32_vbr - The value slot number of the constant value for an -operand.1 + The value slot number of the constant value for an + operand.1 uint24_vbr - The type slot number for the type of the constant -value for an operand.1 + The type slot number for the type of the constant + value for an operand.1 @@ -1391,24 +1391,25 @@ size
      uint32_vbr - The linkage and - visibility style field + + The linkage and visibility + style field block - The constant pool -block for this function.2 + The constant pool block + for this function.2 block - The instruction -list for the function. + The instruction list + for the function. block - The function's symbol -table containing only those symbols pertinent to the function -(mostly block labels). + The function's symbol table + containing only those symbols pertinent to the function (mostly block + labels). @@ -1433,7 +1434,8 @@ other fields will be present as the function is defined elsewhere. bit(0-15) The linkage type of the function: 0=External, 1=Weak, -2=Appending, 3=Internal, 4=LinkOnce, 5=DllImport, 6=DllExport1 + 2=Appending, 3=Internal, 4=LinkOnce, 5=DllImport, + 6=DllExport1 bit(16-18) @@ -1467,8 +1469,8 @@ the block is given in the following table.

      instruction+ - An instruction. Instructions have a variety -of formats. See Instructions for details. + An instruction. Instructions have a variety of + formats. See Instructions for details. @@ -1526,37 +1528,37 @@ possible.

      SRem1461.9 FRem1561.9 Logical Operators - And1611.0 - Or1711.0 - Xor1811.0 + Shl1611.0 + LShr1761.9 + AShr1861.9 + And1911.0 + Or2011.0 + Xor2111.0 Memory Operators - Malloc1911.0 - Free2011.0 - Alloca2111.0 - Load2211.0 - Store2311.0 - GetElementPtr2411.0 + Malloc2211.0 + Free2311.0 + Alloca2411.0 + Load2511.0 + Store2611.0 + GetElementPtr2711.0 Cast Operators - Trunc2572.0 - ZExt2672.0 - SExt2772.0 - FPToUI2872.0 - FPToSI2972.0 - UIToFP3072.0 - SIToFP3172.0 - FPTrunc3272.0 - FPExt3372.0 - PtrToInt3472.0 - IntToPtr3572.0 - BitCast3672.0 + Trunc2872.0 + ZExt2972.0 + SExt3072.0 + FPToUI3172.0 + FPToSI3272.0 + UIToFP3372.0 + SIToFP3472.0 + FPTrunc3572.0 + FPExt3672.0 + PtrToInt3772.0 + IntToPtr3872.0 + BitCast3972.0 Other Operators - ICmp3772.0 - FCmp3872.0 - PHI3911.0 - Call4011.0 - Shl4111.0 - LShr4261.9 - AShr4361.9 + ICmp4072.0 + FCmp4172.0 + PHI4211.0 + Call4311.0 Select4421.2 UserOp14511.0 UserOp24611.0 @@ -1620,17 +1622,17 @@ encodes the value number of the operand, not the type.

      those cases:

        -
      • getelementptr: the slot numbers for sequential type indexes are shifted up -two bits. This allows the low order bits will encode the type of index used, -as follows: 0=uint, 1=int, 2=ulong, 3=long.
      • -
      • cast: the result type number is encoded as the second operand.
      • -
      • alloca/malloc: If the allocation has an explicit alignment, the log2 of the - alignment is encoded as the second operand.
      • -
      • call: If the tail marker and calling convention cannot be encoded into the opcode of the call, it is passed as an - additional operand. The low bit of the operand is a flag indicating whether - the call is a tail call. The rest of the bits contain the calling - convention number (shifted left by one bit).
      • +
      • getelementptr: the slot numbers for sequential type indexes are shifted + up two bits. This allows the low order bits will encode the type of index + used, as follows: 0=uint, 1=int, 2=ulong, 3=long.
      • +
      • cast: the result type number is encoded as the second operand.
      • +
      • alloca/malloc: If the allocation has an explicit alignment, the log2 of + the alignment is encoded as the second operand.
      • +
      • call: If the tail marker and calling convention cannot be + encoded into the opcode of the call, it is passed as + an additional operand. The low bit of the operand is a flag indicating + whether the call is a tail call. The rest of the bits contain the calling + convention number (shifted left by one bit).
      @@ -1657,10 +1659,10 @@ successive fields.

      uint32_vbr - Specifies the opcode of the instruction. Note -that for compatibility with the other instruction formats, the opcode -is shifted left by 2 bits. Bits 0 and 1 must have value zero for this -format. + Specifies the opcode of the instruction. Note that + for compatibility with the other instruction formats, the opcode is + shifted left by 2 bits. Bits 0 and 1 must have value zero for this + format. uint24_vbr @@ -1832,13 +1834,15 @@ table below.

      Symbol Table Identifier (0x04) - llist(type_entry) + llist(type_entry) + A length list of symbol table entries for Types - llist(symtab_plane) + llist(symtab_plane) + A length list of "type planes" of symbol table entries for Values @@ -1896,7 +1900,8 @@ values of a common type. The encoding is given in the following table:

      uint32_vbr - Type slot number of type for all values in this plane.. + Type slot number of type for all values in this plane. + value_entry+ @@ -1973,8 +1978,8 @@ describes the differences between that version and the one that follows.
      Function Flags

      LLVM bytecode versions prior to 1.4 did not include the 'undef' constant - value, which affects the encoding of Constant - Fields.

      + value, which affects the encoding of Constant Fields. +