docs/CommandGuide/llvm-ar.pod

   1 =pod
   2
   3 =head1 NAME
   4
   5 llvm-ar - LLVM archiver
   6
   7 =head1 SYNOPSIS
   8
   9 B<llvm-ar> [-]{dmpqrtx}[Rabfikouz] [relpos] [count] <archive> [files...]
  10
  11
  12 =head1 DESCRIPTION
  13
  14 The B<llvm-ar> command is similar to the common Unix utility, C<ar>. It
  15 archives several files together into a single file. The intent for this is
  16 to produce archive libraries by LLVM bytecode that can be linked into an
  17 LLVM program. However, the archive can contain any kind of file. By default,
  18 B<llvm-ar> generates a symbol table that makes linking faster because
  19 only the symbol table needs to be consulted, not each individual file member
  20 of the archive.
  21
  22 The B<llvm-ar> command can be used to I<read> both SVR4 and BSD style archive
  23 files. However, it cannot be used to write them.  While the B<llvm-ar> command
  24 produces files that are I<almost> identical to the format used by other C<ar>
  25 implementations, it has two significant departures in order to make the
  26 archive appropriate for LLVM. There are first departure is that B<llvm-ar> only
  27 uses BSD4.4 style long path names (stored immediately after the header) and
  28 never contains a string table for long names. The second departure is that the
  29 symbol table is formated for efficient construction of an in-memory data
  30 structure that permits rapid (red-black tree) lookups. Consequently, archives
  31 produced with B<llvm-ar> usually won't be readable or editable with any
  32 C<ar> implementation or useful for linking.  Using the C<f> modifier to flatten
  33 file names will make the archive readable by other C<ar> implementations
  34 but not for linking because the symbol table format for LLVM is unique. If an
  35 SVR4 or BSD style archive is used with the C<r> (replace) or C<q> (quick
  36 update) operations, the archive will be reconstructed in LLVM format. This
  37 means that the string table will be dropped (in deference to BSD 4.4 long names)
  38 and an LLVM symbol table will be added (by default). The system symbol table
  39 will be retained.
  40
  41 Here's where B<llvm-ar> departs from previous C<ar> implementations:
  42
  43 =over
  44
  45 =item I<Symbol Table>
  46
  47 Since B<llvm-ar> is intended to archive bytecode files, the symbol table
  48 won't make much sense to anything but LLVM. Consequently, the symbol table's
  49 format has been simplified. It consists simply of a sequence of pairs
  50 of a file member index number as an LSB 4byte integer and a null-terminated
  51 string.
  52
  53 =item I<Long Paths>
  54
  55 Some C<ar> implementations (SVR4) use a separate file member to record long
  56 path names (> 15 characters). B<llvm-ar> takes the BSD 4.4 and Mac OS X
  57 approach which is to simply store the full path name immediately preceding
  58 the data for the file. The path name is null terminated and may contain the
  59 slash (/) character.
  60
  61 =item I<Compression>
  62
  63 B<llvm-ar> can compress the members of an archive to save space. The
  64 compression used depends on what's available on the platform and what choices
  65 the LLVM Compressor utility makes. It generally favors bzip2 but will select
  66 between "no compression", bzip2 or zlib depending on what makes sense for the
  67 file's content.
  68
  69 =item I<Directory Recursion>
  70
  71 Most C<ar> implementations do not recurse through directories but simply
  72 ignore directories if they are presented to the program in the F<files>
  73 option. B<llvm-ar>, however, can recurse through directory structures and
  74 add all the files under a directory, if requested.
  75
  76 =item I<TOC Verbose Output>
  77
  78 When B<llvm-ar> prints out the verbose table of contents (C<tv> option), it
  79 precedes the usual output with a character indicating the basic kind of
  80 content in the file. A blank means the file is a regular file. A 'Z' means
  81 the file is compressed. A 'B' means the file is an LLVM bytecode file. An
  82 'S' means the file is the symbol table.
  83
  84 =back
  85
  86 =head1 OPTIONS
  87
  88 The options to B<llvm-ar> are compatible with other C<ar> implementations.
  89 However, there are a few modifiers (F<zR>) that are not found in other
  90 C<ar>s. The options to B<llvm-ar> specify a single basic operation to
  91 perform on the archive, a variety of modifiers for that operation, the
  92 name of the archive file, and an optional list of file names. These options
  93 are used to determine how B<llvm-ar> should process the archive file.
  94
  95 The Operations and Modifiers are explained in the sections below. The minimal
  96 set of options is at least one operator and the name of the archive. Typically
  97 archive files end with a C<.a> suffix, but this is not required. Following
  98 the F<archive-name> comes a list of F<files> that indicate the specific members
  99 of the archive to operate on. If the F<files> option is not specified, it
 100 generally means either "none" or "all" members, depending on the operation.
 101
 102 =head2 Operations
 103
 104 =over
 105
 106 =item d
 107
 108 Delete files from the archive. No modifiers are applicable to this operation.
 109 The F<files> options specify which members should be removed from the
 110 archive. It is not an error if a specified file does not appear in the archive.
 111 If no F<files> are specified, the archive is not modified.
 112
 113 =item m[abi]
 114
 115 Move files from one location in the archive to another. The F<a>, F<b>, and
 116 F<i> modifiers apply to this operation. The F<files> will all be moved
 117 to the location given by the modifiers. If no modifiers are used, the files
 118 will be moved to the end of the archive. If no F<files> are specified, the
 119 archive is not modified.
 120
 121 =item p[k]
 122
 123 Print files to the standard output. The F<k> modifier applies to this
 124 operation. This operation simply prints the F<files> indicated to the
 125 standard output. If no F<files> are specified, the entire archive is printed.
 126 Printing bytecode files is ill-advised as they might confuse your terminal
 127 settings. The F<p> operation never modifies the archive.
 128
 129 =item q[Rfz]
 130
 131 Quickly append files to the end of the archive. The F<R>, F<f>, and F<z>
 132 modifiers apply to this operation.  This operation quickly adds the
 133 F<files> to the archive without checking for duplicates that should be
 134 removed first. If no F<files> are specified, the archive is not modified.
 135 Because of the way that B<llvm-ar> constructs the archive file, its dubious
 136 whether the F<q> operation is any faster than the F<r> operation.
 137
 138 =item r[Rabfuz]
 139
 140 Replace or insert file members. The F<R>, F<a>, F<b>, F<f>, F<u>, and F<z>
 141 modifiers apply to this operation. This operation will replace existing
 142 F<files> or insert them at the end of the archive if they do not exist. If no
 143 F<files> are specified, the archive is not modified.
 144
 145 =item t[v]
 146
 147 Print the table of contents. Without any modifiers, this operation just prints
 148 the names of the members to the standard output. With the F<v> modifier,
 149 B<llvm-ar> also prints out the file type (B=bytecode, Z=compressed, S=symbol
 150 table, blank=regular file), the permission mode, the owner and group, the
 151 size, and the date. If any F<files> are specified, the listing is only for
 152 those files. If no F<files> are specified, the table of contents for the
 153 whole archive is printed.
 154
 155 =item x[oP]
 156
 157 Extract archive members back to files. The F<o> modifier applies to this
 158 operation. This operation retrieves the indicated F<files> from the archive
 159 and writes them back to the operating system's file system. If no
 160 F<files> are specified, the entire archive is extract.
 161
 162 =back
 163
 164 =head2 Modifiers (operation specific)
 165
 166 The modifiers below are specific to certain operations. See the Operations
 167 section (above) to determine which modifiers are applicable to which operations.
 168
 169 =over
 170
 171 =item [a]
 172
 173 When inserting or moving member files, this option specifies the destination of
 174 the new files as being C<a>fter the F<relpos> member. If F<relpos> is not found,
 175 the files are placed at the end of the archive.
 176
 177 =item [b]
 178
 179 When inserting or moving member files, this option specifies the destination of
 180 the new files as being C<b>efore the F<relpos> member. If F<relpos> is not
 181 found, the files are placed at the end of the archive. This modifier is
 182 identical to the the F<i> modifier.
 183
 184 =item [f]
 185
 186 Normally, B<llvm-ar> stores the full path name to a file as presented to it on
 187 the command line. With this option, truncated (15 characters max) names are
 188 used. This ensures name compatibility with older versions of C<ar> but may also
 189 thwart correct extraction of the files (duplicates may overwrite). If used with
 190 the F<R> option, the directory recursion will be performed but the file names
 191 will all be C<f>lattened to simple file names.
 192
 193 =item [i]
 194
 195 A synonym for the F<b> option.
 196
 197 =item [k]
 198
 199 Normally, B<llvm-ar> will not print the contents of bytecode files when the
 200 F<p> operation is used. This modifier defeats the default and allows the
 201 bytecode members to be printed.
 202
 203 =item [N]
 204
 205 This option is ignored by B<llvm-ar> but provided for compatibility.
 206
 207 =item [o]
 208
 209 When extracting files, this option will cause B<llvm-ar> to preserve the
 210 original modification times of the files it writes.
 211
 212 =item [P]
 213
 214 use full path names when matching
 215
 216 =item [R]
 217
 218 This modifier instructions the F<r> option to recursively process directories.
 219 Without F<R>, directories are ignored and only those F<files> that refer to
 220 files will be added to the archive. When F<R> is used, any directories specified
 221 with F<files> will be scanned (recursively) to find files to be added to the
 222 archive. Any file whose name begins with a dot will not be added.
 223
 224 =item [u]
 225
 226 When replacing existing files in the archive, only replace those files that have
 227 a time stamp than the time stamp of the member in the archive.
 228
 229 =item [z]
 230
 231 When inserting or replacing any file in the archive, compress the file first.
 232 The compression will attempt to use the zlib compression algorithm. This
 233 modifier is safe to use when (previously) compressed bytecode files are added to
 234 the archive; the compress bytecode files will not be doubly compressed.
 235
 236 =back
 237
 238 =head2 Modifiers (generic)
 239
 240 The modifiers below may be applied to any operation.
 241
 242 =over
 243
 244 =item [c]
 245
 246 For all operations, B<llvm-ar> will always create the archive if it doesn't
 247 exist. Normally, B<llvm-ar> will print a warning message indicating that the
 248 archive is being created. Using this modifier turns off that warning.
 249
 250 =item [s]
 251
 252 This modifier requests that an archive index (or symbol table) be added to the
 253 archive. This is the default mode of operation. The symbol table will contain
 254 all the externally visible functions and global variables defined by all the
 255 bytecode files in the archive. Using this modifier is more efficient that using
 256 L<llvm-ranlib|llvm-ranlib> which also creates the symbol table.
 257
 258 =item [S]
 259
 260 This modifier is the opposite of the F<s> modifier. It instructs B<llvm-ar> to
 261 not build the symbol table. If both F<s> and F<S> are used, the last modifier to
 262 occur in the options will prevail.
 263
 264 =item [v]
 265
 266 This modifier instructs B<llvm-ar> to be verbose about what it is doing. Each
 267 editing operation taken against the archive will produce a line of output saying
 268 what is being done.
 269
 270 =back
 271
 272 =head1 FILE FORMAT
 273
 274 The file format for LLVM Archive files is similar to that of BSD 4.4 or Mac OSX
 275 archive files. In fact, except for the symbol table, the C<ar> commands on those
 276 operating systems should be able to read LLVM archive files. The details of the
 277 file format follow.
 278
 279 Each archive begins with the archive magic number which is the eight printable
 280 characters "!<arch>\n" where \n represents the newline character (0x0A).
 281 Following the magic number, the file is composed of even length members that
 282 begin with an archive header and end with a \n padding character if necessary
 283 (to make the length even). Each file member is composed of a header (defined
 284 below), an optional newline-terminated "long file name" and the contents of
 285 the file.
 286
 287 The fields of the header are described in the items below. All fields of the
 288 header contain only ASCII characters, are left justified and are right padded
 289 with space characters.
 290
 291 =over
 292
 293 =item name - char[16]
 294
 295 This field of the header provides the name of the archive member. If the name is
 296 longer than 15 characters or contains a slash (/) character, then this field
 297 contains C<#1/nnn> where C<nnn> provides the length of the name and the C<#1/>
 298 is literal.  In this case, the actual name of the file is provided in the C<nnn>
 299 bytes immediately following the header. If the name is 15 characters or less, it
 300 is contained directly in this field and terminated with a slash (/) character.
 301
 302 =item date - char[12]
 303
 304 This field provides the date of modification of the file in the form of a
 305 decimal encoded number that provides the number of seconds since the epoch
 306 (since 00:00:00 Jan 1, 1970) per Posix specifications.
 307
 308 =item uid - char[6]
 309
 310 This field provides the user id of the file encoded as a decimal ASCII string.
 311 This field might not make much sense on non-Unix systems. On Unix, it is the
 312 same value as the st_uid field of the stat structure returned by the stat(2)
 313 operating system call.
 314
 315 =item gid - char[6]
 316
 317 This field provides the group id of the file encoded as a decimal ASCII string.
 318 This field might not make much sense on non-Unix systems. On Unix, it is the
 319 same value as the st_gid field of the stat structure returned by the stat(2)
 320 operating system call.
 321
 322 =item mode - char[8]
 323
 324 This field provides the access mode of the file encoded as an octal ASCII
 325 string. This field might not make much sense on non-Unix systems. On Unix, it
 326 is the same value as the st_mode field of the stat structure returned by the
 327 stat(2) operating system call.
 328
 329 =item size - char[10]
 330
 331 This field provides the size of the file, in bytes, encoded as a decimal ASCII
 332 string. If the size field is negative (starts with a minus sign, 0x02D), then
 333 the archive member is stored in compressed form. The first byte of the archive
 334 member's data indicates the compression type used. A value of 0 (0x30) indicates
 335 that no compression was used. A value of 1 (0x31) indicates that zlib
 336 compression was used. A value of 2 (0x32) indicates that bzip2 compression was
 337 used.
 338
 339 =item fmag - char[2]
 340
 341 This field is the archive file member magic number. Its content is always the
 342 two characters back tick (0x60) and newline (0x0A). This provides some measure
 343 utility in identifying archive files that have been corrupted.
 344
 345 =back
 346
 347 The LLVM symbol table has the special name "#_LLVM_SYM_TAB_#". It is presumed
 348 that no regular archive member file will want this name. The LLVM symbol table
 349 is simply composed of a sequence of triplets: byte offset, length of symbol,
 350 and the symbol itself. Symbols are not null or newline terminated. Here are
 351 the details on each of these items:
 352
 353 =over
 354
 355 =item offset - vbr encoded 32-bit integer
 356
 357 The offset item provides the offset into the archive file where the bytecode
 358 member is stored that is associated with the symbol. The offset value is 0
 359 based at the start of the first "normal" file member. To derive the actual
 360 file offset of the member, you must add the number of bytes occupied by the file
 361 signature (8 bytes) and the symbol tables. The value of this item is encoded
 362 using variable bit rate encoding to reduce the size of the symbol table.
 363 Variable bit rate encoding uses the high bit (0x80) of each byte to indicate
 364 if there are more bytes to follow. The remaining 7 bits in each byte carry bits
 365 from the value. The final byte does not have the high bit set.
 366
 367 =item length - vbr encoded 32-bit integer
 368
 369 The length item provides the length of the symbol that follows. Like this
 370 I<offset> item, the length is variable bit rate encoded.
 371
 372 =item symbol - character array
 373
 374 The symbol item provides the text of the symbol that is associated with the
 375 I<offset>. The symbol is not terminated by any character. Its length is provided
 376 by the I<length> field. Note that is allowed (but unwise) to use non-printing
 377 characters (even 0x00) in the symbol. This allows for multiple encodings of
 378 symbol names.
 379
 380 =back
 381
 382 =head1 EXIT STATUS
 383
 384 If B<llvm-ar> succeeds, it will exit with 0.  A usage error, results
 385 in an exit code of 1. A hard (file system typically) error results in an
 386 exit code of 2. Miscellaneous or unknown errors result in an
 387 exit code of 3.
 388
 389 =head1 SEE ALSO
 390
 391 L<llvm-ranlib|llvm-ranlib>
 392
 393 =head1 AUTHORS
 394
 395 Maintained by the LLVM Team (L<http://llvm.cs.uiuc.edu>).
 396
 397 =cut