LLVM 2.3 Release Notes

Introduction @@ -35,284 +33,533 @@

This document contains the release notes for the LLVM compiler -infrastructure, release 2.3. Here we describe the status of LLVM, including -major improvements from the previous release and any known problems. All LLVM -releases may be downloaded from the LLVM -releases web site.

This document contains the release notes for the LLVM Compiler +Infrastructure, release 2.6. Here we describe the status of LLVM, including +major improvements from the previous release and significant known problems. +All LLVM releases may be downloaded from the LLVM releases web site.

For more information about LLVM, including information about the latest release, please check out the main LLVM web site. If you have questions or comments, the LLVM developer's mailing -list is a good place to send them.

+href="http://mail.cs.uiuc.edu/mailman/listinfo/llvmdev">LLVM Developer's Mailing +List is a good place to send them.

Note that if you are reading this file from a Subversion checkout or the +

Note that if you are reading this file from a Subversion checkout or the main LLVM web page, this document applies to the next release, not the -current one. To see the release notes for a specific releases, please see the +current one. To see the release notes for a specific release, please see the releases page.

+ + + + + + + +

- Major Changes and Sub-project Status + Sub-project Status Update

+The LLVM 2.6 distribution currently consists of code from the core LLVM +repository (which roughly includes the LLVM optimizers, code generators +and supporting tools), the Clang repository and the llvm-gcc repository. In +addition to this code, the LLVM Project includes other sub-projects that are in +development. Here we include updates on these subprojects. +

+ +

This is the fourteenth public release of the LLVM Compiler Infrastructure. -It includes a large number of features and refinements from LLVM 2.2.

+ +

+Clang: C/C++/Objective-C Frontend Toolkit

- +

+ +

The Clang project is an effort to build +a set of new 'LLVM native' front-end technologies for the C family of languages. +LLVM 2.6 is the first release to officially include Clang, and it provides a +production quality C and Objective-C compiler. If you are interested in fast compiles and +good diagnostics, we +encourage you to try it out. Clang currently compiles typical Objective-C code +3x faster than GCC and compiles C code about 30% faster than GCC at -O0 -g +(which is when the most pressure is on the frontend).

+ +

In addition to supporting these languages, C++ support is also well under way, and mainline +Clang is able to parse the libstdc++ 4.2 headers and even codegen simple apps. +If you are interested in Clang C++ support or any other Clang feature, we +strongly encourage you to get involved on the Clang front-end mailing +list.

+ +

In the LLVM 2.6 time-frame, the Clang team has made many improvements:

+ +

C and Objective-C support are now considered production quality.
AuroraUX, FreeBSD and OpenBSD are now supported.
Most of Objective-C 2.0 is now supported with the GNU runtime.
Many many bugs are fixed and lots of features have been added.

-Major Changes in LLVM 2.3 +Clang Static Analyzer

LLVM 2.3 no longer supports llvm-gcc 4.0, it has been replaced with - llvm-gcc 4.2.

Previously announced in the 2.4 and 2.5 LLVM releases, the Clang project also +includes an early stage static source code analysis tool for automatically finding bugs +in C and Objective-C programs. The tool performs checks to find +bugs that occur on a specific path within a program.

+ +

In the LLVM 2.6 time-frame, the analyzer core has undergone several important +improvements and cleanups and now includes a new Checker interface that +is intended to eventually serve as a basis for domain-specific checks. Further, +in addition to generating HTML files for reporting analysis results, the +analyzer can now also emit bug reports in a structured XML format that is +intended to be easily readable by other programs.

+ +

The set of checks performed by the static analyzer continues to expand, and +future plans for the tool include full source-level inter-procedural analysis +and deeper checks such as buffer overrun detection. There are many opportunities +to extend and enhance the static analyzer, and anyone interested in working on +this project is encouraged to get involved!

LLVM 2.3 no longer includes the llvm-upgrade tool. It was useful - for upgrading LLVM 1.9 files to LLVM 2.x syntax, but you can always use a - previous LLVM release to do this. One nice impact of this is that the LLVM - regression test suite no longer depends on llvm-upgrade, which makes it run - faster.

+ + +

+VMKit: JVM/CLI Virtual Machine Implementation +

The llvm2cpp tool has been folded into llc, use - llc -march=cpp instead of llvm2cpp.

+The VMKit project is an implementation of +a JVM and a CLI Virtual Machine (Microsoft .NET is an +implementation of the CLI) using LLVM for static and just-in-time +compilation.

LLVM API Changes:

+VMKit version 0.26 builds with LLVM 2.6 and you can find it on its +web page. The release includes +bug fixes, cleanup and new features. The major changes are:

Several core LLVM IR classes have migrated to use the - 'FOOCLASS::Create(...)' pattern instead of 'new - FOOCLASS(...)' (e.g. where FOOCLASS=BasicBlock). We hope to - standardize on FOOCLASS::Create for all IR classes in the future, - but not all of them have been moved over yet.
LLVM 2.3 renames the LLVMBuilder and LLVMFoldingBuilder classes to - IRBuilder. -
MRegisterInfo was renamed to - - TargetRegisterInfo.
The MappedFile class is gone, please use - - MemoryBuffer instead.
The '-enable-eh' flag to llc has been removed. Now code should - encode whether it is safe to omit unwind information for a function by - tagging the Function object with the 'nounwind' attribute.
The ConstantFP::get method that uses APFloat now takes one argument - instead of two. The type argument has been removed, and the type is - now inferred from the size of the given APFloat value.
A new llcj tool to generate shared libraries or executables of Java + files.
Cooperative garbage collection.
Fast subtype checking (paper from Click et al [JGI'02]).
Implementation of a two-word header for Java objects instead of the original + three-word header.
Better Java specification-compliance: division by zero checks, stack + overflow checks, finalization and references support.

-Other LLVM Sub-Projects +compiler-rt: Compiler Runtime Library

-The core LLVM 2.3 distribution currently consists of code from the core LLVM -repository (which roughly contains the LLVM optimizer, code generators and -supporting tools) and the llvm-gcc repository. In addition to this code, the -LLVM Project includes other sub-projects that are in development. The two which -are the most actively developed are the new vmkit Project -and the Clang Project. -

+The new LLVM compiler-rt project +is a simple library that provides an implementation of the low-level +target-specific hooks required by code generation and other runtime components. +For example, when compiling for a 32-bit target, converting a double to a 64-bit +unsigned integer is compiled into a runtime call to the "__fixunsdfdi" +function. The compiler-rt library provides highly optimized implementations of +this and other low-level routines (some are 3x faster than the equivalent +libgcc routines).

+ +

+All of the code in the compiler-rt project is available under the standard LLVM +License, a "BSD-style" license.

-vmkit +

+KLEE: Symbolic Execution and Automatic Test Case Generator

-The "vmkit" project is a new addition to the LLVM family. It is an -implementation of a JVM and a CLI Virtual Machines (Microsoft .NET is an -implementation of the CLI) using the Just-In-Time compiler of LLVM.

- -

The JVM, called JnJVM, executes real-world applications such as Apache -projects (e.g. Felix and Tomcat) and the SpecJVM98 benchmark. It uses the GNU -Classpath project for the base classes. The CLI implementation, called N3, is -its in early stages but can execute simple applications and the "pnetmark" -benchmark. It uses the pnetlib project as its core library.

- -

The 'vmkit' VMs compare in performance with industrial and top open-source -VMs on scientific applications. Besides the JIT, the VMs use many features of -the LLVM framework, including the standard set of optimizations, atomic -operations, custom function provider and memory manager for JITed methods, and -specific virtual machine optimizations. vmkit is not an official part of LLVM -2.3 release. It is publicly available under the LLVM license and can be -downloaded from: -

+The new LLVM KLEE project is a symbolic +execution framework for programs in LLVM bitcode form. KLEE tries to +symbolically evaluate "all" paths through the application and records state +transitions that lead to fault states. This allows it to construct testcases +that lead to faults and can even be used to verify algorithms. For more +details, please see the OSDI 2008 paper about +KLEE.

+ +

+ + +

+DragonEgg: GCC-4.5 as an LLVM frontend +

-svn co http://llvm.org/svn/llvm-project/vmkit/trunk vmkit +The goal of DragonEgg is to make +gcc-4.5 act like llvm-gcc without requiring any gcc modifications whatsoever. +DragonEgg is a shared library (dragonegg.so) +that is loaded by gcc at runtime. It uses the new gcc plugin architecture to +disable the GCC optimizers and code generators, and schedule the LLVM optimizers +and code generators (or direct output of LLVM IR) instead. Currently only Linux +and Darwin are supported, and only on x86-32 and x86-64. It should be easy to +add additional unix-like architectures and other processor families. In theory +it should be possible to use DragonEgg +with any language supported by gcc, however only C and Fortran work well for the +moment. Ada and C++ work to some extent, while Java, Obj-C and Obj-C++ are so +far entirely untested. Since gcc-4.5 has not yet been released, neither has +DragonEgg. To build +DragonEgg you will need to check out the +development versions of gcc, +llvm and +DragonEgg from their respective +subversion repositories, and follow the instructions in the +DragonEgg README.

+ -

-Clang +

+llvm-mc: Machine Code Toolkit

+The LLVM Machine Code (MC) Toolkit project is a (very early) effort to build +better tools for dealing with machine code, object file formats, etc. The idea +is to be able to generate most of the target specific details of assemblers and +disassemblers from existing LLVM target .td files (with suitable enhancements), +and to build infrastructure for reading and writing common object file formats. +One of the first deliverables is to build a full assembler and integrate it into +the compiler, which is predicted to substantially reduce compile time in some +scenarios. +

The Clang project is an effort to build -a set of new 'LLVM native' front-end technologies for the LLVM optimizer -and code generator. Clang is continuing to make major strides forward in all -areas. Its C and Objective-C parsing support is very solid, and the code -generation support is far enough along to build many C applications. While not -yet production quality, it is progressing very nicely. In addition, C++ -front-end work has started to make significant progress.

- -

At this point, Clang is most useful if you are interested in source-to-source -transformations (such as refactoring) and other source-level tools for C and -Objective-C. Clang now also includes tools for turning C code into pretty HTML, -and includes a new static -analysis tool in development. This tool focuses on automatically finding -bugs in C and Objective-C code.

In the LLVM 2.6 timeframe, the MC framework has grown to the point where it +can reliably parse and pretty print (with some encoding information) a +darwin/x86 .s file successfully, and has the very early phases of a Mach-O +assembler in progress. Beyond the MC framework itself, major refactoring of the +LLVM code generator has started. The idea is to make the code generator reason +about the code it is producing in a much more semantic way, rather than a +textual way. For example, the code generator now uses MCSection objects to +represent section assignments, instead of text strings that print to .section +directives.

+ +

MC is an early and ongoing project that will hopefully continue to lead to +many improvements in the code generator and build infrastructure useful for many +other situations. +

- What's New? + External Open Source Projects Using LLVM 2.6

LLVM 2.3 includes a huge number of bug fixes, performance tweaks and minor -improvements. Some of the major improvements and new features are listed in -this section. -

An exciting aspect of LLVM is that it is used as an enabling technology for + a lot of other language and tools projects. This section lists some of the + projects that have already been updated to work with LLVM 2.6.

-Major New Features +Rubinius

Rubinius is an environment +for running Ruby code which strives to write as much of the core class +implementation in Ruby as possible. Combined with a bytecode interpreting VM, it +uses LLVM to optimize and compile ruby code down to machine code. Techniques +such as type feedback, method inlining, and uncommon traps are all used to +remove dynamism from ruby execution and increase performance.

+ +

Since LLVM 2.5, Rubinius has made several major leaps forward, implementing +a counter based JIT, type feedback and speculative method inlining. +

LLVM 2.3 includes several major new capabilities:

The biggest change in LLVM 2.3 is Multiple Return Value (MRV) support. - MRVs allow LLVM IR to directly represent functions that return multiple - values without having to pass them "by reference" in the LLVM IR. This - allows a front-end to generate more efficient code, as MRVs are generally - returned in registers if a target supports them. See the LLVM IR Reference for more details.
- -
MRVs are fully supported in the LLVM IR, but are not yet fully supported in - on all targets. However, it is generally safe to return up to 2 values from - a function: most targets should be able to handle at least that. MRV - support is a critical requirement for X86-64 ABI support, as X86-64 requires - the ability to return multiple registers from functions, and we use MRVs to - accomplish this in a direct way.
LLVM 2.3 includes a complete reimplementation of the "llvmc" - tool. It is designed to overcome several problems with the original - llvmc and to provide a superset of the features of the - 'gcc' driver.
- -
The main features of llvmc2 are: -
- Extended handling of command line options and smart rules for - dispatching them to different tools.
- Flexible (and extensible) rules for defining different tools.
- The different intermediate steps performed by tools are represented - as edges in the abstract graph.
- The 'language' for driver behavior definition is tablegen and thus - it's relatively easy to add new features.
- The definition of driver is transformed into set of C++ classes, thus - no runtime interpretation is needed.
-

+MacRuby +

LLVM 2.3 includes a completely rewritten interface for Link Time Optimization. This interface - is written in C, which allows for easier integration with C code bases, and - incorporates improvements we learned about from the first incarnation of the - interface.

The Kaleidoscope tutorial now - includes a "port" of the tutorial that uses the Ocaml bindings to implement - the Kaleidoscope language.

+MacRuby is an implementation of Ruby on top of +core Mac OS X technologies, such as the Objective-C common runtime and garbage +collector and the CoreFoundation framework. It is principally developed by +Apple and aims at enabling the creation of full-fledged Mac OS X applications. +

+MacRuby uses LLVM for optimization passes, JIT and AOT compilation of Ruby +expressions. It also uses zero-cost DWARF exceptions to implement Ruby exception +handling.

-llvm-gcc 4.2 Improvements +Pure

+Pure +is an algebraic/functional programming language based on term rewriting. +Programs are collections of equations which are used to evaluate expressions in +a symbolic fashion. Pure offers dynamic typing, eager and lazy evaluation, +lexical closures, a hygienic macro system (also based on term rewriting), +built-in list and matrix support (including list and matrix comprehensions) and +an easy-to-use C interface. The interpreter uses LLVM as a backend to + JIT-compile Pure programs to fast native code.

+ +

Pure versions 0.31 and later have been tested and are known to work with +LLVM 2.6 (and continue to work with older LLVM releases >= 2.3 as well). +

LLVM 2.3 fully supports the llvm-gcc 4.2 front-end, and includes support -for the C, C++, Objective-C, Ada, and Fortran front-ends.

+ +

+LLVM D Compiler +

+ +

llvm-gcc 4.2 includes numerous fixes to better support the Objective-C -front-end. Objective-C now works very well on Mac OS/X.

LDC

Fortran EQUIVALENCEs are now supported by the gfortran front-end.

+ +

+Roadsend PHP +

llvm-gcc 4.2 includes many other fixes which improve conformance with the -relevant parts of the GCC testsuite.

+Roadsend PHP (rphp) is an open +source implementation of the PHP programming +language that uses LLVM for its optimizer, JIT and static compiler. This is a +reimplementation of an earlier project that is now based on LLVM.

+ +

+Unladen Swallow +

+Unladen Swallow is a +branch of Python intended to be fully +compatible and significantly faster. It uses LLVM's optimization passes and JIT +compiler.

+ +

+llvm-lua +

+ +

+LLVM-Lua uses LLVM to add JIT +and static compiling support to the Lua VM. Lua bytecode is analyzed to +remove type checks, then LLVM is used to compile the bytecode down to machine +code.

-LLVM Core Improvements +IcedTea Java Virtual Machine Implementation

New features include: +

+IcedTea provides a +harness to build OpenJDK using only free software build tools and to provide +replacements for the not-yet free parts of OpenJDK. One of the extensions that +IcedTea provides is a new JIT compiler named Shark which uses LLVM +to provide native code generation without introducing processor-dependent +code.

+ + + + +

+ What's New in LLVM 2.6? +

+ + +

+ +

This release includes a huge number of bug fixes, performance tweaks and +minor improvements. Some of the major improvements and new features are listed +in this section. +

+ +

+ + +

+Major New Features +

+ +

LLVM 2.6 includes several major new capabilities:

LLVM IR now directly represents "common" linkage, instead of representing it -as a form of weak linkage.
New compiler-rt, KLEE + and machine code toolkit sub-projects.
Debug information now includes line numbers when optimizations are enabled. + This allows statistical sampling tools like OProfile and Shark to map + samples back to source lines.
LLVM now includes new experimental backends to support the MSP430, SystemZ + and BlackFin architectures.
LLVM supports a new Gold Linker Plugin which + enables support for transparent + link-time optimization on ELF targets when used with the Gold binutils + linker.
LLVM now supports doing optimization and code generation on multiple + threads. Please see the LLVM + Programmer's Manual for more information.
LLVM now has experimental support for embedded + metadata in LLVM IR, though the implementation is not guaranteed to be + final and the .bc file format may change in future releases. Debug info + does not yet use this format in LLVM 2.6.

+ +

+ + +

+LLVM IR and Core Improvements +

LLVM IR now has support for atomic operations, and this functionality can -be accessed through the llvm-gcc "__sync_synchronize", -"__sync_val_compare_and_swap", and related builtins. Support for atomics are -available in the Alpha, X86, X86-64, and PowerPC backends.

LLVM IR has several new features for better support of new targets and that +expose new optimization opportunities:

The C and Ocaml bindings have extended to cover pass managers, several -transformation passes, iteration over the LLVM IR, target data, and parameter -attribute lists.

The add, sub and mul + instructions have been split into integer and floating point versions (like + divide and remainder), introducing new fadd, fsub, + and fmul instructions.
The add, sub and mul + instructions now support optional "nsw" and "nuw" bits which indicate that + the operation is guaranteed to not overflow (in the signed or + unsigned case, respectively). This gives the optimizer more information and + can be used for things like C signed integer values, which are undefined on + overflow.
The sdiv instruction now supports an + optional "exact" flag which indicates that the result of the division is + guaranteed to have a remainder of zero. This is useful for optimizing pointer + subtraction in C.
The getelementptr instruction now + supports arbitrary integer index values for array/pointer indices. This + allows for better code generation on 16-bit pointer targets like PIC16.
The getelementptr instruction now + supports an "inbounds" optimization hint that tells the optimizer that the + pointer is guaranteed to be within its allocated object.
LLVM now support a series of new linkage types for global values which allow + for better optimization and new capabilities: +
- linkonce_odr and + weak_odr have the same linkage + semantics as the non-"odr" linkage types. The difference is that these + linkage types indicate that all definitions of the specified function + are guaranteed to have the same semantics. This allows inlining + templates functions in C++ but not inlining weak functions in C, + which previously both got the same linkage type.
- available_externally + is a new linkage type that gives the optimizer visibility into the + definition of a function (allowing inlining and side effect analysis) + but that does not cause code to be generated. This allows better + optimization of "GNU inline" functions, extern templates, etc.
- linker_private is a + new linkage type (which is only useful on Mac OS X) that is used for + some metadata generation and other obscure things.
Finally, target-specific intrinsics can now return multiple values, which + is useful for modeling target operations with multiple results.

- +

@@ -322,215 +569,443 @@ attribute lists.

In addition to a huge array of bug fixes and minor performance tweaks, the -LLVM 2.3 optimizers support a few major enhancements:

In addition to a large array of minor performance tweaks and bug fixes, this +release includes a few major enhancements and additions to the optimizers:

Loop index set splitting on by default. -This transformation hoists conditions from loop bodies and reduces a loop's -iteration space to improve performance. For example,
+
The Scalar Replacement of Aggregates + pass has many improvements that allow it to better promote vector unions, + variables which are memset, and much more strange code that can happen to + do bitfield accesses to register operations. An interesting change is that + it now produces "unusual" integer sizes (like i1704) in some cases and lets + other optimizers clean things up.
The Loop Strength Reduction pass now + promotes small integer induction variables to 64-bit on 64-bit targets, + which provides a major performance boost for much numerical code. It also + promotes shorts to int on 32-bit hosts, etc. LSR now also analyzes pointer + expressions (e.g. getelementptrs), as well as integers.
The GVN pass now eliminates partial + redundancies of loads in simple cases.
The Inliner now reuses stack space when + inlining similar arrays from multiple callees into one caller.
LLVM includes a new experimental Static Single Information (SSI) + construction pass.

-for (i = LB; i < UB; ++i)
-  if (i <= NV)
-    LOOP_BODY
-

is transformed into:

-NUB = min(NV+1, UB)
-for (i = LB; i < NUB; ++i)
-  LOOP_BODY
-

- -

LLVM now includes a new memcpy optimization pass which removes -dead memcpy calls, unneeded copies of aggregates, and performs -return slot optimization. The LLVM optimizer now notices long sequences of -consecutive stores and merges them into memcpy's where profitable.

- -

Alignment detection for vector memory references and for memcpy and -memset is now more aggressive.

- -

The Aggressive Dead Code Elimination (ADCE) optimization has been rewritten -to make it both faster and safer in the presence of code containing infinite -loops. Some of its prior functionality has been factored out into the loop -deletion pass, which is safe for infinite loops. The new ADCE pass is -no longer based on control dependence, making it run faster.

- -

The 'SimplifyLibCalls' pass, which optimizes calls to libc and libm - functions for C-based languages, has been rewritten to be a FunctionPass - instead a ModulePass. This allows it to be run more often and to be - included at -O1 in llvm-gcc. It was also extended to include more - optimizations and several corner case bugs were fixed.

- -

LLVM now includes a simple 'Jump Threading' pass, which attempts to simplify - conditional branches using information about predecessor blocks, simplifying - the control flow graph. This pass is pretty basic at this point, but - catches some important cases and provides a foundation to build on.

- -

Several corner case bugs which could lead to deleting volatile memory - accesses have been fixed.

+ +

+Interpreter and JIT Improvements +

Several optimizations have been sped up, leading to faster code generation - with the same code quality.

- +

+ +

LLVM has a new "EngineBuilder" class which makes it more obvious how to + set up and configure an ExecutionEngine (a JIT or interpreter).
The JIT now supports generating more than 16M of code.
When configured with --with-oprofile, the JIT can now inform + OProfile about JIT'd code, allowing OProfile to get line number and function + name information for JIT'd functions.
When "libffi" is available, the LLVM interpreter now uses it, which supports + calling almost arbitrary external (natively compiled) functions.
Clients of the JIT can now register a 'JITEventListener' object to receive + callbacks when the JIT emits or frees machine code. The OProfile support + uses this mechanism.

-Code Generator Improvements +Target Independent Code Generator Improvements

We put a significant amount of work into the code generator infrastructure, -which allows us to implement more aggressive algorithms and make it run -faster:

We have put a significant amount of work into the code generator +infrastructure, which allows us to implement more aggressive algorithms and make +it run faster:

The code generator now has support for carrying information about memory - references throughout the entire code generation process, via the - - MachineMemOperand class. In the future this will be used to improve - both pre-pass and post-pass scheduling, and to improve compiler-debugging - output.
The target-independent code generator infrastructure now uses LLVM's - APInt - class to handle integer values, which allows it to support integer types - larger than 64 bits (for example i128). Note that support for such types is - also dependent on target-specific support. Use of APInt is also a step - toward support for non-power-of-2 integer sizes.
LLVM 2.3 includes several compile time speedups for code with large basic - blocks, particularly in the instruction selection phase, register - allocation, scheduling, and tail merging/jump threading.
LLVM 2.3 includes several improvements which make llc's - --view-sunit-dags visualization of scheduling dependency graphs - easier to understand.
The code generator allows targets to write patterns that generate subreg - references directly in .td files now.
memcpy lowering in the backend is more aggressive, particularly for - memcpy calls introduced by the code generator when handling - pass-by-value structure argument copies.
Inline assembly with multiple register results now returns those results - directly in the appropriate registers, rather than going through memory. - Inline assembly that uses constraints like "ir" with immediates now use the - 'i' form when possible instead of always loading the value in a register. - This saves an instruction and reduces register use.
Added support for PIC/GOT style tail calls on X86/32 and initial - support for tail calls on PowerPC 32 (it may also work on PowerPC 64 but is - not thoroughly tested).
The llc -asm-verbose option (exposed from llvm-gcc as -dA + and clang as -fverbose-asm or -dA) now adds a lot of + useful information in comments to + the generated .s file. This information includes location information (if + built with -g) and loop nest information.
The code generator now supports a new MachineVerifier pass which is useful + for finding bugs in targets and codegen passes.
The Machine LICM is now enabled by default. It hoists instructions out of + loops (such as constant pool loads, loads from read-only stubs, vector + constant synthesization code, etc.) and is currently configured to only do + so when the hoisted operation can be rematerialized.
The Machine Sinking pass is now enabled by default. This pass moves + side-effect free operations down the CFG so that they are executed on fewer + paths through a function.
The code generator now performs "stack slot coloring" of register spills, + which allows spill slots to be reused. This leads to smaller stack frames + in cases where there are lots of register spills.
The register allocator has many improvements to take better advantage of + commutable operations, various spiller peephole optimizations, and can now + coalesce cross-register-class copies.
Tblgen now supports multiclass inheritance and a number of new string and + list operations like !(subst), !(foreach), !car, + !cdr, !null, !if, !cast. + These make the .td files more expressive and allow more aggressive factoring + of duplication across instruction patterns.
Target-specific intrinsics can now be added without having to hack VMCore to + add them. This makes it easier to maintain out-of-tree targets.
The instruction selector is better at propagating information about values + (such as whether they are sign/zero extended etc.) across basic block + boundaries.
The SelectionDAG datastructure has new nodes for representing buildvector + and vector shuffle operations. This + makes operations and pattern matching more efficient and easier to get + right.
The Prolog/Epilog Insertion Pass now has experimental support for performing + the "shrink wrapping" optimization, which moves spills and reloads around in + the CFG to avoid doing saves on paths that don't need them.
LLVM includes new experimental support for writing ELF .o files directly + from the compiler. It works well for many simple C testcases, but doesn't + support exception handling, debug info, inline assembly, etc.
Targets can now specify register allocation hints through + MachineRegisterInfo::setRegAllocationHint. A regalloc hint consists + of hint type and physical register number. A hint type of zero specifies a + register allocation preference. Other hint type values are target specific + which are resolved by TargetRegisterInfo::ResolveRegAllocHint. An + example is the ARM target which uses register hints to request that the + register allocator provide an even / odd register pair to two virtual + registers.

+ +

+X86-32 and X86-64 Target Improvements

New features of the X86 target include: +

+ +

SSE 4.2 builtins are now supported.
GCC-compatible soft float modes are now supported, which are typically used + by OS kernels.
X86-64 now models implicit zero extensions better, which allows the code + generator to remove a lot of redundant zexts. It also models the 8-bit "H" + registers as subregs, which allows them to be used in some tricky + situations.
X86-64 now supports the "local exec" and "initial exec" thread local storage + model.
The vector forms of the icmp and fcmp instructions now select to efficient + SSE operations.
Support for the win64 calling conventions have improved. The primary + missing feature is support for varargs function definitions. It seems to + work well for many win64 JIT purposes.
The X86 backend has preliminary support for mapping address spaces to segment + register references. This allows you to write GS or FS relative memory + accesses directly in LLVM IR for cases where you know exactly what you're + doing (such as in an OS kernel). There are some known problems with this + support, but it works in simple cases.
The X86 code generator has been refactored to move all global variable + reference logic to one place + (X86Subtarget::ClassifyGlobalReference) which + makes it easier to reason about.

+ +

-X86/X86-64 Specific Improvements +PIC16 Target Improvements

New target-specific features include: +

New features of the PIC16 target include:

llvm-gcc's X86-64 ABI conformance is far improved, particularly in the - area of passing and returning structures by value. llvm-gcc compiled code - now interoperates very well on X86-64 systems with other compilers.
Support for Win64 was added. This includes code generation itself, JIT - support, and necessary changes to llvm-gcc.
The LLVM X86 backend now supports the support SSE 4.1 instruction set, and - the llvm-gcc 4.2 front-end supports the SSE 4.1 compiler builtins. Various - generic vector operations (insert/extract/shuffle) are much more efficient - when SSE 4.1 is enabled. The JIT automatically takes advantage of these - instructions, but llvm-gcc must be explicitly told to use them, e.g. with - -march=penryn.
The X86 backend now does a number of optimizations that aim to avoid - converting numbers back and forth from SSE registers to the X87 floating - point stack. This is important because most X86 ABIs require return values - to be on the X87 Floating Point stack, but most CPUs prefer computation in - the SSE units.
The X86 backend supports stack realignment, which is particularly useful for - vector code on OS's without 16-byte aligned stacks, such as Linux and - Windows.
The X86 backend now supports the "sseregparm" options in GCC, which allow - functions to be tagged as passing floating point values in SSE - registers.
Support for floating-point, indirect function calls, and + passing/returning aggregate types to functions. +
The code generator is able to generate debug info into output COFF files. +
Support for placing an object into a specific section or at a specific + address in memory.

+ +

Things not yet supported:

+ +

Variable arguments.
Interrupts/programs.

+ +

Trampolines (taking the address of a nested function) now work on - Linux/X86-64.

+ +

+ARM Target Improvements +

__builtin_prefetch is now compiled into the appropriate prefetch - instructions instead of being ignored.

New features of the ARM target include: +

128-bit integers are now supported on X86-64 targets. This can be used - through __attribute__((TImode)) in llvm-gcc.

The register allocator can now rematerialize PIC-base computations, which is - an important optimization for register use.
Preliminary support for processors, such as the Cortex-A8 and Cortex-A9, +that implement version v7-A of the ARM architecture. The ARM backend now +supports both the Thumb2 and Advanced SIMD (Neon) instruction sets.
The "t" and "f" inline assembly constraints for the X87 floating point stack - now work. However, the "u" constraint is still not fully supported.
The AAPCS-VFP "hard float" calling conventions are also supported with the +-float-abi=hard flag.
The ARM calling convention code is now tblgen generated instead of resorting + to C++ code.

- + +

These features are still somewhat experimental +and subject to change. The Neon intrinsics, in particular, may change in future +releases of LLVM. ARMv7 support has progressed a lot on top of tree since 2.6 +branched.

+ +

-Other Target Specific Improvements +Other Target Specific Improvements

New target-specific features include: +

New features of other targets include:

The LLVM C backend now supports vector code.
The Cell SPU backend includes a number of improvements. It generates better - code and its stability/completeness is improving.
Mips now supports O32 Calling Convention.
Many improvements to the 32-bit PowerPC SVR4 ABI (used on powerpc-linux) + support, lots of bugs fixed.
Added support for the 64-bit PowerPC SVR4 ABI (used on powerpc64-linux). + Needs more testing.

- + +

+ + +

+New Useful APIs

+ +

This release includes a number of new APIs that are used internally, which + may also be useful for external clients. +

New + PrettyStackTrace class allows crashes of llvm tools (and applications + that integrate them) to provide more detailed indication of what the + compiler was doing at the time of the crash (e.g. running a pass). + At the top level for each LLVM tool, it includes the command line arguments. +
New StringRef + and Twine classes + make operations on character ranges and + string concatenation to be more efficient. StringRef is just a const + char* with a length, Twine is a light-weight rope.
LLVM has new WeakVH, AssertingVH and CallbackVH + classes, which make it easier to write LLVM IR transformations. WeakVH + is automatically drops to null when the referenced Value is deleted, + and is updated across a replaceAllUsesWith operation. + AssertingVH aborts the program if the + referenced value is destroyed while it is being referenced. CallbackVH + is a customizable class for handling value references. See ValueHandle.h + for more information.
The new 'Triple + ' class centralizes a lot of logic that reasons about target + triples.
The new ' + llvm_report_error()' set of APIs allows tools to embed the LLVM + optimizer and backend and recover from previously unrecoverable errors.
LLVM has new abstractions for atomic operations + and reader/writer + locks.
LLVM has new + SourceMgr and SMLoc classes which implement caret + diagnostics and basic include stack processing for simple parsers. It is + used by tablegen, llvm-mc, the .ll parser and FileCheck.

+ + +

-Other Improvements +Other Improvements and New Features

New features include: -

Other miscellaneous features include:

+ +

LLVM now includes a new internal 'FileCheck' tool which allows + writing much more accurate regression tests that run faster. Please see the + FileCheck section of the Testing + Guide for more information.
LLVM profile information support has been significantly improved to produce +correct use counts, and has support for edge profiling with reduced runtime +overhead. Combined, the generated profile information is both more correct and +imposes about half as much overhead (2.6. from 12% to 6% overhead on SPEC +CPU2000).
The C bindings (in the llvm/include/llvm-c directory) include many newly + supported APIs.
LLVM 2.6 includes a brand new experimental LLVM bindings to the Ada2005 + programming language.
The LLVMC driver has several new features: +
- Dynamic plugins now work on Windows.
- New option property: init. Makes possible to provide default values for + options defined in plugins (interface to cl::init).
- New example: Skeleton, shows how to create a standalone LLVMC-based + driver.
- New example: mcc16, a driver for the PIC16 toolchain.
+

+ +

+ + + +

+Major Changes and Removed Features +

+ +

If you're already an LLVM user or developer with out-of-tree changes based +on LLVM 2.5, this section lists some "gotchas" that you may run into upgrading +from the previous release.

+ +

The Itanium (IA64) backend has been removed. It was not actively supported + and had bitrotted.
The BigBlock register allocator has been removed, it had also bitrotted.
The C Backend (-march=c) is no longer considered part of the LLVM release +criteria. We still want it to work, but no one is maintaining it and it lacks +support for arbitrary precision integers and other important IR features.
All LLVM tools now default to overwriting their output file, behaving more + like standard unix tools. Previously, this only happened with the '-f' + option.
LLVM build now builds all libraries as .a files instead of some + libraries as relinked .o files. This requires some APIs like + InitializeAllTargets.h. +

+ + +

In addition, many APIs have changed in this release. Some of the major LLVM +API changes are:

LLVM now builds with GCC 4.3.
Bugpoint now supports running custom scripts (with the -run-custom - option) to determine how to execute the command and whether it is making - forward process.
All uses of hash_set and hash_map have been removed from + the LLVM tree and the wrapper headers have been removed.
The llvm/Streams.h and DOUT member of Debug.h have been removed. The + llvm::Ostream class has been completely removed and replaced with + uses of raw_ostream.
LLVM's global uniquing tables for Types and Constants have + been privatized into members of an LLVMContext. A number of APIs + now take an LLVMContext as a parameter. To smooth the transition + for clients that will only ever use a single context, the new + getGlobalContext() API can be used to access a default global + context which can be passed in any and all cases where a context is + required. +
The getABITypeSize methods are now called getAllocSize.
The Add, Sub and Mul operators are no longer + overloaded for floating-point types. Floating-point addition, subtraction + and multiplication are now represented with new operators FAdd, + FSub and FMul. In the IRBuilder API, + CreateAdd, CreateSub, CreateMul and + CreateNeg should only be used for integer arithmetic now; + CreateFAdd, CreateFSub, CreateFMul and + CreateFNeg should now be used for floating-point arithmetic.
The DynamicLibrary class can no longer be constructed, its functionality has + moved to static member functions.
raw_fd_ostream's constructor for opening a given filename now + takes an extra Force argument. If Force is set to + false, an error will be reported if a file with the given name + already exists. If Force is set to true, the file will + be silently truncated (which is the behavior before this flag was + added).
SCEVHandle no longer exists, because reference counting is no + longer done for SCEV* objects, instead const SCEV* + should be used.
Many APIs, notably llvm::Value, now use the StringRef +and Twine classes instead of passing const char* +or std::string, as described in +the Programmer's Manual. Most +clients should be unaffected by this transition, unless they are used to +Value::getName() returning a string. Here are some tips on updating to +2.6: +
- getNameStr() is still available, and matches the old + behavior. Replacing getName() calls with this is an safe option, + although more efficient alternatives are now possible.
- If you were just relying on getName() being able to be sent to + a std::ostream, consider migrating + to llvm::raw_ostream.
- If you were using getName().c_str() to get a const + char* pointer to the name, you can use getName().data(). + Note that this string (as before), may not be the entire name if the + name contains embedded null characters.
- If you were using operator + on the result of getName() and + treating the result as an std::string, you can either + use Twine::str to get the result as an std::string, or + could move to a Twine based design.
- isName() should be replaced with comparison + against getName() (this is now efficient). +
+
The registration interfaces for backend Targets has changed (what was +previously TargetMachineRegistry). For backend authors, see the Writing An LLVM Backend +guide. For clients, the notable API changes are: +
- TargetMachineRegistry has been renamed + to TargetRegistry.
- Clients should move to using the TargetRegistry::lookupTarget() + function to find targets.
+

- +

+ +

Portability and Supported Platforms @@ -542,16 +1017,16 @@ faster:

LLVM is known to work on the following platforms:

Intel and AMD machines (IA32) running Red Hat Linux, Fedora Core and FreeBSD - (and probably other unix-like systems).
PowerPC and X86-based Mac OS X systems, running 10.3 and above in 32-bit and - 64-bit modes.
Intel and AMD machines (IA32, X86-64, AMD64, EMT-64) running Red Hat + Linux, Fedora Core, FreeBSD and AuroraUX (and probably other unix-like + systems).
PowerPC and X86-based Mac OS X systems, running 10.3 and above in 32-bit + and 64-bit modes.
Intel and AMD machines running on Win32 using MinGW libraries (native).
Intel and AMD machines running on Win32 with the Cygwin libraries (limited support is available for native builds with Visual C++).
Sun UltraSPARC workstations running Solaris 10.
Sun x86 and AMD64 machines running Solaris 10, OpenSolaris 0906.
Alpha-based machines running Debian GNU/Linux.
Itanium-based (IA64) machines running Linux and HP-UX.

The core LLVM infrastructure uses GNU autoconf to adapt itself @@ -569,12 +1044,26 @@ portability patches and reports of successful builds or error messages.

This section contains all known problems with the LLVM system, listed by -component. As new problems are discovered, they will be added to these -sections. If you run into a problem, please check the This section contains significant known problems with the LLVM system, +listed by component. If you run into a problem, please check the LLVM bug database and submit a bug if there isn't already one.

The llvm-gcc bootstrap will fail with some versions of binutils (e.g. 2.15) + with a message of "Error: can not do 8 + byte pc-relative relocation" when building C++ code. We intend to + fix this on mainline, but a workaround for 2.6 is to upgrade to binutils + 2.17 or later.
LLVM will not correctly compile on Solaris and/or OpenSolaris +using the stock GCC 3.x.x series 'out the box', +See: Broken versions of GCC and other tools. +However, A Modern GCC Build +for x86/x86-64 has been made available from the third party AuroraUX Project +that has been meticulously tested for bootstrapping LLVM & Clang.

@@ -592,9 +1081,11 @@ components, please contact us on the LLVMdev list.

The MSIL, IA64, Alpha, SPU, and MIPS backends are experimental.
The llc "-filetype=asm" (the default) is the only supported - value for this option.
The MSIL, Alpha, SPU, MIPS, PIC16, Blackfin, MSP430 and SystemZ backends are + experimental.
The llc "-filetype=asm" (the default) is the only + supported value for this option. The ELF writer is experimental.
The implementation of Andersen's Alias Analysis has many known bugs.

@@ -614,15 +1105,14 @@ href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">LLVMdev list.

The X86 backend generates inefficient floating point code when configured to generate code for systems that don't have SSE2.

Win64 code generation wasn't widely tested. Everything should work, but we - expect small issues to happen. Also, llvm-gcc cannot build mingw64 runtime - currently due + expect small issues to happen. Also, llvm-gcc cannot build the mingw64 + runtime currently due to several - bugs due to lack of support for the - 'u' inline assembly constraint and X87 floating point inline assembly.

The X86-64 backend does not yet support position-independent code (PIC) - generation on Linux targets.

+ bugs and due to lack of support for + the + 'u' inline assembly constraint and for X87 floating point inline assembly.

The X86-64 backend does not yet support the LLVM IR instruction - va_arg. Currently, the llvm-gcc front-end supports variadic + va_arg. Currently, the llvm-gcc and front-ends support variadic argument constructs on X86-64 by lowering them manually.

@@ -650,14 +1140,14 @@ compilation, and lacks support for debug information.

Support for the Advanced SIMD (Neon) instruction set is still incomplete +and not well tested. Some features may not work at all, and the code quality +may be poor in some cases.
Thumb mode works only on ARMv6 or higher processors. On sub-ARMv6 processors, thumb programs can crash or produce wrong results (PR1388).
Compilation for ARM Linux OABI (old ABI) is supported, but not fully tested. +
Compilation for ARM Linux OABI (old ABI) is supported but not fully tested.
There is a bug in QEMU-ARM (<= 0.9.0) which causes it to incorrectly - execute -programs compiled with LLVM. Please use more recent versions of QEMU.

@@ -670,7 +1160,7 @@ programs compiled with LLVM. Please use more recent versions of QEMU.

The SPARC backend only supports the 32-bit SPARC ABI (-m32), it does not +
The SPARC backend only supports the 32-bit SPARC ABI (-m32); it does not support the 64-bit SPARC ABI (-m64).

@@ -678,32 +1168,30 @@ programs compiled with LLVM. Please use more recent versions of QEMU.

- Known problems with the Alpha back-end + Known problems with the MIPS back-end

On 21164s, some rare FP arithmetic sequences which may trap do not have the -appropriate nops inserted to ensure restartability.
64-bit MIPS targets are not supported yet.

- Known problems with the IA64 back-end + Known problems with the Alpha back-end

The Itanium backend is highly experimental, and has a number of known - issues. We are looking for a maintainer for the Itanium backend. If you - are interested, please contact the llvmdev mailing list.

On 21164s, some rare FP arithmetic sequences which may trap do not have the +appropriate nops inserted to ensure restartability.

+ +

@@ -718,8 +1206,9 @@ appropriate nops inserted to ensure restartability. inline assembly code.

The C backend violates the ABI of common C++ programs, preventing intermixing between C++ compiled by the CBE and - C++ code compiled with llc or native compilers.

+ C++ code compiled with llc or native compilers.

The C backend does not support all exception handling constructs.

The C backend does not support arbitrary precision integers.

@@ -732,10 +1221,6 @@ appropriate nops inserted to ensure restartability.

llvm-gcc does not currently support Link-Time -Optimization on most platforms "out-of-the-box". Please inquire on the -llvmdev mailing list if you are interested.

The only major language feature of GCC not supported by llvm-gcc is the __builtin_apply family of builtins. However, some extensions are only supported on some targets. For example, trampolines are only @@ -759,13 +1244,24 @@ tested and works for a number of non-trivial programs, including LLVM itself, Qt, Mozilla, etc.

Exception handling works well on the X86 and PowerPC targets, including -X86-64 darwin. This works when linking to a libstdc++ compiled by GCC. It is -supported on X86-64 linux, but that is disabled by default in this release.
Exception handling works well on the X86 and PowerPC targets. Currently + only Linux and Darwin targets are supported (both 32 and 64 bit).

+ +

+ Known problems with the llvm-gcc Fortran front-end +

+ +

Fortran support generally works, but there are still several unresolved bugs + in Bugzilla. Please see the + tools/gfortran component for details.

@@ -773,23 +1269,26 @@ supported on X86-64 linux, but that is disabled by default in this release.

-The llvm-gcc 4.2 Ada compiler works fairly well, however this is not a mature -technology and problems should be expected. +The llvm-gcc 4.2 Ada compiler works fairly well; however, this is not a mature +technology, and problems should be expected.

The Ada front-end currently only builds on X86-32. This is mainly due -to lack of trampoline support (pointers to nested functions) on other platforms, -however it also fails to build on X86-64 +to lack of trampoline support (pointers to nested functions) on other platforms. +However, it also fails to build on X86-64 which does support trampolines.
The Ada front-end fails to bootstrap. -Workaround: configure with --disable-bootstrap.
The c380004 and c393010 ACATS tests -fail (c380004 also fails with gcc-4.2 mainline). When built at -O3, the -cxg2021 ACATS test also fails.
Some gcc specific Ada tests continue to crash the compiler. The testsuite -reports most tests as having failed even though they pass.
The -E binder option (exception backtraces) +This is due to lack of LLVM support for setjmp/longjmp style +exception handling, which is used internally by the compiler. +Workaround: configure with --disable-bootstrap.
The c380004, c393010 +and cxg2021 ACATS tests fail +(c380004 also fails with gcc-4.2 mainline). +If the compiler is built with checks disabled then c393010 +causes the compiler to go into an infinite loop, using up all system memory.
Some GCC specific Ada tests continue to crash the compiler.
The -E binder option (exception backtraces) does not work and will result in programs -crashing if an exception is raised. Workaround: do not use -E.

-E

Only discrete types are allowed to start or finish at a non-byte offset in a record. Workaround: do not pack records or use representation clauses that result in a field of a non-discrete type @@ -803,6 +1302,20 @@ ignored.

+ +

+ Known problems with the O'Caml bindings +

+ +

The Llvm.Linkage module is broken, and has incorrect values. Only +Llvm.Linkage.External, Llvm.Linkage.Available_externally, and +Llvm.Linkage.Link_once will be correct. If you need any of the other linkage +modes, you'll have to write an external C library in order to expose the +functionality. This has been fixed in the trunk.

Additional Information @@ -830,9 +1343,9 @@ lists.

+ src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS">

+ src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"> LLVM Compiler Infrastructure
Last modified: $Date$