LLVM 2.4 Release Notes

@@ -72,12 +92,11 @@ current one. To see the release notes for a specific release, please see the

-The LLVM 2.4 distribution currently consists of code from the core LLVM -repository (which roughly includes the LLVM optimizers, code generators and -supporting tools) and the llvm-gcc repository. In addition to this code, the -LLVM Project includes other sub-projects that are in development. The two which -are the most actively developed are the Clang Project and -the VMKit Project. +The LLVM 2.6 distribution currently consists of code from the core LLVM +repository (which roughly includes the LLVM optimizers, code generators +and supporting tools), the Clang repository and the llvm-gcc repository. In +addition to this code, the LLVM Project includes other sub-projects that are in +development. Here we include updates on these subprojects.

@@ -91,15 +110,31 @@ the VMKit Project.

The Clang project is an effort to build -a set of new 'LLVM native' front-end technologies for the LLVM optimizer -and code generator. Clang is continuing to make major strides forward in all -areas. Its C and Objective-C parsing support is very solid, and the code -generation support is far enough along to build many C applications. While not -yet production quality, it is progressing very nicely. In addition, C++ -front-end work has started to make significant progress.

- -

Codegen progress/state [DANIEL]

+a set of new 'LLVM native' front-end technologies for the C family of languages. +LLVM 2.6 is the first release to officially include Clang, and it provides a +production quality C and Objective-C compiler. If you are interested in fast compiles and +good diagnostics, we +encourage you to try it out. Clang currently compiles typical Objective-C code +3x faster than GCC and compiles C code about 30% faster than GCC at -O0 -g +(which is when the most pressure is on the frontend).

+ +

In addition to supporting these languages, C++ support is also well under way, and mainline +Clang is able to parse the libstdc++ 4.2 headers and even codegen simple apps. +If you are interested in Clang C++ support or any other Clang feature, we +strongly encourage you to get involved on the Clang front-end mailing +list.

+ +

In the LLVM 2.6 time-frame, the Clang team has made many improvements:

C and Objective-C support are now considered production quality.
AuroraUX, FreeBSD and OpenBSD are now supported.
Most of Objective-C 2.0 is now supported with the GNU runtime.
Many many bugs are fixed and lots of features have been added.

@@ -109,25 +144,20 @@ front-end work has started to make significant progress.

The Clang project also includes an early stage static source code analysis -tool for automatically -finding bugs in C and Objective-C programs. The tool performs a growing set -of checks to find bugs that occur on a specific path within a program. Examples -of bugs the tool finds include logic errors such as null dereferences, -violations of various API rules, dead code, and potential memory leaks in -Objective-C programs. Since its inception, public feedback on the tool has been -extremely positive, and conservative estimates put the number of real bugs it -has found in industrial-quality software on the order of thousands.

Previously announced in the 2.4 and 2.5 LLVM releases, the Clang project also +includes an early stage static source code analysis tool for automatically finding bugs +in C and Objective-C programs. The tool performs checks to find +bugs that occur on a specific path within a program.

The tool also provides a simple web GUI to inspect potential bugs found by -the tool. While still early in development, the GUI illustrates some of the key -features of Clang: accurate source location information, which is used by the -GUI to highlight specific code expressions that relate to a bug (including those -that span multiple lines) and built-in knowledge of macros, which is used to -perform inline expansion of macros within the GUI itself.

In the LLVM 2.6 time-frame, the analyzer core has undergone several important +improvements and cleanups and now includes a new Checker interface that +is intended to eventually serve as a basis for domain-specific checks. Further, +in addition to generating HTML files for reporting analysis results, the +analyzer can now also emit bug reports in a structured XML format that is +intended to be easily readable by other programs.

The set of checks performed by the static analyzer is gradually expanding, -and +

The set of checks performed by the static analyzer continues to expand, and future plans for the tool include full source-level inter-procedural analysis and deeper checks such as buffer overrun detection. There are many opportunities to extend and enhance the static analyzer, and anyone interested in working on @@ -143,170 +173,391 @@ this project is encouraged to get involved!

The VMKit project is an implementation of -a JVM and a CLI Virtual Machines (Microsoft .NET is an -implementation of the CLI) using the Just-In-Time compiler of LLVM.

+a JVM and a CLI Virtual Machine (Microsoft .NET is an +implementation of the CLI) using LLVM for static and just-in-time +compilation.

Following LLVM 2.4, VMKit has its first release 0.24 that you can find on -the release page. The release includes -bug fixes, cleanup and new features. The major changes include:

+VMKit version 0.26 builds with LLVM 2.6 and you can find it on its +web page. The release includes +bug fixes, cleanup and new features. The major changes are:

Support for generics in the .Net virtual machine. This was implemented -by Tilmann Scheller during his Google Summer of Code project.

Initial support for the Mono class libraries.
Support for MacOSX/x86, following LLVM's support for exceptions in -JIT on MacOSX/x86. -
A new vmkit driver: a program to run java or .net applications. The -driver supports llvm command line arguments including the new "-fast" option. -
A new memory allocation scheme in the JVM that makes unloading a -class loader very fast.
VMKit now follows the LLVM Makefile machinery.
A new llcj tool to generate shared libraries or executables of Java + files.
Cooperative garbage collection.
Fast subtype checking (paper from Click et al [JGI'02]).
Implementation of a two-word header for Java objects instead of the original + three-word header.
Better Java specification-compliance: division by zero checks, stack + overflow checks, finalization and references support.

+ +

+compiler-rt: Compiler Runtime Library +

+ +

+The new LLVM compiler-rt project +is a simple library that provides an implementation of the low-level +target-specific hooks required by code generation and other runtime components. +For example, when compiling for a 32-bit target, converting a double to a 64-bit +unsigned integer is compiled into a runtime call to the "__fixunsdfdi" +function. The compiler-rt library provides highly optimized implementations of +this and other low-level routines (some are 3x faster than the equivalent +libgcc routines).

+ +

+All of the code in the compiler-rt project is available under the standard LLVM +License, a "BSD-style" license.

+ +

+ + +

+KLEE: Symbolic Execution and Automatic Test Case Generator +

+ +

+The new LLVM KLEE project is a symbolic +execution framework for programs in LLVM bitcode form. KLEE tries to +symbolically evaluate "all" paths through the application and records state +transitions that lead to fault states. This allows it to construct testcases +that lead to faults and can even be used to verify algorithms. For more +details, please see the OSDI 2008 paper about +KLEE.

+ +

+ + +

+DragonEgg: GCC-4.5 as an LLVM frontend +

+ +

+The goal of DragonEgg is to make +gcc-4.5 act like llvm-gcc without requiring any gcc modifications whatsoever. +DragonEgg is a shared library (dragonegg.so) +that is loaded by gcc at runtime. It uses the new gcc plugin architecture to +disable the GCC optimizers and code generators, and schedule the LLVM optimizers +and code generators (or direct output of LLVM IR) instead. Currently only Linux +and Darwin are supported, and only on x86-32 and x86-64. It should be easy to +add additional unix-like architectures and other processor families. In theory +it should be possible to use DragonEgg +with any language supported by gcc, however only C and Fortran work well for the +moment. Ada and C++ work to some extent, while Java, Obj-C and Obj-C++ are so +far entirely untested. Since gcc-4.5 has not yet been released, neither has +DragonEgg. To build +DragonEgg you will need to check out the +development versions of gcc, +llvm and +DragonEgg from their respective +subversion repositories, and follow the instructions in the +DragonEgg README. +

+ +

+ + + +

+llvm-mc: Machine Code Toolkit +

+ +

+The LLVM Machine Code (MC) Toolkit project is a (very early) effort to build +better tools for dealing with machine code, object file formats, etc. The idea +is to be able to generate most of the target specific details of assemblers and +disassemblers from existing LLVM target .td files (with suitable enhancements), +and to build infrastructure for reading and writing common object file formats. +One of the first deliverables is to build a full assembler and integrate it into +the compiler, which is predicted to substantially reduce compile time in some +scenarios. +

+ +

In the LLVM 2.6 timeframe, the MC framework has grown to the point where it +can reliably parse and pretty print (with some encoding information) a +darwin/x86 .s file successfully, and has the very early phases of a Mach-O +assembler in progress. Beyond the MC framework itself, major refactoring of the +LLVM code generator has started. The idea is to make the code generator reason +about the code it is producing in a much more semantic way, rather than a +textual way. For example, the code generator now uses MCSection objects to +represent section assignments, instead of text strings that print to .section +directives.

+ +

MC is an early and ongoing project that will hopefully continue to lead to +many improvements in the code generator and build infrastructure useful for many +other situations. +

+ +

- What's New in LLVM? + External Open Source Projects Using LLVM 2.6

This release includes a huge number of bug fixes, performance tweaks and -minor improvements. Some of the major improvements and new features are listed -in this section. +

An exciting aspect of LLVM is that it is used as an enabling technology for + a lot of other language and tools projects. This section lists some of the + projects that have already been updated to work with LLVM 2.6.

+ + + +

+Rubinius +

+ +

Rubinius is an environment +for running Ruby code which strives to write as much of the core class +implementation in Ruby as possible. Combined with a bytecode interpreting VM, it +uses LLVM to optimize and compile ruby code down to machine code. Techniques +such as type feedback, method inlining, and uncommon traps are all used to +remove dynamism from ruby execution and increase performance.

+ +

Since LLVM 2.5, Rubinius has made several major leaps forward, implementing +a counter based JIT, type feedback and speculative method inlining.

-Major New Features +MacRuby

LLVM 2.4 includes several major new capabilities:

+MacRuby is an implementation of Ruby on top of +core Mac OS X technologies, such as the Objective-C common runtime and garbage +collector and the CoreFoundation framework. It is principally developed by +Apple and aims at enabling the creation of full-fledged Mac OS X applications. +

The most visible end-user change in LLVM 2.4 is that it includes many -optimizations and changes to make -O0 compile times much faster. You should see -improvements on the order of 30% (or more) faster than LLVM 2.3. There are many -pieces to this change, described in more detail below. The speedups and new -components can also be used for JIT compilers that want fast compilation as -well.
The biggest change to the LLVM IR is that Multiple Return Values (which -were introduced in LLVM 2.3) have been generalized to full support for "First -Class Aggregate" values in LLVM 2.4. This means that LLVM IR supports using -structs and arrays as values in a function. This capability is mostly useful -for front-end authors, who prefer to treat things like complex numbers, simple -tuples, dope vectors, etc as Value*'s instead of as a tuple of Value*'s or as -memory values. Bitcode files from LLVM 2.3 will automatically migrate to the -general representation.
LLVM 2.4 also includes an initial port for the PIC16 microprocessor. This -is the LLVM target that only has support for 8 bit registers, and a number of -other crazy constraints. While the port is still in early development stages, -it shows some interesting things you can do with LLVM.

+MacRuby uses LLVM for optimization passes, JIT and AOT compilation of Ruby +expressions. It also uses zero-cost DWARF exceptions to implement Ruby exception +handling.

+ + +

+Pure +

+ +

+Pure +is an algebraic/functional programming language based on term rewriting. +Programs are collections of equations which are used to evaluate expressions in +a symbolic fashion. Pure offers dynamic typing, eager and lazy evaluation, +lexical closures, a hygienic macro system (also based on term rewriting), +built-in list and matrix support (including list and matrix comprehensions) and +an easy-to-use C interface. The interpreter uses LLVM as a backend to + JIT-compile Pure programs to fast native code.

+ +

Pure versions 0.31 and later have been tested and are known to work with +LLVM 2.6 (and continue to work with older LLVM releases >= 2.3 as well). +

-llvm-gcc 4.2 Improvements +LLVM D Compiler

+LDC is an implementation of +the D Programming Language using the LLVM optimizer and code generator. +The LDC project works great with the LLVM 2.6 release. General improvements in +this +cycle have included new inline asm constraint handling, better debug info +support, general bug fixes and better x86-64 support. This has allowed +some major improvements in LDC, getting it much closer to being as +fully featured as the original DMD compiler from DigitalMars. +

LLVM fully supports the llvm-gcc 4.2 front-end, which marries the GCC -front-ends and driver with the LLVM optimizer and code generator. It currently -includes support for the C, C++, Objective-C, Ada, and Fortran front-ends.

+ +

+Roadsend PHP +

LLVM 2.4 supports the full set of atomic __sync_* builtins. LLVM -2.3 only supported those used by OpenMP, but 2.4 supports them all. While -llvm-gcc supports all of these builtins, note that not all targets do. X86 -support them all in both 32-bit and 64-bit mode and PowerPC supports them all -except for the 64-bit operations when in 32-bit mode.
llvm-gcc now supports an -flimited-precision option, which tells -the compiler that it is ok to use low-precision approximations of certain libm -functions (like tan, log, etc). This allows you to get high performance if you -only need (say) 14-bits of precision.
llvm-gcc now supports a C language extension known as "Blocks -". This feature is similar to nested functions and closures, but does not -require stack trampolines (with most ABIs) and supports returning closures -from functions that define them. Note that actually using Blocks -requires a small runtime that is not included with llvm-gcc.
llvm-gcc now supports a new -flto option. On systems that support -transparent Link Time Optimization (currently Darwin systems with Xcode 3.1 and -later) this allows the use of LTO with other optimization levels like -Os. -Previously, LTO could only be used with -O4, which implied optimizations in --O3 that can increase code size.

+Roadsend PHP (rphp) is an open +source implementation of the PHP programming +language that uses LLVM for its optimizer, JIT and static compiler. This is a +reimplementation of an earlier project that is now based on LLVM.

+ +

+Unladen Swallow

+Unladen Swallow is a +branch of Python intended to be fully +compatible and significantly faster. It uses LLVM's optimization passes and JIT +compiler.

-LLVM Core Improvements +llvm-lua +

+ +

+LLVM-Lua uses LLVM to add JIT +and static compiling support to the Lua VM. Lua bytecode is analyzed to +remove type checks, then LLVM is used to compile the bytecode down to machine +code.

+ + +

+IcedTea Java Virtual Machine Implementation +

+ +

+IcedTea provides a +harness to build OpenJDK using only free software build tools and to provide +replacements for the not-yet free parts of OpenJDK. One of the extensions that +IcedTea provides is a new JIT compiler named Shark which uses LLVM +to provide native code generation without introducing processor-dependent +code. +

+ + + + +

+ What's New in LLVM 2.6?

New features include: + +

This release includes a huge number of bug fixes, performance tweaks and +minor improvements. Some of the major improvements and new features are listed +in this section.

+ + +

+Major New Features +

+ +

LLVM 2.6 includes several major new capabilities:

A major change to the Use class landed, which shrank it by 25%. Since -this is a pervasive part of the LLVM, it ended up reducing the memory use of -LLVM IR in general by 15% for most programs.
New compiler-rt, KLEE + and machine code toolkit sub-projects.
Debug information now includes line numbers when optimizations are enabled. + This allows statistical sampling tools like OProfile and Shark to map + samples back to source lines.
LLVM now includes new experimental backends to support the MSP430, SystemZ + and BlackFin architectures.
LLVM supports a new Gold Linker Plugin which + enables support for transparent + link-time optimization on ELF targets when used with the Gold binutils + linker.
LLVM now supports doing optimization and code generation on multiple + threads. Please see the LLVM + Programmer's Manual for more information.
LLVM now has experimental support for embedded + metadata in LLVM IR, though the implementation is not guaranteed to be + final and the .bc file format may change in future releases. Debug info + does not yet use this format in LLVM 2.6.

Values with no names are now pretty printed by llvm-dis more -nicely. They now print as "%3 = add i32 %A, 4" instead of -"add i32 %A, 4 ; <i32>:3", which makes it much easier to read. -

LLVM 2.4 includes some changes for better vector support. First, the shift -operations (shl, ashr, lshr) now all support vectors -and do an element-by-element shift (shifts of the whole vector can be -accomplished by bitcasting the vector to <1 x i128> for example). Second, -there is initial support in development for vector comparisons with the -fcmp/icmp -instructions. These instructions compare two vectors and return a vector of -i1's for each result. Note that there is very little codegen support available -for any of these IR features though.

- -

A new DebugInfoBuilder class is available, which makes it much -easier for front-ends to create debug info descriptors, similar to the way that -IRBuilder makes it easier to create LLVM IR.

- -

The IRBuilder class is now parameterized by a class responsible -for constant folding. The default ConstantFolder class does target independent -constant folding. The NoFolder class does no constant folding at all, which is -useful when learning how LLVM works. The TargetFolder class folds the most, -doing target dependent constant folding.

- -

LLVM now supports "function attributes", which allows us to separate return -value attributes from function attributes. LLVM now supports attributes on a -function itself, a return value, and its parameters. New supported function -attributes include noinline/alwaysinline and the "opt-size" flag which says the -function should be optimized for code size.

- -

LLVM IR now directly represents "common" linkage, instead of - representing it as a form of weak linkage.

- + +

+LLVM IR and Core Improvements +

+ +

LLVM IR has several new features for better support of new targets and that +expose new optimization opportunities:

+ +

The add, sub and mul + instructions have been split into integer and floating point versions (like + divide and remainder), introducing new fadd, fsub, + and fmul instructions.
The add, sub and mul + instructions now support optional "nsw" and "nuw" bits which indicate that + the operation is guaranteed to not overflow (in the signed or + unsigned case, respectively). This gives the optimizer more information and + can be used for things like C signed integer values, which are undefined on + overflow.
The sdiv instruction now supports an + optional "exact" flag which indicates that the result of the division is + guaranteed to have a remainder of zero. This is useful for optimizing pointer + subtraction in C.
The getelementptr instruction now + supports arbitrary integer index values for array/pointer indices. This + allows for better code generation on 16-bit pointer targets like PIC16.
The getelementptr instruction now + supports an "inbounds" optimization hint that tells the optimizer that the + pointer is guaranteed to be within its allocated object.
LLVM now support a series of new linkage types for global values which allow + for better optimization and new capabilities: +
- linkonce_odr and + weak_odr have the same linkage + semantics as the non-"odr" linkage types. The difference is that these + linkage types indicate that all definitions of the specified function + are guaranteed to have the same semantics. This allows inlining + templates functions in C++ but not inlining weak functions in C, + which previously both got the same linkage type.
- available_externally + is a new linkage type that gives the optimizer visibility into the + definition of a function (allowing inlining and side effect analysis) + but that does not cause code to be generated. This allows better + optimization of "GNU inline" functions, extern templates, etc.
- linker_private is a + new linkage type (which is only useful on Mac OS X) that is used for + some metadata generation and other obscure things.
Finally, target-specific intrinsics can now return multiple values, which + is useful for modeling target operations with multiple results.

@@ -318,149 +569,328 @@ function should be optimized for code size.

In addition to a huge array of bug fixes and minor performance tweaks, this +

In addition to a large array of minor performance tweaks and bug fixes, this release includes a few major enhancements and additions to the optimizers:

The Global Value Numbering (GVN) pass now does local Partial Redundancy -Elimination (PRE) to eliminate some partially redundant expressions in cases -where doing so won't grow code size.
The Scalar Replacement of Aggregates + pass has many improvements that allow it to better promote vector unions, + variables which are memset, and much more strange code that can happen to + do bitfield accesses to register operations. An interesting change is that + it now produces "unusual" integer sizes (like i1704) in some cases and lets + other optimizers clean things up.
The Loop Strength Reduction pass now + promotes small integer induction variables to 64-bit on 64-bit targets, + which provides a major performance boost for much numerical code. It also + promotes shorts to int on 32-bit hosts, etc. LSR now also analyzes pointer + expressions (e.g. getelementptrs), as well as integers.
The GVN pass now eliminates partial + redundancies of loads in simple cases.
The Inliner now reuses stack space when + inlining similar arrays from multiple callees into one caller.
LLVM includes a new experimental Static Single Information (SSI) + construction pass.
LLVM 2.4 includes a new loop deletion pass (which removes output-free -provably-finite loops) and a rewritten Aggressive Dead Code Elimination (ADCE) -pass that no longer uses control dependence information. These changes speed up -the optimizer and also prevent it from deleting output-free infinite -loops.
The new AddReadAttrs pass works out which functions are read-only or -read-none (these correspond to 'pure' and 'const' in GCC) and marks them -with the appropriate attribute.

LLVM 2.4 now includes a new SparsePropagation framework, which makes it -trivial to build lattice-based dataflow solvers that operate over LLVM IR. Using -this interface means that you just define objects to represent your lattice -values and the transfer functions that operate on them. It handles the -mechanics of worklist processing, liveness tracking, handling PHI nodes, -etc.

The Loop Strength Reduction and induction variable optimization passes have -several improvements to avoid inserting MAX expressions, to optimize simple -floating point induction variables and to analyze trip counts of more -loops.

Various helper functions (ComputeMaskedBits, ComputeNumSignBits, etc) were -pulled out of the Instruction Combining pass and put into a new -ValueTracking.h header, where they can be reused by other passes.

+ +

+Interpreter and JIT Improvements +

The tail duplication pass has been removed from the standard optimizer -sequence used by llvm-gcc. This pass still exists, but the benefits it once -provided are now achieved by other passes.

LLVM has a new "EngineBuilder" class which makes it more obvious how to + set up and configure an ExecutionEngine (a JIT or interpreter).
The JIT now supports generating more than 16M of code.
When configured with --with-oprofile, the JIT can now inform + OProfile about JIT'd code, allowing OProfile to get line number and function + name information for JIT'd functions.
When "libffi" is available, the LLVM interpreter now uses it, which supports + calling almost arbitrary external (natively compiled) functions.
Clients of the JIT can now register a 'JITEventListener' object to receive + callbacks when the JIT emits or frees machine code. The OProfile support + uses this mechanism.

-Code Generator Improvements +Target Independent Code Generator Improvements

We have put a significant amount of work into the code generator infrastructure, -which allows us to implement more aggressive algorithms and make it run -faster:

We have put a significant amount of work into the code generator +infrastructure, which allows us to implement more aggressive algorithms and make +it run faster:

The target-independent code generator supports (and the X86 backend - currently implements) a new interface for "fast" instruction selection. This - interface is optimized to produce code as quickly as possible, sacrificing - code quality to do it. This is used by default at -O0 or when using - "llc -fast" on X86. It is straight-forward to add support for - other targets if faster -O0 compilation is desired.
In addition to the new 'fast' instruction selection path, many existing - pieces of the code generator have been optimized in significant ways. - SelectionDAG's are now pool allocated and use better algorithms in many - places, the ".s" file printers now use raw_ostream to emit text much faster, - etc. The end result of these improvements is that the compiler also takes - substantially less time to generate code that is just as good (and often - better) than before.
Each target has been split to separate the ".s" file printing logic from the - rest of the target. This enables JIT compilers that don't link in the - (somewhat large) code and data tables used for printing a ".s" file.
The code generator now includes a "stack slot coloring" pass, which packs - together individual spilled values into common stack slots. This reduces - the size of stack frames with many spills, which tends to increase L1 cache - effectiveness.
Various pieces of the register allocator (e.g. the coalescer and two-address - operation elimination pass) now know how to rematerialize trivial operations - to avoid copies and include several other optimizations.
The graphs produced by - the llc -view-*-dags options are now significantly prettier and - easier to read.
LLVM 2.4 includes a new register allocator based on Partitioned Boolean - Quadratic Programming (PBQP). This register allocator is still in - development, but is very simple and clean.
The llc -asm-verbose option (exposed from llvm-gcc as -dA + and clang as -fverbose-asm or -dA) now adds a lot of + useful information in comments to + the generated .s file. This information includes location information (if + built with -g) and loop nest information.
The code generator now supports a new MachineVerifier pass which is useful + for finding bugs in targets and codegen passes.
The Machine LICM is now enabled by default. It hoists instructions out of + loops (such as constant pool loads, loads from read-only stubs, vector + constant synthesization code, etc.) and is currently configured to only do + so when the hoisted operation can be rematerialized.
The Machine Sinking pass is now enabled by default. This pass moves + side-effect free operations down the CFG so that they are executed on fewer + paths through a function.
The code generator now performs "stack slot coloring" of register spills, + which allows spill slots to be reused. This leads to smaller stack frames + in cases where there are lots of register spills.
The register allocator has many improvements to take better advantage of + commutable operations, various spiller peephole optimizations, and can now + coalesce cross-register-class copies.
Tblgen now supports multiclass inheritance and a number of new string and + list operations like !(subst), !(foreach), !car, + !cdr, !null, !if, !cast. + These make the .td files more expressive and allow more aggressive factoring + of duplication across instruction patterns.
Target-specific intrinsics can now be added without having to hack VMCore to + add them. This makes it easier to maintain out-of-tree targets.
The instruction selector is better at propagating information about values + (such as whether they are sign/zero extended etc.) across basic block + boundaries.
The SelectionDAG datastructure has new nodes for representing buildvector + and vector shuffle operations. This + makes operations and pattern matching more efficient and easier to get + right.
The Prolog/Epilog Insertion Pass now has experimental support for performing + the "shrink wrapping" optimization, which moves spills and reloads around in + the CFG to avoid doing saves on paths that don't need them.
LLVM includes new experimental support for writing ELF .o files directly + from the compiler. It works well for many simple C testcases, but doesn't + support exception handling, debug info, inline assembly, etc.
Targets can now specify register allocation hints through + MachineRegisterInfo::setRegAllocationHint. A regalloc hint consists + of hint type and physical register number. A hint type of zero specifies a + register allocation preference. Other hint type values are target specific + which are resolved by TargetRegisterInfo::ResolveRegAllocHint. An + example is the ARM target which uses register hints to request that the + register allocator provide an even / odd register pair to two virtual + registers.

+ +

+X86-32 and X86-64 Target Improvements

New features of the X86 target include: +

+ +

SSE 4.2 builtins are now supported.
GCC-compatible soft float modes are now supported, which are typically used + by OS kernels.
X86-64 now models implicit zero extensions better, which allows the code + generator to remove a lot of redundant zexts. It also models the 8-bit "H" + registers as subregs, which allows them to be used in some tricky + situations.
X86-64 now supports the "local exec" and "initial exec" thread local storage + model.
The vector forms of the icmp and fcmp instructions now select to efficient + SSE operations.
Support for the win64 calling conventions have improved. The primary + missing feature is support for varargs function definitions. It seems to + work well for many win64 JIT purposes.
The X86 backend has preliminary support for mapping address spaces to segment + register references. This allows you to write GS or FS relative memory + accesses directly in LLVM IR for cases where you know exactly what you're + doing (such as in an OS kernel). There are some known problems with this + support, but it works in simple cases.
The X86 code generator has been refactored to move all global variable + reference logic to one place + (X86Subtarget::ClassifyGlobalReference) which + makes it easier to reason about.

+ +

+ + +

+PIC16 Target Improvements +

+ +

New features of the PIC16 target include: +

+ +

Support for floating-point, indirect function calls, and + passing/returning aggregate types to functions. +
The code generator is able to generate debug info into output COFF files. +
Support for placing an object into a specific section or at a specific + address in memory.

+ +

Things not yet supported:

+ +

Variable arguments.
Interrupts/programs.

+ +

-Target Specific Improvements +ARM Target Improvements

New target-specific features include: +

New features of the ARM target include:

Exception handling is supported by default on Linux/x86-64.
Position Independent Code (PIC) is now supported on Linux/x86-64.
@llvm.frameaddress now supports getting the frame address of stack frames - > 0 on x86/x86-64.
MIPS floating point support? [BRUNO]
The PowerPC backend now supports trampolines.
Preliminary support for processors, such as the Cortex-A8 and Cortex-A9, +that implement version v7-A of the ARM architecture. The ARM backend now +supports both the Thumb2 and Advanced SIMD (Neon) instruction sets.
The AAPCS-VFP "hard float" calling conventions are also supported with the +-float-abi=hard flag.
The ARM calling convention code is now tblgen generated instead of resorting + to C++ code.

These features are still somewhat experimental +and subject to change. The Neon intrinsics, in particular, may change in future +releases of LLVM. ARMv7 support has progressed a lot on top of tree since 2.6 +branched.

+ + +

+Other Target Specific Improvements

New features of other targets include: +

+ +

Mips now supports O32 Calling Convention.
Many improvements to the 32-bit PowerPC SVR4 ABI (used on powerpc-linux) + support, lots of bugs fixed.
Added support for the 64-bit PowerPC SVR4 ABI (used on powerpc64-linux). + Needs more testing.

+ +

-Other Improvements +New Useful APIs

New features include: + +

This release includes a number of new APIs that are used internally, which + may also be useful for external clients.

llvmc2 (the generic compiler driver) gained plugin - support. It is now easier to experiment with llvmc2 and - build your own tools based on it.
LLVM 2.4 includes a number of new generic algorithms and data structures, - include a scoped hash table, 'immutable' data structures, a simple - free-list manager, and a raw_ostream class. - The raw_ostream class and - format allow for efficient file output, and various pieces of LLVM - have switched over to use it. The eventual goal is to eliminate - std::ostream in favor of it.
New + PrettyStackTrace class allows crashes of llvm tools (and applications + that integrate them) to provide more detailed indication of what the + compiler was doing at the time of the crash (e.g. running a pass). + At the top level for each LLVM tool, it includes the command line arguments. +
New StringRef + and Twine classes + make operations on character ranges and + string concatenation to be more efficient. StringRef is just a const + char* with a length, Twine is a light-weight rope.
LLVM has new WeakVH, AssertingVH and CallbackVH + classes, which make it easier to write LLVM IR transformations. WeakVH + is automatically drops to null when the referenced Value is deleted, + and is updated across a replaceAllUsesWith operation. + AssertingVH aborts the program if the + referenced value is destroyed while it is being referenced. CallbackVH + is a customizable class for handling value references. See ValueHandle.h + for more information.
The new 'Triple + ' class centralizes a lot of logic that reasons about target + triples.
The new ' + llvm_report_error()' set of APIs allows tools to embed the LLVM + optimizer and backend and recover from previously unrecoverable errors.
LLVM has new abstractions for atomic operations + and reader/writer + locks.
LLVM has new + SourceMgr and SMLoc classes which implement caret + diagnostics and basic include stack processing for simple parsers. It is + used by tablegen, llvm-mc, the .ll parser and FileCheck.

+ + +

+Other Improvements and New Features +

+ +

Other miscellaneous features include:

+ +

LLVM now includes a new internal 'FileCheck' tool which allows + writing much more accurate regression tests that run faster. Please see the + FileCheck section of the Testing + Guide for more information.
LLVM profile information support has been significantly improved to produce +correct use counts, and has support for edge profiling with reduced runtime +overhead. Combined, the generated profile information is both more correct and +imposes about half as much overhead (2.6. from 12% to 6% overhead on SPEC +CPU2000).
The C bindings (in the llvm/include/llvm-c directory) include many newly + supported APIs.
LLVM 2.6 includes a brand new experimental LLVM bindings to the Ada2005 + programming language.
The LLVMC driver has several new features: +
- Dynamic plugins now work on Windows.
- New option property: init. Makes possible to provide default values for + options defined in plugins (interface to cl::init).
- New example: Skeleton, shows how to create a standalone LLVMC-based + driver.
- New example: mcc16, a driver for the PIC16 toolchain.
+

Major Changes and Removed Features @@ -469,19 +899,24 @@ faster:

If you're already an LLVM user or developer with out-of-tree changes based -on LLVM 2.3, this section lists some "gotchas" that you may run into upgrading +on LLVM 2.5, this section lists some "gotchas" that you may run into upgrading from the previous release.

The LLVM IR generated by llvm-gcc no longer names all instructions. This - makes it run faster, but may be more confusing to some people. If you - prefer to have names, the 'opt -instnamer' pass will add names to - all instructions.
The LoadVN and GCSE passes have been removed from the tree. They are - obsolete and have been replaced with the GVN and MemoryDependence passes. -
The Itanium (IA64) backend has been removed. It was not actively supported + and had bitrotted.
The BigBlock register allocator has been removed, it had also bitrotted.
The C Backend (-march=c) is no longer considered part of the LLVM release +criteria. We still want it to work, but no one is maintaining it and it lacks +support for arbitrary precision integers and other important IR features.
All LLVM tools now default to overwriting their output file, behaving more + like standard unix tools. Previously, this only happened with the '-f' + option.
LLVM build now builds all libraries as .a files instead of some + libraries as relinked .o files. This requires some APIs like + InitializeAllTargets.h. +

@@ -489,31 +924,82 @@ from the previous release.

API changes are:

All uses of hash_set and hash_map have been removed from + the LLVM tree and the wrapper headers have been removed.
The llvm/Streams.h and DOUT member of Debug.h have been removed. The + llvm::Ostream class has been completely removed and replaced with + uses of raw_ostream.
LLVM's global uniquing tables for Types and Constants have + been privatized into members of an LLVMContext. A number of APIs + now take an LLVMContext as a parameter. To smooth the transition + for clients that will only ever use a single context, the new + getGlobalContext() API can be used to access a default global + context which can be passed in any and all cases where a context is + required. +
The getABITypeSize methods are now called getAllocSize.
The Add, Sub and Mul operators are no longer + overloaded for floating-point types. Floating-point addition, subtraction + and multiplication are now represented with new operators FAdd, + FSub and FMul. In the IRBuilder API, + CreateAdd, CreateSub, CreateMul and + CreateNeg should only be used for integer arithmetic now; + CreateFAdd, CreateFSub, CreateFMul and + CreateFNeg should now be used for floating-point arithmetic.
The DynamicLibrary class can no longer be constructed, its functionality has + moved to static member functions.
raw_fd_ostream's constructor for opening a given filename now + takes an extra Force argument. If Force is set to + false, an error will be reported if a file with the given name + already exists. If Force is set to true, the file will + be silently truncated (which is the behavior before this flag was + added).
SCEVHandle no longer exists, because reference counting is no + longer done for SCEV* objects, instead const SCEV* + should be used.
Many APIs, notably llvm::Value, now use the StringRef +and Twine classes instead of passing const char* +or std::string, as described in +the Programmer's Manual. Most +clients should be unaffected by this transition, unless they are used to +Value::getName() returning a string. Here are some tips on updating to +2.6: +
- getNameStr() is still available, and matches the old + behavior. Replacing getName() calls with this is an safe option, + although more efficient alternatives are now possible.
- If you were just relying on getName() being able to be sent to + a std::ostream, consider migrating + to llvm::raw_ostream.
- If you were using getName().c_str() to get a const + char* pointer to the name, you can use getName().data(). + Note that this string (as before), may not be the entire name if the + name contains embedded null characters.
- If you were using operator + on the result of getName() and + treating the result as an std::string, you can either + use Twine::str to get the result as an std::string, or + could move to a Twine based design.
- isName() should be replaced with comparison + against getName() (this is now efficient). +
+
Attributes changes [DEVANG]
The DbgStopPointInst methods getDirectory and -getFileName now return Value* instead of strings. These can be -converted to strings using llvm::GetConstantStringInfo defined via -"llvm/Analysis/ValueTracking.h".
The APIs to create various instructions have changed from lower case - "create" methods to upper case "Create" methods (e.g. - BinaryOperator::create). LLVM 2.4 includes both cases, but the - lower case ones are removed in mainline, please migrate.
Various header files like "llvm/ADT/iterator" were given a ".h" suffix. - Change your code to #include "llvm/ADT/iterator.h" instead.
In the code generator, many MachineOperand predicates were renamed to be - shorter (e.g. isFrameIndex() -> isFI()), - SDOperand was renamed to SDValue (and the "Val" - member was changed to be the getNode() accessor), and the - MVT::ValueType enum has been replaced with an "MVT" - struct. The getSignExtended and getValue methods in the - ConstantSDNode class were renamed to getSExtValue and - getZExtValue respectively, to be more consistent with - the ConstantInt class.
The registration interfaces for backend Targets has changed (what was +previously TargetMachineRegistry). For backend authors, see the Writing An LLVM Backend +guide. For clients, the notable API changes are: +
- TargetMachineRegistry has been renamed + to TargetRegistry.
- Clients should move to using the TargetRegistry::lookupTarget() + function to find targets.
+

@@ -531,16 +1017,16 @@ converted to strings using llvm::GetConstantStringInfo defined via

LLVM is known to work on the following platforms:

Intel and AMD machines (IA32) running Red Hat Linux, Fedora Core and FreeBSD - (and probably other unix-like systems).
PowerPC and X86-based Mac OS X systems, running 10.3 and above in 32-bit and - 64-bit modes.
Intel and AMD machines (IA32, X86-64, AMD64, EMT-64) running Red Hat + Linux, Fedora Core, FreeBSD and AuroraUX (and probably other unix-like + systems).
PowerPC and X86-based Mac OS X systems, running 10.3 and above in 32-bit + and 64-bit modes.
Intel and AMD machines running on Win32 using MinGW libraries (native).
Intel and AMD machines running on Win32 with the Cygwin libraries (limited support is available for native builds with Visual C++).
Sun UltraSPARC workstations running Solaris 10.
Sun x86 and AMD64 machines running Solaris 10, OpenSolaris 0906.
Alpha-based machines running Debian GNU/Linux.
Itanium-based (IA64) machines running Linux and HP-UX.

The core LLVM infrastructure uses GNU autoconf to adapt itself @@ -558,12 +1044,26 @@ portability patches and reports of successful builds or error messages.

This section contains all known problems with the LLVM system, listed by -component. As new problems are discovered, they will be added to these -sections. If you run into a problem, please check the This section contains significant known problems with the LLVM system, +listed by component. If you run into a problem, please check the LLVM bug database and submit a bug if there isn't already one.

The llvm-gcc bootstrap will fail with some versions of binutils (e.g. 2.15) + with a message of "Error: can not do 8 + byte pc-relative relocation" when building C++ code. We intend to + fix this on mainline, but a workaround for 2.6 is to upgrade to binutils + 2.17 or later.
LLVM will not correctly compile on Solaris and/or OpenSolaris +using the stock GCC 3.x.x series 'out the box', +See: Broken versions of GCC and other tools. +However, A Modern GCC Build +for x86/x86-64 has been made available from the third party AuroraUX Project +that has been meticulously tested for bootstrapping LLVM & Clang.

@@ -581,9 +1081,11 @@ components, please contact us on the LLVMdev list.

The MSIL, IA64, Alpha, SPU, MIPS, and PIC16 backends are experimental.
The llc "-filetype=asm" (the default) is the only supported - value for this option.
The MSIL, Alpha, SPU, MIPS, PIC16, Blackfin, MSP430 and SystemZ backends are + experimental.
The llc "-filetype=asm" (the default) is the only + supported value for this option. The ELF writer is experimental.
The implementation of Andersen's Alias Analysis has many known bugs.

@@ -603,13 +1105,14 @@ href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">LLVMdev list.

The X86 backend generates inefficient floating point code when configured to generate code for systems that don't have SSE2.

Win64 code generation wasn't widely tested. Everything should work, but we - expect small issues to happen. Also, llvm-gcc cannot build mingw64 runtime - currently due + expect small issues to happen. Also, llvm-gcc cannot build the mingw64 + runtime currently due to several - bugs due to lack of support for the - 'u' inline assembly constraint and X87 floating point inline assembly.

+ bugs and due to lack of support for + the + 'u' inline assembly constraint and for X87 floating point inline assembly.

The X86-64 backend does not yet support the LLVM IR instruction - va_arg. Currently, the llvm-gcc front-end supports variadic + va_arg. Currently, the llvm-gcc and front-ends support variadic argument constructs on X86-64 by lowering them manually.

@@ -637,14 +1140,14 @@ compilation, and lacks support for debug information.

Support for the Advanced SIMD (Neon) instruction set is still incomplete +and not well tested. Some features may not work at all, and the code quality +may be poor in some cases.
Thumb mode works only on ARMv6 or higher processors. On sub-ARMv6 processors, thumb programs can crash or produce wrong results (PR1388).
Compilation for ARM Linux OABI (old ABI) is supported, but not fully tested. +
Compilation for ARM Linux OABI (old ABI) is supported but not fully tested.
There is a bug in QEMU-ARM (<= 0.9.0) which causes it to incorrectly - execute -programs compiled with LLVM. Please use more recent versions of QEMU.

@@ -657,7 +1160,7 @@ programs compiled with LLVM. Please use more recent versions of QEMU.

The SPARC backend only supports the 32-bit SPARC ABI (-m32), it does not +
The SPARC backend only supports the 32-bit SPARC ABI (-m32); it does not support the 64-bit SPARC ABI (-m64).

@@ -665,32 +1168,30 @@ programs compiled with LLVM. Please use more recent versions of QEMU.

- Known problems with the Alpha back-end + Known problems with the MIPS back-end

On 21164s, some rare FP arithmetic sequences which may trap do not have the -appropriate nops inserted to ensure restartability.
64-bit MIPS targets are not supported yet.

- Known problems with the IA64 back-end + Known problems with the Alpha back-end

The Itanium backend is highly experimental, and has a number of known - issues. We are looking for a maintainer for the Itanium backend. If you - are interested, please contact the llvmdev mailing list.

On 21164s, some rare FP arithmetic sequences which may trap do not have the +appropriate nops inserted to ensure restartability.

+ +

@@ -705,8 +1206,9 @@ appropriate nops inserted to ensure restartability. inline assembly code.

The C backend violates the ABI of common C++ programs, preventing intermixing between C++ compiled by the CBE and - C++ code compiled with llc or native compilers.

+ C++ code compiled with llc or native compilers.

The C backend does not support all exception handling constructs.

The C backend does not support arbitrary precision integers.

@@ -719,10 +1221,6 @@ appropriate nops inserted to ensure restartability.

llvm-gcc does not currently support Link-Time -Optimization on most platforms "out-of-the-box". Please inquire on the -llvmdev mailing list if you are interested.

The only major language feature of GCC not supported by llvm-gcc is the __builtin_apply family of builtins. However, some extensions are only supported on some targets. For example, trampolines are only @@ -747,11 +1245,23 @@ itself, Qt, Mozilla, etc.

Exception handling works well on the X86 and PowerPC targets. Currently - only linux and darwin targets are supported (both 32 and 64 bit).

+ +

+ Known problems with the llvm-gcc Fortran front-end +

+ +

Fortran support generally works, but there are still several unresolved bugs + in Bugzilla. Please see the + tools/gfortran component for details.

@@ -759,22 +1269,26 @@ itself, Qt, Mozilla, etc.

-The llvm-gcc 4.2 Ada compiler works fairly well, however this is not a mature -technology and problems should be expected. +The llvm-gcc 4.2 Ada compiler works fairly well; however, this is not a mature +technology, and problems should be expected.

The Ada front-end currently only builds on X86-32. This is mainly due -to lack of trampoline support (pointers to nested functions) on other platforms, -however it also fails to build on X86-64 +to lack of trampoline support (pointers to nested functions) on other platforms. +However, it also fails to build on X86-64 which does support trampolines.
The Ada front-end fails to bootstrap. -Workaround: configure with --disable-bootstrap.

setjmp

longjmp

--disable-bootstrap

The c380004, c393010 and cxg2021 ACATS tests fail -(c380004 also fails with gcc-4.2 mainline).
Some gcc specific Ada tests continue to crash the compiler.
The -E binder option (exception backtraces) +(c380004 also fails with gcc-4.2 mainline). +If the compiler is built with checks disabled then c393010 +causes the compiler to go into an infinite loop, using up all system memory.
Some GCC specific Ada tests continue to crash the compiler.
The -E binder option (exception backtraces) does not work and will result in programs -crashing if an exception is raised. Workaround: do not use -E.

-E

Only discrete types are allowed to start or finish at a non-byte offset in a record. Workaround: do not pack records or use representation clauses that result in a field of a non-discrete type @@ -788,6 +1302,20 @@ ignored.

+ +

+ Known problems with the O'Caml bindings +

+ +

The Llvm.Linkage module is broken, and has incorrect values. Only +Llvm.Linkage.External, Llvm.Linkage.Available_externally, and +Llvm.Linkage.Link_once will be correct. If you need any of the other linkage +modes, you'll have to write an external C library in order to expose the +functionality. This has been fixed in the trunk.

Additional Information @@ -815,9 +1343,9 @@ lists.

+ src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS">

+ src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"> LLVM Compiler Infrastructure
Last modified: $Date$