X-Git-Url: http://demsky.eecs.uci.edu/git/?a=blobdiff_plain;f=docs%2FReleaseNotes.html;h=4feb907d367fef5bf476e37fc4a678670ccf4135;hb=adf01b3f18442ae8db6b8948e70d82d9df415119;hp=48d5c6fe5cdae9996aaead68503ea48376a381c5;hpb=3bdcda1a8b0df7a26116e313d2ac8eadbabed843;p=oota-llvm.git diff --git a/docs/ReleaseNotes.html b/docs/ReleaseNotes.html index 48d5c6fe5cd..4feb907d367 100644 --- a/docs/ReleaseNotes.html +++ b/docs/ReleaseNotes.html @@ -3,13 +3,12 @@ - - LLVM 2.8 Release Notes + LLVM 3.0 Release Notes -
LLVM 2.8 Release Notes
+

LLVM 3.0 Release Notes

LLVM Dragon Logo @@ -17,35 +16,35 @@
  1. Introduction
  2. Sub-project Status Update
  3. -
  4. External Projects Using LLVM 2.8
  5. -
  6. What's New in LLVM 2.8?
  7. +
  8. External Projects Using LLVM 3.0
  9. +
  10. What's New in LLVM 3.0?
  11. Installation Instructions
  12. Known Problems
  13. Additional Information
-

Written by the LLVM Team

+

Written by the LLVM Team

+ --> -
+

Introduction -

+ -
+

This document contains the release notes for the LLVM Compiler -Infrastructure, release 2.8. Here we describe the status of LLVM, including +Infrastructure, release 3.0. Here we describe the status of LLVM, including major improvements from the previous release and significant known problems. All LLVM releases may be downloaded from the LLVM releases web site.

@@ -62,51 +61,37 @@ current one. To see the release notes for a specific release, please see the releases page.

- - - - - - - - -
+

Sub-project Status Update -

+ -
+

-The LLVM 2.8 distribution currently consists of code from the core LLVM +The LLVM 3.0 distribution currently consists of code from the core LLVM repository (which roughly includes the LLVM optimizers, code generators and supporting tools), the Clang repository and the llvm-gcc repository. In addition to this code, the LLVM Project includes other sub-projects that are in development. Here we include updates on these subprojects.

-
- - - + -
+

Clang is an LLVM front end for the C, C++, and Objective-C languages. Clang aims to provide a better user experience @@ -115,96 +100,51 @@ standards, fast compilation, and low memory use. Like LLVM, Clang provides a modular, library-based architecture that makes it suitable for creating or integrating with other development tools. Clang is considered a production-quality compiler for C, Objective-C, C++ and Objective-C++ on x86 -(32- and 64-bit), and for darwin-arm targets.

- -

In the LLVM 2.8 time-frame, the Clang team has made many improvements:

+(32- and 64-bit), and for darwin/arm targets.

-
    -
  • Surely these guys have done something
  • -
  • X86-64 abi improvements? Did they make it in?
  • -
-
- - - - -
- -

The Clang Static Analyzer - project is an effort to use static source code analysis techniques to - automatically find bugs in C and Objective-C programs (and hopefully C++ in the - future!). The tool is very good at finding bugs that occur on specific - paths through code, such as on error conditions.

- -

The LLVM 2.8 release fixes a number of bugs and slightly improves precision - over 2.7, but there are no major new features in the release. +

In the LLVM 3.0 time-frame, the Clang team has made many improvements:

+ +

If Clang rejects your code but another compiler accepts it, please take a +look at the language +compatibility guide to make sure this is not intentional or a known issue.

- +

+DragonEgg: GCC front-ends, LLVM back-end +

-
+

-DragonEgg is a port of llvm-gcc to -gcc-4.5. Unlike llvm-gcc, dragonegg in theory does not require any gcc-4.5 -modifications whatsoever (currently one small patch is needed) thanks to the -new gcc plugin architecture. -DragonEgg is a gcc plugin that makes gcc-4.5 use the LLVM optimizers and code -generators instead of gcc's, just like with llvm-gcc. +DragonEgg is a +gcc plugin that replaces GCC's +optimizers and code generators with LLVM's. +Currently it requires a patched version of gcc-4.5. +The plugin can target the x86-32 and x86-64 processor families and has been +used successfully on the Darwin, FreeBSD and Linux platforms. +The Ada, C, C++ and Fortran languages work well. +The plugin is capable of compiling plenty of Obj-C, Obj-C++ and Java but it is +not known whether the compiled code actually works or not!

-DragonEgg is still a work in progress, but it is able to compile a lot of code, -for example all of gcc, LLVM and clang. Currently Ada, C, C++ and Fortran work -well, while all other languages either don't work at all or only work poorly. -For the moment only the x86-32 and x86-64 targets are supported, and only on -linux and darwin (darwin may need additional gcc patches). -

- -

-The 2.8 release has the following notable changes: +The 3.0 release has the following notable changes:

    -
  • The plugin loads faster due to exporting fewer symbols.
  • -
  • Additional vector operations such as addps256 are now supported.
  • -
  • Ada global variables with no initial value are no longer zero initialized, -resulting in better optimization.
  • -
  • The '-fplugin-arg-dragonegg-enable-gcc-optzns' flag now runs all gcc -optimizers, rather than just a handful.
  • -
  • Fortran programs using common variables now link correctly.
  • -
  • GNU OMP constructs no longer crash the compiler.
  • +
-

- -
- - - -
-

-The VMKit project is an implementation of -a Java Virtual Machine (Java VM or JVM) that uses LLVM for static and -just-in-time compilation. As of LLVM 2.8, VMKit now supports copying garbage -collectors, and can be configured to use MMTk's copy mark-sweep garbage -collector. In LLVM 2.8, the VMKit .NET VM is no longer being maintained. -

- + -
+

The new LLVM compiler-rt project is a simple library that provides an implementation of the low-level @@ -215,21 +155,16 @@ function. The compiler-rt library provides highly optimized implementations of this and other low-level routines (some are 3x faster than the equivalent libgcc routines).

-

-All of the code in the compiler-rt project is available under the standard LLVM -License, a "BSD-style" license. New in LLVM 2.8, compiler_rt now supports -soft floating point (for targets that don't have a real floating point unit), -and includes an extensive testsuite for the "blocks" language feature and the -blocks runtime included in compiler_rt.

+

In the LLVM 3.0 timeframe,

- + -
+

LLDB is a brand new member of the LLVM umbrella of projects. LLDB is a next generation, high-performance debugger. It @@ -238,445 +173,327 @@ libraries in the larger LLVM Project, such as the Clang expression parser, the LLVM disassembler and the LLVM JIT.

-LLDB is in early development and not included as part of the LLVM 2.8 release, -but is mature enough to support basic debugging scenarios on Mac OS X in C, -Objective-C and C++. We'd really like help extending and expanding LLDB to -support new platforms, new languages, new architectures, and new features. -

+LLDB is has advanced by leaps and bounds in the 3.0 timeframe. It is +dramatically more stable and useful, and includes both a new tutorial and a side-by-side comparison with +GDB.

- + -
+

-libc++ is another new member of the LLVM +libc++ is another new member of the LLVM family. It is an implementation of the C++ standard library, written from the ground up to specifically target the forthcoming C++'0X standard and focus on delivering great performance.

-As of the LLVM 2.8 release, libc++ is virtually feature complete, but would -benefit from more testing and better integration with Clang++. It is also -looking forward to the C++ committee finalizing the C++'0x standard. +In the LLVM 3.0 timeframe,

+ +

+Like compiler_rt, libc++ is now dual + licensed under the MIT and UIUC license, allowing it to be used more + permissively.

- - - - -
- -

An exciting aspect of LLVM is that it is used as an enabling technology for - a lot of other language and tools projects. This section lists some of the - projects that have already been updated to work with LLVM 2.8.

-
- - +

+LLBrowse: IR Browser +

-
+

-TCE is a toolset for designing -application-specific processors (ASP) based on the Transport triggered -architecture (TTA). The toolset provides a complete co-design flow from C/C++ -programs down to synthesizable VHDL and parallel program binaries. Processor -customization points include the register files, function units, supported -operations, and the interconnection network.

- -

TCE uses llvm-gcc/Clang and LLVM for C/C++ language support, target -independent optimizations and also for parts of code generation. It generates -new LLVM-based code generators "on the fly" for the designed TTA processors and -loads them in to the compiler backend as runtime libraries to avoid per-target -recompilation of larger parts of the compiler chain.

- + + LLBrowse is an interactive viewer for LLVM modules. It can load any LLVM + module and displays its contents as an expandable tree view, facilitating an + easy way to inspect types, functions, global variables, or metadata nodes. It + is fully cross-platform, being based on the popular wxWidgets GUI toolkit. +

- - -
-

-Horizon is a bytecode -language and compiler written on top of LLVM, intended for producing -single-address-space managed code operating systems that -run faster than the equivalent multiple-address-space C systems. -More in-depth blurb is available on the wiki.

- +

+VMKit +

+ +
+

The VMKit project is an implementation + of a Java Virtual Machine (Java VM or JVM) that uses LLVM for static and + just-in-time compilation. As of LLVM 3.0, VMKit now supports generational + garbage collectors. The garbage collectors are provided by the MMTk framework, + and VMKit can be configured to use one of the numerous implemented collectors + of MMTk. +

- + + - +
- -
-Pure -
+ +

+ External Open Source Projects Using LLVM 3.0 +

+ -
-

-Pure -is an algebraic/functional -programming language based on term rewriting. Programs are collections -of equations which are used to evaluate expressions in a symbolic -fashion. Pure offers dynamic typing, eager and lazy evaluation, lexical -closures, a hygienic macro system (also based on term rewriting), -built-in list and matrix support (including list and matrix -comprehensions) and an easy-to-use C interface. The interpreter uses -LLVM as a backend to JIT-compile Pure programs to fast native code.

- -

Pure versions 0.44 and later have been tested and are known to work with -LLVM 2.8 (and continue to work with older LLVM releases >= 2.5).

+
-
+

An exciting aspect of LLVM is that it is used as an enabling technology for + a lot of other language and tools projects. This section lists some of the + projects that have already been updated to work with LLVM 3.0.

- +

Crack Programming Language

-
+

-GHC is an open source, -state-of-the-art programming suite for -Haskell, a standard lazy functional programming language. It includes -an optimizing static compiler generating good code for a variety of -platforms, together with an interactive system for convenient, quick -development.

- -

In addition to the existing C and native code generators, GHC 7.0 now -supports an LLVM -code generator. GHC supports LLVM 2.7 and later.

- +Crack aims to provide the +ease of development of a scripting language with the performance of a compiled +language. The language derives concepts from C++, Java and Python, incorporating +object-oriented programming, operator overloading and strong typing.

- + + -
-Clay Programming Language +

TTA-based Codesign Environment (TCE)

+ +
+

TCE is a toolset for designing application-specific processors (ASP) based on +the Transport triggered architecture (TTA). The toolset provides a complete +co-design flow from C/C++ programs down to synthesizable VHDL and parallel +program binaries. Processor customization points include the register files, +function units, supported operations, and the interconnection network.

+ +

TCE uses Clang and LLVM for C/C++ language support, target independent +optimizations and also for parts of code generation. It generates new LLVM-based +code generators "on the fly" for the designed TTA processors and loads them in +to the compiler backend as runtime libraries to avoid per-target recompilation +of larger parts of the compiler chain.

-
-

-Clay is a new systems programming -language that is specifically designed for generic programming. It makes -generic programming very concise thanks to whole program type propagation. It -uses LLVM as its backend.

- -
+ -
-llvm-py Python Bindings for LLVM +

PinaVM

+ +
+

PinaVM is an open +source, SystemC front-end. Unlike many +other front-ends, PinaVM actually executes the elaboration of the +program analyzed using LLVM's JIT infrastructure. It later enriches the +bitcode with SystemC-specific information.

-
-

-llvm-py has been updated to work -with LLVM 2.8. llvm-py provides Python bindings for LLVM, allowing you to write a -compiler backend or a VM in Python.

- + +

Pure

+ +
+

Pure is an + algebraic/functional + programming language based on term rewriting. Programs are collections + of equations which are used to evaluate expressions in a symbolic + fashion. The interpreter uses LLVM as a backend to JIT-compile Pure + programs to fast native code. Pure offers dynamic typing, eager and lazy + evaluation, lexical closures, a hygienic macro system (also based on + term rewriting), built-in list and matrix support (including list and + matrix comprehensions) and an easy-to-use interface to C and other + programming languages (including the ability to load LLVM bitcode + modules, and inline C, C++, Fortran and Faust code in Pure programs if + the corresponding LLVM-enabled compilers are installed).

+ +

Pure version 0.47 has been tested and is known to work with LLVM 3.0 + (and continues to work with older LLVM releases >= 2.5).

- - +

IcedTea Java Virtual Machine Implementation

-
+

-FAUST is a compiled language for real-time -audio signal processing. The name FAUST stands for Functional AUdio STream. Its -programming model combines two approaches: functional programming and block -diagram composition. In addition with the C, C++, JAVA output formats, the -Faust compiler can now generate LLVM bitcode, and works with LLVM 2.7 and -2.8.

+IcedTea provides a +harness to build OpenJDK using only free software build tools and to provide +replacements for the not-yet free parts of OpenJDK. One of the extensions that +IcedTea provides is a new JIT compiler named Shark which uses LLVM +to provide native code generation without introducing processor-dependent +code. +

+

OpenJDK 7 b112, IcedTea6 1.9 and IcedTea7 1.13 and later have been tested +and are known to work with LLVM 3.0 (and continue to work with older LLVM +releases >= 2.6 as well).

- - -
-

Jade -(Just-in-time Adaptive Decoder Engine) is a generic video decoder engine using -LLVM for just-in-time compilation of video decoder configurations. Those -configurations are designed by MPEG Reconfigurable Video Coding (RVC) committee. -MPEG RVC standard is built on a stream-based dataflow representation of -decoders. It is composed of a standard library of coding tools written in -RVC-CAL language and a dataflow configuration &emdash; block diagram &emdash; -of a decoder.

- -

Jade project is hosted as part of the Open -RVC-CAL Compiler and requires it to translate the RVC-CAL standard library -of video coding tools into an LLVM assembly code.

+

Glasgow Haskell Compiler (GHC)

+ +
+

GHC is an open source, state-of-the-art programming suite for Haskell, +a standard lazy functional programming language. It includes an +optimizing static compiler generating good code for a variety of +platforms, together with an interactive system for convenient, quick +development.

+

In addition to the existing C and native code generators, GHC 7.0 now +supports an LLVM code generator. GHC supports LLVM 2.7 and later.

- - -
-

Neko LLVM JIT -replaces the standard Neko JIT with an LLVM-based implementation. While not -fully complete, it is already providing a 1.5x speedup on 64-bit systems. -Neko LLVM JIT requires LLVM 2.8 or later.

- +

Polly - Polyhedral optimizations for LLVM

+ +
+

Polly is a project that aims to provide advanced memory access optimizations +to better take advantage of SIMD units, cache hierarchies, multiple cores or +even vector accelerators for LLVM. Built around an abstract mathematical +description based on Z-polyhedra, it provides the infrastructure to develop +advanced optimizations in LLVM and to connect complex external optimizers. In +its first year of existence Polly already provides an exact value-based +dependency analysis as well as basic SIMD and OpenMP code generation support. +Furthermore, Polly can use PoCC(Pluto) an advanced optimizer for data-locality +and parallelism.

- - -
-

-Crack aims to provide -the ease of development of a scripting language with the performance of a -compiled language. The language derives concepts from C++, Java and Python, -incorporating object-oriented programming, operator overloading and strong -typing. Crack 0.2 works with LLVM 2.7, and the forthcoming Crack 0.2.1 release -builds on LLVM 2.8.

+

Rubinius

+
+

Rubinius is an environment + for running Ruby code which strives to write as much of the implementation in + Ruby as possible. Combined with a bytecode interpreting VM, it uses LLVM to + optimize and compile ruby code down to machine code. Techniques such as type + feedback, method inlining, and deoptimization are all used to remove dynamism + from ruby execution and increase performance.

- - - -
-

-DTMC provides support for -Transactional Memory, which is an easy-to-use and efficient way to synchronize -accesses to shared memory. Transactions can contain normal C/C++ code (e.g., -__transaction { list.remove(x); x.refCount--; }) and will be executed -virtually atomically and isolated from other transactions.

- -
- +

+FAUST Real-Time Audio Signal Processing Language +

-
+

-Kai (Japanese 会 for -meeting/gathering) is an experimental interpreter that provides a highly -extensible runtime environment and explicit control over the compilation -process. Programs are defined using nested symbolic expressions, which are all -parsed into first-class values with minimal intrinsic semantics. Kai can -generate optimised code at run-time (using LLVM) in order to exploit the nature -of the underlying hardware and to integrate with external software libraries. -It is a unique exploration into world of dynamic code compilation, and the -interaction between high level and low level semantics.

- -
+FAUST is a compiled language for real-time +audio signal processing. The name FAUST stands for Functional AUdio STream. Its +programming model combines two approaches: functional programming and block +diagram composition. In addition with the C, C++, JAVA output formats, the +Faust compiler can now generate LLVM bitcode, and works with LLVM 2.7-3.0.

- - - -
-

-OSL is a shading -language designed for use in physically based renderers and in particular -production rendering. By using LLVM instead of the interpreter, it was able to -meet its performance goals (>= C-code) while retaining the benefits of -runtime specialization and a portable high-level language. -

- +
- - - +

+ What's New in LLVM 3.0? +

-
+

This release includes a huge number of bug fixes, performance tweaks and minor improvements. Some of the major improvements and new features are listed in this section.

-
- - + -
+
-

LLVM 2.8 includes several major new capabilities:

+

LLVM 3.0 includes several major new capabilities:

    -
  • As mentioned above, libc++ and LLDB are major new additions to the LLVM collective.
  • -
  • LLVM 2.8 now has pretty decent support for debugging optimized code. You - should be able to reliably get debug info for function arguments, assuming - that the value is actually available where you have stopped.
  • -
-
  • A new 'llvm-diff' tool is available that does a semantic diff of .ll - files.
  • -
  • The MC subproject has made major progress in this release. - Direct .o file writing support for darwin/x86[-64] is now reliable and - support for other targets and object file formats are in progress.
  • - + + + +
    - + -
    +

    LLVM IR has several new features for better support of new targets and that expose new optimization opportunities:

      -
    • The memcpy, memmove, and memset - intrinsics now take address space qualified pointers and a bit to indicate - whether the transfer is "volatile" or not. -
    • -
    • Per-instruction debug info metadata is much faster and uses less memory by - using the new DebugLoc class.
    • -
    • LLVM IR now has a more formalized concept of "trap values", which allow the optimizer - to optimize more aggressively in the presence of undefined behavior, while - still producing predictable results.
    • -
    • LLVM IR now supports two new linkage - types (linker_private_weak and linker_private_weak_def_auto) which map - onto some obscure MachO concepts.
    • +
    - + -
    +

    In addition to a large array of minor performance tweaks and bug fixes, this release includes a few major enhancements and additions to the optimizers:

      -
    • As mentioned above, the optimizer now has support for updating debug - information as it goes. A key aspect of this is the new llvm.dbg.value - intrinsic. This intrinsic represents debug info for variables that are - promoted to SSA values (typically by mem2reg or the -scalarrepl passes).
    • - -
    • The JumpThreading pass is now much more aggressive about implied value - relations, allowing it to thread conditions like "a == 4" when a is known to - be 13 in one of the predecessors of a block. It does this in conjunction - with the new LazyValueInfo analysis pass.
    • -
    • The new RegionInfo analysis pass identifies single-entry single-exit regions - in the CFG. You can play with it with the "opt -regions analyze" or - "opt -view-regions" commands.
    • -
    • The loop optimizer has significantly improve strength reduction and analysis - capabilities. Notably it is able to build on the trap value and signed - integer overflow information to optimize <= and >= loops.
    • -
    • The CallGraphSCCPassManager now has some basic support for iterating within - an SCC when a optimizer devirtualizes a function call. This allows inlining - through indirect call sites that are devirtualized by store-load forwarding - and other optimizations.
    • -
    • The new -loweratomic pass is available - to lower atomic instructions into their non-atomic form. This can be useful - to optimize generic code that expects to run in a single-threaded - environment.
    • -
    - + + +
    - + -
    +

    The LLVM Machine Code (aka MC) subsystem was created to solve a number of problems in the realm of assembly, disassembly, object file format handling, and a number of other related areas that CPU instruction-set level tools work in.

    -

    The MC subproject has made great leaps in LLVM 2.8. For example, support for - directly writing .o files from LLC (and clang) now works reliably for - darwin/x86[-64] (including inline assembly support) and the integrated - assembler is turned on by default in Clang for these targets. This provides - improved compile times among other things.

    -
      -
    • The entire compiler has converted over to using the MCStreamer assembler API - instead of writing out a .s file textually.
    • -
    • The "assembler parser" is far more mature than in 2.7, supporting a full - complement of directives, now supports assembler macros, etc.
    • -
    • The "assembler backend" has been completed, including support for relaxation - relocation processing and all the other things that an assembler does.
    • -
    • The MachO file format support is now fully functional and works.
    • -
    • The MC disassembler now fully supports ARM and Thumb. ARM assembler support - is still in early development though.
    • -
    • The X86 MC assembler now supports the X86 AES and AVX instruction set.
    • -
    • Work on ELF and COFF object files and ARM target support is well underway, - but isn't useful yet in LLVM 2.8. Please contact the llvmdev mailing list - if you're interested in this.
    • +

    For more information, please see the Intro to the LLVM MC Project Blog Post.

    -
    - +
    - + -
    +

    We have put a significant amount of work into the code generator infrastructure, which allows us to implement more aggressive algorithms and make it run faster:

      -
    • The clang/gcc -momit-leaf-frame-pointer argument is now supported.
    • -
    • The clang/gcc -ffunction-sections and -fdata-sections arguments are now - supported on ELF targets (like GCC).
    • -
    • The MachineCSE pass is now tuned and on by default. It eliminates common - subexpressions that are exposed when lowering to machine instructions.
    • -
    • The "local" register allocator was replaced by a new "fast" register - allocator. This new allocator (which is often used at -O0) is substantially - faster and produces better code than the old local register allocator.
    • -
    • A new LLC "-regalloc=default" option is available, which automatically - chooses a register allocator based on the -O optimization level.
    • -
    • The common code generator code was modified to promote illegal argument and - return value vectors to wider ones when possible instead of scalarizing - them. For example, <3 x float> will now pass in one SSE register - instead of 3 on X86. This generates substantially better code since the - rest of the code generator was already expecting this.
    • -
    • The code generator uses a new "COPY" machine instruction. This speeds up - the code generator and eliminates the need for targets to implement the - isMoveInstr hook. Also, the copyRegToReg hook was renamed to copyPhysReg - and simplified.
    • -
    • The code generator now has a "LocalStackSlotPass", which optimizes stack - slot access for targets (like ARM) that have limited stack displacement - addressing.
    • -
    • A new "PeepholeOptimizer" is available, which eliminates sign and zero - extends, and optimizes away compare instructions when the condition result - is available from a previous instruction.
    • -
    • Atomic operations now get legalized into simpler atomic operations if not - natively supported, easy the implementation burden on targets.
    • -
    • The bottom-up pre-allocation scheduler is now register pressure aware, - allowing it to avoid overscheduling in high pressure situations while still - aggressively scheduling when registers are available.
    • -
    • A new instruction-level-parallelism pre-allocation scheduler is available, - which is also register pressure aware. This scheduler has shown substantial - wins on X86-64 and is on by default.
    • -
    • The tblgen type inference algorithm was rewritten to be more consistent and - diagnose more target bugs. If you have an out-of-tree backend, you may - find that it finds bugs in your target description. This support also - allows limited support for writing patterns for instructions that return - multiple results (e.g. a virtual register and a flag result). The - 'parallel' modifier in tblgen was removed, you should use the new support - for multiple results instead.
    • -
    • A new (experimental) "-rendermf" pass is available which renders a - MachineFunction into HTML, showing live ranges and other useful - details.
    • - - - - -
    • The -fast-isel instruction selection path (used at -O0 on X86) was rewritten - to work bottom-up on basic blocks instead of top down. This makes it - slightly faster (because the MachineDCE pass is not needed any longer) and - allows it to generate better code in some cases.
    • - +
    - + -
    -

    New features of the X86 target include: +

    +

    New features and major changes in the X86 target include:

      -
    • The X86 backend now supports holding X87 floating point stack values - in registers across basic blocks, dramatically improving performance of code - that uses long double, and when targetting CPUs that don't support SSE.
    • - - New SSEDomainFix pass: - On Nehalem and newer CPUs there is a 2 cycle latency penalty on using a - register in a different domain than where it was defined. Some instructions - have equvivalents for different domains, like por/orps/orpd. The - SSEDomainFix pass tries to minimize the number of domain crossings by - changing between equvivalent opcodes where possible. - - X86 backend attempts to promote 16-bit integer operations to 32-bits to avoid - 0x66 prefixes, which are slow on some microarchitectures and bloat the code - on others. - - New support for X86 "thiscall" calling convention (x86_thiscallcc in IR) for windows. - - New llvm.x86.int intrinsic (for int $42 and int3) - - Verbose assembly decodes X86 shuffle instructions, e.g.: - insertps $113, %xmm3, %xmm0 ## xmm0 = zero,xmm0[1,2],xmm3[1] - unpcklps %xmm1, %xmm0 ## xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] - pshufd $1, %xmm1, %xmm1 ## xmm1 = xmm1[1,0,0,0] - - X86 ABI: <2 x float> in IR no longer maps onto MMX, it turns into <4 x float> - - new GHC calling convention +
    • The CRC32 intrinsics have been renamed. The intrinsics were previously + @llvm.x86.sse42.crc32.[8|16|32] and @llvm.x86.sse42.crc64.[8|64]. They have + been renamed to @llvm.x86.sse42.crc32.32.[8|16|32] and + @llvm.x86.sse42.crc32.64.[8|64].
    - + -
    +

    New features of the ARM target include:

      - - NEON: Better performance for QQQQ (4-consecutive Q register) instructions. New reg sequence abstraction? - ARM: Better scheduling (list-hybrid, hybrid?) - ARM: Tail call support. - ARM: General performance work and tuning. - - ARM: Half float support through intrinsics LangRef.html#int_fp16 -
    • ARMGlobalMerge:
    • - -
    • The ARM NEON intrinsics have been substantially reworked to reduce - redundancy and improve code generation. Some of the major changes are: -
        -
      1. - All of the NEON load and store intrinsics (llvm.arm.neon.vld* and - llvm.arm.neon.vst*) take an extra parameter to specify the alignment in bytes - of the memory being accessed. -
      2. -
      3. - The llvm.arm.neon.vaba intrinsic (vector absolute difference and - accumulate) has been removed. This operation is now represented using - the llvm.arm.neon.vabd intrinsic (vector absolute difference) followed by a - vector add. -
      4. -
      5. - The llvm.arm.neon.vabdl and llvm.arm.neon.vabal intrinsics (lengthening - vector absolute difference with and without accumlation) have been removed. - They are represented using the llvm.arm.neon.vabd intrinsic (vector absolute - difference) followed by a vector zero-extend operation, and for vabal, - a vector add. -
      6. -
      7. - The llvm.arm.neon.vmovn intrinsic has been removed. Calls of this intrinsic - are now replaced by vector truncate operations. -
      8. -
      9. - The llvm.arm.neon.vmovls and llvm.arm.neon.vmovlu intrinsics have been - removed. They are now represented as vector sign-extend (vmovls) and - zero-extend (vmovlu) operations. -
      10. -
      11. - The llvm.arm.neon.vaddl*, llvm.arm.neon.vaddw*, llvm.arm.neon.vsubl*, and - llvm.arm.neon.vsubw* intrinsics (lengthening vector add and subtract) have - been removed. They are replaced by vector add and vector subtract operations - where one (vaddw, vsubw) or both (vaddl, vsubl) of the operands are either - sign-extended or zero-extended. -
      12. -
      13. - The llvm.arm.neon.vmulls, llvm.arm.neon.vmullu, llvm.arm.neon.vmlal*, and - llvm.arm.neon.vmlsl* intrinsics (lengthening vector multiply with and without - accumulation and subtraction) have been removed. These operations are now - represented as vector multiplications where the operands are either - sign-extended or zero-extended, followed by a vector add for vmlal or a - vector subtract for vmlsl. Note that the polynomial vector multiply - intrinsic, llvm.arm.neon.vmullp, remains unchanged. -
      14. -
      -
    • +
    - + - - -
    -

    Other miscellaneous features include:

    +

    +Other Target Specific Improvements +

    +
      +
    -
    - - + -
    +
    -

    If you're already an LLVM user or developer with out-of-tree changes based -on LLVM 2.7, this section lists some "gotchas" that you may run into upgrading -from the previous release.

    +

    If you're already an LLVM user or developer with out-of-tree changes based on + LLVM 2.9, this section lists some "gotchas" that you may run into upgrading + from the previous release.

      -
    • The build configuration machinery changed the output directory names. It - wasn't clear to many people that "Release-Asserts" build was a release build - without asserts. To make this more clear, "Release" does not include - assertions and "Release+Asserts" does (likewise, "Debug" and - "Debug+Asserts").
    • -
    • The MSIL Backend was removed, it was unsupported and broken.
    • -
    • The ABCD, SSI, and SCCVN passes were removed. These were not fully - functional and their behavior has been or will be subsumed by the - LazyValueInfo pass.
    • -
    • The LLVM IR 'Union' feature was removed. While this is a desirable feature - for LLVM IR to support, the existing implementation was half baked and - barely useful. We'd really like anyone interested to resurrect the work and - finish it for a future release.
    • -
    • If you're used to reading .ll files, you'll probably notice that .ll file - dumps don't produce #uses comments anymore. To get them, run a .bc file - through "llvm-dis --show-annotations".
    • -
    • Target triples are now stored in a normalized form, and all inputs from - humans are expected to be normalized by Triple::normalize before being - stored in a module triple or passed to another library.
    • +
    • The LowerSetJmp pass wasn't used effectively by any + target and has been removed.
    • +
    • The old TailDup pass was not used in the standard pipeline + and was unable to update ssa form, so it has been removed. +
    • The syntax of volatile loads and stores in IR has been changed to + "load volatile"/"store volatile". The old + syntax ("volatile load"/"volatile store") + is still accepted, but is now considered deprecated.
    +

    Windows (32-bit)

    +
    +
      +
    • On Win32(MinGW32 and MSVC), Windows 2000 will not be supported. + Windows XP or higher is required.
    • +
    +
    + +
    + + +

    +Internal API Changes +

    + +
    +

    In addition, many APIs have changed in this release. Some of the major + LLVM API changes are:

    -

    In addition, many APIs have changed in this release. Some of the major LLVM -API changes are:

      -
    • LLVM 2.8 changes the internal order of operands in InvokeInst - and CallInst. - To be portable across releases, please use the CallSite class and the - high-level accessors, such as getCalledValue and - setUnwindDest. -
    • -
    • - You can no longer pass use_iterators directly to cast<> (and similar), - because these routines tend to perform costly dereference operations more - than once. You have to dereference the iterators yourself and pass them in. -
    • -
    • - llvm.memcpy.*, llvm.memset.*, llvm.memmove.* intrinsics take an extra - parameter now ("i1 isVolatile"), totaling 5 parameters, and the pointer - operands are now address-space qualified. - If you were creating these intrinsic calls and prototypes yourself (as opposed - to using Intrinsic::getDeclaration), you can use - UpgradeIntrinsicFunction/UpgradeIntrinsicCall to be portable accross releases. -
    • -
    • - SetCurrentDebugLocation takes a DebugLoc now instead of a MDNode. - Change your code to use - SetCurrentDebugLocation(DebugLoc::getFromDILocation(...)). -
    • -
    • - The RegisterPass and RegisterAnalysisGroup templates are - considered deprecated, but continue to function in LLVM 2.8. Clients are - strongly advised to use the upcoming INITIALIZE_PASS() and - INITIALIZE_AG_PASS() macros instead. -
    • -
    • - The constructor for the Triple class no longer tries to understand odd triple - specifications. Frontends should ensure that they only pass valid triples to - LLVM. The Triple::normalize utility method has been added to help front-ends - deal with funky triples. -
    • +
    • The biggest and most pervasive change is that llvm::Type's are no longer + returned or accepted as 'const' values. Instead, just pass around non-const + Type's.
    • -
    • - Some APIs got renamed: -
        -
      • llvm_report_error -> report_fatal_error
      • -
      • llvm_install_error_handler -> install_fatal_error_handler
      • -
      • llvm::DwarfExceptionHandling -> llvm::JITExceptionHandling
      • -
      • VISIBILITY_HIDDEN -> LLVM_LIBRARY_VISIBILITY
      • -
      -
    • +
    • PHINode::reserveOperandSpace has been removed. Instead, you + must specify how many operands to reserve space for when you create the + PHINode, by passing an extra argument into PHINode::Create.
    • + +
    • PHINodes no longer store their incoming BasicBlocks as operands. Instead, + the list of incoming BasicBlocks is stored separately, and can be accessed + with new functions PHINode::block_begin + and PHINode::block_end.
    • + +
    • Various functions now take an ArrayRef instead of either a pair + of pointers (or iterators) to the beginning and end of a range, or a pointer + and a length. Others now return an ArrayRef instead of a + reference to a SmallVector or std::vector. These + include: +
        + +
      • CallInst::Create
      • +
      • ComputeLinearIndex (in llvm/CodeGen/Analysis.h)
      • +
      • ConstantArray::get
      • +
      • ConstantExpr::getExtractElement
      • +
      • ConstantExpr::getGetElementPtr
      • +
      • ConstantExpr::getInBoundsGetElementPtr
      • +
      • ConstantExpr::getIndices
      • +
      • ConstantExpr::getInsertElement
      • +
      • ConstantExpr::getWithOperands
      • +
      • ConstantFoldCall (in llvm/Analysis/ConstantFolding.h)
      • +
      • ConstantFoldInstOperands (in llvm/Analysis/ConstantFolding.h)
      • +
      • ConstantVector::get
      • +
      • DIBuilder::createComplexVariable
      • +
      • DIBuilder::getOrCreateArray
      • +
      • ExtractValueInst::Create
      • +
      • ExtractValueInst::getIndexedType
      • +
      • ExtractValueInst::getIndices
      • +
      • FindInsertedValue (in llvm/Analysis/ValueTracking.h)
      • +
      • gep_type_begin (in llvm/Support/GetElementPtrTypeIterator.h)
      • +
      • gep_type_end (in llvm/Support/GetElementPtrTypeIterator.h)
      • +
      • GetElementPtrInst::Create
      • +
      • GetElementPtrInst::CreateInBounds
      • +
      • GetElementPtrInst::getIndexedType
      • +
      • InsertValueInst::Create
      • +
      • InsertValueInst::getIndices
      • +
      • InvokeInst::Create
      • +
      • IRBuilder::CreateCall
      • +
      • IRBuilder::CreateExtractValue
      • +
      • IRBuilder::CreateGEP
      • +
      • IRBuilder::CreateInBoundsGEP
      • +
      • IRBuilder::CreateInsertValue
      • +
      • IRBuilder::CreateInvoke
      • +
      • MDNode::get
      • +
      • MDNode::getIfExists
      • +
      • MDNode::getTemporary
      • +
      • MDNode::getWhenValsUnresolved
      • +
      • SimplifyGEPInst (in llvm/Analysis/InstructionSimplify.h)
      • +
      • TargetData::getIndexedOffset
      • +
    • + +
    • All forms of StringMap::getOrCreateValue have been remove + except for the one which takes a StringRef.
    • + +
    • The LLVMBuildUnwind function from the C API was removed. The + LLVM unwind instruction has been deprecated for a long time and + isn't used by the current front-ends. So this was removed during the + exception handling rewrite.
    • + +
    • The LLVMAddLowerSetJmpPass function from the C API was removed + because the LowerSetJmp pass was removed.
    • + +
    • The DIBuilder interface used by front ends to encode debugging + information in the LLVM IR now expects clients to use DIBuilder::finalize() + at the end of translation unit to complete debugging information encoding.
    -
    +
    - + -
    +

    This section contains significant known problems with the LLVM system, listed by component. If you run into a problem, please check the LLVM bug database and submit a bug if there isn't already one.

    -
    - - + -
    +

    The following components of this LLVM release are either untested, known to be broken or unreliable, or are in early development. These components should @@ -1002,43 +720,54 @@ components, please contact us on the LLVMdev list.

      -
    • The Alpha, Blackfin, CellSPU, MicroBlaze, MSP430, MIPS, PIC16, SystemZ +
    • The Alpha, Blackfin, CellSPU, MicroBlaze, MSP430, MIPS, PTX, SystemZ and XCore backends are experimental.
    • llc "-filetype=obj" is experimental on all targets - other than darwin-i386 and darwin-x86_64.
    • + other than darwin and ELF X86 systems. +
    - + -
    +
    • The X86 backend does not yet support all inline assembly that uses the X86 floating point stack. It supports the 'f' and 't' constraints, but not 'u'.
    • -
    • Win64 code generation wasn't widely tested. Everything should work, but we - expect small issues to happen. Also, llvm-gcc cannot build the mingw64 - runtime currently due to lack of support for the 'u' inline assembly - constraint and for X87 floating point inline assembly.
    • The X86-64 backend does not yet support the LLVM IR instruction va_arg. Currently, front-ends support variadic argument constructs on X86-64 by lowering them manually.
    • +
    • Windows x64 (aka Win64) code generator has a few issues. +
        +
      • llvm-gcc cannot build the mingw-w64 runtime currently + due to lack of support for the 'u' inline assembly + constraint and for X87 floating point inline assembly.
      • +
      • On mingw-w64, you will see unresolved symbol __chkstk + due to Bug 8919. + It is fixed in r128206.
      • +
      • Miss-aligned MOVDQA might crash your program. It is due to + Bug 9483, + lack of handling aligned internal globals.
      • +
      +
    • +
    - + -
    +
    • The Linux PPC32/ABI support needs testing for the interpreter and static @@ -1048,11 +777,11 @@ compilation, and lacks support for debug information.
    - + -
    +
    • Thumb mode works only on ARMv6 or higher processors. On sub-ARMv6 @@ -1065,11 +794,11 @@ results (PR1388).
    - + -
    +
    • The SPARC backend only supports the 32-bit SPARC ABI (-m32); it does not @@ -1079,11 +808,11 @@ results (PR1388).
    - + -
    +
    • 64-bit MIPS targets are not supported yet.
    • @@ -1092,11 +821,11 @@ results (PR1388).
    - + -
    +
      @@ -1107,11 +836,11 @@ appropriate nops inserted to ensure restartability.
    - + -
    +

    The C backend has numerous problems and is not being actively maintained. Depending on it for anything serious is not advised.

    @@ -1130,11 +859,13 @@ Depending on it for anything serious is not advised.

    - + + +
    -
    +

    LLVM 3.0 will be the last release of llvm-gcc.

    llvm-gcc is generally very stable for the C family of languages. The only major language feature of GCC not supported by llvm-gcc is the @@ -1150,49 +881,23 @@ Depending on it for anything serious is not advised.

    4.2. If you are interested in Fortran, we recommend that you consider using dragonegg instead.

    -

    The llvm-gcc 4.2 Ada compiler has basic functionality. However, this is not a -mature technology, and problems should be expected. For example:

    -
      -
    • The Ada front-end currently only builds on X86-32. This is mainly due -to lack of trampoline support (pointers to nested functions) on other platforms. -However, it also fails to build on X86-64 -which does support trampolines.
    • -
    • The Ada front-end fails to bootstrap. -This is due to lack of LLVM support for setjmp/longjmp style -exception handling, which is used internally by the compiler. -Workaround: configure with --disable-bootstrap.
    • -
    • The c380004, c393010 -and cxg2021 ACATS tests fail -(c380004 also fails with gcc-4.2 mainline). -If the compiler is built with checks disabled then c393010 -causes the compiler to go into an infinite loop, using up all system memory.
    • -
    • Some GCC specific Ada tests continue to crash the compiler.
    • -
    • The -E binder option (exception backtraces) -does not work and will result in programs -crashing if an exception is raised. Workaround: do not use -E.
    • -
    • Only discrete types are allowed to start -or finish at a non-byte offset in a record. Workaround: do not pack records -or use representation clauses that result in a field of a non-discrete type -starting or finishing in the middle of a byte.
    • -
    • The lli interpreter considers -'main' as generated by the Ada binder to be invalid. -Workaround: hand edit the file to use pointers for argv and -envp rather than integers.
    • -
    • The -fstack-check option is -ignored.
    • -
    +

    The llvm-gcc 4.2 Ada compiler has basic functionality, but is no longer being +actively maintained. If you are interested in Ada, we recommend that you +consider using dragonegg instead.

    +
    +
    - + -
    +

    A wide variety of additional information is available on the LLVM web page, in particular in the LLVM web page, in particular in the documentation section. The web page also contains versions of the API documentation which is up-to-date with the Subversion version of the source code.