-<li>The ARM backend now generates instructions in unified assembly syntax.</li>
-
-<li>llvm-gcc now has complete support for the ARM v7 NEON instruction set. This
- support differs slightly from the GCC implementation. Please see the
- <a
-href="http://blog.llvm.org/2010/04/arm-advanced-simd-neon-intrinsics-and.html">
- ARM Advanced SIMD (NEON) Intrinsics and Types in LLVM Blog Post</a> for
- helpful information if migrating code from GCC to LLVM-GCC.</li>
-
-<li>The ARM and Thumb code generators now use register scavenging for stack
- object address materialization. This allows the use of R3 as a general
- purpose register in Thumb1 code, as it was previous reserved for use in
- stack address materialization. Secondly, sequential uses of the same
- value will now re-use the materialized constant.</li>
-
-<li>The ARM backend now has good support for ARMv4 targets and has been tested
- on StrongARM hardware. Previously, LLVM only supported ARMv4T and
- newer chips.</li>
-
-<li>Atomic builtins are now supported for ARMv6 and ARMv7 (__sync_synchronize,
- __sync_fetch_and_add, etc.).</li>
-
-</ul>
-
-
-</div>
-
-<!--=========================================================================-->
-<div class="doc_subsection">
-<a name="newapis">New Useful APIs</a>
-</div>
-
-<div class="doc_text">
-
-<p>This release includes a number of new APIs that are used internally, which
- may also be useful for external clients.
-</p>
-
-<ul>
-<li>The optimizer uses the new CodeMetrics class to measure the size of code.
- Various passes (like the inliner, loop unswitcher, etc) all use this to make
- more accurate estimates of the code size impact of various
- optimizations.</li>
-<li>A new <a href="http://llvm.org/doxygen/InstructionSimplify_8h-source.html">
- llvm/Analysis/InstructionSimplify.h</a> interface is available for doing
- symbolic simplification of instructions (e.g. <tt>a+0</tt> -> <tt>a</tt>)
- without requiring the instruction to exist. This centralizes a lot of
- ad-hoc symbolic manipulation code scattered in various passes.</li>
-<li>The optimizer now uses a new <a
- href="http://llvm.org/doxygen/SSAUpdater_8h-source.html">SSAUpdater</a>
- class which efficiently supports
- doing unstructured SSA update operations. This centralized a bunch of code
- scattered throughout various passes (e.g. jump threading, lcssa,
- loop rotate, etc) for doing this sort of thing. The code generator has a
- similar <a href="http://llvm.org/doxygen/MachineSSAUpdater_8h-source.html">
- MachineSSAUpdater</a> class.</li>
-<li>The <a href="http://llvm.org/doxygen/Regex_8h-source.html">
- llvm/Support/Regex.h</a> header exposes a platform independent regular
- expression API. Building on this, the <a
- href="TestingGuide.html#FileCheck">FileCheck</a> utility now supports
- regular exressions.</li>
-<li>raw_ostream now supports a circular "debug stream" accessed with "dbgs()".
- By default, this stream works the same way as "errs()", but if you pass
- <tt>-debug-buffer-size=1000</tt> to opt, the debug stream is capped to a
- fixed sized circular buffer and the output is printed at the end of the
- program's execution. This is helpful if you have a long lived compiler
- process and you're interested in seeing snapshots in time.</li>
-</ul>
-
-
-</div>
-
-<!--=========================================================================-->
-<div class="doc_subsection">
-<a name="otherimprovements">Other Improvements and New Features</a>
-</div>
-
-<div class="doc_text">
-<p>Other miscellaneous features include:</p>
-
-<ul>
-<li>You can now build LLVM as a big dynamic library (e.g. "libllvm2.7.so"). To
- get this, configure LLVM with the --enable-shared option.</li>
-
-<li>LLVM command line tools now overwrite their output by default. Previously,
- they would only do this with -f. This makes them more convenient to use, and
- behave more like standard unix tools.</li>
+<li>The ARM NEON intrinsics have been substantially reworked to reduce
+ redundancy and improve code generation. Some of the major changes are:
+ <ol>
+ <li>
+ All of the NEON load and store intrinsics (llvm.arm.neon.vld* and
+ llvm.arm.neon.vst*) take an extra parameter to specify the alignment in bytes
+ of the memory being accessed.
+ </li>
+ <li>
+ The llvm.arm.neon.vaba intrinsic (vector absolute difference and
+ accumulate) has been removed. This operation is now represented using
+ the llvm.arm.neon.vabd intrinsic (vector absolute difference) followed by a
+ vector add.
+ </li>
+ <li>
+ The llvm.arm.neon.vabdl and llvm.arm.neon.vabal intrinsics (lengthening
+ vector absolute difference with and without accumulation) have been removed.
+ They are represented using the llvm.arm.neon.vabd intrinsic (vector absolute
+ difference) followed by a vector zero-extend operation, and for vabal,
+ a vector add.
+ </li>
+ <li>
+ The llvm.arm.neon.vmovn intrinsic has been removed. Calls of this intrinsic
+ are now replaced by vector truncate operations.
+ </li>
+ <li>
+ The llvm.arm.neon.vmovls and llvm.arm.neon.vmovlu intrinsics have been
+ removed. They are now represented as vector sign-extend (vmovls) and
+ zero-extend (vmovlu) operations.
+ </li>
+ <li>
+ The llvm.arm.neon.vaddl*, llvm.arm.neon.vaddw*, llvm.arm.neon.vsubl*, and
+ llvm.arm.neon.vsubw* intrinsics (lengthening vector add and subtract) have
+ been removed. They are replaced by vector add and vector subtract operations
+ where one (vaddw, vsubw) or both (vaddl, vsubl) of the operands are either
+ sign-extended or zero-extended.
+ </li>
+ <li>
+ The llvm.arm.neon.vmulls, llvm.arm.neon.vmullu, llvm.arm.neon.vmlal*, and
+ llvm.arm.neon.vmlsl* intrinsics (lengthening vector multiply with and without
+ accumulation and subtraction) have been removed. These operations are now
+ represented as vector multiplications where the operands are either
+ sign-extended or zero-extended, followed by a vector add for vmlal or a
+ vector subtract for vmlsl. Note that the polynomial vector multiply
+ intrinsic, llvm.arm.neon.vmullp, remains unchanged.
+ </li>
+ </ol>
+</li>