+ available in the debug build. These tests will fail in an optimized or
+ profile build.</p>
+</div>
+
+<div class="question">
+<p>Compiling LLVM with GCC 3.3.2 fails, what should I do?</p>
+</div>
+
+<div class="answer">
+<p>This is <a href="http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13392">a bug in
+ GCC</a>, and affects projects other than LLVM. Try upgrading or downgrading
+ your GCC.</p>
+</div>
+
+<div class="question">
+<p>Compiling LLVM with GCC succeeds, but the resulting tools do not work, what
+ can be wrong?</p>
+</div>
+
+<div class="answer">
+<p>Several versions of GCC have shown a weakness in miscompiling the LLVM
+ codebase. Please consult your compiler version (<tt>gcc --version</tt>) to
+ find out whether it is <a href="GettingStarted.html#brokengcc">broken</a>.
+ If so, your only option is to upgrade GCC to a known good version.</p>
+</div>
+
+<div class="question">
+<p>After Subversion update, rebuilding gives the error "No rule to make
+ target".</p>
+</div>
+
+<div class="answer">
+<p>If the error is of the form:</p>
+
+<pre class="doc_code">
+gmake[2]: *** No rule to make target `/path/to/somefile', needed by
+`/path/to/another/file.d'.<br>
+Stop.
+</pre>
+
+<p>This may occur anytime files are moved within the Subversion repository or
+ removed entirely. In this case, the best solution is to erase all
+ <tt>.d</tt> files, which list dependencies for source files, and rebuild:</p>
+
+<pre class="doc_code">
+% cd $LLVM_OBJ_DIR
+% rm -f `find . -name \*\.d`
+% gmake
+</pre>
+
+<p>In other cases, it may be necessary to run <tt>make clean</tt> before
+ rebuilding.</p>
+</div>
+
+<div class="question">
+<p><a name="llvmc">The <tt>llvmc</tt> program gives me errors/doesn't
+ work.</a></p>
+</div>
+
+<div class="answer">
+<p><tt>llvmc</tt> is experimental and isn't really supported. We suggest
+ using <tt>llvm-gcc</tt> instead.</p>
+</div>
+
+<div class="question">
+<p><a name="srcdir-objdir">When I compile LLVM-GCC with srcdir == objdir, it
+ fails. Why?</a></p>
+</div>
+
+<div class="answer">
+<p>The <tt>GNUmakefile</tt> in the top-level directory of LLVM-GCC is a special
+ <tt>Makefile</tt> used by Apple to invoke the <tt>build_gcc</tt> script after
+ setting up a special environment. This has the unfortunate side-effect that
+ trying to build LLVM-GCC with srcdir == objdir in a "non-Apple way" invokes
+ the <tt>GNUmakefile</tt> instead of <tt>Makefile</tt>. Because the
+ environment isn't set up correctly to do this, the build fails.</p>
+
+<p>People not building LLVM-GCC the "Apple way" need to build LLVM-GCC with
+ srcdir != objdir, or simply remove the GNUmakefile entirely.</p>
+
+<p>We regret the inconvenience.</p>
+</div>
+
+<!-- *********************************************************************** -->
+<div class="doc_section"><a name="felangs">Source Languages</a></div>
+
+<div class="question">
+<p><a name="langs">What source languages are supported?</a></p>
+</div>
+
+<div class="answer">
+<p>LLVM currently has full support for C and C++ source languages. These are
+ available through a special version of GCC that LLVM calls the
+ <a href="#cfe">C Front End</a></p>
+
+<p>There is an incomplete version of a Java front end available in the
+ <tt>java</tt> module. There is no documentation on this yet so you'll need to
+ download the code, compile it, and try it.</p>
+
+<p>The PyPy developers are working on integrating LLVM into the PyPy backend so
+ that PyPy language can translate to LLVM.</p>
+</div>
+
+<div class="question">
+<p><a name="langirgen">I'd like to write a self-hosting LLVM compiler. How
+ should I interface with the LLVM middle-end optimizers and back-end code
+ generators?</a></p>
+</div>
+
+<div class="answer">
+<p>Your compiler front-end will communicate with LLVM by creating a module in
+ the LLVM intermediate representation (IR) format. Assuming you want to write
+ your language's compiler in the language itself (rather than C++), there are
+ 3 major ways to tackle generating LLVM IR from a front-end:</p>
+
+<ul>
+ <li><strong>Call into the LLVM libraries code using your language's FFI
+ (foreign function interface).</strong>
+
+ <ul>
+ <li><em>for:</em> best tracks changes to the LLVM IR, .ll syntax, and .bc
+ format</li>
+
+ <li><em>for:</em> enables running LLVM optimization passes without a
+ emit/parse overhead</li>
+
+ <li><em>for:</em> adapts well to a JIT context</li>
+
+ <li><em>against:</em> lots of ugly glue code to write</li>
+ </ul></li>
+
+ <li> <strong>Emit LLVM assembly from your compiler's native language.</strong>
+ <ul>
+ <li><em>for:</em> very straightforward to get started</li>
+
+ <li><em>against:</em> the .ll parser is slower than the bitcode reader
+ when interfacing to the middle end</li>
+
+ <li><em>against:</em> you'll have to re-engineer the LLVM IR object model
+ and asm writer in your language</li>
+
+ <li><em>against:</em> it may be harder to track changes to the IR</li>
+ </ul></li>
+
+ <li><strong>Emit LLVM bitcode from your compiler's native language.</strong>
+
+ <ul>
+ <li><em>for:</em> can use the more-efficient bitcode reader when
+ interfacing to the middle end</li>
+
+ <li><em>against:</em> you'll have to re-engineer the LLVM IR object
+ model and bitcode writer in your language</li>
+
+ <li><em>against:</em> it may be harder to track changes to the IR</li>
+ </ul></li>
+</ul>
+
+<p>If you go with the first option, the C bindings in include/llvm-c should help
+ a lot, since most languages have strong support for interfacing with C. The
+ most common hurdle with calling C from managed code is interfacing with the
+ garbage collector. The C interface was designed to require very little memory
+ management, and so is straightforward in this regard.</p>
+</div>
+
+<div class="question">
+<p><a name="langhlsupp">What support is there for a higher level source language
+ constructs for building a compiler?</a></p>
+</div>
+
+<div class="answer">
+<p>Currently, there isn't much. LLVM supports an intermediate representation
+ which is useful for code representation but will not support the high level
+ (abstract syntax tree) representation needed by most compilers. There are no
+ facilities for lexical nor semantic analysis. There is, however, a <i>mostly
+ implemented</i> configuration-driven
+ <a href="CompilerDriver.html">compiler driver</a> which simplifies the task
+ of running optimizations, linking, and executable generation.</p>
+</div>
+
+<div class="question">
+<p><a name="getelementptr">I don't understand the GetElementPtr
+ instruction. Help!</a></p>
+</div>
+
+<div class="answer">
+<p>See <a href="GetElementPtr.html">The Often Misunderstood GEP
+ Instruction</a>.</p>
+</div>
+
+<!-- *********************************************************************** -->
+<div class="doc_section">
+ <a name="cfe">Using the GCC Front End</a>
+</div>
+
+<div class="question">
+<p>When I compile software that uses a configure script, the configure script
+ thinks my system has all of the header files and libraries it is testing for.
+ How do I get configure to work correctly?</p>
+</div>
+
+<div class="answer">
+<p>The configure script is getting things wrong because the LLVM linker allows
+ symbols to be undefined at link time (so that they can be resolved during JIT
+ or translation to the C back end). That is why configure thinks your system
+ "has everything."</p>
+
+<p>To work around this, perform the following steps:</p>
+
+<ol>
+ <li>Make sure the CC and CXX environment variables contains the full path to
+ the LLVM GCC front end.</li>
+
+ <li>Make sure that the regular C compiler is first in your PATH. </li>
+
+ <li>Add the string "-Wl,-native" to your CFLAGS environment variable.</li>
+</ol>
+
+<p>This will allow the <tt>llvm-ld</tt> linker to create a native code
+ executable instead of shell script that runs the JIT. Creating native code
+ requires standard linkage, which in turn will allow the configure script to
+ find out if code is not linking on your system because the feature isn't
+ available on your system.</p>
+</div>
+
+<div class="question">
+<p>When I compile code using the LLVM GCC front end, it complains that it cannot
+ find libcrtend.a.
+</p>
+</div>
+
+<div class="answer">
+<p>The only way this can happen is if you haven't installed the runtime
+ library. To correct this, do:</p>
+
+<pre class="doc_code">
+% cd llvm/runtime
+% make clean ; make install-bytecode
+</pre>
+</div>
+
+<div class="question">
+<p>How can I disable all optimizations when compiling code using the LLVM GCC
+ front end?</p>
+</div>
+
+<div class="answer">
+<p>Passing "-Wa,-disable-opt -Wl,-disable-opt" will disable *all* cleanup and
+ optimizations done at the llvm level, leaving you with the truly horrible
+ code that you desire.</p>
+</div>
+
+
+<div class="question">
+<p><a name="translatecxx">Can I use LLVM to convert C++ code to C code?</a></p>
+</div>
+
+<div class="answer">
+<p>Yes, you can use LLVM to convert code from any language LLVM supports to C.
+ Note that the generated C code will be very low level (all loops are lowered
+ to gotos, etc) and not very pretty (comments are stripped, original source
+ formatting is totally lost, variables are renamed, expressions are
+ regrouped), so this may not be what you're looking for. Also, there are
+ several limitations noted below.<p>
+
+<p>Use commands like this:</p>
+
+<ol>
+ <li><p>Compile your program with llvm-g++:</p>
+
+<pre class="doc_code">
+% llvm-g++ -emit-llvm x.cpp -o program.bc -c
+</pre>
+
+ <p>or:</p>
+
+<pre class="doc_code">
+% llvm-g++ a.cpp -c -emit-llvm
+% llvm-g++ b.cpp -c -emit-llvm
+% llvm-ld a.o b.o -o program
+</pre>
+
+ <p>This will generate program and program.bc. The .bc
+ file is the LLVM version of the program all linked together.</p></li>
+
+ <li><p>Convert the LLVM code to C code, using the LLC tool with the C
+ backend:</p>
+
+<pre class="doc_code">
+% llc -march=c program.bc -o program.c
+</pre></li>
+
+ <li><p>Finally, compile the C file:</p>
+
+<pre class="doc_code">
+% cc x.c -lstdc++
+</pre></li>
+
+</ol>
+
+<p>Using LLVM does not eliminate the need for C++ library support. If you use
+ the llvm-g++ front-end, the generated code will depend on g++'s C++ support
+ libraries in the same way that code generated from g++ would. If you use
+ another C++ front-end, the generated code will depend on whatever library
+ that front-end would normally require.</p>
+
+<p>If you are working on a platform that does not provide any C++ libraries, you
+ may be able to manually compile libstdc++ to LLVM bitcode, statically link it
+ into your program, then use the commands above to convert the whole result
+ into C code. Alternatively, you might compile the libraries and your
+ application into two different chunks of C code and link them.</p>
+
+<p>Note that, by default, the C back end does not support exception handling.
+ If you want/need it for a certain program, you can enable it by passing
+ "-enable-correct-eh-support" to the llc program. The resultant code will use
+ setjmp/longjmp to implement exception support that is relatively slow, and
+ not C++-ABI-conforming on most platforms, but otherwise correct.</p>
+
+<p>Also, there are a number of other limitations of the C backend that cause it
+ to produce code that does not fully conform to the C++ ABI on most
+ platforms. Some of the C++ programs in LLVM's test suite are known to fail
+ when compiled with the C back end because of ABI incompatibilities with
+ standard C++ libraries.</p>
+</div>
+
+<div class="question">
+<p><a name="platformindependent">Can I compile C or C++ code to
+ platform-independent LLVM bitcode?</a></p>
+</div>
+
+<div class="answer">
+<p>No. C and C++ are inherently platform-dependent languages. The most obvious
+ example of this is the preprocessor. A very common way that C code is made
+ portable is by using the preprocessor to include platform-specific code. In
+ practice, information about other platforms is lost after preprocessing, so
+ the result is inherently dependent on the platform that the preprocessing was
+ targeting.</p>
+
+<p>Another example is <tt>sizeof</tt>. It's common for <tt>sizeof(long)</tt> to
+ vary between platforms. In most C front-ends, <tt>sizeof</tt> is expanded to
+ a constant immediately, thus hard-wiring a platform-specific detail.</p>
+
+<p>Also, since many platforms define their ABIs in terms of C, and since LLVM is
+ lower-level than C, front-ends currently must emit platform-specific IR in
+ order to have the result conform to the platform ABI.</p>
+</div>
+
+<!-- *********************************************************************** -->
+<div class="doc_section">
+ <a name="cfe_code">Questions about code generated by the GCC front-end</a>
+</div>
+
+<div class="question">
+<p><a name="iosinit">What is this <tt>llvm.global_ctors</tt> and
+ <tt>_GLOBAL__I__tmp_webcompile...</tt> stuff that happens when I <tt>#include
+ <iostream></tt>?</a></p>
+</div>
+
+<div class="answer">
+<p>If you <tt>#include</tt> the <tt><iostream></tt> header into a C++
+ translation unit, the file will probably use
+ the <tt>std::cin</tt>/<tt>std::cout</tt>/... global objects. However, C++
+ does not guarantee an order of initialization between static objects in
+ different translation units, so if a static ctor/dtor in your .cpp file
+ used <tt>std::cout</tt>, for example, the object would not necessarily be
+ automatically initialized before your use.</p>
+
+<p>To make <tt>std::cout</tt> and friends work correctly in these scenarios, the
+ STL that we use declares a static object that gets created in every
+ translation unit that includes <tt><iostream></tt>. This object has a
+ static constructor and destructor that initializes and destroys the global
+ iostream objects before they could possibly be used in the file. The code
+ that you see in the .ll file corresponds to the constructor and destructor
+ registration code.
+</p>
+
+<p>If you would like to make it easier to <b>understand</b> the LLVM code
+ generated by the compiler in the demo page, consider using <tt>printf()</tt>
+ instead of <tt>iostream</tt>s to print values.</p>
+</div>
+
+<!--=========================================================================-->
+
+<div class="question">
+<p><a name="codedce">Where did all of my code go??</a></p>
+</div>
+
+<div class="answer">
+<p>If you are using the LLVM demo page, you may often wonder what happened to
+ all of the code that you typed in. Remember that the demo script is running
+ the code through the LLVM optimizers, so if your code doesn't actually do
+ anything useful, it might all be deleted.</p>
+
+<p>To prevent this, make sure that the code is actually needed. For example, if
+ you are computing some expression, return the value from the function instead
+ of leaving it in a local variable. If you really want to constrain the
+ optimizer, you can read from and assign to <tt>volatile</tt> global
+ variables.</p>
+</div>
+
+<!--=========================================================================-->
+
+<div class="question">
+<p><a name="undef">What is this "<tt>undef</tt>" thing that shows up in my
+ code?</a></p>
+</div>
+
+<div class="answer">
+<p><a href="LangRef.html#undef"><tt>undef</tt></a> is the LLVM way of
+ representing a value that is not defined. You can get these if you do not
+ initialize a variable before you use it. For example, the C function:</p>
+
+<pre class="doc_code">
+int X() { int i; return i; }
+</pre>
+
+<p>Is compiled to "<tt>ret i32 undef</tt>" because "<tt>i</tt>" never has a
+ value specified for it.</p>
+</div>
+
+<!--=========================================================================-->
+
+<div class="question">
+<p><a name="callconvwrong">Why does instcombine + simplifycfg turn
+ a call to a function with a mismatched calling convention into "unreachable"?
+ Why not make the verifier reject it?</a></p>
+</div>
+
+<div class="answer">
+<p>This is a common problem run into by authors of front-ends that are using
+custom calling conventions: you need to make sure to set the right calling
+convention on both the function and on each call to the function. For example,
+this code:</p>
+
+<pre class="doc_code">
+define fastcc void @foo() {
+ ret void
+}
+define void @bar() {
+ call void @foo( )
+ ret void
+}
+</pre>
+
+<p>Is optimized to:</p>
+
+<pre class="doc_code">
+define fastcc void @foo() {
+ ret void
+}
+define void @bar() {
+ unreachable
+}
+</pre>
+
+<p>... with "opt -instcombine -simplifycfg". This often bites people because
+"all their code disappears". Setting the calling convention on the caller and
+callee is required for indirect calls to work, so people often ask why not make
+the verifier reject this sort of thing.</p>
+
+<p>The answer is that this code has undefined behavior, but it is not illegal.
+If we made it illegal, then every transformation that could potentially create
+this would have to ensure that it doesn't, and there is valid code that can
+create this sort of construct (in dead code). The sorts of things that can
+cause this to happen are fairly contrived, but we still need to accept them.
+Here's an example:</p>
+
+<pre class="doc_code">
+define fastcc void @foo() {
+ ret void
+}
+define internal void @bar(void()* %FP, i1 %cond) {
+ br i1 %cond, label %T, label %F
+T:
+ call void %FP()
+ ret void
+F:
+ call fastcc void %FP()
+ ret void
+}
+define void @test() {
+ %X = or i1 false, false
+ call void @bar(void()* @foo, i1 %X)
+ ret void
+}
+</pre>
+
+<p>In this example, "test" always passes @foo/false into bar, which ensures that
+ it is dynamically called with the right calling conv (thus, the code is
+ perfectly well defined). If you run this through the inliner, you get this
+ (the explicit "or" is there so that the inliner doesn't dead code eliminate
+ a bunch of stuff):
+</p>
+
+<pre class="doc_code">
+define fastcc void @foo() {
+ ret void
+}
+define void @test() {
+ %X = or i1 false, false
+ br i1 %X, label %T.i, label %F.i
+T.i:
+ call void @foo()
+ br label %bar.exit
+F.i:
+ call fastcc void @foo()
+ br label %bar.exit
+bar.exit:
+ ret void
+}
+</pre>
+
+<p>Here you can see that the inlining pass made an undefined call to @foo with
+ the wrong calling convention. We really don't want to make the inliner have
+ to know about this sort of thing, so it needs to be valid code. In this case,
+ dead code elimination can trivially remove the undefined code. However, if %X
+ was an input argument to @test, the inliner would produce this:
+</p>
+
+<pre class="doc_code">
+define fastcc void @foo() {
+ ret void
+}
+
+define void @test(i1 %X) {
+ br i1 %X, label %T.i, label %F.i
+T.i:
+ call void @foo()
+ br label %bar.exit
+F.i:
+ call fastcc void @foo()
+ br label %bar.exit
+bar.exit:
+ ret void
+}
+</pre>
+
+<p>The interesting thing about this is that %X <em>must</em> be false for the
+code to be well-defined, but no amount of dead code elimination will be able to
+delete the broken call as unreachable. However, since instcombine/simplifycfg
+turns the undefined call into unreachable, we end up with a branch on a
+condition that goes to unreachable: a branch to unreachable can never happen, so
+"-inline -instcombine -simplifycfg" is able to produce:</p>
+
+<pre class="doc_code">
+define fastcc void @foo() {
+ ret void
+}
+define void @test(i1 %X) {
+F.i:
+ call fastcc void @foo()
+ ret void
+}
+</pre>