docs/CodingStandards.html

   1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
   2                       "http://www.w3.org/TR/html4/strict.dtd">
   3 <html>
   4 <head>
   5   <link rel="stylesheet" href="llvm.css" type="text/css">
   6   <title>LLVM Coding Standards</title>
   7 </head>
   8 <body>
   9
  10 <div class="doc_title">
  11   LLVM Coding Standards
  12 </div>
  13
  14 <ol>
  15   <li><a href="#introduction">Introduction</a></li>
  16   <li><a href="#mechanicalissues">Mechanical Source Issues</a>
  17     <ol>
  18       <li><a href="#sourceformating">Source Code Formatting</a>
  19         <ol>
  20           <li><a href="#scf_commenting">Commenting</a></li>
  21           <li><a href="#scf_commentformat">Comment Formatting</a></li>
  22           <li><a href="#scf_includes"><tt>#include</tt> Style</a></li>
  23           <li><a href="#scf_codewidth">Source Code Width</a></li>
  24           <li><a href="#scf_spacestabs">Use Spaces Instead of Tabs</a></li>
  25           <li><a href="#scf_indentation">Indent Code Consistently</a></li>
  26         </ol></li>
  27       <li><a href="#compilerissues">Compiler Issues</a>
  28         <ol>
  29           <li><a href="#ci_warningerrors">Treat Compiler Warnings Like
  30               Errors</a></li>
  31           <li><a href="#ci_portable_code">Write Portable Code</a></li>
  32           <li><a href="#ci_rtti_exceptions">Do not use RTTI or Exceptions</a></li>
  33           <li><a href="#ci_class_struct">Use of <tt>class</tt>/<tt>struct</tt> Keywords</a></li>
  34         </ol></li>
  35     </ol></li>
  36   <li><a href="#styleissues">Style Issues</a>
  37     <ol>
  38       <li><a href="#macro">The High-Level Issues</a>
  39         <ol>
  40           <li><a href="#hl_module">A Public Header File <b>is</b> a
  41               Module</a></li>
  42           <li><a href="#hl_dontinclude"><tt>#include</tt> as Little as Possible</a></li>
  43           <li><a href="#hl_privateheaders">Keep "internal" Headers
  44               Private</a></li>
  45           <li><a href="#hl_earlyexit">Use Early Exits and <tt>continue</tt> to Simplify
  46               Code</a></li>
  47           <li><a href="#hl_else_after_return">Don't use <tt>else</tt> after a
  48               <tt>return</tt></a></li>
  49           <li><a href="#hl_predicateloops">Turn Predicate Loops into Predicate
  50               Functions</a></li>
  51         </ol></li>
  52       <li><a href="#micro">The Low-Level Issues</a>
  53         <ol>
  54           <li><a href="#ll_assert">Assert Liberally</a></li>
  55           <li><a href="#ll_ns_std">Do not use '<tt>using namespace std</tt>'</a></li>
  56           <li><a href="#ll_virtual_anch">Provide a virtual method anchor for
  57               classes in headers</a></li>
  58           <li><a href="#ll_end">Don't evaluate <tt>end()</tt> every time through a
  59               loop</a></li>
  60           <li><a href="#ll_iostream"><tt>#include &lt;iostream&gt;</tt> is
  61               <em>forbidden</em></a></li>
  62           <li><a href="#ll_raw_ostream">Use <tt>raw_ostream</tt></a</li>
  63           <li><a href="#ll_avoidendl">Avoid <tt>std::endl</tt></a></li>
  64         </ol></li>
  65
  66       <li><a href="#nano">Microscopic Details</a>
  67         <ol>
  68           <li><a href="#micro_spaceparen">Spaces Before Parentheses</a></li>
  69           <li><a href="#micro_preincrement">Prefer Preincrement</a></li>
  70           <li><a href="#micro_namespaceindent">Namespace Indentation</a></li>
  71           <li><a href="#micro_anonns">Anonymous Namespaces</a></li>
  72         </ol></li>
  73
  74
  75     </ol></li>
  76   <li><a href="#seealso">See Also</a></li>
  77 </ol>
  78
  79 <div class="doc_author">
  80   <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a></p>
  81 </div>
  82
  83
  84 <!-- *********************************************************************** -->
  85 <div class="doc_section">
  86   <a name="introduction">Introduction</a>
  87 </div>
  88 <!-- *********************************************************************** -->
  89
  90 <div class="doc_text">
  91
  92 <p>This document attempts to describe a few coding standards that are being used
  93 in the LLVM source tree.  Although no coding standards should be regarded as
  94 absolute requirements to be followed in all instances, coding standards can be
  95 useful.</p>
  96
  97 <p>This document intentionally does not prescribe fixed standards for religious
  98 issues such as brace placement and space usage.  For issues like this, follow
  99 the golden rule:</p>
 100
 101 <blockquote>
 102
 103 <p><b><a name="goldenrule">If you are adding a significant body of source to a
 104 project, feel free to use whatever style you are most comfortable with.  If you
 105 are extending, enhancing, or bug fixing already implemented code, use the style
 106 that is already being used so that the source is uniform and easy to
 107 follow.</a></b></p>
 108
 109 </blockquote>
 110
 111 <p>The ultimate goal of these guidelines is the increase readability and
 112 maintainability of our common source base. If you have suggestions for topics to
 113 be included, please mail them to <a
 114 href="mailto:sabre@nondot.org">Chris</a>.</p>
 115
 116 </div>
 117
 118 <!-- *********************************************************************** -->
 119 <div class="doc_section">
 120   <a name="mechanicalissues">Mechanical Source Issues</a>
 121 </div>
 122 <!-- *********************************************************************** -->
 123
 124 <!-- ======================================================================= -->
 125 <div class="doc_subsection">
 126   <a name="sourceformating">Source Code Formatting</a>
 127 </div>
 128
 129 <!-- _______________________________________________________________________ -->
 130 <div class="doc_subsubsection">
 131   <a name="scf_commenting">Commenting</a>
 132 </div>
 133
 134 <div class="doc_text">
 135
 136 <p>Comments are one critical part of readability and maintainability.  Everyone
 137 knows they should comment, so should you.  When writing comments, write them as
 138 English prose, which means they should use proper capitalization, punctuation,
 139 etc.  Although we all should probably
 140 comment our code more than we do, there are a few very critical places that
 141 documentation is very useful:</p>
 142
 143 <b>File Headers</b>
 144
 145 <p>Every source file should have a header on it that describes the basic
 146 purpose of the file.  If a file does not have a header, it should not be
 147 checked into Subversion.  Most source trees will probably have a standard
 148 file header format.  The standard format for the LLVM source tree looks like
 149 this:</p>
 150
 151 <div class="doc_code">
 152 <pre>
 153 //===-- llvm/Instruction.h - Instruction class definition -------*- C++ -*-===//
 154 //
 155 //                     The LLVM Compiler Infrastructure
 156 //
 157 // This file is distributed under the University of Illinois Open Source
 158 // License. See LICENSE.TXT for details.
 159 //
 160 //===----------------------------------------------------------------------===//
 161 //
 162 // This file contains the declaration of the Instruction class, which is the
 163 // base class for all of the VM instructions.
 164 //
 165 //===----------------------------------------------------------------------===//
 166 </pre>
 167 </div>
 168
 169 <p>A few things to note about this particular format:  The "<tt>-*- C++
 170 -*-</tt>" string on the first line is there to tell Emacs that the source file
 171 is a C++ file, not a C file (Emacs assumes <tt>.h</tt> files are C files by default).
 172 Note that this tag is not necessary in <tt>.cpp</tt> files.  The name of the file is also
 173 on the first line, along with a very short description of the purpose of the
 174 file.  This is important when printing out code and flipping though lots of
 175 pages.</p>
 176
 177 <p>The next section in the file is a concise note that defines the license
 178 that the file is released under.  This makes it perfectly clear what terms the
 179 source code can be distributed under and should not be modified in any way.</p>
 180
 181 <p>The main body of the description does not have to be very long in most cases.
 182 Here it's only two lines.  If an algorithm is being implemented or something
 183 tricky is going on, a reference to the paper where it is published should be
 184 included, as well as any notes or "gotchas" in the code to watch out for.</p>
 185
 186 <b>Class overviews</b>
 187
 188 <p>Classes are one fundamental part of a good object oriented design.  As such,
 189 a class definition should have a comment block that explains what the class is
 190 used for... if it's not obvious.  If it's so completely obvious your grandma
 191 could figure it out, it's probably safe to leave it out.  Naming classes
 192 something sane goes a long ways towards avoiding writing documentation.</p>
 193
 194
 195 <b>Method information</b>
 196
 197 <p>Methods defined in a class (as well as any global functions) should also be
 198 documented properly.  A quick note about what it does and a description of the
 199 borderline behaviour is all that is necessary here (unless something
 200 particularly tricky or insidious is going on).  The hope is that people can
 201 figure out how to use your interfaces without reading the code itself... that is
 202 the goal metric.</p>
 203
 204 <p>Good things to talk about here are what happens when something unexpected
 205 happens: does the method return null?  Abort?  Format your hard disk?</p>
 206
 207 </div>
 208
 209 <!-- _______________________________________________________________________ -->
 210 <div class="doc_subsubsection">
 211   <a name="scf_commentformat">Comment Formatting</a>
 212 </div>
 213
 214 <div class="doc_text">
 215
 216 <p>In general, prefer C++ style (<tt>//</tt>) comments.  They take less space,
 217 require less typing, don't have nesting problems, etc.  There are a few cases
 218 when it is useful to use C style (<tt>/* */</tt>) comments however:</p>
 219
 220 <ol>
 221   <li>When writing a C code: Obviously if you are writing C code, use C style
 222       comments.</li>
 223   <li>When writing a header file that may be <tt>#include</tt>d by a C source
 224       file.</li>
 225   <li>When writing a source file that is used by a tool that only accepts C
 226       style comments.</li>
 227 </ol>
 228
 229 <p>To comment out a large block of code, use <tt>#if 0</tt> and <tt>#endif</tt>.
 230 These nest properly and are better behaved in general than C style comments.</p>
 231
 232 </div>
 233
 234 <!-- _______________________________________________________________________ -->
 235 <div class="doc_subsubsection">
 236   <a name="scf_includes"><tt>#include</tt> Style</a>
 237 </div>
 238
 239 <div class="doc_text">
 240
 241 <p>Immediately after the <a href="#scf_commenting">header file comment</a> (and
 242 include guards if working on a header file), the <a
 243 href="#hl_dontinclude">minimal</a> list of <tt>#include</tt>s required by the
 244 file should be listed.  We prefer these <tt>#include</tt>s to be listed in this
 245 order:</p>
 246
 247 <ol>
 248   <li><a href="#mmheader">Main Module Header</a></li>
 249   <li><a href="#hl_privateheaders">Local/Private Headers</a></li>
 250   <li><tt>llvm/*</tt></li>
 251   <li><tt>llvm/Analysis/*</tt></li>
 252   <li><tt>llvm/Assembly/*</tt></li>
 253   <li><tt>llvm/Bitcode/*</tt></li>
 254   <li><tt>llvm/CodeGen/*</tt></li>
 255   <li>...</li>
 256   <li><tt>Support/*</tt></li>
 257   <li><tt>Config/*</tt></li>
 258   <li>System <tt>#includes</tt></li>
 259 </ol>
 260
 261 <p>... and each category should be sorted by name.</p>
 262
 263 <p><a name="mmheader">The "Main Module Header"</a> file applies to <tt>.cpp</tt> files
 264 which implement an interface defined by a <tt>.h</tt> file.  This <tt>#include</tt>
 265 should always be included <b>first</b> regardless of where it lives on the file
 266 system.  By including a header file first in the <tt>.cpp</tt> files that implement the
 267 interfaces, we ensure that the header does not have any hidden dependencies
 268 which are not explicitly #included in the header, but should be.  It is also a
 269 form of documentation in the <tt>.cpp</tt> file to indicate where the interfaces it
 270 implements are defined.</p>
 271
 272 </div>
 273
 274 <!-- _______________________________________________________________________ -->
 275 <div class="doc_subsubsection">
 276   <a name="scf_codewidth">Source Code Width</a>
 277 </div>
 278
 279 <div class="doc_text">
 280
 281 <p>Write your code to fit within 80 columns of text.  This helps those of us who
 282 like to print out code and look at your code in an xterm without resizing
 283 it.</p>
 284
 285 <p>The longer answer is that there must be some limit to the width of the code
 286 in order to reasonably allow developers to have multiple files side-by-side in
 287 windows on a modest display.  If you are going to pick a width limit, it is
 288 somewhat arbitrary but you might as well pick something standard.  Going with
 289 90 columns (for example) instead of 80 columns wouldn't add any significant
 290 value and would be detrimental to printing out code.  Also many other projects
 291 have standardized on 80 columns, so some people have already configured their
 292 editors for it (vs something else, like 90 columns).</p>
 293
 294 <p>This is one of many contentious issues in coding standards, but is not up
 295 for debate.</p>
 296
 297 </div>
 298
 299 <!-- _______________________________________________________________________ -->
 300 <div class="doc_subsubsection">
 301   <a name="scf_spacestabs">Use Spaces Instead of Tabs</a>
 302 </div>
 303
 304 <div class="doc_text">
 305
 306 <p>In all cases, prefer spaces to tabs in source files.  People have different
 307 preferred indentation levels, and different styles of indentation that they
 308 like... this is fine.  What isn't is that different editors/viewers expand tabs
 309 out to different tab stops.  This can cause your code to look completely
 310 unreadable, and it is not worth dealing with.</p>
 311
 312 <p>As always, follow the <a href="#goldenrule">Golden Rule</a> above: follow the
 313 style of existing code if your are modifying and extending it.  If you like four
 314 spaces of indentation, <b>DO NOT</b> do that in the middle of a chunk of code
 315 with two spaces of indentation.  Also, do not reindent a whole source file: it
 316 makes for incredible diffs that are absolutely worthless.</p>
 317
 318 </div>
 319
 320 <!-- _______________________________________________________________________ -->
 321 <div class="doc_subsubsection">
 322   <a name="scf_indentation">Indent Code Consistently</a>
 323 </div>
 324
 325 <div class="doc_text">
 326
 327 <p>Okay, in your first year of programming you were told that indentation is
 328 important.  If you didn't believe and internalize this then, now is the time.
 329 Just do it.</p>
 330
 331 </div>
 332
 333
 334 <!-- ======================================================================= -->
 335 <div class="doc_subsection">
 336   <a name="compilerissues">Compiler Issues</a>
 337 </div>
 338
 339
 340 <!-- _______________________________________________________________________ -->
 341 <div class="doc_subsubsection">
 342   <a name="ci_warningerrors">Treat Compiler Warnings Like Errors</a>
 343 </div>
 344
 345 <div class="doc_text">
 346
 347 <p>If your code has compiler warnings in it, something is wrong: you aren't
 348 casting values correctly, your have "questionable" constructs in your code, or
 349 you are doing something legitimately wrong.  Compiler warnings can cover up
 350 legitimate errors in output and make dealing with a translation unit
 351 difficult.</p>
 352
 353 <p>It is not possible to prevent all warnings from all compilers, nor is it
 354 desirable.  Instead, pick a standard compiler (like <tt>gcc</tt>) that provides
 355 a good thorough set of warnings, and stick to them.  At least in the case of
 356 <tt>gcc</tt>, it is possible to work around any spurious errors by changing the
 357 syntax of the code slightly.  For example, an warning that annoys me occurs when
 358 I write code like this:</p>
 359
 360 <div class="doc_code">
 361 <pre>
 362 if (V = getValue()) {
 363   ...
 364 }
 365 </pre>
 366 </div>
 367
 368 <p><tt>gcc</tt> will warn me that I probably want to use the <tt>==</tt>
 369 operator, and that I probably mistyped it.  In most cases, I haven't, and I
 370 really don't want the spurious errors.  To fix this particular problem, I
 371 rewrite the code like this:</p>
 372
 373 <div class="doc_code">
 374 <pre>
 375 if ((V = getValue())) {
 376   ...
 377 }
 378 </pre>
 379 </div>
 380
 381 <p>...which shuts <tt>gcc</tt> up.  Any <tt>gcc</tt> warning that annoys you can
 382 be fixed by massaging the code appropriately.</p>
 383
 384 <p>These are the <tt>gcc</tt> warnings that I prefer to enable: <tt>-Wall
 385 -Winline -W -Wwrite-strings -Wno-unused</tt></p>
 386
 387 </div>
 388
 389 <!-- _______________________________________________________________________ -->
 390 <div class="doc_subsubsection">
 391   <a name="ci_portable_code">Write Portable Code</a>
 392 </div>
 393
 394 <div class="doc_text">
 395
 396 <p>In almost all cases, it is possible and within reason to write completely
 397 portable code.  If there are cases where it isn't possible to write portable
 398 code, isolate it behind a well defined (and well documented) interface.</p>
 399
 400 <p>In practice, this means that you shouldn't assume much about the host
 401 compiler, and Visual Studio tends to be the lowest common denominator.
 402 If advanced features are used, they should only be an implementation detail of
 403 a library which has a simple exposed API, and preferably be buried in
 404 libSystem.</p>
 405
 406 </div>
 407
 408 <!-- _______________________________________________________________________ -->
 409 <div class="doc_subsubsection">
 410 <a name="ci_rtti_exceptions">Do not use RTTI or Exceptions</a>
 411 </div>
 412 <div class="doc_text">
 413
 414 <p>LLVM does not use RTTI (e.g. dynamic_cast&lt;&gt;) or exceptions, in an
 415 effort to reduce code and executable size.  These two language features violate
 416 the general C++ principle of "you only pay for what you use", causing executable
 417 bloat even if exceptions are never used in a code base, or if RTTI is never used
 418 for a class.  Because of this, we turn them off globally in the code.
 419 </p>
 420
 421 <p>
 422 That said, LLVM does make extensive use of a hand-rolled form of RTTI that use
 423 templates like <a href="ProgrammersManual.html#isa">isa&lt;&gt;, cast&lt;&gt;,
 424 and dyn_cast&lt;&gt;</a>.  This form of RTTI is opt-in and can be added to any
 425 class.  It is also substantially more efficient than dynamic_cast&lt;&gt;.
 426 </p>
 427
 428 </div>
 429
 430 <!-- _______________________________________________________________________ -->
 431 <div class="doc_subsubsection">
 432 <a name="ci_class_struct">Use of <tt>class</tt> and <tt>struct</tt> Keywords</a>
 433 </div>
 434 <div class="doc_text">
 435
 436 <p>In C++, the <tt>class</tt> and <tt>struct</tt> keywords can be used almost
 437 interchangeably. The only difference is when they are used to declare a class:
 438 <tt>class</tt> makes all members private by default while <tt>struct</tt> makes
 439 all members public by default.</p>
 440
 441 <p>Unfortunately, not all compilers follow the rules and some will generate
 442 different symbols based on whether <tt>class</tt> or <tt>struct</tt> was used to
 443 declare the symbol.  This can lead to problems at link time.</p>
 444
 445 <p>So, the rule for LLVM is to always use the <tt>class</tt> keyword, unless
 446 <b>all</b> members are public and the type is a C++ "POD" type, in which case
 447 <tt>struct</tt> is allowed.</p>
 448
 449 </div>
 450
 451 <!-- *********************************************************************** -->
 452 <div class="doc_section">
 453   <a name="styleissues">Style Issues</a>
 454 </div>
 455 <!-- *********************************************************************** -->
 456
 457
 458 <!-- ======================================================================= -->
 459 <div class="doc_subsection">
 460   <a name="macro">The High-Level Issues</a>
 461 </div>
 462 <!-- ======================================================================= -->
 463
 464
 465 <!-- _______________________________________________________________________ -->
 466 <div class="doc_subsubsection">
 467   <a name="hl_module">A Public Header File <b>is</b> a Module</a>
 468 </div>
 469
 470 <div class="doc_text">
 471
 472 <p>C++ doesn't do too well in the modularity department.  There is no real
 473 encapsulation or data hiding (unless you use expensive protocol classes), but it
 474 is what we have to work with.  When you write a public header file (in the LLVM
 475 source tree, they live in the top level "include" directory), you are defining a
 476 module of functionality.</p>
 477
 478 <p>Ideally, modules should be completely independent of each other, and their
 479 header files should only include the absolute minimum number of headers
 480 possible. A module is not just a class, a function, or a namespace: <a
 481 href="http://www.cuj.com/articles/2000/0002/0002c/0002c.htm">it's a collection
 482 of these</a> that defines an interface.  This interface may be several
 483 functions, classes or data structures, but the important issue is how they work
 484 together.</p>
 485
 486 <p>In general, a module should be implemented with one or more <tt>.cpp</tt>
 487 files.  Each of these <tt>.cpp</tt> files should include the header that defines
 488 their interface first.  This ensures that all of the dependences of the module
 489 header have been properly added to the module header itself, and are not
 490 implicit.  System headers should be included after user headers for a
 491 translation unit.</p>
 492
 493 </div>
 494
 495 <!-- _______________________________________________________________________ -->
 496 <div class="doc_subsubsection">
 497   <a name="hl_dontinclude"><tt>#include</tt> as Little as Possible</a>
 498 </div>
 499
 500 <div class="doc_text">
 501
 502 <p><tt>#include</tt> hurts compile time performance.  Don't do it unless you
 503 have to, especially in header files.</p>
 504
 505 <p>But wait, sometimes you need to have the definition of a class to use it, or
 506 to inherit from it.  In these cases go ahead and <tt>#include</tt> that header
 507 file.  Be aware however that there are many cases where you don't need to have
 508 the full definition of a class.  If you are using a pointer or reference to a
 509 class, you don't need the header file.  If you are simply returning a class
 510 instance from a prototyped function or method, you don't need it.  In fact, for
 511 most cases, you simply don't need the definition of a class... and not
 512 <tt>#include</tt>'ing speeds up compilation.</p>
 513
 514 <p>It is easy to try to go too overboard on this recommendation, however.  You
 515 <b>must</b> include all of the header files that you are using -- you can
 516 include them either directly
 517 or indirectly (through another header file).  To make sure that you don't
 518 accidentally forget to include a header file in your module header, make sure to
 519 include your module header <b>first</b> in the implementation file (as mentioned
 520 above).  This way there won't be any hidden dependencies that you'll find out
 521 about later...</p>
 522
 523 </div>
 524
 525 <!-- _______________________________________________________________________ -->
 526 <div class="doc_subsubsection">
 527   <a name="hl_privateheaders">Keep "internal" Headers Private</a>
 528 </div>
 529
 530 <div class="doc_text">
 531
 532 <p>Many modules have a complex implementation that causes them to use more than
 533 one implementation (<tt>.cpp</tt>) file.  It is often tempting to put the
 534 internal communication interface (helper classes, extra functions, etc) in the
 535 public module header file.  Don't do this.</p>
 536
 537 <p>If you really need to do something like this, put a private header file in
 538 the same directory as the source files, and include it locally.  This ensures
 539 that your private interface remains private and undisturbed by outsiders.</p>
 540
 541 <p>Note however, that it's okay to put extra implementation methods a public
 542 class itself... just make them private (or protected), and all is well.</p>
 543
 544 </div>
 545
 546 <!-- _______________________________________________________________________ -->
 547 <div class="doc_subsubsection">
 548   <a name="hl_earlyexit">Use Early Exits and <tt>continue</tt> to Simplify Code</a>
 549 </div>
 550
 551 <div class="doc_text">
 552
 553 <p>When reading code, keep in mind how much state and how many previous
 554 decisions have to be remembered by the reader to understand a block of code.
 555 Aim to reduce indentation where possible when it doesn't make it more difficult
 556 to understand the code.  One great way to do this is by making use of early
 557 exits and the <tt>continue</tt> keyword in long loops.  As an example of using an early
 558 exit from a function, consider this "bad" code:</p>
 559
 560 <div class="doc_code">
 561 <pre>
 562 Value *DoSomething(Instruction *I) {
 563   if (!isa&lt;TerminatorInst&gt;(I) &amp;&amp;
 564       I-&gt;hasOneUse() &amp;&amp; SomeOtherThing(I)) {
 565     ... some long code ....
 566   }
 567
 568   return 0;
 569 }
 570 </pre>
 571 </div>
 572
 573 <p>This code has several problems if the body of the 'if' is large.  When you're
 574 looking at the top of the function, it isn't immediately clear that this
 575 <em>only</em> does interesting things with non-terminator instructions, and only
 576 applies to things with the other predicates.  Second, it is relatively difficult
 577 to describe (in comments) why these predicates are important because the if
 578 statement makes it difficult to lay out the comments.  Third, when you're deep
 579 within the body of the code, it is indented an extra level.   Finally, when
 580 reading the top of the function, it isn't clear what the result is if the
 581 predicate isn't true, you have to read to the end of the function to know that
 582 it returns null.</p>
 583
 584 <p>It is much preferred to format the code like this:</p>
 585
 586 <div class="doc_code">
 587 <pre>
 588 Value *DoSomething(Instruction *I) {
 589   // Terminators never need 'something' done to them because, ...
 590   if (isa&lt;TerminatorInst&gt;(I))
 591     return 0;
 592
 593   // We conservatively avoid transforming instructions with multiple uses
 594   // because goats like cheese.
 595   if (!I-&gt;hasOneUse())
 596     return 0;
 597
 598   // This is really just here for example.
 599   if (!SomeOtherThing(I))
 600     return 0;
 601
 602   ... some long code ....
 603 }
 604 </pre>
 605 </div>
 606
 607 <p>This fixes these problems.  A similar problem frequently happens in <tt>for</tt>
 608 loops.  A silly example is something like this:</p>
 609
 610 <div class="doc_code">
 611 <pre>
 612   for (BasicBlock::iterator II = BB-&gt;begin(), E = BB-&gt;end(); II != E; ++II) {
 613     if (BinaryOperator *BO = dyn_cast&lt;BinaryOperator&gt;(II)) {
 614       Value *LHS = BO-&gt;getOperand(0);
 615       Value *RHS = BO-&gt;getOperand(1);
 616       if (LHS != RHS) {
 617         ...
 618       }
 619     }
 620   }
 621 </pre>
 622 </div>
 623
 624 <p>When you have very very small loops, this sort of structure is fine, but if
 625 it exceeds more than 10-15 lines, it becomes difficult for people to read and
 626 understand at a glance.
 627 The problem with this sort of code is that it gets very nested very quickly,
 628 meaning that the reader of the code has to keep a lot of context in their brain
 629 to remember what is going immediately on in the loop, because they don't know
 630 if/when the if conditions will have elses etc.  It is strongly preferred to
 631 structure the loop like this:</p>
 632
 633 <div class="doc_code">
 634 <pre>
 635   for (BasicBlock::iterator II = BB-&gt;begin(), E = BB-&gt;end(); II != E; ++II) {
 636     BinaryOperator *BO = dyn_cast&lt;BinaryOperator&gt;(II);
 637     if (!BO) continue;
 638
 639     Value *LHS = BO-&gt;getOperand(0);
 640     Value *RHS = BO-&gt;getOperand(1);
 641     if (LHS == RHS) continue;
 642     ...
 643   }
 644 </pre>
 645 </div>
 646
 647 <p>This has all the benefits of using early exits from functions: it reduces
 648 nesting of the loop, it makes it easier to describe why the conditions are true,
 649 and it makes it obvious to the reader that there is no <tt>else</tt> coming up that
 650 they have to push context into their brain for.  If a loop is large, this can
 651 be a big understandability win.</p>
 652
 653 </div>
 654
 655 <!-- _______________________________________________________________________ -->
 656 <div class="doc_subsubsection">
 657   <a name="hl_else_after_return">Don't use <tt>else</tt> after a <tt>return</tt></a>
 658 </div>
 659
 660 <div class="doc_text">
 661
 662 <p>For similar reasons above (reduction of indentation and easier reading),
 663    please do not use <tt>else</tt> or '<tt>else if</tt>' after something that interrupts
 664    control flow like <tt>return</tt>, <tt>break</tt>, <tt>continue</tt>, <tt>goto</tt>, etc.  For example, this is
 665    "bad":</p>
 666
 667 <div class="doc_code">
 668 <pre>
 669   case 'J': {
 670     if (Signed) {
 671       Type = Context.getsigjmp_bufType();
 672       if (Type.isNull()) {
 673         Error = ASTContext::GE_Missing_sigjmp_buf;
 674         return QualType();
 675       } else {
 676         break;
 677       }
 678     } else {
 679       Type = Context.getjmp_bufType();
 680       if (Type.isNull()) {
 681         Error = ASTContext::GE_Missing_jmp_buf;
 682         return QualType();
 683       } else {
 684         break;
 685       }
 686     }
 687   }
 688   }
 689 </pre>
 690 </div>
 691
 692 <p>It is better to write this something like:</p>
 693
 694 <div class="doc_code">
 695 <pre>
 696   case 'J':
 697     if (Signed) {
 698       Type = Context.getsigjmp_bufType();
 699       if (Type.isNull()) {
 700         Error = ASTContext::GE_Missing_sigjmp_buf;
 701         return QualType();
 702       }
 703     } else {
 704       Type = Context.getjmp_bufType();
 705       if (Type.isNull()) {
 706         Error = ASTContext::GE_Missing_jmp_buf;
 707         return QualType();
 708       }
 709     }
 710     break;
 711 </pre>
 712 </div>
 713
 714 <p>Or better yet (in this case), as:</p>
 715
 716 <div class="doc_code">
 717 <pre>
 718   case 'J':
 719     if (Signed)
 720       Type = Context.getsigjmp_bufType();
 721     else
 722       Type = Context.getjmp_bufType();
 723
 724     if (Type.isNull()) {
 725       Error = Signed ? ASTContext::GE_Missing_sigjmp_buf :
 726                        ASTContext::GE_Missing_jmp_buf;
 727       return QualType();
 728     }
 729     break;
 730 </pre>
 731 </div>
 732
 733 <p>The idea is to reduce indentation and the amount of code you have to keep
 734    track of when reading the code.</p>
 735
 736 </div>
 737
 738 <!-- _______________________________________________________________________ -->
 739 <div class="doc_subsubsection">
 740   <a name="hl_predicateloops">Turn Predicate Loops into Predicate Functions</a>
 741 </div>
 742
 743 <div class="doc_text">
 744
 745 <p>It is very common to write small loops that just compute a boolean
 746    value.  There are a number of ways that people commonly write these, but an
 747    example of this sort of thing is:</p>
 748
 749 <div class="doc_code">
 750 <pre>
 751   <b>bool FoundFoo = false;</b>
 752   for (unsigned i = 0, e = BarList.size(); i != e; ++i)
 753     if (BarList[i]-&gt;isFoo()) {
 754       <b>FoundFoo = true;</b>
 755       break;
 756     }
 757
 758   <b>if (FoundFoo) {</b>
 759     ...
 760   }
 761 </pre>
 762 </div>
 763
 764 <p>This sort of code is awkward to write, and is almost always a bad sign.
 765 Instead of this sort of loop, we strongly prefer to use a predicate function
 766 (which may be <a href="#micro_anonns">static</a>) that uses
 767 <a href="#hl_earlyexit">early exits</a> to compute the predicate.  We prefer
 768 the code to be structured like this:
 769 </p>
 770
 771
 772 <div class="doc_code">
 773 <pre>
 774 /// ListContainsFoo - Return true if the specified list has an element that is
 775 /// a foo.
 776 static bool ListContainsFoo(const std::vector&lt;Bar*&gt; &amp;List) {
 777   for (unsigned i = 0, e = List.size(); i != e; ++i)
 778     if (List[i]-&gt;isFoo())
 779       return true;
 780   return false;
 781 }
 782 ...
 783
 784   <b>if (ListContainsFoo(BarList)) {</b>
 785     ...
 786   }
 787 </pre>
 788 </div>
 789
 790 <p>There are many reasons for doing this: it reduces indentation and factors out
 791 code which can often be shared by other code that checks for the same predicate.
 792 More importantly, it <em>forces you to pick a name</em> for the function, and
 793 forces you to write a comment for it.  In this silly example, this doesn't add
 794 much value.  However, if the condition is complex, this can make it a lot easier
 795 for the reader to understand the code that queries for this predicate.  Instead
 796 of being faced with the in-line details of how we check to see if the BarList
 797 contains a foo, we can trust the function name and continue reading with better
 798 locality.</p>
 799
 800 </div>
 801
 802
 803 <!-- ======================================================================= -->
 804 <div class="doc_subsection">
 805   <a name="micro">The Low-Level Issues</a>
 806 </div>
 807 <!-- ======================================================================= -->
 808
 809
 810 <!-- _______________________________________________________________________ -->
 811 <div class="doc_subsubsection">
 812   <a name="ll_assert">Assert Liberally</a>
 813 </div>
 814
 815 <div class="doc_text">
 816
 817 <p>Use the "<tt>assert</tt>" macro to its fullest.  Check all of your
 818 preconditions and assumptions, you never know when a bug (not necessarily even
 819 yours) might be caught early by an assertion, which reduces debugging time
 820 dramatically.  The "<tt>&lt;cassert&gt;</tt>" header file is probably already
 821 included by the header files you are using, so it doesn't cost anything to use
 822 it.</p>
 823
 824 <p>To further assist with debugging, make sure to put some kind of error message
 825 in the assertion statement (which is printed if the assertion is tripped). This
 826 helps the poor debugger make sense of why an assertion is being made and
 827 enforced, and hopefully what to do about it.  Here is one complete example:</p>
 828
 829 <div class="doc_code">
 830 <pre>
 831 inline Value *getOperand(unsigned i) {
 832   assert(i &lt; Operands.size() &amp;&amp; "getOperand() out of range!");
 833   return Operands[i];
 834 }
 835 </pre>
 836 </div>
 837
 838 <p>Here are some examples:</p>
 839
 840 <div class="doc_code">
 841 <pre>
 842 assert(Ty-&gt;isPointerType() &amp;&amp; "Can't allocate a non pointer type!");
 843
 844 assert((Opcode == Shl || Opcode == Shr) &amp;&amp; "ShiftInst Opcode invalid!");
 845
 846 assert(idx &lt; getNumSuccessors() &amp;&amp; "Successor # out of range!");
 847
 848 assert(V1.getType() == V2.getType() &amp;&amp; "Constant types must be identical!");
 849
 850 assert(isa&lt;PHINode&gt;(Succ-&gt;front()) &amp;&amp; "Only works on PHId BBs!");
 851 </pre>
 852 </div>
 853
 854 <p>You get the idea...</p>
 855
 856 <p>Please be aware when adding assert statements that not all compilers are aware of
 857 the semantics of the assert.  In some places, asserts are used to indicate a piece of
 858 code that should not be reached.  These are typically of the form:</p>
 859
 860 <div class="doc_code">
 861 <pre>
 862 assert(0 &amp;&amp; "Some helpful error message");
 863 </pre>
 864 </div>
 865
 866 <p>When used in a function that returns a value, they should be followed with a return
 867 statement and a comment indicating that this line is never reached.  This will prevent
 868 a compiler which is unable to deduce that the assert statement never returns from
 869 generating a warning.</p>
 870
 871 <div class="doc_code">
 872 <pre>
 873 assert(0 &amp;&amp; "Some helpful error message");
 874 // Not reached
 875 return 0;
 876 </pre>
 877 </div>
 878
 879 <p>Another issue is that values used only by assertions will produce an "unused
 880  value" warning when assertions are disabled.  For example, this code will warn:
 881 </p>
 882
 883 <div class="doc_code">
 884 <pre>
 885   unsigned Size = V.size();
 886   assert(Size &gt; 42 &amp;&amp; "Vector smaller than it should be");
 887
 888   bool NewToSet = Myset.insert(Value);
 889   assert(NewToSet &amp;&amp; "The value shouldn't be in the set yet");
 890 </pre>
 891 </div>
 892
 893 <p>These are two interesting different cases: in the first case, the call to
 894 V.size() is only useful for the assert, and we don't want it executed when
 895 assertions are disabled.  Code like this should move the call into the assert
 896 itself.  In the second case, the side effects of the call must happen whether
 897 the assert is enabled or not.  In this case, the value should be cast to void
 898 to disable the warning.  To be specific, it is preferred to write the code
 899 like this:</p>
 900
 901 <div class="doc_code">
 902 <pre>
 903   assert(V.size() &gt; 42 &amp;&amp; "Vector smaller than it should be");
 904
 905   bool NewToSet = Myset.insert(Value); (void)NewToSet;
 906   assert(NewToSet &amp;&amp; "The value shouldn't be in the set yet");
 907 </pre>
 908 </div>
 909
 910
 911 </div>
 912
 913 <!-- _______________________________________________________________________ -->
 914 <div class="doc_subsubsection">
 915   <a name="ll_ns_std">Do not use '<tt>using namespace std</tt>'</a>
 916 </div>
 917
 918 <div class="doc_text">
 919 <p>In LLVM, we prefer to explicitly prefix all identifiers from the standard
 920 namespace with an "<tt>std::</tt>" prefix, rather than rely on
 921 "<tt>using namespace std;</tt>".</p>
 922
 923 <p> In header files, adding a '<tt>using namespace XXX</tt>' directive pollutes
 924 the namespace of any source file that <tt>#include</tt>s the header.  This is
 925 clearly a bad thing.</p>
 926
 927 <p>In implementation files (e.g. <tt>.cpp</tt> files), the rule is more of a stylistic
 928 rule, but is still important.  Basically, using explicit namespace prefixes
 929 makes the code <b>clearer</b>, because it is immediately obvious what facilities
 930 are being used and where they are coming from, and <b>more portable</b>, because
 931 namespace clashes cannot occur between LLVM code and other namespaces.  The
 932 portability rule is important because different standard library implementations
 933 expose different symbols (potentially ones they shouldn't), and future revisions
 934 to the C++ standard will add more symbols to the <tt>std</tt> namespace.  As
 935 such, we never use '<tt>using namespace std;</tt>' in LLVM.</p>
 936
 937 <p>The exception to the general rule (i.e. it's not an exception for
 938 the <tt>std</tt> namespace) is for implementation files.  For example, all of
 939 the code in the LLVM project implements code that lives in the 'llvm' namespace.
 940 As such, it is ok, and actually clearer, for the <tt>.cpp</tt> files to have a
 941 '<tt>using namespace llvm</tt>' directive at their top, after the
 942 <tt>#include</tt>s.  This reduces indentation in the body of the file for source
 943 editors that indent based on braces, and keeps the conceptual context cleaner.
 944 The general form of this rule is that any <tt>.cpp</tt> file that implements
 945 code in any namespace may use that namespace (and its parents'), but should not
 946 use any others.</p>
 947
 948 </div>
 949
 950 <!-- _______________________________________________________________________ -->
 951 <div class="doc_subsubsection">
 952   <a name="ll_virtual_anch">Provide a virtual method anchor for classes
 953   in headers</a>
 954 </div>
 955
 956 <div class="doc_text">
 957
 958 <p>If a class is defined in a header file and has a v-table (either it has
 959 virtual methods or it derives from classes with virtual methods), it must
 960 always have at least one out-of-line virtual method in the class.  Without
 961 this, the compiler will copy the vtable and RTTI into every <tt>.o</tt> file
 962 that <tt>#include</tt>s the header, bloating <tt>.o</tt> file sizes and
 963 increasing link times.</p>
 964
 965 </div>
 966
 967 <!-- _______________________________________________________________________ -->
 968 <div class="doc_subsubsection">
 969   <a name="ll_end">Don't evaluate <tt>end()</tt> every time through a loop</a>
 970 </div>
 971
 972 <div class="doc_text">
 973
 974 <p>Because C++ doesn't have a standard "foreach" loop (though it can be emulated
 975 with macros and may be coming in C++'0x) we end up writing a lot of loops that
 976 manually iterate from begin to end on a variety of containers or through other
 977 data structures.  One common mistake is to write a loop in this style:</p>
 978
 979 <div class="doc_code">
 980 <pre>
 981   BasicBlock *BB = ...
 982   for (BasicBlock::iterator I = BB->begin(); I != <b>BB->end()</b>; ++I)
 983      ... use I ...
 984 </pre>
 985 </div>
 986
 987 <p>The problem with this construct is that it evaluates "<tt>BB->end()</tt>"
 988 every time through the loop.  Instead of writing the loop like this, we strongly
 989 prefer loops to be written so that they evaluate it once before the loop starts.
 990 A convenient way to do this is like so:</p>
 991
 992 <div class="doc_code">
 993 <pre>
 994   BasicBlock *BB = ...
 995   for (BasicBlock::iterator I = BB->begin(), E = <b>BB->end()</b>; I != E; ++I)
 996      ... use I ...
 997 </pre>
 998 </div>
 999
1000 <p>The observant may quickly point out that these two loops may have different
1001 semantics: if the container (a basic block in this case) is being mutated, then
1002 "<tt>BB->end()</tt>" may change its value every time through the loop and the
1003 second loop may not in fact be correct.  If you actually do depend on this
1004 behavior, please write the loop in the first form and add a comment indicating
1005 that you did it intentionally.</p>
1006
1007 <p>Why do we prefer the second form (when correct)?  Writing the loop in the
1008 first form has two problems: First it may be less efficient than evaluating it
1009 at the start of the loop.  In this case, the cost is probably minor: a few extra
1010 loads every time through the loop.  However, if the base expression is more
1011 complex, then the cost can rise quickly.  I've seen loops where the end
1012 expression was actually something like: "<tt>SomeMap[x]->end()</tt>" and map
1013 lookups really aren't cheap.  By writing it in the second form consistently, you
1014 eliminate the issue entirely and don't even have to think about it.</p>
1015
1016 <p>The second (even bigger) issue is that writing the loop in the first form
1017 hints to the reader that the loop is mutating the container (a fact that a
1018 comment would handily confirm!).  If you write the loop in the second form, it
1019 is immediately obvious without even looking at the body of the loop that the
1020 container isn't being modified, which makes it easier to read the code and
1021 understand what it does.</p>
1022
1023 <p>While the second form of the loop is a few extra keystrokes, we do strongly
1024 prefer it.</p>
1025
1026 </div>
1027
1028 <!-- _______________________________________________________________________ -->
1029 <div class="doc_subsubsection">
1030   <a name="ll_iostream"><tt>#include &lt;iostream&gt;</tt> is forbidden</a>
1031 </div>
1032
1033 <div class="doc_text">
1034
1035 <p>The use of <tt>#include &lt;iostream&gt;</tt> in library files is
1036 hereby <b><em>forbidden</em></b>. The primary reason for doing this is to
1037 support clients using LLVM libraries as part of larger systems. In particular,
1038 we statically link LLVM into some dynamic libraries. Even if LLVM isn't used,
1039 the static c'tors are run whenever an application start up that uses the dynamic
1040 library. There are two problems with this:</p>
1041
1042 <ol>
1043   <li>The time to run the static c'tors impacts startup time of
1044       applications&mdash;a critical time for GUI apps.</li>
1045   <li>The static c'tors cause the app to pull many extra pages of memory off the
1046       disk: both the code for the static c'tors in each <tt>.o</tt> file and the
1047       small amount of data that gets touched. In addition, touched/dirty pages
1048       put more pressure on the VM system on low-memory machines.</li>
1049 </ol>
1050
1051 <p>Note that using the other stream headers (<tt>&lt;sstream&gt;</tt> for
1052 example) is not problematic in this regard (just <tt>&lt;iostream&gt;</tt>).
1053 However, <tt>raw_ostream</tt> provides various APIs that are better performing for almost
1054 every use than <tt>std::ostream</tt> style APIs.
1055 <b>Therefore new code should always
1056 use <a href="#ll_raw_ostream"><tt>raw_ostream</tt></a> for writing, or
1057 the <tt>llvm::MemoryBuffer</tt> API for reading files.</b></p>
1058
1059 </div>
1060
1061
1062 <!-- _______________________________________________________________________ -->
1063 <div class="doc_subsubsection">
1064   <a name="ll_raw_ostream">Use <tt>raw_ostream</tt></a>
1065 </div>
1066
1067 <div class="doc_text">
1068
1069 <p>LLVM includes a lightweight, simple, and efficient stream implementation
1070 in <tt>llvm/Support/raw_ostream.h</tt> which provides all of the common features
1071 of <tt>std::ostream</tt>.  All new code should use <tt>raw_ostream</tt> instead
1072 of <tt>ostream</tt>.</p>
1073
1074 <p>Unlike <tt>std::ostream</tt>, <tt>raw_ostream</tt> is not a template and can
1075 be forward declared as <tt>class raw_ostream</tt>.  Public headers should
1076 generally not include the <tt>raw_ostream</tt> header, but use forward
1077 declarations and constant references to <tt>raw_ostream</tt> instances.</p>
1078
1079 </div>
1080
1081
1082 <!-- _______________________________________________________________________ -->
1083 <div class="doc_subsubsection">
1084   <a name="ll_avoidendl">Avoid <tt>std::endl</tt></a>
1085 </div>
1086
1087 <div class="doc_text">
1088
1089 <p>The <tt>std::endl</tt> modifier, when used with iostreams outputs a newline
1090 to the output stream specified.  In addition to doing this, however, it also
1091 flushes the output stream.  In other words, these are equivalent:</p>
1092
1093 <div class="doc_code">
1094 <pre>
1095 std::cout &lt;&lt; std::endl;
1096 std::cout &lt;&lt; '\n' &lt;&lt; std::flush;
1097 </pre>
1098 </div>
1099
1100 <p>Most of the time, you probably have no reason to flush the output stream, so
1101 it's better to use a literal <tt>'\n'</tt>.</p>
1102
1103 </div>
1104
1105
1106 <!-- ======================================================================= -->
1107 <div class="doc_subsection">
1108   <a name="nano">Microscopic Details</a>
1109 </div>
1110 <!-- ======================================================================= -->
1111
1112 <p>This section describes preferred low-level formatting guidelines along with
1113 reasoning on why we prefer them.</p>
1114
1115 <!-- _______________________________________________________________________ -->
1116 <div class="doc_subsubsection">
1117   <a name="micro_spaceparen">Spaces Before Parentheses</a>
1118 </div>
1119
1120 <div class="doc_text">
1121
1122 <p>We prefer to put a space before an open parenthesis only in control flow
1123 statements, but not in normal function call expressions and function-like
1124 macros.  For example, this is good:</p>
1125
1126 <div class="doc_code">
1127 <pre>
1128   <b>if (</b>x) ...
1129   <b>for (</b>i = 0; i != 100; ++i) ...
1130   <b>while (</b>llvm_rocks) ...
1131
1132   <b>somefunc(</b>42);
1133   <b><a href="#ll_assert">assert</a>(</b>3 != 4 &amp;&amp; "laws of math are failing me");
1134
1135   a = <b>foo(</b>42, 92) + <b>bar(</b>x);
1136   </pre>
1137 </div>
1138
1139 <p>... and this is bad:</p>
1140
1141 <div class="doc_code">
1142 <pre>
1143   <b>if(</b>x) ...
1144   <b>for(</b>i = 0; i != 100; ++i) ...
1145   <b>while(</b>llvm_rocks) ...
1146
1147   <b>somefunc (</b>42);
1148   <b><a href="#ll_assert">assert</a> (</b>3 != 4 &amp;&amp; "laws of math are failing me");
1149
1150   a = <b>foo (</b>42, 92) + <b>bar (</b>x);
1151 </pre>
1152 </div>
1153
1154 <p>The reason for doing this is not completely arbitrary.  This style makes
1155    control flow operators stand out more, and makes expressions flow better. The
1156    function call operator binds very tightly as a postfix operator.  Putting
1157    a space after a function name (as in the last example) makes it appear that
1158    the code might bind the arguments of the left-hand-side of a binary operator
1159    with the argument list of a function and the name of the right side.  More
1160    specifically, it is easy to misread the "a" example as:</p>
1161
1162 <div class="doc_code">
1163 <pre>
1164   a = foo <b>(</b>(42, 92) + bar<b>)</b> (x);
1165 </pre>
1166 </div>
1167
1168 <p>... when skimming through the code.  By avoiding a space in a function, we
1169 avoid this misinterpretation.</p>
1170
1171 </div>
1172
1173 <!-- _______________________________________________________________________ -->
1174 <div class="doc_subsubsection">
1175   <a name="micro_preincrement">Prefer Preincrement</a>
1176 </div>
1177
1178 <div class="doc_text">
1179
1180 <p>Hard fast rule: Preincrement (<tt>++X</tt>) may be no slower than
1181 postincrement (<tt>X++</tt>) and could very well be a lot faster than it.  Use
1182 preincrementation whenever possible.</p>
1183
1184 <p>The semantics of postincrement include making a copy of the value being
1185 incremented, returning it, and then preincrementing the "work value".  For
1186 primitive types, this isn't a big deal... but for iterators, it can be a huge
1187 issue (for example, some iterators contains stack and set objects in them...
1188 copying an iterator could invoke the copy ctor's of these as well).  In general,
1189 get in the habit of always using preincrement, and you won't have a problem.</p>
1190
1191 </div>
1192
1193 <!-- _______________________________________________________________________ -->
1194 <div class="doc_subsubsection">
1195   <a name="micro_namespaceindent">Namespace Indentation</a>
1196 </div>
1197
1198 <div class="doc_text">
1199
1200 <p>
1201 In general, we strive to reduce indentation wherever possible.  This is useful
1202 because we want code to <a href="#scf_codewidth">fit into 80 columns</a> without
1203 wrapping horribly, but also because it makes it easier to understand the code.
1204 Namespaces are a funny thing: they are often large, and we often desire to put
1205 lots of stuff into them (so they can be large).  Other times they are tiny,
1206 because they just hold an enum or something similar.  In order to balance this,
1207 we use different approaches for small versus large namespaces.
1208 </p>
1209
1210 <p>
1211 If a namespace definition is small and <em>easily</em> fits on a screen (say,
1212 less than 35 lines of code), then you should indent its body.  Here's an
1213 example:
1214 </p>
1215
1216 <div class="doc_code">
1217 <pre>
1218 namespace llvm {
1219   namespace X86 {
1220     /// RelocationType - An enum for the x86 relocation codes. Note that
1221     /// the terminology here doesn't follow x86 convention - word means
1222     /// 32-bit and dword means 64-bit.
1223     enum RelocationType {
1224       /// reloc_pcrel_word - PC relative relocation, add the relocated value to
1225       /// the value already in memory, after we adjust it for where the PC is.
1226       reloc_pcrel_word = 0,
1227
1228       /// reloc_picrel_word - PIC base relative relocation, add the relocated
1229       /// value to the value already in memory, after we adjust it for where the
1230       /// PIC base is.
1231       reloc_picrel_word = 1,
1232
1233       /// reloc_absolute_word, reloc_absolute_dword - Absolute relocation, just
1234       /// add the relocated value to the value already in memory.
1235       reloc_absolute_word = 2,
1236       reloc_absolute_dword = 3
1237     };
1238   }
1239 }
1240 </pre>
1241 </div>
1242
1243 <p>Since the body is small, indenting adds value because it makes it very clear
1244 where the namespace starts and ends, and it is easy to take the whole thing in
1245 in one "gulp" when reading the code.  If the blob of code in the namespace is
1246 larger (as it typically is in a header in the <tt>llvm</tt> or <tt>clang</tt> namespaces), do not
1247 indent the code, and add a comment indicating what namespace is being closed.
1248 For example:</p>
1249
1250 <div class="doc_code">
1251 <pre>
1252 namespace llvm {
1253 namespace knowledge {
1254
1255 /// Grokable - This class represents things that Smith can have an intimate
1256 /// understanding of and contains the data associated with it.
1257 class Grokable {
1258 ...
1259 public:
1260   explicit Grokable() { ... }
1261   virtual ~Grokable() = 0;
1262
1263   ...
1264
1265 };
1266
1267 } // end namespace knowledge
1268 } // end namespace llvm
1269 </pre>
1270 </div>
1271
1272 <p>Because the class is large, we don't expect that the reader can easily
1273 understand the entire concept in a glance, and the end of the file (where the
1274 namespaces end) may be a long ways away from the place they open.  As such,
1275 indenting the contents of the namespace doesn't add any value, and detracts from
1276 the readability of the class.  In these cases it is best to <em>not</em> indent
1277 the contents of the namespace.</p>
1278
1279 </div>
1280
1281 <!-- _______________________________________________________________________ -->
1282 <div class="doc_subsubsection">
1283   <a name="micro_anonns">Anonymous Namespaces</a>
1284 </div>
1285
1286 <div class="doc_text">
1287
1288 <p>After talking about namespaces in general, you may be wondering about
1289 anonymous namespaces in particular.
1290 Anonymous namespaces are a great language feature that tells the C++ compiler
1291 that the contents of the namespace are only visible within the current
1292 translation unit, allowing more aggressive optimization and eliminating the
1293 possibility of symbol name collisions.  Anonymous namespaces are to C++ as
1294 "static" is to C functions and global variables.  While "static" is available
1295 in C++, anonymous namespaces are more general: they can make entire classes
1296 private to a file.</p>
1297
1298 <p>The problem with anonymous namespaces is that they naturally want to
1299 encourage indentation of their body, and they reduce locality of reference: if
1300 you see a random function definition in a C++ file, it is easy to see if it is
1301 marked static, but seeing if it is in an anonymous namespace requires scanning
1302 a big chunk of the file.</p>
1303
1304 <p>Because of this, we have a simple guideline: make anonymous namespaces as
1305 small as possible, and only use them for class declarations.  For example, this
1306 is good:</p>
1307
1308 <div class="doc_code">
1309 <pre>
1310 <b>namespace {</b>
1311   class StringSort {
1312   ...
1313   public:
1314     StringSort(...)
1315     bool operator&lt;(const char *RHS) const;
1316   };
1317 <b>} // end anonymous namespace</b>
1318
1319 static void Helper() {
1320   ...
1321 }
1322
1323 bool StringSort::operator&lt;(const char *RHS) const {
1324   ...
1325 }
1326
1327 </pre>
1328 </div>
1329
1330 <p>This is bad:</p>
1331
1332
1333 <div class="doc_code">
1334 <pre>
1335 <b>namespace {</b>
1336 class StringSort {
1337 ...
1338 public:
1339   StringSort(...)
1340   bool operator&lt;(const char *RHS) const;
1341 };
1342
1343 void Helper() {
1344   ...
1345 }
1346
1347 bool StringSort::operator&lt;(const char *RHS) const {
1348   ...
1349 }
1350
1351 <b>} // end anonymous namespace</b>
1352
1353 </pre>
1354 </div>
1355
1356
1357 <p>This is bad specifically because if you're looking at "Helper" in the middle
1358 of a large C++ file, that you have no immediate way to tell if it is local to
1359 the file.  When it is marked static explicitly, this is immediately obvious.
1360 Also, there is no reason to enclose the definition of "operator&lt;" in the
1361 namespace just because it was declared there.
1362 </p>
1363
1364 </div>
1365
1366
1367
1368 <!-- *********************************************************************** -->
1369 <div class="doc_section">
1370   <a name="seealso">See Also</a>
1371 </div>
1372 <!-- *********************************************************************** -->
1373
1374 <div class="doc_text">
1375
1376 <p>A lot of these comments and recommendations have been culled for other
1377 sources.  Two particularly important books for our work are:</p>
1378
1379 <ol>
1380
1381 <li><a href="http://www.amazon.com/Effective-Specific-Addison-Wesley-Professional-Computing/dp/0321334876">Effective
1382 C++</a> by Scott Meyers.  Also
1383 interesting and useful are "More Effective C++" and "Effective STL" by the same
1384 author.</li>
1385
1386 <li>Large-Scale C++ Software Design by John Lakos</li>
1387
1388 </ol>
1389
1390 <p>If you get some free time, and you haven't read them: do so, you might learn
1391 something.</p>
1392
1393 </div>
1394
1395 <!-- *********************************************************************** -->
1396
1397 <hr>
1398 <address>
1399   <a href="http://jigsaw.w3.org/css-validator/check/referer"><img
1400   src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a>
1401   <a href="http://validator.w3.org/check/referer"><img
1402   src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a>
1403
1404   <a href="mailto:sabre@nondot.org">Chris Lattner</a><br>
1405   <a href="http://llvm.org">LLVM Compiler Infrastructure</a><br>
1406   Last modified: $Date$
1407 </address>
1408
1409 </body>
1410 </html>