X-Git-Url: http://demsky.eecs.uci.edu/git/?a=blobdiff_plain;ds=sidebyside;f=docs%2FCodingStandards.rst;h=b454e49664f093568995090b42e1d89160c85199;hb=5be77762a3aa434ee877b0a03b98b5c3a7571918;hp=de50e6eeaf5c96676585c8190214737dc3e9350e;hpb=06d9981d27acfc9d3474cfeb115e6d7da7e670d8;p=oota-llvm.git diff --git a/docs/CodingStandards.rst b/docs/CodingStandards.rst index de50e6eeaf5..b454e49664f 100644 --- a/docs/CodingStandards.rst +++ b/docs/CodingStandards.rst @@ -1,5 +1,3 @@ -.. _coding_standards: - ===================== LLVM Coding Standards ===================== @@ -146,6 +144,132 @@ useful to use C style (``/* */``) comments however: To comment out a large block of code, use ``#if 0`` and ``#endif``. These nest properly and are better behaved in general than C style comments. +Doxygen Use in Documentation Comments +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Use the ``\file`` command to turn the standard file header into a file-level +comment. + +Include descriptive ``\brief`` paragraphs for all public interfaces (public +classes, member and non-member functions). Explain API use and purpose in +``\brief`` paragraphs, don't just restate the information that can be inferred +from the API name. Put detailed discussion into separate paragraphs. + +To refer to parameter names inside a paragraph, use the ``\p name`` command. +Don't use the ``\arg name`` command since it starts a new paragraph that +contains documentation for the parameter. + +Wrap non-inline code examples in ``\code ... \endcode``. + +To document a function parameter, start a new paragraph with the +``\param name`` command. If the parameter is used as an out or an in/out +parameter, use the ``\param [out] name`` or ``\param [in,out] name`` command, +respectively. + +To describe function return value, start a new paragraph with the ``\returns`` +command. + +A minimal documentation comment: + +.. code-block:: c++ + + /// \brief Does foo and bar. + void fooBar(bool Baz); + +A documentation comment that uses all Doxygen features in a preferred way: + +.. code-block:: c++ + + /// \brief Does foo and bar. + /// + /// Does not do foo the usual way if \p Baz is true. + /// + /// Typical usage: + /// \code + /// fooBar(false, "quux", Res); + /// \endcode + /// + /// \param Quux kind of foo to do. + /// \param [out] Result filled with bar sequence on foo success. + /// + /// \returns true on success. + bool fooBar(bool Baz, StringRef Quux, std::vector &Result); + +Don't duplicate the documentation comment in the header file and in the +implementation file. Put the documentation comments for public APIs into the +header file. Documentation comments for private APIs can go to the +implementation file. In any case, implementation files can include additional +comments (not necessarily in Doxygen markup) to explain implementation details +as needed. + +Don't duplicate function or class name at the beginning of the comment. +For humans it is obvious which function or class is being documented; +automatic documentation processing tools are smart enough to bind the comment +to the correct declaration. + +Wrong: + +.. code-block:: c++ + + // In Something.h: + + /// Something - An abstraction for some complicated thing. + class Something { + public: + /// fooBar - Does foo and bar. + void fooBar(); + }; + + // In Something.cpp: + + /// fooBar - Does foo and bar. + void Something::fooBar() { ... } + +Correct: + +.. code-block:: c++ + + // In Something.h: + + /// \brief An abstraction for some complicated thing. + class Something { + public: + /// \brief Does foo and bar. + void fooBar(); + }; + + // In Something.cpp: + + // Builds a B-tree in order to do foo. See paper by... + void Something::fooBar() { ... } + +It is not required to use additional Doxygen features, but sometimes it might +be a good idea to do so. + +Consider: + +* adding comments to any narrow namespace containing a collection of + related functions or types; + +* using top-level groups to organize a collection of related functions at + namespace scope where the grouping is smaller than the namespace; + +* using member groups and additional comments attached to member + groups to organize within a class. + +For example: + +.. code-block:: c++ + + class Something { + /// \name Functions that do Foo. + /// @{ + void fooBar(); + void fooBaz(); + /// @} + ... + }; + ``#include`` Style ^^^^^^^^^^^^^^^^^^ @@ -158,17 +282,10 @@ listed. We prefer these ``#include``\s to be listed in this order: #. Main Module Header #. Local/Private Headers -#. ``llvm/*`` -#. ``llvm/Analysis/*`` -#. ``llvm/Assembly/*`` -#. ``llvm/Bitcode/*`` -#. ``llvm/CodeGen/*`` -#. ... -#. ``llvm/Support/*`` -#. ``llvm/Config/*`` +#. ``llvm/...`` #. System ``#include``\s -and each category should be sorted by name. +and each category should be sorted lexicographically by the full path. The `Main Module Header`_ file applies to ``.cpp`` files which implement an interface defined by a ``.h`` file. This ``#include`` should always be included @@ -283,7 +400,8 @@ code. That said, LLVM does make extensive use of a hand-rolled form of RTTI that use templates like `isa<>, cast<>, and dyn_cast<> `_. -This form of RTTI is opt-in and can be added to any class. It is also +This form of RTTI is opt-in and can be +:doc:`added to any class `. It is also substantially more efficient than ``dynamic_cast<>``. .. _static constructor: @@ -587,8 +705,8 @@ sort of thing is: .. code-block:: c++ bool FoundFoo = false; - for (unsigned i = 0, e = BarList.size(); i != e; ++i) - if (BarList[i]->isFoo()) { + for (unsigned I = 0, E = BarList.size(); I != E; ++I) + if (BarList[I]->isFoo()) { FoundFoo = true; break; } @@ -604,11 +722,10 @@ code to be structured like this: .. code-block:: c++ - /// containsFoo - Return true if the specified list has an element that is - /// a foo. + /// \returns true if the specified list has an element that is a foo. static bool containsFoo(const std::vector &List) { - for (unsigned i = 0, e = List.size(); i != e; ++i) - if (List[i]->isFoo()) + for (unsigned I = 0, E = List.size(); I != E; ++I) + if (List[I]->isFoo()) return true; return false; } @@ -679,7 +796,9 @@ In general, names should be in camel case (e.g. ``TextFileReader`` and As an exception, classes that mimic STL classes can have member names in STL's style of lower-case words separated by underscores (e.g. ``begin()``, -``push_back()``, and ``empty()``). +``push_back()``, and ``empty()``). Classes that provide multiple +iterators should add a singular prefix to ``begin()`` and ``end()`` +(e.g. ``global_begin()`` and ``use_begin()``). Here are some examples of good and bad names: @@ -695,8 +814,8 @@ Here are some examples of good and bad names: Vehicle MakeVehicle(VehicleType Type) { VehicleMaker M; // Might be OK if having a short life-span. - Tire tmp1 = M.makeTire(); // Bad -- 'tmp1' provides no information. - Light headlight = M.makeLight("head"); // Good -- descriptive. + Tire Tmp1 = M.makeTire(); // Bad -- 'Tmp1' provides no information. + Light Headlight = M.makeLight("head"); // Good -- descriptive. ... } @@ -716,16 +835,16 @@ enforced, and hopefully what to do about it. Here is one complete example: .. code-block:: c++ - inline Value *getOperand(unsigned i) { - assert(i < Operands.size() && "getOperand() out of range!"); - return Operands[i]; + inline Value *getOperand(unsigned I) { + assert(I < Operands.size() && "getOperand() out of range!"); + return Operands[I]; } Here are more examples: .. code-block:: c++ - assert(Ty->isPointerType() && "Can't allocate a non pointer type!"); + assert(Ty->isPointerType() && "Can't allocate a non-pointer type!"); assert((Opcode == Shl || Opcode == Shr) && "ShiftInst Opcode invalid!"); @@ -737,23 +856,28 @@ Here are more examples: You get the idea. -Please be aware that, when adding assert statements, not all compilers are aware -of the semantics of the assert. In some places, asserts are used to indicate a -piece of code that should not be reached. These are typically of the form: +In the past, asserts were used to indicate a piece of code that should not be +reached. These were typically of the form: .. code-block:: c++ - assert(0 && "Some helpful error message"); + assert(0 && "Invalid radix for integer literal"); + +This has a few issues, the main one being that some compilers might not +understand the assertion, or warn about a missing return in builds where +assertions are compiled out. -When used in a function that returns a value, they should be followed with a -return statement and a comment indicating that this line is never reached. This -will prevent a compiler which is unable to deduce that the assert statement -never returns from generating a warning. +Today, we have something much better: ``llvm_unreachable``: .. code-block:: c++ - assert(0 && "Some helpful error message"); - return 0; + llvm_unreachable("Invalid radix for integer literal"); + +When assertions are enabled, this will print the message if it's ever reached +and then exit the program. When assertions are disabled (i.e. in release +builds), ``llvm_unreachable`` becomes a hint to compilers to skip generating +code for this branch. If the compiler does not support this, it will fall back +to the "abort" implementation. Another issue is that values used only by assertions will produce an "unused value" warning when assertions are disabled. For example, this code will warn: @@ -905,7 +1029,7 @@ form has two problems. First it may be less efficient than evaluating it at the start of the loop. In this case, the cost is probably minor --- a few extra loads every time through the loop. However, if the base expression is more complex, then the cost can rise quickly. I've seen loops where the end -expression was actually something like: "``SomeMap[x]->end()``" and map lookups +expression was actually something like: "``SomeMap[X]->end()``" and map lookups really aren't cheap. By writing it in the second form consistently, you eliminate the issue entirely and don't even have to think about it. @@ -966,6 +1090,34 @@ flushes the output stream. In other words, these are equivalent: Most of the time, you probably have no reason to flush the output stream, so it's better to use a literal ``'\n'``. +Don't use ``inline`` when defining a function in a class definition +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +A member function defined in a class definition is implicitly inline, so don't +put the ``inline`` keyword in this case. + +Don't: + +.. code-block:: c++ + + class Foo { + public: + inline void bar() { + // ... + } + }; + +Do: + +.. code-block:: c++ + + class Foo { + public: + void bar() { + // ... + } + }; + Microscopic Details ------------------- @@ -981,27 +1133,27 @@ macros. For example, this is good: .. code-block:: c++ - if (x) ... - for (i = 0; i != 100; ++i) ... - while (llvm_rocks) ... + if (X) ... + for (I = 0; I != 100; ++I) ... + while (LLVMRocks) ... somefunc(42); assert(3 != 4 && "laws of math are failing me"); - a = foo(42, 92) + bar(x); + A = foo(42, 92) + bar(X); and this is bad: .. code-block:: c++ - if(x) ... - for(i = 0; i != 100; ++i) ... - while(llvm_rocks) ... + if(X) ... + for(I = 0; I != 100; ++I) ... + while(LLVMRocks) ... somefunc (42); assert (3 != 4 && "laws of math are failing me"); - a = foo (42, 92) + bar (x); + A = foo (42, 92) + bar (X); The reason for doing this is not completely arbitrary. This style makes control flow operators stand out more, and makes expressions flow better. The function @@ -1009,11 +1161,11 @@ call operator binds very tightly as a postfix operator. Putting a space after a function name (as in the last example) makes it appear that the code might bind the arguments of the left-hand-side of a binary operator with the argument list of a function and the name of the right side. More specifically, it is easy to -misread the "``a``" example as: +misread the "``A``" example as: .. code-block:: c++ - a = foo ((42, 92) + bar) (x); + A = foo ((42, 92) + bar) (X); when skimming through the code. By avoiding a space in a function, we avoid this misinterpretation. @@ -1051,21 +1203,21 @@ If a namespace definition is small and *easily* fits on a screen (say, less than namespace llvm { namespace X86 { - /// RelocationType - An enum for the x86 relocation codes. Note that + /// \brief An enum for the x86 relocation codes. Note that /// the terminology here doesn't follow x86 convention - word means /// 32-bit and dword means 64-bit. enum RelocationType { - /// reloc_pcrel_word - PC relative relocation, add the relocated value to + /// \brief PC relative relocation, add the relocated value to /// the value already in memory, after we adjust it for where the PC is. reloc_pcrel_word = 0, - /// reloc_picrel_word - PIC base relative relocation, add the relocated - /// value to the value already in memory, after we adjust it for where the + /// \brief PIC base relative relocation, add the relocated value to + /// the value already in memory, after we adjust it for where the /// PIC base is. reloc_picrel_word = 1, - /// reloc_absolute_word, reloc_absolute_dword - Absolute relocation, just - /// add the relocated value to the value already in memory. + /// \brief Absolute relocation, just add the relocated value to the + /// value already in memory. reloc_absolute_word = 2, reloc_absolute_dword = 3 }; @@ -1084,7 +1236,7 @@ closed. For example: namespace llvm { namespace knowledge { - /// Grokable - This class represents things that Smith can have an intimate + /// This class represents things that Smith can have an intimate /// understanding of and contains the data associated with it. class Grokable { ... @@ -1180,7 +1332,7 @@ namespace just because it was declared there. See Also ======== -A lot of these comments and recommendations have been culled for other sources. +A lot of these comments and recommendations have been culled from other sources. Two particularly important books for our work are: #. `Effective C++