X-Git-Url: http://demsky.eecs.uci.edu/git/?a=blobdiff_plain;f=docs%2FSystemLibrary.html;h=a3629d9dcdaf2aeeec8f60b252925175ffcb772b;hb=dc81e5da271ed394e2029c83458773c4ae2fc5f4;hp=5e5bd8a1217f1dd355f06b3efe2b3ced903baaed;hpb=95189201d1d69f56a900fcb550efa13cfc4f90f5;p=oota-llvm.git diff --git a/docs/SystemLibrary.html b/docs/SystemLibrary.html index 5e5bd8a1217..a3629d9dcda 100644 --- a/docs/SystemLibrary.html +++ b/docs/SystemLibrary.html @@ -2,300 +2,301 @@ "http://www.w3.org/TR/html4/strict.dtd"> + System Library -
System Library
- -
-

Warning: This document is a work in progress.

-
- +

System Library

-

Written by Reid Spencer

+

Written by Reid Spencer

-
Abstract
-
-

This document describes the requirements, design, and implementation - details of LLVM's System Library. The library is composed of the header files - in llvm/include/llvm/System and the source files in - llvm/lib/System. The goal of this library is to completely shield - LLVM from the variations in operating system interfaces. By centralizing - LLVM's use of operating system interfaces, we make it possible for the LLVM - tool chain and runtime libraries to be more easily ported to new platforms. - The library also unclutters the rest of LLVM from #ifdef use and special - cases for specific operating systems.

-

The System Library was donated to LLVM by Reid Spencer who formulated the - original design as part of the eXtensible Programming System (XPS) which is - based, in part, on LLVM.

+

Abstract

+
+

This document provides some details on LLVM's System Library, located in + the source at lib/System and include/llvm/System. The + library's purpose is to shield LLVM from the differences between operating + systems for the few services LLVM needs from the operating system. Much of + LLVM is written using portability features of standard C++. However, in a few + areas, system dependent facilities are needed and the System Library is the + wrapper around those system calls.

+

By centralizing LLVM's use of operating system interfaces, we make it + possible for the LLVM tool chain and runtime libraries to be more easily + ported to new platforms since (theoretically) only lib/System needs + to be ported. This library also unclutters the rest of LLVM from #ifdef use + and special cases for specific operating systems. Such uses are replaced + with simple calls to the interfaces provided in include/llvm/System. +

+

Note that the System Library is not intended to be a complete operating + system wrapper (such as the Adaptive Communications Environment (ACE) or + Apache Portable Runtime (APR)), but only provides the functionality necessary + to support LLVM. +

The System Library was written by Reid Spencer who formulated the + design based on similar work originating from the eXtensible Programming + System (XPS). Several people helped with the effort; especially, + Jeff Cohen and Henrik Bach on the Win32 port.

-
- System Library Requirements -
-
-

The System library's requirements are aimed at shielding LLVM from the - variations in operating system interfaces. The following sections define the - requirements needed to fulfill this objective.

-
+

+ Keeping LLVM Portable +

+
+

In order to keep LLVM portable, LLVM developers should adhere to a set of + portability rules associated with the System Library. Adherence to these rules + should help the System Library achieve its goal of shielding LLVM from the + variations in operating system interfaces and doing so efficiently. The + following sections define the rules needed to fulfill this objective.

- -
-

To be written.

+

Don't Include System Headers

+
+

Except in lib/System, no LLVM source code should directly + #include a system header. Care has been taken to remove all such + #includes from LLVM while lib/System was being + developed. Specifically this means that header files like "unistd.h", + "windows.h", "stdio.h", and "string.h" are forbidden to be included by LLVM + source code outside the implementation of lib/System.

+

To obtain system-dependent functionality, existing interfaces to the system + found in include/llvm/System should be used. If an appropriate + interface is not available, it should be added to include/llvm/System + and implemented in lib/System for all supported platforms.

- -
-

To be written.

+

Don't Expose System Headers

+
+

The System Library must shield LLVM from all system headers. To + obtain system level functionality, LLVM source must + #include "llvm/System/Thing.h" and nothing else. This means that + Thing.h cannot expose any system header files. This protects LLVM + from accidentally using system specific functionality and only allows it + via the lib/System interface.

- -
-

To be written.

+

Use Standard C Headers

+
+

The standard C headers (the ones beginning with "c") are allowed + to be exposed through the lib/System interface. These headers and + the things they declare are considered to be platform agnostic. LLVM source + files may include them directly or obtain their inclusion through + lib/System interfaces.

- -
-

To be written.

+

Use Standard C++ Headers

+
+

The standard C++ headers from the standard C++ library and + standard template library may be exposed through the lib/System + interface. These headers and the things they declare are considered to be + platform agnostic. LLVM source files may include them or obtain their + inclusion through lib/System interfaces.

- -
-

To be written.

+

High Level Interface

+
+

The entry points specified in the interface of lib/System must be aimed at + completing some reasonably high level task needed by LLVM. We do not want to + simply wrap each operating system call. It would be preferable to wrap several + operating system calls that are always used in conjunction with one another by + LLVM.

+

For example, consider what is needed to execute a program, wait for it to + complete, and return its result code. On Unix, this involves the following + operating system calls: getenv, fork, execve, and wait. The + correct thing for lib/System to provide is a function, say + ExecuteProgramAndWait, that implements the functionality completely. + what we don't want is wrappers for the operating system calls involved.

+

There must not be a one-to-one relationship between operating + system calls and the System library's interface. Any such interface function + will be suspicious.

- -
-

To be written.

-
- - - -
-

In order to fulfill the requirements of the system library, strict design - objectives must be maintained in the library as it evolves. The goal here - is to provide interfaces to operating system concepts (files, memory maps, - sockets, signals, locking, etc) efficiently and in such a way that the - remainder of LLVM is completely operating system agnostic.

+

No Unused Functionality

+
+

There must be no functionality specified in the interface of lib/System + that isn't actually used by LLVM. We're not writing a general purpose + operating system wrapper here, just enough to satisfy LLVM's needs. And, LLVM + doesn't need much. This design goal aims to keep the lib/System interface + small and understandable which should foster its actual use and adoption.

- -
-

no public data

-

onlyprimitive typed private/protected data

-

data size is "right" for platform, not max of all platforms

-

each class corresponds to O/S concept

+

No Duplicate Implementations

+
+

The implementation of a function for a given platform must be written + exactly once. This implies that it must be possible to apply a function's + implementation to multiple operating systems if those operating systems can + share the same implementation. This rule applies to the set of operating + systems supported for a given class of operating system (e.g. Unix, Win32). +

- -
-

To be written.

+

No Virtual Methods

+
+

The System Library interfaces can be called quite frequently by LLVM. In + order to make those calls as efficient as possible, we discourage the use of + virtual methods. There is no need to use inheritance for implementation + differences, it just adds complexity. The #include mechanism works + just fine.

- -
-

To be written.

+

No Exposed Functions

+
+

Any functions defined by system libraries (i.e. not defined by lib/System) + must not be exposed through the lib/System interface, even if the header file + for that function is not exposed. This prevents inadvertent use of system + specific functionality.

+

For example, the stat system call is notorious for having + variations in the data it provides. lib/System must not declare + stat nor allow it to be declared. Instead it should provide its own + interface to discovering information about files and directories. Those + interfaces may be implemented in terms of stat but that is strictly + an implementation detail. The interface provided by the System Library must + be implemented on all platforms (even those without stat).

- -
-

To be written.

+

No Exposed Data

+
+

Any data defined by system libraries (i.e. not defined by lib/System) must + not be exposed through the lib/System interface, even if the header file for + that function is not exposed. As with functions, this prevents inadvertent use + of data that might not exist on all platforms.

- -
-

To be written.

+

Minimize Soft Errors

+
+

Operating system interfaces will generally provide error results for every + little thing that could go wrong. In almost all cases, you can divide these + error results into two groups: normal/good/soft and abnormal/bad/hard. That + is, some of the errors are simply information like "file not found", + "insufficient privileges", etc. while other errors are much harder like + "out of space", "bad disk sector", or "system call interrupted". We'll call + the first group "soft" errors and the second group "hard" + errors.

+

lib/System must always attempt to minimize soft errors. + This is a design requirement because the + minimization of soft errors can affect the granularity and the nature of the + interface. In general, if you find that you're wanting to throw soft errors, + you must review the granularity of the interface because it is likely you're + trying to implement something that is too low level. The rule of thumb is to + provide interface functions that can't fail, except when faced with + hard errors.

+

For a trivial example, suppose we wanted to add an "OpenFileForWriting" + function. For many operating systems, if the file doesn't exist, attempting + to open the file will produce an error. However, lib/System should not + simply throw that error if it occurs because its a soft error. The problem + is that the interface function, OpenFileForWriting is too low level. It should + be OpenOrCreateFileForWriting. In the case of the soft "doesn't exist" error, + this function would just create it and then open it for writing.

+

This design principle needs to be maintained in lib/System because it + avoids the propagation of soft error handling throughout the rest of LLVM. + Hard errors will generally just cause a termination for an LLVM tool so don't + be bashful about throwing them.

+

Rules of thumb:

+
    +
  1. Don't throw soft errors, only hard errors.
  2. +
  3. If you're tempted to throw a soft error, re-think the interface.
  4. +
  5. Handle internally the most common normal/good/soft error conditions + so the rest of LLVM doesn't have to.
  6. +
- -
-

To be written.

-
- - - -
-

To be written.

+

No throw Specifications

+
+

None of the lib/System interface functions may be declared with C++ + throw() specifications on them. This requirement makes sure that the + compiler does not insert additional exception handling code into the interface + functions. This is a performance consideration: lib/System functions are at + the bottom of many call chains and as such can be frequently called. We + need them to be as efficient as possible. However, no routines in the + system library should actually throw exceptions.

- -
-

See bug 351 - for further details on the progress of this work

+

Code Organization

+
+

Implementations of the System Library interface are separated by their + general class of operating system. Currently only Unix and Win32 classes are + defined but more could be added for other operating system classifications. + To distinguish which implementation to compile, the code in lib/System uses + the LLVM_ON_UNIX and LLVM_ON_WIN32 #defines provided via configure through the + llvm/Config/config.h file. Each source file in lib/System, after implementing + the generic (operating system independent) functionality needs to include the + correct implementation using a set of #if defined(LLVM_ON_XYZ) + directives. For example, if we had lib/System/File.cpp, we'd expect to see in + that file:

+

+  #if defined(LLVM_ON_UNIX)
+  #include "Unix/File.cpp"
+  #endif
+  #if defined(LLVM_ON_WIN32)
+  #include "Win32/File.cpp"
+  #endif
+  
+

The implementation in lib/System/Unix/File.cpp should handle all Unix + variants. The implementation in lib/System/Win32/File.cpp should handle all + Win32 variants. What this does is quickly differentiate the basic class of + operating system that will provide the implementation. The specific details + for a given platform must still be determined through the use of + #ifdef.

- -
-

In order to provide different implementations of the lib/System interface - for different platforms, it is necessary for the library to "sense" which - operating system is being compiled for and conditionally compile only the - applicabe parts of the library. While several operating system wrapper - libraries (e.g. APR, ACE) choose to use #ifdef preprocessor statements in - combination with autoconf variable (HAVE_* family), lib/System chooses an - alternate strategy.

-

To put it succinctly, the lib/System strategy has traded "#ifdef hell" for - "#include hell". That is, a given implementation file defines one or more - functions for a particular operating system variant. The functions defined in - that file have no #ifdef's to disambiguate the platform since the file is only - compiled on one kind of platform. While this leads to the same function being - imlemented differently in different files, it is our contention that this - leads to better maintenance and easier portability.

-

For example, consider a function having different implementations on a - variety of platforms. Many wrapper libraries choose to deal with the different - implementations by using #ifdef, like this:

-

-      void SomeFunction(void) {
-      #if defined __LINUX
-        // .. Linux implementation
-      #elif defined __WIN32
-        // .. Win32 implementation
-      #elif defined __SunOS
-        // .. SunOS implementation
-      #else
-      #warning "Don't know how to implement SomeFunction on this platform"
-      #endif
-      }
-  
-

The problem with this is that its very messy to read, especially as the - number of operating systems and their variants grow. The above example is - actually tame compared to what can happen when the implementation depends on - specific flavors and versions of the operating system. In that case you end up - with multiple levels of nested #if statements. This is what we mean by "#ifdef - hell".

-

To avoid the situation above, we've choosen to locate all functions for a - given implementation file for a specific operating system into one place. This - has the following advantages:

-

    -
  • No "#ifdef hell"
  • -
  • When porting, the strategy is quite straight forward: copy the - implementation file from a similar operating system to a new directory and - re-implement them.
  • -
  • Correctness is helped during porting because the new operating system's - implementation is wholly contained in a separate directory. There's no - chance to make an error in the #if statements and affect some other - operating system's implementation.
  • -
-

So, given that we have decided to use #include instead of #if to provide - platform specific implementations, there are actually three ways we can go - about doing this. None of them are perfect, but we believe we've chosen the - lesser of the three evils. Given that there is a variable named $OS which - names the platform for which we must build, here's a summary of the three - approaches we could use to determine the correct directory:

-
    -
  1. Provide the compiler with a -I$(OS) on the command line. This could be - provided in only the lib/System makefile.
  2. -
  3. Use autoconf to transform #include statements in the implementation - files by using substitutions of @OS@. For example, if we had a file, - File.cpp.in, that contained "#include <@OS@/File.cpp>" this would get - transformed to "#include <actual/File.cpp>" where "actual" is the - actual name of the operating system
  4. -
  5. Create a link from $OBJ_DIR/platform to $SRC_DIR/$OS. This allows us to - use a generic directory name to get the correct platform, as in #include - <platform/File.cpp>
  6. -
-

Let's look at the pitfalls of each approach.

-

In approach #1, we end up with some confusion as to what gets included. - Suppose we have lib/System/File.cpp that includes just File.cpp to get the - platform specific part of the implementation. In this case, the include - directive with the <> syntax will include the right file but the include - directive with the "" syntax will recursively include the same file, - lib/System/File.cpp. In the case of #include <File.cpp>, the -I options - to the compiler are searched first so it works. But in the #include "File.cpp" - case, the current directory is searched first. Furthermore, in both cases, - neither include directive documents which File.cpp is getting included.

-

In approach #2, we have the problem of needing to reconfigure repeatedly. - Developer's generally hate that and we don't want lib/System to be a thorn in - everyone's side because it will constantly need updating as operating systems - change and as new operating systems are added. The problem occurs when a new - implementation file is added to the library. First of all, you have to add a - file with the .in suffix, then you have to add that file name to the list of - configurable files in the autoconf/configure.ac file, then you have to run - AutoRegen.sh to rebuild the configure script, then you have to run the - configure script. This is deemed to be a pretty large hassle.

-

In approach #3, we have the problem that not all platforms support links. - Fortunately the autoconf macro used to create the link can compensate for - this. If a link can't be made, the configure script will copy the correct - directory from $BUILD_SRC_DIR to $BUILD_OBJ_DIR under the new name. The only - problem with this is that if a copy is made, the copy doesn't get updated if - the programmer adds or modifies files in the $BUILD_SRC_DIR. A reconfigure or - manual copying is needed to get things to compile.

-

The approach we have taken in lib/System is #3. Here's why:

-

    -
  • Approach #1 is rejected because it doesn't document what's actually - getting included and the potential for mistakes with alternate include - directive forms is high.
  • -
  • Approach #2 are both viable and only really impact development when new - files are added to the library.
  • -
  • However, approach #2 impacts every new file on every platform all the - time. With approach #3, only those platforms not supporting links will be - affected. The number of platforms not supporting links is very small and - they are generally archaic.
  • -
  • Given the above, approach #3 seems to have the least impact.
  • -
+

Consistent Semantics

+
+

The implementation of a lib/System interface can vary drastically between + platforms. That's okay as long as the end result of the interface function + is the same. For example, a function to create a directory is pretty straight + forward on all operating system. System V IPC on the other hand isn't even + supported on all platforms. Instead of "supporting" System V IPC, lib/System + should provide an interface to the basic concept of inter-process + communications. The implementations might use System V IPC if that was + available or named pipes, or whatever gets the job done effectively for a + given operating system. In all cases, the interface and the implementation + must be semantically consistent.

-
- Reference Implementation +

Bug 351

+
+

See bug 351 + for further details on the progress of this work

-
-

The linux implementation of the system library will always be the - reference implementation. This means that (a) the concepts defined by the - linux must be identically replicated in the other implementations and (b) the - linux implementation must always be complete (provide implementations for all - concepts).

+
@@ -303,12 +304,12 @@
Valid CSS! + src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"> Valid HTML 4.01! + src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"> Reid Spencer
- LLVM Compiler Infrastructure
+ LLVM Compiler Infrastructure
Last modified: $Date$