Design

Design
PrevÂ	ChapterÂ 30.Â Debug Mode	Â Next

Goals

The libstdc++ debug mode replaces unsafe (but efficient) standard + containers and iterators with semantically equivalent safe standard + containers and iterators to aid in debugging user programs. The + following goals directed the design of the libstdc++ debug mode:

Correctness: the libstdc++ debug mode must not change + the semantics of the standard library for all cases specified in + the ANSI/ISO C++ standard. The essence of this constraint is that + any valid C++ program should behave in the same manner regardless + of whether it is compiled with debug mode or release mode. In + particular, entities that are defined in namespace std in release + mode should remain defined in namespace std in debug mode, so that + legal specializations of namespace std entities will remain + valid. A program that is not valid C++ (e.g., invokes undefined + behavior) is not required to behave similarly, although the debug + mode will abort with a diagnostic when it detects undefined + behavior.
Performance: the additional of the libstdc++ debug mode + must not affect the performance of the library when it is compiled + in release mode. Performance of the libstdc++ debug mode is + secondary (and, in fact, will be worse than the release + mode).
Usability: the libstdc++ debug mode should be easy to + use. It should be easily incorporated into the user's development + environment (e.g., by requiring only a single new compiler switch) + and should produce reasonable diagnostics when it detects a + problem with the user program. Usability also involves detection + of errors when using the debug mode incorrectly, e.g., by linking + a release-compiled object against a debug-compiled object if in + fact the resulting program will not run correctly.
Minimize recompilation: While it is expected that + users recompile at least part of their program to use debug + mode, the amount of recompilation affects the + detect-compile-debug turnaround time. This indirectly affects the + usefulness of the debug mode, because debugging some applications + may require rebuilding a large amount of code, which may not be + feasible when the suspect code may be very localized. There are + several levels of conformance to this requirement, each with its + own usability and implementation characteristics. In general, the + higher-numbered conformance levels are more usable (i.e., require + less recompilation) but are more complicated to implement than + the lower-numbered conformance levels. +
1. Full recompilation: The user must recompile his or + her entire application and all C++ libraries it depends on, + including the C++ standard library that ships with the + compiler. This must be done even if only a small part of the + program can use debugging features.
2. Full user recompilation: The user must recompile + his or her entire application and all C++ libraries it depends + on, but not the C++ standard library itself. This must be done + even if only a small part of the program can use debugging + features. This can be achieved given a full recompilation + system by compiling two versions of the standard library when + the compiler is installed and linking against the appropriate + one, e.g., a multilibs approach.
3. Partial recompilation: The user must recompile the + parts of his or her application and the C++ libraries it + depends on that will use the debugging facilities + directly. This means that any code that uses the debuggable + standard containers would need to be recompiled, but code + that does not use them (but may, for instance, use IOStreams) + would not have to be recompiled.
4. Per-use recompilation: The user must recompile the + parts of his or her application and the C++ libraries it + depends on where debugging should occur, and any other code + that interacts with those containers. This means that a set of + translation units that accesses a particular standard + container instance may either be compiled in release mode (no + checking) or debug mode (full checking), but must all be + compiled in the same way; a translation unit that does not see + that standard container instance need not be recompiled. This + also means that a translation unit A that contains a + particular instantiation + (say, std::vector<int>) compiled in release + mode can be linked against a translation unit B that + contains the same instantiation compiled in debug mode (a + feature not present with partial recompilation). While this + behavior is technically a violation of the One Definition + Rule, this ability tends to be very important in + practice. The libstdc++ debug mode supports this level of + recompilation.
5. Per-unit recompilation: The user must only + recompile the translation units where checking should occur, + regardless of where debuggable standard containers are + used. This has also been dubbed "-g mode", + because the -g compiler switch works in this way, + emitting debugging information at a per--translation-unit + granularity. We believe that this level of recompilation is in + fact not possible if we intend to supply safe iterators, leave + the program semantics unchanged, and not regress in + performance under release mode because we cannot associate + extra information with an iterator (to form a safe iterator) + without either reserving that space in release mode + (performance regression) or allocating extra memory associated + with each iterator with new (changes the program + semantics).
+

Methods

This section provides an overall view of the design of the + libstdc++ debug mode and details the relationship between design + decisions and the stated design goals.

The Wrapper Model

The libstdc++ debug mode uses a wrapper model where the debugging + versions of library components (e.g., iterators and containers) form + a layer on top of the release versions of the library + components. The debugging components first verify that the operation + is correct (aborting with a diagnostic if an error is found) and + will then forward to the underlying release-mode container that will + perform the actual work. This design decision ensures that we cannot + regress release-mode performance (because the release-mode + containers are left untouched) and partially enables mixing debug and release code at link time, + although that will not be discussed at this time.

Two types of wrappers are used in the implementation of the debug + mode: container wrappers and iterator wrappers. The two types of + wrappers interact to maintain relationships between iterators and + their associated containers, which are necessary to detect certain + types of standard library usage errors such as dereferencing + past-the-end iterators or inserting into a container using an + iterator from a different container.

Safe Iterators

Iterator wrappers provide a debugging layer over any iterator that + is attached to a particular container, and will manage the + information detailing the iterator's state (singular, + dereferenceable, etc.) and tracking the container to which the + iterator is attached. Because iterators have a well-defined, common + interface the iterator wrapper is implemented with the iterator + adaptor class template __gnu_debug::_Safe_iterator, + which takes two template parameters:

Iterator: The underlying iterator type, which must + be either the iterator or const_iterator + typedef from the sequence type this iterator can reference.
Sequence: The type of sequence that this iterator + references. This sequence must be a safe sequence (discussed below) + whose iterator or const_iterator typedef + is the type of the safe iterator.

Safe Sequences (Containers)

Container wrappers provide a debugging layer over a particular + container type. Because containers vary greatly in the member + functions they support and the semantics of those member functions + (especially in the area of iterator invalidation), container + wrappers are tailored to the container they reference, e.g., the + debugging version of std::list duplicates the entire + interface of std::list, adding additional semantic + checks and then forwarding operations to the + real std::list (a public base class of the debugging + version) as appropriate. However, all safe containers inherit from + the class template __gnu_debug::_Safe_sequence, + instantiated with the type of the safe container itself (an instance + of the curiously recurring template pattern).

The iterators of a container wrapper will be + safe iterators that reference sequences + of this type and wrap the iterators provided by the release-mode + base class. The debugging container will use only the safe + iterators within its own interface (therefore requiring the user to + use safe iterators, although this does not change correct user + code) and will communicate with the release-mode base class with + only the underlying, unsafe, release-mode iterators that the base + class exports.

The debugging version of std::list will have the + following basic structure:

+template<typename _Tp, typename _Allocator = allocator<_Tp>
+  class debug-list :
+    public release-list<_Tp, _Allocator>,
+    public __gnu_debug::_Safe_sequence<debug-list<_Tp, _Allocator> >
+  {
+    typedef release-list<_Tp, _Allocator> _Base;
+    typedef debug-list<_Tp, _Allocator>   _Self;
+
+  public:
+    typedef __gnu_debug::_Safe_iterator<typename _Base::iterator, _Self>       iterator;
+    typedef __gnu_debug::_Safe_iterator<typename _Base::const_iterator, _Self> const_iterator;
+
+    // duplicate std::list interface with debugging semantics
+  };
+

Precondition Checking

The debug mode operates primarily by checking the preconditions of + all standard library operations that it supports. Preconditions that + are always checked (regardless of whether or not we are in debug + mode) are checked via the __check_xxx macros defined + and documented in the source + file include/debug/debug.h. Preconditions that may or + may not be checked, depending on the debug-mode + macro _GLIBCXX_DEBUG, are checked via + the __requires_xxx macros defined and documented in the + same source file. Preconditions are validated using any additional + information available at run-time, e.g., the containers that are + associated with a particular iterator, the position of the iterator + within those containers, the distance between two iterators that may + form a valid range, etc. In the absence of suitable information, + e.g., an input iterator that is not a safe iterator, these + precondition checks will silently succeed.

The majority of precondition checks use the aforementioned macros, + which have the secondary benefit of having prewritten debug + messages that use information about the current status of the + objects involved (e.g., whether an iterator is singular or what + sequence it is attached to) along with some static information + (e.g., the names of the function parameters corresponding to the + objects involved). When not using these macros, the debug mode uses + either the debug-mode assertion + macro _GLIBCXX_DEBUG_ASSERT , its pedantic + cousin _GLIBCXX_DEBUG_PEDASSERT, or the assertion + check macro that supports more advance formulation of error + messages, _GLIBCXX_DEBUG_VERIFY. These macros are + documented more thoroughly in the debug mode source code.

Release- and debug-mode coexistence

The libstdc++ debug mode is the first debug mode we know of that + is able to provide the "Per-use recompilation" (4) guarantee, that + allows release-compiled and debug-compiled code to be linked and + executed together without causing unpredictable behavior. This + guarantee minimizes the recompilation that users are required to + perform, shortening the detect-compile-debug bug hunting cycle + and making the debug mode easier to incorporate into development + environments by minimizing dependencies.

Achieving link- and run-time coexistence is not a trivial + implementation task. To achieve this goal we required a small + extension to the GNU C++ compiler (described in the GCC Manual for + C++ Extensions, see strong + using), and a complex organization of debug- and + release-modes. The end result is that we have achieved per-use + recompilation but have had to give up some checking of the + std::basic_string class template (namely, safe + iterators). +

Compile-time coexistence of release- and debug-mode components

Both the release-mode components and the debug-mode + components need to exist within a single translation unit so that + the debug versions can wrap the release versions. However, only one + of these components should be user-visible at any particular + time with the standard name, e.g., std::list.

In release mode, we define only the release-mode version of the + component with its standard name and do not include the debugging + component at all. The release mode version is defined within the + namespace std. Minus the namespace associations, this + method leaves the behavior of release mode completely unchanged from + its behavior prior to the introduction of the libstdc++ debug + mode. Here's an example of what this ends up looking like, in + C++.

+namespace std
+{
+  template<typename _Tp, typename _Alloc = allocator<_Tp> >
+    class list
+    {
+      // ...
+     };
+} // namespace std
+

In debug mode we include the release-mode container (which is now +defined in in the namespace __norm) and also the +debug-mode container. The debug-mode container is defined within the +namespace __debug, which is associated with namespace +std via the GNU namespace association extension. This +method allows the debug and release versions of the same component to +coexist at compile-time and link-time without causing an unreasonable +maintenance burden, while minimizing confusion. Again, this boils down +to C++ code as follows:

+namespace std
+{
+  namespace __norm
+  {
+    template<typename _Tp, typename _Alloc = allocator<_Tp> >
+      class list
+      {
+        // ...
+      };
+  } // namespace __gnu_norm
+
+  namespace __debug
+  {
+    template<typename _Tp, typename _Alloc = allocator<_Tp> >
+      class list
+      : public __norm::list<_Tp, _Alloc>,
+        public __gnu_debug::_Safe_sequence<list<_Tp, _Alloc> >
+      {
+        // ...
+      };
+  } // namespace __norm
+
+  using namespace __debug __attribute__ ((strong));
+}
+

Link- and run-time coexistence of release- and + debug-mode components

Because each component has a distinct and separate release and +debug implementation, there are are no issues with link-time +coexistence: the separate namespaces result in different mangled +names, and thus unique linkage.

However, components that are defined and used within the C++ +standard library itself face additional constraints. For instance, +some of the member functions of std::moneypunct return +std::basic_string. Normally, this is not a problem, but +with a mixed mode standard library that could be using either +debug-mode or release-mode basic_string objects, things +get more complicated. As the return value of a function is not +encoded into the mangled name, there is no way to specify a +release-mode or a debug-mode string. In practice, this results in +runtime errors. A simplified example of this problem is as follows. +

Take this translation unit, compiled in debug-mode:

+// -D_GLIBCXX_DEBUG
+#include <string>
+
+std::string test02();
+ 
+std::string test01()
+{
+  return test02();
+}
+ 
+int main()
+{
+  test01();
+  return 0;
+}
+

... and linked to this translation unit, compiled in release mode:

+#include <string>
+ 
+std::string
+test02()
+{
+  return std::string("toast");
+}
+

For this reason we cannot easily provide safe iterators for + the std::basic_string class template, as it is present + throughout the C++ standard library. For instance, locale facets + define typedefs that include basic_string: in a mixed + debug/release program, should that typedef be based on the + debug-mode basic_string or the + release-mode basic_string? While the answer could be + "both", and the difference hidden via renaming a la the + debug/release containers, we must note two things about locale + facets:

They exist as shared state: one can create a facet in one + translation unit and access the facet via the same type name in a + different translation unit. This means that we cannot have two + different versions of locale facets, because the types would not be + the same across debug/release-mode translation unit barriers.
They have virtual functions returning strings: these functions + mangle in the same way regardless of the mangling of their return + types (see above), and their precise signatures can be relied upon + by users because they may be overridden in derived classes.

With the design of libstdc++ debug mode, we cannot effectively hide + the differences between debug and release-mode strings from the + user. Failure to hide the differences may result in unpredictable + behavior, and for this reason we have opted to only + perform basic_string changes that do not require ABI + changes. The effect on users is expected to be minimal, as there are + simple alternatives (e.g., __gnu_debug::basic_string), + and the usability benefit we gain from the ability to mix debug- and + release-compiled translation units is enormous.

Alternatives for Coexistence

The coexistence scheme above was chosen over many alternatives, + including language-only solutions and solutions that also required + extensions to the C++ front end. The following is a partial list of + solutions, with justifications for our rejection of each.

Completely separate debug/release libraries: This is by + far the simplest implementation option, where we do not allow any + coexistence of debug- and release-compiled translation units in a + program. This solution has an extreme negative affect on usability, + because it is quite likely that some libraries an application + depends on cannot be recompiled easily. This would not meet + our usability or minimize recompilation criteria + well.
Add a Debug boolean template parameter: + Partial specialization could be used to select the debug + implementation when Debug == true, and the state + of _GLIBCXX_DEBUG could decide whether the + default Debug argument is true + or false. This option would break conformance with the + C++ standard in both debug and release modes. This would + not meet our correctness criteria.
Packaging a debug flag in the allocators: We could + reuse the Allocator template parameter of containers + by adding a sentinel wrapper debug<> that + signals the user's intention to use debugging, and pick up + the debug<> allocator wrapper in a partial + specialization. However, this has two drawbacks: first, there is a + conformance issue because the default allocator would not be the + standard-specified std::allocator<T>. Secondly + (and more importantly), users that specify allocators instead of + implicitly using the default allocator would not get debugging + containers. Thus this solution fails the correctness + criteria.
Define debug containers in another namespace, and employ + a using declaration (or directive): This is an + enticing option, because it would eliminate the need for + the link_name extension by aliasing the + templates. However, there is no true template aliasing mechanism + is C++, because both using directives and using + declarations disallow specialization. This method fails + the correctness criteria.
Use implementation-specific properties of anonymous + namespaces. + See this post + + This method fails the correctness criteria.
Extension: allow reopening on namespaces: This would + allow the debug mode to effectively alias the + namespace std to an internal namespace, such + as __gnu_std_debug, so that it is completely + separate from the release-mode std namespace. While + this will solve some renaming problems and ensure that + debug- and release-compiled code cannot be mixed unsafely, it ensures that + debug- and release-compiled code cannot be mixed at all. For + instance, the program would have two std::cout + objects! This solution would fails the minimize + recompilation requirement, because we would only be able to + support option (1) or (2).
Extension: use link name: This option involves + complicated re-naming between debug-mode and release-mode + components at compile time, and then a g++ extension called + link name to recover the original names at link time. There + are two drawbacks to this approach. One, it's very verbose, + relying on macro renaming at compile time and several levels of + include ordering. Two, ODR issues remained with container member + functions taking no arguments in mixed-mode settings resulting in + equivalent link names, vector::push_back() being + one example. + See link + name

Other options may exist for implementing the debug mode, many of + which have probably been considered and others that may still be + lurking. This list may be expanded over time to include other + options that we could have implemented, but in all cases the full + ramifications of the approach (as measured against the design goals + for a libstdc++ debug mode) should be considered first. The DejaGNU + testsuite includes some testcases that check for known problems with + some solutions (e.g., the using declaration solution + that breaks user specialization), and additional testcases will be + added as we are able to identify other typical problem cases. These + test cases will serve as a benchmark by which we can compare debug + mode implementations.

Other Implementations

There are several existing implementations of debug modes for C++ + standard library implementations, although none of them directly + supports debugging for programs using libstdc++. The existing + implementations include:

SafeSTL: + SafeSTL was the original debugging version of the Standard Template + Library (STL), implemented by Cay S. Horstmann on top of the + Hewlett-Packard STL. Though it inspired much work in this area, it + has not been kept up-to-date for use with modern compilers or C++ + standard library implementations.
STLport: STLport is a free + implementation of the C++ standard library derived from the SGI implementation, and + ported to many other platforms. It includes a debug mode that uses a + wrapper model (that in some way inspired the libstdc++ debug mode + design), although at the time of this writing the debug mode is + somewhat incomplete and meets only the "Full user recompilation" (2) + recompilation guarantee by requiring the user to link against a + different library in debug mode vs. release mode.
Metrowerks + CodeWarrior: The C++ standard library that ships with Metrowerks + CodeWarrior includes a debug mode. It is a full debug-mode + implementation (including debugging for CodeWarrior extensions) and + is easy to use, although it meets only the "Full recompilation" (1) + recompilation guarantee.

PrevÂ	Up	Â Next
UsingÂ	Home	Â ChapterÂ 31.Â Parallel Mode