Data-Structure Genericity

The Basic Problem

The design attempts to address the following problem. When writing a function manipulating a generic container object, what is the behavior of the object? E.g., suppose one writes

template<typename Cntnr>
void
some_op_sequence(Cntnr &r_container)
{
    ...
}
then one needs to address the following questions in the body of some_op_sequence:
  1. Which types and methods does Cntnr support? Containers based on hash tables can be queries for the hash-functor type and object; this is meaningless for tree-based containers. Containers based on trees can be split, joined, or can erase iterators and return the following iterator; this cannot be done by hash-based containers.
  2. What are the guarantees of Cntnr? A container based on a probing hash-table invalidates all iterators when it is modified; this is not the case for containers based on node-based trees. Containers based on a node-based tree can be split or joined without exceptions; this is not the case for containers based on vector-based trees.
  3. How does the container maintain its elements? Tree-based and Trie-based containers store elements by key order; others, typically, do not. A container based on a splay trees or lists with update policies "cache" "frequently accessed" elements; containers based on most other underlying data structures do not.

The remainder of this section deals with these issues.

Container Hierarchy

Figure Container class hierarchy shows the container hierarchy.

no image
Container class hierarchy.
  1. container_base is an abstract base class for associative containers.
  2. Tree-Like-Based Associative-Containers:
    1. basic_tree is an abstract base class for tree-like-based associative-containers
    2. tree is a concrete base class for tree-based associative-containers
    3. trie is a concrete base class trie-based associative-containers
  3. Hash-Based Associative-Containers:
    1. basic_hash_table is an abstract base class for hash-based associative-containers
    2. cc_hash_table is a concrete collision-chaining hash-based associative-containers
    3. gp_hash_table is a concrete (general) probing hash-based associative-containers
  4. List-Based Associative-Containers:
    1. list_update - list-based update-policy associative container

The hierarchy is composed naturally so that commonality is captured by base classes. Thus operator[] is defined container_base, since all containers support it. Conversely split is defined in basic_tree, since only tree-like containers support it. Data-Structure Tags and Traits discusses how to query which types and methods each container supports.

Data-Structure Tags and Traits

Tags and traits are very useful for manipulating generic types. For example, if It is an iterator class, then typename It::iterator_category or typename std::iterator_traits<It>::iterator_category will yield its category, and typename std::iterator_traits<It>::value_type will yield its value type.

pb_ds contains a tag hierarchy corresponding to the hierarchy in Figure Class hierarchy. The tag hierarchy is shown in Figure Data-structure tag class hierarchy.

no image
Data-structure tag class hierarchy.

container_base publicly defines container_category as one of the classes in Figure Data-structure tag class hierarchy. Given any container Cntnr, the tag of the underlying data structure can be found via typename Cntnr::container_category.

Additionally, a traits mechanism can be used to query a container type for its attributes. Given any container Cntnr, then __gnu_pbds::container_traits<Cntnr> is a traits class identifying the properties of the container.

To find if a container can throw when a key is erased (which is true for vector-based trees, for example), one can use

container_traits<Cntnr>::erase_can_throw, for example.

Some of the definitions in container_traits are dependent on other definitions. E.g., if container_traits<Cntnr>::order_preserving is true (which is the case for containers based on trees and tries), then the container can be split or joined; in this case, container_traits<Cntnr>::split_join_can_throw indicates whether splits or joins can throw exceptions (which is true for vector-based trees); otherwise container_traits<Cntnr>::split_join_can_throw will yield a compilation error. (This is somewhat similar to a compile-time version of the COM model [mscom]).

Point-Type and Range-Type Methods and Iterators

Iterators in Unordered Container Types

pb_ds differentiates between two types of methods and iterators: point-type methods and iterators, and range-type methods and iterators (see Motivation::Associative Containers::Differentiating between Iterator Types and Tutorial::Associative Containers::Point-Type and Range-Type Methods and Iterators). Each associative container's interface includes the methods:

const_point_iterator
find(const_key_reference r_key) const;                

point_iterator
find(const_key_reference r_key);         
    
std::pair<point_iterator,bool>
insert(const_reference r_val);

The relationship between these iterator types varies between container types. Figure Point-type and range-type iterators-A shows the most general invariant between point-type and range-type iterators: iterator, e.g., can always be converted to point_iterator. Figure Point-type and range-type iterators-B shows invariants for order-preserving containers: point-type iterators are synonymous with range-type iterators. Orthogonally, Figure Point-type and range-type iterators-C shows invariants for "set" containers: iterators are synonymous with const iterators.

no image
Point-type and range-type iterators.

Note that point-type iterators in self-organizing containers (e.g., hash-based associative containers) lack movement operators, such as operator++ - in fact, this is the reason why pb_ds differentiates from the STL's design on this point.

Typically, one can determine an iterator's movement capabilities in the STL using std::iterator_traits<It>iterator_category, which is a struct indicating the iterator's movement capabilities. Unfortunately, none of the STL's predefined categories reflect a pointer's not having any movement capabilities whatsoever. Consequently, pb_ds adds a type trivial_iterator_tag (whose name is taken from a concept in [sgi_stl]), which is the category of iterators with no movement capabilities. All other STL tags, such as forward_iterator_tag retain their common use.

Invalidation Guarantees

Motivation::Associative Containers::Differentiating between Iterator Types::Invalidation Guarantees posed a problem. Given three different types of associative containers, a modifying operation (in that example, erase) invalidated iterators in three different ways: the iterator of one container remained completely valid - it could be de-referenced and incremented; the iterator of a different container could not even be de-referenced; the iterator of the third container could be de-referenced, but its "next" iterator changed unpredictably.

Distinguishing between find and range types allows fine-grained invalidation guarantees, because these questions correspond exactly to the question of whether point-type iterators and range-type iterators are valid. Invalidation guarantees class hierarchy shows tags corresponding to different types of invalidation guarantees.

no image
Invalidation guarantees class hierarchy.
  1. basic_invalidation_guarantee corresponds to a basic guarantee that a point-type iterator, a found pointer, or a found reference, remains valid as long as the container object is not modified.
  2. point_invalidation_guarantee corresponds to a guarantee that a point-type iterator, a found pointer, or a found reference, remains valid even if the container object is modified.
  3. range_invalidation_guarantee corresponds to a guarantee that a range-type iterator remains valid even if the container object is modified.

As shown in Tutorial::Associative Containers::Point-Type and Range-Type Methods and Iterators, to find the invalidation guarantee of a container, one can use

typename container_traits<Cntnr>::invalidation_guarantee

which is one of the classes in Figure Invalidation guarantees class hierarchy.

Note that this hierarchy corresponds to the logic it represents: if a container has range-invalidation guarantees, then it must also have find invalidation guarantees; correspondingly, its invalidation guarantee (in this case range_invalidation_guarantee) can be cast to its base class (in this case point_invalidation_guarantee). This means that this this hierarchy can be used easily using standard metaprogramming techniques, by specializing on the type of invalidation_guarantee.

(These types of problems were addressed, in a more general setting, in [meyers96more] - Item 2. In our opinion, an invalidation-guarantee hierarchy would solve these problems in all container types - not just associative containers.)