Priority-Queue Design

+ +

Overview

+ +

The priority-queue container has the following + declaration:

+template<
+    typename Value_Type,
+    typename Cmp_Fn = std::less<Value_Type>,
+    typename Tag = pairing_heap_tag,
+    typename Allocator = std::allocator<char> >
+class priority_queue;
+

+ +

The parameters have the following meaning:

+ +

Value_Type is the value type.
Cmp_Fn is a value comparison functor
Tag specifies which underlying data structure + to use.
Allocator is an allocator + type.

+ +

The Tag parameter specifies which underlying + data structure to use. Instantiating it by pairing_heap_tag, + binary_heap_tag, + binomial_heap_tag, + rc_binomial_heap_tag, + or thin_heap_tag, + specifies, respectively, an underlying pairing heap [fredman86pairing], + binary heap [clrs2001], + binomial heap [clrs2001], a binomial heap with + a redundant binary counter [maverik_lowerbounds], + or a thin heap [kt99fat_heas]. These are + explained further in Implementations.

+ +

As mentioned in Tutorial::Priority Queues, + __gnu_pbds::priority_queue + shares most of the same interface with std::priority_queue. + E.g. if q is a priority queue of type + Q, then q.top() will return the "largest" + value in the container (according to typename + Q::cmp_fn). __gnu_pbds::priority_queue + has a larger (and very slightly different) interface than + std::priority_queue, however, since typically + push and pop are deemed insufficient for + manipulating priority-queues.

+ +

Different settings require different priority-queue + implementations which are described in Implementations; Traits + discusses ways to differentiate between the different traits of + different implementations.

+ +

Iterators

+ +

There are many different underlying-data structures for + implementing priority queues. Unfortunately, most such + structures are oriented towards making push and + top efficient, and consequently don't allow efficient + access of other elements: for instance, they cannot support an efficient + find method. In the use case where it + is important to both access and "do something with" an + arbitrary value, one would be out of luck. For example, many graph algorithms require + modifying a value (typically increasing it in the sense of the + priority queue's comparison functor).

+ +

In order to access and manipulate an arbitrary value in a + priority queue, one needs to reference the internals of the + priority queue from some form of an associative container - + this is unavoidable. Of course, in order to maintain the + encapsulation of the priority queue, this needs to be done in a + way that minimizes exposure to implementation internals.

+ +

In pb_ds the priority queue's insert + method returns an iterator, which if valid can be used for subsequent modify and + erase operations. This both preserves the priority + queue's encapsulation, and allows accessing arbitrary values (since the + returned iterators from the push operation can be + stored in some form of associative container).

+ +

Priority queues' iterators present a problem regarding their + invalidation guarantees. One assumes that calling + operator++ on an iterator will associate it + with the "next" value. Priority-queues are + self-organizing: each operation changes what the "next" value + means. Consequently, it does not make sense that push + will return an iterator that can be incremented - this can have + no possible use. Also, as in the case of hash-based containers, + it is awkward to define if a subsequent push operation + invalidates a prior returned iterator: it invalidates it in the + sense that its "next" value is not related to what it + previously considered to be its "next" value. However, it might not + invalidate it, in the sense that it can be + de-referenced and used for modify and erase + operations.

+ +

Similarly to the case of the other unordered associative + containers, pb_ds uses a distinction between + point-type and range type iterators. A priority queue's iterator can always be + converted to a point_iterator, and a + const_iterator can always be converted to a + const_point_iterator.

+ +

The following snippet demonstrates manipulating an arbitrary + value:

+// A priority queue of integers.
+priority_queue<int> p;
+
+// Insert some values into the priority queue.
+priority_queue<int>::point_iterator it = p.push(0);
+
+p.push(1);
+p.push(2);
+
+// Now modify a value.
+p.modify(it, 3);
+
+assert(p.top() == 3);
+

+ +

(Priority Queue + Examples::Cross-Referencing shows a more detailed + example.)

+ +

It should be noted that an alternative design could embed an + associative container in a priority queue. Could, but most probably should not. To begin with, it should be noted that one + could always encapsulate a priority queue and an associative + container mapping values to priority queue iterators with no + performance loss. One cannot, however, "un-encapsulate" a + priority queue embedding an associative container, which might + lead to performance loss. Assume, that one needs to + associate each value with some data unrelated to priority + queues. Then using pb_ds's design, one could use an + associative container mapping each value to a pair consisting + of this data and a priority queue's iterator. Using the + embedded method would need to use two associative + containers. Similar problems might arise in cases where a value + can reside simultaneously in many priority queues.

+ +

Implementations

+ +

There are three main implementations of priority queues: the + first employs a binary heap, typically one which uses a + sequence; the second uses a tree (or forest of trees), which is + typically less structured than an associative container's tree; + the third simply uses an associative container. These are + shown, respectively, in Figures Underlying Priority-Queue + Data-Structures A1 and A2, Figure Underlying Priority-Queue + Data-Structures B, and Figures Underlying Priority-Queue + Data-Structures C.

+ +

Underlying Priority-Queue Data-Structures.

+ +

Roughly speaking, any value that is both pushed and popped + from a priority queue must incur a logarithmic expense (in the + amortized sense). Any priority queue implementation that would + avoid this, would violate known bounds on comparison-based + sorting (see, e.g., [clrs2001] and brodal96priority]).

+ +

Most implementations do + not differ in the asymptotic amortized complexity of + push and pop operations, but they differ in + the constants involved, in the complexity of other operations + (e.g., modify), and in the worst-case + complexity of single operations. In general, the more + "structured" an implementation (i.e., the more internal + invariants it possesses) - the higher its amortized complexity + of push and pop operations.

+ +

pb_ds implements different algorithms using a + single class: priority_queue. + Instantiating the Tag template parameter, "selects" + the implementation:

+ +

Instantiating Tag = binary_heap_tag creates + a binary heap of the form in Figures Underlying Priority-Queue + Data-Structures A1 or A2. The former is internally + selected by priority_queue + if Value_Type is instantiated by a primitive type + (e.g., an int); the latter is + internally selected for all other types (e.g., + std::string). This implementations is relatively + unstructured, and so has good push and pop + performance; it is the "best-in-kind" for primitive + types, e.g., ints. Conversely, it has + high worst-case performance, and can support only linear-time + modify and erase operations; this is + explained further in Traits.
Instantiating Tag = pairing_heap_tag + creates a pairing heap of the form in Figure Underlying Priority-Queue + Data-Structures B. This implementations too is relatively + unstructured, and so has good push and pop + performance; it is the "best-in-kind" for non-primitive + types, e.g., std:strings. It also has very + good worst-case push and join performance + (O(1)), but has high worst-case pop + complexity.
Instantiating Tag = binomial_heap_tag + creates a binomial heap of the form in Figure Underlying Priority-Queue + Data-Structures B. This implementations is more + structured than a pairing heap, and so has worse + push and pop performance. Conversely, it + has sub-linear worst-case bounds for pop, + e.g., and so it might be preferred in cases where + responsiveness is important.
Instantiating Tag = rc_binomial_heap_tag + creates a binomial heap of the form in Figure Underlying Priority-Queue + Data-Structures B, accompanied by a redundant counter + which governs the trees. This implementations is therefore + more structured than a binomial heap, and so has worse + push and pop performance. Conversely, it + guarantees O(1) push complexity, and so it + might be preferred in cases where the responsiveness of a + binomial heap is insufficient.
Instantiating Tag = thin_heap_tag creates a + thin heap of the form in Figure Underlying Priority-Queue + Data-Structures B. This implementations too is more + structured than a pairing heap, and so has worse + push and pop performance. Conversely, it + has better worst-case and identical amortized complexities + than a Fibonacci heap, and so might be more appropriate for + some graph algorithms.

+ +

Priority-Queue + Performance Tests shows some results for the above, and + discusses these points further.

+ +

Of course, one can use any order-preserving associative + container as a priority queue, as in Figure Underlying Priority-Queue + Data-Structures C, possibly by creating an adapter class + over the associative container (much as + std::priority_queue can adapt std::vector). + This has the advantage that no cross-referencing is necessary + at all; the priority queue itself is an associative container. + Most associative containers are too structured to compete with + priority queues in terms of push and pop + performance.

+ +

Traits

+ +

It would be nice if all priority queues could + share exactly the same behavior regardless of implementation. Sadly, this is not possible. Just one for instance is in join operations: joining + two binary heaps might throw an exception (not corrupt + any of the heaps on which it operates), but joining two pairing + heaps is exception free.

+ +

Tags and traits are very useful for manipulating generic + types. __gnu_pbds::priority_queue + publicly defines container_category as one of the tags + discussed in Implementations. Given any + container Cntnr, the tag of the underlying + data structure can be found via typename + Cntnr::container_category; this is one of the types shown in + Figure Data-structure tag class + hierarchy.

+ +

Data-structure tag class hierarchy.

+ +

Additionally, a traits mechanism can be used to query a + container type for its attributes. Given any container + Cntnr, then __gnu_pbds::container_traits<Cntnr> + is a traits class identifying the properties of the + container.

+ +

To find if a container might throw if two of its objects are + joined, one can use container_traits<Cntnr>::split_join_can_throw, + for example.

+ +

Different priority-queue implementations have different invalidation guarantees. This is + especially important, since as explained in Iterators, there is no way to access an arbitrary + value of priority queues except for iterators. Similarly to + associative containers, one can use + container_traits<Cntnr>::invalidation_guarantee + to get the invalidation guarantee type of a priority queue.

+ +

It is easy to understand from Figure Underlying Priority-Queue + Data-Structures, what container_traits<Cntnr>::invalidation_guarantee + will be for different implementations. All implementations of + type Underlying + Priority-Queue Data-Structures B have point_invalidation_guarantee: + the container can freely internally reorganize the nodes - + range-type iterators are invalidated, but point-type iterators + are always valid. Implementations of type Underlying Priority-Queue + Data-Structures A1 and A2 have basic_invalidation_guarantee: + the container can freely internally reallocate the array - both + point-type and range-type iterators might be invalidated.

+ +

This has major implications, and constitutes a good reason to avoid + using binary heaps. A binary heap can perform modify + or erase efficiently given a valid point-type + iterator. However, inn order to supply it with a valid point-type + iterator, one needs to iterate (linearly) over all + values, then supply the relevant iterator (recall that a + range-type iterator can always be converted to a point-type + iterator). This means that if the number of modify or + erase operations is non-negligible (say + super-logarithmic in the total sequence of operations) - binary + heaps will perform badly.

+
+