Jonathan Boccara's blog

Algorithms on Ranges

Published December 7, 2018 - 5 Comments

In a lot of cases, using STL algorithms in C++ code allows to make it more expressive. However, some developers reported to me they had a hard time diffusing the usage of the STL in their companies, as their co-workers weren’t always keen on putting the STL in their daily coding toolbox.

There were several reasons to this, but one that came up often is that using the STL litters the code with undesirable begins and ends:

auto fortyTwo = std::find(begin(myCollection), end(myCollection), 42);

This code shows several things we don’t want to see: a begin, an end and two occurrences of myCollection instead of just one. Beurk! (“Beurk” is the French equivalent of “Ew”. I’m not claiming to do C++ with a French touch, but I think that Ew has an overly delicate utterance compared to the disgust that unnecessary low-level details spilling over the code inspires. Try to pronounce Beurk (B-er-rk). Don’t you find this vocalises the impression better?)

Using iterators in its interface gives the STL more power if anything, because it allows to perform algorithms on sub-parts of a collection: from one iterator to another one.

That said, how often do you need to perform an algorithm on a sub-part of a collection? Not that often, I guess. In general we perform algorithms on whole collections, like in the above example. This is so common that it deserves a set of overloads on taking collections (or ranges) instead of iterators:

auto fortyTwo = ranges::find(myCollection, 42);

The STL doesn’t happen to offer them, but there is little difficulty in implementing those overloads: we can just wrap a call to the STL algorithm in an interface that accepts a collection. Such overloads will be added the standard in C++20.

Until then, libraries such as range-v3 provide them. Or if you use Boost, they are available in the headers boost/range/algorithm.hpp and boost/range/numeric.hpp, in the boost::range namespace (although not all of them wrap STL implementations).

But if you don’t have access to Boost or any other library providing them, you need to add them as an internal library in your codebase.

There is a subtlety in their implementation that we are going to discuss: how to pass the collection to the algorithm?

Using forwarding references

The most straightforward way to implement such algorithms is probably to pass the collection as a forwarding reference. For example, to wrap std::copy_if:

template<typename InputRange, typename OutputIterator, typename Predicate>
constexpr OutputIterator copy_if(InputRange && range, // <- forwarding reference
                                 OutputIterator out,
                                 Predicate pred)
{
    return std::copy_if(begin(range), end(range), out, pred);
}

Passing ranges to algorithms by forwarding reference is the approach followed by the popular range-v3 library.

This is simple and does the job. But would it make sense to take advantage of the range layer around the STL to add some consts in the interface?

How about passing a reference to const?

EDIT: the following discusses the interest of using references to const in range algorithms. The article as I wrote it initially didn’t come to a definite conclusion, and called for opinions. Like you’ll see in the EDIT at the end of the post, Reddit user tcanens kindly provided a rationale to prefer forwarding references.

When it comes to STL algorithms, stateless is stressless. For example, if you’re calling a std::copy_if by passing it a function (or function object) representing a predicate, it seems reasonable that this predicate doesn’t modify the elements of the collection:

std::copy_if(begin(myCollection), end(myCollection), shouldCopy);
// shouldCopy should not modify its parameter

But, by using iterators in its interface, the original std::copy_if doesn’t have any way to enforce that the collection is not modified by an algorithm.

However, by taking the collection as a whole, we now have the power to force it to be const for the purpose of the algorithm:

template<typename InputRange, typename OutputIterator, typename Predicate>
constexpr OutputIterator copy_if(InputRange const& range, // <- note the const
                                 OutputIterator out,
                                 Predicate pred);

This doesn’t apply to all algorithms. Some algorithms are designed to modify the collection. For example std::rotate, that performs a cyclic permutation of a collection, or even std::sort, are typical examples.

Algorithms that take an iterator

What’s more interesting is that it doesn’t even work for some algorithms that do not modify the collection, if they also take an iterator. One example in the STL is std::rotate_copy, but there could be more if you come to expand the STL algorithms.

std::rotate_copy is like std::rotate, except it doesn’t do the cyclic permutation in-place. It leaves the input collection untouched and produces its results via an output iterator (not familiar with all STL algorithms yet? Check out the World Map of STL Algorithms!)

For instance, consider the following example:

auto numbers = std::vector<int>{0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
auto rotatedNumbers = std::vector<int>{};

std::rotate_copy(begin(numbers), begin(numbers) + 3, end(numbers), back_inserter(rotatedNumbers));

After executing the above code, rotatedNumbers contains {3, 4, 5, 6, 7, 8, 9, 0, 1, 2}.

std::rotate_copy takes 4 parameters:

  • the beginning of the input collection,
  • the position of the element that should end up in the first position after the cyclic permutation,
  • the end of the input collection,
  • the output iterator.

The first and third parameters are superfluous because they indicate the beginning and end of the input collection. Like with the other algorithms, we could create an overload that takes the input collection directly. It would be used like this:

auto numbers = std::vector<int>{0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
auto rotatedNumbers = std::vector<int>{};

ranges::rotate_copy(numbers, begin(numbers) + 3, back_inserter(rotatedNumbers));

But if we take the input collection by reference to const:

template<typename Range, typename Iterator, typename OutputIterator>
OutputIterator rotate_copy(Range const& range, Iterator new_first, OutputIterator out)
{
    return std::rotate_copy(begin(range), new_first, end(range), out);
}

the above code does not compile. We get the following error message:

main.cpp: In instantiation of 'OutputIterator ranges::rotate_copy(const Range&, Iterator, OutputIterator) [with Range = std::vector<int>; Iterator = __gnu_cxx::__normal_iterator<const int*, std::vector<int> >; OutputIterator = std::back_insert_iterator<std::vector<int> >]':
main.cpp:29:79:   required from here
main.cpp:14:54: error: no matching function for call to 'forward<std::vector<int, std::allocator<int> > >(const std::vector<int>&)'

Why is that?

Since numbers is not a const collection, begin(numbers), and therefore begin(numbers) + 3 are of type std::vector<int>::iterator and not std::vector<int>::const_iterator. As a result, in the template instantiation of our rotate_copy, the type of Iterator is deduced as  std::vector<int>::iterator.

On the other hand, since range is of type std::vector<int> const with our explicit const in the interface, begin(range) is of type std::vector<int>::const_iterator.

And std::rotate_copy expects all of its iterator parameters to be of the same type (there is no implicit conversion in the context of template type deduction). Hence the compile error.

Boost has a way to work around that, which we will explore in a future post.

So in summary, passing by const& has the advantage of ensuring that the algorithms that are not supposed to modify collections behave accordingly, and has the drawback that it doesn’t apply to all algorithms, and for rotate_copy it requires extra machinery in the interface.

What to do then?

Should we use const& for the algorithms where we can, such as copy_if and all the others?

One way to see that is that the interfaces of all algorithms should be consistent, so if we can’t use const& for all algorithms, then maybe we shouldn’t use it for any of them.

Yet another way to see this would be to question the idea of putting const in the interface of range algorithms. Indeed, the initial goal of ranges algorithms was to add a layer of abstraction over STL algorithms, and not to change the meaning of their interface by adding consts.

What’s your opinion on this? Should we use && or const& to algorithms that should not modify the values inside the range? Please express what you think about this in the comments section below.

EDIT: as Reddit user tcanens pointed out and as was confirmed by Eric Niebler, using forwarding references is a superior solution. And this is the choice made in range-v3. Indeed, to quote tcanens, const references have two issues:

  • just because T models Range doesn’t mean const T does. In particular, things like filter_view caches begin() to ensure amortized O(1) complexity, so it can’t provide a begin() const without undue overhead.

  • Ranges are not necessarily deep const; most views aren’t. Thus, const Range& offers but an illusion of safety.

I’m very grateful to them for these observations.

You will also like

Don't want to miss out ? Follow:   twitterlinkedinrss
Share this post!Facebooktwitterlinkedin

Comments are closed