Monday, November 5, 2012

Rethinking std::binary_function.

Update: I have expanded on this concept: http://yapb-soc.blogspot.com/2012/11/generic-function-objects-or.html

Before the word "lambda" could be found in almost any C++ coder's vocabulary, the standardizing committee had stuffed <functional> with a bunch of operator functions (plus, minus, equal_to, etc), and they were very helpful. But everything else from C++03 in that file became deprecated in C++11. Things like bind1st, pointer_to_unary_function, mem_fun (not mem_fn), etc.. Even std::unary_function and std::binary_function are now deprecated.

They used to be helpful because they defined typedefs like result_type and argument_type. When you wrote a function object, you could inherit from unary_ or binary_function and get these typedefs for free. Now that we have std::result_of and decltype to do the same thing so these aren't helpful any more; but we can also do more with functions than in the past. We can partially apply them without knowing what arguments or return types the functions will give until we call them. We can also create variadic versions. We can combine, compose, and build them from other functions.

The core idea of having your function objects inherit from a base type that offers extended functionality isn't a bad idea.


Rethinking std::binary_function.

What can we do with a binary function, f(x,y)? We can partially apply them (g(y) = f(x,y)) or chain them arbitrarily (g(x,y,z) = f(f(x,y),z)). So let's do that! To implement it, I'll use the curiously recurring template pattern.

template< class Derr > struct Binary {
    // One argument: curry.
    template< class X >
    constexpr auto operator () ( X x ) -> PartialApplication<Derr,X> {
        return closet( Derr(), std::move(x) );
    }

    template< class X, class Y >
    using Result = typename std::result_of< Derr(X,Y) >::type;

    // Three arguments: unroll.
    template< class X, class Y, class Z >
    constexpr auto operator () ( X&& x, Y&& y, Z&& z ) 
        -> Result<Result<X,Y>,Z> 
    {
        return Derr()( 
            Derr()( std::forward<X>(x), std::forward<Y>(y) ), 
            std::forward<Z>(z)
        );
    }

    template< class X, class Y, class ...Z >
    using Unroll = typename std::result_of <
        Binary<Derr>( Result<X,Y>, Z... )
    >::type;

    // Any more? recurse.
    template< class X, class Y, class Z, class H, class ...J >
    constexpr auto operator () ( X&& x, Y&& y, Z&& z, H&& h, J&& ...j ) 
        -> Unroll<X,Y,Z,H,J...> 
    {
        // Notice how (*this) always gets applied at LEAST three arguments.
        return (*this)( 
            Derr()( std::forward<X>(x), std::forward<Y>(y) ), 
            std::forward<Z>(z), std::forward<H>(h), std::forward<J>(j)... 
        );
    }
};

constexpr struct Add : public Binary<Add> {
    using Binary::operator();

    template< class X, class Y >
    constexpr auto operator () ( X&& x, Y&& y ) 
        -> decltype( std::declval<X>() + std::declval<Y>() )
    {
        return std::forward<X>(x) + std::forward<Y>(y);
    }
} add{};

Note that Add must include the line using Binary::operator() for it to inherit Binary's overloads.

Now, we can create a partially applied addition

    constexpr auto plusTwo = add(2);
    constexpr auto five = plusTwo(3);

And we can string arbitrarily long additions in one function.

    constexpr int sum = add(1,2,3,4,5); 
    std::string helloWorld = add( std::string("Hello"), ' ', "world!" );

Note how for helloWorld's add, the first argument is a string, the second a char, the third a char*. The most sensible way to do this with the standard plus would be to instantiate it as an std::plus<std::string>, which would require converting the char and char* to a string before calling. Luckily, in the future, we may be able to use the syntax std::plus<> (N3421), and its behaviour would allow for mixed-type arguments, but not currying or chaining.

Moving on, we can define Subtract, Multiply, and all of the other binary <functional> types in this way. It might not be useful for all functions. For example, if you wrote a function, find(x,xs), taking a value and a sequence and returning an iterator, then it might not be helpful to chain. find(x,xs,ys) would check ys for the iterator of the item found in xs, which is probably not what we want. But find(x) would return a function that looks for x in any container and that could easily be desirable behaviour.


A note on efficiency.

I am a big fan of constexpr. There is no optimization greater than skipping the whole computation and inlining the result. So when I checked out the assembly generated for the line,

    std::cout << add(1,2,3,4,5) << std::endl;

imagine my surprise when I found out GCC had neither computed the result at compile time nor inlined the call to add! However, when I made the following change:

    constexpr int x = add(1,2,3,4,5);
    std::cout << x << std::endl;

it basically converted it to

    std::cout << 15 << std::endl;

This is very exciting technology, but perhaps too new to work perfectly.  I expect GCC, and compilers in general, to improve over time, but for the moment, they can be fickle. Subtle changes in the code can cause drastic changes in the assembly.


Could we go further?

There are other operations that might make sense on binary function; their arguments can be swapped (or flipped), one could apply only the second argument, and we could find interesting ways of composing them. As always, the limit is our imagination. I think this is a good start. Features that extend beyond overloading () require the calling code to "know" that its function object is a Binary, and so it's less generally useful. Why write a member named Binary::flip() when one could write a function called flip that worked on function pointers, too?

Are there operations on unary functions that might make sense? What about ternary functions?

The lesson here isn't so much that one thing should be a feature over another, but when defining function objects, do not forget about using inheritance to limit code duplication!


The source: https://gist.github.com/4024658
(Oops, I forgot to post it this time!)