Vorner's random stuff

Personal collection of Rust hacks

Rust is a great language. However, I have the (probably bad) habit of using everything I get my hands on to the very limits. That’s like ideal gas ‒ it expands until it hits some walls. Not that I would try on purpose, it just somehow happens. There’s no surprise I’ve hit things that looked impossible or weren’t ready yet even in Rust… but in many cases I’ve beaten the problem long enough until it gave up and I came up with an ugly hack to work around it.

I’ve decided to share the ones I can remember from top of my head in case someone gets to a similar situation. None of them requires enabling any nightly features, but in one case the current stable compiler is unable to crunch through it yet (beta is, it’s probably some bug that got fixed).

Also note I’m typing this without compiling the examples, so you may need to fix typos or add some type hints ‒ but the idea itself should hold.

Poor Man’s try block

While I haven’t paid the attention to the discussion about final syntax, a try block is being planned. That’ll allow limiting an early return by a question mark to the scope of block instead of a whole function, something like this:

fn load_cache(path: impl AsRef<Path>) -> Result<Cache, Box<Error + Send + Sync>> {
    let data = try {
        // Question marks here exit only the try block, not the function
        let mut f = File::open(path)?;
        let mut bytes = Vec::new();
        f.read_to_end(&mut bytes)?;
        bytes
    };
    let cache = match data {
        // But this question mark exits the function
        Ok(data) => Cache::from_data(data)?;
        Err(_) => {
            warn!("Couldn't read cache, using an empty one instead");
            Cache::default()
        }
    };
    Ok(cache)
}

The try block is matter of convenience. It doesn’t bring anything new couldn’t do without it, but grouping some error sites together and handling them in a bunch is handy and doing it with .and_then or other similar methods often feels awkward.

But the try block is not there yet. So, let’s roll our own:

macro_rules! pmtb {
    ($($body: tt)*) => {
        (|| -> Result<_, Box<Error + Send + Sync>> {
            Ok({
                $($body)*
            })
        })()
    }
}

fn load_cache(path: impl AsRef<Path>) -> Result<Cache, Box<Error + Send + Sync>> {
    let data = pmtb! {
        // Question marks here exit only the try block, not the function
        let mut f = File::open(path)?;
        let mut bytes = Vec::new();
        f.read_to_end(&mut bytes)?;
        bytes
    };
    unimplemented!("The rest of the stuff...");
}

What does that do? The basic idea is, create a closure (that will work as the barrier for ? and whatever return happens to be inside) and execute it right away. This is AFAIK different from the proposed try block in that it doesn’t work for all Try types, only for results (that is left as an exercise for the reader O:-)) and return in the real try blocks would exit the outer function, not only the block (that probably can’t be made to work, but who cares about that?).

This variant is with „OK wrapping“, one without would be trivial to construct too.

By the way, I’ve been told this is used in javascript and actually has a name ‒ but I forgot the name.

Poor Man’s FnBox

We have the nice Box<Fn()> and Box<FnMut()>. You can create a Box<FnOnce()> too, but there’s no way to call it (yet), because the call consumes the FnOnce. For that, Rust needs to place it to the top of the stack and for that it needs to know how large it is ‒ which is precisely the thing we don’t want to care about when boxing trait objects. This is not problem with only FnOnce, but any method that consumes self and similar technique would work too.

The standard library has a workaround called FnBox, but it is unstable. Eventually, Box<FnOnce()> will start working (out of the box?). Until then, let’s do it this way:

fn set_cback<F: FnOnce()>(&mut self, f: F) {
    let mut f = Some(f);
    // self.cback is Box<FnMut()>
    self.cback = Box::new(move || {
        (f.take().expect("FnOnce called more than once!"))()
    });
}

The trick here is, we wrap the FnOnce into a new FnMut. As we create a new FnMut one for each F that is passed to us, this FnMut does know the proper size and can call the FnOnce. And boxing FnMut works. Needless to say, if you try to call it more than once, this FnMut will panic.

If you worry about the panic, you can wrap this into a type that hides these details ‒ holds the Box<FnMut> inside itself privately and the .call method consumes the type, re-introducing the compile-time check back. But it’s a bit sad this type can’t be called directly with ().

Anonymous Associated Types

In Rust, some types are so very evil and we can’t even type their name. There are some places that let you talk about such types, as generic parameters or impl Trait. But other places don’t ‒ specifically, associated types in traits. There’s this existential type thing coming… but it’s not here yet.

Often, one needs to put such type there because the trait somehow produces that type. And there’s one thing in Rust with superpowers that actually has the ability to have anonymous associated type ‒ the Fn (and friends). We are going to abuse that. How? Through blanket implementations, like this:

use std::fmt::Debug;

pub trait ProduceStuff {
    type Product: Debug;
    fn produce_stuff(&self, how_much: usize) -> Self::Product;
}

struct ProduceFn<F>(F);

impl<F, P> ProduceStuff for ProduceFn<F>
where
    F: Fn(usize) -> P, // Here is where we steal the associated type
    P: Debug,
{
    type Product = P;
    fn produce_stuff(&self, how_much: usize) -> P {
        (self.0)(how_much)
    }
}

pub fn make_producer() -> impl ProduceStuff {
    ProduceFn(|how_much| unimplemented!())
}

You can get this very far… probably as far as your stomach for trait bound length and ugly error messages can keep up.

HRTB and Associated Types

Now, this one is the ugly abomination on which rustc’s brain in stable explodes.

Let’s say we have some lifetime-parametric traits, like this:

trait Extractor<'a, Source: 'a> {
    type Fragment: 'a;
    fn extract(&self, source: &'a Source) -> Self::Fragment;
}

trait Cruncher<Input> {
    type Report;
    fn crunch(&self, fragment: Input) -> Report;
}

We would like to feed the Extractor some big data type, get smaller one out and crunch the data. Of course, we would like to borrow the data on the way, so Fragment will be something containing reference or references ‒ but maybe not just &Vec<_>, maybe something like (&PartA, &PartB) (so we can’t just type -> &Self::Fragment). We want the Report at the end to be an owned type (eg. 'static, no references there).

And now, we would like to build a pipeline ‒ something that every time it is fed with the big data, it produces a report. This, obviously, needs to work with all possible lifetimes of the data we feed it, so it calls for the HRTB thing.

where
    E: for<'a> Extractor<'a, Source>,
    C: for<'a> Cruncher<E as Extractor<'a, Source>::Fragment>,

This has one downside, though: we just asked for an infinite number of Extractor and Cruncher trait bounds. Each one has a (potentially) different associated type ‒ Extractor<'a, T>::Fragment is some other type than Extractor<'b, T>::Fragment. So, what is the Report type we want to eventually return? Well, because these types on the way differ only in lifetimes and the Report is owned, then there can be just one that is common for all the traits. We know that. But Rust doesn’t. So it complains that it can’t pick one of these infinitely many types if we ask for it.

There’s a way out. First, we need to get hands on arbitrary one of these infinitely many but as we know equivalent types. We do that by providing the type equations above with a concrete lifetime ‒ and there’s only one concrete lifetime we have, that’s 'static. So, the return type will be:

-> Cruncher<E as Extractor<'static, Source>>::Fragment>::Report

However, if we try to write the body of such function, we’ll be using a different lifetime and the resulting Report might (in theory) be a different type. So, how do we tell the compiler to accept only types where the Report happens to be the same for all the other lifetimes? Well, just by translating this sentence to Rust. Literally:

where
    ...
    C: for<'a> Cruncher<
        E as Extractor<'a, Source>::Fragment,
        // Here we add infinitely many type-equality constrains on associated
        // types of similar-but-not-the-same traits.
        Report = Cruncher<E as Extractor<'static, Source>::Fragment>::Report
    >,

I’m not sure I should be doing this, though. The stable compiler is able to resolve this only in case the Fragment is also owned type (it fully works with beta and newer). I’ve seen two different compiler crashes (both are reported, one is already fixed) on the way when debugging it. When something doesn’t line up in the types (well, my actual use case has a few extra trait bounds), it complains a trait bound is not satisfied, but it is missing the additional information needed to actually track down what doesn’t line up (which is worked around by provided a companion function that does nothing at all and has a smaller subset of the bounds, so one can ask for a better error message).

Are there other hacks?

I’m pretty sure there are and that people have their favorites. I’ll be happy to see them shared too, both because one might need them, but also because they are interesting. Or as horror stories for campfire.