nesting allocators
— 2023-08-09

I regularly point people to Tmandry's 2021 post "Contexts and Capabilities in Rust". While nobody is actively working on the design at the moment, it's more because of prioritization than anything else. It seems like it could make certain usage patterns a lot nicer in Rust, and one of those is probably working with custom allocators. I've been working a lot with slab allocators recently, and I'd love it if they were easier to work with in Rust. So I wanted to take a moment to talk about allocators, capabilities as a language feature, and why I believe that would work well.

Kinds of allocation

Memory, at least semantically, is best thought of as a contiguous sequence of bytes. Allocators manage these bytes and do the bookkeeping for who is using what. Allocators can take any type of object ("heterogenous types") and place them in memory. And once we're done with an object, the allocator will reclaim the memory. Because not all objects are the same size this can leave gaps in memory, so occasionally the allocator spends some time reorganizing its internals (think: defragmentation). In practical terms, every program wants to have a single allocator to manage the entire program. This root allocator is what we refer to as the global allocator.

But where global allocators are great, they are not perfect. Typically when we write programs we'll see usage patterns in our main loop. For example: if we're writing an HTTP server we will typically allocate data for the request headers and body. And once that request is handled, we de-allocate it and allocate a new request. We can use our knowledge of our usage patterns to handle allocations in a more targeted way. The way we do this is by obtaining a relatively large-ish chunk of memory from our system allocator, and then using that as the heap space for our custom allocator. These custom allocators are also known as "region-based allocators" or "arena allocators". Their key benefit is that their memory management strategies are very simple, which can make allocations so cheap they might as well be free.

There are many different types of arena allocators, but these are the ones I use the most:

slab allocators: For example the slab crate in Rust. This is ideal for when you perform many homogenous, short-lived allocations over a longer period.
bump allocators: For example the bumpalo crate in Rust. This is ideal for phase-oriented allocations, where you allocate a number of heterogenous objects, say, on every iteration of a loop and then de-allocate them at the end of it.

While not every crate lends itself equally well to it, it's even possible to combine arena allocators to further customize allocation behavior.

Nesting Allocators

The bumpalo crate comes with an optional collections feature which can be enabled to grant access to String and Vec types which are identical to the stdlib's versions other than the fact that they allocate on the bump allocator. Using it looks like this:

use bumpalo::{Bump, collections::Vec};

let b = Bump::new();                 // Create a new bump allocator
let v: Vec<i32> = Vec::new_in(&b);   // Allocate on the bump allocator

This is an example of nesting allocators. The system allocator hosts the bump allocator, and the bump allocator in turn hosts the vector. The main downside of this is that we're not using std::vec::Vec, but have to use bumpalo's custom vec implementation. That's not ideal, not just because we now have a second Vec type, but because we also don't have other collection types such as HashMap, HashSet, and every other type out there on crates.io.

Passing Allocators

The plan is to solve this limitation via the nightly allocator_api feature in the stdlib. This extends the stdlib with methods such as Vec::new_in which can take a custom Allocator argument to host the allocations in. I don't have any particular opinions about the exact Allocator trait, but I do have opinions about the way this bubbles up to collection types.

Roughly speaking, the current design adds a new allocator generic param to collection types which defaults to the global allocator (Vec<T, A = GlobalAllocator>). In order to uniformly make use of this, every constructor method for every collection type needs to provide new _in variants of their existing collections. At first that seems reasonable; but it quickly runs into issues.

The first issue is usage with macros, which we can probably overcome. Macros like vec! perform allocations, and in order to work with custom allocators they should probably take an optional third param for the allocator. The second issue is a lot harder: traits can't take extra arguments. For example, if you have an &str, and you want to convert it to String but use a bump allocator you can't keep using the existing into method. And because the into trait is part of core, we can't just add a new into_in method to the Into trait either. The only real option is probably a new trait in alloc:

let s = "hello".to_owned();         // 1. Uses the global allocator
let bump = Bump::new();
let s = "hello".to_owned_in(&bump); // 2. Uses a bump allocator

let s: String = "hello".into();     // 1. Always uses the global allocator
                                    // 2. ❌ Unclear how to create a variant
                                    //    of this with a custom allocator

While the number of collection types and constructor methods are limited, things can quickly spiral out of hand when we factor in composition. In order for things to work uniformly across the ecosystem, every type which holds a Vec internally would need a second _in variant for all of their constructors and conversions. That kind of duplication is uncompelling, meaning it's likely we're only going to see some types adopt support for custom allocation while the majority doesn't. Not unlike how #[no_std] is not uniformly supported throughout the crates ecosystem today.

Scoped Allocators

What we're really struggling with is that the global allocator acts, in a capability-sense, as an ambient authority. Which is to say: every function in Rust has access to the global allocator without functions needing to take allocators explicitly as arguments. The allocator is just always present. Except when we're in #[no_std] environments, where there is no allocator. Or when we're trying to change which allocator we're trying to take.

The _in family of methods is an attempt to remedy the issue of having an ambient allocator. But that has some drawbacks: most notably the bifurcation of APIs. But also: it seems like that would be quite annoying to use. Manually having to pass some form of alloc argument down to every function call doesn't exactly seem ideal.

Instead Tmandry's capabilities design seems like a much more pleasant direction. It acknowledges the existence of ambient capabilities, and instead just asks that we express them at the signature level. Because they effectively function as implicits, they just carry through to all argument in scope which need them:

let s = "hello".to_owned();             // 1. Uses the global allocator
with alloc = Bump::new() {              
    let s = "hello".to_owned_in(&bump); // 2. Uses a bump allocator
}

let s: String = "hello".into();  // 1. Always uses the global allocator
with alloc = Bump::new() {
    let s = "hello".into;        // 2. ✅ Uses a bump allocator       
}

Compositions with functions and types would no longer require manually passing allocators into arguments, but merely acknowledging the existence of an allocator in the first place. That would mean we don't have to provide new _in methods which take allocators, but instead we could keep using the existing new methods with some minor modifications.

// Vec's design on nightly today, providing both `new` and `new_in` methods
impl Vec<T, A = Global> {
    fn new() -> Self { ...  }
}
impl Vec<T, A> {
    fn new_in(alloc: A) -> Self { ...  }
}

// A design of vec using capabilities, using just a single `new` method.
impl Vec<T>
with alloc = Global,
{
    fn new() -> Self {}
}

Obviously questions remain about how to exactly model capabilities. Many types in the stdlib liberally make use of ambient capabilities (e.g. allocation, filesystem, networking, etc.) which we'd need to continue allowing to work as it does today. But there is a question about how we can instead weave through user-provided capabilities through the entire stdlib and crates.io ecosystem. The answer requires having a way to continue relying on ambient capabilities, but opt-out if you want to. Which is not entirely unlike const.

The final benefit to this is that this would solve one of the current blockers we have for string-literals. The obvious design for that would be to allow the s prefix on literals to create allocated strings. But because that would act as a language built-in, there is an open question on how to make that work with custom allocators. Using capabilities would provide a compelling solution for that:

// Rust today
let s = String::from("Chashu");

// With an `s` prefix, allocating on the global allocator
let s = s"Chashu";

// With an `s` prefix, allocating on a local allocator
with alloc = Bump::new() {
    let s = s"Chashu";
}

Conclusion

In this post we took a look at general allocation strategies, how allocations can be nested today, which problems that approach has, and how we could simplify that via the novel contexts/capabilities language-feature. The design of contexts/capabilities is incomplete, so we intentionally didn't go too deep into that. Instead we only touched on the top-level semantics, showing specifically how it could be leveraged to overcome some of the problems we have with allocation today.

Fundamentally capabilities don't provide anything new; they're just a fancy way of passing arguments to functions. But I also believe that is their strength. It is just passing arguments to functions. But with significantly less code to write, no method duplication, the ability to play along with trait impls, and a lot more clarity of what your function needs to function.

I don't just think that capabilities could simplify the problem space for allocators in Rust. I actually believe that it could be the canonical way in which we solve the current core/alloc/std split in Rust. The core library really is just std minus everything which relies on ambient capabilities. If we could opt-out of relying on ambient capabilities in std, the reason to have separate core, alloc, and std libraries should go away.

The other thought I want to share is that I believe capabilities such as allocators are inherently tree-structured. I've recently written about tree-structured concurrency explaining how concurrency is best thought of as a tree. But from the operational semantics side of the language we now tree-borrows as well. I feel like at a high-level thinking about program execution as a tree is accurate, and we should be thinking of (nesting) capabilities such as allocators as trees as well.