Jan 12, 2019 • RustEditsPermalink

Rust 2019: Solid Foundations for Unsafe Code

It is that time of the year again where our feedreaders are filled with people’s visions for what Rust should become in the next year. Coming up with a vision is not exactly my strong suit, but it is probably something I should learn, so here’s my episode of the “Rust 2019” blog post series.

I think in 2019, we should make a coordinated effort to improving the foundations for writing unsafe code. Of course my particular Rust bubble is mostly about unsafe code, as you will know if you have read some of my previous posts – but safety is a core value proposition of Rust, and the entire ecosystem rests on a foundation of crates that make heavy use of unsafe code, so I believe that caring about unsafe code is an important piece to Rust’s overall success.

Foundations for unsafe code

What do I mean by “improving the foundations” of unsafe code? There are many questions that frequently come up when writing unsafe code, like:

I am sure there are more, these are just some questions that regularly show up in my inbox because someone @mentions me. What all of these questions have in common is that we don’t know the full answer, and even the parts of the answer that we do know are hard to figure out for someone who’s new to this.

So a key ingredient to providing better foundations for writing unsafe code is to start answering some of these questions, and to figure out which other questions need answering. One issue that any attempt to answer any one of these questions precisely will run into is that we do not have a proper specification for what a Rust program does when it is executed – the semantics of Rust are basically defined by “whatever the generated LLVM IR does”, and that’s not a very solid foundation at all. As a consequence, we often even lack the terminology to make precise statements about what unsafe code can and cannot do. What we need is a specification of at least a large enough fragment of Rust to serve as the framework in which these other discussions can take place.

But that’s just the beginning: we can’t just dump a bunch of rules onto programmers and let them deal with the rest. We need to help them write code that follows the rules. To this end, we need more documentation and teaching material – things like the Rustonomicon.

But I strongly believe that we can’t leave it at that. We should also provide tooling that helps programmers check their code for rule violations. I believe that this is extremely important: it does not just help programmers increase confidence in their unsafe code, it also helps them learn what the rules even are. We all learn by making mistakes, but that only works if you know that you made a mistake – and with unsafe code, that’s very hard to figure out. And, last but not least, creating such tooling will uncover ambiguities in the rules and raise new questions about corner cases that we forgot to consider, and thus help flesh out the rules themselves.

Where are we at?

Some of the things I described above are already happening, but things are moving slowly and more hands are always needed!

The unsafe code guidelines effort (UCG) is slowly but steadily chipping away at the kind of questions I raised, determining whether we can have a consensus among the small group of people that is actively participating in these discussions. This is an interesting mix of descriptive and normative work: we are trying to cater patterns that already existing unsafe code follows and cast them into rules that future code will have to follow. Of course, anything we do still has to go through the RFC process before it becomes normative, and I am sure new interesting questions will come up then. Right now, we are discussing “validity invariants”, which describe the assumptions the compiler is allowed to make about the contents of a variable based on its type. This just started, so now is a perfect time to join the effort!

A new approach for handling uninitialized data is available in nightly, replacing the old mem::uninitialized that turned out to be impossible to design reasonable rules for. At this point, this is mostly blocked on bikeshedding the API and figuring out whether we want to allow or forbid unsafe code to create references to uninitialized data (without reading from them) – the latter being one of the most interesting open questions around “validity invariants” that the UCG is discussing.

There is an accepted RFC for resolving the open questions around the interaction of union and Drop, and a work-in-progress partial implementation of that RFC.

There is an RFC under discussion for resolving the situation around references to fields of packed structs, that also helps with other questions around references in unsafe code. This RFC could need some pushing over the finish line, and then it needs implementing.

In terms of tooling, there is Miri (also available on the Rust Playground), which could need quite some love to improve the error messages, to track more information during execution for better diagnostics, to make it run faster and to support more things that test suites often do (like panics, or seeding RNGs from the OS).

Rust also, in principle, supports some of the LLVM sanitizers, namely asan, tsan, msan and lsan. However, using them still seems to run into some issues, and the rust-san project seems to lay dormant. We should aim at making it standard practice for crates to run these sanitizers as part of their CI. Even if the crate doesn’t use unsafe code itself, it might have a dependency that does, so using the sanitizer here helps improve the test coverage of that dependency. This won’t catch violations of Rust-specific rules, and to my knowledge these sanitizers all have false negatives (meaning they miss bugs in their domain), but they do find many issues and that goes a long way.

Of course, it would be awesome to have a Rust-specific sanitizer as well. There is some low-hanging fruit here, like having a way to compile your program with assertions guarding unchecked slice accesses, or verifying that pointers are sufficiently aligned – and then there is a long list of interesting checks that are increasingly hard to implement.

And then there are all the things I did not think of that could help unsafe code authors do their work with confidence, and reduce the number of mistakes that people inevitably make when writing unsafe code. I am sure there is a lot of possibility here for API design to improve the overall reliability of unsafe code.

So, as you can see, many things are happening, so I think there is some real chance that we can make serious progress on this topic in 2019. Let’s make it happen, together!

Posted on Ralf's Ramblings on Jan 12, 2019.
Comments? Drop me a mail!