Sep 28, 2020 • Rust • Edits • Permalink

What (not so) recently happened in Miri

A lot has happened in Miri over the last year and a half, and I figured it would be a good idea to advertise all this progress a bit more widely, so here we go. We also recently performed a breaking change that affects some CI configurations, so this post serves as an announcement for you to update your CI configuration if needed.

For the uninitiated, Miri is an interpreter that runs your Rust code and checks if it triggers any Undefined Behavior. You can think of it a as very thorough (and very slow) version of valgrind: Miri will detect when your program uses uninitialized memory incorrectly, performs out-of-bounds memory accesses or pointer arithmetic, violates key language invariants, does not ensure proper pointer alignment, or causes incorrect aliasing. As such, it is most helpful when writing unsafe code, as it aids in ensuring that you follow all the rules required for unsafe code to be correct and safe. Miri also detects memory leaks, i.e., it informs you at the end of program execution if there is any memory that was not deallocated properly.

However, being an interpreter, Miri is limited in the kinds of code it can execute – everything that would usually involve interacting with C libraries or the operating system needs to be specifically supported, as C code cannot be interpreted by Miri. Miri also lacks support for some Rust features that are hard to interpret, but we are slowly closing these gaps.

Recent and past progress in Miri

During the last 1.5 years, thanks to a series of excellent contributors, we made a lot of progress towards supporting more and more Rust code to run in Miri. I am going to list some highlights below.

If you want to learn how to use Miri yourself, scroll down to the end of this post. If you are using Miri already, maybe you are still passing flags like --exclude-should-panic or disabling tests that require concurrency; you should be able to update those flags now. Also note the breaking change in how cargo miri interprets CLI arguments below!

Randomness and `HashMap`

The Rust HashMap picks a new random seed for each execution. This seed in obtained from the operating system, an operation which Miri did not support until @Aaron1011 implemented getrandom (#683). To ensure the same programs behaves the same way each time it is run by Miri, Miri internally uses a deterministic RNG (seeded with 0, but that can be changed via -Zmiri-seed) to implement getrandom. This PR also enabled Miri to be used with projects that use the rand crate for randomness.

However, this also means randomness in Miri is actually not random, so do not use Miri to perform any important cryptographic operations.

Unwinding

Miri used to just abort program execution in case of a panic. To better match the behavior of real Rust programs, @Aaron1011 implement proper unwinding support in Miri (#693). He even implemented catching panics again, which required aligning quite a few pieces across rustc, the standard library, and Miri itself. This means Miri can finally also execute #[should_panic] tests. Since recently, this is supported even for Windows targets.

Pointer-integer casts

Thanks to @christianpoveda, Miri now properly supports casting arbitrary pointers to integers and back (#779).

Recently, I also adjusted the alignment check to fully take this information into account, so that Miri can now run code that performs its own alignment logic (#1513). Notice however that this can lead to code that just happens to work by pure chance; to properly test such code, the test should be run at least 10 times.

File system access

@christianpoveda went on to implement file system access (this series of PRs started with #962). Later, @divergentdave improved that support with directory listing and some related operations (starting with #1152). This means programs running in Miri can now read from and write to files on the host computer. This is the first form of communication that we support between the interpreted program and the outside world. Communication needs to be explicitly requested via -Zmiri-disable-isolation; by default, Miri isolates the program to ensure that each execution is perfectly reproducible.

File system access is only supported on Linux and macOS targets, but due to cross-interpretation this is not a problem even for Windows users – see the next point.

Cross-interpretation

Based on earlier work by @Aaron1011 who made Miri use check-only builds both for the standard library and the interpreted crate itself (#1136), I made Miri support “cross-interpretation” (#1249). This means even when you are on a Windows host, you can pass --target x86_64-unknown-linux-gnu so Miri will interpret the program as if it was running on Linux, in particular using all the Linux parts of the standard library for the interaction with the operating system. Sine Miri supports the Linux APIs for file system access, it can interpret these programs even when running on a Windows host.

This is particularly useful when testing target features that differ from the host platform: for example, even on a 64bit macOS host, you can run programs for the 32bit Linux target (--target i686-unknown-linux-gnu), making sure your logic works for different pointer sizes. Miri also supports big-endian targets like --target mips64-unknown-linux-gnuabi64, so if your code is endianess-sensitive, you can test if it behaves correctly on big-endian systems. And finally cross-interpretation was enormously helpful for developing Miri itself; for example, I relied on this when fixing up our panic and unwinding support for Windows targets.

Concurrency

Earlier this year, @vakaras surprised me by suddenly showing up with a series of patches that equip Miri with support for concurrency (#1284). This is work he did during an internship with Amazon, so also thank you to Amazon for sponsoring this work! Now Miri programs can spawn threads and interact via locks or atomics. There are some caveats though: Miri does not detect data races, so programs with incorrect synchronization can cause Undefined Behavior through data races without Miri noticing. Also Miri’s scheduler is rather crude, so programs can be stuck in infinite loops under some circumstances.

Better `cargo` compatibility (breaking change!)

Recently, I mostly re-wrote the main entry point for users to execute programs in Miri, cargo miri (#1540). It is now more compatible with cargo itself: cargo test and cargo miri test support the exact same flags, and likewise for cargo run and cargo miri run.

However, this required a breaking change: previously, the way to pass flags to Miri itself and the program when executing the test suite was cargo miri test -- <miri flags> -- <test suite flags>. Now flags are passed via cargo miri test -- <test suite flags> like they are with cargo test; if you need to pass flags to Miri, you can set the MIRIFLAGS variable which works like RUSTFLAGS. I also removed support for cargo miri without further arguments, which used to be an alias for cargo miri run. The reason is that (a) cargo miri test is actually used much more frequently and (b) disambiguating these options while also supporting arbitrary flags is tricky.

If you have set up your CI to run tests in Miri, please make sure to adjust your configuration to the new format. For now, Miri still supports the old style (and emits an appropriate warning), but the intention is to remove that support code eventually. If your project is hosted on GitHub and is affected by the change, you should have already received a notification from me, but I might have missed some projects and of course not everything is on GitHub. While at it, you can also remove cargo miri setup from your CI script; that is no longer needed as thanks to @dtolnay Miri automatically detects when it runs on CI and goes into non-interactive mode.

… and more

This list is by far not exhaustive. Many small functions, from trigonometry to environment variable access to timekeeping, have been implemented over the last months, ever growing the range of programs that Miri can execute. Thank you to @Aaron1011, @christianpoveda, @divergentdave, @JOE1994, and @samrat! I hope I did not miss anyone…

Using Miri

If this post made you curious and you want to give Miri a try, here’s how to do that. Assuming you have a crate with some unsafe code, and you already have a test suite (you are testing your unsafe code, right?), you can just install Miri (rustup +nightly component add miri) and then run cargo +nightly miri test to execute all tests in Miri (except for doctests, which are not supported yet). Note that this requires the nightly toolchain as Miri is still an experimental tool.

Miri is very slow, so it is likely that some tests will take way too long to be feasible. You can adjust iteration counts in Miri without affecting non-Miri testing as follows:

let limit = if cfg!(miri) { 10 } else { 10_000 };

If your test suite needs to access OS facilities such as timers or the file system, set MIRIFLAGS=-Zmiri-disable-isolation to enable those. (Miri will tell you when that is necessary.) If your test suite runs into an unsupported operation, please report an issue.

If you want to add Miri to your CI to ensure your test suite keeps working in Miri, please consult our README. That document is also a great starting point for any other questions you might have.

Miri is also integrated into the Rust Playground: you can select Miri in the “Tools” menu to check the code for Undefined Behavior.

If Miri complains about your code and you do not understand why, I am happy to help! The best places to ask probably are Zulip (the #general stream seems fine), and the Miri issue tracker. Asking publicly is strongly encouraged so other people can help answer the question, and everyone can learn from the responses. Questions are much easier to answer if you manage to reproduce the problem in a small self-contained bit of example code (ideally on the playground), but feel free to ask even if you do not know how to reduce the problem.

Helping Miri

If you want to help improve Miri, that’s awesome! The issue tracker is a good place to start; the list of issues is short enough that you can just browse through it rather quickly to see if anything pikes your interest. Another good starting point is to try to implement the missing bit of functionality that keeps your test suite from working. If you need any mentoring, just get in touch. :)

That’s it for now. I am totally blown away by how many people are already using Miri; this endeavor of re-shaping the way we approach correctness of unsafe code has been way more successful than I expected. I hope Miri can also help you to ensure correctness of your unsafe code, and I am excited for what the next year of Miri development will bring. :D

Posted on Ralf's Ramblings on Sep 28, 2020.
Comments? Drop me a mail or leave a note on reddit!