Tweag

Bazel remote execution with rules_nixpkgs

29 February 2024 — by Konstantinos Sideris, Guillaume Maudoux

Tweag developed rules_nixpkgs to empower Bazel users with the ability to leverage Nix’s reproducible builds and its extensive package registry. That ruleset has proven to be especially advantageous in endeavors demanding intricate dependency administration and the maintenance of uniform build environments.

However, rules_nixpkgs is incompatible with remote execution. This is a major limitation given that remote execution is possibly the main reason why people switch to Bazel. And that rules_nixpkgs provides a great way to configure hermetic toolchains, which are an important ingredient for reliable remote execution. There is no trivial fix as can be seen in the related, longstanding open issue. At Tweag we investigated a promising solution presented at Bazel eXchange 2022 (recording), but these ideas were never implemented in a public proof of concept.

In this post, we will present our new remote execution infrastructure repo and walk you through the required steps to comprehend and replicate how it achieves remote execution with rules_nixpkgs.

The remote execution limitation

When we make use of rules_nixpkgs, we instruct Bazel to use packages from nixpkgs rather than those from the host system. This means that when we try to build a C++ project, Bazel won’t use the gcc compiler, which is typically found under /usr/bin, but instead will use the compiler specified by rules_nixpkgs and provided by Nix, typically stored under some /nix/store/<unique_hash>-gcc/bin directory.

Bazel distinguishes actions to import external dependencies from regular build actions. The former are always executed locally1, while the latter can be distributed using remote execution. rules_nixpkgs falls into the former category and invokes Nix to download and install the required /nix/store/<unique_hash>-gcc path locally on your machine.

This scenario works fine when we’re building locally. However, when we enable remote execution, rules_nixpkgs still installs dependencies locally, while the build happens on another machine, which will not have those paths available, so it will inevitably fail.

Initial setup with remote execution

For our proof of concept, we decided to use Buildbarn to provide the remote execution endpoint and infrastructure. Buildbarn provides Kubernetes manifests that we can use to deploy all the necessary Buildbarn components for remote execution to work. We’ll be using the examples from the bb-deployments repository to test our setup, but also modifying it to make use of rules_nixpkgs.

To replicate our implementation you’ll need a working Buildbarn infrastructure, which in this case would be a Kubernetes cluster. You can use our guide to set up a cluster on AWS.

Test remote execution without rules_nixpkgs

To make sure that everything is working as expected, we’ll use the @abseil-hello Bazel target which is available in the Buildbarn deployments repo. This example does not use rules_nixpkgs, yet. You can clone the bb-deployments repository, if you want to follow along.

  • Get the service endpoint of the Buildbarn executor service (frontend). If you’re deploying on a cloud provider this would be a load-balancer.
$ kubectl get services -n buildbarn
NAME        TYPE           CLUSTER-IP      EXTERNAL-IP                         PORT(S)                      AGE
browser     ClusterIP      172.20.22.171   <none>                              7984/TCP                     8d
frontend    LoadBalancer   172.20.126.97   xxxxx.us-east-1.elb.amazonaws.com   8980:31657/TCP               8d
scheduler   ClusterIP      172.20.83.110   <none>                              8982/TCP,8983/TCP,7982/TCP   8d
storage     ClusterIP      None            <none>                              8981/TCP                     8d
  • Update .bazelrc to use the remote executor endpoint of our environment
...
build:remote-exec --remote_executor=grpc://[endpoint-from-previous-step]
...

Now we can try building the @abseil-hello target using the remote execution infrastructure. Note that we’ll be using a custom toolchain specific to the default executors created by Buildbarn.

bazel build --config=remote-ubuntu-22-04 @abseil-hello//:hello_main

Test remote execution with rules_nixpkgs

Once we have validated that our setup works we can create a new target that uses rules_nixpkgs.

Update .bazelversion to use 6.4 which is a version supported by rules_nixpkgs (any other version on the 6.x should work as well).

Update the WORKSPACE file with the following:

http_archive(
    name = "io_tweag_rules_nixpkgs",
    strip_prefix = "rules_nixpkgs-244ae504d3f25534f6d3877ede4ee50e744a5234",
    urls = ["https://github.com/tweag/rules_nixpkgs/archive/244ae504d3f25534f6d3877ede4ee50e744a5234.tar.gz"],
)

load("@io_tweag_rules_nixpkgs//nixpkgs:repositories.bzl", "rules_nixpkgs_dependencies")
rules_nixpkgs_dependencies()

load("@io_tweag_rules_nixpkgs//nixpkgs:nixpkgs.bzl", "nixpkgs_git_repository", "nixpkgs_package", "nixpkgs_cc_configure")

load("@io_tweag_rules_nixpkgs//nixpkgs:toolchains/go.bzl", "nixpkgs_go_configure") # optional

nixpkgs_git_repository(
    name = "nixpkgs",
    revision = "23.11",
)

nixpkgs_cc_configure(
  repository = "@nixpkgs",
  name = "nixpkgs_config_cc",
  attribute_path = "clang",
)

This is the standard boilerplate to install rules_nixpkgs on our Bazel workspace. We’re also creating a reference to the nixpkgs repository, and a C++ toolchain using clang.

Next, we create a new cc_binary target in BUILD.bazel with a simple hello-world program.

$ cat BUILD.bazel
...
cc_binary(
    name = "hello-world",
    srcs = ["hello-world.cc"],
)

$ cat hello-world.cc
#include <iostream>

int main(int argc, char** argv) {
  std::cout << "Hello world!" << std::endl;
  return 0;
}

Now we need to update the custom Buildbarn toolchain used by the executors to reference @nixpkgs_config_cc. Update the file tools/remote-toolchains/BUILD.bazel and replace the instances of @remote_config_cc with @nixpkgs_config_cc.

We can try building the application using the C++ toolchain we defined with rules_nixpkgs. We expect this to fail because the executors are not Nix-aware yet.

$ bazel build --config=remote-ubuntu-22-04 @abseil-hello//:hello_main

...
ERROR: /home/user/.cache/bazel/_bazel_user/5ce2ca33a49034ed7557e24d70204ce5/external/com_google_absl/absl/base/BUILD.bazel:324:11: Compiling absl/base/internal/throw_delegate.cc failed: (Exit 34): Remote Execution Failure:
Invalid Argument: Failed to run command: Failed to start process: fork/exec /nix/store/n37gxbg343hxin3wdryx092mz2dkafy8-clang-wrapper-16.0.6/bin/cc: no such file or directory
...

Because the executors don’t have the /nix/store available, they cannot resolve the compiler path which is generated locally on our machine when we invoke bazel build.

Now let’s see how we can solve this problem by configuring the executors to access a shared /nix/store via NFS.

NFS-based solution

Our solution involves a Nix server that bridges this gap. This server manages and synchronizes the Nix dependencies across the Bazel build environment.

Here’s how it works:

  1. During bazel build the rules_nixpkgs repository rules will build and copy any Nix derivation to the remote Nix server.

  2. The Nix server will export the /nix/store directory tree via a read-only NFS mount share to the executors.

  3. When a build is triggered, all necessary dependencies are already available on the executors, allowing for the build process to continue.

Workflow overview

Implementation-wise, we’ll need to make the following changes to the Buildbarn infrastructure:

  • A Nix server. This could be a VM with Nix installed that is exporting the /nix/store directory as a read-only NFS share over the private network. We’ll need SSH access on that server from the machine that invokes bazel build.

  • Kubernetes executors with the exported NFS share mounted.

For a detailed setup guide and implementation specifics, refer to our infrastructure repository.

To instruct rules_nixpkgs to copy the nix derivations to the server we’ll need to create an entry in our SSH config (typically found under ~/.ssh/config) with the remote server and then set the environment variable BAZEL_NIX_REMOTE with the name of that entry.

# SSH Configuration
$ cat ~/.ssh/config
Host nix-server
  Hostname [public-ip]
  IdentityFile [ssh-private-key]
  Port [ssh-port]
  User [ssh-user]

Testing out remote execution again

With the new setup, we can try building the project again.

$ export BAZEL_NIX_REMOTE=nix-server
$ bazel clean --expunge # To refetch the Nix derivations
$ bazel build --config=remote-ubuntu-22-04 @abseil-hello//:hello_main

You should now see lines like the following, confirming communication with the Nix server

...
Analyzing: target @abseil-hello//:hello_main (0 packages loaded, 0 targets configured)
    Fetching repository @nixpkgs_config_cc_info; Remote-building Nix derivation 9s
...

And the build should be successful.

Conclusion

In this post, we explored the challenges and our solution for integrating rules_nixpkgs with remote execution in Bazel. Of course this solution is not perfect and it comes with some shortcomings that end user should be aware of.

  • The first issue is about cache eviction. Caching all the Nix paths over the long term is not practical from a storage standpoint. That’s why we need a way to mark the required paths, and garbage collect the others. A Nix path should be available as long as a client may trigger a remote build that uses it. However, there’s no way to determine when a client no longer needs a specific path. A simple solution will be to invalidate the least used paths. That will require a tighter integration with the Bazel APIs in order to track the Nix path usage.

  • The second issue relates to NFS performance. This depends on the infrastructure and workloads in operation. At least we want to tune the NFS synchronization to the point that the paths are available before any build begins. Slow synchronization between the NFS server and client can lead to failed builds.


  1. Bazel has an experimental feature that enables remotable repository rule actions. However, their capabilities are too limited to support the rules_nixpkgs use-case.
About the authors
Konstantinos SiderisAn SRE/DevOps engineer with a keen interest in networking, infrastructure and build systems.
Guillaume MaudouxGuillaume has a background in computer science, engineering and applied mathematics. Regarding software systems, his main concern is correctness, reliability, and trustworthiness. He is passionate about understanding complex systems and untangling intricate issues.

If you enjoyed this article, you might be interested in joining the Tweag team.

This article is licensed under a Creative Commons Attribution 4.0 International license.

Company

AboutOpen SourceCareersContact Us

Connect with us

© 2024 Modus Create, LLC

Privacy PolicySitemap