sqlelf and 20 years of Nix

Published 2023-03-19 on Farid Zakaria's Blog

If you want to skip ahead, please checkout the sqlelf repository and give me your feedback.

🎉 We are celebrating 20 years of Nix 🎉

Within that 20 years Nix has ushered in a new paradigm of how to build software reliably that is becoming more ubiquitous in the software industry. It has inspired imitators such as Spack & Guix.

Given the concepts introduced by Nix and it’s willingnes to eschew some fundamental Linux concepts such as the Filesystem Hierarchy Standard.

I can’t help but think has Nix gone far enough within the 20 years?

If you have kept an eye on some of the work I’ve been doing and thinking of, I have spent some time thinking how Nix can make further progress on goals of reliability and reproducibility.

If you haven’t seen my talk on Rethinking basic primitives for store based systems, I recommend you watch it.

Nix, and more specifically NixOS, is uniquely poised to do-away with many of the historic cruft that has plaque us in software due to the fact that it’s dependency closure goes down to the Linux kernel!

There is no short-list of components we can re-imagine however I have been focused on the dynamic linker / interpreter. Concepts of the *Unixes of the world which are largely historic are up for grabs.

As part of my work on Shrinkwrap, I was getting pretty frustrated working with the ELF file format.

Checkout my SuperComputing 2022 paper Mapping Out the HPC Dependency Chaos

The best tools we have to introspect binaries are readelf and objdump whom simply dump raw ASCII text to the console.

❯ readelf --demangle --dyn-syms /usr/bin/ruby | head

Symbol table '.dynsym' contains 22 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND ruby_run_node

Why are we restricted to such a challenging file format to work with?

I am working on an idea which I am very excited about 🤓 which I will be writing about. To prove to myself that the idea has merit, I wanted to explore making a tool that allows for easier introspection of ELF files.

Using the power of SQLite and the Virtual Tables concept, I wrote sqlelf: Explore ELF objects through the power of SQL.

❯ sqlelf /usr/bin/ruby /bin/ls /usr/bin/pnmarith
sqlite> SELECT elf_headers.path, COUNT(*) as num_sections
    ..> FROM elf_headers
    ..> INNER JOIN elf_sections ON elf_headers.path = elf_sections.path
    ..> WHERE elf_headers.type = 3
    ..> GROUP BY elf_headers.path;
path|num_sections
/bin/ls|31
/usr/bin/pnmarith|27
/usr/bin/ruby|28

If I can prove a clean 1:1 mapping between the two formats then there is an amazing room for potential.

I am still working through the domain model mapping to individual tables (contributions and help appreciated!) but I am really excited by this idea.

  1. Linker specifications can be articulated with SQL to guarantee semantics.
  2. Dynamic loader specifications can also be defined with SQL + potentially use of ACID constraints.
  3. Analysis of files can be done at large easily using SQL.
  4. We remove a custom file format.

Unix introduced the simple concept of what if everything was a file?

🙈 🙉 🙊 What if everything was a database? 🙈 🙉 🙊