The postmodern build system (updated 2025)
Rest of the post
This is a post about an idea. I don't think it exists, and I am unsure if I will be the one to build it. However, something like it should be made to exist.
What we want
-
Trustworthy incremental builds: move the incrementalism to a distrusting layer such that the existence of incremental-build bugs requires hash collisions.
Such a goal implies sandboxing the jobs and making them into pure functions, which are rerun when the inputs change at all.
This is inherently wasteful of computation because it ignores semantic equivalence of results in favour of only binary equivalence, and we want to reduce the wasted computation. However, for production use cases, computation is cheaper than the possibility of incremental bugs.
This can be equivalently phrased in terms of lacking "build identity": is there any way that the system knows what the "previous version" of the "same build" is? A postmodern build system doesn't have build identity, because it causes problems for multitenancy among other things: who decides what the previous build is?
-
Maximize reuse of computation across builds. Changing one source file should rebuild as little as absolutely necessary.
-
Distributed builds: We live in a world where software can be compiled much faster by using multiple machines. Fortunately, turning the build into pure computations almost inherently allows distributing it.
Review
Build systems à la carte
This post uses language from "Build systems à la carte":
-
Monadic: a build that needs to run builds to know the full targets. It's so called because of the definition of the central operation on monads:
bind :: Monad m => m a -> (a -> m b) -> m bThis means that, given a not-yet-executed action returning
aand a function taking the resolved result of that action, you get a new action whose shape depends an arbitrarily large amount on the result ofm a. This is a dynamic build plan since the full knowledge of the build plan requires executingm a. -
Applicative: a build for which the plan is statically known. Generally this implies a strictly two-phase build where the targets are evaluated, a build plan made, and then the build executed. This is so named because of the central operation on applicative types:
apply :: Applicative f => f (a -> b) -> f a -> f bThis means, given a predefined pure function inside a build, the function can be executed to perform the build. But, the shape of the build plan is known ahead of time, since the function cannot execute other builds.
Nix
As much of a Nix shill as I am, Nix is not the postmodern build system. It has some design flaws that are very hard to rectify. Let's write about the things it does well, that are useful to adopt as concepts elsewhere.
Nix is a build system based on the idea of a "derivation". A derivation is
simply a specification of an execution of execve. Its output is then stored
in the Nix store (/nix/store/*) based on a name determined by hashing inputs
or outputs. Memoization is achieved by skipping builds for which the output
path already exists.
This mechanism lacks build identity, and is multitenant: you can dump a whole bunch of different Nix projects of various versions on the same build machine and they will not interfere with each other because of the lack of build identity; the only thing to go off of is the hash.
The store path is either:
-
Named based on the hash of the contents of the derivation: input-addressed
This is the case for building software, typically.
-
Named based on the hash of the output: fixed-output
This is the case for downloading things, and in practice has a relaxed sandbox allowing network access. However, the output is then hashed and verified against a hardcoded value.
-
Named based on the output hash of the derivation, which is not fixed: content-addressed.
Note that
ca-derivationshave had a rocky deployment timeline and have been removed from Lix.See
ca-derivations.
The derivation for GNU hello
{
"/nix/store/nvl9ic0pj1fpyln3zaqrf4cclbqdfn1j-hello-2.12.1.drv": {
"args": [
"-e",
"/nix/store/v6x3cs394jgqfbi0a42pam708flxaphh-default-builder.sh"
],
"builder": "/nix/store/q1c2flcykgr4wwg5a6h450hxbk4ch589-bash-5.2-p15/bin/bash",
"env": {
"__structuredAttrs": "",
"buildInputs": "",
"builder": "/nix/store/q1c2flcykgr4wwg5a6h450hxbk4ch589-bash-5.2-p15/bin/bash",
"cmakeFlags": "",
"configureFlags": "",
"depsBuildBuild": "",
"depsBuildBuildPropagated": "",
"depsBuildTarget": "",
"depsBuildTargetPropagated": "",
"depsHostHost": "",
"depsHostHostPropagated": "",
"depsTargetTarget": "",
"depsTargetTargetPropagated": "",
"doCheck": "1",
"doInstallCheck": "",
"mesonFlags": "",
"name": "hello-2.12.1",
"nativeBuildInputs": "",
"out": "/nix/store/sbldylj3clbkc0aqvjjzfa6slp4zdvlj-hello-2.12.1",
"outputs": "out",
"patches": "",
"pname": "hello",
"propagatedBuildInputs": "",
"propagatedNativeBuildInputs": "",
"src": "/nix/store/pa10z4ngm0g83kx9mssrqzz30s84vq7k-hello-2.12.1.tar.gz",
"stdenv": "/nix/store/wr08yanv2bjrphhi5aai12hf2qz5kvic-stdenv-linux",
"strictDeps": "",
"system": "x86_64-linux",
"version": "2.12.1"
},
"inputDrvs": {
"/nix/store/09wshq4g5mc2xjx24wmxlw018ly5mxgl-bash-5.2-p15.drv": {
"dynamicOutputs": {},
"outputs": [
"out"
]
},
"/nix/store/cx5j3jqvvz8b5i9dsrn0z9cxhfd8r73p-stdenv-linux.drv": {
"dynamicOutputs": {},
"outputs": [
"out"
]
},
"/nix/store/qxxv8jm6z12vl6dvgnn3yjfqfgc68jhc-hello-2.12.1.tar.gz.drv": {
"dynamicOutputs": {},
"outputs": [
"out"
]
}
},
"inputSrcs": [
"/nix/store/v6x3cs394jgqfbi0a42pam708flxaphh-default-builder.sh"
],
"name": "hello-2.12.1",
"outputs": {
"out": {
"path": "/nix/store/sbldylj3clbkc0aqvjjzfa6slp4zdvlj-hello-2.12.1"
}
},
"system": "x86_64-linux"
}
}
execve memoization and the purification of execve
Central to the implementation of Nix is the idea of making execve pure. This is a brilliant idea that allows it to be used with existing software, and probably would be necessary at some level in a postmodern build system.
The way that Nix purifies execve is through the idea of "unknown input xor
unknown output". A derivation is either input-addressed, with a fully specified
environment that the execve is run in (read: no network) or output-addressed,
where the output has a known hash (with network allowed).
With ca-derivations (RFC), Nix additionally
deduplicates input-addressed derivations with identical outputs, such that
changes in how something is built without changing its output avoid rebuilds.
This solves a large problem in practice since NixOS has a huge build cluster to
cope with having to rebuild the entire known universe when one byte of glibc
changes, even if it will not influence downstream things. There are current
projects to deduplicate the public binary cache, which is hundreds of terabytes
in size, by putting it on top of a content-addressed store, independently of
Nix itself doing content-addressing.
This idea of memoized pure execve is brilliant because it purifies impure
build systems by ensuring that they are run in a consistent environment, while
also providing some manner of incrementalism. Nix is a source-based build
system but in practice, most things can be fetched from the binary cache,
making it merely a question of time whether a binary or source build is used.
A postmodern build system will need something like memoized execve to be able
to work with the tools of today while gaining adoption; allowing for
coarse-grained incremental builds.
Recursive Nix, import-from-derivation, dynamic derivations
Nix, the build daemon/execve memoizer, is not a problem for monadic builds and the problem with this lies entirely in the Nix language and its C++ implementation.
Nix is a monadic build system, but in a lot of contexts such as in nixpkgs, due to NixOS Hydra using restrict-eval (which I don't think is documented anywhere, but IFD is banned), it is restricted to only applicative actions in practice, for the most part.
There are a few ideas to improve this situation, which are all isomorphic at some level:
-
Recursive Nix: Nix builds have access to a daemon socket and can execute other Nix evaluations and Nix builds.
This has been removed from Lix due to implementation issues.
-
Import from derivation: Nix evaluation can demand a build before continuing further evaluation.
Unfortunately, the Nix evaluator cannot resume remaining other evaluation while waiting for a build, so in practice, IFD tends to seriously blow up evaluation times by repeated blocking and loss of parallelism.
-
Dynamic derivations: The build plans of derivations can be derivations themselves, which are then actually built to continue the build.
This crucially allows the Nix evaluator to not block evaluation for monadic dependencies even though the complete build plan isn't fully resolved, by putting placeholders in place of the dynamic parts of the build graph. This allows serializing the build graph into a file which can then be executed later, and avoids needing an evaluator to be alive while executing the build.
However, this is to some degree a workaround for the flaw in the Nix evaluator where IFD is serialized and alternative implementations like tvix or alternative surface languages like Zilch are unaffected by this limitation.
The monadic dependencies in this case are essentially replaced with placeholders, which allows the build to not have any evaluator driving the execution of the build, unlike IFD-shaped approaches. That is, dynamic derivations allow having a more complete build graph that can be serialized into a file and run the build later, including on a different machine.
This has been removed from Lix due to implementation issues.
Inner build systems, the killjoys
Nix can run monadic build systems inside of a derivation, but since it is not monadic in current usage, builds of very large software wind up being executed in one giant derivation/build target, meaning that nothing is reused from previous builds.
This wastes a lot of compute building identical targets inside the derivation every time the build is run, since the build system is memoizing at too coarse of granularity.
Puck's Zilch intends to fix this by replacing the surface language of Nix with a Scheme with native evaluation continuation, and integrating with/replacing the inner build systems such as Ninja, Make, Go, and so on such that inner targets are turned into derivations, thus immediately obtaining incremental builds and cross-machine parallelism using the Nix daemon.
Limits of execve memoization
Even if we fixed the Nix surface to use monadic builds to make inner build systems pure and efficient, at some level, our problem build systems become the compilers. In many compilers that are not C/C++/Java, the build units are much larger (for example the Rust compilation unit is an entire crate!), so the compilers themselves become build systems with incremental build support, and since we are a trustworthy-incremental build system, we cannot reuse previous build products and cannot use the compilers' incrementalism as they pretty much all expect build identity.
That is, the compilers force us into too coarse an incrementalism granularity, which we cannot do anything about as our build systems can't see into them.
For example, in Haskell, there are two ways of running a build: running ghc --make to build an entire project as one process, or running ghc -c (like gcc -c in C) to generate .o files.
-
The former eats
O(modules)(or maybe superlinear) overhead in terms of the size of each GHC package. This overhead is a lot smaller than the compiler startup overhead and can mostly be ignored.In cases like with Nix without any incrementalism with old versions of the code,
ghc --makemode has much better throughput but needs to build the entire project and thus has very bad latency. -
The latter eats
O(modules)GHC startup overhead in total, which is problematic as that overhead is about as much as building a couple entire small modules.ghc -chas okay latency, but the build system needs to evaluate a large build graph way faster than Nix would ever do it in order to actually invoke the compiler.
The (Glorious Glasgow) Haskell compiler can reuse previous build products and provide incremental compilation that considers semantic equivalence, but that can't be used for production builds in a trustworthy-incremental world since you would have to declare explicit dependencies on the previous build products and accept the possibility of incremental compilation bugs.
Thus, ghc --make incremental mode is at odds with trustworthy-incremental build systems: since you cannot provide a previous build product, you need to rebuild the entire project every time.
This tension between incrementalism and execution overhead is something that needs to be addressed in a postmodern build system by integrating into inner build systems.
Another solution to execve memoization: Persistent Worker
It turns out that in the original version of this post, I didn't consider a secret third option, having been too focused on not trusting the compilers at all. There is a middle ground implemented by Bazel and by Buck2: the Persistent Worker.
Persistent workers amortize the startup costs of the compilers so smaller incremental actions can be used, yielding good latency properties, without degrading hermeticity/reliability nearly as much as delegating incremental builds to the compiler.
The way that persistent workers are designed is as batch compilation mode without the ability to read previous build products for a given target, but potentially reusing some very carefully chosen caches.
That is, they are RPC servers that do the equivalent of gcc -c as an RPC rather than an execve.
A server design allows hermetic build system designs without build identity to be able to have small and correct actions that run fast by eliminating the reason to need larger actions:
- The actions can be small and incremental due to eliminating startup overhead (an RPC, not an
execve)- Latency to finish a unit of work is much better than larger actions
- Throughput is reduced relatively little compared to running larger actions
- With a careful caching design, the concept of "pre-compiled headers" or similar can be an implementation detail of the compiler server where common dependencies' deserialization may be reused across multiple actions
- No previous action version needs to be picked as the actions themselves aren't incremental, so the need for build identity vanishes
- The compiler is not passed the previous version of the file, so it can't reuse it and is unlikely to have incremental bugs
Some examples of persistent workers include:
Some related talks of interest:
- libuild for Clang/LLVM: not for Bazels, but discusses how a more clever non-batch-mode compiler could go much faster at C++
- Stop Building Dead Programs: an indictment of the practice of building batch systems in the modern era
Buck2 and Bazel
I would be remiss in the year 2025 to not mention the elephants in the room, the bazels: Rust Bazel (Facebook's Buck2), Go Bazel (a UK bank service provider's Please), and Java Bazel (Google's Bazel). Aside from the evident missing piece of the puzzle, Haskell Bazel (this is a joke), all three of these have very similar surface languages used by developers and all have relatively similar execution models.
Their evaluation/build models tend to look something like:
- Produce an unconfigured target graph out of
BUILDfiles which specify dependencies between targets. - Resolve the platform to be built on/for: choose a platform to build for and resolve
select()statements that implement late-bound changes (e.g. depending onlibuuidon Linux and nothing (libc) on macOS). - Run rule bodies which amount to a visitor pattern or writer monad to produce an action graph of commands to execute to run the build.
This has serious benefits to understandability and queryability: targets do have identity insofar as they have a name so they can be tracked over time.
This target graph is a coarse-grained graph that is very fast to compute and which doesn't lose much information out of the build files.
Thus there's rich support for buck query which can answer questions like reverse dependencies of a given target within milliseconds on medium-sized proprietary codebases.
Though targets have identity, the build system doesn't give any preference to any particular version of the codebase and thus has a similar caching model as Nix.
Nix, by comparison, only has an action graph: nix derivation show will show you a bunch of Bash, but if you want to find out what Nix file that Bash came from, you're largely out of luck.
There are design changes in discussion about what could be done about that in Lix, but they're far from implemented in production today.
Bazel and Buck2 (and likely others) implement the "Beta Build System" graph design mentioned in the Tup paper, rather than the "Alpha Build System" like Make. This means that they rely on a file watcher to determine which files have changed and thus can ignore the vast majority of the code in the repo which is both cold and unrelated to what the user is doing, since they can traverse the graph to invalidate reverse dependencies of what the user changed, without having to look at other parts of it. This gives very fast interactive performance, especially for the case where the build files weren't touched so the entire analysis phase can be skipped and the actions executed directly.
Another interesting concept in the Tup paper is the three rules: 1. Scalability, 2. Correctness and 3. Usability.
- The first means build systems must not grow linearly in evaluation time as projects grow (Nix fails this without being driven by an external system at evaluation time, violating Usability).
- The second means that you can check out a new version from Git and it will have a correct result (Nix gets this right, every non-hermetic build system in existence fails this with respect to system packages).
- The third means that no matter which mutations happen to the source tree, there is one build command that will do the right thing; that is, the build is not modal/stateful from the perspective of the user (Nix gets this right [by itself, at least; dev shells are a problem], every single-language build system fails this in a multi-language project).
A post-modern build system should satisfy all three.
All of these build systems and some more implement the same Remote Build Execution protocol as Bazel. This means that the exact same vendors can be used for executing builds on any of these systems, and, if they implemented Build Event Protocol, they could also use the same log analytics systems as Bazel.
Bazel is an Applicative build system with a static build graph and pretty strong restrictions on how builds can be structured. Buck2 is ... more complicated, but effectively is close to being a monadic Bazel-like build system (!).
What Buck2 has, namely, dynamic dependencies is somewhere in between Nix's Dynamic Derivations and Import-from-Derivation. Buck2 always has an evaluator driving the build (including if it's remote execution), in the form of the Buck2 daemon, in a similar manner as Nix's IFD. However, it has the largely static target graph like is produced by Dynamic Derivations, where the nodes that use dynamic dependencies are statically known at construction time, and my understanding is that it forms placeholders that are then evaluated at some later point in the build graph.
Integrating the Bazels with other systems
As a distro packager, build systems like Bazel and Nix are an enormous nightmare to deal with because they don't compose well with "giving them dependencies from elsewhere". They all have very controlling personalities and want to control the entry point of the build as well as where dependencies come from. For a distro, that's non-negotiable: the entry point to package a given piece of software has to be the distro's build system lest packagers find themselves losing their minds with non-uniform ways to build packages (violating the Usability rule!). Integrating these hermetic build systems together is like writing a toxic yuri with characters that don't see eye to eye and all want to be the protagonist: you have to develop their enemies to lovers arc to have them want to kiss.
Buck2 is at least pragmatic and thus more amenable to yuri, assuming it is driving the build itself: it's possible to have non-hermetic rules, and even ones which don't delete the previous build before re-running it, provided that one accepts the risk from doing so. It doesn't, however, necessarily fix using Buck2 with anything else as the top-level build.
Good integrations of Buck2 and Nix are possible with Buck2 driving the build: the Nix store protocol supports importing content-addressed paths with dependencies, despite this not being exposed anywhere but in the internals. Thus, it's possible to import artifacts from such build systems into Nix and then use them in Nix builds and do things like producing NixOS or Docker images out of them. That discussion is for another time.
A 2025 update on large scale Haskell builds
In the aside above, I mention the various ultimately failed attempts to bolt incremental Haskell builds onto the side of Nix. The longer explanation of why Nix isn't workable is that it's not a good project build system and the codebase in question grew to the extent that incremental builds across machines within the project became a necessity.
The attempts at making the build faster in Nix were generally about making CI faster, but wouldn't do anything about the build performance locally. Ultimately, the decision made was to switch approaches entirely and adopt Buck2. Here are a selection of talks related to the Mercury work on using Buck2 on a very large Haskell codebase:
- "Haskell Builds at Scale: Comparing Bazel and Buck2" by Andreas Herrmann: overview of the whole project from the build system perspective; why Bazel was infeasible; the use of monadic builds in Buck2 to eliminate false dependencies between the target dependency graph and the module dependency graph
- "Towards Dream Haskell Build Experience" by Ian Kim: more Haskell/GHC focused view on the Buck2 build
- "Integrating Nix and Buck2 for fun and profit" by Claudio Bley: the dependencies side of the Buck2 Nix integration
- "Building Haskell with Buck2" by Andreas Herrmann: similar as the BazelCon talk by Andreas, slightly different content, focused on the Haskell side
The Birth and Death of JavaScript
A stunningly prescient talk from 2014 predicting that all of computing will be eaten by JavaScript: operating systems become simple JavaScript runtimes, but not because anyone writes JavaScript, and JavaScript starts eating lower and lower level pieces. It predicts a future where JavaScript is the lingua franca compiler target that everything targets, eliminating "native code" as a going concern.
Of course, this didn't exactly happen, but it also didn't exactly not happen. WebAssembly is the JavaScript compiler target that is gathering nearly universal support among compiled languages, and it allows for much better sandboxing of what would have been native code in the past.
Houyhnhnms: horse computers?
There is a blog that inspired a lot of this thinking about build systems: Houyhnhnm computing, or, as I will shorten it, the horseblog. For those unfamiliar, the horseblog series is an excellent read and I highly recommend it, as an alternative vision of computing.
In short, horseblog computers have:
-
Persistence by default, with the data store represented by algebraic data types with functions to migrate between versions, automatic backup/availability everywhere, and automatic versioning of all kinds of data.
-
Treatment as computing systems, rather than computer systems. The system itself is much more batteries-included and well-thought-out.
-
High abstraction level, and thus, the ability to transform programs to run in different contexts with different properties in terms of persistence, performance, availability, etc.
-
Automatic, trivial sandboxing; running multiple versions of software concurrently is trivial.
-
Hermetic build systems with determinism, reproducibility, input-addressing. They handle every size of program build, from building an OS to building a program. They deal gracefully with the golden rule of software distributions: having only one version at a time implies a globally coherent distribution.
A build system is a meta-level construct usable for all kinds of computation on the computing platform, not just compiling software.
Wait a minute, we went meta, oops! :)
Unison
Unison is a computing platform that is remarkably horseblog-brained. I don't know if there is any intentional connection, since given enough thought, these are fairly unsurprising conclusions. Unison is a programming platform containing:
- An editor
- A language that is pretty Haskell-like, which is pure by default, with integrated distributed computing support
- A version control system
- Content-addressed code handling: syntax trees are hashed, and compiled only once, correctly.
- Integrated persistence layer for any values including functions, with fully typed storage.
My take on Unison is that it is an extremely cool experiment, but it sadly requires a complete replacement of computing infrastructure at a level that doesn't seem likely. It won't solve our build system problems with other languages today.
rustc queries, salsa
Let's consider the Rust compiler as an example of a program that contains a build system that would need integration with a postmodern build system to improve incrementalism.
Note that I am not sure if this information is up to date on exactly what rustc is doing, and it is provided for illustration.
rustc contains a query system, in which computations such as "compile crate" are composed of smaller steps such as "parse sources", "convert function f to intermediate representation", and "build LLVM code for this unit". These steps are persisted to disk to provide incremental compilation.
In order to run a computation whose inputs may have changed, the subcomputations are computed recursively. Either:
- The computation has all green inputs, so it's marked green.
- The computation has different inputs than last time, so it's marked grey, evaluated, and then marked green if it's the same result, and red if it's a different result.
However, this is all done internally in the compiler. That renders it untrustworthy and a trustworthy incremental build system will ignore all of it and rebuild the whole thing if there is a single byte of change.
In a postmodern build system with trustworthy incremental builds, such a smaller build system inside a compiler would have to be integrated into a larger one to get incremental builds safely. We can imagine this being achieved with a generic memoization infrastructure integrated into programming languages, which makes computations purely functional and allows for their execution strategies to be changed, for example, by delegating to an external build system.
Concrete ideas
There are multiple levels on which one can implement a postmodern build system.
To reiterate, our constraints are:
-
Don't completely rewrite inner build systems. If something can be jammed into the seams in the program to inject memoization, that is viable and worthwhile.
This eliminates Unison.
-
Provide generic memoization at a less-than-1-process level.
-
Ensure sufficient determinism to make incremental builds trustworthy.
-
Make computations naturally distributed, utilizing modern parallel computing systems.
Make an existing architecture and OS deterministic
This is an extremely unhinged approach that is likely fraught with evil. However, it is not impossible. In fact, Ripple by edef and others is an experiment to implement this exact thing. There is also plenty of prior art of doing vaguely similar things:
rr has made a large set of programs on Linux deterministic on x86_64 and ARM64. There is also prior art in fuzzing, for example Orange Slice by Brendan Falk, which aims to create a fully deterministic hypervisor for x86. On Linux, CRIU can be used to send processes to other systems among other things, and might serve as a primitive to move parts of program executions into other processes in a deterministic way.
In some senses, the problem of deterministic execution of functions in native applications is isomorphic to the problem of snapshot fuzzing, in which some large system has a snapshot taken, input is injected via a harness, and then the program is run to some completion point, with instrumented memory accesses or other tooling. Once the run completes, the system is rapidly restored to its previous state, for example, by leveraging page tables to restore exactly the memory that was touched to its original state.
If it were sufficiently well developed, this approach could conceivably be used to convert existing programs to use distributed computation with minimal changes inside the programs.
However, a lot of the program state in a native process is shaped in a way that may be deterministic but is path-dependent, for example, allocator state, and other things of the sort, leading to pointers changing with each run with slightly different inputs. It seems rather unlikely that, without ensuring a known starting point, a program could usefully be memoized in such a way that it matches past executions.
WASM everything
WebAssembly is likely a more viable approach today but would possibly require more invasive changes to programs to use the build system.
Concretely, it would be possible to make an execution engine that takes the tuple of (Code, Inputs) and purely executes it while allowing for monadic dependencies. This would look something like this in pseudo-Haskell:
execute :: MonadBuild m => (Code, InputBlob) -> m Output
In fact, to some extent, Cloud Haskell did approximately this, however, it didn't necessarily implement it with an eye to being a build system, and thus doesn't do memoization or other things that would be desirable. Though, one could probably use Shake to implement that.
Porting an existing compiler to this design would be a matter of abstracting it over the actual execution mechanism, doing a lot of build engineering to add the necessary entry points to make each query its own WASM executable, to make sure that all the necessary state actually goes into the query, and then finally to implement a back-end for the execution mechanism that attaches to the postmodern build system.
Nix builds but make them monadic
What if the executor for Nix builds or similar systems memoized compiler
executions by transparently translating compiler invocations to Recursive Nix
or similar monadic builds? This could be achieved many ways, but is essentially
equivalent to using ccache or similar tools, only with better sandboxing.
Similarly, tools like Bazel could be relatively easily converted to use recursive Nix or similar trustworthy execve memoization tools in place of their existing sandboxes to avoid the problem of trustworthy build tools not composing, since the outer tool has to be the trusted computing base, and putting another build system inside it requires that it either get circumvented (turning sandbox off, etc) or integrated with, in order to preserve incremental builds.
This "solution", however, only solves the issue up to one execve call: it
can't improve the state of affairs within a process, since Nix only has the
granularity of one execve.
Conclusion
We waste immense amounts of compute and time on rebuilding the same build products, and hell, build systems, over and over again pointlessly. This is wasteful of the resources of the earth as well as our time on it, in multiple respects, and we can do better. We have the ideas, we even have the technology. It just has to get shipped.
Probably the most practical solution to getting a fast multi-language incremental build system in this day and age is one of the Bazels, using dependencies from Nix to not have to Become a Distro. Persistent workers, though unsatisfying from the perspective of not being 100% hermetic, are an effective way to get a substantial amount of the benefit of a hermetic build system with incremental compilation by compensating for much of the computational overhead in making targets smaller.
Many of the ideas in this post came from edef, who you should sponsor if you want to support her giving people these ideas or developing them herself.