I delivered a talk "Debugging Closure Size Graphically" at NixCon 2022. This blog post is intended as a textual version of the talk although it is not a script.

Here are links to the live version:

Closure size

Nix's smallest unit of building is a so-called "derivation". Generally there is one of these per package or per part of a NixOS system. The derivation is a concept at the layer of the Nix store.

Xe has said that Nix is a compiler from Haskell to Bash, which is a quite good if flippant explanation of how the Nix language relates to the build system: the "haskell" part is the Nix language, and the "bash" is the full invocation of some builder (typically written in bash) to actually build the thing, produced by builtins.derivation.

You can see the content of a derivation created by a Nix expression like so:

nix show-derivation nixpkgs#hello
» nix show-derivation nixpkgs#hello
{
"/nix/store/vdqm71n5bkfydjwjwmx2rkj18jrpxlmx-hello-2.12.1.drv": {
"outputs": {
"out": {
"path": "/nix/store/m2wybrc3grfn76hxcv4gyiyncng17z30-hello-2.12.1"
}
},
"inputSrcs": [
"/nix/store/9krlzvny65gdc8s7kpb6lkx8cd02c25b-default-builder.sh"
],
"inputDrvs": {
"/nix/store/004b0bvpjng4l23kahn6vzawlpr6dx75-bash-5.1-p16.drv": [
"out"
],
"/nix/store/gajvqygc937wsfjdkpfpqrrk3mlpyy6z-stdenv-linux.drv": [
"out"
],
"/nix/store/j80mkmlf7xv7cqn3chh7ghi1r96406m2-hello-2.12.1.tar.gz.drv": [
"out"
]
},
"system": "x86_64-linux",
"builder": "/nix/store/dd3713mm8wql4r2d5jxx0f58g16nfm4h-bash-5.1-p16/bin/bash",
"args": [
"-e",
"/nix/store/9krlzvny65gdc8s7kpb6lkx8cd02c25b-default-builder.sh"
],
"env": {
"buildInputs": "",
"builder": "/nix/store/dd3713mm8wql4r2d5jxx0f58g16nfm4h-bash-5.1-p16/bin/bash",
"cmakeFlags": "",
"configureFlags": "",
"depsBuildBuild": "",
"depsBuildBuildPropagated": "",
"depsBuildTarget": "",
"depsBuildTargetPropagated": "",
"depsHostHost": "",
"depsHostHostPropagated": "",
"depsTargetTarget": "",
"depsTargetTargetPropagated": "",
"doCheck": "1",
"doInstallCheck": "",
"mesonFlags": "",
"name": "hello-2.12.1",
"nativeBuildInputs": "",
"out": "/nix/store/m2wybrc3grfn76hxcv4gyiyncng17z30-hello-2.12.1",
"outputs": "out",
"patches": "",
"pname": "hello",
"propagatedBuildInputs": "",
"propagatedNativeBuildInputs": "",
"src": "/nix/store/pa10z4ngm0g83kx9mssrqzz30s84vq7k-hello-2.12.1.tar.gz",
"stdenv": "/nix/store/vsw40d3nb6mpjf770iv58czisf65lzzl-stdenv-linux",
"strictDeps": "",
"system": "x86_64-linux",
"version": "2.12.1"
}
}
}

The "closure" of some Nix store path, for instance, of the "out" output of the "hello" derivation above, refers to all the store paths referenced in that store path, and the store paths referenced in such references recursively. If you would like a mathy definition, see Eelco's PhD thesis, page 96.

Put another way, the closure is "your thing and everything it depends on, recursively".

The way that runtime dependencies are created in Nix is that Nix will effectively grep for the hash part of any inputs of a derivation, which will form the runtime dependencies.

This is unlike most other systems, which require explicit specification of runtime dependencies, potentially allowing forgetting some (which will work on the maintainer's machine that has the software but not on a fresh install). Nix has the opposite problem: it's easy to accidentally create runtime dependencies, since any reference to the build inputs in the outputs can create one.

Closure sizes largely don't matter if you are building on the same system that you are running the software on, besides leaving some stuff around that the garbage collector can't delete due to pointless references. However, the entire closure needs to be shipped around whenever you are deploying Nix-built software to another machine, using up space (prod generally has less space than your laptop!) and bandwidth.

spaghetti monsters

Let the spaghetti monster be a stand-in for your choice of large thing; perhaps a compiler, a build system, some interpreter you don't actually need at runtime, or anything else.

If the spaghetti monster is in buildInputs of your derivation:

silly drawing of mkDerivation with a build input of a drawing of a spaghetti monster

... and its store path appears in the output:

continuation of the previous image, with a buildPhase echoing spaghetti monster into $out

then the derivation will depend on the spaghetti monster at runtime, and thus you will be transporting a big spaghetti monster along with your software, anywhere it goes (such as in a Docker image or a NixOS deployment):

little container ship with a container mostly containing spaghetti monster and insignificantly containing myapp

Figuring out what your software depends on

You can use nix path-info to list the dependencies of a store path -recursively, with -sizes, and closure -Sizes, in -human readable form.

~ » nix path-info -rsSh nixpkgs#hello
/nix/store/c8aj6kzv5h7c5vympiy7l1p3aw626yqy-libobjc-11.0.0                               199.7K  199.7K
/nix/store/y5cp9q9h6p80mzbsijs00zxpi7g0lc9d-apple-framework-CoreFoundation-11.0.0        667.1K  866.8K
/nix/store/xbqj64vdr3z13nlf8cvl1lf5lxa16mha-hello-2.12.1                                 126.9K  993.7K

This is, however, unhelpful, since this is actually structurally a graph, and this view does not make it obvious what depends on what.

Nix actually can output GraphViz with nix-store --query --graph but it doesn't show sizes so it is unhelpful.

Just render a graph then!

I wrote a program in jq, everyone's favourite language that, like Nix, is also allegedly Haskell, which generates GraphViz with good quality output. It can be invoked with nix-closure-graph nixpkgs#python3 > img.svg.

This works pretty well for Python (open the image in a new tab if you want to look at it more closely):

graphviz graph showing nodes for each of the packages Python depends on

But not so for NixOS closures:

incomprehensible graphviz graph absolutely covered in lines for a nixos image

Just render a graph, but not with GraphViz

Writing a better graph viewer sounds hard, but it turns out Sigma.js already did the hard part of writing a graph renderer that does well with big graphs, and I had already built a Sigma-based graph viewer as a prototype at work as part of my internship project.

I believe that it's not practical to show large graphs non-interactively: since GraphViz contains the results of several papers on graph layout and it is slow while also having poor output for large graphs, it follows that the only practical option is to do something worse and faster for layout and make up for it with interactivity. Sigma seems to be one of the better options for building a viewer that does that.

Looking Glass, my graph viewer, consists of a simple JSON format (intended to be simple enough you can generate it directly out of PostgreSQL) and a properties panel to list arbitrary data of nodes.

It's open source under the MIT license, if you want to hack on it.

Looking glass showing an extremely busy, yet somewhat usable view of a NixOS closure

To use it, you can do a command like nix-closure-graph --lg nixpkgs#python3 | xsel -b (pick your store path and clipboard tool of choice), then paste the JSON into the box in the viewer.

Here is the JSON from my NixOS closure, for example

Why is GHC in my NixOS closure?

This is possibly a known issue, or possibly fixed in newer GHC, or possibly me forgetting to use pkgs.haskell.lib.justStaticExecutables. Regardless, it's not on purpose.

Practical closure size problems

I was packaging actual-budget, a budgeting program, to run it on fly.io. So I figured out how to build a docker image for it with Nix. You can get the final version here.

However, I had a problem:

 » ls -lah $(readlink result)
-r--r--r-- 1 root root 219M Dec 31  1969 /nix/store/6r0nslg23w2sa6a2zril3g0fvd6mnp1q-actual-server.tar.gz

200MB compressed is not good. It's just some JavaScript, what happened there?

screenshot of the graph viewer showing that actual-server depends on actual-server-modules, a store path of suspiciously similar size

It appears that for some reason actual-server and actual-server-modules are both there, which is not quite right.

Also, it appears that Python got in there somehow, which is not necessary:

screenshot of the graph viewer showing that nodejs depends on python

What went wrong?

Python is in there because of an exceedingly poorly named legacy build system inherited from Chromium called "Generate Your Projects", which compiles weird Python into Makefiles.

Unfortunately Actual uses some native extensions, but after building those with "Generate Your Projects", there's no reason at all to have Python around.

It turns out, nixpkgs ships a version of NodeJS that does not include Python or npm, called nodejs-slim.

The app is there twice because of a symlink:

» nix why-depends -a --precise \
    /nix/store/qq1zblyp6ysx-actual-server \
    /nix/store/i778ag8s7jf8-actual-sync-modules-1.0.1

/nix/store/qq1zblyp6ysx96vd8havhw0wcsihcidf-actual-server
└───libexec/actual-sync/deps/actual-sync/node_modules
        -> /nix/store/i778ag8s7jf8b-actual-sync-mo>
    → /nix/store/i778ag8s7jf8-actual-sync-modules-1.0.1

However, if we look at what's actually in that symlink, we will find nothing necessary at all.

» ls -a result/libexec/actual-sync/deps/actual-sync/node_modules
.  ..  .bin
 » ls -a result/libexec/actual-sync/deps/actual-sync/node_modules/.bin
.  ..  eslint  prettier  tsc  tsserver  uuid

Scooby Doo 'let's see who this is anyway' meme, unmasking 'closure size' to be 'totally pointless stuff'

Fixing it

The solution to Python being there is to remove all references to nodejs and replace the relevant ones with nodejs-slim. This was easiest to do by just doing a silly find command to delete the ones in unimportant scripts, then manually patching the shebang of the entry point script using sed.

The app being included twice was fixed by deleting the symlink and replacing it with a new symlink to the right relative path (since apparently it was not only a closure size problem but also wrong).

It's a good idea to put the package you want to exclude in disallowedRequisites or disallowedReferences so that Nix will fail the build if it appears again.

All in all it looks like the following: (original source)

prev.mkYarnPackage rec {
# ...
# we don't need to have the full-fat nodejs with python
# (for gyp) and so on except to build. It is undesirably
# referenced by binaries in dependencies, and also would be
# patchShebang'd into bin/actual-server as well if we
# didn't disable that and do it manually.
dontPatchShebangs = true;
extraBuildInputs = [ final.removeReferencesTo ];
disallowedRequisites = [ final.nodejs-16_x ];
distPhase = ''
# redundant symlink that introduces a 150mb runtime dep
# on the actual-server-modules derivation
rm $out/libexec/actual-sync/deps/actual-sync/node_modules
# .. and replace it with a relative symlink inside the
# package so the server can find its web files
ln -s $out/libexec/actual-sync/node_modules $out/libexec/actual-sync/deps/actual-sync/node_modules
# manually patchelf actual-server
sed -i '1c #!${final.nodejs-slim-16_x}/bin/node' "$(readlink -f "$out/bin/actual-server")"
# break unnecessary dependency binaries
find "$out" -type f -exec remove-references-to -t ${final.nodejs-16_x} '{}' +
'';
}

Conclusion

Don't ship around spaghetti monsters if it can be avoided: closure size matters whenever you are shipping around Nix build output.

Most often, closure size is down to accidentally depending on build tools and other packages, and it can be solved case-by-case by adding fixup scripts to remove pointless references.