Thoughts on Digital Preservation and Improving the Software Development Process with Nix

Liam Lloyd

April 17, 2023
Image

I recently had the privilege of attending the Southern California Linux Expo, where I attended a variety of talks covering PostgreSQL, Kubernetes, the intersection between open source and public policy, protecting yourself from surveillance on the Internet, and more. One talk that particularly captured my interest was on a package manager called Nix. Nix takes a novel approach to package management that has some relevance to digital preservation, because it provides a relatively simple means of installing outdated software on a modern computer. It can also help avoid certain common problems in the software development process.

Bringing Software Back From the Dead

Software ages very quickly; a program only ten years old can feel ancient. Running such an aged program can be quite difficult; suppose it relies on another piece of software that has been under continuous development for the past ten years. The modern version of that dependency likely won’t be compatible with this ten year old program, but if you install the version of the dependency from ten years ago, it might break every other program on your computer that relies on that code. Nix avoids this problem by considering the dependencies of a program to be part of that program, and irrelevant to every other program. That means if you install a decade old program with Nix, its decade old dependencies will only be used by that one program, and the rest of your system will be unaffected. The National Archive has taken advantage of this feature of Nix to install software from 2013 on a modern machine in order to recover data from a very old database. If in decades to come Permanent finds itself the custodian of files in formats that the browsers of the time can’t interact with, Nix could help us keep those files accessible to our users (though we also proactively mitigate the risk of this happening by storing copies of uploads in formats we expect to be long-lasting).

How Nix Works

Alt text: Freeway exit meme representing this post veering sharply into software jargon

Nix applies ideas from functional programming to package management, most prominently the ideas of pure functions and immutability. In programming, a function is pure if it produces the same output for any set of inputs and does nothing else besides produce that output. Nix installs packages the same way. 

Inputs:

  • What package to install;
  • Where the source code for that package can be found;
  • What dependencies the package requires.

Outputs:

  • A package built from these inputs, installed at a unique path including a hash based on the inputs used.

This approach isolates the installation from all your other packages, allowing your packages to rely on different versions of their dependencies. Immutability comes as a consequence of pure functions; if operations can only take in values and produce new values, they can’t change existing values. In Nix, this means updating a package will actually just install a new, updated copy of that package. Since the old package will still be there, you can easily roll back if an update causes problems. Complete immutability would mean wasting a lot of disk space though, so once you’re confident you don’t need to roll back, you can run garbage collection to remove packages that you’re no longer using.

Nix in the Software Development Process

The great benefit that Nix derives from this functional behavior is reproducibility. Developers using Nix should never find themselves saying “well, it worked on my machine”, because Nix renders the state of the machine irrelevant (outside of the hardware – if your coworker is using a Raspberry Pi you might still have to talk about things working on specific machines). Of course, developers have long known that it’s a problem if things work on your machine but not your colleague’s, so we do have other tools to solve such problems, but Nix offers some additional benefits. One of the most common reasons code might run as intended on one developer’s machine but not another is that they have different versions of the programming language installed, perhaps because one developer had been working on another project that used a different version of the language. Most languages have tools to make switching between versions easy (nvm for Node, pipenv for Python, g for Go, etc.), but Nix solves the problem for all languages by itself. Docker can claim the same thing, but here too Nix has advantages. While Docker allows you to pin versions, it doesn’t require it, so even when you pin Node and Python your build might break because you forgot to pin npm or the container’s base image. Nix pins versions for you by its very nature. Also, bringing up a containerized environment isn’t exactly fast, whereas Nix “environments” just consist of dependencies that you’ve installed in an isolated fashion. Finally, Nix can protect us from the rarer but more infuriating class of problems that can crop up on Linux systems where programs interfere with each other in totally unexpected (for, I assume, most users) ways. Permanent is a bring-your-own-device organization, and when I was starting out here my installation of PHP somehow broke my installation of Lutris (a tool for running Windows games on Linux). If I’d used Nix to set up my development environment this would not have happened.

What Else?

I have not covered all the possibilities Nix offers here. You can specify Docker images with the Nix domain specific language instead of Dockerfiles, and there is also an operating system (NixOS) built around Nix that applies its functional properties not just to your packages but also to your system configuration. Likely it has capabilities I haven’t even heard about yet, which I look forward to discovering as I continue my investigations into this exciting piece of software.