Hacker News new | past | comments | ask | show | jobs | submit login
The Magic Nix Cache, a GitHub Action for speeding up your Nix workflows (determinate.systems)
195 points by biggestlou 10 months ago | hide | past | favorite | 68 comments



Graham Christensen here, cofounder of DetSys. Happy to answer any questions! The Magic Nix Cache has been a huge boon to us internally, and we're really excited to share it with the world today.


As someone who has led a ~2yr adoption of Nix in an enterprise scenario, I'm really glad for more projects that help with bridging the gap between Nix and containers/k8s.

You can get fast startups if you're willing to define your containers upfront (dockerTools, nix2container) and/or adopt a dynamic container-server (nixery, flakehub). And you can get reasonably fast substitutions if your binary cache is on MinIO in the same cluster as the workers.

But I feel like there's still room for a "magic" /nix/store that skips the copying and decompression stage altogether— something that works using standard nix invocations (like Magic Nix Cache), but presents itself as a Kubernetes Volume, so that in cases where a path already exists on-node, the existing files (in the cache pod) are simply mounted/served directly into whatever container ran a nix command.

I don't feel like I really know enough about either k8s or nix to assess the practicality of such a thing, but the thought of lightning-fast substitutions for arbitrary Nix workflows is massively appealing.


It is great to hear about your work in the space!

Several years ago I looked at implementing a custom component for k8s which would exchange nix store paths instead of containers, substitute, and bind mount them in at run time. It was an interesting experiment, but was Quite Difficult to pull off for someone who wasn't already familiar with k8s.

I've seen some projects similar to what you're describing though: the lightning-fast substitutions. It was incredible! They had the benefit of a fabulously fat network connection, though, and I'm not sure the experience translates very well. We will see!


Thanks! The domain is robotics, and we gave a brief conference talk last fall about our experiences so far, if you're interested— although it's mostly directed at a Nix-unaware audience, you can get a glimpse of our workflow that's heavily oriented around hourly builds delivered as Nix flake tags: https://vimeo.com/767139940

Definitely there's an obviousness to the concept of "magic" Nix stores in various spaces, and I know the tvix project seeks to realize some of this as well— reusing OCI tools to supply the build sandbox, leveraging existing container orchestrators for job management and queuing, all those goodies. So I'm excited to keep watching the space.


> But I feel like there's still room for a "magic" /nix/store that skips the copying and decompression stage altogether— something that works using standard nix invocations (like Magic Nix Cache), but presents itself as a Kubernetes Volume, so that in cases where a path already exists on-node, the existing files (in the cache pod) are simply mounted/served directly into whatever container ran a nix command.

Mostly a digression since it's not Nix, but I've wondered a bit about this sort of thing when building AUR packages on Arch Linux. The very last step of the builder is to compress the package, which in the vast majority of my uses is followed immediately by installing it, which of course decompresses the package. I've wondered why there isn't some (non-default) option to say "I don't need to keep the package itself around; just install it as soon as its built". I'm sure for my specific use case there's a simpler solution, but I've always wondered if there's a hacky way to get around things more generally by making a tool that can mimic the expected compression API but then creates a "fake" compressed artifact that no-ops (or maybe puts in a valid header followed by non-compressed data) and then injects that implementation into the PATH. You'd be able to invoke it with something like `fakecompress --zstd makepkg -si`, and it would invoke `makepkg -si` with the no-op zstd implementation.


I know you wanted to make a general point but in Arch you can just change PKGEXT='.pkg.tar' in /etc/makepkg.conf and the compression stage is skipped.

https://wiki.archlinux.org/title/makepkg

Tar would also be the answer to how to do fake compression (keeping files together and in order in one file) IF you can choose the "compression" library/tool.


Interesting! I wasn't aware of that, so I'll check it out


Ironically, Python has moved in the other direction, with now requiring a wheel be built before installing it, when wheel itself was originally an optional bolt-on to setuptools.


Any pointer to flakehub?


I think it's mostly just a prototype: https://github.com/elohmeier/flakehub


Does it work with nix-shell? I don't know how to use flakes yet.


Yes! The Action isn't directly aware of what Nix itself is doing or which commands are being run; it's only aware of the Nix store. So whether you're using flakes or channels it works the same.


Great question. Yes. Anything that Nix builds during your workflow will get cached. Give it a try and let me know how it goes?


Typo: “speed future up Nix workflow runs”.


One project cut their CI time from 18m to 3m: https://github.com/awkward-squad/hasql-interpolate/actions. I wonder who will see the biggest cut!

Note that when PRs merge to the default branch, their cache doesn't come with them. This is how GitHub Action's cache works, as a security measure. However: subsequent rebuilds will, and PRs off the default branch will too.


Nice, really hope I'll find some sweet way to cache similarish with GitLab-CI. Also kinda been thinking about how cool it'd be to run Kubernetes with Nix natively (so instead of a docker layer registry you have nix paths mounted together to overlayfs)


I think it should be pretty straightforward to make the Magic Nix Cache work on GitLab, too. They have a similar caching API. We'll take a look!


I would be thrilled to see this ported to GitLab CI. This is an incredibly (for lack of a better word) sexy tool on top of Nix, and would perfectly fit a use case we have for using Nix for caching build artifacts.


Nice. The GitLab caching API is basically saving and restoring a folder, so if you allow pointing to a local folder instead of the GitHub API you should be good!


Is the same true for Azure Pipelines? Given Actions forked from Pipelines I'd imagine it would be straightforward.


I'm not sure... want to open a ticket? :)


I would love to use this kind of thing with GitLab CI (and AzDO) at work!


My solution is here: https://kevincox.ca/2022/01/02/nix-in-docker-caching/

I basically expose the daemon socket into the docker container so that it requests builds from the host. It means that everything is cached right on local disk. If you need more oomph than one machine will provide the cache won't be shared between different machines without extra effort but you can do a lot of building on a single machine (especially if a lot of stuff is using Nix so cached).


This is what I'm using with gitlab: https://github.com/takeda/nix-cde/blob/master/contrib/gitlab...

It caches on two levels (instance's /nix/store on EBS and then also binary cache on S3).


I just spun up a gitlab-runner on NixOS (super easy due to how NixOS works)


Nice! Yeah, we're obviously big fans of NixOS over here :). In cases where build infrastructure is highly ephemeral, this sort of cache would make a lot of sense. We'd love to help get it working there!


Yeah, you are just trading security with those runners though. I want the cache to be secure and not being able to take over other repos/branches/tags ci-jobs.


It depends what level of security you need. My runner doesn't have root access and it talks to the nix-daemon on the host to do the building, so theoretically everything is safe.

Of course the attack surface is quite large, so I wouldn't expose this to the public. But using this for my repos and trusted developers is fine with me. It is basically impossible to accidentally do harm. Also note that GitLab forks and Merge Requests from forks run in the author's repo, so they won't use your runners. So it is only people with push access that will use them.


Slightly off-topic, but how do folks around here create production packages in nix? I use nix for my dev shells and machine configuration, but haven't yet built production packages using it.

More concretely, let's say you have a python backend that uses poetry. Do you just use `poetry install` in your derivation for python-deps? Do you use something like poetry2nix or node2nix and do all of your package management in nix?


"It depends." poetry2nix is pretty good! Generally, exporting to an OCI image, or targeting a production environment that can run Nix closures natively is the way we go about it. Many of our services use buildLayeredImage[0] and target Fly.io.

0: https://grahamc.com/blog/nix-and-layered-docker-images/


> targeting a production environment that can run Nix closures natively

Do you have some examples for this? The ones that come to mind are things like bare metal/vms with nix, or perhaps disnix, but those are a pretty hard sell over more popular orchestration systems, and I’d like to have more alternatives.


So I so far use it for dev environment and also for CI pipeline (the benefit is that Nix allows caching between runs, it helps with things like:

- getting dev environment on CI to be identical to user dev - with minor changes the project is not rebuilt or or rebuilt minimally - the caching works across branches, so for example merging a feature branch to master, if nothing changes the build on master will be very quick

I created something similar to nix-cache for gitlab, but I had to create a dedicated runner running NixOS.

If I could use NixOS for deployment, at that point I would just point the same binary cache to the machine and use the same derivation to build the app. Because the app was already build by CI, it would just download the compiled version. No need for artifactory or similar. In that scenario (you using poetry) you probably would just use poetry2nix to generate the application.

If the OS is not NixOS, but you still want to deploy via nix, then IMO this[2] looks interesting, basically it packages everything in self extracting archive. That you can extract and then run the app.

Other alternatives are these bundlers[3], which includes building toArx (works in a way similar to the previous one but pretends everything is in a single file), RPM, DEB, docker (you would have more control over it if you would use the code directly instead of a bundler though)

And the last option (probably the most obvious one) is that you can simply just use the tool to build the package. Since you're using poetry, then you can generate a wheel from it.

[1] https://github.com/takeda/nix-cde/blob/master/contrib/gitlab...

[2] https://github.com/Ninlives/relocatable.nix

[3] https://github.com/NixOS/bundlers/blob/master/flake.nix


As a complete nix noob, will this help with caching node dependencies? We have a few projects that take over 20mins for a `yarn install && yarn build`. I’ve read setting up Nix for node isn’t that straightforward, but that was a couple of years back. Has anything changed with respect to node projects?


You may find something like node2nix helpful (https://github.com/svanderburg/node2nix). This converts your package.json into a Nix expression that can then be cached. You're right that it does require some setup and a bit of Nix knowledge but could yield significant benefits and take a good chunk out of that 20 minutes.

Another option might be to use pnpm instead of Yarn and cache your pnpm dependencies. pnpm actually works a bit like Nix in that it creates a pnpm-lock.yaml file with content-based hashes for the full package.json dependency tree. This enables it to quickly determine which parts of the dependency tree it needs to build and which are already available.


Nowadays, there's buildNpmPackage [1]. It's included in Nixpkgs, actively maintained, and easier to work with IMO.

[1]: https://github.com/NixOS/nixpkgs/blob/master/doc/languages-f...


I have given up installing node dependencies via Nix. I use Nix to make nodejs available but then install node deps with npm/yarn/pnpm.

Node packages sometimes pull additional files from the internet in a postinstall script, or do other funky stuff that's incompatible with Nix. So the idea that you can construct a pure derivation from a package-lock.json or yarn.lock file is a pipe dream.


In order to support resolving dependencies to multiple versions (eg A and B depend on C but at incompatible versions), what npm does is drop the version override for C inside the nested node_modules directory (node_modules/A/node_modules/C, for example).

That means that node_modules (as created by a package-lock.json) can't really be cached or be built from caches since it depends on the particular version solution found by npm for a particular project.

So there's only so much that Nix can do. It can cache it about as well as using a naive caching scheme with actions/upload-artifact or similar (create a tarball of your node_modules and just cache it across runs, update when you need to).

Basically node_modules is inherently large. If you want better performance for caching dependencies use a better programming language environment.


Or fewer dependencies!


That is supported officially:

  - uses: actions/cache@v3
    with:
      # npm cache files are stored in `~/.npm` on Linux/macOS
      path: ~/.npm
      key: ${{ runner.os }}-${{ hashFiles('**/package-lock.json') }}


Just this past week I thought about setting up custom github runners on NixOS machines so that Nix is pre-installed (doesn't need to be installed via a github action) and so that the Nix store can be shared between runs. Though I don't really want to manage the machines, so I might give this new github action a try…



i replaced cloud autoscaled docker runners with fixed set of machines with nix store cache pee machine. 10x imrovement.


Yep, that works great for cases where you trust all your contributors. We do the same for some projects internally, too. However, for public projects where you don't it gets a bit ... dicier. Glad to hear that's working for you!


we have public oss project. i guess we were not attacked yet.

prs from forks run only main workflows, and these run nix which is kind of isolated enough.

i guess one could attack with some infinite nix store bomb.


Are you saying you replaced Docker with Nix? I am still trying to understand the use case for Nix. Can someone explain? Is it a complete replacement for Docker?


It's really hard to explain it because Nix is a paradigm shift (I think that's why it is hard initially grasp it).

But you can use it for things like:

- declare your application with all dependencies explicitly, so when someone else wants to build it they can (I would argue this is the primary purpose and rest is just built on top of that)

- common dev environment (so other developers can get the same dev environment as you with all exact same build tools)

- build toolchain (for example if you do embedded environment)

- you could use it as a replacement for homebrew/mac ports

- a configuration file holding your .dot files (home manager)

- if you use NixOS (OS that was built around Nix) then you have OS with a built-in configuration management (i.e. salt/puppet/ansible/chef) that is truly declarative

I think this[1] also shows some crazy stuff you can do with it.

Regarding question around Docker, the great thing is that if you define your application as a nix derivation, then you can easily generate docker that just contains your application and dependencies (the docker in that case is just a deployment unit). The reproducible environment is what docker promised, but practically failed to deliver. Instead of deliver reproducibility, it actually delivered repeatability.

[1] https://youtu.be/6Le0IbPRzOE?t=109


Thank you for the explanation. Let me see if I understood. The basic draw is that it offers a programming language-agnostic specification for dependency management, replacing language-specific specifications like requirements.txt. Then you pair it with a containerization technology like Docker or Podman. And if you use NixOS, you can skip the last part?


> Thank you for the explanation. Let me see if I understood. The basic draw is that it offers a programming language-agnostic specification for dependency management, replacing language-specific specifications like requirements.txt

Yes, actually I got interested in Nix because with requirements.txt I could only define python dependencies, and I had no control over for example installing postgresql C library that psycopg2 depends on.

> Then you pair it with a containerization technology like Docker or Podman

You don't have to, but you can, given that in most places containers are being used a lot of people use nix that way.

> And if you use NixOS, you can skip the last part?

Yes, although keep in mind that for example the requirement is to use Kubernetes then you would have to use Kubernetes. But if you need to create for example an EC2 instance. You can use Ubuntu + ansible or you could use NixOS.

Honestly I don't have much experience with NixOS as at my workplace we are mandated to use specific distro for everything.

Edit: I forgot to add additional benefits with using NixOS compared to Ubuntu + ansible for example. All updates are atomic. You either end up with the new configuration or the old configuration, there's no in-between as it would happen with ansible. Second big benefit is easy way to rollback.


> Is it a complete replacement for Docker?

Docker mainly does two things: there's Docker images as a way of sharing some container-image, and the container runtime to allow running those images for tasks or services in isolated/fresh ways.

Docker images make it easy to distribute software which runs the same everywhere.

Nix is a package manager which tackles that problem, but without using container images: Nix is for distributing software so it has the same behaviour everywhere.

Nix users are often very enthusiastic about Nix because it also enables all sorts of neat developer experiences. e.g. Nix is great for setting up development environments.

In terms of Nix-vs-Docker, Nix is also capable of building Docker/OCI images. So, you could use Nix instead of writing a Dockerfile.


Does it make sense to use nix during the image build process to precisely define dependencies I want to have in the resulting container? Or perhaps in other words, can I use nix and docker together, to have a precisely defined environment I can share with others devs as a result?


> Does it make sense to use nix during the image build process to precisely define dependencies I want to have in the resulting container?

Using a dockerfile with a step like "RUN nix ..."? You can, but it strikes me as a cumbersome way of doing things. Mitchell H describes this way of doing things here: https://mitchellh.com/writing/nix-with-dockerfiles

Whereas, I'd reckon the more idiomatic thing to do is to build the Docker image with Nix code. -- You're going to get a precisely defined environment to share across workstation/CI/etc. Some example Nix code for building Docker images is here https://github.com/NixOS/nixpkgs/blob/master/pkgs/build-supp...


Thanks for the links, that is very useful! It seems like I was thinking about this process the other way around.


yes. nix builds linux filesystem with all dependencies and puts it into oci image.

so locally i use process-compose to avoiding waiting docker builds.


Hm, I skimmed the github page for process-compose and I don't understand how it is different from building the container image directly with nix.


How does the Magic Nix Cache know which parts of my build are deterministic and which aren't?

I suppose maybe it will only work if I split my build up into multiple steps such that Nix will know to skip those first steps. If Nix knows that, I suppose the Magic Nix Cache also knows?


Nix builds things in sandbox (no network access, no file system access etc) this can make creating derivation for application that has build with side effects a frustrating experience, but that brings reproducibility (or at least gets us really close there) so generally this is not a concern.

As for splitting build into steps. I am not entirely sure what are you trying to do, but you should not need to do that. Each package in Nix has a hash, which is generated from things like hash of the source code, of the dependencies, compile flags, system architecture etc. This means that you could have multiple versions of the same package in /nix/store with different compile options or different dependencies. When you need a given dependency you know all the information to generate the hash and can easily know if you need to build it or you can use cached version. This is what makes binary cache a pretty much plug and play and don't need to worry what files to cache or whether you should split build into stages.


Nix's design gives good guarantees about reproducibility. It may not be bit-perfect, but in general it is _very_ good. Maybe the Zero to Nix article on Caching, and its linked pages will help? https://zero-to-nix.com/concepts/caching

The long and short of it is a merkle tree of hashed inputs :).


You tell it; when you define a derivation, you are promising it is at least logically deterministic.


Could you possibly clarify what you mean by "skip those first steps?" Which steps are you referring to?


Presumably, the Magic Nix Cache speeds up your build by caching the output of the "first steps" (really leaves in a tree of inputs). Otherwise, how would it work? So by steps, I am referring to the steps that your Nix derivation consists of.


When Nix "realises" a derivation, it realises the entire dependency tree (here, "realises" means either building or fetching, depending on whether a dependency is already in the Nix store). For every single derivation (dependency) in the tree, Nix first calculates what the store path for that dependency would be and then uses that to determine if it's already stored in the Nix store. So it would look at, say, a glibc dependency in a derivation and determine that the Nix store path would be /nix/store/7kn2mkg0g49lfflkdip7i39q3zsck4pc-glibc. If that's already in the Nix store then it doesn't need to build that. And it applies this logic throughout the entire dependency tree. In some cases, the entire dependency tree has already been built and written to the Nix store, in which Nix knows that it doesn't have to build anything. So Nix's caching logic doesn't apply only to the "first steps" of a Nix build (or realisation, to be more specific); it applies to all steps.


Check out https://zero-to-nix.com/concepts/closures and https://zero-to-nix.com/concepts/realisation, which I think add more color to this (and also is largely authored by biggestlou here.)


this looks awesome, can't wait to try it out.


this is awesome. i've yet to try Nix, but we are starting a remote cache service and we'd love to be compatible with Nix' caching mechanism

we're at https://less.build if you want to take a peek -- we will look at adding S3 support! :)


I'm all for faster builds...but how does your product differ from using something like https://ccache.dev/manual/4.7.html#_remote_storage_backends ? (I ask specifically because you advertise being a drop in ccache replacement, or similar)


we are drop-in compatible with ccache, in the sense that you can plug ccache right in (as an HTTP backend), and gain features like authorization policies, analytics, and global caching backed by Cloudflare.

that's where we are beginning and then we're working to add spicier features on from there. but first and foremost, we want to serve the ccache/gradle cache crowd with a fantastic protocol-agnostic backend which "just works" and pays for itself in terms of time saved.

and thanks for asking :) we are very new


Ah, cool. Not my usecase, but neat all the same.


side note i am happy anytime build caching is on the front page of hckrnews for any reason




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: