Friday, January 14, 2011

The Nix package manager

In my previous blog post, which was about Pull Deployment of Services, I have explained what the project I'm working on as a PhD student is about. An import foundation is the Nix package manager, which serves as an important basis of my research.

There are dozens of papers and websites available describing Nix, so in this blog post I will not give another extensive description. Instead, I'm going to explain some of the details and the vision that we have about software deployment which sets the Nix approach apart from conventional approaches.

Nix is a package manager developed by Eelco Dolstra as part of his PhD research, which has similar purposes as tools such as RPM and dpkg commonly found in many Linux distributions, such as Red Hat enterprise Linux, Fedora, Debian and Ubuntu.

A package manager is basically a collection of software tools to automate the process of installing, upgrading, configuring, and removing software packages for an operating system in a consistent manner. In many Linux distributions, the package manager is one of the major factors that distinguish them from each other. Similarly, an experimental Linux distribution has been built around Nix, called NixOS using the Nix package manager as its basis.

One of the major differences of Nix compared to conventional deployment tooling, is the way Nix addresses packages (or components). Most tools use nominal dependency specifications, consisting of a name and version number, such as:

openssl-1.0.0c

Although it is a good thing to have these attributes to distinguish packages (which for instance isn't properly done for native Windows DLLs), this dependency specification mechanism has some limitations. A problem may occur when OpenSSL is compiled with a different version of GCC as an executable linking to it. A library compiled with an older version of GCC cannot always be linked to an executable compiled with a newer version of GCC, due to a different ABI. Moreover, OpenSSL also has dependencies on libraries, such as glibc. If OpenSSL is compiled against an incompatible older version of glibc as an executable, also problems may occur.

In Nix, we have different way of addressing components. We store components in a special directory called the Nix store in which every component is stored under an unique filename, such as:

/nix/store/xq2bfcqdsbrmfr8h5ibv7n1qb8xs5s79-openssl-1.0.0c

The former part of the component name: xq2bfcqdsbrmfr8h5ibv7n1qb8xs5s79 is a SHA256 hash code derived from all build time dependencies to build the component from source code. For example, if the same component is built with a different version of GCC, a different hash code is generated.

Using hash codes to address components offers us some nice advantages. Since every component has an unique filename, we can safely store multiple versions and variants next to each other, without overwriting each other. This also offers us the possibility to upgrade a system atomically. In case of an upgrade a new component is safely stored next to an existing one, and after the upgrade only a symlink pointing to the configuration by a user is changed.

The hash codes in the component names are also used to detect runtime dependencies of a component. This is done by scanning for occurrences of a hash inside a component. If a hash code is found (such as a path containing a Nix store component in an ELF header, or a shell script containing a Nix store path) Nix identifies these components as a runtime dependency. Although this technique sounds risky/scary, we have used this for many packages and it turns out to work quite well.

Another key feature is the Nix expression language, which we use to build packages, as shown in the code fragment below.

{stdenv, fetchurl}:

stdenv.mkDerivation {
  name = "hello-2.6";
  
  src = fetchurl {
    url = ftp://ftp.gnu.org/gnu/hello/hello-2.6.tar.gz;
    sha256 = "1h6fjkkwr7kxv0rl5l61ya0b49imzfaspy7jk9jas1fil31sjykl";
  };

  meta = {
    homepage = http://www.gnu.org/software/hello/manual/;
    license = "GPLv3+";
  };
}

Basically each package description is a function which takes some dependencies as arguments (in this case stdenv and fetchurl). In the body of the function we call the mkDerivation function describing a build action, which takes some arguments, such as a reference to the source code and other relevant build properties. In this expression we did not specify how to build this component. By default, Nix assumes that a packages is autotools based (i.e. ./configure; make; make install) if build instructions are omitted.

The expression described earlier cannot be used to build a package directly. The function must be called with the right arguments, i.e. a component must be composed. This is done in another expression, shown below:

rec {
  stdenv = ...;

  fetchurl = import ../build-support/fetchurl {
    inherit stdenv curl;
  };

  hello = import ../applications/misc/hello {
    inherit stdenv fetchurl;
  }

  ...
}

In this expression we call the function building the GNU Hello component (shown in the previous expression) with its required arguments. As you may notice, all the dependencies of the GNU Hello package are composed in this expression as well.

Now you may think that this dependency scheme of using hash codes to address components is a bit unhandy for end users. To solve this problem, we use Nix profiles. A Nix profile is basically an environment in the PATH of a user, containing symlinks indirectly referring to components in the Nix store. For example, by typing:

$ nix-env -i hello

The GNU Hello component is installed into the Nix profile of the current user. By typing the following instruction on the command-line:

$ hello

The GNU Hello component installed earlier is executed.

The last distinguishable feature I'd like to mention is that in Nix packages are not directly removed, but removed by a garbage collector. The Nix garbage collector will safely remove components that are no longer in use. In most conventional tooling you may accidentally remove a package which may be still in use. Packages which are still installed in a profile of a user or which are a dependency of another package, will not be removed by Nix. Moreover, running processes and open files are also garbage collector roots, which makes it safe to run the garbage collector at any time.

After reading this blog post, you may have become interested in trying the Nix package manager. The simplest way to try out Nix is by downloading and installing NixOS. Although NixOS is a Linux distribution built around Nix, you can also use Nix on conventional Linux distributions, if you find it more convenient to keep using the operating system you are used to. Moreover, the Nix package manager is quite portable and can also be used on different operating systems, such as FreeBSD, OpenSolaris, Mac OS X and Windows (through Cygwin).


Except for plain package management and the management of a Linux distribution, Nix is also used as foundation for some other tooling. Hydra is a continuous build and integration server built on top of Nix, which allows you to continuously checkout sources from repositories and to build components in several variations (e.g. using various compilers, libraries and platforms). I have developed an extension called Disnix, which extends the Nix package manager for the deployment of distributable components (or services) into a network of machines.

Another thing I want to clarify is that I don't want to claim that other package management solutions (and deployment tooling) are bad solutions. In fact, I think other package management solutions have done a great job in dealing with certain installation issues. Moreover, I find package management tools one the strong points of Linux distributions, although we support some distinct features which we care about, which other package managers don't.

We are not the only group doing research in package management. There is also a project, called Mancoosi, in which package and system management issues are investigated. They have a different view about certain principles, such as keeping maintainer scripts which imperatively modify the system configuration (whereas we generate everything as packages in the Nix store). Moreover, their solutions are based on package management tools widely used in Linux distributions, such as RPM and dpkg.


For more information about Nix, have a look at the Nix website: http://nixos.org/nix or have a look at Eelco's excellent PhD thesis titled: 'The Purely Functional Software Deployment Model', which can be found on his publications page.

No comments:

Post a Comment