Published 2023-18-08

nix

derivation

experiment

Derivation: Fixing Layer Leakage

Introduction

Picking up reference the latest development efforts in the modules working group; Modules allow us to segregate functionalities, reduce complexities, and enhance the reusability of our codebase. But how we approach modularity, especially in the context of Nix, can significantly determine its efficency.

The Modules Working Group is a collective effort aimed at finding the right module system and championing best practices. Their dedication to making modules more accessible and functional is commendable, but there's an underlying principle I'd like to discuss: the approach of building from the bottom up.

Starting from the ground up, addressing the actual core (derivation) and then building upon it, is often more impactful than merely layering fixes or wrappers over flawed interfaces. Instead of sealing imperfections behind additional overhead wrappers, we should focus on creating a robust foundation that inherently addresses those concerns. This blog post will delve into this philosophy, discussing its potential merits and presenting an alternative perspective on achieving the foundation for genuine modularity in Nix.

A new approach

In one of my previous posts I discussed the type of builtins.derivation.

Expanding on this topic, I will delve into how Nix might modify its interface to streamline and simplify the derivation type.

I believe such a change could promote higher quality Nix code and improved architecture. The concept of first defining the types of a problem and then tackling the real issue with that abstraction as a foundation has always intrigued me. (type-driven-development)

With that in mind i'll present the type signature below for a deeper discussion on the type.

If you're interested how i came up with that type, read the post about that

let 
    Derivation :: {
        all :: [ Derivation ];
        builder :: String;
        drvAttrs :: {
            builder = String; 
            name = String;
            system = String;
            outputs = [ output :: String ]; 
            ${additionalArgs} :: String;
        }
        drvPath :: String;
        name :: String;
        outPath :: String;
        outputName :: String;
        outputs :: [ output :: String ];
        system :: String;
        type :: "derivation";
        ${output} :: Derivation;
        ${additionalArgs} :: String;
    };
in
    builtins.derivation :: {
        name :: String;
        builder :: String;
        system :: String;
        outputs :: [ output :: String ]?;
        ${additionalArgs} :: String;
    } -> Derivation

In my view, a major shortcoming is that every input attribute passed is invariably present in the returned attribute set. This causes unintended overlap between abstraction layers. Rather than clearly distinguishing between layers, subsequent layers must continually consider the potential impacts on those beneath them. This approach feels counterintuitive and is generally regarded as an antipattern in programming.

If you're wondering what those layers of abstraction could be give let me give you a concrete example:

              [SomeActualPackage]   <Random Github repository>
            buildPythonPackage      <nixpkgs>
          stdenv.mkDerivation       <nixpkgs>
      builtins.derivation           <nix>
  builtins.derivationStrict         <primop>

With every new layer of abstraction the amount of returned attributes grows

length (attrNames pkgs.bash) 
=> 54

[
  "PKG_CONFIG_ALLOW_CROSS",
  "__ignoreNulls",
  "all",
  "args",
  "buildAndTestSubdir",
  "buildInputs",
  "builder",
  "cargoBuildFeatures",
  "cargoBuildNoDefaultFeatures",
  "cargoBuildType",
  "cargoCheckFeatures",
  "cargoCheckNoDefaultFeatures",
  "cargoCheckType",
  "cargoDeps",
  "cargoSha256",
  "configureFlags",
  "configurePhase",
  "depsBuildBuild",
  "depsBuildBuildPropagated",
  "depsBuildTarget",
  "depsBuildTargetPropagated",
  "depsHostHost",
  "depsHostHostPropagated",
  "depsTargetTarget",
  "depsTargetTargetPropagated",
  "doCheck",
  "doInstallCheck",
  "drvAttrs",
  "drvPath",
  "inputDerivation",
  "meta",
  "name",
  "nativeBuildInputs",
  "out",
  "outPath",
  "outputName",
  "outputs",
  "override",
  "overrideAttrs",
  "overrideDerivation",
  "passthru",
  "patchRegistryDeps",
  "patches",
  "pname",
  "postUnpack",
  "propagatedBuildInputs",
  "propagatedNativeBuildInputs",
  "src",
  "stdenv",
  "strictDeps",
  "system",
  "type",
  "userHook",
  "version"
]

Many packages return roughly 50 attributes, with the majority being superfluous. In my view, accessing attributes of a derivation, which were intended as input options, is an ill-advised approach.

A derivation should ideally be viewed as possessing only predefined attributes. Any meta information can be seamlessly integrated using an additional abstraction layer, perhaps through a wrapper. The existing method doesn't offer developers sufficient control over their attributes, potentially leading them to rely on this unintended behavior rather than utilizing a distinct layer for variable management.

One possible approach to clean the implementation could be to redesign the interface from bottom up. Starting with the builtins.derivation interface. Luckily this is relatively easy to do, as derivation is not an actual primop. For it to change, we can just experiment with some nix code, inspired by the current implementation which is based on builtins.derivationStrict.

Type conflicts

From a type perspective having two dynamic entries with different types each is also a problem:

# { ...
${output} :: Derivation;
${additionalArgs} :: String;
# }...

Although nix might not have a static typing system; consistency and predictability are more than just ideals – they're fundamental to ensuring reliability and clarity in our code. Now, let's consider a scenario where we have two dynamic entries, each with a different type. At first glance, it might seem like a mere quirk or a minor inconsistency. But let's delve deeper.

When you're dealing with dynamic entries of varying types, it introduces a layer of unpredictability. Each entry behaves differently, operates under its own set of rules, and interacts uniquely with other components. This disparity can lead to increased complexity, not just in understanding the system but also in troubleshooting and extending it.

Furthermore, from a maintenance perspective, it poses a challenge. Developers must remember the nuances of each dynamic entry, its associated type, and potential side effects. This can lead to inadvertent mistakes by even seasoned developers.

In essence, while having dynamic entries with different types might appear as a flexible feature, it also brings forth challenges in consistency, predictability, and maintainability. Ensuring homogeneity in type systems, especially for dynamic entries, can significantly streamline both development and future iterations.

Thus, I suggest separating these dynamic entries from one another.

Derivation

When conducting experiments, it's been observed that the Nix build-plane requires specific attributes for a derivation to be successfully constructed. These essential attributes and their respective types are listed below:

Note: I researched the following via reverse engineering, as I'm not deeply familiar with the nix implementation under its hood.

name

name :: String

Name of the derivation
Must be context free

outputs

outputs :: [ String@output ]

List of all outputs
Every given output being will produce one derivation (recursively, lazy)
Must be context free

outputName

outputName :: String@output

outputName
MUST be one of the outputs
MUST be context free

Interestingly nix-build vs nix build behaves slightly different; nix-build doesn't necessarily need the outputs list, just outputName is enough. The newer version nix build has a fallback implementation on the outputs list, if there is no list present, it uses the internally defined list [ "out" ], which may lead to unexpected behavior if omitted.

outPath

outPath :: Path

Enables the nix dependency chain.
Is not strictly required for building.
Is required for "toString" or string interpolation ${}.

For example:

derivation {
  # ...
  PATH  = "${python}/bin";
  # ...
}

Note: String coercion could also be achieved with a dedicated __toString method, although this is slightly less performant than my recommended form: outPath = strict.${outputName};, as it requires creating and invoking an additional lambda.

See reference here

drvPath

drvPath :: Path

Store path to the .drv file.
The actual build instructions.

type

type    :: "derivation"

MUST always be this exact string literal.
Otherwise nix will treat this attribute set like a generic attribute set and thus will not generate build outcomes.
String context allowed but ignored.

Wrapping Up

As we've journeyed through the intricacies of the current derivation in Nix, it becomes evident that there's room for improvement, particularly in streamlining the flow of attributes and ensuring a more intuitive developer experience.

Now, with some insights gathered and understanding of the limitations of the current method, I'd like to introduce a new and simplified implementation of the derivation function. This recommendation is designed to reduce potential pitfalls, and more closely align with established programming best practices.

Alternative derivation

You can visit the original derivation for comparison.

For clarity and brevity, I'll first outline my suggested modifications before delving into the specifics. Don't worry I'll explain the changes and why the new implementation still works:

derivation.nix

drvAttrs @ { outputs ? [ "out" ], ... }:

let
  strict = derivationStrict drvAttrs;

  commonAttrs = (builtins.listToAttrs outputsList) //
  { 
    inherit (drvAttrs) name outputs;
  };

  outputToAttrListElement = outputName:
    { name = outputName;
      value = commonAttrs // {
        outPath = strict.${outputName};
        drvPath = strict.drvPath;
        type = "derivation";
        inherit outputName;
      };
    };

  outputsList = map outputToAttrListElement outputs;

in (builtins.head outputsList).value

Remove `drvAttrs`

/*L12*/ inherit drvAttrs;

I've eliminated drvAttrs as it deliberately exposes information from the underlying layer. Anything not essential for the build process should be excluded from the foundational layer.

Solution: Delete the line

Don't extend `drvAttrs`

/*L10*/ commonAttrs = drvAttrs // {

Rather than modifying the nonspecific drvAttrs, I suggest explicitly defining all output attributes where feasible.

For clarity the resulting commonAttrs currently has the following type:

commonAttrs :: { 
      name :: String;
      outputs :: String;
      builder :: String;
      system :: String;
      all :: [ Derivation ];
      ${output} :: Derivation;
      ...

With Derivation having the value of the final derivation, although we didn't define how to construct it yet.

Solution: Delete drvAttrs here. Inherit needed attributes explicitly into the actual value.

inherit (drvAttrs) name outputs;

Use .${} instead of getAttr

/*L18*/ outPath = builtins.getAttr outputName strict;

Rather than accessing the output value by invoking a function, we should access it with a simple outPath = strict.${outputName}; this is both more intuitive but also slightly more performant.

No 'all'

/*L11*/ all = map (x: x.value) outputsList;

Regarding the all attribute. I'd suggest removing it from here as this is the foundational layer and everything not needed by the build plane should live in higher abstraction layers.

Note the 'all' attribute can be easily constructed anytime from outside the derivation.

The new type

let 
    Derivation :: {
        name :: String;
        drvPath :: String;
        outPath :: String;
        outputName :: String;
        outputs :: [ output :: String ];
        type :: "derivation";
        ${output} :: Derivation;
    };
in
    builtins.derivation :: {
        name :: String;
        builder :: String;
        system :: String;
        outputs :: [ output :: String ]?;
        ${additionalArgs} :: String;
    } -> Derivation

If required we could also add drvAttrs back in, but keep it nested:

let 
    Derivation :: {
        name :: String;
        outputs :: [ output :: String ];
        drvAttrs :: {
            name = String;
            builder = String; 
            system = String;
            outputs = [ output :: String ]; 
            ${additionalArgs} :: String;
        };
        drvPath :: String;
        outPath :: String;
        outputName :: String;
        type :: "derivation";
        ${output} :: Derivation;
    };
in
    builtins.derivation :: {
        name :: String;
        builder :: String;
        system :: String;
        outputs :: [ output :: String ]?;
        ${additionalArgs} :: String;
    } -> Derivation

This solves my current issues with derivation and i'm happy to try it out:

Try It

With some simple tests we can verify that the proposed changes work properly:

pkg.nix

let 
  pkgs = import <nixpkgs> {};
  drv = import ./drv.nix; #The new derivation function
  pkg = drv {
    name = "pkg";
    system = "x86_64-linux";
    outputs = [ "output" ];
    builder = "${pkgs.bash}/bin/bash";
    args = [ "-c" "echo $PATH > $output"];
  };
in {
  inherit pkg;
}

The derivation can be built

$> nix-build ./pkg.nix -A pkg
/nix/store/ryy0bhs3l87viimrg3wzmffahn5gydxq-pkg-output

$> cat /nix/store/ryy0bhs3l87viimrg3wzmffahn5gydxq-pkg-output
/path-not-set

The derivation has only the expected attributes

$> nix repl -f pkg.nix
nix-repl> pkg.    
pkg.drvPath     pkg.outPath     pkg.outputs
pkg.name        pkg.outputName  pkg.type

Conclusion

Checks

cross compatibility with all advanced attributes
dependency behavior: should work as before
Functional Integrity
Performance, do a performance test. We could see a marginal performance gain.
Error handling: Does the current nix error handling depend on the (old) implementation details?
Tests: Migrate some packages in the newly proposed format and use them for validation.

Downsides

Backward Compatibility: This is not backwards compatible to nixpkgs because many packages already rely on attribute leakage

Known Limitations

This new approach may cause confusion on somebody that is used to the current behaviors.
Breaking changes; It is not feasible to replace builtins.derivation. Instead this is a new solution that needs to be maintained in parallel.

Integration Points

To begin, we could establish a test repository dedicated to package declarations in the new format. This would not only serve as a structured foundation to advocate for its integration into nixpkgs, but also as a proactive response to concerns regarding clarity and organization. We could find the best patterns first, which we could then use as an argument for nix or nixpkgs integration.

Outlook

In future posts I'll try drafting intermediate abstraction layers to achieve user experiences similar to what stdenv.mkDerivation currently offers.

Thanks for reading; 😃