Derivation: Fixing Layer Leakage
Introduction
Picking up reference the latest development efforts in the modules working group; Modules allow us to segregate functionalities, reduce complexities, and enhance the reusability of our codebase. But how we approach modularity, especially in the context of Nix, can significantly determine its efficency.
The Modules Working Group is a collective effort aimed at finding the right module system and championing best practices. Their dedication to making modules more accessible and functional is commendable, but there's an underlying principle I'd like to discuss: the approach of building from the bottom up.
Starting from the ground up, addressing the actual core (derivation) and then building upon it, is often more impactful than merely layering fixes or wrappers over flawed interfaces. Instead of sealing imperfections behind additional overhead wrappers, we should focus on creating a robust foundation that inherently addresses those concerns. This blog post will delve into this philosophy, discussing its potential merits and presenting an alternative perspective on achieving the foundation for genuine modularity in Nix.
A new approach
In one of my previous posts I discussed the type of builtins.derivation.
Expanding on this topic, I will delve into how Nix might modify its interface to streamline and simplify the derivation type.
I believe such a change could promote higher quality Nix code and improved architecture. The concept of first defining the types of a problem and then tackling the real issue with that abstraction as a foundation has always intrigued me. (type-driven-development)
With that in mind i'll present the type signature below for a deeper discussion on the type.
If you're interested how i came up with that type, read the post about that
let
Derivation :: {
all :: [ Derivation ];
builder :: String;
drvAttrs :: {
builder = String;
name = String;
system = String;
outputs = [ output :: String ];
${additionalArgs} :: String;
}
drvPath :: String;
name :: String;
outPath :: String;
outputName :: String;
outputs :: [ output :: String ];
system :: String;
type :: "derivation";
${output} :: Derivation;
${additionalArgs} :: String;
};
in
builtins.derivation :: {
name :: String;
builder :: String;
system :: String;
outputs :: [ output :: String ]?;
${additionalArgs} :: String;
} -> Derivation
In my view, a major shortcoming is that every input attribute passed is invariably present in the returned attribute set. This causes unintended overlap between abstraction layers. Rather than clearly distinguishing between layers, subsequent layers must continually consider the potential impacts on those beneath them. This approach feels counterintuitive and is generally regarded as an antipattern in programming.
If you're wondering what those layers of abstraction could be give let me give you a concrete example:
[SomeActualPackage] <Random Github repository>
buildPythonPackage <nixpkgs>
stdenv.mkDerivation <nixpkgs>
builtins.derivation <nix>
builtins.derivationStrict <primop>
With every new layer of abstraction the amount of returned attributes grows
length (attrNames pkgs.bash)
=> 54
[
"PKG_CONFIG_ALLOW_CROSS",
"__ignoreNulls",
"all",
"args",
"buildAndTestSubdir",
"buildInputs",
"builder",
"cargoBuildFeatures",
"cargoBuildNoDefaultFeatures",
"cargoBuildType",
"cargoCheckFeatures",
"cargoCheckNoDefaultFeatures",
"cargoCheckType",
"cargoDeps",
"cargoSha256",
"configureFlags",
"configurePhase",
"depsBuildBuild",
"depsBuildBuildPropagated",
"depsBuildTarget",
"depsBuildTargetPropagated",
"depsHostHost",
"depsHostHostPropagated",
"depsTargetTarget",
"depsTargetTargetPropagated",
"doCheck",
"doInstallCheck",
"drvAttrs",
"drvPath",
"inputDerivation",
"meta",
"name",
"nativeBuildInputs",
"out",
"outPath",
"outputName",
"outputs",
"override",
"overrideAttrs",
"overrideDerivation",
"passthru",
"patchRegistryDeps",
"patches",
"pname",
"postUnpack",
"propagatedBuildInputs",
"propagatedNativeBuildInputs",
"src",
"stdenv",
"strictDeps",
"system",
"type",
"userHook",
"version"
]
Many packages return roughly 50 attributes, with the majority being superfluous. In my view, accessing attributes of a derivation, which were intended as input options, is an ill-advised approach.
A derivation should ideally be viewed as possessing only predefined attributes. Any meta information can be seamlessly integrated using an additional abstraction layer, perhaps through a wrapper. The existing method doesn't offer developers sufficient control over their attributes, potentially leading them to rely on this unintended behavior rather than utilizing a distinct layer for variable management.
One possible approach to clean the implementation could be to redesign the interface from bottom up. Starting with the builtins.derivation
interface. Luckily this is relatively easy to do, as derivation
is not an actual primop. For it to change, we can just experiment with some nix code, inspired by the current implementation which is based on builtins.derivationStrict
.
Type conflicts
From a type perspective having two dynamic entries with different types each is also a problem:
# { ...
${output} :: Derivation;
${additionalArgs} :: String;
# }...
Although nix might not have a static typing system; consistency and predictability are more than just ideals – they're fundamental to ensuring reliability and clarity in our code. Now, let's consider a scenario where we have two dynamic entries, each with a different type. At first glance, it might seem like a mere quirk or a minor inconsistency. But let's delve deeper.
When you're dealing with dynamic entries of varying types, it introduces a layer of unpredictability. Each entry behaves differently, operates under its own set of rules, and interacts uniquely with other components. This disparity can lead to increased complexity, not just in understanding the system but also in troubleshooting and extending it.
Furthermore, from a maintenance perspective, it poses a challenge. Developers must remember the nuances of each dynamic entry, its associated type, and potential side effects. This can lead to inadvertent mistakes by even seasoned developers.
In essence, while having dynamic entries with different types might appear as a flexible feature, it also brings forth challenges in consistency, predictability, and maintainability. Ensuring homogeneity in type systems, especially for dynamic entries, can significantly streamline both development and future iterations.
Thus, I suggest separating these dynamic entries from one another.
Derivation
When conducting experiments, it's been observed that the Nix build-plane requires specific attributes for a derivation to be successfully constructed. These essential attributes and their respective types are listed below:
Note: I researched the following via reverse engineering, as I'm not deeply familiar with the nix implementation under its hood.
name
name :: String
- Name of the derivation
- Must be context free
outputs
outputs :: [ String@output ]
- List of all outputs
- Every given output being will produce one derivation (recursively, lazy)
- Must be context free
outputName
outputName :: String@output
- outputName
- MUST be one of the outputs
- MUST be context free
Interestingly nix-build
vs nix build
behaves slightly different; nix-build
doesn't necessarily need the outputs
list, just outputName
is enough. The newer version nix build
has a fallback implementation on the outputs
list, if there is no list present, it uses the internally defined list [ "out" ]
, which may lead to unexpected behavior if omitted.
outPath
outPath :: Path
- Enables the nix dependency chain.
- Is not strictly required for building.
- Is required for "toString" or string interpolation
${}
.
For example:
derivation {
# ...
PATH = "${python}/bin";
# ...
}
Note: String coercion could also be achieved with a dedicated __toString
method, although this is slightly less performant than my recommended form: outPath = strict.${outputName};
, as it requires creating and invoking an additional lambda.
See reference here
drvPath
drvPath :: Path
- Store path to the .drv file.
- The actual build instructions.
type
type :: "derivation"
- MUST always be this exact string literal.
- Otherwise nix will treat this attribute set like a generic attribute set and thus will not generate build outcomes.
- String context allowed but ignored.
Wrapping Up
As we've journeyed through the intricacies of the current derivation in Nix, it becomes evident that there's room for improvement, particularly in streamlining the flow of attributes and ensuring a more intuitive developer experience.
Now, with some insights gathered and understanding of the limitations of the current method, I'd like to introduce a new and simplified implementation of the derivation function. This recommendation is designed to reduce potential pitfalls, and more closely align with established programming best practices.
Alternative derivation
You can visit the original derivation for comparison.
For clarity and brevity, I'll first outline my suggested modifications before delving into the specifics. Don't worry I'll explain the changes and why the new implementation still works:
derivation.nix
drvAttrs @ { outputs ? [ "out" ], ... }:
let
strict = derivationStrict drvAttrs;
commonAttrs = (builtins.listToAttrs outputsList) //
{
inherit (drvAttrs) name outputs;
};
outputToAttrListElement = outputName:
{ name = outputName;
value = commonAttrs // {
outPath = strict.${outputName};
drvPath = strict.drvPath;
type = "derivation";
inherit outputName;
};
};
outputsList = map outputToAttrListElement outputs;
in (builtins.head outputsList).value
Remove drvAttrs
/*L12*/ inherit drvAttrs;
I've eliminated drvAttrs as it deliberately exposes information from the underlying layer. Anything not essential for the build process should be excluded from the foundational layer.
Solution: Delete the line
Don't extend drvAttrs
/*L10*/ commonAttrs = drvAttrs // {
Rather than modifying the nonspecific drvAttrs
, I suggest explicitly defining all output attributes where feasible.
For clarity the resulting commonAttrs
currently has the following type:
commonAttrs :: {
name :: String;
outputs :: String;
builder :: String;
system :: String;
all :: [ Derivation ];
${output} :: Derivation;
...
With Derivation
having the value of the final derivation, although we didn't define how to construct it yet.
Solution: Delete drvAttrs here. Inherit needed attributes explicitly into the actual value.
inherit (drvAttrs) name outputs;
Use .${} instead of getAttr
/*L18*/ outPath = builtins.getAttr outputName strict;
Rather than accessing the output value by invoking a function, we should access it with a simple outPath = strict.${outputName};
this is both more intuitive but also slightly more performant.
No 'all'
/*L11*/ all = map (x: x.value) outputsList;
Regarding the all
attribute. I'd suggest removing it from here as this is the foundational layer and everything not needed by the build plane should live in higher abstraction layers.
Note the 'all' attribute can be easily constructed anytime from outside the derivation.
The new type
let
Derivation :: {
name :: String;
drvPath :: String;
outPath :: String;
outputName :: String;
outputs :: [ output :: String ];
type :: "derivation";
${output} :: Derivation;
};
in
builtins.derivation :: {
name :: String;
builder :: String;
system :: String;
outputs :: [ output :: String ]?;
${additionalArgs} :: String;
} -> Derivation
If required we could also add drvAttrs back in, but keep it nested:
let
Derivation :: {
name :: String;
outputs :: [ output :: String ];
drvAttrs :: {
name = String;
builder = String;
system = String;
outputs = [ output :: String ];
${additionalArgs} :: String;
};
drvPath :: String;
outPath :: String;
outputName :: String;
type :: "derivation";
${output} :: Derivation;
};
in
builtins.derivation :: {
name :: String;
builder :: String;
system :: String;
outputs :: [ output :: String ]?;
${additionalArgs} :: String;
} -> Derivation
This solves my current issues with derivation
and i'm happy to try it out:
Try It
With some simple tests we can verify that the proposed changes work properly:
pkg.nix
let
pkgs = import <nixpkgs> {};
drv = import ./drv.nix; #The new derivation function
pkg = drv {
name = "pkg";
system = "x86_64-linux";
outputs = [ "output" ];
builder = "${pkgs.bash}/bin/bash";
args = [ "-c" "echo $PATH > $output"];
};
in {
inherit pkg;
}
The derivation can be built
$> nix-build ./pkg.nix -A pkg
/nix/store/ryy0bhs3l87viimrg3wzmffahn5gydxq-pkg-output
$> cat /nix/store/ryy0bhs3l87viimrg3wzmffahn5gydxq-pkg-output
/path-not-set
The derivation has only the expected attributes
$> nix repl -f pkg.nix
nix-repl> pkg.
pkg.drvPath pkg.outPath pkg.outputs
pkg.name pkg.outputName pkg.type
Conclusion
Checks
- cross compatibility with all advanced attributes
- dependency behavior: should work as before
- Functional Integrity
- Performance, do a performance test. We could see a marginal performance gain.
- Error handling: Does the current nix error handling depend on the (old) implementation details?
- Tests: Migrate some packages in the newly proposed format and use them for validation.
Downsides
- Backward Compatibility: This is not backwards compatible to nixpkgs because many packages already rely on
attribute leakage
Known Limitations
- This new approach may cause confusion on somebody that is used to the current behaviors.
- Breaking changes; It is not feasible to replace builtins.derivation. Instead this is a new solution that needs to be maintained in parallel.
Integration Points
To begin, we could establish a test repository dedicated to package declarations in the new format. This would not only serve as a structured foundation to advocate for its integration into nixpkgs, but also as a proactive response to concerns regarding clarity and organization. We could find the best patterns first, which we could then use as an argument for nix or nixpkgs integration.
Outlook
In future posts I'll try drafting intermediate abstraction layers to achieve user experiences similar to what stdenv.mkDerivation
currently offers.
Thanks for reading; 😃