Commit 4e91a4a0 authored by Kierán Meinhardt's avatar Kierán Meinhardt
Browse files

nixos/doc: document systemd-nspawn test containers

parent 2e5b118a
Loading
Loading
Loading
Loading
+15 −4
Original line number Diff line number Diff line
@@ -10,6 +10,17 @@ $ ./result/bin/nixos-test-driver
>>>
```

::: {.note}
Tests using `systemd-nspawn` container machines require root privileges to run interactively,
since the driver calls `systemd-nspawn` directly to start the containers:

```
$ sudo ./result/bin/nixos-test-driver
[...]
>>>
```
:::

::: {.note}
By executing the test driver in this way,
the VMs executed may gain network & Internet access via their backdoor control interface,
@@ -30,7 +41,7 @@ back into the test driver command line upon its completion. This allows
you to inspect the state of the VMs after the test (e.g. to debug the
test script).

## Shell access in interactive mode {#sec-nixos-test-shell-access}
## Shell access to VMs in interactive mode {#sec-nixos-test-shell-access}

The function `<yourmachine>.shell_interact()` grants access to a shell running
inside a virtual machine. To use it, replace `<yourmachine>` with the name of a
@@ -63,7 +74,7 @@ using:
Once the connection is established, you can enter commands in the socat terminal
where socat is running.

## SSH Access for test machines {#sec-nixos-test-ssh-access}
## SSH Access for test VMs {#sec-nixos-test-ssh-access}

An SSH-based backdoor to log into machines can be enabled with

@@ -149,10 +160,10 @@ must be configured to allow these connections.
## Reuse VM state {#sec-nixos-test-reuse-vm-state}

You can re-use the VM states coming from a previous run by setting the
`--keep-vm-state` flag.
`--keep-machine-state` flag.

```ShellSession
$ ./result/bin/nixos-test-driver --keep-vm-state
$ ./result/bin/nixos-test-driver --keep-machine-state
```

The machine state is stored in the `$TMPDIR/vm-state-machinename`
+35 −1
Original line number Diff line number Diff line
@@ -21,10 +21,44 @@ $ nix-store --read-log result

## System Requirements {#sec-running-nixos-tests-requirements}

NixOS tests require virtualization support.
NixOS tests using QEMU virtual machine [`nodes`](#test-opt-nodes) require virtualization support.
This means that the machine must have `kvm` in its [system features](https://nixos.org/manual/nix/stable/command-ref/conf-file.html?highlight=system-features#conf-system-features) list, or `apple-virt` in case of macOS.
These features are autodetected locally, but `apple-virt` is only autodetected since Nix 2.19.0.

Features of **remote builders** must additionally be configured manually on the client, e.g. on NixOS with [`nix.buildMachines.*.supportedFeatures`](https://search.nixos.org/options?show=nix.buildMachines.*.supportedFeatures&sort=alpha_asc&query=nix.buildMachines) or through general [Nix configuration](https://nixos.org/manual/nix/stable/advanced-topics/distributed-builds).

If you run the tests on a **macOS** machine, you also need a "remote" builder for Linux; possibly a VM. [nix-darwin](https://daiderd.com/nix-darwin/) users may enable [`nix.linux-builder.enable`](https://daiderd.com/nix-darwin/manual/index.html#opt-nix.linux-builder.enable) to launch such a VM.

NixOS tests using `systemd-nspawn` [`containers`](#test-opt-containers) require the Nix daemon to be
configured with the following settings:

```nix
{
  nix.settings = {
    auto-allocate-uids = true;
    extra-system-features = [ "uid-range" ];
    experimental-features = [
      "auto-allocate-uids"
      "cgroups"
    ];
  };
}
```

See the documentation of the settings
[`auto-allocate-uids`](https://nix.dev/manual/nix/stable/command-ref/conf-file#conf-auto-allocate-uids),
[`uid-range`](https://nix.dev/manual/nix/stable/command-ref/conf-file.html?highlight=uid-range#conf-system-features), and
[`cgroups`](https://nix.dev/manual/nix/stable/development/experimental-features#xp-feature-cgroups)
for more information.

If the test uses both `systemd-nspawn` [`containers`](#test-opt-containers) and QEMU virtual machine [`nodes`](#test-opt-nodes)
and requires them share a common VLAN,
`/dev/net` must be present in the sandbox.
This allows them to be bridged over a TAP interface.
To make this path available, set the following option:

```nix
{
  nix.settings.sandbox-paths = [ "/dev/net" ];
}
```
+239 −100
Original line number Diff line number Diff line
@@ -4,15 +4,14 @@ A NixOS test is a module that has the following structure:

```nix
{

  # One or more machines:
  # QEMU virtual machines:
  nodes = {
    machine =
    vm1 =
      { config, pkgs, ... }:
      {
        # ...
      };
    machine2 =
    vm2 =
      { config, pkgs, ... }:
      {
        # ...
@@ -20,6 +19,20 @@ A NixOS test is a module that has the following structure:
    # …
  };

  # systemd-nspawn containers:
  containers = {
    container1 =
      { config, pkgs, ... }:
      {
        # ...
      };
    container2 =
      { config, pkgs, ... }:
      {
        # ...
      };
  };

  testScript = ''
    Python code…
  '';
@@ -27,12 +40,13 @@ A NixOS test is a module that has the following structure:
```

We refer to the whole test above as a test module, whereas the values
in [`nodes.<name>`](#test-opt-nodes) are NixOS modules themselves.
in [`nodes.<name>`](#test-opt-nodes) and [`containers.<name>`](#test-opt-containers)
are NixOS modules themselves.

The option [`testScript`](#test-opt-testScript) is a piece of Python code that executes the
test (described below). During the test, it will start one or more
virtual machines, the configuration of which is described by
the option [`nodes`](#test-opt-nodes).
test (described [below](#ssec-test-script)). During the test, it will start one or more
virtual machines and/or `systemd-nspawn` containers, the configuration of which is described by
the options [`nodes`](#test-opt-nodes) and [`containers`](#test-opt-containers), respectively.

An example of a single-node test is
[`login.nix`](https://github.com/NixOS/nixpkgs/blob/master/nixos/tests/login.nix).
@@ -42,6 +56,12 @@ when switching between consoles, and so on. An interesting multi-node test is
[`nfs/simple.nix`](https://github.com/NixOS/nixpkgs/blob/master/nixos/tests/nfs/simple.nix).
It uses two client nodes to test correct locking across server crashes.

A test can contain both virtual machines and containers.
If configured to share a common VLAN,
they can reach each other over the network.
See [](https://github.com/applicative-systems/nixpkgs/blob/78100f9077ab50604ab9c9514e442bbc7ac7ca5b/nixos/tests/nixos-test-driver/containers.nix)
for an example of this and [](#sec-running-nixos-tests-requirements) for the system requirements for this scenario.

## Calling a test {#sec-calling-nixos-tests}

Tests are invoked differently depending on whether the test is part of NixOS or lives in a different project.
@@ -90,13 +110,54 @@ pkgs.testers.runNixOSTest {

`runNixOSTest` returns a derivation that runs the test.

## Configuring the nodes {#sec-nixos-test-nodes}
## Test machines {#ssec-nixos-test-machines}

There are a few special NixOS options for test VMs:
A NixOS test usually consists of one or more test machines. Each machine is either a
QEMU virtual machine or a `systemd-nspawn` container.

`virtualisation.memorySize`
QEMU virtual machines are defined in the
[`nodes`](#test-opt-nodes) attribute set, whereas `systemd-nspawn` containers are defined in the
[`containers`](#test-opt-containers) attribute set.

:   The memory of the VM in MiB (1024×1024 bytes).
To set NixOS options for all machines in the test, use the attribute
[`defaults`](#test-opt-defaults). These options are applied to both virtual machines
and containers. You can set separate defaults for virtual machines and containers
using the attributes [`nodeDefaults`](#test-opt-nodeDefaults) and
[`containerDefaults`](#test-opt-containerDefaults), respectively.

### Virtual machines vs. containers {#sec-nixos-test-vms-vs-containers}

QEMU virtual machines and `systemd-nspawn` containers offer different
trade-offs which make them suitable for different use cases.

Some advantages of containers over virtual machines are:

- Containers share the kernel of the host system; they are
  significantly faster to start up than virtual machines.
- Containers are more lightweight in terms of resource usage, which
  allows running more of them in parallel on a single host.
- Containers can easily be run in virtualised environments, e.g., CI systems.
- Containers allow direct bind-mounting of host device nodes, which enables
  testing of GPU code (CUDA), for example.

Some advantages of virtual machines over containers are:

- Virtual machines run a separate kernel, which allows testing kernel features
  (kernel modules, etc.).
- Virtual machines support testing graphical applications on X11.
- Virtual machines allow testing NixOS modules that use systemd's namespacing options (such as `ProtectSystem=` or `MountAPIVFS=`).
- Virtual machines allow testing [`spcialisation`](options.html#opt-specialisation).
  (Switching to a specialisation requires the creation of SUID/SGID wrappers, which is disallowed in `systemd-nspawn` within the Nix sandbox.)
- Virtual machines allow the execution of `setuid` binaries.

Refer to the sections on [QEMU virtual machines](#ssec-nixos-test-qemu-vms)
and [systemd-nspawn containers](#ssec-nixos-test-nspawn-containers) below
for more details on configuring each type of machine.

### Configuring test machines {#sec-nixos-test-machines-config}

The following special NixOS option can be used to configure
machines in a NixOS test, whether they are virtual machines or containers:

`virtualisation.vlans`

@@ -104,6 +165,35 @@ There are a few special NixOS options for test VMs:
    [`nat.nix`](https://github.com/NixOS/nixpkgs/blob/master/nixos/tests/nat.nix)
    for an example.

#### Configuring `systemd-nspawn` containers {#ssec-nixos-test-nspawn-containers}

Some options are specific to `systemd-nspawn` containers:

`virtualisation.systemd-nspawn.options`

:  A list of additional command-line options to pass to
    `systemd-nspawn` when starting the container. For example, to
    bind mount a directory from the host into the container, you could
    use: `[ "--bind=/host/dir:/container/dir" ]`.

For more options, see the module
[`nspawn-container`](https://github.com/NixOS/nixpkgs/blob/master/nixos/modules/virtualisation/nspawn-container/default.nix).

Note that the paths used in `--bind` or `--bind-ro` options have to be accessible from within the Nix sandbox.
Use the Nix option
[`sandbox-paths`](https://nix.dev/manual/nix/stable/command-ref/conf-file#conf-sandbox-paths)
and/or the module [`programs.nix-required-mounts`](#opt-programs.nix-required-mounts.enable) on the host
to add additional paths to the sandbox.

#### Configuring QEMU virtual machines {#ssec-nixos-test-qemu-vms}

Some options are specific to QEMU virtual machines:

`virtualisation.memorySize`

:   The memory of the VM in MiB (1024×1024 bytes).


`virtualisation.writableStore`

:   By default, the Nix store in the VM is not writable. If you enable
@@ -114,13 +204,15 @@ There are a few special NixOS options for test VMs:
For more options, see the module
[`qemu-vm.nix`](https://github.com/NixOS/nixpkgs/blob/master/nixos/modules/virtualisation/qemu-vm.nix).

## Writing the test script {#ssec-test-script}

The test script is a sequence of Python statements that perform various
actions, such as starting VMs, executing commands in the VMs, and so on.
Each virtual machine is represented as an object stored in the variable
`name` if this is also the identifier of the machine in the declarative
config. If you specified a node `nodes.machine`, the following example starts the
machine, waits until it has finished booting, then executes a command
and checks that the output is more-or-less correct:
actions, such as starting machines, executing commands in them, and so on. For
example, if you specified a virtual machine in `nodes.machine`, there will be
a Python variable `machine` available in the test script that represents that
virtual machine. The following example would start the machine, wait until it
has finished booting, and then execute a command and check that the output is
more-or-less correct:

```py
machine.start()
@@ -139,17 +231,20 @@ start_all()

Under the variable `t`, all assertions from [`unittest.TestCase`](https://docs.python.org/3/library/unittest.html) are available.

If the hostname of a node contains characters that can't be used in a
If the hostname of a machine contains characters that can't be used in a
Python variable name, those characters will be replaced with
underscores in the variable name, so `nodes.machine-a` will be exposed
to Python as `machine_a`.

## Machine objects {#ssec-machine-objects}
### Methods available on machine objects {#ssec-machine-objects}

The following methods are available on machine objects:
The following methods are available on machine objects (like `machine` in
the examples above):

@PYTHON_MACHINE_METHODS@

### Testing user units {#ssec-testing-user-units}

To test user units declared by `systemd.user.services` the optional
`user` argument can be used:

@@ -162,54 +257,7 @@ machine.wait_for_unit("xautolock.service", "x-session-user")
This applies to `systemctl`, `get_unit_info`, `wait_for_unit`,
`start_job` and `stop_job`.

For faster dev cycles it's also possible to disable the code-linters
(this shouldn't be committed though):

```nix
{
  skipLint = true;
  nodes.machine =
    { config, pkgs, ... }:
    {
      # configuration…
    };

  testScript = ''
    Python code…
  '';
}
```

This will produce a Nix warning at evaluation time. To fully disable the
linter, wrap the test script in comment directives to disable the Black
linter directly (again, don't commit this within the Nixpkgs
repository):

```nix
{
  testScript = ''
    # fmt: off
    Python code…
    # fmt: on
  '';
}
```

Similarly, the type checking of test scripts can be disabled in the following
way:

```nix
{
  skipTypeCheck = true;
  nodes.machine =
    { config, pkgs, ... }:
    {
      # configuration…
    };
}
```

## Failing tests early {#ssec-failing-tests-early}
### Failing tests early {#ssec-failing-tests-early}

To fail tests early when certain invariants are no longer met (instead of waiting for the build to time out), the decorator `polling_condition` is provided. For example, if we are testing a program `foo` that should not quit after being started, we might write the following:

@@ -255,7 +303,7 @@ def foo_running():
    machine.succeed("pgrep -x foo")
```

## Adding Python packages to the test script {#ssec-python-packages-in-test-script}
### Adding Python packages to the test script {#ssec-python-packages-in-test-script}

When additional Python libraries are required in the test script, they can be
added using the parameter `extraPythonPackages`. For example, you could add
@@ -279,6 +327,61 @@ added using the parameter `extraPythonPackages`. For example, you could add

In that case, `numpy` is chosen from the generic `python3Packages`.

### Linting and type checking test scripts {#ssec-test-script-checks}

Test scripts are automatically linted with
[Pyflakes](https://pypi.org/project/pyflakes/) and type-checked with
[Mypy](https://mypy.readthedocs.io/en/stable/).
If there are any linting or type checking errors, the test will
fail to evaluate.

For faster dev cycles it's also possible to disable the code-linters
(this shouldn't be committed though):

```nix
{
  skipLint = true;
  nodes.machine =
    { config, pkgs, ... }:
    {
      # configuration…
    };

  testScript = ''
    Python code…
  '';
}
```

This will produce a Nix warning at evaluation time. To fully disable the
linter, wrap the test script in comment directives to disable the Black
linter directly (again, don't commit this within the Nixpkgs
repository):

```nix
{
  testScript = ''
    # fmt: off
    Python code…
    # fmt: on
  '';
}
```

Similarly, the type checking of test scripts can be disabled in the following
way:

```nix
{
  skipTypeCheck = true;
  nodes.machine =
    { config, pkgs, ... }:
    {
      # configuration…
    };
}
```

## Overriding a test {#sec-override-nixos-test}

The NixOS test framework returns tests with multiple overriding methods.
@@ -297,7 +400,7 @@ The NixOS test framework returns tests with multiple overriding methods.
:   Evaluates the test with additional NixOS modules and/or arguments.

    `module`
    :   A NixOS module to add to all the nodes in the test. Sets test option [`extraBaseModules`](#test-opt-extraBaseModules).
    :   A NixOS module to add to all the machines in the test. Sets test option [`extraBaseModules`](#test-opt-extraBaseModules).

    `specialArgs`
    :   An attribute set of arguments to pass to all NixOS modules. These override the existing arguments, as well as any `_module.args.<name>` that the modules may define. Sets test option [`node.specialArgs`](#test-opt-node.specialArgs).
@@ -345,29 +448,20 @@ list-id: test-options-list
source: @NIXOS_TEST_OPTIONS_JSON@
```

## Accessing VMs in the sandbox with SSH {#sec-test-sandbox-breakpoint}

::: {.note}
For debugging with SSH access into the machines, it's recommended to try using
[the interactive driver](#sec-running-nixos-tests-interactively) with its
[SSH backdoor](#sec-nixos-test-ssh-access) first.

This feature is mostly intended to debug flaky test failures that aren't
reproducible elsewhere.
:::

As explained in [](#sec-nixos-test-ssh-access), it's possible to configure an
SSH backdoor based on AF_VSOCK. This can be used to SSH into a VM of a running
build in a sandbox.
## Debugging test machines {#sec-test-sandbox-breakpoint}

This can be done when something in the test fails, e.g.
You can set the [`enableDebugHook`](#test-opt-enableDebugHook) option to pause
a test on the first failure and have it print instructions on how to enter the
sandbox shell of the test. Suppose you have the following test module:

```nix
{
  name = "foo";

  nodes.machine = { };

  sshBackdoor.enable = true;
  enableDebugHook = true;
  sshBackdoor.enable = true;

  testScript = ''
    start_all()
@@ -376,31 +470,76 @@ This can be done when something in the test fails, e.g.
}
```

For the AF_VSOCK feature to work, `/dev/vhost-vsock` is needed in the sandbox
which can be done with e.g.
The test will fail with an output like this:

```
nix-build -A nixosTests.foo --option sandbox-paths /dev/vhost-vsock
vm-test-run-foo> !!! Breakpoint reached, run 'sudo /nix/store/eeeee-attach/bin/attach <PID>'
```

This will halt the test execution on a test-failure and print instructions
on how to enter the sandbox shell of the VM test. Inside, one can log into
e.g. `machine` with
You can then enter the sandbox shell:

```
ssh -F ./ssh_config vsock/3
$ sudo /nix/store/eeeee-attach/bin/attach <PID>
bash#
```

There, you can attach to a [`pdb`](https://docs.python.org/3/library/pdb.html) session
to step through the Python test script:

```
bash# telnet 127.0.0.1 4444
pdb$
```

Note that it is also possible to set breakpoints in the test script using `debug.breakpoint()`.

### SSH access to test VMs {#sec-test-vm-ssh-access}

::: {.note}
For debugging with SSH access into the machines, it's recommended to try using
[the interactive driver](#sec-running-nixos-tests-interactively) with its
[SSH backdoor](#sec-nixos-test-ssh-access) first.

This feature is mostly intended to debug flaky test failures that aren't
reproducible elsewhere.
:::


If you set the [`sshBackdoor.enable`](#test-opt-sshBackdoor.enable) option,
QEMU virtual machines will open an SSH backdoor based on AF_VSOCK
(see [](#sec-nixos-test-ssh-access)).
Once you are in the sandbox shell, you can access the VMs (for example, `machine`)
with SSH over vsock:

```
bash# ssh -F ./ssh_config vsock/3
```

For the AF_VSOCK feature to work, `/dev/vhost-vsock` is needed in the sandbox
which can be done with e.g.

```
nix-build -A nixosTests.foo --option sandbox-paths /dev/vhost-vsock
```

As described in [](#sec-nixos-test-ssh-access), the numbers for vsock start at
`3` instead of `1`. So the first VM in the network (sorted alphabetically) can
be accessed with `vsock/3`.

Alternatively, it's possible to explicitly set a breakpoint with
`debug.breakpoint()`. This also has the benefit, that one can step through
`testScript` with `pdb` like this:
### SSH access to test containers {#sec-test-container-ssh-access}

If you set the [`sshBackdoor.enable`](#test-opt-sshBackdoor.enable) option,
each `systemd-nspawn` container will open an SSH backdoor.
Once the container starts,
it will print instructions on how to log into the container via SSH.
If the test fails,
attach to the sandbox as described above,
and then use the provided SSH command to log into the container.
For example:

```
$ sudo /nix/store/eeeee-attach <id>
bash# telnet 127.0.0.1 4444
pdb$ …
$ sudo /nix/store/eeeee-attach <PID>
bash# ssh -o User=root -o ProxyCommand="socat - UNIX-CLIENT:/run/systemd/nspawn/unix-export/machine/ssh" bash
[root@machine:~]# hostname
machine
```
+52 −3
Original line number Diff line number Diff line
@@ -88,6 +88,9 @@
  "module-virtualisation-xen-introduction": [
    "index.html#module-virtualisation-xen-introduction"
  ],
  "sec-nixos-test-vms-vs-containers": [
    "index.html#sec-nixos-test-vms-vs-containers"
  ],
  "sec-override-nixos-test": [
    "index.html#sec-override-nixos-test"
  ],
@@ -100,6 +103,51 @@
  "sec-wireless-imperative": [
    "index.html#sec-wireless-imperative"
  ],
  "sec-test-container-ssh-access": [
    "index.html#sec-test-container-ssh-access"
  ],
  "sec-test-vm-ssh-access": [
    "index.html#sec-test-vm-ssh-access"
  ],
  "ssec-all-machine-objects": [
    "index.html#ssec-all-machine-objects"
  ],
  "ssec-nixos-test-machines": [
    "index.html#ssec-nixos-test-machines"
  ],
  "ssec-nixos-test-nspawn-containers": [
    "index.html#ssec-nixos-test-nspawn-containers"
  ],
  "ssec-nixos-test-qemu-vms": [
    "index.html#ssec-nixos-test-qemu-vms"
  ],
  "ssec-nspawn-machine-objects": [
    "index.html#ssec-nspawn-machine-objects"
  ],
  "ssec-qemu-machine-objects": [
    "index.html#ssec-qemu-machine-objects"
  ],
  "ssec-test-script": [
    "index.html#ssec-test-script"
  ],
  "ssec-test-script-checks": [
    "index.html#ssec-test-script-checks"
  ],
  "ssec-testing-user-units": [
    "index.html#ssec-testing-user-units"
  ],
  "test-opt-containerDefaults": [
    "index.html#test-opt-containerDefaults"
  ],
  "test-opt-containers": [
    "index.html#test-opt-containers"
  ],
  "test-opt-extraBaseModules": [
    "index.html#test-opt-extraBaseModules"
  ],
  "test-opt-nodeDefaults": [
    "index.html#test-opt-nodeDefaults"
  ],
  "test-opt-rawTestDerivationArg": [
    "index.html#test-opt-rawTestDerivationArg"
  ],
@@ -2037,7 +2085,8 @@
  "sec-call-nixos-test-outside-nixos": [
    "index.html#sec-call-nixos-test-outside-nixos"
  ],
  "sec-nixos-test-nodes": [
  "sec-nixos-test-machines-config": [
    "index.html#sec-nixos-test-machines-config",
    "index.html#sec-nixos-test-nodes"
  ],
  "ssec-machine-objects": [
@@ -2070,8 +2119,8 @@
  "test-opt-enableOCR": [
    "index.html#test-opt-enableOCR"
  ],
  "test-opt-extraBaseModules": [
    "index.html#test-opt-extraBaseModules"
  "test-opt-extraBaseNodeModules": [
    "index.html#test-opt-extraBaseNodeModules"
  ],
  "test-opt-extraDriverArgs": [
    "index.html#test-opt-extraDriverArgs"
+48 −14

File changed.

Preview size limit exceeded, changes collapsed.

Loading