Commit 29ce0536 authored by Connor Baker's avatar Connor Baker
Browse files

cudaPackages: build redists from manifests

parent 3ad342b2
Loading
Loading
Loading
Loading
+29 −28
Original line number Diff line number Diff line
@@ -10,39 +10,40 @@ package set by [cuda-packages.nix](../../top-level/cuda-packages.nix).

## Top-level directories

- `cuda`: CUDA redistributables! Provides extension to `cudaPackages` scope.
- `cudatoolkit`: monolithic CUDA Toolkit run-file installer. Provides extension
    to `cudaPackages` scope.
- `cudnn`: NVIDIA cuDNN library.
- `cutensor`: NVIDIA cuTENSOR library.
- `fixups`: Each file or directory (excluding `default.nix`) should contain a
    `callPackage`-able expression to be provided to the `overrideAttrs` attribute
    of a package produced by the generic manifest builder.
    These fixups are applied by `pname`, so packages with multiple versions
    (e.g., `cudnn`, `cudnn_8_9`, etc.) all share a single fixup function
    (i.e., `fixups/cudnn.nix`).
- `generic-builders`:
  - Contains a builder `manifest.nix` which operates on the `Manifest` type
      defined in `modules/generic/manifests`. Most packages are built using this
      builder.
  - Contains a builder `multiplex.nix` which leverages the Manifest builder. In
      short, the Multiplex builder adds multiple versions of a single package to
      single instance of the CUDA Packages package set. It is used primarily for
      packages like `cudnn` and `cutensor`.
- `modules`: Nixpkgs modules to check the shape and content of CUDA
    redistributable and feature manifests. These modules additionally use shims
    provided by some CUDA packages to allow them to re-use the
    `genericManifestBuilder`, even if they don't have manifest files of their
    own. `cudnn` and `tensorrt` are examples of packages which provide such
    shims. These modules are further described in the
    [Modules](./modules/README.md) documentation.
- `_cuda`: Fixed-point used to configure, construct, and extend the CUDA package
    set. This includes NVIDIA manifests.
- `buildRedist`: Contains the logic to build packages using NVIDIA's manifests.
- `packages`: Contains packages which exist in every instance of the CUDA
    package set. These packages are built in a `by-name` fashion.
- `setup-hooks`: Nixpkgs setup hooks for CUDA.
- `tensorrt`: NVIDIA TensorRT library.
- `tests`: Contains tests which can be run against the CUDA package set.

Many redistributable packages are in the `packages` directory. Their presence
ensures that, even if a CUDA package set which no longer includes a given package
is being constructed, the attribute for that package will still exist (but refer
to a broken package). This prevents missing attribute errors as the package set
evolves.

## Distinguished packages

Some packages are purposefully not in the `packages` directory. These are packages
which do not make sense for Nixpkgs, require further investigation, or are otherwise
not straightforward to include. These packages are:

- `cuda`:
  - `collectx_bringup`: missing `libssl.so.1.1` and `libcrypto.so.1.1`; not sure how
    to provide them or what the package does.
  - `cuda_sandbox_dev`: unclear on purpose.
  - `driver_assistant`: we don't use the drivers from the CUDA releases; irrelevant.
  - `mft_autocomplete`: unsure of purpose; contains FHS paths.
  - `mft_oem`: unsure of purpose; contains FHS paths.
  - `mft`: unsure of purpose; contains FHS paths.
  - `nvidia_driver`: we don't use the drivers from the CUDA releases; irrelevant.
  - `nvlsm`: contains FHS paths.
- `cublasmp`:
  - `libcublasmp`: `nvshmem` isnt' packaged.
- `cudnn`:
  - `cudnn_samples`: requires FreeImage, which is abandoned and not packaged.

### CUDA Compatibility

[CUDA Compatibility](https://docs.nvidia.com/deploy/cuda-compatibility/),
+1 −1
Original line number Diff line number Diff line
@@ -55,7 +55,7 @@
    };

    # No changes from 12.8 to 12.9
    # https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#host-compiler-support-policy
    # https://docs.nvidia.com/cuda/archive/12.9.1/cuda-installation-guide-linux/index.html#host-compiler-support-policy
    "12.9" = {
      clang = {
        maxMajorVersion = "19";
+3 −5
Original line number Diff line number Diff line
# The _cuda attribute set is a fixed-point which contains the static functionality required to construct CUDA package
# sets. For example, `_cuda.bootstrapData` includes information about NVIDIA's redistributables (such as the names
# NVIDIA uses for different systems), `_cuda.lib` contains utility functions like `formatCapabilities` (which generate
# common arguments passed to NVCC and `cmakeFlags`), and `_cuda.fixups` contains `callPackage`-able functions which
# are provided to the corresponding package's `overrideAttrs` attribute to provide package-specific fixups
# out of scope of the generic redistributable builder.
# NVIDIA uses for different systems), and `_cuda.lib` contains utility functions like `formatCapabilities` (which generate
# common arguments passed to NVCC and `cmakeFlags`).
#
# Since this attribute set is used to construct the CUDA package sets, it must exist outside the fixed point of the
# package sets. Make these attributes available directly in the package set construction could cause confusion if
@@ -23,7 +21,7 @@ lib.fixedPoints.makeExtensible (final: {
    inherit lib;
  };
  extensions = [ ]; # Extensions applied to every CUDA package set.
  fixups = import ./fixups { inherit lib; };
  manifests = import ./manifests { inherit lib; };
  lib = import ./lib {
    _cuda = final;
    inherit lib;
+0 −12
Original line number Diff line number Diff line
{ flags, lib }:
prevAttrs: {
  autoPatchelfIgnoreMissingDeps = prevAttrs.autoPatchelfIgnoreMissingDeps or [ ] ++ [
    "libnvrm_gpu.so"
    "libnvrm_mem.so"
    "libnvdla_runtime.so"
  ];
  # `cuda_compat` only works on aarch64-linux, and only when building for Jetson devices.
  badPlatformsConditions = prevAttrs.badPlatformsConditions or { } // {
    "Trying to use cuda_compat on aarch64-linux targeting non-Jetson devices" = !flags.isJetsonBuild;
  };
}
+0 −37
Original line number Diff line number Diff line
# TODO(@connorbaker): cuda_cudart.dev depends on crt/host_config.h, which is from
# (getDev cuda_nvcc). It would be nice to be able to encode that.
{ addDriverRunpath, lib }:
prevAttrs: {
  # Remove once cuda-find-redist-features has a special case for libcuda
  outputs =
    prevAttrs.outputs or [ ]
    ++ lib.lists.optionals (!(builtins.elem "stubs" prevAttrs.outputs)) [ "stubs" ];

  allowFHSReferences = false;

  # The libcuda stub's pkg-config doesn't follow the general pattern:
  postPatch =
    prevAttrs.postPatch or ""
    + ''
      while IFS= read -r -d $'\0' path; do
        sed -i \
          -e "s|^libdir\s*=.*/lib\$|libdir=''${!outputLib}/lib/stubs|" \
          -e "s|^Libs\s*:\(.*\)\$|Libs: \1 -Wl,-rpath,${addDriverRunpath.driverLink}/lib|" \
          "$path"
      done < <(find -iname 'cuda-*.pc' -print0)
    ''
    # Namelink may not be enough, add a soname.
    # Cf. https://gitlab.kitware.com/cmake/cmake/-/issues/25536
    + ''
      if [[ -f lib/stubs/libcuda.so && ! -f lib/stubs/libcuda.so.1 ]]; then
        ln -s libcuda.so lib/stubs/libcuda.so.1
      fi
    '';

  postFixup = prevAttrs.postFixup or "" + ''
    mv "''${!outputDev}/share" "''${!outputDev}/lib"
    moveToOutput lib/stubs "$stubs"
    ln -s "$stubs"/lib/stubs/* "$stubs"/lib/
    ln -s "$stubs"/lib/stubs "''${!outputLib}/lib/stubs"
  '';
}
Loading