Skip to content

test(snap): add spread integration tests#1465

Open
zyga wants to merge 3 commits into
NVIDIA:mainfrom
zyga:feat/spread-tests
Open

test(snap): add spread integration tests#1465
zyga wants to merge 3 commits into
NVIDIA:mainfrom
zyga:feat/spread-tests

Conversation

@zyga
Copy link
Copy Markdown
Contributor

@zyga zyga commented May 19, 2026

Summary

Add spread smoke test suite for the snap package.

Related Issue

N/A

Changes

  • Add spread.yaml using image-garded ad-hoc backend (using qemu internally)
  • Add .image-garden.mk with one-time init logic for each system image
  • Add spread tests/ directory with one smoke test that creates a sandbox

Testing

  • mise run pre-commit passes
  • Unit tests added/updated
  • E2E tests added/updated (if applicable)

Checklist

  • Follows Conventional Commits
  • Commits are signed off (DCO)
  • Architecture docs updated (if applicable)

Please let me know if I should update architecture docs. The next step after this is to wire this all to GitHub to gate snap publishing on the smoke test passing.

@zyga zyga requested review from a team, derekwaynecarr, maxamillion and mrunalp as code owners May 19, 2026 23:04
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 19, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@zyga zyga force-pushed the feat/spread-tests branch from d7fb63c to 064565a Compare May 19, 2026 23:06
The integration test suite runs on typical Linux distribution images and
ensures that openshell can create a sandbox and execute a hello-world
program inside.

Tests have two components:

1) The `image-garden` program uses built-in rules, as well as
`.image-garden.mk` to prepare virtual machine images for testing. The
virtual machines are vanilla cloud images booted once with a cloud-init
profile that prepares them for testing. At runtime garden downloads images
to ~/.cache/garden/dl or $SNAP_USER_COMMON/cache/dl (when using the snap)
and then saves customized differential images in `.image-garden/` in the
project directory.

In practice the pre-created environment has snapd, the docker snap and the
"ghcr.io/nvidia/openshell-community/sandboxes/base:latest" docker image
pre-pulled for faster test iteration.

2) The `spread` program uses `spread.yaml` and a collection of `task.yaml`
files to run tests. The top-level file defines the set of test systems,
contains project wide preparation logic where we install the snap and
defines a single test suite which corresponds to the `tests/` directory.

An initial smoke test that creates a sandbox and ensures it can run a shell
hello world is provided. This ensures that the locally built snap really
works on the set of test environments:

- centos-cloud-10
- debian-cloud-13
- fedora-cloud-44
- ubuntu-cloud-24.04
- ubuntu-cloud-26.04

Those tests are typically used with the `image-garden` snap, which also
includes spread: https://snapcraft.io/image-garden/

Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
@zyga zyga force-pushed the feat/spread-tests branch from 064565a to 6e77fc0 Compare May 19, 2026 23:15
@TaylorMutch
Copy link
Copy Markdown
Collaborator

/ok to test 6e77fc0

zyga added 2 commits May 20, 2026 09:07
We need to give docker a moment to finish setting up. We an also add a
docker system group while we are at it. With this setting it is worth to
keep prepared images around in CI but the time advantage is worth it.

Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
CentOS 10 cloud image has a truncated checksum file causing each download
to fail validation. Let's remove it from spread.yaml while keeping the
.image-garden.mk entry for another day.

Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
@TaylorMutch
Copy link
Copy Markdown
Collaborator

/ok to test 551d47f

@TaylorMutch TaylorMutch self-assigned this May 20, 2026
Copy link
Copy Markdown
Collaborator

@drew drew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain a little more how this is intended to be used?

Comment thread .image-garden/.gitignore
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we instead add .image-garden/* to the root .gitignore?

Comment thread .image-garden.mk
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how is this used?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When image-garden make or image-garden allocate is used, it loads this file to compute the cloud-init profile to give to the VM. In practice this defines what is in the image that is perhaps cached by the CI stack. Locally it plays a larger role as the cached image is simply in the workspace (no separate caching step). This effectively allows iteration with one-off preparation costs paid once, regardless of how many test cycles you have.

Comment thread tests/smoke/task.yaml
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how is this used? do you see this getting expandded for non snap use cases? i'd prefer to keep snap specific infra in deploy/snap for now. we have an existing install canary that we run in .github/workflows/release-canary.yml.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one of the files that spread loads to understand test cases and execute them.

I would love to expand this to cover the non snap case (note that in prepare we can simply install the pre-build binaries from target/release). Spread is used heavily across many of our products as it allows to perform realistic full system testing on many releases of ubuntu (with the real kernel, with the real bugs people would face) as well as to extend this testing to other distributions (this is what garden provides: distribution images).

Here we only have one test case but in our products we aim to have full coverage of all interactions expressed as spread tests. They provide invaluable support in ensuring quality and in preventing regressions in complex products.

@drew
Copy link
Copy Markdown
Collaborator

drew commented May 22, 2026

Crossing linking this discussion here, #1494 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants