Skip to content

Testing

Unit tests

Run all unit tests with race detection and coverage:

make test

This uses gotestsum with auto-retry for flaky tests:

gotestsum --format pkgname \
  --rerun-fails --rerun-fails-max-failures=5 \
  --packages="./api/... ./cmd/... ./internal/..." \
  -- -race -timeout=10m \
  -coverpkg=./internal/... \
  -coverprofile=coverage.out \
  -covermode=atomic

View the coverage report:

go tool cover -html=coverage.out

Coverage requirements

The project requires 80%+ line coverage for internal/ packages. CI enforces this threshold and fails if coverage drops below it.

Integration tests (envtest)

Integration tests use controller-runtime's envtest to run a real API server and etcd locally without a full cluster:

make test-integration

This installs the setup-envtest tool if needed, downloads the Kubernetes binaries, and runs:

KUBEBUILDER_ASSETS="$(setup-envtest use -p path)" \
  go test ./test/integration/... -race -count=1 -timeout=15m -tags=integration

Integration tests verify the full reconciliation loop: creating a AttunePolicy, injecting mock metrics, and asserting that status is updated correctly.

E2E tests (Chainsaw)

End-to-end tests run against a real Kubernetes cluster using Chainsaw. They deploy actual Deployments and AttunePolicy resources and verify the operator behaves correctly.

Prerequisites

Running E2E tests from scratch

# Recommended: k3d, because CI and nightly workflows run on k3d/K3S
make k3d-create
make k3d-deploy IMG=attune:e2e
make test-e2e
make test-e2e-go
make k3d-delete

# Alternative: Kind (supported, but local-only and not the default CI path)
make kind-create
make kind-deploy IMG=attune:e2e
make test-e2e
make test-e2e-go
make kind-delete

Before running the E2E suites, verify that your current kubeconfig context points at the cluster you just created and that the API server is reachable:

kubectl config current-context
kubectl cluster-info

If kubectl cluster-info fails or still points at an old context, switch contexts before running make test-e2e or make test-e2e-go.

Fast smoke check

Use this when you want to verify that the local end-to-end flow basically works without running the full E2E suites:

make test-local-smoke

This target provisions a disposable k3d cluster, deploys cert-manager, Prometheus, and the operator, then runs: - test/e2e/oneshot-resize in Chainsaw - TestE2E_OneShotMode_ResizesOnePod in test/e2e-go/

For a pre-provisioned cluster, the equivalent minimal smoke suite is:

make test-e2e-smoke

Test scenarios

Directory Mode What it verifies
test/e2e/recommend-mode/ Recommend Discovers workloads, reaches InsufficientData
test/e2e/observe-mode/ Observe Reaches InsufficientData without resizing pods
test/e2e/oneshot-resize/ OneShot Discovers a workload and performs a one-shot resize
test/e2e/canary-rollout/ Canary Performs a canary resize on a rollout-managed deployment
test/e2e/auto-mode/ Auto Discovers workloads and performs automatic resizes
test/e2e/bootstrap-progress/ Recommend Reports InsufficientData progress and ETA while metrics bootstrap
test/e2e/statefulset-target/ StatefulSet Discovers a StatefulSet workload
test/e2e/daemonset-target/ DaemonSet Discovers a DaemonSet workload
test/e2e/cronjob-target/ CronJob Discovers a CronJob workload (recommend-only)
test/e2e/job-target/ Job Discovers a standalone Job workload (recommend-only)
test/e2e/opt-out/ (cross-cutting) attune.io/skip annotation is respected
test/e2e/exclude-containers/ (cross-cutting) excludedContainers skips sidecars
test/e2e/multi-selector/ (cross-cutting) Label selector matches multiple deployments
test/e2e/eviction-fallback/ (cross-cutting) InPlaceOrRecreate is accepted and still resizes workloads (in-place path)
test/e2e/schedule-window/ (cross-cutting) Schedule windows block resizes outside the allowed time
test/e2e/budget-caps/ (cross-cutting) Budget caps are accepted and the policy still resizes workloads
test/e2e/concurrent-resize/ (cross-cutting) maxConcurrentResizes is accepted and workloads still resize
test/e2e/namespace-defaults/ (cross-cutting) AttuneNamespaceDefaults overrides cluster defaults
test/e2e/defaults-merge/ (cross-cutting) AttuneDefaults values are inherited by a policy that omits them
test/e2e/hpa-conflict/ (cross-cutting) HPA conflict is warning-only, policy still reconciles
test/e2e/vpa-conflict/ (cross-cutting) VPA conflict is warning-only, policy still reconciles
test/e2e/hpa-auto-tune/ (cross-cutting) Auto-tunes HPA CPU target utilization when annotated
test/e2e/policy-weight/ (cross-cutting) Higher-weight policy outranks lower-weight on the same workload
test/e2e/requests-only/ (cross-cutting) controlledValues: RequestsOnly is accepted and discovers workloads
test/e2e/query-parameters/ (cross-cutting) Prometheus query parameters are accepted without breaking queries
test/e2e/startup-boost/ (cross-cutting) CPU startup boost is applied to new pods
test/e2e/configmap-export/ (cross-cutting) Recommendations are exported to a ConfigMap
test/e2e/prometheus-unreachable/ (cross-cutting) Handles unreachable Prometheus gracefully without crashing
test/e2e/grafana-dashboard/ (helm) Dashboard ConfigMap renders with grafanaDashboard.enabled
test/e2e/health-probes/ (infra) Liveness and readiness probes pass
test/e2e/metrics-endpoint/ (infra) Prometheus metrics endpoint is exposed
test/e2e/webhook-defaulting/ (webhook) Mutating webhook applies defaults
test/e2e/webhook-validation/ (webhook) Rejects invalid overhead and negative cooldown
test/e2e/webhook-schedule-validation/ (webhook) Rejects invalid timezone, day, and window time
test/e2e/defaults-validation/ (webhook) Rejects invalid AttuneDefaults

Writing new E2E tests

Create a directory under test/e2e/<scenario-name>/ with a chainsaw-test.yaml file. Follow the existing pattern: create a namespace, deploy a workload, create a policy, assert on status.

Chainsaw configuration is in .chainsaw.yaml (timeouts, parallelism).

Warning

E2E tests modify cluster state. Always run them against a disposable local cluster (k3d or Kind), not a shared environment.

Fuzz testing

Fuzz tests exercise the recommendation engine and webhook validation with random inputs to catch panics and edge cases:

make test-fuzz

This runs each fuzz target for 30 seconds (coverage-guided):

go test ./internal/recommendation/... -run='^$' -fuzz=FuzzPercentileEstimator -fuzztime=30s
go test ./internal/recommendation/... -run='^$' -fuzz=FuzzRecommendationEngine -fuzztime=30s
go test ./internal/webhook/...        -run='^$' -fuzz=FuzzValidateFloatFields  -fuzztime=30s

Fuzz targets are defined in internal/recommendation/fuzz_test.go (estimator and engine) and internal/webhook/validation_test.go (float-field parsing via strconv.ParseFloat).

Running all tests

Run everything in one command:

make test-all         # all tiers against a pre-provisioned cluster with operator + Prometheus
make test-local       # provisions k3d, deploys the stack, then runs all tiers
make test-local-smoke # provisions k3d, deploys the stack, then runs the smoke suite only

Or run each tier separately:

make test              # unit tests only
make test-integration  # integration tests (envtest)
make test-e2e          # Chainsaw E2E (requires local k3d or Kind cluster)
make test-e2e-go       # Go E2E (requires local k3d or Kind cluster with Prometheus)
make test-e2e-smoke    # one Chainsaw scenario + one Go E2E smoke test

For a full local validation including lint, helm, and CRD freshness:

make verify        # all CI checks locally

Test organization

Directory Type Framework
api/v1alpha1/*_test.go Unit Go testing
internal/**/*_test.go Unit Go testing + testify
internal/**/*_benchmark_test.go Benchmark Go testing (make test-bench)
test/integration/ Integration envtest
test/e2e/ E2E (Chainsaw) Chainsaw (make test-e2e)
test/e2e-go/ E2E (Go) Go testing + real cluster (make test-e2e-go)
internal/recommendation/fuzz_test.go Fuzz Go native fuzzing (make test-fuzz)
internal/webhook/validation_test.go (FuzzValidateFloatFields) Fuzz Go native fuzzing (make test-fuzz)

Full Go E2E suite

make test-e2e-go now runs the full Go E2E suite, including the longer Prometheus warm-up scenarios that cover budget caps, schedule windows, bearer-token auth, eviction fallback, realistic overprovisioned workloads, secret rotation, recommendation retention without live pods, and OOM-triggered safety reverts.

Expect 5-10 minutes of total runtime for the Go E2E portion because these scenarios wait for real Prometheus samples and operator reconciles. The nightly workflow still runs the same suite across the full Kubernetes version matrix.

Writing new tests

  • Place unit tests next to the code they test (foo_test.go alongside foo.go).
  • Use testify/assert and testify/require for assertions.
  • Use table-driven tests for functions with multiple input/output scenarios.
  • Mock the MetricsCollector interface for tests that need Prometheus data.