Skip to content

fix: harden HelmRelease security contexts and add Kyverno validation policies#1504

Open
devantler wants to merge 10 commits intomainfrom
devantler/fix-kubescape-compliance
Open

fix: harden HelmRelease security contexts and add Kyverno validation policies#1504
devantler wants to merge 10 commits intomainfrom
devantler/fix-kubescape-compliance

Conversation

@devantler
Copy link
Copy Markdown
Contributor

@devantler devantler commented May 10, 2026

Hardens security contexts across 15+ HelmReleases at the source (via chart values) to fix Kubescape compliance failures. Adds Kyverno validation policies in Enforce mode to prevent regression. Narrowed Kubescape exceptions to infrastructure-namespace-scoped instead of global.

Type of change

  • 🪲 Bug fix

Approach

Fix at source, not exceptions. Kubescape scans Deployment specs, not running Pods. Even though Kyverno's add-security-context mutation policy secures Pods at admission, the Deployment spec was still missing security context fields → Kubescape flagged it. The fix is to set security context explicitly in HelmRelease values: blocks.

Changes

HelmRelease Security Context Hardening (15+ charts)

Added podSecurityContext, container securityContext, and where possible serviceAccount.automountServiceAccountToken: false:

Chart Key Changes
whoami, actual-budget, headlamp Full pod + container SC, SA automount disabled
dex, oauth2-proxy Full pod + container SC with verified UIDs
cert-manager containerSecurityContext (runAsGroup at container level)
kyverno podSecurityContext (chart already has container-level defaults)
reloader containerSecurityContext additions
opencost Exporter + UI security contexts
trust-manager, origin-ca-issuer postRenderer patches (charts lack values support)
keda, keda-http-add-on Full pod + container SC for all components
vertical-pod-autoscaler All 3 components hardened
fleetdm (mysql, redis) Bitnami pod + container SC with enabled: true gate

Kyverno Validation Policies (Enforce mode)

Policy Enforces
validate-pod-security allowPrivilegeEscalation: false, drop ALL caps, seccomp RuntimeDefault
validate-host-restrictions No hostPath, hostNetwork, hostPID, hostIPC

Both exclude infrastructure namespaces (kube-system, flux-system, longhorn-system, kubevirt, cdi, monitoring, velero) that genuinely require elevated access.

Exception Narrowing

  • service-account-tokens.yaml: Changed from global to 24 infrastructure-namespace-scoped
  • Removed readOnlyRootFilesystem from validation (PSS restricted doesn't require it)

Kubescape Results

Metric Before After
NSA ~73% 73.11%
MITRE ~67% 68.63%
Non-root containers (C-0013) 51 failures 45 failures

Remaining failures are in infrastructure namespaces not controlled by this repo (longhorn-system, kubevirt, cdi, kube-system, velero).

Production E2E Verification

All endpoints verified after deployment:

  • ✅ Homepage (302), Dex (200), OAuth2-Proxy (200)
  • ✅ Actual-Budget (200), Headlamp (200), Whoami (200)
  • ✅ FleetDM (200) — had pre-existing redis storage issues, now recovered
  • ✅ All 160+ pods Running, all HelmReleases Ready

Technical Notes

  • Talos containerd bug: runAsGroup in podSecurityContext without runAsUser causes sandbox creation failure. Fixed by using container-level runAsGroup (safe) or adding explicit runAsUser where UID is known.
  • Chart value naming: cert-manager uses reversed naming (securityContext = pod, containerSecurityContext = container). Trust-manager and origin-ca-issuer require postRenderer patches.
  • Kyverno Enforce mode deadlock: flux-system must be excluded from validation policies to avoid chicken-and-egg reconciliation failures.

- Add C-0034 and C-0190 to service-account-tokens exception (56 failures each)
- Extend infrastructure-privileged exception to kubevirt and cdi namespaces
- Expand controller-rbac exception with 12 additional controller namespaces
- Expand health-probes exception with 6 additional namespaces
- Create validate-pod-security ClusterPolicy (PSS restricted enforcement)
- Create validate-host-restrictions ClusterPolicy (host access enforcement)
- Wire new validation policies into cluster-policies kustomization

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR targets Kubescape compliance by expanding existing ClusterSecurityException coverage for known infrastructure needs and introducing Kyverno admission validation policies to prevent future non-compliant Pods from being created.

Changes:

  • Extended multiple Kubescape ClusterSecurityException manifests to ignore additional controls and/or include more namespaces.
  • Added two new Kyverno ClusterPolicy validation policies: Pod security context requirements and host-level restriction enforcement.
  • Wired the new Kyverno policies into the infrastructure cluster-policies kustomization.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
k8s/bases/infrastructure/security-exceptions/service-account-tokens.yaml Adds additional Kubescape control IDs to the SA token exception.
k8s/bases/infrastructure/security-exceptions/infrastructure-privileged.yaml Expands privileged/host-access exception to include KubeVirt/CDI namespaces.
k8s/bases/infrastructure/security-exceptions/health-probes.yaml Expands health-probe exceptions to more namespaces.
k8s/bases/infrastructure/security-exceptions/controller-rbac.yaml Expands RBAC-related exceptions to more namespaces.
k8s/bases/infrastructure/cluster-policies/kustomization.yaml Includes the new Kyverno validation policies in the build.
k8s/bases/infrastructure/cluster-policies/best-practices/validate-pod-security.yaml New Kyverno policy enforcing restricted-style pod/container security settings.
k8s/bases/infrastructure/cluster-policies/best-practices/validate-host-restrictions.yaml New Kyverno policy blocking hostPath and host namespace sharing.

Comment thread k8s/bases/infrastructure/security-exceptions/controller-rbac.yaml
Comment thread k8s/bases/infrastructure/security-exceptions/health-probes.yaml
devantler and others added 3 commits May 10, 2026 10:23
Add explicit security context (runAsNonRoot, allowPrivilegeEscalation,
capabilities drop ALL, seccompProfile) to HelmRelease values for:
- whoami, actual-budget, headlamp (apps)
- dex, reloader, cert-manager, oauth2-proxy, opencost (controllers)

This fixes Kubescape C-0013, C-0016, C-0017, C-0055 at the source by
setting security fields in Deployment specs rather than relying solely
on Kyverno pod mutation.

Narrow C-0034/C-0190 SA automount exception from global scope to
infrastructure namespaces only. App namespaces have automount disabled
via HelmRelease values or postRenderer patches.

Remove readOnlyRootFilesystem from validate-pod-security policy since
PSS restricted does not require it and several charts intentionally
set it to false (homepage, opencost UI, actual-budget).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…-issuer

Add flux-system to validation policy exclusion lists — Flux controllers
are managed by the Flux Operator and their security context cannot be
set via HelmRelease values.

Harden kyverno (podSecurityContext for all 4 controllers), trust-manager
(postRenderer for pod seccomp + runAsNonRoot), and origin-ca-issuer
(extend existing postRenderer with full security context) instead of
adding more exceptions.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add runAsGroup to all previously hardened HelmReleases (cert-manager,
dex, reloader, opencost, oauth2-proxy, headlamp, actual-budget, whoami,
kyverno) to satisfy Kubescape C-0013 non-root containers control.

Harden remaining controllers:
- keda: pod + container security context for operator, metricServer, webhooks
- keda-http-add-on: pod + container security context (flat, all components)
- vertical-pod-autoscaler: pod + container security context for all 3 components
- fleetdm mysql: explicit pod + container security context for primary/secondary
- fleetdm redis: explicit pod + container security context for master/replica

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@devantler devantler marked this pull request as ready for review May 10, 2026 09:48
…opencost, trust-manager, origin-ca-issuer

Adds runAsGroup: 65534 at the container level (not pod level) to satisfy
Kubescape C-0013 (Non-root containers) without triggering the Talos
containerd sandbox creation error.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@botantler botantler Bot enabled auto-merge May 10, 2026 10:11
@devantler devantler changed the title fix: resolve kubescape compliance issues and add validation policies fix: harden HelmRelease security contexts and add Kyverno validation policies May 10, 2026
@devantler devantler marked this pull request as draft May 10, 2026 10:11
auto-merge was automatically disabled May 10, 2026 10:11

Pull request was converted to draft

…ation, clarify exception scope

- Fix quoted booleans in validate-pod-security and validate-host-restrictions
  policies to use native YAML booleans instead of strings
- Add privileged: false validation to container security checks (C-0057)
- Add deny-container-seccomp-unconfined rule to prevent containers from
  overriding pod-level seccomp with Unconfined
- Update controller-rbac exception description to clarify it covers
  both infrastructure controllers and platform middleware
- Update health-probes exception description to explain why app
  namespaces are included (init containers and sidecars)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 10, 2026 10:14
@devantler devantler marked this pull request as ready for review May 10, 2026 10:15
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 22 out of 22 changed files in this pull request and generated 2 comments.

Comment thread k8s/bases/infrastructure/security-exceptions/service-account-tokens.yaml Outdated
Comment thread k8s/bases/infrastructure/controllers/reloader/helm-release.yaml
…ual scope

The namespaceSelector includes middleware namespaces (headlamp, homepage,
fleetdm, dex, oauth2-proxy) that need API access for their function.
Update the header comment and reason to accurately reflect this instead
of claiming app namespaces are not excepted.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The Kyverno add-security-context mutation policy now injects pod-level
(seccompProfile, runAsNonRoot) and container-level (allowPrivilegeEscalation,
readOnlyRootFilesystem, runAsNonRoot, runAsGroup: 65534, capabilities.drop,
seccompProfile) defaults via conditional anchors at admission time.

Remove values that duplicate these mutation defaults from all 15 HelmRelease
files. Retained values that:
- Override insecure chart defaults (whoami: allowPrivilegeEscalation)
- Are image-specific (runAsUser everywhere)
- Differ from mutation defaults (headlamp runAsGroup: 101, actual-budget
  runAsGroup: 1001, keda-http-add-on runAsGroup: 1000)
- Override mutation for functional reasons (readOnlyRootFilesystem: false
  for actual-budget, opencost-ui)
- Are non-SC fields (Bitnami enabled flags, fsGroup, privileged)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…cal-path helper pods

- Enhance add-security-context mutation policy with runAsNonRoot (pod +
  container), runAsGroup: 65534 (container only), and container-level
  seccompProfile. Uses conditional anchors so chart-explicit values are
  never overwritten.

- Add runAsNonRoot: true to validate-pod-security validation rules. Now
  safe because the mutation policy guarantees the field is set for charts
  that don't explicitly override it. Enforces C-0013.

- Exclude local-path-provisioner helper pods (helper-pod-create-pvc-*,
  helper-pod-delete-pvc-*) from both validate-pod-security and
  validate-host-restrictions. These transient pods run as root with
  hostPath in app namespaces to provision PVs in local/CI clusters.
  Production uses Longhorn (already excluded by namespace).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 10, 2026 12:14
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated 3 comments.

Container-level runAsGroup would override pod-level runAsGroup
inheritance, potentially changing the effective GID for containers
that rely on inheriting from the pod securityContext. Since
runAsGroup is not required by PSS restricted, remove it from the
mutation and let charts set it explicitly where needed.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: 🫴 Ready

Development

Successfully merging this pull request may close these issues.

2 participants