Skip to main content

Kubernetes

Runs the agent's bash/node/python inside an agent-sandbox Sandbox pod on the Broods k3s cluster — a VM-like runtime (bash, node, python3, curl on PATH). Egress follows config.network via a per-Sandbox NetworkPolicy: allow-all removes the policy, deny-all (the default when network is omitted) blocks all egress, and restricted allows the listed CIDRs with DNS (port 53) kept open. The workspace is the same shared S3 bucket the other providers use, mounted at the selected sandbox/<namespace>/ prefix for each run.

Infra side (controller install, runtime image, IAM, cluster identity, debugging, migration) is documented in the beeblastco/infra repo: docs/agent-sandbox.md.

Configuration

{
"name": "kubernetes",
"config": {
"provider": "kubernetes",
"permissionMode": "ask",
"timeout": 60,
"options": {
"workspaceRoot": "/mnt/workspaces",
"mountAwsS3Buckets": true
}
}
}

Reference the resulting sandboxId from config.sandbox or config.workspaces[].sandbox. Cluster-control settings are service-managed and cannot be set in account sandbox config. They come from deployment env/defaults:

OptionEnv fallbackDefault
kubeconfigKUBERNETES_SANDBOX_KUBECONFIG, then KUBERNETES_SANDBOX_KUBECONFIG_SSM (SSM param name)ambient kubeconfig (local only)
namespaceKUBERNETES_SANDBOX_NAMESPACEagent-sandboxes
imageKUBERNETES_SANDBOX_IMAGEghcr.io/beeblastco/agent-sandbox-runtime:latest
serviceAccountNameKUBERNETES_SANDBOX_SERVICE_ACCOUNTagent-sandbox-workspace when S3 mount is enabled, otherwise pod default SA
imagePullSecretsKUBERNETES_SANDBOX_IMAGE_PULL_SECRETS (comma-sep)none
workspaceBucketNameFILESYSTEM_BUCKET_NAME
awsRegionAWS_REGION / AWS_DEFAULT_REGION

sandbox.envVars passes extra environment variables into the pod (honored by all providers).

How it works

Per bash/file run the executor (functions/harness-processing/sandbox/kubernetes-executor.ts):

  1. Creates a Sandbox (agents.x-k8s.io/v1alpha1) named fp-<namespace>-<rand> in the namespace, with the runtime image (command: sleep infinity).
  2. Waits for the pod to be Ready.
  3. If mountAwsS3Buckets is set, runs mount-s3 --prefix sandbox/<namespace>/ <bucket> <workspaceRoot>/<namespace> inside the pod (the container runs privileged so FUSE works).
  4. Execs the command via the kube exec API after cd <workspaceRoot>/<namespace>, streaming stdout/stderr.
  5. Deletes the Sandbox (ephemeral-per-run).

With persistent: true the executor instead reserves one deterministic Sandbox (fp-p-<slug>-<hash>) per workspace: it scales replicas 0↔1 to pause/resume, keeps the home directory on a PVC (persistentDiskGb, persistentHome, storageClass options), and refreshes a shutdownTime so the idle reaper only deletes truly expired sandboxes. Hooks (onCreate/onResume) and detached background jobs work as described in the sandbox overview.

Home storage

$HOME is runtime state; the S3 workspace mount is the user-facing shared file store.

ModeConfigPersists after scale-to-zero?Use for
Durable home PVCpersistent: true + options.persistentDiskGbyespackage caches, virtualenvs, background-job metadata
Ephemeral homepersistent: true + ephemeralHome: truenofast-start workloads that write durable files to S3 workspace
S3 workspaceoptions.mountAwsS3Buckets: true + attached workspaceyesuser/project files, read-only sharing, downloadable artifacts
// Durable home PVC
{
"provider": "kubernetes",
"persistent": true,
"options": {
"persistentDiskGb": 10,
"persistentHome": "/home/node",
"storageClass": "local-path"
}
}
// Ephemeral home: skips the home PVC entirely.
{
"provider": "kubernetes",
"persistent": true,
"ephemeralHome": true
}

options.persistentDiskGb is the requested home PVC size in GiB (currently 1-10). options.persistentHome is the mounted HOME path, not a boolean. options.storageClass selects the Kubernetes StorageClass, such as local-path or hcloud-volumes; omit it to use the cluster default.

PVCs are per reserved Sandbox and mounted only into that Sandbox's pod. They are good for private runtime state, not cross-user file sharing. For read-only or multi-user access, write artifacts to the S3 workspace and enforce access there.

The kubeconfig (CA + token) is ~2.7 KB — over Lambda's 4 KB env-var limit. So sst.config.ts stores it in an SSM SecureString parameter (/broods/<stage>/kubernetes-sandbox-kubeconfig, value from the KubernetesSandboxKubeconfig secret) and passes only the parameter name as KUBERNETES_SANDBOX_KUBECONFIG_SSM; the executor fetches + caches it at runtime. Set the GitHub secret KUBERNETES_SANDBOX_KUBECONFIG (base64 kubeconfig) and deploy — no env-size juggling.

Cluster auth uses the service-managed kubeconfig. For mount-s3, the pod gets S3 permissions from its Kubernetes service account / cluster identity; the harness does not pass AWS credentials into the pod. When mountAwsS3Buckets is enabled and no deployment override is set, the executor uses the agent-sandbox-workspace ServiceAccount. The pod runs privileged + runAsUser: 0 (FUSE needs the device + root) and mounts with --uid 1000 --gid 1000.

Mountpoint-for-S3 has no append/in-place edit (>> or editing a file in place fails) — only whole-file create/overwrite. The agent should rewrite files, not append. (Same as Daytona's mount.)

Requirements

  • Harness deployed with KUBERNETES_SANDBOX_KUBECONFIG set (CI: GitHub secret of the same name; see the infra doc for how to generate it from the SA token).
  • The harness runtime must be able to reach the cluster API (https://<api>:6443) and open exec websockets. The HarnessProcessing Lambda is not VPC-attached, so it has public egress by default.
  • For the S3 mount: mountAwsS3Buckets: true; the sandbox pod service account must have S3 RW on the workspace bucket, and the runtime image must be in GHCR with a pull secret in the namespace. Without the mount, stateless runs still work but workspace-backed tools fail fast because files would not persist across calls.
  • TLS on the deployed harness. The harness Lambda is a bun-compiled binary whose fetch ignores the kubeconfig CA / insecure-skip-tls-verify, and k3s serves a self-signed cert. So sst.config.ts sets NODE_TLS_REJECT_UNAUTHORIZED=0 on the harness for non-production stages only; production keeps full verification and needs a trusted API cert before using this provider. See the infra docs/agent-sandbox.md.

Execution Notes

  • It's a real shell: bash, node <file>, python3 <file> all run natively; use python3 explicitly (commands run as-is).
  • Without persistent: true, files persist across calls only with the S3 mount enabled (each call gets a fresh pod).

What the model sees

For workspace-backed runs, the model should see a normal project directory. The executor starts each bash command in the selected workspace directory:

pwd # current workspace directory
ls # files in this workspace
python3 script.py

Use relative paths in prompts and examples. If a command prints an absolute path under the configured workspaceRoot, that is expected and is only useful for debugging.

Direct executor test

Validate the executor against a cluster without a deployed harness:

KUBECONFIG=/path/to/kubeconfig.yaml \
KUBERNETES_SANDBOX_DEBUG_STREAM=1 \
bun run packages/demos/sandbox-kubernetes-direct.ts

KUBERNETES_SANDBOX_DEBUG_STREAM=1 tees the exec stream to your terminal. (Under bun, set NODE_TLS_REJECT_UNAUTHORIZED=0 if your kubeconfig CA isn't honored — a bun TLS quirk; real Node Lambda honors it.)

The full agent flow example is packages/demos/sandbox-workspace-kubernetes.ts.

Troubleshooting

SymptomCause / fix
sandbox pod ... not ready ... (ErrImagePull / ImagePullBackOff)The sandbox pod can't pull the private runtime image. Attach the registry pull secret (ghcr-pull-secret) to the agent-sandboxes default and agent-sandbox-workspace service accounts (the infra "Deploy Kubernetes Apps" workflow does this), set KUBERNETES_SANDBOX_IMAGE_PULL_SECRETS, or make the image package public. Verify with kubectl describe pod <name> -n agent-sandboxes.
Harness gets 401 from the cluster APISA token expired or kubeconfig not base64; regenerate.