Kubernetes
Runs the agent's bash/node/python inside an agent-sandbox
Sandbox pod on the Broods k3s cluster — a VM-like runtime (bash, node, python3, curl
on PATH). Egress follows config.network via a per-Sandbox NetworkPolicy: allow-all removes the
policy, deny-all (the default when network is omitted) blocks all egress, and restricted
allows the listed CIDRs with DNS (port 53) kept open. The workspace is the same shared S3 bucket the
other providers use, mounted at the selected sandbox/<namespace>/ prefix for each run.
Infra side (controller install, runtime image, IAM, cluster identity, debugging, migration) is
documented in the beeblastco/infra repo: docs/agent-sandbox.md.
Configuration
{
"name": "kubernetes",
"config": {
"provider": "kubernetes",
"permissionMode": "ask",
"timeout": 60,
"options": {
"workspaceRoot": "/mnt/workspaces",
"mountAwsS3Buckets": true
}
}
}
Reference the resulting sandboxId from config.sandbox or config.workspaces[].sandbox.
Cluster-control settings are service-managed and cannot be set in account sandbox config.
They come from deployment env/defaults:
| Option | Env fallback | Default |
|---|---|---|
| kubeconfig | KUBERNETES_SANDBOX_KUBECONFIG, then KUBERNETES_SANDBOX_KUBECONFIG_SSM (SSM param name) | ambient kubeconfig (local only) |
namespace | KUBERNETES_SANDBOX_NAMESPACE | agent-sandboxes |
image | KUBERNETES_SANDBOX_IMAGE | ghcr.io/beeblastco/agent-sandbox-runtime:latest |
serviceAccountName | KUBERNETES_SANDBOX_SERVICE_ACCOUNT | agent-sandbox-workspace when S3 mount is enabled, otherwise pod default SA |
imagePullSecrets | KUBERNETES_SANDBOX_IMAGE_PULL_SECRETS (comma-sep) | none |
workspaceBucketName | FILESYSTEM_BUCKET_NAME | — |
awsRegion | AWS_REGION / AWS_DEFAULT_REGION | — |
sandbox.envVars passes extra environment variables into the pod (honored by all providers).
How it works
Per bash/file run the executor (functions/harness-processing/sandbox/kubernetes-executor.ts):
- Creates a
Sandbox(agents.x-k8s.io/v1alpha1) namedfp-<namespace>-<rand>in the namespace, with the runtime image (command: sleep infinity). - Waits for the pod to be
Ready. - If
mountAwsS3Bucketsis set, runsmount-s3 --prefix sandbox/<namespace>/ <bucket> <workspaceRoot>/<namespace>inside the pod (the container runsprivilegedso FUSE works). - Execs the command via the kube exec API after
cd <workspaceRoot>/<namespace>, streaming stdout/stderr. - Deletes the
Sandbox(ephemeral-per-run).
With persistent: true the executor instead reserves one deterministic Sandbox
(fp-p-<slug>-<hash>) per workspace: it scales replicas 0↔1 to pause/resume, keeps the
home directory on a PVC (persistentDiskGb, persistentHome, storageClass options), and
refreshes a shutdownTime so the idle reaper only deletes truly expired sandboxes. Hooks
(onCreate/onResume) and detached background jobs work as described in
the sandbox overview.
Home storage
$HOME is runtime state; the S3 workspace mount is the user-facing shared file store.
| Mode | Config | Persists after scale-to-zero? | Use for |
|---|---|---|---|
| Durable home PVC | persistent: true + options.persistentDiskGb | yes | package caches, virtualenvs, background-job metadata |
| Ephemeral home | persistent: true + ephemeralHome: true | no | fast-start workloads that write durable files to S3 workspace |
| S3 workspace | options.mountAwsS3Buckets: true + attached workspace | yes | user/project files, read-only sharing, downloadable artifacts |
// Durable home PVC
{
"provider": "kubernetes",
"persistent": true,
"options": {
"persistentDiskGb": 10,
"persistentHome": "/home/node",
"storageClass": "local-path"
}
}
// Ephemeral home: skips the home PVC entirely.
{
"provider": "kubernetes",
"persistent": true,
"ephemeralHome": true
}
options.persistentDiskGb is the requested home PVC size in GiB (currently 1-10).
options.persistentHome is the mounted HOME path, not a boolean.
options.storageClass selects the Kubernetes StorageClass, such as local-path or
hcloud-volumes; omit it to use the cluster default.
PVCs are per reserved Sandbox and mounted only into that Sandbox's pod. They are good for private runtime state, not cross-user file sharing. For read-only or multi-user access, write artifacts to the S3 workspace and enforce access there.
The kubeconfig (CA + token) is ~2.7 KB — over Lambda's 4 KB env-var limit. So
sst.config.tsstores it in an SSM SecureString parameter (/broods/<stage>/kubernetes-sandbox-kubeconfig, value from theKubernetesSandboxKubeconfigsecret) and passes only the parameter name asKUBERNETES_SANDBOX_KUBECONFIG_SSM; the executor fetches + caches it at runtime. Set the GitHub secretKUBERNETES_SANDBOX_KUBECONFIG(base64 kubeconfig) and deploy — no env-size juggling.
Cluster auth uses the service-managed kubeconfig. For mount-s3, the pod gets S3
permissions from its Kubernetes service account / cluster identity; the harness does not
pass AWS credentials into the pod. When mountAwsS3Buckets is enabled and no deployment
override is set, the executor uses the agent-sandbox-workspace ServiceAccount. The pod
runs privileged + runAsUser: 0 (FUSE needs the device + root) and mounts with
--uid 1000 --gid 1000.
Mountpoint-for-S3 has no append/in-place edit (
>>or editing a file in place fails) — only whole-file create/overwrite. The agent should rewrite files, not append. (Same as Daytona's mount.)
Requirements
- Harness deployed with
KUBERNETES_SANDBOX_KUBECONFIGset (CI: GitHub secret of the same name; see the infra doc for how to generate it from the SA token). - The harness runtime must be able to reach the cluster API (
https://<api>:6443) and open exec websockets. TheHarnessProcessingLambda is not VPC-attached, so it has public egress by default. - For the S3 mount:
mountAwsS3Buckets: true; the sandbox pod service account must have S3 RW on the workspace bucket, and the runtime image must be in GHCR with a pull secret in the namespace. Without the mount, stateless runs still work but workspace-backed tools fail fast because files would not persist across calls. - TLS on the deployed harness. The harness Lambda is a bun-compiled binary whose
fetchignores the kubeconfig CA /insecure-skip-tls-verify, and k3s serves a self-signed cert. Sosst.config.tssetsNODE_TLS_REJECT_UNAUTHORIZED=0on the harness for non-production stages only; production keeps full verification and needs a trusted API cert before using this provider. See the infradocs/agent-sandbox.md.
Execution Notes
- It's a real shell:
bash,node <file>,python3 <file>all run natively; usepython3explicitly (commands run as-is). - Without
persistent: true, files persist across calls only with the S3 mount enabled (each call gets a fresh pod).
What the model sees
For workspace-backed runs, the model should see a normal project directory. The executor starts each bash command in the selected workspace directory:
pwd # current workspace directory
ls # files in this workspace
python3 script.py
Use relative paths in prompts and examples. If a command prints an absolute path under the
configured workspaceRoot, that is expected and is only useful for debugging.
Direct executor test
Validate the executor against a cluster without a deployed harness:
KUBECONFIG=/path/to/kubeconfig.yaml \
KUBERNETES_SANDBOX_DEBUG_STREAM=1 \
bun run packages/demos/sandbox-kubernetes-direct.ts
KUBERNETES_SANDBOX_DEBUG_STREAM=1 tees the exec stream to your terminal. (Under bun, set
NODE_TLS_REJECT_UNAUTHORIZED=0 if your kubeconfig CA isn't honored — a bun TLS quirk; real Node
Lambda honors it.)
The full agent flow example is packages/demos/sandbox-workspace-kubernetes.ts.
Troubleshooting
| Symptom | Cause / fix |
|---|---|
sandbox pod ... not ready ... (ErrImagePull / ImagePullBackOff) | The sandbox pod can't pull the private runtime image. Attach the registry pull secret (ghcr-pull-secret) to the agent-sandboxes default and agent-sandbox-workspace service accounts (the infra "Deploy Kubernetes Apps" workflow does this), set KUBERNETES_SANDBOX_IMAGE_PULL_SECRETS, or make the image package public. Verify with kubectl describe pod <name> -n agent-sandboxes. |
Harness gets 401 from the cluster API | SA token expired or kubeconfig not base64; regenerate. |