We hosted the April Bay Area OWASP Meetup, and presented on a topic we’ve been deep in: hardening coding agents.
Coding agents are everywhere. Teams are shipping with Codex, Claude, Gemini — and seeing real productivity gains. But the security model hasn’t caught up. Most agents run with full shell access, persistent credentials, and no audit trail. The attack surface is growing fast, and most orgs don’t have visibility into what’s actually happening.
Our talk covered the threat landscape, what good infrastructure looks like, and how to think about the tradeoffs.
The threat landscape is real
This isn’t theoretical. We walked through recent, real-world exploits targeting coding agents and AI tooling: prompt injection, credential exfiltration from MCP configs, supply chain attacks via npm, privilege escalation through README prompt injection, token and OAuth abuse, and agent framework RCEs. These are CVEs from 2025 and 2026 — not future risks.
The common thread: agents are untrusted code running on your infrastructure. Treat them accordingly.
Orchestration and determinism
A key theme of the talk was the role of orchestrators. Without one, the agent decides its own tool calls, has no allowlist, no rollback, and holds all your secrets in a single context. With an orchestrator enforcing policy, you get deterministic tool allowlists, isolated worktrees, scoped secrets that are revoked on completion, and a full audit trail.
The key insight: the model is untrusted by default. The orchestrator enforces policy — not the model.
Where to run agents
We compared three execution environments — local dev machines, isolated VMs, and ephemeral containers — and the security tradeoffs of each. The rule of thumb: treat the agent process like untrusted third-party code running on your infra.
Guardrails that actually work
We covered six concrete guardrails: git worktree isolation, tool allowlisting, hard deadlines and token budgets, secret injection (not embedding), append-only audit logs, and human-in-the-loop gates for destructive actions.
The auth gap
Auth is broken for agents today. Long-lived PATs baked into env variables, OAuth flows that assume a human with a browser, no per-task credential scoping. We discussed short-term patches (short-lived tokens via Vault/OIDC, fine-grained PATs, revoke on task completion) and the longer-term path toward agent identity standards like SPIFFE and task-scoped OAuth.
Until tooling vendors adapt: rotate aggressively, scope narrowly, audit obsessively.
Vigilante
We demoed Vigilante, our open-source coding agent orchestrator. It’s a sandbox-first system: issue tracker as work queue, isolated worktrees per task, scoped credentials minted at provision and revoked at teardown, concurrency limits, and full audit logging. It’s a free, open-source Go binary.