coo-agent Case Study

Designing a safety-first AI agent runtime with durable state, approval-gated actions, and role-isolated workers.

Problem

Running AI agents against real systems (codebases, APIs, external services) is high-stakes: a wrong action can create work, spend money, or send a message that can’t be unsent. The common failure mode is an agent that either does too little (asks for approval on everything) or too much (acts autonomously without guardrails). The goal was a runtime that operates autonomously within well-defined policy boundaries and surfaces only the decisions that genuinely need a human.

Approach

Built a Docker-first orchestration platform around a few core ideas:

The operator-facing surface includes a dashboard for active runs and an approval inbox for reviewing and acting on gated decisions.

Outcome

Stack

Python, FastAPI, PostgreSQL, Redis, RQ, SQLAlchemy, litellm, Docker