Ottonomous ππ¨
Skills for every stage of product development β spec writing, task prioritization, implementation, testing, code review, and summaries β that work in both Claude Code and OpenAI Codex.
Install
Claude Code
/plugin marketplace add brsbl/ottonomous
/plugin install ottonomous@ottonomous
Codex
codex plugin marketplace add brsbl/ottonomous
Dependencies
- Claude Code or Codex
- Node.js 18+
- Git
Philosophy
Invocation differs per provider: Claude Code uses
/spec, Codex uses$spec. Throughout these docs skills are referred to by bare name (e.g. thespecskill).
Subagents for Context Isolation
Use subagents to isolate concerns and prevent context pollution:
- Context isolation: Each subagent gets only what it needs, nothing more. The orchestrator agent delegates to and manages subagents
- Specialization: Different expertise per agent (frontend-developer vs backend-architect, senior-code-reviewer vs architect-reviewer, test-writer, etc)
Skill/Subagent Separation
Skills and subagents have distinct responsibilities:
- Skills define what to hand off (file list, diff command, scope, context) and are instructions for the orchestrator agent
- Subagents define how to process what's handed off (criteria, detection rules, output format)
This keeps subagents self-contained and reusable while skills orchestrate the workflow. Skills describe delegation in tool-neutral prose so the same source runs on either provider β the runtime decides the actual model and delegation mechanics.
Swarm Orchestration
Skills coordinate multiple subagents working in parallel using background subagents β spawning concurrent work and waiting on the results:
Coordination patterns:
- Fan-out/Fan-in β Spawn N agents, wait for all, synthesize results. Used by
review. - Batches β Complete batch N before starting N+1 (for dependent work). Used by
review fix. - Pipeline β Sequential handoff between specialists. Used by
otto.
Scaling: 1-4 items = 1 agent, 5-10 = 2-3 agents, 11+ = 3-5 agents. Group by directory or component type.
Iterative Review for Verification
Every phase has explicit verification:
- Planning: spec β spec review β user approval
- Implementation: code β code review β fix β commit
- Verification criteria: Each step defines "Done when..."
- Prioritized findings: P0-P2 across all skills (P0 = critical, P1 = important, P2 = minor)
Recommended Workflow
Invoke skills with
/xin Claude Code or$xin Codex (e.g./specor$spec).
spec # define requirements via interview
β
βΌ
task # break spec into sessions & tasks
β
βΌ
βββββββββββββββββββββ
β β
βΌ β
next batch β # implement sessions of tasks in parallel then stage results
β β
βΌ β
test write staged β # generate tests, then lint/typecheck/run all
β β
βΌ β
review staged β # multi-agent code review
β β
βΌ β
review fix staged β # fix P0-P2 issues
β β
βΌ β
commit ββββββββββββββ # loop if more sessions/tasks
β
βΌ
summary # generate semantic overview of changes, opened in browser
β
βΌ
PR
Reset context between steps (e.g. /clear in Claude Code).
Skills
The 8 skills: spec, task, next, test, review, summary, otto, reset.
Specification & Planning
| Skill | Description |
|---|---|
spec [idea] |
Researches best practices, interviews you to define requirements and design. technical-product-manager validates completeness, consistency, feasibility, and technical correctness. |
spec revise {spec} |
Saves a comprehensive spec and goes straight to review with codebase exploration, skipping the interview. |
spec list |
Lists all specs with id, name, status, and created date. |
task <spec-id> |
Creates atomic tasks grouped into agent sessions. principal-engineer reviews work breakdown, dependencies, and completeness. |
task list |
Lists all tasks and their spec, sessions, status etc. |
Implementation
| Skill | Description |
|---|---|
next |
Returns next task id. |
next session |
Returns next session id. |
next <id> |
Launches a subagent to implement a task or session. Plans first, then implements. |
next batch |
Implements all highest-priority unblocked sessions in parallel. |
Testing
| Skill | Description |
|---|---|
test run |
Lint, type check, run tests. |
test write |
test-writer generates tests for pure functions with edge cases, then runs pipeline. |
test browser |
Visual verification with browser automation (a mode of the test skill). |
test all |
Run + browser combined. |
Scope: staged, branch (default)
Code Review
| Skill | Description |
|---|---|
review |
Multi-agent code review. architect-reviewer checks system structure and boundaries; senior-code-reviewer checks correctness, security, performance; false-positive-validator filters out invalid findings. |
review fix |
Implements all fixes from plan in parallel batches. |
review fix P0 |
Implements only P0 (critical) fixes. |
review fix P0-P1 |
Implements P0 and P1 fixes. |
Scope: staged, branch (default)
Summary
| Skill | Description |
|---|---|
summary |
Synthesizes code docs into a semantic HTML summary explaining what changed and why. Primarily a resource to complement or replace code review. |
Scope: staged, branch (default)
Automation
| Skill | Description |
|---|---|
otto <idea> |
Autonomous spec β tasks β [next/test/review] per session β summary. Best for greenfield explorations, prototyping, scoped migrations, and simple applications. Not recommended for building complex apps end-to-end. |
reset [targets] |
Resets workflow data. Targets: tasks, specs, sessions, all (default). |
Architecture
skills/ # Single source of truth β neutral SKILL.md + agent personas
βββ spec/
β βββ SKILL.md
β βββ agents/
β βββ technical-product-manager.md # Spec validation (completeness, feasibility)
βββ task/
β βββ SKILL.md
β βββ agents/
β βββ principal-engineer.md # Task decomposition review
βββ next/
β βββ SKILL.md
β βββ agents/ # Implementation agents
β βββ frontend-developer.md
β βββ backend-architect.md
βββ test/
β βββ SKILL.md
β βββ agents/
β βββ test-writer.md # Test generation
βββ review/
β βββ SKILL.md
β βββ agents/ # Code review agents
β βββ architect-reviewer.md # Architectural issues
β βββ senior-code-reviewer.md # Implementation issues
β βββ false-positive-validator.md # Validates and filters review findings
βββ summary/
β βββ SKILL.md
β βββ scripts/md-to-html.js
βββ otto/
β βββ SKILL.md
βββ reset/
βββ SKILL.md
.otto/ # Workflow artifacts (git-ignored)
βββ specs/ # Specification documents (.md)
βββ tasks/ # Sessions and tasks (.json)
βββ reviews/ # Review fix plans (.json)
βββ summaries/ # Generated HTML summaries
βββ otto/
βββ sessions/ # Otto session state (state.json)
Provider-agnostic layout
skills/ is the single source of truth: each SKILL.md is neutral (no model: or allowed-tools:), and agent personas describe delegation in tool-neutral prose. From this one source, both providers are wired up:
skills/β neutral source skills and agent personas, read directly by Claude Code.scripts/build-codex-plugin.mjs(npm run build) β generates the Codex app package atplugins/ottonomous/by copying the skills and emitting a per-skillagents/openai.yamlCodex interface file..claude-plugin/β Claude Code manifests (plugin.jsonpointsskillsat./skillsand lists the agent dirs;marketplace.json). Claude Code ignores the generatedopenai.yamlfiles..codex-plugin/+.agents/plugins/β Codex manifests. The root.codex-plugin/plugin.jsonis a compatibility manifest, and.agents/plugins/marketplace.jsonpoints at./plugins/ottonomous.
The Codex package under plugins/ottonomous/ is generated, never hand-edited β regenerate it with npm run build whenever skills/ changes. This one-source-regenerate-the-mirror approach (modeled on the moss-skills repo) is the anti-drift mechanism.
Feedback
Found a bug or have a feature request? Open an issue.
License
MIT