Akmatori
Akmatori is an AI-powered AIOps agent that integrates with monitoring systems and Slack to provide intelligent incident response and automated remediation.
Key Features
- Multi-LLM Support: Use OpenAI, Anthropic, Google, OpenRouter, or on-premise models (GLM, Kimi, Minimax, Mistral, LLaMA)
- Multi-Source Alert Ingestion: Receive alerts from Alertmanager, PagerDuty, Grafana, Datadog, Zabbix, and Slack channels
- Messaging Integrations & Channels: Configure one or more messaging providers (Slack today, Telegram on the roadmap) under Settings → Integrations, then attach Channels with capability flags (post / listen / default) that alert sources and cron jobs reference by UUID
- Cron Jobs: Schedule recurring agent investigations that post results to a Channel — pick a 5-field cron expression, write a prompt, and attach a per-cron tool allowlist. Every tick runs as a full investigation under the
cron-agentsystem skill; platform-seeded crons (e.g.memory-curator) are markedis_system, ship disabled so you can review them before they fire, and cannot be deleted (only enabled/disabled) - AI-Powered Automation: Analyze incidents and execute remediation skills using your preferred LLM
- Agent Skills Format: Skills follow the open Agent Skills specification for portability across AI agents
- Tools Management: Configure reusable tools (SSH, Python scripts, API clients) for skills
- Web Dashboard: Manage incidents, skills, tools, and settings through a modern UI
- Context Files: Upload reference documents for the AI to use during incident analysis
- Self-Hosted: Your data never leaves your infrastructure
Supported LLM Providers
| Provider | Models |
|---|---|
| OpenAI | GPT-5.5, GPT-5.5 Pro, GPT-5.4, GPT-5.4 Mini, GPT-5.3 Codex, o4-mini |
| Anthropic | Claude Opus 4.7, Opus 4.6, Sonnet 4.6, Haiku 4.5 |
| Gemini 3 Pro Preview, Gemini 3.1 Pro Preview, Gemini 3 Flash Preview, Gemini 2.5 Pro/Flash, 2.0 Flash | |
| OpenRouter | Access to 100+ models |
| Custom/On-Prem | GLM, Kimi, Minimax, Mistral, LLaMA, etc. |
Install (end users)
The recommended install flow pulls pre-built multi-arch images from GHCR — no git clone, no local build.
Prerequisites
- Docker and Docker Compose v2+
- LLM API key (OpenAI, Anthropic, Google, or OpenRouter)
- Slack App (optional, for Slack integration)
Install
-
Download the release assets (compose file + nginx config):
mkdir akmatori && cd akmatori curl -fsSLO https://github.com/akmatori/akmatori/releases/latest/download/docker-compose.yml mkdir proxy && curl -fsSL -o proxy/nginx.conf \ https://github.com/akmatori/akmatori/releases/latest/download/nginx.conf -
(Optional) Create an
.envto pin a specific version or configure a corporate proxy. All secrets (POSTGRES_PASSWORD,JWT_SECRET, and the admin password) are auto-generated on first run, so the file is only needed for the overrides shown below:cat > .env <<'EOF' # AKMATORI_VERSION=1.2.0 # HTTP_PROXY=http://proxy.corp:3128 # HTTPS_PROXY=http://proxy.corp:3128 EOFAKMATORI_VERSIONdefaults tolatest. See the "Behind an HTTP proxy" section below for proxy details. -
Pull and start:
docker compose pull docker compose up -d -
Verify all containers are running:
docker compose psAll services should show "Up" status.
-
Access the web dashboard at
http://localhost:8080(usernameadmin). The first visit runs a one-time setup wizard that lets you set the admin password. -
Configure your LLM provider in Settings → LLM Provider
-
(Optional) Connect Slack:
- Settings → Integrations → Add Slack to register your bot/signing/app tokens.
- Settings → Channels → Add Channel to attach posting and listening destinations. Tick
is_default_poston the channel that should receive alerts when an alert source does not pin one explicitly.can_post/can_listendecide whether the channel appears in the alert-source picker or is monitored for inbound mentions.
-
(Optional) Schedule recurring runs in Cron Jobs: pick a schedule, write a prompt, attach the tools the agent is allowed to call, and point it at one of your Channels. Every tick runs as a full agent investigation under the
cron-agentsystem skill.
Upgrade
Bump AKMATORI_VERSION in .env (or leave it unset to track latest) and:
docker compose pull
docker compose up -d
Behind an HTTP proxy
There are two independent concerns: pulling images through the proxy (a Docker daemon setting) and the running services egressing through the proxy at runtime (a compose environment: setting). Both must be configured or you'll get stuck partway through.
A. Pulling images through the proxy
This is a Docker daemon-level setting — compose can't fix it from inside the file.
Linux + systemd:
# /etc/systemd/system/docker.service.d/http-proxy.conf
[Service]
Environment="HTTP_PROXY=http://proxy.corp:3128"
Environment="HTTPS_PROXY=http://proxy.corp:3128"
Environment="NO_PROXY=localhost,127.0.0.1,.svc,.local"
Then:
sudo systemctl daemon-reload
sudo systemctl restart docker
Docker Desktop (macOS / Windows): Settings → Resources → Proxies.
Allowlist note: your corporate proxy allowlist must include pkg-containers.githubusercontent.com alongside ghcr.io. GHCR stores image manifests on ghcr.io and the actual blob layers on pkg-containers.githubusercontent.com — the most common GHCR-through-corporate-proxy footgun is "manifest pulls but blob downloads hang."
B. Runtime egress through the proxy
Set HTTP_PROXY / HTTPS_PROXY once in your .env (or in the shell that runs docker compose up); the api, mcp-gateway, and agent containers inherit them via the compose file. The default NO_PROXY bypasses internal service-to-service traffic (api↔postgres, agent↔gateway, etc.) so internal hops never hit the corporate proxy.
# .env
HTTP_PROXY=http://proxy.corp:3128
HTTPS_PROXY=http://proxy.corp:3128
# NO_PROXY defaults to the internal service names; override only if you need to add hosts.
The runtime HTTP_PROXY covers the API server's outbound calls (Slack), the agent worker's LLM API calls, and the MCP Gateway's HTTP-connector tools and external MCP-server connections. The MCP Gateway's built-in monitoring/CMDB tools (Zabbix, Grafana, VictoriaMetrics, PagerDuty, NetBox, Kubernetes, Catchpoint, Jira) ignore the env-var proxy by design and have their own per-tool proxy toggle in Settings → Proxy — enable those if your monitoring endpoints also need to go through the corporate proxy.
Maintainer / development
If you're working on Akmatori itself and want to build from source instead of pulling published images, use the dev override which restores the per-service build: blocks:
git clone https://github.com/akmatori/akmatori.git
cd akmatori
cp .env.example .env # edit ADMIN_PASSWORD / POSTGRES_PASSWORD
make dev # docker compose -f docker-compose.yml -f docker-compose.dev.yml up -d --build
make dev is the canonical maintainer entry point. The base docker-compose.yml alone has only image: references (the end-user pull flow); the docker-compose.dev.yml override adds the build: blocks back. Without the -f docker-compose.dev.yml argument, docker compose build is a no-op against a release install.
Architecture
Akmatori uses a secure multi-container architecture:
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Alert Sources │────▶│ API Server │◀───▶│ PostgreSQL │
│ (Prometheus, │ │ (Go backend) │ │ (encrypted │
│ PagerDuty, │ │ │ │ credentials) │
│ Datadog...) │ └────────┬────────┘ └─────────────────┘
└─────────────────┘ │
│ WebSocket
┌─────────────────┐ ▼
│ Slack Bot │◀───▶┌─────────────────┐ ┌─────────────────┐
│ │ │ Agent Worker │◀───▶│ MCP Gateway │
└─────────────────┘ │ (pi-mono + │ │ (SSH, APIs) │
│ subagents) │ │ │
└────────┬────────┘ └─────────────────┘
│
▼
┌─────────────────┐
│ LLM Providers │
│ (OpenAI, │
│ Anthropic, │
│ Google, etc.) │
└─────────────────┘
Security by design:
- Agent Worker has NO database access
- Credentials are fetched via MCP Gateway on-demand
- Network isolation between containers
- API keys passed per-incident via WebSocket
Runbook and memory search: runbooks and cross-incident memory live as markdown files mounted into the agent container (runbooks read-only, memory read-write). The main agent reaches them through pi-mono subagents — runbook-searcher for SOP lookup, memory-searcher for prior-incident recall, and memory-writer to record durable findings at the end of an investigation. The API materializes new memory files back into Postgres at incident completion.
Documentation
API Documentation (Self-Hosted)
The API server includes built-in interactive documentation:
- Swagger UI:
http://localhost:8080/api/docs— browse and test API endpoints in your browser - OpenAPI Spec:
http://localhost:8080/api/openapi.yaml— raw OpenAPI 3.1 specification
Both endpoints are publicly accessible (no authentication required).
License
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.