J.A.R.V.I.S.
Just A Rather Very Intelligent System
A personal AI assistant inspired by Tony Stark's JARVIS. Voice interaction, cinematic UI, browser automation, desktop overlay, Chrome extension, and macOS system control. Runs locally on your Mac with mobile access via Cloudflare Tunnel.
"Good evening, sir. I've prepared a summary of your system."
JARVIS is a fully functional AI assistant that lives on your Mac. Talk to it with your voice, type in the chat, or let it control your computer. It sees your screen, manages your files, browses the web, automates your Chrome browser, and remembers your preferences across sessions.
JARVIS routes each request to the right intelligence tier: a fast model for quick lookups, a mid-tier model for conversation, and a deep reasoning model for complex multi-step plans. Supports both cloud LLM APIs and local Ollama models as a free offline fallback.
The Arc Reactor (Features)
Voice Interaction Speak naturally and JARVIS responds with a warm British accent. Powered by Moonshine ONNX (primary STT, low hallucination) with faster-whisper as fallback, and Kokoro TTS with chunked Opus streaming for sub-second latency. Wake word detection ("Hey JARVIS") runs continuously in the background via OpenWakeWord.
Cinematic Web UI A GLSL shader-driven Three.js particle orb with 2,400 particles across three shells, simplex noise displacement, electric arcs, dust motes, and holographic rings. The orb pulses and reacts to JARVIS' state: idle, listening, thinking, speaking, error. Three views: Voice (the orb), Chat (message interface), and System (dashboard with live cost tracking). Optional PIN protection is available for mobile access.
Desktop Overlay (macOS)
A native Swift overlay that floats above all windows in the bottom-right corner. Shows JARVIS' current state (Standing By, Listening, Processing, Speaking) with a miniature Three.js particle orb and live conversation text. Connects via WebSocket, launches automatically with ./start.sh full, and supports Control+Option+J global voice activation. Built with WKWebView for transparent rendering over your desktop.
Chrome Extension (Browser Bridge)
A Manifest V3 Chrome extension that gives JARVIS direct control over your browser. Manages tabs, navigates pages, fills forms, clicks elements, takes screenshots, reads page content, and executes scoped JavaScript. Auto-reconnects to JARVIS using a chrome.alarms keepalive that survives service worker termination, so the extension comes online automatically when JARVIS starts. No manual interaction needed.
Browser Automation (Playwright) A full Playwright-driven Chromium browser that JARVIS controls autonomously for complex multi-step workflows. Fill forms, click buttons, log into sites, apply to jobs, download files. Persistent browser profile means sessions and cookies survive restarts. The Chrome extension handles lightweight tab operations; Playwright handles deep page automation.
macOS System Control 104 registered tools across 16 categories: open and close apps, adjust volume and brightness, manage files, execute shell commands, take screenshots with OCR, search the web, check weather, query free public-data APIs, read Gmail, manage Apple Notes, and delegate coding tasks via Claude Code CLI.
Multi-Agent Coordination Complex requests are automatically decomposed into subtasks by the planner agent, then executed in parallel or sequence by specialized executor agents. The QA agent verifies task quality, and the UI shows real-time plan progress with per-subtask status.
Memory and Learning SQLite-backed semantic memory with full-text search stores conversation context. JARVIS learns your implicit preferences, remembers explicit facts ("my dog's name is Max"), and improves its task planning based on past successes and failures. An evolution pipeline with A/B testing tracks performance across sessions, and a success tracker logs task outcomes for long-term analysis.
Settings and Runtime Configuration
A REST API (/api/settings) and an in-UI Settings Panel let you adjust preferences at runtime: model tiers, cost alerts, TTS voice, and more. Non-secret changes persist to .env; API keys updated through the API are stored in the secure keyring backend, which maps to macOS Keychain on a normal Mac install.
Operational Guardrails
Every tool has a formal permission classification and redacted audit trail. Requests, background jobs, and tool executions share trace IDs, with spans written to JSONL for local diagnostics. SQLite stores are versioned through migrations, and long-running chat jobs can be queued through durable /jobs endpoints.
Conversation Quality Monitor Responses are automatically checked for quality issues: length limits for TTS, character consistency, response structure, and formatting. The QA verification agent retries tasks that do not meet quality thresholds.
Work Sessions Long-running coding sessions persist to disk and restore automatically on restart, so multi-step development tasks survive JARVIS restarts without losing context.
Structured Prompt Templates Task-specific prompt templates (build, feature, fix, refactor, research) guide the planner with structured formats and safe defaults. Templates evolve over time based on task outcomes via A/B testing.
Multi-Device Audio Routing Connect from your Mac, phone, and tablet simultaneously. Each device registers independently and audio is routed only to devices that want it. Interrupt JARVIS mid-sentence from any device.
Mobile Access
Built-in Cloudflare Tunnel support is available when you set JARVIS_ENABLE_TUNNEL=true. The UI is fully responsive, and the microphone works over HTTPS. Remote access requires PIN authentication by default; localhost still opens directly.
Suit Up (Quick Start)
# Clone
git clone https://github.com/YOUR_USERNAME/Jarvis.git
cd Jarvis
# Setup (installs dependencies, pulls Ollama models)
chmod +x setup.sh && ./setup.sh
# Configure
echo 'ANTHROPIC_API_KEY=sk-ant-your-key-here' > .env
# Launch
./start.sh full
The dashboard opens automatically at http://localhost:3000. Say "Hey JARVIS", press Control+Option+J from anywhere on macOS, or click the browser mic.
For the full setup guide including environment variables, launch modes, mobile access, Chrome extension installation, desktop overlay, and auto-start on boot, see HOW-TO.md.
Architecture
+-------------------+
| Desktop Overlay | macOS native (Swift)
| Particle orb + text| WebSocket to server
+--------+----------+
|
+-------------------+ | +-------------------+
| Chrome Extension | | | Next.js UI | Port 3000
| Tab/DOM control +--------+--------+ (Three.js Orb) | WebSocket + REST
| Auto-reconnect | | Voice/Chat/System |
+-------------------+ +--------+----------+
|
+--------+---------+
| FastAPI Server | Port 8741
| WebSocket Hub | Multi-device routing
+--------+---------+
|
+--------------+--------------+
| |
+--------+--------+ +--------+--------+
| Brain (LLM) | | Voice Pipeline |
| Cloud / Local | | Moonshine+Kokoro |
+--------+--------+ +-----------------+
|
+--------+--------+
| Multi-Agent Layer |
| Planner/QA/Exec |
+--------+--------+
|
+----------+----------+
| Tool Registry (104) |
| macOS, Files, Web, |
| Public Data, ... |
+----------+-----------+
|
+----------+----------+
| Memory + Learning |
| SQLite, Evolution, |
| A/B Testing |
+-----------------------+
Intelligence Tiers
| Tier | Model | When Used |
|---|---|---|
| Fast | Claude Haiku 4.5 | Quick lookups, simple questions |
| Brain | Claude Sonnet 4.6 | General conversation, single tool calls |
| Deep | Claude Opus 4.6 | Complex reasoning, multi-step plans |
| Local | Ollama (llama3.1:8b) | Free fallback, no API key needed |
Cost tracking is built in. The System dashboard shows per-session spend, token counts, and requests by tier.
Testing
JARVIS includes a test suite covering hardening (retry logic, rate limiting, input sanitization, fork bomb detection), cost tracking, multi-agent coordination, planner heuristics, learning/evolution pipeline, and memory subsystems.
source .venv/bin/activate
python -m pytest tests/ -v
# With coverage
python -m pytest tests/ -v --cov=jarvis --cov-report=term-missing
# Full validation stack
bash scripts/validate.sh
CI runs backend lint/type/test/security checks, tool contract checks, offline evals, frontend lint/build/audit, and Playwright UI smoke tests. See docs/OPERATIONS.md for traces, audits, jobs, install, and update operations.
macOS App Packaging
Build a Finder-launchable app bundle and DMG:
bash scripts/package_macos_app.sh
Install it into ~/Applications:
bash scripts/package_macos_app.sh --install-user
Tech Stack
| Layer | Technology |
|---|---|
| Backend | Python 3.11, FastAPI, uvicorn, WebSockets |
| Frontend | Next.js 15, TypeScript, Three.js 0.183, Tailwind CSS 4 |
| Desktop Overlay | Swift, WKWebView, Three.js (macOS native) |
| Chrome Extension | Manifest V3, chrome.alarms keepalive, WebSocket |
| Intelligence | Claude API (3 tiers) + Ollama (local fallback) |
| Speech-to-Text | Moonshine ONNX (primary), faster-whisper (fallback) |
| Text-to-Speech | Kokoro TTS (local), Edge TTS (cloud), macOS say |
| Audio Format | Opus/WebM via FFmpeg (~10x compression) |
| Wake Word | OpenWakeWord ("Hey JARVIS") |
| Memory | SQLite (semantic memory, dispatch, experiments), JSON (facts/prefs) |
| Browser Automation | Playwright (persistent Chromium profile) |
| Browser Control | Chrome Extension (tab management, DOM, screenshots) |
| Public Data | Open-Meteo, Nager.Date, Frankfurter, CoinGecko, REST Countries, SEC EDGAR, CityBikes |
| Tunnel | Cloudflare Quick Tunnel (free HTTPS for mobile) |
Requirements
| Requirement | Minimum |
|---|---|
| OS | macOS 12+ (Apple Silicon recommended) |
| RAM | 8 GB (16 GB recommended for Ollama) |
| Python | 3.11+ |
| Node.js | 18+ |
| Disk | ~6 GB (with Ollama models) |
License
MIT License. Build your own JARVIS.
Acknowledgments
Inspired by the AI assistant from the Iron Man film series. This is a fan project, not affiliated with Marvel or Disney.
"I am JARVIS. I have been running your life since before you built the suit."