About The-Librarian

Stop re-explaining your project every session. Persistent memory for Cowork — 100% local, zero configuration. If it was useful to you, please consider buying me a drink (link below). Enjoy!

p

Published by

prdicta

Visit View Profile

README.md

View on GitHub

The Librarian

Persistent memory for Cowork. The Librarian gives Claude perfect recall across conversations — preferences, decisions, project context, and past discussions survive between sessions.

Built for Claude on Cowork (Anthropic's desktop app). Ground truth is stored locally and injected at retrieval time via a CLI interface. The architecture is LLM-agnostic in principle — but Cowork is the tested and supported environment.

If The Librarian was useful to you, please consider buying me a drink.

Why this exists

Every conversation with an AI starts from zero. You explain who you are, what you're working on, what you've already tried...and then the context window fills up and you start over. Three hundred sessions in, you're still introducing yourself and explaining you're not "the user". It's embarrassing.

The Librarian was born from a shower thought to do better. It started as an experiment: what if it just remembered our interactions? Not through fine-tuning or retraining, but by keeping a local record of everything and surfacing the right pieces at the right time, when it was intelligently needed.

I used the analogy of a well-worn book that falls open to the page you need, its spine cracked from use. And now, when I ask it to recall that analogy, it can.

What it does

Tested on: Claude Opus 4.6 (Anthropic, February 2026) via Cowork on Windows and macOS.

The Librarian sits between you and Claude in Cowork, maintaining a local knowledge base that grows over time:

Conversation memory — Every substantive exchange is indexed. Preferences, decisions, code patterns, and project facts persist across sessions automatically.
Hybrid search — Combines FTS5 keyword matching with ONNX-accelerated semantic embeddings (all-MiniLM-L6-v2) for accurate retrieval. Query expansion and multi-signal reranking surface the right context.
Three-tier storage — Active context stays lean. Frequently accessed entries are promoted to a hot cache. Everything else lives in cold storage, retrieved on demand.
User knowledge — Facts about the user (preferences, corrections, biographical context) get a permanent 3x search boost and are loaded at every boot.
Context window management — Automatically tracks token budgets and offloads content to the rolodex before context overflows.
Temporal grounding — Timestamps everything and flags stale entries, so Claude never presents outdated information as current truth.
Dual mode — Works out of the box in verbatim mode (no API key needed), or with an Anthropic API key for enhanced extraction and enrichment (tested with Claude models).

Installation

Download the latest release for your platform from Releases:

Platform	Artifact
Windows	`TheLibrarian-windows.tar.gz`
macOS	`TheLibrarian-macos.tar.gz`
Linux	`TheLibrarian-linux.tar.gz`

Extract the archive and run the librarian binary. On first run, use the init command to set up your workspace:

librarian init /path/to/your/project

This copies the CLI, source files, and ONNX model into the target directory.

Commands

All interaction happens through the librarian CLI:

librarian boot [--compact|--full-context]   # Start or resume a session
librarian ingest <role> "<text>"            # Store a message
librarian recall "<query>"                  # Search memory
librarian remember "<fact>"                 # Store a user-knowledge fact
librarian stats                             # Session and memory health
librarian end "<summary>"                   # Close a session
librarian profile set <key> <value>         # Set a user preference
librarian profile show                      # View preferences
librarian pulse                             # Heartbeat check
librarian maintain                          # Background knowledge graph hygiene

Search options

librarian recall "<query>" --source conversation|document|user_knowledge
librarian recall "<query>" --fresh [hours]   # Prioritize recent entries

How it works

┌────────────────┐    ┌───────────────────────────────┐
│  Cowork (CLI)  │────│  The Librarian                │
│  boot / recall │    │                               │
│  ingest / pulse│    │  FTS5 keyword search           │
└────────────────┘    │  + ONNX semantic embeddings    │
                      │  + multi-signal reranking      │
                      │                               │
                      │  ┌─────────────────────────┐  │
                      │  │  rolodex.db (SQLite)    │  │
                      │  │  entries + embeddings   │  │
                      │  │  user knowledge + KG    │  │
                      │  └─────────────────────────┘  │
                      └───────────────────────────────┘

Storage

All data lives in a single SQLite database (rolodex.db) in your project directory. No external servers, no cloud dependencies. The database uses WAL mode for concurrent read/write safety.

Entries are categorized automatically (note, preference, decision, code, etc.) and tagged with temporal metadata. Each entry gets a vector embedding for semantic search alongside FTS5 indexing for keyword search.

Search pipeline

Query expansion — The query is expanded into multiple search variants with entity extraction and intent classification.
Wide-net retrieval — Each variant pulls up to 15 candidates via hybrid search (keyword + semantic).
Reranking — A multi-signal reranker scores candidates on semantic relevance, entity overlap, category match, recency, and access frequency.
Context assembly — Top results are formatted into a context block with metadata, reasoning chains, and source attribution.

Embedding

The Librarian bundles an ONNX-optimized all-MiniLM-L6-v2 model (~25MB) for local semantic embeddings. No API calls needed for search. The embedding strategy follows a fallback chain: Anthropic API → local sentence-transformers → ONNX Runtime → deterministic hash (always available).

Reasoning chains

When Claude's thinking process matters (design decisions, debugging sessions, multi-step analyses), The Librarian captures reasoning chains — ordered sequences of steps that preserve the "why" alongside the "what."

Limitations and known scope

Single model family tested. The Librarian has been developed and tested exclusively with Claude (Anthropic) via Cowork. Behavior with other LLMs is untested.
Single platform tested. Cowork is the only verified integration surface. The CLI architecture is portable, but no other platform has been validated.
English-only embeddings. The bundled ONNX model (all-MiniLM-L6-v2) is optimized for English. Recall quality for other languages will be lower.
Single-user. The rolodex is designed for one person's workflow. There is no multi-user, permissioning, or shared-memory support.
No encryption at rest. The SQLite database is stored in plaintext. Sensitive data should be managed accordingly.

Building from source

Requirements: Python 3.12+, pip

# Install dependencies
pip install -r requirements.txt
pip install -r requirements-onnx.txt

# Export the ONNX model
pip install torch sentence-transformers onnx onnxscript
python scripts/export_onnx_model.py

# Build the standalone binary
python build.py

The build produces a PyInstaller bundle in dist/librarian/ (Windows/Linux) or dist/The Librarian.app/ (macOS).

Running tests

pip install -r requirements-dev.txt
pytest tests/

CI

The GitHub Actions workflow builds and smoke-tests on all three platforms (Windows, macOS, Linux) on every push to main. The smoke test runs 10 end-to-end checks: binary launch, boot, ingest, recall, embedding verification, cross-topic search, and session lifecycle.

Data and backup

Your memory is a single file: rolodex.db. Standard SQLite — portable, self-contained, no server required.

While running, SQLite creates companion files (rolodex.db-wal, rolodex.db-shm) for crash safety. These are transient and fold back into the main database when the connection closes.

To back up: Copy all three files if The Librarian is running. If it's stopped, only rolodex.db is needed.

License

The Librarian is dual-licensed:

Open source — GNU Affero General Public License v3.0 (AGPL-3.0). Free to use, modify, and distribute under AGPL terms. If you modify the software and make it available over a network, you must release your source code under the AGPL-3.0.
Commercial — For OEMs, ISVs, SaaS providers, and enterprises that want to embed or distribute The Librarian without AGPL obligations. See COMMERCIAL_LICENSE.md or contact [email protected].

The-Librarian