yantrikos

Professional software vendor delivering innovative solutions on the Softono platform. Specialized in both open-source and proprietary software development.

Visit Website

Total Products

Software by yantrikos

Open Source

yantrikdb

# YantrikDB — A Cognitive Memory Engine for Persistent AI Systems > The memory engine for AI that actually knows you. [![PyPI](https://img.shields.io/pypi/v/yantrikdb)](https://pypi.org/project/yantrikdb/) [![Crates.io](https://img.shields.io/crates/v/yantrikdb)](https://crates.io/crates/yantrikdb) [![License: AGPL-3.0](https://img.shields.io/badge/license-AGPL--3.0-blue)](LICENSE) ## Get Started in 60 Seconds ### For AI agents (MCP — works with Claude, Cursor, Windsurf, Copilot) ```bash pip install yantrikdb-mcp ``` Add to your MCP client config: ```json { "mcpServers": { "yantrikdb": { "command": "yantrikdb-mcp" } } } ``` That's it. The agent auto-recalls context, auto-remembers decisions, and auto-detects contradictions — no prompting needed. See [yantrikdb-mcp](https://github.com/yantrikos/yantrikdb-mcp) for full docs. ### As a Python library ```bash pip install yantrikdb ``` The engine ships a default embedder (`potion-base-2M`, ~7 MB, distilled from BGE-base-en-v1.5) — `record_text()` / `recall_text()` work out of the box. **No `sentence-transformers` install. No first-run model download. No ONNX runtime.** Just one `pip install`. ```python import yantrikdb # Default: bundled embedder, dim=64. Just works. db = yantrikdb.YantrikDB.with_default("memory.db") db.record("Alice is the engineering lead", importance=0.8, domain="people") db.record("Project deadline is March 30", importance=0.9, domain="work") db.record("User prefers dark mode", importance=0.6, domain="preference") results = db.recall("who leads the team?", top_k=3) # → [{"text": "Alice is the engineering lead", "score": 1.0}, ...] db.relate("Alice", "Engineering", "leads") db.get_edges("Alice") db.think() # consolidate, detect conflicts, mine patterns db.close() ``` #### Want higher-quality embeddings? Three opt-in upgrade paths, in increasing weight: ```python # 1. Larger bundled variant — downloads on first call, caches under # your user data dir. Self-hosted from yantrikos/yantrikdb-models; # no HuggingFace dependency, no rate limits. db = yantrikdb.YantrikDB("memory.db", embedding_dim=256) db.set_embedder_named("potion-base-8M") # ~28 MB, ~92% MiniLM # or: db.set_embedder_named("potion-base-32M") # ~121 MB, ~95% MiniLM # 2. Bring your own embedder (sentence-transformers, fastembed, custom). from sentence_transformers import SentenceTransformer db = yantrikdb.YantrikDB("memory.db", embedding_dim=384) db.set_embedder(SentenceTransformer("all-MiniLM-L6-v2")) # 3. Slim build (no bundled embedder, must set_embedder yourself). # For deployments where the ~7 MB bundle is intolerable. # Rust: yantrikdb = { version = "0.7", default-features = false } ``` | Path | Quality vs MiniLM | Size on disk | Install network | |---|---|---|---| | Bundled default (`with_default`) | ~89% | ~7 MB (bundled) | none | | `set_embedder_named("potion-base-8M")` | ~92% | ~28 MB (cached) | first call only | | `set_embedder_named("potion-base-32M")` | ~95% | ~121 MB (cached) | first call only | | `set_embedder(MiniLM)` | 100% (baseline) | ~80 MB | sentence-transformers' own download | ### As a Rust crate ```toml [dependencies] yantrikdb = "0.7" # Want set_embedder_named() for runtime model upgrades? # yantrikdb = { version = "0.7", features = ["embedder-download"] } # Slim build (no bundled embedder, no network code path): # yantrikdb = { version = "0.7", default-features = false } ``` ## The Problem Current AI memory is: > Store everything → Embed → Retrieve top-k → Inject into context → Hope it helps. That's not memory. That's a search engine with extra steps. Real memory is hierarchical, compressed, contextual, self-updating, emotionally weighted, time-aware, and predictive. YantrikDB is built for that. ## Why Not Existing Solutions? | Solution | What it does | What it lacks | |----------|-------------|---------------| | **Vector DBs** (Pinecone, Weaviate) | Nearest-neighbor lookup | No decay, no causality, no self-organization | | **Knowledge Graphs** (Neo4j) | Structured relations | Poor for fuzzy memory, not adaptive | | **Memory Frameworks** (LangChain, Mem0) | Retrieval wrappers | Not a memory architecture — just middleware | | **File-based** (CLAUDE.md, memory files) | Dump everything into context | O(n) token cost, no relevance filtering | ### Benchmark: Selective Recall vs. File-Based Memory | Memories | File-Based | YantrikDB | Token Savings | Precision | |----------|-----------|-----------|---------------|-----------| | 100 | 1,770 tokens | 69 tokens | **96%** | 66% | | 500 | 9,807 tokens | 72 tokens | **99.3%** | 77% | | 1,000 | 19,988 tokens | 72 tokens | **99.6%** | 84% | | 5,000 | 101,739 tokens | 53 tokens | **99.9%** | 88% | At 500 memories, file-based exceeds 32K context windows. At 5,000, it doesn't fit in any context window — not even 200K. YantrikDB stays at ~70 tokens per query. Precision *improves* with more data — the opposite of context stuffing. ## Architecture ### Design Principles - **Embedded, not client-server** — single file, no server process (like SQLite) - **Local-first, sync-native** — works offline, syncs when connected - **Cognitive operations, not SQL** — `record()`, `recall()`, `relate()`, not `SELECT` - **Living system, not passive store** — does work between conversations - **Thread-safe** — `Send + Sync` with internal Mutex/RwLock, safe for concurrent access ### Five Indexes, One Engine ``` ┌──────────────────────────────────────────────────────┐ │ YantrikDB Engine │ │ │ │ ┌──────────┬──────────┬──────────┬──────────┐ │ │ │ Vector │ Graph │ Temporal │ Decay │ │ │ │ (HNSW) │(Entities)│ (Events) │ (Heap) │ │ │ └──────────┴──────────┴──────────┴──────────┘ │ │ ┌──────────┐ │ │ │ Key-Value│ WAL + Replication Log (CRDT) │ │ └──────────┘ │ └──────────────────────────────────────────────────────┘ ``` 1. **Vector Index (HNSW)** — semantic similarity search across memories 2. **Graph Index** — entity relationships, profile aggregation, bridge detection 3. **Temporal Index** — time-aware queries ("what happened Tuesday", "upcoming deadlines") 4. **Decay Heap** — importance scores that degrade over time, like human memory 5. **Key-Value Store** — fast facts, session state, scoring weights ### Decoupled Write Path (v0.6.6+) The vector index is structured as a **two-tier LSM**: a small mutable delta and an immutable HNSW cold tier swapped atomically via `ArcSwap`. Foreground writes only touch the delta (brief lock, O(1) push); HNSW work amortizes on a dedicated compactor thread. This is what eliminated the production wedge where sustained writes starved readers — see [CONCURRENCY.md](CONCURRENCY.md) and [docs/decoupled_write_path_rfc.md](docs/decoupled_write_path_rfc.md). ```mermaid flowchart LR subgraph CLIENT["Caller"] C1["record / record_with_rid"] C2["recall / recall_with_seq"] end subgraph FG["Foreground — P1, brief locks only"] F1["assign_seq vec_seq.fetch_add (or fetch_max for cluster seq)"] F2["DeltaIndex.append brief RwLock<Vec> push"] F3["bump_visible_seq DashMap + AtomicU64 (lock-free)"] F4["log_op → SQLite WAL"] end subgraph IDX["DeltaIndex (per engine)"] D1[("delta RwLock<Vec<DeltaEntry>> cap = delta_max (256)")] D2[("cold ArcSwap<HnswIndex> lock-free read")] end subgraph BG["Background — P3, dedicated threads"] B1["Compactor (1s tick) fires when delta past half-cap OR oldest entry > max_dirty_age"] B2["Materializer pool N = cores / 2 drains pending oplog ops"] end subgraph STORE["SQLite (WAL mode, single file)"] S1["memories"] S2["oplog"] S3["entity_edges, sessions, ..."] end C1 --> F1 F1 --> F2 F2 --> D1 F1 --> F3 F1 --> F4 F4 --> S2 C2 -.->|"optional wait_for_visible_seq"| F3 C2 --> D1 C2 --> D2 B1 -->|"seal + clone + ArcSwap.store"| D1 B1 --> D2 B2 --> S2 B2 --> S1 B2 --> S3 ``` **The structural invariant.** Foreground (P1) and background (P3) do not share a lock primitive that holds for non-O(1) work. The cold tier is read lock-free via `ArcSwap`; the delta's `RwLock` is held for the O(1) push only. This is what makes "no single background task can wedge reads, writes, or recovery" enforceable — see [CONCURRENCY.md](CONCURRENCY.md) Rules 2 and 3 for the names and failure modes if violated. ### Cluster Mode (RFC 010 + Phase 6 RYW) For multi-node deployments, [yantrikdb-server](https://github.com/yantrikos/yantrikdb-server) wraps the engine with [openraft](https://github.com/datafuselabs/openraft) for leader-elected replication. The four cluster-mutation primitives take the openraft commit-log index as their `seq`, so all nodes agree on a single global monotonic sequence — read-your-writes works across the cluster, not just within a node. ```mermaid flowchart LR L["Leader HTTP request"] LR["Leader engine record_with_rid(seq=Some(log_idx))"] OR["openraft commit log"] F1["Follower 1 applier record_with_rid(seq=Some(log_idx))"] F2["Follower 2 applier record_with_rid(seq=Some(log_idx))"] R["Reader on any node recall_with_seq(min_seq=log_idx)"] L --> LR LR --> OR OR -->|replicate + apply| F1 OR -->|replicate + apply| F2 F1 -.->|"visible_seq[ns] reaches log_idx"| R F2 -.->|"visible_seq[ns] reaches log_idx"| R LR -.->|"visible_seq[ns] reaches log_idx"| R ``` Each `record_with_rid` / `tombstone_with_rid` / `upsert_entity_edge_with_id` / `delete_entity_edge_with_id` accepts an optional `seq: Option<u64>`. Single-node callers pass `None` and the engine allocates; cluster appliers pass `Some(commit_log_index)` and the engine ratchets `vec_seq` up to at least that value via `fetch_max`. After apply, `visible_seq[namespace]` reaches the log index, so any subsequent `recall_with_seq(min_seq=N)` blocks just long enough for the local node to have applied through index N — and no longer. ### Memory Types (Tulving's Taxonomy) | Type | What it stores | Example | |------|---------------|---------| | **Semantic** | Facts, knowledge | "User is a software engineer at Meta" | | **Episodic** | Events with context | "Had a rough day at work on Feb 20" | | **Procedural** | Strategies, what worked | "Deploy with blue-green, not rolling update" | All memories carry **importance**, **valence** (emotional tone), **domain**, **source**, **certainty**, and **timestamps** — used in a multi-signal scoring function that goes far beyond cosine similarity. ## Key Capabilities ### Relevance-Conditioned Scoring Not just vector similarity. Every recall combines: - **Semantic similarity** (HNSW) — what's topically related - **Temporal decay** — recent memories score higher - **Importance weighting** — critical decisions beat trivia - **Graph proximity** — entity relationships boost connected memories - **Retrieval feedback** — learns from past recall quality Weights are tuned automatically from usage patterns. ### Conflict Detection & Resolution When memories contradict, YantrikDB doesn't guess — it creates a conflict segment: ``` "works at Google" (recorded Jan 15) vs. "works at Meta" (recorded Mar 1) → Conflict: identity_fact, priority: high, strategy: ask_user ``` Resolution is conversational: the AI asks naturally, not programmatically. ### Semantic Consolidation After many conversations, memories pile up. `think()` runs: 1. **Consolidation** — merge similar memories, extract patterns 2. **Conflict scan** — find contradictions across the knowledge base 3. **Pattern mining** — cross-domain discovery ("work stress correlates with health entries") 4. **Trigger evaluation** — proactive insights worth surfacing ### Proactive Triggers The engine generates triggers when it detects something worth reaching out about: - Memory conflicts needing resolution - Approaching deadlines (temporal awareness) - Patterns detected across domains - High-importance memories about to decay - Goal tracking ("how's the marathon training?") Every trigger is grounded in real memory data — not engagement farming. ### Multi-Device Sync (CRDT) Local-first with append-only replication log: - **CRDT merging** — graph edges, memories, and metadata merge without conflicts - **Vector indexes rebuild locally** — raw memories sync, each device rebuilds HNSW - **Forget propagation** — tombstones ensure forgotten memories stay forgotten - **Conflict detection** — contradictions across devices are flagged for resolution ### Sessions & Temporal Awareness ```python sid = db.session_start("default", "claude-code") db.record("decided to use PostgreSQL") # auto-linked to session db.record("Alice suggested Redis for caching") db.session_end(sid) # → computes: memory_count, avg_valence, topics, duration db.stale(days=14) # high-importance memories not accessed recently db.upcoming(days=7) # memories with approaching deadlines ``` ## Full API | Operation | Methods | |-----------|---------| | **Core** | `record`, `record_batch`, `recall`, `recall_with_response`, `recall_refine`, `forget`, `correct` | | **Knowledge Graph** | `relate`, `get_edges`, `search_entities`, `entity_profile`, `relationship_depth`, `link_memory_entity` | | **Cognition** | `think`, `get_patterns`, `scan_conflicts`, `resolve_conflict`, `derive_personality` | | **Triggers** | `get_pending_triggers`, `acknowledge_trigger`, `deliver_trigger`, `act_on_trigger`, `dismiss_trigger` | | **Sessions** | `session_start`, `session_end`, `session_history`, `active_session`, `session_abandon_stale` | | **Temporal** | `stale`, `upcoming` | | **Procedural** | `record_procedural`, `surface_procedural`, `reinforce_procedural` | | **Lifecycle** | `archive`, `hydrate`, `decay`, `evict`, `list_memories`, `stats` | | **Sync** | `extract_ops_since`, `apply_ops`, `get_peer_watermark`, `set_peer_watermark` | | **Maintenance** | `rebuild_vec_index`, `rebuild_graph_index`, `learned_weights` | ## Technical Decisions | Decision | Choice | Rationale | |----------|--------|-----------| | **Core language** | Rust | Memory safety, no GC, ideal for embedded engines | | **Architecture** | Embedded (like SQLite) | No server overhead, sub-ms reads, single-tenant | | **Bindings** | Python (PyO3), TypeScript | Agent/AI layer integration | | **Storage** | Single file per user | Portable, backupable, no infrastructure | | **Sync** | CRDTs + append-only log | Conflict-free for most operations, deterministic | | **Thread safety** | Mutex/RwLock, Send+Sync | Safe concurrent access from multiple threads | | **Query interface** | Cognitive operations API | Not SQL — designed for how agents think | ## Ecosystem | Package | What | Install | |---------|------|---------| | [yantrikdb](https://crates.io/crates/yantrikdb) | Rust engine | `cargo add yantrikdb` | | [yantrikdb](https://pypi.org/project/yantrikdb/) | Python bindings (PyO3) | `pip install yantrikdb` | | [yantrikdb-mcp](https://pypi.org/project/yantrikdb-mcp/) | MCP server for AI agents | `pip install yantrikdb-mcp` | ## Roadmap - [x] **V0** — Embedded engine, core memory model (record, recall, relate, consolidate, decay) - [x] **V1** — Replication log, CRDT-based sync between devices - [x] **V2** — Conflict resolution with human-in-the-loop - [x] **V3** — Proactive cognition loop, pattern detection, trigger system - [x] **V4** — Sessions, temporal awareness, cross-domain pattern mining, entity profiles - [ ] **V5** — Multi-agent shared memory, federated learning across users ## Worked example: Wirecard (RFC 008 substrate — with honest limits) For nearly a decade, Wirecard's filings and EY's audit attested to €1.9B in Philippine escrow accounts. In June 2020 both banks and the central bank formally denied the accounts existed. When the `source_lineage` fields are hand-populated — EY as `[wirecard, ey]` to capture audit dependence on Wirecard-provided documents, BSP as `[bsp, bpi, bdo]` to capture restatement of the commercial banks — RFC 008's `⊕` discounts the dependent claims, and the contest operator's temporal split distinguishes present-tense contradictions from historical state changes. On this hand-populated data, the substrate produces useful annotations. **Honest limits** (surfaced by Phase 2 empirical testing, Apr 2026): - On naturalistic evidence where a real agent populates the fields, the substrate's gates don't reliably fire. Cases B and C of the Phase 2 eval need an extractor/canonicalizer (not yet built) to work; Case A exposed that `⊕` is mathematically incapable of flipping decisions at realistic N, regardless of coefficient tuning. - **Current claim**: structured schema for evidence provenance/temporal/conflict annotation, useful for audit and inspection. The dependence-discount operator works on curated inputs but needs replacement before it can drive decisions. - **Not a current claim**: "decision-improvement substrate for AGI-capable agents." That framing is withdrawn pending RFC 009. See **[docs/showcase/wirecard.md](docs/showcase/wirecard.md)** for the full walkthrough including the Phase 2 negative result and the gold-state ablation that partitioned operator failure from extraction failure. Run the hand-populated demonstration directly: ```bash cargo run --example showcase_wirecard ``` ## Research & Publications ### 📄 Skill as Memory, Not Document (May 2026) [Sarkar, P. (2026). *Skill as Memory, Not Document: A Database-Native Substrate for Agent Skill Catalogs*. Zenodo.](https://doi.org/10.5281/zenodo.20128887) A measurement paper at 5K-skill scale: token cost vs filesystem catalogs (with the honest 1.49× ablation), retrieval latency (87.3 ms p50), and invalid-skill admission (0% YantrikDB vs 97% document-only baseline). Reproducible scripts + raw CSVs at [yantrikdb-server/benchmarks/skill_recall/](https://github.com/yantrikos/yantrikdb-server/tree/main/benchmarks/skill_recall). Companion blog: [yantrikdb.com/papers/skill-substrate](https://yantrikdb.com/papers/skill-substrate/). ### Earlier work - **U.S. Patent Application 19/573,392** (March 2026): "Cognitive Memory Database System with Relevance-Conditioned Scoring and Autonomous Knowledge Management" - **Zenodo (software):** [YantrikDB: A Cognitive Memory Engine for Persistent AI Systems](https://doi.org/10.5281/zenodo.18793952) ## Author **Pranab Sarkar** — [ORCID](https://orcid.org/0009-0009-8683-1481) · [LinkedIn](https://www.linkedin.com/in/pranab-sarkar-b0511160/) · [email protected] ## License AGPL-3.0. See [LICENSE](LICENSE) for the full text. The [MCP server](https://github.com/yantrikos/yantrikdb-mcp) is MIT-licensed — using the engine via the MCP server does not trigger AGPL obligations on your code.

AI Agents Vector Databases

33 Github Stars

Open Source

yantrikdb-mcp

# YantrikDB MCP Server **YantrikDB — Cognitive memory for AI agents. Persistent semantic recall, knowledge graph, contradiction detection, and procedural learning. Ships as embeddable engine, network database, or MCP server.** Works with Claude Code, Cursor, Windsurf, and any MCP-compatible client. **Website:** [yantrikdb.com](https://yantrikdb.com) · **Docs:** [yantrikdb.com/guides/mcp](https://yantrikdb.com/guides/mcp/) · **GitHub:** [yantrikos/yantrikdb-mcp](https://github.com/yantrikos/yantrikdb-mcp) · **Paper:** [Skill as Memory, Not Document](https://doi.org/10.5281/zenodo.20128887) ## At a glance | | | |---|---| | **What it is** | An MCP server that gives any MCP-compatible AI agent persistent, structured, queryable memory across sessions | | **Install** | `pip install yantrikdb-mcp` | | **Works with** | Claude Code, Cursor, Windsurf, Continue, Claude Desktop, any MCP client | | **Storage** | Local SQLite at `~/.yantrikdb/memory.db` (or any path; or HTTP cluster) | | **Embedder** | Bundled 64-dim Rust embedder (default), 384-dim ONNX MiniLM (`[onnx]` extra), 256-dim multilingual (101 languages) | | **Tools** | 16 — remember, recall, forget, correct, think, memory, graph, conflict, trigger, session, temporal, procedure, category, personality, stats, skill | | **License** | MIT (engine: AGPL-3.0) | | **Privacy** | All data on your machine. No telemetry. No external services. | ## Install ```bash # Default — uses the engine's bundled 64-dim embedder. ~10 MB install, # ~80 ms cold start, no native ML deps. pip install yantrikdb-mcp # Optional: higher-quality 384-dim ONNX MiniLM-L6-v2 embedder (~150 MB install). # Auto-used when an existing pre-v0.6 database is detected. pip install 'yantrikdb-mcp[onnx]' ``` > **Upgrading from v0.5.x?** Your existing database stays at 384 dim — install > the `[onnx]` extra to keep using it transparently. New installs default to > the lean bundled embedder. v0.7.0+ pins the engine migration fix automatically. > See [Embedder backends](#embedder-backends) below. ## Configure The MCP server has three deployment modes. Pick the one that fits your setup. ### Mode 1 — Local (default, recommended for single user) The MCP server runs the engine in-process with a local SQLite database. Fast, private, zero dependencies. ```json { "mcpServers": { "yantrikdb": { "command": "yantrikdb-mcp" } } } ``` That's it. The agent auto-recalls context, auto-remembers decisions, and auto-detects contradictions — no prompting needed. ### Mode 2 — HTTP Cluster (recommended for shared/multi-machine setups) Forward all tool calls to a [YantrikDB HTTP cluster](https://github.com/yantrikos/yantrikdb-server) instead of using an embedded engine. The MCP server is a thin stateless client — all memories live on the cluster, accessible from any machine. Benefits: shared memory across machines, high availability, no local embedder download, no local database. ```json { "mcpServers": { "yantrikdb": { "command": "yantrikdb-mcp", "env": { "YANTRIKDB_SERVER_URL": "http://node1:7438,http://node2:7438", "YANTRIKDB_TOKEN": "ydb_your_database_token" } } } } ``` - Comma-separate multiple nodes for Raft cluster auto-discovery - Automatic leader-following on failover - 15s request timeout - Get the token from the cluster: `yantrikdb token create --db your_database` ### Mode 3 — SSE Server (legacy, single remote instance) Run the MCP server itself as a long-running SSE server with its own embedded database. Clients connect via HTTP streaming. ```bash # Generate a secure API key export YANTRIKDB_API_KEY=$(python -c "import secrets; print(secrets.token_urlsafe(32))") # Start SSE server yantrikdb-mcp --transport sse --port 8420 ``` ```json { "mcpServers": { "yantrikdb": { "type": "sse", "url": "http://your-server:8420/sse", "headers": { "Authorization": "Bearer YOUR_API_KEY" } } } } ``` Supports `sse` and `streamable-http` transports. Note: SSE connections can drop on idle — Mode 2 (HTTP Cluster) is more reliable for shared deployments. ### Environment Variables | Variable | Used in Mode | Default | Description | |---|---|---|---| | `YANTRIKDB_SERVER_URL` | Cluster | *(unset → local mode)* | Comma-separated cluster node URLs | | `YANTRIKDB_TOKEN` | Cluster | *(none)* | Bearer token for the cluster database | | `YANTRIKDB_DB_PATH` | Local | `~/.yantrikdb/memory.db` | Database file path | | `YANTRIKDB_EMBEDDER` | Local | `auto` | Backend selector: `auto` \| `bundled` \| `onnx` \| `multilingual` | | `YANTRIKDB_EMBEDDING_MODEL` | Local | `all-MiniLM-L6-v2` | ONNX model name (only used when `YANTRIKDB_EMBEDDER=onnx`) | | `YANTRIKDB_SKILLS_WRITE_ENABLED` | All | `false` | Set `true` to allow agents to author skills via `skill(action="define")` (see [Skill substrate](#skill-substrate-v070) below) | | `YANTRIKDB_OUTCOMES_WRITE_ENABLED` | All | `true` | Outcome tracking via `skill(action="outcome")`. Defaults on so the feedback loop works out of the box; set `false` to lock the outcome substrate. Added in v0.8.1 per [#8](https://github.com/yantrikos/yantrikdb-mcp/issues/8) | | `YANTRIKDB_API_KEY` | SSE server | *(none)* | Bearer token when serving SSE/HTTP | ### Embedder backends Local mode ships three embedders. The MCP picks one automatically; override with `YANTRIKDB_EMBEDDER`. | Backend | Dim | Cold start | Install size | Language coverage | When it's used | |---|---|---|---|---|---| | `bundled` (engine default) | 64 | ~80 ms | ~10 MB | English-only | New / empty databases (auto-selected) | | `onnx` (MiniLM-L6-v2) | 384 | ~2 s | ~150 MB | English (higher recall) | Existing pre-v0.6 databases (auto-selected), or when set explicitly | | `multilingual` (potion-multilingual-128M) | 256 | ~2 s + ~460 MB download on first use | ~10 MB pip + ~500 MB model cache | 101 languages (BGE-M3 tokenizer) | Opt-in only via `YANTRIKDB_EMBEDDER=multilingual` | **`auto`** (default) reads the SQLite file at `YANTRIKDB_DB_PATH` and picks `onnx` if it already contains memories — preserving recall quality on upgrades — and `bundled` otherwise. **Multilingual is never auto-selected** because its 256-dim vectors are incompatible with existing bundled (64-dim) or ONNX (384-dim) databases; opt-in only on fresh databases. Set `YANTRIKDB_EMBEDDER=bundled|onnx|multilingual` to override. If you set `YANTRIKDB_EMBEDDER=onnx` (or auto-detection picks it) without installing the extras, the server fails fast with an install hint: ``` RuntimeError: Existing DB has memories embedded with the 384-dim ONNX model, but ONNX deps are missing. Install with: pip install 'yantrikdb-mcp[onnx]' ``` For the multilingual backend, the engine downloads `potion-multilingual-128M` (~460 MB tarball) from `github.com/yantrikos/yantrikdb-models` on first use. The download is SHA-256 verified, extracted into the engine's cache dir, and reused on subsequent starts. No extra Python deps required — the model runs entirely inside the Rust engine. ## Why Not File-Based Memory? File-based memory (CLAUDE.md, memory files) loads **everything** into context every conversation. YantrikDB recalls only what's relevant. ### Benchmark: 15 queries × 4 scales | Memories | File-Based | YantrikDB | Savings | Precision | |---|---|---|---|---| | 100 | 1,770 tokens | 69 tokens | **96%** | 66% | | 500 | 9,807 tokens | 72 tokens | **99.3%** | 77% | | 1,000 | 19,988 tokens | 72 tokens | **99.6%** | 84% | | 5,000 | 101,739 tokens | 53 tokens | **99.9%** | 88% | **Selective recall is O(1). File-based memory is O(n).** - At 500 memories, file-based exceeds 32K context windows - At 5,000, it doesn't fit in *any* context window — not even 200K - YantrikDB stays at ~70 tokens per query, under 60ms latency - Precision *improves* with more data — the opposite of context stuffing Run the benchmark yourself: `python benchmarks/bench_token_savings.py` ## Tools 16 tools, full engine coverage: | Tool | Actions | Purpose | |---|---|---| | `remember` | single / batch | Store memories — decisions, preferences, facts, corrections | | `recall` | search / refine / feedback | Semantic search, refinement, and retrieval feedback | | `forget` | single / batch | Tombstone memories | | `correct` | — | Fix incorrect memory (preserves history) | | `think` | — | Consolidation + conflict detection + pattern mining | | `memory` | get / list / search / update_importance / archive / hydrate | Manage individual memories + keyword search | | `graph` | relate / edges / link / search / profile / depth | Knowledge graph operations | | `conflict` | list / get / resolve / reclassify | Handle contradictions and teach substitution patterns | | `trigger` | pending / history / acknowledge / deliver / act / dismiss | Proactive insights and warnings | | `session` | start / end / history / active / abandon_stale | Session lifecycle management | | `temporal` | stale / upcoming | Time-based memory queries | | `procedure` | learn / surface / reinforce | Procedural memory — learn and reuse strategies | | `category` | list / members / learn / reset | Substitution categories for conflict detection | | `personality` | get / set | AI personality traits from memory patterns | | `stats` | stats / health / weights / maintenance | Engine stats, health, weights, and index rebuilds | | `skill` | define / surface / outcome / get / list | Substrate-native agent skill catalog (writes off by default — see [Skill substrate](#skill-substrate-v070)) | See [yantrikdb.com/guides/mcp](https://yantrikdb.com/guides/mcp/) for full documentation. ## Skill substrate (v0.8.0+) YantrikDB exposes a structured agent skill catalog — separate from loose `procedure` memories. Skills have schema (`skill_id`, `applies_to`, `triggers`, `body`, `type`) and are stored in the dedicated `skill_substrate` namespace so multiple consumers (this MCP, [yantrikdb-hermes-plugin](https://github.com/yantrikos/yantrikdb-hermes-plugin), Lane B SDK, WisePick, yantrikdb-server's `/v1/skills/*` endpoints) all read and write the same substrate. Background: [Sarkar 2026 — Skill as Memory, Not Document](https://doi.org/10.5281/zenodo.20128887). ### Security model Skill writes shape future agent behavior across sessions, so the MCP server implements defense-in-depth. Every control has an env-var knob (locked once at startup — `C2`) and the full state is exposed via `stats(action="stats")` and the audit log. **Layered controls** (each ships *on* by default unless noted): | Layer | Control | Env var | Notes | |---|---|---|---| | **Schema** | `skill_id` regex, body 50–5000 chars, `applies_to` 1–10 entries, `skill_type` enum | (always on) | Same regex set as yantrikdb-server `/v1/skills/define` | | **A1** Prompt-injection markers | Reject bodies containing role-confusion / "ignore previous instructions" patterns | `YANTRIKDB_SKILLS_DISABLE_SCANNERS=A1` to disable (audited) | OWASP LLM01 | | **A2** Credential scanner | AWS/GitHub/Slack/Stripe/Google/Anthropic/OpenAI keys, SSH/PGP private keys, JWT, password assignments | `=A2` to disable | Subset of GitHub secret-scanning | | **A3** URL/IP block | Reject http(s), ftp, IPv4 literals in body | `YANTRIKDB_SKILLS_ALLOW_URLS=true` to allow | Exfil path for downstream agents | | **A4** Unicode evasion | Reject non-printing chars (Cf/Cs/Cn except whitelisted) | `=A4` to disable | Bidi override (U+202E), zero-width spaces | | **A5** Encoded payload | Reject ≥200-char runs of base64/hex | `=A5` to disable | Heuristic — false-positive prone for large hashes | | **B1** Namespace allowlist | `skill_id` first segment must be in operator list | `YANTRIKDB_SKILLS_ALLOWED_NAMESPACES=workflow,review` | Unset = all allowed | | **B2** Author attribution | Records `session_id`, `os_user`, `hostname`, `wall_clock`, `audit_nonce` | (always on) | Forensic trail | | **B3** Cross-origin replace | Refuse to overwrite a skill written by a different consumer | `YANTRIKDB_SKILLS_ALLOW_CROSS_ORIGIN_REPLACE=true` to allow | Defends against MCP↔hermes-plugin collision | | **B4** Supersedes integrity | `supersedes` must reference an existing skill in the same namespace | (always on) | Blocks malicious retirement of legit skills | | **C1** Time-bound gate | Gate auto-closes at the timestamp (applies to both define + outcome) | `YANTRIKDB_SKILLS_WRITE_EXPIRES_AT=2026-12-31T00:00:00Z` | Unset = no expiry | | **C1.5** Split outcome gate | `outcome` action uses its own gate, default ON | `YANTRIKDB_OUTCOMES_WRITE_ENABLED=false` to lock outcomes too | v0.8.1+: `define` and `outcome` have different threat profiles — outcome can't introduce new instructions, only append `{succeeded, note≤500}` against an existing skill. Feedback loop works by default; lock explicitly if needed | | **C2** Locked config | All `YANTRIKDB_SKILLS_*` / `YANTRIKDB_OUTCOMES_*` env vars read once at startup | (always on) | Mutating env in a sub-process can't bypass the gate | | **D1** Audit log | JSONL append of every accept/reject/tamper event | `YANTRIKDB_SKILLS_AUDIT_LOG=/var/log/yantrikdb/skills.jsonl` | Unset = no auditing (warns at boot) | | **D2** Rate limit | Per-session-id sliding-window write cap | `YANTRIKDB_SKILLS_WRITE_RATE=30` (default writes/min) | Defeats flood attacks | | **D3** Outcome.note guards | Note ≤500 chars + scanned by A1/A2/A4 | (always on) | Closes the outcome side-channel | | **D4** Counters in `stats` | Accept/reject counts by reason, surfaced in `stats(action="stats")["skill_substrate"]` | (always on) | Operator dashboards | | **E1** Body SHA-256 | Stored at write time, re-verified on every read | (always on) | Detects out-of-band DB tampering — surface/get omit mismatches and log to audit | | **E2** Author origin | `metadata.author_origin` tag — defaults to `yantrikdb-mcp` | `YANTRIKDB_SKILLS_AUTHOR_ORIGIN=...` to override | Tracks substrate provenance across consumers | | **F** Startup safety | Boot-time warnings about dangerous configurations | (always on) | Logs `[F.1]`–`[F.5]` to stderr + audit | | **G** Review queue for `rule` | `rule`-type skills route to `skill_pending_review` (not surfaced by `surface/get/list`) | `YANTRIKDB_SKILLS_RULE_REQUIRES_REVIEW=false` to disable (not recommended) | Rules influence agent policy — human approval required | | **Multi-tenant guard** | `[F.1]` warning if DB shows multiple actor IDs without ack | `YANTRIKDB_SKILLS_MULTITENANT_ACK=true` | One DB = one tenant is the safe default | **Enterprise checklist:** ```bash # Minimum production config when you turn the gate ON: YANTRIKDB_SKILLS_WRITE_ENABLED=true YANTRIKDB_SKILLS_WRITE_EXPIRES_AT=2026-12-31T00:00:00Z YANTRIKDB_SKILLS_ALLOWED_NAMESPACES=workflow,review,onboarding YANTRIKDB_SKILLS_AUDIT_LOG=/var/log/yantrikdb/skills.audit.jsonl YANTRIKDB_SKILLS_AUTHOR_ORIGIN=acme-corp-claude-prod # Defaults are already correct: writes off, scanners on, rate-limit 30/min, # rule-type routed to review, body-hash verified on read, locked at startup. ``` The audit log is the canonical record. Every accept, every reject (with the scanner that flagged), every tamper-detection on read, every gate-closed-due-to-expiry — all there in JSONL. Plug it into your SIEM. ### `stats(action="stats")` example output (skill_substrate slice) ```json "skill_substrate": { "counters": { "skill_defines_accepted": 12, "skill_defines_rejected": {"content_scan:A2": 1, "namespace_not_allowed": 3}, "skill_outcomes_recorded": 47, "skill_pending_review": 2 }, "config": { "writes_enabled": true, "write_expires_at": "2026-12-31T00:00:00+00:00", "allowed_namespaces": ["workflow", "review"], "audit_log_path": "/var/log/yantrikdb/skills.audit.jsonl", "rule_requires_review": true, "author_origin": "acme-corp-claude-prod" } } ``` ### Schema (validated at write time) | Field | Constraint | |---|---| | `skill_id` | Lowercase dot-separated segments, length 4–200, e.g. `workflow.git.commit_clean` | | `body` | 50–5000 chars | | `applies_to` | 1–10 lowercase-underscore identifiers (**no hyphens** — load-bearing for substrate consistency) | | `skill_type` | One of `procedure`, `reference`, `lesson`, `pattern`, `rule` | | `on_conflict` | `reject` (default) or `replace` | ### Example session ```python # Define (requires gate enabled) skill(action="define", skill_id="workflow.git.commit_clean", body="Before commit: run pytest, run lint, write a clear subject + body.", skill_type="procedure", applies_to=["git", "release"]) # Surface relevant skills for the current task skill(action="surface", query="how to commit cleanly", top_k=5) # Record an outcome after using the skill (gated, append-only) skill(action="outcome", skill_id="workflow.git.commit_clean", succeeded=True, note="caught a flake8 issue pre-push") ``` Outcomes are append-only events in the `outcome_substrate` namespace — no auto-rollup on the parent skill, matching yantrikdb-server's "schema not semantics" design rule. Agents (or the operator) can aggregate outcomes themselves to compute success rates. ## FAQ ### What is YantrikDB MCP? YantrikDB MCP is a Model Context Protocol (MCP) server that gives AI agents persistent cognitive memory across sessions. It exposes 16 tools (remember, recall, forget, correct, think, graph, conflict, trigger, session, temporal, procedure, category, personality, stats, memory, skill) that any MCP-compatible client — Claude Code, Cursor, Windsurf, Continue, Claude Desktop — can call automatically without prompting. ### How is this different from file-based memory like CLAUDE.md? File-based memory loads *everything* into context on every conversation, which scales O(n) in token cost. YantrikDB uses selective semantic recall — at 5,000 memories, file-based costs ~101K tokens per conversation while YantrikDB costs ~53 tokens. Precision *improves* with more data instead of degrading as the context window fills up. Benchmark script: `python benchmarks/bench_token_savings.py`. ### How does it compare to mem0 / Letta / Zep / native MCP memory? See [comparison table](#comparison-with-other-agent-memory-systems) below. Short version: YantrikDB is the only one that ships as both an embeddable Rust engine *and* an MCP server *and* a network database with the same substrate semantics. It's the only one with first-class procedural memory + a skill substrate validated by schema at write time + autonomous consolidation/conflict detection. It's also the only one whose underlying engine is published as a peer-reviewed paper ([Sarkar 2026, Zenodo DOI 10.5281/zenodo.20128887](https://doi.org/10.5281/zenodo.20128887)). ### Can I self-host? Yes — three ways. (1) Local: just `pip install yantrikdb-mcp` and point your MCP client at it. SQLite lives at `~/.yantrikdb/memory.db`. (2) Network: run [yantrikdb-server](https://github.com/yantrikos/yantrikdb-server) as a multi-tenant HTTP cluster, point the MCP at it via `YANTRIKDB_SERVER_URL`. (3) Hybrid: SSE server mode (`yantrikdb-mcp --transport sse`) for shared deployments. ### Is my data sent anywhere? No. All data stays on your machine (or your cluster). No telemetry, no third-party services. The default embedder runs entirely in the Rust engine via static lookup — no model downloads or API calls. The optional `[onnx]` and multilingual embedders fetch model weights once from HuggingFace's CDN and run locally thereafter. ### What's the difference between `procedure` and `skill`? `procedure` stores loose how-to memories (effectiveness-ranked, no schema). `skill` stores structured catalog entries (`skill_id`, `applies_to`, `triggers`, `body`, `type`) in a dedicated `skill_substrate` namespace shared with [yantrikdb-hermes-plugin](https://github.com/yantrikos/yantrikdb-hermes-plugin), Lane B SDK, WisePick, and the [yantrikdb-server `/v1/skills/*` endpoints](https://github.com/yantrikos/yantrikdb-server). Use `procedure` for personal how-to notes; use `skill` for structured agent capabilities that other consumers should be able to surface. ### Is skill authoring safe to enable? Skill writes are off by default precisely because they can shape future agent behavior. When you turn the gate on, seven layers of defense-in-depth apply: prompt-injection scanner, credential scanner, URL block, unicode-evasion scanner, namespace allowlist, author attribution, audit log, rate limit, body-hash tamper detection, and a review queue for `rule`-type skills. See [Security model](#security-model) above. ### Does it work in production? Yes — yantrikdb-mcp runs in production on the YantrikDB homelab cluster (1973+ memories, SSE transport, 2 weeks uptime per release cycle) and is the reference deployment behind the engine's release decisions. v0.8.x added the engine's same-day-patch cadence to the MCP server itself: external issues filed by community contributors land as released fixes within 2 hours. ### What's the engine written in? The YantrikDB engine is Rust ([crates.io: yantrikdb](https://crates.io/crates/yantrikdb)) with pyo3 Python bindings ([PyPI: yantrikdb](https://pypi.org/project/yantrikdb/)). The MCP server itself is Python — a thin wrapper around the engine's Python bindings, plus stdio/SSE/HTTP transport plumbing. ## Comparison with other agent memory systems | Capability | YantrikDB MCP | mem0 | Letta (MemGPT) | Zep | Native MCP filesystem memory | |---|---|---|---|---|---| | MCP-native | ✅ first-class | via custom integration | via custom integration | via custom integration | ✅ filesystem-shaped | | Embeddable (no server) | ✅ Rust + Python | ❌ requires service | ❌ requires service | ❌ requires service | ✅ filesystem | | Network database mode | ✅ Raft HA cluster | ✅ Pro / Enterprise | ✅ self-host | ✅ managed + self-host | ❌ | | Semantic recall (vector) | ✅ HNSW | ✅ | ✅ | ✅ | ❌ (file grep only) | | Knowledge graph | ✅ typed nodes + edges | ✅ (recent addition) | partial | ✅ | ❌ | | Contradiction detection | ✅ autonomous | ❌ | ❌ | ❌ | ❌ | | Procedural memory | ✅ effectiveness-ranked | ❌ | partial | ❌ | ❌ | | Skill substrate (schema-validated) | ✅ with 7 defense layers | ❌ | ❌ | ❌ | ❌ | | Autonomous consolidation (`think`) | ✅ | ❌ | partial | ✅ | ❌ | | Temporal decay + half-life | ✅ biological model | ❌ | ❌ | ❌ | ❌ | | Proactive triggers | ✅ | ❌ | ❌ | ❌ | ❌ | | Personality traits derivation | ✅ from memory patterns | ❌ | ❌ | ❌ | ❌ | | Storage | local SQLite + WAL | hosted | local | local + hosted | filesystem | | License | MIT (engine AGPL-3.0) | Apache 2.0 | Apache 2.0 | Apache 2.0 | MIT | | Peer-reviewed paper | ✅ [Zenodo](https://doi.org/10.5281/zenodo.20128887) | ❌ | ✅ MemGPT paper | ❌ | ❌ | | Same-day patch cadence for issues | ✅ (avg <2h on v0.8.x) | varies | varies | varies | n/a | Comparisons reflect public-facing capabilities as of May 2026. PRs welcome to correct any rows. ## Cite this work If you use YantrikDB in academic or research context, please cite the substrate paper: ```bibtex @misc{sarkar2026skill, author = {Sarkar, Pranab}, title = {Skill as Memory, Not Document: A Database-Native Substrate for Agent Skill Catalogs}, year = {2026}, publisher = {Zenodo}, doi = {10.5281/zenodo.20128887}, url = {https://doi.org/10.5281/zenodo.20128887}, orcid = {0009-0009-8683-1481} } ``` Plain text citation: > Sarkar, P. (2026). *Skill as Memory, Not Document: A Database-Native Substrate for Agent Skill Catalogs*. Zenodo. https://doi.org/10.5281/zenodo.20128887 ## Examples ### 1. Auto-recall at conversation start **User:** "What did we decide about the database migration?" The agent automatically calls `recall("database migration decision")` and retrieves relevant memories before responding — no manual prompting needed. ### 2. Remember decisions + build knowledge graph **User:** "We're going with PostgreSQL for the new service. Alice will own the migration." The agent calls: - `remember(text="Decided to use PostgreSQL for the new service", domain="architecture", importance=0.8)` - `remember(text="Alice owns the PostgreSQL migration", domain="people", importance=0.7)` - `graph(action="relate", entity="Alice", target="PostgreSQL Migration", relationship="owns")` ### 3. Contradiction detection After storing "We use Python 3.11" and later "We upgraded to Python 3.12", calling `think()` detects the conflict. The agent surfaces it: > "I found a contradiction: you previously said Python 3.11, but recently mentioned Python 3.12. Which is current?" Then resolves with `conflict(action="resolve", conflict_id="...", strategy="keep_b")`. ## Privacy Policy YantrikDB MCP Server stores all data **locally on your machine** (default: `~/.yantrikdb/memory.db`). No data is sent to external servers, no telemetry is collected, and no third-party services are contacted during operation. - **Data collection:** Only what you explicitly store via the `remember` tool or what the AI agent stores on your behalf. - **Data storage:** Local SQLite database on your filesystem. You control the path via `YANTRIKDB_DB_PATH`. - **Third-party sharing:** None. Data never leaves your machine in local (stdio) mode. - **Network mode:** When using SSE/HTTP transport, data travels between your client and your self-hosted server. No Anthropic or third-party servers are involved. - **Embedding model:** Uses a local ONNX model (`all-MiniLM-L6-v2`). Model files are downloaded once from Hugging Face Hub on first use, then cached locally. - **Retention:** Data persists until you delete it (`forget` tool) or delete the database file. - **Contact:** [email protected] Full policy: [yantrikdb.com/privacy](https://yantrikdb.com/privacy/) ## Contributing See [CONTRIBUTING.md](CONTRIBUTING.md) for a venv setup, running `pytest`, and opening PRs. ## Support - **Issues:** [github.com/yantrikos/yantrikdb-mcp/issues](https://github.com/yantrikos/yantrikdb-mcp/issues) - **Email:** [email protected] - **Docs:** [yantrikdb.com/guides/mcp](https://yantrikdb.com/guides/mcp/) ## License This MCP server is licensed under **MIT** — use it freely in any project. Note: This package depends on [yantrikdb](https://github.com/yantrikos/yantrikdb) (the cognitive memory engine), which is licensed under **AGPL-3.0**. The AGPL applies to the engine itself — if you modify the engine and distribute it or provide it as a network service, those modifications must also be AGPL-3.0. Using the engine as-is via this MCP server does not trigger AGPL obligations on your code.

AI Agents Knowledge Bases & RAG

21 Github Stars