π§ LLM Wiki Template
An AI-managed personal knowledge base using the Karpathy LLM Wiki Pattern β where LLMs write and maintain a structured Obsidian wiki from your raw research data.
π Language: English | TiαΊΏng Viα»t
πΊ Video HΖ°α»ng DαΊ«n
Bα» Nhα» AI Kiα»u Karpathy: 3 BΖ°α»c XΓ’y Wiki Cho Agent [Miα» n PhΓ] β HΖ°α»ng dαΊ«n chi tiαΊΏt tα»« setup ΔαΊΏn sα» dα»₯ng thα»±c tαΊΏ.
What Is This?
This is a ready-to-use template for building an AI-powered personal knowledge base, inspired by Andrej Karpathy's approach to using LLMs for knowledge management:
"Something I'm finding very useful recently: using LLMs to build personal knowledge bases for various topics of research interest." β Andrej Karpathy
Instead of relying on complex RAG pipelines or vector databases, this system uses a simpler approach:
- You dump raw sources (articles, tweets, papers, videos) into
raw/ - The LLM compiles structured wiki articles in
wiki/ - You ask questions and get answers grounded in your personal knowledge base
- Knowledge compounds β each cycle makes the wiki richer
The result is a 100% inspectable, file-based knowledge system where you can see exactly what your AI "knows."
Key Features
- π File-based architecture β Markdown files, no databases, no vendor lock-in
- π 100% inspectable β Every piece of knowledge is a readable
.mdfile - π 10 automated workflows β
/ingest,/compile,/ask,/cleanup,/breakdown,/autoresearch,/save,/overview,/startup,/wrapup - π¬ Autonomous research β Agent searches the web, evaluates sources, and ingests automatically
- βοΈ Contradiction detection β Flags conflicting claims instead of silently overwriting
- πΎ Chat-to-Wiki pipeline β Save knowledge from conversations directly to wiki
- π Integrated MCP Server β Standard MCP API (Model Context Protocol) with FTS5 search for AI agents (Secure
127.0.0.1bind) - πΈοΈ Knowledge Graph β Automated analysis and visualization of knowledge links (God nodes, Orphans)
- π Self-maintaining indexes β Master index, glossary, backlinks, executive overview, operations log
- π‘οΈ Quality gates β Article size guardrails, anti-cramming/thinning rules, re-read checks
- π§Ή Wiki health checks β Automated tone, structure, link, and contradiction auditing
- π Compound knowledge loop β Each cycle produces better knowledge, which produces better outputs
Quick Start
1. Use This Template
Click "Use this template" β "Create a new repository" on GitHub.
Or clone manually:
git clone https://github.com/YOUR_USERNAME/llm-wiki-template.git my-second-brain
2. Open in Obsidian
- Download Obsidian (free)
- Open as vault:
File β Open vault β Open folder as vault - Select the cloned directory
- Install recommended plugins when prompted: Dataview, Marp Slides
3. Connect Your AI Agent
This template works with any LLM-powered coding agent that can read files. Tested with:
- Gemini CLI (recommended)
- Claude Code / Claude Desktop with filesystem access
- Cursor / Windsurf with workspace access
- Any agent that can read/write Markdown files
The agent reads AGENTS.md as its operating manual β no additional configuration needed.
4. Start Building Your Knowledge Base
# Step 1: Ingest a source
/ingest https://example.com/interesting-article
# Step 2: Compile into wiki
/compile
# Step 3: Ask questions
/ask What are the key concepts from my sources?
# Step 4: Audit wiki quality
/cleanup
# Step 5: Find knowledge gaps
/breakdown
# Step 6: Auto-research a topic
/autoresearch Large Language Models
# Step 7: Save chat insights to wiki
/save
Architecture
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β YOUR RESEARCH β
β Articles, Tweets, Papers, Videos, Repos, etc. β
ββββββββββββββββββββββββ¬βββββββββββββββββββββββββββ
β /ingest
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β raw/ β
β Source documents β NEVER modified, only added β
β articles/ papers/ repos/ tweets/ videos/ misc/ β
ββββββββββββ¬ββββββββββββββββββββββββ¬βββββββββββββββ
β /compile β /autoresearch
βΌ βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β wiki/ β
β Compiled knowledge β AI-maintained wiki β
β concepts/ tools/ people/ comparisons/ β
β + _index.md, _glossary.md, overview.md β
β βοΈ Contradiction Check before every update β
ββββββββ¬ββββββββββ¬ββββββββββ¬βββββββββββββββββββββββ
β /ask β /cleanup β /save
βΌ βΌ βΌ
ββββββββββββ ββββββββββββ ββββββββββββββββββββββββ
β answers β β quality β β chat β raw β wiki β
β + refs β β fixes β β knowledge extraction β
ββββββββββββ ββββββββββββ ββββββββββββββββββββββββ
Workflows
| Command | What It Does |
|---|---|
/ingest |
Imports raw sources (URLs, files, PDFs) into raw/ with proper frontmatter |
/compile |
Reads raw sources and creates/updates structured wiki articles (with contradiction detection) |
/ask |
Answers questions using wiki knowledge, with optional file-back to wiki |
/cleanup |
Audits wiki quality β tone, structure, links, size, contradiction backlog β and auto-fixes |
/breakdown |
Scans wiki for missing entities and proposes new articles |
/autoresearch |
π Autonomous research β searches the web, evaluates sources, ingests, and synthesizes reports |
/save |
π Chat-to-Wiki β extracts knowledge from conversations and saves directly to wiki |
/startup |
π Project Brain Startup β AI recalls context from previous sessions |
/wrapup |
π Project Wrapup β AI saves session archive and updates rolling context |
Each workflow is defined in .agents/workflows/ and can be customized.
AutoResearch β Autonomous Knowledge Discovery
The /autoresearch workflow turns your wiki into an active researcher:
/autoresearch [topic]
How it works:
- Gap Analysis β Scans existing wiki to identify what's missing
- 3-Round Research Loop β Broad search β Gap fill β Verify
- Auto-Ingest β Downloads and processes sources automatically
- Synthesis Report β Generates an executive summary at
outputs/reports/ - Human Review β You approve before anything enters the wiki
Configure search constraints in raw/_research_program.md.
Contradiction Detection
When compiling new sources, the system automatically checks for conflicting claims:
- β Temporal updates (v1.0 β v2.0) β Updated normally
- β New information β Integrated normally
- β οΈ Actual contradictions β Preserved with
[!warning]callout, taggedneeds-review
The wiki never silently overwrites conflicting information. Human review is always required.
Quality System
The template enforces several quality mechanisms:
- Re-read before update β The AI must read the full article before editing (non-negotiable)
- Contradiction check β Compare new claims against existing wiki before writing
- Article size guardrails β 15β120 lines; too short = stub, too long = split
- Anti-cramming β Sub-topics with β₯3 paragraphs get their own article
- Anti-thinning β No article creation unless β₯3 meaningful sentences can be written
- Encyclopedia tone β Neutral, attribution-based writing, no editorial voice
- Absorption log β Tracks which raw sources have been compiled (no duplicates)
- Operations log β Chronological record of every action taken on the vault
File Structure
llm-wiki-template/
βββ AGENTS.md β Agent operating manual (the brain)
βββ README.md β This file
βββ update.py β π One-command updater script
βββ sync-brain.ps1 β π Auto-sync to GitHub script
βββ .gitignore
β
βββ .agents/workflows/ β 10 automated workflows
β βββ ask.md
β βββ autoresearch.md β π Autonomous research
β βββ breakdown.md
β βββ cleanup.md β Updated: contradiction backlog scanning
β βββ compile.md β Updated: contradiction detection (Step 4.5)
β βββ ingest.md
β βββ save.md β π Chat-to-Wiki pipeline
β βββ startup.md β π Session startup
β βββ wrapup.md β π Session wrapup
β
βββ integrations/mcp/ β π MCP Server Integration
β βββ README.md β MCP Setup Guide
β βββ config-sample.json β Sample config for Claude/Cursor
β
βββ scripts/ β π Agent-Native Tooling
β βββ brain.py β Central CLI Router (Search, Index, Health, MCP)
β βββ brain_mcp.py β FastMCP Server (Secure 127.0.0.1 bind)
β βββ brain_db.py β SQLite Database abstraction
β βββ build_search_index.pyβ FTS5 Indexing
β βββ ... β Other tools (audit, resolve_orphans, etc.)
β
βββ .obsidian/ β Obsidian config (pre-configured)
β
βββ sessions/ β Session logs (AI Memory)
β βββ current-context.md β Rolling context (auto-updated)
β βββ .hot-buffer.md β Mid-session decisions buffer
β βββ session-summary-*.md β Archive of each session
β
βββ raw/ β Your source documents
β βββ _ingest.py β Batch ingest script (Python)
β βββ _research_program.md β π AutoResearch configuration
β βββ articles/
β βββ papers/
β βββ repos/
β βββ tweets/
β βββ videos/
β βββ misc/
β
βββ wiki/ β AI-maintained wiki
β βββ overview.md β π Executive summary for cross-project access
β βββ _index.md β Master catalog
β βββ _glossary.md β Term definitions
β βββ _absorb_log.json β Compilation tracker
β βββ _backlinks.json β Reverse link index
β βββ _build_backlinks.py β Backlinks builder script
β βββ _build_graph.py β π Knowledge Graph analysis script
β βββ _dashboard.md β Dataview dashboard
β βββ _ops_log.md β Operations log
β βββ concepts/
β βββ tools/
β βββ people/
β βββ comparisons/
β
βββ outputs/ β Generated content
βββ reports/ β AutoResearch synthesis reports
βββ slides/
βββ charts/
βββ summaries/
Customization
Change the Language
The template uses English by default. To switch:
- Edit
AGENTS.mdβ update the Writing Tone section - Update wiki meta files (
_index.md,_glossary.md) headers - The AI will follow your language preference from
AGENTS.md
Add Entity Types
Edit AGENTS.md β Entity-Type Templates section to add new categories beyond concepts/tools/people/comparisons.
Modify Quality Rules
All quality rules are in AGENTS.md. Adjust thresholds (article size, quote density, etc.) to match your preferences.
Configure AutoResearch
Edit raw/_research_program.md to customize:
- Search scope and constraints
- Confidence scoring thresholds
- Source exclusion lists
- Domain-specific notes and priorities
Add Obsidian Plugins
The template includes configs for Dataview (tables/queries) and Marp Slides (presentations). Add more plugins through Obsidian's community plugin browser.
Batch Ingest Script
For bulk importing, use the included Python script:
# Single file
python raw/_ingest.py path/to/article.md
# PDF (requires PyMuPDF: pip install PyMuPDF)
python raw/_ingest.py paper.pdf
# Entire folder
python raw/_ingest.py ~/Downloads/research-notes/
# Preview without creating files
python raw/_ingest.py big-folder/ --dry-run
Updating Your Vault
Already using the template? Update to the latest workflows with a single command:
Option A: brain CLI (Recommended)
Install once:
pip install git+https://github.com/KHOAAI-HILL/llm-wiki-template.git
Then use anywhere inside your vault:
brain update # Update (asks for confirmation)
brain update --dry-run # Preview changes without writing
brain update --force # Update without asking
brain status # Check vault health
brain version # Show version
Option B: Standalone script (No install)
# Download the script
curl -o update.py https://raw.githubusercontent.com/KHOAAI-HILL/llm-wiki-template/master/update.py
# Run it
python update.py
Both methods only touch system files (workflows, AGENTS.md). Your personal data (raw/, wiki/, sessions/, outputs/) is never modified.
Philosophy
This template is built on three principles:
-
Files over databases β Markdown files are portable, inspectable, and version-controllable. No vector DB, no cloud dependencies.
-
Compile once, query forever β Instead of retrieving raw chunks on every query (RAG), the AI pre-compiles clean wiki articles. Queries read refined knowledge, not raw data.
-
Knowledge compounds β Each ingest-compile-ask cycle makes the wiki richer. Better wiki β better answers β better questions β richer wiki.
Credits
- Andrej Karpathy β Originated the LLM Knowledge Base concept
- Farzaa β
wiki-gen-skillimplementation that heavily influenced quality gates - DataChaz β Community breakdown and analysis
License
MIT β Use freely, modify as you wish, share with others.
