knowledge-rag
Knowledge RAG is a 100% local retrieval-augmented generation system designed for Claude Code via the Model Context Protocol. It enables instant searching of local documents without external servers, API keys, or data leaving your machine. The software supports over 20 file formats including PDFs, Markdown, code, and notebooks, capable of indexing 1800 files into 39000 chunks in under three minutes. It features advanced hybrid search combining BM25 keyword matching with semantic vectors and a cross-encoder reranker to ensure high accuracy. The package provides 12 dedicated MCP tools for seamless integration. Deployment is streamlined with pip installation, requiring no Docker or Ollama, and leverages ONNX for efficient inference with optional NVIDIA GPU acceleration. Version 4.0 introduces enterprise-grade concurrent access via SSE and streamable HTTP transports, allowing a single server process to serve multiple clients with shared resources. Optional secure features include rate limiting, Prometheus metrics,