Home
Softono
chunkhound

chunkhound

Open source MIT Python
1.3K
Stars
103
Forks
87
Issues
8
Watchers
1 week
Last Commit

About chunkhound

Local first codebase intelligence

Platforms

Web Self-hosted

Languages

Python

ChunkHound

Local-first codebase intelligence

Tests License: MIT 100% AI Generated Discord

Your AI assistant searches code but doesn't understand it. ChunkHound researches your codebase—extracting architecture, patterns, and institutional knowledge at any scale. Integrates via MCP.

Features

  • cAST Algorithm - Research-backed semantic code chunking
  • Multi-Hop Semantic Search - Discovers interconnected code relationships beyond direct matches
  • Semantic search - Natural language queries like "find authentication code"
  • Regex search - Pattern matching without API keys
  • Local-first - Your code stays on your machine
  • 32 languages with structured parsing
    • Programming (via Tree-sitter): Python, JavaScript, TypeScript, JSX, TSX, Java, Kotlin, Groovy, C, C++, C#, Go, Rust, Haskell, Swift, Bash, MATLAB, Makefile, Objective-C, PHP, Dart, Lua, Vue, Svelte, Zig
    • Configuration: JSON, YAML, TOML, HCL, Markdown
    • Text-based (custom parsers): Text files, PDF
  • MCP integration - Works with Claude, VS Code, Cursor, Windsurf, Zed, etc
  • Real-time indexing - Automatic file watching, smart diffs, seamless branch switching, and explicit backend selection (watchdog, watchman, polling)

Documentation

Visit chunkhound.ai for documentation:

Requirements

Installation

# Install uv if needed
curl -LsSf https://astral.sh/uv/install.sh | sh

# Install ChunkHound
uv tool install chunkhound

Quick Start

  1. Create .chunkhound.json in project root

    {
    "embedding": {
     "provider": "voyageai",
     "api_key": "your-voyageai-key"
    },
    "llm": {
     "provider": "claude-code-cli"
    }
    }

    Note: Use "codex-cli" instead if you prefer Codex. Both work equally well and require no API key.

  2. Index your codebase

    chunkhound index
  3. Search changed code in recent commits

    
    # Last N commits
    chunkhound search "authentication changes" --last-n 20

Changes introduced by that commit (diff against its parent; root commits use empty tree)

chunkhound search "database migration" --commit-hash abc1234

Custom git range

chunkhound search "API changes" --commit-range v2.0..HEAD

Deep research over recent changes

chunkhound research "what changed in the auth module?" --last-n 50


> `--vector-source` controls scope: `diff` (default, changed code only), `both` (merges diff + DB), `db` (ignore diff).

**For configuration, IDE setup, and advanced usage, see the [documentation](https://chunkhound.ai).**

## Why ChunkHound?

| Approach | Capability | Scale | Maintenance |
|----------|------------|-------|-------------|
| Keyword Search | Exact matching | Fast | None |
| Traditional RAG | Semantic search | Scales | Re-index files |
| Knowledge Graphs | Relationship queries | Expensive | Continuous sync |
| **ChunkHound** | Semantic + Regex + Code Research | Automatic | Incremental + realtime |

**Ideal for:**
- Large monorepos with cross-team dependencies
- Security-sensitive codebases (local-only, no cloud)
- Multi-language projects needing consistent search
- Offline/air-gapped development environments

## License

MIT