๐ค Chat with PDF locally using Ollama + LangChain
A powerful local RAG (Retrieval Augmented Generation) application that lets you chat with your PDF documents using Ollama and LangChain. This project includes multiple interfaces: a modern Next.js web app, a Streamlit interface, and Jupyter notebooks for experimentation.
โจ Features
- ๐ 100% Local - All processing happens on your machine, no data leaves
- ๐ Multi-PDF Support - Upload and query across multiple documents
- ๐ง Multi-Query RAG - Intelligent retrieval with source citations
- ๐ฏ Advanced RAG - LangChain-powered pipeline with ChromaDB
- ๐ฅ๏ธ Two Modern UIs - Next.js (primary) and Streamlit interfaces
- ๐ REST API - FastAPI backend for programmatic access
- ๐ Jupyter Notebooks - For experimentation and learning
๐ผ๏ธ Screenshots
Next.js Interface (Recommended)
Modern chat interface with PDF management, source citations, and reasoning steps
Streamlit Interface
Classic Streamlit interface with PDF viewer and chat functionality
๐บ Video Tutorial
๐๏ธ Project Structure
ollama_pdf_rag/
โโโ src/
โ โโโ api/ # FastAPI REST API
โ โ โโโ routers/ # API endpoints
โ โ โโโ services/ # Business logic
โ โ โโโ main.py # API entry point
โ โโโ app/ # Streamlit application
โ โ โโโ components/ # UI components
โ โ โโโ main.py # Streamlit entry point
โ โโโ core/ # Core RAG functionality
โ โโโ document.py # PDF processing
โ โโโ embeddings.py # Vector embeddings
โ โโโ llm.py # LLM configuration
โ โโโ rag.py # RAG pipeline
โโโ web-ui/ # Next.js frontend
โ โโโ app/ # Next.js app router
โ โโโ components/ # React components
โ โโโ lib/ # Utilities & AI integration
โโโ data/
โ โโโ pdfs/ # PDF storage
โ โโโ vectors/ # ChromaDB storage
โโโ notebooks/ # Jupyter notebooks
โโโ tests/ # Unit tests
โโโ docs/ # Documentation
โโโ run.py # Streamlit runner
โโโ run_api.py # FastAPI runner
โโโ start_all.sh # Start all services
๐ Getting Started
Prerequisites
-
Install Ollama
- Visit Ollama's website to download and install
- Pull required models:
ollama pull llama3.2 # or your preferred chat model ollama pull nomic-embed-text # for embeddings
-
Clone Repository
git clone https://github.com/tonykipkemboi/ollama_pdf_rag.git cd ollama_pdf_rag -
Set Up Python Environment
python -m venv venv source venv/bin/activate # On Windows: .\venv\Scripts\activate pip install -r requirements.txt -
Set Up Next.js Frontend (for the modern UI)
cd web-ui pnpm install pnpm db:migrate cd ..
๐ฎ Running the Application
Option 1: Next.js + FastAPI (Recommended)
Start both services:
# Terminal 1: Start the FastAPI backend
python run_api.py
# Runs on http://localhost:8001
# Terminal 2: Start the Next.js frontend
cd web-ui && pnpm dev
# Runs on http://localhost:3000
Or use the convenience script:
./start_all.sh
Service URLs: | Service | URL | Description | |---------|-----|-------------| | Next.js Frontend | http://localhost:3000 | Modern chat interface | | FastAPI Backend | http://localhost:8001 | REST API | | API Documentation | http://localhost:8001/docs | Swagger UI |
Option 2: Streamlit Interface
python run.py
# Runs on http://localhost:8501
Option 3: Jupyter Notebook
jupyter notebook
Open notebooks/experiments/updated_rag_notebook.ipynb to experiment with the code.
๐ก Usage
Next.js Interface
- Upload PDFs - Click the ๐ button or drag & drop files
- View PDFs - Uploaded PDFs appear in the sidebar with chunk counts
- Select Model - Choose from your locally available Ollama models
- Ask Questions - Type your question and get answers with source citations
- View Reasoning - See the AI's thinking process and retrieved chunks
Streamlit Interface
- Upload PDF - Use the file uploader or toggle "Use sample PDF"
- Select Model - Choose from available Ollama models
- Ask Questions - Chat with your PDF through the interface
- Adjust Display - Use the zoom slider for PDF visibility
- Clean Up - Delete collections when switching documents
๐ API Reference
The FastAPI backend provides these endpoints:
| Method | Endpoint | Description |
|---|---|---|
POST |
/api/v1/pdfs/upload |
Upload and process a PDF |
GET |
/api/v1/pdfs |
List all uploaded PDFs |
DELETE |
/api/v1/pdfs/{pdf_id} |
Delete a PDF |
POST |
/api/v1/query |
Query PDFs with RAG |
GET |
/api/v1/models |
List available Ollama models |
GET |
/api/v1/health |
Health check |
See full documentation at http://localhost:8001/docs when running.
๐งช Testing
# Run all tests
python -m pytest tests/ -v
# Run with coverage
python -m pytest tests/ --cov=src
Pre-commit Hooks
pip install pre-commit
pre-commit install
โ ๏ธ Troubleshooting
- Ollama not responding: Ensure Ollama is running (
ollama serve) - Model not found: Pull models with
ollama pull <model-name> - No chunks retrieved: Re-upload PDFs to rebuild the vector database
- Port conflicts: Check if ports 3000, 8001, or 8501 are in use
Common Errors
ONNX DLL Error (Windows)
DLL load failed while importing onnx_copy2py_export
Install Microsoft Visual C++ Redistributable and restart.
CPU-Only Systems
Reduce chunk size if experiencing memory issues:
- Modify
chunk_sizeto 500-1000 insrc/core/document.py
๐ค Contributing
- Open issues for bugs or suggestions
- Submit pull requests
- Comment on the YouTube video for questions
- โญ Star the repository if you find it useful!
๐ License
This project is open source and available under the MIT License.
โญ๏ธ Star History
Built with โค๏ธ by Tony Kipkemboi