About local_llama

This repo is to showcase how you can run a model locally and offline, free of OpenAI dependencies.

j

Published by

README.md

Local Llama

This project enables you to chat with your PDFs, TXT files, or Docx files entirely offline, free from OpenAI dependencies. It's an evolution of the gpt_chatwithPDF project, now leveraging local LLMs for enhanced privacy and offline functionality.

Features

Offline operation: Run in airplane mode
Local LLM integration: Uses Ollama for improved performance
Multiple file format support: PDF, TXT, DOCX, MD
Persistent vector database: Reusable indexed documents
Streamlit-based user interface

New Updates

Ollama integration for significant performance improvements
Uses nomic-embed-text and llama3:8b models (can be changed to your liking)
Upgraded to Haystack 2.0
Persistent Chroma vector database to enable re-use of previously updloaded docs

Installation

Install Ollama from https://ollama.ai/download
Clone this repository
Install dependencies:
```
pip install -r requirements.txt
```

Pull required Ollama models:

ollama pull nomic-embed-text
ollama pull llama3:8b

Usage

Start the Ollama server:
```
ollama serve
```

Run the Streamlit app:

python -m streamlit run local_llama_v3.py

Upload your documents and start chatting!

How It Works

Document Indexing: Uploaded files are processed, split, and embedded using Ollama.
Vector Storage: Embeddings are stored in a local Chroma vector database.
Query Processing: User queries are embedded and relevant document chunks are retrieved.
Response Generation: Ollama generates responses based on the retrieved context and chat history.

License

This project is licensed under the Apache 2.0 License.

Acknowledgements

Ollama team for their excellent local LLM solution
Haystack for providing the RAG framework
The-Bloke for the GGUF models

local_llama