Awesome-RAG
The sections outline is in complete draft form. Everything is in motion (in this document and in my head).
- General
- Dialogue Routing
- LLM Models
- Retrieval
- Prompts
- Generation
- Evaluation
- Performance and cost
- Privacy
- Security
- Applications of RAG
- Tools
- Vendor-specific examples
- Running RAGs in production
- Vectors corner
General
- Retrieval Augmented Generation β Intuitively and Exhaustively Explained
- GraphRAG - Microsoft Research Blog Post
Disadvantages of RAG
RAG Patterns
- Generative AI Lifecycle Patterns
- Why do RAG pipelines fail? Advanced RAG Patterns β Part1 Ozgur Guler
- How to improve RAG peformance β Advanced RAG Patterns β Part2
- Patterns for Building LLM-based Systems & Products
- AI Engineer Summit - Building Blocks for LLM Systems & Products
- Technical Considerations for Complex RAG
Dialogue Routing
Retrieval
Vector Retrieval
- Boosting RAG: Picking the Best Embedding & Reranker models
- What We Need to Know Before Adopting a Vector Database
Chunking
- Chunking Strategies for LLM Applications
- Evaluating the Ideal Chunk Size for a RAG System using LlamaIndex
- How to Chunk Text Data β A Comparative Analysis
Positional chunking
Semantic chunking
Embeddings
Vector Search
RAG Fusion
Not Vector Retrieval
- Vector Search Is Not All You Need
- Build a search engine, not a vector DB
- Improving RAG (Retrieval Augmented Generation) Answer Quality with Re-ranker
- From Search to Synthesis: Enhancing RAG with BM25 and Reciprocal Rank Fusion
Generation
Prompts
- Emerging RAG & Prompt Engineering Architectures for LLMs
- How to Cut RAG Costs by 80% Using Prompt Compression
Prompting strategies
Multi-Modal RAG
Multi-index RAG
Multi-Document
FLARE
Chain-of-Verification
Chain-Of-Thought
Context
Long context RAG
Knowledge and Knowledge Graphs
-
Graph RAG: Unleashing the Power of Knowledge Graphs with LLM
-
Embeddings + Knowledge Graphs: The Ultimate Tools for RAG Systems
-
The Practical Benefits to Grounding an LLM in a Knowledge Graph Daniel Bukowski
-
HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models
Automated prompt optimization
Hallucination
Guardrails
LLM Models
Finetuning and Pretraining
- Fine-Tuning Llama 2.0 with Single GPU Magic
- Practitioners guide to fine-tune LLMs for domain-specific use case
- Are You Pre-training your RAG Models on Your Raw Text?
- Combine Multiple LoRA Adapters for Llama 2
- RAG vs Finetuning β Which Is the Best Tool to Boost Your LLM Application?
Evaluation of RAGs
- RAG Evaluation
- Evaluating RAG: A journey through metrics
- Exploring End-to-End Evaluation of RAG Pipelines
- Evaluation Driven Development, the Swiss Army Knife for RAG Pipelines
- Evaluating the Ideal Chunk Size for a RAG System using LlamaIndex
Performance and cost
Privacy
Security
Applications of RAG
Chatbots
Tools
DSPy
AutoRAG
- AutoRAG - AutoML tool for RAG. Automatically optimize RAG pipeline with single YAML file.
AutoGPT
Langchain
LlamaIndex
-
Building Production-Ready LLM Apps with LlamaIndex: Document Metadata for Higher Accuracy Retrieval
-
Building Production-Ready LLM Apps With LlamaIndex: Recursive Document Agents for Dynamic Retrieval
Vendor-specific examples
Elastcisearch + OpenAI
OpenAI and ChatGPT
Tools and fucntions
- Unlocking the Power of the OpenAI API: Master Function-Calling with Practical Examples
- penAI/Chat-GPT Function Calling : for Enhanced AI Interactions
Vespa
Qdrant
Running RAGs in production
Vectors corner
- Similarity Search, Part 2: Product Quantization
- Binary and Scalar Embedding Quantization for Significantly Faster & Cheaper Retrieval
- Cohere int8 & binary Embeddings - Scale Your Vector Database to Large Datasets Image of Nils Reimers