Home
Softono
gcp-redis-llm-stack

gcp-redis-llm-stack

Open source MIT Jupyter Notebook
38
Stars
14
Forks
1
Issues
5
Watchers
1 year
Last Commit

About gcp-redis-llm-stack

Reference architecture for LLM-based applications on Google Cloud Platform with Redis Enterprise as a high-performance data layer.

Platforms

Web Self-hosted Cloud

Languages

Jupyter Notebook

Links

Scalable LLM Architectures with Redis & GCP Vertex AI

☁️ Generative AI with Google Vertex AI comes with a specialized in-console studio experience, a dedicated API for Gemini and easy-to-use Python SDK designed for deploying and managing instances of Google's powerful language models.

⚡ Redis Enterprise offers fast and scalable vector search, with an API for index creation, management, blazing-fast search, and hybrid filtering. When coupled with its versatile data structures - Redis Enterprise shines as the optimal solution for building high-quality Large Language Model (LLM) apps.

This repo serves as a foundational architecture for building LLM applications with Redis and GCP services.

Reference architecture

  1. Primary Data Sources
  2. Data Extraction and Loading
  3. Large Language Models
    • text-embedding-gecko@003 for embeddings
    • gemini-1.5-flash-001 for LLM generation and chat
  4. High-Performance Data Layer (Redis)
    • Semantic caching to improve LLM performance and associated costs
    • Vector search for context retrieval from knowledge base

RAG + Semantic Caching demo

Open In Colab

Open the code tutorial using the Colab notebook to get your hands dirty with Redis and Vertex AI on GCP.

Additional resources