π€ MemRAG Chatbot
Multimodal AI Research Assistant Δược xΓ’y dα»±ng trΓͺn Google Agent Development Kit (ADK)
KαΊΏt hợp RAG Β· Bα» nhα» dΓ i hαΊ‘n Β· Transcription thα»i gian thα»±c Β· Wiki Knowledge Graph
π Tech Stack
β¨ TΓnh nΔng nα»i bαΊt
| TΓnh nΔng | MΓ΄ tαΊ£ |
|---|---|
| π Realtime Chat Streaming | SSE token-by-token, phαΊ£n hα»i tα»©c thΓ¬ |
| π Multimodal PDF RAG | Upload PDF β chunk β embed β semantic search vα»i trΓch dαΊ«n |
| π§ Long-term Memory | TΓch hợp mem0 ghi nhα» thΓ΄ng tin cΓ‘ nhΓ’n xuyΓͺn phiΓͺn |
| ποΈ Realtime Transcription | Soniox STT, 60+ ngΓ΄n ngα»―, tα»± lΖ°u transcript β RAG |
| π Wiki Knowledge Graph | Tα»± tα»ng hợp tα»« tΓ i liα»u + meetings, visualize bαΊ±ng React Flow |
| β‘ Redis Caching | ElastiCache cache wiki/graph/session/docs β giαΊ£m latency |
| π Auth | JWT + Google OAuth2, refresh token rotation, CSRF protection |
| ποΈ IaC | ToΓ n bα» hαΊ‘ tαΊ§ng AWS quαΊ£n lΓ½ bαΊ±ng Terraform |
ποΈ KiαΊΏn trΓΊc hα» thα»ng
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β USER (Browser) β
β https://d3qrt08bgfyl3d.cloudfront.net β
ββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββββββ
β HTTPS
ββββββββββββββββΌβββββββββββββββ
β CloudFront β
β /api/* β ALB (proxy) β
β /* β S3 Frontend β
βββββββββ¬βββββββββββ¬βββββββββββ
β β
βββββββββββββΌβββ ββββββΌβββββββββββββββββββββββββ
β S3 Bucket β β Application Load Balancer β
β (React SPA) β β memrag-backend-alb-*.elb β
ββββββββββββββββ ββββββββββββββ¬βββββββββββββββββ
β
ββββββββββββββββββββββββββββΌβββββββββββββββββββββββββββ
β ECS EC2 β FastAPI Backend β
β Port 8000 β
β β
β ββββββββββββ ββββββββββββ ββββββββββββββββββββ β
β β Chat API β β Docs API β β Wiki / Auth API β β
β β (SSE) β β (upload) β β Transcription WS β β
β ββββββ¬ββββββ ββββββ¬ββββββ ββββββββββ¬ββββββββββ β
β β β β β
β ββββββΌββββββββββββββΌβββββββββββββββββββΌβββββββββββ β
β β Google ADK Agent (Gemini 2.5-flash) β β
β β 9 Tools + ContextFilterPlugin β β
β ββββββ¬βββββ¬βββββ¬βββββ¬βββββ¬βββββ¬ββββββββββββββββββββ β
β β β β β β β β
βββββββββΌβββββΌβββββΌβββββΌβββββΌβββββΌββββββββββββββββββββββ
β β β β β β
βββββββββββββΌβ βββΌβββ β βββΌβββ β βββΌβββββββ
β ElastiCacheβ βmem0β β βWikiβ β βDynamoDBβ
β Redis Cacheβ β β β βS3 β β βSessionsβ
βββββββββββββββ ββββββ β ββββββ β ββββββββββ
ββββΌβββ ββββΌβββββ
β β β RDS β
βQdrant βPostgresβ
β(RAG)β β(Auth) β
βββββββ βββββββββ
π Data Flow
π¬ Chat Flow
User nhαΊ―n tin
βββΊ POST /api/v1/chat/stream (SSE)
βββΊ ChatService._ensure_session() [DynamoDB]
βββΊ ContextFilterPlugin (tΓ³m tαΊ―t nαΊΏu > 22 msgs)
βββΊ ADK Runner β Gemini 2.5-flash
βββΊ read_wiki_index / read_wiki_page
βββΊ search_documents (RAG β Qdrant)
βββΊ retrieve_memories (mem0)
βββΊ βββ SSE stream tokens βββΊ Browser
π Upload PDF Flow
Upload PDF
βββΊ POST /api/v1/documents/upload
βββΊ extract text β StorageBackend.save() [S3]
βββΊ RAGService: chunk β embed β Qdrant upsert
βββΊ Cache.delete("docs:{user_id}:list")
βββΊ [background] WikiService 4-phase pipeline
βββΊ Phase 1 MAP: extract entities/topics (parallel)
βββΊ Phase 2 REDUCE: merge + deduplicate by slug
βββΊ Phase 3 SYNTH: LLM synthesize per page β S3
βββΊ Phase 4 FINAL: rebuild index + link_index
ποΈ Transcription Flow
Start meeting β POST /transcription/start [DynamoDB]
β
Audio stream β WS /transcription/audio/{meeting_id}
βββΊ SonioxService β Soniox STT β utterances β DynamoDB
β
Stop meeting β POST /transcription/stop
βββΊ TranscriptRAGService.ingest() [Qdrant]
βββΊ [background] WikiService.update_wiki_from_transcript()
β‘ Cache Strategy
Request ΔαΊΏn API
β
βββΊ Cache.get(key) βββΊ HIT βββΊ Return cached data (< 1ms)
β
βββΊ MISS βββΊ TΓnh toΓ‘n / DB query
βββΊ Cache.set(key, data, ttl)
βββΊ Return data
TTL: wiki_page=10min | wiki_graph=2min | sessions=1min | docs=1min | user=5min
π Quickstart β Local Dev
YΓͺu cαΊ§u
- Docker & Docker Compose
- Node.js 20+
- Python 3.11+ vα»i uv
- Google Gemini API Key
1. Clone & cαΊ₯u hΓ¬nh
git clone https://github.com/minh2004pd/chatbotfullpipeline.git
cd chatbotfullpipeline
# TαΊ‘o file .env tα»« example
cp .env.example .env
# β Δiα»n GEMINI_API_KEY vΓ cΓ‘c biαΊΏn cαΊ§n thiαΊΏt
2. Khα»i Δα»ng services
# ChαΊ‘y toΓ n bα»: Redis, Qdrant, DynamoDB Local, Postgres, Backend
docker compose up -d
# Kiα»m tra logs backend
docker logs -f memrag-backend
3. ChαΊ‘y Frontend
cd frontend
npm install
npm run dev
# β http://localhost:5173
4. API Docs
| URL | MΓ΄ tαΊ£ |
|---|---|
| http://localhost:8000/docs | Swagger UI |
| http://localhost:6333/dashboard | Qdrant Dashboard |
| http://localhost:8001 | DynamoDB Local |
π§ͺ Testing
# ChαΊ‘y toΓ n bα» test suite
cd backend && uv run pytest
# Vα»i coverage report
uv run pytest --cov=app --cov-report=term-missing
# ChαΊ‘y test cα»₯ thα»
uv run pytest tests/services/test_wiki_service.py -v
Test coverage:
| Module | Test file |
|--------|-----------|
| Wiki pipeline | tests/services/test_wiki_service.py |
| Auth + JWT | tests/core/test_auth.py |
| Session CRUD | tests/services/test_dynamo_session_service.py |
| Cache Service | tests/core/test_cache_service.py |
| RAG Service | tests/services/test_rag_service.py |
π CI/CD Pipeline
git push origin main
β
βββ backend/** thay Δα»i?
β β
β βββββΌβββββββββββββββ ββββββββββββββββββββ
β β JOB: lint β β JOB: test β β chαΊ‘y song song
β β ruff format β β pytest + coverage β
β β ruff check β β upload artifact β
β βββββββββ¬βββββββββββ ββββββββββ¬ββββββββββ
β ββββββββββββ¬βββββββββββββ
β β cαΊ£ 2 pass
β βββββββββββΌβββββββββββ
β β JOB: build-push β β push main only
β β docker build β
β β tag: <sha>+latest β
β β push β ECR β
β βββββββββββ¬βββββββββββ
β β
β βββββββββββΌβββββββββββ
β β JOB: deploy β β environment: production
β β ECS rolling updateβ
β β wait stable β β
β ββββββββββββββββββββββ
β
βββ frontend/** thay Δα»i?
β
βββββΌβββββββββββββββββββββββ
β JOB: validate β
β tsc --noEmit + eslint β
βββββββββββββ¬βββββββββββββββ
β
βββββββββββββΌβββββββββββββββ
β JOB: deploy β
β npm build β
β s3 sync (immutable) β
β CloudFront invalidate β
ββββββββββββββββββββββββββββ
βοΈ Infrastructure (AWS)
AWS ap-southeast-2
βββ VPC (10.0.0.0/16)
β βββ Public Subnets β ALB, NAT Gateway
β βββ Private Subnets β ECS EC2, RDS, ElastiCache
β
βββ ECS Cluster (EC2 launch type)
β βββ Backend Task (FastAPI :8000)
β
βββ Qdrant (ECS Fargate + EFS volume)
β
βββ ALB β ECS Backend (port 8000)
βββ CloudFront β S3 (frontend) + ALB (/api/*)
β
βββ RDS PostgreSQL (db.t3.micro) β Auth
βββ DynamoDB β Sessions + Meetings
βββ ElastiCache Redis (cache.t3.micro) β Caching
βββ S3 β Uploads + Wiki pages
βββ ECR β Docker images
Triα»n khai:
cd infrastructure
terraform init
terraform plan # kiα»m tra trΖ°α»c
terraform apply # tαΊ‘o hαΊ‘ tαΊ§ng (~10-15 phΓΊt)
terraform output # xem endpoints
Outputs quan trα»ng:
| Output | GiΓ‘ trα» |
|--------|---------|
| cloudfront_url | URL frontend public |
| alb_dns_name | Backend ALB endpoint |
| redis_primary_endpoint | ElastiCache Redis |
| rds_endpoint | PostgreSQL host |
π CαΊ₯u trΓΊc dα»± Γ‘n
.
βββ backend/ # FastAPI + Google ADK
β βββ app/
β β βββ agents/ # ADK Agent + 9 Tools
β β βββ api/v1/ # REST endpoints
β β βββ core/ # Config, Cache, DB, DI
β β βββ repositories/ # Data access layer
β β βββ services/ # Business logic
β β βββ schemas/ # Pydantic models
β βββ tests/ # pytest (~40 test files)
β βββ Dockerfile
β βββ pyproject.toml
β
βββ frontend/ # React 18 + Vite + TypeScript
β βββ src/
β βββ api/ # Axios clients
β βββ components/ # UI components
β βββ hooks/ # React Query hooks
β βββ store/ # Zustand state
β βββ types/ # TypeScript types
β
βββ infrastructure/ # Terraform (AWS)
β βββ ecs.tf # ECS cluster + task def
β βββ rds.tf # PostgreSQL
β βββ elasticache.tf # Redis
β βββ dynamodb.tf # Sessions + Meetings
β βββ s3_frontend.tf # Frontend hosting
β βββ alb.tf # Load balancer
β βββ variables.tf # Input variables
β
βββ docs/ # Documentation
β βββ codebase.md # KiαΊΏn trΓΊc chi tiαΊΏt
β βββ spec.md # Product specification
β βββ wiki.md # Wiki system design
β βββ cicd-flow.md # CI/CD flow
β
βββ .github/workflows/
β βββ ci-cd.yml # Backend CI/CD
β βββ deploy-frontend.yml # Frontend deploy
β
βββ docker-compose.yml # Local dev stack
π οΈ Development Rules
Δα» ΔαΊ£m bαΊ£o chαΊ₯t lượng vΓ deploy mượt mΓ :
- π§ͺ ChαΊ‘y full test suite trΖ°α»c khi hoΓ n thΓ nh bαΊ₯t kα»³ task nΓ o:
cd backend && uv run pytest - π Bα» sung test case cho mα»i tΓnh nΔng mα»i (unit + integration).
- π CαΊp nhαΊt docs β
docs/,CLAUDE.md,README.mdkhi thay Δα»i kiαΊΏn trΓΊc. - π¨ Format code trΖ°α»c khi commit:
cd backend && uv run ruff format .
π Environment Variables
| BiαΊΏn | BαΊ―t buα»c | MΓ΄ tαΊ£ |
|---|---|---|
GEMINI_API_KEY |
β | Google Gemini API key |
JWT_SECRET_KEY |
β (prod) | JWT signing secret (32+ chars) |
STORAGE_BACKEND |
β | local hoαΊ·c s3 (default: local) |
S3_BUCKET |
β (nαΊΏu S3) | TΓͺn S3 bucket |
S3_ACCESS_KEY_ID |
β (nαΊΏu S3) | AWS Access Key |
S3_SECRET_ACCESS_KEY |
β (nαΊΏu S3) | AWS Secret Key |
QDRANT_URL |
β | Qdrant server URL |
REDIS_URL |
β | Redis connection URL |
REDIS_ENABLED |
β | BαΊt/tαΊ―t cache (default: true) |
DATABASE_URL |
β | PostgreSQL connection string |
SONIOX_API_KEY |
β | Soniox STT API key |
WIKI_ENABLED |
β | BαΊt/tαΊ―t wiki pipeline (default: true) |
DEBUG |
β | Dev mode β bα» qua JWT (true/false) |
π License
MIT Β© 2026 minh2004pd