GenAI ChatBot with Amazon Bedrock Agent
Table of Contents
- GenAI ChatBot with Amazon Bedrock Agent
Introduction
This GenAI ChatBot application is built with Amazon Bedrock, including KnowledgeBase, Agent, and additional AWS serverless GenAI solutions. The solution showcases a Chatbot that understands EC2 instances and their pricing. It demonstrates Amazon Bedrock's capabilities to convert natural language into Amazon Athena queries and to process complex data sets.
LlamaIndex is used for data processing and text-to-SQL retrieval. The solution integrates:
- Amazon S3 for storage
- Amazon Bedrock KnowledgeBase for retrieval augmented generation (RAG)
- Amazon Bedrock Agent to execute multi-step tasks across data sources
- AWS Glue to prepare data
- Amazon Athena to execute efficient queries
- AWS Lambda for serverless compute
- Amazon ECS with Fargate for the Streamlit frontend
- Amazon OpenSearch Serverless as the vector database for the KnowledgeBase
Prerequisites
- Docker
- AWS CDK Toolkit 2.240.0+, installed and configured. For more information, see Getting started with the AWS CDK.
- Python 3.13+, installed and configured. For more information, see Python Downloads.
- An active AWS account
- An AWS account bootstrapped by using AWS CDK in
us-east-1orus-west-2 - Enable the following model access in Amazon Bedrock:
- Claude Sonnet 4.6 (Agent foundation model and text-to-SQL)
- Claude Sonnet 4.5 (available as alternative)
- Claude Haiku 4.5 (available as alternative)
- Titan Embedding Text v1 (KnowledgeBase embeddings)
Target technology stack
- Amazon Bedrock (Agent, KnowledgeBase)
- Amazon OpenSearch Serverless
- Amazon ECS (Fargate)
- AWS Glue
- AWS Lambda (Python 3.13)
- Amazon S3
- Amazon Athena
- Elastic Load Balancer
Architecture
The chatbot uses two data paths:
- Qualitative questions (e.g., "What is Amazon EC2?") → Bedrock Agent routes to KnowledgeBase → RAG over EC2 User Guide documentation stored in OpenSearch Serverless
- Quantitative questions (e.g., "What is the cheapest EC2 instance?") → Bedrock Agent routes to Action Group → Lambda converts natural language to SQL via LlamaIndex → Athena queries EC2 pricing data
The Streamlit frontend runs on ECS Fargate behind an Application Load Balancer and invokes the Bedrock Agent through a Lambda function.
Deployment
Local development setup
Add a .env file to code/streamlit-app/ folder:
ACCOUNT_ID = <Your account ID>
AWS_REGION = <Your region>
LAMBDA_FUNCTION_NAME = invokeAgentLambda
Deploy to AWS
Create and activate a virtual environment:
python3 -m venv .venv
source .venv/bin/activate
Install dependencies:
pip install -r requirements.txt
Bootstrap (first time only):
cdk bootstrap
Synthesize and deploy:
cdk synth
cdk deploy
The first deployment takes approximately 30-45 minutes to build Docker images. Subsequent deployments take about 5-8 minutes.
Once deployed, access the chatbot via the Streamlit app URL in the CloudFormation stack outputs.
Cleanup
cdk destroy
You may also need to manually delete the S3 buckets generated by the CDK.
Useful CDK commands
cdk ls— list all stacks in the appcdk synth— emit the synthesized CloudFormation templatecdk deploy— deploy this stack to your default AWS account/regioncdk diff— compare deployed stack with current statecdk docs— open CDK documentationcdk destroy— destroy one or more specified stacks
High-level Code Structure
.
├── app.py # CDK app entry point
├── cdk.json # CDK configuration and context
├── requirements.txt # CDK Python dependencies
├── assets/
│ ├── agent_api_schema/ # Bedrock Agent OpenAPI schema
│ ├── agent_prompts/ # Agent prompt templates
│ ├── data_query_data_source/ # EC2 pricing CSV data for Athena
│ ├── knowledgebase_data_source/ # EC2 User Guide docs for RAG
│ └── diagrams/ # Architecture diagrams
└── code/
├── code_stack.py # CDK stack definition (all AWS resources)
├── lambdas/
│ ├── action-lambda/ # Text-to-SQL via LlamaIndex (Docker, Python 3.13)
│ ├── create-index-lambda/ # Creates OpenSearch Serverless index
│ ├── invoke-lambda/ # Invokes Bedrock Agent (called by Streamlit)
│ └── update-lambda/ # Post-deployment: Glue crawler, KB sync, agent alias
├── layers/
│ ├── boto3_layer/ # Shared boto3 layer (v1.42.56)
│ └── opensearch_layer/ # OpenSearch client layer (v3.1.0+)
└── streamlit-app/ # Streamlit frontend (Docker, Python 3.13-slim)
Models
The application uses the following Amazon Bedrock models:
| Component | Model | Model ID / Inference Profile |
|---|---|---|
| Bedrock Agent | Claude Sonnet 4.6 | us.anthropic.claude-sonnet-4-6 |
| Text-to-SQL (Action Lambda) | Claude Sonnet 4.6 | us.anthropic.claude-sonnet-4-6 |
| KnowledgeBase Embeddings | Titan Embedding Text v1 | amazon.titan-embed-text-v1 |
Claude Sonnet 4.5 and 4.6 are cross-region inference only — they require inference profile IDs (prefixed with us.) rather than raw model IDs.
Available model alternatives in code/lambdas/action-lambda/connections.py:
Sonnet46— Claude Sonnet 4.6 (default)Sonnet45— Claude Sonnet 4.5Haiku45— Claude Haiku 4.5
Customize the chatbot with your own data
For Knowledgebase Data Integration
1. Data Preparation
- Place your dataset in the
assets/knowledgebase_data_source/directory.
2. Configuration Adjustments
- Update
cdk.json→context/config/paths/knowledgebase_file_namewith your filename. - Update
cdk.json→context/config/bedrock_instructions/knowledgebase_instructionto reflect your dataset.
For Structural Data Integration
1. Data Organization
- Create a subdirectory in
assets/data_query_data_source/(e.g.,tabular_data). - Place your structured dataset (CSV, JSON, ORC, or Parquet) in the subdirectory.
- To connect to an existing database, update
create_sql_engine()incode/lambdas/action-lambda/build_query_engine.py.
2. Configuration and Code Updates
- Update
cdk.json→context/config/paths/athena_table_data_prefixto match your data path. - Update
code/lambdas/action-lambda/dynamic_examples.csvwith text-to-SQL examples for your dataset. - Update
code/lambdas/action-lambda/prompt_templates.pyto reflect your table schema. - Update
cdk.json→context/config/bedrock_instructions/action_group_descriptionfor your action lambda. - Update
assets/agent_api_schema/artifacts_schema.jsonto reflect your action lambda's API.
General Update
- Update
cdk.json→context/config/bedrock_instructions/agent_instructionwith a description of the agent's purpose for your use case.