Home
Softono
AI-Bank-Statement-Document-Automation-By-LLM-And-Personal-Finanical-Analysis-Prediction

AI-Bank-Statement-Document-Automation-By-LLM-And-Personal-Finanical-Analysis-Prediction

Open source Apache-2.0 Jupyter Notebook
583
Stars
114
Forks
0
Issues
7
Watchers
2 months
Last Commit

About AI-Bank-Statement-Document-Automation-By-LLM-And-Personal-Finanical-Analysis-Prediction

AI Bank Statement Document Automation By LLM model and Personal Finanical Analysis

Platforms

Web Self-hosted

Languages

Jupyter Notebook

Links

🏦 AI Bank Statement Document Automation with LLM & Personal Financial Analysis

Python License AG2

Automated extraction, structuring, RAG-powered querying, and AI-agent financial analysis of bank statement PDFs.

This project converts unstructured bank statement PDFs into structured data using computer vision (YOLO), OCR, and Large Language Models. It supports natural language queries and generates insightful monthly/yearly financial reports.


✨ Key Features

  • Advanced Document Parsing β€” Custom YOLOv8 layout detection + OCR + LLM table extraction
  • RAG Pipeline β€” Powerful retrieval-augmented generation with vector databases
  • Autonomous AI Agents β€” Built with AG2 (migrated from pyautogen in Feb 2026)
  • Financial Intelligence β€” Income/expense categorization, trend analysis, monthly & yearly summaries
  • Multimodal & Local LLM Support β€” Works with Gemini, Ollama (Llama 3, Gemma 2, etc.)
  • User Interface β€” Streamlit web application (apps.py)
  • Evaluation Framework β€” DeepEval integration for RAG quality testing

πŸ›  Technology Stack

  • Document Processing: YOLOv8 (custom layout model), PyMuPDF, pytesseract, pymupdf4llm
  • RAG & Vector Store: LangChain, Chroma, Faiss
  • Agent Framework: AG2 (latest)
  • LLMs: Google Gemini, Local models via Ollama
  • Frontend: Streamlit
  • Analysis: pandas, Plotly

Related Repo: YOLO Base Document Layout Detection


πŸ“ Repository Structure

src/
β”œβ”€β”€ dev/                    # Jupyter notebooks for development & testing
β”‚   β”œβ”€β”€ ai_bank_statement_dev.ipynb
β”‚   β”œβ”€β”€ ai_agent_dev.ipynb
β”‚   └── RAG_algorithm_test.ipynb
β”œβ”€β”€ apps.py                 # Streamlit web application
β”œβ”€β”€ bank-statement-document/ # Core processing scripts
β”œβ”€β”€ yolo-base-layout-analysis/
β”œβ”€β”€ faiss_index/ & chroma_db/
β”œβ”€β”€ test-document/          # Sample PDFs for testing
β”œβ”€β”€ *.sh                    # Installation & setup scripts
β”œβ”€β”€ requirements.txt
└── .env.example

πŸš€ Quick Start

1. Clone & Setup

git clone https://github.com/johnsonhk88/AI-Bank-Statement-Document-Automation-By-LLM-And-Personal-Finanical-Analysis-Prediction.git
cd AI-Bank-Statement-Document-Automation-By-LLM-And-Personal-Finanical-Analysis-Prediction

# Setup virtual environment and install dependencies
./src/build-python-virual-environment.sh
./src/activate_virual_environment.sh
./src/install-requirement.sh

# Install Tesseract OCR (Ubuntu/Debian)
./src/install-pytesseract-for-linux.sh

Create a .env file and add your GOOGLE_API_KEY (for Gemini).

2. Run the Application

Development Notebooks

cd src/dev
jupyter notebook

Streamlit Web UI

cd src
streamlit run apps.py

πŸ“ˆ Recent Major Updates

  • Feb 24, 2026 β€” Full migration from pyautogen β†’ AG2 agent framework
  • 2025 β€” Added advanced RAG pipeline, multimodal support, and DeepEval evaluation
  • Ongoing β€” Improving financial categorization and local LLM inference

πŸ—Ί Roadmap

  • Complete production-ready end-to-end pipeline
  • Advanced time-series forecasting for cash flow prediction
  • Multi-bank statement support with automatic categorization
  • Docker + API deployment
  • Rich interactive dashboard with more visualizations

πŸ“„ License

This project is licensed under the Apache License 2.0.


Made with ❀️ for personal finance automation in Hong Kong.

⭐ Star this repo if you find it useful!

Just copy the entire block above and replace your current README.md file on GitHub.

This is the final, clean, and up-to-date version. Push it and your project will look professional instantly!

Want me to add screenshots, example queries, or a demo video section next? Just say the word! πŸš€