β SEO Keyword Clusters (Legacy Project)
β οΈ This repository is archived and represents an early prototype of what eventually became SEOCluster.ai. It is preserved for educational and historical purposes only.
π Try the New Production Platform: SEOCluster.ai
A modern AI-powered SaaS for keyword clustering, content briefs, and automated landing pages.
Highlights of the new platform:
- β‘ FastAPI backend + Next.js frontend
- π₯ SentenceTransformer embeddings (no more classic KMeans)
- π Local-intent detection across 325k+ global locations
- π AI-generated content briefs
- π§± Landing page generator
- π Firebase authentication
- π³ Stripe billing (Free, Pro, Agency tiers)
- π Google Search Console OAuth integration
- π Cloud Run multi-worker deployment
- π§ Smart caching & optimized UX
This legacy repo does not include these features β it represents the origins of the project.
π About This Legacy Version
This codebase was originally created between 2021β2022 for a UWA Data Science Capstone project.
It uses:
- Python + Flask
- Pandas & scikit-learn
- Traditional KMeans clustering
- TF-IDF cluster labeling
- Google Data Studio + Tableau for visualization
- CSV export from Google Search Console
- Basic SQLite storage
π This version is not production-ready.
π¦ Repository Structure
.
βββ app.py # Legacy Flask app
βββ Keyword_Clustering.ipynb # Main ML notebook
βββ Queries.csv # Sample GSC query data
βββ static/ # Static assets
βββ templates/ # Jinja2 templates
βββ keyword_clustering.sqlite # Example database
βββ README.md
π License & Usage
This legacy version is open for learning and academic use only.
βοΈ Allowed:
- Personal study
- Academic use
- ML experimentation
β Not Allowed:
- Commercial use
- Using this code in SaaS products
- Replicating SEOCluster.ai features
- Redistributing modified versions for business use
For production use β π https://seocluster.ai
π Project History
This repository represents the earliest foundation of SEOCluster.ai.
- 2021 β Built as a Data Science ML project
- 2022 β First UI deployed to Heroku
- 2023β2024 β Rewritten using modern full-stack architecture
- 2024β2025 β Became SEOCluster.ai, a full SaaS platform
The repository remains public because:
- It already has β stars and forks
- It helps others learn ML-based clustering
- It documents the evolution of the project
β Support the Journey
If you're interested in how this evolved into a real SaaS business, consider:
- β Starring this repo
- π Trying SEOCluster.ai β https://seocluster.ai
- π¬ Connecting for collaboration
π Useful Links
- Current SaaS: https://seocluster.ai