Sales Data Analysis
Synthetic sales data analysis with Python. Generate realistic sales transactions, clean and validate data, compute KPIs, and visualize revenue trends by day, month, and category. Includes reproducible scripts and charts for portfolio demonstration. Simple end-to-end sales analytics: synthetic data generation, cleaning, KPIs, and charts.
Features
- Generate synthetic daily orders
- Clean and validate data (deduplicate, impute, recompute revenue)
- Compute KPIs: daily/monthly revenue, average basket, growth rate
- Visualize revenue trends with Matplotlib
- Save outputs to the
outputs/directory
Project Structure
sales-data-analysis/
├─ README.md
├─ requirements.txt
├─ data/
│ └─ generate_sales.py
├─ src/
│ ├─ analyze_sales.py
│ └─ utils.py
└─ outputs/
└─ figures & reports
Setup
python -m venv .venv
# Windows:
.venv\Scripts\activate
# macOS/Linux:
source .venv/bin/activate
pip install -r requirements.txt
Generate Synthetic Data
python data/generate_sales.py --start 2023-01-01 --end 2024-12-31 --seed 42 --out data/sales.csv
Run Analysis
python src/analyze_sales.py --input data/sales.csv --outdir outputs
The script produces KPIs and charts automatically.
Outputs
outputs/kpis.txt– main KPIsoutputs/fig_daily_revenue.pngoutputs/fig_monthly_revenue.pngoutputs/fig_category_revenue.png
Sample Results
Daily Revenue
Monthly Revenue
Revenue by Category
Data Schema
| column | description |
|---|---|
| date | order date |
| order_id | unique order identifier |
| customer_id | customer identifier |
| category | product category |
| price | unit price (after discount) |
| quantity | order quantity |
| revenue | price * quantity |