Databricks Bootcamp 2026
Welcome to the Databricks Data Lakehouse Project by Data With Baraa.
This repository contains a complete, real-world Data Lakehouse implementation built on Databricks, including datasets, notebooks, SQL examples, and exercises. Everything here is designed to help you understand how modern data teams use Databricks in practice, from data ingestion and transformation to analytics-ready data products.
β οΈ Important Note
Build this project on your own first using the Notion roadmap.
Use this repository only as a reference if you get stuck.
Before starting, watch the Databricks Bootcamp, where I explain the architecture and decisions behind this project.
- π§ Notion Roadmap: Open guide
- βΆοΈ Databricks Bootcamp: Watch on YouTube
- π Finished? Share it on LinkedIn. Letβs celebrate
ποΈ Architecture
This project follows the Medallion Architecture:
π₯ Bronze Layer
- Raw data ingestion
- Schema inference and storage as Delta tables
π₯ Silver Layer
- Data cleaning and standardization
- Type casting and validation
π₯ Gold Layer
- Dimensional Data Model (Business Transformation)
- Ready for BI and analysis
π οΈ Technologies Used
- Databricks
- Apache Spark
- PySpark
- Spark SQL
- Delta Lake
- Unity Catalog
Prerequisites
- Basic SQL, Python and some Pyspark knowledge
- No prior Databricks experience required
β Stay Connected
π Connect With Me
π Courses (Structured & Certified)
- π SQL Full Course β Start here
- π Tableau Full Course β Start here
βΆοΈ Free YouTube Courses
- SQL Full Course β Watch on YouTube
- Python Full Course β Watch on YouTube
- Tableau Full Course β Watch on YouTube
- Real-World Data Projects β Watch on YouTube
- Data Career Roadmaps β Watch on YouTube
π‘οΈ License
This project is licensed under the MIT License. You are free to use, modify, and share this project with proper attribution.
π About Me
Hi, Iβm Baraa Khatib Salkini, also known as Data With Baraa. Iβm a senior data professional and educator with over 17 years of industry experience, working across data engineering, analytics, and modern data platforms. Iβve led large-scale data projects in real companies and now focus on teaching practical, real-world data skills through my courses, YouTube content, and bootcamps. My goal is simple: help you understand how data actually works in real systems, not just how to write code.