Complete MLOps Bootcamp
A comprehensive repository containing 10+ end-to-end Machine Learning projects following the Complete MLOps Bootcamp by Krish Naik on Udemy. This repository covers the entire MLOps lifecycle from data ingestion to model deployment and monitoring.
π Course Structure
This repository is organized into modules that cover different aspects of MLOps:
ποΈ Core MLOps Components
- 1-Introduction - MLOps fundamentals and overview
- 2-MLFLOW - Experiment tracking and model registry
- 3-DVC - Data version control and pipeline orchestration
- 4-Dagshub - Git-based ML platform integration
- 5-Mlpipeline - Building ML pipelines
- 6-MLflow with AWS - Cloud integration for model management
π³ Containerization & Orchestration
- 7-Dockers - Containerizing ML applications
- 8-Apache Airflow - Workflow orchestration and scheduling
- 9-ETL Pipeline - Data extraction, transformation, and loading
π CI/CD & Deployment
- 10-Github Actions - Automated workflows and CI/CD
- 11-Github Actions with DockerHub - Container-based deployments
π― End-to-End Projects
- 12-End-End DataScience Project - Complete ML pipeline implementation
- 13-End to End Network Security Project - ML for cybersecurity applications
- 14-End to End NLP Text Summarizer - Natural Language Processing project
- 15-AWS Sagemaker - Cloud-based ML platform
- 16-Grafana - Monitoring and visualization
- 17-LLmOps - Large Language Model operations
π οΈ Technologies Covered
- Experiment Tracking: MLflow, Dagshub
- Version Control: Git, DVC
- Containerization: Docker
- Orchestration: Apache Airflow
- CI/CD: GitHub Actions
- Cloud Platforms: AWS, Sagemaker
- Monitoring: Grafana
- ML Frameworks: TensorFlow, Keras, Scikit-learn
- Data Processing: Pandas, NumPy
- Hyperparameter Tuning: Hyperopt
π― Skills Acquired
π€ Machine Learning Operations
- ML Lifecycle Management: Complete understanding of ML model development, deployment, and monitoring
- Experiment Tracking: Systematic tracking of model experiments, parameters, and metrics
- Model Versioning: Managing different versions of models and datasets
- Automated ML Pipelines: Building end-to-end automated machine learning workflows
π§ DevOps for ML
- Containerization: Packaging ML applications using Docker
- CI/CD for ML: Implementing continuous integration and deployment for ML models
- Infrastructure as Code: Managing ML infrastructure using code
- Workflow Orchestration: Designing and managing complex ML workflows with Airflow
π Data Engineering
- Data Version Control: Tracking and managing dataset versions with DVC
- ETL Pipeline Development: Building robust data extraction, transformation, and loading processes
- Data Validation: Implementing data quality checks and schema validation
- Feature Engineering: Creating and managing ML features systematically
βοΈ Cloud & Deployment
- Cloud ML Platforms: Working with AWS Sagemaker for model training and deployment
- Scalable Deployments: Deploying ML models at scale using cloud services
- Model Monitoring: Setting up monitoring and alerting for ML models in production
- A/B Testing: Implementing model comparison and testing strategies
π Advanced ML Techniques
- Hyperparameter Optimization: Automated tuning of model parameters
- Model Evaluation: Comprehensive model performance assessment
- NLP Operations: Managing and deploying natural language processing models
- LLM Operations: Working with large language models in production
- Network Security ML: Applying ML to cybersecurity use cases
π Monitoring & Observability
- Model Performance Monitoring: Tracking model drift and performance degradation
- Log Management: Collecting and analyzing ML system logs
- Metrics Visualization: Creating dashboards for ML system monitoring
- Alert Systems: Setting up automated alerts for model issues
π οΈ Software Engineering Best Practices
- Code Organization: Structuring ML projects for maintainability and scalability
- Configuration Management: Managing project configurations and parameters
- Testing ML Systems: Implementing unit tests and integration tests for ML pipelines
- Documentation: Creating comprehensive documentation for ML projects
π Prerequisites
Before starting with the projects, ensure you have:
- Python 3.8+ installed
- Git configured
- Docker installed (for containerization projects)
- AWS account (for AWS-related projects)
- Basic understanding of Machine Learning concepts
π Getting Started
-
Clone the repository
git clone https://github.com/Suraj-G-Rao/Complete-MLOPS.git cd Complete-MLOPS -
Install dependencies
pip install -r requirements.txt -
Navigate to the desired module
cd "12-End-End DataScience Project" -
Follow the module-specific README for detailed instructions
π Project Structure
Each module follows a consistent structure:
- Configuration files (
config.yaml,schema.yaml,params.yaml) - Source code (
src/directory) - Research notebooks (
research/directory) - Templates (
templates/directory) - Main execution scripts (
main.py,app.py)
π§ Common Workflow
For most end-to-end projects, follow these steps:
-
Update configuration files
config.yaml- Project configurationschema.yaml- Data schema definitionsparams.yaml- Model parameters
-
Update components
- Entity classes
- Configuration manager
- Individual components (data ingestion, transformation, etc.)
-
Build the pipeline
- Update pipeline configuration
- Modify main execution script
π ML Pipeline Stages
Typical ML pipeline includes:
- Data Ingestion - Collect and load data
- Data Validation - Validate data quality and schema
- Data Transformation - Feature engineering and preprocessing
- Model Training - Train ML models with hyperparameter tuning
- Model Evaluation - Evaluate model performance using MLflow/Dagshub
- Model Deployment - Deploy models to production
- Monitoring - Track model performance and drift
π€ Contributing
This repository is part of a learning journey. Feel free to:
- Fork the repository
- Create issues for bugs or improvements
- Submit pull requests
- Share your learning experience
π License
This project is licensed under the MIT License - see the LICENSE file for details.
π Acknowledgments
- Krish Naik - For the excellent MLOps Bootcamp course on Udemy
- MLOps Community - For valuable resources and best practices
- Open Source Contributors - For the amazing tools and libraries used
π Contact
- Course: Complete MLOps Bootcamp With 10+ End To End ML Projects
- Platform: Udemy
Note: This repository is for educational purposes. Please ensure you have the necessary permissions and licenses for any production use of the code and models.