Home
Softono
data_engineering_with_python-track-datacamp

data_engineering_with_python-track-datacamp

Open source MIT Jupyter Notebook
52
Stars
34
Forks
0
Issues
1
Watchers
5 years
Last Commit

About data_engineering_with_python-track-datacamp

Data Engineer with Python is a comprehensive learning track offering lecture notes and course materials covering data engineering fundamentals and advanced techniques. The track teaches building effective data architectures, streamlining data processing, and maintaining large-scale data systems using Python, Shell, SQL, and Scala. Core topics include creating data engineering pipelines, automating file system tasks, writing efficient Python code, object-oriented programming, and unit testing. Students gain hands-on experience with cloud and big data tools including AWS Boto, PySpark, Spark SQL, and MongoDB for database creation, querying, and data wrangling. Additional courses cover relational database design, Bash scripting, Airflow for pipeline orchestration, and Scala programming. Advanced modules address cleaning data in SQL Server, transactions, error handling, triggers, and query performance optimization. The curriculum progresses from foundational concepts through specialized skills, preparing learners

Platforms

Web Self-hosted

Languages

Jupyter Notebook

Links

Data Engineer with Python

In this track, you’ll discover how to build an effective data architecture, streamline data processing, and maintain large-scale data systems. In addition to working with Python, you’ll also grow your language skills as you work with Shell, SQL, and Scala, to create data engineering pipelines, automate common file system tasks, and build a high-performance database.

Through hands-on exercises, you’ll add cloud and big data tools such as AWS Boto, PySpark, Spark SQL, and MongoDB, to your data engineering toolkit to help you create and query databases, wrangle data, and configure schedules to run your pipelines. By the end of this track, you’ll have mastered the critical database, scripting, and process skills you need to progress your career.


Courses

  1. Data Engineering for Everyone
  2. Introduction to Data Engineering
  3. Streamlined Data Ingestion with pandas
  4. Writing Efficient Python Code
  5. Writing Functions in Python
  6. Introduction to Shell
  7. Data Processing in Shell
  8. Introduction to Bash Scripting
  9. Unit Testing for Data Science in Python
  10. Object-Oriented Programming in Python
  11. Introduction to Airflow in Python
  12. Introduction to PySpark
  13. Building Data Engineering Pipelines in Python
  14. Introduction to AWS Boto in Python
  15. Introduction to Relational Databases in SQL
  16. Database Design
  17. Introduction to Scala
  18. Big Data Fundamentals with PySpark
  19. Cleaning Data with PySpark
  20. Introduction to Spark SQL in Python
  21. Cleaning Data in SQL Server databases
  22. Transactions and Error Handling in SQL Server
  23. Building and Optimizing Triggers in SQL Server
  24. Improving Query Performance in SQL Server
  25. Introduction to MongoDB in Python