iusztinpaul

Professional software vendor delivering innovative solutions on the Softono platform. Specialized in both open-source and proprietary software development.

Visit Website

Total Products

Software by iusztinpaul

Open Source

designing-real-world-ai-agents-workshop

# Build Your Own Deep Research Agent + Technical Writer Multi-Agent System A hands-on workshop, presented at [AI Engineering Conference Europe](https://www.ai.engineer/europe), building a multi-agent AI system with two MCP servers: a **Deep Research Agent** and a **LinkedIn Writing Workflow**. Both connected to a harness like Claude Code or Cursor. 🎬 Full workshop available on [YouTube](https://www.youtube.com/watch?v=mYSRn6PC1mc) ↓ <a href="https://www.youtube.com/watch?v=mYSRn6PC1mc"> <img src="https://img.youtube.com/vi/mYSRn6PC1mc/maxresdefault.jpg" alt="Watch the video" style="width:100%; max-width:600px;"> </a> 📑 Slides [here](https://drive.google.com/file/d/1RWdS5VQYjz7a9y7NzHhAnyhGtxi6e0vt/view?usp=sharing). ---- ## Whenever You're Ready, Here's How to Go Deeper <a href="https://academy.towardsai.net/courses/agent-engineering?utm_source=github&utm_medium=aieng&utm_campaign=2026_aieng_workshop&utm_id=researchwriter"><img src="media/course_clip.gif" alt="Agentic AI Engineering Course" width="800"/></a> This workshop is a 2–4 hour taste. If you want to go from zero to shipping production-grade AI agents, check out our [**Agentic AI Engineering Course**](https://academy.towardsai.net/courses/agent-engineering?utm_source=github&utm_medium=aieng&utm_campaign=2026_aieng_workshop&utm_id=researchwriter), built with Towards AI. **34 lessons. Three end-to-end portfolio projects. A certificate. And a Discord community with direct access to industry experts and us.** Rated 5/5 by 300+ students. The first 6 lessons are free: [**Start here →**](https://academy.towardsai.net/courses/agent-engineering?utm_source=github&utm_medium=aieng&utm_campaign=2026_aieng_workshop&utm_id=researchwriter) ---- ## How to Use This Repo Three ways to use this repo. Pick the mode that fits the time you have. Or work through all three in order, since each builds on the last: 1. **Watch the workshop and see the patterns end-to-end. Watch in ~2 hr.** Start with the [2-hour YouTube workshop](https://www.youtube.com/watch?v=mYSRn6PC1mc) and the [slides](https://drive.google.com/file/d/1RWdS5VQYjz7a9y7NzHhAnyhGtxi6e0vt/view?usp=sharing) above. You'll come away with a mental model of the full multi-agent system: tool-use agents, evaluator-optimizer loops, grounded search, structured LLM output, and MCP-server design. 2. **Run the finished code. See it produce real artifacts. Run in ~30 min.** Watch the system generate a research brief, draft a LinkedIn post through an evaluator-optimizer loop, and score itself with an LLM-as-judge. Follow the [Getting Started](#getting-started) and [Running the Code](#running-the-code) sections to install the project and run the MCP servers, skills, and evaluation pipeline. 3. **Implement it yourself with agentic coding. Build a 1:1 replica from scratch in ~2–4 hr.** Open [`implement_yourself/`](implement_yourself/), a stripped-down skeleton prepared with 25 pre-groomed tickets and a custom `/implement` Claude Code skill that orchestrates SWE and Tester agents in a loop, ticket by ticket, until the directory matches `src/`. See [`implement_yourself/README.md`](implement_yourself/README.md) for the kickoff guide. > **No cheating, by design.** `implement_yourself/` is a self-contained project. Open your harness (Claude Code, Cursor, …) **directly in that folder** (not at the repo root) so its working directory is scoped to the skeleton. The agents can't see the reference implementation in `../src/`, can't grep it, can't read its files. You get a real build, not a copy-paste. ## What You'll Build Today **Deep Research Agent** — An MCP server that runs deep research using Gemini with Google Search grounding and native YouTube video analysis: ``` user topic → [deep_research] × N → analyze_youtube_video (if URLs) → [deep_research gap-fill] → compile_research → research.md ``` **LinkedIn Writing Workflow** — An MCP server that generates LinkedIn posts with an evaluator-optimizer loop: ``` research.md + guideline → generate post → [review → edit] × N → post.md → generate image ``` Both servers expose tools, resources, and prompts via the [Model Context Protocol](https://modelcontextprotocol.io/), letting any MCP-compatible harness orchestrate the workflow. <img src="media/architecture.png" alt="End-to-end workflow architecture" width="800"/> **Patterns and concepts you'll learn:** - **Tool-use agents** — letting the LLM decide which tools to call and when - **Evaluator-optimizer loop** — generate, review, edit in cycles - **Grounded search** — Gemini with Google Search grounding for factual research - **Structured LLM output** — Pydantic schemas for type-safe model responses - **MCP server design** — registering tools, resources, and prompts with FastMCP - **LLM-as-judge evaluation** — automated quality scoring with Opik <img width="1400" height="1380" alt="system_architecture" src="https://github.com/user-attachments/assets/5507d5dd-5809-4e01-bcf3-a6de980bc773" /> ## Example: End-to-End Workflow Here's a real run through the full pipeline — from a topic seed to a published-ready LinkedIn post with an AI-generated image. ### Final output  <div align="center"> <table> <tr> <td> <div> Phil Tobaloo AI Engineer | I ship AI products and teach you about the process. </div> --- We planned 12 AI agents and shipped 1. It worked better. Sounds crazy, right? But it's a common story. A client built an AI marketing chatbot. Their initial design had dozens of agents: orchestrator, validators, spam prevention. It failed. A single agent with tools won. Tasks were tightly coupled. One brain maintained context. Tools were still specialized. This is the core mistake. People jump to complex multi-agent setups too fast. Think AI system design as a spectrum: * Workflows: You control steps. * Single Agent + Tools: Model decides flow. * Multi-Agent: Multiple decision-makers. ... A single agent works for most cases. But it has limits. Too many tools? You hit "context rot." Past ~10-20 tools, LLMs degrade at tool selection. They get overwhelmed. Information gets lost in the middle. So, when do you actually need multi-agent? ... **The simplest system that reliably solves the problem is always the best system.** Don't overengineer your AI agents. Build simple first. What's the most complex agent architecture you've simplified? Tell me below. <a href="media/post_3.md">Read the full post</a> <img src="media/post_image.png" width="500"/> </td> </tr> </table> </div> <details> <summary>Step-by-step breakdown (seed → research → guideline → drafts)</summary> #### 1. Start with a seed A short research brief with 2-3 questions and reference links: ```markdown # Research Topic: AI Agent Architecture — When Less Is More ## Key Questions 1. Why do single-agent architectures with smart tools outperform multi-agent systems? 2. What are the only legitimate reasons to adopt a multi-agent architecture? ## References - Stop Overengineering: Workflows vs AI Agents Explained (YouTube) - From 12 Agents to 1 (DecodingAI article) ``` #### 2. Deep Research Agent produces `research.md` The agent runs multiple Gemini-grounded search queries and analyzes YouTube videos, then compiles everything into a structured research brief with sources. > The full research.md for this example is ~20k tokens across 2 queries and 1 video transcript. #### 3. Write a guideline A short brief describing the post angle, audience, and key points: ```markdown # LinkedIn Post Guideline ## Topic Why most AI teams should use 1 agent instead of 12. ## Angle Open with the counterintuitive "12 agents → 1" hook. Introduce the complexity spectrum. End with a clear mental model. ## Target Audience AI engineers and technical leads building LLM-powered applications. ## Key Points - A team planned 12 agents but shipped 1 — it worked better. - The spectrum: workflows → single agent + tools → multi-agent. Stay left. - "Context rot": past ~10-20 tools, LLMs degrade at tool selection. - Only 4 valid reasons for multi-agent. ## Tone Direct, opinionated, engineer-to-engineer. No fluff. ``` #### 4. Writing Workflow refines the post The evaluator-optimizer loop generates a draft, then runs 3 rounds of review + edit: <table> <tr> <td width="50%"> **v0 — Initial draft** > We planned 12 AI agents. We shipped 1. > > Sounds crazy, right? But it's a common story. > > A client wanted an AI chatbot for marketing content: emails, SMS, promos. Their initial design had dozens of specialized agents: orchestrator, analyzers, validators, spam prevention. > > In practice? A single agent with tools won. Tasks were tightly coupled, sequential. Splitting it created information silos and handoff errors. [...] > > **The simplest system that reliably solves the problem is always the best system.** </td> <td width="50%"> **v3 — After 3 review/edit cycles** > We planned 12 AI agents and shipped 1. It worked better. > > A client built an AI marketing chatbot. Their initial design had dozens of agents: orchestrator, validators, spam prevention. It failed. > > A single agent with tools won. Tasks were tightly coupled. One brain maintained context. Tools were still specialized. > > **Stay as far left as possible.** Move right only when forced. [...] > > **The simplest system that reliably solves the problem is always the best system.** </td> </tr> <tr> <td align="center">Verbose, redundant phrasing, weak hook</td> <td align="center">Tighter, punchier, stronger structure</td> </tr> </table> </details> <details> <summary>Example 2 — Harness Engineering (click to expand)</summary> <img src="examples/harness-engineering/post_image.png" width="300"/> ``` Harness engineering isn't just a new term for prompt engineering. It's where AI is heading. Agents got useful enough for code and tools, but they weren't reliable. They'd repeat mistakes. The bottleneck shifted from code generation to consistent, reliable behavior in real systems. Think of it this way: prompt engineering is what to ask. Context engineering is what to send the model. Harness engineering is how the whole thing operates. It's the environment around the model, beyond just tokens. Car analogy: the model is the engine. Context is the fuel. The harness is the rest of the car: steering, brakes, lane boundaries. It prevents crashes. A harness includes tools, permissions, state, tests, logs, retries, checkpoints, guardrails, and evals. Stop hoping the model improves. Engineer its environment. The burden shifts to us, the builders, to prevent repeat mistakes. I use self-reflection in my Claude Code setup. The agent learns what I liked, saving tokens and time. Real companies are already doing this. Anthropic's long-running agents externalize memory into artifacts. OpenAI built a 1M-line product with zero manual code using structured docs and agent-to-agent reviews. Stripe agents merge 1K+ PRs weekly within isolated environments. LangChain moved a coding agent from outside the top 30 to top 5 on Terminal Bench 2.0 by changing only the harness. Same model, better system. This isn't just for coding agents. This is the new way software gets built. The programmer's job is shifting: less writing code, more designing habitats for agents to work without issues. Think machine-readable docs, evals, sandboxes, permission boundaries, and structural tests. Reliability is the real work. Not just prompting. LLMs are heading into systems, workflows, harnesses. Value comes from orchestration, constraints, feedback loops—not just a single prompt. The future isn't one genius model. It's models in well-engineered environments. That's why harness engineering matters. It's what happens when you stop demoing intelligence and start shipping it. Want to learn more? I explain it all in my latest video: https://youtu.be/zYerCzIexCg What's your biggest challenge building reliable agent systems right now? ``` </details> <details> <summary>Example 3 — Angine de Poitrine for AI Engineers (click to expand)</summary> <img src="examples/angine-de-poitrine-ai-engineers/post_image.png" width="300"/> ``` Forget your latest AI model. There's a new system breaking the internet: Angine de Poitrine. This masked duo from Quebec, deploys a high-resolution audio architecture. It makes everything else sound low-res. Khn's custom double-necked microtonal guitar features 2x resolution: 24 notes per octave, not 12. Fine-grained frequency modulation. Klek backs him up on drums, driving rhythms that feel like O(n^2) time signatures. Pure algorithmic complexity. Their sound: "Dada Pythagorean-Cubist mantra-rock." Eastern traditions meet Frank Zappa. Their 27-minute inference run for KEXP in Feb 2026 hit 7M+ views and broke the internet. Sold-out shows in NYC, London, Rennes followed. Their anonymity adds another layer. Polka-dot costumes and papier-mâché masks are anonymous inference endpoints. They communicate in an invented language. This ensures pure signal: raw output, no artist biases. A decoupled identity art experiment. Khn's loop pedals are recursive pipelines. They stack complex guitar and bass in real-time, building dense soundscapes. Just dropped: their new album, Vol. II, on April 3, 2026. AI music is common. Angine de Poitrine proves human artistry is the ultimate non-deterministic function. Raw, complex, and deeply human. You need to hear this. What's the most complex system you've encountered recently? ``` </details> > Browse more full examples (seed, research, post drafts, reviews, final post + image) in the [`examples/`](examples/) directory. ## Tech Stack | Component | Tool | | ---------------- | -------------------------------------- | | LLM API | Google Gemini (via `google-genai` SDK) | | MCP Framework | FastMCP | | Data Validation | Pydantic | | Settings | Pydantic Settings | | Observability | Opik | | Image Generation | Gemini Flash Image | | QA | Ruff | | Package Manager | uv | ## Getting Started > **Assumes** working Python knowledge and basic familiarity with LLMs. ### Prerequisites | Requirement | Check | Install | |-------------|-------|---------| | Python 3.12+ | `python --version` | `uv python install 3.12` or [python.org](https://www.python.org/downloads/) | | uv 0.7+ | `uv --version` | `curl -LsSf https://astral.sh/uv/install.sh \| sh` ([docs](https://docs.astral.sh/uv/getting-started/installation/)) | | GNU Make | `make --version` | Pre-installed on macOS/Linux. Windows: `choco install make` | | Google API Key | — | [aistudio.google.com/apikey](https://aistudio.google.com/apikey) (required — all LLM calls use Gemini) | | Opik account | — | [comet.com/site/products/opik](https://www.comet.com/site/products/opik/) (optional, for observability and evals) | ### Installation 1. **Clone and configure:** ```bash git clone https://github.com/iusztinpaul/designing-real-world-ai-agents-workshop.git cd designing-real-world-ai-agents-workshop cp .env.example .env # add your GOOGLE_API_KEY (+ optional OPIK_API_KEY) ``` 2. **Install dependencies:** ```bash uv sync ``` > **Note:** If you don't have Python 3.12+, uv can install it for you: `uv python install 3.12`, then re-run `uv sync`. 3. **Verify the setup:** ```bash make test-end-to-end # runs research + writing pipeline end-to-end ``` If it completes without errors, you're good to go. ## Running the Code There are four ways to run the workflows: | Mode | Best for | |------|----------| | **MCP Servers** (recommended) | Interactive use with AI harness | | **Skills** | Guided slash-command workflows | | **Streamlit UI** | Visual end-to-end demo with live progress | | **Scripts** | Verify setup, smoke tests | ### MCP Servers (recommended) Connect the servers to an MCP-compatible harness (Claude Code, Cursor) for interactive use. This is the primary way to use the workshop. **Setup:** The `.mcp.json` file is pre-configured. Both servers start automatically when you open the project in Claude Code or Cursor. | Server | Tools | Prompt | |--------|-------|--------| | `deep-research` | `deep_research`, `analyze_youtube_video`, `compile_research` | `research_workflow` | | `linkedin-writer` | `generate_post`, `edit_post`, `generate_image` | `linkedin_post_workflow` | **Usage:** 1. Open the project in Claude Code or Cursor 2. Invoke an MCP prompt (e.g., `research_workflow`) to get guided through the full workflow 3. Or call individual tools directly for fine-grained control **Manual server start (advanced):** ```bash make run-research-server # stdio transport make run-writing-server # stdio transport ``` ### Skills Pre-built slash commands that orchestrate the MCP tools with sensible defaults. All output goes to `outputs/{topic-slug}/`. | Command | What it does | |---------|-------------| | `/research` | Deep research on a topic → `research.md` | | `/write-post` | Generate LinkedIn post from existing research → `post.md` + `post_image.png` | | `/research-and-write` | Full pipeline: research a topic, then write a post from it | Example: ``` /research-and-write ``` The skill will ask you for a topic and guideline, then run the full pipeline end-to-end. Check [`examples/`](examples/) to see what each step produces. ### Streamlit UI A standalone chat UI that orchestrates both MCP servers via FastMCP — no harness required. Drop in a topic (or upload a `.md` / `.txt` seed file) and watch the pipeline run end-to-end with live per-stage progress: search counters, sources collected, evaluator-optimizer loop, and image generation. ```bash make run-ui ``` ![Streamlit UI showing live deep-research progress, the writing workflow's evaluator-optimizer loop, and the generated image](media/streamlit_ui.png) Outputs land in `outputs/{topic-slug}/` (same layout as the skills). <details> <summary>Scripts (terminal-only, for smoke tests)</summary> Run workflows directly from the terminal via `make`. Useful for verifying your setup works and running quick smoke tests. See `[examples/](examples/)` for full end-to-end output samples. **Test workflows:** ```bash make test-research-workflow # Research on a sample topic → test_logic/research.md make test-writing-workflow # Generate post from research → test_logic/post.md make test-end-to-end # Both steps sequentially ``` > **Note:** `test-writing-workflow` requires `test_logic/research.md` to exist. Run `test-research-workflow` first, or use `test-end-to-end`. **Full dataset run:** The `[datasets/](datasets/)` directory contains a pre-built LinkedIn posts dataset with seeds, guidelines, research documents, ground truth posts, and generated outputs — used for both batch runs and evaluation. ```bash make run-dataset-writing # Research + write for all dataset posts (with images) make run-dataset-writing-no-image # Same, skip image generation (faster) ``` </details> ### Evaluation (requires Opik) The workshop includes an LLM-as-judge evaluation pipeline. Instead of manually reviewing each generated post, an LLM scores them against quality criteria (structure, tone, accuracy). [Opik](https://www.comet.com/site/products/opik/) tracks these scores across runs so you can measure whether prompt or pipeline changes actually improve output quality. ```bash make eval-dev # LLM judge on dev split make eval-test # LLM judge on test split make eval-online # Generate + judge posts on the fly ``` > Each command automatically uploads the dataset to Opik before running. To upload without evaluating (e.g., to browse in the Opik UI), use `make upload-eval-dataset`. ## Project Structure ``` ├── src/ │ ├── research/ # Deep Research Agent MCP server │ │ ├── server.py # FastMCP entry point │ │ ├── config/ # Settings, constants, prompt templates │ │ ├── models/ # Pydantic schemas for structured LLM output │ │ ├── app/ # Business logic handlers │ │ ├── tools/ # MCP tool implementations │ │ ├── routers/ # MCP tool, resource, and prompt registration │ │ └── utils/ # Gemini client, file I/O, Opik, markdown helpers │ └── writing/ # LinkedIn Writer MCP server │ ├── server.py # FastMCP entry point │ ├── profiles/ # Shipped markdown profiles (structure, terminology, character, branding) │ ├── config/ # Settings, constants, prompt templates │ ├── models/ # Pydantic schemas (Post, Review, Profiles) │ ├── app/ # Business logic handlers │ ├── evals/ # LLM judge metric, dataset upload, evaluation harness │ ├── tools/ # MCP tool implementations │ ├── routers/ # MCP tool, resource, and prompt registration │ └── utils/ # Gemini client, Imagen, Opik helpers ├── datasets/ # LinkedIn posts dataset with labels and splits ├── examples/ # Full end-to-end output samples (seed → research → posts → image) ├── scripts/ # Entrypoints and test scripts ├── .mcp.json # MCP server configuration for harnesses ├── Makefile # Command center └── .env.example # Environment variable template ``` ## Next Steps | Resource | Description | | ---------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------- | | [Agentic AI Engineering Course](https://academy.towardsai.net/courses/agent-engineering?utm_source=github&utm_medium=aieng&utm_campaign=2026_aieng_workshop&utm_id=researchwriter) | Our full course on shipping production-grade AI agents. 34 lessons. Three end-to-end portfolio projects. A certificate. And a Discord community. | | [Agentic AI Engineering Guide](https://email-course.towardsai.net/) | Free 6-day email course on the mistakes that silently break AI agents in production. | | [AI Engineering Cheatsheets](https://github.com/louisfb01/ai-engineering-cheatsheets) | Quick-reference sheets for agents, RAG, fine-tuning, and more. Ready to be plugged into Claude Code as context. | ## Contributors <table> <tr> <td align="center"><img src="https://github.com/louisfb01.png" width="150" alt="Louis-François Bouchard"/></td> <td align="center"><img src="https://github.com/iusztinpaul.png" width="150" alt="Paul Iusztin"/></td> <td align="center"><img src="https://github.com/sam04-ops.png" width="150" alt="Samridhi Vaid"/></td> </tr> <tr> <td align="center"><a href="https://github.com/louisfb01">Louis-François Bouchard</a> AI Engineer</td> <td align="center"><a href="https://github.com/iusztinpaul">Paul Iusztin</a> AI Engineer</td> <td align="center"><a href="https://github.com/sam04-ops">Samridhi Vaid</a> AI Engineer</td> </tr> </table> ## License MIT License. See [LICENSE](LICENSE) for details. Copyright (c) 2026 Paul Iusztin, Towards AI Inc

AI Agents LMS

442 Github Stars

Open Source

energy-forecasting

# The Full Stack 7-Steps MLOps Framework `Learn MLE & MLOps for free by designing, building, deploying and monitoring an end-to-end ML batch system | source code + 2.5 hours of reading & video materials on Medium` This repository contains a **7-lesson FREE course** to teach you how to **build a production-ready ML batch system**. Its primary focus is to engineer a scalable system using MLOps good practices. You will implement an ML system for forecasting hourly energy consumption levels across Denmark. You will **learn how to build, train, serve, and monitor an ML system** using a batch architecture. We will show you how to integrate an experiment tracker, a model registry, a feature store, Docker, Airflow, GitHub Actions and more! **Level:** Intermediate to Advanced | This **course targets** MLEs who want to build end-to-end ML systems and SWEs who wish to transition to MLE. ------ Following the **documentation on GitHub** and the [lessons on Medium](#lessons), you have *2.5 hours of reading & video materials*, which will help you understand every piece of the code! **At the end of the course, you will know how to build everything from the diagram below 👇** Don't worry if something doesn't make sense to you. We will explain everything in detail in the [Medium lessons](#lessons). If you are unsure if this course is for you, [here is an article presenting a high-level overview](https://pub.towardsai.net/the-full-stack-7-steps-mlops-framework-6599a0c6e295) of all the components you will build during the series. <img src="images/architecture.png"> <div align="center"> <a href="https://youtu.be/OKk9U310qYE"> Check out this short video to see what you will build during the course 👇 <img src="images/screenshot_introduction_video.png" alt="Introduction Video" style="width:75%;"> </a> </div> *You can safely use this code as you like, as long as you respect the terms and agreement of the MIT License.* ``<<< Using all the tools suggested in the course will be free of charge, except the ones from Lesson 7, where you will be deploying your application to GCP which will cost you ~20$. >>>`` # Table of Contents 1. [What You Will Learn](#learn) 2. [Lessons & Tutorials](#lessons) 3. [Costs](#costs) 4. [Ask Questions](#ask-questions) 5. [Data](#data) 6. [Code Structure](#structure) 7. [Set Up Additional Tools](#tools) 8. [Usage](#usage) 9. [Installation & Usage for Development](#installation) 10. [Licensing & Contributing](#licensing) 11. [Support](#support) -------- # 🤔 1. What You Will Learn <a name=learn></a> **At the end of this 7 lessons course, you will know how to:** * design a batch-serving architecture * use Hopsworks as a feature store * design a feature engineering pipeline that reads data from an API * build a training pipeline with hyper-parameter tunning * use W&B as an ML Platform to track your experiments, models, and metadata * implement a batch prediction pipeline * use Poetry to build your own Python packages * deploy your own private PyPi server * orchestrate everything with Airflow * use the predictions to code a web app using FastAPI and Streamlit * use Docker to containerize your code * use Great Expectations to ensure data validation and integrity * monitor the performance of the predictions over time * deploy everything to GCP * build a CI/CD pipeline using GitHub Actions If that sounds like a lot, don't worry. After you cover this course, you will understand everything we said before. Most importantly, you will know WHY we used all these tools and how they work together as a system. # 🤌 2. Lessons & Tutorials <a name=lessons></a> The course consists of 7 lessons hosted on Medium Towards Data Science publication. We also provide a bonus lesson where we openly discuss potential improvements that could be made to the current architecture and trade-offs we had to take during the course. The course adds up to *2.5 hours of reading and video materials*. `We recommend running the code along the articles to get the best out of this course, as we provide detailed instructions to set everything up.` **👇 Access the step-by-step lessons on Medium 👇** Here is an [article presenting a high-level overview](https://pub.towardsai.net/the-full-stack-7-steps-mlops-framework-6599a0c6e295) of all the components you will build during the course. 1. [Batch Serving. Feature Stores. Feature Engineering Pipelines.](https://medium.com/towards-data-science/a-framework-for-building-a-production-ready-feature-engineering-pipeline-f0b29609b20f) 2. [Training Pipelines. ML Platforms. Hyperparameter Tuning.](https://medium.com/towards-data-science/a-guide-to-building-effective-training-pipelines-for-maximum-results-6fdaef594cee) 3. [Batch Prediction Pipeline. Package Python Modules with Poetry.](https://medium.com/towards-data-science/unlock-the-secret-to-efficient-batch-prediction-pipelines-using-python-a-feature-store-and-gcs-17a1462ca489) 4. [Private PyPi Server. Orchestrate Everything with Airflow.](https://towardsdatascience.com/unlocking-mlops-using-airflow-a-comprehensive-guide-to-ml-system-orchestration-880aa9be8cff) 5. [Data Validation for Quality and Integrity using GE. Model Performance Continuous Monitoring.](https://towardsdatascience.com/ensuring-trustworthy-ml-systems-with-data-validation-and-real-time-monitoring-89ab079f4360) 6. [Consume and Visualize your Model's Predictions using FastAPI and Streamlit. Dockerize Everything.](https://towardsdatascience.com/fastapi-and-streamlit-the-python-duo-you-must-know-about-72825def1243) 7. [Deploy All the ML Components to GCP. Build a CI/CD Pipeline Using Github Actions.](https://towardsdatascience.com/seamless-ci-cd-pipelines-with-github-actions-on-gcp-your-tools-for-effective-mlops-96f676f72012) 8. [\[Bonus\] Behind the Scenes of an ‘Imperfect’ ML Project — Lessons and Insights.](https://towardsdatascience.com/imperfections-unveiled-the-intriguing-reality-behind-our-mlops-course-creation-6ff7d52ecb7e) # 💵 3. Costs <a name=costs></a> The code from GitHub is released under the MIT license. Thus, as long as you redistribute our LICENSE and give credit to our work, you can use it as you wish. The Medium lessons are released under Medium's paid wall. If you already have it, then they are free. Otherwise, you must pay a $5 monthly fee to read the articles. On the tools side, I will dig deeper when covering each tool independently, but TL/DR, it will cost you around ~20$ to deploy everything to GCP. To conclude, the course will cost you ~25$-30$ to read and run end-to-end. # ❔ 4. Ask Questions <a name=ask-questions></a> If you have any questions or issues during the course, please create an [issue](https://github.com/iusztinpaul/energy-forecasting/issues). I will do my best to respond. Also, you can contact me directly on [LinkedIn](https://www.linkedin.com/in/pauliusztin/). ------- # 📊 5. Data <a name=data></a> We used an open API that provides hourly energy consumption values for all the energy consumer types within Denmark. They provide an intuitive interface where you can easily query and visualize the data. You can access the data [here](https://www.energidataservice.dk/tso-electricity/ConsumptionDE35Hour). The data has 4 main attributes: * **Hour UTC**: the UTC datetime when the data point was observed. * **Price Area**: Denmark is divided into two price areas: DK1 and DK2 - divided by the Great Belt. DK1 is west of the Great Belt, and DK2 is east of the Great Belt. * **Consumer Type**: The consumer type is the Industry Code DE35, owned and maintained by Danish Energy. * **Total Consumption**: Total electricity consumption in kWh **Note:** The observations have a lag of 15 days! But for our demo use case, that is not a problem, as we can simulate the same steps as it would be in real time. ### IMPORTANT OBSERVATION The API will become obsolete during 2023. Its latest data points are from June 2023, and the API will become unavailable during 2023. We created a copy of the data from 2020-07-01 and 2023-06-30 to bypass this issue. Thus, there are 3 years of data to play with. More than enough for the purpose of this course. The file is stored in Google Drive accessible [at this link](https://drive.google.com/file/d/1y48YeDymLurOTUO-GeFOUXVNc9MCApG5/view?usp=drive_link). Thus, instead of querying the API, we will mock the same behavior by loading the data from the file. Therefore, you don't have to download your file yourself. The code will download it and load the data from the file instead of the API, simulating 100% the same behavior. **---> All Rights Reserved to: www.energidataservice.dk** <img src="images/forecasting_demo_screenshot.png"> The data points have an hourly resolution. For example: "2023–04–15 21:00Z", "2023–04–15 20:00Z", "2023–04–15 19:00Z", etc. We will model the data as multiple time series. Each unique price area and consumer type tuple represents its unique time series. Thus, we will build a model that independently forecasts the energy consumption for the next 24 hours for every time series. [Check out this video to better understand what the data looks like.](https://youtu.be/OKk9U310qYE) ---------- # 🧬 6. Code Structure <a name=structure></a> The code is split into two main components: the `pipeline` and the `web app`. The **pipeline** consists of 3 modules: - `feature-pipeline` - `training-pipeline` - `batch-prediction-pipeline` The **web app** consists of other 3 modules: - `app-api` - `app-frontend` - `app-monitoring` **Also,** we have the following folders: - `airflow` : Airflow files | Orchestration - `.github` : GitHub Actions files | CI/CD - `deploy` : Build & Deploy To follow the structure in its natural flow, read the folders in the following order: 1. `feature-pipeline` 2. `training-pipeline` 3. `batch-prediction-pipeline` 4. `airflow` 5. `app-api` 6. `app-frontend` & `app-monitoring` 7. `.github` **Read the Medium articles listed in the [Lessons & Tutorials](#lessons) section for the whole experience.** ------- # 🔧 7. Set Up Additional Tools <a name=tools></a> **The code is tested only on Ubuntu 20.04 and 22.04 using Python 3.9.** We use a `.env` file to store all our credentials. Every module that needs a `.env` file has a `.env.default` in the module's main directory that acts as a template. Thus, you have to run: ```shell cp .env.default .env ``` ... and complete what is surrounded by `<...>`. For now, don't do anything. We will explain in detail in later steps what you have to do. ## Poetry ##### ``<< free usage >>`` **Note:** During the course, we used `Poetry 1.4.2`. To avoid potential issues when installing the dependencies using Poetry, we recommend you use the same version (or if there are any errors & you have a different version, you can delete and regenerate the `poetry.lock` file). Install Python system dependencies: ```shell sudo apt-get install -y python3-distutils ``` Download and install Poetry: ```shell curl -sSL https://install.python-poetry.org | python3 - ``` Open the `.bashrc` file to add the Poetry PATH: ```shell nano ~/.bashrc ``` Add `export PATH=~/.local/bin:$PATH` to `~/.bashrc` Check if Poetry is installed: ```shell source ~/.bashrc poetry --version ``` [If necessary, here are the official Poetry installation instructions.](https://python-poetry.org/docs/#installation) #### macOS M1/M2 Poetry Issues **!!!** If you have issues creating Poetry environments on macOS M1/M2 devices, [Hongnan Gao](https://github.com/gao-hongnan) implemented a script that will solve all the dependency issues. Just run the following before creating a Poetry environment: ```shell bash scripts/install_poetry_macos_m1_chip.sh ``` ## Docker ##### ``<< free usage >>`` During the course we used `Docker version 24.0.5`. * [Install Docker on Ubuntu.](https://docs.docker.com/engine/install/ubuntu/) * [Install Docker on Mac.](https://docs.docker.com/desktop/install/mac-install/) * [Install Docker on Windows.](https://docs.docker.com/desktop/install/windows-install/) ## Configure Credentials for the Private PyPi Server ##### ``<< free usage >>`` ** We will run the private PyPi server using Docker down the line. But it will already expect the credentials configured. ** Create credentials using `passlib`: ```shell # Install dependencies. sudo apt install -y apache2-utils pip install passlib # Create the credentials under the energy-forecasting name. mkdir ~/.htpasswd htpasswd -sc ~/.htpasswd/htpasswd.txt energy-forecasting ``` Set `poetry` to use the credentials: ```shell poetry config repositories.my-pypi http://localhost poetry config http-basic.my-pypi energy-forecasting <password> ``` Check that the credentials are set correctly in your poetry `auth.toml` file: ```shell cat ~/.config/pypoetry/auth.toml ``` ## Hopsworks ##### ``<< free usage >>`` You will use [Hopsworks](https://www.hopsworks.ai/) as your serverless feature store. Thus, you have to create an account and a project on Hopsworks. We will show you how to configure the code to use your Hopsworks project later. [We explained on Medium in **Lesson 1** how to create a Hopsworks API Key.](https://medium.com/towards-data-science/a-framework-for-building-a-production-ready-feature-engineering-pipeline-f0b29609b20f) But long story short, you can go to your Hopsworks account settings and get the API Key from there. Afterward, you must create a new project (or use the default one) and add these credentials to the `.env` file under the `FS_` prefix. **!!!** Be careful to name your project differently than **energy_consumption,** as Hopsworks requires unique names across its serverless deployment. [Click here to start with Hopsworks](https://www.hopsworks.ai/). **Note:** Our course will use only the Hopsworks freemium plan, making it free of charge to replicate the code within the series. ## Weights & Biases ##### ``<< free usage >>`` You will use Weights & Biases as your serverless ML platform. Thus, you must create an account and a project on Weights & Biases. We will show you how to configure the code to use your W&B project later. [On Medium, we explained in **Lesson 2** how to create an API Key on W&B.](https://towardsdatascience.com/a-guide-to-building-effective-training-pipelines-for-maximum-results-6fdaef594cee) But long story short, you can go to your W&B and create an entity & project. Afterward, you must navigate to user settings and create the API Key from there. In the end, you must add these credentials to the `.env` file under the `WANDB_` prefix. **!!!** Be careful to name your entity differently than **teaching-mlops,** as W&B requires unique names across its serverless deployment. [Click here to start with Weights & Biases](https://wandb.ai/). **Note:** Our course will use only the W&B freemium plan, making it free of charge to replicate the code within the series. ## GCP First, you must install the `gcloud` GCP CLI on your machine. [Follow this tutorial to install it.](https://cloud.google.com/sdk/docs/install) **If you only want to run the code locally, go straight to the "Storage" section.** As before, you have to create an account and a project on GCP. Using solely the bucket as storage will be free of charge. When we were writing this documentation, GCS was free until 5GB. ### Storage ##### ``<< free usage >>`` At this step, you have to do 5 things: - create a project - create a non-public bucket - create a service account that has admin permissions to the newly created bucket - create a service account that has read-only permissions to the newly created bucket - download a JSON key for the newly created service accounts. Your `bucket admin service account` should have assigned the following role: `Storage Object Admin` Your `bucket read-only service account` should have assigned the following role: `Storage Object Viewer` ![Bucket Creation](images/gcp_gcs_screenshot.png) * [Docs for creating a bucket on GCP.](https://cloud.google.com/storage/docs/creating-buckets) * [Docs for creating a service account on GCP.](https://cloud.google.com/iam/docs/service-accounts-create) * [Docs for creating a JSON key for a GCP service account.](https://cloud.google.com/iam/docs/keys-create-delete) **NOTE:** When we were writing this documentation, GCS was free until 5GB. [Check out **Lesson 3** on Medium to better understand **how we set up the GCP bucket** and its role in the batch prediction pipeline.](https://towardsdatascience.com/unlock-the-secret-to-efficient-batch-prediction-pipelines-using-python-a-feature-store-and-gcs-17a1462ca489). **NOTE:** Don't forget to add the GCP credentials to the `.env` file under the `GOOGLE_CLOUD_` prefix: * *GOOGLE_CLOUD_PROJECT*: your project name (e.g., "energy_consumption") * *GOOGLE_CLOUD_BUCKET_NAME*: your bucket name (e.g., "hourly-batch-predictions") * *GOOGLE_CLOUD_SERVICE_ACCOUNT_JSON_PATH*: absolute path to your JSON key file. (e.g., "/absolute/path/to/your/service-account.json") ### Deployment ##### ``<< ~20$ >>`` This step must only be finished if you want to deploy the code on GCP VMs and build the CI/CD with GitHub Actions. Note that this step might result in a few costs on GCP. It won't be much. While developing this course, we spent only ~20$. Also, you can get some free credits if you create a new GCP account (we created a new account and received 300$ in GCP credits). Just be sure to delete the resources after you finish the course. See [this document](/README_DEPLOY.md) for detailed instructions. ------- # 🔎 8. Usage <a name=usage></a> **The code is fully tested on Ubuntu 20.04 & 22.04 using Python 3.9 and Poetry 1.4.2.** **Note:** If you are working on macOS M1/M2, be sure to check the [macOS M1/M2 Poetry Issues](https://github.com/iusztinpaul/energy-forecasting/tree/main#macos-m1m2-poetry-issues) section. ## The Pipeline Check out [Lesson 4](https://towardsdatascience.com/unlocking-mlops-using-airflow-a-comprehensive-guide-to-ml-system-orchestration-880aa9be8cff) on Medium to better understand how everything is orchestrated using Airflow. #### Run You will run the pipeline using Airflow (`free usage`). Don't be scared. Docker makes everything very simple to set up. **Note:** We also hooked the **private PyPi server** in the same docker-compose.yaml file with Airflow. Thus, everything will start with one command. **Important:** If you plan to run the pipeline outside Airflow, be sure to check the [🧑‍💻 7. Installation & Usage for Development](https://github.com/iusztinpaul/energy-forecasting/tree/main#-7-installation--usage-for-development-) section. Run: ```shell # Move to the airflow directory. cd airflow # Make expected directories and environment variables mkdir -p ./logs ./plugins sudo chmod 777 ./logs ./plugins # It will be used by Airflow to identify your user. echo -e "AIRFLOW_UID=$(id -u)" > .env # This shows where our project root directory is located. echo "ML_PIPELINE_ROOT_DIR=/opt/airflow/dags" >> .env ``` Now from the `airflow` directory move to the `dags` directory and run: ```shell cd ./dags # Make a copy of the env default file. cp .env.default .env # Open the .env file and complete the FS_API_KEY, FS_PROJECT_NAME and WANDB_API_KEY credentials # Create the folder where the program expects its GCP credentials. mkdir -p credentials/gcp/energy_consumption # Copy the GCP service credetials that gives you admin access to GCS. cp -r /path/to/admin/gcs/credentials/admin-buckets.json credentials/gcp/energy_consumption # NOTE that if you want everything to work outside the box your JSON file should be called admin-buckets.json. # Otherwise, you have to manually configure the GOOGLE_CLOUD_SERVICE_ACCOUNT_JSON_PATH variable from the .env file. ``` Now go back to the `airflow` directory and run the following: ```shell cd .. # Initialize the Airflow database docker compose up airflow-init # Start up all services # Note: You should set up the private PyPi server credentials before running this command. docker compose --env-file .env up --build -d ``` [Read the official Airflow installation using Docker, but NOTE that we modified their official docker-compose.yaml file.](https://airflow.apache.org/docs/apache-airflow/stable/howto/docker-compose/index.html) Wait a while for the containers to build and run. After access `127.0.0.1:8080` to login into Airflow. Use the following default credentials to log in: * username: `airflow` * password: `airflow` <img src="images/airflow_login_screenshot.png"> Before starting the pipeline DAG, you must deploy the modules to the private PyPi server. Go back to the `root folder` of the `energy-forecasting` repository and run the following to build and deploy the pipeline modules to your private PyPi server: ```shell # Set the experimental installer of Poetry to False. For us, it crashed when it was on True. poetry config experimental.new-installer false # Build & deploy the pipelines modules. sh deploy/ml-pipeline.sh ``` Airflow will know how to install the packages from the private PyPi server. One final step is to configure the parameters used to run the pipeline. Go to the `Admin` tab, then hit `Variables.` There you can click on the `blue` `+` button to add a new variable. These are the three parameters you can configure with our suggested values: * `ml_pipeline_days_export = 30` * `ml_pipeline_feature_group_version = 5` * `ml_pipeline_should_run_hyperparameter_tuning = False` <img src="images/airflow_variables_screenshot.png"> Now, go to the `DAGS/All` section and search for the `ml_pipeline` DAG. Toggle the activation button. It should automatically start in a few seconds. Also, you can manually run it by hitting the play button from the top-right side of the `ml_pipeline` window. <img src="images/airflow_ml_pipeline_dag_overview_screenshot.png"> That is it. You can run the entire pipeline with a single button if all the credentials are set up correctly. How cool is that? Here is what the DAG should look like 👇 <img src="images/airflow_ml_pipeline_dag_screenshot.png"> #### Clean Up ```shell docker compose down --volumes --rmi all ``` #### Backfil Using Airflow Find your `airflow-webserver` docker container ID: ```shell docker ps ``` Start a shell inside the `airflow-webserver` container and run `airflow dags backfill` as follows (in this example, we did a backfill between `2023/04/11 00:00:00` and `2023/04/13 23:59:59`): ```shell docker exec -it <container-id-of-airflow-webserver> sh airflow dags backfill --start-date "2023/04/11 00:00:00" --end-date "2023/04/13 23:59:59" ml_pipeline ``` If you want to clear the tasks and run them again, run these commands: ```shell docker exec -it <container-id-of-airflow-webserver> sh airflow tasks clear --start-date "2023/04/11 00:00:00" --end-date "2023/04/13 23:59:59" ml_pipeline ``` ### Run Private PyPi Server Separately The private PyPi server is already hooked to the airflow docker compose file. But if you want to run it separately for whatever reason, you can run this command instead: ```shell docker run -p 80:8080 -v ~/.htpasswd:/data/.htpasswd pypiserver/pypiserver:v1.5.2 run -P .htpasswd/htpasswd.txt --overwrite ``` ------ ## The Web App Check out [Lesson 6](https://medium.com/towards-data-science/fastapi-and-streamlit-the-python-duo-you-must-know-about-72825def1243) on Medium to better understand how the web app components work together. Fortunately, everything is a lot simpler when setting up the web app. This time, we need to configure only a few credentials. **Important:** If you plan to run the web app components without docker-compose, check the [🧑‍💻 7. Installation & Usage for Development](https://github.com/iusztinpaul/energy-forecasting/tree/main#-7-installation--usage-for-development-) section. Copy the bucket read-only GCP credentials to the root directory of your `energy-forecasting` project: ```shell # Create the folder where the program expects its GCP credentials. mkdir -p credentials/gcp/energy_consumption # Copy the GCP service credetials that gives you read-only access to GCS. cp -r /path/to/admin/gcs/credentials/read-buckets.json credentials/gcp/energy_consumption # NOTE that if you want everything to work outside the box your JSON file should be called read-buckets.json. # Otherwise, you have to manually configure the APP_API_GCP_SERVICE_ACCOUNT_JSON_PATH variable from the .env file of the API. ``` Go to the API folder and make a copy of the `.env.default` file: ```shell cd ./app-api cp .env.default .env ``` **NOTE:** Remember to complete the `.env` file with your own variables. That is it! Go back to the root directory of your `energy-forecasting` project and run the following docker command, which will build and run all the docker containers of the web app: ```shell docker compose -f deploy/app-docker-compose.yml --project-directory . up --build ``` If you want to run it in development mode, run the following command: ```shell docker compose -f deploy/app-docker-compose.yml -f deploy/app-docker-compose.local.yml --project-directory . up --build ``` **Now you can see the apps running at:** * [API](http://127.0.0.1:8001/api/v1/docs) * [Frontend](http://127.0.0.1:8501/) * [Monitoring](http://127.0.0.1:8502/) ----- ## Deploy the Code to GCP [Check out this section.](./README_DEPLOY.md) ## Set UP CI/CD with GitHub Actions [Check out this section.](./README_CICD.md) ------ # 🧑‍💻 9. Installation & Usage for Development <a name=installation></a> All the modules support Poetry. Thus the installation is straightforward. **Note 1:** Just ensure you have installed Python 3.9, not Python 3.8 or Python 3.10. **Note 2:** During the course, we used `Poetry 1.4.2`. To avoid potential issues when installing the dependencies using Poetry, we recommend you use the same version (or if there are any errors & you have a different version, you can delete and regenerate the `poetry.lock` file). **Note 3:** If you are working on macOS M1/M2, be sure to check the [macOS M1/M2 Poetry Issues](https://github.com/iusztinpaul/energy-forecasting/tree/main#macos-m1m2-poetry-issues) section. ## The Pipeline **We support Docker to run the whole pipeline. Check out the [Usage](#usage) section if you only want to run it as a whole.** If Poetry is not using Python 3.9, you can follow the next steps: 1. Install Python 3.9 on your machine. 2. `cd /path/to/project`, for example, `cd ./feature-pipeline` 3. run `which python3.9` to find where Python3.9 is located 4. run `poetry env use /path/to/python3.9` **Every pipeline component must load its credential from the `.env` file. Thus, you have two options:** 1. **Recommended option:** run `cp .env.default .env` into the folder where the `ML_PIPELINE_ROOT_DIR` env var is pointing to & fill in the credentials of the `.env` file. Check the [section below](https://github.com/iusztinpaul/energy-forecasting/tree/main#set-up-the-ml_pipeline_root_dir-variable) to see how to set it up. 2. Create a copy by running `cp .env.default .env` in every pipeline directory individually. But note that by taking this approach, you won't be able to run the system as a whole. **See here how to install every project individually:** - [Feature Pipeline](/feature-pipeline/README.md) - [Training Pipeline](/training-pipeline/README.md) - [Batch Prediction Pipeline](/batch-prediction-pipeline/README.md) ### Set Up the ML_PIPELINE_ROOT_DIR Variable **Important:** Before installing and running every module individually, **one key step** is to set the `ML_PIPELINE_ROOT_DIR` variable to your root directory of the `energy-forecasting` project (or any other directory - just make sure to set it): Export it to your `~/.bashrc` file: ```shell gedit ~/.bashrc export ML_PIPELINE_ROOT_DIR=/path/to/root/directory/repository/energy-forecasting/ ``` Or run every Python script proceeded by the `ML_PIPELINE_ROOT_DIR` variables. For example: ```shell ML_PIPELINE_ROOT_DIR=/path/to/root/directory/repository/energy-forecasting/ python -m feature_pipeline.pipeline ``` By doing so, all the 3 pipeline projects (feature, training, batch) will load and save the following files from the same location: * `.env` configuration; * JSON metadata files; * logs & plots. **NOTE:** This step is **critical** as every pipeline component needs to access the JSON metadata from other pipeline processes. By setting up the **ML_PIPELINE_ROOT_DIR** variable, all the metadata JSON files will be saved and accessed from the same location between different processes. For example, the batch prediction pipeline will read the model version it needs to use to make predictions from a JSON file generated by the training pipeline. Without settings the **ML_PIPELINE_ROOT_DIR**, the training and batch processes won't share the same output directory. Thus, they won't know how to talk to each other. When running the project inside `Airflow`, it is defaulted to `/opt/airflow/dags`; thus, you must set this variable only when running it outside Airflow. ## The Web App **We support Docker to run the web app. Check out the [Usage](#usage) section if you only want to run it as a whole.** **See here how to install every project individually:** - [API](/app-api/README.md) - [Frontend](/app-frontend/README.md) - [Monitoring](/app-monitoring/README.md) You can also run the whole web app in development mode using Docker: ```shell docker compose -f deploy/app-docker-compose.yml -f deploy/app-docker-compose.local.yml --project-directory . up --build ``` ------ # 🏆 10. Licensing & Contributing <a name=licensing></a> The code is under the MIT License. Thus, as long as you keep distributing the License, feel free to share, clone, or change the code as you like. Also, if you find any bugs or missing pieces in the documentation, I encourage you to add an issue on GitHub or a PR. Based on your support, I will adapt the code and docs for future readers. Furthermore, you can contact me directly on [LinkedIn](https://www.linkedin.com/in/pauliusztin/) if you have any questions. I also want to thank [Kurtis Pykes](https://github.com/kurtispykes) for being an awesome copilot and helping me make this course happen. ----- ### Let's connect if you want to level up in designing and productionizing ML systems: I post almost daily AI content on 👇🏼 [<img alt="linkedin" width="40px" src="images/linkedin.png" align="left" style="padding-right:20px;"/>](https://www.linkedin.com/in/pauliusztin) [<img alt="medium" width="40px" src="images/medium.png" align="left" style="padding-right:20px;"/>](https://pauliusztin.medium.com/) [<img alt="substack" width="35px" src="images/substack.png" align="left" style="padding-right:20px;"/>](https://pauliusztin.substack.com/) [<img alt="gmail" width="40px" src="images/gmail.png" align="left" style="padding-right:20px;"/>](mailto:[email protected]?subject=[From%20GitHub]%20ML%20Collaborations) [<img alt="twitter" width="40px" src="images/twitter.png" align="left" style="padding-right:20px;"/>](https://twitter.com/iusztinpaul) Subscribe to my [ML engineering weekly newsletter](https://pauliusztin.substack.com/). ----- # 🖤 11. Support <a name=support></a> 🎨 **Creating content takes me a lot of time. If you enjoyed my work, you could support me by [buying me a coffee](https://www.buymeacoffee.com/pauliusztin).** Thank you ✌🏼 !

AI Agents ML Frameworks

973 Github Stars