ML Workflow¶
Requirements¶
- You understand the basic concepts of a machine learning pipeline.
- You know which components are minimally required in a machine learning pipeline.
- You are able to translate a Jupyter notebook into a machine learning pipeline.
- You understand what a virtual environment is and how to use it.
- You understand what Prefect and MLFlow are and what they are used for.
- You are able to run a machine learning pipeline with Prefect.
- You are able to log metrics and artifacts with MLFlow.
Theory¶
Guest lecture¶
- Slides
- Recording: link will be added when available
Articles¶
- What is a machine learning pipeline?: https://www.ibm.com/topics/machine-learning-pipeline
- ML pipelines: https://developers.google.com/machine-learning/managing-ml-projects/pipelines
- Machine learning pipeline: What it is, Why it matters, and Guide to Building it?: https://medium.com/@datasciencewizards/machine-learning-pipeline-what-it-is-why-it-matters-and-guide-to-building-it-2940d143fd37
- Read until the section "Why are we on this topic?", from there on it is commercial content.
- From Jupyter notebook experiments to ML pipelines that work: https://medium.com/factset/from-jupyter-notebook-experiments-to-ml-pipelines-that-work-2c9c3ae5a3c5
- It's important to understand the concepts and ideas from this article, not the specific tools used.
Videos¶
- What Is a Machine Learning Pipeline?: https://www.youtube.com/watch?v=HWWxtVL-D9k
- Prefect 3.0: A Framework to Build Resilient Pipelines: https://www.youtube.com/watch?v=cW18kscqAaw
- An Introduction to Your First Prefect Flow: https://www.youtube.com/watch?v=4yIW34WcmBQ
- Getting Started with Prefect | Task Orchestration & Data Workflows: https://www.youtube.com/watch?v=D5DhwVNHWeU
- Very in depth video about Prefect, only the first 6 sections are relevant for this course.
- Feel free to watch the entire video if you are interested in Prefect.
- MLFlow: A Quickstart Guide: https://www.youtube.com/watch?v=cjeCAoW83_U
- MLFlow Tutorial | ML Ops Tutorial: https://www.youtube.com/watch?v=6ngxBkx05Fs
- Skip the last section about Dagshub.
Manuals¶
- Python virtual environments: https://docs.python.org/3/tutorial/venv.html
- Prefect: https://docs.prefect.io/
- MLFlow: https://mlflow.org/docs/latest/index.html
Online courses¶
- Microsoft Azure Machine Learning Fundamentals, LinkedIn Learning: https://www.linkedin.com/learning/microsoft-azure-machine-learning-fundamentals/microsoft-azure-machine-learning-fundamentals-introduction
- HOGENT students have free access to LinkedIn Learning via Academic Software.
- While Prefect is excellent for academic use, Azure ML is a widely adopted technology in the industry for managing ML pipelines. If you're looking to deepen your understanding of a widely utilized ML pipeline technology, it would be beneficial to explore the following resources.