Skip to content

dibuaman/MLOps

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

53 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MLOps - Machine Learning Operations

Goals

  1. Create a library of modular recipes (parameterized devops pipeline templates) which could then be composed to create custom end to end CI/CD pipelines for Machine Learning
  2. To learn & teach fundamentals

Why care? To sustain business benefits of Machine learning across any organization, we need to bring in discipline, automation & best practices. Enter MLOps.

Approach

  1. Minimalistic: Focus is on clean, understandable pipeline & code
  2. Modular: Atomic recipes that could be referred and reused (e.g recipe: Deploy to production after approval)

Status: Project board

Technologies: Azure Machine Learning & Azure Devops

Technical Aspects
It is fine if you do not understand this yet - there will be discussions in the workshop (todo: add detailed notes)

  1. Fully CI/CD YAML based multistage pipeline (does not use classic release pipelines in Azure devops)
  2. Use YAML based variables template (no need to configure variable groups through UI)
  3. Gated releases (manual approvals)
  4. CLI based MLOps: use Azure ML CLI from Devops pipelines as a mechanism for interacting with the ML platform. Simple and clean.

Get Started

  1. Understand what we are trying to do (below section + workshop discussion)
  2. Setup the environment
  3. Run an end to end MLOps pipeline

Note: Automated builds based on code/asset changes have been disabled by setting triggers: none in the pipelines. The reason is to avoid triggering accidental builds during your learning phase.

MLOps Flow and Current Setup MLOps Flow

The above diagram illustrates a possible end to end MLOps scenario. Our current Build-Release pipeline has a subset: Training ➡️ Approval ➡️ Model Registration ➡️ Package ➡️ Deploy in test ➡️ Approval ➡️ Deploy to Production

Notes on our Base scenario:

  1. Directory Structure
    1. mlops_pipelines contains the devops pipelines
      1. The EnvCreatePipeline.yml is a devops pipeline that will provision all the components in the cloud
      2. The BasicBuildRelease.yml is a devops pipeline that would perform the subset of steps mentioned above (Training to Deployment in Test)
    2. code directory has the source code for training and scoring. This will be used by Azure ML to create docker images to perform training & scoring.
    3. dataset directory contains the german credit card dataset
  2. Training: For training we use a simple LogisticRegression model on the German Credit card dataset. We build sklearn pipeline that does feature engineering. We export the whole pipeline as a the model binary (pkl file).
  3. We use Azure ML CLI as a mechanism for interacting with Azure ML due to simplicity reasons.

More documentation will follow.

Acknowledgements

  1. MLOpsPython python repo was one of the inspirations for this - thanks to the contributors
  2. German Creditcard Dataset
    Dua, D. and Graff, C. (2019). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.

About

Learn how to create and run modular and minimalistic MLOps pipelines

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 100.0%