Introducing
MLOps
Anish Cheriyan
Vimal Das K,
Inputs from Jayaraj J
9th
April, 2022.
Event organized jointly by BSPIN and
ASQ Bengaluru LMC
Image-
https://medium.com/analytics-vidhya/applications-and-types-of
-machine-learning-c177a844bf38
Agenda
★ Industry 4.0
★ Agile , CI/CD, DevOps
★ DevOps and MLOps
★ Evolution of MLOPS
★ MLOps Capabilities
★ AI Platform Pipelines
★ Training and Tuning AI Platform
★ Case Study
BSPIN has been active since 1992 with the support of individuals and organizations in Bangalore.
BSPIN’s Mission is to help the Indian Software industry to achieve breakthrough in software quality
and productivity by active practice enabled by collaborations, learning, sharing and innovating from
the practitioners’ level.
BSPIN (Bangalore SPIN) is currently the largest operational SPIN across the globe. More details
about BSPIN is available on www.bspin.org
For MEMBERSHIP-
https://bspin.org/?page_id=1480#!/SignUp/Up
membership@bspin.org
● ASQ is a global community of people passionate about quality, who use the tools, their ideas and expertise
to make our world work better. ASQ: The Global Voice of Quality.
● ASQ is a global organization with members in more than 130 countries. Headquartered in Milwaukee,
Wisconsin, we also operate centers in Mexico, India, and China. Our Society consists of member-led
communities that help members connect with other quality professionals and practitioners, advance their
knowledge and careers, and grow as thought leaders.
For MEMBERSHIP-
https://asq.org.in/membership/
Self Driving Car
AI usage for Cancer Detection
https://www.drugtargetreview.com/news/34555/ai-system-detects-canc
er-tumours-missed-by-conventional-diagnostics/
Image-
https://medium.com/analytics-vidhya/applications-and-types-of
-machine-learning-c177a844bf38
2018- Self-driving Uber car that hit and killed woman did not recognize
that pedestrians jaywalk
https://www.nbcnews.com/tech/tech-news/self-driving-uber-car-hit-killed-woman-did-not-recognize-n1079281
Industry 4.0 and ABCs
Image Reference- https://hrishikeshiyengar.wordpress.com/2021/01/31/components-of-industry-4-0-the-heart-and-soul/
https://www.synopsys.com/blogs/software-security/agile-cicd-devops-difference/
Agile , CI/CD, DevOps
What is MLOps?
An approach, like
DevOps,
developed in the
context of ML
engineering
Unifies ML System
Development and
Operations
Standardized
Processes and
Technology
Capabilities for
building,deploying,
& operationalizing
ML systems rapidly
and reliably
MLOps & DevOps
References- https://en.wikipedia.org/wiki/MLOps
Evolution of MLOps
ML Capabilities
https://ml-ops.org/content/motivation
Challenges of
Practical
Applications
of ML
Avoiding training-serving
skews that are due to
inconsistencies in data,
Handling concerns about
model fairness and
adversarial attacks.
Maintaining the veracity of
models by continuously
retraining
Performing ongoing
experimentation of new
data sources,
Preparing and maintaining
high-quality data for
training ML models.
Tracking models in
production to detect
performance degradation.
05
01
02 03
04
Challenges
Benefits of MLOps
Shorter development
cycles, and as a
result, shorter time
to market.
Better
collaboration
between teams.
Increased
reliability,
performance,
scalability, and
security of ML
systems.
Streamlined
operational and
governance
processes.
Increased return
on investment of
ML projects.
Relationship of Data
Engineering, ML Engineering,
and Application Engineering.
• Data engineering involves ingesting, integrating, curating, and
refining data to facilitate a broad spectrum of operational
tasks, data analytics tasks, and ML tasks.
• ML models are built and deployed in production using curated
data that is usually created by the data engineering team.
Lifecycle of
MLOps
Image: https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf
MLOps Detailed Workflow
Core MLOps
Technical
Capabilities of
ML Platforms
Image: https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf
Experimentation
Lets data scientists and
ML researchers
collaboratively perform
EDA, create prototype
model architectures, and
implement training
routines.
Image: https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf
Data Processing
Lets you prepare and
transform large amounts
of data for ML at scale in
ML development, in
continuous training
pipelines, and in
prediction serving.
Image: https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf
Model training
lets you efficiently and
cost-effectively run
powerful algorithms for
training ML models
Image: https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf
Model evaluation
lets you assess the
effectiveness of your
model, interactively during
experimentation and
automatically in
production.
Image: https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf
Model serving
lets you deploy and serve
your models in production
environments.
Key functionalities in
model serving include
support for near-real-time,
low latency prediction,
logging etc…
Image: https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf
Online
experimentation
lets you understand how
newly trained models
perform in production
settings compared to the
current models before you
release the new model to
production.
Model monitoring
lets you track the efficiency
and effectiveness of the
deployed models in
production to ensure
predictive quality and
business continuity.
Image: https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf
ML pipelines
lets you instrument,
orchestrate, and automate
complex ML training and
prediction pipelines in
production.
Image: https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf
Model registry
lets you govern the lifecycle
of the ML models in a central
repository. This ensures the
quality of the production
models and enables model
discovery.
Image: https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf
Dataset & feature
repository
lets you unify the
definition and the storage
of the ML data assets.
Helps data scientists and
ML researchers save time
on data preparation and
feature engineering
Image: https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf
ML metadata &
artifact repository
Enables reproducibility and
debugging of complex ML
tasks and pipelines.
Metadata about ML artifacts
such as descriptive statistics,
data schemas, trained
models, and evaluation
results are tracked in it.
Image: https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf
Real-world Case
Studies
Automated customer
support routing
Problem Statement
Automatically classify
customer support
ticket and route to the
right support agent
Automated customer support routing
Data Selection &
Exploration
- Ticket Info
- Trip info
- Customer info
Feature engineering
- Ticket message
- Time after trip
- …
Model prototyping &
validation
- learning-to-rank
approach
- retrieval-based
pointwise ranking
Training pipeline
- Batch jobs
- Yarn/Mesos cluster
Data pipeline
- Data transformation
- Kafka (pub/sub)
- Samza (stream
processing)
- Cassandra (for
training)
Model Refresh
- Model tuning
- Model evaluation
- Model validation
Service integration
- Offline mode
- Online mode
- Library mode
CI/CD pipeline
- Dynamic model
loading
- Artifacts Validation
- Serving validations
Online
experimentation
- A/B testing
Monitoring
integration
- Kafka
- Kibana
(dashboards)
Model monitoring
- RMSLE
- RMSE
- R-suqared
Data & feature
repository
- HDFS (data)
- Cassandra (feature
metadata)
Public Launch
- Gradual rollout
- Online
/Batch/Embedded
inference
Model repository
- HDFS (zip archive)
- Cassandra (model
metadata)
ML Development Training operationalization
Continuous training
Prediction serving Model deployment
Model monitoring
MLOps End-to-End Workflow
Image: https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf
Take Away
Delivering business value through ML is not only about building the best ML model for
the use case at hand, but also about building an integrated ML system that operates
continuously to adapt to changes in the dynamics of the business environment.
Such an ML system involves
❑ Collecting, processing, and managing ML datasets and features;
❑ Training, and evaluating models at scale;
❑ Serving the model for predictions;
❑ Monitoring the model performance in production; and
❑ Tracking model metadata and artifacts.
… and MLOps enables building such an ML System.
BOOK
COMING
SOON
Anish Cheriyan, Rajith Raveendran, Vimal Das Kammath
With Sheena Lakshmi
https://softwareforindustrynext.blogspot.com/
References
• Practitioners guide to MLOps:A framework for continuous delivery and automation of machine learning.Khalid Salama,
Jarek Kazmierczak, Donna Schut, Google Cloud White Paper, 2021
•
• Engineering MLOps: Rapidly build, test, and manage production-ready machine learning life cycles at scale-
Emmanuel Raj
Thank You

Introducing MLOps.pdf

  • 1.
    Introducing MLOps Anish Cheriyan Vimal DasK, Inputs from Jayaraj J 9th April, 2022. Event organized jointly by BSPIN and ASQ Bengaluru LMC Image- https://medium.com/analytics-vidhya/applications-and-types-of -machine-learning-c177a844bf38
  • 2.
    Agenda ★ Industry 4.0 ★Agile , CI/CD, DevOps ★ DevOps and MLOps ★ Evolution of MLOPS ★ MLOps Capabilities ★ AI Platform Pipelines ★ Training and Tuning AI Platform ★ Case Study
  • 3.
    BSPIN has beenactive since 1992 with the support of individuals and organizations in Bangalore. BSPIN’s Mission is to help the Indian Software industry to achieve breakthrough in software quality and productivity by active practice enabled by collaborations, learning, sharing and innovating from the practitioners’ level. BSPIN (Bangalore SPIN) is currently the largest operational SPIN across the globe. More details about BSPIN is available on www.bspin.org For MEMBERSHIP- https://bspin.org/?page_id=1480#!/SignUp/Up membership@bspin.org
  • 4.
    ● ASQ isa global community of people passionate about quality, who use the tools, their ideas and expertise to make our world work better. ASQ: The Global Voice of Quality. ● ASQ is a global organization with members in more than 130 countries. Headquartered in Milwaukee, Wisconsin, we also operate centers in Mexico, India, and China. Our Society consists of member-led communities that help members connect with other quality professionals and practitioners, advance their knowledge and careers, and grow as thought leaders. For MEMBERSHIP- https://asq.org.in/membership/
  • 5.
    Self Driving Car AIusage for Cancer Detection https://www.drugtargetreview.com/news/34555/ai-system-detects-canc er-tumours-missed-by-conventional-diagnostics/ Image- https://medium.com/analytics-vidhya/applications-and-types-of -machine-learning-c177a844bf38
  • 6.
    2018- Self-driving Ubercar that hit and killed woman did not recognize that pedestrians jaywalk https://www.nbcnews.com/tech/tech-news/self-driving-uber-car-hit-killed-woman-did-not-recognize-n1079281
  • 7.
    Industry 4.0 andABCs Image Reference- https://hrishikeshiyengar.wordpress.com/2021/01/31/components-of-industry-4-0-the-heart-and-soul/
  • 8.
  • 9.
    What is MLOps? Anapproach, like DevOps, developed in the context of ML engineering Unifies ML System Development and Operations Standardized Processes and Technology Capabilities for building,deploying, & operationalizing ML systems rapidly and reliably
  • 10.
    MLOps & DevOps References-https://en.wikipedia.org/wiki/MLOps
  • 11.
  • 12.
  • 13.
    Challenges of Practical Applications of ML Avoidingtraining-serving skews that are due to inconsistencies in data, Handling concerns about model fairness and adversarial attacks. Maintaining the veracity of models by continuously retraining Performing ongoing experimentation of new data sources, Preparing and maintaining high-quality data for training ML models. Tracking models in production to detect performance degradation. 05 01 02 03 04 Challenges
  • 14.
    Benefits of MLOps Shorterdevelopment cycles, and as a result, shorter time to market. Better collaboration between teams. Increased reliability, performance, scalability, and security of ML systems. Streamlined operational and governance processes. Increased return on investment of ML projects.
  • 15.
    Relationship of Data Engineering,ML Engineering, and Application Engineering. • Data engineering involves ingesting, integrating, curating, and refining data to facilitate a broad spectrum of operational tasks, data analytics tasks, and ML tasks. • ML models are built and deployed in production using curated data that is usually created by the data engineering team.
  • 16.
  • 17.
  • 18.
    Core MLOps Technical Capabilities of MLPlatforms Image: https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf
  • 19.
    Experimentation Lets data scientistsand ML researchers collaboratively perform EDA, create prototype model architectures, and implement training routines. Image: https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf
  • 20.
    Data Processing Lets youprepare and transform large amounts of data for ML at scale in ML development, in continuous training pipelines, and in prediction serving. Image: https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf
  • 21.
    Model training lets youefficiently and cost-effectively run powerful algorithms for training ML models Image: https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf
  • 22.
    Model evaluation lets youassess the effectiveness of your model, interactively during experimentation and automatically in production. Image: https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf
  • 23.
    Model serving lets youdeploy and serve your models in production environments. Key functionalities in model serving include support for near-real-time, low latency prediction, logging etc… Image: https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf
  • 24.
    Online experimentation lets you understandhow newly trained models perform in production settings compared to the current models before you release the new model to production.
  • 25.
    Model monitoring lets youtrack the efficiency and effectiveness of the deployed models in production to ensure predictive quality and business continuity. Image: https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf
  • 26.
    ML pipelines lets youinstrument, orchestrate, and automate complex ML training and prediction pipelines in production. Image: https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf
  • 27.
    Model registry lets yougovern the lifecycle of the ML models in a central repository. This ensures the quality of the production models and enables model discovery. Image: https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf
  • 28.
    Dataset & feature repository letsyou unify the definition and the storage of the ML data assets. Helps data scientists and ML researchers save time on data preparation and feature engineering Image: https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf
  • 29.
    ML metadata & artifactrepository Enables reproducibility and debugging of complex ML tasks and pipelines. Metadata about ML artifacts such as descriptive statistics, data schemas, trained models, and evaluation results are tracked in it. Image: https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf
  • 30.
  • 31.
    Problem Statement Automatically classify customersupport ticket and route to the right support agent Automated customer support routing Data Selection & Exploration - Ticket Info - Trip info - Customer info Feature engineering - Ticket message - Time after trip - … Model prototyping & validation - learning-to-rank approach - retrieval-based pointwise ranking Training pipeline - Batch jobs - Yarn/Mesos cluster Data pipeline - Data transformation - Kafka (pub/sub) - Samza (stream processing) - Cassandra (for training) Model Refresh - Model tuning - Model evaluation - Model validation Service integration - Offline mode - Online mode - Library mode CI/CD pipeline - Dynamic model loading - Artifacts Validation - Serving validations Online experimentation - A/B testing Monitoring integration - Kafka - Kibana (dashboards) Model monitoring - RMSLE - RMSE - R-suqared Data & feature repository - HDFS (data) - Cassandra (feature metadata) Public Launch - Gradual rollout - Online /Batch/Embedded inference Model repository - HDFS (zip archive) - Cassandra (model metadata) ML Development Training operationalization Continuous training Prediction serving Model deployment Model monitoring
  • 32.
    MLOps End-to-End Workflow Image:https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf
  • 33.
    Take Away Delivering businessvalue through ML is not only about building the best ML model for the use case at hand, but also about building an integrated ML system that operates continuously to adapt to changes in the dynamics of the business environment. Such an ML system involves ❑ Collecting, processing, and managing ML datasets and features; ❑ Training, and evaluating models at scale; ❑ Serving the model for predictions; ❑ Monitoring the model performance in production; and ❑ Tracking model metadata and artifacts. … and MLOps enables building such an ML System.
  • 34.
    BOOK COMING SOON Anish Cheriyan, RajithRaveendran, Vimal Das Kammath With Sheena Lakshmi https://softwareforindustrynext.blogspot.com/
  • 35.
    References • Practitioners guideto MLOps:A framework for continuous delivery and automation of machine learning.Khalid Salama, Jarek Kazmierczak, Donna Schut, Google Cloud White Paper, 2021 • • Engineering MLOps: Rapidly build, test, and manage production-ready machine learning life cycles at scale- Emmanuel Raj
  • 36.