24CSA524: Machine Learning
(3-0-1-4 credits)
Remya Rajesh
• Go through the following links for ML related content.
a. https://www.analyticsvidhya.com/
b. https://www.datacamp.com/datalab/w/96e1eee8-23ac-4240-9d1a-c0bcad2ebaf1/edit
c. https://app.datacamp.com/learn
d. https://www.kaggle.com/
e. https://www.kdnuggets.com/
f. https://www.hackerrank.com/domains/ai?filters%5Bsubdomains%5D%5B%5D=machine-learning
Machine Learning in Our Daily Lives
Spam Filtering Web Search Postal Mail Routing Fraud Detection
Movie Vehicle Driver Web
Social Networks
Recommendations Assistance Advertisements
Speech
Recognition
What is Machine Learning?
•A branch of Artificial
Intelligence (AI) that
focuses on developing
algorithms that enable
computers to learn
patterns from data and
make predictions or
decisions without being
explicitly programmed.
Relevance of Machine Learning
Provide highly demanding predictions to
improve decision making and to make smart Situation/Context
actions in real-time without or with minimal
human intervention Awareness
• Find a problem to solve
• Do you need ML to solve it?
• Fake the working of the system
• Check the false positives and negatives
(Google Search vs Disease prediction)
Machine Learning
• Adaptation of the system – feedback
loop
• Learn using the right labels
What's required to create good machine learning systems?
• Data preparation techniques
• Algorithms –basic and advanced
• Programming knowledge
• Scalability through high performance
computing
• Ensemble modeling
Why “Learn” ?
• Machine learning is programming to optimize a performance
criterion using example data [past experience (past knowledge)].
• There is no need to “learn” to calculate payroll given the basic
attributes.
• But to revise payroll based on various factors?
• Learning is used when:
• Human expertise does not exist (navigating on Mars),
• Humans are unable to explain their own expertise (speech recognition)
• Solution changes in time (routing on a computer network)
• Solution needs to be adapted to particular cases (fraud)
Lecture Notes for E Alpaydın 2010 Introduction to Machine
8
Learning 2e © The MIT Press (V1.0)
Definition of Machine Learning
Arthur Samuel (1959): Machine Learning is the
field of study that gives the computer the ability
to learn without being explicitly programmed!!
Photos from Wikipedia
Definition of Machine Learning
Tom Mitchell (1998): a computer program is
said to learn from experience E with respect
to some class of tasks T and performance
measure P, if its performance at tasks in T, as
measured by P, improves with experience E.
Task T: Playing Checkers
Experience E(data): Moves made when
games played by the program
Performance measure P: winning rate
A well-defined learning task is given by <T, P, E>
Image from Tom Mitchell’s homepage
What We Talk About When We Talk
About“Learning”
• Learning general models from a data of particular
examples
• Data is cheap and abundant (data warehouses, data
marts); knowledge is expensive and scarce.
• Build a model that is a good and useful
approximation/representation to the data!!
• Describe/ Summarize data in the form of a model
12
Learning: Knowledge iterates to improve
prior Learning knowledge
knowledge
Data/ Additional Data
13
An ML Framework
Types of Data and related ML applications
Text Data:
•Sentiment Analysis: Analyzing emotions or opinions in product reviews, tweets, etc.
•Machine Translation: Translating text from one language to another (e.g., Google
Translate).
•Text Summarization: Generating concise summaries of long documents.
•Chatbots and Virtual Assistants: NLP-based conversation systems (e.g., Alexa, Siri).
•Spam Detection: Classifying emails or messages as spam or not.
•Named Entity Recognition (NER): Identifying entities like names, locations, or dates in
text.
•Text Classification: Categorizing articles or documents into predefined classes (e.g., news
categories).
•Sentiment-Based Recommendation Systems: Recommending products or services based
on user opinions.
•Plagiarism Detection: Identifying copied content using text similarity models.
•Topic Modeling: Identifying latent topics in large text corpora.
Types of Data and related ML applications
Image Data:
• Image Classification
• Object Detection
• Face Recognition
• Image Segmentation
• Optical Character Recognition (OCR)
• Image Generation
• Landmark Detection
• Medical Imaging Analysis
Types of Data and related ML applications
Video Data:
•Video Object Detection: Identifying objects in video frames (e.g., self-driving car
cameras).
•Action Recognition: Detecting activities like running, jumping, or dancing in videos.
•Video Summarization: Automatically generating short summaries of long videos.
•Video Captioning: Creating text descriptions for video content.
•Gesture Recognition: Detecting and interpreting hand or body gestures.
•Video Anomaly Detection: Identifying unusual events (e.g., in security footage).
•Autonomous Vehicles: Using video data for lane detection, obstacle recognition, and
navigation.
•Facial Recognition in Videos: Identifying individuals in live or recorded video streams.
•Video Frame Interpolation: Generating intermediate frames to improve video quality or
create slow-motion effects.
•Virtual Backgrounds: Replacing or blurring backgrounds in video calls using ML.
Preprocessing
• Text Data : Cleaning, Lowercasing, tokenization,
Stopword removal, stemming, lemmatization,
vectorization, spelling correction, rephrasing
• Other data: Normalization, scaling, grayscale conversion,
cropping, denoising, segmentation, compression,
handling missing data, encoding categorical data to
numbers, outlier detection and removal, feature
selection, feature engineering, dimensionality reduction,
data balancing, data splitting
Machine Learning Vocabulary
•Target: predicted category or value of the
data (column to predict)
Machine Learning Vocabulary
Machine Learning Vocabulary
• Target: predicted category or value of the data (column to predict)
• Features: properties of the data used for prediction (non-target
columns)
Machine Learning Vocabulary
Machine Learning Vocabulary
• Target: predicted category or value of the data (column to predict)
• Features: properties of the data used for prediction (non-target
columns)
• Example: a single data point within the data (one row)
Machine Learning Vocabulary
Machine Learning Vocabulary
• Target: predicted category or value of the data (column to predict)
• Features: properties of the data used for prediction (non-target
columns)
• Example: a single data point within the data (one row)
• Label: the target value for a single data point
Machine Learning Vocabulary
Types of Machine Learning Algorithms
Types of Supervised Learning Algorithms
Supervised Learning - Overview
Regression - Overview
Classification - Overview
What is Needed for Classification?
•Model data with:
•Features that can be quantitated
•Labels that are known
•Method to measure similarity