0% found this document useful (0 votes)
12 views36 pages

Lecture 3

The document provides an overview of machine learning (ML), including its definition, types, and applications, as well as a brief introduction to the Scikit-learn library. It outlines the differences between supervised, unsupervised, and reinforcement learning, and discusses practical examples and exercises using the K-Nearest Neighbors algorithm. The content is aimed at introducing ML concepts without delving into theoretical details, with a focus on hands-on practice using Python.

Uploaded by

Edoardo Maschio
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views36 pages

Lecture 3

The document provides an overview of machine learning (ML), including its definition, types, and applications, as well as a brief introduction to the Scikit-learn library. It outlines the differences between supervised, unsupervised, and reinforcement learning, and discusses practical examples and exercises using the K-Nearest Neighbors algorithm. The content is aimed at introducing ML concepts without delving into theoretical details, with a focus on hands-on practice using Python.

Uploaded by

Edoardo Maschio
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

Machine learning with pytho

ESCP-Paris 2021
Slides (or images, contents) adapted from D. Dligach, C. Müller, E.
Duchesnay, M.Defferrard, E. Eaton, S. Sankararaman and many others (who
,

made their course materials freely available online).

Anh-Phuong TA
Chief Data Scientist at Le Figaro CCM-Benchmark group
taanhphuong@gmail.com

1
n

Maths …

2
Today’s lecture
• Overview of ML
• Quick introduction to Scikit-learn
• No theories (i.e., we will learn them next lecture)
Machine learning is
Wiki: ML is
Machine learning is a sub eld of computer science (more
particularly soft computing) that evolved from the study of
pattern recognition and computational learning theory in
arti cial intelligence. In 1959, Arthur Samuel de ned
machine learning as a “Field of study that gives computers
the ability to learn without being explicitly programmed”.
Machine learning explores the study and construction of
algorithms that can learn from and make predictions on
data.
fi

fi
fi
Machine learning is
When Do We Use Machine Learning?
ML is used when:
• Human expertise does not exist (navigating on Mars)
• Humans cannot explain their expertise (speech recognition)
• Algorithms must be customized (personalized medicine)
• Data exists to acquire expertise (genomics)
A classic example of a task that
requires machine learning:
More tasks that are best solved
by using a learning algorithm
• Recognizing patterns
- Facial identities or facial expressions
- Handwritten or spoken word
- Medical image
• Generating patterns:
- Generating images or motion sequences
• Recognizing anomalies:
- Unusual credit card transactions
- Unusual patterns of sensor readings in a nuclear power plant
• Prediction:
- Future stock prices or currency exchange rates
s

Some applications of ML
• Web searc
• Computational biolog
• Financ
• E-commerc
• Space exploratio
• Robotic
• Information extraction
• Social network
• Debugging software
e

Types of ML

11
Types of Learning
• Supervised (inductive) learning : Learn with a teache
– Given: labeled training instances (or examples)
– Goal: learn mapping that predicts label for test instance
• Unsupervised learning : Learn without a teacher
– Given: unlabeled inputs
– Goal: learn some intrinsic structure in inputs
• Reinforcement learning: Learn by interactin
– Given agent interacting in environment (having set of
states)
– Learn policy (state to action mapping) that maximizes agent’s
reward g

Supervised learning

• Predicting the future with supervised learnin


• Classi cation vs. Regression
fi
g

Classi cation
• Predict categorical class labels
based on past observations
• Class labels are discrete
unordered values
• Email spam classi cation
example (binary)
• Handwritten digit classi cation
example (multi-class)
fi

fi
fi

Regression
• Also a kind of
supervised learnin
• Prediction of
continuous outcome
• Predicting semester
grade scores for
students
g

Unsupervised learning

• Dealing with unlabeled dat


• Cluster analysi
• Objects within a cluster share a degree of
similarity Unsupervised learning
s

Unsupervised learning
Reinforcement Learning
• Given sequence of states and actions
with (delayed) rewards
• Learn policy that maximizes agent’s
reward
Examples:
– Game playing
– Robot in maze

The Agent-Environment Interface


Designing a Learning System
• Choose training experience
• Choose exactly what is to be learned – i.e. the
target functio
• Choose how to represent the target function
• Choose learning algorithms to infer target
function from experience
n

Feature representations
Feature representations
Ex: Iris dataset
Basic terminology
Diff. steps for building ML app
Practice: we will use sklearn
• Contains many state-of-the-art machine
learning algorithms
• Offers comprehensive documentation (hp://
scikit- learn.org/stable/documentation) about
each algorithm
• Widely used, and a wealth of tutorials (hp://
scikit- learn.org/stable/user_guide.html) and
code snippets are available
• Works well with numpy, scipy, pandas,
matplotlib, …

Building ( tting) models


All classi ers in Sci-kit learn have the same API:
fi
fi
Example
from sklearn.naive_bayes import MultinomialNB
model = MultinomialNB()

model. t(X_train, y_train)

print("train score:", model.score(X_train, y_train))


print("test score:", model.score(X_test, y_test))
X_pred = model.predict(X_test)
fi

Some toy datasets available


in sklearn

from sklearn.datasets import load_iris


iris_dataset = datasets.load_iris()
X = iris_dataset.data
y = iris_dataset.target
print("Targets: {}".format(iris_dataset['target_names'])
print("Features: {}".format(iris_dataset['feature_names'])
print("Shape of data: {}".format(iris_dataset['data'].shape)
print("First 5 rows:\n{}".format(iris_dataset['data'][:5])
print("Target names: {}".format(iris_dataset['target_names'])
print("Targets:\n{}".format(iris_dataset['target']))
)

Supervised learning: rst


algorithm

fi
SuSupervised learning with
sklearn
KNN: k=1

Return the class of nearest label


KNN: k>1

for k>1: do a vote and return the majority (or


a con dence value for each class)
fi
Training and testing data
• train_test_split : splits data randomly in 75%
training and 25% test data.
X_train, X_test, y_train, y_test = train_test_split(

iris_dataset['data'], iris_dataset['target'], random_state=0)

• Some questions
• Why 75%? Are there beer ways to split?
What if one random split yields different models than
another
• What if all examples of one class all end up in the
training/test set?
?

Testing kNN with sklearn


from sklearn.neighbors import KNeighborsClassifie

• Training a kNN model


knn = KNeighborsClassifier(n_neighbors=1
knn.fit(X_train, y_train)

• Evaluating the model


y_pred = knn.predict(X_test)
print("Score: {:.2f}".format(np.mean(y_pred == y_test)))
print("Score: {:.2f}".format(knn.score(X_test, y_test) ))

• Predicting a new example


X_new = np.array([[5, 2.9, 1, 0.2]])
prediction = knn.predict(X_new)
:

Exercise
Testing kNN with boston dataset

You might also like