0% found this document useful (0 votes)
6 views12 pages

Complete

its about technical

Uploaded by

maliksubhaan15
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views12 pages

Complete

its about technical

Uploaded by

maliksubhaan15
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Machine Learning

Machine Learning enables machines to learn from data and experiences without explicit programming.
Instead of writing code, you provide data to an algorithm, which builds logic based on that data.

How it works: A training dataset is used to train an ML algorithm to create a model. New input data is
then processed through this model to make predictions. If the predictions meet acceptable accuracy, the
model is deployed. Otherwise, it is retrained with enhanced data until the accuracy improves.

Types of Machine Learning:

1. Supervised Learning –

Learning is guided by labeled data (a


"teacher"). The model is trained on this
dataset and then makes predictions or
decisions when new data is introduced.

2. Unsupervised Learning
The model learns by observing and
finding patterns in data without
labels. It organizes data into clusters
based on relationships, though it
doesn't assign labels to these
clusters. For example, it can group
apples, bananas, and mangoes into
clusters without naming them.

3. Reinforcement Learning
An agent interacts with its
environment and learns through rewards and penalties. It refines its decisions over time by
maximizing positive rewards and minimizing mistakes. Once trained, it can make predictions
based on new data.
Classification of machine learning

Unit 2
Regression Models:

Regression predicts continuous response values, such as house prices, stock values, or cricket scores.
Common models include:

1. Simple Linear Regression – Predicts using one independent variable.


2. Multiple Linear Regression – Predicts using multiple independent variables.

Key Concepts:

• Cost Function & Gradient Descent: Methods for optimizing the model by minimizing error.

• Performance Metrics:

o Mean Absolute Error (MAE)

o Mean Squared Error (MSE)

o R-Squared & Adjusted R-Squared (indicate model fit).

Types of Regression
1. Linear Regression
2. Logistic Regression
3. Polynomial Regression
4. Support Vector Regression
5. Decision Tree Regression

6 Random Forest Regression

7 Ridge Regression

8 Lasso Regression

Linear Regression:

Linear regression is a simple statistical method for predictive analysis that models the relationship
between continuous variables. It addresses regression problems by showing a linear relationship
between the independent variable (X) and the dependent variable (Y).

Types:

1. Simple Linear Regression – One input variable.

2. Multiple Linear Regression – Multiple input variables.

Equation:
Y=aX+bY = aX + b

• YY: Dependent variable (target)

• XX: Independent variable

Example: Predicting an employee's salary based on years of experience.

Some popular applications of linear regression are:

• Analyzing trends and sales estimates


• Salary forecasting
• Real estate prediction
• Arriving at ETAs in traffic.
LINEAR REGRESSION
Linear regression is a statistical approach for modeling relationship between a dependent variable with a
given set of independent variables.

Simple Linear Regression


Simple linear regression is an approach for predicting a response using a single feature.

It is assumed that the two variables are linearly related. Hence, we try to find a linear function that
predicts the response value(y) as accurately as possible as a function of the feature or independent
variable(x). Let us consider a dataset where we have a value of response y for every feature x:
LOGISTIC REGRESSION

Consider an example dataset which maps the number of hours of study with the result of an exam. The
result can take only two values, namely passed (1) or failed(0)

i.e. y is a categorical target variable which can take only two possible type:“0” or “1”.In order to
generalize our model, we assume that:
Differences between Linear Regression and Logistic Regression: -
LINEAR REGRESSION LOGISTIC REGRESSION
1. Linear Regression is a supervised regression model. 1. Logistic Regression is a supervised classification
2. In Linear Regression, we predict the value by an model.
integer number. 2. In Logistic Regression ,we predict the value by 1 or 0.
3. Here no activation function is used 3. Here activation function is used to convert a linear
regression
Performance Metrics

1. Accuracy can be calculated by taking average of the values lying across the “main diagonal”

2. Precision:-It is the number of correct positive results divided by the number of positive results
predicted by classifier.

3. Recall :- It is the number of correct positive results divided by the number of all relevant samples
Residuals and Residual Plots

Residuals: Residuals measure the vertical distance between


observed data points and the regression line, representing the error
between predicted and actual values.

Residual Plots:

• Residuals (Y-axis) vs. independent variable (X-axis) are


visualized in residual plots.

• Key assumption: Residuals should be independent and


normally distributed.

Residual Plot Analysis

A key assumption of linear regression is that residuals (errors)


are independent and normally distributed. Since predictions are
never 100% accurate, some randomness is inherent. The
regression model aims to capture all predictive information in
the deterministic part, leaving residuals as completely random
and unpredictable (stochastic). Ideally, residuals should follow
a normal distribution, validating this assumption.

Characteristics of a Good Residual Plot:

1. High density of points near the origin and low density away from it.

2. Symmetry about the origin.

3. No patterns as residuals are distributed evenly along the X-axis.

4. Projected residuals on the Y-axis form a normal distribution.

A good residual plot shows random, pattern less scatter, while a bad one shows systematic patterns or
deviations from normality. This validates the assumption that residual errors are stochastic and
independent.

A good residual plot satisfies key assumptions:

1. Residuals projected onto the Y-axis form a normal distribution, confirming normality.

2. Residuals are evenly distributed across the X-axis with no visible patterns, ensuring
independence.
Good residual plots
Project on to the Y axis

In contrast, a bad residual plot shows:

• High density far from the origin and low density near it.

• A non-normal distribution when projected onto the Y-axis, violating these assumptions.

Polynomial Regression

Polynomial regression models the relationship between the independent variable xxx and the
dependent variable yyy as an nnn-degree polynomial. It fits a nonlinear relationship using the least-
squares method.

Types of Polynomial Regression:

• Linear: Degree = 1

• Quadratic: Degree = 2

• Cubic: Degree = 3
• Higher degrees follow similarly.

Assumptions of Polynomial Regression

For effective polynomial regression:

1. The relationship between the dependent variable and independent variables should be linear or
curved and additive.

2. Independent variables must not correlate with each other.

3. Errors should be independent, normally distributed with a mean of zero, and have constant
variance.

Polynomial regression alters the structure from a linear equation to a quadratic or higher-degree
equation, which can be visualized through its curve.

Linear Regression vs. Polynomial Regression

Linear regression models straight-line relationships but struggles when data points follow a curve. When
linear regression underfits the data, polynomial regression captures the nonlinear patterns by fitting a
curved line.
Key Difference:

• Linear regression assumes a linear relationship between variables.

• Polynomial regression handles nonlinear relationships effectively by increasing model complexity


(e.g., quadratic curves) while keeping feature weights linear.

Polynomial regression overcomes underfitting by transforming the model structure without changing the
linear nature of the weights.

MEASURES FOR IN – SAMPLE EVALUATION:


Measures for in – sample evaluation:
A way to numerically determine how good the model fits the data set.

Two important measures to determine the fit of a model:

• Mean squared error (MSE)


• R squared (R^2)

Mean Squared Error (MSE)

Mean Squared Error (MSE) quantifies how close a regression line is to data points by calculating the
average of squared errors.

• Smaller MSE indicates closely dispersed data with fewer errors, resulting in a better model.

• Larger MSE suggests widely scattered data points around the mean.

Goal: Minimize MSE for improved model accuracy.

You might also like