0% found this document useful (0 votes)

19 views27 pages

Linear Regression in Machine Learning

Linear regression is a fundamental machine learning algorithm used for predictive analysis of continuous variables, showing a linear relationship between dependent and independent variables. It can be categorized into simple and multiple linear regression, with the goal of finding the best-fit line that minimizes prediction error. Additionally, concepts like linear separability, candidate elimination, and various machine learning types (supervised, unsupervised, semi-supervised, and reinforcement learning) are discussed, highlighting their applications and challenges.

Uploaded by

gorlikrishnaveni23

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views27 pages

Linear Regression in Machine Learning

Uploaded by

gorlikrishnaveni23

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

Linear Regression in Machine Learning

• Linear regression is one of the easiest and most popular Machine Learning algorithms.
It is a statistical method that is used for predictive analysis.
• Linear regression makes predictions for continuous/real or numeric variables such as
sales, salary, age, product price, etc.
• Linear regression algorithm shows a linear relationship between a dependent (y) and
one or more independent (y) variables, hence called as linear regression.
• Since linear regression shows the linear relationship, which means it finds how the
value of the dependent variable is changing according to the value of the independent
variable.
• The linear regression model provides a sloped straight line representing the
relationship between the variables. Consider the below image:

Mathematically, we can represent a linear regression as: y= a0+a1x+ ε

Y= Dependent Variable
X= Independent Variable
a0= intercept of the line
a1 = Linear regression coefficient
ε = random error
The values for x and y variables are training datasets for Linear Regression model
representation.
Types of Linear Regression
Linear regression can be further divided into two types of the algorithm:
Simple Linear Regression: If a single independent variable is used to predict the value
of a numerical dependent variable, then such a Linear Regression algorithm is called
Simple Linear Regression
Multiple Linear Regression: If more than one independent variable is used to predict
the value of a numerical dependent variable, then such a Linear Regression algorithm
is called Multiple Linear Regression.
Linear Regression Line
A linear line showing the relationship between the dependent and independent variables is
called a regression line. A regression line can show two types of relationship:
Positive Linear Relationship:
If the dependent variable increases on the Y-axis and independent variable increases on X-
axis, then such a relationship is termed as a Positive linear relationship.
Negative Linear Relationship:
If the dependent variable decreases on the Y-axis and independent variable increases on the
X-axis, then such a relationship is called a negative linear relationship.

The goal of the algorithm is to find the best Fit Line equation that can predict the values based
on the independent variables.
Best Fit Line:
• The best Fit Line equation provides a straight line that represents the relationship between
the dependent and independent variables.
• The slope of the line indicates how much the dependent variable changes for a unit change
in the independent variable(s).
• Our primary objective while using linear regression is to locate the best-fit line, which implies
that the error between the predicted and actual values should be kept to a minimum.
• There will be the least error in the best-fit line.
Cost or loss Function:
• It measures the error between what the model predicts (Ŷ) and the actual answer (Y).
• It tells us how wrong the model is.

• The goal is to find the best values for:

o θ₁ (intercept or starting point of the line)
o θ₂ (slope or how much y changes with x)
• This gives us the best-fit line that closely matches the data.
• The model predicts values using the formula:
ŷᵢ = θ₁ + θ₂xᵢ
(Where xᵢ is the input, and ŷᵢ is the predicted output)
• MSE Formula:
Linear separability

Linear Separability implies that if there are two classes then there will be point, line, plane, or
hyperplane that splits the input features in such a way that all points of one class are in one-
half space and the second class is in the other half-space.

Linear separability is not an inherent property of the data, but depends on how the data is
represented or transformed. Proper feature engineering and data preprocessing can often
turn non-linearly separable data into linearly separable.

Linear separability is a foundational concept in machine learning that makes binary

classification problems simpler. While many datasets are not linearly separable,
understanding this concept helps choose the right models and techniques for classification.

• For example, here is a case of selling a house based on area and price. We have got a number
of data points for that along with the class, which is house Sold/Not Sold:

Importance in Supervised Learning:

• Especially important in binary classification problems.
• It helps in determining whether linear models like Logistic Regression, Perceptron, and Linear
SVM can classify the data accurately.
• If data is linearly separable, classification becomes easier and faster.

Use in Machine Learning Algorithms:

• Algorithms like Perceptron and Linear SVM are designed to find such separating hyperplanes.
• These models work best when data is perfectly separable using linear boundaries.

Linearly Non-Separable Data:

Many real-world datasets are not linearly separable (e.g., XOR problem).

In such cases, we need:

• Non-linear classifiers like Decision Trees, Neural Networks, etc.
• Or apply feature transformations or kernel tricks (in SVM) to make data linearly
separable in a higher-dimensional space.

Candidate Elimination and Version space

The Candidate Elimination Algorithm is a technique used in concept learning within machine
learning. It identifies all the hypotheses that correctly match the given training data. The
algorithm updates its set of possible hypotheses by using both positive and negative
examples. The group of all consistent hypotheses is known as the version space.

• The Candidate Elimination Algorithm (CEA) is used to find the correct target concept
from a hypothesis space based on given examples.

• It does this by keeping only the hypotheses that match all training examples.

• It uses two important boundaries:

o S (Specific hypothesis): The most specific hypothesis that fits all positive
examples.

o G (General hypothesis): The most general hypothesis that fits all positive
examples and rejects all negative ones.

• The area between S and G is called the version space, which contains all the possible
correct hypotheses.

Important Terms Used

1. Concept Learning

• Concept learning is the process of learning a general rule or condition (called a

hypothesis) from given examples.

• The goal is to generalize from the specific training data to correctly classify unseen
data.

2. Hypothesis

• A hypothesis is a rule that defines the concept.

For example: “If sky = sunny and humidity = normal, then play = yes.”

3. Specific Hypothesis (S)

• The most specific rule that covers only the positive examples.

• Initially, it is set to the null or most specific condition.

• Example: If attributes are [sky, temperature], an S might look like [sunny, warm].

4. General Hypothesis (G)

• The broadest rule that could still match the positive examples.

• Initially, it allows all values, like [?, ?] for two attributes.

• It becomes more specific with negative examples.

5. Version Space

• It is the space between S and G.

• It includes all hypotheses that agree with all training examples.

• As more examples are seen, the version space gets smaller, until the target concept is
found.

Steps of the Candidate Elimination Algorithm

1. Initialization:

o S = Most specific hypothesis (matches nothing).

o G = Most general hypothesis (matches everything).

2. For each training example:

o If the example is positive:

▪ Remove inconsistent hypotheses from G.

▪ Generalize S if needed to include the example.

o If the example is negative:

▪ Remove inconsistent hypotheses from S.

▪ Specialize G if needed to exclude the example.

3. Refinement:

o The algorithm updates S and G until all examples are processed.

o Final version space lies between S and G.

Advantages of CEA

• Uses both positive and negative examples.

• Maintains all possible consistent hypotheses.

• Provides more accurate and generalized results.

• Can handle noisy data better than Find-S.

Disadvantages of CEA

• More complex and slower to compute.

• May fail if data has too much noise.

• Difficult to implement for beginners.

• Not efficient for very small datasets.

Find-S Algorithm
The Find-S Algorithm is a simple concept learning method used in machine learning. It is
designed to find the most specific hypothesis that fits all the positive training examples. The
algorithm is based on the assumption that the hypothesis space contains the target concept
and that there are no errors in the data.

What is Concept Learning?

Concept learning is the task of finding a general rule or condition (called a hypothesis) that
correctly describes a target concept based on training examples. The goal is to generalize from
specific examples to correctly classify future data.

What is the Find-S Algorithm?

• Find-S stands for "Find the Most Specific Hypothesis."

• It works by starting with the most specific hypothesis and gradually generalizing it
using only positive examples.

• It ignores negative examples, which is both a strength and a limitation depending on

the data.

•
Steps in Find-S Algorithm

Advantages of Find-S Algorithm

1. Very easy to understand and use

2. Works fast with small data

3. Finds the most exact rule for the given positive examples

4. Good when there are only correct (positive) examples and no mistakes in data

Disadvantages of Find-S Algorithm

1. Ignores negative (wrong) examples

2. Works only if the correct rule is already in the search space

3. Cannot handle wrong or missing data (noise)

4. May give a rule that is too narrow (overfits)

Machine Learning
• Machine Learning is the study of computer algorithms that allow computer programs to
automatically improve through experience.

• Machine Learning is about making computers modify or adapt their actions so that these
actions get more accurate.

• Machine learning is a subset of AI, which enables the machine to automatically learn from
data, improve performance from past experiences, and make predictions.

Types of Machine Learning:

1. Supervised Machine Learning

2. Unsupervised Machine Learning

3. Semi-Supervised Machine Learning

4. Reinforcement Learning

1. Supervised Machine Learning

Supervised learning is a machine learning technique where models are trained using labelled
data (input is already mapped to correct output). The goal is to learn a function that maps
input (X) to output (Y). Once trained, the model can predict outputs for new, unseen data.

Supervised learning is most effective when there is a large amount of accurately labeled data.
It allows the model to learn from previous examples and make future predictions. It is widely
used in applications where past data can be reliably linked to expected outcomes.

Example: Training a model with images of cats and dogs, labeled accordingly. After learning
the features like shape, size, color, and tail, the model can classify new images as cat or dog.

Types of supervised learning:

a) Classification

Classification algorithms are used to solve the classification problems in which the output
variable is categorical, such as "Yes" or No, Male or Female, Red or Blue, etc. The classification
algorithms predict the categories present in the dataset.
Algorithms: Decision Tree, Logistic Regression, SVM, Random Forest

Examples: Spam detection, Email filtering

b) Regression

Regression algorithms are used to solve regression problems in which there is a linear
relationship between input and output variables. These are used to predict continuous output
variables, such as market trends, weather prediction, etc.

Algorithms: Linear Regression, Multivariate Regression, Lasso, Decision Tree

2. Unsupervised Machine Learning

Unsupervised Machine Learning: In unsupervised learning, the model is trained using
unlabeled data. The system identifies hidden patterns or groupings in the data without
guidance.

Example: Feeding images of various fruits without labels, the model will group them based on
similarities like color, shape, or size.

Types of unsupervised learning:

a) Clustering

Clustering groups similar data points into clusters based on shared features. It helps uncover
hidden structures in data. For example, it is used to group customers by their buying habits.

b) Association

Association finds interesting relationships between variables in large datasets. It identifies

which items often occur together, such as products bought together in a market basket.

3. Semi-Supervised Learning
Semi-supervised learning combines both labeled and unlabeled data for training. It is useful
when acquiring labeled data is expensive or time-consuming.

Example: A student learns from a teacher (labeled data) but also revises independently
(unlabeled data).

Semi-supervised learning helps when you have a small amount of labeled data and a large
amount of unlabeled data. It improves learning efficiency by using both types. It is commonly
used in text classification, image recognition, and speech analysis.

4. Reinforcement Learning
Reinforcement learning is based on a feedback system where an agent learns by interacting
with the environment. Good actions are rewarded; bad actions are punished. The goal is to
maximize the cumulative reward.
Example: An AI learning to play a game by trial and error, improving its moves based on scores.

Reinforcement learning focuses on decision making through trial and error. The agent receives
rewards or penalties and adjusts its actions accordingly. It is useful in situations where actions
influence future outcomes over time.

Types of Reinforcement Learning:

1. Positive Reinforcement: Encourages behavior by giving rewards. It helps increase the
likelihood of the behavior being repeated. This is the most common type used in
learning agents.

2. Negative Reinforcement: Encourages avoiding bad behavior by removing negative

outcomes. It increases the chance of repeating the correct behavior. Often used to
guide agents away from harmful actions.

Perspectives and Issues in Machine Learning:

Perspectives

• Machine learning means finding the best solution (hypothesis) from many possibilities
that match the data and what we already know.

• These possible solutions (hypotheses) can be represented using models like decision
trees, linear functions, or neural networks.

• Different problems require different types of models for better learning results.

• Learning can be seen as a “search process,” where the algorithm looks for the right
model based on the amount of training data, the size of the solution space, and how
confident we are that the model will work on new data.

Specific Issues in Machine Learning:

1. Inadequate Training Data: Too little data makes it hard for the model to learn correctly.

2. Poor Quality Data: If the data is full of errors or noise, the model will learn the wrong
patterns.

3. Non-representative Data: If the training data doesn’t reflect real-world conditions, the
model may perform poorly.

4. Overfitting and Underfitting: Overfitting happens when the model memorizes the
training data, while underfitting happens when it fails to learn the underlying pattern.

5. Monitoring and Maintenance: Machine learning models need regular updates and
monitoring to stay accurate.
6. Bad Recommendations: Poor model training can lead to incorrect suggestions,
especially in recommendation systems.

7. Lack of Skilled Resources: Building and managing ML models requires experts, which
are sometimes hard to find.

8. Customer Segmentation: Wrong grouping of customers may affect personalized

services or marketing.

9. Process Complexity: Machine learning involves many steps—data cleaning, feature

selection, model training, testing, etc., which can be complicated.

10. Data Bias: If the data has bias, the model will learn and repeat the same bias, leading
to unfair or incorrect results.

PERCEPTRON
In Machine Learning and AI, a Perceptron is a fundamental concept and the starting point for
understanding neural networks and deep learning. It consists of input values, weights, and a
threshold (bias) to make decisions.

It was invented by Frank Rosenblatt in the mid-20th century to perform calculations that
detect patterns in data. The Perceptron is a supervised learning algorithm used mainly for
binary classification problems. It helps the model learn from data by adjusting weights
during training.

Perceptron Model
A Perceptron is a simple supervised learning algorithm used for binary classification. It is also
called an artificial neuron and is the basic unit of a neural network.

The Perceptron model works like a single-layer neural network with four key parts:

• Inputs
• Weights and Bias
• Net sum (weighted total)
• Activation function

It helps in detecting patterns from input data and is one of the simplest forms of neural
networks used in machine learning.

Binary Classifier

A binary classifier is a machine learning algorithm that classifies input data into one of two
classes, such as yes/no, spam/not spam, or positive/negative.
It works by checking if the input (usually represented as feature vectors) belongs to a specific
class. Most binary classifiers use a linear function that combines weights and features to
make a prediction

Components of Perceptron
Mr. Frank Rosenblatt invented the perceptron model as a binary classifier which contains
three main components. These are as follows:

Input Layer
The input layer is the initial component of the perceptron that accepts the raw data. Each
input node represents a feature of the dataset and typically holds a real-valued input. These
inputs are passed forward for weighted processing.

Weights and Bias

Weights determine the influence of each input on the output. Bias acts like the intercept in a
linear equation, helping adjust the output threshold.

Activation Function
Processes the weighted sum and bias to determine the output. A step function is commonly
used to produce binary results, deciding whether the neuron should activate.
Types of Activation functions:

• Sign function
• Step function
• Sigmoid function

A data scientist uses the activation function to help the model make decisions based on the
problem.
Different activation functions like Sign, Step, and Sigmoid are used in perceptrons depending
on how the learning behaves—whether it's slow or affected by issues like vanishing or
exploding gradients.

Working
In Machine Learning, a Perceptron is a single-layer neural network that includes four main
parts: input values, weights and bias, net sum, and an activation function.

• Each input is multiplied by its corresponding weight.

• All weighted values are added together along with the bias to form a net sum.
• This net sum is passed through an activation function (usually a step function), which
decides the final output.
The activation function helps convert the output to a fixed range like (0, 1) or (-1, 1).

• The weight shows how important an input is.

• The bias shifts the activation curve up or down to improve learning flexibility.

The Perceptron model works in two main steps:

Step 1: Calculate Weighted Sum

Multiply each input value with its corresponding weight wi

Add all the products and include the bias b.

Step 2: Apply Activation Function

The activation function f is applied to the net sum to produce the final output:

This output can be binary (e.g., 0 or 1) or continuous, depending on the function used.

LINEAR DISCRIMINANT ANALYSIS (LDA)

Linear Discriminant Analysis (LDA) is a popular dimensionality reduction and classification
technique used in supervised learning, especially when dealing with more than two classes.

It is also known as:

• Normal Discriminant Analysis (NDA)

• Discriminant Function Analysis (DFA)

What LDA Does:

• Reduces high-dimensional data to a lower-dimensional space to save computation and
avoid overfitting.
• Separates two or more classes by projecting data in a way that maximizes the difference
between them.
• Often used as a preprocessing step before applying classification algorithms

Why LDA

• Logistic Regression works well for only two-class problems.

• LDA can handle multiple classes efficiently.

Whenever we need to separate two or more classes that have multiple features, Linear
Discriminant Analysis (LDA) is one of the most commonly used techniques.

For example, if we try to classify two classes using just one feature, their values might
overlap, making it hard to separate them clearly.
But LDA combines multiple features to create a new axis that helps in better class separation
with less overlap.

To overcome the overlapping issue in the classification process, we must increase the
number of features regularly.

Example

Imagine we have two classes of data points plotted on a 2D plane (X and Y axes), as shown
in the image.

Black circles = Class 1

Red circles = Class 2

In this 2D space, it's hard to draw a straight line that completely separates the two classes.

Linear Discriminant Analysis (LDA) helps by projecting this 2D data into a 1D space in a way
that:

• Maximizes the distance between the classes, and

• Minimizes overlap within each class

This makes classification much easier and more accurate.

How Linear Discriminant Analysis (LDA) Works

LDA is used to reduce high-dimensional data (like 2D or 3D) into a lower-dimensional space
(like 1D) to make classification easier.

Imagine we have two classes of points plotted in a 2D space (X and Y axes), and we want to
separate them clearly.

Linear Discriminant Analysis (LDA) helps by:

• Drawing a new axis (a straight line) that best separates the two classes.
• Projecting all data points onto this new axis.

This makes it easier to classify the points, as the separation between classes becomes more
visible in this new 1D space.

Hence, we can maximize the separation between these classes and reduce the 2-D plane
into 1-D.

To do this, LDA follows two key rules:

• Maximize the distance between the means of the two classes.

• Minimize the variance within each class.

By applying these rules, LDA finds a new axis that:

• Spreads the classes farther apart, and

• Keeps the points of the same class closer together.
This makes it easier to separate the classes and improves classification accuracy.

Drawbacks of Linear Discriminant Analysis (LDA)

While LDA is effective for multi-class classification, it has some limitations:

• Fails when class distributions have the same mean:

If two or more classes share the same mean, LDA cannot find a clear axis to separate
them, making it ineffective.

• In such cases, non-linear techniques like Non-Linear Discriminant Analysis or kernel-

based methods are used.

Real-World Applications of LDA

• Face Recognition
• Medical Diagnosis
• Customer Identification
• Predictive Modeling
• Robotics and Learning

CONCEPT LEARNING
Concept Learning is a supervised learning approach in machine learning where the goal is to
learn a general rule or concept from specific examples.
In simple terms, it’s about learning how to classify data into positive or negative examples
based on defined conditions or rules.
Concept Learning is the task of inferring a Boolean-valued function (yes/no, true/false) from
given input-output examples.
For example:
Learning the concept of a "bird" from examples like:
• Sparrow – Yes (Bird)
• Car – No (Not a Bird)
The model tries to understand what makes something a "bird" based on features
Key Elements:
• Instance Space (X): All possible examples.
• Hypothesis Space (H): All possible rules the model can learn.
• Target Concept (c): The actual concept we're trying to learn.
• Training Examples (D): Labelled examples used to learn.

CONCEPT LEARNING TASK

The Concept Learning Task is the problem of automatically identifying a concept or rule from
a set of labelled training examples.
The goal is to learn a target concept (a function) that maps input instances to yes/no
(true/false) decisions.

A concept learning task is defined as:

• A set of instances (X)
• A set of training examples D, where each example is labeled as positive (belongs to
the concept) or negative (does not)
• A hypothesis space (H) that contains all possible rules or concepts
The task:
Find a hypothesis h ∈ H such that:

Where:
• x is an input instance
• h(x) is the hypothesis prediction
• c(x) is the actual label (target concept)

Hypothesis Representation in Concept Learning

When training a machine to learn a concept (like "good day to go sailing"), we need to
represent possible rules (called hypotheses) that the machine can choose from.

A hypothesis is basically a set of conditions that describes a possible concept.

Example

Each instance has six attributes

1. Sky (e.g., Sunny, Cloudy, Rainy)

2. AirTemp (e.g., Warm, Cold)
3. Humidity (e.g., High, Normal)
4. Wind (e.g., Strong, Weak)
5. Water (e.g., Warm, Cool)
6. Forecast (e.g., Same, Change)

So, a hypothesis is a vector (or list) with six entries, one for each of these attributes.

How Can a Hypothesis Represent an Attribute?

For each attribute, the hypothesis can say one of the following:
1. "?" (Question mark)
→ Any value is acceptable for this attribute.
→ Example: "?" in the Wind position means Wind can be Strong or Weak.

2. Specific Value
→ Only one specific value is allowed.
→ Example: "Warm" in AirTemp means it must be Warm.

3. "θ" (Theta symbol)

→ No value is acceptable. It rejects all instances.
→ Used to represent the most specific hypothesis, which accepts nothing.

Ex: [Sunny, ?, High, ?, Warm, Same]

This means:

• Sky must be Sunny

• Any value is acceptable for AirTemp and Wind

• Humidity must be High

• Water must be Warm

• Forecast must be Same

Inductive Learning Hypothesis

The Inductive Learning Hypothesis is a basic idea in machine learning. It says that if a
hypothesis (rule) works well on the training data, then it is likely to work well on new,
unseen data too.

That means, when we train a machine learning model, we give it labeled examples
(training data). The model learns a hypothesis (a rule or function) from these examples.
If this rule correctly predicts most of the training data, we assume it will also predict
future data correctly.

This is what makes generalization possible in machine learning. It allows models to make

predictions on data they’ve never seen before.

Concept Learning as Search

• In machine learning, concept learning can be seen as a search problem where we
search for the best hypothesis (rule) that matches the training examples.
• We are searching through a space called the Hypothesis Space (H).
• This space contains all possible hypotheses (rules) that could explain the data.
• The Goal is to find the hypothesis in H that correctly classifies all positive and
negative examples from the training set.
How It Works:
• The hypothesis space is all the possible rules the model can choose from.
• The learning algorithm searches this space to find a rule that:
o Matches positive examples (yes)
o Rejects negative examples (no)
Concept learning is like a search process, where the learner explores different hypotheses
to find the one that matches the concept based on examples.
By choosing a hypothesis representation (like specific symbols or formats),
the designer decides:
• What kind of rules the model can learn
• What the limits of the learning algorithm will be
So, the structure of the hypothesis space depends entirely on how the learning problem
is defined.
Example: Concept Learning as Search
Let’s say we want to teach a machine the concept of a "fruit that is good to eat" based
on 3 attributes:

Attribute Possible Values

Color Red, Green, Yellow
Texture Smooth, Rough
Taste Sweet, Sour

Step 1: Training Data (Examples)

Color Texture Taste Good to Eat?
Red Smooth Sweet Yes
Green Rough Sour No
Yellow Smooth Sweet Yes

Step 2: Define Hypothesis Space

We define hypotheses like this: [Color, Texture, Taste]
Each position can be ?(any value) or specific value or no value

Step 3: Searching the Hypothesis Space

most general hypothesis: [?, ?, ?] → Accepts everything
most specific hypothesis: [θ, θ, θ] → Accepts nothing
Then, based on training examples, we search the space by adjusting the hypothesis:
• First positive example: [Red, Smooth, Sweet]
• Generalize to include the next positive: [?, Smooth, Sweet] (matches both Red and
Yellow)
This search continues until we find a hypothesis that:
• Covers all positive examples
• Excludes all negative ones
[?, Smooth, Sweet] → Any color, but texture must be Smooth and taste must be Sweet.

General-to-Specific Ordering of Hypotheses

In concept learning, many algorithms organize the search through the hypothesis space
using a structure called the general-to-specific ordering. That means
• Some hypotheses are more general than others — they apply to more examples.
• Some hypotheses are more specific — they apply to fewer examples.
This natural structure helps us search more efficiently, even in large or infinite hypothesis
spaces, without checking every possible rule.
Example:
Let’s consider two hypotheses:
• h1 = (Sunny, ?, ?, Strong, ?, ?)
• h2 = (Sunny, ?, ?, ?, ?, ?)
Now look at the difference:
• h1 says the wind must be Strong.
• h2 doesn't care about wind (more general).
So, any instance that matches h1 will also match h2, but not the other way around.
Therefore, we can say that “h2 is more general than h1”
Definition: Let hj and hk be Boolean- valued functions defined over X. Then hj is more
general-than-or-equal-to hk (written hj >= hk) if and only if

THE BRAIN AND THE NEURON

• In animals, learning happens in the brain, which is a powerful and complex system.
• It can handle noisy, incomplete, and high-dimensional data (like images) efficiently.
• The brain weighs around 1.5 kg and still performs well even as neurons die with age.
• The basic unit of the brain is the neuron.
How a Neuron Works:
• Dendrites receive signals.
• If enough signals are received, the neuron fires by sending a signal down the axon.
• We can mimic this process in machine learning using a function that:
o Takes weighted inputs
o Adds them up
o Fires (outputs a signal) if the sum crosses a threshold (bias)

Each neuron performs a simple task, but the brain as a whole acts as a massively parallel
computer with around 10^11 neurons.
Hebb’s Rule (Learning Rule in Neuroscience)
• Proposed by Donald Hebb.

• If two neurons fire at the same time, the connection between them becomes
stronger.

• If they don’t fire together, the connection weakens or dies.

This rule forms the basis of learning in biological and artificial neural networks.

McCulloch and Pitts Neurons:

➢ A mathematical model of a neuron developed in the 1940s.
➢ The neuron receives multiple inputs xi with corresponding weights wi.
➢ These inputs are summed, and:

• If the sum is greater than a threshold, the neuron fires (outputs 1).

• If not, the neuron does not fire (outputs 0).

Components:
• Weighted Inputs – Like synapses

• Adder/Sum – Like the neuron's membrane

• Activation Function – Usually a threshold function

Limitations of the McCulloch & Pitts Model

• Can only handle Boolean inputs (0 or 1)

• Thresholds must be set manually
• No built-in weight adjustment or learning
• Cannot solve non-linearly separable problems

LEARNING SYSTEM
A Learning System is a system that automatically learns and improves from experience
without being explicitly programmed for every possible task.

A learning system is any machine or algorithm that can:

• Take input data

• Learn patterns or relationships from that data (training)

• Make predictions or decisions based on learned knowledge

• Improve its performance over time with more data

Example: Spam filter in email systems learns to classify emails as spam or not spam
based on labelled examples.

Designing a Learning System

Designing a machine learning system involves several well-defined steps or components:
1. Define the Learning Problem

• What is the goal? (e.g., classification, regression, clustering)

• What is the output? (e.g., label, value, category)

• What kind of data is available?

2. Data Collection and Preparation

• Collect data relevant to the task

• Clean, preprocess, and transform it (e.g., handle missing values, normalize features)

3. Choose a Learning Algorithm

• Supervised learning (e.g., decision trees, neural networks)

• Unsupervised learning (e.g., k-means)

• Reinforcement learning (e.g., Q-learning)

4. Train the Model

• Use training data to learn parameters

• Adjust internal model weights to minimize errors

5. Evaluate the Model

• Use a test dataset to measure performance

• Use metrics like accuracy, precision, recall, F1-score

6. Tune Hyperparameters

• Use techniques like cross-validation, grid search

• Improve generalization by avoiding overfitting/underfitting

7. Deploy and Monitor

• Deploy the model to a real-world system

• Continuously monitor and update with new data

Key Considerations While Designing

• Data quality and quantity

• Choice of features (feature engineering)

• Model complexity vs interpretability

• Scalability and computational cost

Example: Credit Card Fraud Detection System

Step 1: Define the Learning Problem

• Goal: Detect fraudulent transactions

• Type: Binary classification

• Output: 1 (Fraud) or 0 (Genuine)

Step 2: Data Collection & Preparation

• Collect a dataset of past transactions, including:

o Amount, time, location

o Merchant type

o Cardholder's profile

o Label (fraud or not)

• Clean the data:

o Remove missing or duplicate records

o Normalize transaction amounts

o Encode categorical features (e.g., one-hot encoding for location/merchant)

Step 3: Choose Learning Algorithm

Use a Supervised Learning algorithm like:

• Logistic Regression

• Decision Trees / Random Forest

• Support Vector Machines (SVM)

• Neural Networks

Suppose we use Random Forest for its good performance on imbalanced datasets.

Step 4: Train the Model

• Split data into training and testing sets (e.g., 80/20 split)

• Train the Random Forest model on the training set

Step 5: Evaluate the Model

Use metrics suitable for imbalanced datasets:

• Accuracy (not enough alone)

• Precision: % of predicted frauds that are correct

• Recall: % of actual frauds detected

• F1-Score: Harmonic mean of precision and recall

• AUC-ROC Curve for visualization

Step 6: Tune Hyperparameters

• Use Grid Search or Random Search for parameters like:

o Number of trees

o Max tree depth

o Minimum samples per leaf

• Perform Cross-Validation to generalize better

Step 7: Deploy and Monitor

• Integrate model into payment systems

• Monitor for:

o Model drift (data patterns changing over time)

o False positives (flagging real users as fraud)

o False negatives (missing frauds)

Machine Learning
No ratings yet
Machine Learning
87 pages
Unit-2 ML
No ratings yet
Unit-2 ML
18 pages
Unit-2 ML
No ratings yet
Unit-2 ML
17 pages
MLT Key
No ratings yet
MLT Key
71 pages
Unit 2 Supervised Learning and Applications
No ratings yet
Unit 2 Supervised Learning and Applications
13 pages
Linear Classifiers Explained
No ratings yet
Linear Classifiers Explained
13 pages
Jntuk ML RECORD Full
No ratings yet
Jntuk ML RECORD Full
46 pages
Cp4252 ML Unit-II
No ratings yet
Cp4252 ML Unit-II
44 pages
Chapter 6 Supervised Learning
No ratings yet
Chapter 6 Supervised Learning
6 pages
DI-ML-concept learning-CEA
No ratings yet
DI-ML-concept learning-CEA
33 pages
Linear Regression
No ratings yet
Linear Regression
3 pages
Module3-Fitting A Model To Data
No ratings yet
Module3-Fitting A Model To Data
57 pages
Linear Regression
No ratings yet
Linear Regression
4 pages
Find S Algorithm
No ratings yet
Find S Algorithm
7 pages
IT 802 ML Unit-2 Notes
No ratings yet
IT 802 ML Unit-2 Notes
19 pages
ML Day4
No ratings yet
ML Day4
24 pages
Ds Module 4
No ratings yet
Ds Module 4
73 pages
Unit 2 - Part A - B - C
No ratings yet
Unit 2 - Part A - B - C
25 pages
ML 2
No ratings yet
ML 2
155 pages
ML PPT 2
No ratings yet
ML PPT 2
206 pages
First Cours 2
No ratings yet
First Cours 2
42 pages
Lecture 3 - Machine Learning and Data Driven Analysis
No ratings yet
Lecture 3 - Machine Learning and Data Driven Analysis
36 pages
Machine: Learning
No ratings yet
Machine: Learning
24 pages
Aiml Unit 3
No ratings yet
Aiml Unit 3
9 pages
Unit 4
No ratings yet
Unit 4
121 pages
ML Unit 2
No ratings yet
ML Unit 2
16 pages
UNIT 1 Linear Discrimnat
No ratings yet
UNIT 1 Linear Discrimnat
7 pages
Unit 2 ML - Ver 2
No ratings yet
Unit 2 ML - Ver 2
129 pages
Module 1
No ratings yet
Module 1
63 pages
Unit 2 ML
No ratings yet
Unit 2 ML
89 pages
ML Unit 1
No ratings yet
ML Unit 1
73 pages
DATA Analytics Previous Solved
No ratings yet
DATA Analytics Previous Solved
8 pages
Data Analysis Chap 3
No ratings yet
Data Analysis Chap 3
21 pages
Intro to Machine Learning Basics
No ratings yet
Intro to Machine Learning Basics
16 pages
Machine Learning
No ratings yet
Machine Learning
33 pages
Machine Learning Theory
100% (1)
Machine Learning Theory
12 pages
LoTs and HoTs Question For Unit 3 and Unit 4 - 1
No ratings yet
LoTs and HoTs Question For Unit 3 and Unit 4 - 1
16 pages
Day.9 SML
No ratings yet
Day.9 SML
23 pages
Unit 1 (DS)
No ratings yet
Unit 1 (DS)
15 pages
Unit-2 Supervised Machine Learning
No ratings yet
Unit-2 Supervised Machine Learning
132 pages
Linear Regression Vs Logistic Regression
No ratings yet
Linear Regression Vs Logistic Regression
8 pages
ML 2 PPT Unit 2
No ratings yet
ML 2 PPT Unit 2
214 pages
Class 3 - Classification
No ratings yet
Class 3 - Classification
80 pages
Machine Learning: Introduction and Linear Regression
No ratings yet
Machine Learning: Introduction and Linear Regression
29 pages
Linear Regression Basics
No ratings yet
Linear Regression Basics
32 pages
MLT UNIT-2 Notes
No ratings yet
MLT UNIT-2 Notes
16 pages
Outcome Based Lab Report
No ratings yet
Outcome Based Lab Report
22 pages
UNIT-2 Material
No ratings yet
UNIT-2 Material
71 pages
Discriminative vs. Generative Models
No ratings yet
Discriminative vs. Generative Models
21 pages
My Notes
No ratings yet
My Notes
15 pages
1 - Intro To Machine Learning
No ratings yet
1 - Intro To Machine Learning
34 pages
Data Analyst Interview Questionaries
No ratings yet
Data Analyst Interview Questionaries
16 pages
AI ML 3 Updated
No ratings yet
AI ML 3 Updated
34 pages
M2 - Supervised Machine Learning
No ratings yet
M2 - Supervised Machine Learning
79 pages
Module III - Simulation Scenarios - C1
No ratings yet
Module III - Simulation Scenarios - C1
179 pages
Classification & Regression Models
No ratings yet
Classification & Regression Models
32 pages
Summer of Science-Final Report
100% (1)
Summer of Science-Final Report
7 pages
Unit Ii
No ratings yet
Unit Ii
118 pages
ML 2 Micro
No ratings yet
ML 2 Micro
6 pages
Neural Networks & Fuzzy Logic Quiz
No ratings yet
Neural Networks & Fuzzy Logic Quiz
25 pages
Fault Localization Thesis (147-169) Finl Edit by Me
No ratings yet
Fault Localization Thesis (147-169) Finl Edit by Me
78 pages
Machine - Learning (ANN)
No ratings yet
Machine - Learning (ANN)
88 pages
Harrison Kinsley, Daniel Kukieła - Neural Networks From Scratch in Python (2020) - 31-61
No ratings yet
Harrison Kinsley, Daniel Kukieła - Neural Networks From Scratch in Python (2020) - 31-61
31 pages
CS 4407 Programming Assign. Unit 7
No ratings yet
CS 4407 Programming Assign. Unit 7
5 pages
Pipe Leakage Detection System With Artificial Neural Network
No ratings yet
Pipe Leakage Detection System With Artificial Neural Network
9 pages
2015 - Request Confirmation Networks For Neuro-Symbolic Script Execution.
No ratings yet
2015 - Request Confirmation Networks For Neuro-Symbolic Script Execution.
9 pages
Model Questions DWT
No ratings yet
Model Questions DWT
3 pages
CNN For Computer Vision
No ratings yet
CNN For Computer Vision
81 pages
Machine Learning Labs Manual
No ratings yet
Machine Learning Labs Manual
60 pages
Machine Learning: A Guide for Students
No ratings yet
Machine Learning: A Guide for Students
18 pages
Soft Computing Unit 2
No ratings yet
Soft Computing Unit 2
22 pages
It-3031 (DMDW) - CS End Nov 2023
No ratings yet
It-3031 (DMDW) - CS End Nov 2023
23 pages
Machine Learning An Algorithmic Perspective Second Edition Stephen Marsland No Waiting Time
100% (1)
Machine Learning An Algorithmic Perspective Second Edition Stephen Marsland No Waiting Time
81 pages
Long Short-Term Memory (LSTM) : 1. Problem Statement
No ratings yet
Long Short-Term Memory (LSTM) : 1. Problem Statement
6 pages
Artificial Neural Networks & Fuzzy Logic
No ratings yet
Artificial Neural Networks & Fuzzy Logic
13 pages
Deep Learning Lab Manual
No ratings yet
Deep Learning Lab Manual
35 pages
Deploy Machine Learning Models To Production: With Flask, Streamlit, Docker, and Kubernetes On Google Cloud Platform 1st Edition Pramod Singh
100% (1)
Deploy Machine Learning Models To Production: With Flask, Streamlit, Docker, and Kubernetes On Google Cloud Platform 1st Edition Pramod Singh
65 pages
Optimizers and Activation Functions in Deep Learning
No ratings yet
Optimizers and Activation Functions in Deep Learning
15 pages
ANFIS Lecture Final
No ratings yet
ANFIS Lecture Final
59 pages
ML Unit 2
No ratings yet
ML Unit 2
23 pages
Unit 4
No ratings yet
Unit 4
38 pages
Deep Learning Training Insights
No ratings yet
Deep Learning Training Insights
26 pages
Practical On Artificial Neural Networks: Amrender Kumar
No ratings yet
Practical On Artificial Neural Networks: Amrender Kumar
11 pages
Fiche Econo 2
No ratings yet
Fiche Econo 2
14 pages
An Estimation of Motor Yacht Light Displacement Based On Design Parameters Using Computational Intelligence Techniques
No ratings yet
An Estimation of Motor Yacht Light Displacement Based On Design Parameters Using Computational Intelligence Techniques
14 pages
Resisting AI An Anti Fascist Approach To Artificial Intelligence 1st Edition Dan Mcquillan - The Ebook With Rich Content Is Ready For You To Download
100% (2)
Resisting AI An Anti Fascist Approach To Artificial Intelligence 1st Edition Dan Mcquillan - The Ebook With Rich Content Is Ready For You To Download
77 pages
Chopra - Recurrent Neural Networks With Non-Sequential Data To Predict Hospital Readmission of Diabetic Patients
No ratings yet
Chopra - Recurrent Neural Networks With Non-Sequential Data To Predict Hospital Readmission of Diabetic Patients
6 pages
CSCI946 W5-Classification
No ratings yet
CSCI946 W5-Classification
72 pages
Neural Networks and Deep Learning Notes
No ratings yet
Neural Networks and Deep Learning Notes
88 pages