0% found this document useful (0 votes)

34 views6 pages

Introduction To Machine Learning

Machine Learning (ML) is a branch of artificial intelligence that allows systems to learn from data for predictions and decisions. It includes supervised, unsupervised, and reinforcement learning, and is applied in various fields like healthcare and finance. The document also covers regression techniques, data preprocessing, data augmentation, and the importance of statistics and normalization in ML.

Uploaded by

akshitthakur371

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views6 pages

Introduction To Machine Learning

Uploaded by

akshitthakur371

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Introduction to Machine Learning

Machine Learning (ML) is a subset of artificial intelligence (AI) that enables systems to learn from data and make
predictions or decisions without being explicitly programmed. It involves feeding data into algorithms to identify
patterns and make predictions on new data. Machine learning is used in various applications, including image and
speech recognition, natural language processing, and recommender systems.

Types of Machine Learning:

1. Supervised Learning – Models learn from labeled data (technique : classification, regression) .Labeled data is data
that is already known and categorized, helping to identify the type of data coming from the user.
2. Unsupervised Learning – Models find hidden patterns in unlabeled data (technique : clustering, dimensionality
reduction). Unlabeled data is raw data that has not been categorized or assigned specific labels. It does not have
predefined outputs, so machine learning models must find patterns and relationships on their own.
3. Reinforcement Learning – Models learn by interacting with an external environment and receiving feedback (e.g.,
game playing, robotics).

Benefits of Machine Learning

• Saves Time and Effort: Machine learning (ML) can handle repetitive tasks, allowing people to focus on more
important work. It also makes processes faster and more efficient.
• Better Decisions: ML can analyze large amounts of data and find patterns that humans might not notice. This helps
in making smarter decisions based on real information.
• Personalized Experience: ML improves user experiences by customizing recommendations, ads, and content based
on individual preferences.
• Smarter Machines and Robots: ML helps robots and machines perform complex tasks more accurately, which is
transforming industries like manufacturing and logistics.

Scope and Limitations of Machine Learning:

Scope (Where Machine Learning is Used)

• Automation in Different Fields: ML is used in healthcare, finance, marketing, and more to automate tasks and
improve efficiency.
• Better Decision-Making: ML helps predict future trends, making it useful for businesses and research.
• Personalized Experience: ML powers recommendation systems like Netflix and Amazon, showing users content they
are likely to enjoy.
• Fraud Detection: ML helps banks and online platforms detect and prevent fraud.
• Smart Assistants: Virtual assistants like Siri and Google Assistant use ML to understand and respond to voice
commands.
Limitations (Challenges of Machine Learning)
• Needs a Lot of Data: ML works best with large amounts of data. Without enough data, it may not learn properly.
• Can Be Biased: If the training data is biased, ML can make unfair or incorrect decisions.
• High Cost and Time-Consuming: Training ML models requires powerful computers and a lot of time.
• Lacks Human Understanding: ML can recognize patterns but does not truly "understand" like a human does.
• Security Risks: Hackers can trick ML models, making them vulnerable to cyberattacks.

Regression in Machine Learning

Regression in machine learning refers to a supervised learning technique that establishes a relationship between
independent and dependent variables. It helps understand how changes in independent variables affect the dependent
variable.
For example, when buying a mobile phone, the price (dependent variable) depends on factors like RAM, storage, and
camera quality (independent variables). Regression helps find how much each factor influences the price.
Regression is used to predict continuous values based on input data.
Types of Regression:
1. Simple Linear Regression – Establishes a straight-line relationship between one independent variable and one
dependent variable.
2. Multiple Linear Regression – Models the relationship between two or more independent variables and a
dependent variable using a straight line.
3. Polynomial Regression – Fits a curved (polynomial) relationship between the independent and dependent
variables, useful when data is not linear.

1. The dependent variable (target) is what we are trying to predict, such as the price of a house.
2. The independent variables (features) are the factors that influence this prediction, like the locality, number of rooms,
and house size.

Advantages of Regression
• Simple to understand and explain.
• Works well even if some data points are very different (outliers).
• Can easily handle straight-line (linear) relationships between variables.
Disadvantages of Regression
• Assumes that the relationship between variables is always a straight line.
• Can give incorrect results if two or more independent variables are too similar (multicollinearity).
• Not the best choice for very complex relationships.

Data Visualization:
Data Visualization is the process of turning complex data or predictions into interactive and visually appealing graphs or
charts, making it easier to understand the results.
This is an optional feature you provide to your clients to help them better understand the output. It includes various types
of graphs, such as:
1. Bar Chart – Uses rectangular bars to compare different categories.
2. Line Chart – Shows trends over time using a continuous line.
3. Pie Chart – Represents proportions of a whole in a circular format.
4. Scatter Plot – Displays relationships between two variables using dots.
5. Histogram – Shows the distribution of data over a range.
6. Heatmap – Uses colors to represent values in a matrix or table.

Data Preprocessing:
Data Preprocessing is a process used to clean, integrate and transform the data before feeding it into the training data. As
the name suggests, preprocessing means processing the data before the algorithm uses the training data set.

For Example: Imagine you conducted a survey among students about their study habits and satisfaction levels, but the
collected data has issues. Some responses have missing values, duplicate entries, and inconsistent formats (e.g., "5 hrs"
vs. "Five hours"). There are also outliers like unrealistic study hours ("25 hours per day") and irrelevant data such as
"Email ID." Additionally, different rating scales (1-5 vs. 1-10) and typographical errors ("exelent" instead of "excellent")
can affect analysis. These issues need to be fixed through Data Preprocessing before further use.

Data Preprocessing Steps:

1. Cleaning: Converting data according to the requirements of the training dataset. For example, if there are
null/empty values but the training data should not contain them, we must handle them appropriately. Unwanted/
Irrelevant data is considered noise in machine learning. Therefore, we perform denoising (reducing noise) to
improve data quality.
2. Integration: Gathering data from different sources into a single system. Since different databases may have different
schemas and formats, we need to standardize them into a unified structure or group them into a single place with
similar attributes of data. After that we gather whole data into a single place. This process is known as data
integration.
3. Reduction: Reducing the dimensionality of data using techniques such as Principal Component Analysis (PCA).
Additionally, numeric values can be reduced for storage efficiency, and compression techniques can be applied to
save space. However, data quality may slightly decrease in the process. Therefore, we need to compress data in a
way that minimizes quality loss while maintaining accuracy.
4. Transformation: Modifying data slightly to fit within a specified range. This process, called normalization, helps in
faster processing and ensures consistency in data representation.
5. Data Discretization: Visualizing or showing data in certain intervals.

cleaning

Data
Integration
Discretizatio

Transformation Reduction
Data augmentation:
Data augmentation is a method used to enhance a dataset’s diversity by applying transformations to existing data
instead of collecting new samples. These transformations, such as rotation, scaling, flipping, or noise addition,
create modified versions while preserving the original labels. This technique helps machine learning models
generalize better, improving their performance and robustness.

This technique is particularly beneficial in image processing tasks

Advantages of Data Augmentation:

• Improves Model Accuracy – Helps machine learning models learn better by providing more varied data.
• Reduces Overfitting – Prevents the model from memorizing the training data and helps it perform well on new
data.
• Saves Time & Cost – No need to collect a lot of new data, as existing data is modified to create more samples.
• Works for Different Data Types – Can be used for images, text, and audio to improve learning.

Disadvantages of Data Augmentation:

• Computational Cost – Requires extra processing power to generate and train on augmented data.
• Not Always Useful – Some types of data may not benefit from augmentation, especially if small changes affect
the meaning.
• Risk of Adding Noise – If not done properly, augmentation can create unrealistic data that confuses the model.
• Slower Training – More data means the model takes longer to train.

How Data Augmentation Works for Images

Data augmentation enhances image datasets by applying transformations to create new training examples.
1. Geometric Transformations – Modifies image shape, including rotation, flipping, scaling, translation, and shearing.
2. Color Adjustments – Alters brightness, contrast, saturation, and hue to change image appearance.
3. Kernel Filters – Applies effects like blurring, sharpening, and edge detection.
4. Random Erasing – Hides parts of an image to help models handle missing data.
5. Combining Techniques – Multiple augmentations are applied together for more diverse training data.

Statistics in ML:
Statistics is the study of collecting, organizing, analyzing, and understanding data. It helps summarize information, make
predictions, and draw conclusions. Statistical methods also measure uncertainty, allowing researchers to make confident,
data-based decisions.

In machine learning, statistics plays a key role in collecting, organizing, analyzing, and interpreting data. It helps identify
patterns, make predictions, and measure uncertainty. Statistical methods are essential for:
• Training Models – Providing data-driven insights for learning algorithms.
• Evaluating Performance – Measuring accuracy, variance, and error rates.
• Feature Selection – Identifying the most relevant data points.
• Probability and Uncertainty – Estimating confidence levels and handling noisy data.
• Hypothesis Testing – Validating assumptions and comparing models.
Overall, statistics helps improve the accuracy, reliability, and interpretability of machine learning models
Types of Statistics
Statistics is divided into two main types:
1. Descriptive Statistics
o Definition: Descriptive statistics help in organizing, summarizing, and presenting data in a meaningful way.
It makes large amounts of data easier to understand using numbers, tables, and graphs.
o Example: If a teacher calculates the average marks of a class from a test, it helps summarize the
performance of all students.
2. Inferential Statistics
o Definition: Inferential statistics allow us to analyze a small sample of data and make predictions or
conclusions about a larger group (population).
o Example: A survey of 100 people is conducted to predict the opinion of an entire city on a new product.

Convex Optimization:

Convex optimization is a mathematical technique used to minimize a cost or loss function by reducing the difference
between actual values and predicted values. In this context, "optimization" refers to the process of finding the best
possible solution, while "convex" means that the function being optimized has a well-defined shape that ensures a single
best solution. This technique is widely used in machine learning and mathematical modeling to improve accuracy and
efficiency.

Probability:
"Machine Learning (ML) is a subset of Artificial Intelligence (AI) that focuses on making predictions or decisions based on data. Since these
predictions often involve uncertainty, probability plays a key role in ML. It helps in modeling uncertainty and making informed guesses about
outcomes, making probability a fundamental concept in machine learning."

Normalizing Datasets in Machine Learning

What is Normalization?

Normalization is a data preprocessing technique used to scale the values of features (columns) so that they all fall within a similar range, typically
between 0 and 1.

It's especially useful when your dataset has features with different units or scales — for example, age in years and income in thousands.
Why is Normalization Important in Machine Learning?

1. Helps the Model Work Better

Some machine learning algorithms (like k-NN, SVM, and neural networks) use distance or gradient calculations. These methods work best when all
features are on a similar scale.
2. Treats All Features Fairly
Without normalization, features with larger ranges can dominate smaller ones, leading to biased results.

3. Speeds Up Learning
Models that use gradient descent (like logistic regression and neural networks) learn faster when the data is normalized.
4. Makes Charts and Graphs Clearer
When visualizing data (like in clustering or PCA), normalized data is easier to understand and compare.

Use normalization when:

• You're using distance-based models (e.g., KNN, K-Means, SVM).

• Features are on different scales (e.g., height in cm, weight in kg).

• You're training neural networks or using PCA.

Don’t use normalization when:

• Using tree-based models (like Decision Trees, Random Forest, XGBoost) — they’re scale-invariant.

Common Normalization Techniques

1. Min-Max Normalization

Scales values between 0 and 1.

Formula:
Xnorm=X−XminXmax−XminX_{norm} = \frac{X - X_{min}}{X_{max} - X_{min}}Xnorm=Xmax−XminX−Xmin

Best when you know the min and max values.

Sensitive to outliers.

Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
9 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
8 pages
Adbm Mid-2 QB
No ratings yet
Adbm Mid-2 QB
30 pages
Unit1 ML
No ratings yet
Unit1 ML
10 pages
Big-Data Unit-3
100% (1)
Big-Data Unit-3
54 pages
ML 1 PPT Unit 1
No ratings yet
ML 1 PPT Unit 1
93 pages
Presenttion 33
No ratings yet
Presenttion 33
2 pages
Machine Learning Concept1
No ratings yet
Machine Learning Concept1
16 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
19 pages
Module - 1
No ratings yet
Module - 1
9 pages
Unit 1-1
No ratings yet
Unit 1-1
32 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
19 pages
Module 1 - Intro To ML - V2
No ratings yet
Module 1 - Intro To ML - V2
47 pages
Aiya Session 4
No ratings yet
Aiya Session 4
42 pages
MOOC PART 3 in Gndu
No ratings yet
MOOC PART 3 in Gndu
9 pages
Machine Learning1
100% (1)
Machine Learning1
11 pages
TIS - Intro To Machine Learning
No ratings yet
TIS - Intro To Machine Learning
18 pages
Machinelearning Unit1
No ratings yet
Machinelearning Unit1
9 pages
Machine Learning?
100% (5)
Machine Learning?
114 pages
Machine Learning Question Bank
No ratings yet
Machine Learning Question Bank
14 pages
ML Unit 1
No ratings yet
ML Unit 1
21 pages
ML Workshop
No ratings yet
ML Workshop
78 pages
INTRODUCTION
No ratings yet
INTRODUCTION
51 pages
Machine: Learning ATO Z - I
No ratings yet
Machine: Learning ATO Z - I
131 pages
Department of Emerging Technology (SB) III B.Tech - I Semester
No ratings yet
Department of Emerging Technology (SB) III B.Tech - I Semester
12 pages
ML Note
No ratings yet
ML Note
8 pages
Machine Learning
No ratings yet
Machine Learning
24 pages
Intro to Machine Learning Basics
100% (1)
Intro to Machine Learning Basics
15 pages
Tesla Stock Marketing Price Prediction
No ratings yet
Tesla Stock Marketing Price Prediction
62 pages
Machine Learning Overview & Benefits
No ratings yet
Machine Learning Overview & Benefits
15 pages
Machine Learning: Spam Filtering & Regression
No ratings yet
Machine Learning: Spam Filtering & Regression
8 pages
What Is Machine Learning
No ratings yet
What Is Machine Learning
13 pages
ML Unit-1
No ratings yet
ML Unit-1
39 pages
Module 1
No ratings yet
Module 1
54 pages
Unit 1
No ratings yet
Unit 1
26 pages
Zarantech - Intro To ML
No ratings yet
Zarantech - Intro To ML
105 pages
Week 4 - Intro To ML
No ratings yet
Week 4 - Intro To ML
37 pages
Machine Learning With Python
No ratings yet
Machine Learning With Python
6 pages
Module2 ch2
No ratings yet
Module2 ch2
36 pages
Fundamentals of Machine Learning II
No ratings yet
Fundamentals of Machine Learning II
13 pages
Session One Machine Learning
No ratings yet
Session One Machine Learning
18 pages
MCA - ML Question Bank Answer
No ratings yet
MCA - ML Question Bank Answer
139 pages
Data - Analytics - Chapter 2
No ratings yet
Data - Analytics - Chapter 2
58 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
5 pages
Machine Learning
No ratings yet
Machine Learning
48 pages
ML SIG - Day 1
No ratings yet
ML SIG - Day 1
55 pages
Machine Learning Reg
No ratings yet
Machine Learning Reg
45 pages
Machine Learning QB
No ratings yet
Machine Learning QB
15 pages
ML Unit 1
No ratings yet
ML Unit 1
9 pages
Unit 1 Machine Learning
No ratings yet
Unit 1 Machine Learning
10 pages
What Is Machine Learning
No ratings yet
What Is Machine Learning
27 pages
AI-900 Study Notes
No ratings yet
AI-900 Study Notes
25 pages
Advance ML - Unit 1
No ratings yet
Advance ML - Unit 1
12 pages
Machine Learning Introduction
100% (1)
Machine Learning Introduction
20 pages
What Is Machine Learning
No ratings yet
What Is Machine Learning
5 pages
Unit 3 - DS - 1st Year
No ratings yet
Unit 3 - DS - 1st Year
5 pages
Module 1 MMC201
No ratings yet
Module 1 MMC201
77 pages
(A) What Is Machine Learning? Explain The Impact of Various Machine Learning Techniques in Today's World
No ratings yet
(A) What Is Machine Learning? Explain The Impact of Various Machine Learning Techniques in Today's World
6 pages
Topic 1
No ratings yet
Topic 1
39 pages
SAP MM End To End S - 4 HANA Free Learning Document
100% (3)
SAP MM End To End S - 4 HANA Free Learning Document
191 pages
Facial Recognition for Class Attendance
No ratings yet
Facial Recognition for Class Attendance
20 pages
Tech Specs for Lenovo Laptop
No ratings yet
Tech Specs for Lenovo Laptop
34 pages
Parinay Chauhan Resume
No ratings yet
Parinay Chauhan Resume
3 pages
Pt-1 Exam XII Computer Science (083) Time:1:30Hrs M.M:40
No ratings yet
Pt-1 Exam XII Computer Science (083) Time:1:30Hrs M.M:40
2 pages
0708
No ratings yet
0708
106 pages
Creating A New Configuration: Table 3-6
No ratings yet
Creating A New Configuration: Table 3-6
100 pages
Drive Test 2G Vs 3G
100% (2)
Drive Test 2G Vs 3G
11 pages
Network Types Lesson Plan
No ratings yet
Network Types Lesson Plan
3 pages
Jes2 Notes
No ratings yet
Jes2 Notes
16 pages
Comprehensive Forensic Report
No ratings yet
Comprehensive Forensic Report
17 pages
Xootr Assembly Process Optimization
0% (1)
Xootr Assembly Process Optimization
2 pages
Tsend C
No ratings yet
Tsend C
10 pages
Chapter 9 Sampling Design
100% (1)
Chapter 9 Sampling Design
15 pages
CV 03 10 23
No ratings yet
CV 03 10 23
2 pages
Syllabus
No ratings yet
Syllabus
2 pages
Fix WMDC Issues on Windows 10 Update
No ratings yet
Fix WMDC Issues on Windows 10 Update
2 pages
Microsoft Word Editing Guide
No ratings yet
Microsoft Word Editing Guide
46 pages
User Manual: Off Grid Solar Inverter SPF 5000 ES
100% (1)
User Manual: Off Grid Solar Inverter SPF 5000 ES
42 pages
Ict Assignment No. 2: Expansion Cards & Slots, Ports
No ratings yet
Ict Assignment No. 2: Expansion Cards & Slots, Ports
18 pages
Membuat Database Menggunakan CMD
No ratings yet
Membuat Database Menggunakan CMD
4 pages
OP300 Cal
No ratings yet
OP300 Cal
24 pages
Foundations of Computer Science Behrouz Forouzan PDF Download
No ratings yet
Foundations of Computer Science Behrouz Forouzan PDF Download
114 pages
2GLL
No ratings yet
2GLL
3 pages
Black Book
No ratings yet
Black Book
85 pages
Under The Dome Stephen King King Stephen PDF Download
100% (7)
Under The Dome Stephen King King Stephen PDF Download
32 pages
Chapter 4-Moving Average and Smoothing
No ratings yet
Chapter 4-Moving Average and Smoothing
21 pages
GPSMAP 12x3 Flush Template
No ratings yet
GPSMAP 12x3 Flush Template
2 pages
LAB 02 Solid Works
No ratings yet
LAB 02 Solid Works
6 pages
BEML Layout in Bagalur
No ratings yet
BEML Layout in Bagalur
4 pages

Introduction To Machine Learning

Uploaded by

Introduction To Machine Learning

Uploaded by

Introduction to Machine Learning

Types of Machine Learning:

Benefits of Machine Learning

Scope and Limitations of Machine Learning:

Scope (Where Machine Learning is Used)

Regression in Machine Learning

Data Preprocessing Steps:

This technique is particularly beneficial in image processing tasks

Advantages of Data Augmentation:

Disadvantages of Data Augmentation:

How Data Augmentation Works for Images

Normalizing Datasets in Machine Learning

1. Helps the Model Work Better

Use normalization when:

• You're using distance-based models (e.g., KNN, K-Means, SVM).

• You're training neural networks or using PCA.

Don’t use normalization when:

Common Normalization Techniques

Scales values between 0 and 1.

Best when you know the min and max values.

You might also like