0% found this document useful (0 votes)

91 views35 pages

Study Material For Reference

The document provides an overview of artificial intelligence (AI), its subsets like machine learning (ML) and deep learning (DL), and various applications such as ChatGPT and Google Translate. It discusses different machine learning models, including supervised learning techniques like classification and regression, along with examples and algorithms like K-Nearest Neighbors and Support Vector Machines. Additionally, it covers decision tree analysis and random forests for customer analysis and prediction tasks.

Uploaded by

Muskan Sikarwar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

91 views35 pages

Study Material For Reference

Uploaded by

Muskan Sikarwar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 35

Study Material for Reference

Artificial Intelligence-An Overview
❑ Artificial intelligence (AI) is the theory and development of
computer systems capable of performing tasks that historically
required human intelligence, such as recognizing speech, making
decisions, and identifying patterns.
❑ AI is an umbrella term that encompasses a wide variety of
technologies, including machine learning, deep learning, and
natural language processing (NLP).

❑ ChatGPT: Uses large language models (LLMs) to generate text in

response to questions or comments posed to it.
❑ Google Translate: Uses deep learning algorithms to translate text from
one language to another.
❑ Netflix: Uses machine learning algorithms to create personalized
recommendation engines for users based on their previous viewing
history.
❑ Tesla: Uses computer vision to power self-driving features on their
Study Material for Reference
cars.
Strong AI vs Weak AI

Study Material for Reference

How does AI Work?

Study Material for Reference

What is Machine Learning?
• A subset of artificial intelligence (AI) and computer science, machine learning
(ML) deals with the study and use of data and algorithms that mimic how
humans learn. This helps machines gradually improve their accuracy.
• ML allows software applications to improve their prediction accuracy without
being specifically programmed to do so. It estimates new output values by using
historical data as input.

Machine learning makes it possible to

discover patterns in supply chain data by
relying on algorithms that quickly pinpoint
the most influential factors to a supply
networks’ success, while constantly
learning in the process.

Study Material for Reference

Relationship between AI, ML and DL

Study Material for Reference

Relationship between AI, ML and DL

Study Material for Reference

Machine Learning Models

Study Material for Reference

Supervised Learning

Study Material for Reference

Machine Learning Models

Study Material for Reference

Supervised Learning-Classification
Classification :
1. Binary Classification Problem
In binary classification, the task involves classifying instances into one of two classes
or categories.
Examples
✓ Spam email detection (classifying emails as spam or not spam).
✓ Medical diagnosis (predicting whether a patient has a particular disease or not).
✓ Customer churn prediction (determining whether a customer will churn or not).
2. Multi-class Classification Problem
In multiclass classification, the task involves classifying instances into one of three
or more classes or categories.
Examples
✓ Species classification in biology (identifying different species of plants).
✓ Sentiment analysis with multiple Study classes (e.g., positive, negative, neutral).
Material for Reference
Examples of Classification Problems
Customer Segmentation:
Classes: Different segments or groups of customers (e.g., high-value customers, occasional
buyers, price-sensitive customers).
Features: Demographic information (age, gender, location), behavioral data (purchase
frequency, average order value), psychographic data (lifestyle, interests), and transaction
history.
Churn Prediction:
Classes: Churners (customers who leave) vs. Non-churners (customers who stay).
Features: Customer demographics, usage patterns (frequency of interaction with the
product/service), tenure (length of time as a customer), customer service interactions, and
recent activity (e.g., decreased usage).
Credit Scoring:
Classes: Creditworthy vs. Non-creditworthy applicants.
Features: Credit history (credit score, payment history), financial information (income, debt-to-
income ratio), employment status, length of credit history, and other relevant factors such as
outstanding debts and loan history.
Study Material for Reference
Examples of Classification Problems
Fraud Detection:
Classes: Genuine transactions vs. Fraudulent transactions.
Features: Transaction amount, location, time, frequency, deviations from typical behavior, IP
address, device information, and other contextual data.

Sentiment Analysis:
Classes: Positive sentiment, Negative sentiment, Neutral sentiment.
Features: Text data (customer reviews, social media posts), linguistic features (word frequency,
sentiment words, emoticons), metadata (time of posting, user demographics), and context
(product or service being reviewed).
Product Recommendation:
Classes: Recommended products or services for each customer.
Features: Customer behavior (purchase history, browsing history, items added to cart), product
attributes (price, category, brand), similarity measures between products (collaborative
filtering, content-based filtering), and contextual information (seasonality, trends).

Study Material for Reference

Examples of Classification Problems
Fault Diagnosis:
Classes: Types of faults or malfunctions (e.g., Mechanical fault, Electrical fault, Software error).
Features: Sensor data (temperature, pressure, vibration), performance metrics (speed, efficiency),
maintenance logs, environmental conditions, and historical failure data.

Market Segmentation:
Classes: Different market segments (e.g., Urban, Suburban, Rural).
Features: Demographic data (population density, income levels), geographic information
(location, climate), economic indicators (GDP per capita, unemployment rates), consumer
behavior (buying habits, brand preferences), and market size.

Study Material for Reference

Classification Models
1) Decision Trees

2) Random Forests

3) Support Vector Machines

4) Logistic Regression

5) Neural Networks

6) K-Nearest Neighbors (KNN)

7) Naïve Bayes

Study Material for Reference

Supervised Learning- Regression
Regression:
▪ Regression problems involve predicting a continuous numerical value
rather than class labels.
▪ These problems are often focused on estimating or forecasting quantities
that are essential for decision-making.
▪ It investigates the relationship between one or more independent variables
and a dependent variable. It's primary goal is to estimate the strength and
direction of the relationship between these variables.
▪ Regression analysis, in essence, helps us understand how changes in one
variable can affect another.

Study Material for Reference

Examples of Regression Problems
Sales Forecasting:
Target: Future sales volume (units or revenue)..
Features: Historical sales data, Advertising spend, Seasonal effects, Pricing strategies,
Economic indicators (e.g., GDP growth, inflation).
Customer Lifetime Value (CLV) Prediction:
Target: Predicted customer lifetime value (monetary value).
Features: Customer demographics (age, gender, location), Purchase history
(frequency, recency, monetary value), Customer engagement metrics (website visits,
email opens), Marketing campaign responses, Product/service preferences.
Inventory Level Estimation:
Target Variable: Number of units in inventory.
Features: Sales history, lead times, order frequency, shelf life.

Study Material for Reference

Examples of Regression Problems
Price Optimization:
Target: Optimal price point.
Features: Historical pricing and sales data, Competitor pricing, Market demand,
Production and distribution costs, Brand positioning.
Credit Scoring:
Target: Credit score or probability of default.
Features: Financial history (credit card usage, loan repayments), Income and
employment status, Credit utilization ratio, Length of credit history, Number of
recent credit inquiries, Demographic information.
Warehouse Throughput Prediction:
Target Variable: Throughput in units per hour.
Features: Warehouse layout, historical throughput data, order volume.

Study Material for Reference

Regression Models
1) Linear Regression: Simple and interpretable, suitable for problems with linear
relationships.
2) Decision Trees and Random Forests: Effective for capturing non-linear relationships and
handling complex interactions.
3) Support Vector Regression (SVR): Extends SVM to regression problems, suitable for
cases with non-linear patterns.
4) Neural Networks (Deep Learning): Can capture complex relationships in data but may
require a larger dataset.
5) Time Series Models : Especially relevant for problems involving time-dependent data,
such as Long-Short Term Memory

Study Material for Reference

K-Nearest Neighbor(KNN) Algorithm
1) K-NN algorithm assumes the similarity between the new case/data and available cases and put the new case into the category
that is most similar to the available categories.
2) K-NN algorithm stores all the available data and classifies a new data point based on the similarity. This means when new data
appears then it can be easily classified into a well suite category by using K- NN algorithm.
3) K-NN algorithm can be used for Regression as well as for Classification but mostly it is used for the Classification problems.
4) K-NN is a non-parametric algorithm, which means it does not make any assumption on underlying data.
5) It is also called a lazy learner algorithm because it does not learn from the training set immediately instead it stores the dataset
and at the time of classification, it performs an action on the dataset.
6) KNN algorithm at the training phase just stores the dataset and when it gets new data, then it classifies that data into a
category that is much similar to the new data.

Study Material for Reference

Distance Metrics Used in KNN Algorithm
Euclidean Distance

Manhattan Distance

Minkowski Distance

Study Material for Reference

Advantages and Disadvantages
➢ Advantages of the KNN Algorithm
Easy to implement as the complexity of the algorithm is not that high.
Adapts Easily – As per the working of the KNN algorithm it stores all the data in memory storage and hence whenever a new example
or data point is added then the algorithm adjusts itself as per that new example and has its contribution to the future predictions as well.
Few Hyperparameters – The only parameters which are required in the training of a KNN algorithm are the value of k and the choice of
the distance metric which we would like to choose from our evaluation metric.

➢ Disadvantages of the KNN Algorithm

Does not scale – As we have heard about this that the KNN algorithm is also considered a Lazy Algorithm. The main significance of this
term is that this takes lots of computing power as well as data storage. This makes this algorithm both time-consuming and resource
exhausting.
Curse of Dimensionality – There is a term known as the peaking phenomenon according to this the KNN algorithm is affected by the
curse of dimensionality which implies the algorithm faces a hard time classifying the data points properly when the dimensionality is
too high.
Prone to Overfitting – As the algorithm is affected due to the curse of dimensionality it is prone to the problem of overfitting as well.
Hence generally feature selection as well as dimensionality reduction techniques are applied to deal with this problem.

Study Material for Reference

How does KNN Work?
➢ Step-1: Select the number K of the neighbors
➢ Step-2: Calculate the Euclidean distance of K number of neighbors
➢ Step-3: Take the K nearest neighbors as per the calculated Euclidean distance.
➢ Step-4: Among these k neighbors, count the number of the data points in each category.
➢ Step-5: Assign the new data points to that category for which the number of the neighbor is maximum.

Study Material for Reference

Support Vector Machine
• The goal of the SVM algorithm is to create the best line or decision boundary that can segregate n-dimensional space into classes so
that we can easily put the new data point in the correct category in the future. This best decision boundary is called a hyperplane.
• SVM chooses the extreme points/vectors that help in creating the hyperplane. These extreme cases are called as support vectors, and
hence algorithm is termed as Support Vector Machine.

• SVMs can handle both linearly separable and non-linearly separable

data. They do this by using different types of kernel functions, such as
the linear kernel, polynomial kernel or radial basis function (RBF)
kernel. These kernels enable SVMs to effectively capture complex
relationships and patterns in the data.
• The kernel function plays a critical role in SVMs, as it makes it possible
to map the data from the original feature space to the kernel space. The
choice of kernel function can have a significant effect on the
performance of the SVM algorithm, and choosing the best kernel
function for a particular problem depends on the characteristics of the
data.
❑ Linear Kernel
❑ Polynomial Kernel
❑ RBF Kernel
❑ Sigmoid Kernel
Study Material for Reference
Support Vector Machine

Study Material for Reference

Decision Tree Analysis

Study Material for Reference

Customer Analysis in Retail using Decision Tree
While the food industry and the global brands continuously extend their offer, the
category managers in retail have all the difficulties to fit the available assortment in
their stores. Indeed the space allocated for a particular category is limited and it can
quickly become congested if no category management is done to avoid the product
proliferation.

By category management, the retailer need to answer to the following main

questions:

1) What is the optimal space allocated for the category?

2) What is the optimal assortment (range or variety) for the category ?
3) How the category should be segmented in the aisle, according to the customer flow ?

4) What is the optimal space allocated by product (facing), and where to display the
products ?
Study Material for Reference
Customer Analysis in Retail using Decision Tree
A retailer wants to understand how the customers are purchasing on the each
product Category. They start to interview their customers and the decision tree can
have many different structures :

1. Brand → Shape → Weight → Type of wheat

2. Weight → Shape → Brand → Type of wheat
3. Type of wheat → Weight → Shape → Brand

For the first option, it means that customers are very loyal to the brand as it is the
first entry’s key for the category. The demand transfer between brands in this case is
very low, so the retailer should focus on top brands and group all the products by
brand.

Well, but what if each customer has a different decision tree?

Study Material for Reference

Customer Analysis in Retail using Decision Tree
It is quite obvious that the customer behavior extracted and can be identified
through the purchase history of each customer. That’s where analytics can help !

Retailer have a secret weapon for it : The loyalty card.

They are able to track each customer purchase history and generate a global pattern
for each category.

• The dendrogram is generated based on similarity coefficient and will be then

analyzed by the category manager to try identifying patterns in the generated
groups.
• He may see that products are grouped by brand, or any another attribute. He can
then extract a decision tree for the category and use it during the category
management process.

Study Material for Reference

Working with Decision Tree

Study Material for Reference

Attribute Selection Measures
Information Gain:

1. Entropy is referred to as the randomness or the impurity in a system.

2. Information gain is the decrease in entropy. Information gain computes the
difference between entropy before the split and average entropy after the split of
the dataset based on given attribute values.

Study Material for Reference

Attribute Selection Measures
Gini Index:

The Entropy and Information Gain method focuses on purity and impurity in a
node. The Gini Index or Impurity measures the probability for a random instance
being misclassified when chosen randomly. The lower the Gini Index, the better the
lower the likelihood of misclassification.

The Gini index has a maximum impurity is 0.5 and maximum purity is 0, whereas
Entropy has a maximum impurity of 1 and maximum purity is 0.

Study Material for Reference

Random Forest

Study Material for Reference

Example of Random Forest
Objective: Predict whether a customer will churn based on various features.
Features: Customer Tenure, Monthly Charges, Contract Type (Month-to-Month, One
Year, Two Years), Internet Service (DSL, Fiber optic, None), Tech Support (Yes/No),
Other possible features like demographics, usage patterns, etc.

1.Tree 1: Predicts Churn: Yes Benefits:

2.Tree 2: Predicts Churn: No • Overfitting
3.Tree 3: Predicts Churn: Yes • Stability and Robustness
4.... • Accuracy
5.Tree 100: Predicts Churn: No
Final Prediction: The majority of trees predict
“Yes,” so the final prediction is that the
customer will churn.

Study Material for Reference

Retail Data Insights & Strategies
No ratings yet
Retail Data Insights & Strategies
24 pages
Mall Customer Data Analysis PDF
No ratings yet
Mall Customer Data Analysis PDF
10 pages
Introduction To Business Forecasting and Predictive Analytics
No ratings yet
Introduction To Business Forecasting and Predictive Analytics
25 pages
Multi-Criteria Decision Making
No ratings yet
Multi-Criteria Decision Making
5 pages
Slides l4 Ts
No ratings yet
Slides l4 Ts
162 pages
MT416 - BCommII - Introduction To Business Analytics - MBA - 10039 - 19 - PratyayDas
No ratings yet
MT416 - BCommII - Introduction To Business Analytics - MBA - 10039 - 19 - PratyayDas
44 pages
Business Intelligence & Business Analytics
No ratings yet
Business Intelligence & Business Analytics
8 pages
Arm PPT
No ratings yet
Arm PPT
15 pages
Augmented Analytics for BI Experts
No ratings yet
Augmented Analytics for BI Experts
8 pages
Business Intelligence Essentials
No ratings yet
Business Intelligence Essentials
15 pages
Data Science M-1 Notes
No ratings yet
Data Science M-1 Notes
34 pages
Cluster Training PDF (Compatibility Mode)
No ratings yet
Cluster Training PDF (Compatibility Mode)
21 pages
AI and ML For Business Management
No ratings yet
AI and ML For Business Management
110 pages
Week 5 Prescriptive Analytics Optimization and Simulation
No ratings yet
Week 5 Prescriptive Analytics Optimization and Simulation
37 pages
CH 22 Analytical Decision Making
No ratings yet
CH 22 Analytical Decision Making
26 pages
Time Series Forecasting Guide
No ratings yet
Time Series Forecasting Guide
30 pages
Quality Management System
No ratings yet
Quality Management System
4 pages
Genetic Algorithms in Java Basics-2 PDF
No ratings yet
Genetic Algorithms in Java Basics-2 PDF
2 pages
Predictive Analytics Overview
No ratings yet
Predictive Analytics Overview
10 pages
Performance Evaluation of Various Classification Techniques For Customer
No ratings yet
Performance Evaluation of Various Classification Techniques For Customer
19 pages
Data Visualization Mastery Course
No ratings yet
Data Visualization Mastery Course
2 pages
Security and Privacy Issues in Recommender Systems
100% (1)
Security and Privacy Issues in Recommender Systems
15 pages
Data Mart Info
No ratings yet
Data Mart Info
5 pages
Ba Unit 4 - Part1
No ratings yet
Ba Unit 4 - Part1
7 pages
Classification and Regression Trees
No ratings yet
Classification and Regression Trees
37 pages
Arima
No ratings yet
Arima
14 pages
BI 10 Huris
No ratings yet
BI 10 Huris
47 pages
Business Analytics 2nd Edition Evans Fast Access
No ratings yet
Business Analytics 2nd Edition Evans Fast Access
312 pages
Project
No ratings yet
Project
14 pages
02-03 ASAP Business Analytics-2 Descriptive Statistics
No ratings yet
02-03 ASAP Business Analytics-2 Descriptive Statistics
109 pages
Dss 10
No ratings yet
Dss 10
47 pages
Module 4 ML
No ratings yet
Module 4 ML
11 pages
Data Science Masters Program - Curriculum-Updated 2019
No ratings yet
Data Science Masters Program - Curriculum-Updated 2019
52 pages
RMM Unit-I Introdution To Data Mining
No ratings yet
RMM Unit-I Introdution To Data Mining
129 pages
Recommendation System in Python
No ratings yet
Recommendation System in Python
13 pages
Gaussian Mixture Models Unit-III
No ratings yet
Gaussian Mixture Models Unit-III
13 pages
R-Codes SCS1621
No ratings yet
R-Codes SCS1621
151 pages
ML Seminar Presentation
No ratings yet
ML Seminar Presentation
26 pages
Ai Unit-4
No ratings yet
Ai Unit-4
60 pages
Application of Big Data Analytics and Organizational Performance
No ratings yet
Application of Big Data Analytics and Organizational Performance
17 pages
Marketing Analytics Unit 1
No ratings yet
Marketing Analytics Unit 1
48 pages
Business Intelligence
No ratings yet
Business Intelligence
60 pages
IE405 System Dynamics
No ratings yet
IE405 System Dynamics
2 pages
Taller Practica Churn
50% (2)
Taller Practica Churn
6 pages
Turban Dss9e Ch04
No ratings yet
Turban Dss9e Ch04
50 pages
Project Report
No ratings yet
Project Report
7 pages
Cluster
100% (1)
Cluster
72 pages
Introduction To Time Series Analysis
No ratings yet
Introduction To Time Series Analysis
93 pages
Introduction To Power BI and Its Features
No ratings yet
Introduction To Power BI and Its Features
41 pages
PPT1
No ratings yet
PPT1
93 pages
Cluster Analysis
No ratings yet
Cluster Analysis
38 pages
Machine Learning Algorithm Guide
100% (1)
Machine Learning Algorithm Guide
15 pages
Unit 5 - Data Mining - WWW - Rgpvnotes.in
No ratings yet
Unit 5 - Data Mining - WWW - Rgpvnotes.in
15 pages
Evans Analytics1e PPT 14
No ratings yet
Evans Analytics1e PPT 14
74 pages
Data Ethics and Governance Guide
No ratings yet
Data Ethics and Governance Guide
18 pages
Big Data Course for MBA Students
No ratings yet
Big Data Course for MBA Students
27 pages
5.classification in AI - Unit 2
No ratings yet
5.classification in AI - Unit 2
5 pages
Lecture 1 - Introduction To ML
No ratings yet
Lecture 1 - Introduction To ML
25 pages
Sales Prediction For Big Mart 3.0.pptx MM
No ratings yet
Sales Prediction For Big Mart 3.0.pptx MM
25 pages
Data Analysis Chap 3
No ratings yet
Data Analysis Chap 3
21 pages
CH 11. Mensuration 1
No ratings yet
CH 11. Mensuration 1
21 pages
Gea1000 Finals Cheatsheet
No ratings yet
Gea1000 Finals Cheatsheet
2 pages
Beauty & Cleaning Product Guide
100% (1)
Beauty & Cleaning Product Guide
170 pages
Opera Guide for Students
No ratings yet
Opera Guide for Students
34 pages
Apostille Guide for Philippine Documents
No ratings yet
Apostille Guide for Philippine Documents
4 pages
MUFJ Form 2024 Valid 06-Nov-2024
No ratings yet
MUFJ Form 2024 Valid 06-Nov-2024
2 pages
Lenton Plus
No ratings yet
Lenton Plus
2 pages
Class 9 Science Test Paper
No ratings yet
Class 9 Science Test Paper
3 pages
E57 1 PDF
No ratings yet
E57 1 PDF
27 pages
Forms Mine Rule
No ratings yet
Forms Mine Rule
22 pages
Yealink VC800 Video Conferencing System Datasheet
No ratings yet
Yealink VC800 Video Conferencing System Datasheet
4 pages
M&E Consulting Engineer Duties
33% (3)
M&E Consulting Engineer Duties
2 pages
RESEARCH
No ratings yet
RESEARCH
16 pages
Arduino Irrigation Timer Setup
No ratings yet
Arduino Irrigation Timer Setup
23 pages
Assignment 3 2025
No ratings yet
Assignment 3 2025
3 pages
Equine Lameness
No ratings yet
Equine Lameness
128 pages
Superexcels Provide Differentiated Supervision: First Edition
No ratings yet
Superexcels Provide Differentiated Supervision: First Edition
23 pages
Residential Construction Guidelines
No ratings yet
Residential Construction Guidelines
2 pages
Elyes Zebda - 2 G em
No ratings yet
Elyes Zebda - 2 G em
1 page
Module Reading Writing Quarter 4
No ratings yet
Module Reading Writing Quarter 4
94 pages
Pizzanut Enterprises - For Class Harish
No ratings yet
Pizzanut Enterprises - For Class Harish
27 pages
Economics Project Work (Guidelines)
No ratings yet
Economics Project Work (Guidelines)
2 pages
Hotel Receptionist/Front of House Job Description
No ratings yet
Hotel Receptionist/Front of House Job Description
2 pages
Chapter 4 Quiz - Answer
100% (2)
Chapter 4 Quiz - Answer
2 pages
Research Ii Espine, de Los Santos, Verzo Effects of Technology To Senior Hi
No ratings yet
Research Ii Espine, de Los Santos, Verzo Effects of Technology To Senior Hi
46 pages
Osces for the Mrcs Part B: A Bailey & Love Revision Guide Second Edition Chowdhury full chapters instanly
No ratings yet
Osces for the Mrcs Part B: A Bailey & Love Revision Guide Second Edition Chowdhury full chapters instanly
134 pages
ITE-6101-2013T (UGRD) Computing Fundamentals: 5 43 My Courses
0% (1)
ITE-6101-2013T (UGRD) Computing Fundamentals: 5 43 My Courses
27 pages
Chartres Cathedral
No ratings yet
Chartres Cathedral
6 pages
Logic Programming A Hands-On Approach
No ratings yet
Logic Programming A Hands-On Approach
67 pages
QM ZG528 Course Handout
No ratings yet
QM ZG528 Course Handout
8 pages