0% found this document useful (0 votes)

36 views43 pages

ML Passing Package - 1

The document outlines the structure and content of the VI Semester B.C.A. Examination for Machine Learning, including various sections with questions on definitions, algorithms, and concepts related to machine learning. It covers topics such as data preparation, model selection, evaluation metrics, and clustering techniques. Additionally, it provides detailed explanations of the K-Nearest Neighbour Algorithm and other machine learning principles.

Uploaded by

chandanaschandu31

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views43 pages

ML Passing Package - 1

Uploaded by

chandanaschandu31

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 43

VI Semester B.C.A.

Examination, July/August - 2024

Machine Learning
SECTION-A
Answer any Four of the following questions.
1.Define machine learning.
Ans: Refer MQP 2 Q 1
2.What is Dataset.
Ans: Refer MQP 2 Q 2
3. Define regression. Give an example.
Ans: Refer MQP 1 Q 2
4. Define clustering. Mention one application.
Ans: Refer MQP 1 Q 6
5.Mention any two tools used for machine learning.
Ans: Refer MQP 2 Q 4
6. What is Data splitting.
Ans: Refer MQP 2 Q 3

SECTION-B
Answer any Four of the following questions.
7. Explain types of machine learning with examples.
Ans: Refer MQP 1 Q 7
8. Explain exploratory data analysis and data cleaning.
Ans: Refer MQP 1 Q 11
9.Explain Baye's theorem with an example.
Ans: Refer MQP 2 Q 7
10. Explain K-means clustering for image segmentation.
Ans: Refer MQP 1 Q 15
11. Explain how DBSCAN works.
Ans: Refer MQP 2 Q 14
12. Write and explain K-Nearest Neighbour Algorithm.
Ans: Refer MQP 1 Q 8
Algorithm K-NN Algorithm
Step-1: Choose the Number of Neighbors (K):
The first step in the K-NN algorithm is to select the number of neighbors (K) that will be considered
when making predictions for a new data point. The value of K is a hyperparameter that needs to be
specified before running the algorithm.
Step-2: Calculate Distance:
Compute the distance between the new data point and all the data points in the training set. The
distance metric, commonly Euclidean distance, measures the similarity or proximity between data
points in the feature space.
Step-3:Sort and Select Nearest Neighbors:
After calculating the distances, sort the distances in ascending order and selects the K data points with
the smallest distances to the new data point. These K data points are the nearest neighbors to the new
data point in the feature space.
Step-4: For Classification:
In the classification task, assign a class label to the new data point based on the majority class among
the K nearest neighbors. The class with the highest frequency among the K neighbors is chosen as the
predicted class for the new data point.
Step-5: For Regression:
In regression tasks, compute the average (or weighted average) of the target values of the K nearest
neighbors. This average value serves as the predicted target value for the new data point in regression
analysis.

SECTION-C
Answer any Four of the following questions.
13. Explain main challenges of Machine Learning.
Ans: Refer MQP 2 Q 11
14. Explain how to prepare the data for Machine Learning Algorithms.
Ans: Data preparation is a fundamental aspect of the Machine Learning workflow, essential for
optimizing the data before model training. It involves a series of steps such as cleaning, transforming,
and structuring the data to make it well-suited for the specific algorithm being used. Each step in this
process serves a unique purpose and employs specific techniques to enhance the quality and usability
of the dataset. Proper data preparation is crucial for ensuring the accuracy and effectiveness of the
machine learning model during training and evaluation.
1. Data Cleaning: Data cleaning, also known as data cleansing, is the process of identifying and
correcting errors, inconsistencies, and missing values in a dataset to improve its quality and reliability
for analysis and modelling. Data cleaning is a crucial step in data preprocessing as it ensures that the
data is accurate, complete, and consistent.
i. Handling Missing Values: Missing values in data refer to the absence of information or data points
for certain observations or attributes in a dataset. Handling missing values is crucial in-data
preprocessing to ensure the quality and reliability of the machine learning model.
Example: The Titanic Passengers dataset has missing values in the Age and Cabin columns. The
passenger information has been extracted from various historical sources. In this case the missing
values couldn't be found in the sources.
ii. Handling Outliers: Outliers are data points that significantly differ from other observations ina
dataset. These data points can skew statistical analyses and machine learning models, leading to
inaccurate results. Outliers can occur due to various reasons such as measurement errors, data entry
mistakes, or genuine extreme values in the data.
Example: In a dataset containing information about individuals, such as their age, it is common to
encounter outliers, such as ages above 100 years. While some individuals may indeed be over 100
years old, extreme ages can impact statistical analyses and machine learning models if not handled
appropriately.
2. Data Transformation: Data transformation is a fundamental process in data preprocessing that
involves modifying the original data to make it more suitable for analysis or modeling. This
transformation can help improve the quality of the data, address issues like skewness or outliers, and
enhance the performance of machine learning algorithms.
i. Normalization: Normalization is a type of data transformation that scales the values of numerical
features to a standard range, typically between 0 and 1.
ii. Standardization: Standardization is another data transformation technique that centres the data
around a mean of 0 and scales it to have a standard deviation of 1. It is like converting heights and
weights to z-scores.
iii. Log Transformation: Log transformation is applied to skewed data, like converting income values
to their logarithmic form to handle extreme values.
iv. Encoding Categorical Variables: Converting categorical variables into numerical representations
through techniques like one-hot encoding or label encoding is a form of data transformation.
One-hot Encoding: Creates a new binary column for each category level.
Label Encoding: Assigns a unique integer based on the alphabetical ordering of the categories.
3. Data Reduction Data reduction is a critical step in preparing data for efficient analysis, especially
in contexts involving large datasets or complex models. The process of data reduction involves
diminishing the amount of data that needs to be processed and analyzed without significantly
sacrificing valuable information.
i. Feature Creation: This involves creating new variables from existing data to provide additional
insight to the models. This might involve combining features, deriving new metrics from existing
data, or aggregating data over time or space.
ii. Feature Transformation: Transforming features to enhance their predictive power or making
them more suitable for models. Common transformations include normalization, scaling,
applying mathematical functions like logarithms or exponentials, and more.

4. Feature Engineering: Feature engineering is a fundamental process in the field of machine

learning where raw data is transformed into formatted datasets that machine learning algorithms can
work with more effectively. This process involves creating new features from existing data,
transforming data into more useful formats, or enhancing the quality of data to improve the accuracy
and efficiency of predictive models.
5. Data Spitting: Data splitting in machine learning is the process of dividing the data into separate
subsets to be used at different stages of model building and evaluation. The primary goal of data
splitting is to ensure that the model trained on one set of data can generalize well to new, unseen data.
This helps avoid problems like overfitting, where a model performs well on the training data but
poorly on new data.
15. Explain confusion matrix and performance evaluation metrics in classification.
Ans: Refer MQP 2 Q 15 and
1. Accuracy: Accuracy is the most commonly used metric for evaluating classification models. It
measures the proportion of correct predictions made by the model out of the total number of
predictions. It is calculated as the ratio of the number of correct predictions to the total number of
predictions. Accuracy = (TP + TN) / (TP + FP + TN + FN)
A high accuracy score indicates that the model is making correct predictions most of the time.
However, accuracy can be misleading when the class distribution is imbalanced.

2. Precision: Precision measures the proportion of the positives among the instances that the model
predicted as positive. It is calculated as the ratio of the number of true positives to the total number of
instances predicted as positive. Precision = TP / (TP + FP)
A high precision score indicates that the model is making fewer false positive predictions. It is useful
when the cost of false positives is high.

3. Recall: Recall measures the proportion of true positives among the instances that are actually
positive. It is calculated as the ratio of the number of true positives to the total number of actual
positive instances. Recall = TP / (TP + FN)
A high recall score indicates that the model is capturing a majority of the actual positive instances. It
is useful when the cost of false negatives is high.

4. F1 score: F1 score is the harmonic mean of precision and recall. It provides a balance between the
two metrics and is particularly useful when the class distribution is imbalanced.
Fi score = 2 * (precision * recall} / (precision + recall)
A high F1 score indicates that the model has both good precision and recall. It is useful when both
false positives and false negatives are equally important

16. Explain any four unsupervised learning techniques.

Ans: Refer MQP 2 Q 16
17. Explain.
a) Scikit-learn and pandas.
Ans: Refer MQP 2 Q 13
b) Explain the steps to select and train a model.
Ans: Process of Selecting and Training a Machine Learning Model
Step 1: Model Selection
The choice of machine learning model is crucial and depends on the nature of the problem at hand.
Understanding the problem type, whether it involves regression, classification, clustering, or other
tasks—is essential for selecting the most appropriate model that can effectively address the specific
requirements and characteristics of the data. For a task like predicting student performance (numeric
score prediction), regression models are typically suitable.
Example: For predicting student performance, a regression task could start with simpler models like
linear regression but may require more complex models like Random Forest Regressors if the
relationships between features and the target are non-linear. Given the initial analysis suggesting non-
linear patterns, we opt for a Random Forest Regressor due to its ability to handle complex data
structures and provide robustness against overfitting.
Step 2: Model Training
Model training is a critical step in the machine learning pipeline where the selected model is exposed
to the prepared dataset to learn patterns and relationships between the input features and the target
variable. During this phase, the model adjusts its internal parameters based on the training data to
minimize prediction error and improve its performance.
Example: The Random Forest model is trained using features such as study hours, attendance
records, and historical grades. This model does not require setting many hyperparameters initially but
does involve decisions about the number of trees and their depth, which we initially set to default
values for a baseline model.
Step 3: Model Evaluation
The model's performance is assessed using a validation set, which is a subset of the training data that
the model has not seen before. This evaluation helps in assessing the model's learning capability and
its ability to generalize to new data.
Example: Evaluate the initial Random Forest model by calculating its Root Mean Square Error
(RMSE) on the validation set. If the performance is unsatisfactory, it suggests the need for tuning
hyperparameters or possibly revisiting the feature engineering step.
Step 4: Hyperparameter Tuning
Hyperparameters are parameters that are set before the learning process begins. They control the
learning process and model behavior but are not learned from the data. Examples include learning
rate, regularization strength, number of hidden layers in a neural network, and kernel type in Support
Vector Machines.
Fine-tuning the model's hyperparameters is crucial to enhance its performance. Techniques like grid
search or random search can be employed to systematically explore different hyperparameter
combinations.
Example: Adjusting hyperparameters such as the number of trees or tree depth in a Random Forest
model through grid search can help minimize RMSE on the validation set, thereby improving model
accuracy.
Step 5: Cross-Validation
Cross-validation ensures the model's stability across various data subsets. By repeatedly splitting the
data into training and validation sets, training on each subset, and averaging the results, the model's
robustness is assessed.
Example: Implementing 10-fold cross-validation on a Random Forest model involves dividing the
training data into 10 subsets, using each as a validation set once, and training on the remaining 9
subsets. The average RMSE across all validations provides a reliable performance estimate.
Step 6: Final Model Training
After identifying the best model and hyperparameters, the final model is trained on the entire training
dataset to leverage all available data for optimal learning.
Example: The optimized Random Forest model, with fine-tuned hyperparameters from cross-
validation, is trained on the complete student dataset to maximize its predictive capabilities.
Step 7: Model Testing
The trained model is tested on a separate dataset that was not used during training or validation to
evaluate its performance and real-world applicability.
Example: The final test for the Random Forest model involves predicting exam scores for new
students based on their study habits and past performance. The model's predictions are compared
against actual scores to calculate the final RMSE, assessing its effectiveness.
18. Write note on:
a) Entropy and information gain.
Ans: Entropy and information gain are key concepts in domains such as information theory, data
science, and machine learning. Information gain is the amount of knowledge acquired during a certain
decision or action, whereas entropy is a measure of uncertainty or unpredictability. People can handle
difficult situations and make wise judgments across a variety of disciplines when they have a solid
understanding of these principles. Entropy can be used in data science, for instance, to assess the
variety or unpredictable nature of a dataset, whereas Information Gain can assist in identifying the
qualities that would be most useful to include in a model. In this article, we'll examine the main
distinctions between entropy and information gain and how they affect machine learning.
1. Entropy: The term "entropy" comes from the study of thermodynamics, and it describes how
chaotic or unpredictable a system is. Entropy is a measurement of a data set's impurity in the context
of machine learning. In essence, it is a method of calculating the degree of uncertainty in a given
dataset.

2. Information Gain is a statistical metric used to assess a feature's applicability in a dataset. It is an

important idea in machine learning and is frequently utilized in decision tree algorithms. By
contrasting the dataset's entropy before and after a feature is separated, information gain is
estimated. A feature's relevance to the categorization of the data increases with information gain.

b) Partitioning clustering and hierarchical clustering.

Ans:
1.Partitioning Clustering
Definition:
Partitioning clustering divides the dataset into distinct, non-overlapping groups (clusters) based on a
specific criterion. Each data point belongs to exactly one cluster.
Key Characteristics:
• Flat Structure: The result is a flat partition of the data into clusters.
• Number of Clusters: The number of clusters (k) must be specified in advance (e.g., in k-
means clustering).
• Centroid-Based: Many partitioning methods, like k-means, use centroids to represent
clusters. The algorithm iteratively assigns points to the nearest centroid and updates the
centroids based on the assigned points.
• Efficiency: Generally faster and more efficient for large datasets compared to hierarchical
methods.
• Sensitivity to Initialization: The results can vary based on the initial placement of centroids,
especially in k-means.
Common Algorithms:
• K-Means Clustering: Partitions data into k clusters by minimizing the variance within each
cluster.
• K-Medoids (PAM): Similar to k-means but uses actual data points (medoids) as cluster
centres.
Use Cases: Market segmentation, Image compression, Document clustering
2.Hierarchical Clustering
Definition:
Hierarchical clustering creates a tree-like structure (dendrogram) that represents the nested grouping
of data points. It can be agglomerative (bottom-up) or divisive (top-down).
Key Characteristics:
• Tree Structure: Produces a hierarchy of clusters, allowing for different levels of granularity.
• No Predefined Number of Clusters: The number of clusters does not need to be specified in
advance; it can be determined by cutting the dendrogram at a desired level.
• Agglomerative vs. Divisive:
o Agglomerative: Starts with each data point as its own cluster and merges them based
on similarity until one cluster remains.
o Divisive: Starts with one cluster containing all data points and splits it into smaller
clusters.
• Distance Metrics: The choice of distance metric (e.g., Euclidean, Manhattan) and linkage
criteria (e.g., single, complete, average) significantly affect the clustering results.
Common Algorithms:
• Agglomerative Clustering: Merges clusters based on distance metrics.
• Divisive Clustering: Splits clusters based on distance metrics.
Use Cases: Gene expression analysis, Social network analysis, Document clustering.

Machine Learning

Model Questions

Short Answer Questions

1. Applications of machine learning.

Ans: the applications of machine learning:
1. Image Recognition
2. Speech Recognition
3. Medical Diagnosis
4. Traffic Prediction
5. Product Recommendations
6. Online Fraud Detection
7. Self-Driving Cars
8. Email Spam Filtering
9. Automatic Language Translation
10. Virtual Personal Assistants
11. Stock Market Trading

2. Scikit learn with its features.

Ans: Scikit-learn (sklearn) is a popular Python library used for machine learning tasks. It provides a
wide range of tools for various stages of the machine learning process, including:
Features:
•Simple and Efficient: Easy-to-use interface for building and evaluating machine learning models.
•Comprehensive: Supports a wide range of supervised and unsupervised learning algorithms,
including regression, classification, clustering, and dimensionality reduction.
•Model Selection: Tools for model selection and evaluation, such as cross-validation, grid search, and
performance metrics.
•Data Preprocessing: Includes tools for data preprocessing like scaling, normalization, encoding
categorical variables, handling missing values, etc.
•Integration: Seamless integration with other scientific Python libraries like NumPy, SciPy, and
matplotlib.
3. Visualization of data.
Ans: Data visualization is the graphical representation of information and data. By using visual
elements like charts, graphs, and maps, data visualization tools provide an accessible way to see and
understand trends, outliers, and patterns in data. Steps involved are:
1. Define the Purpose
2. Collect and Prepare Data
3. Choose the Right Visualization Type
4. Design the Visualization
5. Interpret and Share

4. List the different clustering techniques

Ans: Refer MQP 1 Q 9
5. Different linear models

Ans: different types of linear models:

1. Simple Linear Regression

2. Multiple Linear Regression
3. Polynomial Regression
4. Ridge Regression
5. Lasso Regression
6. Robust Regression
Long answer Questions

6. Decision tree algorithm with its advantages and disadvantages.

Ans: Refer MQP 1 Q 18
7. Steps in Data Preparation process
Ans: Data Preparation is a crucial step in the machine learning pipeline that involves transforming
raw data into a suitable format for analysis and modeling. The process ensures the data is clean,
consistent, and ready for use in training models.

1. Data Collection:
• Sources: Gather data from various sources such as databases, APIs, surveys, or files.
• Integration: Combine data from multiple sources, ensuring consistency and completeness.
2. Data Cleaning:
• Handling Missing Values: Address missing data using imputation methods like mean
substitution, median substitution, or filling with a default value.
• Outlier Detection and Removal: Identify and handle outliers that can skew analysis, using
statistical methods or domain knowledge.
• Removing Duplicates: Detect and eliminate duplicate records to ensure data integrity.
3. Data Transformation:
Normalization: Scale numerical features to a standard range, such as [0, 1] or [-1, 1], to
ensure consistent input for models.
• Encoding Categorical Variables: Convert categorical data into numerical format using
techniques like one-hot encoding or label encoding.
• Feature Engineering: Create new features that capture relevant information from existing
data to enhance model performance.
4. Data Reduction:
• Dimensionality Reduction: Reduce the number of features using techniques like PCA to
simplify the dataset while retaining essential information.
• Feature Selection: Identify and retain the most relevant features, removing those that do
not contribute significantly to the model.
5. Data Splitting:
• Training, Validation, and Test Sets: Split the dataset into subsets to train, validate, and test
the model, ensuring unbiased evaluation of model performance.
6. Data Augmentation:
• Synthetic Data Generation: Create additional data samples to balance classes or increase
the diversity of the dataset, especially in image and text data.
7. Data Integration and Finalization:
• Combining Data Sources: Merge data from different sources into a unified dataset.
• Final Checks: Perform final validation and ensure the data is correctly formatted and ready
for modeling.

8. Naïve Bayes classification algorithm with an example

Ans: Refer MQP 2 Q 7
9. K means Clustering Algorithms
Ans: Refer MQP 1 Q 15
10. DBSCAN algorithm
Ans: Refer MQP 2 Q 14
11. Differentiate between supervised and unsupervised learning algorithms, listing a few
algorithms under each type.
Ans: Refer MQP 2 Q 9

Unit I 1
No ratings yet
Unit I 1
203 pages
Machine Learning Data Prep Guide
No ratings yet
Machine Learning Data Prep Guide
9 pages
new-Guidelines-Datamining-I-UGCF-DSE-CS Hons-Sem 4-Jan 25
No ratings yet
new-Guidelines-Datamining-I-UGCF-DSE-CS Hons-Sem 4-Jan 25
3 pages
Fam Question Bank CT
No ratings yet
Fam Question Bank CT
14 pages
Model Evaluation
No ratings yet
Model Evaluation
39 pages
Ch8 Data and Its Processing
No ratings yet
Ch8 Data and Its Processing
32 pages
MSDSModule 2
No ratings yet
MSDSModule 2
35 pages
Ai Chapter 3
No ratings yet
Ai Chapter 3
8 pages
Machine Learning Essentials Guide
No ratings yet
Machine Learning Essentials Guide
33 pages
7 Data Preprocessing Steps in Machine Learning
No ratings yet
7 Data Preprocessing Steps in Machine Learning
5 pages
ML Lect1
100% (1)
ML Lect1
51 pages
Machine Learning Chapter 2
No ratings yet
Machine Learning Chapter 2
37 pages
Learning Progress Review Week 10
No ratings yet
Learning Progress Review Week 10
35 pages
DSF - UNIT III Notes
No ratings yet
DSF - UNIT III Notes
17 pages
ML Lecture Notes Unit-1
No ratings yet
ML Lecture Notes Unit-1
45 pages
Aml Midsem
No ratings yet
Aml Midsem
59 pages
Workflow of A Machine Learning Project
No ratings yet
Workflow of A Machine Learning Project
12 pages
Guidelines-Datamining-I-UGCF-DSE-CS Hons-Sem 4-Jan2024
No ratings yet
Guidelines-Datamining-I-UGCF-DSE-CS Hons-Sem 4-Jan2024
3 pages
ML - Part - A
No ratings yet
ML - Part - A
10 pages
Data Preprocessing for ML Experts
No ratings yet
Data Preprocessing for ML Experts
19 pages
(A) What Is Machine Learning? Explain The Impact of Various Machine Learning Techniques in Today's World
No ratings yet
(A) What Is Machine Learning? Explain The Impact of Various Machine Learning Techniques in Today's World
6 pages
ML Imp QB
No ratings yet
ML Imp QB
34 pages
Machine Learning Types & Challenges
No ratings yet
Machine Learning Types & Challenges
11 pages
Topic-2 ML Concepts
No ratings yet
Topic-2 ML Concepts
9 pages
ML Lab
No ratings yet
ML Lab
44 pages
Sample Paper 307
No ratings yet
Sample Paper 307
16 pages
Unit 1
No ratings yet
Unit 1
41 pages
CH 3
No ratings yet
CH 3
33 pages
Lecture 3 Unit 1
No ratings yet
Lecture 3 Unit 1
61 pages
FDP Day1
No ratings yet
FDP Day1
35 pages
Unit 2 Data Preprocessing
No ratings yet
Unit 2 Data Preprocessing
3 pages
MMC102 - Module 4 - Notes
No ratings yet
MMC102 - Module 4 - Notes
39 pages
BCS602 Model Question Paper Solved (Search Creators)
No ratings yet
BCS602 Model Question Paper Solved (Search Creators)
37 pages
Guidelines-Datamining-I - UGCF-BA-major-sem 3 - July 24
No ratings yet
Guidelines-Datamining-I - UGCF-BA-major-sem 3 - July 24
3 pages
TE ML LAB Mannual
No ratings yet
TE ML LAB Mannual
21 pages
01 Apply Data Preprocessing On Heart Dataset and Evaluate Performance Using Confusion Matrix
No ratings yet
01 Apply Data Preprocessing On Heart Dataset and Evaluate Performance Using Confusion Matrix
19 pages
MLT Syllabus
No ratings yet
MLT Syllabus
3 pages
AI351 Lecture 1
No ratings yet
AI351 Lecture 1
32 pages
Ocs353 DSF Unit III Notes
No ratings yet
Ocs353 DSF Unit III Notes
11 pages
BCS602 Model Question Paper Solved (Search Creators) - 2-37
0% (2)
BCS602 Model Question Paper Solved (Search Creators) - 2-37
36 pages
Churn Prediction with ML Techniques
No ratings yet
Churn Prediction with ML Techniques
77 pages
Assignment DMW
No ratings yet
Assignment DMW
2 pages
Cmsa Sem 6 Dse ML
No ratings yet
Cmsa Sem 6 Dse ML
3 pages
Machine Learning for Nigerian Languages
No ratings yet
Machine Learning for Nigerian Languages
67 pages
Machine Learning (Feature Engineering)
No ratings yet
Machine Learning (Feature Engineering)
10 pages
Chapter 2 Data Preprocessing
No ratings yet
Chapter 2 Data Preprocessing
23 pages
ML Lectures Summary 2
No ratings yet
ML Lectures Summary 2
52 pages
Lecture - 08 - Data Science Processes Part 2
No ratings yet
Lecture - 08 - Data Science Processes Part 2
20 pages
Data Preprocessing: Clean, Transform, Integrate
No ratings yet
Data Preprocessing: Clean, Transform, Integrate
6 pages
Common DS Interview Questions and Answers - 1
No ratings yet
Common DS Interview Questions and Answers - 1
4 pages
ML Data Preprocessing Guide
No ratings yet
ML Data Preprocessing Guide
5 pages
Data Mining University Answer
No ratings yet
Data Mining University Answer
10 pages
Introduction To Data Mining
No ratings yet
Introduction To Data Mining
6 pages
Social Media Analytics Techniques
No ratings yet
Social Media Analytics Techniques
77 pages
SML Updated UNIT-2
No ratings yet
SML Updated UNIT-2
43 pages
Assignment 4 MB511
No ratings yet
Assignment 4 MB511
6 pages
Data Science and Machine Learning Syllabus V1.0
No ratings yet
Data Science and Machine Learning Syllabus V1.0
6 pages
ClassX_AIMiniProjecttemplate2025-26NEW_f1e84c7d623b4c08a6fa76c67aa1a11d_31582.pdf
No ratings yet
ClassX_AIMiniProjecttemplate2025-26NEW_f1e84c7d623b4c08a6fa76c67aa1a11d_31582.pdf
10 pages
Week-14-15-Data Mining Tools
No ratings yet
Week-14-15-Data Mining Tools
24 pages
Vivek Dudhat
No ratings yet
Vivek Dudhat
2 pages
Psychology Group Project 2
No ratings yet
Psychology Group Project 2
29 pages
Features of A Video Journalism
No ratings yet
Features of A Video Journalism
6 pages
Vinay Warrier Resume
No ratings yet
Vinay Warrier Resume
1 page
AEC Data Integration with PowerBi
No ratings yet
AEC Data Integration with PowerBi
66 pages
Power Apps Developer Ganesh
No ratings yet
Power Apps Developer Ganesh
5 pages
Flood Prediction Analysis
No ratings yet
Flood Prediction Analysis
42 pages
Data Storytelling and Visualization - Prachi Manoj Joshi
No ratings yet
Data Storytelling and Visualization - Prachi Manoj Joshi
122 pages
Utilizing Big Data Analytics and Business Intelligence For Improved Decision-Making at Leading Fortune Company
No ratings yet
Utilizing Big Data Analytics and Business Intelligence For Improved Decision-Making at Leading Fortune Company
10 pages
Dev Unit I
No ratings yet
Dev Unit I
5 pages
Learning Analytics in Inquiry-Based Learning A Sys
No ratings yet
Learning Analytics in Inquiry-Based Learning A Sys
31 pages
Chart Design Basics for Beginners
No ratings yet
Chart Design Basics for Beginners
6 pages
Complete Data Storytelling With Generative AI Using Python and Altair MEAP V05 Angelica Lo Duca PDF For All Chapters
100% (14)
Complete Data Storytelling With Generative AI Using Python and Altair MEAP V05 Angelica Lo Duca PDF For All Chapters
79 pages
Project 4
No ratings yet
Project 4
7 pages
Intership Report Vishnu Priya
No ratings yet
Intership Report Vishnu Priya
23 pages
Understanding SAP HANA Views
No ratings yet
Understanding SAP HANA Views
19 pages
Instagram Fake Spammer Genuine Accounts - ML, DA, FA Project
No ratings yet
Instagram Fake Spammer Genuine Accounts - ML, DA, FA Project
46 pages
BIJ Data-Driven Marketing Analysis
No ratings yet
BIJ Data-Driven Marketing Analysis
18 pages
Office Automation Important Questions and Answers
100% (1)
Office Automation Important Questions and Answers
9 pages
Performance Task 1 Rational Functions Application
No ratings yet
Performance Task 1 Rational Functions Application
2 pages
Maths Project
No ratings yet
Maths Project
5 pages
Ip Chapter 1
No ratings yet
Ip Chapter 1
36 pages
DataViz 1e Ch03 PowerPoint 2
No ratings yet
DataViz 1e Ch03 PowerPoint 2
40 pages
Year 6 Math Curriculum Guide
100% (1)
Year 6 Math Curriculum Guide
12 pages
Excel AI For Beginner
No ratings yet
Excel AI For Beginner
83 pages
Computer Science Curriculum With NOC-2024
No ratings yet
Computer Science Curriculum With NOC-2024
68 pages
Dashboards With SAP Analytics Cloud
100% (1)
Dashboards With SAP Analytics Cloud
79 pages
School of Engineering and Technology: Data Science"
No ratings yet
School of Engineering and Technology: Data Science"
18 pages

ML Passing Package - 1

Uploaded by

ML Passing Package - 1

Uploaded by

VI Semester B.C.A.

Examination, July/August - 2024

4. Feature Engineering: Feature engineering is a fundamental process in the field of machine

16. Explain any four unsupervised learning techniques.

2. Information Gain is a statistical metric used to assess a feature's applicability in a dataset. It is an

b) Partitioning clustering and hierarchical clustering.

Short Answer Questions

1. Applications of machine learning.

2. Scikit learn with its features.

4. List the different clustering techniques

Ans: different types of linear models:

1. Simple Linear Regression

6. Decision tree algorithm with its advantages and disadvantages.

8. Naïve Bayes classification algorithm with an example

You might also like