LEC-1
1. Define Pattern.
Ans: A pattern refers to a structured arrangement of data or features that can be identified,
classified, or analyzed based on certain characteristics. It can be a sequence, shape, object, or
signal that follows a recognizable structure.
2. Define Pattern Recognition.
Ans: The process of identifying patterns or regularities in data.
Applications of Pattern Recognition:
• Image & Speech Recognition – Face detection, speech processing.
• Medical Diagnosis – Disease pattern identification.
• Fraud Detection – Unusual transaction spotting.
3. How does Pattern Recognition work?
Ana: How Pattern Recognition Works:
1) Data Collection – Gather raw data (e.g., images, speech).
2) Preprocessing – Clean and normalize data.
3) Feature Extraction – Identify key characteristics (e.g., edges, frequency).
4) Model Training – Train algorithms to recognize patterns.
5) Classification – Apply the model to classify or predict outcomes.
4. Define Issues in Pattern Recognition.
Ans: Issues in Pattern Recognition:
• Data Quality & Quantity – Insufficient or noisy data affects accuracy (e.g., rare disease
images in medical diagnostics).
• Overfitting – Poor generalization due to excessive focus on training data (e.g., face
recognition fails on new images).
• Class Imbalance – Rare patterns are harder to detect (e.g., fraud detection struggles with
infrequent fraud cases).
• Computational Complexity – High resource demands (e.g., real-time speech recognition on
mobile devices).
5. Define Supervised Learning.
Ans: A machine learning approach where models learn from labeled data to make predictions.
Common Algorithms:
• Linear & Logistic Regression – Predict values & classify data.
• Decision Trees & Random Forest – Rule-based decision-making.
• SVM & k-NN – Classification using boundaries or proximity.
Real-life Examples:
• Spam Detection – Classifying emails.
• Medical Diagnosis – Predicting diseases.
• Credit Scoring – Assessing loan risks.
• Image Recognition – Identifying objects/faces.
Training Set – Labeled dataset with input features & correct outputs (e.g., images labeled as
"cat" or "dog").
6. Define Unsupervised Learning.
Ans: A machine learning approach where models find patterns in unlabeled data.
Common Algorithms:
• Clustering: K-means, Hierarchical Clustering.
• Dimensionality Reduction: PCA, t-SNE.
• Anomaly Detection: Identifying unusual data points.
Real-life Examples:
• Customer Segmentation – Grouping buyers for marketing.
• Market Basket Analysis – Finding product associations.
• Image Compression – Reducing data size efficiently.
• Fraud Detection – Spotting anomalies in transactions.
Training Set – Only input features, no labels (e.g., grouping articles by topic).
7. Define Reinforcement Learning.
Ans: An agent learns by interacting with an environment, receiving rewards or penalties to
maximize long-term gains.
Common Algorithms:
• Q-Learning & DQN – Value-based learning.
• Policy Gradient & Actor-Critic – Direct policy optimization.
Real-life Examples:
• Game AI – AlphaGo, chess bots.
• Robotics – Navigating & object manipulation.
• Autonomous Vehicles – Learning to drive.
• Recommendation Systems – Optimizing suggestions.
Training – No traditional dataset; learning happens via trial and error (e.g., a robot navigating a
maze).
8. Define Training Set and Test Set.
Ans: Training Set – Labeled data used for learning.
Test Set – Separate data for evaluation.
LEC-2
1. Define KNN.
Ans: K-Nearest Neighbors (KNN) is a non-parametric, instance-based learning algorithm used for
classification and regression. It classifies a data point based on the majority class of its nearest
neighbors. Commonly applied in pattern recognition tasks, KNN is widely used in image
recognition, speech recognition, and medical diagnosis.
2. How does KNN.
Ans: How KNN Works
• Step 1: Compute the distance between the new data point and all training points (e.g.,
Euclidean distance).
• Step 2: Identify the K nearest neighbors.
• Step 3: Classify based on majority vote (classification) or average (regression).
3. What are the key features of KNN.
Ans: Key Features of KNN
• Simple – Easy to understand and implement.
• Lazy Learning – No explicit training; stores the entire dataset.
• Non-parametric – No assumptions about data distribution.
• Versatile – Works for both classification and regression.
4. How to choose the right K?
Ans: Choosing the Right K
• Small K (e.g., K=1) – Prone to noise and overfitting.
• Large K – May cause underfitting.
• Tip – Use cross-validation to find the optimal K.
5. Types of Distance Metrices in KNN.
Ans: Distance Metrics in KNN
• Euclidean Distance – Common for continuous data.
• Manhattan Distance – Ideal for grid-based data.
• Minkowski Distance – Generalizes Euclidean & Manhattan.
• Cosine Similarity – Used in text classification.
6. Applications of KNN.
Ans: Applications of KNN in Pattern Recognition
• Image Recognition – Classifying objects by pixel similarities.
• Handwriting Recognition – Identifying characters via comparisons.
• Speech Recognition – Classifying speech patterns by audio similarity.
• Medical Diagnosis – Predicting diseases by comparing patient data.
7. Example : We have a dataset of fruits based on their features: Weight and Color Intensity (on a
scale from 0 to 10). We want to classify whether a fruit is an Apple or Orange based on these
features.
Solution:
8. Example:
Height Weight Class
167 51 Underweight
182 62 Normal
176 69 Normal
172 64 Normal
174 56 Underweight
169 58 Normal
173 57 Normal
170 55 Normal
Classify the following person "A" where his/ her weight=57kg, height=170 cm.
Solution:
9. Define Support Vector Machine(SVM).
ANS: Support Vector Machine (SVM) is a supervised machine learning algorithm used for both
classification and regression. It classifies data by finding an optimal hyperplane that maximizes
the margin between different classes in an N-dimensional space. The key idea is to maximize the
distance between the hyperplane and the closest data points, known as support vectors, to
ensure better generalization.
10. Advantages and Disadvantages of SVM.
Ans: Advantages of SVM:
• Effective in high-dimensional spaces.
• Memory efficient, using only support vectors for decision making.
• Versatile with kernel functions for non-linear decision boundaries.
Disadvantages of SVM:
• High training time for large datasets (O(n²) to O(n³)).
• Requires careful parameter tuning for optimal performance.
• Does not provide probability estimates directly.
11. Define Kernel in SVM.
Ans: Kernel in SVM:
The kernel trick maps data into higher-dimensional spaces to enable linear separation.
Common Kernels:
• Linear: For linearly separable data.
• Polynomial: Handles polynomially separable data.
• RBF: Works well for non-linearly separable data.
• Sigmoid: Used in some neural network applications.
12. Real life Applications of SVM.
Ans: Real-Life Applications of SVM:
• Text Classification: Spam detection, sentiment analysis.
• Image Classification: Facial recognition, object detection, handwriting recognition.
• Bioinformatics: Gene classification, protein structure prediction.
• Finance: Stock prediction, credit scoring.
• Speech Recognition & Anomaly Detection: Fraud detection, network security.
13. Define Regression.
Ans: Relation between dependent and independent variables.
14. Define Linear Regression.
Ans: A supervised learning method that models the relationship between a dependent variable
and one or more independent variables using a linear equation. It assumes a linear relationship
and predicts continuous values. Types: Simple (one independent variable) and Multiple (multiple
independent variables). Goal: Minimize the difference between predicted and actual values.
15. Applications of Linear Regression.
Ans: Predicting housing prices, sales forecasting, medical outcomes, stock prices, marketing
impact, and engineering performance.
16. Problem :
Solution:
17. Define Logistic Regression.
Ans: Logistic Regression: A statistical method for binary classification using the sigmoid function
to model probabilities. It predicts categorical outcomes (0 or 1) and transforms log-odds into
probabilities.
18. Example
19. Define Decision Tree.
Ans: Decision Tree: A supervised learning algorithm for classification and regression, using a
tree-like structure where nodes represent decisions and leaves indicate outcomes. It splits data
recursively for optimal partitioning, follows a top-down approach, is interpretable, and handles
non-linear relationships.
20. Advantages and Disadvantages of Decision Tree.
Asn: Advantages:
• Easy to interpret and visualize.
• No need for feature scaling.
• Handles missing values effectively.
• Captures non-linear relationships.
Disadvantages:
• Prone to overfitting with deep trees.
• Sensitive to small data changes.
• May favor features with more categories.
21. Define Naïve Bayes.
Asn: Naïve Bayes: A probabilistic classifier based on Bayes' Theorem with an independence
assumption.
Types:
• Gaussian: For continuous data.
• Multinomial: For text classification.
• Bernoulli: For binary features.
Applications: Spam filtering, sentiment analysis, document classification.
Advantages:
• Simple,
• efficient,
• handles high-dimensional data well.
Disadvantages:
• Assumes feature independence,
• struggles with correlated features.
22.