Match the Following (10 Questions)
1. Match the following machine learning concepts with their descriptions:
| Concept | Description |
|------------|----------------|
| Overfitting | Model performs well on training data but poorly on test data |
| Underfitting | Model is too simple to learn the patterns |
| Bias | Systematic error that skews predictions |
| Variance | Model sensitivity to fluctuations in training data |
2. Match the following learning types with their examples:
| Learning Type | Example |
|------------------|------------|
| Supervised Learning | Decision Trees |
| Unsupervised Learning | K-means Clustering |
| Reinforcement Learning | Q-learning |
| Semi-supervised Learning | Google Photos Face Grouping |
3. Match the following clustering methods with their characteristics:
| Clustering Method | Characteristic |
|----------------------|-------------------|
| K-means | Partitional clustering |
| DBSCAN | Density-based clustering |
| Hierarchical Clustering | Merges or splits clusters iteratively |
| Expectation-Maximization | Probabilistic clustering approach |
4. Match the following evaluation metrics with their use cases:
| Metric | Use Case |
|-----------|------------|
| Accuracy | Overall correctness of the model |
| Precision | Identifying relevant results among retrieved items |
| Recall | Identifying all relevant instances |
| ROC Curve | Evaluating classifier performance at different thresholds |
5. Match the following ML processes with their definitions:
| Process | Definition |
|------------|-------------|
| Feature Selection | Choosing relevant features for model training |
| Feature Engineering | Creating new meaningful features |
| Data Preprocessing | Cleaning and transforming data before training |
| Model Tuning | Adjusting hyperparameters for better performance |
True/False (10 Questions)
6. K-means clustering is a supervised learning algorithm.
7. A confusion matrix helps evaluate the performance of classification models.
8. Cross-validation reduces overfitting.
9. Linear regression is used for classification tasks.
10. Hierarchical clustering cannot be used for non-numeric data.
11. The F1-score is the harmonic mean of precision and recall.
12. A larger training dataset generally improves model accuracy.
13. SVM can only be used for classification problems.
14. ROC Curve measures the trade-off between sensitivity and specificity.
15. Unsupervised learning requires labeled data for training.
Fill in the Blanks (10 Questions)
16. ______ learning involves training models using labeled data.
17. The process of dividing a dataset into training and testing sets is called ______.
18. A confusion matrix consists of True Positives, True Negatives, ______, and ______.
19. Logistic regression is mainly used for ______ problems.
20. ______ clustering works by iteratively merging or splitting clusters.
21. ROC Curve stands for ______.
22. The main purpose of K-fold cross-validation is to reduce ______.
23. The metric used to measure how well a clustering model groups data is called ______.
24. An SVM model uses a ______ to separate data points.
25. Machine learning models improve performance by learning from ______.
---
Odd One Out (10 Questions)
26. Which of the following does not belong?
- A) Linear Regression
- B) KNN
- C) Logistic Regression
- D) K-means
27. Find the odd one out:
- A) Precision
- B) Recall
- C) Accuracy
- D) Clustering
28. Which is different from others?
- A) Training Data
- B) Test Data
- C) Feature Scaling
- D) Validation Data
29. Which is not a supervised learning algorithm?
- A) Decision Trees
- B) Random Forest
- C) K-means
- D) SVM
30. Find the odd one out:
- A) Mean Absolute Error
- B) Mean Squared Error
- C) F1-score
- D) Root Mean Squared Error
31. Which does not belong to classification algorithms?
- A) Decision Tree
- B) SVM
- C) K-means
- D) Naive Bayes
32. Which of the following is not a clustering algorithm?
- A) K-means
- B) DBSCAN
- C) PCA
- D) Agglomerative Clustering
33. Find the odd one out:
- A) Overfitting
- B) Bias
- C) Variance
- D) Feature Scaling
34. Which is different from others?
- A) Feature Selection
- B) Feature Engineering
- C) Model Training
- D) Data Augmentation
35. Find the odd one out in loss functions:
- A) Mean Squared Error
- B) Cross-Entropy Loss
- C) F1-score
- D) Hinge Loss
MCQs (10 Questions)
36. Which of the following is an example of supervised learning?
- A) K-means Clustering
- B) Decision Trees
- C) DBSCAN
- D) Expectation Maximization
37. Which evaluation metric is used for classification problems?
- A) RMSE
- B) Accuracy
- C) Mean Squared Error
- D) Variance
38. What is the main advantage of KNN?
- A) Fast training time
- B) Requires less memory
- C) Works well with high-dimensional data
- D) Does not require labeled data
39. What is the main challenge of hierarchical clustering?
- A) Requires labeled data
- B) Cannot handle large datasets efficiently
- C) Requires feature scaling
- D) Uses reinforcement learning
40. What is the best method for handling missing data?
- A) Removing missing values
- B) Filling missing values with mean/median/mode
- C) Ignoring missing values
- D) All of the above
41. What is the function of the activation function in a neural network?
- A) Transform input data
- B) Introduce non-linearity
- C) Reduce bias
- D) Improve accuracy
42. Which method is used for dimensionality reduction?
- A) K-means
- B) DBSCAN
- C) PCA
- D) Logistic Regression
43. What does "TP" stand for in a confusion matrix?
- A) True Performance
- B) True Prediction
- C) True Positive
- D) Total Precision
44. Which learning approach is used in spam filtering?
- A) Supervised Learning
- B) Unsupervised Learning
- C) Reinforcement Learning
- D) Clustering
45. Which machine learning model is best for predicting house prices?
- A) K-means
- B) Decision Trees
- C) SVM
- D) Linear Regression