0% found this document useful (0 votes)

7 views7 pages

ML Digit Classification Report

This project report focuses on handwritten digit recognition using classical machine learning models, specifically evaluating the UCI Digits dataset. Various models including Logistic Regression, K-Nearest Neighbors, Support Vector Machine, Decision Tree, and Random Forest were implemented, with the Support Vector Machine achieving the highest accuracy of over 99.1%. The report discusses data preprocessing techniques, model evaluation metrics, and concludes with the performance comparison of the models.

Uploaded by

prarit.work

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views7 pages

ML Digit Classification Report

Uploaded by

prarit.work

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

ARM 210

Introduction to
machine learning
Project report

Submitted To: Submitted By:

Dr.Amit Choudhary Prarit Arora
AIML B1
04919051623
Email- aroraprarit017.pa@gmail.com
Contact - 9999538421
Digit Recognition Using Classical Machine
Learning Models

Link to Notebook

Abstract

Handwritten digit recognition is a classical problem in machine learning and computer

vision, often used to benchmark model performance. This project utilizes the UCI Digits
dataset to evaluate various traditional machine learning classifiers on their ability to
identify handwritten digits (0–9). A range of models including Logistic Regression, K-
Nearest Neighbors, Support Vector Machine, Decision Tree, and Random Forest were
implemented. Their performance was compared using metrics such as accuracy,
precision, recall, F1-score, confusion matrices, and cross-validation scores. Preprocessing
techniques, including normalization and an attempted dimensionality reduction using
PCA, are discussed. Support Vector Machine achieved the highest performance among
all models.

Keywords

Handwritten Digit Recognition, Supervised Learning, Classification Algorithm, Model

Evaluation, Confusion Matrix

1. Introduction

Handwritten digit classification is a well-known pattern recognition problem and serves

as an ideal case study for evaluating various supervised learning algorithms. The task is
to automatically recognize digits written by hand, which is foundational to applications
like postal code recognition, bank check processing, and digit-based entry systems.

This study uses the UCI Digits dataset, which is smaller and more lightweight compared
to MNIST, making it ideal for quick prototyping and comparisons.
2. Dataset Overview

 Dataset Source: UCI Machine Learning Repository (via

sklearn.datasets.load_digits)
 Shape: 1797 images of 8x8 pixels (64 features per image)
 Classes: 10 (Digits 0 through 9)
 Format: Each image is flattened into a 1D array of 64 pixel intensity values

Each sample in the dataset represents a grayscale digit image. Pixel values range from 0
to 16.

3. Data Preprocessing

 Normalization: Since pixel values range from 0–16, all values were normalized
by dividing by 16 to bring them into the [0, 1] range, which often improves model
convergence and accuracy.
 Train-Test Split:
o 80% for training (1437 samples)
o 20% for testing (360 samples)
o Stratified split was used to ensure class distribution remains consistent
across sets.
 Principal Component Analysis (PCA):
o PCA was attempted to reduce dimensionality and possibly enhance
performance.
o However, applying PCA led to a slight drop in accuracy, possibly due to
loss of information critical for classification. Hence, the raw normalized
features were retained.

4. Models Used

1. Logistic Regression

 A baseline linear classifier that works well with normalized numeric data.
 Surprisingly effective for this task, achieving over 93% accuracy.

2. K-Nearest Neighbors (KNN)

 A non-parametric model that classifies based on the majority class of its k closest
neighbors.
 It performed extremely well, achieving nearly 98.6% accuracy, as digit images
tend to cluster well in pixel-space.

3. Support Vector Machine (SVM)

 A powerful classifier that finds the optimal hyperplane to separate classes using
kernel tricks (RBF used here).
 This model achieved the highest accuracy of all: over 99.1%.

4. Decision Tree

 A simple and interpretable model that recursively splits data based on feature
values.
 Its performance was the weakest among all, with an accuracy of 83.3%.

5. Random Forest

 An ensemble model of multiple decision trees, helping reduce overfitting and

improve generalization.
 Achieved 96.1% accuracy — much better than a single tree.
5. Evaluation Metrics

The following metrics were used for evaluation:

 Accuracy: Ratio of correctly predicted instances over total instances.

 Precision (weighted): True positives / (True positives + False positives),
weighted by class.
 Recall (weighted): True positives / (True positives + False negatives), weighted
by class.
 F1-Score (weighted): Harmonic mean of precision and recall.
 Confusion Matrix: Shows detailed breakdown of actual vs predicted classes.
 Cross-Validation (5-fold): Measures model stability across multiple subsets.

6. Results

Metric Comparison Table:

Model Accuracy Precision Recall F1-Score

Logistic Regression 0.9361 0.9366 0.9361 0.9353

K-Nearest Neighbors 0.9861 0.9867 0.9861 0.9861

Support Vector Machine 0.9917 0.9920 0.9917 0.9917

Decision Tree 0.8333 0.8372 0.8333 0.8335

Random Forest 0.9611 0.9620 0.9611 0.9609

Cross-Validation Scores (5-fold):

 SVM: 0.9882 ± 0.0052

 KNN: 0.9882 ± 0.0087
 Random Forest: 0.9756 ± 0.0062
 Logistic Regression: 0.9429 ± 0.0061
 Decision Tree: 0.8427 ± 0.0233
7. Conclusion

Among the evaluated models, Support Vector Machine (SVM) performed the best with
an accuracy of 99.17% on the test set and strong cross-validation performance. KNN was
a close second, showing the strength of instance-based learning for small image datasets.
Decision Tree, while simple and fast, underperformed likely due to its tendency to
overfit small datasets. Random Forest demonstrated a strong balance between
interpretability and performance.

8. Refrences

1. Scikit-learn Developers. (2024). Scikit-learn User Guide. https://scikit-

learn.org/stable/user_guide.html
2. Pedregosa, F. et al. (2011). Scikit-learn: Machine Learning in Python. Journal of
Machine Learning Research, 12, 2825–2830.
3. Géron, A. (2019). Hands-On Machine Learning with Scikit-Learn, Keras, and
TensorFlow (2nd ed.). O'Reilly Media.
4. James, G., Witten, D., Hastie, T., & Tibshirani, R. (2021). An Introduction to
Statistical Learning (2nd ed.). Springer. https://www.statlearning.com
5. Hastie, T., Tibshirani, R., & Friedman, J. (2009). The Elements of Statistical
Learning (2nd ed.). Springer.
6. Dua, D., & Graff, C. (2019). UCI Machine Learning Repository: Optical
Recognition of Handwritten Digits Dataset. University of California, Irvine.
https://archive.ics.uci.edu/ml/datasets/Optical+Recognition+of+Handwritten+Digi
ts
7. LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based
learning applied to document recognition. Proceedings of the IEEE, 86(11),
2278–2324. https://doi.org/10.1109/5.726791

8. Kotsiantis, S. B., Zaharakis, I., & Pintelas, P. (2007). Supervised machine

learning: A review of classification techniques. Emerging Artificial Intelligence
Applications in Computer Engineering, 160, 3–24.

9. Jolliffe, I. T., & Cadima, J. (2016). Principal component analysis: A review and
recent developments. Philosophical Transactions of the Royal Society A:
Mathematical, Physical and Engineering Sciences, 374(2065).
https://doi.org/10.1098/rsta.2015.0202

10. Bhatele, M., Jadon, S., & Chaurasia, P. (2021). A Comparative Study of
Machine Learning Techniques for Digit Recognition. International Journal of
Computer Applications, 183(2), 6–11.

AI Mini Project Report
No ratings yet
AI Mini Project Report
7 pages
Classifying Hand-Written Digits Using Neural Network
No ratings yet
Classifying Hand-Written Digits Using Neural Network
21 pages
ML - Aat - Report 1
No ratings yet
ML - Aat - Report 1
8 pages
Updated ML Digit Classification Report
No ratings yet
Updated ML Digit Classification Report
3 pages
Untitled Document
No ratings yet
Untitled Document
16 pages
Handwritten Digit Recognition Analysis
No ratings yet
Handwritten Digit Recognition Analysis
15 pages
Lab Assignment 1 216
No ratings yet
Lab Assignment 1 216
2 pages
Recognizing Handwritten Digits With Scikit-Learn: Punam Seal
No ratings yet
Recognizing Handwritten Digits With Scikit-Learn: Punam Seal
21 pages
ML Digit Classification Report
No ratings yet
ML Digit Classification Report
2 pages
Handwritten Digit Recognition Systems
No ratings yet
Handwritten Digit Recognition Systems
12 pages
MACHINE LEARNING WITH PYTHON - Digit Recognition With Scikit-Learn and Mnist
No ratings yet
MACHINE LEARNING WITH PYTHON - Digit Recognition With Scikit-Learn and Mnist
11 pages
MLP - Week 5 - MNIST - Perceptron - Ipynb - Colaboratory
No ratings yet
MLP - Week 5 - MNIST - Perceptron - Ipynb - Colaboratory
31 pages
School of Computer Science and Artificial Intelligence
No ratings yet
School of Computer Science and Artificial Intelligence
35 pages
Handwritten Digit Recognition With ML Models
No ratings yet
Handwritten Digit Recognition With ML Models
41 pages
Newbie's Deep Learning Project To Recognize Handwritten Digit
No ratings yet
Newbie's Deep Learning Project To Recognize Handwritten Digit
6 pages
P06 The Classification Pipeline Ans
No ratings yet
P06 The Classification Pipeline Ans
16 pages
Bangla and English Digit Recognition
No ratings yet
Bangla and English Digit Recognition
8 pages
MNIST Handwritten Digit Classification Guide
No ratings yet
MNIST Handwritten Digit Classification Guide
54 pages
Capstone Project Report (Digit-Recognition Using CNN)
No ratings yet
Capstone Project Report (Digit-Recognition Using CNN)
11 pages
Institute of Engineering and Management, Kolkata Artificial Intelligence Project (CS793C) On Handwriting Analysis
No ratings yet
Institute of Engineering and Management, Kolkata Artificial Intelligence Project (CS793C) On Handwriting Analysis
11 pages
CIS 6213 Applied Machine Learning Coursework
No ratings yet
CIS 6213 Applied Machine Learning Coursework
5 pages
Assignment 02# - Machine Learning 2023
No ratings yet
Assignment 02# - Machine Learning 2023
8 pages
Handwritten Digit Recognition Final Report
No ratings yet
Handwritten Digit Recognition Final Report
3 pages
DIP Mini Project
100% (1)
DIP Mini Project
12 pages
How To Develop A CNN For MNIST Handwritten Digit Classification
No ratings yet
How To Develop A CNN For MNIST Handwritten Digit Classification
43 pages
Assignment SQGAN
No ratings yet
Assignment SQGAN
14 pages
Vaishnavi P 1
No ratings yet
Vaishnavi P 1
6 pages
Project
No ratings yet
Project
15 pages
SVM Image Analysis Code
No ratings yet
SVM Image Analysis Code
18 pages
Ann Experiential Learning
No ratings yet
Ann Experiential Learning
43 pages
Aiml Nts
No ratings yet
Aiml Nts
33 pages
Lecture03. Classification (Chapter 3)
No ratings yet
Lecture03. Classification (Chapter 3)
46 pages
Image/Digit Recognition Using Machine Learning: by Raghav Chawla, I.T/B.Tech/Hmritm/5 Semester 43713303117
100% (1)
Image/Digit Recognition Using Machine Learning: by Raghav Chawla, I.T/B.Tech/Hmritm/5 Semester 43713303117
15 pages
ANN Case Study
No ratings yet
ANN Case Study
12 pages
Tensor Flow and Keras Sample Programs
No ratings yet
Tensor Flow and Keras Sample Programs
22 pages
Mnist Classification Report
No ratings yet
Mnist Classification Report
15 pages
Aicw
No ratings yet
Aicw
19 pages
JOCC Volume 2 Issue 1 Page 9 19
No ratings yet
JOCC Volume 2 Issue 1 Page 9 19
11 pages
Intro Ai Group3
No ratings yet
Intro Ai Group3
35 pages
33 DTand Digital Char Reg
No ratings yet
33 DTand Digital Char Reg
4 pages
G54 Midterm
No ratings yet
G54 Midterm
15 pages
Pprint ML
No ratings yet
Pprint ML
22 pages
ML Guide: MNIST Digit Classification
No ratings yet
ML Guide: MNIST Digit Classification
98 pages
Paper 2
No ratings yet
Paper 2
4 pages
DL Lab-Final
No ratings yet
DL Lab-Final
22 pages
Final ML File
No ratings yet
Final ML File
39 pages
Deep Learning: Image Classification & XOR
No ratings yet
Deep Learning: Image Classification & XOR
3 pages
Handwritten Digit Recognition KNN
No ratings yet
Handwritten Digit Recognition KNN
8 pages
On Handwritten Digit Recognition
No ratings yet
On Handwritten Digit Recognition
15 pages
Aishwarya MiniProjectReport - SC
No ratings yet
Aishwarya MiniProjectReport - SC
6 pages
Image Classification
No ratings yet
Image Classification
18 pages
Research Papers
No ratings yet
Research Papers
16 pages
Project Report - Intro To AI
No ratings yet
Project Report - Intro To AI
40 pages
MNIST Based Handwritten Digits Recognition
No ratings yet
MNIST Based Handwritten Digits Recognition
5 pages
Handwritten Digit Recognition Report
No ratings yet
Handwritten Digit Recognition Report
9 pages
Lab 8
No ratings yet
Lab 8
5 pages
Shaik Muneer Roll no:22KT1A4257 3rd Year (AI&ML) PSCMR College of Engineering and Technology
No ratings yet
Shaik Muneer Roll no:22KT1A4257 3rd Year (AI&ML) PSCMR College of Engineering and Technology
20 pages
Machine Learning
No ratings yet
Machine Learning
21 pages
Green Modern Agriculture Presentation
No ratings yet
Green Modern Agriculture Presentation
9 pages
QR 11894294
No ratings yet
QR 11894294
1 page
AUITS ProblemStatement
No ratings yet
AUITS ProblemStatement
2 pages
Untitled Document
No ratings yet
Untitled Document
14 pages
Project Name
No ratings yet
Project Name
6 pages
Green Modern Agriculture Presentation - Compressed
No ratings yet
Green Modern Agriculture Presentation - Compressed
9 pages
Prarit Arora CV - Compressed
No ratings yet
Prarit Arora CV - Compressed
1 page
ArogyaAI Compressed
No ratings yet
ArogyaAI Compressed
10 pages
2 F2026-T&P USAR2 Notice CloudTechner Services
No ratings yet
2 F2026-T&P USAR2 Notice CloudTechner Services
12 pages
ML Project Report Puranjay
No ratings yet
ML Project Report Puranjay
2 pages
Level 1
No ratings yet
Level 1
1 page
Prarit Aroa ADA Final
No ratings yet
Prarit Aroa ADA Final
110 pages
Tickets - Odoo Hackathon 2025 (Aug 11, 2025, 8-00-00 AM)
No ratings yet
Tickets - Odoo Hackathon 2025 (Aug 11, 2025, 8-00-00 AM)
4 pages
SRS Sentiment Analysis Project
No ratings yet
SRS Sentiment Analysis Project
4 pages
Geek Verse Guidelines Offline
No ratings yet
Geek Verse Guidelines Offline
1 page
Teena Ai File
No ratings yet
Teena Ai File
24 pages
Internship Guidelines 2025-26-4-5
No ratings yet
Internship Guidelines 2025-26-4-5
2 pages
CN File
No ratings yet
CN File
16 pages
Assignment - Ad Hoc On-Demand Distance Vector (AODV) Routing Protocol
No ratings yet
Assignment - Ad Hoc On-Demand Distance Vector (AODV) Routing Protocol
3 pages
Manjot CN
No ratings yet
Manjot CN
1 page
Drdo PDF
No ratings yet
Drdo PDF
1 page
Teena 05019051623 B1 CN
No ratings yet
Teena 05019051623 B1 CN
12 pages
ARM 258 Computer Networks Lab File: Submitted To: Submitted by
No ratings yet
ARM 258 Computer Networks Lab File: Submitted To: Submitted by
1 page
Mastering Problem Solving in AI
No ratings yet
Mastering Problem Solving in AI
8 pages
Assignment 6
No ratings yet
Assignment 6
5 pages
Unit 4 Part 1 B
No ratings yet
Unit 4 Part 1 B
31 pages
CSF-469-L11-13 (Link Analysis Page Rank)
No ratings yet
CSF-469-L11-13 (Link Analysis Page Rank)
47 pages
ML Question Bank U - 4
No ratings yet
ML Question Bank U - 4
14 pages
26 6 2024 Flowchart Qus
No ratings yet
26 6 2024 Flowchart Qus
6 pages
2 - Master's Algorithm
No ratings yet
2 - Master's Algorithm
8 pages
Introduction To Support Vector Regression (SVR)
No ratings yet
Introduction To Support Vector Regression (SVR)
28 pages
Unit3 Notes
No ratings yet
Unit3 Notes
42 pages
Primal Simplex Method Solutions
No ratings yet
Primal Simplex Method Solutions
27 pages
Chap3 Ftrans PDF
No ratings yet
Chap3 Ftrans PDF
4 pages
The Great Multivariate Time Series Classification Bake Off: A Review and Experimental Evaluation of Recent Algorithmic Advances
No ratings yet
The Great Multivariate Time Series Classification Bake Off: A Review and Experimental Evaluation of Recent Algorithmic Advances
49 pages
Chapter 3 - Image Enhancement
No ratings yet
Chapter 3 - Image Enhancement
79 pages
ML Unit-2 Material
No ratings yet
ML Unit-2 Material
20 pages
Lab Report#08 DSP
No ratings yet
Lab Report#08 DSP
11 pages
Lecture 2 - Data Structures - Asymptotic Analysis
No ratings yet
Lecture 2 - Data Structures - Asymptotic Analysis
3 pages
1sts 1stq Quiz2 General Mathematics
No ratings yet
1sts 1stq Quiz2 General Mathematics
3 pages
Soundcraft-EPM 6-8-12 Mixer
100% (1)
Soundcraft-EPM 6-8-12 Mixer
41 pages
Ds Data Structure Lab Manual
No ratings yet
Ds Data Structure Lab Manual
61 pages
Signal Lab Report 03
No ratings yet
Signal Lab Report 03
10 pages
Course Handout DS
No ratings yet
Course Handout DS
2 pages
Frequency Response Lab Guide
No ratings yet
Frequency Response Lab Guide
8 pages
Optimal Power Flow: Nonlinear & Quadratic Approaches
No ratings yet
Optimal Power Flow: Nonlinear & Quadratic Approaches
9 pages
Simple Programming Problems
No ratings yet
Simple Programming Problems
5 pages
Zouhastie 05
No ratings yet
Zouhastie 05
20 pages
Addition of Two Polynomials Using Linked List
No ratings yet
Addition of Two Polynomials Using Linked List
15 pages
Fundamentals of Artificial Intelligence: Local Search Algorithms
No ratings yet
Fundamentals of Artificial Intelligence: Local Search Algorithms
32 pages
Assignment 06
No ratings yet
Assignment 06
1 page
DSP Question With Answer
No ratings yet
DSP Question With Answer
41 pages
Ch.11 Graphs: Data Structures: A Pseudocode Approach With C
No ratings yet
Ch.11 Graphs: Data Structures: A Pseudocode Approach With C
65 pages

ML Digit Classification Report

Uploaded by

ML Digit Classification Report

Uploaded by

ARM 210

Submitted To: Submitted By:

Handwritten digit recognition is a classical problem in machine learning and computer

Handwritten Digit Recognition, Supervised Learning, Classification Algorithm, Model

Handwritten digit classification is a well-known pattern recognition problem and serves

 Dataset Source: UCI Machine Learning Repository (via

2. K-Nearest Neighbors (KNN)

3. Support Vector Machine (SVM)

 An ensemble model of multiple decision trees, helping reduce overfitting and

The following metrics were used for evaluation:

 Accuracy: Ratio of correctly predicted instances over total instances.

Metric Comparison Table:

Logistic Regression 0.9361 0.9366 0.9361 0.9353

K-Nearest Neighbors 0.9861 0.9867 0.9861 0.9861

Support Vector Machine 0.9917 0.9920 0.9917 0.9917

Decision Tree 0.8333 0.8372 0.8333 0.8335

Random Forest 0.9611 0.9620 0.9611 0.9609

Cross-Validation Scores (5-fold):

 SVM: 0.9882 ± 0.0052

1. Scikit-learn Developers. (2024). Scikit-learn User Guide. https://scikit-

8. Kotsiantis, S. B., Zaharakis, I., & Pintelas, P. (2007). Supervised machine

You might also like