IMAGE CLASSIFICATION USING DEEP LEARNING
A project report submitted to
MALLA REDDY UNIVERSITY
in partial fulfillment of the requirements for the award of degree of
BACHELOR OF TECHNOLGY
in
COMPUTER SCIENCE & ENGINEERING (AI&ML)
Submitted by
P. LAXMI PRASANNA : 2011CS020285
P. MAHATHI : 2011CS020286
P. PRAMOD REDDY : 2011CS020287
K. SWASTHIKA : 2011CS020288
Under the Guidance of
Prof. YOGESH KAKDE
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING (AI&ML)
2023
COLLEGE CERTIFICATE
This is to certify that this is the bonafide record of the application development
entitled,” IMAGE CLASSIFICATION USING DEEP LEARNING”
Submitted by P. LAXMI PRASANNA (2011CS020285), P. MAHATHI
(2011CS020286), P. PRAMOD (2011CS020287), K. SWASTHIKA
(2011CS020288) B. Tech III year II semester, Department of CSE (AI&ML) during
the year 2022-23. The results embodied in the report have not been submitted to any
other university or institute for the award of any degree or diploma.
INTERNAL GUIDE HEAD OF THE DEPARTMENT
Prof. Yogesh Kakde Dr. Thayyaba Khatoon
Asst. Professor Professor
External Examiner
ACKNOWLEDGEMENT
We would like to express our gratitude to all those we extended their support and suggestions to come
up with this application. Special Thanks to our mentor Prof. V. Sravanthi whose help and stimulating
suggestions and encouragement helped us all time in the due course of project development.
We sincerely thank our HOD Dr. Thayyaba Khatoon for her constant support and motivation all
the time. A special acknowledgement goes to a friend who enthused us from the backstage. Last but not
the least our sincere appreciation goes to our family who has been tolerant understanding our moods and
extending timely support
ABSTRACT
Image Classification nowadays is used to narrow the gap between the computer vision and human
vision so that the images can be recognized by machines in the same way as we humans do. It deals
with assigning the appropriate class for the given image. We therefore propose a system named
Image Classification using Deep Learning that classifies the given images using Classifiers like
Neural Network. This system will be developed to measure the accuracy of classifying images on
GPU and CPU. The system will be designed using Python as a Programming Language, keras and
TensorFlow for creating neural networks.
CONTENTS
CHAPTER NO TITLE PAGE NO
01. INTRODUCTION 01
02. LITERATURE SURVEY 03
03. PROPOSED METHODOLOGY 05
04. RESULTS 09
05. CONCLUSION 11
CHAPTER 1
INTRODUCTION
Deep learning is a machine learning subfield that involves the training of artificial neural
networks to learn from vast amounts of data. It's inspired by the human brain's structure and
function and includes multiple layers of interconnected nodes that process and transform data.
Due to its ability to automatically learn and extract relevant features from raw image data, deep
learning has emerged as a leading technique for image classification, resulting in improved
accuracy and efficiency. Convolutional neural networks (CNNs) are a popular type of deep
learning architecture for image classification, achieving state-of-the-art performance on various
tasks such as object recognition, scene classification, and facial recognition.
Deep learning has revolutionized the field of computer vision and image processing, enabling
machines to recognize and interpret images with incredible accuracy and speed. With the advent
of big data, powerful computing hardware, and the availability of deep learning frameworks, it
has become easier than ever to build and train complex deep learning models for image
classification.
Convolutional neural networks (CNNs) are the most widely used deep learning architecture for
image classification, and they have achieved impressive results on a wide range of image
recognition tasks. CNNs use a hierarchical approach to learning, where the lower layers detect
simple features such as edges and corners, and the higher layers detect more complex features
such as shapes, objects, and textures.
One of the key advantages of deep learning for image classification is its ability to learn from
large amounts of unlabeled data. In contrast to traditional machine learning techniques that rely
on hand-engineered features, deep learning can automatically learn and extract relevant features
from raw image data, making it a highly efficient and accurate approach to image classification.
1
However, training deep learning models can be computationally intensive and require large
amounts of memory and computing power. To overcome these challenges, researchers and
engineers are constantly developing new algorithms and techniques for optimizing deep learning
models and making them more efficient and scalable.
In addition to image classification, deep learning has many other applications in computer vision,
including object detection, semantic segmentation, and image captioning. As deep learning
continues to advance and evolve, it is likely to have an increasingly significant impact on the
field of computer vision and image processing.
Image classification using deep learning is a potent technique in computer vision, enabling
computers to recognize and categorize images based on their content. The process typically
involves training a CNN on a large dataset of labeled images, where the network learns to identify
patterns and features relevant to the task at hand. Once trained, the network can classify new
images by feeding them into the network and obtaining the output predictions.
To build and train deep learning models, there are several popular deep learning frameworks,
including TensorFlow, PyTorch, and Keras. These frameworks provide pre-built models and
tools for developing and training deep learning models, making it easier to implement deep
learning solutions in image classification and other computer vision applications.
2
CHAPTER 2
LITERATURE SURVEY
Here is a brief literature survey of image classification using deep learning:
1. "ImageNet Classification with Deep Convolutional Neural Networks" by Krizhevsky et
al. (2012): The paper was published in 2012 and presented a deep convolutional neural
network architecture that significantly outperformed previous state-of-the-art methods on the
ImageNet dataset, which consists of over a million images belonging to 1,000 different
classes. The success of AlexNet sparked a resurgence of interest in deep learning and led to
the development of even more sophisticated deep neural network architectures for image
classification and other computer vision tasks.
2. "Very Deep Convolutional Networks for Large-Scale Image Recognition" by Simonyan
and Zisserman (2015): The VGG architecture is a deep convolutional neural network that
consists of up to 19 layers, making it deeper than the previously introduced AlexNet
architecture. The VGG network achieved state-of-the-art performance on the ImageNet
dataset and demonstrated the importance of depth in deep neural network architectures for
image classification. The VGG architecture has been widely used and adapted in various
computer vision tasks and remains a significant contribution to the field of deep learning.
3. "Deep Residual Learning for Image Recognition" by He et al. (2016): The ResNet
architecture is a deep convolutional neural network that utilizes residual connections, which
allow information to bypass some layers and flow directly to later layers, facilitating the
training of very deep neural networks. The ResNet architecture has also been adapted for
other computer vision tasks, such as object detection and semantic segmentation.
3
4. "DenseNet: Densely Connected Convolutional Networks" by Huang et al. (2017): The
DenseNet architecture is a deep convolutional neural network that introduced densely
connected layers to improve the flow of information between layers. The paper also
introduced a new way of measuring the efficiency of deep neural networks, called the "total
number of FLOPs," which takes into account both the number of floating-point operations
and the number of parameters in the network.
5. "Deep Learning for Image Classification: A Comprehensive Review" by Zhu et al.
(2018): This review paper provide a detailed overview of each topic, discussing the key
concepts and techniques involved, as well as the advantages and limitations of each approach.
The paper also includes a comparative analysis of different deep learning architectures and
methods, highlighting their strengths and weaknesses in various scenarios. Overall, it's a
valuable resource for researchers and practitioners working on image classification problems
using deep learning.
6. "A Survey of Deep Learning Techniques for Image Classification" by Anjum et al.
(2018): This survey paper provides an overview of deep learning techniques for image
classification. It covers various topics, including convolutional neural networks, transfer
learning, and ensemble methods, and discusses their practical applications across different
domains. The paper is a valuable resource for researchers and practitioners seeking to explore
the latest advances and trends in deep learning for image classification.
These publications and surveys offer a thorough and up-to-date understanding of the latest
developments in deep learning for image classification, and can serve as valuable references for both
researchers and practitioners operating in this area.
4
CHAPTER 3
PROPOSED METHODOLOGY
Existing systems:
1. Google Cloud Vision API: This is a cloud-based image recognition system developed by Google
that uses deep learning models to classify images into thousands of predefined categories. It can
also detect objects and faces in images, and extract text and other information from images. The
API is available as a paid service and can be integrated into various applications. It uses deep
learning models to achieve high accuracy in image classification. This cloud-based image
recognition service allows developers to easily integrate image analysis into their applications.
2. Microsoft Cognitive Services Computer Vision API: It is a powerful tool for image classification
and analysis. It uses deep learning models to classify images based on their content, and can also
provide additional insights such as object detection, image tagging, and face recognition. The API
is cloud-based, making it easily accessible for developers to integrate into their applications using
REST APIs. Overall, it can be a valuable tool for businesses and organizations looking to
incorporate image classification and analysis into their products and services.
Both Google Cloud Vision and Microsoft Cognitive Services Computer Vision are cloud-based image
recognition services that provide an easy-to-use API for developers to integrate image analysis into
their applications. These services use deep learning models to classify images and can automatically
detect and extract other features, such as objects, faces, and colors, with high accuracy. With these
services, users can simply upload an image and receive the desired output without the need for
extensive programming or deep learning expertise.
5
Algorithms:
Convolutional Neural Networks (CNNs):
Convolutional neural networks (CNNs) are a popular type of deep learning architecture for image
classification. They are designed to automatically extract features from raw image data, making them
particularly well-suited for image recognition tasks.
A CNN consists of several layers, including convolutional layers, pooling layers, and fully connected
layers. The convolutional layers are responsible for performing convolutions on the input image,
which involves sliding a small filter over the image and computing the dot product between the filter
and each region of the image. This process helps to extract relevant features from the image, such as
edges, lines, and textures.
The pooling layers are used to down sample the feature maps obtained from the convolutional layers,
reducing the spatial dimensions of the data while retaining the most important information. This helps
to reduce the computational complexity of the model and prevent overfitting.
Finally, the fully connected layers are responsible for classifying the input image into one of several
predefined categories. They take the feature maps obtained from the previous layers and convert them
into a one-dimensional vector, which is then fed into a series of fully connected layers. The output of
the final fully connected layer represents the probabilities of the input image belonging to each of the
predefined categories.
Training a CNN for image classification involves feeding a large dataset of labeled images into the
model and adjusting the weights of the different layers using backpropagation to minimize the
classification error. By repeatedly adjusting the weights of the network, the model gradually learns to
identify relevant features in the images and accurately classify them into the correct categories.
6
Proposed System:
This project aims to implement a convolutional neural network (CNN) in Python using the Keras
framework for image classification on the CIFAR-10 dataset. The project will begin with an
exploration of the dataset, followed by the implementation of the CNN model using Keras. Finally,
the model will be trained and evaluated for accuracy.
The primary goal of image classification is to assign each pixel in a digital image to one of several
predefined classes. This task is crucial for digital image analysis, as it enables a wide range of
applications such as object recognition, scene understanding, and image retrieval.
Image classification can be approached using both supervised and unsupervised methods. In
supervised classification, we manually select samples for each target class and use them to train our
neural network. Once trained, the model can classify new samples accurately based on their features.
In unsupervised classification, we group similar sample images into clusters based on their features.
We then classify each cluster into the intended classes. This method does not require manually
labeled data, making it a useful approach for large datasets.
The dataset used in this project is CIFAR-10 that comprises of 60,000 images that are classified into
ten target classes, with each class consisting of 6,000 images. The 10 different target classes of this
dataset are: Airplane, Car, Bird, Cat, Deer, Dog, Frog, Horse, Ship, Truck.
The images are relatively small, with a resolution of 32x32 pixels. The low resolution of these images
provides an ideal platform for researchers to experiment with novel image classification algorithms.
CIFAR-10 dataset is already available in the datasets module of Keras. We do not need to download
it; we can directly import it from keras.datasets.
7
Architecture of the model consisting of training and testing of the data During the training phase,
the model takes in images from a dataset, performs pre-processing steps such as normalization or
resizing, and then extracts relevant features using techniques such as convolutional neural networks.
The extracted features are then passed through fully connected layers to classify the image into
different classes or labels. During the validation or inference phase, the model takes in new images,
applies the same pre-processing steps as during training, and extracts features using the same
network architecture. The model then classifies the image into one of the given labels based on the
extracted features.
8
CHAPTER 4
RESULTS
9
10
CHAPTER 5
CONCLUSION
In conclusion, image classification is an essential application of deep learning, and convolutional neural
networks (CNNs) are the most popular choice for image classification tasks. In this project, we explored
the use of Keras, a popular deep learning library, to build a CNN model that can classify images from the
CIFAR-10 dataset accurately. We discussed the importance of supervised and unsupervised classification
methods in image classification tasks and explored the CIFAR-10 dataset, which is widely used in
research and development for image classification tasks.
Using Keras, we implemented a CNN model with multiple convolutional and pooling layers that achieved
high accuracy in classifying images into their respective categories. We also discussed techniques such
as data augmentation and regularization that can be used to improve the performance of the model further.
Overall, this project provides a useful introduction to image classification using deep learning techniques
and demonstrates the effectiveness of using Keras to build CNN models for this purpose. As the demand
for image classification applications continues to grow in various industries, the skills and knowledge
gained from this project can be applied to solve real-world problems related to image analysis and
recognition.
11
REFERENCES:
[1] https://in.mathworks.com/matlabcentral/fileexchange/59133 - neural -network -toolbox -tm--model -
for -alexnet -network
[2] H. Lee, R. Grosse, R. Ranganath, and A.Y. Ng. Convolutional deep belief networks for scalable
unsupervised learning of hierarchical representations. In Proceedings of the 26th Annual International
Conference on Machine Learning, pages 609–616. ACM, 2009
[3] Deep Learning with MATLAB – matlab expo2018
[4] Introducing Deep Learning with the MATLAB – Deep Learning E -Book provided by the mathworks.
[5] https://www.completegate.com/2017022864/blog/deep -machine - learning -images -lenet -alexnet -
cnn/all -pages
[6] Berg, J. Deng, and L. Fei -Fei. Large scale visual recognition challenge 2010.
www.imagenet.org/challenges. 2010.
[7] Fei -Fei Li, Justin Johnson and Serena Yueng, “Lecture 9: CNN Architectures” May 2017.
[8] L. Fei -Fei, R. Fergus, and P. Perona. Learning generative visual models from few training examples:
An incremental bayesian approach tested on 101 object categories. Computer Vision and Image
Understanding, 106(1):59–70, 2007.
[9] J. Sánchez and F. Perronnin. High -dimensional signature compression for large-scale image
classification. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, pages
1665 –1672. IEEE, 2011.
[10] https://in.mathworks.com/help/vision/examples/image -category - classification -using -deep -
learning.html [11] Alex Krizhevsky, Ilya Sutskever and Geoffrey E. Hinton, “ImageNet Classification
with Deep Convolutional Neural Networks” May 2015.
[12] A. Krizhevsky. Learning multiple layers of features from tiny images. Master’s thesis, Department
of Computer Science, University of Toronto, 2009.
[13] https://in.mathworks.com/help/nnet/deep -learning -imageclassification.html
[14] KISHORE, P.V.V., KISHORE, S.R.C. and PRASAD, M.V.D., 2013. Conglomeration of hand
shapes and texture information for recognizing gestures of indian sign language using feed forward neural
networks. International Journal of Engineering and Technology, 5(5), pp. 3742-3756.
[15] RAMKIRAN, D.S., MADHAV, B.T.P., PRASANTH, A.M., HARSHA, N.S., VARDHAN, V.,
AVINASH, K., CHAITANYA, M.N. and NAGASAI, U.S., 2015. Novel compact asymmetrical fractal
aperture Notch band antenna. Leonardo Electronic Journal of Practices and Technologies, 14(27), pp. 1 -
12.
[16] KARTHIK, G.V.S., FATHIMA, S.Y., RAHMAN, M.Z.U., AHAMED, S.R. and LAY -
EKUAKILLE, A., 2013. Efficient signal conditioning techniques for brain activity in remote health
monitoring network. IEEE Sensors Journal, 13(9), pp. 3273 -3283.
[17] KISHORE, P.V.V., PRASAD, M.V.D., PRASAD, C.R. and RAHUL, R., 2015. 4-Camera model for
sign language recognition using elliptical fourier descriptors and ANN, International Conference on
Signal Processing and Communication Engineering Systems - Proceedings of SPACES 2015, in
Association with IEEE 2015, pp. 34 -38.
12