0% found this document useful (0 votes)
8 views29 pages

Report Skin Cancer

The mini project focuses on classifying different types of skin cancer using neural networks, specifically through image analysis of infected skin areas. Various classifiers, including Baseline Model and Fine-Tuning methods, were employed to achieve accurate results, with the project successfully classifying seven types of skin cancer. The document outlines the methodologies, software used, and results obtained, emphasizing the importance of early detection in combating skin cancer.

Uploaded by

kaishwarya978
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views29 pages

Report Skin Cancer

The mini project focuses on classifying different types of skin cancer using neural networks, specifically through image analysis of infected skin areas. Various classifiers, including Baseline Model and Fine-Tuning methods, were employed to achieve accurate results, with the project successfully classifying seven types of skin cancer. The document outlines the methodologies, software used, and results obtained, emphasizing the importance of early detection in combating skin cancer.

Uploaded by

kaishwarya978
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

MINI PROJECT

Submitted by
SOBANA M (REG.No.2020102136)
VIDHYA E (REG.No.2020102167)
VIGNESHWARI C (REG.No.2020102171)
YAMI K.P (REG.No.2020102175)

For the Course

19UCS603

ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING


IN

B.E COMPUTER SCIENCE AND ENGINEERING

SETHU INSTITUTE OF TECHNOLOGY


An Autonomous institute PULLOOR, KARIAPATTTI-626 115

APRIL-2023

i
ABSTRACT

Today, we are evident that growing of various types of disease in various forms around us.
Cancer is one of the deadliest diseases which can affect the human life seriously. There are
different types of cancers are found like lung cancer, kidney cancer, breast cancer, skin cancer
and so on. One fifth of total population are affected by skin cancer every year. The best method
to stop the skin cancer is to identify it in the early stages. Also, finding the type of skin cancer
provides the use of specific medicine for that type. Our project aims to classify the type of skin
cancer based on neural network by giving images of infection area in the skin as input. By
using the neural networks, we train and test the classifiers for better accuracies. We use four
different classifiers namely Baseline Model in machine learning, Fine_Tuning_DenseNet,
Fine_Tuning_InceptionResNet , Fine_Tuning_InceptionV3, Retraining_DenseNet in deep
learning. Our project successfully classified the 7 different types of skin cancer. We got
different accuracies in different classifiers. Results and obtained plots are given in the further
sections.

ii
TABLE OF CONTENTS
CHAPTER TITLE PAGE
NO. NO.
LIST OF FIGURES v
LIST OF TABLES vi
LIST OF ABBREVIATIONS vii
1 INTRODUCTION 1
1.1.LEARNING APPROACHES 1
1.1.1 Machine Learning 1
1.1.2 Deep Learning 2
1.2.BASIC CLASSIFIER- CONVOLUTIONAL 4
NEURAL NETWORK
1.2.1 CNN Architecture 4
1.2.2 Working of CNN 4
2 DATASET AND DATA AUGMENTATION 6
3 SOFTWARE DESCRIPTION 7
3.1. GOOGLE COLABORATORY 7
3.1.1 Features 7
3.1.2 Limitations 7
3.2. JUPYTER NOTEBOOK – ANACONDA 3 8
3.2.1. Features 8
3.2.2. Installations 8
3.2.3. Limitations 9
4 CLASSIFIERS TRIED 10
4.1 BASE LINE MODEL WITH CNN CLASSIFIER 10
4.2 FINE TUNING INCEPTION V3 CLASSIFIER 11

4.3 FINE TUNING INCEPTION RESNET 12


CLASSIFIER
4.4 FINE TUNING DENSE NET CLASSIFIER 14
4.5 RETRAINING DENSE NET CLASSIFIER 15
5 RESULTS AND PLOTS 17
5.1 BASE LINE MODEL WITH CNN CLASSIFIER 17
RESULTS

iii
5.2 FINE TUNING INCEPTION V3 CLASSIFIER 17
RESULTS
5.3 FINE TUNING INCEPTION RESNET 18
CLASSIFIER RESULTS
5.4 FINE TUNING DENSE NET CLASSIFIER 19
RESULTS
5.5 RETRAINING DENSE NET CLASSIFIER 20
RESULTS
8 CONCLUSIONS 22

iv
LIST OF FIGURES
FIGURE NO. FIGURE NAME PAGE NO.
1.1 Hierarchy of artificial 1
intelligence
1.2 Pictorial representation of 3
neural network
1.3 CNN Architecture 4
2.1 Count for each type of lesions 6

4.1 Base line model with CNN 11


outputs
4.2 Fine Tuning Inception V3 12
classifier output
4.3 Fine Tuning Inception ResNet 14
classifier output
4.4 Fine Tuning Dense Net 15
classifier output
4.5 Retraining Dense Net classifier 16
output
5.1 Accuracy and loss plot for base 17
line model with CNN classifier
5.2 Accuracy and loss plot for fine 18
tuning inception v3 classifier
5.3 Accuracy and loss plot for fine 19
tuning inception ResNet
classifier
5.4 Accuracy and loss plot for fine 20
tuning Dense Net classifier
5.5 Accuracy and loss plot for 21
retraining Dense Net classifier

v
LIST OF TABLES

TABLE CONTENTS PAGE


NO. NO.
4.1 CNN layers and number of layers 10

4.2 Fine Tuning Inception V3 layers and number of 12


layers
4.3 Fine Tuning Inception ResNet layers and number of 13
layers
4.4 Fine Tuning Dense Net layers and number of layers 14
4.5 Retraining dense net layers and number of layers 15

6.6 Comparison of test accuracy for different classifiers 22

vi
LIST OF ABBREVIATION
CNN : Convolutional Neural Network
ResNet : Residual Network
HAM : Human Against Machine
ReLU : Rectified Linear Units

vii
CHAPTER 1
INTRODUCTION

1.1. LEARNING APPROACHES


In the era of modern world, one of the most emerging technologies on which our lives
depend is the artificial intelligence. It involves understanding and solving of human life
problems by a machine or robot. A machine is to be trained by itself to solve the human life
problems on their own without interference of human. Acquiring knowledge from human
intelligence makes the machine artificial intelligence. We can’t imagine a world without
artificial intelligence. Since it finds application in various fields. For example, AI based health
care monitoring, self-driving cars, chatbots, etc. For any application, a machine should be
trained and learned from the experience. Machine learning and deep learning are a part of
artificial intelligence. This generalization id diagrammed in Fig.1.1

Fig.1.1. Hierarchy of artificial intelligence

1.1.1. Machine Learning


These are the algorithms which predict the output from learning the structured data. It is
not much complex than deep learning. Since it is self-operated, reduces the need of human
intervention. It predicts the output based on the current and past events like predicting climatic
conditions based on previous year’s records and present situation. An error in the algorithm
does not make the system works effectively and it leaves the drawback of machine learning. It
can be broadly classified into four classes, namely

1
• Supervised

• Semi-supervised

• Unsupervised

• Reinforcement

In the Supervised type of machine learning, both input and corresponding output is given.
Thus, every input is labeled to unique output. The model is trained upon and gives output
corresponding to the given input. Examples of supervised algorithms are Linear regression,
Random Forest, KNN, etc. The drawbacks behind this type of algorithm includes time
intensive since labelling is done to whole datasets, chances occurring manual errors in the
database and it does not classify data by itself. To get better accuracy value, a large volume of
data is required for training, so it becomes costlier.
In the Semi-supervised method of machine learning, some of the data is labelled and
remaining is left without label. The model is trained on labelled data and make labels for
unlabeled data on its own by using the knowledge gained on training of labelled data. This type
of label is called false or pseudo labels. Here labeling of whole dataset is not needed as in the
case of supervised learning. Hence it takes less amount of time. Also, it classifies the data by
itself and it less costly than supervised one. Prediction of output accuracy is higher in semi-
supervised model. Most of the real time applications uses semi-supervised models such as web
content classification, text document classification, speech recognition and so on.
In Unsupervised type, there is no labeled data. The system itself trained to find the
similarities among data and classify it. It does not require pre trained data. It is similar to
thinking about unknown object. Output accuracy is poor as in the case of unsupervised system
since it may misclassify the data. It is cheaper as it does not involve labeling. Real life usage
of unsupervised model includes customer opinion of products in supermarkets, predict the
changes in revenue field and so on.
The Reinforcement model interacts with its surroundings and ask to take actions. Based on
the action it will be given positive or negative feedback. For positive feedback, the system will
take action once again and for getting negative feedback it rejects the action. Simple example
is that a robot in a maze. Since it does not require labeled datasets, it is cheapest one. Negative
feedbacks do not improve the model accuracy as it leaves the drawback of this system.

1.1.2. Deep Learning

2
Manual extraction of features is one of the serious problems in machine learning. Deep
learning automatically extracts features without human intervention. As dataset grows, it is
difficult for machine learning to train the whole dataset. This limitation is overcome by deep
learning.
This type of learning models is able to automatically extract features from the input and
provided for further processing into neural networks. Neural networks in deep learning are
analogy to neurons in human brain. In humans, neurons in brain acts as a decision maker for
every situation. Likewise, neural networks in deep learning system predicts the output by
undergoing several iterations of training and learning from the features of input data. Neural
networks are also called classifiers which are made up of network of mathematical equations.
This neural network is diagrammed in Fig.1.2. There are three main nodes or layers of neural
networks are
• Input Layers

• Hidden Layer

• Output Layer

Fig.1.2. Pictorial representation of neural network

Input layer consists of extracted features of input data. Input data may be audio or
image. Various features are extracted and given as input based on the application. For audio
one of the features is sampling rate value which is provided as input feature of audio signal.
For an image, mel frequency spectral coefficients are provided as input feature of an image.
Further processing and training of neural network classifier is done by hidden layers. It
finds the relationship between input and output layer. Examples of hidden layers include

3
activation function, sigmoid function, tanh function, ReLu function and so on. Of all, ReLu
function is the most commonly used hidden layer function which replace the negative values
with zeros. Increasing the number of hidden layers, increases the accuracy of output value.
Hence, classifier with more number of hidden layers will perform better. It is capable of feed
forwarding and back warding to any node for classification.

1.3. BASIC CLASSIFIER – CONVOLUTIONAL NEURAL NETWORK (CNN)


It forms the basis for most of the modern classifiers. It consists of convolutional layer which
provides convolved input or activation map. Pooling layer used here is to reduce the dimension
of the feature map. Now the fully connected layer extracts the features and given it for further
processing. Since it leads to overfitting problem, it is suitable for smaller datasets and
computationally needs longer time.
1.3.1. CNN Architecture:

Fig.1.1 CNN Architecture


1.3.2. Working of CNN:
Input layer:
Generally, input will be an image or a sequence of images. This layer holds the raw input
of the image with width 32, height 32, and depth 3.
Convolutional layer:
This layer is used to extract the feature from the input dataset. It applies a set of learnable
filters known as the kernels to the input images. The filters/kernels are smaller matrices
usually 2×2, 3×3, or 5×5 shape. it slides over the input image data and computes the dot
product between kernel weight and the corresponding input image patch. The output of this
layer is referred ad feature maps.
Example, consider 3X3 matrix (pixel) and 3X3 filter (kernel)

4
A1 A2 A3 I1 I2 I3

B1 B2 B3 J1 J2 J3

C1 C2 C3 K1 K2 K3

Input image matrix Filter matrix

Formula for Output image matrix:


3X3 = ((A1*I1) + (A2*I2) + (A3*I3) + (B1*J1) + (B2*J2) + (B3*J3) + (C1*K1) +
(C2*K2) + (C3*K3)) /2
Pooling layer:
Pooling is just reducing the size of the image without losing the features.
MaxPooling method will take in a shape of a matrix and return the larger value in that range.

Dense layer:
Dense layer is the regular deeply connected neural network layer. Dense layer does dot product
of all input and its corresponding weights bias represent a biased value to optimize the model.
Commonly it does mathematical functions.

Output layer:
Output layer consists of classification results.

5
CHAPTER 2
DATASETS AND DATA AUGMENTATION
The HAM10000 dataset is a popular benchmark dataset that can be used for machine learning
and for comparing machine learning results with human experts. The cases included in the
HAM10000 contain a representative collection of all important diagnostic categories in the
realm of pigmented lesions. The Human Against Machine with 10000 training images
(HAM10000) dataset includes 10015 training images for detecting pigmented skin lesions.
The images in the dataset were collected from different populations, acquired, and stored by
different modalities. The data set contains 10,015 labeled images of size 450x600 (HAM
10000 data set) of seven skin decease classes: melanoma, melanocytic nevus, basal cell
carcinoma, actinic keratosis, benign keratosis, dermatofibroma and vascular lesion. It also
contains information about the sex and age of the patients, which we are not likely to
incorporate into the model at this point. The data is highly imbalanced, with more than 60 %
of the images being of the NV: “Melanocytic nevus” class and some classes being extremely
rare (less than 2%)

2.1 count for each type of lesions

The original images are of size 450 x 600, which is too huge, so I resize the images to 28 x 28
RGB images for baseline model, and 192 x 256 for fine-tuning models. The dataset is
normalized by dividing by 255 and is split into 7210 training examples, 1803 validation
examples, and 1002 test examples.

6
CHAPTER 3
SOFTWARE DESCRIPTION

3.1. GOOGLE COLABORATORY


Colab is expanded to colaboratory which is opensource online software by google. This
platform is especially useful for machine learning and data science related project works.
Programming language supported is python version 2 or 3.

3.1.1. Features
The Google Colab is tried to program the Speech Emotion Recognition System. It has the
following features:
• Colab provides cost free GPU for running the program module which is faster than
CPU especially for running machine learning and deep learning modules.

• It uses google drive for saving all of the files which further stored in cloud platform.
Thus, it eliminates the use of servers. Since it stores all files in cloud, we can open these
files from anywhere.

• Colab also provided some pre-installed packages like keras, PyTorch and so on.

• There is no need of configuring colab. It comes with online interpreter and complier.
So, we do not need to install python separately on our system.

• We can easily upload or share our colab file. Directly we can upload colab file in github.

3.1.2. Limitations

For Colab, we need internet connectivity for the compilation and execution of programs
since it does not use any local server. The files can only be accessed if the Google account is
present along with the access provided to google drive to save the program files.

3.2. JUPYTER NOTEBOOK – ANACONDA3


The Jupyter or Jupyter notebook is one of the web destinations which is also used as a
platform for execution of program code. It is called as notebook since it integrates code and
output into a single file along with any graphs or visualizations. It supports many languages

7
like R, python, Julia, etc. It is also a freely available opensource software. It is mainly suitable
for the Datasciene project works.
3.2.1. Features
The Jupyter Notebook is completely utilized to program the Speech Emotion
Recognition System as Colab usage provides major discomforts during model training. It has
the following features:
• One of the main features of Jupyter notebook is that it has inbuilt libraries like
pyChips, cellmap and so on. We can also install libraries by using command line
incommand prompt either in the system terminal or Jupyter terminal

• It does not need internet connectivity since it works on local server and saves the files
in that server.

• Jupyter kernel supports various programming languages based on that kernel, so it


becomes user friendly.

• Other features include auto code completion and viewing visualization. Visualization
refers to graphs, plots, barcharts, images etc,. which are generated from the execution
of programing code.

• Once the code is executed, the results obtained is not removed whenever we open the
document. So need not execute the program again and again for every opening of the
code file.

3.2.2. Installations
There are commonly two ways of installing Jupyter notebook are,
• Using ANACONDA distributor

• Using PIP manager

1. By using ANACONDA distributor:


Anaconda is opensource freely available web destinator which contains many web
applications which includes Juypter, spyder for processing large data. It works on R and python
languages. Conda is one of the subsystems in anaconda which manages available packages.
For installing Jupyter, first we need to install anaconda navigator which contains Jupyter
notebook. By clicking, the packages of Jupyter notebook are downloaded. Now the installation
is completed and the Jupyter notebook is launched.
2. By using PIP manager:

8
PIP is the package management system which handles packages, libraries and modules
of python in our system. By using PIP command, we can install Jupyter notebook by putting
python “-m pip install jupyter”. Now the Jupyter notebook is installed and launched.

3.2.3. Limitations
We are able to run the program on the local system. If the system is GPU enriched, the
execution will be faster. If the system is CPU type, then execution takes longer time. Since it
works in local server, files are stored in that server. So, we cannot open the file in remote mode
or through cloud platform. Its exe file is larger. Hence requires more space in memory and
requires configuration of appropriate python version.

9
CHAPTER 4
CLASSIFIERS TRIED

We use five different types of a classifiers based on different algorithms. They are
• Base line model with CNN classifier
• Fine Tuning Dense Net classifier
• Fine Tuning Inception ResNet classifier
• Fine tuning Inception V3 classifier
• Retraining Dense net classifier

4.1. BASE LINE MODEL WITH CNN CLASSIFIER

Base line model is a simple machine learning approach which may be rule based or regression
based or any classification base model. Here we use classification base model ie., convolutional
neural network. It is made up of multiple layers which includes convolutional layer, pooling
layer, fully connected layer, softmax layer etc,. Convolutional layer helps to matches the pixels
and acts as a filter with suitable filter coefficients. Pooling layer helps to minimize the size of
the image by selecting the needed portion of the image. Fully connected layer is a vector
provides useful nodes. Additionally softmax layer is used which consist of activation layer and
ReLu layer. Activation layer gives the final output node. ReLu layer turns the negative
coefficients into zeros.

The following table shows the types and no of layers used


Layers used Numbers
Convolutional Layer 3
Max Pooling Layer 3
Dense Layer 2
Dropout Layer 1
Table 4.1 CNN layers and number of layers

Description:
There are 3 convolutional layers are used. Pooling layer which gives maximum of all values is
used. Dense layer selects more information pixel and dropout layer remove the unwanted pixel.

10
There are 322599 parameters are trained. Output shape field in the figure below gives the
dimension which has no of input taken and no of features extracted. Each and every layer takes
different number of parameters for training.

Fig 4.1. Base line model with CNN output

4.2. FINE TUNING INCEPTION V3 CLASSIFIER


To overcome the problem of overfitting, inception V3 model uses multiple filters of different
size in same level. Inception model does not go deeply instead they go parallelly and it contains
same layers as in convolutional neural network. It uses regularization methods for better
accuracy and it has better efficiency also.

Layers used Numbers


Convolutional Layer 93
Batch normalization 93

11
Activation layer 93
Max pooling layer 2
Dense Layer 2
Dropout Layer 1
Table 4.2 Fine Tuning Inception V3 layers and number of layers

Description:
This model consists of CNN layers with more numbers. Total trainable parameters are
10,69,895. The below image shows the layers used.

Fig 4.2. Fine tuning inception V3 classifier output

4.3. FINE TUNING INCEPTION RESNET CLASSIFIER


This classifier uses residual network which has skip connection between layers. Skip
connections are one that leaves some in between nodes. Advantages of skip connection
includes avoiding vanishing problem and gradient balance. Inception model helps to reduce the

12
model size and its specific feature is 1X1 convolution. It also consists of optimizer for fine
tuning.
By combining inception model with residual network, we can eliminate the overfitting problem
as much as possible.
Layers used Numbers
Convolutional Layer <8
Zero padding <8
Batch normalization <8
Activation layer <8
Max pooling layer 2
Custom scale layer <5
Dense Layer 2
Dropout Layer 1
Table 4.3. Fine Tuning Inception ResNet layers and number of layers

Description:
Here custom scale layer is used along with conventional layer. Custom scale layer is used the
scale the value by selecting appropriate integer value. Residual blocks are connected and also
inception layer is used. The classifier is fine tuned to overcome overfitting problem by
adjusting the number of layers used. The total trainable parameters are 24,352,647. More
parameters are not trained. The below image shows the layers used

13
Fig. 4.3. Fine Tuning Inception ResNet classifier output

4.4. FINE TUNING DENSE NET CLASSIFIER


Dense network is a type of convolutional neural network which consist of dense connection
between layers through dense blocks. Dense block consists of convolutional layer, pooling
layer and so on as in CNN. Connection of multiple dense blocks make dense net classifier.
Tuning of classifier overcomes the overfitting problem. Here, the fully connected layers are
replaced.

Layers used Numbers


Convolutional Layer <5
Zero padding <5
Batch normalization <5
ReLu layer <5
Max pooling layer 2
Dense Layer 2
Dropout Layer 1
Table 4.4 Fine TuningDense Net layers and number of layers

Description:
Here more than 5 convolutional layers, zero padding layers etc,. are used. Zero padding layer
gives correct length to the sequence. Batch normalization uses normalization coefficient which

14
is used to normalize the interlayers. Rectified Linear Unit layer converts negative coefficients
to zeros without changing the positive values. Max pooling, dense and dropout layers are used
along with these layers. Total trainable parameters are 19,080,071 and some are left untrainable.
Selected batch size is 32. The below image shows the output shape of model along with layers used.

Fig. 4.4. Fine Tuning Dense Net classifier output

4.5. RETRAINING DENSE NET CLASSIFIER


Here retraining of dense net model is used. New data are given while the model is serving along
with old data refers to retraining. It is a life long learning process. If we combine all past and new
data it can easily become intractable to retrain the neural network model. Here we use dense net
classifier as a neural network classifier.
Layers used Numbers
Convolutional Layer <15
Batch normalization <15
Zero padding <4
Max pooling layer 2
Dense Layer 2
Dropout Layer 1
Table 4.5 Retraining dense net layers and number of layers
Description:
It also consists of CNN layers. Total trainable parameters are 1,90,80,071.

15
Fig.4.5. Retraining Dense net classifier output

16
CHAPTER 5
RESULTS AND PLOTS

5.1 BASE LINE MODEL WITH CNN CLASSIFIER RESULTS


We obtain accuracy of about 68%. We train the model with 30 epochs.

Loss and Accuracy plots:

Fig 5.1. Accuracy and loss plot for baseline model with CNN classifier

5.2. FINE TUNING INCEPTION V3 CLASSIFIER RESULTS:

17
We obtained accuracy of 70% with 15 epochs. The batch size selected is 64.

Accuracy and loss plots:

Fig. 5.2. Accuracy and loss plots for Fine tuning Inception V3 classifier

5.3. FINE TUNING INCEPTION RESNET CLASSIFIER


We obtain the accuracy of 84% with 15 epochs. Batch size selected is 64.
18
Accuracy and loss plots:

Fig. 5.3. Accuracy and loss plots foe Fine tuning Inception ResNet classifier

5.4. FINE TUNING DENSE NET CLASSIFIER


We obtain accuracy of 86% with 3 epochs. As it is dense, so we decrease the epoch for better accuracy.

19
Accuracy and loss plot:

Fig. 5.4. Accuracy and loss plots for Fine tuning Dense Net classifier
5.5. RETRAINING DENSE NET CLASSIFIER
We obtained accuracy of 88% with 15 epochs.

Accuracy and loss plots:

20
Fig.5.5. Accuracy and loss plots for retraining dense net classify

21
CHAPTER 6
CONCLUSION
ALGORITHM ACCURACY
Baseline Model 68.10%
Fine_Tuning_InceptionV3 70.26%
Fine_Tuning_InceptionResNet 84.03%
Fine_Tuning_DenseNet 86.93%
Retraining_DenseNet 88.42%
6.1 Comparison of test accuracy for different classifiers

22

You might also like