Brain Tumor Classification
                                Using Neural Network Based Methods
 
                          Kailash D.Kharat1 & Pradyumna P.Kulkarni2 & M.B.Nagori 3
                                1&2
                                    Govt. Engineering College, Aurangabad, Maharashtra, India.
                          3
                              Dept. of Computer Science and Engg., Aurangabad, Maharashtra, India
                                  E-mail- kailashdkharat@yahoo.co.in , kul.gokul@gmail.com
 
Abstract - MRI (Magnetic resonance Imaging) brain tumor images Classification is a difficult task due to the variance and
complexity of tumors. This paper presents two Neural Network techniques for the classification of the magnetic resonance human
brain images. The proposed Neural Network technique consists of three stages, namely, feature extraction, dimensionality reduction,
and classification. In the first stage, we have obtained the features related with MRI images using discrete wavelet transformation
(DWT). In the second stage, the features of magnetic resonance images (MRI) have been reduced using principles component
analysis (PCA) to the more essential features. In the classification stage, two classifiers based on supervised machine learning have
been developed. The first classifier based on feed forward artificial neural network (FF-ANN) and the second classifier based on
Back-Propagation Neural Network. The classifiers have been used to classify subjects as normal or abnormal MRI brain images.
Artificial Neural Networks (ANNs) have been developed for a wide range of applications such as function approximation, feature
extraction, optimization, and classification. In particular, they have been developed for image enhancement, segmentation,
registration, feature extraction, and object recognition and classification. Among these, object recognition and image classification is
more important as it is a critical step for high-level processing such as brain tumor classification. Multi-Layer Perceptron (MLP),
Radial Basis Function (RBF), Hopfield, Cellular, and Pulse-Coupled neural networks have been used for image segmentation. These
networks can be categorized into feed-forward (associative) and feedback (auto-associative) networks..
Keywords-MRI; Feature Extraction; Feature Selection; Tumor Classification; Feed forward Neural Network; Back-Propagation
Neural Network.
 
                                                                        the classification model so we do not employ
I.   INTRODUCTION
                                                                        registration. Image segmentation is required to delineate
     Early detection and classification of brain tumors is              the boundaries of the ROIs ensuring, in our case, that
very important in clinical practice. Many researchers                   tumors are outlined and labeled consistently across
have proposed different techniques for the classification               subjects. Segmentation can be performed manually,
of brain tumors based on different sources of                           automatically, or semi-automatically. The manual
information. In this paper we propose a process for brain               method is time consuming and its accuracy highly
tumor classification, focusing on the analysis of                       depends on the domain knowledge of the operator.
Magnetic Resonance (MR) images and Magnetic                             Specifically, various approaches have been proposed to
Resonance Spectroscopy (MRS) data collected for                         deal with the task of segmenting brain tumors in MR
patients with benign and malignant tumors. Our aim is                   images. The performance of these approaches usually
to achieve a high accuracy in discriminating the two                    depends on the accuracy of the spatial probabilistic
types of tumors through a combination of several                        information collected by domain experts. In previous
techniques for image segmentation, feature extraction                   work, we proposed an automatic segmentation algorithm
and classification. The proposed technique has the                      that is based on the fuzzy connectedness concept. The
potential of assisting clinical diagnosis.                              main idea is to assign to every pair of voxels, x, y, in the
                                                                        image, a real number between 0 and 1 indicating their
     Necessary     preprocessing    steps     prior   to
                                                                        connectedness. Starting with several seed points, all the
characterization and analysis of regions of interest
                                                                        voxels are automatically assigned to the structure to
(ROIs) are segmentation and registration. Image                         which they have the highest connectedness value.
registration is used to determine whether two subjects
                                                                        Utilizing the statistical information cumulated during
have ROIs in the same location. However, in this work
                                                                        the segmentation process, this method can provide
we do not take into account the location of the tumor in
__________________________________________________________________________________________________________
             International Journal of Computer Science and Informatics ISSN (PRINT): 2231 –5292, Vol-1, Iss-4, 2012
                                                                  85 
                                                                    
                                   Brain Tumor Classification Using Neural Network Based Methods  
satisfying results even in cases where the boundaries of                  There are four major steps in the proposed approach
the ROIs cannot be easily identified.                                for brain tumor classification: (a) ROI segmentation:
                                                                     delineating the boundary of the tumor (ROI) in an MR
     Having segmented the ROI and in order to build a
                                                                     image; (b) feature extraction: getting meaningful
classification model, one needs to extract a set of
                                                                     features of the ROI identified in the previous step; (c)
discriminative features from the ROI. Most
                                                                     feature selection: removing the redundant features; (d)
characterization techniques are based on extracted
                                                                     classification: learning a classification model using the
global visual features that refer to the entire image rather
                                                                     features.
than to regions that are of interest. However, in medical
images, feature extraction has to focus on specific                  A. Segmentation
regions and capture not only shape but also structural
                                                                          Within the segmentation process, each image region
and internal volume properties that can be useful for
                                                                     confined by a rectangular window is represented by a
building a classification model. Megalooikonomou et al.
                                                                     feature vector of length R. These vectors computed for
proposed a method that efficiently extracts a k-
                                                                     Q selected regions are organized in the pattern matrix
dimensional feature vector using concentric spheres in
                                                                     PR,Q and form clusters in the R-dimensional space. The
3D (or circles in 2D) radiating out of the ROI’s center of
                                                                     Q pattern vectors in P are fed into the input NN layer,
mass. The method has been applied successfully to
                                                                     while the number C of the output layer elements
classification and similarity searches of spatial ROIs. In
                                                                     represents the desired number of segmentation classes.
this paper, we propose an approach (see Figure 1) for
                                                                     In each epoch of the network training process, the
building a classification from the MR images, and a
                                                                     network weights WC,R are recalculated by minimizing
group of features is extracted. Instead of employing all
                                                                     the distances between each input pattern vector and the
of the features to build the model, a preprocessing step
                                                                     corresponding weights of the winning neuron
of feature selection is model performed aiming to
                                                                     characterized by its coefficients closest to the current
remove the redundant features. Based on the statistical
                                                                     pattern. In case that the process is successfully
information, only the most for informative features
                                                                     completed, the network weights belonging to separate
extracted from the MR images are utilized in the model
                                                                     output elements represent typical class individuals. In
building process. In addition, in this
                                                                     this paper, the region segmentation process comprises of
    brain                                                            training the NN on all image regions extracted by a
                                                                     rectangular sliding window with half overlap, and
                                                                     subsequent exploitation of the trained network for
                                                                     region classification. The algorithm comprises of the
                                                                     following successive steps:
                                                                          1.   Feature vectors computation to create the
                                                                               feature matrix P using the sliding window
                                                                          2.   Initialization of the learning process
                                                                               coefficients and the network weights matrix W
                                                                          3.   Iterative application of the competitive process
                                                                               and the Kohonen learning rule [10] for all
                                                                               feature vectors during the learning stage
                                                                          4.   NN simulation to assign class numbers to
                                                                               individual feature vectors
                                                                          5.   Evaluation of the regions classification results
                                                                     B. Feature Extraction
                                                                          The proposed system uses the Discrete Wavelet
                                                                     Transform (DWT) coefficients as feature vector. The
                                                                     wavelet is a powerful mathematical tool for feature
                 Figure 1. Proposed Work Model                       extraction, and has been used to extract the wavelet
                                                                     coefficient from MR images. Wavelets are localized
paper, we consider features from other sources (e.g.,                basis functions, which are scaled and shifted versions of
MRS data) in the classifier training process. This leads             some fixed mother wavelets. The main advantage of
to improved classification accuracy.                                 wavelets is that they provide localized frequency
II. METHODOLOGY                                                      information about a function of a signal, which is
                                                                  
            International Journal of Computer Science and Informatics ISSN (PRINT): 2231 –5292, Vol-1, Iss-4, 2012
                                                               86 
 
                                          Brain Tumor Classification Using Neural Network Based Methods  
particularly beneficial for classification. A review of                     factors. The main feature of DWT is multiscale
basic fundamental of Wavelet Decomposition is                               representation of function. By using the wavelets, given
introduced as follows:                                                      function can be analyzed at various levels of resolution.
                                                                            Fig. 2 illustrates DWT schematically. The original
The continuous wavelet transform of a signal x(t),
                                                                            image is process along the x and y direction by h(n) and
square-integrable function, relative to a real-valued
                                                                            g(n) filters which, is the row representation of the
wavelet, (t) is defined as:
                                                                            original image. As a result of this transform there are 4
                                     ∞                                      subband (LL, LH, HH, HL) images at each scale.
         (1)   Wψ (a, b) =
                                  −∞
                                     ∫ f (x) ψ  *   a,b   (t )dx            (Fig.2). Subband image LL is used only for DWT
                                                                            calculation at the next scale. To compute the wavelet
                                                                            features in the _rst stage, the wavelet coefficients are
     Where                                  1                               calculated for the LL subband using Harr wavelet
                     ψ    a, b   (t ) =
                                                                            function.
                                                    a
                                                                            C. Feature Selection and Reduction
and the wavelet Ψa,b is computed from the mother Ψ
wavelet by translation and dilation, wavelet, a the                              One of the most common forms of dimensionality
dilation factor and b the translation parameter (both                       reduction is principal components analysis. Given a set
being real positive numbers). Under some mild                               of data, PCA finds the linear lower-dimensional
assumptions, the mother wavelet Ψ satisfies the                             representation of the data such that the variance of the
constraint of having zero mean.                                             reconstructed data is preserved. Using a system of
                                                                            feature reduction based on a combined principle
     The eq. (1) can be discretized by restraining a and b                  component analysis on the feature vectors that
to a discrete lattice (a = 2b; a € R+; b € R) to give the                   calculated from the wavelets limiting the feature vectors
discrete wavelet transform (DWT). The discrete wavelet                      to the component selected by the PCA should lead to a n
transform (DWT) is a linear transformation that operates                    efficient classification algorithm utilizing supervised
on a data vector whose length is an integer power of                        approach. So, the main idea behind using PCA in our
two, transforming it into a numerically different vector                    approach is to reduce the dimensionality of the wavelet
of the same length. It is a tool that separates data into                   coefficients. This leads to more efficient and accurate
different frequency components, and then studies each                       classifier.
component with resolution matched to its scale. DWT
can be expressed as.                                                             The feature extraction process was carried out
                      {
                                                                            through two steps: firstly the wavelet coefficients were
                          dj , k = ∑ ( x ( n ) h* j ( n − 2 jk ))           extracted by the DWT and then the essential coefficients
(2) DWTx ( n ) =          dj , k = ∑ ( x ( n ) g * j ( n − 2 jk ))          have been selected by the PCA.
                                                                                Figure3: Schematic diagram for the used feature
                                                                                        extraction and reduction scheme
      The coefficients dj,k, refer to Figure2: DWT                          III. MODEL LEARNING
                    Schematically
                                                                                 A. Feed Forward Artificial Neural Network (FF-
the detail components in signal x(n) and correspond to                              ANN) Based Classifier
the wavelet function, whereas aj,k,         refer to the
                                                                             A three layer Neural network was created with 500
approximation components in the signal. The functions
                                                                            nodes in the first (input) layer, 1 to 50 nodes in the
h(n) and g(n) in the equation represent the coefficients
                                                                            hidden layer, and 1 node as the output layer. We varied
of the high-pass and low-pass filters, respectively, whilst
                                                                            the number of nodes in the hidden layer in a simulation
parameters j and k refer to wavelet scale and translation
                                                                            in order to determine the optimal number of hidden
                                                                         
             International Journal of Computer Science and Informatics ISSN (PRINT): 2231 –5292, Vol-1, Iss-4, 2012
                                                                      87 
 
                                   Brain Tumor Classification Using Neural Network Based Methods  
nodes. This was to avoid over fitting or under fitting the           include a complexity term that reacts a prior distribution
data. Due to hardware limitations, ten nodes in the                  over the values that the parameters can take.
hidden layer were selected to run the final simulation.
                                                                          The activation function considered for each node in
Figure 2 shows the design of the Feed Forward Neural
                                                                     the network is the binary sigmoidal function defined
networks used in this research.
                                                                     (with s = 1) as output = 1/(1+e-x), where x is the sum of
     The 500 data points extracted from each subject                 the weighted inputs to that particular node. This is a
were then used as inputs of the neural networks. The                 common function used in many BPN. This function
output node resulted in either a 0 or 1, for control or              limits the output of all nodes in the network to be
patient data respectively. Since the nodes in the input              between 0 and 1. Note all neural networks are basically
layer could take in values from a large range, a transfer            trained until the error for each training iteration stopped
function was used to transform data first, before sending            decreasing.
it to the hidden layer, and then was transformed with
another transfer function before sending it to the output
layer. In this case, a tan sigmoid transfer function was
used between the input and hidden layer, and a log
sigmoid function was used between the hidden layer and
the output layer.
        Figure 4: Feed Forward Neural Network
The weights in the hidden node needed to be set using
“training” data. Therefore, subjects were divided into
training and testing datasets. Out of the 69 subjects, 2             Figure 5: Back Propagation Neural Network
random patients and 2 random controls were selected as
“test data”, while the rest of the dataset was used for              Figure 5 shows the architecture of the specialized
training. Training data was used to feed into the neural             network for the prediction of stroke disease. The
networks as inputs and then knowing the output, the                  complete set of final data (20 inputs) are presented to
weights of the hidden nodes were calculated using back               the generic network, in which the final diagnosis
propagation algorithm. 120 trials were performed on the              corresponds to output units.
same Neural Network, selecting 65 subjects randomly                  The net inputs and outputs of the j hidden layer neurons
every time for retraining and 4 remaining subjects for               can be calculated as follows
testing to find accuracy of Neural network prediction.
                                                                                  N +1
    B. Back Propagation Artificial Neural Network
       (BP-ANN) Based Classifier
                                                                     n e t hj =   ∑W
                                                                                   i =1
                                                                                           ji   xi
The most widely used neural-network learning method                   y j = f ( n e t hj )
is the BP algorithm. Learning in a neural network
involves modifying the weights and biases of the                     Calculate the net inputs and outputs of the k output layer
network in order to minimize a cost function. The cost               neurons are
function always includes an error term a measure of how
close the network's predictions are to the class labels for
the examples in the training set. Additionally, it may
                                                                  
            International Journal of Computer Science and Informatics ISSN (PRINT): 2231 –5292, Vol-1, Iss-4, 2012
                                                               88 
 
                                            Brain Tumor Classification Using Neural Network Based Methods  
                 J +1                                                                Analysis and Computational Methods, vol. 1, World Scientific
    n e t ko =   ∑j =1
                         V k jy j
                                                                              [2]
                                                                                     Publications, 2005, ISBN 981-256-993-6.
                                                                                     L.P. Clarke, R.P. Velthuizen, M.A. Camacho, J.J. Heine, M.
                                                                                     Vaidyanathan, L.O. Hall, R.W. Thatcher, M.L. Silbiger, MRI
    Z k = f (net )        o
                          k                                                          segmentation: methods and applications, Magn. Reson.
                                                                                     Imaging 13 (3) (1995) 343–368.
Update the weights in the output layer (for all k, j pairs)                   [3]    J.C. Bezdek, L.O. Hall, L.P. Clarke, Review of MR image
                                                                                     segmentation techniques using pattern recognition, Med.
v kj    ←    v kj + c λ ( d k − Z k ) Z k (1 − Z k ) y j                             Phys. 20 (4) (1993) 1033–1048.
                                                                              [4]    H.S. Zadech, H.S. Windham, Optimal linear transformation
Update the weights in the hidden layer (for all i, j pairs)                          for MRI feature extraction, IEEE Trans. Med. Imaging 15
                                                                                     (1996) 749–767.
wji ← wji + cλ 2 yj (1 − yj ) xi (∑ ( dk − zk ) zk (1− zk )vkj )
                                        k
                                                                              [5]    H.S. Zadech, J.P. Windham, A comparative analysis of
                                       k =1                                          several transformations for enhancement and segmentation of
                                                                                     magnetic resonance image scene sequences, IEEE Trans.
Update the error term                                                                Med. Imaging 11 (N3) (1992) 302–318.
                              k                                               [6]    D. Wang, D.M. Doddrell, A segmentation-based partial-
    E    ←    E + ∑ ( d k − z k )2                                                   volume-compensated method for an accurate measurement of
                           k =1                                                      lateral ventricular volumes on T1-weighted magnetic
                                                                                     resonance images, Magn. Reson. Imaging 19 (2001) 267–272.
and repeat from Step 1 until all input patterns have been
                                                                              [7]    X. Zeng, L.H. Staib, R.T. Schultz, J.S. Duncan, Segmentation
presented (one epoch). If E is below some predefined                                 and measurement of the cortex from 3-D MR images using
tolerance level, then stop. Otherwise, reset E = 0, and                              coupled surfaces propagation, IEEE Trans. Med. Imaging 18
repeat from Step 1 for another epoch.                                                (10) (1999) 927–937.
                                                                              [8]    Khan M. Iftekharuddin et al., Fractal-based brain tumor
IV. CONCLUSIONS                                                                      detection in multimodal MRI, Appl. Math. Comput. (2008),
                                                                                     doi:10.1016/j.amc.2007.10.063
    In this paper, we propose two approaches for Brain
                                                                              [9]     N. Pal and S. Pal, "A review on image segmentation
Tumor Detection based on artificial neural networks.                                 techniques," Pattern Recognition, vol. 26, pp. 1277-1294,
The networks were categorized into feed-forward neural                               1993.
networks and Back propagation neural Network.
                                                                              [10]   Josin G. M and Liddle P.F. Neural Network Analysis of the
     The purpose is to develop tools for discriminating                              Pattern of Functional Connectivity between Cerebral Areas in
                                                                                     Schizophrenia. Biological Cybernetics, 2001; 84, 117-122.
malignant tumors from benign ones assisting decision
making in clinical diagnosis. The proposed approach                           [11]   M. Sasikala and N. Kumaravel, “Comparison of Feature
                                                                                     Selection Techniques for Detection of Malignai Tumor in
utilizes a combination of these two neural network
                                                                                     Brain Images”, IEEE Indicon 20C Conference India, Dec.
techniques and is composed of several steps including                                2005.
segmentation, feature vector extraction and model
                                                                              [12]   Yan Zhu and Hong Yan, “Computerized Tumor Boundary
learning. These two methods can then be used to filter                               Detection Using a Hopfield Neural Network”, IEEE
out non-suspecting brain scans as well as to point out                               Transactions on Medical Imaging, l6, No. l, February l997.
suspicious regions that have similar property as the
                                                                              [13]   Alan Wee-Chung Liewand Hong Yan, “An Adaptive Spatial
tumor regions.                                                                       Fuzzy Clustering Algorithm for 3-D MR Image
                                                                                     Segmentation”, IEEE Transactions C Medical Imaging, 22,
ACKNOWLEDGMENT                                                                       No. 9, September 2003.
                                                                              [14]   Nicolae Duta and Milan Sonka, “Segmentation and
    We would like to thank Prof. M. B. Nagori, for
                                                                                     Interpretation of MR Brain Images: An Improved Acth Shape
valuable guidance at every step in making of this paper.                             Model”, IEEE Transactions On Medici Tmaeine, 17, No. 6,
She motivated us & boosted our confidence & we must                                  December 1998.
admit that the work would not have been accomplished
with her guidance & encouragement.
REFERENCES
[1]       K.M. Iftekharuddin, On techniques in fractal analysis and
          their applications in brian MRI, in: T.L. Cornelius (Ed.),
          Medical imaging systems: technology and applications,
                                                                           
                 International Journal of Computer Science and Informatics ISSN (PRINT): 2231 –5292, Vol-1, Iss-4, 2012
                                                                        89 
 
                                      Brain Tumor Classification Using Neural Network Based Methods  
[15]   Madiha J. Jafri,Vince D.Calhoun,” Functional Classification
       of Schizophrenia Using Feed Forward Neural Networks”
       IEEE EMBS Annual International Conference New York
       City, USA, Aug 30-Sept 3, 2006.
[16] Jayashri Joshi1, Mrs.A.C.Phadke,” Feature Extraction and
       Texture Classification in MRI”, International Conference
       [ICCT-2010], 3rd-5th December 2010
[17]   el-sayed a. El-dahshan, abdel-badeeh m. Salem, and tamer h.
       Younis,” a hybrid technique for automatic mri brain Images
       classification”    BABES_{BOLYAI,         INFORMATICA,
       Volume LIV, Number 1, 2009
                            
                                                                       
            International Journal of Computer Science and Informatics ISSN (PRINT): 2231 –5292, Vol-1, Iss-4, 2012
                                                                     90