SETIT 2007
4th International Conference: Sciences of Electronic,
                                     Technologies of Information and Telecommunications
                                                  March 25-29, 2007 – TUNISIA
             On-line diagnosis of induction motor faults
                                            LEBAROUD Abdesselam
                                                    CLERC Guy
               Université Claude Bernard, Lyon 1 – CEGELY – Bâtiment OMEGA 43, boulevard du 11
                              novembre 1918 69622 VILLEURBANNE cedex France
                                            E.mail : lebaroud@yahoo.fr
Abstract: A new method of automatic diagnosis of induction motor faults based on the time-frequency ambiguity plane
analysis of the current waveforms. This method is composed of two sequential processes: a feature extraction and a
classification. In the process features extraction, the time-frequency representation (TFR) have been designed for
maximizing the separability between classes representing the different faults; bearing fault, stator fault and rotor fault.
The classification of a new signal is based on the Mahalanobis distance.
Key words: automatic diagnosis, time-frequency representation, bearing fault, stator fault, rotor fault
                                                                  1.1. Bearing Faults
INTRODUCTION
                                                                    Bearing faults such as outer race, inner race, ball
    In many classification applications, features are
                                                                  fault, and train fault cause machine vibration. These
traditionally extracted from standard Time-Frequency
                                                                  faults have vibration frequency components, fv, that
Representations (TFRs) [1]. Quadratic class of time–
                                                                  are characteristic of each fault type. The mechanical
frequency      representations     can    be   uniquely
                                                                  vibration caused by the bearing fault results in air gap
characterized by an underlying function called a
                                                                  eccentricity. Oscillations in air gap length induce
kernel. In previous time-frequency research, kernels
                                                                  variations in flux density. These variations produce
have been derived in order to fulfil properties, such
                                                                  harmonics on the stator current. The characteristic
minimizing quadratic interference. Although some of
                                                                  current frequencies, f c , due to bearing characteristic
the resulting TFRs can offer advantages for
classification of certain types of signals the goal of            vibration frequencies are calculate by [7]:
sensitive detection or accurate classification is rarely           f c = f s ± mfν                                   (1)
an explicit goal of kernel design [2]. Those few
methods that optimize the kernel for classification                 Where f s     is fundamental frequency,           fν
purpose, constrain the form of the kernel to predefined           Characteristic vibration, frequency m Positive integer
parametric functions with symmetries that can not be              multiplier.
suitable to detection or classification [3], [4].
Traditionally, the objective of time–frequency research           1.2. Rotor faults
is to create a function that will describe the energy                A fault on the rotor, such as a broken rotor bars,
density of a signal simultaneously in time and                    causes asymmetrical working conditions within the
frequency. For explicit classification, it is not                 rotor. The current rotor bars, which are at the
necessarily desirable to accurately represent the                 frequency s f , can be expressed into positive and
energy distribution of a signal in time and frequency.
In fact, such a representation may conflict with the              negative sequence components ± s f within the rotor,
goal of classification, generating a TFR that                     where s is the slip. Consequently, the negative
maximizes the separability of TFRs from different                 sequence rotor current results in stator currents at
classes. It may be advantageous to design TFRs that               frequency [8]:
specifically highlight differences between classes                 − sf + f r = − sf + f − sf = (1 − 2 s) f           (2)
[5],[6]. In this paper we propose a technique for
designing an optimized time-frequency representation                 The interaction of the (1 − 2 s) f harmonic of the
(TFR) from a time-frequency ambiguity plane applied               motor current with the fundamental air-gap flux
for a precise classification of motor faults; such as;            produces speed ripple at 2sf and gives rise to
bearing fault, stator faults and broken bars.                     additional motor current harmonics at frequencies
1. INDUCTION MACHINE FAULTS                                       (1 ± 2ks) f , k=1,2,3.. [9] with k = 1 , the frequency
                                                                  sidebands (1 ± 2s) f of the fundamental are very
                                                            -1-
SETIT2007
commonly used to detect broken bar faults. The                                     3.      INDUCTION           MACHINE            FAULTS
motor-load inertia also affects the magnitude of these                             CLASSIFICATION
sidebands [10].                                                                       The optimal TFR method is applied to classification
                                                                                   of three kinds of induction machine faults, which are;
1.3 STATOR FAULTS                                                                  bearing fault, stator fault and rotor fault. Thus, four
   In ideal conditions, the motor supply current                                   classes are considered: healthy motor, bearing fault,
contains only a positive-sequence component, leading                               stator fault and broken bars. The goal of the feature
to a constant space vector current modulus. If there is                            extraction is to generate a N-point feature vector from
inter-turn short circuit in the motor stator winding, the                          the original 10000-point current signal. The feature
supply current will exhibits some sort of unbalance.                               vectors are classified with Mahalanobis distance. The
When explained by symmetrical components theory,                                   characteristic frequencies of the three faults
the stator asymmetry produces a component at                                       considered, are located near to the fundamental
frequency − f (i.e., a negative sequence component).                               frequency and, in order to preserve relevant
This component gives rise to torque ripples at                                     information, the original signal is resample with a
frequencies of 2sf and consequently produce speed                                  downsampling rate 25. Only the range of the required
                                                                                   frequencies is preserved. By downsampling, the signal
ripples of different amplitude, being differently
                                                                                   dimension has been reduced greatly. This leads to a
filtered by the machine-load inertia [11].
                                                                                   great reduction of the computation complexity. In
                                                                                   addition, electrical noise has also been attenuated.
2. USE OF THE KERNEL AND AMBIGUITY
                                                                                   After this step, a new 200-point signal that keeps the
PLANE FOR CLASSIFICATION
                                                                                   signature of the original signal is obtained. The time-
                                                                                   frequency ambiguity plane of the signal is calculated
  The discrete version of ambiguity function [5] is:                               by eq. (3). We will construct the TFR optimal for our
                                                                                   classification task by smoothing the ambiguity plane
                               N −1              2π
                                                                                   with a class-dependant kernel. Here, the dimension of
A[η ,τ ] = Fn→η {R[n,τ ]} = ∑R[n,τ ]e
                                            -j      nη
                                                 N
                                                               (3)                 ambiguity plane is 200*200. Basically we will directly
                               n=0                                                 select N points from this plane as our feature vector.
  Where       F     represents        the   Fourier            transform,                        is1          i s2              i si
η represents discrete frequency shift, and τ
represents discrete time lag. The instantaneous
autocorrelation function R[n,τ ] is defined as:
                                                                                              (TFR)1       (TFR)2     …      (TFR)i
R[n,τ ] = x * [n ]. x[(n + τ )N ]                        (4)
   This method, used to design kernels (and thus
TFRs), optimizes the discrimination between
predefined sets of classes. The resulting kernels are                                                      Classification
not restricted to any predefined function but, rather,
are arbitrary in shape. This approach requires the
necessary smoothing in order to achieve best
classification performance. The use of the kernel and
ambiguity plane includes two sequential processes:                                Level 1
                                                                                            Stator fault   Bearing Fault     Rotor fault
feature extraction and classification. Our goal is to
design a classification-optimal representation that
specifically emphasizes the differences between
                                                                                  Level 2                  Severity degree
classes. It is not necessary for the representation to
accurately describe the time-frequency information of
the signal. This technique has been successfully                                         Figure.1 Procedure of Faults classification
applied for tool-wear monitoring and radar transmitter
identification [5].
                                                                                       3.1 Fisher’s discriminant ratio kernel (FDR)
   The kernel determines the representations and its
properties. A kernel function is a generating function                                The kernel φ opt (η ,τ ) is designed for each specific
that operates upon the signal to produce the TFR. The                              classification task. We determine N locations from the
characteristic function for each TFR is A(η ,τ )φ (η ,τ )                          200*200 ambiguity plane, in such a way that the
In other words, for a given a signal, a TFR can be                                 values in these locations are very similar for signals
uniquely mapped from a kernel [3]. The classification-                             from the same class, while they vary significantly for
optimal representation TFRi can be obtained by                                     signals from different classes. The notation
smoothing the ambiguity plane with an appropriate                                   Aij [η ,τ ] represents the ambiguity plane of the jth
kernel φ opt , which is a classification-optimal kernel.                           training example in the ith class. We design and use
The problem of designing the TFRi becomes                                          Fisher’s Discriminant Ratio kernel to get those N
equivalent to designing the classification-optimal                                 locations.
kernel φ opt (η , τ ) .
                                                                            -2-
SETIT2007
   The kernels are designed by I training example                                                                  Feature points are ambiguity plane points of
signals from each class with the equation as follows:                                                            locations (η ,τ ) where φopt
                                                                                                                                          (c)
                                                                                                                                              [η ,τ ] = 1 . Therefore, the
                                                                                                                 process of feature extraction is to select points that are
FDR(η ,τ ) =
             (m [η ,τ ] − mFault                          Healthy
                                                                    [η ,τ ])   2
                                                                                                     (5)
                                                                                                                 optimal for the classification task from the ambiguity
                           2
                         VFault [η ,τ ] + VHealthy
                                            2
                                                                    [η ,τ ]                                      plane. The optimal number of nonzero points is
                                                                                                                 determined by evaluating the classifier performance
Where                                                                                                            using the K best kernel points (i.e., the K points with
                           N1       N2
                                                                                                                 the largest Fisher’s discriminant ratio). Kopt is selected
m Fault [η ,τ ] =        ∑∑ A [η ,τ ]
                     1                                                                                           to be the number K for which the probability of
                                               ij Fault
                                                                                                     (6)
                     I     i =1     j =1                                                                         correct classification is the greatest. Selection of
                                                                                                                 points in the Doppler-delay plane is interpreted as
                              N1      N2
m Healthy [η ,τ ] =         ∑∑ A                              [η ,τ ]
                       1                                                                                         masking of ambiguity function of the signal by an
                                                 ij Healthy
                                                                                                     (7)
                       I     i =1     j =1                                                                       adapted binary function providing an optimal kernel
                                                                                                                 According to the symmetries of the ambiguity plane,
mFault [η ,τ ] and mHealthy [η ,τ ] average of                                                 fault and         only points on a quarter plane are considered.
healthy classes of ambiguity plane
I = N1.N 2 number of examples per class                                                                          3.3 Classification by Mahalanobis distance
N 2 : Number of current examples of same load level
N1 : Number of load levels                                                                                          After designing the kernels, using examples from
  2
VFault [η ,τ ]2 and VHealthy
                      2
                             [η ,τ ] variances of the fault and                                                  each of the C classes, actual classification is
healthy classes in the ambiguity plane                                                                           performed. Given a particular unknown test signal
                                                                                                                 vector (the classifier is not trained on this example),
         [η ,τ ] =         ∑ (A [η ,τ ] − m [η ,τ ])
                             N
    2                  1                                                               2
                                                                                                                 the classifier estimates the class membership of this
  VFault                                   ij Fault                    Fault
                                                                                                    (8)
                       I     j =1                                                                                example. The classification of the point x in one of c
                                                                                                                 classes can be realized by standard classification
         [η ,τ ] =         ∑ (A                         [η ,τ ] − m                [η ,τ ])
                             N
  2                   1                                                                    2                     techniques (e.g. linear or quadratic discriminant
VHealtyn                                   ij Healthy                   Healthy
                                                                                                   (9)
                      I     j =1                                                                                 functions, distance, neural networks, ..). We choose to
                                                                                                                 classify with a Mahalanobis distance. The feature
  The system is trained by using ten current signals at                                                          vector of signal x given by:
0% and 100% load levels. We take N 1 = 2 in order to
                                                                                                                 FVx = φopt
                                                                                                                        (c)
                                                                                                                            o Ax                                                            (12)
solve the problem of the load levels. Test is then
performed on current signals collected at the 25% and                                                            Where Ax Ambiguity Plane of signal x
70% load levels. For C -classes must be designed                                                                 x is affected to the class ci ⇔ i = arg min {d M (FVx )}
C − 1 kernels. As we have four classes (three fault                                                              Where d M (FV x ) is the FVx Mahalanobis i =1...c
                                                                                                                                                                   distance of
cases and the healthy case of the machine), we must                                                              class ci, arg min {d M (FV x )} is the minimal value of the
design three kernels: bearing fault kernel, stator fault                                                         d M (FV x )   i =1...c
kernel and rotor fault kernel. Each kernel separates the
healthy case of the fault case.                                                                                                 (
                                                                                                                 d M (FVx ) = (FVx − FVtrain
                                                                                                                                        (c)
                                                                                                                                             ) .∑c (FVx − FVtrain
                                                                                                                                                             (c )
                                                                                                                                                             T
                                                                                                                                                                  )   −1
                                                                                                                                                                                             )12
                                                                                                                                                                                                   (13)
                                                                                                                 Where (.) denotes matrix transpose. Covariance
                                                                                                                               T
3.2. Features Extraction
                                                                                                                  ∑c are estimated from the training data. A reject
  We transform the Fisher’s Discriminant Ratio                                                                   decision is taken when the response x(t ) to be classified
(FDR) to φ opt kernel in a binary matrix by replacing                                                            is far from any class.
the maximum N points with 1‘s and the other points
with 0’s. Features can be extracted directly                                                                     ⎧ x is affected if d M (FVx ) p η
from φ opt [η , τ ]o A[η ,τ ] where o is an element-by-                                                          ⎨                                                                            (14)
                                                                                                                 ⎩ x is reject otherwise
element matrix product. The kernel has the same
dimensions as the ambiguity plane. By multiplying the                                                            Where η is a given reject threshold
φopt kernel with a certain signal’s ambiguity plane, we                                                            The error of the badly classified points of the feature
                                                                                                                 vectors FVx is calculated by:
will find k feature points for this signal. We put them
into a vector in order to create the training feature                                                                                     N1     N2     FV train          − FV x
               (k ) of class c:                                                                                     e(i, j )% =           ∑∑
           (c)
vector FVtrain                                                                                                                      1                              i, j            i, j
                                                                                                                                                                                          .100 (15)
                                                                                                                                    I     i =1   j =1            FVtrain i , j
   (c)
FVtrain (k ) = φopt(c ) [η ,τ ] o A (c ) [η ,τ ]                                           (10)
φopt(c ) [η ,τ ] Training optimal kernel                                                                         Where
 A (c ) [η ,τ ] Mean class of ambiguity plane                                                                    I = N1.N 2 Examples number per class
Where                                                                                                            N 2 : Current examples number of same load level
                                                                                                                 N1 : load level number
                                             ⎧⎪ A ( c ) [η ,τ ], if φ opt
                                                                      (c )
                                                                           [η ,τ ]= 1
φ opt( c ) [η ,τ ]o A ( c ) [η ,τ ] = ⎨                                               (11)
                                                                 if φ opt [η ,τ ]= 0
                                                                                                                 4. EXPERIMENT RESULTS
                                              ⎪⎩0,
                                                                       (c)
                                                                                                           -3-
SETIT2007
  An acquisition of current signals was carried out on             the locations which have values close or similar of
a test bench, which is made of a 5.5 KW induction                  points those of other classes.
motor (fig.2).
                                                                             150
                                                                                            Bearing fault
                                                                     Doppler (Hz)
                                                                             100
                                                                                    50
                                                                                                                         Rotor fault
  Fig.2. test bench of induction motor                                                                                 Stator fault
                                                                                     0
                                                                                      0                 50              100                150
                                                                                                             Delay (points)
                                                                                    Fig.5. Ambiguity plane smoothed by three kernel
                                                                      The features vector of the signal to be classified
                                                                    FVx was compared with features vectors of the
                                                                   training by using the Mahalanobis distance from
  Fig.3. Broken bars        Fig.4. Bearing faults                  eq.14. The decision rule of signal assignment is made
                                                                   by eq.15.        The threshold λ = 0.4 was tested
   The sampling rate is 10 KHz. The number of                      successfully on several signals in order to obtain a
samples per signal is N=10000. The data acquisition                correct classification. We have tested signals which do
set on the machine consists of 15 examples of stator               not belong to the training set of the three faults;
current recorded with different levels (0%, 25%, 50%,              bearing fault, stator fault and rotor fault with various
75%, 100%). Different operating conditions from the                levels of load (25%, 50%, 75%). Five signals
machine were considered; healthy, bearing fault                    examples are taken for each fault and for each load
(fig.4), stator fault (fig.2) and rotor fault (fig.3) .The         level. Thus we will have 5 X 3=15 signals test for
training set is carried out on 10 current examples. The            each fault. After extraction, the features vectors of
last five current examples are used to test the system             signal to be classified, we took 50 points at each
classification.                                                    feature vector with various levels of load. The
   The training set for the three faults and for the               calculation of the Mahalanobis distance d M (Vx ) is
healthy machine was made, each one, from 10                        done along these features vectors. The figure 8 shows
examples of no-load current and 10 other examples for              that the error is null for the first eight points of the
full load. Consequently we have 20 examples of                     vectors tests concerning the bearing fault. This is for
training for each of three faults and 20 examples of               25% 50% and 75%.levels of load. Finally, bearing
training for the healthy machine. The bearing fault                fault is only characterized by three points that are
kernel is designed for obtaining the points location of            belonging to the first eight points. Consequently the
maximum separation between two classes: fault                      signals tested are identified with precision.
bearing class and the healthy motor class. The
dimension of ambiguity plane contains initially 200 x                               50
                                                                                              25%
200 = 40000 points; considering symmetry compared                                             50%
                                                                                    40        75%
to the origin, we take the quarter of ambiguity plane,
which corresponds to N=10000. The stator fault kernel                               30
is designed for obtaining the points location of
                                                                   Error %
maximum separation between stator fault class and the                               20
healthy motor class. Rotor fault kernel is designed
                                                                                    10
also for obtaining the points location of maximum
separation between rotor fault class and the healthy                                 0
motor class. Ambiguity plane of three kernels is                                      0         10      20       30      40      50
computed from N=10000 points. The classification
consists on the separation of faults classes. The                                                    Fisher’s Points
Fisher’s point locations are represented in the                                           Fig.8. Classification errors for bearing fault
Doppler-delay plane (Fig.5). We retained 03 points
location per kernel {(ξ ,τ )1 ,L, (ξ ,τ )9 } of stronger              For stator fault, the figure 9 shows that the
contrast. These locations are ranged in the feature                classification error is null for the first twelve points of
vector for training {FV1 ,L, FV9 } . This selection is             the vectors tests. Different load level 25% 50% and
                                                                   75% are considered. The stator fault is characterized
made on the basis of contrast value and a compact                  by three points that belong to the first twelve points.
localization in the ambiguity plane. We also removed               Consequently the signals are identified with precision.
                                                             -4-
SETIT2007
                                                                                            Current," IEEE Trans. Industry Applications, vol.
                80
                             25%                                                            35, no. 2, March/April 1999
                70           50%
                             75%                                                       [2] L. Atlas, J. Droppo, and J. McLaughlin,
                60
                                                                                            "Optimizing time-frequency distributions via
     Error %
                50                                                                          operator theory", in Proceedings of the 1997 SPIE,
                40                                                                          vol. 3162, pp. 161–171.
                30                                                                     [3] C.       Heitz,      "Optimum         time-frequency
                20                                                                          representations for the classification and
                10                                                                          detection of signals," Appl. Signal Process., 1995,
                   0                                                                        vol. 2, no. 3, pp. 124–143.
                    0          10        20        30        40        50
                                                                                       [4] M. Davy and C. Doncarli, "Optimal kernels of
                                          Fisher’s Points                                   time-frequency representations for signal
               Fig.9. Classification errors for stator fault                                classification," in Proc. IEEE-SP Int. Symp.
                                                                                            Time-Freq. Time-Scale Anal., 1998, pp. 581–
   Figure 10 shows that for the rotor fault the                                             584.
classification mean error is null for the first thirteen                               [5] B.W. Gillespie and L. Atlas, "Optimizing Time–
points of the vectors tests for different load level 25%                                    Frequency Kernels for Classification," IEEE
50% and 75%. The rotor fault is characterized by three                                      Trans. Signal Processing, March 2001, VOL. 49,
points that belong to the first thirteen points.                                            NO. 3,
Consequently the signals tested are also identified                                    [6] M. Wang, G. I. Rowe, and A. V. Mamishev,
with precision.                                                                             "Classification of power quality events using
                                                                                            optimal time-frequency representations—Part 1:
                     60
                              25%
                              50%
                                                                                            application," IEEE Trans. Power Delivery, 2004,
                              75%
                     50                                                                     vol. 19, pp. 1496–1503,
                                                                                       [7] M J. Devaneyand L. Eren, "Detecting Motor Bearing
         Error %
                     40
                                                                                            faults”, IEEE Instrumentation & Measurement
                     30                                                                     Magazine", December 2004
                     20                                                                [8] D.K. Perovic, M. Arkan, and P. Unsworth.
                                                                                            "Induction motor fault detection by space vector
                     10
                                                                                            angular fluctuation," IEEE IAS, Rome 2000, Vol 1,
                        0                                                                   pp 388-394,
                         0          10        20        30        40        50
                                                                                       [9] F. Filipetti, G. Franceschini, C. Tassoni, and P. Vas, "
                                         Fisher’s Points
                                                                                            AI techniques in induction machine diagnosis
               Fig.10. Classification error for rotor fault                                 including the speed ripple effect," vol 1, IEEE IAS,
                                                                                            California 1996, pp 655-662,
                                                                                       [10] A. Lebaroud, A. Bentounsi, G. Clerc, "Detailed
                                    CONCLUSION                                              Study of the Rotor Asymmetry Effects of Induction
                                                                                            Machine Under Different Supply Conditions,” 11th
   In this study we proposed a method based on time-                                        European conference of power electronics and
frequency representation (TFR) for the diagnostic of                                        applications (EPE’05), Dresden, Germany, 11-14
induction motor faults. We demonstrate that the                                             September 2005.
classical TFR have parametric kernel and a priori                                      [11] A. Lebaroud, A. Bentounsi, A. Kkhezzar, M.
preset which are inappropriate for classification. Thus,                                    Boucherma, " Effects of Broken Bar Induction
we use the plane of ambiguity where all the TFR can                                         Motor with Stator Asymmetry and Distorted
be derived by a suitable choice of the kernel. It gives a                                   Supply," International conference in electrical
precise classification of the signal. The diagnosis                                         machines (ICEM’04), Poland, September 2004.
system classifies the different faults : bearing fault,                                [12] L. Cohen, "Time-Frequency Analysis," Englewood
stator fault and rotor fault. Each fault was                                                Cliffs, NJ: Prentice-Hall, 1995.
characterized by a kernel. The optimized kernels allow
extracting nine discriminated locations in the Doppler
delay plane {(ξ ,τ )1 , L , (ξ ,τ )9 } . The assignment of an
unclassified signal was made with the Mahalanobis
distance. The mean error of points which are badly
classified is null. The diagnosis by TFR takes into
account the level of load and provides a reduced
computing time and an accurate classification.
                                    REFERENCE
[1] B. Yazıcı, G. B. Kliman, "An Adaptive Statistical
    Time–Frequency Method for Detection of Broken
    Bars and Bearing Faults in Motors Using Stator
                                                                                 -5-