CPE 633
Chapter 1 - Preliminaries
                    Dr. Rhonda Kay Gaede
                                      UAH
                                                           1
Electrical and Computer Engineering
   UAH                                  Chapter 1   CPE 633
                                      Motivation
   • Computers are everywhere.
   • Computers are used in _______________
     and _________________ applications.
   • Computer systems (______________ and
     ____________) are incredibly __________.
   • With complexity comes a propensity
     for ____________.
   • Two approaches:
         – ________________________
         – ____________________________________________
                                                      Page 2 of 10
Electrical and Computer Engineering
                                                                     1
   UAH                                Chapter 1    CPE 633
                        1.1 Fault Classification
   • Definitions
         – A fault (or failure) can be either a
           ___________________ or a ___________________
         – An ______ is a manifestation of the ______.
   • Examples
         – Output of adder circuit ______________
         – sin(x) computation really ___________
   • Fault effects can ____________.
   • To limit this spread, designers
     incorporate _________________________.
                                                     Page 3 of 10
Electrical and Computer Engineering
   UAH                                Chapter 1    CPE 633
                        1.1 Fault Classification
   • These containment zones are
     __________ that reduce the chance
     that an effect can spread.
         – ____________________________________________
           _______.
   • Hardware faults can be:
         – ________________
         – _____________
         – __________________
   • Hardware faults are __________ or
     _______________.
                                                     Page 4 of 10
Electrical and Computer Engineering
                                                                    2
   UAH                                Chapter 1       CPE 633
                     1.2 Types of Redundancy
   • All of fault tolerance is an exercise
     in _____________ and ____________
     _________________ – the property of
     ___________________________ than is
     minimally necessary.
   • Four forms of redundancy: __________,
     _____________, _________, ______________
   • Hardware redundancy is provided by
     ___________________________ in the
     design to _________ or _________ errors.
         – It can be ________, _________ or __________.
                                                     Page 5 of 10
Electrical and Computer Engineering
   UAH                                Chapter 1       CPE 633
                     1.2 Types of Redundancy
   • The best-known form of _____________
     redundancy, _________________ and
     _______________ coding, is widely used in
     ___________________________.
   • ___________________ and ________________
     codes are also used to protect data
     communicated over _________ (channels
     subject to many __________ failures)
     channels. ______________ upon detection of
     an error is ________ redundancy.
   • _______________ redundancy leads to
     hardware _____________.
                                                          Page 6 of 10
Electrical and Computer Engineering
                                                                         3
   UAH                                       Chapter 1   CPE 633
    1.3 Basic Measures of Fault Tolerance
   • What does it mean to make machines more
     __________________?
         – We need __________
   • Traditional Measures
         – ______________, _____, is the probability that the
           system has been ___________________ in the time
           interval [0,t]. It is suitable for applications in
           which even a ___________________________ can prove
             costly.
              • ____________________________ (MTTF)
              • _______________________________ (MTBF)
              • _______________________ (MTTR)
              • ________ = ________ + ___________
                                                            Page 7 of 10
Electrical and Computer Engineering
   UAH                                       Chapter 1   CPE 633
    1.3 Basic Measures of Fault Tolerance
         – _______________, _____, is the average _____________
           _______ over the interval [0.t] that the system is
           _____.
                                      A = lim A(t )
                                          t →∞
                            MTTF    MTTF
                    A=           =
                            MTBF MTTF + MTTR
         – ________________________, ______, is the probability
           that the system is up at ___________________________
           ____________.
                                                            Page 8 of 10
Electrical and Computer Engineering
                                                                           4
   UAH                                  Chapter 1                     CPE 633
    1.3 Basic Measures of Fault Tolerance
   • All this is nice as long as we know what ____ means.
         – Some cases are simple, _________________ for example.
         – Other cases not so much, what if ______________________
           ____________________________________________?
         – Many systems have ________________ states
   • Extension of traditional measures to _____________
     ___________________________________ of a system with n
     processors.
                                  n
                  ACC = ∑ ci Pi (t )
                                 i =1
               • Ci is the _______________________________ of a system with I
                 ____________________ processors
               • Pi(t) is the probability that exactly __________________ are
                 operational at time t
                                                                           Page 9 of 10
Electrical and Computer Engineering
   UAH                                  Chapter 1                     CPE 633
    1.3 Basic Measures of Fault Tolerance
 • Network Measures
       – Classical _______ and ____________________ – the minimum
         number of ___________ and __________ that have to fail
         before the network becomes ________________________.
       – Average ________________________
       – Maximum __________________ (_______________)
                                                                          Page 10 of 10
Electrical and Computer Engineering