Mean Time Between Failures in Plant
Maintenance
    KPI's in Plant Maintenance
    Mean Time Between Failures and Mean Time To Repair are two important KPI's in plant maintenance
    management and lean manufacturing.
    Mean Time Between Failures = (Total up time) / (number of breakdowns)
    Mean Time To Repair = (Total down time) / (number of breakdowns)
    "Mean Time" means, statistically, the average time.
    "Mean Time Between Failures" is literally the average time elapsed from one failure to the next. 
    Usually people think of it as the average time that something works until it fails and needs to be
    repaired (again). As reliable production processes are crucial in a Lean Manufacturing environment,
    MTBF is vital for all lean initiatives
    "Mean Time To Repair" is the average time that it takes to repair something after a failure.
    For something that cannot be repaired, the correct term is "Mean Time To Failure" (MTTF).  Some
    would define MTBF – for repair-able devices – as the sum of MTTF plus MTTR. .In other words, the
    mean time between failures is the time from one failure to another.  This distinction is important if the
    repair time is a significant fraction of MTTF.
    Here is an example.  A light bulb in a chandelier is not repairable, so MTTF is most appropriate.  (The
    light bulb will be replaced).  The MTTF might be 10,000 hours. 
    On the other hand, without oil changes, an automobile's engine may fail after 150 hours of highway
    driving – that is the MTTF.  Assuming 6 hours to remove and replace the engine (MTTR), Mean Time
    Between Failures is 150 hours.
    Like automobiles, most manufacturing equipment will be repaired, rather than replaced after a failure,
    so Mean Time Between Failures is the more appropriate measurement.     
    What is a Failure?
    "Failure" can have multiple meanings.  Let us briefly examine one device's "failures":
    An Uninterruptible Power Source (UPS) may have five functions under two conditions:
           While the main power is available:
o                    Allow power to flow from the main source to the machine being protected
o                    Condition the power by limiting surges or brownouts
o                    Store power in a battery, up to the battery's full charge
           When the main power is interrupted:
           Supply continuous power to the machine being protected
           Emit a signal to indicate that the main power is off
    There is no question that the UPS has failed if it prevents main power from flowing to the machine
    being protected (function 1).  Failures for functions 2, 3 or 5 may not be obvious, because the
    "protected" machine is still running on main power or on the battery supply.  Even if noticed, these
    failures may not trigger immediate corrective measures because the "protected" machine is still running
    and it may be more important to keep it running than to repair or replace the UPS.
    What is Availability?
    The "availability" of a device is, mathematically, MTBF / (MTBF + MTTR) for scheduled working time.
    The automobile in the earlier example is available for 150/156 = 96.2% of the time.  The repair is
    unscheduled down time.
    With an unscheduled half-hour oil change every 50 hours – when a dashboard indicator alerts the
    driver – availability would increase to 50/50.5 = 99%.
    If oil changes were properly scheduled as a maintenance activity, then availability would be 100%.
     
    Why are these important for reliablity ?
    "Availability" is a key performance indicator in manufacturing; it is part of the "Overall Equipment
    Effectiveness" (OEE) metric.
    A production schedule that includes down time for preventative maintenance can accurately predict
    total production.  Schedules that ignore Mean Time Between Failures and Mean Time To Repair are
    simply future disasters awaiting remediation.
     
    How to calculate actual Mean Time Between
    Failures
    Actual or historic Mean Time Between Failures is calculated using observations in the real world. 
    (There is a separate discipline for equipment designers, based on the components and anticipated
    workload).
    Calculating actual Mean Time Between Failures requires a set of observations; each observation is:
           Uptime_moment: the moment at which a machine began operating (initially or after a repair)
           Downtime_moment: the moment at which a machine failed after operating since the previous
    uptime-moment
    So each Time Between Failure (TBF) is the difference between one Uptime_moment observation and
    the subsequent Downtime_moment.
    Three quantities are required:
           n = Number of observations.
           ui = This is the ith Uptime_moment
           di = This is the ith Downtime_moment following the ith Uptime_moment
    So Mean Time Between Failures = Sum (di – ui)/ n  , for all i = 1 through n observations.  More simply,
    it is the total working time divided by the number of failures.
    By Oskar Olofsson
    Read more:
    KPI's in Manufacturing Plant Maintenance
Preventive Maintenance
Do you need this calculator in spreadsheet format? Buy it from our online store
MTBR-------