Fault Tree Analysis
Introduction
→ Industrial operating system - combination of various components club
together to form a single unit to perform specific operation.
→ It is desirable to analyze the possible failure sequences related to such
operations and to perform a probabilistic analysis while developing a
production cycle to successfully mitigate the risk associated with it
→ It is -preventive analysis - to protect the end user from unidentified and
unacceptable consequences.
→ Fault tree analysis is one of many tools available to identify potential
failures and mechanism associated with them.
→ A method of conducting root cause analysis
→ Purpose – diagnose a problem & get its root cause – to formulate
corrective measures
→ It is – diagrammatic representation of the failure path
→ 2 primary components – Events & Gates
Understanding the Model
• → Top-Down approach
•→ Deductive analysis technique - depict possible
sequence of event failure.
•→ Purpose: Identify combinations of equipment
failures and human errors
that can result in an accident event
•→ Provides all the possible path through which a
particular failure event can occur - in a single
hierarchical chart.
•→ Constructed - Boolean logic,
- Each step - divided into 2 extreme
outcomes ( True or False).
•→ Events are arranged
- sequences of series relationship through "ORs"
- parallel relationships through "ANDs" of Boolean
logic.
•→Thus , tree-like diagram - formed through logic
symbols which visualize the dependencies among
the events.
•→ Event - mechanical components / software
glitch / arise from the electronics used while
designing the system.
History
•→ First developed - 1961 for the U.S. Air Force
- by H. A. Watson
- at Bell Telephone Laboratories for use with the Minuteman
system
(American land-based intercontinental ballistic missile in service with the
Air Force Global Strike Command)
→ Later adopted and extensively applied by the Boeing Company
As system safety analysis tool (1963)
Applied to entire Minuteman system for safety (1964-99)
Commercial codes developed that works on PC's (1991-99)
•→ Recognized Software codes include:
Prepp/Kitt
SETS
FTAP
Importance and COMCAN
•→ Adopted by - robotics and software industries
- chemical industries (1981-90)
Basic Definitions
Fault :
→Abnormal undesirable state of a system
→attributed to - implementation of wrong command/ no implementation of command
- due to failure of system or component of the system.
→If system is interrupted by safety device & has been shut down it is NOT counted as fault.
Failure:
→Loss of functioning of the system / any component of the system
→Examples :
→Pressure vessel burst - the vessel failure.
→Cooling coil is not functioning due to corrosion and leaking the coolant - cooling system failure.
→Thermocouple coil is broken & relief device unable - detect system temperature - relief device
failure.
Primary Failure:
→If failure - occurred due to system itself
- like failure within the life span, leakage,
→system has no exposure to the surrounding
Secondary Failure:
→Failure occurred due the exposure of the system to the surrounding / manufacturing errors
- like improper designing, wrong selection, operational error (use of device above rated
limit.
MOE:
→Multiple Occurring Event or failure event mode
occurs in more than one place in the FT.
→Also Known - redundant / repeated event.
MOB:
→Multiple Occurring Branch
a branch - used in more than one place in FT
(All basic events in MOB will be MOE's)
Branch:
A sub section of tree
Module:
An independent sub tree / branch
contains no outside MOE or MOB and is not MOB
Top Event :
The failure event of a particular system under
study
Basic Event :
The event - cannot be subdivided further
hence it is the terminating point of the branch of
the tree.
Cut Set Terms
•Cut Set (CS):
→A set of events starting from basic event to the
undesirable top event
that together cause the top event to occur - Cut Set
•Min CS (MCS):
→A CS with minimum number of events that can still
cause the top event
•Super Set:
→A CS that contains a MCS plus additional events to
cause the top undesirable event.
•Critical Path:
→The highest probability CS that drives the top undesired
event probability
•Cut Set Order:
→The number of elements in a CS
•Cut Set Truncation:
→Removal of cut sets from consideration during the FT
evaluation process.
→CS's are truncated when they exceed a specified order
and/or probability.
FTA is best applied to cases with
→Large, perceived threats of loss,
i.e. high risk
→ Numerous potential contributors to a mishap
→ Complex or multi-element systems or processes.
→ Already-identified undesirable events.
A must!
→ Indiscernible mishap causes,
i.e. autopsies
→ Requires the use of software
→ Not intuitive, requires training
→ Not useful when temporal aspects are important
→Nature of Results: Qualitative, with quantitative potential.
- can be evaluated quantitatively when probabilistic data are available
Applying Fault Tree Analysis
•Postulate top event (fault)
•Branch down listing faults in the system that must
occur for the top event to occur
•Consider sequential and parallel or combinations of faults
•Use Boolean algebra to quantify fault tree with
event probabilities
•Determine probability of top event
Fault Tree Logic
•Use logic gates
•Higher gates are the outputs from lower gates in the tree
•Top event - output of all the input faults or events that occur
Fault Tree Analysis and other Analytical Models
→ The Failure Mode Effects Analysis (FMEA) & The Reliability Block Diagram (RBD)
Fault tree analysis Reliability Block Diagram (RBD)
depicts a system by using gates depicts a system by using paths
focuses on the failure focuses on the success part
used for analyzing fixed may cover time-varying factors
probabilities of the occurrence of during the analysis process.
each event.
Fault tree analysis The Failure Mode Effects Analysis
(FMEA)
top-down tree matrix structure with all the key
measurements (severity rating,
occurrence rating, process controls,
detection rating and risk priority
number etc.) right on the top column.
used to show single or multiple not good at exploring multiple or single
initiating faults faults
hard to find all possible faults by using does well in exhaustively cataloging
fault tree analysis. initiating faults and identify effects,
→ FTA and FMEA can be used at the same time for a better system development
(e.g. the analysis of civil aerospace).
→ Fault Tree Analysis Diagram Symbols
• Events
• Gates
• Transfer
Events
This sub-category includes the following shapes:
→ Primary/basic event (circle)
It is a failure or error in a system component or element.
→ External event (house-shape)
It is an event that normally expected to occur.
→ Undeveloped event (Diamond)
Needs no more investigation due to limited information.
→ Conditioning event (Oval)
It is a restriction on a logic gate.
→ Intermediate event (Rectangle)
It is usually placed above a primary event in order to show more event description details.
Gates
These symbols mainly show the relationship between output and input events
→ OR gate
It occurs as long as at least one of the input events occurs.
→ AND gate
It occurs only if all input (at least two) requirements are met.
→ Exclusive OR gate
It occurs only if one of the input conditions is met, not if all conditions are met.
→ Priority AND gate
It occurs only after a specific order of conditions.
→ Inhibit gate
It only occurs if all input events take place and whatever is defined in a conditional event.
Transfer
→ A pointer used to a tree branch
→ Indicates a sub-tree branch that is used elsewhere in the tree.
→ It is represented by a triangle symbol.
→ Can be used for several purpose:
Start a new page of plots
It indicates where a branch is used numerous places in the same tree, but is not repeatedly
drawn (internal transfer)
It indicates an input module from a separate analysis (External Transfer)
Transfer in Transfer out
How to Perform Fault Tree Analysis (FTA)
The 5 basic steps to perform a Fault Tree Analysis are
as follows:
→ Identify the Hazard
→ Obtain Understanding of the System Being Analyzed
→ Create the Fault Tree
→ Identify the Cut Sets
→ Mitigate the Risk
Performing FTA
→ Step 1: Define the Undesired Event (Top Event)
• Usually several different, but equivalent, fault trees can be constructed for a given
system.
• It should be defined as precisely as possible:
How much impact does the top event pose to system?
What will be the duration of the top event?
What is the consequence of happening i.e. safety impact?
What is the environmental impact?
What is the regulatory impact?
• Identify all the Immediate, Necessary and Sufficient events to the Top Event
Immediate Event
Collection of past events, previous experiences (always include in FT)
Necessary Event
Always try to include only those events which are actually necessary.
Inclusion of small faults (events) leads to a complicated visualization.
Sufficient Event
Do not include more than minimum necessary
Performing FTA
→ Step 2: Obtain Understanding of the System
• Create or acquire appropriate support information:
List of components involved in the system
Boundary Diagram
Schematic
Code Requirements (Safety Codes associated to the system)
Engineering Noises and Environments
Examples of similar products or failures
Previous FTA data (If Available)
•Starting from top event, list the potential causes of hazard in accordance with the level
(Top Event, Level 1, Level 2... Basic Event). The development of FT should be focused on
completion single level first and then proceed to the next level.
•Always try to include the past experiences of the design engineers, who have good
physical, thermodynamic and chemical knowledge of that system.
•This knowledge is very important for cause selection.
Performing FTA
→ Step 2: Obtain Understanding of the System
• It is a team work process, hence assign skilled person to who
can assist in developing the relationships of causes to a failure
or fault.
• The procedure is continued until all the basic failures are
identified
• Identify each causing event as one of the following path types:
Primary Fault
Secondary Fault
Command Fault
• Estimate probability of the causes at the Base-level event
• Label all causes with codes
• Prioritize or sequence causes in the order of occurrence or
probability
Performing FTA
→ Step 3: Construct the Fault Tree
• The set of events that are all required to produce an event of interest are
connected to AND gates
• The set of events that can individually produce an event of interest are
connected to OR gates
• It is a complete analysis of system including mechanical, software as well as
the electronics used in the system.
• The risks may be prevented through engineering choices or controlled
through Quality Control.
• The Basic event (depicted as a circle or oval) is the point at which the team
can address the risk. It is typically color coded as follows:
Red: Critical Risk
Orange: High Risk
Yellow: Minor Risk
Green: Acceptable / Very Low Risk
Performing FTA
→ Step 4: Identify the Cut Sets
• Risk is estimated for each event
When available, the failure rate data can be used to
calculate the risk of a single chain or the many chains
If there is no data, an estimate is established based on
subjective guidelines similar to those used in FMEA
development
• The Cut Sets with risk greater than the system can tolerate
(i.e. safety or inoperative conditions) are selected for
mitigation
• Actions are required for Critical (red) and High Risks(orange)
Performing FTA
Step 5: Control the hazard identified
• Have your plan of action according to the FTA.
• Any risk that is not corrected up to the desired
limit can have potency for the system failure and
hence can be treated as topic of Mistake proofing
an quality control.
• Controlling such issues leads towards the
satisfaction and protection of the cosumer from
the risk.
STAFF REQUIREMENTS
One analyst should be responsible for a single fault tree, with frequent
consultation with the engineers, operators, and other personal who
have experience with the systems/equipment that are included in the
analysis
A team approach is desirable if multiple fault trees are needed, with
each team member concentrating on one individual fault tree.
Interactions between team members and other experienced personnel
are for necessary completeness in the analysis process
OTHER REQUIREMENTS
Data Requirements:
A complete understanding of how the plant/system functions
Knowledge of the plant/system equipment failure modes and their
effects on the plant/system
Time and Cost Requirements:
Highly dependent on the complexity of systems involved. Modeling a
small process unit could require a day or less with an experienced
team
Large problems, with many potential accident events and complex
systems, could require several weeks even with an experienced
analysis team
Examples of Mitigation Strategies
When a risk is unacceptable the team may have several options
available. The following are a few examples of the options available:
o Design change
o Selection of a component with a higher reliability to replace the Base-
level event component
o This is often expensive unless identified early in Product
Development
Physical Redundancy of the Component
o This option places the redundant component in parallel to the other.
Both must fail simultaneously for the hazard to be experienced. If a
safety issue exists, this option may require non-identical components.
Software Redundancy
o The addition of a sensing circuit, which can change the state of
the product, often reduces the severity of the event by protecting
components through duty cycle changes and reducing input
stresses when identified.
Warning System
o The circuit may just warn of an event. This requires action by an
operator or analyst. It is important to note that if this course of
action is taken, Human Factors Reliability must also enter the
evaluation.
Quality Control
o This may include removal of the potential failure through testing or
inspection. The inspection effectiveness must match the level of
severity that the hazard may impose on the consumer.
Things to Remember while doing FTA
Always try to omit inputs with small probabilities
Always remember the difference between active and
passive components
Ask Yourself: Does quantified tree make sense?
Don't fault tree everything
Careful with Boolean expressions
Ensure top event is high priority
ADVANTAGES
• Deals well with parallel, redundant or alternative fault paths.
• Searches for possible causes of an end effect which may not have been foreseen.
•The cut sets derived in FTA can give enormous insight into various ways top event
occurs.
•Very useful tool for focused analysis where analysis is required for one or two major
outcomes.
• Use to determine the minimal cut sets.
• User could select the top event to be specific to the failure of interest.
• Software are available to construct fault tree, to determine cut sets and to calculate
the failure probabilities.
DISADVANTAGES
• Complicated process.
• Require considerable amount of time to complete.
• Need experienced engineers
• Requires a separate fault tree for each top event and makes it difficult to analyze
complex systems.
• Fault trees developed by different individuals are usually different in structure,
producing different cut set elements and results.
• The same event may appear in different parts of the tree, leading to some initial
confusion.
• It is very time consuming analysis and requires large efforts
for a complex system.
APPLICATION
• Used in the field of safety engineering and reliability engineering to
determine the probability of a safety accident or particular system level
failure.
• Aerospace engineering
• To monitor the performance of the system
• To assist in designing a system as per the safety and regulatory concern
To analyze the effect of medication in the present system.
• Used as a Diagnostic tool to identify and correct causes of the top event
• Use to understand the impact of changing environment or change in duty
cycle for same design
CONCLUSION
• FTA identifies all the possible causes of a specified undesired event
(TOP event)
• FTA is a structured top-down deductive analysis.
• FTA leads to improved understanding of system characteristics.
• Design flaws and insufficient operational and maintenance procedures
may be revealed and corrected during the fault tree construction.
• FTA is not (fully) suitable for modeling dynamic scenarios
• FTA is binary (fail–success) and may therefore fail to address some
problems
Thank You