Full Text
Full Text
article info a b s t r a c t
Article history: Deep learning-based models for system prognostics and health management have received significant
Received 5 September 2021 attention in the reliability and safety fields. However, limited progress has been achieved in the usage
Received in revised form 24 March 2022 of deep learning for system reliability assessment. This paper aims to bridge this gap and explore
Accepted 18 June 2022
the interface between deep learning and system reliability assessment by expanding and adapting
Available online 30 June 2022
recent advances in physics-informed deep neural networks. Particularly, we present a novel deep
Keywords: learning-based system reliability assessment and develop a physics-informed generative adversarial
Physics-informed deep learning network-based approach to facilitate uncertainty quantification and propagation as well as enable
Reliability assessment measurement data fusion and incorporation into system reliability assessment. Three numerical exam-
Generative adversarial networks ples employing a dual-processor computer system are used to demonstrate the proposed approach.
Uncertainty quantification Results show that the proposed approach has comparable performance to the widely used Runge–
Kutta method and Monte Carlo simulation in handling deterministic scenarios. When dealing with
probabilistic scenarios, the proposed approach is 16.5 times more computationally efficient than
Monte Carlo simulation in uncertainty quantification and is effective in fusing measurement data for
the system’s reliability assessment. The proposed approach offers a novel perspective and builds a
link between deep learning and system reliability assessment for computational alleviation and data
assimilation challenges.
© 2022 Published by Elsevier B.V.
https://doi.org/10.1016/j.asoc.2022.109217
1568-4946/© 2022 Published by Elsevier B.V.
T. Zhou, E.L. Droguett and A. Mosleh Applied Soft Computing 126 (2022) 109217
neural networks as universal approximators of a desired solu- in uncertainty quantification, which is important for the
tion and then constrain the training process by designing the reliability and safety community.
loss function according to domain-specific knowledge, such as
the physics model described by partial differential equations. The proposed approach is demonstrated using a dual-processor
Most recent research focuses on enhancing PINNs by using dif- computing system with performance degradation, which encom-
ferent deep learning architectures [20–22] with applications to passes the following three examples: (i) the system starts with
biophysics, geophysics, and engineering sciences [23–27]. The in- a perfect condition and degrades over mission time, which is
terested readers can find a comprehensive review of the progress validated by comparison with the Runge–Kutta method and the
of physics-informed deep learning with diverse applications in Monte Carlo simulation; (ii) the system starts with either a per-
Karniadakis et al. [28]. fect condition or a degraded state. The uncertainty is modeled
Physics-informed deep learning is still in an early stage of by the Bernoulli distribution, the epistemic uncertainty of which
development and needs to be well configured given the specific is modeled by the Beta distribution. The results are validated
problem. One of the main concerns is to improve PINNs for uncer- by comparison with the Monte Carlo simulation; (iii) the sys-
tainty quantification to achieve a robust and reliable prediction. tem starts with a perfect condition given synthetic measurement
Most notably, physics-informed generative adversarial networks data available to reflect the system’s condition during a service
(PIGANs) have been proposed and are still under development to life span. This example is heuristically demonstrated by two
probabilistically leverage observations and the underlying physics simulated systems with either better or worse performance as
model in solving both forward and inverse problems. The studies compared to the baseline case in the first example. A heuristic
of PIGANs mainly use a similar framework and vary depending demonstration is presented because measurement data cannot
on two factors: integrate domain knowledge into either generator be properly incorporated using the current methods for system
or discriminator [29]; adopt different types of GANs to improve reliability assessment. Overall, the results validate the effective-
their training stability [20,21,30–32]. The choice of PIGANs would ness of the proposed approach for system reliability assessment
vary depending on the scale of specific problems and the com- and show the superiority of the proposed approach in terms of
putational resources available. It is worthwhile noting that the computational efficiency.
above research also shows great value to address the challenges The remaining of this paper is structured as follows.
of scarcity of measurements in system reliability assessment. Par- Section 2 summarizes the problem formulation and background
ticularly, the following applications adopted the PIGANs proposed of system reliability assessment. Section 3 presents the deep
by Yang and Perdikaris [20], which integrates domain knowledge learning-based approach for system reliability assessment.
into the generator and follows the training scheme developed by Section 4 demonstrates the proposed model using three nu-
Li et al. [33]. merical examples involving a dual-processor computing system.
This paper presents a novel perspective of system reliability Section 5 discusses the conclusions and future directions.
assessment by leveraging the advance in physics-informed deep
learning. The main objective is to make a connection between 2. Problem formulation and background
deep learning and system reliability assessment and to further
demonstrate the relevant beneficial value of deep learning to System reliability assessment aims to model the reliability
system reliability assessment. Our contributions are twofold: of systems with several components. The general strategy is
to analyze component reliability first, then aggregate individual
(1) We present an approach to frame system reliability assess-
ment as a problem of deep learning, which encodes the components’ reliability based on the applicable system structure
system property into the network configuration and train- to quantify system reliability [35]. Component reliability [36] has
ing based on the mathematical model governing system traditionally been estimated using either a physical (e.g., stress–
reliability evolutions. Particularly, it approximates the so- strength model) or an actuarial approach (e.g., Weibull analysis).
lution to reliability assessment with a neural network and The approaches are predicated on the implicit premise that the
induces another neural network to obtain the derivatives component and system states are binary, i.e., either complete
of system state probability by using automatic differentia- failure or fully functioning. However, this is rarely the case where
tion techniques. The outputs of the two neural networks a condition of transition exists between fully functioning and
are utilized to construct a composite loss function, and complete failure.
then gradient-based optimization algorithms are employed It is critical to characterize system state in more than just
to learn the system reliability assessment solution. It is binary to achieve a more realistic insight of a system’s relia-
worthwhile noting that this provides a continuous solution bility [37]. In general, there are three approaches for achieving
to the system reliability assessment because of the univer- this goal: (1) consider system state as a continuous variable and
sal approximation theorem [34]. This enables one to assess model its evolution with a continuous stochastic process [38,39];
system reliability at any given time instant. (2) consider system state as a discrete variable and model its
(2) To highlight the potential value of physics-informed deep evolution with a discrete stochastic process, also known as the
learning, we put forward a PIGANs-based approach for multi-state model [40,41]; (3) consider system state as a variable
uncertainty quantitation and measurement data incorpo- with a combination of continuous and discrete features [42].
ration into system reliability assessment. This is accom- Notably, the multi-state model has been widely adopted due to
plished by formulating a deep probabilistic setting by an its natural fit to represent the state of engineering systems [43]
adversarial game between the data constraints and the by a range of discrete levels according to their functional modes,
mathematical model describing the system reliability evo- failure modes, or degrading performance. For instance, pump
lution. This provides a new perspective on combining mea- states can be defined based on their failure modes: fail to start,
surement data with the underlying mathematical model. fail to run, fail to stop, and external leakage [44]; define the
This is particularly valuable for safety–critical applications states of a power generating unit based on the various generating
with a small number of measurements, such as passive capacity levels [45]; determine the states of transmission pipeline
structures in nuclear power plants. Moreover, as it shall based on the degree of corrosion over the service span [46].
be demonstrated, the proposed approach has superior ef- Typically, the multi-state model is mathematically represented
ficiency so that it alleviates the computational challenges by a Markov or semi-Markov process in reliability applications
2
T. Zhou, E.L. Droguett and A. Mosleh Applied Soft Computing 126 (2022) 109217
[47,48]. The system states are characterized by a finite number approach for uncertainty quantification and measurement data
of discrete levels and the system reliability evolution is char- incorporation into the system’s reliability assessment.
acterized using the time spent on each state and the transi-
tions between states. Usually, there are four categories of models 3.1. Frame system reliability assessment in a deep learning context
given that the state transitions are either time-independent or
time-dependent: (1) the homogeneous Markov process assumes This section focuses on the connection between deep learning
time-independent transitions; for instance, modeling the ther- and system reliability assessment. The essential objective is to
mal reliability of high-density electronic systems [49]; (2) non- learn a continuous latent function as the solution to system
homogeneous Markov process is the most studied and assumes reliability considering the possible state transitions and the initial
system transition rates as a function of the system operational condition. In particular, a neural network is utilized to approx-
imate p(t), which acts as a prior on the unknown reliability
time; for instance, modeling of a machining tool degradation and
solution. According to the universal approximation theorem, this
capture the aging effects of the wear process [50]; (3) homo-
leads to a continuous solution that enables one to assess the
geneous semi-Markov process accounts for the effects of time
system reliability at any time instant up to mission time. As
spent in a state; for example, modeling the crack growth rate
illustrated in Fig. 1, the system property is encoded into the
considering the length of time at which the component spent
network configuration and training, as discussed in the following.
on the crack initiation [51]; (4) non-homogeneous semi-Markov In the network configuration, there are two neural networks
process treats the system rates as a function of both system op- with shared parameters that approximate the system state prob-
erational time and the time spent in a state, as in the modeling of ability and obtain their derivative regarding the system’s opera-
system reliability of downhole optical monitoring systems under tional time. In other words, the time dependency is encoded by
complex test and maintenance strategies [52,53]. the state probability and its derivative at any time instant. These
In this study, we assume that the system state is continuously two networks are configured as follows:
observed and the state transitions can occur at any time. Transi-
tion rates are solely determined by the amount of time the system • Utilize a neural network Nθ (t ) as the surrogate for the
has been operating owing to performance degradation and/or reliability estimates p (t ), where Nθ (t ) denotes a neural
network parameterized by θ and the network input is global
maintenance interventions. Suppose system performance is char-
time t. The number of neurons in the output layers needs
acterized by a finite number of states S = {0, 1, . . . , j, . . . , M }.
to match the number of system states. Then using the Soft-
The system dynamics are reflected by the transitions across states
Max activation function in the output layer would provide
at each time instant t, which is parameterized by the transition
the probability regarding each system state. This implicitly
rate λi,j (t ) from state i to state j. Hence, denote the corresponding
satisfies the constraints of probability value regarding each
transition ∑rate matrix Q (t ) at a time instant t as below, where state in the range [0,1].
λi (t ) = j∈S ,j̸=i λi,j (t ). • Establish an induced neural network Nθ′ (t ) to obtain the
derivative of system state probability p′ (t ). Particularly,
−λ0 (t ) λ0,1 (t ) ... λ0,j (t ) ... λ0,M (t )
⎡ ⎤
Nθ′ (t ) is an induced neural network based on Nθ (t ) using
⎢ λ1,0 (t ) −λ1 (t ) ... λ1,j (t ) ... λ1,M (t ) ⎥ automatic differentiation.
... ... ... ... ... ...
⎢ ⎥
Q (t ) = ⎢
⎢ ⎥
⎢ λj,0 (t ) λj,1 (t ) ... −λj (t ) ... λj,M (t ) ⎥
⎥ The network training process needs to be constrained to satisfy
⎣ ... ... ... ... ... ... ⎦ the system initial condition and system state transition model as
λM ,0 (t ) λM ,1 (t ) ... λM ,j (t ) ... −λM (t ) reflected in Eqs. (2) and (3). Therefore, a composite loss function
can be constructed in Eq. (4), by combining two residual terms:
(1)
L (θ) = (Nθ (t = 0) − s0 )2
The system state at each time instant t is represented by a proba- [ Nr )2 ]
bility vector, that is p (t ) = {p0 (t ) , p1 (t ) , . . . , pj (t ) , . . . , pM (t )}, dNθ (ti )
(
1 ∑
+λ Nθ (ti ) · Q (ti ) − (4)
where pj (t ) is the probability that the system is in state j at time Nr dti
∑M i=1
instant t, and j=1 pj (t ) = 1. The system state probability can be
derived according to the forward Kolmogorov equations, which where the first term enforces the neural network in agreement
consist of a set of differential equations parameterized by the with the system’s initial condition given by Eq. (3); the second
term enforces training process consistent with the system state
transition rate matrix and state probability vector in Eq. (2). Then,
transition as expressed in Eq. (2) by penalizing at Nr collocation
the system reliability can be determined by aggregating the state
points; λ is a weighting factor to balance those loss terms. This
probability where the system is considered functioning.
is then formulated as a minimization problem to train the neural
p′ (t ) = p (t ) Q (t ) (2) networks via gradient-based optimization algorithms:
3
T. Zhou, E.L. Droguett and A. Mosleh Applied Soft Computing 126 (2022) 109217
Nd
3.2. Physics-informed generative adversarial networks (GANs) for 1 ∑
LG (θG ) = log 1 − NθD tk , NθG (tk , zk )
( ( ))
system reliability assessment
Nd + 1
k=0
Nd
This section further discusses how deep learning can benefit 1 ∑ )2
y (tk ) − NθG (tk , zk )
(
system reliability assessment. Particularly, we discuss PIGANs +
Nd + 1
based approach to integrate data constraints in system reliability [
k=0
Nr ( )2 ]
assessment. The idea is to formulate a probabilistic setting to 1 ∑ dNθG (ti , zi )
learn the probabilistic distributions of the system reliability that, +λ NθG (ti , zi ) · Q (ti ) −
Nr dti
in turn, results in a generative model capable to produce synthetic i=1
4
T. Zhou, E.L. Droguett and A. Mosleh Applied Soft Computing 126 (2022) 109217
Fig. 2. A physics-informed generative adversarial network (GANs) based approach for system reliability assessment considering measurement data.
Upon the PIGANs model is successfully trained, the genera- The PIGANs-based approach has twofold advantages. First, the
tor NθG (t , z ) can be utilized to simulate the system reliability generator can be used as a surrogate model for conventional
considering measurement data and uncertainty. Particularly, the Monte Carlo simulation. This would be more efficient for a highly
system reliability assessment is accomplished by drawing Ns sam- reliable system that is often computationally expensive and re-
ples through stochastic forward passes of the generator. A point quires a large number of samples in the conventional Monte
estimate of jth state probability is determined by computing the Carlo simulation. Second, the proposed approach accounts for
mean of the predictions regarding each sample in Eq. (12). The both measurement data and the mathematical model, thus, in
uncertainty of jth state probability is characterized by intervals turn, the system reliability evolution simulated by the generator
with two-standard deviation in Eq. (13). is also informed by the measurement data. This offers a unique
Ns advantage against the current methods that cannot consider mea-
1 ∑
pj (t ) ≈ NθG (t , zn ) [j] (12) surement data. Note that the time complexity of the proposed
Ns approach varies depending on the network architectures (e.g., the
n=1
number of layers, the hidden unit number in each layer, and
Ns
1 ∑ ]2 the number of output) and training algorithms (e.g., number of
σj (t ) ≈ √ NθG (t , zn ) [j] − pj (t )
[
(13)
Ns − 1 iterations, variants of backpropagation algorithm) [54]. These two
n=1
advantages are experimentally demonstrated in the following
Assume the system works reliably in a set of state indexes U. It numerical examples.
is straightforward to compute the point estimate and uncertainty
of system reliability by aggregating the state probability within 4. Numerical examples
the set U as follows:
Ns
1 ∑∑ This section demonstrates the deep learning approach for
R (t ) ≈ NθG (t , zn ) [j] (14) system reliability assessment using three numerical examples
Ns
n=1 j∈U involving a four-state system. Section 4.1 provides a brief de-
⎡ ⎤2 scription of problem formulation. Section 4.2 discusses the results
Ns
1 ∑ ∑ and the model performance assessment. The proposed approach
σR ( t ) ≈ √ NθG (t , zn ) [j] − R (t )⎦
⎣ (15)
Ns − 1 was developed based on Python v3.8 [55], TensorFlow v2.4.0 [56],
n=1 j∈U
and NumPy v1.19.2 [57] using a laptop with Intel Core i7-6700
5
T. Zhou, E.L. Droguett and A. Mosleh Applied Soft Computing 126 (2022) 109217
Fig. 3. A state transition diagram to describe the performance deterioration of a dual-processor computing system.
CPU and 32 GB DDR4 RAM. The differential equation solver par- results are validated by a comparative study with the dif-
ticularly the Runge–Kutta method was implemented using Mat- ferential equation solver and the Monte Carlo simulation,
lab [58]. The Monte Carlo simulation was also implemented in respectively.
Python v3.8 [55]. (2) Suppose the system starts with a perfect working condition
(i.e., state 0) or degraded state (i.e., state 1), which follows
4.1. Problem description the Bernoulli distribution. This scenario is subject to large
uncertainty due to the cause of manufacturing defects and
Consider a safety model of a dual-processor computing system installation in the field. Hence, we use the Beta distribu-
that degrades through 4 possible states {0, 1, 2, 3} as taken tion to model the epistemic uncertainty for the Bernoulli
from Rindos et al. [59]. Fig. 3 shows the state transition diagram distribution. The results are validated by comparison with
describing the possible transition across states. With the system the Monte Carlo simulation.
state increasing from 0 to 3, the system continuously degrades (3) The third example intends to demonstrate how the deep
until a safe or unsafe failure. The system is considered reliable learning approach can incorporate the measurement data
in states 0 and 1, so the system reliability can be calculated by into the mathematical model describing the underlying
summing the probability of these two states. The definition of state transitions. Particularly, we follow the previous as-
each state is as follows: sumption that the system starts with a perfect working
condition; generate synthetic measurement data by us-
• State 0: the system functions in full capacity with two ing the results in the first example as a baseline; heuris-
processors. tically validate the results by discussing the impacts of
• State 1: the system works in a degraded mode given any of measurements on the system behavior.
the two processors fails, which can be successfully detected
(i.e., probability c2 ). 4.2. Results and discussions
• State 2: the system is operated in a degraded state, and the
other processor failure leads to safe shutdown (i.e., proba- This section discusses the results of the three examples. For
bility c1 ). validation purposes, the following discussion is based on the
• State 3: the system fails unsafely due to two scenarios: any system reliability with a certain time step (i.e., 1-time unit) up
of the two processors fails but is not detected (i.e., proba- to mission time 30.
bility 1 − c2 ); the failure of the other processor leads the
system to an unsafe state (i.e., probability 1 − c1 ) when the 4.2.1. Example 1
system is operated in a degraded state. The system reliability is assessed using the proposed approach
in Section 3.1. The neural network consists of 2 hidden layers,
The transition across states follows the Weibull distribution, and
each one with 50 neurons, and uses the Tanh activation function.
the transition rates are denoted by λ(t) = λ0 α t α−1 , where t is the
There are four neurons in the output layer with the SoftMax
system’s operational time. This results in a Non-Homogeneous
activation function. The output of each neuron corresponds to
Continuous-Time Markov process where the transition rates de-
the probability of each system state, respectively. There are 40
pend on the system’s operational time. In this paper, we set the
collocation points, which are generated linearly spaced within the
parameters as the same as Rindos et al. [59], that is c2 = 0.9,
range [0, 30]. The network is trained using the Adam optimization
c1 = 0.9, λ0 = 0.01 and α = 2.0. The corresponding transition
algorithm and the number of iterations is 2 × 104 . An exponential
rate matrix is shown in Eq. (16):
decaying learning rate is applied with the starting learning rate
−0.04 · t 0.036 · t 0.004 · t as 1 × 10−3 , the decay rate as 0.9 and the decay step as 1000. The
⎡ ⎤
0
0 −0.02 · t 0.018 · t 0.002 · t ⎥ weighting factor λ is set equal to 1. On the other hand, baseline
Q (t ) = ⎣ (16)
⎢
0 0 0 0 ⎦
results are obtained by using the differential equation solver,
0 0 0 0 and the Monte Carlo simulation with 1 × 105 iterations. Fig. 4
To demonstrate our proposed approach, three example problems summarizes the mean value of each system state probability. The
are formulated with the detailed setup below: results of the proposed approach are close to the baseline results,
which indicates the good performance of the proposed approach.
(1) Suppose the system starts with a perfect working condition To further evaluate the consistency of the results between
(i.e., state 0) and degrades over a service life span. The the proposed approach and the differential equation solver, we
6
T. Zhou, E.L. Droguett and A. Mosleh Applied Soft Computing 126 (2022) 109217
Fig. 4. The results of system state probability using the proposed approach, the differential equation solver, and the Monte Carlo simulation.
Fig. 5. The root mean square error (RMSE) of the results between the proposed approach and the differential equation solver.
run the former for 60 replications considering the random effects N
1 ∑ ]2
RMSEj (t ) = √ pj (t ) − p∗j (t )
[ i
through training and testing. The consistency between the results (17)
N
is measured using the root mean square error (RMSE) in Eq. (17), i=1
where p∗j (t ) is the probability of state j at time t given by the The distributions of RMSEj (t ) up to mission time are summarized
differential equation solver, pij (t ) is the state probability of state j in Fig. 5. The overall variability of the RMSE is further illustrated
at time t obtained in the ith replication of the proposed approach, by its value of 5% quantile, median, mean, and 95% quantile. The
the total number of replications is N, RMSEj (t ) is the RMSE of jth RMSE remains relatively small and indicates the effectiveness
state probability between the proposed approach and differential of the proposed method to achieve a satisfactory assessment of
solver at time t. system reliability.
7
T. Zhou, E.L. Droguett and A. Mosleh Applied Soft Computing 126 (2022) 109217
Fig. 6. The absolute difference of the results between the proposed approach and the Monte Carlo simulation.
The proposed approach is also validated by comparison with can propagate and quantify the uncertainty of the system’s initial
the Monte Carlo simulation. Considering the random effects of condition. Denote the probability vector of the system’s initial
both methods, both the proposed approach and the Monte Carlo condition by [ρ0 , 1 − ρ0 , 0, 0, 0, 0], where ρ0 follows the Beta
simulation are run for 60 replications. The predictive uncertainty distribution parameterized by two shape parameters, α = 5 and
at each time instant is characterized by the mean and standard β = 1.5. To integrate such uncertainty in the proposed approach,
deviation of the corresponding realizations. The prediction of jth 50 samples are generated and are treated as a type of boundary
state probability at time t is denoted: by mean pj (t ) and standard condition, which needs to be satisfied by the neural network
deviation σj (t ) using the proposed approach; by mean p′j (t ) and training process. In the generator, the neural network consists of
standard deviation σj′ (t ) using the Monte Carlo simulation. Then, 4 hidden layers of 50 neurons with the Tanh activation functions.
evaluate the consistency of results between the proposed ap- There are four units in the output layer with the SoftMax acti-
proach and the Monte Carlo simulation by measuring their abso- vation function. The number of collocation points is 40, which
lute difference and composite standard deviation, as represented are linearly spaced values generated within the range [0, 30].
by ∆pj (t ) and ∆σj (t ) in Eqs. (18) and (19), representatively: In the discriminator, the neural network has 2 hidden layers of
∆pj (t ) = ⏐pj (t ) − p′j (t )⏐ 50 neurons with the Tanh activation functions and 1 neuron in
⏐ ⏐
(18)
√ the output layer. An exponential decaying learning rate is applied
∆σj (t ) = σj (t )2 + σj′ (t )2 (19) with the starting learning rate as 1 × 10−2 , the decay rate as 0.9
and the decay step as 1000. The number of iterations is 1 × 105
Figs. 6 and 7 show the distribution of ∆pj (t ) and ∆σj (t ) using the Adam optimization algorithm. The weighting factor λ
up to mission time. Both the absolute difference and compos- is set equal to 1. Once the model is well trained, the generator
ite standard deviation remain relatively small (i.e., the overall
is used to generate 5 × 103 samples to estimate the system state
median values are 0.0011 and 0.0012, respectively). This implies
probability with uncertainty.
that the performance of the proposed approach is comparable to
The Monte Carlo simulation consists of 50 replications and
the Monte Carlo simulation. Furthermore, note that the proposed
each replication includes 1 × 105 iterations. Note that each sam-
approach only takes 16.0 s for a replicate, while it takes 266.9 s
ple from the Monte Carlo simulation represents the actual state
for the Monte Carlo simulation. This shows the superiority of the
number, while the samples in the proposed approach represent
proposed approach in terms of computational efficiency.
So far, we have demonstrated the performance of the proposed the system state probability vector. Note that the Monte Carlo
approach in assessing each system’s state probability. Then, we simulation is computationally expensive, taking 332.5 s per repli-
can calculate the system reliability by summing the probability cation and needs around 4.6 h in total. However, the whole
in states 0 and 1. The results of system reliability are displayed training and sampling process of the proposed approach takes
in Fig. 8 and show a good match for the results using all three only 1,005.5 s (less than 17 min). This indicates the superior
methods. Therefore, we can conclude the validity of the proposed computational efficiency of the proposed approach that is 16.5
approach to assess system reliability. times more computationally efficient when compared with the
Monte Carlo simulation.
4.2.2. Example 2 Figs. 9 and 10 display the predictions with uncertainty quan-
The proposed approach is of particular advantage for uncer- tification for system state probability and system reliability, re-
tainty quantification, which is important in reliability and safety spectively. The results indicate the consistency between the pro-
applications. This example shows how the proposed approach posed approach and the Monte Carlo simulation. Indeed, some
8
T. Zhou, E.L. Droguett and A. Mosleh Applied Soft Computing 126 (2022) 109217
Fig. 7. The composite standard deviation of the results between the proposed approach and the Monte Carlo simulation.
Fig. 8. The results of system reliability using the proposed approach, the differential equation solver, and the Monte Carlo simulation.
deviations are observed in both the mean prediction and un- We also present a comparison between the exact system’s ini-
certainty bound for each state probability. Such deviation can tial condition and the corresponding prediction by the proposed
be attributed to the sources of uncertainty due to the Monte approach. As shown in Fig. 11, the proposed approach performs
Carlo simulation and the neural network configuration. This in- well in quantifying the uncertainty of the system’s initial condi-
consistency would be further reduced by increasing the number tion. Then, we follow the same process in Section 4.2.1 to evaluate
of replications and iterations for the Monte Carlo simulation, the consistency of the results between the proposed approach
enhancing the network configuration and training process. Note and the Monte Carlo simulation. Figs. 12 and 13 summarize
that the difficulty of training GANs has been well recognized, and the distribution of absolute difference and composite standard
the training of PIGANs becomes even more challenging due to deviation between the proposed approach and the Monte Carlo
the integration of more complicated composite generator loss. simulation. As it can be observed, both measures remain rela-
Improvement of the configuration of network architecture and tively small (i.e., the overall median values are 0.0052 and 0.017,
training of PIGANs is still an open topic of research, which is respectively). An implication is that the proposed approach pro-
discussed in Section 5 and will be considered in the authors’ vides a satisfactory result when compared with the Monte Carlo
future work. simulation.
9
T. Zhou, E.L. Droguett and A. Mosleh Applied Soft Computing 126 (2022) 109217
Fig. 9. The results of system state probability using the proposed approach and the Monte Carlo simulation considering the measurement data of the system’s initial
condition.
Fig. 10. The results of system reliability using the proposed approach and the Monte Carlo simulation considering the measurement data of system initial condition.
Fig. 11. A comparison of the distribution of state probability initially in states 0 and 1 using the proposed approach.
Fig. 12. The absolute difference of the results between the proposed approach and the Monte Carlo simulation considering the measurement data of system initial
condition.
proposed approach can be considered effective in incorporating according to inspection and expert judgment. The same notation
the measurement data collected during the system’s service life applies to the baseline case and the baseline system state prob-
span. ability is denoted by [t ∗ , p∗t ∗ ]. Then, the synthetic measurement
Suppose a simulated system is inspected at time t, and the cor- data can be generated as equal to the state probability vector
responding measurement data is [t , pt ], where t is the simulated p∗t of the baseline case at a time instant t ∗ = t ∓ ∆t, which
system’s operational time, and pt is the state probability vector shifts the inspection time t forward or backward. Particularly, a
11
T. Zhou, E.L. Droguett and A. Mosleh Applied Soft Computing 126 (2022) 109217
Fig. 13. The composite standard deviation of the results between the proposed approach and the Monte Carlo simulation considering the measurement data of
system initial condition.
Fig. 14. Updated reliability evolution for a simulated system with worse performance as compared to the baseline case.
12
T. Zhou, E.L. Droguett and A. Mosleh Applied Soft Computing 126 (2022) 109217
Table 1
The synthetic measurement data generated to simulate a system with better or worse performance.
Inspection- Time- Synthetic measurement System
time t shifted ∆t data reliability
5 2 [8.35E−01, 1.42E−01, 0.977
A system with better
6.20E−03, 1.72E−02]
performance
10 2 [2.79E−01, 4.47E−01, 0.726
1.81E−01, 9.23E−02]
15 2 [3.43E−02, 2.71E−01, 0.3053
5.38E−01, 1.56E−01]
2 3 [8.35E−01, 1.42E−01, 0.977
A system with worse
6.20E−03, 1.72E−02]
performance
5 2 [3.75E−01, 4.27E−01, 0.802
1.22E−01, 7.60E−02]
9 4 [3.43E−02, 2.71E−01, 0.3053
5.38E−01, 1.56E−01]
Fig. 15. Updated reliability evolution for a simulated system with better performance as compared to the baseline case.
backward shift that is t ∗ = t − ∆t leads to a system with better Figs. 14 and 15 show the reliability evolution of the simulated
performance; a forward shift that is t ∗ = t + ∆t leads to a system system with worse and better performance, respectively. The
with worse performance. validity of the results can be justified based on the following
For demonstration purposes, Table 1 shows the synthetic mea- insights:
surement data generated to simulate two systems with either
better or worse performance. The measurement data is sequen- • The model can generally capture the trend that the relia-
tially used to update the system reliability evolution. The network bility of the system is generally lower or larger than the
architecture used is the same as the one used in Section 4.2.2. baseline case as in Figs. 14 and 15, respectively.
The Adam optimization algorithm is employed for training with • The uncertainty of the reliability evolution can be effectively
2 × 104 iterations. The weighting factor λ is set equal to 1. An quantified to consider the measurement data since the syn-
exponential decaying learning rate is applied with the starting thetic measurement data are bounded by the two-standard
learning rate as 1×10−3 , decay rate as 0.9, and decay step of 1000. deviation intervals of the updated system reliability.
13
T. Zhou, E.L. Droguett and A. Mosleh Applied Soft Computing 126 (2022) 109217
14
T. Zhou, E.L. Droguett and A. Mosleh Applied Soft Computing 126 (2022) 109217
[27] S. Cofre-Martel, E.L. Droguett, M. Modarres, Remaining useful life es- [44] Nuclear Energy Agency, ICDE Project Report: Collection and Analysis of
timation through deep learning partial differential equation models: A Common-Cause Failures of Centrifugal Pumps, in: NEA/CSNI/R(2013)2,
framework for degradation dynamics interpretation using latent variables, 2013.
Shock Vib. (2021) 9937846. [45] A. Lisnianski, D. Elmakias, D. Laredo, H.B. Haim, A multi-state Markov
[28] G.E. Karniadakis, I.G. Kevrekidis, L. Lu, P. Perdikaris, S. Wang, L. Yang, model for a short-term reliability analysis of a power generating unit,
Physics-informed machine learning, Nat. Rev. Phys. (2021) 1–19. Reliab. Eng. Syst. Saf. 98 (1) (2012) 1–6.
[29] A. Daw, M. Maruf, A. Karpatne, PID-GAN: A GAN framework based on a [46] F. Caleyo, J.C. Velázquez, A. Valor, J.M. Hallen, Markov chain modelling
physics-informed discriminator for uncertainty quantification with physics, of pitting corrosion in underground pipelines, Corros. Sci. 51 (9) (2009)
2021, arXiv preprint arXiv:2106.02993. 2197–2207.
[30] J.E. Warner, J. Cuevas, G.F. Bomarito, P.E. Leser, W.P. Leser, Inverse esti- [47] K.S. Trivedi, A. Bobbio, Reliability and Availability Engineering: Modeling,
mation of elastic modulus using physics-informed generative adversarial Analysis, and Applications, Cambridge University Press, 2017.
networks, 2020, arXiv preprint arXiv:2006.05791. [48] R.G. Gallager, Stochastic Processes: Theory for Applications, Cambridge
[31] P. Jacquier, A. Abdedou, V. Delmas, A. Soulaïmani, Non-intrusive reduced- University Press, 2013.
order modeling using uncertainty-aware deep neural networks and proper [49] Y. Wan, H. Huang, D. Das, M. Pecht, Thermal reliability prediction and
orthogonal decomposition: Application to flood modeling, J. Comput. Phys. analysis for high-density electronic systems based on the Markov process,
424 (2021) 109854. Microelectron. Reliab. 56 (2016) 182–188.
[32] B. Lütjens, B. Leshchinskiy, C. Requena-Mesa, F. Chishtie, N. Díaz-Rodríguez, [50] M.H. Shu, B.M. Hsu, K.C. Kapur, Dynamic performance measures for tools
O. Boulais, A. Sankaranarayanan, A. Pina, Y. Gal, C. Raissi, A. Lavin, D. with multi-state wear processes and their applications for tool design and
Newman, Physically-consistent generative adversarial networks for coastal selection, Int. J. Prod. Res. 48 (16) (2010) 4725–4744.
flood visualization, 2021, arXiv preprint arXiv:2104.04785. [51] S.D. Unwin, P.P. Lowry, R.F. Layton, P.G. Heasler, M.B. Toloczko, Multi-state
[33] C. Li, J. Li, G. Wang, L. Carin, Learning to sample with adversarially learned physics models of aging passive components in probabilistic risk assess-
likelihood-ratio, 2018. ment, in: Proceedings of ANS PSA 2011 International Topical Meeting on
[34] K. Hornik, M. Stinchcombe, H. White, Multilayer feedforward networks are Probabilistic Safety Assessment and Analysis, Wilmington, North Carolina,
universal approximators, Neural Netw. 2 (5) (1989) 359–366. USA, 2011.
[35] M. Modarres, M.P. Kaminskiy, V. Krivtsov, Reliability Engineering and Risk [52] Moura M. das Chagas, E.L. Droguett, Mathematical formulation and nu-
Analysis: A Practical Guide, third ed., CRC Press, 2016. merical treatment based on transition frequency densities and quadrature
[36] M. Raus, A. Barros, A. Hoyland, System Reliability Theory: Models, methods for non-homogeneous semi-Markov processes, Reliab. Eng. Syst.
Statistical Methods, and Applications, third ed., John Wiley & Sons, 2020. Saf. 94 (2) (2009) 342–349.
[37] A. Lisnianski, I. Frenkel, L. Khvatskin, Modern dynamic reliability analysis [53] M.D.C. Moura, E.L. Droguett, Numerical approach for assessing system
for multi-state systems, in: Springer Series in Reliability Engineering, dynamic availability via continuous time homogeneous semi-Markov
Springer, 2021. processes, Methodol. Comput. Appl. Probab. 12 (3) (2010) 431–449.
[38] Z. Zhang, X. Si, C. Hu, Y. Lei, Degradation data analysis and remaining useful [54] X. Hu, L. Chu, J. Pei, W. Liu, J. Bian, Model complexity of deep learning: A
life estimation: A review on Wiener-process-based methods, European J. survey, 2021, arXiv preprint arXiv:2103.05127.
Oper. Res. 271 (3) (2018) 775–796. [55] G. Van Rossum, F.L. Drake, Python 3 Reference Manual, CreateSpace, Scotts
[39] J.M. Van Noortwijk, M.D. Pandey, A Stochastic Deterioration Process for Valley, CA, 2009.
Time-Dependent Reliability Analysis, in Reliability and Optimization of [56] M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S.
Structural Systems, CRC Press, 2020, pp. 259–265. Ghemawat, G. Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga, S. Moore,
[40] Y.F. Li, H.Z. Huang, J. Mi, W. Peng, X. Han, Reliability analysis of multi-state D.G. Murray, B. Steiner, P. Tucker, V. Vasudevan, P. Warden, M. Wicke, Y.
systems with common cause failures based on Bayesian network and fuzzy Yu, X. Zheng, Tensorflow: A system for large-scale machine learning, in:
probability, Ann. Oper. Res. (2019) 1–15. Proceedings of the 12th USENIX Symposium on Operating Systems Design
[41] M. Bao, Y. Ding, C. Singh, C. Shao, A multi-state model for reliabil- and Implementation (OSDI ’16), Savannah, GA, USA, 2016.
ity assessment of integrated gas and power systems utilizing universal [57] C.R. Harris, K.J. Millman, S.J. van der Walt, R. Gommers, P. Virtanen,
generating function techniques, IEEE Trans. Smart Grid 10 (6) (2019) D. Cournapeau, E. Wieser, J. Taylor, S. Berg, N.J. Smith, R. Kern, Array
6271–6283. programming with numpy, Nature 585 (7825) (2020) 357–362.
[42] R. Arismendi, A. Barros, A. Grall, Piecewise deterministic Markov process [58] L.F. Shampine, I. Gladwell, S. Thompson, Solving ODEs with MATLAB,
for condition-based maintenance models—Application to critical infrastruc- Cambridge University Press, Cambridge U.K, 2003.
tures with discrete-state deterioration, Reliab. Eng. Syst. Saf. 212 (2021) [59] A. Rindos, S. Woolet, I. Viniotis, K. Trivedi, Exact methods for the
107540. transient analysis of nonhomogeneous continuous time Markov chains,
[43] D.W. Coit, E. Zio, The evolution of system reliability optimization, Reliab. in: Computations with Markov Chains, Springer, Boston, MA, 1995, pp.
Eng. Syst. Saf. 192 (2019) 106259. 121–133.
15