0% found this document useful (0 votes)
11 views29 pages

Wang 2018

This manuscript presents a Bayesian network approach to analyze accident severity in waterborne transportation, focusing on data from China over a 30-year period. It identifies key risk factors influencing maritime accidents, such as ship type and location, and provides a predictive model for accident severity under uncertainty. The findings aim to enhance safety measures for stakeholders in the maritime industry.

Uploaded by

Ridwan Bin Alam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views29 pages

Wang 2018

This manuscript presents a Bayesian network approach to analyze accident severity in waterborne transportation, focusing on data from China over a 30-year period. It identifies key risk factors influencing maritime accidents, such as ship type and location, and provides a predictive model for accident severity under uncertainty. The findings aim to enhance safety measures for stakeholders in the maritime industry.

Uploaded by

Ridwan Bin Alam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Accepted Manuscript

Bayesian network modelling and analysis of accident severity in


waterborne transportation: a case study in China

Likun Wang , Zaili Yang

PII: S0951-8320(18)30135-2
DOI: 10.1016/j.ress.2018.07.021
Reference: RESS 6220

To appear in: Reliability Engineering and System Safety

Received date: 4 February 2018


Revised date: 28 June 2018
Accepted date: 18 July 2018

Please cite this article as: Likun Wang , Zaili Yang , Bayesian network modelling and analysis of
accident severity in waterborne transportation: a case study in China, Reliability Engineering and
System Safety (2018), doi: 10.1016/j.ress.2018.07.021

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service
to our customers we are providing this early version of the manuscript. The manuscript will undergo
copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please
note that during the production process errors may be discovered which could affect the content, and
all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT

Highlights

 Analysis of accident severity in waterborne transportation.


 Collection and statistical analysis of maritime accident data over 30 years.
 Identification of key factors influencing maritime accident severity using data-driven BN
and expert verification.
 Dynamic prediction of maritime accident severity under high uncertainties.

T
IP
CR
US
AN
M
ED
PT
CE
AC

1
ACCEPTED MANUSCRIPT

Bayesian network modelling and analysis of accident severity in waterborne


transportation: a case study in China

Likun Wang1, Zaili Yang2

1. College of Transport and Communication, Shanghai Maritime University, Shanghai,


China
2. Liverpool Logistics, Offshore and Marine Research Institute, Liverpool John Moores
University, Liverpool, UK

T
IP
Abstract

CR
The rapid development of the shipping industry requires the use of large vessels carrying
high-volume cargoes. Accidents incurred by these vessels can lead to a heavy loss of life and
damage to the environment and property. As a leading country in international trade, China has
developed its waterway transport systems, including inland waterways and coastal shipping, in the

US
past decades. A few catastrophic shipping accidents have occurred during this period. This paper
aims to develop a new risk analysis approach based on Bayesian networks (BNs) to enable the
AN
analysis of accident severity in waterborne transportation. Although the risk data are derived from
accidents that occurred in China‘s waters, the risk factors influencing accident severity and the
risk modelling methodology are generic and capable of generating useful insights on waterway
risk analysis in a broad sense.
M

To develop the BN-based risk model, waterway accident data are first collected from all
accident investigation reports by China‘s Maritime Safety Administration (MSA) from 1979 to
2015. Based on the derived quantitative data, we identify the factors related to the severity of
ED

waterway accidents and use them as nodes of the risk model. Second, based on a receiver
operating characteristic (ROC) curve, an augmented naïve BN (ABN) model is selected through a
comparative study with a naïve BN (NBN) model to analyse the key risk factors influencing
PT

waterway accident severity. The results show that the key factors influencing waterway safety
include the type and location of the accident and the type and age of the ship. Moreover, a novel
CE

scenario analysis is conducted to predict accident severity in various situations by combining


different states (e.g., high risk) of the key factors to generate useful insights for accident
prevention. More specifically, the findings can aid transport authorities, ship owners and other
AC

stakeholders in improving waterborne transportation safety under uncertainty.

1. Introduction
Waterborne transportation is vital for sustaining national economic development given its
capability of providing cheaper and greener solutions compared to other transport modes. For
instance, over the past several years, China‘s waterway (including both coastal and inland)
shipping has been developing rapidly. By 2016, its waterway freight volume reached 6.382 billion
tonnes, which was a 480% increase from 2001, and its water transport ship load capacity reached
266.22 million tonnes, 211.73 million tonnes more than that in 2001. Due to increasing shipping
demand, waterway traffic density increases, and the navigational environment becomes complex,

2
ACCEPTED MANUSCRIPT

leading to a high level of risk. It has been reported that in 2016 alone, 196 accidents occurred and
203 people died in China‘s waterways (Statistical Bulletin of Transportation Industry
Development, 2017). Maritime accidents cause casualties, economic loss, environmental
degradation and waterway congestion (Zhang et al. 2013). Compared to ocean transportation,
ships used in inland and coastal waterways are smaller, and their ability to tackle emergencies is
lower. Hence, the probability and severity of the accidents that involve these ships are probably
higher. It has been found that larger Danish-flagged cargo ships often suffer fewer accidents
(Danish Maritime Authorities, 2010). Hansen, Jepsen, and Hermansen (2012) concluded that
vessels smaller than 3000 GT put their crews at great risk of being in a maritime accident, which
requires the crews of such vessels to abandon ship more frequently than those of large vessels.

T
Statistics from the Maritime Accident Investigation Branch (MAIB) in 2010 disclosed that the risk

IP
of the total loss of a small ship is much higher than that of a large vessel. This paper therefore
aims to analyse the characteristics of shipping accidents involving small vessels in inland and

CR
coastal waterways and, based on a data-driven Bayesian network (BN), to identify the important
risk factors influencing accident severity for risk prediction and accident prevention. The accident
data used in this study are obtained from the accident investigation reports by China‘s Maritime
Safety Administration (MSA) over the past 30 years. Accidents that occurred in China‘s inland

US
and coastal waterways are selected, and those associated with deep-sea transportation are
eliminated. The term ―accident‖ in this paper refers to both accidents and casualties, as defined by
AN
the International Maritime Organization (IMO).
This paper is organised as follows. Section 2 describes the current literature relating to
maritime accidents, with a focus on maritime risk assessment using BNs. In Section 3, the method
and results of accident data mining are presented. Section 4 presents the methodology of
M

developing a BN-based risk model for the analysis of waterway accidents. In contrast to previous
relevant studies that rely, more or less, on expert judgments to interpret subjective probabilities,
the novelty of the methodology lies in the use of data-driven approaches to identify key risk
ED

factors and quantify their interdependencies. There are no studies in the literature that analyze all
accident investigation reports of a particular country/region (e.g., China) over a long time span
(e.g., 30 years). In Section 5, the BN model verification is conducted by comparing two models
PT

based on different BN calculations. Section 6 describes the scenario analysis for drawing useful
findings in terms of risk prediction and accident prevention. Conclusions and future work are
CE

discussed in Section 7.

2. Literature review
AC

2.1 Studies of maritime accidents

Previous studies on maritime accidents involve a wide variety of geographical locations,


including the Gulf of Finland (Mazaheri, Montewka, and Kujala 2014), Istanbul Strait (Aydogdu,
2014), the UK (Chauvin et al. 2013; Bhattacharya, 2012), Greece (Tzannatos and Kokotos, 2009),
Sweden (Mullai and Paulsson 2011), the heritage regions of Tubbataha and Banc d‘Arguin (Heij et
al. 2013), the Arctic (Kum and Sahin 2015), the North Atlantic and Arctic regions (Knapp,
Bijwaard, and Heij 2011), and China‘s Yangtze River (Zhang et al. 2013; Zhang et al. 2014).
Ship accidents are caused by equipment failure, human error, environmental effects,
excessive loads, or a combination of these factors (Guedes Soares and Teixeira, 2001; Antao and
3
ACCEPTED MANUSCRIPT

Guedes Soares, 2008). The integration of established and novel techniques to assess risks is a
current goal within many maritime organisations. The application of probabilistic methods to
model some of these high risks is a current practice because it has potential to help in the process
of decision making, which would allow regulatory changes to be proposed (Guedes Soares and
Teixeira, 2001)
Some papers set natural weather conditions as one of various variables that influence ship
accidents (Zhang et al., 2013; Mullai and Paulsson, 2011; Balmat et al., 2009), whereas Knapp et
al. (2011) focus on oceanographic conditions. This paper uses econometric models to measure the
effect of significant wave height and wind strength on the probability of vessel casualty, and the
results show that the probability of vessel casualty is influenced by seasonality, wind strength and

T
wave height.

IP
It is commonly stated that 80% of all accidents are associated with human factors (Antao and
Guedes Soares, 2008). Several human factor analysis models have been introduced and widely

CR
used, such as the Human Factors Analysis and Classification System (HFACS), the Technique for
Retrospective and Predictive Analysis of Cognitive Errors (TRACEr), the Cognitive Reliability
and Error Analysis Method (CREAM) and Accident Analyse Mapping (AcciMap). Chen et al.
(2013) established a maritime incident analysis framework using HFACS-MA. Akyuz (2015)

US
assessed human factors in ship grounding accidents with AcciMap. Sotiralis et al. (2016)
calculated the collision accident probability due to human error with TRACEr and BN. Yang et al.,
AN
(2013), Wu et al. (2017) and Xi et al., (2017) proposed different modified CREAM based on
evidential reasoning to estimate the human error probability in maritime accidents.
Reviewing these studies reveals that although diverse causes of accidents are presented in
different routes/locations, common risk factors influencing the occurrence probability or
M

consequence severity of accidents exist; these are presented in Table 1. The analysis of such
causes and factors aids the analysis of the initial set of risk variables in this study.
ED

Table 1
Variables from the relevant literature
PT

Variable Literature sources

Ship type Weng and Yang (2015), Heij and Knapp (2012), Cariou, Mejia, and Wolff (2008),
Balmat et al. (2011), Li, Yin, and Fan (2014), Knapp et al. (2011)
CE

Hull type Balmat et al. (2009)


AC

Ship age Knapp et al. (2011), Balmat et al. (2009), Zhang et al. (2013), Li, Yin, and Fan (2014),
Wu et al. (2015)

Ship flag or registry Knapp et al. (2011), Balmat et al. (2009), Li, Yin, and Fan (2014)

Gross tonnage Zhang et al. (2013), Knapp et al. (2011), Balmat et al. (2009), Hansen, Jepsen, and
Hermansen (2012), Li, Yin, and Fan (2014), Knapp, Bijwaard, and Heij (2011)

Ship speed Balmat et al. ( 2011), Talley, Yip, and Jin (2012)

Ship defects Hänninen and Kujala (2014), Knapp, Bijwaard, and Heij (2011)

4
ACCEPTED MANUSCRIPT

Loading Guedes Soares and Teixeira (2001), Akyuz and Celik (2014)

Crew Akhtar and Utne (2014), Mullai and Paulsson (2011), Yang et al. (2013), Hänninen and
Kujala (2012), Weng and Yang (2015), Prabhu Gaonkar, Xie, and Fu (2013)

Location Weng and Yang (2015), Mullai and Paulsson (2011), Sun et al. (2013)

Rain, fog Mullai and Paulsson (2011), Balmat et al. (2009), Weng and Yang (2015)

Visibility Balmat et al. (2009), Zhang et al. (2013)

Wind Zhang et al. (2013), Heij et al. (2013)

T
Season Knapp et al. (2011), Zhang et al. (2013), Li, Yin, and Fan (2014)

IP
Human factors Antao and Guedes Soares (2008), Chen et al. (2013), Akyuz (2015), Sotiralis et al.

CR
(2016), Wu et al. (2017)

Maritime accident risk models often involve quantitative analysis. The IMO proposed a

US
formal safety assessment (FSA) method for risk management in maritime accident analysis. The
FSA method is a systematic approach to ship accident analysis that considers ship condition,
organisational management, human operation and hardware (Guedes Soares and Teixeira, 2001).
AN
To further compute the causal relationships among the above factors, some quantitative risk
assessments are provided in maritime accident research. For example, fault-tree analysis (FTA)
has been used to analyse the causes of maritime accidents (Ronza et al. 2003, Kum and Sahin
2015). Antao and Guedes Soares (2006) used the FSA method to identify basic events that could
M

lead to a Ro-Ro vessel accident, and they built an FTA model to analyse the relation between the
relevant events. Recently, Zhang et al. (2013) used the FSA method to analyse ship accident
ED

consequences in the Yangtze River and then used the BN tool for quantitative analysis. In addition,
Fabiano et al. (2010) proposed summarised statistics for evaluating accident frequency over time
or certain risk control levels. Balmat et al. (2011) evaluated maritime risk assessment based on a
PT

fuzzy-logic approach.
Maritime accident data are available from established datasets and accident investigation
reports. Among the most often used historical datasets are Lloyd‘s Register Fairplay, Lloyd‘s
CE

Maritime Intelligence Unit, and the IMO. The contained statistics consist of ship names, ship
registries, accident dates and times, types of casualties, consequences, locations, ship types, gross
tonnages, classification societies, dead weights, and injured or dead people. Lloyd‘s data normally
AC

cover ships larger than 100 gross registered tons and thus omit a large percentage of fishing
vessels (Guedes Soares and Teixeira, 2001).
Heij et al. (2013) used the accident statistics of Lloyd‘s Register Fairplay, Lloyd‘s Maritime
Intelligence Unit and the IMO. Wu et al. (2015) used accident statistics of the Yangtze River MSA,
and Weng and Yang (2015) used shipping accident statistics managed by Lloyd‘s List Intelligence
Company.
Accident investigation reports are a useful way to obtain more complete accident data. The
investigation reports of maritime accidents are often available from maritime authorities, such as
the MAIB of the UK, the MSA of China, and the Transport Safety Board of Canada. These reports
provide much more detailed information than existing databases and contain details of what
5
ACCEPTED MANUSCRIPT

occurred, subsequent actions taken and recommendations. In the present literature, few studies use
accident investigation reports to conduct accident analysis, and even fewer use them to conduct
quantitative risk analysis simply due to the large workload required to aggregate the data from
each report for a dataset of meaningful critical mass. For instance, from cases of high-speed craft
accidents, Antao and Guedes Soares (2008) found the chain of events that led to the accidents and
their associated contributory factors and causes. With the taxonomy of the TRACEr method,
Graziano et al. (2016) coded and analysed grounding and collision accidents investigation reports
to identify human and organisational errors (HOFs). Chauvin et al. (2013), Akhtar and Utne
(2014), and Chen et al. (2013) analysed accident reports of selected cases to investigate maritime
accident HOFs.

T
The data collection and analysis method based on accident reports, though time consuming,

IP
will no doubt bring new findings and rich information that cannot be easily obtained from existing
databases and facilitate the use of primary data in maritime accident analysis. Its novelty is also

CR
highlighted by feeding such information into an advanced BN model for enabling risk prediction
and accident prevention, instead of discussing the importance of individual factors based on basic
statistical analysis. It will therefore help the authors generate new findings in this study region, in
contrast to previous relevant studies, which rely on data derived from the same/similar sources.

2.2 Use of BN in maritime accident analysis


US
AN
Despite such efforts, the uncertainty (e.g., incompleteness and randomness) in historical
failure data in the maritime industry stimulates the use of advanced techniques in risk assessment.
For instance, the IMO considered the incorporation of BN modelling in risk quantification in FSA
M

studies (Yang et al., 2013a). BNs have been used in waterway transport accident research due to
their advantages, including their usefulness in conducting backward risk diagnosis and forward
risk prediction and in accommodating new evidence to update an analysis without the need to
ED

significantly change the original network (Yang et al., 2009; 2013b).


In applying a BN in maritime accident analysis, the first challenge is to construct the BN
structure. Normally, a BN is constructed with aid from data correlation, expert knowledge, a
PT

literature review, or a combination of the above (Zhang, 2016). Trucco et al. (2008) presented a
BN of maritime accidents with human and organisational factors, which was an extension of the
fault tree. With the aid of experts in the maritime and petroleum industries, a BN was obtained to
CE

predict the risk of maritime piracy against offshore oil fields (Bouejla et al. 2014). Based on the
available data from the Maritime Authority (DGAM), expert knowledge was consulted for the
construction and validation of the BBN model (Antao and Guedes Soares, 2008). With the
AC

suggestions of six selected experts who were consulted to identify major factors influencing the
likelihood of a successful hijacking of a ship, a BN model was developed to estimate the
likelihood of a ship being hijacked in the Western Indian or Eastern African regions (Pristrom et al.
2016). With a combination of statistics and expert knowledge, Zhang et al. (2016) built a Bayesian
belief network to express the dependencies between the indicator variables and the consequences
of the Tianjin port accident.
When BN structures are developed from data using a machine learning algorithm, there is a
possibility that the generated casual relationships are unreasonable and ambiguous. Previous
studies have used expert knowledge or a taxonomy model to optimise such structures. Zhang et al.
(2013) estimated navigational risks on the Yangtze River using a BN technique; the preliminary
6
ACCEPTED MANUSCRIPT

structure of the BN was obtained from data via the necessary path condition algorithm, and
additional domain knowledge was referenced to further consolidate the structure. Ma et al. (2016)
presented a BN-based target-extraction method to extract moving vessels from numerous blips
captured in frame-by-frame radar images; at the beginning, an initial BN structure was established
based on expert judgment and was then improved with the help of a K2 scoring algorithm. Akhtar
and Utne (2014) developed a Bayesian causal network to analyse maritime accidents using the
qualitative model (HFACS) and its taxonomy for structuring fatigue-related factors into levels,
which decreased the number of links (correlations) in need.
To avoid the subjectivity associated with expert input in BN modelling, two plain machine
learning algorithms, the naïve BN (NBN) and the augmented NBN (ABN) (Friedman et al., 1997),

T
are applied in this study because of their demonstrated efficiency and capability. With the core

IP
idea of classification, the NBN and ABN models can be built to simplify BN structures without
sacrificing the accuracy of the model.

CR
Previous studies in which a BN was applied to maritime risk tended to focus on the
probabilities of shipping accidents rather than their severity, and accident data were frequently
obtained directly from existing databases rather than compiled from investigation reports. The
novelty of this study is its attempt to construct a BN from primary data directly derived from

US
accident investigation reports containing rich information that fits the specific requirement of this
study.
AN
Furthermore, we extract influencing factors and the nature of waterway transportation
accidents from accident investigation reports using text mining techniques. The text mining
method has a wealth of applications in other disciplines, such as enterprise management and
sociology (Glaser, 1992). However, the use of this text coding approach in maritime risk data
M

elicitation, such as in Mullai and Paulsson (2011), is scant. To develop a rational BN structure, we
use and compare the NBN and ABN algorithms to select the best-fit BN structure with specific
evaluating indicators.
ED

3. Data mining
PT

3.1 Data acquisition

We collected 229 accident investigation reports from Chinese coastal waterways and inland
CE

rivers from China‘s MSA and its fourteen subordinates. As many as 350 vessels were involved in
these reported accidents from 1979 to 2015. Each report includes a description of the ship(s), crew,
ship companies, accident location, navigational environment, accident process, losses, and an
AC

analysis of the cause.


With regards to severity, a maritime accident can be classified as a catastrophic accident, a
critical accident, a major accident or a minor accident (MoT, 2002). Their explanations are
detailed in Table 2.
Table 2
Classification of the consequences of accidents
Minor Major Critical Catastrophic
Ships over 3,000 Below minor Serious injury, or 1-2 fatalities, or Over 2 fatalities, or
gross tonnage accident economic loss economic loss economic loss over
between 500k and between 3,000k and 5,000 RMB
7
ACCEPTED MANUSCRIPT

3,000k RMB 5,000k RMB


Serious injury, or 1-2 fatalities, or
Ships between 500 Over 2 fatalities, or
Below minor economic loss economic loss
and 3,000 gross economic loss over
accident between 200k and between 500k and
tonnage 3,000 RMB
500k RMB 3,000k RMB
Serious injury, or 1-2 fatalities, or
Over 2 fatalities, or
Ships below 500 Below minor economic loss economic loss
economic loss over
gross tonnage accident between 100k and between 200k and
500k RMB
200k RMB 500k RMB
Source: Regulation of Water Transportation Accident Statistics, Ministry of Transport (MoT),

T
China, 2002.

IP
CR
3.2 Coding

The grounded theory (GT) method is a systematic methodology involving the discovery of
theory through the analysis of data (Glasser and Strauss, 1967). Selective reduction is the kernel of

US
GT. Using the GT method, we stepwise process the 229 text cases collected from MSA by coding,
conceptual formulation, categorisation and repeated comparisons to extract the influencing risk
factors and their effects on waterway transport accidents. The specific process is described as
AN
follows.
(1) Coding: we read the original case reports word by word and sentence by sentence to
encode the data. We mark the sentences in the original text that involve the consequences of each
M

waterway accident and the related risk factors.


(2) Conceptual formulation: We borrow the concepts of the existing literature or use the
report analyst‘s language, and we group the marked sentences into similar concepts of the related
ED

risk factors and the consequences of the waterway accidents.


(3) Categorisation: We study all cases repeatedly, and we make constant comparisons
between the reported cases. We unify concepts of the same meaning to form the categories at
PT

higher levels of abstraction. Then, we retrieve the ultimate accident-related categories, which are
presented in Table 3. In this process, the attributes and their categorisation in previous studies (in
Table 1) are used as a reference.
CE

(4) Attribution: According to the categories in Table 3, we obtain the related attribute values
of each case. Finally, we obtain a database with 350 records with 21 columns.
(5) Relationships recognition: When contrasting explanations (i.e. causal relationships)
AC

appear, one solution is to use domain expert evaluations with reference to the mainstream
explanations in the literature. If the experts argue the explanations opposite to the mainstream
ones, extra justifications are needed. The other is to mark the contrasting explanations, in order to
test the sensitivity in the model validation process.
Fig. 1 presents the ships involved in the accidents. It shows that accidents involving bulk
cargo ships occur most frequently, as found by Zhang et al. (2016), followed by container ships,
and more than 50% of the catastrophic accidents involve bulk cargo ships. The frequencies of the
other variables are presented by the percentage values attached at each state of them in Table 3.
Table 3
Categories
8
ACCEPTED MANUSCRIPT

Notatio Values Occurrence frequencies


Category Descriptions
n (%)

Container ship, dry bulk cargo ship, fishing ship,


Ship type
1, 2, 3, 4, 13.7, 51.4, 7.7, 9.4, 4.3,
tanker or chemical ship, barge or tug, ro/ro ship,
5, 6, 7, 8 3.7, 5.2, 4.6
passenger ship, other

Hull type Steel, wood, aluminium alloy 1, 2, 3 94.6, 2.9, 2.5

1, 2, 3, 4, 44.6, 23.1, 11.7, 7.5,


Ship age (years)
0 to 5, 6 to 10, 11 to 15, 16 to 20, more than 20
5 13.1

Ship flag* China, FOC, other 1, 2, 3 80.2, 10.9, 8.9

T
Length (metres) 100 or less, more than 100 1, 2 63.3, 36.7

IP
Gross tonnage
300 or less, 300 to 10000, greater than 10000 1, 2, 3 15.4, 62, 22.6

CR
(GT)

Ship speed
5 or less, 5 to 10, greater than 10 1, 2, 3 42.5, 36.1, 21.4
(knots)

Ship defects
US
No defect, defect was unrelated to the accident, or

defect was corrected in a recent PSC check before

sailing;
1, 2 77.2, 22.8
AN
relevant defect, or no recent PSC check

Loading Normally loaded, ballast, overloaded 1, 2, 3 79, 13.3, 7.7


M

Sufficient crew with valid certificates;


Crew
insufficient crew, lack of a certificate, or invalid 1, 2 79.1, 20.9

certificate
ED

Quay, port channel, anchorage, inland waterway, 1, 2, 3, 4, 13.1, 8.5, 10.6, 24.9,
Location
coastal waterway 5 42.9

Rain No rain or unmentioned, rain 1, 2 80.9, 19.1


PT

Fog No fog or unmentioned, fog 1, 2 79.2, 20.8

Visibility (km) 2 or less, 2 to 10, greater than 10 1, 2, 3 18, 29.1, 52.9


CE

Wind (Beaufort
4 or less, 5 to 7, greater than 7 1, 2, 3 30.8, 58.6, 10.6
scale)
AC

Non-dry season, dry season (November to the


Season
1, 2 57.9, 42.1
following March)

Time of day 07:00 to 19:00, other 1, 2 35.5, 64.5

Navigational Good, poor (complex geographic environment or


1, 2 25.3, 74.7
environment dense traffic)

Human factors No human factors, human factors 1,2 7.1, 92.9

9
ACCEPTED MANUSCRIPT

Collision, contact, stranding, grounding, 1, 2, 3, 4, 60.3, 7.4, 1.4, 3.7, 4.6,


Accident type**
fire/explosion, sinking, wind strike, other 5, 6, 7, 8 10.3, 6, 6.3

Accident
Minor, major, critical, catastrophic 1, 2, 3, 4 20.6, 14.1, 20.6, 44.7
severity

*The flag of convenience (FOC) refers to vessel registry in the following locations: Panama, Limassol, Kingston,
Valletta, Belize, Majuro, Cyprus, Phnom Penh, Cambodia, and Willemstad.
** The maritime accidents were divided into eight types, i.e., collision, contact, standing, grounding,
fire/explosion, sinking, wind strike and other, according to the Regulation of Water Transportation Accidents
Statistics provided by the MoT (2002).

T
In the subsequent quantitative analysis, accident severity is defined as a dependent variable,
the other categories in Table 3 are treated as influencing variables, and the attributes correspond to

IP
the variable states.1
60%

CR
Minor
50% Major
Critical
Accident severity

40%

30%

20%
US Catastrophic
AN
10%

0%
Container Dry bulk Fishing Tanker Barge/ bug Ro/Ro ship Passenger Other
M

ship ship ship ship

Fig. 1. Percentage of each ship type involved in minor, major, critical, and catastrophic
accidents.
ED

4. Use of BN modelling to analyse the severity of maritime accidents


BN theory was introduced by Pearl (1988) and can be expressed as the following pair: S =
PT

<G, P>. G is a directed acyclic graph, the nodes in the network correspond to the variables, the
tangential arc refers to the causal relationship between variables, the directional arc from node X
to node Y indicates that X has a direct causal effect on Y, and the conditional probability P(Y|X)
CE

represents the intensity of the causal effect.

P(Y X )  P( X Y )  P(Y ) / P( X ) (1)


AC

P(Y) is the prior probability of the hypothesis, i.e., the likelihood that Y will be in a certain
state, prior to consideration of any other relevant information (evidence), which is X. P(X|Y) is the
conditional probability (the likelihood of evidence given the hypothesis to be tested), and P(Y|X) is
the posterior probability of the hypothesis (the likelihood of Y being in a certain state, conditional
on the evidence provided) (Akhtar and Utne 2014).
The development of a BN model includes the following steps: BN structure learning, BN
monitoring and analysis and model validation, sensitivity analysis, estimation and evaluation
1
The entire process was conducted in Chinese given that the accident investigation reports were all in this
language. The identified categories were later translated to English for this paper.
10
ACCEPTED MANUSCRIPT

(Zhang et al. 2013).


Two methods are available for developing the structure of a BN. One method is the use of
expert knowledge. Its disadvantage is that some causal relationships are not easily analysed or are
expressed subjectively. An alternative method for BN construction is to develop the network
structure and parameter estimation by data-driven machine learning methods and use experts to
validate the final structure to assure the meaningfulness of the relationships. Considering the
availability of the historical data derived from the accident reports, data-driven BN approaches are
applied in this study. A significant drawback of the data-driven approach is that the number of

T
possible structures for a given problem increases super-exponentially with the number of variables
in the problem domain (Yang et al., 2018).

IP
NBN and the ABN can reduce the complexity given that the partial structure of the network
is fixed. An NBN is a simple structure that has an independent node as the parent node of all the

CR
other nodes, and no other connections are allowed in the structure. However, strong assumptions
are required in most NBN cases. To make the model more realistic, we adopted an ABN model,

US
whose architecture consists of a naïve architecture that is made richer by basing the ties between
the child nodes on the value of the target node. Since ABN is based on NBN, the latter is
introduced first.
AN

4. 1 NBN learning
M

An NBN is a network structure in which the target node is directly connected to all other
nodes and each child node is independent of the other nodes. The NBN structure is generated by
ED

specification. The NBN model is most commonly applied to classification problems (Friedman et
al., 1997).
a) Let ‗accident severity‘ be the class variable (S) with one state for each possible state, and
PT

let
be the set of risk variables ( 𝑘 ) (i.e., ship type, hull type, ship age, ship flag, length, gross tonnage,
ship speed, ship defects, loading, crew, location, rain, fog, visibility, wind, season, time of day,
CE

navigational environment, human factors, and accident type, respectively), where each variable
represents a property that we observe and include in our model.
b) Given the simplicity and strong assumption of the pairwise independence of the attributes,
AC

two types of structures can be obtained to describe the relationships between S and 𝑘 .
Fig. 2(a) shows the first structure, in which 𝑘 is the parent node of ‗accident severity‘, and
‗accident severity‘ is the only child of each risk factor node and no other structure. In our study,
the ‗accident severity‘ of four states can be assigned to S, and it has 20 influencing variables, each
of which can be assigned to more than one state, as in Table 3. For any set of observations
, the complexity of computing the conditional probability distribution
| ) is non-linear, and there may be more than 2E+09 conditional probability
distributions that need to be computed (the size of the conditional probability table increases
exponentially with the number of parents).
Let ‗accident severity‘ have no parents, and let it be the only parent of each feature variable.
11
ACCEPTED MANUSCRIPT

Fig. 2(b) shows the second structure; S is the only parent of each child node. The structure consists
of the prior distribution ) and 65 conditional probability distributions 𝑘 | ). This classifier

algorithm is much simpler, and it can be used to express the relationship between variables. In this
paper, we adopt the structure in Fig. 2(b) as the NBN structure.

…… ……

T
IP
Fig. 2(a). BN converging structure with Fig. 2(b). BN diverging structure with
‗accident severity‘ as a child node. ‗accident severity‘ as the parent node.

CR
c) Estimate the conditional probability distribution as follows:
n
P ( S ) P ( Rk S )
P ( S RST , RHT ,
US
, RAT )  P ( S Rk )  n
k 1

 P( R )
k 1
k
(2)
AN

4. 2 ABN learning
M

The ABN consists of an NBN enriched by the relationships between the child nodes and the
value of the target node (the common parent). The ABN modelling technique is implemented as
ED

follows.
a) Generate a Naïve Bayes structure with the target node (e.g., ‘accident severity’) directly
PT

connected to all other nodes ( 𝑘 ).


b) Create different ABN structures by changing the parameter of the structural coefficient (α).
Set 0 < α ≤ 1 and let N’ = N/α, where N is the number of samples in the dataset (0 < α ≤ 1).
CE

Given different values of α, assume that the relationship between child nodes ( 𝑘 ) is allowed to
exist by fixing the links from the target node (S) to the children ( ). With the aim of
AC

the minimum description length score (Lam, Bacchus 1993), a greedy search algorithm among the
children is used to obtain the augmented part of the ABN given different values of α.
𝑀𝐷𝐿 𝐵 𝐷) α𝐷𝐿 𝐵) + 𝐷𝐿 𝐷|𝐵) (3)
where 𝐵 represents the ABN, and 𝐷 represents the dataset given to ABN 𝐵.
c) Evaluate the structure/data ratios of different ABN structures, where the structure/data
ratios are 𝐷𝐿 𝐵)/𝐷𝐿 𝐷|𝐵), 𝐷𝐿 𝐵) is the description length of the ABN, and 𝐷𝐿 𝐷|𝐵) is the
description length of the data given the ABN.
The structure/data ratios allow us to consider the structural complexity and predictive
performance of the network. The lower the value of α is, the higher the value of 𝐷𝐿 𝐵)/𝐷𝐿 𝐷|𝐵)

12
ACCEPTED MANUSCRIPT

will be. In other words, when α equals 0.1, the ABN structure in Fig. 3(a) is more complex than
the structure with α = 1 in Fig. 3(b), and the target predictive precision of ABN (α = 0.1) is higher
than that of ABN (α = 1). After comparing the structure/data ratios under different α, an ABN
structure (𝐵) will be selected with a satisfied trade-off between predictive performance versus
network complexity.

T
IP
CR
Fig. 3(a). ABN structure with α = 0.1 US Fig. 3(b). ABN structure with α = 1
AN
d) Estimate the conditional probability distribution as follows:
n
P ( S ) P ( Rk (Rk)
)
M

P( S RST , RHT , , RAT )  k 1


n
(4)
 P( R )
k 1
k
ED

where 𝜋 𝑘) denotes the parents of node 𝑘.


PT

4.3 Model verification

The types of construction validity tests for BN models include nomological, face, content,
CE

concurrent and convergent validity, qualitative features and the sensitivity test (Pitchforth and
Mengersen, 2013; Mazaheri, 2016; Sotiralis et al., 2016). In this paper, the NBN structure is fixed
AC

by nature, so the structure and parameters need not be checked. In contrast, during the process of
ABN structure learning, the model content verification will be done. In section 5.2, three domain
experts were interviewed to verify the parameters and their relationships in the ABN model. The
sensitivity test is described in detail in sections 4.4 and 6.2.
In addition, this paper uses the receiver operating characteristic (ROC) curve to verify the
model from the data statistics view. ROC is a plot of the true positive rate (Y-axis) against the
false-positive rate (X-axis), and the ROC index represents the surface under the ROC curve
divided by the total surface. For the different BN models, we use the indicator of the ROC curve
to evaluate the fitness of the NBN and ABN models in this paper.
13
ACCEPTED MANUSCRIPT

4.4 Sensitivity analysis

Sensitivity analysis is a common method of uncertainty analysis used to quantify the


uncertainties associated with relevant variables. Sensitivity analysis is useful for increasing our
understanding of the relationships between input and output variables (Wu et al. 2015). Because
the nodes in this study are categories listed in Table 3, mutual information(MI) is used to compute
the strengths of the relationships between the target node (i.e., severity) and influencing nodes (i.e.,
risk variables), and the value of the target node is computed under different states of the
influencing nodes, which share a large amount of mutual information with the target node.
One of the key advantages of mutual information I(Y, X) is that it can be computed between

T
categorical variables. It measures how much (on average) the observation of a random variable y
tells us about the uncertainty of x, i.e., by how much the entropy of x is reduced if we have

IP
information on y. If I(Y, X) > 0, then the association between y and x is strong; if I(Y, X) ≈ 0, then
the association is weak, and y and x occur simultaneously only by chance; and if I(Y, X) < 0, then y

CR
and x are complementary, and there is no association. The mutual information between ‗accident
severity‘ and other risk variables can be defined as
| 𝑘)
𝑘) | 𝑘)

where

US
represents each state of ‗accident severity‘,
𝑘) ∑

𝑘
) (5)

represents each state of the risk


represents the mutual information shared between ‗accident severity‘ and
AN
variables, and 𝑘)

the waterway accident risk variables in this paper.


For the risk variable, which has a strong relationship with ‗accident severity‘, a sensitivity
M

analysis to determine how the risk variable affects ‗accident severity‘ is performed as follows.
The value of the target node (e.g., ‗accident severity‘) is computed when the state of one
ED

child node (e.g., risk variable) is assigned different values, and the states of the other child nodes
are locked. In other words, for a specific k, where 𝑘 has a strong relationship with S, we set 𝑘

to a different state i, then compute the joint probability 𝑗 𝑘 𝑖) and the mean value
PT

𝐸 𝑘 𝑖).
𝑗 𝑘 𝑖) 𝑗) × 𝑘 𝑖| 𝑗) (6)
CE

𝐸 𝑘 𝑖) ∑𝑗 𝑗 × 𝑗 𝑘 𝑖) (7)

5. BN structure learning
AC

5.1 NBN structure learning

Assuming that all the child nodes are independent, we can construct an NBN as shown in Fig.
4.

14
ACCEPTED MANUSCRIPT

T
IP
CR
Fig. 4. NBN structure

5.2 ABN structure learning

(1) Structural coefficient


US
We set the structural coefficient (α) equal to {0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1}. The
relationship between α and the structure/data ratio is presented in Fig. 5; the X-axis represents α,
AN
and the Y-axis represents the structure/data ratio.
1

0.8
M
Structure/DataRatio

0.6
ED

0.4

0.2
PT

0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Structural Coefficient

Fig. 5. Representation of Structure/Data Ration variation for different values of the Structural
CE

coefficient (a).
The complexity of the structure increases with a decreasing structural coefficient. This
complexity becomes problematic when it increases more rapidly than the predictive precision.
AC

Visual inspection suggests that there could be a trade-off (e.g., the sharp bend of the curve)
between the predictive performance and the network complexity when α is equal to 0.3.
(2) Content validity
In addition, in the process of learning the structure in ABN, the common sense, accident
report and expert knowledge should be used to ensure the rational causal relationships between the
BN nodes. Hanninen, Kujala (2014) used prior knowledge to forbid some arcs while using a
hill-climbing algorithm to learn a BN structure in ship accidents. With reference to Hanniene
Kujala‘s work, when learing the ABN structure in this work, we consider the following prior

15
ACCEPTED MANUSCRIPT

knowledge to argue the casual relationships which are not harmony with the current understanding
learnt from the literature.
 ‗Ship type‘ should have no relationships with ‗Ship age‘ and ‗Ship flag‘; ‗Ship age‘ should
have no relationships with ‗Ship flag‘, ‗Length‘, ‗Gross tonnage‘; ‗Ship flag‘ has no
relationships with ‗Ship flag‘, ‗Length‘, ‗Gross tonnage‘;
 Ship static characteristics (e.g., ‗Ship type‘, ‗Hull type‘, ‗Ship age‘, ‗Ship flag‘, ‗Length‘,
‗Gross tonnage‘) should have no relationships with accident environment states (e.g.,
‗Rain‘, ‗Fog‘, ‗Visibility‘, ‗Wind‘), the ship loading state (e.g., ‗Loading‘), the crew
seaworthiness (e.g., ‗Crew‘), and accident time and location (e.g., ‗Location‘, ‗Season‘,

T
‗Time of day‘);

IP
 ‗Ship defect‘ or ‗Crew‘ should have no relationships with ‗Rain‘, ‗Fog‘, ‗Visibility‘,
‗Wind‘, ‗Season‘ ,‗Time of day‘, ‗Ship speed‘ and ‗Navigational environment‘;

CR
Using the above prior knowledge to prohibit relevant arcs, we obtain a BN structure from
data with the ABN algorithm (α=0.3), as shown in Fig. 6(a). The arc orientation is the opposite of

US
the causal relation for low computational complexity (4.1). It is observed that ‗Ship speed → Time
of day‘, ‗Loading → Navigational environment‘ and ‗Length → Human factors‘, is not consistent
with the actual situation because accident time will not influence ship speed. In addition, it can be
AN
shown that navigational environment has a small consequence for ship loading situation, also the
human factors should have a small consequence for ship length.
M
ED
PT
CE
AC

Fig. 6(a). ABN model structure with initial Fig. 6(b). ABN model structure with
forbidden arcs. additional forbidden arcs ‗Ship speed → Time
of day‘, ‗Loading → Navigational
environment‘, ‗Length → Human factors‘.
Therefore, we adjust the ABN structure with additional forbidden arcs of ‗Ship speed →
Time of day‘, ‗Loading → Navigational environment‘, ‗Length → Human factors‘. As a result, the
revised structure is shown in Fig. 6(b); the arc ‗Accident type → Time of day‘ in Fig.6(b) is not

16
ACCEPTED MANUSCRIPT

consistent with the actual situation, and we forbidden this arc and adjust the ABN structure, and
the updated structure is shown in Fig. 6(c).

T
IP
CR
US
Fig. 6(c). Final ABN structure with additional forbidden arcs ‗Accident type → Time of day‘.
AN
5.3 ROC curve

We use the indicators of ROC to evaluate the fitness of the different BN models. The results
are shown in Table 4.
M

Table 4
Degrees of fit of the models
ED

ROC (%) S=1 S=2 S=3 S=4


NBN (Fig. 4) 84.48 77.06 77.82 79.07
ABN (Fig. 6(c)) 93.06 90.01 86.39 88.45
PT

The ROC value of the ABN model is higher, indicating that this model yields a better result.
Thus, we select the ABN model (Fig. 6(c)) for the subsequent data analysis.
CE

6. Model results
AC

6.1 Prior probability distribution of the BN model

Fig. 7 shows the initial prior probability distributions of the factors involved in the adjusted
ABN.

17
ACCEPTED MANUSCRIPT

T
IP
CR
US
AN
M
ED

Fig. 7. The prior probability distribution of the adjested BN model


The classical statistical analysis of these data provides some initial findings, including:
In the shipping accidents occurring in the coastal and inland waters of China, collision was
PT

the type of accident with the highest probability: 60.27%. Dry bulk cargo vessels accounted for
the largest percentage (i.e., 51.42%) of shipments involved in accidents. Ships younger than 5
CE

years were involved in the largest percentage (i.e., 44.63%) of accidents. The majority of vessels
involved in accidents, 63.70%, were less than 100 m long. Gross tonnages in the range of
300-10000 accounted for 61.46% of the ships involved in accidents.
AC

With respect to the safety of the ships, 22.70% of the vessels involved in accidents had
deficiencies or failed to conduct safety inspections, 7.92% were overloaded, and 20.87% had an
insufficient number of crew members or a crew member with an incomplete or invalid certificate.
In terms of navigational environment, rain was present in 21.69% of the accidents, fog in
19.96%, and poor visibility in 17.30%. In addition, 42.14% of the accidents occurred between
November and the following March, 64.38% occurred at night time, and 74.74% occurred in
waterways with shipping congestion and other poor navigational environments.

18
ACCEPTED MANUSCRIPT

6.2 Sensitivity analysis (SA)

6.2.1 Most relevant variables of the target node


Fig. 8 shows the amount of mutual information shared between ‗accident severity‘ and the
other risk variables. The sizes of the nodes are proportional to the amount of mutual information
shared with the target node given the available evidence.
Given that ‗accident severity‘ is the target node, the variable ‗accident type‘ has the strongest
effect on the accident severity: the corresponding amount of mutual information is 0.1542. Certain
variables yield values of 𝑘) exceeding 0.05 and thus have a significant effect on ‗accident

T
severity‘; these variables are ‗accident type‘, ‗location‘, ‗ship type‘ and ‗ship age‘. Additional
𝑘) greater than 0.02 but less than 0.05, i.e., ‗ship flag‘, ‗ship

IP
variables that yield values of
speed‘, ‗time (of day)‘, ‗visibility‘, ‗gross tonnage‘,‗environment‘, ‗crew‘, ‗season‘ and ‘ wind‘,

CR
also had a significant effect on ‗accident severity‘.
The variables ‗ship defects‘, ‗loading‘, ‗fog‘, ‗hull type‘, ‗human factors‘, ‗rain‘ and ‗length‘
had a relatively weak effect on ‗accident severity‘.

US
AN
M
ED
PT
CE

Fig. 8. Mutual information shared with the target node (the size of the node is equal to the MI
value)
AC

6.2.2 Sensitivity analysis with respect to the target node


The variables ‗accident type‘, ‗location‘, ‗ship type‘ and ‗ship age‘ had their MI values higher
than 0.05 with ‗accident severity‘; thus, we compute the effect of these four nodes on the target
node.
In accordance with the ABN model described in Fig.9(a), we assign ‗accident type‘ a state of
1 (‗collision‘). Consequently, the joint probability of 1 1) is 0.1733,
2 1) = 0.1465, 3 1) = 0.2090, and 4 1) = 0.4713; then
𝐸 1) = 2.9785. The joint probabilities and the mean value show that when the accident is
19
ACCEPTED MANUSCRIPT

a collision, the severity of the waterway accident tends to be critical.


Similarly, the mean value and joint probabilities are obtained and presented in Figs. 9(a) to
9(d): the X-axis represents the state of a risk variable, the left Y-axis represents the mean value of
‗accident severity‘, and the right Y-axis represents the joint probabilities of jth ‗accident severity‘
state with ith state of each risk variable (e.g. accident type, location, ship type, ship age) .

T
IP
CR
US
AN

Fig. 9(a). Mean values of ‗accident severity‘ against different accident types, and the
M

posterior probability of each state of ‗accident severity‘ with respect to different accident types.
Fig. 9(a) shows that when the accident is a sinking, the ‗accident severity‘ tends to be the
ED

highest, at a value of 3.237. When a grounding occurs, the probability of ‗minor accident‘ is the
highest, and the mean probability of ‗accident severity‘ is the lowest, which means that the
waterway ‗accident severity‘ tends to be minor.
PT
CE
AC

Fig. 9(b). Mean values of ‗accident severity‘ in different locations, and the posterior probability of
20
ACCEPTED MANUSCRIPT

each state of ‗accident severity‘ in different locations.


Fig. 9(b) shows that when the accident occurs at a quay or a port channel, the mean values of
accident severity is low. When the location is an anchoage, an inland or coastal waterway, the
severity of the accident tends to be catastrophic.

T
IP
CR
US
AN
Fig. 9(c). Mean values of ‗accident severity‘ against different ship types, and the posterior
probability of each state of ‗accident severity‘ with respect to different ship types.
M

The severity of the waterway accident involving fishing vessel tends to be the highest: 3.456.
When the ship is hauling passengers, dry bulk cargo or a barge/tug, then the mean value of
‗accident severity‘ are 2.801, 3.007, 3.310, respectively. If the ship is a tanker, container ship, or
ED

ro/ro vessel, then the mean value of ‗accident severity‘ are lower: 2.705, 2.657, and 2.501,
respectively.
PT
CE
AC

Fig. 9(d). Mean values of ‗accident severity‘ against different ship ages, and the posterior
probability of each state of ‗accident severity‘ with respect to different ship ages.

21
ACCEPTED MANUSCRIPT

With increasing ship age, the severity of accidents generally rises. However, a 6-10 years ship
tends to be safer than one aged 0-5 years, probably because a new ship has a certain run-in period.
Ships aged 16-20 years tend to be slightly less safe than those more than 20 years.

6.3 Implications: Scenario analysis

The model enables analysis of the severity of waterway accidents based on various scenarios
involving different natural and navigational environments and vessel managerial conditions. Two

T
scenarios are undertaken focusing on the environment and vessel management to demonstrate the
possible research implications of the BN model.

IP
6.3.1 Scenario one: Hypotheses of natural and navigational environment aspects

CR
In scenario one, waterway risk under specific environmental conditions is estimated. Here,
environmental factors, including ‗season‘, ‗wind‘, ‗rain‘, ‗fog‘, ‗the time of day (TD)‘, and

US
‗navigational environment (NE)‘, are chosen. The variables are assigned the following states:
‗season‘ = ‗winter‘, ‗wind‘ = ‗greater than 7 on the Beaufort scale‘, ‗rain‘ = ‗rain‘, ‗fog‘ = ‗fog‘,
‗time’ = ‗night time‘, and ‗environment‘ = ‗high traffic density or other poor navigational
AN
environment‘. Assuming that one or more than one of the above natural and navigational
environment situations occurs, and considering that strong winds and heavy fogs, or heavy rain
and heavy fog do not usually happen simultaneously, we obtain 40 combinations of different
M

environmental conditions. If we fix the other variables in the ABN structure as constant (i.e. lock
the evidence), we can computer the increasing percent of the posterior probability of ―catastrophic‖
ED

accident severity under 40 different combined conditions, respectively.


Any combination leading to over 30% increase is shown in Fig. 10. It is obserbed that a
remarkable increase in accident severity compared to the initial state when the wind is strong; it is
PT

noted that the rain itself could not lead to serious accident severity, but when rain occurs with
wind, the ship accident severity will increase significantly. In the meantime, the highest marginal
CE

contribution to ‗AS = 4‘ comes from ‗wind = greater than 7, time of day = night, navigational
environment = poor‖.
Obviously, all stakeholders should pay great attention when encountering poor navigational
AC

environments, especially severe weather conditions, given their significant effect on the accident
severity.

22
ACCEPTED MANUSCRIPT

60%

catastrophic accident
posterior probability of
Increased percent of the
55%
55% 53%
50% 51%
50% 47% 48% 48%
45% 41% 41% 42%
40% 38%

35% 33%

30%
25%
20%
Season 2 2 2 2 2

T
Wind 3 3 3 3 3 3 3 3 3
Rain 2

IP
Fog 2 2 2 2
TD 2 2 2 2 2 2 2

CR
NE 2 2 2 2 2 2 2 2
Combinations of environment conditions

Fig. 10. Posterior probabilities increasing by 30% for catastrophic accidents in scenario one.

6.3.2 Scenario two: Hypotheses of vessel managerial aspects


US
In scenario two, the critical safety characteristics of vessels are identified to demonstrate how
AN
better monitoring and management can be undertaken to reduce risks. The variables representing
aspects of vessel management in this scenario are assigned to the following designations or values:
‗crew‘ = ‗sufficiently staffed with valid certificates‘, ‗ship defects‘ = ‗no defects‘ and ‗ship speed‘
M

‗smaller than 5 knots‘. The drop percentages of mean values of the node ‗accident severity‘ in
scenario two is shown in Fig. 11. When the crew is sufficiently staffed, the ship has no defects,
and the ship speed is slow, the mean value of accident severity decreases by 10% compared to
ED

the initial state. These values indicate a significant decrease in risk. The results indicate that the
reduction of hidden managerial dangers can significantly reduce the severity of the accidents.
PT

0%
mean value of ‘AS’
Decreased percent of

-1% -2%
-2% -2%
CE

-4%

-6% -7%
-8%
AC

-8% -9%
-10%
-10%

-12%
Crew 1 1 1 1
Ship defects 1 1 1 1
Ship speed 1 1 1 1
Combinations of managerial conditions

Fig. 11. Decreased percent of mean values of ‗accident severity‘ in scenario two arranged in
decreasing order.

23
ACCEPTED MANUSCRIPT

7. Conclusions
In this study, we extracted useful data from maritime accident investigation reports for risk
analysis using the GT method and BNs. We analysed the reports of waterway accidents held by
MSA in the past 30 years and then identified and analysed the causal factors influencing waterway
transport accidents. A novel BN model was constructed to analyse waterway risks using ABN
modelling.
Based on the mutual information contained in the ABN model, the risk variables are grouped
and ranked according to their degrees of closeness to the node of accident severity in the following
order: Among Group I (i.e. mutual information higher than 0.05) are accident type, location, ship

T
type and ship age; Group II (i.e. mutual information greater than 0.02 but less than 0.05) includes

IP
ship flag (registry), ship speed, time of day, visibility, gross tonnage, environment, crew, season
and wind; and Group Ⅲ (i.e. mutual information less than 0.02): ship defects, loading, fog, hull

CR
type, human factors, rain and length.
From the analysis, useful insights are obtained as follows:
(i)

US
When the type of accident is a sinking, the severity of the accident is the highest, and
when the type is a grounding, the accident severity is the lowest.
(ii) When the accident occurs at a quay, the risk of serious consequences is the lowest.
AN
When the location is an inland or coastal waterway, the average severity is the highest,
and the severity of the waterway accident tends to be catastrophic.
(iii) when the ship is a fishing vessel, the severity of the accident is the highest among all
M

vessel types;
(iv) with increasing ship age, the accident severity generally increases.
ED

The ABN model and the scenario analysis help investigate whether oceanographic conditions
influence risk and if these effects change over time. The relevant findings will provide useful
guides to the stakeholders, including ship operators to take better safety control options (with
PT

respect to the most influencing risk factors) to eliminate/reduce accident consequences; and
policymakers (e.g. classification societies) to set new safety standards (e.g. design for safety with
CE

respect to the most ship-related influential risk factors).


The analysis of two scenarios reveals that navigational environments and ship management
have significant effects on accident severity. Heavy winds obviously lead to serious accident
AC

severity. At night, heavy wind creating a poor navigational environment, causes the significant
increase of the probability of a catastrophic accident. When the crew is sufficiently staffed, the
ship has no defects, and the ship speed is slow, the mean value of accident severity decreases by
10%. Obviously, such analysis results suggest appropriate way of developing countermeasures for
accident prevention.
Despite the above contributions and findings, the paper has shown some limitations, among
which the significant includes
1) The completeness of the data mined from the text case is arguable. More sources should be
used to compensate the missing data.
24
ACCEPTED MANUSCRIPT

2) In BN modelling, we used the Expectation Maximization (EM) algorithm to process


missing data. This method is time costly. In future, advanced methods could be developed to
improve its efficiency.
3) The study focuses more on objective variables and concerns little on human factor. Data
relating to human factor and its impact on maritime accident severity should be derived from the
accident reports to address this concern in future.

Acknowledgements

T
This research is sponsored by the Shanghai Pujiang Program (Grant No. 5PJC060), the

IP
National Science Foundation of China (Grant nos. 71573172 and 71402093), and the EU H2020

MC RISE programme (Grant No. GOLF-777742)

CR
Reference
Akhtar, M. J., & Utne, I. B. (2014). Human fatigue‘s effect on the risk of maritime groundings – A

US
Bayesian Network modeling approach. Safety Science, 62, 427–440.
Akyuz, E., & Celik, M. (2014). A hybrid decision-making approach to measure effectiveness of
safety management system implementations on-board ships. Safety Science, 68, 169–179.
AN
Aydogdu, Y. V. (2014). A comparison of maritime risk perception and accident statistics in the
Istanbul Straight. The Journal of Navigation 67(1): 129-144.
Akyuz, E. (2015). A hybrid accident analysis method to assess potential navigational
M

contingencies: The case of ship grounding. Safety Science, 79, 268-276.


Antao, P. & Soares, G. (2006). Fault-tree models of accident scenarios of RoPax vessels.
International Journal of Automation and Computing, 3(2), 107-116.
ED

Antao, P. & Soares, G. (2008). Causal factors in accidents of high speed craft and conventional
ocean going vessels. Reliability Engineering & System Safety, 93(9), 1292-1304.
Antao, P., Grande, O., Trucco, P., & Soares, G. (2009). Analysis of maritime accident data with
PT

BBN models. European Safety and Reliability Annual Conference. September 7-10, Prague,
Czech Republic.
Balmat, J.-F., Lafont, F., Maifret, R., & Pessel, N. (2009). Maritime risk assessment (MARISA), a
CE

fuzzy approach to define an individual ship risk factor. Ocean Engineering, 36, 1278–
1286.
Balmat, J.-F., Lafont, F., Maifret, R., & Pessel, N. (2011). A decision-making system to maritime
AC

risk assessment. Ocean Engineering, 38(1), 171–176.


Bhattacharya, Syamantak. (2012). The effectiveness of the ISM Code: A qualitative enquiry.
Marine Policy 36(2), 528-535.
Bouejla, A., Chaze, X., Guarnieri, F., & Napoli, A. (2014). A Bayesian network to manage risks
of maritime piracy against offshore oil fields. Safety Science, 68, 222–230.
Cariou, P., Mejia, M. Q., & Wolff, F.-C. (2008). On the effectiveness of port state control
inspections. Transportation Research Part E: Logistics and Transportation Review, 44(3),
491–503.
Chauvin, C., Lardjane, S., Morel, G., Clostermann, J.-P., & Langard, B. (2013). Human and
organisational factors in maritime accidents: Analysis of collisions at sea using the
25
ACCEPTED MANUSCRIPT

HFACS. Accident Analysis & Prevention, 59, 26–37.


Chen, S.-T., Wall, A., Davies, P., Yang, Z., Wang, J., & Chou, Y.-H. (2013). A Human and
Organisational Factors (HOFs) analysis method for marine casualties using
HFACS-Maritime Accidents (HFACS-MA). Ocean Engineering, 60, 105–114.
Fabiano, B., Currò, F., Reverberi, A. P., & Pastorino, R. (2010). Port safety and the container
revolution: A statistical study on human factor and occupational accidents over the long
period. Safety Science, 48(8), 980–990.
Friedman, N., Geiger, D., & Goldszmidt, M. (1997). Bayesian networks classiers. Machine
Learning, 29: 131-161.
Goerlandt, F., & Montewka, J. (2015). A framework for risk analysis of maritime transportation

T
systems: a case study for oil spill from tankers in a ship–ship collision. Safety Science, 76,

IP
42-66.
Glaser, B. G., & Strauss, A. L. (1967). The discovery of grounded theory. New York: Aldine.

CR
Glaser, B.G. (1992). Emergence vs Forcing: Basics of Grounded Theory Analysis. Sociology
Press, Mill Valley, CA
Hänninen, M., & Kujala, P. (2012). Influences of variables on ship collision probability in a
Bayesian belief network model. Reliability Engineering & System Safety, 102, 27–40.

US
Hänninen, M., & Kujala, P. (2014). Bayesian network modeling of Port State Control inspection
findings and ship accident involvement. Expert Systems with Applications, 41(4), 1632–
AN
1646.
Hansen, H. L., Jepsen, J. R., & Hermansen, K. (2012). Factors influencing survival in case of
shipwreck and other maritime disasters in the Danish merchant fleet since 1970. Safety
Science, 50(7), 1589–1593.
M

Heij, C., & Knapp, S. (2012). Evaluation of safety and environmental risk at individual ship and
company level. Transportation Research Part D: Transport and Environment, 17(3),
228–236.
ED

Heij, C., Knapp, S., Henderson, R., & Kleverlaan, E. (2013). Ship incident risk around the
heritage areas of Tubbataha and Banc d‘Arguin. Transportation Research Part D:
Transport and Environment, 25, 77–83.
PT

Knapp, S., Bijwaard, G., & Heij, C. (2011). Estimated incident cost savings in shipping due to
inspections. Accident Analysis & Prevention, 43(4), 1532–1539.
CE

Knapp, S., Kumar, S., Sakurada, Y., & Shen, J. (2011). Econometric analysis of the changing
effects in wind strength and significant wave height on the probability of casualty in
shipping. Accident Analysis & Prevention, 43(3), 1252–1266.
AC

Kum, S., & Sahin, B. (2015). A root cause analysis for Arctic Marine accidents from 1993 to 2011.
Safety Science, 74, 206–220.
Li, K. X., Yin, J., & Fan, L. (2014). Ship safety index. Transportation Research Part A: Policy
and Practice, 66, 75–87.
Lam, W., & Bacchus, F. (1993). Using causal information and local measures to learn Bayesian
networks. Uncertainty in Artificial Intelligence, 243-250.
Ma, F., Chen, Y., Yan, X., Chu, X., & Wang, J. (2016). A novel marine radar targets extraction
approach based on sequential images and Bayesian Network. Ocean Engineering, 120,
64–77.
Mazaheri, A., Montewka, J., & Kujala, P. (2014). Modeling the risk of ship grounding—a

26
ACCEPTED MANUSCRIPT

literature review from a risk management perspective. WMU Journal of Maritime Affairs,
13(2), 269–297.
Mazaheri, A., Montewka, J., Kujala, P. (2016). Towards an evidence-based probabilistic risk
model for ship-grounding accidents. Safety Science, 86, 195-210.
Mullai, A., & Paulsson, U. (2011). A grounded theory model for analysis of marine accidents.
Accident Analysis & Prevention, 43(4), 1590–1603.
Pearl, J. (1988), Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann, San Mateo,
CA.
Gaonkar, R. S. P., Xie, M., & Fu, X. (2013). Reliability estimation of maritime transportation: a
study of two fuzzy reliability models. Ocean Engineering, 72(11), 1-10.

T
Graziano, A., Teixeira, A., & Soares, G. (2016). Classification of human errors in grounding and

IP
collision accidents using the TRACEr taxonomy. Safety Science, 86, 245-257.
Prabhu Gaonkar, R. S., Xie, M., & Fu, X. (2013). Reliability estimation of maritime transportation:

CR
A study of two fuzzy reliability models. Ocean Engineering, 72, 1–10.
Pristrom, S., Yang, Z., Wang, J., & Yan, X. (2016). A novel flexible model for piracy and robbery
assessment of merchant ship operations. Reliability Engineering & System Safety, 155,
196–211.

US
Ronza, A., Félez, S., Darbra, R. M., Carol, S., Vílchez, J. A., & Casal, J. (2003). Predicting the
frequency of accidents in port areas by developing event trees from historical analysis.
AN
Journal of Loss Prevention in the Process Industries, 16(6), 551–560.
Soares, G. & Teixeira, A. (2001). Risk Assessment in Maritime Transportation. Reliability
Engineering & System Safety, 74, 299-309.
Sotiralis, P., Ventikos, N., Hamann, R., Golyshev, P., & Teixeira, A. (2016). Incorporation of
M

human factors into ship collision risk models focusing on human centred design aspects.
Reliability Engineering & System Safety, 156, 210-227.
Sun, X., Yan, X., Wu, B., & Song, X. (2013). Analysis of the operational energy efficiency for
ED

inland river ships. Transportation Research Part D: Transport and Environment, 22, 34–
39.
Talley, W. K., Yip, T. L., & Jin, D. (2012). Determinants of vessel-accident bunker spills.
PT

Transportation Research Part D: Transport and Environment, 17(8), 605–609.


Trucco, P., Cagno, E., Ruggeri, F., & Grande, O. (2008). A Bayesian Belief Network modelling of
CE

organisational factors in risk analysis: A case study in maritime transportation. Reliability


Engineering & System Safety, 93(6), 845–856.
Tzannatos, E., & Kokotos, D. (2009). Analysis of accidents in Greek shipping during the pre- and
AC

post-ism period. Marine Policy, 33(4), 679-684.


Weng, J., & Yang, D. (2015). Investigation of shipping accident injury severity and mortality.
Accident Analysis & Prevention, 76, 92–101.
Wu, B., Wang, Y., Zhang, J., Savan, E. E., & Yan, X. (2015). Effectiveness of maritime safety
control in different navigation zones using a spatial sequential DEA model: Yangtze
River case. Accident Analysis & Prevention, 81, 232–242.
Wu, B., Yan, X., Wang, Y., and Guedes Soares, C. (2017). An evidential reasoning-based CREAM
to human reliability analysis in maritime accident process. Risk Analysis, 37(10), 1936-1957.
Xi Y.T., Yang Z., Fang Q.G., Chen W.J. and Wang J. (2017). A modified CREAM for human error
probability quantification, Ocean Engineering, 138, 45-54.

27
ACCEPTED MANUSCRIPT

Yang Z., Wang J. and Li K. (2013a). Maritime safety analysis in retrospect, Maritime Policy and
Management, 40: 261-277.
Yang, Z.., Bonsall, S., Wall, A., Wang, J., & Usman, M. (2013). A modified CREAM to human
reliability quantification in marine engineering, Ocean Engineering, 58, 293–303.
Yang Z., Yang Z., and Yin J. (2018). ―Realising Advanced Risk-based Port State Control
Inspection using Data-Driven Bayesian Networks‖. Transportation Research Part A:
Policy and Practice, 110, 38-56..
Zhang, J., Teixeira, A., Soares, G., Yan, X., & Liu, K. (2016). Maritime transportation risk
assessment of Tianjin Port with Bayesian Belief Networks. Risk Analysis, 36(6), 1171-1187.
Zhang, D., Yan, X., Yang, Z., Wall, A., & Wang, J. (2013). Incorporation of formal safety

T
assessment and Bayesian network in navigational risk estimation of the Yangtze River.

IP
Reliability Engineering & System Safety, 118, 93–105.
Zhang, D., Yan, X., Yang, Z., & Wang, J. (2014). An accident data-based approach for congestion

CR
risk assessment of inland waterways: A Yangtze River case. Proceedings of the Institution of
Mechanical Engineers, Part O: Journal of Risk and Reliability, 228(2), 176–188.

US
AN
M
ED
PT
CE
AC

28

You might also like