0% found this document useful (0 votes)
85 views20 pages

Welding Detection Radiography

This paper presents an automatic welding defect detection system using semantic segmentation methods to enhance the accuracy and efficiency of analyzing radiographic images. A dataset of welding defects, called RIWD, is established, and an end-to-end FPN-ResNet-34 network is implemented for defect detection, achieving high performance metrics. The study addresses challenges such as data acquisition, algorithm optimization, and generalization, demonstrating the system's capability to accurately identify and describe defect boundaries.

Uploaded by

ak2778
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
85 views20 pages

Welding Detection Radiography

This paper presents an automatic welding defect detection system using semantic segmentation methods to enhance the accuracy and efficiency of analyzing radiographic images. A dataset of welding defects, called RIWD, is established, and an end-to-end FPN-ResNet-34 network is implemented for defect detection, achieving high performance metrics. The study addresses challenges such as data acquisition, algorithm optimization, and generalization, demonstrating the system's capability to accurately identify and describe defect boundaries.

Uploaded by

ak2778
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Measurement 188 (2022) 110569

Contents lists available at ScienceDirect

Measurement
journal homepage: www.elsevier.com/locate/measurement

Defect detection in welding radiographic images based on semantic


segmentation methods
H. Xu a, b, c, Z.H. Yan a, b, *, B.W. Ji a, b, P.F. Huang a, b, J.P. Cheng a, b, X.D. Wu a, b
a
Faculty of Materials and Manufacturing, Beijing University of Technology, Beijing 100124, China
b
Engineering Research Center of Advanced Manufacturing Technology for Automotive Structural Parts, Ministry of Education, Beijing University of Technology, Beijing
100124, China
c
Beijing Laboratory of Intelligent Information Technology, School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China

A R T I C L E I N F O A B S T R A C T

Keywords: In order to remove the limitations of human interpretation, many computer-aided algorithms have been
Non-destructive testing developed to automatically detect defects in radiographic images. Compared with traditional detection algo­
Radiographic images rithms, deep learning algorithms have the advantages of strong generalization ability and automatic feature
Welding defects
extraction, and have been applied in welding defect detection. However, these algorithms still need further
Deep learning
Semantic segmentation
research in the acquisition and cleaning of welding radiographic image data, the selection and optimization of
deep neural networks, and the generalization and interpretation of network models. Therefore, this paper pro­
poses an automatic welding defect detection system based on semantic segmentation method. Firstly, a dataset of
radiographic images of welding defects, called RIWD, is set up, and the corresponding data preprocessing and
annotation methods are designed for the training and evaluation of the algorithm. Secondly, an end-to-end FPN-
ResNet-34 semantic segmentation network-based defect detection algorithm is implemented, and the network
architecture is experimentally demonstrated to be suitable for defect features extraction and fusion. Thirdly, to
improve the detection performance of the algorithm, an optimization strategy for the network is designed ac­
cording to the data characteristics of defects, which includes data augmentation based on combined image
transformations and class balancing using a hybrid loss function with dice loss and focal loss. Finally, to ensure
the reliability of the algorithm, the generalization ability of the algorithm is tested using external validation, and
the defect features learned by the network are visualized by post-interpretation technique. The experimental
results show that our method can correctly discriminate defect types and accurately describe defect boundaries,
achieving 0.90 mPA, 0.86 mR, 0.77 mF1 and 0.73 mIoU, which can be applied to automatically interpret
radiographic images.

1. Introduction film. The weld defect is described by the change in intensity of the
radiographic image. These films should be inspected by certified in­
Welding is a widely used joining process in the manufacture of spectors to assess and interpret the quality of the weld, called human
various metal structural parts, such as automobiles, ships, aircraft, interpretation. However, human interpretation suffers from many
pressure vessels and pipelines, etc. In the welding process, due to the drawbacks such as lack of objectivity and consistency, low detection
complex conditions, it is inevitable that various types of welding defects efficiency, and the possibility of missing small defects, which has
will be produced, which seriously damage the quality of welding. developed computer-aided defect detection systems based on radio­
Therefore, it is necessary to check the quality of welded joints using non- graphic images [1,2].
destructive testing (NDT) techniques. The radiographic test (RT) used to A considerable amount of literature has been published on automatic
inspect the internal defects of the weld is the critical NDT technique for welding defect detection algorithms. Traditionally, the defects
welding. The X-ray or gamma source produces the weld radiographic segmented using image processing algorithms are characterized by
image by penetrating the weld structure and exposing the photographic human-defined feature vectors and recognized by machine learning

* Corresponding author at: Faculty of Materials and Manufacturing, Beijing University of Technology, Beijing 100124, China.
E-mail addresses: xuhao@emails.bjut.edu.cn (H. Xu), zhihongyan@163.com (Z.H. Yan), miyayuki@emails.bjut.edu.cn (B.W. Ji), huangpf@bjut.edu.cn
(P.F. Huang), chengjianpeng@emails.bjut.edu.cn (J.P. Cheng), wxd184@163.com (X.D. Wu).

https://doi.org/10.1016/j.measurement.2021.110569
Received 22 September 2021; Received in revised form 1 December 2021; Accepted 3 December 2021
Available online 8 December 2021
0263-2241/© 2021 Elsevier Ltd. All rights reserved.
H. Xu et al. Measurement 188 (2022) 110569

Fig. 1. Different levels of defect detection tasks. (a): Classification. (b): Object detection. (c): Segmentation. (d): Semantic segmentation.

algorithms. operations reduce detection efficiency, which limit the application of


The steps of image processing can usually be summarized as: (1) traditional methods to realistic detection.
Image preprocessing: improve the quality of radiographic images by Fortunately, in recent years, with the improvement of big data and
noise filtering [3–5] and contrast enhancement [6,7]. (2) Weld zone computing power, deep learning algorithms have developed rapidly,
extraction: the weld zone is usually reserved using the threshold method providing new solutions for computer-aided defect detection. Deep
[8,9], which can eliminate the interference of non-weld zones on the neural networks (DNN) can directly process images and output the re­
film and improve the detection efficiency. (3) Defect segmentation: it is sults to achieve end-to-end detection. Moreover, the DNN model can be
the key step and a lot of algorithms are applied to extract the region of trained to discover and extract high-level features automatically,
defects, such as thresholding [10], background subtraction [11], edge without the need for hand-crafted features. In addition, compared with
detection [12], morphological processing [13], etc. These customized shallow machine learning models, DNN models are more generalizable
algorithms can achieve good detection results under the corresponding and transferable.
conditions, but their versatility and robustness are also limited because Nowadays, researchers have introduced a number of DNN models
of the diversity of weld zones, the uncertainty of radiographic inspection into defect detection [26,27], especially convolutional neural networks
conditions, and the complexity of weld defects all showing different (CNN) [28–30] for processing images, such as ResNet [31], SqueezeNet
visual characteristics on the images. In our previous work, a multi-scale [32], R-FCN [33], etc. Wang et al. [34] used pre-trained RetinaNet to
multi-intensity (MSMI) defect detection algorithm was proposed. output the type and location of defects. Oh et al. [35] detected defect
Analyzing the radiographic images in a MSMI parameter space ensures objects by Faster R-CNN. Yang et al. [36] proposed an automatic
the robustness and versatility of this algorithm [14]. welding defect localization method based on improved U-net. Guo et al.
After that, based on the human definition, various types of feature [37] used Xception model as the base learner to build the target network
vectors were proposed to characterize defects, including: shape geo­ for defect detection.
metric features [15], texture features [16], Mel-Frequency Cepstral co­ Furthermore, a lot of improvements have been proposed to obtain
efficients and polynomial features [17], generic Fourier features [18], better detection performance, mainly reflected in two aspects:
and so on. Yang et al. [19] proposed features based on the intensity
contrast between weld defect and its background. In our previous work, (1). Data: Training networks with excellent performance requires a
3D depth image features and 2D multi-angle illumination gray image large amount of balanced labeled data, which is difficult to
features were extracted and combined to detect weld defects [20]. Based obtain. Therefore, some data optimization strategies have been
on these features, many machine learning algorithms have been applied proposed, such as combining image processing and generative
to recognize defects, classified into two types: (1) Supervised algorithms: adversarial network (GAN) to generate training samples [38] and
ANN [21], SVM [22], MLP [23], and AdaBoost [24]. (2) Unsupervised balancing samples using resampling methods [39,40]. Le et al.
algorithms: multivariate generalized Gaussian mixture model [15], [41] proposed a learning-based approach based on small image
dictionary learning [25], ant colony optimization, and k-nearest datasets. With the help of Wasserstein GANs, feature-extraction-
neighbor classifier [1]. A large number of experiments have demon­ based transfer learning techniques and multi-model ensemble
strated that these machine learning algorithms have good detection framework, this approach was able to deal with imbalanced and
performance. However, machine learning algorithms rely on accurate severely rare images with defects successfully. Dong et al. [42]
image segmentation results and suitable hand-crafted feature vectors. In developed unsupervised local deep feature learning to address
the complex welding environment and inspection conditions, designing the problems of insufficient data and lack of detailed annotations.
and selecting valid image features for the classification model is still a Guo et al. [37] proposed contrast enhancement conditional GAN,
challenging task that requires the prior knowledge of professionals. which solved the data imbalance problem well, and also solved
Therefore, traditional defect detection algorithms have limitations: the impact of low-contrast defect images on detection.
(1) poor adaptability to images with different visual characteristics; (2) (2). Algorithm: Design and improve the detection algorithm ac­
the requirement for hand-crafted defect feature vectors; (3) step-by-step cording to the features of defects. To address the complexity and

2
H. Xu et al. Measurement 188 (2022) 110569

Radiographic Image Feature Image External


Film Digitization Extraction Processing Validation

Image ResNet-34 Data Generalization


Preprocessing Backbone Augmentation Ability Detection Results

Image Feature Hybrid Loss Interpretation Defect


Annotation Fusion Function Technique Category

Learning FPN Class Feature Defect


Sample Image Architecture Balance Visualization Boundary

Fig. 2. The framework of the defect detection system. The key steps are marked in red.

Fig. 3. Some example images (after extracting the weld zone) of the dataset. (a): R0001. (b): R0002. (c): W0004.

similarity of defects, Dong et al. [43] proposed a pyramid feature (2). Generalization and interpretation of detection algorithms:
fusion and global context attention network. Du et al. [44] Most researches have only verified the effectiveness of algorithms
improved the detection accuracy of Faster R-CNN using FPN and in a specific project rather than in different scenarios, which ig­
RoIAlign. The method proposed by Jiang et al. [45] includes an nores the advantage of the transferability of deep learning algo­
improved pooling strategy and an enhanced feature selection rithms. In addition, DNN models are able to learn features
method. Gong et al. [46] proposed a deep transfer learning model automatically, but this hidden learning process also reduces the
to extract defect features. interpretability of the models, raising concerns about the reli­
ability of applying the black-box models in the field of quality
Although DNNs have shown some advantages in welding defect inspection.
detection tasks, the following problems still exist in the application:
Therefore, to achieve high-level defect detection tasks and to solve
(1). The levels of defect detection tasks: The algorithms detect the problems of difficult labeling and imbalance of data as well as the
defects at different levels, as shown in Fig. 1, from top to bottom, poor interpretation and unverified generalization of algorithms, this
the algorithms provide finer granularity detection and richer paper proposes an automatic welding defect detection system in radio­
defect information. However, limited by the annotation granu­ graphic images based on semantic segmentation method, as shown in
larity of the data[47], a majority of studies have only imple­ Fig. 2. The main contributions are as follows:
mented defects classification [39,41,45,46,48–50] (Fig. 1 (a)) or
object detection [29,35,44] (Fig. 1 (b)). A few studies have (1). Based on the collected data, a dataset of radiographic images of
implemented defects segmentation, but have not recognized welding defects, called RIWD, is constructed as the basis for the
types of defects [36,42] (Fig. 1 (c) [51]). The information pro­ algorithm study. Moreover, the corresponding data processing
vided by these tasks above is not sufficient for radiographic film method is designed, including image preprocessing, annotation,
interpretation. Because the welding quality rating criteria specify and sample set setting, for training and evaluation of the semantic
the type, number, spacing and size of acceptable defects, which segmentation network.
requires the detection algorithm to not only recognize defect (2). An end-to-end welding defect detection algorithm based on the
types but also to describe defect boundaries, i.e., the semantic FPN-ResNet-34 semantic segmentation network is implemented.
segmentation task of defects (Fig. 1 (d)). The model can output complete information, including the defect

3
H. Xu et al. Measurement 188 (2022) 110569

Fig. 4. Comparison of cropping strategies. (a): Resize and crop the image. (b): Keep the original resolution of the image.

Fig. 5. Image annotation process. (a): The complex defect boundaries in the image are difficult to be manually labeled. (b): The defect regions are automatically
labeled by the MSMI algorithm. (c): The types of defect regions are manually labeled.

Table 1
The statistical information about defects in the sample set.
Radiographic Image Depth Defect Region
Types Number Pixel Pixel Training Valid Test Set Film Rescale Annotation
of of number ratio of Set Set (kpixel)
defects defects of defects defects (kpixel) (kpixel)
Weld Bead
(kpixel) (%)
Extraction
CR 623 1689.72 0.49 1351.78 236.56 101.38
Digitization Defect Category
PO 3707 2460.78 0.72 2042.45 123.04 295.29 (Scan/Photograph) Annotation
Image
SL 897 777.03 0.23 696.02 16.31 85.47 Cropping
LPF 625 1444.64 0.42 1227.94 130.02 75.12
Radiographic Patch Image Ground Truth
Image Sample Label
category, boundary and location by automatically learning and
aggregating the semantic features of the defect, which can be
directly used for rating the weld quality. RIWD GDXray
(3). The experiments verified that the architecture of FPN-ResNet-34 R0002 R0001 W0004
semantic segmentation network is suitable for the defect features
extraction and fusion. Furthermore, according to the data char­
acteristics of defects, the optimization strategy of the network is Sample Set
External
designed, including data augmentation based on image trans­ Validation Set
Training Set Validation Set Test Set
formation and class balancing based on loss function, which
effectively improves the detection performance of the network. Fig. 6. The framework of data acquisition and processing.
(4). The algorithm is tested on an external validation set to verify its
generalization ability to different scenarios. In addition, the

4
H. Xu et al. Measurement 188 (2022) 110569

Fig. 7. FPN-based semantic segmentation network architecture.

reliability of the algorithm is analyzed by visualizing the features 2.2. Image preprocessing
learned from the network.
The images in our dataset are preprocessed:
The remaining part of this paper is organized as follows: we present
the data-related work in Section 2. After that, Section 3 briefly describes (1). Image bit depth rescaling: The original 12-bit data depth is
network architectures and optimization methods. Section 4 designs ex­ rescaled to 8 bits, using the linear color look-up table method to
periments to validate our method, and analyzes and discusses the ensure that all necessary defect information remains in the 8-bit
experimental results in detail. Finally, we draw conclusions in Section 5. image.
(2). Weld zone extraction: The weld zone is adaptively segmented
2. Data acquisition and processing using the maximum between-cluster variance method, which
avoids other zones to interfere with the detection.
In the field of weld RT, the lack of defect data has restricted the (3). Image cropping: The original radiographic image is normally
development of algorithms. In this section, we acquire weld radio­ high-resolution and long strip, which needs to be cropped into
graphic images, construct a dataset RIWD, preprocess and annotate the low-resolution small patches to input the model and reduce
images as learning samples for DNNs. computational resources. We propose a cropping strategy for the
welds, which resizes the image to the target resolution and then
2.1. Datasets crops small patches in a tiled pattern. In contrast, many studies
use a cropping strategy that keeps the original resolution
2.1.1. RIWD [39,41,49]. Fig. 4 compares these two strategies, where both
Based on digitized weld radiographic films, we constructed a dataset original images are converted to 224 × 224 resolution patches.
of radiographic images of welding defects (RIWD), which consists of two Compared with the cropping strategy that keeps the original
subsets: resolution, our strategy obtains a smaller number of patches, but
can retain more weld context information and provide a larger
(1). R0001: We photograph 57 illustrative plats of typical welding receptive field, which is helpful for the training of the model.
defects, as shown in Fig. 3 (a), on which the defects are obvious
and numerous, suitable for training the algorithm to learn the 2.3. Image annotation
distinguishing defect features.
(2). R0002: We scan 59 weld radiographic films from realistic tests, Training the semantic segmentation network requires pixel-level
as shown in Fig. 3 (b), on which the defects are rare, small and ground truth annotations. While manually annotating these complex
unobvious, but this subset can be used to verify the algorithm’s defect regions (Fig. 5 (a)) is difficult and expensive. So, we annotate
performance in realistic detection. images as follows:

2.1.2. GDXray (1). Defect region annotation: The MSMI algorithm we proposed can
GDXray [51] is a public dataset, including a group of X-ray welding extract the defect region adaptively [14], so this algorithm is
images (Welds). The group Welds contains 88 images arranged in 3 applied in cropped images and accurately describes defect
series, which is used to evaluate the performance of detection algorithms boundaries, as shown in Fig. 5 (b), which greatly simplifies the
[36,39,49]. annotation operation. We also manually exclude some defect
We select 38 images from the group Welds, called Series W0004 regions that are incorrectly segmented by this algorithm.
(subset) (Fig. 3 (c)). And this subset is extended with multi-class ground (2). Defect type annotation: According to ISO 6520 and ISO 5817, we
truth annotations to evaluate the performance of the semantic seg­ annotate these regions as four common defect types: Crack (CR),
mentation algorithm. Porosity (PO), Slag inclusion (SL), Lack of penetration or Lack of
fusion (LPF), as shown in Fig. 5 (c).

Using the above method, we generated accurate ground truth labels

5
H. Xu et al. Measurement 188 (2022) 110569

Fig. 8. Other semantic segmentation networks architectures. (a): U-net. (b): PSPNet. (c): Linknet.

at low cost. algorithm.


The sample set contains 95 radiographic images (average height of
831.2 pixels and average width of 4333.9 pixels), cropped into 740
2.4. Sample set
patches of 224 × 224 resolution, which contain about 6000 defects of
various types. These 740 sample images are manually divided, of which
We use preprocessed and annotated images as learning samples for
638 are used as training samples, 51 as validation samples, and 51 as test
the algorithm. In the three subsets, the images in R0001 and W0004
samples, ensuring that they are independent of each other and each
have more similar features, so we combine them as the sample set, which
validation or test sample contains defects.
ensures the diversity of sample sources. While the data distribution of
Table 1 presents the statistical information of defects in the sample
R0002 is more varied, so this subset is used separately to construct an
set. As seen, besides the image features of defects (e.g., low contrast,
external validation set for evaluating the generalization ability of the

6
H. Xu et al. Measurement 188 (2022) 110569

Table 2 Table 3
Combination of image transformations. Weights setting.
Types of Methods of transformations Probabilities Types CR PO SL LPF Background
transformations
Weights λ Dice 2 1 3 2 0.5
Geometric Horizontal flip 0.5
transformation
Noise addition Gaussian noise 0.2
Gray-level CLAHE; Random brightness;Random 0.9 Table 4
transformation Gamma Hyperparameters configuration.
Grayscale Random Contrast; Random HSV 0.9
transformation Hyperparameters Values
Fuzzy mapping Sharpening; Random blur; Motion blur 0.9 Classification categories 5
Batch size 8
Learning rate 0.0001
weak texture, etc.), the data characteristics of defects also bring chal­ Training iterations 30
lenges for detection: Activation function Softmax
Gradient optimum algorithm Adam
Backbone ResNet-34
(1). The data size is relatively small: Although we have expanded Initialization parameters Pre-training weights on ImageNet
the data size as much as possible, which is still less for training Loss function Cross-entropy
DNNs. However, these small number of images contain a large Data augmentation No
number of defect objects (approximately 6000), reflecting the
high annotation complexity of our RIWD dataset.
pathway consists of four down-sampling stages (also known as
(2). Defect objects are too small: All types of defects only occupy
pyramid levels) where the convolutional layers in the same stage
less than 2% of pixels in images in total. It is a challenge to detect
produce output feature maps of the same size. And CNNs can be
such small defects from the weld background, which can also be
used to construct this pathway.
regarded as an extremely imbalanced problem between the
(2). Top-down pathway: The top-down pathway produces higher
foreground and background categories.
resolution features by up-sampling (the nearest interpolation)
(3). Category imbalance of defects: Both the number and the pixel
spatially coarser, but semantically stronger, feature maps from
number of defects prove this, which may cause poor prediction of
higher pyramid levels.
defect categories with small samples.
(3). Lateral connections: The features from the top-down pathway are
enhanced by the features from the corresponding bottom-up
The above challenges need to be taken into account when designing
pathway via lateral connections. The bottom-up feature map is
our algorithm.
of lower semantics, but its activations are more accurately
In summary, the framework in terms of data of this paper is shown in
localized as it is subsampled fewer times, so it can be used to
Fig. 6.

3. Methodology
Table 5
In this section, we briefly explain the deep learning methods, Comparisons of performances and efficiencies on semantic segmentation
including network architectures and optimization strategies. networks.
Models mIoU mF1 mPA mR Training Inference
time(s) time(s)
3.1. FPN-based semantic segmentation network
FPN 0.68 ± 0.68 ± 0.80 ± 0.86 ± 296.8 ± 0.040 ±
0.00 0.00 0.01 0.01 3.6 0.001
Feature Pyramid Network (FPN) is a feature extraction network [52], U-net 0.47 ± 0.50 ± 0.64 ± 0.79 ± 192.9 ± 0.038 ±
originally proposed for multi-scale object detection, which is con­ 0.01 0.01 0.01 0.01 4.5 0.002
structed from three parts: PSPNet 0.62 ± 0.63 ± 0.80 ± 0.76 ± 390.0 ± 0.040 ±
0.01 0.01 0.01 0.00 9.4 0.001
Linknet 0.55 ± 0.57 ± 0.73 ± 0.79 ± 199.8 ± 0.039 ±
(1). Bottom-up pathway: The bottom-up pathway computes a feature
0.06 0.06 0.07 0.01 7.2 0.001
hierarchy consisting of feature maps at several scales. This

Fig. 9. Example of data augmentation by combined image transformations. (a): Original samples. (b): Augmented samples.

7
H. Xu et al. Measurement 188 (2022) 110569

Fig. 10. Comparison of defect segmentation results. (a): Origin image. (b): Ground truth. (c): FPN. (d): U-net. (e): PSPNet. (f): Linknet.

Table 6
Comparisons of performances and efficiencies between ResNet backbones with different depths.
ResNets mIoU mF1 mPA mR Para. Training time(s) Inference time(s)

ResNet-18 0.62 0.63 0.81 0.77 13,821,135 263.1 0.037


ResNet-34 0.66 0.67 0.88 0.76 23,936,719 296.8 0.040
ResNet-50 0.64 0.66 0.84 0.77 26,917,583 355.1 0.043
ResNet-101 0.64 0.66 0.84 0.77 45,961,935 463.2 0.055
ResNet-152 0.65 0.66 0.85 0.75 61,651,663 550.8 0.059

Table 7
Comparisons of performances and efficiencies between some advanced backbones.
Backbones mIoU mF1 mPA mR Para. Trainingtime(s) Inference time(s)

ResNet-34 0.66 0.67 0.88 0.76 23,936,719 296.8 0.040


ResNeXt-101 0.62 0.64 0.83 0.77 45,497,929 1444.5 0.154
Se-ResNeXt-101 0.63 0.63 0.84 0.72 50,275,638 1569.6 0.165
Inception-v3 0.66 0.67 0.86 0.77 24,998,310 381.8 0.047
DenseNet-169 0.63 0.65 0.84 0.76 15,558,790 456.3 0.056

enhance the top-down feature. The feature maps are laterally from the features of FPN [53]. The features of each level after the
connected by element-wise addition. connection are up-sampled to feature maps of the same resolution by
bilinear interpolation, then fused into one feature map by channel
A branch is then added to generate the semantic segmentation output concatenation, and finally the size and channel of this feature map are

8
H. Xu et al. Measurement 188 (2022) 110569

Table 8 parameter computation of the decoder and better recovers the


Comparison of setting hyperparameters. spatial information lost in down-sampling.
Hyperparameters Values mIoU mF1 mPA mR
The architectures of these above networks are shown in Fig. 8, and
Table 4 Table 4 0.66 0.67 0.88 0.76
Initialization parameters None ImageNet 0.62 0.62 0.84 0.74 their architectural variations essentially reflect their different ideas on
Batch size 2 0.66 0.67 0.88 0.75 semantic feature fusion.
Batch size 16 0.63 0.65 0.83 0.76
Training iterations 20 0.64 0.65 0.83 0.77
3.3. ResNet-based feature extraction network
Training iterations 50 0.66 0.67 0.86 0.76
Gradient optimum algorithm Nadam 0.59 0.60 0.80 0.75
Deep residual network (ResNet) solves the degradation problem of
DNNs by residual learning [57]. Specifically, increasing the network
layers theoretically allows DNNs to learn more complex features and
Table 9 improve their performances, but experiments have found that increasing
Data augmentation improves model performance. the network depth may even leads accuracy saturation or degradation, i.
Data augmentation mIoU mF1 mPA mR e., the degradation problem of DNNs. While ResNet avoids degradation
no 0.62 ± 0.00 0.63 ± 0.00 0.83 ± 0.00 0.76 ± 0.01 by learning the residual function, which at least retains the identity
yes 0.65 ± 0.02 0.68 ± 0.02 0.80 ± 0.01 0.86 ± 0.01 mapping and learns more deep residual features. Therefore, the deep
ResNet-101, ResNet-152, etc. can be trained effectively.
ResNets contain five convolutional stages, which consist of a varying
number of building blocks with shortcut connections. There are two
Table 10
Performance comparison of models trained with different loss functions. types of building blocks, the shallower ResNet-18 and ResNet-34 use two
layers of building blocks, while the deeper ResNet-50, ResNet-101 and
Loss functions mIoU mF1 mPA mR
ResNet-152 use three layers of building blocks. In this paper, we use
Cross-entropy 0.62 ± 0.00 0.63 ± 0.00 0.83 ± 0.00 0.76 ± 0.01 ResNet to construct the backbone (down-sampling stage) of the semantic
Dice 0.64 ± 0.01 0.67 ± 0.01 0.79 ± 0.02 0.82 ± 0.01
segmentation network for extracting semantic features. Specifically, the
Focal 0.68 ± 0.00 0.69 ± 0.00 0.90 ± 0.00 0.75 ± 0.00
Dice + λFocal 0.61 ± 0.01 0.64 ± 0.01 0.78 ± 0.01 0.81 ± 0.00
second to the fifth convolutional stages of ResNet correspond to the four
0.73 ± 0.00 0.77 ± 0.00 0.88 ± 0.01 0.82 ± 0.01
stages of the semantic segmentation network down-sampling.
λDice Dice + λFocal

3.4. Data augmentation


resized by convolution to generate the semantic segmentation result
with the same size as the input image. Data augmentation is a common network improvement method for
The stages and connections of the FPN-based semantic segmentation the tasks that are difficult to acquire a lot of samples. Image trans­
network (later referred to as FPN semantic segmentation network or formation is a common approach to data augmentation, by transforming
FPN) are shown in Fig. 7. the original image into the augmented image, which makes the network
believe that there are more samples to learn.
3.2. Other semantic segmentation networks Image transformations can be divided into many types according to
their different processing effects, such as geometric transformation,
To better demonstrate the detection performance of FPN, we also color transformation, etc. For radiographic images, some trans­
used some other typical semantic segmentation networks for formations are feasible, such as adding noise or blurring transformations
comparison: (simulating transformations that may occur during image acquisition),
which reflect objective variations and help in the prediction of test sets
(1). U-net: U-net is initially proposed for biomedical image segmen­ or other realistic tests. While some transformations are meaningless,
tation [54], with symmetric encoder-decoder architecture, where such as rotating the image by 90 degrees (the test set does not have
the input image is down-sampled four times to produce high-level vertical welds).
feature maps, and then up-sampled four times to restore feature Therefore, to obtain diverse augmented images, we combine various
maps to their original resolution. The four up-sampling stages are types of feasible transformations according to certain probabilities,
concatenated to the corresponding down-sampling stages in which are listed in Table 2. For a certain type of transformation, there
order to fuse the low-level features into the high-level semantic may be several feasible methods, and we choose one of them with equal
features. U-net has been applied in welding defect detection probability when transforming the image. Moreover, to avoid overfitting
[36,42]. that may be caused by over-augmenting the training data, we only
(2). PSPNet: PSPNet is initially proposed for scene parsing [55]. To implement transformations once on the training samples, which means
integrate the context information under different receptive fields, that the training set is expanded to twice, Fig. 9 shows a data augmen­
this network uses the pyramid pooling module to pool the feature tation example. In addition, if needed, we also transform the annotated
maps obtained by down-sampling into different sizes, which are images to ensure that the labels corresponding to the augmented sam­
then up-sampled to the same size using bilinear interpolation and ples are correct.
fused into one feature map by channel concatenation, and finally
these feature maps are resized using convolution to obtain the 3.5. Loss function
prediction results.
(3). Linknet: Linknet focuses on the efficiency of semantic segmen­ The loss function is the key element to train a reliable model,
tation implementation [56]. Like U-net, this network is also an ensuring that the convergence procedure is rapid and stable. The
encoder-decoder architecture. However, this network does not commonly used cross-entropy loss function [36,44,58] can be formu­
up-sample the high-level feature maps obtained from down- lated as:
sampling, but directly passes these feature maps to the corre­

C
sponding up-sampling stage, which greatly reduces the LCross− = − gtc ⋅log(prc ) (1)
entropy
c=1

9
H. Xu et al. Measurement 188 (2022) 110569

Fig. 11. Training curves. (a): Loss function curves. (b): mIoU curves. (c): mF1 curves.

where gtc and prc represent the one-hot ground truth and predicted ∑
C
2TPc
probability of category c, respectively, and C is the number of categories, LDice = C − (2)
2TPc + FNc + FPc
set to 5 (4 defect categories plus 1 background category). c=1

Considering the training difficulties due to imbalanced data distri­


where TPc , FNc and FPc are true positives, false negatives and false
bution, several new loss functions have been proposed to alleviate this
positives of category c calculated by prediction probability, respectively.
problem:
However, this loss function may cause instability in the training process.
(1). Dice loss function [59]: the minus of the dice coefficient, which
(2). Focal loss function [60]: the focal loss forces the model to learn
can partly solve the data imbalance problem by turning pixel-
poorly classified pixels better, it can be formulated as:
wise labeling problem into minimizing class-level distribution
distance, which is calculated by the following equation:

10
H. Xu et al. Measurement 188 (2022) 110569

Fig. 12. Obvious defects. (The top figure is original, and the middle figure is annotation, while the bottom figure is prediction result, same below).


C hyperparametric configuration, listed in Table 4. We refer to open
LFocal = − gtc ⋅α⋅(1 − prc )γ ⋅log(prc ) (3) source libraries [62–64] in implementing algorithms.
c=1

where α is the weighting factor for balancing positive and negative 4.2. Evaluation indicators
samples, while γ is the moderating factor for hard and easy samples,
which are set to be 0.25 and 2 here, respectively. In addition, the focal 4.2.1. Performance evaluation indicators
loss also stabilizes the training process. We introduce some evaluation indicators to verify the algorithm’s
Furthermore, to address the extremely imbalanced data distribution defect detection performance:
problem, Zhu et al. [61] uses a hybrid loss function, including dice loss
and focal loss, and the total loss can be formulated as: (1). Mean Class Pixel Accuracy (mPA): Accuracy reflects the pro­
portion of correctly predicted pixels in the object class and can be
LHybrid = LDice + λ⋅LFocal (4) used to measure false alarms:

where λ is the tradeoff between dice loss LDice and focal loss LFocal , and is ∑
C− 1
TPc
set to be 1. mPA = (6)
c=1
TPc + FPc
However, we noticed that in our defect detection task, the imbalance
problem not only exists between the foreground and the background, where the background class is not considered, so there are C-1 classes.
but also between the various categories of defects. Therefore, we add a
weighting factor λDice for balancing the categories to dice loss based on (2). Mean Recall (mR): This indicator represents how many pixels of
Eq. (4), and the new loss function can be written as: the object class in the sample are correctly predicted, and it can
be used to measure missed detection:
LHybrid = λDice ⋅LDice +λ⋅LFocal

C− 1
TPc
∑ ∑ (7)
C C
λDice 2TPc mR =
= C − ∑Cc ⋅ − λ⋅ gtc ⋅α⋅(1− prc )γ ⋅log(prc ) TPc + FNc
λ Dice
c=1
2TP c +FN c +FP c c=1
c=1
c=1 c
(5)
(3). Macro F1 score (mF1): It balances accuracy and recall:
where λDice represents the weight of category c, which is set according to

C− 1
2PAc ⋅Rc
the data distribution in the sample set (Table 1), as shown in Table 3. mF1 = (8)
c=1
PAc + Rc
4. Experimental results and discussion Where PAc and Rc represent the accuracy and recall of class c,
respectively.
In this section, we test and discuss the detection performances of the
above methods on weld radiographic images through detailed experi­ (4). Mean intersection over union ratio (mIoU): It is the ratio of the
ments and comparisons. intersection over the union of the predictive value and the true
value of object class, and is used to indicate the similarity be­
4.1. Experimental setup tween the predictive value and the true value:

C− 1
Based on the above sample set, we compare the detection perfor­ mIoU =
TPc
(9)
mances of several networks and evaluate the improvement effect of c=1
TPc + FNc + FPc
optimization strategies. All experiments are conducted in the Tensor­
Flow 2.4 framework on a PC with 32 GB RAM, an Intel i7 processor, an 4.2.2. Efficiency evaluation indicators
NVIDIA RTX 3070 GPU, and a 64-bit Windows 10 operating system, To meet the efficiency requirements of detection, we also evaluate
using Python 3.8 and CUDA 11.2. These experiments share a the algorithm’s efficiency using the following indicators:

11
H. Xu et al. Measurement 188 (2022) 110569

Fig. 13. Weak defects. (a): Crack. (b): Lack of penetration. (c): Lack of fusion.

12
H. Xu et al. Measurement 188 (2022) 110569

Fig. 14. Small defects. (a): Porosity. (b): Inclusion.

(1). Training time: The training time of the model can reflect its 4.3.1. FPN semantic segmentation network for defect feature fusion
computational cost to some extent. Based on the above sample set and training parameters, we train four
(2). Inference time: The execution speed can directly reflect the effi­ semantic segmentation networks: FPN, U-net, PSPNet and Linknet. Their
ciency of the algorithm, usually expressed by the inference time evaluation results on the validation set are presented in Table 5 (they are
of the model. The time at which the model predicts a sample averaged over 5 runs, and the same for Tables 9 and 10) , from which we
image is recorded. can see that the FPN semantic segmentation network outperforms other
(3). Trainable parameters (Para.): The number of trainable parame­ networks in all evaluation indicators, especially in mIoU and mF1. And
ters is also provided as a reference to quantify the size of the in terms of computational cost and prediction efficiency, U-net and
model. Linknet perform better because they are structured with fewer compu­
tational parameters.
4.3. Network architecture Fig. 10 shows some prediction results, from which we can observe
that: First, when predicting defect regions, except that PSPNet misses
Appropriate feature extraction and aggregation is crucial for defect many weak defects (columns 2 and 3 of Fig. 10), the rest of networks can
classification [53]. And the network’s architecture is the key factor segment defect regions. Second, when predicting the defect types, FPN
influencing defect feature extraction and fusion. Therefore, in this part, predicts more accurately, while other networks have semantic confusion
we select the appropriate network architecture for the welding defect problems, specifically:
detection task, and demonstrate the superiority of the selected model
through comparative experiments. (1). PO have the larger sample size and greater feature differentia­
tion, so the models performed well in both their region

13
H. Xu et al. Measurement 188 (2022) 110569

Fig. 15. Various types of defects.

segmentation and category judgement, but also missed some too- the deeper level, which are also better characterized. To explore the
small PO (column 3 of Fig. 10). impact of defect feature extraction on the detection results, we train
(2). The similarity of features between CR and LPF has led to confu­ FPNs with ResNets of different layers as backbones, and their perfor­
sion when the models predict their types, with some models mances on the validation set are presented in Table 6.
predicting the same defect region as both defect types (columns 1, We can see from the experimental results that FPN-ResNet-34 has the
5 and 6 of Fig. 10). best detection performance, which means that the features extracted by
(3). SL have the fewest samples, so they are often misclassified as PO the ResNet-34 backbone are sufficient to characterize the defects. Since
with similar features (column 4 of Fig. 10). defect detection is a lower semantic level task (the weld structures are
fixed and the defect features are simple on the sample images), it is not
The difference in the architecture of the networks is undoubtedly an necessary to extract the defect features by a deeper network. Moreover,
important reason for the above problems. From the perspective of training a DNN with many parameters using few samples may also lead
feature learning, features with different levels of CNNs have different to overfitting, reducing detection performance (and efficiency). There­
sensitivities to objects. Low-level features have higher resolution, so fore, the feature extraction network with the best detection performance
they generate clear and detailed boundaries, and are sensitive to posi­ is not the deepest one, but the one best suited to the defect semantic
tional deviations, but with less contextual semantic information. While feature level.
high-level features have more abstract semantic information, which is In addition, the number of parameters and the time to predict a
the main basis for classification, but weaker shape and location infor­ sample for ResNet with different layers are also given in Table 6.
mation [65]. Assuming that a weld radiograph is cropped to 10 sample images
Thus, the accurate prediction of defect boundaries and types by FPN (usually less than 10), it takes only about 0.4 s for FPN-ResNet-34 to
proves that it has learned both low-level and high-level features of de­ interpret it. Therefore, this end-to-end model can basically come up to
fects. However, the defect prediction results of other networks are not as the standard of real-time detection.
good as FPN, even they should extract the same defect features based on The computational overhead of the network is largely controllable.
the same backbone (ResNet-34), which illustrates that the key to the As listed in Table 5, the training time and inference time of the model are
problem lies in the difference of the feature fusion modules. relatively stable. Because the size of the input images and the number of
In the feature fusion architecture, the biggest difference between network trainable parameters are both fixed. Moreover, original images
FPN and other networks is that FPN concatenates the multi-layer feature are converted to 224 × 224 resolution patches by preprocessing
maps obtained from the pyramid structure to predict the semantic seg­ methods and the shallower ResNet-34 is selected as the backbone
mentation results. In this way, the extracted high-level semantic features network, which all reduce the computational burden to some extent.
are directly used for prediction, cleverly avoiding the loss of these deep In summary, we select the FPN-ResNet-34 semantic segmentation
features during the transfer between the models’ different layers, and network to build our defect detection system.
the contribution of these semantic features to the prediction results
guarantees the correct classification of defects. While U-net and Linknet 4.3.3. Other advanced backbones and hyperparameters setting
also concatenate high-level semantic features, but these features In addition to ResNet, we apply some other advanced networks as
contribute less to the prediction results, which leads to misclassification backbones of FPN, including ResNeXt[66], Se-ResNeXt[67], Inception-
of defect types. And PSPNet pools the extracted feature maps to too v3[68], and DenseNet[69]. As given in Table 7, replacing these back­
small size, which leads to missed detection of weak defects. bone networks did not significantly improve the detection performance,
In summary, the experiments and results analysis demonstrate that but rather increased the computational cost. We believe that the reason
the FPN semantic segmentation network is very suitable for defect is that these advanced models are designed for generic tasks and not for
detection tasks due to its feature fusion architecture, and it shows good specific defect detection challenges.
detection performance on the sample set. When initializing the model parameters, we utilize the pre-training
weights of the backbone on ImageNet. Specifically, the encoder pa­
4.3.2. ResNet backbone for defect feature extraction rameters pre-trained on ImageNet are frozen and copied to the network,
Without degradation, deeper networks extract semantic features at after which the unfrozen encoder and the randomly initialized decoder

14
H. Xu et al. Measurement 188 (2022) 110569

Fig. 16. Various forms of welds.

are trained simultaneously. With this fine-tuning method, the shallow


layer of the network encoder can pre-learn some general features, which
benefits the deep layer to learn specific features. The effectiveness of this
method is experimentally demonstrated, as given in Table 8, where the
Fig.17. External verification results. (original images on top, test results
detection performance of the model is reduced when the network pa­
on bottom).
rameters are initialized without ImageNet pre-training weights.
In addition, we also compare the other hyperparameters in Table 4,
and the experimental results are given in Table 8. We set the batch size, detection challenges.
training iterations and gradient optimum algorithm that are suitable for
our detection task. 4.4.1. Image transformation-based data augmentation
The above experimental results demonstrate that it is difficult to As discussed above, training the DNN with our small sample set is a
further improve the network detection performance by replacing other challenge. Therefore, we select several image transformations (Table 2)
advanced backbones or adjusting hyperparameters. Therefore, we for the radiographic images to augment the training samples. The
design the targeted network optimization method in the next section. improvement of this optimization strategy on the model is presented in
Table 9.
We can observe that this data augmentation method can improve the
4.4. Network optimization detection performance, but the improvement is limited. Because this
augmentation method based on the original image transformation is
According to the data characteristics of defects, we optimize the difficult to simulate more realistic data, it does not really expand the
selected FPN-ResNet-34 semantic segmentation network to meet the sample capacity and cannot fundamentally improve the performance of

15
H. Xu et al. Measurement 188 (2022) 110569

Fig. 18. Visualization of the middle activation layers. From left to right: input image, layer from stage 1, stage 2, stage 3 of the network. (a): Porosity. (b): Lack of
penetration. (c): Multiple defects.

the model. and focal loss outperform cross-entropy loss on our model, which shows
that balancing the categories by the loss function can improve the per­
4.4.2. Loss function-based category balancing formance of the model (see Fig. 11 (b) (c)). Second, the hybrid loss does
Another challenge of our defect detection task is caused by the not improve the model performance, while it significantly improves the
imbalanced data distribution, which is reflected in two aspects: network after adding the weights for balancing defect categories, which
proves the superiority of the loss function we designed for defect data
(1). Imbalance between foreground and background: Since the loss distribution. Third, if the model is not trained with focal loss, the pre­
function of image semantic segmentation is based on pixel-wise diction results may not contain defects (all background) and the model
labeling, and small defects with few pixels contribute less to the needs to be retrained, which demonstrates the role of the focal loss
loss, which may cause the model trained to minimize loss actually function for stable training.
has a poor detection performance.
(2). Imbalance among defect categories: The difference in the 4.4.3. Visualization of test results
contribution to loss by various defect types may lead to the It is also worth noting that although our method is not outstanding in
model’s poor prediction performance on the defect types with the evaluation indicators, this does not mean that it has poor detection
few samples. performance. Because these indicators are evaluated based on defects’
pixels, rather than defects’ individual objects. This leads to the evalua­
This problem is also reflected in the detection results in Fig. 10, so we tion indicators underestimating the detection performance of the model.
solve it by using the loss function of Eq. (5). To validate the effect of It is possible that just a few pixel errors between the defect boundaries
different loss functions on the model performance, we train FPN-ResNet- predicted by the model and the labeled boundaries can cause a signifi­
34 using different loss functions. We try five loss functions, including cant degradation of the indicators (especially for IoU, which is very
cross-entropy loss, dice loss, focal loss, hybrid loss between dice loss and sensitive to object position deviations), even though these defects can be
focal loss, and our proposed hybrid loss with added category weights. considered as detected.
The performances of the model trained with the five loss functions Therefore, in this part, we show some defect detection results of the
described above are shown in Table 10, and the training curves is shown optimized FPN-ResNet-34 model on the test set (as shown in Fig. 12-
in Fig. 11. Fig. 16, from top to bottom, original images, ground truth annotations,
We notice some observations from this experiment: First, dice loss and prediction results) to reflect its detection performance more

16
H. Xu et al. Measurement 188 (2022) 110569

Fig. 19. Defect detection method framework.

intuitively. Firstly, the model can accurately predict the boundaries of network to other detection scenarios, we use the optimized FPN-ResNet-
obvious defects (Fig. 12). Secondly, for some images with detection 34 to predict the images in the external validation set (R0002),
challenges, whether they are CR and LPF with low contrast (Fig. 13), or achieving 0.63 mIoU and 0.64 mF1.
small object PO and SL (Fig. 14), the model is able to detect these de­ Fig. 17 shows some of the predicted results, and although these films
fects. Finally, due to the excellent feature learning capability of the from realistic projects are more difficult to detect, the network also
DNN, the model can also be adapted to detect various types of defects detects their defects relatively accurately. Based on the good general­
(Fig. 15) and in multiple welding scenarios (Fig. 16) of the sample set. ization performance of the DNN, our method shows the potential for
These visualization results demonstrate the good defect detection per­ applying in practical inspection.
formance of our method.
4.5.2. Feature visualization-based network interpretation
4.5. Network verification The reason why DNNs have strong generalization ability is because
they learn generic semantic features. To better understand these fea­
4.5.1. External validation-based network generalization tures, we visualized the middle activation layers of FPN-ResNet-34 using
Although the optimized model performs well on the sample set, this the post-interpretation technique, and Fig. 18 shows the transformation
does not mean that it is also reliable in detecting weld defects for other patterns of these different layers for the input image.
scenarios. Therefore, to evaluate the generalization ability of the We can clearly observe from the figure that the shallow feature maps

17
H. Xu et al. Measurement 188 (2022) 110569

Fig. 20. False detection on external validation set.

are only simple transformations of input images, which almost contain study. Furthermore, in view of the characteristics of weld radio­
the complete original information, i.e., the low-level features retain graphic images, an image preprocessing and annotation method
more defect location and boundary information. While as the network is designed, which preserves the original information of the data
layers get deeper, the feature maps become more abstract, representing while enabling the images to be used for training and evaluation
higher-level features. In the last layer, it is obvious that some feature of the semantic segmentation networks.
maps that only activate the defect regions, i.e., high-level semantic (2). Based on the FPN-ResNet-34 semantic segmentation network, an
features, which are the main basis for defect prediction. end-to-end welding defect detection is implemented. The
The visualized middle activation layer intuitively shows the defect network receives weld radiographic images as input, automati­
features learned by the network. While FPN-ResNet-34 extracts the cally extracts and fuses semantic features of defects, and outputs
defect high-level features in its third stage, consistent with the viewpoint complete information, including defect category, boundary, and
about the defect feature extraction and fusion that we obtained when location, which can be directly used for weld quality rating.
discussing the network architecture earlier. Compared with other networks, FPN-ResNet-34 exhibits better
detection performance due to its architectural features, achieving
0.66 mIoU, 0.67 mF1, 0.88 mPA, and 0.76 mR on the validation
4.6. Limitations and future work
set. Moreover, the network predicts a single sample image in only
0.04 s, which can meet the efficiency requirement of real-time
In summary, we propose an automatic detection method for welding
detection.
defects in radiographic images based on semantic segmentation, and the
(3). According to the data characteristics of welding defects, the
overall framework of this method is shown in Fig. 19.
optimization strategy of FPN-ResNet-34 is proposed, including
Although our method has shown good detection performance, it still
image transformation-based data augmentation and hybrid loss
has some limitations that need to be improved in our future work:
function-based data distribution balancing. The improvement of
network performance by the optimization strategy is demon­
(1). Image transformation has limited augmentation of our data, so
strated experimentally, with the optimized network achieving
we will probably try some new data augmentation methods, such
0.77 mIOU, 0.73 mF1, 0.90 mPA and 0.86 mR.
as manual defect simulation, GAN-based defect generation or
(4). To verify our method’s generalization ability to different detec­
other reconstruction methods [70].
tion scenarios, the network is tested on an external validation set
(2). Although the network exhibits a certain generalization ability, it
and achieves 0.63 mIoU and 0.64 mF1, indicating its potential for
does not perform excellently on the external validation set, as
application to practical welding inspection engineering. The
shown in Fig. 20, with false alarms. In the next step, we will
defect features learned by the network are shown by visualizing
design a transfer learning strategy for the algorithm, so that the
the middle activation layer, which illustrates the robustness and
trained model can be fine-tuned based on external data to
generalizability of our method from the perspective of feature
improve its transfer performance and give the algorithm the
learning.
accumulated learning ability when facing new data.
(3). In this paper, we use the MSMI algorithm to automatically label
In addition, the idea of constructing the system in this paper can also
the defect regions, which greatly reduces the workload of data
be extended to other NDT fields.
annotation, but the labeling of defect types is still done artifi­
cially. Compared with supervised learning that requires labeled
CRediT authorship contribution statement
data, unsupervised learning is certainly a new idea to solve the
problem. Therefore, in the future, we may build a better defect
H. Xu: Conceptualization, Methodology, Software, Formal analysis,
detection system based on unsupervised learning.
Data curation, Writing – original draft, Writing – review & editing,
Visualization. Z.H. Yan: Conceptualization, Methodology, Software,
5. Conclusions Validation, Data curation, Writing – review & editing, Funding acqui­
sition, Supervision. B.W. Ji: Software, Investigation, Resources, Data
In this paper, we construct an automatic welding defect detection curation. P.F. Huang: Investigation, Resources. J.P. Cheng: Investiga­
system in radiographic images based on semantic segmentation method. tion. X.D. Wu: Resources.
The main research results of this paper are summarized as follows:

(1). Based on the collected data, a weld radiographic image dataset


RIWD is constructed as the basis for the detection algorithm

18
H. Xu et al. Measurement 188 (2022) 110569

Declaration of Competing Interest [25] B. Chen, Z. Fang, Y. Xia, L. Zhang, Y. Huang, L. Wang, Accurate defect detection via
sparsity reconstruction for weld radiographs, NDT and E Int. 94 (2018) 62–69.
[26] F.M. Suyama, M.R. Delgado, R. Dutra da Silva, T.M. Centeno, Deep neural
The authors declare that they have no known competing financial networks based approach for welded joint detection of oil pipelines in radiographic
interests or personal relationships that could have appeared to influence images with Double Wall Double Image exposure, NDT and E Int. 105 (2019)
the work reported in this paper. 46–55.
[27] P. Sassi, P. Tripicchio, C.A. Avizzano, A Smart Monitoring System for Automatic
Welding Defect Detection, IEEE Trans. Ind. Electron. 66 (12) (2019) 9641–9650.
Acknowledgement [28] J.P. Yun, W.C. Shin, G. Koo, M.S. Kim, C. Lee, S.J. Lee, Automated defect inspection
system for metal surfaces based on deep learning and data augmentation, J. Manuf.
Syst. 55 (2020) 317–324.
The authors wish to thank the editor and the reviewers for their [29] J.-K. Park, W.-H. An, D.-J. Kang, Convolutional Neural Network Based Surface
helpful suggestions. Inspection System for Non-patterned Welding Defects, Int. J. Precis. Eng. Manuf.
20 (3) (2019) 363–374.
[30] J. Lin, Y.u. Yao, L. Ma, Y. Wang, Detection of a casting defect tracked by deep
Funding convolution neural network, Int. J. Adv. Manuf. Technol. 97 (1-4) (2018) 573–581.
[31] K. Zhang, H. Shen, Solder Joint Defect Detection in the Connectors Using Improved
The work in this research is financially supported by the National Faster-RCNN Algorithm, Appl. Sci. 11 (2) (2021) 576, https://doi.org/10.3390/
app11020576.
Natural Science Foundation of China (Grant Nos. 51975015). [32] Y. Yang, R. Yang, L. Pan, J. Ma, Y. Zhu, T. Diao, L.i. Zhang, A lightweight deep
learning algorithm for inspection of laser welding defects on safety vent of power
References battery, Comput. Ind. 123 (2020) 103306, https://doi.org/10.1016/j.
compind.2020.103306.
[33] X. Zhang, Y. Hao, H. Shangguan, P. Zhang, A. Wang, Detection of surface defects on
[1] T.W. Liao, Improving the accuracy of computer-aided radiographic weld inspection
solar cells by fusing Multi-channel convolution neural networks, Infrared Phys.
by feature selection, NDT & E Int. 42 (4) (2009) 229–239.
Technol. 108 (2020) 103334, https://doi.org/10.1016/j.infrared.2020.103334.
[2] W. Hou, D. Zhang, Y.e. Wei, J. Guo, X. Zhang, Review on Computer Aided Weld
[34] Y. Wang, F. Shi, X. Tong, A Welding Defect Identification Approach in X-ray Images
Defect Detection from Radiography Images, Appl. Sci. 10 (5) (2020) 1878, https://
Based on Deep Convolutional Neural Networks, Springer International Publishing,
doi.org/10.3390/app10051878.
Cham, 2019, pp. 53–64.
[3] Y. Zou, D. Du, B. Chang, L. Ji, J. Pan, Automatic weld defect detection method
[35] Oh S-j, Jung M-j, Lim C, Shin S-c. Automatic Detection of Welding Defects Using
based on Kalman filtering for real-time radiographic inspection of spiral pipe, NDT
Faster R-CNN. Applied Sciences. 2020;10.
& E Int. 72 (2015) 1–9.
[36] L. Yang, H. Wang, B. Huo, F. Li, Y. Liu, An automatic welding defect location
[4] M. Malarvel, G. Sethumadhavan, P.C. Rao Bhagi, S. Kar, T. Saravanan, A. Krishnan,
algorithm based on deep learning, NDT & E Int. 120 (2021) 102435, https://doi.
Anisotropic diffusion based denoising on X-radiography images to detect weld
org/10.1016/j.ndteint.2021.102435.
defects, Digital Signal Process. 68 (2017) 112–126.
[37] R. Guo, H. Liu, G. Xie, Y. Zhang, Weld Defect Detection From Imbalanced
[5] X. Wang, B.S. Wong, Radiographic Image Segmentation for Weld Inspection Using
Radiographic Images Based on Contrast Enhancement Conditional Generative
a Robust Algorithm, Res. Nondestr. Eval. 16 (3) (2005) 131–142.
Adversarial Network and Transfer Learning, IEEE Sens. J. 21 (9) (2021)
[6] Z. Lin, Z. Yingjie, D. Bochao, C. Bo, L.J.I.I.P. Yangfan, Welding defect detection
10844–10853.
based on local image enhancement. 13 (2019), 2647–2658.
[38] L. Yang, Y. Liu, J. Peng, An Automatic Detection and Identification Method of
[7] O. Zahran, H. Kasban, M. El-Kordy, F.E.A. El-Samie, Automatic weld defect
Welded Joints Based on Deep Neural Network, IEEE Access 7 (2019)
identification from radiographic images, NDT & E Int. 57 (2013) 26–35.
164952–164961.
[8] Y. Wang, Y.i. Sun, P. Lv, H. Wang, Detection of line weld defects based on multiple
[39] W. Hou, Y. Wei, Y. Jin, C.J.M. Zhu, Deep features based on a DCNN model for
thresholds and support vector machine, NDT & E Int. 41 (7) (2008) 517–524.
classifying imbalanced weld flaw types. 131 (2019) 482–489.
[9] R. Vilar, J. Zapata, R. Ruiz, An automatic system of classification of weld defects in
[40] T.W. Liao, Classification of weld flaws with imbalanced class data, Expert Syst.
radiographic images, NDT & E Int. 42 (5) (2009) 467–476.
Appl. 35 (3) (2008) 1041–1052.
[10] G. Wang, T.W.J.N. Liao, International E. Automatic identification of different types
[41] X. Le, J. Mei, H. Zhang, B. Zhou, J. Xi, A learning-based approach for surface defect
of welding defects in radiographic images.35 (2002) 519–528.
detection using small image datasets, Neurocomputing. 408 (2020) 112–120.
[11] J. Shao, D. Du, B. Chang, H. Shi, Automatic weld defect detection based on
[42] X. Dong, C.J. Taylor, T.F. Cootes, Automatic aerospace weld inspection using
potential defect tracking in real-time radiographic image sequence, NDT & E Int.
unsupervised local deep feature learning, Knowl.-Based Syst. 221 (2021) 106892,
46 (2012) 14–21.
https://doi.org/10.1016/j.knosys.2021.106892.
[12] Alaknanda, R.S. Anand, P. Kumar, Flaw detection in radiographic weldment
[43] H. Dong, K. Song, Y.u. He, J. Xu, Y. Yan, Q. Meng, PGA-Net: Pyramid Feature
images using morphological watershed segmentation technique, NDT & E Int. 42
Fusion and Global Context Attention Network for Automated Surface Defect
(1) (2009) 2–8.
Detection, IEEE Trans. Ind. Inf. 16 (12) (2020) 7448–7458.
[13] Alaknanda, R.S. Anand, P. Kumar, Flaw detection in radiographic weld images
[44] W. Du, H. Shen, J. Fu, G.e. Zhang, Q. He, Approaches for improvement of the X-ray
using morphological approach, NDT & E Int. 39 (1) (2006) 29–33.
image defect detection of automobile casting aluminum parts based on deep
[14] Z.H. Yan, H. Xu, P.F. Huang, Multi-scale multi-intensity defect detection in ray
learning, NDT & E Int. 107 (2019) 102144, https://doi.org/10.1016/j.
image of weld bead, NDT & E Int. 116 (2020) 102342, https://doi.org/10.1016/j.
ndteint.2019.102144.
ndteint.2020.102342.
[45] H. Jiang, Q. Hu, Z. Zhi, J. Gao, Z. Gao, R. Wang, S. He, H. Li, Convolution neural
[15] N. Nacereddine, A.B. Goumeidane, D. Ziou, Unsupervised weld defect classification
network model with improved pooling strategy and feature selection for weld
in radiographic images using multivariate generalized Gaussian mixture model
defect recognition, Welding World. 65 (4) (2021) 731–744.
with exact computation of mean and shape parameters, Comput. Ind. 108 (2019)
[46] Y. Gong, H. Shao, J. Luo, Z. Li, A deep transfer learning model for inclusion defect
132–149.
detection of aeronautics composite materials, Compos. Struct. 252 (2020) 112681,
[16] I. Valavanis, D. Kosmopoulos, Multiclass defect detection and classification in weld
https://doi.org/10.1016/j.compstruct.2020.112681.
radiographic images using geometric and texture features, Expert Syst. Appl. 37
[47] Q. Sun, Z. Ge, A Survey on Deep Learning for Data-Driven Soft Sensors, IEEE Trans.
(12) (2010) 7606–7614.
Ind. Inf. 17 (9) (2021) 5853–5866.
[17] H. Kasban, O. Zahran, H. Arafa, M. El-Kordy, S.M.S. Elaraby, F.E. Abd El-Samie,
[48] W. Hou, Y.e. Wei, J. Guo, Y.i. Jin, C. Zhu, Automatic Detection of Welding Defects
Welding defect detection from radiography images with a cepstral approach, NDT
using Deep Neural Network, J. Phys. Conf. Ser. 933 (2018) 012006, https://doi.
& E Int. 44 (2) (2011) 226–231.
org/10.1088/1742-6596/933/1/012006.
[18] N. Nacereddine, D. Ziou, L. Hamami, Fusion-based shape descriptor for weld defect
[49] C. Ajmi, J. Zapata, S. Elferchichi, A. Zaafouri, K. Laabidi, Deep Learning
radiographic image retrieval, Int. J. Adv. Manuf. Technol. 68 (9-12) (2013)
Technology for Weld Defects Classification Based on Transfer Learning and
2815–2832.
Activation Features, Adv. Mater. Sci. Eng. 2020 (2020) 1–16.
[19] L.u. Yang, H. Jiang, Weld defect classification in radiographic images using unified
[50] H. Jiang, R. Wang, Z. Gao, J. Gao, H. Wang, Classification of weld defects based on
deep neural network with multi-level features, J. Intell. Manuf. 32 (2) (2021)
the analytical hierarchy process and Dempster-Shafer evidence theory, J. Intell.
459–469.
Manuf. 30 (4) (2019) 2013–2024.
[20] Z. Yan, B. Shi, L. Sun, J. Xiao, Surface defect detection of aluminum alloy welds
[51] D. Mery, V. Riffo, U. Zscherpel, G. Mondragón, I. Lillo, I. Zuccar, H. Lobel,
with 3D depth image and 2D gray image, Int. J. Adv. Manuf. Technol. 110 (3-4)
M. Carrasco, GDXray: The Database of X-ray Images for Nondestructive Testing,
(2020) 741–752.
J. Nondestruct. Eval. 34 (4) (2015), https://doi.org/10.1007/s10921-015-0315-7.
[21] R.R. da Silva, L.P. Calôba, M.H.S. Siqueira, J.M.A. Rebello, Pattern recognition of
[52] T-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, S. Belongie, Feature pyramid
weld defects detected by radiographic test, NDT and E Int. 37 (6) (2004) 461–470.
networks for object detection. Proceedings of the IEEE conference on computer
[22] J. Sun, C. Li, X.-J. Wu, V. Palade, W. Fang, An Effective Method of Weld Defect
vision and pattern recognition (2017), p. 2117–2125.
Detection and Classification Based on Machine Vision, IEEE Trans. Ind. Inf. 15 (12)
[53] A. Kirillov, R. Girshick, K. He, P. Dollár, Panoptic feature pyramid networks, in:
(2019) 6322–6333.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern
[23] N. Boaretto, T.M. Centeno, Automated detection of welding defects in pipelines
Recognition, 2019, pp. 6399–6408.
from radiographic images DWDI, NDT and E Int. 86 (2017) 7–13.
[54] O. Ronneberger, P. Fischer, B.T. U-net, Convolutional networks for biomedical
[24] F. Duan, S. Yin, P. Song, W. Zhang, C. Zhu, H. Yokoi, Automatic Welding Defect
image segmentation, in: International Conference on Medical image computing
Detection of X-Ray Images by Using Cascade AdaBoost With Penalty Term, IEEE
and computer-assisted intervention: Springer, 2015, pp. 234–241.
Access 7 (2019) 125929–125938.

19
H. Xu et al. Measurement 188 (2022) 110569

[55] H. Zhao, J. Shi, X. Qi, X. Wang, J. Jia, Pyramid scene parsing network. Proceedings [62] Y. Pavel, Segmentation Models, Information (2019).
of the IEEE conference on computer vision and pattern recognition (2017) p. [63] A. Buslaev, V.I. Iglovikov, E. Khvedchenya, A. Parinov, M. Druzhinin, A.A. Kalinin,
2881–2890. Albumentations: fast and flexible image augmentations, Information 11 (2) (2020)
[56] A. Chaurasia, C.E. Linknet, Exploiting encoder representations for efficient 125, https://doi.org/10.3390/info11020125.
semantic segmentation, in: 2017 IEEE Visual Communications and Image [64] F. Chollet, Deep learning with Python: Simon and Schuster (2017).
Processing (VCIP): IEEE, 2017, pp. 1–4. [65] A. Mahendran, A. Vedaldi, Understanding deep image representations by inverting
[57] K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition. them. Proceedings of the IEEE conference on computer vision and pattern
Proceedings of the IEEE conference on computer vision and pattern recognition recognition (2015). p. 5188–5196.
(2016), p. 770–778. [66] A.E. Orhan, Robustness properties of Facebook’s ResNeXt WSL models. arXiv
[58] H. Yu, X. Li, K. Song, E. Shang, H. Liu, Y. Yan, Adaptive depth and receptive field preprint arXiv:190707640. 2019.
selection network for defect semantic segmentation on castings X-rays, NDT & E [67] J. Hu, L. Shen, G. Sun, Squeeze-and-excitation networks. Proceedings of the IEEE
Int. 116 (2020) 102345, https://doi.org/10.1016/j.ndteint.2020.102345. conference on computer vision and pattern recognition (2018). p. 7132–7141.
[59] C.H. Sudre, W. Li, T. Vercauteren, S. Ourselin, M.J. Cardoso, Generalised dice [68] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, Z. Wojna. Rethinking the inception
overlap as a deep learning loss function for highly unbalanced segmentations, in: architecture for computer vision. Proceedings of the IEEE conference on computer
Deep learning in medical image analysis and multimodal learning for clinical vision and pattern recognition2016. p. 2818–2826.
decision support: Springer, 2017, pp. 240–248. [69] G. Huang, Z. Liu, L. Van Der Maaten, K.Q. Weinberger, Densely connected
[60] T-Y. Lin, P. Goyal, R. Girshick, K. He, P. Dollár, Focal loss for dense object convolutional networks. Proceedings of the IEEE conference on computer vision
detection. Proceedings of the IEEE international conference on computer vision and pattern recognition2017. p. 4700–8.
(2017) p. 2980–2988. [70] X. Liu, S. Chen, L. Song, M. Woniak, S. Liu, Self-attention Negative Feedback
[61] W. Zhu, Y. Huang, L. Zeng, X. Chen, Y. Liu, Z. Qian, N. Du, W. Fan, X. Xie, Network for Real-time Image Super-Resolution, J. King Saud Univ. Comput. Inf.
AnatomyNet: Deep learning for fast and fully automated whole-volume Sci. (2021).
segmentation of head and neck anatomy, Med. Phys. 46 (2) (2019) 576–589.

20

You might also like