Académique Documents
Professionnel Documents
Culture Documents
Oil and gas pipeline failure prediction system using long range ultrasonic
transducers and Euclidean-Support Vector Machines classication approach
Lam Hong Lee a,, Rajprasad Rajkumar a,1, Lai Hung Lo b,2, Chin Heng Wan b,2, Dino Isa a,3
a
b
Intelligent Systems Research Group, Faculty of Engineering, The University of Nottingham, Malaysia Campus, Jalan Broga, 43500 Semenyih, Selangor, Malaysia
Faculty of Information and Communication Technology, Universiti Tunku Abdul Rahman, 31900 Kampar, Perak, Malaysia
a r t i c l e
i n f o
Keywords:
Oil and gas pipeline defects
Long range ultrasonic transducer
Support Vector Machines
Euclidean distance function
Kernel function
Soft margin parameter
a b s t r a c t
This paper presents an intelligent failure prediction system for oil and gas pipeline using long range ultrasonic transducers and Euclidean-Support Vector Machines classication approach. Since the past decade,
the incidents of oil and gas pipeline leaks and failures which happened around the world are becoming
more frequent and have caused loss of life, properties and irreversible environmental damages. This situation is due to the lack of a full-proof method of inspection on the condition of oil and gas pipelines.
Onset of corrosion and other defects are undetected which cause unplanned shutdowns and disruption
of energy supplies to consumers. Existing failure prediction systems for pipeline which use non-destructive
testing (NDTs) methods are accurate, but they are deployed at pre-determined intervals which can be
several months apart. Hence, a full-proof and reliable inspection method is required to continuously
monitor the condition of oil and gas pipeline in order to provide sufcient information and time to oil
and gas operators to plan and organize shutdowns before failures occur. Permanently installed long range
ultrasonic transducers (LRUTs) offer a solution to this problem by providing an inspection platform that
continuously monitor critical pipeline sections. Data are acquired in real-time and processed to make
decision based on the condition of the pipe. The continuous nature of the data requires an automatic decision making software rather than manual inspection by operators. Support Vector Machines (SVMs) classication approach has been increasingly used in a multitude of domains including LRUT and has shown
better performance than other classication algorithms. SVM is heavily dependent on the choice of kernel
functions as well as ne tuning of the kernel and soft margin parameters. Hence it is unsuitable to be used
in continuous monitoring of pipeline data where constant modications of kernels and parameters are
not unrealistic. This paper proposes a novel classication technique, namely Euclidean-Support Vector
Machines (Euclidean-SVM), to make a decision on the integrity of the pipeline in a continuous monitoring
environment. The results show that the classication accuracy of the Euclidean-SVM approach is not
dependent on the choice of the kernel function and parameters when classifying data from pipes with
simulated defects. Irrespective of the kernel function and parameters chosen, classication accuracy of
the Euclidean-SVM is comparable and also higher in some cases than using conventional SVM. Hence,
the Euclidean-SVM approach is ideally suited for classifying data from the oil and gas pipelines which
are continuously monitored using LRUT.
2012 Elsevier Ltd. All rights reserved.
1. Introduction
This paper presents a novel oil and gas pipeline failure prediction system by utilizing non-destructive testing (NDTs) method
based on long range ultrasonic transducers (LRUTs), in conjunction
with an advanced signal processing technique and a new classication framework, the Euclidean-Support Vector Machines (Euclidean-SVM) approach. This system provides continuous monitoring
for pipelines using NDT method, and also makes decision without
human errors and misinterpretations using articial intelligence
classication approach. In recent years, oil and gas pipeline condition monitoring and failure prediction system has become great
importance, due to the incidents of oil and gas pipeline leaks and
failures which happened around the world. These incidents are
becoming more frequent and have caused loss of life, properties
and irreversible environmental damages. The major cause of these
incidents is the lack of a full-proof method of inspection on the
condition of oil and gas pipeline. Corrosion has been reported as
1926
one of the major problems in oil and gas pipeline which results in
catastrophic pollution and wastage of raw materials (Lozev, Smith,
& Grimmett, 2003). These undetected defects of pipeline cause unplanned shutdowns and disruptions of energy supply to the consumers. Hence, frequent leaks of gas and oil from the pipeline
due to ruptured pipes are calling for the need for better and more
efcient methods to monitor the condition and predict the failures
of oil and gas pipeline.
Since some decades now, techniques such as pigging (Lebsack,
2002) have been used for pipeline inspection at predetermined
intervals. The pigging technique uses devices called smart pigs,
which travel within the pipeline to record critical information such
as corrosion levels, cracks, and structural defects using numerous
types of sensors. Smart pigs are able to provide pinpoint information on the location of defects using techniques such as magnetic
ux leakage and ultrasonic detection (Bickerstaff, Vaughn, Stoker,
Hassard, & Garrett, 2002). However, the implementation of pigging
system in pipeline inspection can be very costly, and the pipeline
condition is measured only at the instance it is deployed and does
not provide continuous measurements over time. Recently, other
NDT techniques have also been introduced to monitor the condition of pipelines in order to reduce the cost of using pigging system
for pipeline inspection. However, these NDT methods have also
been implemented at predetermined intervals, where operators
need to be physically present to perform measurements, data collection and make judgments on the integrity of the pipeline. These
processes may take up to several months in order to generate the
result regarding to the condition of the pipeline. During this period,
the condition of the pipeline can go unmonitored, and hence may
cause failures and leaks, as the defects which lead to these failures
may occur suddenly.
In order to overcome the problems as mentioned above, a fullproof and reliable inspection method is required to continuously
monitor the condition of oil and gas pipeline in order to provide
sufcient information in a real-time manner to oil and gas operators to plan and organize shutdowns of the pipeline before failures
occur. A permanently installed NDT system is needed for real-time
pipeline condition monitoring and failures prediction by providing
an inspection platform that continuously monitors critical pipeline
sections, such as insulated pipes, risers, pipes on hill slopes, pipe
bends, pipes under road crossings and offshore pipes. This system
would ensure that pipes are continuously monitored and hence
prevent the occurrence of leaks and failures. LRUT which utilizes
guided waves to inspect long distances from a single location
(Demma, Cawley, Lowe, Roosenbrand, & Pavlakovic, 2004), was
specically designed for inspection of Corrosion Under Insulation
(CUI). As compared to other NDT techniques, LRUT is reported to
be more efcient and cost-saving, since it also able to detect both
internal and external corrosions. This makes LRUT technique to
have many advantages over other NDT techniques which have seen
their widespread use in many other applications. With recent
developments of oil and gas pipeline which are based on permanent mounting system using special compound, a real-time continuous monitoring system is destined to be the future trend of NDT
systems. Data from the permanently installed LRUT system will be
continuous and hence not practical to be analyzed by human operators. In our proposed system, data are acquired in real-time and
processed to make decision based on the condition of the pipe.
The continuous nature of the data requires an automatic decision
making software rather than manual inspection by the human
operators. Hence, automatic and intelligence-based software must
be deployed to the system in order to process the continuous
streams of data and make decisions on the integrity of the pipeline
monitored.
Support Vector Machines (SVMs) approach has been increasingly used in a multitude of domains including LRUT and has
and failure prediction for oil and gas pipeline. The Euclidean-SVM
approach is presented by replacing the optimal separating
hyper-plane of the conventional SVM with Euclidean distance
measurement as the classication decision making function. The
Euclidean-SVM approach uses SVM in the training phase to identify the set of support vectors (SVs) for each category, and uses
Euclidean distance formula in the classication phase to compute
the average distances between the testing data point and each of
the sets of SVs from different categories. Classication decision is
made based on the category which has the lowest average distance
between its set of SVs and the new data point, in order to make the
classication decision irrespective of the efcacy of hyper-plane
formed by applying the particular kernel function and parameters.
The conventional SVM classication model requires the implementation of the appropriate combination of kernel function and
parameters to make correct decisions. In our proposed EuclideanSVM classication approach, the impact of kernel function and
parameters on the accuracy of the classier could be minimized.
As the result, the Euclidean-SVM approach contributes to a kernel
function and parameters independent classication framework for
the real-time pipeline condition monitoring and failure prediction
system, and hence prevents the necessity of preparing validation
dataset for kernel function and parameters optimization process,
and reduces the convoluted computations in the training phase.
2. Euclidean-Support Vector Machines classication approach
Support Vector Machines (SVMs) is increasingly being used for
classication problems due to its promising empirical performance
and excellent generalization ability. The good generalization characteristic of SVM is due to the implementation of Structural Risk
Minimization (SRM) principle, which entails nding an optimal
separating hyper-plane, thus guaranteeing the highly accurate
classier in most applications. Eq. (1) represents the equation of
a hyper-plane which can be used to partition data points in SVM.
w:x b 0
1927
minf1=2kwk2 g
yi :w:xi b P 1; i
1928
v
!
u n
u X
2
t
D
pi qi
i1
Fig. 2. Vector space of the conventional SVM classier with optimal separating
hyper-plane.
PN
Dav g
Fig. 3. Vector space of the Euclidean-SVM classier with the Euclidean distance
function as the classication decision making algorithm.
I1
q
P
ni1 pi qi 2
I
After computing the average distance of the new data point to the
set of SVs of each of the categories, the classication decision is
made based on the category which has the lowest average distance
between its set of SVs and the new data point. In other words, the
new input data point will be labeled with the category which has
the lowest average distance between its SVs and the new data point
itself. Table 1 illustrates the algorithm of the Euclidean-SVM classication approach.
With the combination of the SVM training algorithm and
Euclidean distance function to make the classication decision,
the impact of kernel function and parameters on the classication
accuracy can be minimized. This is due to the fact that the optimal
separating hyper-plane, which its construction is highly dependent
on kernel function and parameters, is replaced by Euclidean distance function. Since Euclidean distance function is able to perform
its classication decision making task sufciently as long as both
the training data points (support vectors) and new data points to
be classied are mapped into the same vector space, the transformation of existing vector space into a higher dimensional feature
space by using kernel function is not needed during the classication phase, hence does not have great impact on the classication
performance. In other words, the problem of selecting the right
combination of kernel function and parameters for the classier
does not exist if the optimal separating hyper-plane is replaced
by Euclidean distance function. As proven by our experimental results obtained in this paper, the classication performance of the
Euclidean-SVM is comparable to the conventional SVM, without
needing the selection and implementation of the appropriate combination of kernel function and parameters.
Table 1
Algorithms of the Euclidean-SVM classication approach.
Stages
Steps
Training
stage
1. Map all the training data points into the vector space of a SVM.
2. Determine and capture the set of support vectors for each of the categories using SVM training algorithm, and eliminate the rest of the training data
points which are not identied as support vectors.
3. Map all the support vectors into the original vector space.
Testing
stage
1. Map the new unlabeled data point into the same original vector space with all the support vectors.
2. Adopt the Euclidean distance function to compute the average distances between the new data point and each of the sets of support vectors from
different categories.
3. Determine the category which has the lowest average distance between its set of support vectors and the new inserted data point.
4. Generate classication result for the new data point based on the identied category.
1929
1930
Fig. 5. Full circumferential corrosion defect (i) Position (ii) Actual Picture.
Table 2
Arrangement of data points for simulation of a continuous signal.
Depth of defect (mm)
0
1
2
3
4
0
1
2
3
4
Sample number
Start
End
1
501
1001
1501
2001
500
1000
1500
2000
2500
tion rate. In conclusion, by implementing the Euclidean-SVM approach to construct a classication framework for pipeline failure
prediction, we can obtain a pipeline failure prediction system with
better performance in which the accuracy is comparable with or
even better than the conventional pipeline failure prediction system using the SVM approach, while immunes from the problem
of determining the appropriate kernel function and parameters
for the classier.
4. Experiments and evaluations
Table 3
List of datasets used in the experiments.
1.
2.
3.
4.
5.
6.
1931
Table 4
Classication accuracies of SVM classier and Euclidean-SVM classier with different types of kernel function and different values for parameter C on the original dataset (0 dB of
SNR).
Classication approach (Kernel function)
SVM (Linear)
SVM (Polynomial)
SVM (RBF)
SVM (Sigmoid)
Euclidean-SVM (Linear)
Euclidean-SVM (Polynomial)
Euclidean-SVM (RBF)
Euclidean-SVM (Sigmoid)
10
100
1000
10000
99.30
99.25
22.10
28.75
92.30
97.30
98.95
99.10
99.30
99.25
23.70
28.85
92.30
97.30
98.95
99.10
99.30
99.25
23.27
29.60
92.30
97.30
98.95
99.10
99.30
99.25
23.70
29.50
92.30
97.30
98.95
99.10
99.30
99.25
23.70
30.60
92.30
97.30
98.95
99.10
0
0
0.4802
0.5493
0
0
0
0
Table 5
Classication accuracies of SVM classier and Euclidean-SVM classier with different types of kernel function and different values for parameter C on dataset with 5 dB SNR.
Classication approach (Kernel function)
SVM (Linear)
SVM (Polynomial)
SVM (RBF)
SVM (Sigmoid)
Euclidean-SVM (Linear)
Euclidean-SVM (Polynomial)
Euclidean-SVM (RBF)
Euclidean-SVM (Sigmoid)
10
100
1000
10000
44.00
48.05
20.00
35.10
50.70
51.65
51.55
51.40
44.50
48.05
20.00
37.50
50.55
51.65
51.55
50.60
44.35
48.05
20.00
36.25
50.65
51.65
51.55
50.75
44.30
48.05
20.00
36.15
50.65
51.65
51.55
51.15
44.55
48.05
20.00
37.00
50.65
51.65
51.55
50.70
s2
Pn
X2
n 1
i1 X i
Variance v arx:
4.1. Experiment on Original Dataset (0 dB SNR)
As illustrated in Table 4, the implementation of different types
of kernel functions greatly affect the performance of the conventional SVM classier on the Original Dataset. The SVM classier
with linear kernel and the SVM classier with polynomial kernel
have been recorded with high accuracies, which are 99.30% and
0.0467
0
0
0.8362
0.0030
0
0
0.1157
99.25% respectively. However, the conventional SVM classier performs badly with RBF kernel and sigmoid kernel implemented to it.
In this experiment, the SVM classier with RBF kernel has achieved
its best accuracy of 23.70% which the highest accuracy of the SVM
classier with sigmoid kernel has been recorded at 30.60%. The
great difference in terms of the accuracy of the SVM classier with
different kernel functions has shown that the conventional SVM
classier requires the implementation of appropriate kernel
function in order to obtain high classication performance and to
guarantee good generalization ability. The inappropriate implementation of kernel function in the conventional SVM leads to a
low performance of the classier. The experimental results here
show that the accuracy of the conventional SVM classier is highly
dependent on the implementation of kernel function.
The experiments on the Euclidean-SVM classier have recorded
a more consistent performance with different kernel functions
implemented to it, as compared to the conventional SVM. The lowest classication accuracy for the Euclidean-SVM is recorded at
92.30% when the classier is implemented with linear kernel function. The Euli-SVM classier with sigmoid kernel has achieved the
highest accuracy of 99.10% in this experiment. Hence, it can be
seen that the implementation of kernel function has minimal impact on the performance of the Euclidean-SVM classier.
In this experiment, the performance of both of the conventional
SVM classier and the Euclidean-SVM classier are almost immune from the soft margin parameter, C. According to Table 4,
the variances of classication accuracies across values of parameter C, for the SVM classier with RBF kernel and the SVM classier
with sigmoid kernel, have been recorded at 0.4820 and 0.5493
respectively. On the other hand, the soft margin parameter does
not have impact on the classication performance of the Euclidean-SVM classiers. These results have again proven that the
Euclidean-SVM approach has lower dependent on the implementation of kernel function and soft margin parameter, as compared to
the conventional SVM approach.
1932
Table 6
Classication accuracies of SVM classier and Euclidean-SVM classier with different types of kernel function and different values for parameter C on dataset with 2 dB SNR.
Classication approach (Kernel function)
SVM (Linear)
SVM (Polynomial)
SVM (RBF)
SVM (Sigmoid)
Euclidean-SVM (Linear)
Euclidean-SVM (Polynomial)
Euclidean-SVM (RBF)
Euclidean-SVM (Sigmoid)
10
100
1000
10000
59.40
60.10
20.00
37.70
62.35
66.45
66.95
65.55
59.50
60.10
20.00
47.55
61.90
66.45
66.95
66.05
59.45
60.10
20.00
36.60
61.55
66.45
66.95
65.95
59.50
60.10
20.00
36.95
61.55
66.45
66.95
66.00
59.50
60.10
20.00
36.90
62.05
66.45
66.95
66.10
0.0020
0
0
22.2667
0.1170
0
0
0.0482
Table 7
Classication accuracies of SVM classier and Euclidean-SVM classier with different types of kernel function and different values for parameter C on dataset with 0.01 dB SNR.
Classication approach (Kernel function)
SVM (Linear)
SVM (Polynomial)
SVM (RBF)
SVM (Sigmoid)
Euclidean-SVM (Linear)
Euclidean-SVM (Polynomial)
Euclidean-SVM (RBF)
Euclidean-SVM (Sigmoid)
10
100
1000
10000
65.00
68.90
20.00
29.00
71.85
73.55
74.65
74.70
63.75
68.90
20.00
27.65
72.85
73.55
75.65
74.65
63.65
68.90
20.00
27.55
72.60
73.55
74.65
75.10
63.50
68.90
20.00
28.55
72.55
73.55
74.65
75.15
63.50
68.90
20.00
27.65
72.55
73.55
74.65
75.15
0.4033
0
0
0.4295
0.1395
0
0
0.0637
1933
Table 8
Classication accuracies of SVM classier and Euclidean-SVM classier with different types of kernel function and different values for parameter C on dataset with 1 dB SNR.
Classication approach (Kernel function)
SVM (Linear)
SVM (Polynomial)
SVM (RBF)
SVM (Sigmoid)
Euclidean-SVM (Linear)
Euclidean-SVM (Polynomial)
Euclidean-SVM (RBF)
Euclidean-SVM (Sigmoid)
10
100
1000
10000
75.05
72.95
20.00
28.70
77.95
75.60
75.40
77.10
72.85
72.95
20.00
29.75
75.45
75.60
75.40
76.65
72.65
72.95
20.00
29.80
75.30
75.60
75.40
76.65
72.80
72.95
20.00
29.75
75.30
75.60
75.40
76.65
72.80
72.95
20.00
29.80
75.30
75.60
75.40
76.65
1.0407
0
0
0.2318
1.3693
0
0
0.0405
Table 9
Classication accuracies of SVM classier and Euclidean-SVM classier with different types of kernel function and different values for parameter C on dataset with 10 dB SNR.
Classication approach (Kernel function)
SVM (Linear)
SVM (Polynomial)
SVM (RBF)
SVM (Sigmoid)
Euclidean-SVM (Linear)
Euclidean-SVM (Polynomial)
Euclidean-SVM (RBF)
Euclidean-SVM (Sigmoid)
10
100
1000
10000
98.25
98.55
20.80
23.10
95.60
97.55
96.25
98.00
98.25
98.55
22.80
24.10
95.60
97.55
96.25
98.40
98.25
98.55
22.80
23.90
95.60
97.55
96.25
98.40
98.25
98.55
22.80
24.00
95.60
97.55
96.25
98.40
98.25
98.55
22.80
23.75
95.60
97.55
96.25
98.40
0
0
0.8000
0.1570
0
0
0
0.0320
kernel, the accuracies are only recorded at the range from 20.80%
to 24.10%.
On the other hand, the implementation of different kernel functions and different value of parameter C does not has high impact
on the Euclidean-SVM classication approach. The Euclidean-SVM
approach has achieved classication accuracies between the range
of 95.60% to 98.40%, with the implementation of different kernels
and different values of parameter C. The results in this experiment
have further justied that the Euclidean-SVM has lower dependency on the implementation of kernel functions and value of
parameter C, as compared to the conventional SVM.
4.7. Discussion on experimental results
Based on the results obtained from a series of experiments by
using the datasets with different SNR levels, it can be observed that
the Euclidean-SVM classication approach has lower dependency
on types of kernel and values of soft margin parameter, as compared to the conventional SVM classication approach. In most
cases, the Euclidean-SVM approach outperforms the conventional
SVM approach in terms of classication accuracy, as well as performance consistency across different combinations of kernel and
parameter C. By performing the classication tasks using the
Euclidean-SVM approach, high accuracies could be obtained, without needing the transformation of the original vector space into a
high dimensional feature space using kernel functions. This is
due to the fact that the Euclidean-SVM approach uses Euclidean
distance as the decision making function for the classication
framework. As the Euclidean-SVM approach does not use optimal
separating hyper-plane as the decision surface, the implementation of kernel functions to transform original input space into high
dimensional feature space have only minimal impact to the performance of the Euclidean-SVM classication framework. The Euclidean distance function used in the Euclidean-SVM approach could
perform effective classication decision making task, as long as
all the training data points (the SVs) and the input data points
are mapped into the same vector space. The Euclidean-SVM
1934