International Journal of Advanced Network, Monitoring and Controls      Volume 05, No.01, 2020 

DOI: 10.21307/ijanmc-2020-007                            44 

Research on the Tunnel Geological Radar Image Flaw 

Detection Based on CNN  

 
Li He 

School of Computer Science and Engineering 

Xi’an Technological University 

Xi’an, China  

E-mail: 1003294436@qq.com 

 
Wang Yubian 

Department of Railway Transportation Control 

Belarusian State University of Transport 

34, Kirova street, Gomel, 246653 

Republic of Belarus 

E-mail: alika_wang@mail.ru 

 
Abstract—Tunnel geological radar image has been widely used 

in tunnel engineering quality detection for its advantages of 

fast, nondestructive, continuous detection, real-time imaging, 

intuitive data processing and high detection accuracy. 

However, the traditional defect detection method, which is 

judged by surveyors visually, consumes energy. In order to 

detect the quality of tunnel engineering accurately and quickly, 

an improved method of void defect detection based on Faster 

RCNN (Regional Convolutional Neural Network) is proposed 

in depth learning. The image data of the tunnel geological 

radar is collected for annotation, which fills the blank of the 

defect data set in the tunnel engineering. Through the method 

of this paper proposed, the feature extraction is optimized to 

improve the performance of the detection model, and the 

detection accuracy of the model is verified by expert 

knowledge. 

Keywords-Tunnel Geological; Radar Image; Flaw Detection; 

CNN 

I. INTRODUCTION 

There are a large number of tunnel projects in the 

construction project. The quality defects of the tunnel 

may affect the construction schedule, increase the 

engineering cost, damage the mechanical equipment 

and even endanger the lives of the constructor. 

Geological radar detection method[1-2] is the 

mainstream method of tunnel lining detection at 

present, and has excellent performance in the detection 

of reinforcement and arch spacing, plain concrete 

structure, etc.[3]. However, the traditional survey 

situation is that the site construction surveyors scan the 

survey images generated by radar equipment one by 

one according to their expert knowledge. This 

traditional method has a large workload, a large human 

factor, and a certain rate of omission and error. 

In recent years, with the continuous improvement of 

GPU, the field of deep learning is booming. In 2006, 

Hinton[4] and other researcher proposed the concept of 

deep learning, using convolutional neural network 

(CNN) to learn features from data. In 2012, in the 

ImageNet image classification competition, Alex 

Krizhevsky's team proposed the deep convolutional 

neural network AlexNet for the first time. AlexNet won 

the champion with an accuracy rate of 15.3% higher 

than the second place, which made people have a 

further understanding of the application of 

convolutional neural network in visual tasks. Girshick 

R proposed the regional convolutional neural network 

(RCNN) model, which USES the Selective Search 

method to select candidate regions and USES multiple 


International Journal of Advanced Network, Monitoring and Controls      Volume 05, No.01, 2020 

45 

support vector machines to classify features, thus 

achieving target detection. 

In 2015, Girshick R proposed Fast RCNN[6], which 

is an improved version of RCNN and adopts RoI 

Pooling to share the parts with a large amount of 

calculation to improve the working efficiency of the 

whole model. Later, Ren improved again on the basis 

of the original network, introduced RPN layer, and 

designed a Faster RCNN model[7], aiming at the 

problem that the running time of extracting candidate 

feature regions was slow, which achieved good results. 

Compared with artificially designed features, features 

extracted by convolutional neural network have better 

robustness and stronger semantic information, and 

great achievements have been made in computer vision 

fields such as face recognition[8-9], target detection, 

and speech recognition[10-11]. 

This paper selects Faster RCNN network as the 

basic algorithm framework for tunnel GPR (Ground 

Penetrating Radar) detection. The framework of Faster 

RCNN network is introduced. However, if the original 

Faster RCNN model is directly applied to the tunnel 

GPR detection in the actual scene, there may be two 

disadvantages: 

1) The data sets collected on site for training are 

relatively small, which may lead to incomplete learning 

of the learning model and easy over fitting[12]. 

2) There are many interference factors in the tunnel, 

resulting in complex image features of the defect. At 

the same time, the radar images are all manually 

experienced by field surveyors, and there is no uniform 

standard, so the sharpness is quite different. It will 

cause RPN to produce more negative sample space, 

and the network model is difficult to converge[13]. 

Based on the above reasons, this paper proposed 

Faster - RCNN model to expand original data sets, 

based on the data increase[14] and combining with GA 

– RPN[15] for an improvement in the target detection 

and evaluation index in GIoU on border regression 

optimization[16], in order to overcome the above 

shortcomings, further improve the accuracy of tunnel 

geological radar image detection.  

II. KEY TECHNOLOGIES AND EQUIPMENT 

As a relatively mature geophysical prospecting 

method, geological radar method has the advantages of 

high resolution, fast detection speed, non-destructive 

and radar image visualization, etc., and has become the 

most important method for tunnel lining quality 

detection. There are obvious abnormal reactions to the 

defects of tunnel lining, such as local un-compactness, 

voidage, insufficient thickness and lack of 

reinforcement.  

A. Technical principle 

Different defects have different reflections in radar 

images. The electromagnetic waves emitted by GPR 

will generate reflection and refraction on the surface of 

the medium with different dielectric constants[17]. The 

dielectric constants of common materials are shown in 

table 1 below. 

TABLE I. DIELECTRIC CONSTANTS OF COMMON MATERIALS  

Material Dielectric constant Velocity (mm/ns) 

atmosphere 1 300 

water 81 30 

concrete 5-8 55-120 

Sand (dry) 3-6 120-170 

Sand (wet) 25-30 55-60 

pitch 3-5 134-173 

 
International Journal of Advanced Network, Monitoring and Controls      Volume 05, No.01, 2020 

46 

Reflection and refraction conform to the law of 

reflection and refraction. The energy of reflected wave 

and refracted wave depends on the reflection 

coefficient R and refracted coefficient T: 

 
21

21








R  (1) 

In the above equation, ε1 and ε2 are the relative 

permittivity of the upper and lower media at the 

interface respectively. According to Formula (1), when 

the electromagnetic wave propagates to the interface 

with dielectric constant difference, the reflected 

electromagnetic wave energy will change, which is 

reflected in the radar image as positive and anti-peak 

anomalies. The defects in lining such as local 

un-compaction, voidage, insufficient thickness and lack 

of reinforcement have obvious dielectric differences 

with concrete, which provide a good geophysical 

foundation for the application of GPR. 

B. Equipment 

The equipment of detection equipment of GPR 

system in the project is shown in Figure 1. 

 
Figure 1. GPR system detection equipment 

Geological ground penetrating radar method is the 

use of high frequency electromagnetic wave 

transmitting antenna will be in the form of a pulse in 

the concrete surface emission to concrete, the concrete 

interface reflection and defects return to the surface, by 

the receiving antenna to receive the echo signal, the 

treatment of radar system, radar image, through the 

analysis of the radar image processing, the 

interpretation of the data on the basis of this, so as to 

achieve quality nondestructive testing of lining. Its 

detection principle is shown in Figure 2. 

 
Figure 2. Principles of geological radar detection 


International Journal of Advanced Network, Monitoring and Controls      Volume 05, No.01, 2020 

47 

C. The Radar imaging profile 

Profiles of GPR images are usually recorded in the 

form of pulse reflected waves, or in the form of gray or 

color profiles. In this way, the in-phase axis or the 

iso-gray line can be used to represent the underground 

reflector or the target body. On the waveform recording 

chart, waveforms are recorded in the vertical direction 

of the survey line at all measurement points, forming 

the radar imaging profile. 

In the tunnel lining detection, the voids mainly 

appear between the inner lining, the second lining and 

the surface layer. Due to the large dielectric difference 

between the void and surrounding media, when the 

electromagnetic wave propagates between concrete and 

atmosphere, air and surrounding rock, it will generate 

two strong reflections at the upper and lower interfaces. 

Because the electromagnetic wave in the air 

attenuation is small, and the concrete and air dielectric 

difference is large, so in the void to produce a strong 

multiple reflections, electromagnetic wave in the air 

medium propagation frequency is relatively high. 

At the upper interface of voids, the electromagnetic 

wave goes from concrete to the air medium. According 

to the law of reflection (Formula 1), the reflection 

coefficient is positive, showing a negative wave peak. 

At the gap interface, the reflection coefficient of 

electromagnetic wave from air to concrete medium is 

negative, showing a positive wave peak. The picture is 

as shown in Figure 3.  

In the image, white is the strongest color of positive 

reflection and black is the strongest color of negative 

reflection. In theory, there is a gap defect and the radar 

image shows two sets of reflected signals. In the actual 

survey, the second reflection signal may be lost due to 

signal interference. 

 
Figure 3. GPR image of tunnel ejection 

III. CONVOLUTIONAL NEURAL NETWORKS  

Convolutional Neural Networks (CNN) can use 

local operations to abstract the representations in a 

hierarchical way in image recognition [18]. Two key 

design ideas have driven the success of convolution 

architecture in computer vision. First, CNN USES the 

2D structure of the image, and the pixels in adjacent 

areas are usually highly correlated. Therefore, instead 

of using one-to-one connections between all pixel units 

(as most neural networks do), CNN can use local 

connections in packets. Second, the CNN architecture 

relies on feature sharing, so each channel (that is, the 

output feature graph) is generated by convolution with 

the same filter at all locations. 

Deep learning methods of convolutional neural 

network are mainly divided into single-stage (eg.SDD, 

YOLO) and two-stage (eg.RCNN series). Single-stage 

generates detections directly on the picture through 


International Journal of Advanced Network, Monitoring and Controls      Volume 05, No.01, 2020 

48 

calculation. Two-stage extracts the proposal first, and 

then makes the second amendment based on the 

proposal. Both are relatively single-stage fast, low 

precision. In this paper, it is proposed to use two-stage 

Faster RCNN because of the high accuracy of 

two-stage[19]. 

This paper adopts Faster RCNN as the basic 

framework, as shown in Figure 4. In fact, Faster 

RCNN can be divided into four main contents: 

 
Conv layers

隧道地质雷达图像

classifier

RoI pooling

Feature maps

proposals

Region Proposal Network

隧道地质雷达图像  

Figure 4. Structure chart Faster RCNN 

 
1) Convolutional layers. It is used to extract the 

features of un-blemishes on the GPR image. The input 

is the whole image and the output is the extracted 

features, which are called feature maps. 

2) Region Proposal Network. It is used to 

recommend candidate regions; this network is used in 

place of the previous search selective. The input is an 

image (because the RPN network and Fast RCNN 

share the same CNN here, so the input can also be 

considered feature maps), and the output is multiple 

candidate areas to filter out gaps in features and = 

perform a preliminary border regression. It shares the 

convolution feature of the whole graph with the 

detection network, solves the speed bottleneck of the 

original selective search method, and greatly improves 

the speed of target detection. 

3) RoI pooling. Its function is input different sizes 

and converted into output of fixed length, and the 

input and output are the same as RoI pooling in Faster 

RCNN. 

4) Classification and regression. The output of this 

layer is the class of the fully connected neural network 

of candidate region, and the exact location of the 

candidate region in the image. 

The whole process is shown in Figure 5.  


International Journal of Advanced Network, Monitoring and Controls      Volume 05, No.01, 2020 

49 

 
Figure 5. Structure chart of RPN  

IV. GENERATING ANCHOR -RPN 

Regional Anchor is an important cornerstone of 

target detection method at present. Most good target 

detection algorithms rely on the Anchors mechanism to 

evenly sample a given location in space with 

predefined sizes. There are two major problems in the 

process of generating Anchor by the traditional 

two-stage anchor-based method: 

1) Inefficient is low. The existing method makes 

sliding Windows on the feature map and generates 

thousands of anchors. However, there are only a few 

objects in one picture, which leads too many negative 

samples. 

2) Unreasonable a priori assumptions. When 

generating Anchor, it is assumed that the scale of 

Anchor or the length-width ratio is several fixed 

values. These values tend to change and change with 

the dataset. 

According to the above Hamid Rezatofighi 

proposed the guided anchor in 2019. The Guided 

anchor mechanism works as follows: the position and 

shape of the target can be represented by (x, y, w, h). (x, 

y) represents the coordinates of the object's position in 

space. 

If the box of the object is drawn on a given input 

picture, the following distribution can be obtained: 

      IyxhwpIyxpIhwyxp ,,|,|,|,,,   (2) 

There are two important information can be 

obtained from the above equation: (1) the specific 

region of the object in the image; (2) the size and 

proportion of the object are closely related to its 

location. The anchor generated model is therefore 

designed to contain two branches: one for positioning 

and one for shape prediction. 

A. Anchor generation modules.  

That is position prediction module. The goal is to 

predict which regions should be used as the center 

points to generate the anchor, which is a dichotomy 

problem, to predict whether or not the object is the 

center. Two branches were added to predict the 

confidence of each pixel (corresponding receptive field) 

on the feature graph, as well as the corresponding 

width and height. 

A target is considered a target if its confidence is 

greater than a specific domain value. Obviously, the 

process of obtaining this proposal is different from that 

of sliding window, which can reduce a large number of 

negative samples (only one proposal can be generated 

by making more pixels on each Feature map). 

In addition, since the width and height are also 

regressed by CNN, there is no scale for the object, and 


International Journal of Advanced Network, Monitoring and Controls      Volume 05, No.01, 2020 

50 

the width and height are compared to any prior 

assumptions. 

B. Feature adaption modules.  

This module actually quotes from the idea of 

deformable convolution. Firstly, the width and height 

of each point can be obtained by using the Anchor 

generation module. The width and height are 

represented by a 2-channel characteristic graph, and 

then the offset field is obtained by convolving again on 

the 2-channel characteristic graph. Finally, the offset 

field is applied to the characteristic graph. 

V. IMPROVE IOU ALGORITHM 

In the target detection, it depends on the regression 

of boundingbox coordinates to obtain accurate 

positioning effect. IoU (Intersection-over-Union) is an 

important concept in target detection. In the 

anchor-based method, it is not only used to determine 

positive samples and negative samples, but also to 

evaluate the distance between the predictbox and 

ground-truth, or the accuracy of the predictbox. 

One of the better features about the IoU is that it's 

scale insensitive. In the regression task, the most direct 

indicator to judge the distance between predict box and 

gt is the IoU, but the loss used is not suitable. Since 

loss cannot reflect the regression effect, IoU can get 

different values according to different situations, which 

can most directly reflect the regression effect. But there 

are two problems with using IoU directly as the loss 

function: 

A. If the two boxes do not intersect, by definition, 

IoU=0, it does not reflect the distance between 

them. At the same time, because loss=0, there is no 

gradient return, no learning and training. 

B. The IoU cannot accurately reflect the degree of 

coincidence between the two. As shown in figure 6 

below, IoU is equal in all three cases, but their 

coincidence degree is different. The graph on the 

left has the best regression effect, while the graph 

on the right has the worst. 

 
Figure 6. Same border regression of IoU 

To solve the above problems, Rezatofighi, Hamid and other researcher proposed GIoU in 2019, and its 

algorithm formula for GIoU is as follows: 

Algorithm 1: Generalized Intersection over Union 

Input：Two Random image:
 

n
RSBA ，  

Output：GIoU 

1. Find the smallest target graph C that surrounds A and B，C satisfies the condition: 
n

RSC   

2. 
BA

BA
IoU






 
3. 
 

C

BAC
IoUoUGI




\
 

International Journal of Advanced Network, Monitoring and Controls      Volume 05, No.01, 2020 

51 

C. Characteristics of GIoU  

1) Similar to IoU, GIoU is also a distance measure. 

As a loss function, L_{GIoU}=1-GIoU, which meets 

the basic requirements of the loss function. 

2) GIoU is Insensitive to scale. 

3) GIoU is the lower bound on IoU, and in the 

case of wireless overlap between the two boxes, 

IoU=GIoU. 

4) The value of IoU is [0,1], but the value range of 

GIoU is symmetric [-1,1]. According to the above 

formula, it can be seen that GIoU is always less than 

or equal to IoU. In addition, for IoU, its range is [0,1], 

while the range of GIoU is [−1,1]. When the two 

shapes coincide completely, there is GIoU=IoU=1. 

When there is no overlap between the two shapes, 

IoU= 0, and the subtraction is -1. 

5) Unlike the IoU, which only focuses on the 

overlapping area, the GIoU not only focuses on the 

overlapping area, but also focuses on other 

non-overlapping areas, which can better reflect the 

coincidence degree of the two. Since GIoU introduces 

C containing two shapes of A and B, it can still be 

optimized when A and B do not coincide. 

VI. EXPERIMENTAL METHODS 

A. Data to enhance  

Compared with traditional images, the number of 

GDAR images that can provide deep learning training 

is generally less, and the lack of data will lead to over 

fitting of the model. In order to ensure the 

generalization ability and recognition effect of the 

model after training, it is more important to improve 

the training performance of the deep learning method 

by means of data augmentation. The small manually 

labeled data set was taken as the initial sample, and the 

initial sample set was rotated and compound rotated, 

and all the operation results obtained were taken as the 

training data set after data enlargement. 

B. Training methods 

After the data enhancement, the images of plain 

concrete and reinforced concrete were sent into the 

neural network to start the training model. Faster - 

RCNN model training method using alternate 

optimization method (alternating optimization), it is 

divided into four steps. 

1) Stage1_rpn_train.pt 

RPN network was trained separately, and the 

trained void model was initialized with the model of 

ImageNet, and the parameters were adjusted in an end 

to end manner. backbone+rpn+fast 

rcnn——>backbone1+rpn1+fast rcnn, backbone, rpn, 

parameter updating; 

2) Stage1_fast_rcnn_train.pt 

Fast RCNN is a separate training detection network. 

Proposals for training come from RPN net in step 1. 

The model initialization adopts the ImageNet model. 

backbone+rpn1+fast rcnn——>backbone2+rpn1+fast 

rcnn1, backbone, fast rcnn, parameter updating; 

3) Stage2_rpn_train.pt 

The RPN model was initialized with the parameters 

of the second step Fast Rcnn, but the convolutional 

layer was fixed during the training and only the 

parameters belonging to RPN were adjusted. 

backbone2+rpn1+fast 

rcnn1——>backbone2+rpn2+fast rcnn1, rpn, 

parameter updating; 

4) Stage2_fast_rcnn_train.pt 

Keep the Shared convolutional layer fixed; 

fine-tune the remaining parameters of Fast rcnn with 

the proposals of the RPN output adjusted in step 3 as 

input. Backbone2+rpn2+fast rcnn1  backbone2+rpn2 

+ fast rcnn2, fast rcnn, parameter updating.  

The above four steps are shown in figure 7. 


International Journal of Advanced Network, Monitoring and Controls      Volume 05, No.01, 2020 

52 

 
Figure 7. Network model training process 

C. Evaluation index 

The general formula for calculating the accuracy is: 

the number/total number of accurate classification 

×100% [20]. This paper uses traditional evaluation 

criteria 

Accuracy: ACC= (TP+TN)/ (TP+TN+FP+FN) 

Precision: P= TP/ (TP+FP), the proportion of 

positive classes in the results after classification. 

Recall: RECALL=TP/(TP+FN), the proportion of 

all positive examples divided into pairs.  

TP means that the positive sample is correctly 

identified as a positive sample. 

TN means that negative samples are correctly 

identified as negative samples. 

FP indicates that a negative sample is incorrectly 

identified as a positive sample. 

FN means that a positive sample is incorrectly 

identified as a negative sample. 

The identification criteria for identification of void 

in the tunnel lining in the image are defined as the type 

and location of void. The probability size and the 

coincidence degree of the identification box and the 

marker box are determined as empty. Through the 

analysis of the sample situation, if the probability of the 

defect of the Geology radar image and the GIoU reach 

more than 50%, the gap will be recognized. 

VII. CONCLUSION 

In tunnel construction, geological radar is used to 

detect highway tunnel engineering, and the radar 

scanning result map can be used to realize advanced 

geological forecast, and the hidden danger of 

geological defects in the tunnel can be found. Based on 

the analysis of the elation principle, a method based on 

Faster RCNN is proposed in this paper to extract the 

position of the elation in the second lining. Compared 

with the traditional identification method, this method 

has the characteristics of Faster and higher accuracy. 

Since this data set is only plain concrete and reinforced 

concrete, the scale of annotated data will be further 

expanded to enrich the diversity of samples to improve 

the performance of the model. 

 
REFERENCES  

[1] Li Wendi. Analysis of GPR image features of tunnel lining defects. 
[J]. Fujian Building Materials, 2019(01):22-24. 

[2] Zhang Chi. Research on detection of lining voids of reinforced 
concrete structures based on geological radar method [J]. Railway 
survey, 2018, 44 (03):35-38. 

[3]  Liu Jinlong, Tan hailiang. Application of geological radar in 
detecting soil defects [J]. Engineering quality, 2018, 36 (01):73-75. 

[4] Hinton G E, Osindero S, Teh Y. A fast learning algorithm for deep 
belief nets [J]. Neural Computation, 2006, 18: 1527-1554. 


International Journal of Advanced Network, Monitoring and Controls      Volume 05, No.01, 2020 

53 

[5] Krizhevsky A, Sutskeever I, Hinton G E. ImageNet classification 
with deep convolutional neural networks [J]. Communications of the 
Acm, 2012, 60(2): 1097-1105.  

[6] Girshick R. Fast R-CNN[C]//2015 IEEE International Conference on 
Computer Vision (ICCV), December 7-13, 2015, Santiago, Chile. 
New York: IEEE, 2015: 1440-1448.  

[7] Ren S, He K, Girshick R, et al. Faster R-CNN: towards real time 
object detection with region proposal networks[J]. IEEE 
Transactions on Pattern Analysis and Machine Intelligence, 2017, 
39(6): 1137-1149. 

[8] Sun Y, Liang D, Wang X G, et al. DeepID3: face recognition with 
very deep neural networks [J]. Computer Science, 2015, 2(3): 1-5.  

[9] Luiz G H, Robert S, Luiz S O. Written dependent feature learning 
for offine signature verification using deep convolutional neutral 
networks [J]. Pattern Recognition, 2017(70): 163-176.  

[10] Abdelhamid O, Mohamed A R, Jiang H, et al. Convolutional neural 
networks for speech recognition[J]. IEEE/ACM Transactions on 
Audio, Speech, and Language Processing, 2014, 22(10): 1533-1545.  

[11] Redmon J, Divvala S, Girshick R, et al. You only look once: unified, 
real-time object detection[C]//Proceedings of the IEEE Conference 
on Computer Vision and Pattern Recognition, June 27-30, 2016, 
Seattle, WA, USA. New York: IEEE, 779-788. 

[12] Du Yuhong, Dong Chao-qun, etc. Application of improved Faster 
RCNN model in cotton fiber identification [J/OL]. Advances in laser 
and optoelectronics:1-14[2019-11-25]. 

[13] Xu shoukun, Wang yaru, Gu yuwan etc. Research on the detection of 
helmet wearing based on improved FasterRCNN [J/OL]. Computer 
application research: 1-6[2019-11-25].  

[14] Song Shang-ling,Yang Yang etc. Pulmonary nodules detection 
algorithm based on Faster-RCNN [J/OL]. Chinese journal of 
biomedical engineering: 1-8[2019-11-25]. 

[15] Wang J, Chen K, Yang S, et al. Region Proposal by Guided 
Anchoring [J]. 2019. R ezatofighi H, Tsoi N, Gwak J Y, et al. 
Generalized Intersection over Union: A Metric and A Loss for 
Bounding Box Regression [J]. 2019. 

[16] Rezatofighi H, Tsoi N, Gwak J Y, etc. Generalized Intersection over 
Union: A Metric and A Loss for Bounding Box Regression[J]. 2019. 

[17] Zheng Lifei, Xiao lito, Li Xiaoqing. Forward modeling and 
application of geological radar advance prediction [J]. 
Communications science and technology, 2018(2):76-81. 

[18] Wu Zhengwen. Application of convolution neural network in image 
classification. Chengdu: University of Electronic Science and 
technology of China, 2015 

[19] Lin Gang, Wang Bo etc. Multi-target detection and positioning of 
power line inspection image based on improved faster-RCNN [J]. 
Power automation equipment, 2019, 39(05):213-218. 

[20] Sainath TN, Kingsbury B, Saon G, et al. Deep convolutional neural 
networks for large-scale speech tasks [J]. Neural Networks, 2015, 
64:39-48.