[go: up one dir, main page]

CN115601397A - Ship track tracking and predicting method based on monocular camera - Google Patents

Ship track tracking and predicting method based on monocular camera Download PDF

Info

Publication number
CN115601397A
CN115601397A CN202211317790.7A CN202211317790A CN115601397A CN 115601397 A CN115601397 A CN 115601397A CN 202211317790 A CN202211317790 A CN 202211317790A CN 115601397 A CN115601397 A CN 115601397A
Authority
CN
China
Prior art keywords
ship
model
target
prediction
tracking
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211317790.7A
Other languages
Chinese (zh)
Inventor
高邈
陈帅
陈鹏旭
廖子豪
康振
陈翔宇
贾佳策
曾希
张安民
周春辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN202211317790.7A priority Critical patent/CN115601397A/en
Publication of CN115601397A publication Critical patent/CN115601397A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention belongs to the technical field of ocean mapping, and particularly relates to a ship track tracking and predicting method based on a monocular camera, which comprises the following steps of: step 1: constructing an image data set of the target ship, and step 2: labeling and training a data set, and step 3: evaluating and analyzing the model training result, and step 4: a target ship tracking experiment based on the YOLOv5 and Deep Sort algorithm, and the step 5: visual positioning and data preprocessing of the target ship, and step 6: building an LSTM track prediction model, and step 7: selecting and evaluating the time step of the prediction model, and step 8: and (5) predicting, verifying and analyzing the target ship track. The method can complete the rapid and accurate detection and tracking of the marine ship target, and has important significance for identifying dangerous behaviors of ships and improving the efficiency of maritime supervision.

Description

基于单目相机的船舶轨迹跟踪与预测方法Ship trajectory tracking and prediction method based on monocular camera

技术领域technical field

本发明属于海洋测绘技术领域,具体涉及一种基于单目相机的船舶轨迹跟踪与预测方法。The invention belongs to the technical field of marine surveying and mapping, and in particular relates to a ship trajectory tracking and prediction method based on a monocular camera.

背景技术Background technique

船舶检测识别及轨迹预测是对船舶危险行为的早期预测,对于识别船舶潜在的航行危险具有重要的意义,此外,基于单目相机的物标识别跟踪技术还可以应用于军事监控领域,对于维护海洋权益,监管特殊区域,侦查海上违法活动等有着广泛的应用空间。目前,民用航海领域,对于目标船的识别及碰撞危险判别仍依赖于人工对各种传感器图像进行标绘,计算碰撞风险,此方法不仅占用了额外的人力资源,而且人工对船舶碰撞风险进行评估也在很容易受主观因素的影响。Ship detection and identification and trajectory prediction are early predictions of dangerous behavior of ships, which are of great significance for identifying potential navigation hazards of ships. In addition, object recognition and tracking technology based on monocular cameras can also be applied to the field of military monitoring. Rights and interests, supervision of special areas, detection of illegal activities at sea, etc. have a wide range of application space. At present, in the field of civil navigation, the identification of target ships and the judgment of collision risk still rely on manual plotting of various sensor images and calculation of collision risk. This method not only takes up additional human resources, but also manually evaluates the risk of ship collision. It is also easily influenced by subjective factors.

发明内容Contents of the invention

本发明的目的在于提供一种基于单目相机的船舶轨迹跟踪与预测方法,以解决背景技术中存在的问题。The object of the present invention is to provide a method for tracking and predicting ship trajectory based on a monocular camera, so as to solve the problems in the background technology.

为实现上述目的,本发明提供如下技术方案:一种基于单目相机的船舶轨迹跟踪与预测方法,包括以下步骤:In order to achieve the above object, the present invention provides the following technical solutions: a method for tracking and predicting ship trajectory based on a monocular camera, comprising the following steps:

步骤1:构建目标船图像数据集,Step 1: Construct the target ship image dataset,

采用数据增强技术,利用常用的增强操作生成更多的目标船图像;Using data enhancement technology, using commonly used enhancement operations to generate more target ship images;

步骤2:数据集标注及训练,Step 2: Dataset labeling and training,

在整理完所有图像数据后,需要对所有训练集和验证集中的船舶图像进行标注;After sorting out all the image data, it is necessary to label all the ship images in the training set and verification set;

步骤3:模型训练结果评价分析,Step 3: Evaluation and analysis of model training results,

采用平均精度均值mAP,计算所有类别的精度Precision和召回率Recall,并计算所有类别AP的平均值;Using the average precision mean mAP, calculate the precision Precision and recall rate Recall of all categories, and calculate the average value of AP of all categories;

步骤4:基于YOLOv5和Deep Sort算法的目标船跟踪实验,Step 4: Target ship tracking experiment based on YOLOv5 and Deep Sort algorithm,

将基于YOLOv5的目标检测器与Deep Sort算法进行融合,实现对自定义目标船的持续跟踪,并将目标在跟踪视频中的像素坐标进行输出,方便下一步的目标船舶方位解算;Integrate the YOLOv5-based target detector with the Deep Sort algorithm to realize the continuous tracking of the custom target ship, and output the pixel coordinates of the target in the tracking video, which is convenient for the next step of target ship orientation calculation;

步骤5:目标船视觉定位及数据预处理,Step 5: Target ship visual positioning and data preprocessing,

基于齐次坐标转换原理及小孔成像模型建立目标船像素坐标与世界坐标,的转换模型,通过此过程可以将由步骤4中跟踪得到的目标船像素坐标转换成以相机光心为原点的相对坐标;Based on the principle of homogeneous coordinate transformation and the pinhole imaging model, the conversion model of the pixel coordinates of the target ship and the world coordinates is established. Through this process, the pixel coordinates of the target ship tracked in step 4 can be converted into relative coordinates with the optical center of the camera as the origin. ;

步骤6:LSTM轨迹预测模型的搭建,Step 6: Construction of LSTM trajectory prediction model,

选择LSTM网络进行目标船轨迹预测,该模型由1个输入层、五个隐藏层和一个输出层构成,输入层的输入序列的输入维度为2,即步骤5中由像素坐标转换后的相对坐标;输出层的输出序列输出维度为2,隐藏层由2个LSTM层、2个Dropout层以及1个Dense层构成;根据经验设置Dropout层的概率设置为0.3,用于防止过拟合现象的出现;设置Dense层的参数为2,即最终结果输出为预测的目标船相对位置信息,为了引入一定的非线性,选择ReLU激活函数作为Activation层的激活函数,设置在隐藏层和输出层之间;Select the LSTM network to predict the trajectory of the target ship. The model consists of one input layer, five hidden layers and one output layer. The input dimension of the input sequence of the input layer is 2, that is, the relative coordinates converted from the pixel coordinates in step 5 ;The output sequence output dimension of the output layer is 2, and the hidden layer is composed of 2 LSTM layers, 2 Dropout layers and 1 Dense layer; according to experience, the probability of the Dropout layer is set to 0.3 to prevent the occurrence of overfitting ;Set the parameter of the Dense layer to 2, that is, the final result output is the predicted relative position information of the target ship. In order to introduce a certain nonlinearity, the ReLU activation function is selected as the activation function of the Activation layer, which is set between the hidden layer and the output layer;

步骤7:预测模型时间步长的选取及评价,Step 7: Selection and evaluation of the time step of the prediction model,

选取步骤6中解算出来的相对坐标信息前80%作为训练集,剩余20%作为验证集,进行模型的训练;Select the first 80% of the relative coordinate information calculated in step 6 as the training set, and the remaining 20% as the verification set for model training;

步骤8:目标船轨迹预测及验证分析,Step 8: Target ship trajectory prediction and verification analysis,

利用步骤7中选取的最佳预测模型对目标船相对轨迹信息进行预测,并用原始视觉定位轨迹进行验证,分析模型预测的误差大小,如未达到理想预测精度,则应继续对网络模型进行调参,重新训练,以提高对目标船轨迹预测的精度。Use the best prediction model selected in step 7 to predict the relative trajectory information of the target ship, and use the original visual positioning trajectory for verification, and analyze the error of the model prediction. If the ideal prediction accuracy is not achieved, you should continue to adjust the parameters of the network model , to retrain to improve the accuracy of target ship trajectory prediction.

优选地,步骤2中,使用www.makesense.ai网站对图像中的船舶进行标注。之后对训练模型相关参数进行设置,其中训练过程中数据集的迭代次数epochs设置为300次;初始学习率lr0设置为0.01,当模型学习率达到预设初始值时,采用余弦下降的方式下降,在训练结束时下降为0.002;每次梯度更新的批量数batch-size设置为16,即一次看完16张图片进行一次权重更新。Preferably, in step 2, the website www.makesense.ai is used to mark the ship in the image. Then set the relevant parameters of the training model, in which the number of iterations epochs of the data set during the training process is set to 300 times; the initial learning rate lr0 is set to 0.01, and when the learning rate of the model reaches the preset initial value, the method of cosine descent is used to decrease, At the end of the training, it drops to 0.002; the batch number batch-size of each gradient update is set to 16, that is, after viewing 16 pictures at a time, a weight update is performed.

优选地,步骤3中,精度和召回率的计算公式如下:

Figure BDA0003910119290000021
Preferably, in step 3, the calculation formulas of precision and recall are as follows:
Figure BDA0003910119290000021

Figure BDA0003910119290000022
Figure BDA0003910119290000022

其中TP表示检测框内样本的真实类别是正确的,并且模型预测结果也是正确的,即检测目标和检测结果都为ship;TN表示检测框内样本的真实类别是错误的,并且模型预测结果也是错误的,即检测目标不是ship而检测结果是ship;FP表示检测框内样本的真实类别是错误的,而模型预测结果是正确的,即检测目标和检测结果都不是ship;FN表示检测框内样本的真实类别是正确的,而模型预测结果是错误的,即检测目标是ship而检测结果不是ship。由训练结果可以判断模型识别性能优良,若平均精度均值mAP较低,则应对图像数据集进行补充及标注,对网络模型进行调参,重新训练以得到较好的识别性能。Among them, TP indicates that the true category of the sample in the detection frame is correct, and the model prediction result is also correct, that is, both the detection target and the detection result are ship; TN indicates that the true category of the sample in the detection frame is wrong, and the model prediction result is also Wrong, that is, the detection target is not a ship and the detection result is a ship; FP indicates that the true category of the sample in the detection frame is wrong, but the model prediction result is correct, that is, neither the detection target nor the detection result is a ship; FN indicates that the detection frame The true category of the sample is correct, but the model prediction result is wrong, that is, the detection target is ship but the detection result is not ship. From the training results, it can be judged that the recognition performance of the model is excellent. If the average precision (mAP) is low, the image data set should be supplemented and marked, and the network model should be adjusted and retrained to obtain better recognition performance.

优选地,所述步骤5中,利用均值滤波对解算得到的相对坐标进行预处理,使轨迹更加平滑。Preferably, in the step 5, the relative coordinates obtained through the calculation are preprocessed by mean filtering to make the trajectory smoother.

优选地,所述步骤7中,模型训练过程中针对时间步长的选择进行多组实验,即将时间步长设置为n,预测第n+1时刻的方位坐标,即输入t-n+1到t时刻的方位坐标,预测t+1时刻的坐标,采用均方误差(MSE)来评估目标船的轨迹预测模型,均方误差是指船舶坐标预测值与船舶真实坐标值之差平方的期望值,MSE的值越小,说明该LSTM模型对本实验所用数据预测准确度较高,具体计算公式如下:Preferably, in the step 7, multiple sets of experiments are carried out for the selection of the time step in the model training process, that is, the time step is set to n, and the azimuth coordinates at the n+1th moment are predicted, that is, input t-n+1 to The azimuth coordinates at time t, the coordinates at time t+1 are predicted, and the mean square error (MSE) is used to evaluate the trajectory prediction model of the target ship. The smaller the value of MSE, the higher the prediction accuracy of the LSTM model for the data used in this experiment. The specific calculation formula is as follows:

Figure BDA0003910119290000031
Figure BDA0003910119290000031

其中,

Figure BDA0003910119290000032
表示利用LSTM模型预测t时刻坐标点的位置,(x(t),y(t))表示该时刻在t时刻实际坐标点的位置,T为样本数量。in,
Figure BDA0003910119290000032
Indicates the position of the coordinate point predicted by the LSTM model at time t, (x (t) , y (t) ) indicates the position of the actual coordinate point at time t at this time, and T is the number of samples.

本发明的有益效果是:The beneficial effects of the present invention are:

1.本发明首先结合YOLOv5和Deep Sort算法完成对目标船的识别跟踪。首先,通过自制目标船数据集,选取不同海天类型下的船舶图像,并进行数据增强处理,再并对网络模型进行训练,通过对训练结果的分析,选取较好的网络模型应用于船舶的识别跟踪。此方法可完成对海上船舶目标的快速准确检测及跟踪,对于识别船舶危险行为,提高海事监管效率具有重要意义。1. The present invention first completes the identification and tracking of the target ship in combination with YOLOv5 and Deep Sort algorithms. First of all, through the self-made target ship data set, select ship images under different types of sea and sky, and perform data enhancement processing, and then train the network model, and through the analysis of the training results, select a better network model and apply it to ship recognition track. This method can quickly and accurately detect and track ships at sea, which is of great significance for identifying dangerous behaviors of ships and improving the efficiency of maritime supervision.

2.本发明在识别到目标船的基础上对其进行视觉定位并利用LSTM模型预测目标船轨迹。通过对LSTM网络模型搭建,并对网络模型参数进行多次调节,以得到理想的预测效果,此船舶航行安全评估方法准确直观,对于保障海上航行安全具有重要意义。2. The present invention performs visual positioning on the basis of identifying the target ship and utilizes the LSTM model to predict the track of the target ship. By building the LSTM network model and adjusting the parameters of the network model many times to obtain the ideal prediction effect, this ship navigation safety assessment method is accurate and intuitive, which is of great significance for ensuring the safety of maritime navigation.

附图说明Description of drawings

图1为本发明中的基于Deep Sort算法的目标船跟踪算法流程图;Fig. 1 is the target ship tracking algorithm flow chart based on Deep Sort algorithm among the present invention;

图2为本发明中的相对坐标均值滤波后示意图;Fig. 2 is a schematic diagram after relative coordinate mean filtering among the present invention;

图3为本发明中的LSTM单元内部原理图;Fig. 3 is the internal schematic diagram of the LSTM unit in the present invention;

图4为本发明中的LSTM预测轨迹与原始视觉定位轨迹对比图;Fig. 4 is the comparison chart of LSTM predicted track and original visual positioning track in the present invention;

图5a-图5d为本发明中YOLOv5算法对于不同姿态的目标船检测效果图;Fig. 5a-Fig. 5d are the target ship detection effect diagrams of different attitudes by YOLOv5 algorithm in the present invention;

图6a-图6d为本发明中YOLOv5和Deep Sort算法对于不同姿态的目标船跟踪效果图;Fig. 6a-Fig. 6d are the target ship tracking effect diagrams of different attitudes of YOLOv5 and Deep Sort algorithms in the present invention;

图7为本发明的中体流程图。Fig. 7 is a flow chart of the middle body of the present invention.

具体实施方式detailed description

下面结合附图及较佳实施例详细说明本发明的具体实施方式。The specific implementation manner of the present invention will be described in detail below in conjunction with the accompanying drawings and preferred embodiments.

在本发明创造的描述中,需要理解的是,术语“中心”、“纵向”、“横向”、“上”、“下”、“前”、“后”、“左”、“右”、“竖直”、“水平”、“顶”、“底”、“内”、“外”等指示的方位或位置关系为基于附图所示的方位或位置关系,仅是为了便于描述本发明创造和简化描述,而不是指示或暗示所指的装置或元件必须具有特定的方位、以特定的方位构造和操作,因此不能理解为对本发明创造的限制。此外,术语“第一”、“第二”等仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”等的特征可以明示或者隐含地包括一个或者更多个该特征。在本发明创造的描述中,除非另有说明,“多个”的含义是两个或两个以上。In describing the present invention, it should be understood that the terms "center", "longitudinal", "transverse", "upper", "lower", "front", "rear", "left", "right", The orientations or positional relationships indicated by "vertical", "horizontal", "top", "bottom", "inner", "outer", etc. are based on the orientation or positional relationships shown in the drawings, and are only for the convenience of describing the present invention Creation and simplification of description, rather than indicating or implying that the device or element referred to must have a specific orientation, be constructed and operate in a specific orientation, and therefore should not be construed as limiting the invention. In addition, the terms "first", "second", etc. are used for descriptive purposes only, and should not be understood as indicating or implying relative importance or implicitly specifying the quantity of the indicated technical features. Thus, a feature defined as "first", "second", etc. may expressly or implicitly include one or more of that feature. In the description of the present invention, unless otherwise specified, "plurality" means two or more.

在本发明创造的描述中,需要说明的是,除非另有明确的规定和限定,术语“安装”、“相连”、“连接”“固连”“固接”应做广义理解,例如,可以是固定连接,也可以是可拆卸连接,或一体地连接;可以是机械连接,也可以是电连接;可以是直接相连,也可以通过中间媒介间接相连,可以是两个元件内部的连通。对于本领域的普通技术人员而言,可以通过具体情况理解上述术语在本发明创造中的具体含义。In the description of the present invention, it should be noted that unless otherwise specified and limited, the terms "installation", "connection", "connection", "fixed connection" and "fixed connection" should be understood in a broad sense, for example, it can be It can be a fixed connection, a detachable connection, or an integral connection; it can be a mechanical connection or an electrical connection; it can be a direct connection or an indirect connection through an intermediary, and it can be the internal communication of two components. Those of ordinary skill in the art can understand the specific meanings of the above terms in the present invention based on specific situations.

以下结合附图对本发明专利进行进一步的描述。Below in conjunction with accompanying drawing, the patent of the present invention is further described.

1.基于YOLOv5和Deep Sort算法的目标船识别跟踪方法1. Target ship identification and tracking method based on YOLOv5 and Deep Sort algorithm

在本专利中,选用YOLOv5算法完成对目标船舶的识别,该算法是一种使用卷积神经网络进行目标检测的单阶段深度学习算法,通过神经网络在单次前向传播中进行检测,并且具有比其他目标检测算法精度更高、检测更快的特性,这一特性使得YOLOv5算法在深度学习算法中很受欢迎。此外,YOLOv5算法共设计了S、M、L、X四种不同结构,以满足不同领域工作者需求。YOLOv5s网络最小,运算速度最快,但同时平均精度也最低,模型大小仅为14MB,适合嵌入式设备使用。目标跟踪部分采用Deep Sort算法,该算法以YOLOv5的检测结果为输入,通过卡尔曼滤波算法预测检测目标未来的位置,使目标之间更好的关联,然后使用匈牙利算法识别当前帧中的对象是否与前一帧中的对象具有相同的关联和ID属性,以匹配到相邻帧之间的相同目标。Deep Sort算法的前身为Sort算法,Sort算法使用交并比IoU在Detect和Track之间执行匹配,但其仅考虑了边界框之间的距离匹配,而没有考虑内容特征的匹配,容易导致ID变换,其外,当目标丢失无法找回时,该方法只能通过检测重新更新ID,无法满足目标跟踪的基本需求。与之相比,Deep Sort使用两个指标来评价相似度,它们是目标的运动状态信息和外观信息。Deep Sort算法将这两个指标融合的结果作为最终判断依据,改善目标被遮挡后跟踪失败以及大量ID变换的问题。对于目标船的识别跟踪方法具体如下:In this patent, the YOLOv5 algorithm is selected to complete the identification of the target ship. This algorithm is a single-stage deep learning algorithm that uses a convolutional neural network for target detection. It is detected in a single forward propagation through the neural network, and has It has higher accuracy and faster detection than other target detection algorithms, which makes the YOLOv5 algorithm very popular in deep learning algorithms. In addition, the YOLOv5 algorithm has designed four different structures of S, M, L, and X to meet the needs of workers in different fields. YOLOv5s has the smallest network and the fastest calculation speed, but also the lowest average precision. The model size is only 14MB, which is suitable for embedded devices. The target tracking part uses the Deep Sort algorithm, which uses the detection results of YOLOv5 as input, predicts the future position of the detected target through the Kalman filter algorithm, and makes the targets better related, and then uses the Hungarian algorithm to identify whether the object in the current frame is Have the same association and ID attributes as the object in the previous frame to match to the same object between adjacent frames. The predecessor of the Deep Sort algorithm is the Sort algorithm. The Sort algorithm uses the IoU to perform matching between Detect and Track, but it only considers the distance matching between the bounding boxes, and does not consider the matching of content features, which easily leads to ID transformation. , in addition, when the target is lost and cannot be retrieved, this method can only re-update the ID through detection, which cannot meet the basic needs of target tracking. In contrast, Deep Sort uses two indicators to evaluate the similarity, which are the target's motion state information and appearance information. The Deep Sort algorithm uses the result of the fusion of these two indicators as the final judgment basis to improve the tracking failure after the target is occluded and the problems of a large number of ID changes. The identification and tracking methods for the target ship are as follows:

(1)构建目标船图像数据集(1) Construct target ship image dataset

在进行目标船识别前,需要构建目标船的图像数据集,然后利用YOLOv5网络模型进行训练。为了提高系统识别的鲁棒性,图像数据集应包括目标船实际航行过程中的各种状况的海天类型,例如雨、雪、大风浪等天气均应考虑。为了提高框架的泛化性能,本系统采用数据增强技术,利用常用的增强操作生成更多的目标船图像。具体来说,通过数据增强技术(平移、旋转、颜色变换、尺度变换等操作),可以为每个输入训练图像获得20幅不同的船舶图像,本系统从生成的20张船舶视觉特征(边缘、轮廓、颜色等)明显不同于原始样本的图像中手工选择船舶变体样本。随机选取70%图片作为训练集,剩余图片作为验证集。Before identifying the target ship, it is necessary to construct an image data set of the target ship, and then use the YOLOv5 network model for training. In order to improve the robustness of system identification, the image data set should include the sea and sky types of various conditions during the actual voyage of the target ship, such as rain, snow, strong wind and waves, etc. should be considered. In order to improve the generalization performance of the framework, this system employs data augmentation techniques to generate more target ship images using commonly used augmentation operations. Specifically, through data enhancement techniques (translation, rotation, color transformation, scale transformation, etc.), 20 different ship images can be obtained for each input training image. The system generates 20 ship visual features (edge, Outline, color, etc.) are hand-selected samples of ship variants in images that are significantly different from the original samples. 70% of the pictures are randomly selected as the training set, and the remaining pictures are used as the verification set.

(2)数据集标注及训练(2) Dataset labeling and training

在整理完所有图像数据后,需要对所有训练集和验证集中的船舶图像进行标注,使用www.makesense.ai网站对图像中的船舶进行标注。之后对训练模型相关参数进行设置,其中训练过程中数据集的迭代次数epochs设置为300次;初始学习率lr0设置为0.01,当模型学习率达到预设初始值时,采用余弦下降的方式下降,在训练结束时下降为0.002;每次梯度更新的批量数batch-size设置为16,即一次看完16张图片进行一次权重更新,可以使模型具有更好的识别效果。如图5a-图5d所示。After sorting out all the image data, it is necessary to label all the ship images in the training set and verification set, and use the www.makesense.ai website to label the ships in the images. Then set the relevant parameters of the training model, in which the number of iterations epochs of the data set during the training process is set to 300 times; the initial learning rate lr0 is set to 0.01, and when the learning rate of the model reaches the preset initial value, the method of cosine descent is used to decrease, At the end of the training, it drops to 0.002; the batch number batch-size of each gradient update is set to 16, that is, after viewing 16 pictures at a time and performing a weight update, the model can have a better recognition effect. As shown in Figure 5a-5d.

(3)模型训练结果评价分析(3) Evaluation and analysis of model training results

在评价网络模型训练结果的性能时,常采用平均精度均值mAP,其主要是计算所有类别的精度Precision和召回率Recall,并计算所有类别AP的平均值。When evaluating the performance of network model training results, the average precision mAP is often used, which mainly calculates the precision Precision and recall rate Recall of all categories, and calculates the average value of AP of all categories.

其中,精度和召回率的计算公式如下:

Figure BDA0003910119290000051
Among them, the calculation formulas of precision and recall are as follows:
Figure BDA0003910119290000051

Figure BDA0003910119290000052
Figure BDA0003910119290000052

其中TP表示检测框内样本的真实类别是正确的,并且模型预测结果也是正确的,即检测目标和检测结果都为ship;TN表示检测框内样本的真实类别是错误的,并且模型预测结果也是错误的,即检测目标不是ship而检测结果是ship;FP表示检测框内样本的真实类别是错误的,而模型预测结果是正确的,即检测目标和检测结果都不是ship;FN表示检测框内样本的真实类别是正确的,而模型预测结果是错误的,即检测目标是ship而检测结果不是ship。由训练结果可以判断模型识别性能优良,若平均精度均值mAP较低,则应对图像数据集进行补充及标注,对网络模型进行调参,重新训练以得到较好的识别性能。Among them, TP indicates that the true category of the sample in the detection frame is correct, and the model prediction result is also correct, that is, both the detection target and the detection result are ship; TN indicates that the true category of the sample in the detection frame is wrong, and the model prediction result is also Wrong, that is, the detection target is not a ship and the detection result is a ship; FP indicates that the true category of the sample in the detection frame is wrong, but the model prediction result is correct, that is, neither the detection target nor the detection result is a ship; FN indicates that the detection frame The true category of the sample is correct, but the model prediction result is wrong, that is, the detection target is ship but the detection result is not ship. From the training results, it can be judged that the recognition performance of the model is excellent. If the average precision (mAP) is low, the image data set should be supplemented and marked, and the network model should be adjusted and retrained to obtain better recognition performance.

(4)基于YOLOv5和Deep Sort算法的目标船跟踪实验(4) Target ship tracking experiment based on YOLOv5 and Deep Sort algorithm

本系统提出的船舶检测跟踪的主要设计思路是将基于YOLOv5的目标检测器与Deep Sort算法进行融合,实现对自定义目标船的持续跟踪,并将目标在跟踪视频中的像素坐标进行输出,方便下一步的目标船舶方位解算。设计流程如图1和图6a-图6d所示。The main design idea of the ship detection and tracking proposed by this system is to integrate the YOLOv5-based target detector with the Deep Sort algorithm to realize the continuous tracking of the custom target ship, and output the pixel coordinates of the target in the tracking video, which is convenient The next step is to calculate the orientation of the target ship. The design process is shown in Figure 1 and Figure 6a-6d.

2.基于LSTM算法的目标船轨迹预测方法2. Target ship trajectory prediction method based on LSTM algorithm

(1)目标船视觉定位及数据预处理(1) Target ship visual positioning and data preprocessing

基于齐次坐标转换原理及小孔成像模型建立目标船像素坐标与世界坐标(世界坐标的原点位置自定义,这里我们假设为相机的光心位置为世界坐标的原点)的转换模型。通过此过程可以将由上文中跟踪得到的目标船像素坐标转换成以相机光心为原点的相对坐标。Based on the principle of homogeneous coordinate transformation and the pinhole imaging model, the conversion model of the pixel coordinates of the target ship and the world coordinates (the origin of the world coordinates can be customized, here we assume that the optical center of the camera is the origin of the world coordinates) is established. Through this process, the pixel coordinates of the target ship obtained by tracking above can be converted into relative coordinates with the optical center of the camera as the origin.

由于相机抖动、人工标注时的框选大小误差和水平面背影干扰等,目标检测框大小与目标船舶轮廓间可能出现较小位置偏差,导致跟踪目标船舶的图像坐标产生错误估计,进而会降低相对坐标的解算精度。我们利用均值滤波对解算得到的相对坐标进行预处理,使轨迹更加平滑,以降低外界因素对相对坐标解算精度的影响,如图2所示。Due to camera shake, frame size error during manual labeling, and background interference on the horizontal plane, there may be a small positional deviation between the size of the target detection frame and the outline of the target ship, resulting in an incorrect estimation of the image coordinates of the tracking target ship, which in turn will reduce the relative coordinates. solution accuracy. We use mean filtering to preprocess the calculated relative coordinates to make the trajectory smoother, so as to reduce the influence of external factors on the relative coordinate solution accuracy, as shown in Figure 2.

(2)LSTM轨迹预测模型的搭建(2) Construction of LSTM trajectory prediction model

由于船舶在航行中易受外部环境因素的影响,导致其轨迹具有非线性的特点,传统的数学方程难以考虑海风、水流对船舶运动状态的影响,因此这里我们选择神经网络来对船舶运动轨迹进行预测。众所周知,循环神经网络(RNN)具有梯度消失、梯度爆炸及难以学习长期模式等问题,并且通常很难训练,而LSTM模型可以很好的解决这个问题,并且更易训练。因此,本系统选择LSTM网络进行目标船轨迹预测。Since ships are easily affected by external environmental factors during navigation, their trajectories have nonlinear characteristics. It is difficult for traditional mathematical equations to consider the influence of sea wind and water currents on the ship's motion state. Therefore, here we choose neural networks to carry out predict. As we all know, the cyclic neural network (RNN) has problems such as gradient disappearance, gradient explosion, and difficulty in learning long-term patterns, and is usually difficult to train. The LSTM model can solve this problem well and is easier to train. Therefore, this system chooses LSTM network for target ship trajectory prediction.

该模型由1个输入层、五个隐藏层(LSTM层、Dropout层以及Dense层)和一个输出层构成。输入层的输入序列的输入维度为2(即上文中由像素坐标转换后的相对坐标);输出层的输出序列输出维度为2(同上)。隐藏层由2个LSTM层、2个Dropout层以及1个Dense层构成;根据经验设置Dropout层的概率设置为0.3,用于防止过拟合现象的出现;设置Dense层的参数为2,即最终结果输出为预测的目标船相对位置信息。为了引入一定的非线性,在这里选择ReLU激活函数作为Activation层的激活函数,设置在隐藏层和输出层之间。LSTM单元内部原理如图3所示。The model consists of an input layer, five hidden layers (LSTM layer, Dropout layer, and Dense layer) and an output layer. The input dimension of the input sequence of the input layer is 2 (that is, the relative coordinates converted from the pixel coordinates above); the output dimension of the output sequence of the output layer is 2 (same as above). The hidden layer consists of 2 LSTM layers, 2 Dropout layers and 1 Dense layer; according to experience, the probability of the Dropout layer is set to 0.3 to prevent over-fitting; the parameter of the Dense layer is set to 2, that is, the final The output of the result is the predicted relative position information of the target ship. In order to introduce a certain nonlinearity, the ReLU activation function is selected here as the activation function of the Activation layer, which is set between the hidden layer and the output layer. The internal principle of the LSTM unit is shown in Figure 3.

(3)预测模型时间步长的选取及评价(3) Selection and evaluation of the time step of the prediction model

这里我们选取上文解算出来的相对坐标信息前80%作为训练集,剩余20%作为验证集,进行模型的训练。一般来说,输入层不同的时间步长对算法预测的精度也会产生一定的影响,输入的维度越大,模型产生过拟合现象的几率也会增加。模型训练过程中针对时间步长的选择进行多组实验,即将时间步长设置为n,预测第n+1时刻的方位坐标,即输入t-n+1到t时刻的方位坐标,预测t+1时刻的坐标。采用均方误差(MSE)来评估目标船的轨迹预测模型,均方误差是指船舶坐标预测值与船舶真实坐标值之差平方的期望值,MSE的值越小,说明该LSTM模型对本实验所用数据预测准确度较高,具体计算公式如下:Here we select the first 80% of the relative coordinate information calculated above as the training set, and the remaining 20% as the verification set for model training. Generally speaking, the different time steps of the input layer will also have a certain impact on the accuracy of the algorithm's prediction. The larger the input dimension, the more likely the model will overfit. In the process of model training, multiple sets of experiments are carried out for the selection of the time step, that is, the time step is set to n, and the azimuth coordinates at the n+1th time are predicted, that is, the azimuth coordinates at the time t-n+1 to t are input, and the prediction t+ 1 time coordinates. The mean square error (MSE) is used to evaluate the trajectory prediction model of the target ship. The mean square error refers to the expected value of the square of the difference between the predicted value of the ship coordinates and the real coordinate value of the ship. The smaller the value of MSE, the better the LSTM model is for the data used in this experiment. The prediction accuracy is high, and the specific calculation formula is as follows:

Figure BDA0003910119290000071
Figure BDA0003910119290000071

其中,

Figure BDA0003910119290000072
表示利用LSTM模型预测t时刻坐标点的位置,(x(t),y(t))表示该时刻在t时刻实际坐标点的位置,T为样本数量。in,
Figure BDA0003910119290000072
Indicates the position of the coordinate point predicted by the LSTM model at time t, (x (t) , y (t) ) indicates the position of the actual coordinate point at time t at this time, and T is the number of samples.

(4)目标船轨迹预测及验证分析(4) Target ship trajectory prediction and verification analysis

利用上一步骤中选取的最佳预测模型对目标船相对轨迹信息进行预测,并用原始视觉定位轨迹进行验证,分析模型预测的误差大小,如未达到理想预测精度,则应继续对网络模型进行调参,重新训练,以提高对目标船轨迹预测的精度。如图4所示。Use the best prediction model selected in the previous step to predict the relative trajectory information of the target ship, and verify it with the original visual positioning trajectory, analyze the error of the model prediction, if the ideal prediction accuracy is not reached, continue to adjust the network model Parameters, retraining to improve the accuracy of target ship trajectory prediction. As shown in Figure 4.

最后应说明的是:以上各实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述各实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的范围。Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present invention, rather than limiting them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: It is still possible to modify the technical solutions described in the foregoing embodiments, or perform equivalent replacements for some or all of the technical features; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the technical solutions of the various embodiments of the present invention. scope.

Claims (5)

1. A ship track tracking and predicting method based on a monocular camera is characterized by comprising the following steps:
step 1: a target vessel image data set is constructed,
generating more target ship images by adopting a data enhancement technology and utilizing common enhancement operation;
step 2: the labeling and training of the data set are carried out,
after finishing sorting all the image data, marking all the ship images in the training set and the verification set;
and step 3: the model training result is evaluated and analyzed, and the model training result is analyzed,
calculating Precision and Recall rate Recall of all categories by adopting an average Precision mean mAP, and calculating an average value of all categories of APs;
and 4, step 4: based on the target ship tracking experiment of the YOLOv5 and Deep Sort algorithm,
fusing a target detector based on YOLOv5 with a Deep Sort algorithm to realize continuous tracking of a user-defined target ship, and outputting pixel coordinates of a target in a tracking video to facilitate next-step target ship azimuth calculation;
and 5: the visual positioning and data preprocessing of the target ship,
establishing a conversion model of the pixel coordinates and the world coordinates of the target ship based on a homogeneous coordinate conversion principle and a small-hole imaging model, and converting the pixel coordinates of the target ship obtained by tracking in the step 4 into relative coordinates taking the optical center of the camera as an origin through the process;
and 6: building an LSTM track prediction model,
selecting an LSTM network to predict the track of the target ship, wherein the model consists of 1 input layer, five hidden layers and an output layer, the input dimensionality of an input sequence of the input layer is 2, namely the relative coordinate converted from the pixel coordinate in the step 5; the output dimension of an output sequence of the output layer is 2, and the hidden layer consists of 2 LSTM layers, 2 Dropout layers and 1 Dense layer; the probability of setting Dropout layer to 0.3 is empirically set for preventing the occurrence of the overfitting phenomenon; setting the parameter of a Dense layer as 2, namely outputting the final result as predicted relative position information of a target ship, and selecting a ReLU Activation function as an Activation function of an Activation layer to introduce certain nonlinearity, wherein the ReLU Activation function is arranged between a hidden layer and an output layer;
and 7: selecting and evaluating the time step of the prediction model,
selecting the first 80% of the relative coordinate information calculated in the step 6 as a training set, and the remaining 20% as a verification set, and training the model;
and 8: the target ship track is predicted, verified and analyzed,
and (4) predicting the relative track information of the target ship by using the optimal prediction model selected in the step (7), verifying the original visual positioning track, analyzing the error of model prediction, and if the ideal prediction precision is not reached, continuously adjusting parameters of the network model and retraining so as to improve the precision of the target ship track prediction.
2. The vessel trajectory tracking and prediction method based on the monocular camera as recited in claim 1, wherein in the step 2, the vessel in the image is labeled by using www. Then setting relevant parameters of the training model, wherein the iteration times epochs of the data set in the training process are set to be 300 times; the initial learning rate lr0 is set to 0.01, when the model learning rate reaches a preset initial value, the model learning rate is reduced in a cosine reduction mode, and when the training is finished, the model learning rate is reduced to 0.002; the batch size of each gradient update is set to 16, namely, the weight update is performed once after 16 pictures are seen at a time.
3. The vessel track tracking and predicting method based on the monocular camera according to claim 1, wherein in step 3, the calculation formulas of the precision and the recall rate are as follows:
Figure FDA0003910119280000021
Figure FDA0003910119280000022
wherein TP indicates that the real category of the sample in the detection frame is correct and the model prediction result is also correct, namely the detection target and the detection result are both ship; TN indicates that the true class of the sample in the detection frame is wrong, and the model prediction result is also wrong, namely the detection target is not ship and the detection result is ship; FP means that the real class of the sample in the detection frame is wrong, and the model prediction result is correct, namely, neither the detection target nor the detection result is ship; FN indicates that the true class of the sample in the detection box is correct, while the model prediction result is wrong, i.e. the detection target is ship and the detection result is not ship. The training result can judge that the model recognition performance is excellent, if the average precision mean mAP is low, the image data set is supplemented and labeled, the network model is subjected to parameter adjustment, and retraining is carried out to obtain good recognition performance.
4. The ship track tracking and predicting method based on the monocular camera according to claim 1, wherein in the step 5, the calculated relative coordinates are preprocessed by using mean filtering to make the track smoother.
5. The monocular camera-based vessel trajectory tracking and prediction method of claim 1, further comprising: in the step 7, multiple sets of experiments are performed for selecting the time step in the model training process, that is, the time step is set to be n, the azimuth coordinate at the n +1 th moment is predicted, that is, the azimuth coordinate from the t-n +1 to the t moment is input, the coordinate at the t +1 moment is predicted, and a track prediction model of the target ship is estimated by using a Mean Square Error (MSE), wherein the MSE refers to an expected value of the square of the difference between a ship coordinate prediction value and a ship real coordinate value, and the smaller the MSE value is, which indicates that the LSTM model has higher accuracy for the data used in the experiment, and a specific calculation formula is as follows:
Figure FDA0003910119280000031
wherein,
Figure FDA0003910119280000032
indicating the prediction of the position of the coordinate point at time t using the LSTM model, (x) (t) ,y (t) ) Which represents the position of the actual coordinate point at time T, where T is the number of samples.
CN202211317790.7A 2022-10-26 2022-10-26 Ship track tracking and predicting method based on monocular camera Pending CN115601397A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211317790.7A CN115601397A (en) 2022-10-26 2022-10-26 Ship track tracking and predicting method based on monocular camera

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211317790.7A CN115601397A (en) 2022-10-26 2022-10-26 Ship track tracking and predicting method based on monocular camera

Publications (1)

Publication Number Publication Date
CN115601397A true CN115601397A (en) 2023-01-13

Family

ID=84851427

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211317790.7A Pending CN115601397A (en) 2022-10-26 2022-10-26 Ship track tracking and predicting method based on monocular camera

Country Status (1)

Country Link
CN (1) CN115601397A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116992349A (en) * 2023-08-15 2023-11-03 中国兵器工业计算机应用技术研究所 Analysis and optimization method of civilian ship trajectory behavior based on deep learning
CN118691551A (en) * 2024-05-31 2024-09-24 鄂州三江东顺船业有限公司 New energy ship parts detection method and system based on deep learning

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116992349A (en) * 2023-08-15 2023-11-03 中国兵器工业计算机应用技术研究所 Analysis and optimization method of civilian ship trajectory behavior based on deep learning
CN118691551A (en) * 2024-05-31 2024-09-24 鄂州三江东顺船业有限公司 New energy ship parts detection method and system based on deep learning

Similar Documents

Publication Publication Date Title
US10943352B2 (en) Object shape regression using wasserstein distance
Liu et al. Detection and pose estimation for short-range vision-based underwater docking
CN104200495B (en) A kind of multi-object tracking method in video monitoring
CN109635685A (en) Target object 3D detection method, device, medium and equipment
CN110781836A (en) Human body recognition method and device, computer equipment and storage medium
CN106780631B (en) Robot closed-loop detection method based on deep learning
Shan et al. LiDAR-based stable navigable region detection for unmanned surface vehicles
WO2020046213A1 (en) A method and apparatus for training a neural network to identify cracks
Sun et al. IRDCLNet: Instance segmentation of ship images based on interference reduction and dynamic contour learning in foggy scenes
Peretroukhin et al. Reducing drift in visual odometry by inferring sun direction using a bayesian convolutional neural network
CN115601397A (en) Ship track tracking and predicting method based on monocular camera
CN114463628B (en) A ship target recognition method based on deep learning in remote sensing images based on threshold constraints
Li et al. Vision-based target detection and positioning approach for underwater robots
CN112417948A (en) Method for accurately guiding lead-in ring of underwater vehicle based on monocular vision
CN105321188A (en) Foreground probability based target tracking method
Alla et al. Vision-based deep learning algorithm for underwater object detection and tracking
CN110826575A (en) Underwater target identification method based on machine learning
CN114821358A (en) Optical remote sensing image marine ship target extraction and identification method
CN118640939A (en) A method and system for correcting target trajectory in marine geographic surveying and mapping
CN116758421A (en) Remote sensing image directed target detection method based on weak supervised learning
CN118115896A (en) Unmanned aerial vehicle detection method and system based on improvement YOLOv3
Wang et al. 3D-LIDAR based branch estimation and intersection location for autonomous vehicles
Yu et al. Improved deformable convolution method for aircraft object detection in flight based on feature separation in remote sensing images
Jin et al. An occlusion-aware tracker with local-global features modeling in UAV videos
CN117523428B (en) Ground target detection method and device based on aircraft platform

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination