Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be clearly and completely described below with reference to specific embodiments of the present application and corresponding drawings. It will be apparent that the described embodiments are only some, but not all, embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
Target tracking can be classified into single target tracking and multi-target tracking. The single-target tracking is tracking of a single target by a pointer, and the single-target tracking is multi-target tracking. The target tracking method provided by the application is mainly applied to tracking of single targets. Thus, embodiments of the present application also extend primarily around single target tracking.
Single object tracking may continuously identify and locate objects of interest in a video sequence based on their characteristics. It can be seen that tracking of a single target requires first determining the target to be tracked in the image, and then predicting the size and position of the target in the subsequent frame. The object to be tracked is determined in the image, first the image in the relevant scene needs to be acquired by means of a detector. For example, the target recognition and tracking is performed by a visible light image obtained by a visible light detector. But the visible light detector relies on solar/starlight radiation to image in a passive manner. In particular, the quality of the obtained image is poor under the conditions of poor illuminance, bad weather, smoke shielding and the like, and the image characteristics have limitations. Thus, the accuracy of target recognition is low, which is unfavorable for target tracking. The infrared detector detects and identifies the target by receiving the infrared rays of the target and utilizing the temperature difference between the target and the background and the difference of radiant energy of each part of the target. Thus, the infrared detector can observe the object under "full black" conditions without relying on any light source. From this, it is clear that there is a great complementarity between the visible image and the infrared image. The target can be identified and tracked according to the visible light image and the infrared image, so that the target with richer characteristics can be obtained, and subsequent tracking is facilitated.
It should be noted that, according to the recognition and tracking of the target by the visible light image and the infrared image together, huge operation is often required, and the efficiency of target tracking is directly affected. In order to reduce the operand in the target tracking process, the embodiment of the application provides a target tracking method. Specifically, referring to fig. 1 and fig. 2, the target tracking method provided by the embodiment of the application includes the following steps:
and S100, acquiring three-color channel imaging data of a visible light image and imaging data of an infrared image.
The three-color channel imaging data of the visible light image can be understood as three-color channel imaging data of the photographed object in the visible light wave band, and can be acquired by the visible light detector. The resolution of the visible light detector can be regarded as a first resolution, which is denoted as m 0×n0, and the "three-color channel imaging data" can be understood as brightness values corresponding to each channel of red (R), green (G) and blue (B) of the visible light image in the RGB color mode. In the field of computer vision, the "three-color channel imaging data" herein may also be referred to as "three-dimensional imaging data". That is, each color channel corresponds to a certain color dimension.
The imaging data of the infrared image is understood to mean imaging data of the subject in the infrared band, which can be acquired by an infrared detector. The resolution of the infrared detector can be regarded as a second resolution, denoted as m 1×n1, and the infrared image imaging data can be understood as monochromatic channel (one-dimensional) imaging data of the infrared image relative to the three-color channel (three-dimensional) imaging data of the visible light image.
And S200, performing data dimension-increasing processing on the obtained visible light imaging data and infrared imaging data through a data dimension-increasing fitting model to obtain visible light-infrared four-dimensional imaging data containing visible light imaging data information and infrared imaging data information.
It can be understood that the infrared imaging data carries infrared radiation energy information of each region of the photographed object, and the visible light imaging data carries imaging information of the photographed object on the lower surface of the visible light wave band. However, the infrared imaging data has less target characteristic information, is quite unfavorable for target identification and tracking, and the visible light imaging data is greatly influenced by environmental factors, so that the target characteristic is easy to lose. If target tracking is performed on single visible light imaging data/infrared imaging data, effective identification of the target is not facilitated. Therefore, the infrared imaging data and the visible light imaging data are spliced, namely, the dimension-increasing fitting of the data is performed. Therefore, the infrared imaging data make up for the information loss of the visible light imaging data caused by environmental factors such as poor illuminance, bad weather, smoke shielding and the like, and the obtained four-dimensional data carries more comprehensive characteristic information. Namely, all target characteristic information of visible light wave bands and infrared wave bands is contained in the visible light-infrared four-dimensional imaging data, so that the target detection result obtained later is most comprehensive, and the defect that a single visible light wave band is influenced by a rain and fog environment to lose a target or the target characteristic information of a single infrared wave band is less is effectively overcome.
Further, in a preferred embodiment provided in the present application, the data dimension-increasing processing is performed on the obtained visible light imaging data and infrared imaging data through a data dimension-increasing fitting model to obtain visible light-infrared four-dimensional imaging data containing visible light imaging data information and infrared imaging data information, which specifically includes scaling the imaging data of the infrared image through a linear scaling model to obtain infrared image data with resolution consistent with that of the visible light image, and performing the data dimension-increasing processing on the obtained visible light imaging data and the infrared image data obtained through the scaling processing through a data dimension-increasing fitting model to obtain visible light-infrared four-dimensional imaging data containing visible light imaging data information and infrared imaging data information.
It will be appreciated that the visible imaging data may be acquired by a visible light detector at a first resolution and the infrared imaging data may be acquired by a visible light detector at a second resolution. But the resolution of the infrared detector tends to be lower than the resolution of the visible light detector. That is, the second resolution is not consistent with the first resolution. Therefore, after the data in the visible light wave band and the data in the infrared wave band are spliced, the data length is possibly changed, and the subsequent target detection and identification are not facilitated. That is, there is a possibility that the target feature information in the visible light band and the target feature information in the infrared band are superimposed and disordered in the four-dimensional data obtained by the dimension-increasing fitting, and the reliability of the obtained four-dimensional data becomes low. Thus, the infrared band imaging data at the second resolution may be converted to infrared band data at the first resolution prior to performing the data dimension-wise fit.
Specifically, the infrared band imaging data at the second resolution is converted into the infrared band data at the first resolution by means of linear scaling. Assuming that [ X, Y ] is the coordinate of a certain place in the infrared band data before transformation and [ X, Y ] is the coordinate of the same place after transformation, the linear scaling process of the infrared band imaging data under the second resolution can be expressed as follows:
Where m 0、n0 is the resolution of the visible light detector (first resolution), m 1、n1 is the resolution of the infrared detector (second resolution), and the image pixel f (X, Y) corresponding to the [ X, Y ] coordinate can be determined by bilinear interpolation.
The infrared band imaging data and the visible light band imaging data can be in the same resolution dimension through conversion of the infrared band imaging data under different resolutions. At this time, the dimension-increasing fitting is performed on the infrared band imaging data and the visible light band imaging data, namely, the dimension-increasing fitting is performed on the infrared band imaging data under the first resolution and the visible light band imaging data under the first resolution, so that the accuracy of target feature information in the visible light-infrared four-dimensional imaging data is increased.
Further, in a preferred embodiment of the present application, the data dimension-increasing processing is performed on the obtained visible light imaging data and the infrared image data obtained by the scaling processing through the data dimension-increasing fitting model to obtain visible light-infrared four-dimensional imaging data containing visible light imaging data information and infrared imaging data information, which specifically includes converting the infrared image data obtained by the scaling processing into infrared image data registered with the visible light image through the image registration model; and performing data dimension-increasing processing on the obtained visible light imaging data and the infrared image data obtained through registration through a data dimension-increasing fitting model to obtain visible light-infrared four-dimensional imaging data containing visible light imaging data information and infrared imaging data information.
Image registration here can be understood as registration based on different detector imaging coordinates. That is, a mapping between imaging in the visible light band and imaging in the infrared band is established so that images taken by different sensors are spatially aligned. Thus, the feature information in the visible light-infrared four-dimensional imaging data obtained by dimension-increasing fitting has higher accuracy.
The infrared image data obtained by the scaling process is converted into infrared image data registered with the visible light image through the image registration model, and first, the image registration parameters need to be determined. In a specific embodiment provided by the application, enough homonymous points (imaging points of the same target) can be found in a visible light image corresponding to the visible light imaging data and an infrared image corresponding to the infrared imaging data respectively, and then fitting registration is carried out by using a polynomial model. Assuming that the positions of the same-name points corresponding to the infrared wave band data of any point [ X, Y ] of the visible light image are [ X 0,Y0 ], establishing the following polynomial model:
Wherein a0, a1, a2, a3, a4, a5 and b0, b1, b2, b3, b4, b5 are polynomial parameters. In practical application, the values of parameters a0-a5, b0-b5 and the like can be determined by carrying out space coordinate calibration on the visible light detector and the infrared detector and carrying out data fitting.
And according to the determined image registration parameters, registering the infrared band image with the first resolution and the visible light image with the first resolution can be performed. That is, the infrared image data obtained by the scaling processing is converted into infrared image data registered with the visible light image. At this time, the input data of the dimension-increasing fitting model is visible light imaging data at the first resolution and infrared image data which is obtained through image registration and can be registered with a visible light image. Therefore, each point of the images of the infrared band and the visible light band can be kept consistent, and the accuracy of target characteristic information in the visible light-infrared four-dimensional imaging data obtained by dimension-increasing fitting is improved, so that the accuracy of target identification is improved.
S300, determining a target to be tracked according to the obtained four-dimensional data and through a target detection model.
It can be understood that the four-dimensional data contains imaging information of both infrared wave bands and visible wave bands, so that the characteristic information of the target area is greatly enriched, and the detected target is more accurate. The target detection model is mainly used for determining a target to be tracked in a four-dimensional image obtained by fitting.
Further, in a preferred embodiment of the present application, the target to be tracked is determined according to the obtained four-dimensional data and through a target detection model, specifically including determining the target to be tracked according to the obtained four-dimensional data and through a yolo_v5s target detection model.
The yolo_v5s target detection model is a detection model obtained through neural network training. In a specific embodiment provided by the application, the training optimization of the yolo_v5s target detection model can be performed in a negative feedback training mode. Compared with other versions of detection models, the data processing speed and the data processing precision of the yolo_v5 detection model are improved greatly. The Yolo_v5s version detection model has the smallest depth and the smallest width of the feature map. Therefore, to achieve both detection speed and accuracy, the yolo_v5s model is used here for target detection.
In a specific embodiment provided by the application, a target to be tracked is determined through a yolo_v5s target detection model, a plurality of targets of interest (suspected targets to be tracked) in a fitting image corresponding to visible light-infrared four-dimensional imaging data are required to be determined first, and then the targets to be tracked are determined from the determined target set of interest. Here, a plurality of interested targets (suspected targets to be tracked) in the fitting image corresponding to the four-dimensional imaging data are represented by a set, and are marked as follows:
ROIobj={obj0,obj1,obj2...obji};
Where obj i represents the ith object of interest (suspected object to be tracked). It is understood that each object detected by the yolo_v5s object detection model may be marked by an object box selection. Therefore, each target (suspected target to be tracked) in the four-dimensional imaging data corresponding fitting image correspondingly has a target frame corresponding to the target, and at least comprises (x, y, w, h, p) 5 basic information, wherein x represents the upper left corner x coordinate of the target, y represents the upper left corner y coordinate of the target, w represents the image width of the target, h represents the image height of the target, and p represents the detection accuracy of the target. It should be noted that, here, x and y are used to represent the position information of the target frame corresponding to the target of interest, and they should be unified and corresponding to the same position of the same target frame. It should be understood that the upper left corner of the object of interest is merely for convenience of understanding, and its specific reference location in practical application obviously does not limit the scope of the present application.
And determining a single target to be tracked in the suspected target set, and fully considering the position of each suspected target in the fitting image corresponding to the visible light-infrared four-dimensional imaging data besides referring to the detection accuracy of each suspected target.
In a specific embodiment provided by the application, a detection accuracy threshold is preset, and a center point of a four-dimensional data corresponding fitting image is used as a reference point to determine a single target to be tracked in a target set of interest ROI obj. Specifically, selecting a suspected target with detection accuracy greater than a set threshold value and closest to a set reference point from the target data set ROI obj of interest as the target to be tracked may be expressed as:
Where min () represents the minimum value, obj i. P represents the detection accuracy of the ith object of interest, thr represents a preset detection accuracy threshold, obj i. X represents the upper left corner x-coordinate of the ith object of interest, obj i. Y represents the upper left corner y-coordinate of the ith object of interest, obj i. W represents the image width of the ith object of interest, obj i. H represents the image height of the ith object of interest, C x represents the x-coordinate of the center point of the fitted image corresponding to the four-dimensional data, and C y represents the y-coordinate of the center point of the fitted image corresponding to the four-dimensional data. It will be appreciated that the specific values of the target detection accuracy threshold preset herein are clearly not limiting on the scope of the present application.
It can be understood that all target characteristic information of the visible light wave band and the infrared wave band is contained in the visible light-infrared four-dimensional data obtained by dimension-increasing fitting. Therefore, the visible light-infrared four-dimensional data is used as the input of the target detection model, the target to be tracked with comprehensive characteristic information can be obtained, and the defect that a single visible light wave band is influenced by a rain and fog environment to lose the target or the characteristic information of the single infrared wave band target is less is effectively overcome.
And S400, performing data dimension reduction processing on the obtained visible light imaging data and infrared imaging data through a data dimension reduction fitting model to obtain visible light-infrared imaging data containing visible light imaging data information and infrared imaging data information.
The data dimension-reduction fitting can be understood as performing data fusion processing on visible light band imaging data and infrared band imaging data. That is, compressing the visible light image and the infrared image can be understood as "blurring" the image. And then taking the visible light-infrared imaging data obtained through dimension reduction as the input of a follow-up tracking algorithm.
It can be understood that the determined object to be tracked carries both its characteristic information in the visible band and its characteristic information in the infrared band. When the determined target to be tracked is unfolded for target tracking, tracking is only performed on the visible light image corresponding to the visible light detector, if the condition that the target is blocked exists in the visible light image, the target cannot be accurately identified in the subsequent image frame, and thus target tracking failure is caused. Similarly, tracking is only performed on an infrared image corresponding to the infrared detector, and due to low resolution and poor detail resolution of the infrared image, a target cannot be accurately identified in a subsequent image, and target tracking failure is also caused. Therefore, the visible light-infrared imaging data including the visible light imaging data information and the infrared imaging data information is used as a reference image for subsequent object tracking, so that the accuracy of object tracking is increased.
Based on the determined characteristic information of the target to be tracked, in order to reduce the operation amount of a follow-up target tracking model, dimension reduction fitting is performed on imaging data of the visible light detector and imaging data of the infrared detector, so that the operation of the data amount in a tracking algorithm is effectively reduced. And the data obtained by dimension reduction fitting not only contains the characteristic information under the visible light wave band, but also contains the characteristic information under the infrared wave band, so that the accuracy of target tracking can be ensured.
In one specific embodiment provided by the application, the three-color channel (three-dimensional) imaging data of the obtained visible light image and the imaging data of the infrared image are subjected to dimension reduction fitting to be visible light-infrared one-dimensional (monochromatic channel) imaging data, and the imaging data are used as the input of a follow-up target tracking model. Therefore, the dimension of input data of the target tracking model is effectively reduced, the operand is reduced, and the target tracking efficiency is improved.
Specifically, the three-color channel imaging data of the obtained visible light image and the imaging data of the infrared image are subjected to data fusion and dimension reduction processing, and the weighting operation is adopted, which is expressed as follows:
F(i,j)=0.6(0.299·R(i,j)+0.587·G(i,j)+0.114·B(i,j))+0.4·I(i,j)
Wherein F (I, j) represents one-dimensional (monochromatic channel) data after image fusion at the corresponding (I, j) coordinate, R (I, j), G (I, j), B (I, j) respectively represent three-color channel data of three-color channel imaging data of a visible light image at the corresponding (I, j) coordinate, and I (I, j) represents monochromatic channel data of imaging data of an infrared image at the corresponding (I, j) coordinate.
Further, in a preferred embodiment provided by the application, the acquired visible light imaging data and infrared imaging data are subjected to data dimension reduction processing through a data dimension reduction fitting model to obtain visible light-infrared imaging data containing visible light imaging data information and infrared imaging data information, and the method specifically comprises the steps of scaling imaging data of an infrared image through a linear scaling model to obtain infrared image data with resolution consistent with that of the visible light image; and performing data dimension reduction processing on the obtained visible light imaging data and the infrared image data obtained through the scaling processing through a data dimension reduction fitting model to obtain visible light-infrared imaging data containing visible light imaging data information and infrared imaging data information.
It will be appreciated that the visible imaging data may be acquired by a visible light detector at a first resolution and the infrared imaging data may be acquired by a visible light detector at a second resolution. But the resolution of the infrared detector tends to be lower than the resolution of the visible light detector. That is, the second resolution is not consistent with the first resolution. Therefore, after the data in the visible light wave band and the data in the infrared wave band are spliced, the data length is possibly changed, and the subsequent target detection and identification are not facilitated. That is, there is a possibility that the target feature information in the visible light band and the target feature information in the infrared band are overlapped and disordered in the dimension reduction fitting process, and the reliability of the obtained data becomes low. Thus, the infrared band imaging data at the second resolution may be converted to infrared band data at the first resolution prior to performing the data dimension-wise fit.
Specifically, the infrared band imaging data at the second resolution is converted into the infrared band data at the first resolution by means of linear scaling. The detailed linear scaling procedure is described in the previous data dimension-increasing fitting procedure, and will not be described here.
The infrared band imaging data and the visible light band imaging data can be in the same resolution dimension through conversion of the infrared band imaging data under different resolutions. At this time, the dimension reduction fitting is performed on the infrared band imaging data and the visible light band imaging data, that is, the dimension reduction fitting is performed on the infrared band imaging data under the first resolution and the visible light band imaging data under the first resolution, so that the accuracy of target feature information in the visible light-infrared imaging data obtained through dimension reduction is increased.
Further, in a preferred embodiment of the present application, the data dimension-reducing processing is performed on the obtained visible light imaging data and the infrared image data obtained by the scaling processing through the data dimension-reducing fitting model to obtain visible light-infrared imaging data including visible light imaging data information and infrared imaging data information, which specifically includes converting the infrared image data obtained by the scaling processing into infrared image data registered with the visible light image through the image registration model; and performing data dimension reduction processing on the acquired visible light imaging data and the infrared image data obtained through registration through a data dimension reduction fitting model to obtain visible light-infrared imaging data containing visible light imaging data information and infrared imaging data information.
Image registration here can be understood as registration based on different detector imaging coordinates. That is, a mapping between imaging in the visible light band and imaging in the infrared band is established so that images taken by different sensors are spatially aligned. Therefore, the feature information of each region in the visible light-infrared imaging data obtained by dimension reduction fitting has higher accuracy.
The infrared image data obtained by the scaling process is converted into infrared image data registered with the visible light image through the image registration model, and the specific process is already described in the previous data dimension-increasing fitting process, and is not repeated here. The input data of the dimension-reduction fitting model is visible light imaging data under the first resolution and infrared image data which can be registered with a visible light image and is obtained through image registration. Therefore, each point of the images in the infrared band and the visible light band can be kept consistent, and the accuracy of the characteristic information of each region in the visible light-infrared imaging data obtained by dimension reduction fitting is improved, so that the accuracy of target tracking is improved.
S500, according to visible light-infrared imaging data obtained by dimension reduction fitting, unfolding and tracking the determined target to be tracked through a target tracking model.
It can be understood that the visible light-infrared imaging data obtained by dimension reduction fitting is used as the input of the target tracking model, so that the operation data quantity is reduced, the target tracking speed is improved, and the image characteristics of the infrared wave band and the visible light wave band are properly reserved. Notably, the imaging angle can directly affect the accuracy of target tracking. For example, an imaging perspective shift may cause a change or occlusion in the appearance of the object in the image, resulting in a drift or loss of the object. In the process of target tracking, conditions such as target shielding, deformation, size transformation, exceeding the field of view and the like often exist, and the accuracy of target tracking is directly affected. Therefore, training of the target tracking model needs to comprehensively consider the situations of target shielding, deformation, size transformation, out-of-view and the like which occur in the target tracking process, so that the target can be accurately tracked.
Further, in a preferred embodiment of the application, the visible light-infrared imaging data obtained through dimension reduction fitting is utilized to unfold and track a determined target to be tracked through a target tracking model, and the method specifically comprises the steps of obtaining a first position parameter and a first dimension parameter of the determined target to be tracked in a first fitting image corresponding to the visible light-infrared four-dimensional imaging data, screening tracking points of the determined target to be tracked according to the first position parameter and the first dimension parameter and through the target tracking model to obtain a target tracking point set composed of a plurality of tracking points for tracking the determined target, calculating a second position parameter of the target to be tracked in a next frame image according to the target tracking point set and through the target tracking model, judging whether the determined target to be tracked exceeds a second fitting image range corresponding to the visible light-infrared imaging data obtained through dimension reduction according to the second position parameter and through the target tracking model, and re-determining the target to be tracked through the target tracking model when the determined target to be tracked exceeds the second fitting image range.
It will be appreciated that in the fitting image (first fitting image) of the visible light-infrared four-dimensional imaging data, there will be a target frame corresponding to the determined target, and at least 5 pieces of basic information are included (x, y, w, h, p). Wherein (x, y) is used to characterize the position of the target frame and (w, h) is used to characterize the size of the target frame. The first position parameter and the first size parameter can be understood as position parameters (x, y) and size parameters (w, h) of a target frame corresponding to the target to be tracked in a fitting image corresponding to the visible light-infrared four-dimensional imaging data, and can be directly acquired after the target to be tracked is determined.
The tracking point is understood here as a locating point representing the same position of the object to be tracked in each frame of image. The tracking point screening can be realized by uniformly generating some points in the determined target size range, then using a Lucas-Kanade tracker to forward track the points to t+1 frames and backward track the points to t frames, calculating the error of the tracking points between the two frames, and screening out half of the points with the minimum error as the optimal tracking points. And according to the change of the coordinates and the distance of the optimal tracking points, the position and the size of the target to be tracked in the next frame of image can be calculated. The position of the object to be tracked in the next frame of image is preferentially calculated to judge whether the image exceeds the range of the reference image. If the reference image is exceeded, the tracker is indicated to be invalid. At this time, the position of the target to be tracked in the next frame of image does not need to be calculated, so that the processing speed of the target tracking model is increased. The position parameter corresponding to the position of the target to be tracked in the next frame of image is referred to as a second position parameter.
Specifically, a fitting image (second fitting image) corresponding to the visible light-infrared data obtained by dimension-reduction fitting is used as a reference image. And comparing the second position parameter with the reference image, and indicating that the target to be tracked is lost when the second position parameter exceeds the range of the reference image. At this time, the target to be tracked needs to be newly determined. The originally determined target to be tracked is referred to herein as a first target, and the re-detected target is referred to herein as a second target. The second target may be redetermined by capturing a range of images around the first target and performing target detection using a target detection model. After the second target is determined, the position parameter and the size parameter corresponding to the second target become relevant parameters of the subsequent target to be tracked. Namely, the updating of the target to be tracked is completed, and continuous tracking is developed according to the data obtained by dimension reduction fitting of the next frame of image.
By comparing the second position parameter with the reference image, there is also a case where the reference image range is not exceeded. At this time, it is also necessary to refer to the corresponding size parameter of the target to be tracked in the next frame image, so as to prevent the target tracking failure caused by the target size transformation due to factors such as the change of the shooting angle of view.
Further, in the preferred embodiment of the application, the method comprises the steps of unfolding and tracking the determined target to be tracked according to visible light-infrared imaging data obtained through dimension reduction fitting and through a target tracking model, calculating a second dimension parameter of the target to be tracked in a second fitting image according to a target tracking point set and through the target tracking model when the determined target to be tracked does not exceed the range of the second fitting image, judging whether the first dimension parameter is consistent with the second dimension parameter according to the second dimension parameter and through the target tracking model, and re-determining the target to be tracked according to visible light-infrared four-dimensional imaging data and through a target detection model when the first dimension parameter is inconsistent with the second dimension parameter.
The second size parameter may be understood as a size parameter corresponding to the target to be tracked in the next frame of image calculated according to the target tracking point. Although the target to be tracked does not exceed the range of the reference image (the second fitting image), the second size parameter may not be consistent with the first size parameter, and the accuracy of target tracking may be directly affected. The second size parameter is inconsistent with the first size parameter, which can be understood as that the sizes of the target frames corresponding to the targets to be tracked are different. When the second size parameter is inconsistent with the first size parameter, the size of the target to be tracked is changed. At this time, the target to be tracked needs to be newly determined. Likewise, the originally determined target to be tracked is regarded as a first target, and the re-detected target is regarded as a second target. The second target may be redetermined by capturing an image of a range around the first target and performing target detection using a target detection model. After the second target is determined, the second target generates a position parameter and a size parameter corresponding to the position parameter.
It should be noted that, instead of directly replacing the second target with a new target to be tracked, it is determined whether the second size parameter corresponding to the second target and the second size parameter corresponding to the original first target meet the preset target tracking parameter replacement condition through the target tracking model. That is, by the object tracking model, it is determined whether or not there is a space intersection of the size corresponding to the second object and the second size corresponding to the first object and a specific determination criterion is satisfied.
Specifically, it is assumed that the second size of the first object (originally determined object to be tracked) is represented by a rectangular box [ x1, y1, x2, y2], (x 1, y 1) represents the upper left corner of the first object, (x 2, y 2) represents the lower right corner of the first object, and the size of the second object (re-detected object) is represented by a rectangular box [ x3, y3, x4, y4], (x 3, y 3) represents the upper left corner of the second object, and (x 4, y 4) represents the lower right corner of the second object. The target tracking model decision criteria are as follows:
where area_inter represents the size area of the overlapping portion of the first object and the second object in the second size, area_t represents the area of the second size, min () represents the minimum value, and max () represents the maximum value.
And (3) through target tracking model processing, when the size corresponding to the second target and the second size corresponding to the first target are spatially intersected and meet the judgment standard, replacing the size of the first target with the size parameter corresponding to the second target, wherein the position parameter of the first target is not changed and is still the second position parameter. The size parameter of the target to be tracked is determined to be the size parameter corresponding to the re-detection target (second target), and the position parameter is the position parameter (second position parameter) of the original tracked target calculated according to the tracking point. And then, respectively updating the size parameter and the position parameter input by the target tracking model into the size parameter and the second position parameter corresponding to the redetected target, and expanding and tracking according to the data obtained by dimension reduction fitting of the next frame of image. Thus, the failure of tracking the ground target caused by deformation and size transformation of the target is effectively avoided.
Further, in a preferred embodiment of the present application, the method includes performing extended tracking on the determined target to be tracked according to the visible light-infrared imaging data obtained by dimension reduction fitting and through a target tracking model, specifically including performing extended tracking on the determined target to be tracked according to the visible light-infrared imaging data obtained by dimension reduction fitting and through a Median-Flow target tracking model.
The Median-Flow target tracking model is a target tracking model established based on a Median-Flow Median Flow tracking algorithm. The algorithm can track the size change of the target frame, so that the influence of the size change of the target frame on the target accuracy can be reduced.
Referring to fig. 3, the present application further provides a target tracking apparatus 100 for tracking a target according to a target tracking method. Specifically, the object tracking device 100 includes:
an acquisition module 11, configured to acquire three-color channel imaging data of a visible light image and imaging data of an infrared image;
The computing module 12 is used for carrying out data dimension-increasing processing on the obtained visible light imaging data and infrared imaging data through a data dimension-increasing fitting model to obtain visible light-infrared four-dimensional imaging data containing visible light imaging data information and infrared imaging data information;
The detection module 13 is used for determining a target to be tracked according to the obtained four-dimensional data and through a target detection model;
And the tracking module 14 is used for expanding and tracking the determined target to be tracked according to the visible light-infrared imaging data obtained by dimension reduction fitting and through a target tracking model.
The acquisition module 11 is used for acquiring three-color channel imaging data of the visible light image and imaging data of the infrared image. The three-color channel imaging data of the visible light image can be understood as three-color channel imaging data of the photographed object in the visible light wave band, and can be acquired by the visible light detector. The resolution of the visible light detector can be regarded as a first resolution, which is denoted as m 0×n0, and the "three-color channel imaging data" can be understood as brightness values corresponding to each channel of red (R), green (G) and blue (B) of the visible light image in the RGB color mode. In the field of computer vision, the "three-color channel imaging data" herein may also be referred to as "three-dimensional imaging data". That is, each color channel corresponds to a certain color dimension.
The imaging data of the infrared image is understood to mean imaging data of the subject in the infrared band, which can be acquired by an infrared detector. The resolution of the infrared detector can be regarded as a second resolution, denoted as m 1×n1, and the infrared image imaging data can be understood as monochromatic channel (one-dimensional) imaging data of the infrared image relative to the three-color channel (three-dimensional) imaging data of the visible light image.
The calculation module 12 is configured to perform data dimension-increasing processing on the obtained visible light imaging data and infrared imaging data through a data dimension-increasing fitting model, so as to obtain visible light-infrared four-dimensional imaging data including visible light imaging data information and infrared imaging data information. It can be understood that the infrared imaging data carries infrared radiation energy information of each region of the photographed object, and the visible light imaging data carries imaging information of the photographed object on the lower surface of the visible light wave band. However, the infrared imaging data has less target characteristic information, is quite unfavorable for target identification and tracking, and the visible light imaging data is greatly influenced by environmental factors, so that the target characteristic is easy to lose. If target tracking is performed on single visible light imaging data/infrared imaging data, effective identification of the target is not facilitated. Therefore, the infrared imaging data and the visible light imaging data are spliced, namely, the dimension-increasing fitting of the data is performed. Therefore, the infrared imaging data make up for the information loss of the visible light imaging data caused by environmental factors such as poor illuminance, bad weather, smoke shielding and the like, and the obtained four-dimensional data carries more comprehensive characteristic information. Namely, all target characteristic information of visible light wave bands and infrared wave bands is contained in the visible light-infrared four-dimensional imaging data, so that the target detection result obtained later is most comprehensive, and the defect that a single visible light wave band is influenced by a rain and fog environment to lose a target or the target characteristic information of a single infrared wave band is less is effectively overcome.
Further, in a preferred embodiment provided in the present application, the calculation module 12 is configured to perform data dimension-increasing processing on the obtained visible light imaging data and infrared imaging data through a data dimension-increasing fitting model to obtain visible light-infrared four-dimensional imaging data including visible light imaging data information and infrared imaging data information, and specifically configured to perform scaling processing on the imaging data of the infrared image through a linear scaling model to obtain infrared imaging data with a resolution consistent with that of the visible light image, and perform data dimension-increasing processing on the obtained visible light imaging data and the infrared imaging data obtained through the scaling processing through a data dimension-increasing fitting model to obtain visible light-infrared four-dimensional imaging data including visible light imaging data information and infrared imaging data information.
It will be appreciated that the visible imaging data may be acquired by a visible light detector at a first resolution and the infrared imaging data may be acquired by a visible light detector at a second resolution. But the resolution of the infrared detector tends to be lower than the resolution of the visible light detector. That is, the second resolution is not consistent with the first resolution. Therefore, after the data in the visible light wave band and the data in the infrared wave band are spliced, the data length is possibly changed, and the subsequent target detection and identification are not facilitated. That is, there is a possibility that the target feature information in the visible light band and the target feature information in the infrared band are superimposed and disordered in the four-dimensional data obtained by the dimension-increasing fitting, and the reliability of the obtained four-dimensional data becomes low. Thus, the infrared band imaging data at the second resolution may be converted to infrared band data at the first resolution prior to performing the data dimension-wise fit.
Specifically, the infrared band imaging data at the second resolution is converted into the infrared band data at the first resolution by means of linear scaling. Assuming that [ X, Y ] is the coordinate of a certain place in the infrared band data before transformation and [ X, Y ] is the coordinate of the same place after transformation, the linear scaling process of the infrared band imaging data under the second resolution can be expressed as follows:
Where m 0、n0 is the resolution of the visible light detector (first resolution), m 1、n1 is the resolution of the infrared detector (second resolution), and the image pixel f (X, Y) corresponding to the [ X, Y ] coordinate can be determined by bilinear interpolation.
The infrared band imaging data and the visible light band imaging data can be in the same resolution dimension through conversion of the infrared band imaging data under different resolutions. At this time, the dimension-increasing fitting is performed on the infrared band imaging data and the visible light band imaging data, namely, the dimension-increasing fitting is performed on the infrared band imaging data under the first resolution and the visible light band imaging data under the first resolution, so that the accuracy of target feature information in the visible light-infrared four-dimensional imaging data is increased.
Further, in a preferred embodiment of the present application, the calculation module 12 is configured to perform data dimension-increasing processing on the obtained visible light imaging data and the infrared image data obtained by scaling processing through a data dimension-increasing fitting model to obtain visible light-infrared four-dimensional imaging data including visible light imaging data information and infrared imaging data information, and specifically is configured to convert the infrared image data obtained by scaling processing into infrared image data registered with the visible light image through an image registration model, and perform data dimension-increasing processing on the obtained visible light imaging data and the infrared image data obtained by registration through a data dimension-increasing fitting model to obtain visible light-infrared four-dimensional imaging data including visible light imaging data information and infrared imaging data information.
Image registration here can be understood as registration based on different detector imaging coordinates. That is, a mapping between imaging in the visible light band and imaging in the infrared band is established so that images taken by different sensors are spatially aligned. Thus, the feature information in the visible light-infrared four-dimensional imaging data obtained by dimension-increasing fitting has higher accuracy.
The infrared image data obtained by the scaling process is converted into infrared image data registered with the visible light image through the image registration model, and first, the image registration parameters need to be determined. In a specific embodiment provided by the application, enough homonymous points (imaging points of the same target) can be found in a visible light image corresponding to the visible light imaging data and an infrared image corresponding to the infrared imaging data respectively, and then fitting registration is carried out by using a polynomial model. Assuming that the positions of the same-name points corresponding to the infrared wave band data of any point [ X, Y ] of the visible light image are [ X 0,Y0 ], establishing the following polynomial model:
Wherein a0, a1, a2, a3, a4, a5 and b0, b1, b2, b3, b4, b5 are polynomial parameters. In practical application, the values of parameters a0-a5, b0-b5 and the like can be determined by carrying out space coordinate calibration on the visible light detector and the infrared detector and carrying out data fitting.
And according to the determined image registration parameters, registering the infrared band image with the first resolution and the visible light image with the first resolution can be performed. That is, the infrared image data obtained by the scaling processing is converted into infrared image data registered with the visible light image. At this time, the input data of the dimension-increasing fitting model is visible light imaging data at the first resolution and infrared image data which is obtained through image registration and can be registered with a visible light image. Therefore, each point of the images of the infrared band and the visible light band can be kept consistent, and the accuracy of target characteristic information in the visible light-infrared four-dimensional imaging data obtained by dimension-increasing fitting is improved, so that the accuracy of target identification is improved.
And the detection module 13 is used for determining the target to be tracked according to the obtained four-dimensional data and through a target detection model. It can be understood that the four-dimensional data contains imaging information of both infrared wave bands and visible wave bands, so that the characteristic information of the target area is greatly enriched, and the detected target is more accurate. The target detection model is mainly used for determining a target to be tracked in a four-dimensional image obtained by fitting.
Further, in a preferred embodiment of the present application, the detection module 13 is configured to determine the target to be tracked according to the obtained four-dimensional data and through a target detection model, and specifically configured to determine the target to be tracked according to the obtained four-dimensional data and through a yolo_v5s target detection model.
The yolo_v5s target detection model is a detection model obtained through neural network training. In a specific embodiment provided by the application, the training optimization of the yolo_v5s target detection model can be performed in a negative feedback training mode. Compared with other versions of detection models, the data processing speed and the data processing precision of the yolo_v5 detection model are improved greatly. The Yolo_v5s version detection model has the smallest depth and the smallest width of the feature map. Therefore, to achieve both detection speed and accuracy, the yolo_v5s model is used here for target detection.
In a specific embodiment provided by the application, a target to be tracked is determined through a yolo_v5s target detection model, a plurality of targets of interest (suspected targets to be tracked) in a fitting image corresponding to visible light-infrared four-dimensional imaging data are required to be determined first, and then the targets to be tracked are determined from the determined target set of interest. Here, a plurality of interested targets (suspected targets to be tracked) in the fitting image corresponding to the four-dimensional imaging data are represented by a set, and are marked as follows:
ROIobj={obj0,obj1,obj2...obji};
where obji represents the ith object of interest (suspected object to be tracked). It is understood that each object detected by the yolo_v5s object detection model may be marked by an object box selection. Therefore, each target (suspected target to be tracked) in the four-dimensional imaging data corresponding fitting image correspondingly has a target frame corresponding to the target, and at least comprises (x, y, w, h, p) 5 basic information, wherein x represents the upper left corner x coordinate of the target, y represents the upper left corner y coordinate of the target, w represents the image width of the target, h represents the image height of the target, and p represents the detection accuracy of the target. It should be noted that, here, x and y are used to represent the position information of the target frame corresponding to the target of interest, and they should be unified and corresponding to the same position of the same target frame. It should be understood that the upper left corner of the object of interest is merely for convenience of understanding, and its specific reference location in practical application obviously does not limit the scope of the present application.
And determining a single target to be tracked in the suspected target set, and fully considering the position of each suspected target in the fitting image corresponding to the visible light-infrared four-dimensional imaging data besides referring to the detection accuracy of each suspected target.
In a specific embodiment provided by the application, a detection accuracy threshold is preset, and a center point of a four-dimensional data corresponding fitting image is used as a reference point to determine a single target to be tracked in a target set of interest ROI obj. Specifically, selecting a suspected target with detection accuracy greater than a set threshold value and closest to a set reference point from the target data set ROI obj of interest as the target to be tracked may be expressed as:
Where min () represents the minimum value, obj i. P represents the detection accuracy of the ith object of interest, thr represents a preset detection accuracy threshold, obj i. X represents the upper left corner x-coordinate of the ith object of interest, obj i. Y represents the upper left corner y-coordinate of the ith object of interest, obj i. W represents the image width of the ith object of interest, obj i. H represents the image height of the ith object of interest, C x represents the x-coordinate of the center point of the fitted image corresponding to the four-dimensional data, and C y represents the y-coordinate of the center point of the fitted image corresponding to the four-dimensional data. It will be appreciated that the specific values of the target detection accuracy threshold preset herein are clearly not limiting on the scope of the present application.
It can be understood that all target characteristic information of the visible light wave band and the infrared wave band is contained in the visible light-infrared four-dimensional data obtained by dimension-increasing fitting. Therefore, the visible light-infrared four-dimensional data is used as the input of the target detection model, the target to be tracked with comprehensive characteristic information can be obtained, and the defect that a single visible light wave band is influenced by a rain and fog environment to lose the target or the characteristic information of the single infrared wave band target is less is effectively overcome. Therefore, the calculation module 12 is further configured to perform data dimension reduction processing on the obtained visible light imaging data and infrared imaging data through a data dimension reduction fitting model, so as to obtain visible light-infrared imaging data including visible light imaging data information and infrared imaging data information.
The data dimension-reduction fitting can be understood as performing data fusion processing on visible light band imaging data and infrared band imaging data. That is, compressing the visible light image and the infrared image can be understood as "blurring" the image. And then taking the visible light-infrared imaging data obtained through dimension reduction as the input of a follow-up tracking algorithm.
It can be understood that the determined object to be tracked carries both its characteristic information in the visible band and its characteristic information in the infrared band. When the determined target to be tracked is unfolded for target tracking, tracking is only performed on the visible light image corresponding to the visible light detector, if the condition that the target is blocked exists in the visible light image, the target cannot be accurately identified in the subsequent image frame, and thus target tracking failure is caused. Similarly, tracking is only performed on an infrared image corresponding to the infrared detector, and due to low resolution and poor detail resolution of the infrared image, a target cannot be accurately identified in a subsequent image, and target tracking failure is also caused. Therefore, the visible light-infrared imaging data including the visible light imaging data information and the infrared imaging data information is used as a reference image for subsequent object tracking, so that the accuracy of object tracking is increased.
Based on the determined characteristic information of the target to be tracked, in order to reduce the operation amount of a follow-up target tracking model, dimension reduction fitting is performed on imaging data of the visible light detector and imaging data of the infrared detector, so that the operation of the data amount in a tracking algorithm is effectively reduced. And the data obtained by dimension reduction fitting not only contains the characteristic information under the visible light wave band, but also contains the characteristic information under the infrared wave band, so that the accuracy of target tracking can be ensured.
In one specific embodiment provided by the application, the three-color channel (three-dimensional) imaging data of the obtained visible light image and the imaging data of the infrared image are subjected to dimension reduction fitting to be visible light-infrared one-dimensional (monochromatic channel) imaging data, and the imaging data are used as the input of a follow-up target tracking model. Therefore, the dimension of input data of the target tracking model is effectively reduced, the operand is reduced, and the target tracking efficiency is improved.
Specifically, the three-color channel imaging data of the obtained visible light image and the imaging data of the infrared image are subjected to data fusion and dimension reduction processing, and the weighting operation is adopted, which is expressed as follows:
F(i,j)=0.6(0.299·R(i,j)+0.587·G(i,j)+0.114·B(i,j))+0.4·I(i,j)
Wherein F (I, j) represents one-dimensional (monochromatic channel) data after image fusion at the corresponding (I, j) coordinate, R (I, j), G (I, j), B (I, j) respectively represent three-color channel data of three-color channel imaging data of a visible light image at the corresponding (I, j) coordinate, and I (I, j) represents monochromatic channel data of imaging data of an infrared image at the corresponding (I, j) coordinate.
Further, in a preferred embodiment of the present application, the calculation module 12 is configured to perform data dimension reduction processing on the obtained visible light imaging data and infrared imaging data through a data dimension reduction fitting model to obtain visible light-infrared imaging data including visible light imaging data information and infrared imaging data information, and specifically configured to perform scaling processing on the imaging data of the infrared image through a linear scaling model to obtain infrared imaging data with resolution consistent with that of the visible light image, and perform data dimension reduction processing on the obtained visible light imaging data and the infrared imaging data obtained through the scaling processing through a data dimension reduction fitting model to obtain visible light-infrared imaging data including visible light imaging data information and infrared imaging data information.
It will be appreciated that the visible imaging data may be acquired by a visible light detector at a first resolution and the infrared imaging data may be acquired by a visible light detector at a second resolution. But the resolution of the infrared detector tends to be lower than the resolution of the visible light detector. That is, the second resolution is not consistent with the first resolution. Therefore, after the data in the visible light wave band and the data in the infrared wave band are spliced, the data length is possibly changed, and the subsequent target detection and identification are not facilitated. That is, there is a possibility that the target feature information in the visible light band and the target feature information in the infrared band are overlapped and disordered in the dimension reduction fitting process, and the reliability of the obtained data becomes low. Thus, the infrared band imaging data at the second resolution may be converted to infrared band data at the first resolution prior to performing the data dimension-wise fit.
Specifically, the infrared band imaging data at the second resolution is converted into the infrared band data at the first resolution by means of linear scaling. The detailed linear scaling procedure is described in the previous data dimension-increasing fitting procedure, and will not be described here.
The infrared band imaging data and the visible light band imaging data can be in the same resolution dimension through conversion of the infrared band imaging data under different resolutions. At this time, the dimension reduction fitting is performed on the infrared band imaging data and the visible light band imaging data, that is, the dimension reduction fitting is performed on the infrared band imaging data under the first resolution and the visible light band imaging data under the first resolution, so that the accuracy of target feature information in the visible light-infrared imaging data obtained through dimension reduction is increased.
Further, in a preferred embodiment of the present application, the calculation module 12 is configured to perform data dimension reduction processing on the obtained visible light imaging data and the infrared image data obtained by scaling processing through a data dimension reduction fitting model to obtain visible light-infrared imaging data including visible light imaging data information and infrared imaging data information, and specifically is configured to convert the infrared image data obtained by scaling processing into infrared image data registered with the visible light image through an image registration model, and perform data dimension reduction processing on the obtained visible light imaging data and the infrared image data obtained by registration through a data dimension reduction fitting model to obtain visible light-infrared imaging data including visible light imaging data information and infrared imaging data information.
Image registration here can be understood as registration based on different detector imaging coordinates. That is, a mapping between imaging in the visible light band and imaging in the infrared band is established so that images taken by different sensors are spatially aligned. Therefore, the feature information of each region in the visible light-infrared imaging data obtained by dimension reduction fitting has higher accuracy.
The infrared image data obtained by the scaling process is converted into infrared image data registered with the visible light image through the image registration model, and the specific process is already described in the previous data dimension-increasing fitting process, and is not repeated here. The input data of the dimension-reduction fitting model is visible light imaging data under the first resolution and infrared image data which can be registered with a visible light image and is obtained through image registration. Therefore, each point of the images in the infrared band and the visible light band can be kept consistent, and the accuracy of the characteristic information of each region in the visible light-infrared imaging data obtained by dimension reduction fitting is improved, so that the accuracy of target tracking is improved.
And the tracking module 14 is used for expanding and tracking the determined target to be tracked according to the visible light-infrared imaging data obtained by dimension reduction fitting and through a target tracking model. It can be understood that the visible light-infrared imaging data obtained by dimension reduction fitting is used as the input of the target tracking model, so that the operation data quantity is reduced, the target tracking speed is improved, and the image characteristics of the infrared wave band and the visible light wave band are properly reserved. Notably, the imaging angle can directly affect the accuracy of target tracking. For example, an imaging perspective shift may cause a change or occlusion in the appearance of the object in the image, resulting in a drift or loss of the object. In the process of target tracking, conditions such as target shielding, deformation, size transformation, exceeding the field of view and the like often exist, and the accuracy of target tracking is directly affected. Therefore, training of the target tracking model needs to comprehensively consider the situations of target shielding, deformation, size transformation, out-of-view and the like which occur in the target tracking process, so that the target can be accurately tracked.
Further, in a preferred embodiment of the present application, the tracking module 14 is configured to perform tracking on a determined target to be tracked by using visible light-infrared imaging data obtained by dimension reduction fitting, and by using a target tracking model, and specifically configured to obtain a first position parameter and a first dimension parameter of the determined target to be tracked in a first fitting image corresponding to the visible light-infrared four-dimensional imaging data, screen tracking points of the determined target to be tracked according to the first position parameter and the first dimension parameter and by using a target tracking model, obtain a target tracking point set composed of a plurality of tracking points for tracking the determined target, calculate a second position parameter of the target to be tracked in a next frame image according to the target tracking point set and by using a target tracking model, determine whether the determined target to be tracked exceeds a second fitting image range corresponding to the visible light-infrared imaging data obtained by dimension reduction fitting in a next frame image according to the second position parameter and by using a target tracking model, and re-determine the target to be tracked by using the four-dimensional imaging model.
It will be appreciated that in the fitting image (first fitting image) of the visible light-infrared four-dimensional imaging data, there will be a target frame corresponding to the determined target, and at least 5 pieces of basic information are included (x, y, w, h, p). Wherein (x, y) is used to characterize the position of the target frame and (w, h) is used to characterize the size of the target frame. The first position parameter and the first size parameter can be understood as position parameters (x, y) and size parameters (w, h) of a target frame corresponding to the target to be tracked in a fitting image corresponding to the visible light-infrared four-dimensional imaging data, and can be directly acquired after the target to be tracked is determined.
The tracking point is understood here as a locating point representing the same position of the object to be tracked in each frame of image. The tracking point screening can be realized by uniformly generating some points in the determined target size range, then using a Lucas-Kanade tracker to forward track the points to t+1 frames and backward track the points to t frames, calculating the error of the tracking points between the two frames, and screening out half of the points with the minimum error as the optimal tracking points. And according to the change of the coordinates and the distance of the optimal tracking points, the position and the size of the target to be tracked in the next frame of image can be calculated. The position of the object to be tracked in the next frame of image is preferentially calculated to judge whether the image exceeds the range of the reference image. If the reference image is exceeded, the tracker is indicated to be invalid. At this time, the position of the target to be tracked in the next frame of image does not need to be calculated, so that the processing speed of the target tracking model is increased. The position parameter corresponding to the position of the target to be tracked in the next frame of image is referred to as a second position parameter.
Specifically, a fitting image (second fitting image) corresponding to the visible light-infrared data obtained by dimension-reduction fitting is used as a reference image. And comparing the second position parameter with the reference image, and indicating that the target to be tracked is lost when the second position parameter exceeds the range of the reference image. At this time, the target to be tracked needs to be newly determined. The originally determined target to be tracked is referred to herein as a first target, and the re-detected target is referred to herein as a second target. The second target may be redetermined by capturing a range of images around the first target and performing target detection using a target detection model. After the second target is determined, the position parameter and the size parameter corresponding to the second target become relevant parameters of the subsequent target to be tracked. Namely, the updating of the target to be tracked is completed, and continuous tracking is developed according to the data obtained by dimension reduction fitting of the next frame of image.
By comparing the second position parameter with the reference image, there is also a case where the reference image range is not exceeded. At this time, it is also necessary to refer to the corresponding size parameter of the target to be tracked in the next frame image, so as to prevent the target tracking failure caused by the target size transformation due to factors such as the change of the shooting angle of view.
Further, in a preferred embodiment of the present application, the tracking module 14 is configured to perform unfolding tracking on the determined target to be tracked according to the visible light-infrared imaging data obtained by dimension reduction fitting and through the target tracking model, and is further configured to calculate a second dimension parameter of the target to be tracked in the second fitting image according to the target tracking point set and through the target tracking model when the determined target to be tracked does not exceed the second fitting image range, determine whether the first dimension parameter is consistent with the second dimension parameter according to the second dimension parameter and through the target tracking model, and re-determine the target to be tracked according to the visible light-infrared four-dimensional imaging data and through the target detection model when the first dimension parameter is inconsistent with the second dimension parameter.
The second size parameter may be understood as a size parameter corresponding to the target to be tracked in the next frame of image calculated according to the target tracking point. Although the target to be tracked does not exceed the range of the reference image (the second fitting image), the second size parameter may not be consistent with the first size parameter, and the accuracy of target tracking may be directly affected. The second size parameter is inconsistent with the first size parameter, which can be understood as that the sizes of the target frames corresponding to the targets to be tracked are different. When the second size parameter is inconsistent with the first size parameter, the size of the target to be tracked is changed. At this time, the target to be tracked needs to be newly determined. Likewise, the originally determined target to be tracked is regarded as a first target, and the re-detected target is regarded as a second target. The second target may be redetermined by capturing an image of a range around the first target and performing target detection using a target detection model. After the second target is determined, the second target generates a position parameter and a size parameter corresponding to the position parameter.
It should be noted that, instead of directly replacing the second target with a new target to be tracked, it is determined whether the second size parameter corresponding to the second target and the second size parameter corresponding to the original first target meet the preset target tracking parameter replacement condition through the target tracking model. That is, by the object tracking model, it is determined whether or not there is a space intersection of the size corresponding to the second object and the second size corresponding to the first object and a specific determination criterion is satisfied.
Specifically, it is assumed that the second size of the first object (originally determined object to be tracked) is represented by a rectangular box [ x1, y1, x2, y2], (x 1, y 1) represents the upper left corner of the first object, (x 2, y 2) represents the lower right corner of the first object, and the size of the second object (re-detected object) is represented by a rectangular box [ x3, y3, x4, y4], (x 3, y 3) represents the upper left corner of the second object, and (x 4, y 4) represents the lower right corner of the second object. The target tracking model decision criteria are as follows:
where area_inter represents the size area of the overlapping portion of the first object and the second object in the second size, area_t represents the area of the second size, min () represents the minimum value, and max () represents the maximum value.
And (3) through target tracking model processing, when the size corresponding to the second target and the second size corresponding to the first target are spatially intersected and meet the judgment standard, replacing the size of the first target with the size parameter corresponding to the second target, wherein the position parameter of the first target is not changed and is still the second position parameter. The size parameter of the target to be tracked is determined to be the size parameter corresponding to the re-detection target (second target), and the position parameter is the position parameter (second position parameter) of the original tracked target calculated according to the tracking point. And then, respectively updating the size parameter and the position parameter input by the target tracking model into the size parameter and the second position parameter corresponding to the redetected target, and expanding and tracking according to the data obtained by dimension reduction fitting of the next frame of image. Thus, the failure of tracking the ground target caused by deformation and size transformation of the target is effectively avoided.
Further, in a preferred embodiment of the present application, the tracking module 14 is configured to perform extended tracking on a determined target to be tracked according to visible light-infrared imaging data obtained by dimension-reduction fitting, and perform extended tracking on the determined target to be tracked according to a target tracking model, specifically, according to visible light-infrared imaging data obtained by dimension-reduction fitting, and perform extended tracking on the determined target to be tracked according to a Median-Flow target tracking model.
The Median-Flow target tracking model is a target tracking model established based on a Median-Flow Median Flow tracking algorithm. The algorithm can track the size change of the target frame, so that the influence of the size change of the target frame on the target accuracy can be reduced.
It should be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, the statement "comprises an" or "comprising" does not exclude that an additional identical element is present in a process, method, article or apparatus that comprises the element.
The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and variations of the present application will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. which come within the spirit and principles of the application are to be included in the scope of the claims of the present application.