CN116468753A

CN116468753A - Target tracking method, apparatus, device, storage medium, and program product

Info

Publication number: CN116468753A
Application number: CN202310279713.5A
Authority: CN
Inventors: 马龙祥; 张伟俊; 贾配洋
Original assignee: Insta360 Innovation Technology Co Ltd
Current assignee: Insta360 Innovation Technology Co Ltd
Priority date: 2023-03-20
Filing date: 2023-03-20
Publication date: 2023-07-21

Abstract

The present application relates to a target tracking method, apparatus, device, storage medium and program product. The method comprises the following steps: acquiring tracking conditions corresponding to the target at the current moment in the target tracking process; determining the tracking state of the target according to the tracking condition; the tracking state comprises tracking success or tracking failure; if the tracking state is the tracking failure, performing failure recovery processing on the target based on the current frame image, and determining a tracking result corresponding to the target. By adopting the method, the targets such as pets can be accurately tracked.

Description

Target tracking method, apparatus, device, storage medium, and program product

Technical Field

The present invention relates to the field of object tracking technology, and in particular, to an object tracking method, apparatus, device, storage medium, and program product.

Background

Target tracking refers to positioning a certain target in continuous video frames, and by tracking the target, the target can be better displayed, monitored and the like.

At present, in the field of the pet, a specific device with a communication module such as bluetooth, for example, hardware such as a necklace, is generally configured on the pet; and then, sending a positioning instruction to a positioning module through the Bluetooth module, and acquiring positioning data of the pet by the positioning module according to the positioning instruction, so that the positioning of the pet position is finally completed, and the pet is considered to be tracked.

However, the above-described technique cannot achieve accurate tracking of targets such as pets in video.

Disclosure of Invention

In view of the foregoing, it is desirable to provide a target tracking method, apparatus, device, storage medium, and program product that can accurately track a target such as a pet.

In a first aspect, the present application provides a target tracking method, the method comprising:

acquiring tracking conditions corresponding to the target at the current moment in the target tracking process;

determining the tracking state of the target according to the tracking conditions; the tracking state includes tracking success or tracking failure;

if the tracking state is the tracking failure, performing failure recovery processing on the target based on the current frame image, and determining a tracking result corresponding to the target.

In one embodiment, the tracking condition includes: tracking the target to a key frame image and/or continuously presetting whether the frame number fails to track the target; the key frame image is an image which is time-sequence after the initial frame image.

In one embodiment, the tracking condition is tracking the target to the key frame image, and the determining the tracking state of the target according to the tracking condition includes:

Acquiring initial characteristics corresponding to a target in an initial frame image;

acquiring re-identification features corresponding to targets in the key frame images;

and determining the tracking state of the target according to the initial characteristic and the re-identification characteristic.

In one embodiment, determining the tracking state of the target according to the initial feature and the re-identification feature includes:

matching the initial characteristic and the re-identification characteristic to obtain a first matching result;

and determining the tracking state of the target according to the first matching result.

In one embodiment, the matching processing of the initial feature and the re-identification feature to obtain a first matching result includes:

calculating the similarity between the initial feature and the re-identification feature;

and determining a first matching result according to the similarity.

In one embodiment, determining the tracking state of the target according to the first matching result includes:

if the first matching result is successful, determining that the tracking state of the target is successful;

if the first matching result is the matching failure, determining the tracking state of the target as the tracking failure.

In one embodiment, the initial feature and the re-recognition feature are both color features, and the calculating the similarity between the initial feature and the re-recognition feature includes:

Respectively carrying out section division on the value ranges corresponding to the three channels of the initial characteristics to obtain a plurality of initial sections; each initial section includes a set number of pixels therein;

respectively carrying out section division on the value ranges corresponding to the three channels of the re-identification characteristic to obtain a plurality of re-identification sections; each re-identification section comprises a set number of pixels;

and determining the similarity between the initial characteristic and the re-identification characteristic according to each initial section and each re-identification section.

In one embodiment, the determining the similarity between the initial feature and the re-identification feature according to each initial segment and each re-identification segment includes:

removing the sections with zero pixel quantity in each initial section to obtain a first color feature vector;

removing the sections with zero pixel number in each re-identification section to obtain a second color feature vector;

a similarity between the first color feature vector and the second color feature vector is calculated.

In one embodiment, the performing the failure recovery processing on the target based on the current frame image, and determining the tracking result corresponding to the target includes:

and executing a failure recovery operation, wherein the failure recovery operation comprises the following steps: identifying and extracting the characteristics of each candidate object in the current frame image to obtain candidate characteristics of each candidate object; and determining a tracking result corresponding to the target according to the initial characteristic and each candidate characteristic.

In one embodiment, determining the tracking result corresponding to the target according to the initial feature and each candidate feature includes:

matching the initial feature with each candidate feature to obtain each second matching result;

and determining a tracking result corresponding to the target according to each second matching result.

In one embodiment, determining the tracking result corresponding to the target according to each second matching result includes:

and if the second matching results are all matching failure, taking the next frame image after the current frame image as a new current frame image, and returning to execute the failure recovery operation until a preset loop cut-off condition is reached, so as to obtain a tracking result corresponding to the target.

In one embodiment, the preset cycle cut-off condition includes at least one of the following: one second matching result exists in the second matching results, and the second matching result is successful in matching; a preset recovery time length; the number of image frames is preset.

In one embodiment, the target is a pet, and the initial Feature corresponding to the target includes at least one of a color Feature, a histogram of oriented gradient HOG Feature, a depth Feature Deep Feature, and a category attribute.

In one embodiment, the method further comprises:

and displaying a target frame where the target is in each frame of image in the target tracking process.

In one embodiment, the method further comprises:

and adjusting the position of the holder in real time according to the position of the target frame so as to display the target frame and the targets in the target frame at the set position of the display interface.

In one embodiment, the method further comprises:

and carrying out slow motion processing on the targets in the target frame and displaying the targets.

In a second aspect, the present application also provides an object tracking device, the device comprising:

the condition acquisition module is used for acquiring tracking conditions corresponding to the target at the current moment in the target tracking process;

the state determining module is used for determining the tracking state of the target according to the tracking conditions; the tracking state includes tracking success or tracking failure;

and the tracking module is used for carrying out failure recovery processing on the target based on the current frame image if the tracking state is the tracking failure, and determining a tracking result corresponding to the target.

In a third aspect, the present application also provides a computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:

In a fourth aspect, the present application also provides a target tracking system, comprising:

a cradle head;

and the shooting device is connected to the cradle head and comprises the computer device of the third aspect.

In a fifth aspect, the present application further provides a pan-tilt comprising:

a photographing device;

a computer device of the above third aspect connected to the photographing device.

In a sixth aspect, the present application also provides a computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of:

In a seventh aspect, the present application also provides a computer program product comprising a computer program which when executed by a processor performs the steps of:

According to the target tracking method, the device, the equipment, the storage medium and the program product, the tracking condition corresponding to the target at the current moment is obtained in the target tracking process, the tracking state of the target is determined according to the tracking condition, if the tracking state is the tracking failure, failure recovery processing is carried out on the target based on the current frame image, and the tracking result corresponding to the target is determined; wherein the tracking status includes tracking success or tracking failure. In the method, the tracking state of the target can be determined through the tracking condition of the target, and the target can be recovered when the tracking state is failed, so that an accurate tracking result is obtained, the condition that the tracking error is caused by the loss of the target tracking can be avoided, and the target can be accurately and stably tracked.

Drawings

FIG. 1 is a diagram of an application environment for a target tracking method in one embodiment;

FIG. 2 is a flow chart of a target tracking method according to an embodiment;

FIG. 3 is a flowchart of a target tracking method according to another embodiment;

FIG. 4 is a flowchart of a target tracking method according to another embodiment;

FIG. 5 is a flowchart of a target tracking method according to another embodiment;

FIG. 6 is a flowchart of a target tracking method according to another embodiment;

FIG. 7 is a block diagram of a target tracking device in one embodiment;

fig. 8 is an internal structural diagram of a computer device in one embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.

The target tracking method provided by the embodiment of the application can be applied to an application environment shown in fig. 1. The pan-tilt 102 may be provided with the photographing device 104 (this case is not shown), or the pan-tilt 102 may be externally connected with other photographing devices 104 (the drawings in the embodiment mainly show this case). The shooting device 104 is mainly used for shooting objects or environments in a scene and the like to obtain corresponding single-frame images or video streams; the capture device 104 may be a camera, webcam, radar, cell phone, or the like. The pan/tilt head 102 is mainly used for preventing the photographing device 104 from shaking during the process of photographing the image by the photographing device 104, i.e. for enabling the photographing device 104 to stably photograph the image. In addition, a computer device may be built in the photographing device 104, or the photographing device 104 may be connected with an external computer device, where the computer device mainly performs image processing on a single frame image or a video stream photographed by the photographing device 104, and sends a control instruction to the pan-tilt 102 through a processing result, so that the pan-tilt 102 may control itself to rotate based on the control instruction, and drive the photographing device 104 to rotate, so that the photographing device 104 aims at a target, and tracking of the target is achieved.

In one embodiment, as shown in fig. 2, a target tracking method is provided, and the method is applied to the computer device related to the photographing device in fig. 1, for illustration, the method may include the following steps:

s202, acquiring tracking conditions corresponding to the target at the current moment in the target tracking process.

In the target tracking process, a first frame image in a video stream shot by shooting equipment can be acquired first and recorded as an initial frame image, and objects in various categories are identified to obtain identified objects, the categories of the objects and detection frames of the objects. In the above-mentioned process of identifying the object in the initial frame image, the convolutional neural network technique may be used to train out the multi-class detector in advance, and the position of the object in the frame may be located, for example, by using the existing deep learning network with mature network architecture YOLO, SSD, etc. After training the multi-category detector, the initial frame image can be input into the detector for identification processing, and the identified objects, the categories of the objects and the detection frames of the objects are obtained.

Assuming that a plurality of objects are identified in the initial frame image as described above, a target can be selected from among the categories of the respective objects, for example, the category of the target is a dog, the category of the respective objects is a cat, a dog, a pedestrian, or the like, and the object of the category of the dog can be selected as the target from among the categories.

In the target tracking process, tracking algorithms are adopted, including but not limited to modes of frame-by-frame/frame-skip detection algorithm and matching algorithm. The detection algorithm is mainly realized by adopting the trained multi-class detector; through detecting the target on the frame image, the position of the target can be accurately identified, and the accurate recovery can be realized after the target is out of bounds or is blocked, so that the continuity and stability of target tracking are ensured. The matching algorithm mainly comprises the following steps: first, the center distances match. Setting a matching threshold Thr, and assuming that the distance between the center coordinates of the tracking frame Box and the detection frame is D, D can adopt the Euclidean distance in the two-dimensional space as a measurement to obtain the detection frame Det_box closest to the tracking frame, and the center distance is D. If D < Thr is calculated, the two frames are considered to be successfully matched, and the current detection frame is adopted as a new tracking frame. If D > =thr, the two boxes are considered to have no association, and the matching fails.

Second, the cross-ratios IOU match. The matching threshold Thr is set, and the IOU is calculated as follows: setting the intersection of two frames as the intersection and the Union as Union; then intersection=area (Box & det_box), i.e. the intersection is the Area where both the tracking and detection frames intersect; union = Area (Box) +area (det_box) -intersection, i.e. the Union is the sum of the Area of the tracking frame and the Area of the detection frame minus the Area of the intersection; further, IOU=intersection/Union, i.e., the intersection ratio is the ratio of the area of intersection and the area of Union. If so, the two frames are considered to be successfully matched, and the current detection frame is adopted as a new tracking frame. If D < Thr, the two boxes are considered to have no association, and the matching fails.

If the matching fails, a new tracking frame position can be predicted by using a Kalman prediction method according to the motion trend of the tracking frame, or the tracking frame can hover in place (namely, the position of the tracking frame in the previous frame image is used as the position of the tracking frame in the next frame image).

It should be noted that the above matching process refers to matching the detection frame and the tracking frame in the same category as the tracking frame.

In short, the target tracking algorithm can track the target on each frame image, and the position of the tracking frame or the target frame of the target on the current frame image is updated in real time.

Of course, in the continuous tracking process of the target, the tracking condition of the target can be obtained, and the tracking condition is used for representing the current tracking condition of the target. As an alternative embodiment, the tracking condition includes: tracking the target to a key frame image and/or continuously presetting whether the frame number fails to track the target; the key frame image is an image which is time-sequence after the initial frame image.

The target is tracked to a key frame image, and the time sequence of the key frame image is positioned behind the initial frame image, namely the key frame image belongs to an image behind the initial frame image in the target tracking process. In the target tracking process, the frame images after the initial frame image can be continuously tracked until the key frame image is tracked. The key frame image may be one or more frames of images subsequent to the initial frame image. Specifically, when determining the key frame image, the key frame image may be determined according to a preset interval, where the interval may be a fixed interval or an unfixed interval; in addition, the interval may be an image frame number interval, for example, every 10 frames or 15 frames of images are key frame images, or may be a time interval, for example, every 500ms or 1000ms of images are key frame images. In addition, whether the continuous preset frame number fails to track the target refers to whether the tracking algorithm is adopted to track the target in continuous multi-frames, and whether the target is failed to be detected or matched.

By setting different tracking conditions, the subsequent target recovery process can be performed in various scenes, so that the condition of losing target tracking can be avoided in various scenes, and the accuracy, continuity and stability of target tracking are improved.

S204, determining the tracking state of the target according to the tracking conditions; the tracking status includes tracking success or tracking failure.

In this step, the tracking condition is used to represent the current tracking condition of the target, and then the tracking state of the target can be reflected by the different states of the tracking condition.

Taking tracking conditions as continuous preset frames to judge whether the tracking of the target fails or not, taking a tracking algorithm for tracking the target for a plurality of continuous frames, and if the tracking conditions are detection failure or matching failure, the tracking state of the target at present can be considered as tracking failure; if the result of one or more frames of images is not detection failure or matching failure, the tracking state of the target is considered to be successful.

Taking tracking conditions as an example of tracking a target to a key frame image, when the target is tracked to the key frame image, the target in the key frame image can be tracked through a target tracking algorithm, or other modes, such as threshold segmentation, feature matching and the like, can be adopted to obtain the tracking condition of the target in the key frame image. If the tracking is successful, the tracking state of the target at present can be considered as successful tracking, otherwise, the tracking state of the target at present can be considered as failed tracking.

S206, if the tracking state is the tracking failure, performing failure recovery processing on the target based on the current frame image, and determining a tracking result corresponding to the target.

Generally, if the tracking state of the target is a tracking failure, the target can be considered to be lost, which affects the target tracking process of the subsequent frame image, so that in order to avoid the problem, the target can be tracked more stably, and the target with the tracking failure can be recovered, so that the target can be tracked again.

In this step, when the target is lost at the current time, the current frame image may be acquired, and the current frame image is detected and matched by using a tracking algorithm, so as to re-identify the target therefrom, i.e. recover the target, and perform subsequent target tracking. Or the method can acquire the subsequent frame images based on the current frame image, detect and match the frame images based on the frame images by adopting a tracking algorithm, re-identify the target from the frame images, namely recover the target, and track the subsequent target.

In the target tracking method, the tracking condition corresponding to the target at the current moment is obtained in the target tracking process, the tracking state of the target is determined according to the tracking condition, if the tracking state is tracking failure, failure recovery processing is carried out on the target based on the current frame image, and the tracking result corresponding to the target is determined; wherein the tracking status includes tracking success or tracking failure. In the method, the tracking state of the target can be determined through the tracking condition of the target, and the target can be recovered when the tracking state is failed, so that an accurate tracking result is obtained, the condition that the tracking error is caused by the loss of the target tracking can be avoided, and the target can be accurately and stably tracked.

The above embodiments refer to the fact that the tracking state of the target can be determined by different tracking conditions, and in the following embodiments, a detailed description is given of how to determine the tracking state of the target under the condition that the tracking condition is that the target is tracked to the key frame image.

In another embodiment, another object tracking method is provided, and based on the above embodiment, as shown in fig. 3, the step S204 may include the following steps:

s302, obtaining initial characteristics corresponding to targets in the initial frame image.

In this step, the target in the initial frame image may be identified in the manner in S202, and after the target in the initial frame image is identified, feature extraction may be performed on the target, specifically, feature extraction may be performed on the image where the target frame corresponding to the target is located, to obtain the initial feature of the target, and the initial feature of the target is stored for use in subsequent re-identification.

For the target and the initial Feature, as an alternative embodiment, the target may be a pet, and the initial Feature corresponding to the target may include at least one of a color Feature, a histogram of direction gradient HOG Feature, a depth Feature Deep Feature, and a category attribute. Wherein the color feature may be a chroma saturation brightness HSL feature or a chroma saturation brightness HSV feature.

S304, re-identification features corresponding to the targets in the key frame images are obtained.

The key frame image is an image which is time sequence after the initial frame image. In this step, the explanation of the key frame image may be referred to the explanation related to S202, which will not be repeated here, and in any case, the tracked key frame image may be determined in the tracking process.

After determining the key frame image, the tracking algorithm is adopted to continuously search the most similar area with the target in the initial frame image in the key frame image according to the same process of tracking other frame images, so as to obtain the most similar area, namely, the target and the target frame where the target is located are obtained, wherein the target frame comprises the position information of the target frame, such as the center point coordinate of the target frame, the length, width and height of the target frame and the like.

Taking a pet as an example, assuming that the target is the pet, if the pet coordinates in the current frame image are (x 0, y0, w1, h 1), and when the pet moves leftwards in the image, searching or matching the area closest to the pet in the initial frame image based on the center point of the previous frame, and obtaining a detection frame where the pet is located, wherein the coordinates are (x 1, y1, w1, h 1), and continuously updating the position of the target frame where the target is located, wherein the process is a target tracking process.

After determining to track the key frame image, the above description may track the target in the key frame image, and then perform feature extraction on the target in the key frame image, specifically, may perform feature extraction on the image where the target frame corresponding to the target is located, so as to obtain the re-identification feature of the target. The specific types of the re-identification features may be the same as those of the initial features, and will not be described here again.

S306, determining the tracking state of the target according to the initial characteristic and the re-identification characteristic.

In this step, after the initial feature of the target and the re-identification feature of the target are obtained, whether the target in the key frame image is the original target or not, that is, the tracking state corresponding to the target can be obtained by comparing whether the two features of the target are close before.

In this embodiment, when the target is tracked to the key frame image, an initial feature corresponding to the target in the initial frame image and a key frame image in the target tracking process may be acquired, a re-identification feature of the target in the key frame image is determined, and then a tracking state of the target is determined according to the initial feature of the target and the re-identification feature of the target; the key frame image is an image which is time-sequence after the initial frame image. Therefore, the specific target is re-identified on the key frame only to determine the tracking state, instead of re-identifying all the objects, so that the calculated amount of processing is small, the time consumption in processing the single frame image is short, namely the processing efficiency of the single frame image is high, and the real-time performance of the image processing can be improved. In addition, because the target re-recognition is only carried out on the key frames to determine the tracking state, instead of re-recognition of each frame of image, the calculated amount of processing can be further reduced, and the real-time performance of image processing is further improved.

The above-described embodiments have mentioned the tracking state of the target that can be obtained by the initial feature and the re-recognition feature of the target, and the following embodiments describe in detail how to obtain the tracking state.

In another embodiment, as shown in fig. 4, the step S306 may include the following steps:

s402, performing matching processing on the initial feature and the re-identification feature to obtain a first matching result.

In this step, when the initial feature and the re-identification feature of the target are matched, whether the features in the two features are identical or not may be directly compared, or feature similarity between the two features may be calculated, or other matching manners may be used.

Taking the example of calculating the feature similarity between the two, as an alternative embodiment, the matching process in this step may include: calculating the similarity between the initial feature and the re-identification feature; and determining a first matching result according to the similarity.

Here, the similarity between the initial feature and the re-recognition feature may be calculated by a preset similarity calculation method, and then the calculated similarity may be compared with a preset threshold Thr to obtain the first matching result. If the calculated similarity is greater than or equal to a preset threshold, the comparison similarity between the initial characteristic and the re-identification characteristic of the target is described, and the first matching result is that the matching is successful; if the calculated similarity is smaller than a preset threshold, the fact that the initial feature and the re-identification feature of the target are dissimilar is indicated, and the first matching result is a matching failure.

S404, determining the tracking state of the target according to the first matching result.

In this step, after the first matching result is obtained, a corresponding tracking state of the target may be obtained, and as an alternative embodiment, if the first matching result is successful, the tracking state of the target may be determined to be successful; if the first matching result is the matching failure, determining the tracking state of the target as the tracking failure.

That is, if tracking is successful, it means that the target has no tracking deviation or tracking loss, the target frame and its position where the target is located can be continuously updated, and a new target frame position is obtained. The tracking failure represents that the tracking area on the key frame image is greatly different from the characteristic of the initial frame image, that is, the tracking of the target is deviated or lost, and at this time, the target needs to be re-tracked or recovered, so that the target is re-tracked.

In the embodiment, the initial characteristic and the re-identification characteristic of the target are matched, and the tracking state of the target is determined according to the obtained matching result, so that the tracking state of the target can be obtained rapidly through the matching of the two characteristics, and the efficiency and the instantaneity for obtaining the tracking state of the target are improved; meanwhile, the phenomenon of target frame deviation or wrong target tracking in the tracking process can be avoided through feature matching, so that the anti-interference capability in the tracking process can be improved, and the target tracking precision can be improved. In addition, when the initial feature and the re-identification feature are matched, a matching result can be obtained by calculating the similarity between the initial feature and the re-identification feature, so that a more accurate feature matching result can be obtained through the similarity, and the accuracy of the obtained target tracking state can be improved. Further, the target tracking success is determined when the two features are successfully matched, and the target tracking failure is determined when the two features are failed, so that the process of obtaining the tracking state can be refined, and the obtained tracking state is more accurate.

In the above-described embodiments, it is mentioned how the similarity can be calculated between the initial feature and the re-recognition feature of the object, and when both of the features are color features, the process of calculating the similarity between the color features is described below.

In another embodiment, as shown in fig. 5, the process of calculating the similarity in S402 may include the following steps:

s502, respectively dividing the value ranges corresponding to the three channels of the initial characteristics into sections to obtain a plurality of initial sections; each initial segment includes a set number of pixels.

S504, respectively dividing the value ranges corresponding to the three channels of the re-identification feature into sections to obtain a plurality of re-identification sections; each re-identification section includes a set number of pixels therein.

In S502-S504, the initial frame image and the current frame image may be first converted from the RGB image or BGR image into the HSV color space, or may be converted into the HSL color space. Wherein, HSV and HSL refer to chroma saturation brightness. Taking HSV as an example, the corresponding three channels H, S, V may be divided into a plurality of bin containers (e.g., 1-100 pixels are divided into 10 bins, and each bin container contains 10 pixels). That is, the range of values corresponding to the respective three channels of the initial feature and the re-identification feature may be divided into a plurality of sections, each of which is equivalent to a container, in which a plurality of pixels are included, and the obtained plurality of sections may be referred to as an initial section or a re-identification section. The range of the chroma H is 0-360 degrees, the range of the saturation S is 0.0-1.0, and the range of the brightness V is 0.0 (black) to 1.0 (white). Here, the number of segments divided on the three channels for the initial feature and the re-identification feature may be equal or unequal, and the number of pixels included in each segment may be equal or unequal, that is, the two set numbers may be equal or unequal. In addition, the S-channel and V-channel extrema may be saved as a single bin herein. Wherein the value of S is in a (0-30) interval and is pure white, the value of V is in a (0-30) interval and is pure black brightness, at the moment, the H color of the V is out of alignment, and the H color is extracted independently, so that the result calculation of the color similarity is more credible. The segment division of the HSL feature is the same as the HSV feature, and will not be described in detail here.

The above-mentioned steps S502 and S504 may be performed first, S502 may be performed second, or S502 and S504 may be performed simultaneously.

S506, determining the similarity between the initial characteristic and the re-identification characteristic according to each initial section and each re-identification section.

In this step, in order to improve accuracy of calculating the color similarity, as an alternative embodiment, a segment with zero number of pixels in each initial segment may be removed to obtain a first color feature vector; removing the sections with zero pixel number in each re-identification section to obtain a second color feature vector; a similarity between the first color feature vector and the second color feature vector is calculated.

That is, after the segmentation is completed, some bins with the number of pixels being 0 (i.e. the initial segment or the re-identification segment) can be removed, and each bin occupies a certain weight when calculating the similarity of the histogram, so if all bins are taken to calculate the similarity of the histogram, the obtained result tends to be gentle, the difference of the characteristics of the histogram is unfavorable to be highlighted, and the difference of the characteristics is more obvious by removing the bins with the number of pixels being 0, and the calculated similarity is more accurate.

After the segments with the number of pixels of 0 in the initial segment and the re-identification segment are removed, the remaining segments can form a color feature vector, namely, a first color feature vector and a second color feature vector are obtained. Since the consistency of the categories is ensured when the images are matched frame by frame in the target tracking process, category information is not needed to be considered when the initial frame image and the key frame image are matched, and the similarity between the two images can be directly calculated through the color histogram features. Therefore, the similarity between the first color feature vector and the second color feature vector can be directly calculated, the calculated similarity is compared with a preset threshold, if the similarity is greater than or equal to the preset threshold, the two color feature vectors are indicated to be relatively close, namely, the initial feature and the re-identification feature are relatively close, then the targets corresponding to the initial feature and the re-identification feature are the same target, namely, the target matching is successful, otherwise, the matching is failed.

In this embodiment, after the initial feature and the re-identification feature are divided into the segments of the color channel, the similarity between the segments is calculated, so that the similarity calculation process can be refined, and the calculation result of the similarity is more accurate. In addition, the similarity is calculated after the segments with zero pixel quantity are removed from the divided segments, so that the difference between the two features is more obvious, and the calculated similarity is more accurate.

In the above embodiments, it is mentioned that when the tracking of the target fails, the target tracking process of the subsequent frame image is affected, so that in order to avoid this problem, the target may be tracked more stably, and recovery processing may be performed on the target that fails to track, which will be described in detail in the following embodiments.

In another embodiment, another object tracking method is provided, and if the object tracking fails, the method may further include the following steps based on the above embodiment:

and step A, executing a failure recovery operation, wherein the failure recovery operation comprises the following steps: identifying and extracting the characteristics of each candidate object in the current frame image to obtain candidate characteristics of each candidate object; and determining a tracking result corresponding to the target according to the initial characteristic and each candidate characteristic.

In this step, when the tracking of the target fails, that is, when the tracking of the target has been overturned or overturned, in order to re-track the target, the failure recovery may be performed on the target, specifically, each candidate object in the current frame image may be re-identified, that is, all the objects in the current frame image are identified by using the trained detector in S202, the detection range is enlarged, and the identified object and the detection frame where the object is located are marked as candidate objects. And then, extracting the characteristics of the image where the detection frame of each candidate object is positioned, obtaining the characteristics corresponding to each candidate object, and marking the characteristics as candidate characteristics.

After each candidate object in the current frame image is identified and obtained and the candidate feature is obtained, failure recovery processing can be performed by combining the initial feature of the target in the initial frame image. As an alternative embodiment, as shown in fig. 6, the specific recovery may be performed by the following steps:

s602, carrying out matching processing on the initial feature and each candidate feature to obtain each second matching result.

S604, determining a tracking result corresponding to the target according to each second matching result.

When the candidate features are matched with the initial features of the target, the similarity between the candidate features and the initial features can be calculated respectively to obtain the similarity corresponding to the candidate features, and the similarity is compared with a preset threshold to obtain a second matching result between the candidate features and the initial features; if the similarity between the candidate feature and the initial feature is greater than or equal to a preset threshold value, the candidate feature is similar to the initial feature, and a second matching result corresponding to the candidate feature is successful matching; if the similarity between the other candidate feature and the initial feature is smaller than the preset threshold, the other candidate feature is similar to the initial feature, and the second matching result corresponding to the other candidate feature is the matching failure.

After obtaining the second matching result of each candidate object (may also be referred to as each candidate feature), as an alternative embodiment, if each second matching result is that the matching fails, taking the next frame image after the current frame image as a new current frame image, and returning to execute the failure recovery operation until reaching the preset cycle cut-off condition, so as to obtain the tracking result corresponding to the target.

That is, the candidate features of each candidate object on the current frame image and the initial features can be subjected to feature matching, if any candidate feature matching is successful, the recovery result of the target is that the recovery is successful, and the target can be continuously tracked, namely, the real-time position of the target can be continuously updated. If all candidate features fail to match, it indicates that the recovery result of the target on the current frame image is not successful, then the target recovery may be continued on the frame image after the current frame image, for example, the current frame image is the nth frame, then the object recognition and feature extraction and matching processes may be continuously performed on the n+1st frame, n+2, n+3.

Wherein the preset cycle cut-off condition comprises at least one of the following: one second matching result exists in the second matching results, and the second matching result is successful in matching; a preset recovery time length; the number of image frames is preset. Specifically, for the first condition, if one second matching result exists in the second matching results, that is, if one second matching result is successful, the target recovery is successful, and the target tracking can be continued subsequently, at this time, the target recovery process can be stopped. For the second condition, a preset recovery time length is set, when object identification, feature extraction and matching processing are repeatedly performed on the frame images after the current frame image, if a certain detection time length is reached, for example, the detection time reaches 10s, which indicates that the target is completely lost and needs to be tracked again, the target recovery process can be stopped at the moment. For the third condition, the number of image frames is preset, when the object recognition, feature extraction and matching processing are repeatedly performed on the frame images after the current frame image, if a certain number of frames is reached, for example, the 10 th frame after the current frame is detected, which indicates that the target is completely lost and needs to be tracked again, the target recovery process can be stopped at this time.

It should be noted that, when all the objects on the current frame image are identified and feature extraction and feature matching are performed, it is assumed that the current frame image and two subsequent frame images are identified three times, and that 6 objects 1-6 are identified on each frame image, respectively, then when candidate object extraction features and feature matching are selected for the current frame image, objects 1 and 2 may be selected as candidate objects, when candidate object extraction features and feature matching are selected for the first frame image after the key frame image, objects 3 and 4 may be selected as candidate objects, and when candidate object extraction features and feature matching are selected for the second frame image after the key frame image, objects 5 and 6 may be selected as candidate objects. Therefore, by selecting different objects and less than the number of all the objects on each frame of image, on one hand, the calculation force requirement in the tracking process can be reduced, and on the other hand, the feature extraction and feature matching of all the objects on the frame of image can be ensured, and the effective recovery of the target is ensured.

In this embodiment, if the target tracking fails, a failure recovery operation is performed, where the failure recovery operation includes: and re-identifying and extracting the characteristics of each candidate object in the current frame image, and determining the recovery result of the target through the candidate characteristics and the initial characteristics of each candidate object, so that the range is expanded, re-detection and tracking can be realized, the target recovery efficiency and accuracy can be improved. Further, matching results are obtained through matching of the candidate features and the target features to determine target recovery results, and the target recovery results can be obtained rapidly through feature matching, so that target recovery efficiency and instantaneity are improved. Furthermore, when the target is not successfully recovered, continuous loop identification and feature matching can be performed in the subsequent frame image, so that the tracking target can be recovered as soon as possible, and the smooth execution of the tracking process is ensured. Meanwhile, different circulation conditions are set, so that various scene requirements can be met, and the application range of the recovery process is improved.

In the above-mentioned target tracking process, the target tracking is generally performed on each frame of image, that is, the objects in each frame of image are identified and matched, so as to obtain the target and track. In the identification process, a multi-category detector is generally adopted to identify the object in the frame image, so that the identified object, the category of the object and the detection frame and the position of the detection frame of the object can be obtained, and for the target, the category of the target and the position of the target frame and the target frame where the target is located can be obtained. Based on this, in another embodiment, then, the target frame in which the target is located may be displayed in each frame of image during the target tracking process. The user can manually adjust the photographing equipment such as the mobile phone to ensure that the target is always positioned in the center of the picture, so that the target frame (also called a tracking frame) is displayed in real time, the position of the target is highlighted, the attention of the user to the target is enhanced, and the use experience of the user can be improved.

In another embodiment, after the target and the target frame where the target is located and the position of the target frame are obtained, the position of the pan-tilt can be adjusted in real time according to the position of the target frame, so that the target frame and the target in the target frame are displayed at the set position of the display interface. The setting position here may be, for example, a position such as a center position or an upper right corner position of the display interface. The position of the cradle head is adjusted in real time through the position of the target frame, namely the rotation of the cradle head rotating assembly can be controlled, the cradle head is connected with and supports the shooting equipment, and the rotation of the cradle head drives the shooting equipment to rotate, so that the shooting equipment always faces the target, and the aim of always tracking the target is fulfilled.

In another embodiment, after the target and the target frame where the target is located and the position of the target frame are obtained, the target in the target frame may be further processed and displayed in a slow motion. The slow motion processing can be to embed a target visual tracking algorithm into slow motion tracking and the like, and slow down display the target so as to create a video with better visual effect, and improve the use experience of a user.

In order to facilitate the detailed description of the technical solution of the present application, a detailed embodiment is given below to describe the technical solution of the present application, and on the basis of the above embodiment, taking the pet as an example, the method may include the following steps:

s1, acquiring tracking conditions corresponding to a pet at the current moment in the pet tracking process; the tracking conditions include: tracking the pet to a key frame image and/or continuously presetting whether the pet is failed to be tracked by the frame number;

s2, if the tracking condition is that the pet is tracked to a key frame image, acquiring initial characteristics corresponding to the pet in the initial frame image;

s3, obtaining re-identification features corresponding to the pets in the key frame images; the key frame image is an image of which the time sequence is behind the initial frame image;

S4, respectively dividing the value ranges corresponding to the three channels of the initial characteristics into sections to obtain a plurality of initial sections; respectively carrying out section division on the value ranges corresponding to the three channels of the re-identification characteristic to obtain a plurality of re-identification sections; each initial section includes a set number of pixels therein; each re-identification section comprises a set number of pixels;

s5, eliminating the section with zero pixel number in each initial section to obtain a first color feature vector; removing the sections with zero pixel number in each re-identification section to obtain a second color feature vector;

s6, calculating a first similarity between the first color feature vector and the second color feature vector, and comparing the first similarity with a preset threshold;

s7, if the first similarity is larger than a preset threshold, a first matching result between the initial feature and the re-identification feature is successful; if the first similarity is not greater than a preset threshold, a first matching result between the initial feature and the re-identification feature is a matching failure;

s8, if the first matching result is successful, determining that the tracking state of the pet is successful; if the first matching result is that the matching fails, determining that the tracking state of the pet is that the tracking fails;

S9, if the tracking state is the tracking failure, identifying and extracting the characteristics of each candidate object in the current frame image to obtain the candidate characteristics of each candidate object;

s10, carrying out matching processing on the initial characteristics and the candidate characteristics to obtain second similarity;

s11, comparing each second similarity with a preset threshold value;

s12, if the second similarity is larger than the threshold value, the second matching result between the initial feature and the candidate feature is successful matching; if the second similarity is not greater than the threshold, the second matching result between the initial feature and the candidate feature is a matching failure;

s13, if the second matching results are all matching failure, taking the next frame image after the current frame image as a new current frame image, and returning to execute S9-S13 until a preset cycle cut-off condition is reached, so as to obtain a tracking result corresponding to the pet; the preset cycle cut-off condition includes at least one of: one second matching result exists in the second matching results, and the second matching result is successful in matching; a preset recovery time length; a preset number of image frames;

s14, displaying a target frame where the pet is in each frame of image in the pet tracking process;

s15, adjusting the position of the cradle head in real time according to the position of the target frame so as to display the target frame and the pets in the target frame at the set position of the display interface;

S16, carrying out slow motion processing and displaying on the pets in the target frame.

It should be understood that, although the steps in the flowcharts related to the embodiments described above are sequentially shown as indicated by arrows, these steps are not necessarily sequentially performed in the order indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in the flowcharts described in the above embodiments may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of the steps or stages is not necessarily performed sequentially, but may be performed alternately or alternately with at least some of the other steps or stages.

Based on the same inventive concept, the embodiment of the application also provides an object tracking device for realizing the above-mentioned object tracking method. The implementation of the solution provided by the device is similar to that described in the above method, so the specific limitation of one or more embodiments of the object tracking device provided below may be referred to above for limitation of the object tracking method, and will not be repeated here.

In one embodiment, as shown in fig. 7, there is provided an object tracking device including: the system comprises a condition acquisition module, a state determination module and a tracking module, wherein:

Optionally, the tracking condition includes: tracking the target to a key frame image and/or whether the continuous preset frame number fails to track the target.

In another embodiment, another object tracking device is provided, on the basis of the above embodiment, the tracking condition is that the object is tracked to a key frame image, and the state determining module may include:

the initial feature acquisition unit is used for acquiring initial features corresponding to targets in the initial frame image;

the re-identification feature acquisition unit is used for acquiring re-identification features corresponding to the targets in the key frame images; the key frame image is an image of which the time sequence is behind the initial frame image;

And the state determining unit is used for determining the tracking state of the target according to the initial characteristic and the re-identification characteristic.

Optionally, the target is a pet, and the initial Feature corresponding to the target includes at least one of a color Feature, a direction gradient histogram HOG Feature, a depth Feature Deep Feature, and a category attribute.

Alternatively, the above-described state determination unit may include:

the first matching subunit is used for carrying out matching processing on the initial characteristic and the re-identification characteristic to obtain a first matching result;

and the state determining subunit is used for determining the tracking state of the target according to the first matching result.

Optionally, the first matching subunit is specifically configured to calculate a similarity between the initial feature and the re-identification feature; and determining a first matching result according to the similarity.

Optionally, the state determining subunit is specifically configured to determine that the tracking state of the target is successful tracking if the first matching result is successful matching; if the first matching result is the matching failure, determining the tracking state of the target as the tracking failure.

In another embodiment, another object tracking device is provided, on the basis of the foregoing embodiment, the initial feature and the re-identification feature are both color features, and the first matching subunit is specifically configured to perform segment division on the value ranges corresponding to the three channels of the initial feature, so as to obtain multiple initial segments; respectively carrying out section division on the value ranges corresponding to the three channels of the re-identification characteristic to obtain a plurality of re-identification sections; determining the similarity between the initial features and the re-identification features according to the initial sections and the re-identification sections; each initial section includes a set number of pixels therein; each re-identification section includes a set number of pixels therein.

Optionally, the first matching subunit is specifically configured to reject a segment with zero number of pixels in each initial segment, so as to obtain a first color feature vector; removing the sections with zero pixel number in each re-identification section to obtain a second color feature vector; a similarity between the first color feature vector and the second color feature vector is calculated.

In another embodiment, another object tracking device is provided, where, based on the foregoing embodiment, the tracking module may include a failure recovery unit, where the failure recovery unit is configured to perform a failure recovery operation, where the failure recovery operation includes: identifying and extracting the characteristics of each candidate object in the current frame image to obtain candidate characteristics of each candidate object; and determining a tracking result corresponding to the target according to the initial characteristic and each candidate characteristic.

Optionally, the failure recovery unit may include:

the second matching subunit is used for carrying out matching processing on the initial feature and each candidate feature to obtain each second matching result;

and the tracking subunit is used for determining a tracking result corresponding to the target according to each second matching result.

Optionally, the tracking subunit is specifically configured to, if each second matching result is a matching failure, take a next frame image after the current frame image as a new current frame image, and return to perform the failure recovery operation until a preset cycle cut-off condition is reached, thereby obtaining a tracking result corresponding to the target.

Optionally, the preset cycle cut-off condition includes at least one of the following: one second matching result exists in the second matching results, and the second matching result is successful in matching; a preset recovery time length; the number of image frames is preset.

In another embodiment, another object tracking device is provided, and on the basis of the above embodiment, the device further includes:

and the display module is used for displaying a target frame where the target is in each frame of image in the target tracking process.

Optionally, the method further comprises: and the adjusting module is used for adjusting the position of the cradle head in real time according to the position of the target frame so as to display the target frame and the target in the target frame at the set position of the display interface.

Optionally, the method further comprises: and the processing module is used for carrying out slow motion processing and displaying on the targets in the target frame.

The various modules in the above-described object tracking device may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one embodiment, a computer device is provided, which may be a terminal, and the internal structure thereof may be as shown in fig. 8. The computer device includes a processor, a memory, a communication interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless mode can be realized through WIFI, a mobile cellular network, NFC (near field communication) or other technologies. The computer program is executed by a processor to implement a target tracking method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, can also be keys, a track ball or a touch pad arranged on the shell of the computer equipment, and can also be an external keyboard, a touch pad or a mouse and the like.

It will be appreciated by those skilled in the art that the structure shown in fig. 8 is merely a block diagram of some of the structures associated with the present application and is not limiting of the computer device to which the present application may be applied, and that a particular computer device may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, a computer device is provided comprising a memory and a processor, the memory having stored therein a computer program, the processor when executing the computer program performing the steps of:

acquiring tracking conditions corresponding to the target at the current moment in the target tracking process; determining the tracking state of the target according to the tracking conditions; the tracking state includes tracking success or tracking failure; if the tracking state is the tracking failure, performing failure recovery processing on the target based on the current frame image, and determining a tracking result corresponding to the target.

In one embodiment, the tracking conditions include: tracking the target to a key frame image and/or whether the continuous preset frame number fails to track the target.

In one embodiment, the processor when executing the computer program further performs the steps of:

Acquiring initial characteristics corresponding to a target in an initial frame image; acquiring re-identification features corresponding to targets in the key frame images; the key frame image is an image of which the time sequence is behind the initial frame image; and determining the tracking state of the target according to the initial characteristic and the re-identification characteristic.

matching the initial characteristic and the re-identification characteristic to obtain a first matching result; and determining the tracking state of the target according to the first matching result.

calculating the similarity between the initial feature and the re-identification feature; and determining a first matching result according to the similarity.

if the first matching result is successful, determining that the tracking state of the target is successful; if the first matching result is the matching failure, determining the tracking state of the target as the tracking failure.

respectively carrying out section division on the value ranges corresponding to the three channels of the initial characteristics to obtain a plurality of initial sections; respectively carrying out section division on the value ranges corresponding to the three channels of the re-identification characteristic to obtain a plurality of re-identification sections; determining the similarity between the initial features and the re-identification features according to the initial sections and the re-identification sections; each initial section includes a set number of pixels therein; each re-identification section includes a set number of pixels therein.

removing the sections with zero pixel quantity in each initial section to obtain a first color feature vector; removing the sections with zero pixel number in each re-identification section to obtain a second color feature vector; a similarity between the first color feature vector and the second color feature vector is calculated.

matching the initial feature with each candidate feature to obtain each second matching result; and determining a tracking result corresponding to the target according to each second matching result.

In one embodiment, the preset cycle cut-off condition includes at least one of: one second matching result exists in the second matching results, and the second matching result is successful in matching; a preset recovery time length; the number of image frames is preset.

In one embodiment, a target tracking system is provided, referring to the exemplary diagram shown in FIG. 1, which may include a cradle head; the shooting equipment is connected to the cradle head and comprises the computer equipment.

In one embodiment, a cradle head is provided, comprising a photographing apparatus; the above-mentioned computer device connected to the photographing device.

In one embodiment, a computer readable storage medium is provided having a computer program stored thereon, which when executed by a processor, performs the steps of:

In one embodiment, the computer program when executed by the processor further performs the steps of:

In one embodiment, a computer program product is provided comprising a computer program which, when executed by a processor, performs the steps of:

The data (including, but not limited to, data for analysis, data stored, data displayed, etc.) referred to in this application are all data that are fully authorized by each party.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, database, or other medium used in the various embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (ReRAM), magnetic random access Memory (Magnetoresistive Random Access Memory, MRAM), ferroelectric Memory (Ferroelectric Random Access Memory, FRAM), phase change Memory (Phase Change Memory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in the form of a variety of forms, such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), and the like. The databases referred to in the various embodiments provided herein may include at least one of relational databases and non-relational databases. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processors referred to in the embodiments provided herein may be general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic units, quantum computing-based data processing logic units, etc., without being limited thereto.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The above examples only represent a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the present application. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application shall be subject to the appended claims.

Claims

1. A method of target tracking, the method comprising:

determining the tracking state of the target according to the tracking condition; the tracking state comprises tracking success or tracking failure;

2. The method of claim 1, wherein the tracking conditions comprise: tracking the target to a key frame image and/or continuously presetting whether the tracking of the target is failed or not; the key frame image is an image which is time-sequential after the initial frame image.

3. The method of claim 2, wherein the tracking condition is that the pair of targets is tracked to a key frame image, and the determining the tracking state of the targets according to the tracking condition includes:

acquiring initial characteristics corresponding to the target in an initial frame image;

acquiring re-identification features corresponding to the targets in the key frame images;

4. A method according to claim 3, wherein said determining the tracking state of the target from the initial feature and the re-identification feature comprises:

matching the initial feature and the re-identification feature to obtain a first matching result;

5. The method of claim 4, wherein the matching the initial feature and the re-identified feature to obtain a first matching result comprises:

and determining the first matching result according to the similarity.

6. The method according to claim 4 or 5, wherein said determining the tracking state of the target from the first matching result comprises:

and if the first matching result is the matching failure, determining that the tracking state of the target is the tracking failure.

7. The method of claim 5, wherein the initial feature and the re-identification feature are both color features, and wherein the calculating the similarity between the initial feature and the re-identification feature comprises:

respectively carrying out section division on the value ranges corresponding to the three channels of the initial characteristic to obtain a plurality of initial sections; each initial section comprises a set number of pixels;

dividing the value ranges corresponding to the three channels of the re-identification feature into sections respectively to obtain a plurality of re-identification sections; each re-identification section comprises a set number of pixels;

And determining the similarity between the initial feature and the re-identification feature according to each initial section and each re-identification section.

8. The method of claim 7, wherein said determining a similarity between said initial feature and said re-identified feature based on each of said initial segment and each of said re-identified segments comprises:

removing the sections with zero pixel numbers in the initial sections to obtain a first color feature vector;

and calculating the similarity between the first color feature vector and the second color feature vector.

9. The method according to any one of claims 1-5, wherein the performing a failure recovery process on the target based on the current frame image, and determining a tracking result corresponding to the target, includes:

performing a failure recovery operation, the failure recovery operation comprising: identifying and extracting the characteristics of each candidate object in the current frame image to obtain candidate characteristics of each candidate object; and determining a tracking result corresponding to the target according to the initial characteristic and each candidate characteristic.

10. The method of claim 9, wherein determining the tracking result corresponding to the target based on the initial feature and each of the candidate features comprises:

11. The method of claim 10, wherein determining the tracking result corresponding to the target according to each of the second matching results comprises:

12. The method of claim 11, wherein the predetermined cycle cut-off condition comprises at least one of:

one second matching result exists in each second matching result as successful matching;

a preset recovery time length;

the number of image frames is preset.

13. The method of any of claims 3-5, wherein the target is a pet and the initial features corresponding to the target include at least one of a color Feature, a histogram of oriented gradient HOG Feature, a depth Feature Deep Feature, and a category attribute.

14. The method according to any one of claims 1-5, further comprising:

15. The method of claim 14, wherein the method further comprises:

and adjusting the position of the cradle head in real time according to the position of the target frame so as to display the target frame and the targets in the target frame at the set position of the display interface.

16. The method of claim 14, wherein the method further comprises:

17. A target tracking device, the device comprising:

the state determining module is used for determining the tracking state of the target according to the tracking condition; the tracking state comprises tracking success or tracking failure;

18. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any one of claims 1 to 16 when the computer program is executed.

19. A target tracking system, comprising:

a cradle head;

a photographing device connected to the pan-tilt, wherein the photographing device comprises the computer device of claim 18.

20. A cradle head, comprising: a camera device, the computer device of claim 18 connected to the camera device.

21. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 16.

22. A computer program product comprising a computer program, characterized in that the computer program, when executed by a processor, implements the steps of the method of any one of claims 1 to 16.