[go: up one dir, main page]

CN114565777B - Data processing method and device - Google Patents

Data processing method and device Download PDF

Info

Publication number
CN114565777B
CN114565777B CN202210192315.5A CN202210192315A CN114565777B CN 114565777 B CN114565777 B CN 114565777B CN 202210192315 A CN202210192315 A CN 202210192315A CN 114565777 B CN114565777 B CN 114565777B
Authority
CN
China
Prior art keywords
feature point
feature
points
map
descriptor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210192315.5A
Other languages
Chinese (zh)
Other versions
CN114565777A (en
Inventor
陈一平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Vivo Mobile Communication Co Ltd
Original Assignee
Vivo Mobile Communication Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Vivo Mobile Communication Co Ltd filed Critical Vivo Mobile Communication Co Ltd
Priority to CN202210192315.5A priority Critical patent/CN114565777B/en
Publication of CN114565777A publication Critical patent/CN114565777A/en
Application granted granted Critical
Publication of CN114565777B publication Critical patent/CN114565777B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/269Analysis of motion using gradient-based methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30244Camera pose

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a data processing method and a device, belonging to the field of simultaneous localization and map construction, wherein the method comprises the steps of extracting a plurality of first characteristic point sets from video data comprising multi-frame images; each first feature point set comprises a plurality of feature points corresponding to the same feature object, each feature point is extracted from a corresponding frame of image, descriptors of each feature point are obtained, merging processing is carried out on the plurality of first feature point sets according to the obtained descriptors to obtain a plurality of second feature point sets, a corresponding map point is generated according to each second feature point set, at least one map point to be judged is determined in the generated plurality of map points, the distance between each map point to be judged is smaller than a first distance threshold value, merging processing is carried out on each map point to be judged according to a reprojection error corresponding to each map point to be judged to obtain a corresponding merged map point.

Description

Data processing method and device
Technical Field
The application belongs to the field of simultaneous localization and map construction, and particularly relates to a data processing method and device.
Background
SLAM (Simultaneous Localization AND MAPPING ) refers to a robot that performs its own localization while moving in an unknown environment, and builds an incremental map according to the acquired environmental information. The SLAM for collecting the environment information through the camera is a visual SLAM, and has the advantages of low cost, rich image information and the like, and the technology is more and more focused.
However, as the camera moves, a certain feature object in the environmental information may disappear from the view of the camera and appear again, so that in the environmental information collected by the camera, the feature object that reappears after disappearing is regarded as a brand new feature object by the camera, and thus a great amount of redundant data is included in the environmental information collected by the camera. Redundant data may occupy a large amount of memory, resulting in increased data processing time and reduced data processing efficiency.
Disclosure of Invention
The embodiment of the application aims to provide a data processing method and device, which can solve the problem of redundant data reduction in collected environment information.
In a first aspect, an embodiment of the present application provides a data processing method, including:
Extracting a plurality of first feature point sets from video data comprising a plurality of frame images, wherein each first feature point set comprises a plurality of feature points corresponding to the same feature object;
Acquiring descriptors of each feature point, and combining the plurality of first feature point sets according to the acquired descriptors to obtain a plurality of second feature point sets;
generating a corresponding map point according to each second characteristic point set, and determining at least one pair of map points to be judged in the generated plurality of map points, wherein the distance between each pair of map points to be judged is smaller than a first distance threshold value;
And combining each pair of map points to be judged according to the reprojection error corresponding to the map points to be judged, so as to obtain corresponding combined map points.
In a second aspect, an embodiment of the present application provides a data processing apparatus, including:
The device comprises an extraction module, a first feature point extraction module and a second feature point extraction module, wherein the extraction module is used for extracting a plurality of first feature point sets from video data comprising multi-frame images, each first feature point set comprises a plurality of feature points corresponding to the same feature object, and each feature point is extracted from a corresponding frame of image;
the first merging module is used for acquiring descriptors of each feature point, and merging the plurality of first feature point sets according to the acquired descriptors to obtain a plurality of second feature point sets;
The determining module is used for generating a corresponding map point according to each second characteristic point set, and determining at least one pair of map points to be judged in the generated plurality of map points;
and the second merging module is used for merging each pair of map points to be judged according to the re-projection errors corresponding to the map points to be judged, so as to obtain corresponding merged map points.
In a third aspect, an embodiment of the present application provides an electronic device, including a processor, a memory, and a program or instructions stored on the memory and executable on the processor, the program or instructions implementing the steps of the data processing method according to the first aspect when executed by the processor.
In a fourth aspect, embodiments of the present application provide a readable storage medium having stored thereon a program or instructions which when executed by a processor implement the steps of the data processing method according to the first aspect.
In a fifth aspect, an embodiment of the present application provides a chip, where the chip includes a processor and a communication interface, where the communication interface is coupled to the processor, and the processor is configured to execute a program or instructions to implement a data processing method according to the first aspect.
In the embodiment of the application, a plurality of first feature point sets are extracted from video data comprising multiple frames of images, each first feature point set comprises a plurality of feature points corresponding to the same feature object, each feature point is extracted from a corresponding frame of image, descriptors of each feature point are acquired, the first feature point sets are combined according to the acquired descriptors to obtain a plurality of second feature point sets, a corresponding map point is generated according to each second feature point set, at least one map point to be judged is determined in the generated map points, the distance between each map point to be judged is smaller than a first distance threshold, and the corresponding map points to be judged are combined according to the reprojection error corresponding to each map point to be judged to obtain the corresponding map points after combination. According to the embodiment of the application, under the condition that the observed times of the feature objects are more, the first feature point sets can be combined to obtain the second feature point sets, redundant data caused by the fact that the same feature object is identified to correspond to the first feature point sets can be reduced, map points corresponding to the second feature point sets can be generated, and the map points to be judged with a relatively short distance can be combined to reduce redundant data caused by the fact that the same feature object is identified to correspond to the map points, so that the data processing efficiency is improved in the process of processing data by using environment information.
Drawings
FIG. 1 is a flowchart of a first data processing method according to an embodiment of the present application;
FIG. 2 is a flowchart of a second data processing method according to an embodiment of the present application;
FIG. 3 is a schematic flow chart of merging a first feature point set according to an embodiment of the present application;
FIG. 4 is a schematic diagram of a reprojection error according to an embodiment of the present application;
FIG. 5 is a schematic flow chart of merging map points according to an embodiment of the present application;
FIG. 6 is a schematic block diagram of a data processing apparatus provided by an embodiment of the present application;
FIG. 7 is a schematic block diagram of an electronic device provided by an embodiment of the present application;
fig. 8 is a schematic hardware structure of an electronic device according to an embodiment of the present application.
Detailed Description
The technical solutions of the embodiments of the present application will be clearly described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which are obtained by a person skilled in the art based on the embodiments of the present application, fall within the scope of protection of the present application.
The terms first, second and the like in the description and in the claims, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged, as appropriate, such that embodiments of the present application may be implemented in sequences other than those illustrated or described herein, and that the objects identified by "first," "second," etc. are generally of a type, and are not limited to the number of objects, such as the first object may be one or more. Furthermore, in the description and claims, "and/or" means at least one of the connected objects, and the character "/", generally means that the associated object is an "or" relationship.
The data processing method and device provided by the embodiment of the application are described in detail through specific embodiments and application scenes thereof with reference to the accompanying drawings.
Fig. 1 is a flowchart of a first data processing method according to an embodiment of the present application.
Step 102, extracting a plurality of first feature point sets from video data comprising a plurality of frame images, wherein each first feature point set comprises a plurality of feature points corresponding to the same feature object, and each feature point is extracted from a corresponding frame image.
The video data may be obtained by photographing with a camera, which may be a monocular camera, a binocular camera, or a depth camera. The embodiment of the application does not limit the types of cameras in particular.
The camera may change its pose during the capture of the acquired video data. The pose of the camera may include position information of the camera and a photographing angle of the camera.
The feature object may be an object that appears in a plurality of frames of images included in the video data, for example, a tree, a stool, a person. When a feature object appears in a plurality of frames of images, it can be considered that the feature object is observed by a camera capturing video data a plurality of times, and in a piece of video data, the number of frames of images containing the same feature object can be considered as the observed number of times of the feature object.
Each first feature point set includes a plurality of feature points corresponding to the same feature object, each feature point is extracted from a corresponding one of the frame images, for example, a piece of video data includes 60 frame images, wherein from 11 th frame image to 20 th frame image, a stool is included in the 10 th frame image, and the stool position in the 10 th frame image is different due to the change of the pose of the camera over time. Extracting the characteristic points corresponding to the stool from the 11 th frame image, extracting the characteristic points corresponding to the stool from the 12 th frame image, extracting the characteristic points corresponding to the stool from the 20 th frame image, the 10 feature points extracted form a first feature point set, the 10 feature points correspond to the same stool, and each feature point is extracted from a corresponding frame of image.
The first feature point set is extracted from the video data, at least one feature point is extracted from each frame of image of the video data, and feature matching or feature tracking is performed between two adjacent frames of images for each feature point, so as to determine a first feature point set corresponding to the same feature object.
At least one feature point is extracted from each frame of image of the video data, which may be ORB (Oriented Fast and Rotated Brief) feature points, SIFT (Scale-INVARIANT FEATURE TRANSFORM, scale invariant feature transform) feature points, SURF (speed up robust feature) feature points, or the like, from one frame of image.
The feature matches may be descriptor matches. The feature tracking may be optical flow tracking.
The descriptor may be understood that, for each feature point, feature information of a circle of pixel points around the feature point may be described by a set of binary numbers, and the feature information includes, but is not limited to, luminance information, color information, and the like.
In particular, each frame of image in the video data may be pre-processed prior to execution of step 102. The image preprocessing mode can be to correct distortion of each frame of image according to the calibration parameters of the camera. Calibration parameters include, but are not limited to, camera focal length, camera offset, and camera distortion parameters. The image preprocessing mode may be an adjustment operation for brightness of an image, a motion blur processing operation for an image, or the like. The image with better quality is obtained by preprocessing the image, so that the feature extraction error is reduced in the process of extracting the feature points.
Step 104, acquiring descriptors of each feature point, and combining the plurality of first feature point sets according to the acquired descriptors to obtain a plurality of second feature point sets.
After feature extraction and feature matching are completed, a similarity determination may be made for the first set of feature points, the embodiment of the application can judge the similarity degree of the first characteristic point set by adopting the descriptor distance.
The method comprises the steps of obtaining a plurality of first feature point sets, obtaining a plurality of second feature point sets, wherein the first feature point sets comprise descriptors, calculating the descriptor distance of any two feature points in each first feature point set according to the descriptors of each feature point, counting the descriptor distance sets corresponding to each feature point according to the descriptor distances of any two feature points obtained through calculation, wherein the descriptor distance sets comprise the descriptor distances of other feature points belonging to the same first feature point set with the corresponding feature points, determining target descriptors corresponding to each first feature point set according to the descriptor distance sets corresponding to each feature point, and obtaining the second feature point sets by combining the plurality of first feature point sets according to the target descriptors corresponding to each first feature point set.
According to the descriptor of each feature point, calculating the descriptor distance of any two feature points in each first feature point set, and according to the calculated descriptor distances of any two feature points, counting the descriptor distance set corresponding to each feature point, wherein the descriptor distance set comprises the descriptor distances of other feature points belonging to the same first feature point set with the corresponding feature point.
For example, the first feature point set 1 includes 3 feature points, which are feature point 1, feature point 2, and feature point 3, respectively. For the feature point 1, the feature point 2 and the feature point 3 are other feature points belonging to the same first feature point set as the feature point 1. For feature point 2, feature point 1 and feature point 3 are other feature points belonging to the same first feature point set as feature point 1. For the feature point 3, the feature point 1 and the feature point 2 are other feature points belonging to the same first feature point set as the feature point 1.
The descriptor distance of any two feature points in the first feature point set 1 can be calculated, namely, the descriptor distance r1 of the descriptor of the feature point 1 and the descriptor of the feature point 2, the descriptor distance r2 of the descriptor of the feature point 1 and the descriptor of the feature point 3, and the descriptor distance r3 of the descriptor of the feature point 2 and the descriptor of the feature point 3.
The descriptor distance set corresponding to the feature point 1 comprises r1 and r2, the descriptor distance set corresponding to the feature point 2 comprises r1 and r3, and the descriptor distance set corresponding to the feature point 3 comprises r2 and r3.
Optionally, determining the target descriptor corresponding to each first feature point set according to the descriptor distance set corresponding to each feature point includes performing average processing on the descriptor distances in the descriptor distance set corresponding to each feature point to obtain an average descriptor distance corresponding to each feature point, and determining the descriptor of the feature point with the smallest corresponding average descriptor distance in each first feature point set as the target descriptor corresponding to each first feature point set.
And carrying out average processing on the descriptor distances in the descriptor distance set corresponding to each feature point to obtain the average descriptor distance corresponding to each feature point. For example, the first feature point set 1 includes 3 feature points, which are feature point 1, feature point 2, and feature point 3, respectively. The descriptor distance set corresponding to the feature point 1 comprises r1 and r2, the descriptor distance set corresponding to the feature point 2 comprises r1 and r3, and the descriptor distance set corresponding to the feature point 3 comprises r2 and r3. The descriptor distance in the descriptor distance set corresponding to the feature point 1 is averaged to obtain an average descriptor distance corresponding to the feature point 1, namely 1/2 (r1+r2), the descriptor distance in the descriptor distance set corresponding to the feature point 2 is averaged to obtain an average descriptor distance corresponding to the feature point 2, namely 1/2 (r1+r3), and the descriptor distance in the descriptor distance set corresponding to the feature point 3 is averaged to obtain an average descriptor distance corresponding to the feature point 3, namely 1/2 (r2+r3).
In each first feature point set, the descriptor of the feature point with the smallest average descriptor distance is determined as the target descriptor corresponding to each first feature point set, for example, in the first feature point set 1, the average descriptor distance 1/2 (r1+r2) corresponding to the feature point 1 is smallest, the average descriptor distance 1/2 (r1+r3) corresponding to the feature point 2 is largest, and the average descriptor distance 1/2 (r2+r3) corresponding to the feature point 3 is located between the two, so that the descriptor of the feature point 1 can be determined as the target descriptor corresponding to the first feature point set 1.
The method comprises the steps of determining at least one feature point set to be judged in the first feature point sets, calculating the descriptor distance of each feature point set to be judged according to the target descriptor corresponding to each first feature point set to obtain the target descriptor distance, and carrying out merging processing on each feature point set to be judged, wherein the feature point sets to be judged are smaller than a second distance threshold value, so as to obtain a corresponding merged feature point set.
The second feature point set may be a corresponding one of the merged feature point sets obtained by merging at least two of the first feature point sets, or may be a non-merged feature point set corresponding to the first feature point set. Among the plurality of first feature point sets, one part of the first feature point sets can be subjected to combination processing, and the other part of the first feature point sets cannot be subjected to combination processing.
The two first feature point sets that can be subjected to the merging process can be regarded as two first feature point sets corresponding to the same feature object. For example, in a 60 second video data, a first feature point set 1 corresponding to the first stone is extracted from the 10 th frame to the 20 th frame, a first feature point set 2 corresponding to the second stone is extracted from the 45 th frame to the 52 th frame, and if the first feature point set 1 and the first feature point set 2 can be combined, it can be considered that the first stone and the second stone are the same stone.
At least one feature point set to be judged is determined in the plurality of first feature point sets, for example, the plurality of first feature point sets include a first feature point set 1 corresponding to the stone A and a first feature point set 2 corresponding to the stone B, and the first feature point set 1 and the first feature point set 2 are determined as a pair of feature point sets to be judged in the case that the similarity of the stone A and the stone B is greater than a preset similarity threshold.
And calculating the descriptor distance of each feature point set to be judged according to the target descriptor corresponding to each first feature point set, and obtaining the target descriptor distance. For example, the pair of feature point sets to be judged includes a first feature point set 1 and a first feature point set 2. The first feature point set 1 comprises a feature point 1, a feature point 2 and a feature point 3, the first feature point set 2 comprises a feature point 4, a feature point 5, a feature point 6 and a feature point 7, the target descriptor corresponding to the first feature point set 1 is a descriptor of the feature point 1, and the target descriptor corresponding to the first feature point set 2 is a descriptor of the feature point 6. The descriptor distance of the descriptor of the feature point 1 and the descriptor of the feature point 6 is calculated as the descriptor distance of the feature point set to be judged, i.e., the target descriptor distance.
And carrying out merging processing on each feature point set to be judged of which the target descriptor distance is smaller than the second distance threshold value, and obtaining a corresponding merged feature point set.
The second distance threshold may be a preset distance threshold describing the sub-distance.
And under the condition that the target descriptor distance of any feature point set to be judged is smaller than a second distance threshold value, combining the feature point set to be judged to obtain one combined feature point set corresponding to the feature point set to be judged.
And under the condition that the distance between the target descriptors of any feature point set to be judged is larger than a second distance threshold, determining that the feature point set to be judged does not need to be combined.
Step 104 may be described in detail below in conjunction with fig. 2. Fig. 2 is a schematic flow chart of merging a first feature point set according to an embodiment of the present application.
In the implementation, the first feature point set 1 and the first feature point set 2 to be compared may be determined from a plurality of first feature point sets according to the similarity determination result of the feature object corresponding to the first feature point set, or any two first feature point sets in the plurality of first feature point sets may be used as the first feature point set 1 and the first feature point set 2 to be compared.
As shown in fig. 2, in step 202, a descriptor distance of the first feature point set 1 is calculated.
In step 204, the descriptor distance of the first feature point set 2 is calculated.
Step 202 and step 204 may be performed simultaneously, or step 202 may be performed first, then step 204 may be performed, or vice versa.
In step 206, a target descriptor of the first feature point set 1 is obtained.
After the execution of step 202, step 206 is executed.
Step 208, obtaining the target descriptors of the first feature point set 2.
After execution of step 204, step 208 is executed.
Step 210, calculating a descriptor distance between the target descriptor of the first feature point set 1 and the target descriptor of the first feature point set 2.
Step 212, determining whether the descriptor distance is less than a second distance threshold.
If yes, go to step 214.
In step 214, the first feature point set 1 and the first feature point set 2 are combined to obtain a second feature point set 1.
And 106, generating a corresponding map point according to each second characteristic point set, and determining at least one map point to be judged in the generated plurality of map points, wherein the distance between each two map points to be judged is smaller than a first distance threshold value.
The map points are used to reflect specific position information of the feature points in the three-dimensional space on each frame of image. For a monocular camera, a triangulation method is generally adopted to obtain depth information corresponding to the feature points, for a double-sided camera, the depth information corresponding to the feature points can be obtained by calculating left-right eye parallax, and for a depth camera, the depth information corresponding to the feature points can be directly obtained.
At least one map point to be judged is determined in the generated plurality of map points, namely, the distance between the two adjacent map points is calculated, whether the distance between the two adjacent map points is smaller than a first distance threshold value is judged, and if yes, the two adjacent map points are determined to be one map point to be judged.
The first distance threshold may be a map point distance threshold set in advance.
Optionally, before generating a corresponding map point according to each second feature point set, the data processing method further comprises the steps of obtaining a target camera pose of a camera for collecting video data, generating a corresponding map point according to each second feature point set, determining coordinate information and target depth information of an initial map point corresponding to each second feature point set according to coordinate information and depth information of each feature point in each second feature point set, and correcting the coordinate information of the initial map point according to the target camera pose and the target depth information to obtain a target map point corresponding to each second feature point set.
The target camera pose may be position information and pose information of a camera that collects video data.
The visual slam system can use the re-projection error to construct a least squares problem to optimize the results of the system, i.e., pose and depth, etc.
Fig. 3 is a schematic diagram of a reprojection error according to an embodiment of the present application. As shown in fig. 3, it is assumed that the coordinates of a certain spatial point areThe projected pixel coordinates areThe correspondence between pixel positions and spatial positions is as follows:
(1)
Written in matrix form:
su=KTP(2)
S is depth information corresponding to a feature point under a camera coordinate system, K is an internal reference of a camera, and T is a pose of the camera.
As shown in fig. 3, the point P may be a spatial point, P1 may be a projection of P when the camera is in pose 1, and P2 may be a projection of P when the camera is in pose 2, but in practice, the spatial point P calculated by the coordinate information of P1, the depth information, and the camera pose 1 is inaccurate in projection coordinates at pose 2, such as P2'.
That is, due to the existence of system noise of the visual slam system, the above formula has errors, the errors are summed, a least square problem is constructed, and the pose and depth of the target camera are determined, so that the numerical value of the following formula (3) is minimized.
(3)
The coordinate information of the initial map points is corrected according to the pose of the target camera and the target depth information, so that the target map points corresponding to each second feature point set are obtained.
After the feature points are observed by the camera for many times, the coordinate repair of the constraint participation map points needs to be constructed so as to obtain the optimal value of the system. When feature point sets are combined, the map points corresponding to the feature points already obtain depth values, so that the map points corresponding to the feature points are combined in addition to the attribute combination of the feature points, and the depth is updated by using the number of the feature points as a weight value. If the first feature point set 1 has w 1 feature points with depth d 1, and the first feature point set 2 has w 2 feature points with depth d 2, the depth d after combination is shown in the following formula (4).
(4)
It may be understood that, in the case of merging the first feature point sets, the greater the number of feature points included in the first feature point set, the higher the weight of the target depth information corresponding to the first feature point set having the greater number of feature points.
In the technical scheme of observing feature points by a monocular camera or by a binocular camera, a large amount of relatively accurate depth information and a small amount of less accurate depth information may be acquired. In this case, after the depth of the feature point set is combined in the above manner, the reliability of the depth information is improved, and further, it is unnecessary to acquire the depth information again by other acquisition means. In addition, in the subsequent step 108, the map points need to be merged by using the depth information of the feature points, and the amount of the depth information is reduced by merging the depth information, so that the calculation amount can be effectively reduced in the process of merging the map points, and the influence of the sensor noise on the system precision can be avoided.
And step 108, combining the map points to be judged according to the re-projection errors corresponding to the map points to be judged, and obtaining the corresponding combined map points.
The method comprises the steps of calculating the re-projection error of each feature point in each map point to be judged according to the depth information and the coordinate information of each feature point in a second feature point set corresponding to each map point to be judged to obtain a first re-projection error of each feature point, determining a target feature point corresponding to each map point to be judged according to the median value of the first re-projection error of each feature point, calculating the re-projection error of each target feature point in a crossing mode according to the depth information and the coordinate information of the target feature point corresponding to each map point to be judged to obtain a second re-projection error of each target feature point, and combining the two map points to be judged, wherein the sum of the two second re-projection errors is smaller than a preset error threshold value, to obtain a corresponding combined map point.
And in each map point to be judged, calculating the re-projection error of each feature point according to the depth information and the coordinate information of each feature point in the second feature point set corresponding to each map point to be judged, and obtaining a first re-projection error R 1 of each feature point. For example, a pair of map points to be judged includes map point 1 and map point 2. The calculation formula of the reprojection error can be referred to as the formula (3) above, namely
R1=(5)
The second characteristic point set corresponding to the map point 1 comprises a characteristic point 1, a characteristic point 2 and a characteristic point 3, and the second characteristic point set corresponding to the map point 2 comprises a characteristic point 4, a characteristic point 5 and a characteristic point 6. The re-projection errors of the feature points 1 to 6 can be calculated, respectively, to obtain six first re-projection errors R1.
And determining the target feature point corresponding to each map point to be judged according to the median value of the first re-projection error of each feature point. For example, for map point 1, the R1 value of feature point 1 is the smallest, the R1 value of feature point 2 is slightly larger, and the R1 value of feature point 3 is the largest, then the target feature point corresponding to map point 1 is feature point 2.
And according to the depth information and the coordinate information of the target feature points corresponding to each map point to be judged, calculating the re-projection error of each target feature point in a crossing manner, and obtaining a second re-projection error of each target feature point.
For example, the re-projection error of the target feature point of the map point 1 and the map point 2 is calculated as the second re-projection error R2 of the target feature point of the map point 1, and the re-projection error of the target feature point of the map point 2 and the map point 1 is calculated as the second re-projection error R2' of the target feature point of the map point 2.
And combining each map point to be judged, of which the sum of the two second projection errors is smaller than a preset error threshold value, so as to obtain a corresponding combined map point.
The preset error threshold may be a preset re-projection error threshold r. For example, when r2+r2 'is smaller than R, the map point 1 and the map point 2 may be combined to obtain a corresponding one of the combined map points, and when r2+r2' is greater than or equal to R, it may be determined that the combining of the map point 1 and the map point 2 is not necessary.
Fig. 4 is a schematic flow chart of merging map points according to an embodiment of the present application.
As shown in fig. 4, in step 402, a map point 1 associated feature point re-projection error is calculated.
Step 404, calculating the re-projection error of the associated feature points of the map point 2.
In step 406, the target feature point corresponding to the map point 1 is obtained.
In step 408, the target feature point corresponding to the map point 2 is obtained.
In step 410, the re-projection error with map point 2 is calculated in a cross manner to obtain r_1.
In step 412, the re-projection error with map point 1 is cross-calculated to obtain r_2.
In step 414, it is determined whether the sum of r_1 and r_2 is less than a predetermined error threshold.
If yes, go to step 416.
In step 416, map point 1 is merged with map point 2.
Optionally, after combining the map points to be judged according to the re-projection errors corresponding to the map points to be judged to obtain the corresponding combined map points, determining whether the number of the feature points corresponding to each combined map point is larger than a preset number threshold, if the number of the feature points corresponding to each combined map point is larger than the preset number threshold, determining the feature points corresponding to each combined map point as feature points to be screened, calculating corresponding association scores according to the first re-projection errors, the average descriptor distance and the observed times of the corresponding feature objects of each feature point to be screened, and screening the feature points to be screened corresponding to each combined map point according to the association scores corresponding to each feature point to be screened to obtain at least one representative feature point corresponding to the combined map point.
After the step 108 is executed, the number of feature points associated with the map points may be analyzed, and whether the number of feature points is greater than a set number threshold may be determined, so as to screen out an optimal feature point set. The number of feature points associated with the map points may be the number of feature points included in the second feature point set corresponding to the map points.
The scheme calculates the association scores of the feature points associated with the map points through the following formula (6), sorts the map points according to the association scores, determines at least one representative feature point corresponding to the combined map points according to the sorting result,
(6)
Wherein s is the correlation score, E rep is the reprojection error, a is the reprojection error corresponding coefficient, D avg is the average descriptor distance, b is the average descriptor distance coefficient, N ob is the observed times of the feature points, and c is the feature point coefficient.
In the embodiment shown in fig. 1, a plurality of first feature point sets are extracted from video data comprising a plurality of frames of images, each first feature point set comprises a plurality of feature points corresponding to the same feature object, each feature point is extracted from a corresponding frame of image, descriptors of each feature point are obtained, the first feature point sets are combined according to the obtained descriptors to obtain a plurality of second feature point sets, a corresponding map point is generated according to each second feature point set, at least one map point to be judged is determined in the generated map points, the distance between each map point to be judged is smaller than a first distance threshold, and the map points to be judged are combined according to a reprojection error corresponding to each map point to be judged to obtain a corresponding combined map point. According to the embodiment of the application, under the condition that the observed times of the feature objects are more, the first feature point sets can be combined to obtain the second feature point sets, redundant data caused by the fact that the same feature object is identified to correspond to the first feature point sets can be reduced, map points corresponding to the second feature point sets can be generated, and the map points to be judged with a relatively short distance can be combined to reduce redundant data caused by the fact that the same feature object is identified to correspond to the map points, so that the data processing efficiency is improved in the process of processing data by using environment information.
Fig. 5 is a flowchart of a second data processing method according to an embodiment of the present application.
Step 502, input image data preprocessing.
Step 504, image feature extraction and matching.
Step 506, fusing similar feature points.
And 508, map points are generated.
Step 510, map point correction and updating.
At step 512, map points are fused.
For the above-described data processing method embodiments, since they are substantially similar to the respective data processing method embodiments described above, the description is relatively simple, and reference is made to the foregoing description of the respective data processing method embodiments.
It should be noted that, in the data processing method provided in the embodiment of the present application, the execution body may be a data processing apparatus, or a control module in the data processing apparatus for executing the data processing method. In the embodiment of the present application, a data processing device is described by taking a data processing method performed by the data processing device as an example.
Fig. 6 is a schematic block diagram of a data processing apparatus according to an embodiment of the present application.
As shown in fig. 6, the data processing apparatus includes:
An extracting module 601, configured to extract a plurality of first feature point sets from video data including a plurality of frame images, where each first feature point set includes a plurality of feature points corresponding to a same feature object;
The first merging module 602 is configured to obtain a descriptor of each feature point, and merge the plurality of first feature point sets according to the obtained descriptors to obtain a plurality of second feature point sets;
the first determining module 603 is configured to generate a corresponding map point according to each second feature point set, and determine at least one map point to be determined from the generated plurality of map points;
The second merging module 604 is configured to merge each map point to be determined according to the re-projection error corresponding to the map point to be determined, so as to obtain a corresponding merged map point.
Optionally, the first merging module 602 includes:
A calculating unit, configured to calculate a descriptor distance of any two feature points in each first feature point set according to the descriptor of each feature point;
The statistics unit is used for counting a descriptor distance set corresponding to each feature point according to the calculated descriptor distances of any two feature points, wherein the descriptor distance set comprises descriptor distances of other feature points belonging to the same first feature point set with the corresponding feature points;
The determining unit is used for determining a target descriptor corresponding to each first characteristic point set according to the descriptor distance set corresponding to each characteristic point;
And the merging unit is used for merging the plurality of first feature point sets according to the target descriptors corresponding to each first feature point set to obtain a plurality of second feature point sets.
Optionally, the determining unit is specifically configured to:
Carrying out average processing on the descriptor distances in the descriptor distance set corresponding to each feature point to obtain an average descriptor distance corresponding to each feature point;
and in each first characteristic point set, determining the descriptor of the characteristic point with the smallest corresponding average descriptor distance as the target descriptor corresponding to each first characteristic point set.
Optionally, the plurality of second feature point sets includes at least one merged feature point set and at least one non-merged feature point set, and the merging unit is specifically configured to:
Determining at least one feature point set to be judged in the plurality of first feature point sets;
Calculating the descriptor distance of each feature point set to be judged according to the target descriptor corresponding to each first feature point set to obtain the target descriptor distance;
And combining each feature point set to be judged, of which the target descriptor distance is smaller than the second distance threshold value, so as to obtain a corresponding combined feature point set.
Optionally, the data processing apparatus further comprises:
the acquisition module is used for acquiring the pose of a target camera of the camera for acquiring video data;
the first determining module 603 is specifically configured to:
according to the coordinate information and the depth information of each feature point in each second feature point set, determining the coordinate information and the target depth information of an initial map point corresponding to each second feature point set;
And correcting the coordinate information of the initial map points according to the pose of the target camera and the target depth information to obtain target map points corresponding to each second characteristic point set.
Optionally, the second merging module 604 is specifically configured to:
In each map point to be judged, calculating a re-projection error of each feature point according to the depth information and the coordinate information of each feature point in the second feature point set corresponding to each map point to be judged, and obtaining a first re-projection error of each feature point;
Determining target feature points corresponding to each map point to be judged according to the median value of the first re-projection errors of each feature point;
according to the depth information and the coordinate information of the target feature points corresponding to each map point to be judged, calculating the re-projection error of each target feature point in a crossing manner to obtain a second re-projection error of each target feature point;
And combining each map point to be judged, of which the sum of the two second projection errors is smaller than a preset error threshold value, so as to obtain a corresponding combined map point.
Optionally, the data processing apparatus further comprises:
The second determining module is used for determining whether the number of the feature points corresponding to each combined map point is larger than a preset number threshold value;
If the number of the feature points corresponding to each combined map point is larger than a preset number threshold, a calculation module is operated, wherein the calculation module is used for determining the feature points corresponding to the combined map points as feature points to be screened, and calculating corresponding association scores according to the first reprojection error, the average descriptor distance and the observed times of the corresponding feature objects of each feature point to be screened;
And the screening module is used for screening the feature points to be screened corresponding to the combined map points according to the association scores corresponding to the feature points to be screened, so as to obtain at least one representative feature point corresponding to the combined map points.
In the embodiment of the application, a plurality of first feature point sets are extracted from video data comprising multiple frames of images, each first feature point set comprises a plurality of feature points corresponding to the same feature object, each feature point is extracted from a corresponding frame of image, descriptors of each feature point are acquired, the first feature point sets are combined according to the acquired descriptors to obtain a plurality of second feature point sets, a corresponding map point is generated according to each second feature point set, at least one map point to be judged is determined in the generated map points, the distance between each map point to be judged is smaller than a first distance threshold, and the corresponding map points to be judged are combined according to the reprojection error corresponding to each map point to be judged to obtain the corresponding map points after combination. According to the embodiment of the application, under the condition that the observed times of the feature objects are more, the first feature point sets can be combined to obtain the second feature point sets, redundant data caused by the fact that the same feature object is identified to correspond to the first feature point sets can be reduced, map points corresponding to the second feature point sets can be generated, and the map points to be judged with a relatively short distance can be combined to reduce redundant data caused by the fact that the same feature object is identified to correspond to the map points, so that the data processing efficiency is improved in the process of processing data by using environment information.
The data processing device in the embodiment of the application can be a device, or can be a component, an integrated circuit, or a chip in a terminal. The device may be a mobile electronic device or a non-mobile electronic device. By way of example, the mobile electronic device may be a mobile phone, a tablet computer, a notebook computer, a palm computer, a vehicle-mounted electronic device, a wearable device, an ultra-mobile personal computer (UMPC), a netbook or a Personal Digital Assistant (PDA), etc., and the non-mobile electronic device may be a server, a network attached storage (Network Attached Storage, NAS), a personal computer (personal computer, PC), a Television (TV), a teller machine, a self-service machine, etc., and the embodiments of the present application are not limited in particular.
The data processing device in the embodiment of the present application may be a device having an operating system. The operating system may be an Android operating system, an iOS operating system, or other possible operating systems, and the embodiment of the present application is not limited specifically.
The data processing device provided in the embodiment of the present application can implement each process implemented by the embodiments of the methods of fig. 1 to 5, and in order to avoid repetition, a description is omitted here.
Fig. 7 is a schematic block diagram of an electronic device according to an embodiment of the present application. Optionally, as shown in fig. 7, the embodiment of the present application further provides an electronic device 700, including a processor 701, a memory 702, and a program or an instruction stored in the memory 702 and capable of running on the processor 701, where the program or the instruction implements each process of the above-mentioned embodiment of the data processing method when executed by the processor 701, and the process can achieve the same technical effects, and for avoiding repetition, a description is omitted herein.
The electronic device in the embodiment of the application includes the mobile electronic device and the non-mobile electronic device.
Fig. 8 is a schematic hardware structure of an electronic device according to an embodiment of the present application.
The electronic device 800 includes, but is not limited to, a radio frequency unit 801, a network module 802, an audio output unit 803, an input unit 804, a sensor 805, a display unit 806, a user input unit 807, an interface unit 808, a memory 809, and a processor 810.
Those skilled in the art will appreciate that the electronic device 800 may also include a power source (e.g., a battery) for powering the various components, which may be logically connected to the processor 810 by a power management system to perform functions such as managing charge, discharge, and power consumption by the power management system. The electronic device structure shown in fig. 8 does not constitute a limitation of the electronic device, and the electronic device may include more or less components than shown, or may combine certain components, or may be arranged in different components, which are not described in detail herein.
The processor 810 is configured to extract a plurality of first feature point sets from video data including a plurality of frame images, wherein each of the first feature point sets includes a plurality of feature points corresponding to a same feature object, and each of the feature points is extracted from a corresponding frame image;
Acquiring descriptors of each feature point, and combining the plurality of first feature point sets according to the acquired descriptors to obtain a plurality of second feature point sets;
generating a corresponding map point according to each second characteristic point set, and determining at least one pair of map points to be judged in the generated plurality of map points, wherein the distance between each pair of map points to be judged is smaller than a first distance threshold value;
And combining each pair of map points to be judged according to the reprojection error corresponding to the map points to be judged, so as to obtain corresponding combined map points.
In the embodiment of the application, a plurality of first feature point sets are extracted from video data comprising multiple frames of images, each first feature point set comprises a plurality of feature points corresponding to the same feature object, each feature point is extracted from a corresponding frame of image, descriptors of each feature point are acquired, the first feature point sets are combined according to the acquired descriptors to obtain a plurality of second feature point sets, a corresponding map point is generated according to each second feature point set, at least one map point to be judged is determined in the generated map points, the distance between each map point to be judged is smaller than a first distance threshold, and the corresponding map points to be judged are combined according to the reprojection error corresponding to each map point to be judged to obtain the corresponding map points after combination. According to the embodiment of the application, under the condition that the observed times of the feature objects are more, the first feature point sets can be combined to obtain the second feature point sets, redundant data caused by the fact that the same feature object is identified to correspond to the first feature point sets can be reduced, map points corresponding to the second feature point sets can be generated, and the map points to be judged with a relatively short distance can be combined to reduce redundant data caused by the fact that the same feature object is identified to correspond to the map points, so that the data processing efficiency is improved in the process of processing data by using environment information.
Optionally, the processor 110 is further configured to combine the plurality of first feature point sets according to the obtained descriptors to obtain a plurality of second feature point sets, where the steps include:
Calculating the descriptor distance of any two feature points in each first feature point set according to the descriptors of each feature point;
Counting a descriptor distance set corresponding to each feature point according to the calculated descriptor distances of any two feature points, wherein the descriptor distance set comprises descriptor distances of other feature points belonging to the same first feature point set with the corresponding feature points;
determining a target descriptor corresponding to each first feature point set according to the descriptor distance set corresponding to each feature point;
and combining the plurality of first feature point sets according to the target descriptors corresponding to the first feature point sets to obtain a plurality of second feature point sets.
Optionally, the processor 110 is further configured to determine, according to the descriptor distance set corresponding to each feature point, a target descriptor corresponding to each first feature point set, where the determining includes:
Carrying out average processing on the descriptor distances in the descriptor distance set corresponding to each feature point to obtain an average descriptor distance corresponding to each feature point;
and in each first characteristic point set, determining the descriptor of the characteristic point with the smallest corresponding average descriptor distance as the target descriptor corresponding to each first characteristic point set.
Optionally, the plurality of second feature point sets includes at least one merged feature point set and at least one non-merged feature point set, and the processor 110 is further configured to merge the plurality of first feature point sets according to the target descriptor corresponding to each of the first feature point sets to obtain a plurality of second feature point sets, where the method includes:
Determining at least one feature point set to be judged in the plurality of first feature point sets;
Calculating the descriptor distance of each feature point set to be judged according to the target descriptor corresponding to each first feature point set to obtain the target descriptor distance;
And combining each feature point set to be judged, of which the target descriptor distance is smaller than the second distance threshold value, so as to obtain a corresponding combined feature point set.
Optionally, the processor 110 is further configured to:
Acquiring the target camera pose of the camera for acquiring video data before generating a corresponding map point according to each second characteristic point set;
Generating a corresponding map point according to each second characteristic point set, including:
according to the coordinate information and the depth information of each feature point in each second feature point set, determining the coordinate information and the target depth information of an initial map point corresponding to each second feature point set;
And correcting the coordinate information of the initial map points according to the pose of the target camera and the target depth information to obtain target map points corresponding to each second characteristic point set.
Optionally, the processor 110 is further configured to combine each map point to be determined according to the reprojection error corresponding to the map point to be determined, to obtain a corresponding combined map point, where the combining includes:
In each map point to be judged, calculating a re-projection error of each feature point according to the depth information and the coordinate information of each feature point in the second feature point set corresponding to each map point to be judged, and obtaining a first re-projection error of each feature point;
Determining target feature points corresponding to each map point to be judged according to the median value of the first re-projection errors of each feature point;
according to the depth information and the coordinate information of the target feature points corresponding to each map point to be judged, calculating the re-projection error of each target feature point in a crossing manner to obtain a second re-projection error of each target feature point;
And combining each map point to be judged, of which the sum of the two second projection errors is smaller than a preset error threshold value, so as to obtain a corresponding combined map point.
Optionally, the processor 110 is further configured to combine each map point to be determined according to the reprojection error corresponding to the map point to be determined, and further includes:
determining whether the number of the feature points corresponding to each combined map point is larger than a preset number threshold;
If the number of the feature points corresponding to each combined map point is larger than a preset number threshold, determining the feature points corresponding to the combined map points as feature points to be screened, and calculating corresponding association scores according to the first re-projection error, the average descriptor distance and the observed times of the corresponding feature objects of each feature point to be screened;
And screening the feature points to be screened corresponding to the combined map points according to the association scores corresponding to the feature points to be screened, so as to obtain at least one representative feature point corresponding to the combined map points.
According to the embodiment of the application, the first characteristic point sets can be combined, the map points to be judged are combined, redundant data is reduced from the environment information acquired by the camera, the data processing efficiency is improved, and the number of the characteristic points corresponding to each map point can be reduced.
It should be appreciated that in embodiments of the present application, the input unit 804 may include a graphics processor (Graphics Processing Unit, GPU) 8041 and a microphone 8042, with the graphics processor 8041 processing image data of still pictures or video obtained by an image capturing device (e.g., a camera) in a video capturing mode or an image capturing mode. The display unit 806 may include a display panel 8061, and the display panel 8061 may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like. The user input unit 807 includes a touch panel 8071 and other input devices 8072. Touch panel 8071, also referred to as a touch screen. The touch panel 8071 may include two parts, a touch detection device and a touch controller. Other input devices 8072 may include, but are not limited to, a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and so forth, which are not described in detail herein. The memory 809 may be used to store software programs as well as various data including, but not limited to, application programs and an operating system. The processor 810 may integrate an application processor that primarily handles operating systems, user interfaces, applications, etc., with a modem processor that primarily handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 810.
The embodiment of the application also provides a readable storage medium, on which a program or an instruction is stored, which when executed by a processor, implements each process of the above-mentioned data processing method embodiment, and can achieve the same technical effects, and in order to avoid repetition, the description is omitted here.
Wherein the processor is a processor in the electronic device described in the above embodiment. The readable storage medium includes a computer readable storage medium such as a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk or an optical disk, and the like.
The embodiment of the application further provides a chip, which comprises a processor and a communication interface, wherein the communication interface is coupled with the processor, and the processor is used for running programs or instructions to realize the processes of the data processing method embodiment, and can achieve the same technical effects, so that repetition is avoided, and the description is omitted here.
It should be understood that the chips referred to in the embodiments of the present application may also be referred to as system-on-chip chips, chip systems, or system-on-chip chips, etc.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element. Furthermore, it should be noted that the scope of the methods and apparatus in the embodiments of the present application is not limited to performing the functions in the order shown or discussed, but may also include performing the functions in a substantially simultaneous manner or in an opposite order depending on the functions involved, e.g., the described methods may be performed in an order different from that described, and various steps may be added, omitted, or combined. Additionally, features described with reference to certain examples may be combined in other examples.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a computer software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method according to the embodiments of the present application.
The embodiments of the present application have been described above with reference to the accompanying drawings, but the present application is not limited to the above-described embodiments, which are merely illustrative and not restrictive, and many forms may be made by those having ordinary skill in the art without departing from the spirit of the present application and the scope of the claims, which are to be protected by the present application.

Claims (12)

1.一种数据处理方法,其特征在于,包括:1. A data processing method, comprising: 从包括多帧图像的视频数据中提取多个第一特征点集合;每个所述第一特征点集合包括对应于同一个特征对象的多个特征点;其中,每个所述特征点从对应的一帧图像中提取;Extracting a plurality of first feature point sets from video data including a plurality of frames of images; each of the first feature point sets includes a plurality of feature points corresponding to the same feature object; wherein each of the feature points is extracted from a corresponding frame of image; 获取每个所述特征点的描述子,根据获取的所述描述子,对所述多个第一特征点集合进行合并处理,得到多个第二特征点集合;Obtaining a descriptor for each of the feature points, and merging the plurality of first feature point sets according to the obtained descriptors to obtain a plurality of second feature point sets; 根据每个所述第二特征点集合,生成对应的一个地图点,在生成的多个地图点中确定至少一对待判断地图点;每对所述待判断地图点之间的距离小于第一距离阈值;According to each of the second feature point sets, a corresponding map point is generated, and at least one pair of map points to be determined is determined from the generated multiple map points; the distance between each pair of the map points to be determined is less than a first distance threshold; 根据每对所述待判断地图点对应的重投影误差,对每对所述待判断地图点进行合并处理,得到对应的合并后的地图点;Merging each pair of the map points to be determined according to the reprojection errors corresponding to each pair of the map points to be determined to obtain corresponding merged map points; 所述根据获取的描述子,对所述多个第一特征点集合进行合并处理,得到多个第二特征点集合,包括:The step of merging the plurality of first feature point sets according to the acquired descriptors to obtain a plurality of second feature point sets includes: 根据每个所述特征点的描述子,计算每个所述第一特征点集合中的任意两个特征点的描述子距离;Calculating the descriptor distance between any two feature points in each of the first feature point sets according to the descriptor of each of the feature points; 根据计算得到的所述任意两个特征点的描述子距离,统计每个所述特征点对应的描述子距离集合;所述描述子距离集合包括与对应的特征点属于同一个第一特征点集合的其他特征点的描述子距离;According to the calculated descriptor distances of the arbitrary two feature points, a descriptor distance set corresponding to each feature point is counted; the descriptor distance set includes descriptor distances of other feature points belonging to the same first feature point set as the corresponding feature point; 根据每个所述特征点对应的所述描述子距离集合,确定每个所述第一特征点集合对应的目标描述子;Determine the target descriptor corresponding to each of the first feature point sets according to the descriptor distance set corresponding to each of the feature points; 根据每个所述第一特征点集合对应的目标描述子,对所述多个第一特征点集合进行合并处理,得到多个第二特征点集合。The multiple first feature point sets are merged according to the target descriptor corresponding to each of the first feature point sets to obtain multiple second feature point sets. 2.根据权利要求1所述的方法,其特征在于,所述根据每个所述特征点对应的所述描述子距离集合,确定每个所述第一特征点集合对应的目标描述子,包括:2. The method according to claim 1, characterized in that the step of determining the target descriptor corresponding to each of the first feature point sets according to the descriptor distance set corresponding to each of the feature points comprises: 对每个所述特征点对应的所述描述子距离集合中的描述子距离进行平均处理,得到每个所述特征点对应的平均描述子距离;Averaging the descriptor distances in the descriptor distance set corresponding to each of the feature points to obtain an average descriptor distance corresponding to each of the feature points; 在每个所述第一特征点集合中,将对应的平均描述子距离最小的特征点的描述子确定为每个所述第一特征点集合对应的目标描述子。In each of the first feature point sets, the descriptor of the feature point with the smallest corresponding average descriptor distance is determined as the target descriptor corresponding to each of the first feature point sets. 3.根据权利要求1所述的方法,其特征在于,所述多个第二特征点集合包括至少一个合并特征点集合与至少一个非合并特征点集合;所述根据每个所述第一特征点集合对应的目标描述子,对所述多个第一特征点集合进行合并处理,得到多个第二特征点集合,包括:3. The method according to claim 1, characterized in that the multiple second feature point sets include at least one merged feature point set and at least one non-merged feature point set; the merging of the multiple first feature point sets according to the target descriptor corresponding to each of the first feature point sets to obtain the multiple second feature point sets comprises: 在所述多个第一特征点集合中确定至少一对待判断的特征点集合;Determining at least one pair of feature point sets to be judged from the plurality of first feature point sets; 根据每个所述第一特征点集合对应的目标描述子,计算每对待判断的特征点集合的描述子距离,获得目标描述子距离;According to the target descriptor corresponding to each of the first feature point sets, a descriptor distance of each feature point set to be judged is calculated to obtain a target descriptor distance; 对所述目标描述子距离小于第二距离阈值的每对待判断的特征点集合进行合并处理,获得对应的一个合并特征点集合。Each set of feature points to be judged whose distance to the target descriptor is less than the second distance threshold is merged to obtain a corresponding merged feature point set. 4.根据权利要求1所述的方法,其特征在于,所述根据每个所述第二特征点集合,生成对应的一个地图点之前,还包括:4. The method according to claim 1, characterized in that before generating a corresponding map point according to each of the second feature point sets, it also includes: 获取采集所述视频数据的相机的目标相机位姿;Obtaining a target camera pose of a camera that collects the video data; 所述根据每个所述第二特征点集合,生成对应的一个地图点,包括:The step of generating a corresponding map point according to each of the second feature point sets includes: 根据每个所述第二特征点集合中的每个所述特征点的坐标信息和深度信息,确定每个所述第二特征点集合对应的初始地图点的坐标信息和目标深度信息;Determine the coordinate information and target depth information of the initial map point corresponding to each of the second feature point sets according to the coordinate information and depth information of each of the feature points in each of the second feature point sets; 根据所述目标相机位姿以及目标深度信息对所述初始地图点的坐标信息进行修正,得到每个所述第二特征点集合对应的目标地图点。The coordinate information of the initial map point is corrected according to the target camera pose and the target depth information to obtain the target map point corresponding to each set of the second feature points. 5.根据权利要求1所述的方法,其特征在于,所述根据每对所述待判断地图点对应的重投影误差,对每对所述待判断地图点进行合并处理,得到对应的合并后的地图点,包括:5. The method according to claim 1, characterized in that the step of merging each pair of the map points to be determined according to the reprojection errors corresponding to each pair of the map points to be determined to obtain the corresponding merged map points comprises: 在每对所述待判断地图点中,根据每个所述待判断地图点对应的第二特征点集合中的每个所述特征点的深度信息和坐标信息,计算每个所述特征点的重投影误差,得到每个所述特征点的第一重投影误差;In each pair of the map points to be determined, according to the depth information and coordinate information of each feature point in the second feature point set corresponding to each of the map points to be determined, a reprojection error of each feature point is calculated to obtain a first reprojection error of each feature point; 根据每个所述特征点的第一重投影误差的中值,确定每个所述待判断地图点对应的目标特征点;Determining the target feature point corresponding to each of the map points to be determined according to the median value of the first reprojection error of each of the feature points; 根据每个所述待判断地图点对应的目标特征点的深度信息和坐标信息,交叉计算每个所述目标特征点的重投影误差,得到每个所述目标特征点的第二重投影误差;According to the depth information and coordinate information of the target feature point corresponding to each of the map points to be determined, cross-calculate the reprojection error of each of the target feature points to obtain a second reprojection error of each of the target feature points; 对两个所述第二重投影误差之和小于预设误差阈值的每对所述待判断地图点进行合并处理,得到对应的一个合并后的地图点。Each pair of the to-be-determined map points whose sum of two second re-projection errors is less than a preset error threshold is merged to obtain a corresponding merged map point. 6.根据权利要求1所述的方法,其特征在于,所述根据每对所述待判断地图点对应的重投影误差,对每对所述待判断地图点进行合并处理,得到对应的合并后的地图点之后,还包括:6. The method according to claim 1, characterized in that after merging each pair of the map points to be determined according to the reprojection errors corresponding to each pair of the map points to be determined to obtain the corresponding merged map points, the method further comprises: 确定每个所述合并后的地图点对应的特征点的数量是否大于预设数量阈值;Determine whether the number of feature points corresponding to each of the merged map points is greater than a preset number threshold; 若每个所述合并后的地图点对应的特征点的数量大于预设数量阈值,则将所述合并后的地图点对应的特征点确定为待筛选特征点,根据每个所述待筛选特征点的第一重投影误差、平均描述子距离以及对应的特征对象被观测次数,计算对应的关联分数;If the number of feature points corresponding to each of the merged map points is greater than a preset number threshold, the feature points corresponding to the merged map points are determined as feature points to be screened, and the corresponding association scores are calculated according to the first reprojection error, the average descriptor distance and the number of observations of the corresponding feature objects of each of the feature points to be screened; 根据每个所述待筛选特征点对应的关联分数,对所述合并后的地图点对应的待筛选特征点进行筛选,得到所述合并后的地图点对应的至少一个代表特征点。According to the association score corresponding to each of the feature points to be screened, the feature points to be screened corresponding to the merged map points are screened to obtain at least one representative feature point corresponding to the merged map points. 7.一种数据处理装置,其特征在于,包括:7. A data processing device, comprising: 提取模块,用于从包括多帧图像的视频数据中提取多个第一特征点集合;每个所述第一特征点集合包括对应于同一个特征对象的多个特征点;其中,每个所述特征点从对应的一帧图像中提取;An extraction module, used to extract a plurality of first feature point sets from video data including a plurality of frames of images; each of the first feature point sets includes a plurality of feature points corresponding to the same feature object; wherein each of the feature points is extracted from a corresponding frame of image; 第一合并模块,用于获取每个所述特征点的描述子,根据获取的所述描述子,对所述多个第一特征点集合进行合并处理,得到多个第二特征点集合;A first merging module is used to obtain a descriptor of each of the feature points, and merge the multiple first feature point sets according to the obtained descriptors to obtain multiple second feature point sets; 第一确定模块,用于根据每个所述第二特征点集合,生成对应的一个地图点,在生成的多个地图点中确定至少一对待判断地图点;每对所述待判断地图点之间的距离小于第一距离阈值;A first determination module is used to generate a corresponding map point according to each of the second feature point sets, and determine at least one pair of map points to be determined from the generated multiple map points; the distance between each pair of the map points to be determined is less than a first distance threshold; 第二合并模块,用于根据每对所述待判断地图点对应的重投影误差,对每对所述待判断地图点进行合并处理,得到对应的合并后的地图点;A second merging module is used to merge each pair of the map points to be determined according to the reprojection errors corresponding to each pair of the map points to be determined, so as to obtain corresponding merged map points; 所述第一合并模块,包括:The first merging module comprises: 计算单元,用于根据每个所述特征点的描述子,计算每个所述第一特征点集合中的任意两个特征点的描述子距离;a calculation unit, configured to calculate a descriptor distance between any two feature points in each of the first feature point sets according to the descriptor of each of the feature points; 统计单元,用于根据计算得到的所述任意两个特征点的描述子距离,统计每个所述特征点对应的描述子距离集合;所述描述子距离集合包括与对应的特征点属于同一个第一特征点集合的其他特征点的描述子距离;A statistical unit, configured to count a descriptor distance set corresponding to each feature point according to the calculated descriptor distance between any two feature points; the descriptor distance set includes descriptor distances of other feature points belonging to the same first feature point set as the corresponding feature point; 确定单元,用于根据每个所述特征点对应的所述描述子距离集合,确定每个所述第一特征点集合对应的目标描述子;a determining unit, configured to determine a target descriptor corresponding to each of the first feature point sets according to the descriptor distance set corresponding to each of the feature points; 合并单元,用于根据每个所述第一特征点集合对应的目标描述子,对所述多个第一特征点集合进行合并处理,得到多个第二特征点集合。A merging unit is used to merge the multiple first feature point sets according to the target descriptor corresponding to each of the first feature point sets to obtain multiple second feature point sets. 8.根据权利要求7所述的装置,其特征在于,所述确定单元,具体用于:8. The device according to claim 7, characterized in that the determining unit is specifically used to: 对每个所述特征点对应的所述描述子距离集合中的描述子距离进行平均处理,得到每个所述特征点对应的平均描述子距离;Averaging the descriptor distances in the descriptor distance set corresponding to each of the feature points to obtain an average descriptor distance corresponding to each of the feature points; 在每个所述第一特征点集合中,将对应的平均描述子距离最小的特征点的描述子确定为每个所述第一特征点集合对应的目标描述子。In each of the first feature point sets, the descriptor of the feature point with the smallest corresponding average descriptor distance is determined as the target descriptor corresponding to each of the first feature point sets. 9.根据权利要求7所述的装置,其特征在于,所述多个第二特征点集合包括至少一个合并特征点集合与至少一个非合并特征点集合;所述合并单元,具体用于:9. The device according to claim 7, characterized in that the plurality of second feature point sets include at least one merged feature point set and at least one non-merged feature point set; and the merging unit is specifically configured to: 在所述多个第一特征点集合中确定至少一对待判断的特征点集合;Determining at least one pair of feature point sets to be judged from the plurality of first feature point sets; 根据每个所述第一特征点集合对应的目标描述子,计算每对待判断的特征点集合的描述子距离,获得目标描述子距离;According to the target descriptor corresponding to each of the first feature point sets, a descriptor distance of each feature point set to be judged is calculated to obtain a target descriptor distance; 对所述目标描述子距离小于第二距离阈值的每对待判断的特征点集合进行合并处理,获得对应的一个合并特征点集合。Each set of feature points to be judged whose distance to the target descriptor is less than the second distance threshold is merged to obtain a corresponding merged feature point set. 10.根据权利要求7所述的装置,其特征在于,还包括:10. The device according to claim 7, further comprising: 获取模块,用于获取采集所述视频数据的相机的目标相机位姿;An acquisition module, used to acquire a target camera pose of a camera that collects the video data; 所述第一确定模块,具体用于:The first determining module is specifically configured to: 根据每个所述第二特征点集合中的每个所述特征点的坐标信息和深度信息,确定每个所述第二特征点集合对应的初始地图点的坐标信息和目标深度信息;Determine the coordinate information and target depth information of the initial map point corresponding to each of the second feature point sets according to the coordinate information and depth information of each of the feature points in each of the second feature point sets; 根据所述目标相机位姿以及目标深度信息对所述初始地图点的坐标信息进行修正,得到每个所述第二特征点集合对应的目标地图点。The coordinate information of the initial map point is corrected according to the target camera pose and the target depth information to obtain the target map point corresponding to each set of the second feature points. 11.根据权利要求7所述的装置,其特征在于,所述第二合并模块,具体用于:11. The device according to claim 7, wherein the second merging module is specifically configured to: 在每对所述待判断地图点中,根据每个所述待判断地图点对应的第二特征点集合中的每个所述特征点的深度信息和坐标信息,计算每个所述特征点的重投影误差,得到每个所述特征点的第一重投影误差;In each pair of the map points to be determined, according to the depth information and coordinate information of each feature point in the second feature point set corresponding to each of the map points to be determined, a reprojection error of each feature point is calculated to obtain a first reprojection error of each feature point; 根据每个所述特征点的第一重投影误差的中值,确定每个所述待判断地图点对应的目标特征点;Determining the target feature point corresponding to each of the map points to be determined according to the median value of the first reprojection error of each of the feature points; 根据每个所述待判断地图点对应的目标特征点的深度信息和坐标信息,交叉计算每个所述目标特征点的重投影误差,得到每个所述目标特征点的第二重投影误差;According to the depth information and coordinate information of the target feature point corresponding to each of the map points to be determined, cross-calculate the reprojection error of each of the target feature points to obtain a second reprojection error of each of the target feature points; 对两个所述第二重投影误差之和小于预设误差阈值的每对所述待判断地图点进行合并处理,得到对应的一个合并后的地图点。Each pair of the to-be-determined map points whose sum of two second re-projection errors is less than a preset error threshold is merged to obtain a corresponding merged map point. 12.根据权利要求7所述的装置,其特征在于,还包括:12. The device according to claim 7, further comprising: 第二确定模块,用于确定每个所述合并后的地图点对应的特征点的数量是否大于预设数量阈值;A second determination module is used to determine whether the number of feature points corresponding to each of the merged map points is greater than a preset number threshold; 若每个所述合并后的地图点对应的特征点的数量大于预设数量阈值,则运行计算模块,所述计算模块用于将所述合并后的地图点对应的特征点确定为待筛选特征点,根据每个所述待筛选特征点的第一重投影误差、平均描述子距离以及对应的特征对象被观测次数,计算对应的关联分数;If the number of feature points corresponding to each of the merged map points is greater than a preset number threshold, a calculation module is run, wherein the calculation module is used to determine the feature points corresponding to the merged map points as feature points to be screened, and calculate the corresponding association score according to the first reprojection error, the average descriptor distance and the number of observations of the corresponding feature object of each of the feature points to be screened; 筛选模块,用于根据每个所述待筛选特征点对应的关联分数,对所述合并后的地图点对应的待筛选特征点进行筛选,得到所述合并后的地图点对应的至少一个代表特征点。The screening module is used to screen the feature points to be screened corresponding to the merged map points according to the association score corresponding to each feature point to be screened, so as to obtain at least one representative feature point corresponding to the merged map points.
CN202210192315.5A 2022-02-28 2022-02-28 Data processing method and device Active CN114565777B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210192315.5A CN114565777B (en) 2022-02-28 2022-02-28 Data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210192315.5A CN114565777B (en) 2022-02-28 2022-02-28 Data processing method and device

Publications (2)

Publication Number Publication Date
CN114565777A CN114565777A (en) 2022-05-31
CN114565777B true CN114565777B (en) 2025-06-24

Family

ID=81715475

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210192315.5A Active CN114565777B (en) 2022-02-28 2022-02-28 Data processing method and device

Country Status (1)

Country Link
CN (1) CN114565777B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117576494A (en) * 2022-08-08 2024-02-20 腾讯科技(深圳)有限公司 Feature map generation method, device, storage medium and computer equipment

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104036494A (en) * 2014-05-21 2014-09-10 浙江大学 Fast matching computation method used for fruit picture
CN106529538A (en) * 2016-11-24 2017-03-22 腾讯科技(深圳)有限公司 Method and device for positioning aircraft

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111260779B (en) * 2018-11-30 2022-12-27 华为技术有限公司 Map construction method, device and system and storage medium
CN111415387B (en) * 2019-01-04 2023-12-29 南京人工智能高等研究院有限公司 Camera pose determining method and device, electronic equipment and storage medium
CN111795704B (en) * 2020-06-30 2022-06-03 杭州海康机器人技术有限公司 Method and device for constructing visual point cloud map

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104036494A (en) * 2014-05-21 2014-09-10 浙江大学 Fast matching computation method used for fruit picture
CN106529538A (en) * 2016-11-24 2017-03-22 腾讯科技(深圳)有限公司 Method and device for positioning aircraft

Also Published As

Publication number Publication date
CN114565777A (en) 2022-05-31

Similar Documents

Publication Publication Date Title
US11605214B2 (en) Method, device and storage medium for determining camera posture information
CN112150551B (en) Method, device and electronic device for acquiring object pose
US20220383535A1 (en) Object Tracking Method and Device, Electronic Device, and Computer-Readable Storage Medium
US10645364B2 (en) Dynamic calibration of multi-camera systems using multiple multi-view image frames
CN112528831A (en) Multi-target attitude estimation method, multi-target attitude estimation device and terminal equipment
WO2021136386A1 (en) Data processing method, terminal, and server
CN112561973A (en) Method and device for training image registration model and electronic equipment
CN113838151B (en) Camera calibration method, device, equipment and medium
CN111462179B (en) Three-dimensional object tracking method and device and electronic equipment
CN112291473B (en) Focusing method, device and electronic device
US11682227B2 (en) Body and hand association method and apparatus, device, and storage medium
CN113628259B (en) Image registration processing method and device
CN112818874B (en) Image processing method, device, equipment and storage medium
CN114565777B (en) Data processing method and device
CN113592922B (en) Image registration processing method and device
CN119048675A (en) Point cloud construction method and device, electronic equipment and readable storage medium
CN112532884B (en) Identification method and device and electronic equipment
CN114241127A (en) Panoramic image generation method and device, electronic equipment and medium
CN113660420B (en) Video frame processing method and video frame processing device
CN116433767A (en) Target object detection method, device, electronic device and storage medium
CN119832166B (en) Panorama reconstruction method based on 3DGS, electronic equipment and storage medium
CN113706553A (en) Image processing method and device and electronic equipment
CN113516684B (en) Image processing method, device, equipment and storage medium
CN116342992B (en) Image processing method and electronic device
CN120259445B (en) Coarse-to-fine camera and laser radar space calibration method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant