Disclosure of Invention
      In order to solve the problems in the prior art, the present application provides a three-dimensional terrain modeling method by an embodiment, the method comprising the steps of:
       s1, acquiring aerial photos of a plurality of untagged ground control points in a region to be modeled; 
       S2, generating a first sparse point cloud set and a rough three-dimensional terrain model of a region to be modeled based on aerial photographs of unlabeled ground control points; 
       S3, automatically marking and associating ground control points on each aerial photo based on the rough three-dimensional terrain model; 
       s4, generating a second sparse point cloud set of the region to be modeled based on aerial photographs of all marked ground control points; 
       And S5, generating a dense point cloud set and a fine three-dimensional terrain model of the region to be modeled based on the second sparse point cloud set. 
      Compared with the prior art, the three-dimensional terrain modeling method provided by the application has the advantages that the operation of generating sparse point clouds once by using aerial photos of unlabeled ground control points is added in the conventional three-dimensional terrain modeling step based on aerial photogrammetry, a rough three-dimensional terrain model is built by using the operation, then the automatic labeling and association of the ground control points are performed by using the rough three-dimensional terrain model, through the steps, the association relation of the same ground control points on different aerial photos can be automatically identified without manual operation, the marking precision of the ground control points is improved, and meanwhile, the marking speed is effectively improved, so that the efficiency of high-precision three-dimensional terrain modeling is remarkably improved as a whole.
      Further, step S2 includes the steps of:
       S21, identifying a plurality of characteristic points in each aerial photo; 
       s22, matching characteristic points in different aerial photos; 
       S23, acquiring a first estimated value of a camera azimuth element of each aerial photo and a first estimated value of a three-dimensional coordinate of each feature point based on unconstrained beam method adjustment; 
       S24, constructing a first sparse point cloud set based on first estimated values of three-dimensional coordinates of each feature point; 
       s25, establishing a rough three-dimensional terrain model of the region to be modeled based on the first sparse point cloud set. 
      Further, step S3 includes the steps of:
       s31, selecting a ground control point as a target ground control point; 
       s32, selecting a reference picture and a plurality of evaluation pictures corresponding to the target ground control point; 
       s33, determining at least one marking point from an image corresponding to the target ground control point on the reference picture; 
       s34, projecting the marking points to the surface of the rough three-dimensional terrain model and intercepting a first projection surface element; 
       s35, searching to obtain the optimal position and the optimal orientation of the first projection surface element; 
       S36, projecting the image in a reference window on the reference picture to a first projection surface element with the optimal position and the optimal orientation to obtain a second projection surface element; 
       and S37, searching and determining the corresponding imaging position of the target ground control point on each picture to be marked based on the expected image obtained by re-projecting the second projection surface element to each picture to be marked. 
      Preferably, the similarity between the image formed by the target ground control point on the reference picture and the forward projection image is larger than a preset first similarity threshold value, or the deviation is smaller than a preset first deviation threshold value.
      Preferably, the similarity between the image formed by the target ground control point on the evaluation picture and the forward projection image or the image formed by the target ground control point on the reference picture is larger than a preset second similarity threshold value, or the deviation is smaller than a preset second deviation threshold value, wherein the first similarity threshold value is larger than or equal to the second similarity threshold value, and the first deviation threshold value is smaller than or equal to the second deviation threshold value.
      Further, step S34 includes the steps of:
       Projecting a mark point to the surface of the rough three-dimensional terrain model along a reference projection line corresponding to the reference picture, and obtaining a projection point of the mark point on the surface of the rough three-dimensional terrain, wherein the reference projection line corresponding to the reference picture is determined based on a camera azimuth element of the reference picture. 
      Further, step S35 includes the steps of:
       s351, re-projecting the first projection surface element to each evaluation picture to obtain a re-projection window; 
       S352, determining cross-correlation characteristics of images in the re-projection windows of all the evaluation pictures; 
       s353, judging whether the cross-correlation feature meets the correlation criterion, if not, executing step S354 and returning to step S351, and if so, executing step S355; 
       s354, translating or rotating the first projection surface element along a reference projection line; 
       S355, the current position and orientation of the first projection surface element are used as the optimal position and optimal orientation of the first projection surface element. 
      Further, step S37 includes cyclically performing the following steps until the processing of all aerial photos that need to be marked is completed:
       S371, selecting one aerial photo of the region to be modeled as a picture to be marked; 
       s372, re-projecting the second projection surface element to the picture to be marked to obtain a desired image; 
       s373, extracting edges of the expected image to construct a matching window; 
       And S374, searching the image to be marked for the optimal position of the matching window, so that the target ground control point is marked on the image to be marked. 
      Preferably, in the process of searching for the optimal position of the matching window on the image to be marked, the matching window only performs translation operation on the image to be marked.
      Further, step S4 includes the steps of:
       s41, identifying a plurality of characteristic points in aerial pictures of all marked ground control points and performing characteristic point matching; 
       S42, obtaining a second estimated value of a camera azimuth element of each aerial photo and a second estimated value of a three-dimensional coordinate of each characteristic point through beam method adjustment with constraint conditions, wherein the constraint conditions comprise position constraint conditions of marked ground control points; 
       s43, constructing a second sparse point cloud set based on second estimated values of three-dimensional coordinates of the feature points. 
      The application also provides a three-dimensional terrain modeling system, which comprises a processor and a readable storage medium, wherein the readable storage medium stores an executable program, and the executable program can realize the three-dimensional terrain modeling method when being executed by the processor.
    
    
      Detailed Description
      The present application will be further described below based on preferred embodiments with reference to the accompanying drawings.
      In addition, various components on the drawings are enlarged or reduced for ease of understanding, but this is not intended to limit the scope of the application.
      In the description of the embodiments of the present application, it should be noted that, if the terms "upper," "lower," "inner," "outer," and the like indicate an azimuth or a positional relationship based on that shown in the drawings, or an azimuth or a positional relationship that a product of the embodiments of the present application conventionally put in use, it is merely for convenience of describing the present application and simplifying the description, and does not indicate or imply that the device or element to be referred to must have a specific azimuth, be configured and operated in a specific azimuth, and thus should not be construed as limiting the present application. Furthermore, in the description of the present application, terms first, second, etc. are used herein for distinguishing between different elements, but not limited to the order of manufacture, and should not be construed as indicating or implying any relative importance, as such may be different in terms of its detailed description and claims.
      The terminology used in the description presented herein is for the purpose of describing embodiments of the application and is not intended to be limiting of the application. It should also be noted that unless explicitly stated or limited otherwise, the terms "disposed," "connected," and "connected" should be construed broadly, and may be, for example, fixedly connected, detachably connected, or integrally connected, mechanically connected, directly connected, indirectly connected via an intermediate medium, or communicating between two elements. The specific meaning of the above terms in the present application will be specifically understood by those skilled in the art.
      Fig. 1 shows a conventional three-dimensional terrain modeling method based on aerial photogrammetry, which comprises the steps of firstly obtaining an aerial photo set formed by a plurality of aerial photos meeting a certain overlapping rate in a region to be modeled in an aerial photo mode, then marking photographed Ground Control Points (GCP) on the aerial photo, further obtaining sparse point clouds of the region to be modeled by matching the aerial photos and performing aerial triangulation on the aerial photo, finally obtaining dense point clouds of the region to be modeled by a densification algorithm, and generating a three-dimensional terrain digital model of the region to be modeled by utilizing the dense point clouds. In the three-dimensional terrain modeling process, ground control points are used as basic data, geometric constraint is provided for iterative optimization of point cloud coordinates in an aerial triangulation stage, and obviously, whether the positions of all the ground control points can be accurately calibrated on aerial photos or not can be obviously judged, and the same ground control points in different photos are accurately associated, so that measurement and modeling results are obviously influenced.
      At present, the mainstream mode of manually marking ground control points needs to manually click each ground control point on each aerial photo and establish the association of the same ground control point among different aerial photos, and the operation not only consumes a great deal of time, but also obviously reduces the modeling speed.
      In addition, there are also algorithms for automatic recognition of ground control points by texture information on aerial pictures, however, the applicant has found that the above-mentioned automatic recognition algorithm based on ground control point image texture features has the following problems:
       Firstly, the same ground control point may have extremely large shape deformation and imaging area difference in aerial photos with different view angles, for example, an image of a ground control point which is square on one Zhang Hang photo (forward shooting) and may be compressed into an extremely fine rectangle on the other Zhang Hang photo (the shooting view angle is almost perpendicular to the normal direction of the ground control point), and the situation that the difference is extremely large cannot be solved by calculating the correlation or relying on a depth network due to the fact that the texture information of the two is extremely large; 
       secondly, when a ground control point is imaged on the edge of a certain aerial photo, the situation that the photo only comprises part of the image of the ground control point may occur, and the conventional image recognition mode is used at this time, because of lack of complete image information, the image is likely to be correctly recognized; 
       More importantly, the image of one ground control point on a certain aerial photograph may be mistakenly associated with the image of another ground control point in other aerial photographs for some reasons, and this problem is caused by the fact that when the ground control points are marked by adopting the existing modeling method shown in fig. 1, camera azimuth information of each aerial photograph is not limited, so that matching can only be performed based on texture information, in which case, image distortion caused by perspective deformation or due to the topography factors of the photographed region may cause matching errors, for example, there are some buildings which are relatively similar and present periodic arrangement, so that a recognition algorithm recognizes that spatial positions are different, but two different ground control points which may present similar texture features on aerial photographs photographed at two different positions are recognized as the same. 
      Therefore, if the matching relationship between different aerial photos is not established when the ground feature points are automatically identified, even if the images of the ground feature points can be extracted from each aerial photo through texture features, the same ground feature points on different aerial photos cannot be associated, and the association between the ground control points on each aerial photo is generally required to be manually specified, otherwise, different ground control points on two or more aerial photos may be erroneously identified as the same ground control point, so that adverse effects on the subsequent aerial three measurement results are caused.
      The root of the problem is that the existing automatic recognition algorithm lacks necessary pose information when recognizing the ground control points, but the problem can not be solved fundamentally obviously only by optimizing the feature extraction algorithm or the deep learning network structure or increasing the training degree of the deep learning network.
      In order to solve the problems in the prior art, the present application improves the labeling and association manners of the ground control points in the existing modeling flow, so as to provide a new three-dimensional terrain modeling method, and fig. 2 is a flowchart of the three-dimensional terrain modeling method provided according to an embodiment of the present application, as shown in fig. 2, and the method includes the following steps:
       s1, acquiring aerial photos of a plurality of untagged ground control points in a region to be modeled; 
       S2, generating a first sparse point cloud set and a rough three-dimensional terrain model of a region to be modeled based on aerial photographs of unlabeled ground control points; 
       s3, automatically marking and associating each ground control point on each aerial photo based on the rough three-dimensional terrain model; 
       s4, generating a second sparse point cloud set of the region to be modeled based on aerial photographs of all marked ground control points; 
       And S5, generating a dense point cloud set and a fine three-dimensional terrain model of the region to be modeled based on the second sparse point cloud set. 
      Compared with the prior art, as can be seen from comparing fig. 1 and fig. 2, in the three-dimensional terrain modeling method provided by the application, firstly, in step S2, the operation of generating sparse point cloud is performed once by using aerial photographs of unlabeled Ground Control Points (GCPs), and a rough three-dimensional terrain model is established by using the operation, then, in step S3, automatic marking and association of ground control points are performed by using the rough three-dimensional terrain model, through the step, automatic identification can be performed without manual marking and association one by one, and association relations of the same ground control points on different aerial photographs are accurately established, so that accurate constraint is provided for subsequent aerial photograph registration and second sparse point cloud generation operation by air three-measurement.
      The following describes the above steps in detail with reference to the drawings.
      < Acquisition of aerial photograph of region to be modeled >
      Step S1 is used for acquiring an initial aerial photograph of the region to be modeled, namely, an aerial photograph which is not marked by a ground control point.
      Fig. 3A illustrates, in a specific embodiment, several original aerial photographs of an area to be modeled from an aerial photograph collection of a building as a designated area disclosed by International Society for Photogrammetry and Remote Sensing (ISPRS) (Dort-ZECHE dataset).
      Fig. 3B shows one of the aerial photographs in an enlarged manner, as shown by the red circles in fig. 3B, and there may be 0,1, 2 or more images of ground control points in the aerial photographs.
      In the process of aerial photogrammetry and three-dimensional terrain modeling, the ground control point can provide effective geometric constraint by utilizing accurate position information, and in general, the ground control point can be manufactured by using a plate or soft cloth, the shape and the size of the ground control point are determined according to aerial photographing height, photographing resolution and the like (for example, a square with the specification of 50cm multiplied by 50cm or 1m multiplied by 1m and the like) and patterns with obvious texture characteristics and high contrast are printed or drawn on the surface.
      Furthermore, before aerial photographing of the region to be modeled is performed, a plurality of ground control points can be placed at the ground, a building or a natural object and the like of the region to be modeled according to a formulated aerial photographing plan, and the spatial position of each ground control point can be measured with high precision by using a total station, a GPS and the like.
      After the setting is finished, aerial photographing of the region to be modeled can be carried out according to a preset aerial photographing plan, the aerial photographing obtained image data can be a plurality of aerial photographing pictures or aerial photographing videos which are continuously photographed, the aerial photographing images generally need to be converted according to the requirement of subsequent data processing, for example, the size of the unified pictures is unified, or the continuous videos are converted into the form of a plurality of aerial photographing pictures, the number of the aerial photographing pictures is generally determined according to factors such as the scale of the region to be modeled, the area covered by each aerial photographing picture, the overlapping degree of adjacent aerial photographing pictures and the like, and the overlapping degree of each continuous aerial photographing picture is generally set to be 60-80%, so that the reliability of subsequent feature matching is ensured.
      < Creation of a rough three-dimensional terrain model of the region to be modeled >
      Step S2 is used for establishing a rough three-dimensional terrain model of the region to be modeled, and the rough three-dimensional terrain model is used for providing indexes and constraints for imaging positions, shapes and texture features of all ground control points on different aerial photos in step S2, so that the same ground control points can be identified and associated on different aerial photos without manually associating one by one.
      Fig. 4 shows a specific implementation of step S2, and as shown in fig. 4, the rough three-dimensional terrain model may be established by:
       Step S21, identifying a plurality of characteristic points in each aerial photo; 
       step S22, matching characteristic points in different aerial photos; 
       Step S23, obtaining a first estimated value of a camera azimuth element of each aerial photo and a first estimated value of a three-dimensional coordinate of each feature point based on unconstrained beam method adjustment; 
       step S24, constructing a first sparse point cloud set based on first estimated values of three-dimensional coordinates of each feature point; 
       and step S25, establishing a rough three-dimensional terrain model of the region to be modeled based on the first sparse point cloud set. 
      Specifically, feature points may be first extracted from each aerial image in step S21 by a feature detection algorithm (such as SIFT, SURF, ORB, etc.), and then feature point matching of different aerial images may be performed in step S22, which may match the same feature points in aerial images of different perspectives using a method such as nearest neighbor search.
      After the feature points of the aerial photo are matched, in step S23, a beam method adjustment operation may be performed to estimate the spatial positions of the feature points, which is important to note that, in the method provided by the present application, since the ground control points are not marked and associated on the aerial photo before executing step S23, the beam method free net adjustment that is not constrained by using the position coordinates is adopted in the step.
      Unconstrained beam method adjustment refers to a method for estimating three-dimensional coordinates of camera azimuth elements (including internal azimuth elements such as focal length, principal point, distortion coefficient, and the like of a camera, and external azimuth elements such as position, orientation, and the like of the camera) and each feature point of an aerial photo without depending on externally known control points.
      Fig. 5 and 6 illustrate the principle and specific flow of iterative estimation of three-dimensional coordinates of camera orientation elements and feature points using unconstrained beam-leveling adjustment, in some embodiments, respectively. As shown in fig. 6, in step S231, the internal azimuth element and the external azimuth element of the camera may be initially estimated through the description document of the camera parameter specification and the flight position and attitude record of the aerial vehicle during aerial photography, and then the three-dimensional coordinates of each feature point may be calculated according to the camera azimuth element and the feature point matching result of each aerial photograph through the triangulation method (step S232).
      Next, in step S233, a first objective function value is calculated using the three-dimensional coordinates of each feature point and the camera orientation element of each aerial photo (generally, the objective function of the optical adjustment is a sum-of-square function with a robust mapping error, that is, the error is minimized after each feature point is mapped to the aerial photo), and in step S234, it is determined whether the function value converges or the iteration number reaches a preset upper limit, if the determination result is no, the camera orientation element is re-estimated and the three-dimensional coordinates of each feature point are re-calculated in step S235, and then the first objective function value is re-calculated, and if the determination result is yes, the first estimated value of the camera orientation element of each aerial photo and the first estimated value of the three-dimensional coordinates of each feature point are determined based on the current iteration result (step S236).
      After the first estimated value of the three-dimensional coordinates of each feature point is obtained, each feature point can be used as a point cloud point in step S24, the position of each feature point in the three-dimensional space is set according to the first estimated value, so as to obtain a first sparse point cloud set, then in step S25, each feature point in the sparse point cloud set is used as a grid point, a coarse three-dimensional terrain model of the region to be modeled is constructed by using an algorithm such as Delaunay triangular segmentation, and generally, the coarse three-dimensional terrain model constructed in the step is composed of a plurality of small patches, and the vertex of each patch is the feature point.
      In some preferred embodiments, the camera azimuth elements of each aerial photo obtained through the steps, the sparse point cloud set of the region to be modeled and the rough three-dimensional terrain model are all stored in a database for later steps to be used and invoked in the implementation process.
      < Automatic marking and associating individual ground control points >
      It should be noted that, in the conventional three-dimensional terrain modeling method, since the spatial resolution of the sparse point cloud formed by the feature points is too low, the three-dimensional model generated by the sparse point cloud is rough, so that the three-dimensional grid model is generally not generated directly by the sparse point cloud, but the dense point cloud is used for three-dimensional terrain reconstruction after the sparse point cloud is dense, however, in the embodiment of the application, the rough three-dimensional terrain model has a surface reflecting the rough three-dimensional form of the terrain of the area to be modeled, that is, the pose of each patch on the surface can approximately represent the orientation and the spatial coordinate of the area to be modeled at the place, so that the three-dimensional terrain model can be regarded as a reference for searching for the ground control point on each aerial image and establishing the relevance of the same ground control point on each aerial image, and thus the automatic marking and relevance of the ground control point are performed by means of the rough three-dimensional terrain model in step S3.
      Fig. 7 shows the implementation of step S3 in some embodiments, and as shown in fig. 7, step S3 further includes the following steps:
       Step S31, selecting a ground control point as a target ground control point; 
       step S32, selecting a reference picture and a plurality of evaluation pictures corresponding to the target ground control point; 
       Step S33, determining at least one marking point from the image corresponding to the target ground control point on the reference picture; 
       step S34, projecting the marking points to the surface of the rough three-dimensional terrain model and intercepting a first projection surface element; 
       step S35, searching to obtain the optimal position and the optimal orientation of the first projection surface element; 
       step S36, projecting the image in a reference window on the reference picture to a first projection surface element with the best position and the best orientation to obtain a second projection surface element; 
       And step S37, searching and determining the corresponding imaging position of the target ground control point on each picture to be marked based on the expected image obtained by re-projecting the second projection surface element to each picture to be marked. 
      In some specific embodiments, when there are a plurality of ground control points to be automatically marked and associated, as shown in fig. 7, steps S31 to S37 may be cyclically performed until the operation of automatically identifying all the ground control points to be automatically identified and associated is completed.
      An embodiment of step S3 will be described in detail below with reference to the drawings.
      Firstly, in step S31, a ground control point is selected as a target ground control point to be automatically marked and associated with each picture to be marked, then a reference picture is selected according to the imaging condition of the target ground control point on each aerial photo, and a plurality of evaluation pictures are selected around the reference picture.
      In order to ensure the accurate shape of the subsequent search results, the reference picture is selected so that the target ground control point is shot at an orthographic angle as much as possible, namely, an aerial photo of which the image corresponding to the target ground control point is close to the forward projection shape of the target ground control point is selected as the reference picture as much as possible, for example, when the actual shape of the target ground control point is square, the aerial photo of which the target ground control point is shot and the imaging shape is basically square is selected from the aerial photo set.
      In some preferred embodiments, the deviation or similarity between the shape of an image formed by a ground control point on an aerial image and the shape that it should have when taken at a forward viewing angle can be evaluated using, for example, the geometric difference between the imaged image and an ideal square can be measured using, for example, hausdorff distance, average distance or similarity ratio, and the like, and, for example, the difference between the sides (or perimeter) of an image formed by GCP on an aerial image and the sides (or perimeter) of a standard square having the same area can be used as a deviation indicator:
      
        
      
       Wherein D is a deviation index, l i is the side length of each image formed by one GCP on the aerial photo, and l square is the side length of a square with the same area as the image. 
      Obviously, a preset similarity threshold (first similarity threshold) or deviation threshold (first deviation threshold) may be set, and when the similarity between the shape of an image formed by a ground control point on an aerial image and the shape of a forward-projection image of the aerial image is greater than the first similarity threshold or the deviation is less than the first deviation threshold, the aerial image is taken as a reference image taking the ground control point as a target ground control point.
      After the reference picture of the target ground control point is determined, a plurality of aerial pictures can be selected as evaluation pictures from the periphery of the reference picture, wherein the periphery refers to the fact that the evaluation pictures have camera positions and visual angles close to those of the reference picture, and the aerial pictures are generally numbered continuously according to a shooting sequence, so that the evaluation pictures are generally a plurality of pictures adjacent to the reference picture in the aerial picture set.
      Fig. 8A and 8B respectively show schematic diagrams of a reference picture and its surrounding evaluation pictures selected from a plurality of aerial pictures after determining a target ground control point to be marked in a specific embodiment.
      As shown in fig. 8A, the evaluation picture may include a picture that captures the target ground control point at a near forward projection view angle, or may include a picture that captures the target ground control point at a more deviated view angle than the reference picture, and similarly, as shown in fig. 8B, the evaluation picture may also be screened according to the similarity or deviation between the shape of the image formed by the target ground control point on the evaluation ground control point and the forward projection image shape of the ground control point (or the shape of the image formed by the target ground control point on the reference picture), that is, the second similarity threshold or the second deviation threshold may be preset, and only aerial pictures that have the similarity with the shape of the forward projection image of the target ground control point (or the shape of the image formed by the reference picture) larger than the second similarity threshold or the deviation smaller than the second deviation threshold may be selected as the evaluation picture.
      Preferably, the image of the target ground control point on the reference picture is closer to the image of the forward projection than the image of the surrounding evaluation pictures, so that in some preferred embodiments the first similarity threshold is greater than or equal to the second similarity threshold, and in other preferred embodiments the first deviation threshold is less than or equal to the second deviation threshold.
      In some preferred embodiments, the number of evaluation pictures is not less than 6, for example, 3 aerial pictures can be sequentially selected as 6 evaluation pictures on both sides of the reference picture, respectively, and in other preferred embodiments, the number of evaluation pictures is not more than 10, for example, 5 aerial pictures can be sequentially selected as 10 evaluation pictures on both sides of the reference picture, respectively.
      The step S33 is used for marking the marking point corresponding to the target ground control point on the reference picture, and the marking in the step may be performed by manual operation, for example, a certain pixel (preferably, a pixel corresponding to the center of the target ground control point) in the image formed by the target ground control point is manually selected on the reference picture, and the pixel is used as the marking point of the target ground control point on the reference picture, or may be performed in a semi-automatic manner of man-machine interaction, for example, a rough range is firstly defined by manual operation, an image of the ground control point is identified by an automatic identification algorithm, one pixel is marked as the marking point, and then the position of the marking point is manually adjusted.
      It should be noted that, although the marking of the target ground control point in step S33 occurs after the selection of the reference picture in step S32, a preferred principle of selecting the reference picture is whether a certain ground control point that is desired to be automatically marked can be imaged in a forward direction or a direction close to the forward direction on one aerial photo picture, so as to facilitate the accurate marking of the ground control point on the aerial photo picture, and thus, for different target ground control points, different aerial photos may need to be selected as the reference pictures thereof.
      After the accurate marking of the target ground control point on the reference picture is completed, in step S34, as shown in fig. 9, first, the marking point of the target ground control point is projected onto the rough three-dimensional terrain model of the region to be modeled along the projection line corresponding to the reference picture (in the embodiment of the present application, the projection line corresponding to the marking point on the reference picture is referred to as the reference projection line), where the reference projection line of the marking point may be determined by the camera orientation element of the reference picture obtained in step S2 that is already stored in the database.
      Camera orientation elements play an important role in the fields of photogrammetry, computer vision, image processing and the like, and comprise an inner orientation element and an outer orientation element of the camera, wherein the inner orientation element describes internal parameters of the camera, the parameters relate to the geometric structure and optical characteristics of the camera, mainly comprise Focal Length (Focal Length), principal point (PRINCIPAL POINT), radial distortion coefficient (Radial Distortion Coefficients), tangential distortion coefficient (TANGENTIAL DISTORTION COEFFICIENTS) and the like and are used for representing how to project from three-dimensional space points to a two-dimensional image plane, and the outer orientation element describes the spatial position and orientation of the camera when shooting, mainly comprises three-dimensional position coordinates and rotation angles (Roll, pitch, yaw or corresponding rotation matrix/quaternion) of the camera and is used for representing the position and the posture of the camera relative to a world coordinate system.
      Referring to fig. 5, once the camera azimuth element of an aerial photo is determined by means of aerial triangulation, for a three-dimensional spatial point, a straight line from the center of the camera to the spatial point is a projection line of the spatial point corresponding to the aerial photo, and an intersection point of the projection line and the aerial photo is a position where the intersection point is imaged on the aerial photo, so that the projection line shows how the spatial point is mapped onto the aerial photo through the internal and external azimuth elements of the camera.
      In order to clearly indicate the direction of the projection operation, in the embodiment of the present application, the operation of projecting a certain or several elements (such as pixels, geometric features, or texture features) along their corresponding projection lines from an aerial image onto a three-dimensional terrain model in the form of a grid patch of the region to be modeled is referred to as projection, and correspondingly, the operation of projecting an element on the three-dimensional terrain model along its projection lines onto an aerial image is referred to as re-projection.
      As previously analyzed, the rough three-dimensional model obtained by unconstrained beam adjustment can generally reflect the three-dimensional topography of the region to be modeled, and the camera azimuth element provides the projection direction of each pixel on the reference picture to the corresponding space point, so that after the labeling point of the target ground control point is projected onto the surface of the rough three-dimensional model according to the corresponding reference projection line on the reference picture, a projection point is obtained, and the three-dimensional coordinates of the projection point and the orientation (i.e. the normal direction) of the plane on the rough three-dimensional model represent the approximate space position and orientation of the target ground control point when the target ground control point is arranged on the region to be modeled.
      Next, as shown in fig. 9, a surface element is continuously cut around the projection point on the plane where the projection point is located on the rough groove three-dimensional model in step S34, and this surface element is referred to as a first projection surface element in the present application.
      The first projection surface element can be preferably intercepted according to the projection point as the center, or can be correspondingly adjusted according to the actual intersection condition of each plane forming the rough three-dimensional model (if the surface element intercepted by taking the projection point as the center just spans two patches forming the rough three-dimensional model, the first projection surface element can be properly deviated to be in the same plane, and only the projection point is required to be included in the intercepted surface element.
      The shape and size of the first projection surface element may be selected according to the size of the region to be modeled, the resolution parameter of the aerial photo, etc., in some preferred embodiments, the first projection surface element may be square, and the projection point is taken as the center of the square, in other embodiments, the shape of the first projection surface element may be set correspondingly according to the actual shape of the target ground control point, such as a rectangle, a circle, etc.
      In the embodiment of the present application, the first projection bin corresponds to a Mask, and constructs a "window" defining the shape and size (number of pixels) based on the rough position and orientation information of the target ground control point reflected by the projection point on the rough three-dimensional model, and further, the first projection bin can be used as a search starting point to obtain the best estimation result of the position and orientation of the target ground control point by iterative search in step S35.
      Referring to fig. 10 and 11, in some preferred embodiments of the present application, the search for the optimal position and orientation of the target ground control point is performed by:
       step S351, re-projecting the first projection surface element to each evaluation picture to obtain a re-projection window; 
       step S352, determining cross-correlation characteristics of images in the re-projection windows of the respective evaluation pictures; 
       Step S353, judging whether the cross-correlation feature satisfies a correlation criterion, if not, executing step S354 and returning to step S351, and if so, executing step S355; 
       step S354, translating or rotating the first projection surface element along a reference projection line; 
       Step S355, the current position and orientation of the first projection bin are used as the optimal position and orientation of the first projection bin. 
      Specifically, in step S351, the first projection bin is projected onto each evaluation picture according to its current position and orientation, the projection line on which each point is projected onto each evaluation picture is constructed according to the camera azimuth element information (determined in step S2), the projection line corresponding to each evaluation picture is used for re-projecting the first projection bin onto each evaluation picture, after the first projection bin is re-projected onto the evaluation picture, the region covered on the evaluation picture is shown as a red part in fig. 11, and the edge of the region forms a re-projection window.
      Obviously, during the first reprojection operation, the intersection point of the first projection surface element and the reference projection line is the projection point on the rough three-dimensional model, and the position and the orientation of the first projection surface element are the position and the orientation of the surface element intercepted in the step S34.
      Referring to fig. 11, when the target ground control point is in a certain alternative pose, after the target ground control point is photographed by the cameras at each position, the size and the shape (i.e. the position where the reprojection window is located, the number of covered pixels and the formed shape) of the target ground control point are expected to be located on the corresponding aerial photo, obviously, only the first projection bin consistent with the actual pose of the target ground control point is present, and the texture features of the image in the reprojection window obtained by reprojecting the target ground control point to each evaluation photo have the greatest correlation with the target ground control point.
      The rough three-dimensional model is utilized to determine the initial pose of the first projection surface element, so that the calculated amount of searching can be obviously reduced, only fine translation or rotation is needed around the initial pose, the real pose of the target ground control point can be accurately estimated, meanwhile, for any one of the alternative poses of the first projection surface element, the position of the reprojection window on an estimated picture is determined, the orientation of the position of the reprojection window determines the shape and the size of the reprojection window on the estimated picture, the texture features of the image in the reprojection window are simultaneously limited and restrained by factors such as the position, the pose, the shape and the area, and the like, the reprojection window generated according to any pose with the deviation from the actual pose exceeding a certain degree can generate obvious deviation between the covered image and the real imaging result of the target ground control point on the estimated picture on one or more of the positions, the shapes and the areas, and further cause obvious reduction of the correlation of the image textures in the reprojection window, and therefore, the ground control point can be greatly estimated accurately.
      In the step S352, in the case of evaluating the Correlation of the images in the respective reprojection windows, an image Correlation evaluation index known to those skilled in the art may be selected, and the Correlation of the images in the reprojection windows on any two evaluation pictures may be calculated, for example, an optional Correlation evaluation index is a Normalized Cross-Correlation (NCC), or other index for image Correlation may be selected.
      In some preferred embodiments, the reference picture also serves as an evaluation picture, i.e. the first projection bin is also re-projected to the reference picture and participates in the evaluation of the correlation with the image in its re-projection window.
      After the correlation calculation of the images in each re-projection window on each evaluation picture is completed, in step S353, it is determined whether the calculation result satisfies the correlation criterion, and step S354 or step S355 is selectively executed according to the determination result.
      In some alternative embodiments, the correlation of the images in each re-projection window may be counted first, any one or more of statistical characteristics such as a mean value, a median value, a maximum value or a minimum value may be obtained, then whether the statistical characteristics are equal to or greater than a preset correlation threshold value is determined, when the determination result is negative, the first projection surface element is translated or rotated through step S354 (refer to fig. 11, the translation or rotation of the first projection surface element is always performed with the intersection point of the first projection surface element and the reference projection line as a motion base point, that is, the projection result of the labeling point of the target ground control point to the spatial point is always "bound" on the reference projection line, the translation and rotation are always performed along the reference projection line, then step S351 is returned to re-perform re-projection and correlation calculation, and when the determination result is positive, step S355 is performed, the position and orientation (normal direction) of the first projection surface element at this time are taken as the optimal position and orientation (refer to fig. 11), the position and orientation of the first projection surface element at this time are always performed with the intersection point of the first projection surface element and the reference projection line as the motion base point, that the labeling point is arranged at the actual position and orientation of the region to be modeled, and the possible position and orientation of the target control point are provided on the target point to be effectively associated with the target point at the next estimated position and the constraint point is provided.
      Fig. 12 shows the case where the NCC coefficient of the image in each reprojection window varies with the size of the first projection bin, which is a square bin, and the abscissa represents the number of pixels included in each side of the bin, in a specific embodiment.
      As shown in fig. 12, in the region where the first projected bin size is smaller, the NCC value generally shows a gradual upward trend as the bin size increases, indicating that as the bin size increases, the accuracy of matching also increases, however, as the bin size increases further, the NCC value does not increase significantly any more, and even fluctuates (as occurs between 11×11 pixels and 17×17 pixels).
      By analyzing the texture features of the pictures on each evaluation picture, the correlation index is not obviously improved after the size of the bin is increased to a certain degree, even the phenomenon of decline occurs, which indicates that other image noises can be additionally introduced along with the increase of the number of pixels for correlation calculation, so that the contribution degree of the bin size to pose optimization is reduced and even negatively influenced. Therefore, the size of the first projection bin may not be too small, otherwise there will be insufficient information for evaluating the cross-correlation during the subsequent search, preferably the lower size limit of the first projection bin is 3 x 3 pixels, and at the same time the size of the first projection bin need not be too large, since further increases in size will not continue to promote the cross-correlation. In some preferred embodiments, the upper size limit of the first projection bin may be determined by calculating the maximum value of the correlation statistics of the images within each re-projection window.
      After the actual pose of the target ground control point is determined, the actual pose can be used as a geometric constraint condition, obviously, the constraint condition can be used for effectively limiting the search of the target ground control point on the picture to be marked in a specific area, so that the automatic searching efficiency is remarkably improved, more importantly, the image of the ground control point automatically searched and marked on the picture to be marked by using the geometric constraint is necessarily the image of the target ground control point, namely, the association of the same ground characteristic point on different aerial pictures is completed while the automatic searching is carried out, and the problem that the images of different ground control points on two aerial pictures are recognized as the same ground control point is fundamentally avoided.
      As shown in fig. 7, the searching for the target ground control point on the picture to be marked is performed through step S36 and step S37.
      Step S36 is used for generating a second projection surface element, wherein an image edge corresponding to a target ground control point can be generated on a reference picture through manual operation or manual and automatic algorithm interaction operation and the like, so that a reference window is obtained, and an image in the reference window is projected onto a first projection surface element with the optimal position and the optimal orientation along a reference projection line, so that the second projection surface element is obtained.
      Step S37 is used to automatically search for the target ground control point on each picture to be marked, and as analyzed above, since this step searches for the designated target ground control point on each picture to be marked, it also establishes an association between the same ground control points on different aerial photos.
      In some specific embodiments, as shown in fig. 13, step S37 includes performing the following steps in a loop until processing of all aerial pictures that need to be marked is completed:
       S371, selecting one aerial photo of the region to be modeled as a picture to be marked; 
       s372, re-projecting the second projection surface element to the picture to be marked to obtain a desired image; 
       s373, extracting edges of the expected image to construct a matching window; 
       And S374, searching the image to be marked for the optimal position of the matching window, so that the target ground control point is marked on the image to be marked. 
      Specifically, in step S372, the second projection bin may be re-projected onto the picture to be marked along a projection line corresponding to the picture to be marked, so as to obtain a desired image thereof on the picture to be marked. Obviously, the position and the orientation of the second projection surface element are consistent with those of the first projection surface element with the optimal position and orientation, and the difference between the two projection surface elements is that the shape and texture characteristic information expected to be possessed when the corresponding image of the target ground control point on the reference picture is projected to the real pose of the target ground control point is increased in the second projection surface element, so that when multiple information with the specific position, orientation, shape and texture characteristic is further projected onto the picture to be marked, the obtained expected image can approximately limit the expected imaging position of the target ground control point on the picture to be marked and the shape, size and texture of the image expected to be formed, the search of the target ground control point is carried out by utilizing the information with strong constraint, the possibility that different ground control points in two aerial pictures are identified as the same ground control point is eliminated, and meanwhile, the matching window is only required to be horizontally slid in a small area around the expected image, and the large-scale search and the rotation operation in the conventional image matching search is not required, so that the search speed is greatly improved.
      Preferably, the searching of the best position of the matching window in step S374 may be implemented by using a least square image matching algorithm, as shown in fig. 14, the matching window may be slid horizontally on the image to be marked according to a set sliding step, the image in the matching window is extracted, the texture feature of the matching window and the texture feature of the expected image are calculated in a least error square manner, the above-mentioned process is repeated until the position with the minimum error is found, the image surrounded by the matching window at the position represents the imaged position of the target ground control point, and thus the automatic marking of the target ground control point is completed.
      The embodiment shown in fig. 15 further illustrates the case of performing an optimal position search of the matching window on the picture to be marked that only partially displays the target ground control point, when the target ground control point is located at the edge of one aerial image, only a portion may be imaged on the aerial image, and it is apparent that the texture features for searching and comparing are missing for the conventional automatic marking method based on texture information alone, thus causing difficulty in recognition, as shown in fig. 14, when the desired image is entirely inside the picture to be marked. However, for the method of the application, since the expected image is imaged on the edge of the aerial image according to the position where the expected image is expected to be located, and only a part of the expected image is located on the aerial image, the matching window can be determined by combining the edge of the expected image with the edge of the aerial image itself, and the optimal position of the expected image can be searched on the image to be marked, namely, by the method of the application, the identification and marking of the target ground control point can be completed only through the partial information of the target ground control point on the image to be marked.
      Returning to fig. 7, once the operations of steps S31 to S37 are completed, the automatic identification and association of one target ground control point is completed, after which one ground control point may be reselected as the target ground control point, and then steps S31 to S37 are re-performed. By repeatedly performing the above operations, automatic marking and association of all or a specific number of ground control points can be accomplished.
      < High-precision modeling of three-dimensional terrain >
      Returning to fig. 2, after the operation of step S3 is completed, through step S4, the generation of the sparse point cloud set is performed again (i.e. the constrained beam method adjustment operation is performed under the constraint provided by the ground control point, so as to obtain the second sparse point cloud) by using the ground control point mark and the associated aerial photo.
      In some specific embodiments, step S4 comprises the steps of:
       s41, identifying a plurality of characteristic points in aerial pictures of all marked ground control points and performing characteristic point matching; 
       S42, obtaining a second estimated value of a camera azimuth element of each aerial photo and a second estimated value of a three-dimensional coordinate of each characteristic point through beam method adjustment with constraint conditions, wherein the constraint conditions comprise position constraint conditions of marked ground control points; 
       s43, constructing a second sparse point cloud set based on second estimated values of three-dimensional coordinates of the feature points. 
      The main difference between the step S4 and the step S2 is that the beam adjustment operation at this time uses the position information of the marked and associated ground control points as a constraint condition, specifically, since the true three-dimensional coordinates of each ground control point are obtained through accurate measurement, and the positions of the ground control points imaged on each aerial photo are marked and associated through the step S3, the position information of the ground control points can be used as a constraint condition, and the estimation result of the three-dimensional coordinates of each feature point can be continuously optimized in the iterative estimation process of the beam adjustment, so as to finally obtain the second estimation value of the three-dimensional coordinates of each feature point and generate the second sparse point cloud.
      After the second sparse point cloud is obtained, the dense point cloud can be generated through the step S5, so that a fine three-dimensional terrain model of the region to be modeled is obtained. The specific embodiments of the steps S4 and S5 are known to those skilled in the art, and are not described herein.
      < Detailed description of the invention >
      In this embodiment, the aerial photo set shown in fig. 3A and 3B is adopted, and the method shown in fig. 2 is used to perform high-precision modeling on the building.
      Fig. 16 and 17 show schematic diagrams of a three-dimensional sparse point cloud and a coarse grid model of a building generated in an implementation process, wherein in the process of automatically marking and associating each ground control point, the size of a first projection surface element is 11×11 pixels, and statistics of marking results of 10 ground control points are listed in the following table 1.
      TABLE 1 automatic marking results statistics for ground control points
      
        
      
      As can be seen from Table 1, when the reference window size is 15×15, there are a total of 482 points successfully marked with a success rate of about 67.3%, the root mean square error between the coordinates of the 482 marked points and the true coordinates is 0.564 pixels, and when the window size is 25×25, there are a total of 683 points successfully marked with a success rate of about 95.5%, the root mean square error between the coordinates of the 683 marked points and the true coordinates is 0.549 pixels. The statistics result shows that through the steps of automatic marking and association, more than 700 ground control points to be marked can be automatically marked within 1 minute, the marking success rate is more than 95% along with the increase of the size of the reference window to 25×25 pixels, meanwhile, the comprehensive precision of marking all the ground control points reaches the sub-pixel level, and the comprehensive improvement of the marking speed and the marking precision compared with the manual marking mode is proved.
      Fig. 18 and 19 show a dense point cloud and a fine three-dimensional terrain model of the building, and fig. 20 shows a final three-dimensional model obtained by mapping the fine model, and it can be seen from fig. 18 to 20 that three-dimensional terrain modeling based on aerial pictures can be efficiently and accurately completed by using the three-dimensional terrain modeling method provided by the application.
      Some embodiments of the present application also provide a three-dimensional terrain modeling system, as shown in fig. 21, which includes a processor and a readable storage medium, wherein the readable storage medium stores an executable program, and the executable program can implement the three-dimensional terrain modeling method when executed by the processor.
      While the foregoing is directed to embodiments of the present application, other and further embodiments of the application may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow.