Disclosure of Invention
Based on this, it is necessary to provide an unsupervised video SAR image registration method, apparatus, device and medium capable of overcoming global rigid deformation and local elastic deformation at the same time.
An unsupervised video SAR image registration method, the method comprising:
acquiring a training data set, wherein the training data set comprises multi-frame SAR training images arranged in time sequence;
Selecting any two adjacent SAR training images in the training data set as a reference image and an image to be registered respectively, preprocessing the reference image and the image to be registered, and inputting the preprocessed reference image and the preprocessed image to a global registration network to obtain related parameters of a global transformation matrix;
Performing global transformation on the image to be registered according to the related parameters of the global transformation matrix to obtain a preliminary registration image;
Inputting the preliminary registration image and the reference image into a local registration network, and carrying out local correction on the preliminary registration image by using the reference image to obtain a predicted registration image;
Constructing a global registration loss function according to the similarity between the maximized reference image and the initial registration image, and constructing a local registration loss function according to the local correlation coefficient and the deformation field regular term between the reference image and the predicted registration image;
Training the global registration network and the local registration network according to the global registration loss function and the local registration loss function to obtain a trained registration network, and constructing an interframe registration network according to the trained global registration network and the local registration network;
And acquiring a video SAR image to be registered, and performing image registration on the video SAR image by using the interframe registration network.
In one embodiment, when the global registration network generates relevant parameters of a global transformation matrix according to the reference image and the image to be registered, the global registration network:
splicing the reference image and the image to be registered along the channel dimension and then taking the spliced reference image and the image to be registered as the input of the global registration network;
In the global registration network, five parameters are generated according to the spliced image by utilizing six convolution layers and one full connection layer, wherein the five parameters comprise An axial translation amount and a scaling ratio,An axial translation amount and a scaling and rotation angle;
Wherein the output channels of the six convolution layers are 32,64,256,256,256 and 256 in sequence.
In one embodiment, the global registration loss function is expressed as:
;
in the above-mentioned description of the invention, Representing the reference image at a pixel pointThe gray value at which the color is to be changed,Representing the primary registered image at a pixel pointThe gray value at which the color is to be changed,The whole image domain is represented and,Is thatThe mean value of the gray values over the whole image domain,Is thatThe mean value of the gray values over the image domain.
In one embodiment, the local registration network includes a convolution layer, a pooling layer, bottleneck units, and a deconvolution layer;
After the reference image and the preliminary registration image are spliced along the channel dimension, downsampling is carried out by using the convolution layer and the pooling layer;
the Bottleneck unit comprises a plurality of groups with different Bottleneck numbers, and performs feature extraction and dimension reduction processing on the downsampled data in different levels in sequence to obtain multi-level features;
And after the features of each level are fused with the features of the previous level through jump connection, the dimension is recovered through up-sampling by the deconvolution layer, the features continue to propagate to the previous level until the features of the previous level are fused with the features of the highest level, and the predicted registration image is obtained through the deconvolution layer.
In one embodiment, the convolution kernel size of the convolution layer is 7X7;
the Bottleneck units comprise 4 groups, each group comprises 3, 4, 6 and 3 Bottleneck;
The convolution kernel size of the deconvolution layer is 3X3.
In one embodiment, the Bottleneck includes a main path and a hop connection path;
three convolution layers with different convolution kernel sizes are sequentially arranged on the main path, and one convolution layer is arranged on the jump connection path;
And after the main path and the jump connection path respectively process the input data, the processed data are fused through addition operation to obtain the Bottleneck output data.
In one embodiment, the local registration loss function is expressed as:
;
in the above-mentioned description of the invention, The local correlation coefficient is represented by a set of coefficients,A regularization term representing the deformation field,Expressed in pixelsIs a local area of the center of the device,Representing the preliminary registration image atThe gray value at which the color is to be changed,Representation areaIn (a)Is used for the color filter,Representing the reference image inThe gray value at which the color is to be changed,Representation areaIn (a)Is used for the color filter,Is a coefficient, wherein the regularization term is a transformation field、Respectively atThe squares of the first derivatives of the directions are summed.
The application provides an unsupervised video SAR image registration device, which comprises:
the system comprises a training data set acquisition module, a data acquisition module and a data processing module, wherein the training data set is used for acquiring a training data set which comprises multiple SAR training images arranged in time sequence;
The global registration parameter obtaining module is used for selecting any two adjacent SAR training images in the training data set to serve as a reference image and an image to be registered respectively, preprocessing the reference image and the image to be registered, and inputting the preprocessed reference image and the preprocessed image to be registered into the global registration network to obtain related parameters of a global transformation matrix;
The primary registration module is used for carrying out global transformation on the image to be registered according to the related parameters of the global transformation matrix to obtain a primary registration image;
The local registration module is used for inputting the preliminary registration image and the reference image into a local registration network, and carrying out local correction on the preliminary registration image by using the reference image to obtain a prediction registration image;
The loss function construction module is used for constructing a global registration loss function according to the similarity between the maximized reference image and the initial registration image and constructing a local registration loss function according to the local correlation coefficient and the deformation field regular term between the reference image and the predicted registration image;
The network training module is used for training the global registration network and the local registration network according to the global registration loss function and the local registration loss function respectively to obtain a trained registration network, and constructing an interframe registration network according to the trained global registration network and the local registration network;
The image registration module is used for acquiring a video SAR image to be registered, and carrying out image registration on the video SAR image by utilizing the interframe registration network.
A computer device comprising a memory storing a computer program and a processor which when executing the computer program performs the steps of:
acquiring a training data set, wherein the training data set comprises multi-frame SAR training images arranged in time sequence;
Selecting any two adjacent SAR training images in the training data set as a reference image and an image to be registered respectively, preprocessing the reference image and the image to be registered, and inputting the preprocessed reference image and the preprocessed image to a global registration network to obtain related parameters of a global transformation matrix;
Performing global transformation on the image to be registered according to the related parameters of the global transformation matrix to obtain a preliminary registration image;
Inputting the preliminary registration image and the reference image into a local registration network, and carrying out local correction on the preliminary registration image by using the reference image to obtain a predicted registration image;
Constructing a global registration loss function according to the similarity between the maximized reference image and the initial registration image, and constructing a local registration loss function according to the local correlation coefficient and the deformation field regular term between the reference image and the predicted registration image;
Training the global registration network and the local registration network according to the global registration loss function and the local registration loss function to obtain a trained registration network, and constructing an interframe registration network according to the trained global registration network and the local registration network;
And acquiring a video SAR image to be registered, and performing image registration on the video SAR image by using the interframe registration network.
A computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of:
acquiring a training data set, wherein the training data set comprises multi-frame SAR training images arranged in time sequence;
Selecting any two adjacent SAR training images in the training data set as a reference image and an image to be registered respectively, preprocessing the reference image and the image to be registered, and inputting the preprocessed reference image and the preprocessed image to a global registration network to obtain related parameters of a global transformation matrix;
Performing global transformation on the image to be registered according to the related parameters of the global transformation matrix to obtain a preliminary registration image;
Inputting the preliminary registration image and the reference image into a local registration network, and carrying out local correction on the preliminary registration image by using the reference image to obtain a predicted registration image;
Constructing a global registration loss function according to the similarity between the maximized reference image and the initial registration image, and constructing a local registration loss function according to the local correlation coefficient and the deformation field regular term between the reference image and the predicted registration image;
Training the global registration network and the local registration network according to the global registration loss function and the local registration loss function to obtain a trained registration network, and constructing an interframe registration network according to the trained global registration network and the local registration network;
And acquiring a video SAR image to be registered, and performing image registration on the video SAR image by using the interframe registration network.
According to the unsupervised video SAR image registration method, device, equipment and medium, the preprocessed reference image and the image to be registered are input into the global registration network to obtain relevant parameters of the global transformation matrix, the preliminary registration image is further obtained, the image and the reference image are input into the local registration network to be subjected to local correction to obtain the predicted registration image, then the global registration loss function is built according to the similarity between the maximized reference image and the preliminary registration image, the local registration loss function is built according to the local correlation coefficient and the deformation field regularization term between the reference image and the predicted registration image, the global registration network and the local registration network are trained according to the global registration loss function and the local registration loss function, the trained registration network is obtained, the inter-frame registration network is built according to the trained global registration network and the local registration network, and the inter-frame registration network is utilized to conduct image registration on the video image. The SAR image registration method can overcome global rigid deformation and local elastic deformation simultaneously.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
Aiming at the prior art, the SAR image registration method capable of solving global rigid deformation and local elastic deformation is less common, as shown in figure 1, an unsupervised video SAR image registration method is provided, and the method specifically comprises the following steps:
Step S100, a training data set is acquired, wherein the training data set comprises multiple frames of SAR training images arranged in time sequence.
Step S110, selecting any two adjacent SAR training images in the training data set to serve as a reference image and an image to be registered respectively, preprocessing the reference image and the image to be registered, and inputting the preprocessed reference image and the preprocessed image to the global registration network to obtain related parameters of the global transformation matrix.
Step S120, performing global transformation on the image to be registered according to the related parameters of the global transformation matrix to obtain a preliminary registration image.
Step S130, inputting the preliminary registration image and the reference image into a local registration network, and carrying out local correction on the preliminary registration image by using the reference image to obtain a prediction registration image.
Step S140, constructing a global registration loss function according to the similarity between the maximized reference image and the preliminary registration image, and constructing a local registration loss function according to the local correlation coefficient and the deformation field regularization term between the reference image and the predicted registration image.
Step S150, training the global registration network and the local registration network according to the global registration loss function and the local registration loss function to obtain a trained registration network, and constructing an interframe registration network according to the trained global registration network and the local registration network.
Step S160, obtaining a video SAR image to be registered, and performing image registration on the video SAR image by using an inter-frame registration network.
Before the method is explained, the reason for shadow formation of the moving target of the video SAR is introduced, and the influence of the moving target speed and the equivalent backscattering coefficient on the detection of the moving target is analyzed.
In the video SAR, a moving target shields ground reflection echoes, a shadow region with lower gray values is formed in an image, and the longer the shielding time is, the smaller the shadow gray values in the image are. Shadow length when the target is movingCan be expressed as:
(1)
in the case of the formula (1), For the width of the object to be a target width,For a motion displacement of the object within the imaging time,For stationary target due to heightAnd beam incident angleThe length of the shadow formed. As shown in fig. 2, when the target moves, the shadow length is related to factors such as the length, height, moving speed, and on-board video SAR incidence angle of the target.
When ignoring the effects of moving object height and on-board video SAR incidence angle, shadow length is related only to moving object length, speed, and imaging time, and can be expressed as:
(2)
In the formula (2) of the present invention, For the speed of movement of the object,For video SAR imaging time.
Further, the background noise is composed of two parts, namely additive noise and multiplicative noise, the additive noise depends on the thermal noise of the receiver, the multiplicative noise is in linear relation with the average equivalent backscattering coefficient of the received signal, and the equivalent backscattering coefficient of the noise. Can be expressed as:
(3)
In the formula (3) of the present invention, For the noise equivalent backscattering coefficient,For the additive noise equivalent backscattering coefficient,For multiplicative noise equivalent backscattering coefficients,For the average equivalent backscattering coefficient of the received signal,Is the multiplicative noise ratio.
Shadows can be classified into two types according to the speed of movement of the object. When (when)Less thanThe equivalent backscattering coefficient of the shadow intermediate region isThe ground echo of the partial area is totally blocked by the moving object as shown in fig. 3 (a), whenGreater thanWhen the equivalent backscattering coefficient of the shadow middle zone is lower and is larger thanLess than background equivalent backscattering coefficientThis partial area ground echo is partially blocked by the moving object as shown in fig. 3 (b). The gray values of the two types of shadow areas from the middle to the two ends are kept unchanged, and then gradually increased to be. The length of the shadow area is determined by the length of the moving object and the moving speed, and the faster the moving speed of the object is, the longer the shadow is, and the larger the equivalent backscattering coefficient of the shadow area is.
When the movement speed of the object is small, the length of a shadow area in the image is small, the gray value is low, the contrast with the surrounding static background is large, and the shadow position can be effectively detected. As the speed of the target motion increases, the shadow area becomes longer, the gray value increases, the contrast with the surrounding stationary background decreases, and as shown in fig. 3 (b), the shadow may be submerged in the background noise, increasing the risk of leakage police conduct. Therefore, the shadow detection is affected by various factors, and a plurality of scholars continuously innovate the detection performance under the condition of the existing data set, so that the shadow detection probability is improved, and the false alarm probability are reduced.
After deep analysis of shadow causes of moving targets and detection of challenges, a two-stage unsupervised interframe registration network architecture based on deep learning is proposed in the present application, and the network is composed of a global registration network and a local registration network (i.e., interframe registration network), and the whole network framework is shown in fig. 4.
In the method, steps S100 to S150 are all processes of training the global registration network and the local registration network, and step S160 is a process of performing image registration by using the inter-frame registration network.
In step S100, since the method is an unsupervised network training method, a truth value tag is not needed when preparing the training data set, which reduces the difficulty of network training to some extent.
In this embodiment, before the training data set is used to train the network, the training data set is further preprocessed, where preprocessing includes filtering out speckle noise by use of lee filtering, and then expanding the number of training images in the training data set by means of rotation, clipping, and the like.
In step S110, in the training dataset, two adjacent frames of SAR images are used as a set of training pairs, one frame is used as a reference image, the other frame is used as an image to be registered, and the images are input into the global registration network to predict relevant parameters of the global transformation matrix.
In this embodiment, when the global registration network generates relevant parameters of the global transformation matrix according to the reference image and the image to be registered, the reference image and the image to be registered are spliced along the channel dimension and then are used as input of the global registration network, and in the global registration network, five parameters are generated according to the spliced image by using six convolution layers and one full connection layer, includingAn axial translation amount and a scaling ratio,The amount of axial translation and the scaling and the angle of rotation. Wherein the output channels of the six convolution layers are 32,64,256,256,256 and 256 in sequence.
Specifically, the global registration network realizes the preliminary registration of the input image through three global transformations of translation, rotation and scaling. Reference imageAnd an unregistered imageStitching along the channel dimension serves as input to the global registration network, with the input channel being 2-dimensional. Through six convolution layers and a full connection layer, the output channels of the convolution layers are 32,64,256,256,256 and 256 in sequence, and the output channels of the full connection layer are 5, and five parameters of the global transformation matrix are output. Globally registering imagesCan be made byObtained through global transformation, can be expressed as:
(4)
In the formula (4) of the present invention, Is thatThe amount of the axial translation,Is thatThe amount of the axial translation,In order for the angle of rotation to be a function of,Is thatThe scale in the axial direction is scaled up and down,Is thatThe scale in the axial direction is scaled up and down,Representation ofIs used for the display of the display panel,Representation ofIs a pixel of (a) a pixel of (b).
In step S120, the image to be registered is subjected to global transformation according to the global transformation matrix formula (4) and the parameters outputted by the global registration network, so as to obtain a preliminary registration image.
Further, in step S130, the preliminary registration image and the reference image are input into a local registration network, and the reference image is used to locally correct the preliminary registration image, so as to obtain a predicted registration image.
Due to the motion of the video SAR platform, local elastic deformation exists between adjacent frame images. The factors such as relief, target movement, micro vibration of the platform and the like can cause the local area in the image to generate elastic deformation, and local registration is required to be completed through local deformation compensation. Local registration network compensates local deformation by using multi-scale fusion and deformable convolution to realize pixel level alignment and forecast registration imageCan be obtained by preliminary registration of imagesObtained through local compensation, can be expressed as:
(5)
in the formula (5) of the present invention, Representation ofA deformation field in the direction of the deformation,Representation ofA deformation field in the direction. Ideally, the edge regions are removed and the output of the network is locally registeredAbout equal to。
In this embodiment, the local registration network includes a convolution layer, a pooling layer, a Bottleneck unit and a deconvolution layer, after the reference image and the preliminary registration image are spliced along the channel dimension, the convolution layer and the pooling layer are utilized to perform downsampling, the Bottleneck unit includes a plurality of groups with different numbers, the downsampled data is sequentially subjected to feature extraction and dimension reduction processing of different levels, and a multi-level feature is obtained, wherein after each level feature is fused with the previous level feature through jump connection, the deconvolution layer is utilized to perform upsampling to restore the dimension, and then the upslope is continued to propagate until the downscaled reference image and the highest level feature are fused, and then the predicted registration image is obtained through the deconvolution layer.
As shown in fig. 5, the reference imagePreliminary registration of imagesThe input channels are 2-dimensional along the channel dimension stitching as input to the local registration network. The output channels are 32 by the step size of 2 convolution layers and the maximum pooling layer. And then sequentially processing the data output by the maximum pooling layer through a plurality of groups with different Bottleneck numbers, and outputting a plurality of characteristics of different levels, wherein the output channels are 64,128,256,512 in sequence. The output channels were then 256,128,64,32,2 in sequence by 5 consecutive deconvolutions. For better fusion of image characteristics, a U-Net jump connection structure is adopted, and the network output is that each pixelAxial direction and direction of the shaftA local elastic deformation field in the axial direction.
In one embodiment, the convolution kernel size of the convolution layer is 7x7, the bottleck unit includes 4 groups, each group includes 3,4, 6, and 3 Bottleneck, and the convolution kernel size of the deconvolution layer is 3X3.
Further, as shown in fig. 6, bottleneck includes a main path and a jump connection path, three convolution layers with different convolution kernel sizes are sequentially disposed on the main path, one convolution layer is disposed on the jump connection path, and after the main path and the jump connection path process the input data respectively, the processed data are fused through an addition operation to obtain Bottleneck output data.
In step S140, a global registration loss function pair global registration network is constructed from the similarity between the maximized reference image and the preliminary registration image, expressed as:
(6)
in the formula (6) of the present invention, Representing a reference image at a pixel pointThe gray value at which the color is to be changed,Representing the primary registered image at a pixel pointThe gray value at which the color is to be changed,The whole image domain is represented and,Is thatThe mean value of the gray values over the whole image domain,Is thatThe mean value of the gray values over the image domain.
Further, a local registration loss function is constructed according to the local correlation coefficient between the reference image and the predicted registration image and the deformation field regular term, and is expressed as follows:
(7)
In the formula (7) of the present invention, The local correlation coefficient is represented by a set of coefficients,A regularization term representing the deformation field,Expressed in pixelsIs a local area of the center of the device,Representing the preliminary registration image atThe gray value at which the color is to be changed,Representation areaIn (a)Is used for the color filter,Representing the reference image inThe gray value at which the color is to be changed,Representation areaIn (a)Is used for the color filter,Is a coefficient, wherein the regularization term is a transformation fieldRespectively atThe squares of the first derivatives of the directions are summed.
In step S150, the global registration network and the local registration network are trained according to the two loss functions until convergence, and then the inter-frame registration network can be constructed according to the trained global registration network and local registration network.
The inter-frame registration network also comprises a global transformation unit between the global registration network and the local registration network, the unit adopts a transformation matrix to carry out global transformation on the image to be registered according to related parameters generated by the global registration network to obtain a preliminary registration image, and the local registration network carries out local correction on the preliminary registration image to obtain a registration image.
In step S160, the video SAR images to be registered are sequentially input into an inter-frame registration network to realize registration. The method comprises the steps of sequentially taking the previous frame as a reference image of the next frame in a video SAR image, registering the next frame, taking the registered image as the reference image of the next frame of the image, registering the registered image, and the like.
In the method, the effectiveness of the method is also proved by experimental results, the data set used in the experiment is real video SAR data, 900 frames of images are cut out, and speckle noise is filtered through lee filtering. Wherein 700 images are expanded into 1000 512 x 512 images by rotation, clipping, etc. for training. 200 images cut to 512 x 512 were used to test network performance. Batch size in the network is set to 32, and global registration network learning rate is set toThe local registration network learning rate is set to。
Two indicators of Structural Similarity (SSIM) and Mutual Information (MI) are used to evaluate image registration performance. The structural similarity measures the registration performance by comparing errors of the reference image and the registration image, and the mutual information measures the statistical dependency between the reference image and the registration image, so that the similarity between the reference image and the registration image can be well evaluated by the two indexes. The methods presented herein were compared to three image matching methods, SAR-SIFT, LPM and LLT, and the comparison results are shown in Table 1. The method has the advantages that the structural similarity and mutual information are greatly improved compared with three methods, the registered images keep the greater similarity in structure, and the practicability and superiority of the method are verified.
Table 1 comparison of the method with other methods
Fig. 7 shows that the 2 nd frame image in the video SAR image is used as a reference image, the 3 rd frame image is used as an unregistered image, two images are used as inputs of an interframe registration network, the reference image and the registration image output by the network are utilized to extract the static background of the moving object, and then the detection result of the shadow is finally obtained through background difference, morphological processing and connected domain screening.
In the above-mentioned unsupervised video SAR image registration method, a two-stage unsupervised interframe registration network is provided for the global deformation and local deformation problems existing between adjacent frames of the video SAR image. The network realizes the global-to-local registration process through a cascade structure of the global registration network and the local registration network. The global registration network completes primary registration by utilizing translation, rotation, scaling and other transformations, and the local registration network compensates local elastic deformation through multi-scale fusion and deformable convolution, so that the registration accuracy is further improved. Experimental results show that the method is superior to SAR-SIFT, LPM and LLT in two indexes of Structural Similarity (SSIM) and Mutual Information (MI), and the effectiveness and superiority of the method are verified.
It should be understood that, although the steps in the flowchart of fig. 1 are shown in sequence as indicated by the arrows, the steps are not necessarily performed in sequence as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in fig. 1 may include multiple sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, nor do the order in which the sub-steps or stages are performed necessarily performed in sequence, but may be performed alternately or alternately with at least a portion of other steps or sub-steps of other steps.
In one embodiment, as shown in fig. 8, an unsupervised video SAR image registration apparatus is provided, which includes a training data set acquisition module 200, a global registration parameter obtaining module 210, a preliminary registration module 220, a local registration module 230, a loss function construction module 240, a network training module 250, and an image registration module 260, wherein:
A training data set acquisition module 200, configured to acquire a training data set, where the training data set includes multiple frames of SAR training images arranged in time sequence;
the global registration parameter obtaining module 210 is configured to select any two adjacent frames of SAR training images in the training dataset to be respectively used as a reference image and an image to be registered, and input the preprocessed reference image and the preprocessed image to be registered into a global registration network to obtain relevant parameters of a global transformation matrix;
The preliminary registration module 220 is configured to perform global transformation on the image to be registered according to the relevant parameters of the global transformation matrix, so as to obtain a preliminary registration image;
the local registration module 230 is configured to input the preliminary registration image and the reference image into a local registration network, and locally correct the preliminary registration image by using the reference image to obtain a predicted registration image;
a loss function construction module 240, configured to construct a global registration loss function according to maximizing similarity between the reference image and the preliminary registration image, and construct a local registration loss function according to a local correlation coefficient and a deformation field regularization term between the reference image and the predicted registration image;
the network training module 250 is configured to train the global registration network and the local registration network according to the global registration loss function and the local registration loss function, obtain a trained registration network, and construct an inter-frame registration network according to the trained global registration network and the local registration network;
The image registration module 260 is configured to acquire a video SAR image to be registered, and perform image registration on the video SAR image by using the inter-frame registration network.
For specific limitations on the unsupervised video SAR image registration apparatus, reference may be made to the above limitations on the unsupervised video SAR image registration method, and no further description is given here. The respective modules in the above-mentioned unsupervised video SAR image registration apparatus may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
In one embodiment, a computer device is provided, which may be a terminal, and the internal structure thereof may be as shown in fig. 9. The computer device includes a processor, a memory, a network interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program when executed by a processor implements an unsupervised video SAR image registration method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, can also be keys, a track ball or a touch pad arranged on the shell of the computer equipment, and can also be an external keyboard, a touch pad or a mouse and the like.
It will be appreciated by persons skilled in the art that the architecture shown in fig. 9 is merely a block diagram of some of the architecture relevant to the present inventive arrangements and is not limiting as to the computer device to which the present inventive arrangements are applicable, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.
In one embodiment, a computer device is provided comprising a memory and a processor, the memory having stored therein a computer program, the processor when executing the computer program performing the steps of:
acquiring a training data set, wherein the training data set comprises multi-frame SAR training images arranged in time sequence;
Selecting any two adjacent SAR training images in the training data set as a reference image and an image to be registered respectively, preprocessing the reference image and the image to be registered, and inputting the preprocessed reference image and the preprocessed image to a global registration network to obtain related parameters of a global transformation matrix;
Performing global transformation on the image to be registered according to the related parameters of the global transformation matrix to obtain a preliminary registration image;
Inputting the preliminary registration image and the reference image into a local registration network, and carrying out local correction on the preliminary registration image by using the reference image to obtain a predicted registration image;
Constructing a global registration loss function according to the similarity between the maximized reference image and the initial registration image, and constructing a local registration loss function according to the local correlation coefficient and the deformation field regular term between the reference image and the predicted registration image;
Training the global registration network and the local registration network according to the global registration loss function and the local registration loss function to obtain a trained registration network, and constructing an interframe registration network according to the trained global registration network and the local registration network;
And acquiring a video SAR image to be registered, and performing image registration on the video SAR image by using the interframe registration network.
In one embodiment, a computer readable storage medium is provided having a computer program stored thereon, which when executed by a processor, performs the steps of:
acquiring a training data set, wherein the training data set comprises multi-frame SAR training images arranged in time sequence;
Selecting any two adjacent SAR training images in the training data set as a reference image and an image to be registered respectively, preprocessing the reference image and the image to be registered, and inputting the preprocessed reference image and the preprocessed image to a global registration network to obtain related parameters of a global transformation matrix;
Performing global transformation on the image to be registered according to the related parameters of the global transformation matrix to obtain a preliminary registration image;
Inputting the preliminary registration image and the reference image into a local registration network, and carrying out local correction on the preliminary registration image by using the reference image to obtain a predicted registration image;
Constructing a global registration loss function according to the similarity between the maximized reference image and the initial registration image, and constructing a local registration loss function according to the local correlation coefficient and the deformation field regular term between the reference image and the predicted registration image;
Training the global registration network and the local registration network according to the global registration loss function and the local registration loss function to obtain a trained registration network, and constructing an interframe registration network according to the trained global registration network and the local registration network;
And acquiring a video SAR image to be registered, and performing image registration on the video SAR image by using the interframe registration network.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous link (SYNCHLINK) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The above examples illustrate only a few embodiments of the application, which are described in detail and are not to be construed as limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of protection of the present application is to be determined by the appended claims.