CN113905147B

CN113905147B - Method and device for removing tremble of marine monitoring video picture and storage medium

Info

Publication number: CN113905147B
Application number: CN202111156440.2A
Authority: CN
Inventors: 顾骏; 王珅; 彭梦兰; 常健杰; 唐红梅; 黄永珍; 罗勇
Original assignee: Guilin Changhai Development Co ltd
Current assignee: Guilin Changhai Development Co ltd
Priority date: 2021-09-30
Filing date: 2021-09-30
Publication date: 2023-10-03
Anticipated expiration: 2041-09-30
Also published as: CN113905147A

Abstract

The application provides a method and a device for removing jitter of a marine monitoring video picture and a storage medium, wherein the method comprises the following steps: preprocessing a plurality of video frame images to be processed to obtain a reference frame image and a plurality of video frame images, analyzing offset vectors of the reference frame image and the plurality of video frame images to obtain offset vectors, constructing perspective projection matrixes through each offset vector, and respectively carrying out rendering transformation compensation on each video frame image according to each perspective projection matrix to obtain the debounced video picture. The application solves the bottleneck problems of low calculation speed and unsatisfactory debounce effect in the current software debounce technology, thereby achieving the effect of debounce of real-time video in a computer with lower service performance, improving the debounce effect and enhancing the definition of the processed video.

Description

Method and device for removing tremble of marine monitoring video picture and storage medium

Technical Field

The application mainly relates to the technical field of image processing, in particular to a method and a device for removing tremble of a marine monitoring video picture and a storage medium.

Background

In a ship video monitoring system, a camera fixedly installed on a ship vibrates due to factors such as ship resonance, wave impact and the like, so that a collected video picture is subjected to shaking and blurring, the watching effect is influenced, and the problem can be solved by means of hardware and software. The hardware mode is to add various stabilizer devices on the market, has the problems of higher cost, poorer safety, inconvenient installation and the like, and is not suitable for a marine video monitoring system; the software mode is to make the jittering picture undergo the process of computer image processing method, and make the jittering picture undergo the process of automatic conversion compensation so as to can visually attain the effect of removing jittering of video. However, the existing software debounce technology has several problems, namely, the calculated amount is large, the requirement on computer hardware is too high, and the real-time video processing effect is difficult to meet; secondly, the processing effect of the debounce is not ideal, and the processed video is still blurred.

Disclosure of Invention

The application aims to solve the technical problem of providing a method and a device for removing jitter of a marine monitoring video picture and a storage medium aiming at the defects of the prior art.

The technical scheme for solving the technical problems is as follows: a method for removing tremble of a marine monitoring video picture comprises the following steps:

importing video data, wherein the video data comprises a plurality of video frame images to be processed, and preprocessing the plurality of video frame images to be processed to obtain a reference frame image and a plurality of video frame images;

analyzing offset vectors of the reference frame image and the plurality of video frame images to obtain offset vectors corresponding to the video frame images;

constructing a perspective projection matrix corresponding to each video frame image through each offset vector;

and respectively performing rendering transformation compensation on each video frame image according to each perspective projection matrix to obtain a tremble-free video picture corresponding to each video frame image.

The other technical scheme for solving the technical problems is as follows: a marine monitoring video picture tremble removing device comprises:

the image preprocessing module is used for importing video data, wherein the video data comprises a plurality of video frame images to be processed, and preprocessing the video frame images to be processed to obtain a reference frame image and a plurality of video frame images;

the offset vector analysis module is used for analyzing the offset vectors of the reference frame image and the video frame images to obtain offset vectors corresponding to the video frame images;

the matrix construction module is used for constructing a perspective projection matrix corresponding to each video frame image through each offset vector;

and the anti-shake video image acquisition module is used for respectively carrying out rendering transformation compensation on each video frame image according to each perspective projection matrix to obtain an anti-shake video image corresponding to each video frame image.

The other technical scheme for solving the technical problems is as follows: the marine monitoring video picture jitter removing device comprises a memory, a processor and a computer program which is stored in the memory and can run on the processor, wherein the marine monitoring video picture jitter removing method is realized when the processor executes the computer program.

The other technical scheme for solving the technical problems is as follows: a computer readable storage medium storing a computer program which, when executed by a processor, implements a marine surveillance video picture debounce method as described above.

The beneficial effects of the application are as follows: the method comprises the steps of preprocessing a plurality of video frame images to be processed to obtain a reference frame image and a plurality of video frame images, analyzing offset vectors of the reference frame image and the plurality of video frame images to obtain offset vectors, constructing perspective projection matrixes through each offset vector, respectively carrying out rendering transformation compensation on each video frame image according to each perspective projection matrix to obtain a debounced video picture, and solving the bottleneck problems of low calculation speed and unsatisfactory debounced effect in the current software debounced technology, thereby achieving the effect of real-time video debounced by a computer with lower use performance, improving the debounced effect and enhancing the definition of the processed video.

Drawings

Fig. 1 is a flow chart of a method for removing tremble of a marine surveillance video frame according to an embodiment of the present application;

fig. 2 is a block diagram of a marine monitoring video image debouncer according to an embodiment of the present application.

Detailed Description

The principles and features of the present application are described below with reference to the drawings, the examples are illustrated for the purpose of illustrating the application and are not to be construed as limiting the scope of the application.

Fig. 1 is a flowchart of a method for removing jitter of a marine surveillance video frame according to an embodiment of the present application.

As shown in fig. 1, a method for removing tremble of a marine surveillance video frame includes the following steps:

In the above embodiment, the reference frame image and the plurality of video frame images are obtained by preprocessing the plurality of video frame images to be processed, the offset vectors of the reference frame image and the plurality of video frame images are analyzed to obtain the offset vectors, perspective projection matrixes are constructed by the offset vectors, and the debounced video frames are obtained by rendering transformation compensation of the video frame images according to the perspective projection matrixes, so that the bottleneck problems of low calculation speed and unsatisfactory debounce effect in the existing software debounce technology are solved, the real-time video debounce effect can be realized by a computer with lower use performance, the debounce effect is also improved, and the definition of the processed video is enhanced.

Optionally, as an embodiment of the present application, the plurality of video frame images to be processed are sequentially arranged, and the process of preprocessing the plurality of video frame images to be processed to obtain the reference frame image and the plurality of video frame images includes:

respectively carrying out image format conversion on each video frame image to be processed in a set format to obtain converted images corresponding to each video frame image to be processed;

scaling the converted images respectively to obtain scaled images corresponding to the video images;

and respectively carrying out gray level conversion on the scaled images to obtain video frame images corresponding to the video images, and taking the first video frame image as a reference frame image.

It should be understood that the reference frame image is used for positioning of feature points as a reference for calculation of the motion amount change for the subsequent frame.

It should be understood that the set format is BGR24 format, that is, the yuv format of the video frame image to be processed is converted into rgb format for subsequent image processing, and BGR24 format belongs to one of the rgb formats.

It should be understood that after the real-time video image data (i.e. the video frame image to be processed) is collected, the image collected by each frame (i.e. the video frame image to be processed) is converted into BGR24 format, and 1/4 scaling and gray level conversion are performed to obtain a plurality of image data (i.e. the video frame image); and placing the 1 st frame image data (i.e. the video frame image) into a memory as a debounced reference frame (i.e. the reference frame image).

In the above embodiment, the image formats of each video frame image to be processed are respectively converted into the converted images in the set format, the scales of the converted images are respectively scaled to obtain the scaled images, the gray scales of the scaled images are respectively converted to obtain the video frame images, and the first video frame image is used as the reference frame image, so that accurate data is provided for subsequent processing, the debouncing effect is improved, the definition of the processed video is enhanced, and the purpose of real-time video debouncing can be achieved by a computer with lower usability.

Optionally, as an embodiment of the present application, the process of analyzing offset vectors of the reference frame image and the plurality of video frame images to obtain offset vectors corresponding to each of the video frame images includes:

positioning the feature points of the reference frame image to obtain a plurality of reference feature points, and integrating all the reference feature points to obtain global feature points;

and carrying out feature point iterative computation on a plurality of video frame images and the global feature points through a pyramid optical flow method and the reference frame images to obtain a plurality of offset corresponding to each video frame image, and respectively collecting the plurality of offset corresponding to each video frame image to obtain offset vectors corresponding to each video frame image.

It should be appreciated that the global feature point will subsequently perform an estimated tracking of each feature point (i.e. the reference feature point).

It should be appreciated that the pyramid optical flow method, also called the pyramid optical flow method, is a widely used differential method of optical flow estimation, which assumes a constant in the field of pixels, and then uses the least squares method to solve the underlying optical flow equation for all pixels in the neighborhood.

Specifically, 100-200 goodFeatures feature points are positioned on a reference frame (i.e. the reference frame image), each subsequent frame image tracks the feature points (i.e. the reference feature points), the feature points (i.e. the reference feature points) are acquired according to a specified rule, and all the feature points are counted into global feature points; an LK optical flow pyramid is built according to the modified optical flow method (i.e., the pyramid optical flow method), and global feature points of the previous frame are placed at the bottom of the pyramid (i.e., only the resulting set of feature points (i.e., the global feature points) are sent to the bottom of the optical flow pyramid, i.e., as input data).

It should be appreciated that locating feature points on the reference frame image is used for tracking of each subsequent frame; the specified rule is specifically to randomly acquire a plurality of points, and then screen out at least 100 and at most 200 points which have differences of adjacent gray values around exceeding a certain range as characteristic points (namely the reference characteristic points) meeting the requirements.

It should be understood that, the front and rear two frames of images (i.e. the reference frame image and/or the video frame image) with the feature point positioning are sent to the LK optical flow pyramid to perform feature point iterative computation, the 4 th layer is the top layer of the pyramid, and the PointTrack offset vector of the adjacent frame is obtained after 4 iterations and is used as the global motion quantity (i.e. the offset vector).

It should be understood that, using the global feature points, 4 iterative computations are performed by a pyramid optical flow method, and the offset (i.e. the offset vector) of the two front and rear frame images to all feature points is obtained according to the characteristic that the brightness change of the adjacent frame images is small and the content displacement is small. The PointTrack offset vector refers to a set of offset vectors after all feature points of the reference frame are offset for obtaining a projective transformation matrix in the next step.

In the above embodiment, the global feature points are obtained by locating the feature points of the reference frame image, the offset vector is obtained by the pyramid optical flow method and the iterative calculation of the feature points of the reference frame image on the plurality of video frame images and the global feature points, and each frame can be tracked later, so that the definition of each frame image is ensured, and the purpose of real-time video debouncing can be achieved by a computer with lower use performance.

Optionally, as an embodiment of the present application, the process of constructing a perspective projection matrix corresponding to each video frame image by using each offset vector includes:

respectively carrying out rectangular construction on each offset vector by using a Matrix3D Matrix tool to obtain a first homotype Matrix corresponding to each video frame image and a second homotype Matrix corresponding to each video frame image;

respectively restoring each first homotype matrix and each second homotype matrix according to a preset initial scaling ratio to obtain a restored first homotype matrix corresponding to each video frame image and a restored second homotype matrix corresponding to each video frame image;

respectively carrying out continuous cyclic subtraction conversion on each restored first homotype matrix and each restored second homotype matrix to obtain a relative offset matrix corresponding to each video frame image;

and importing an initial scaling, and analyzing offset values of the relative offset matrixes through the initial scaling to obtain perspective projection matrixes corresponding to the video frame images.

Preferably, the initial scale may be 90% of the original scale.

It should be appreciated that the Matrix3D Matrix tool, i.e., matrix3D tool class, represents a transformation Matrix that determines the position and orientation of a three-dimensional (3D) display object. The matrix may perform conversion functions including translation (repositioning along x, y, and z axes), rotation, and scaling (resizing). Matrix3D classes may also perform perspective projection, which maps points in 3D coordinate space to two-dimensional (2D) views.

It should be appreciated that calculating an image scaling matrix Zoomer from offset vectors of adjacent frames; constructing the zoom ratio of the two zoom matrixes into homomorphic matrixes; finally, after the compensation is carried out according to the original 1/4 scaling, the angle, the zoom number and the displacement difference parameter of the xyz axis generated by the image transformation can be obtained, and finally, the angle, the zoom number and the displacement difference parameter of the xyz axis are converted into a perspective projection matrix.

Specifically, the global motion amount (i.e., the offset vector) is put into a Matrix3D Matrix tool, a homotype Matrix H1 (i.e., the first homotype Matrix) and a homotype Matrix H2 (i.e., the second homotype Matrix) are constructed, the matrices (i.e., the first homotype Matrix and the second homotype Matrix) are restored according to the original 1/4 scaling, the homotype Matrix H1 (i.e., the restored first homotype Matrix) and the homotype Matrix H2 (i.e., the restored second homotype Matrix) are subjected to cyclic subtraction and conversion into a relative offset Matrix H, a default scaling Zoom (i.e., the initial scaling) is set to 90% of the original scaling, and an angle offset value and an xyz axis offset value of the current frame can be obtained through calculation.

In the above embodiment, the perspective projection matrix corresponding to each video frame image is constructed through each offset vector, so that the bottleneck problems of low calculation speed and unsatisfactory debounce effect in the current software debounce technology are solved, the effect of real-time video debounce can be realized by a computer with lower service performance, the debounce effect is also improved, and the definition of the processed video is enhanced.

Optionally, as an embodiment of the present application, the process of analyzing the offset value of each of the relative offset matrices by using the initial scaling to obtain a perspective projection matrix corresponding to each of the video frame images includes:

calculating the angle offset value of each relative offset matrix through a first formula respectively to obtain the angle offset value corresponding to each video frame image, wherein the first formula is as follows:

△θ＝arcsin(H(0,1))*180.f/pi，

wherein delta theta is an angle offset value, f is a floating point, pi is pi, H (0, 1) is the 0 th row and the 1 st column of the relative offset matrix;

respectively calculating x-axis offset values of each relative offset matrix through a second formula and the initial scaling to obtain x-axis offset values corresponding to each video frame image, wherein the second formula is as follows:

△x＝-H(0,2)*Zoom，

wherein Deltax is the x-axis offset value, zoom is the initial scaling, H (0, 2) is the 0 th row and 2 nd column of the relative offset matrix;

calculating y-axis offset values of the relative offset matrices respectively through a third formula and the initial scaling to obtain y-axis offset values corresponding to the video frame images, wherein the third formula is as follows:

△y＝-H(1,2)*Zoom，

wherein Δy is the y-axis offset value, zoom is the initial scaling, and H (1, 2) is the 1 st row and 2 nd column of the relative offset matrix;

calculating the z-axis offset value of each relative offset matrix through a fourth formula and the initial scaling respectively to obtain the z-axis offset value corresponding to each video frame image, wherein the fourth formula is as follows:

△z＝-H(2,2)*Zoom，

wherein Δz is the z-axis offset value, zoom is the initial scaling, and H (2, 2) is the 2 nd row and 2 nd column of the relative offset matrix;

judging whether a condition is satisfied, the condition including that a difference value of the angle offset value corresponding to the current video frame image and the angle offset value corresponding to the last video frame image is greater than or equal to a preset angle offset threshold value, and a difference value of the x-axis offset value corresponding to the current video frame image and the x-axis offset value corresponding to the last video frame image is greater than or equal to a preset x-axis offset threshold value, and a difference value of the y-axis offset value corresponding to the current video frame image and the y-axis offset value corresponding to the last video frame image is greater than or equal to a preset y-axis offset threshold value, and a difference value of the z-axis offset value corresponding to the current video frame image and the z-axis offset value corresponding to the last video frame image is greater than or equal to a preset z-axis offset threshold value,

if the conditions are met at the same time, reversely modifying the initial scaling according to a preset first modification value to obtain a first scaling, and taking the first scaling as a modified scaling;

if any condition is not met, modifying the initial scaling according to a preset second modification value to obtain a second scaling, and taking the second scaling as the modified scaling;

and performing matrix conversion on the angle offset value, the x-axis offset value, the y-axis offset value, the z-axis offset value and the modified scaling to obtain perspective projection matrixes corresponding to the video frame images.

It will be appreciated that the image scaling effect is scaled in order to reduce the black edges that occur around the image as the transformation compensates for the movement. By default, the scaling ratio is set to 90%, and the scaling ratio is modified when the calculated angle and xyz axis displacement difference change suddenly increases or decreases, and the scaling ratio is increased or decreased to 90% frame by frame when the change tends to be flat.

It will be appreciated that when the calculated angular offset value, xyz-axis offset value, suddenly changes compared to the previous frame, the scaling Zoom value (i.e. the initial scaling) is modified inversely with the preset modification value (i.e. the preset first modification value), and the scaling ratio (i.e. the initial scaling) is modified back to 90% frame by frame +1 or-1 as the change flattens out, and finally the angular offset value, xyz-axis offset value, zoom value (i.e. the modified scaling) is converted into a perspective projection matrix corresponding to each frame according to the rules of perspective projection.

In the above embodiment, the perspective projection matrix is obtained by analyzing the offset values of the relative offset matrices through the initial scaling, so that the black edges generated around the image during the transformation and compensation movement are reduced, and the bottleneck problems of slow calculation speed and non-ideal trembling removal effect in the current software trembling removal technology are solved, thereby achieving the effect of real-time video trembling removal by a computer with lower service performance, improving the trembling removal effect and enhancing the definition of the processed video.

Optionally, as an embodiment of the present application, the process of performing rendering transformation compensation on each video frame image according to each perspective projection matrix to obtain a de-jittering video frame corresponding to each video frame image includes:

performing perspective transformation on each video frame image according to each perspective projection matrix to obtain a transformed image corresponding to the video frame image;

rendering each transformed image respectively to obtain a de-jittering video picture corresponding to each video frame image.

It should be understood that, using the vertex shader as a rendering tool, after the perspective projection matrix is acquired, the original image (i.e. the video frame image) is subjected to perspective transformation processing according to the perspective projection matrix, and then the transformed image data (i.e. the transformed image) is rendered, so as to obtain the compensated anti-shake video image (i.e. the anti-shake video frame image).

In the above embodiment, the perspective projection matrixes are used for respectively performing perspective transformation on each video frame image to obtain transformed images, and rendering each transformed image to obtain the debouncing video picture, so that the automatic debouncing processing of the real-time video with the jittering blur collected by the camera in the marine video monitoring system is realized, the frame image is rapidly processed within 40ms, and meanwhile, the picture output is clear and stable.

Alternatively, as another embodiment of the present application, as shown in fig. 2, a marine surveillance video picture debounce device includes:

Optionally, as an embodiment of the present application, a plurality of the video frame images to be processed are sequentially arranged, and the image preprocessing module is specifically configured to:

Alternatively, another embodiment of the present application provides a marine surveillance video picture debounce apparatus, including a memory, a processor, and a computer program stored in the memory and executable on the processor, which when executed by the processor, implements the marine surveillance video picture debounce method as described above. The device may be a computer or the like.

Alternatively, another embodiment of the present application provides a computer readable storage medium storing a computer program which, when executed by a processor, implements the marine surveillance video picture debounce method as described above.

It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.

It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the apparatus and units described above may refer to corresponding procedures in the foregoing method embodiments, which are not described herein again.

In the several embodiments provided by the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of elements is merely a logical functional division, and there may be additional divisions of actual implementation, e.g., multiple elements or components may be combined or integrated into another system, or some features may be omitted, or not performed.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the embodiment of the present application.

In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application is essentially or a part contributing to the prior art, or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods of the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

The foregoing description of the preferred embodiments of the application is not intended to limit the application to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and scope of the application are intended to be included within the scope of the application.

Claims

1. A method for removing tremble of a marine monitoring video picture is characterized by comprising the following steps:

rendering transformation compensation is carried out on each video frame image according to each perspective projection matrix, so that a tremble-free video picture corresponding to each video frame image is obtained;

the process of constructing a perspective projection matrix corresponding to each video frame image through each offset vector comprises the following steps:

respectively carrying out cyclic subtraction conversion on each restored first homotype matrix and each restored second homotype matrix to obtain a relative offset matrix corresponding to each video frame image;

2. The method for removing jitter of a surveillance video frame for a ship according to claim 1, wherein the video frame images to be processed are sequentially arranged, and the preprocessing of the video frame images to be processed to obtain the reference frame image and the video frame images comprises:

scaling the converted images respectively to obtain scaled images corresponding to the video frame images to be processed;

and respectively carrying out gray level conversion on the scaled images to obtain video frame images corresponding to the video frame images to be processed, and taking the first video frame image as a reference frame image.

3. The method for removing jitter of a surveillance video frame for a ship according to claim 1, wherein the analyzing the offset vectors of the reference frame image and the plurality of video frame images to obtain the offset vectors corresponding to the respective video frame images comprises:

4. The method for removing jitter of a surveillance video frame for a ship according to claim 1, wherein the process of analyzing the offset value of each relative offset matrix by the initial scaling to obtain a perspective projection matrix corresponding to each video frame image includes:

△θ＝arcsin(H(0,1))*180.f/pi，

△x＝-H(0,2)*Zoom，

△y＝-H(1,2)*Zoom，

△z＝-H(2,2)*Zoom，

5. The method for removing jitter of a surveillance video frame for a ship according to claim 1, wherein the process of performing rendering transformation compensation on each video frame image according to each perspective projection matrix to obtain the jitter-removed video frame image corresponding to each video frame image comprises:

6. The utility model provides a marine monitoring video picture removes trembles device which characterized in that includes:

the image preprocessing module is used for importing video data, wherein the video data comprises a plurality of video frame images to be processed, and preprocessing the plurality of video frame images to be processed to obtain a reference frame image and a plurality of video frame images;

the anti-shake video image obtaining module is used for respectively carrying out rendering transformation compensation on each video frame image according to each perspective projection matrix to obtain an anti-shake video image corresponding to each video frame image;

the matrix construction module specifically comprises:

7. The marine surveillance video picture de-jittering apparatus according to claim 6, wherein a plurality of the video frames to be processed are sequentially arranged, and the image preprocessing module is specifically configured to:

8. A computer readable storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the marine surveillance video picture debounce method according to any one of claims 1 to 5.