CN119850409A

CN119850409A - Panorama stitching method, camera and storage medium

Info

Publication number: CN119850409A
Application number: CN202411959991.6A
Authority: CN
Inventors: 张红兵; 张武超; 崔星星
Original assignee: Shenzhen Zhengcheng Technology Co ltd
Current assignee: Shenzhen Zhengcheng Technology Co ltd
Priority date: 2024-12-30
Filing date: 2024-12-30
Publication date: 2025-04-18

Abstract

The present invention discloses a panoramic stitching method camera and a storage medium, which is implemented based on a deep learning algorithm and includes the following steps: S1, using two spherical fisheye lenses to obtain two fisheye images and expand them into Equirectangular panoramas to obtain p1 and p2; S2, expanding the panorama of the effective visualization area mask of the two fisheye images in S1 into m1 and m2; S3, roughly fusion panorama, using p1 to fuse p2 to obtain p12, using p2 to fuse p1 to obtain p21; S4, using a deep learning algorithm to estimate the optical flow information of p12 and p21 in S3 respectively; S5, designing a transition zone area fusion coefficient; S6, using the optical flow information in S4 combined with the fusion coefficient in S5 to weight each coefficient of the transition zone area, and then introducing a ghost coefficient ρ to calculate the specific color value of each pixel; S7, fine fusion panorama; S8, panoramic image output. The present invention has the advantages of high stitching accuracy and stability, good real-time stitching, good consistency of dynamic scene pictures, good stitching line smoothing effect, etc.

Description

Panorama stitching method, camera and storage medium

Technical Field

The invention belongs to the technical field of image processing computer vision algorithms, and particularly relates to a panoramic stitching method, a camera and a storage medium.

Background

In recent years, a 360-degree motion camera is becoming an important device in panoramic image production due to its wide field of view and immersive experience. The panoramic video is captured through multiple lenses, such as two spherical fisheye lenses, and images of multiple fields of view are spliced into a complete panoramic picture, so that an omnibearing visual effect is provided for a user, and the panoramic camera is a core function of the camera. However, implementation of panorama stitching typically involves complex image processing techniques, the quality of which directly affects the user experience. At present, the panoramic stitching technology is mainly based on stitching of feature point matching and stitching based on optical flow, wherein the feature point matching method calculates a geometric transformation matrix between two images by detecting feature points in the images, such as SIFT, ORB and the like, and completes image stitching, and the image stitching is estimated by analyzing motion vectors among pixels based on an optical flow algorithm. However, the optical flow calculation has high requirements on precision and calculation amount, and the real-time performance and quality of splicing are directly affected.

Although existing optical flow algorithms represent a certain advantage in panorama stitching, the following problems and disadvantages still remain:

In a scene with light change, rapid movement or less texture, optical flow estimation is easy to be misaligned, so that obvious dislocation or edge discontinuity appears near a splicing line, and the picture quality is influenced;

The optical flow algorithm has high requirements on computing resources, and particularly under the condition of high resolution or multi-lens input, the optical flow algorithm is difficult to realize high-quality and low-delay splicing under the condition of limited hardware, and cannot meet the real-time processing requirements;

For scenes containing fast moving objects or complex dynamic contents, the existing optical flow algorithm is difficult to accurately estimate the motion trail of the objects, so that the dynamic objects in the spliced picture are ghost or distorted;

in multi-shot stitching, the optical flow algorithm requires simultaneous processing of geometric distortion between shots and suture smoothing of the field boundaries. Existing methods often have difficulty in achieving both local detail and overall picture consistency, resulting in possible blurring or artifacts in the transition region.

Disclosure of Invention

The invention provides a panoramic stitching method, a camera and a storage medium for solving the defects in the prior art.

In order to solve the technical problems, the invention adopts the following technical scheme:

the panoramic stitching method is realized based on a deep learning algorithm and comprises the following steps:

s1, obtaining two fisheye images by using two spherical fisheye lenses, and expanding the fisheye images into Equirectangular panoramic pictures to obtain p1 and p2;

s2, developing panoramic images of the visual area mask effective for the two fisheye images in the S1 into m1 and m2;

S3, roughly fusing the panoramic image, fusing p2 by using p1 to obtain p12, and fusing p1 by using p2 to obtain p21;

S4, performing optical flow information estimation on p12 and p21 in the S3 by using a deep learning algorithm;

s5, designing a transition zone region fusion coefficient, wherein the calculation formula is as follows:

Wherein, blend _ij is a fusion coefficient, and varies between [0,1], fmin _ij and Bmin _ij respectively represent the nearest distances from m1 and m2 between the (i, j) pixel points corresponding to the two spherical fisheye lenses;

S6, weighting each coefficient of the transition zone region by utilizing the optical flow information in the S4 and combining the fusion coefficient in the S5, and then introducing a ghost coefficient rho to calculate a specific color value of each pixel;

s7, finely fusing the panoramic images;

S8, outputting the panoramic image.

Preferably, the fusion of p12 in S3 is calculated as:

p12=p1 (m 1) +p2 (| m 1), where (m 1) represents the effective visualization area mask of the fisheye image in S2 expanded panorama m1, and (| m1 represents the effective visualization area mask of the fisheye image in S2 inverted in expanded panorama m 1.

Preferably, the fusion of p21 in S3 is calculated as:

p21=p2 (m 2) +p1 (| m 2), where (m 2) represents the effective visualization area mask of the fisheye image in S2 expanded panorama m2, and (| m 2) represents the effective visualization area mask of the fisheye image in S2 inverted in expanded panorama m 2.

Preferably, the deep learning algorithm is any one of fastFlowNet algorithm, RAFT algorithm or memFlowNet algorithm.

Preferably, in the fine fusion of the panoramic image in S7, for each pixel (x, y) in the panoramic image, an optical flow vector fLR (x, y) from a first fisheye image to a second fisheye image of the two fisheye images and an optical flow vector fRL (x, y) from the second fisheye image to the first fisheye image are used, and fusion weights wF (x, y) and wB (x, y) of the two images are calculated according to a fusion mask m (x, y), where a calculation formula of wF (x, y) is:

wF(x,y)=b(x,y);

And the calculation formula of wB (x, y) is as follows:

wB(x,y)=1-b(x,y);

and then calculating the color value C (x, y) of the pixel (x, y) in the fused panoramic image according to the fusion weights wF (x, y) and wB (x, y) of the two images and the ghost coefficient rho.

Preferably, the calculation formula of the color value C (x, y) of the pixel (x, y) in the panoramic image is as follows:

C(x,y)=CF(x,y)*wF(x,y)*rho(x,y)+CB(x,y)*wB(x,y)*rho(x,y)。

The invention also discloses a camera, which comprises a processor and a memory, wherein the processor is used for executing the steps of the panorama stitching method by calling the programs or instructions stored in the memory.

The invention also discloses a storage medium, and the computer readable storage medium stores a program or instructions for executing the steps of the panorama stitching method.

By adopting the technical scheme, the invention has the following beneficial effects:

(1) According to the invention, an optical flow estimation model can be utilized to realize a splicing effect with higher accuracy and high robustness, and in the prior art, in a complex scene with less illumination change, rapid motion and texture, a motion vector cannot be effectively estimated, so that the problems of dislocation, cracks or edge discontinuity of the boundary of a spliced picture exist;

(2) In the prior art, the calculation amount is large in a high-resolution and multi-lens scene, the time cost is high, and the requirement of real-time video stitching is difficult to meet;

(3) The invention designs a special optical flow splicing strategy, which can enhance the processing capability of object boundaries and details in the dynamic scene, realize smooth transition of pictures and optimize the picture consistency of the dynamic scene;

(4) In a stitching region spliced by multiple lenses, consistency of global pictures and fidelity of local details are difficult to be considered in the traditional method, so that stitching lines are fuzzy or artifact is caused, processing of stitching lines is optimized, visual artifact is reduced, stitching line smoothing effect and geometric distortion correction are improved by integrating an optical flow estimation and geometric correction module, two frames of images requiring adjacent moments are input in the prior art, different spherical fisheye lenses with common vision regions are input in the invention, two spherical fisheye lenses with common vision regions are simultaneously input to obtain an unfolding Equirectangular panorama of the two fisheye images, displacement of each pixel is output, and better technical effect can be achieved;

in conclusion, the method has the advantages of high splicing accuracy and stability, good splicing instantaneity, good consistency of dynamic scene pictures, good suture line smoothing effect and the like.

Drawings

Fig. 1 is a schematic flow chart of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made apparent and fully in view of the accompanying drawings, in which some, but not all embodiments of the invention are shown.

The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the invention, as presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention.

All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

In the description of the present invention, it should be noted that the directions or positional relationships indicated by the terms "center", "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer", etc. are based on the directions or positional relationships shown in the drawings, are merely for convenience of describing the present invention and simplifying the description, and do not indicate or imply that the devices or elements referred to must have a specific orientation, be configured and operated in a specific orientation, and thus should not be construed as limiting the present invention. Furthermore, the terms "first," "second," and "third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.

In the description of the present invention, unless explicitly stated or limited otherwise, the terms "mounted," "connected," and "connected" are to be construed broadly, and may be, for example, fixedly connected, detachably connected, or integrally connected, mechanically connected, electrically connected, directly connected, indirectly connected via an intervening medium, or in communication between two elements. The specific meaning of the above terms in the present invention will be understood in specific cases by those of ordinary skill in the art.

Example 1

In this embodiment, the invention provides a panorama stitching method, a camera and a storage medium, in particular to a panorama stitching method, a camera and a storage medium of a 360-degree motion camera based on optical flow estimation.

As shown in fig. 1, in one embodiment of the present invention, the panorama stitching method is implemented based on a deep learning algorithm, where the deep learning algorithm is any one of fastFlowNet algorithm, RAFT algorithm or memFlowNet algorithm, and the existing deep learning algorithm inputs two frames of images at adjacent time to perform optical flow estimation, in the present invention, equirectangular panorama images of two fisheye images of two spherical fisheye lenses at the same time are input to perform optical flow estimation, and finally displacement of each pixel is output, in the present invention, two fisheye images are obtained by shooting with two fisheye lenses, where the two lenses of the two fisheye lenses are two lenses disposed opposite to each other, and a sum of angles of view of the two lenses is greater than or equal to 360 degrees, and specifically includes the following steps:

s1, two spherical fisheye lenses are utilized to obtain two fisheye images and are unfolded to obtain Equirectangular panoramic images, p1 and p2 are obtained, the two fisheye images are usually images obtained through the two fisheye lenses, the fisheye lenses are special ultra-wide angle lenses which have extremely large visual angles and can usually reach 180 degrees or even exceed 180 degrees, the imaging characteristics are that serious barrel-shaped distortion is generated on the images, the photographed images are in round or approximately round shapes, very wide scenes can be covered in one image, in order to achieve more comprehensive scene coverage, particularly in some scenes needing 360-degree panoramic photographing, two fisheye lenses are adopted to photograph, the two fisheye lenses can be installed on the same equipment, the images are obtained from different angles or different directions, the two lenses are respectively oriented in different directions, the visual angle ranges of the two lenses can be combined to cover the omni-directional scene as much as possible, namely, the photographed images are in back-arranged two lenses, the specific panoramic view is also called a two-dimensional panoramic plane view map, and the panoramic view is a specific panoramic image is shown as a two-dimensional panoramic plane map 360-dimensional panoramic image. The vertical direction shows 180 degrees of visual angles, the height of the corresponding image is generally 2:1, the aspect ratio is generally 2:1, the appearance form of a rectangular image is integrally presented, when equipment which can capture wide visual angles such as a double-fisheye lens is used for shooting, the obtained image is often highly distorted and deformed, and can be divided into different parts (such as circular images which are respectively shot by the two fisheye lenses), the circular images are inconvenient to directly use for displaying or are subsequently used in application scenes requiring a conventional image format, the original image with special visual angles and the distorted shape is required to be unfolded into a equirectgular panoramic image, the original image with special visual angles and the distorted shape is converted into the panoramic image conforming to the rule of equidistant columnar projection through a series of mathematical transformation, image splicing, correction and other processing technologies, in particular, pixel points in the original image are required to be repositioned and adjusted according to an imaging model (such as equidistant projection, stereoscopic projection and other different optical models of the fisheye lens), the circular images shot by the double-fisheye lens are firstly carried out on the circular images shot by the two fisheye lenses respectively, the two images are required to be subjected to the actual image with the distortion, and the rectangular images can be independently and completely fused in the horizontal direction according to the requirements of the rule of the rectangular image with the rectangular projection, the distortion can be completely fused in the rectangular image with the rectangular image which is required to be completely-shaped, and the rectangular image with the distortion is required to be completely-shaped, and the rectangular image can be completely fused with the rectangular image is required to be completely and completely-fused;

S2, developing a panorama image into m1 and m2 for the effective visual area mask of the two fisheye images in S1, wherein in the field of image processing, the mask generally refers to a binary image, is used for marking out the interested area or a specific part needing to be focused and processed, the visual area mask is a part which is specially defined to be effective in the subsequent visual operation, other areas are excluded, in the image shot by using the fisheye lens, due to the characteristics of the lens, the image edge may have a dark angle, serious distortion and other areas with poor imaging effect, or during panoramic image acquisition, some parts which do not need to be displayed due to equipment shielding and other reasons exist, the pixels corresponding to the ineffective areas can be marked as 0 (which means not focused and not processed), and the pixels of the effective area are marked as 1 (or other proper important values which mean subsequent processing and displaying), so as to define the visual area;

S3, roughly fusing the panoramic image, fusing p2 by using p1 to obtain p12, fusing p1 by using p2 to obtain p21, and fusing p21 in S3 is calculated as follows:

p21=p2 (m 2) +p1 (|m2), where (m 2) represents the effective visualization area mask of the fisheye image in S2 expanded panorama m2, and (|m2) represents the effective visualization area mask of the fisheye image in S2 inverted in expanded panorama m 2;

the fusion calculation of p21 in S3 is:

p21=p2 (m 2) +p1 (| m 2), where (m 2) represents that the effective visual area mask of the fisheye image in S2 is unfolded and (| m 2) represents that the effective visual area mask of the fisheye image in S2 is unfolded and inverted, when the panorama is manufactured, for example, a plurality of images are obtained by shooting from different directions through double fisheye lenses, the rough fusion stage is to splice the images together approximately, taking the images shot by the two fisheye lenses as an example, firstly, a simple positional alignment is performed on the circular images shot by each of the two fisheye lenses, then a corresponding positional relationship is found according to some characteristic points (such as obvious markers, edges and the like in the picture) in the images, and then the overlapped parts of the two images are subjected to preliminary superposition and splicing, and at this time, the overlapped parts may be simply processed in a relatively simple and direct manner according to average pixel values and the like, so that the overlapped parts may form a complete image, but the overlapped images may have a rough and inconstant color and rough fusion effect in the overlapped areas. Fine fusion operations such as fine color correction, light and shadow adjustment, splicing trace elimination and the like are needed to be carried out later;

S4, optical flow estimation, namely, carrying out optical flow information estimation on p12 and p21 in the S3 by utilizing a deep learning algorithm, and more particularly, improving the accuracy and the robustness of the optical flow estimation under a complex scene by introducing a multi-scale feature extraction mechanism and a context sensing module;

S6, weighting each coefficient of the transition zone region by combining the optical flow information in S4 with the fusion coefficient in S5, and then introducing a ghost coefficient rho to calculate a specific color value of each pixel, so that visual artifacts of the stitching region are obviously reduced;

S7, finely fusing the panoramic image, wherein when the panoramic image is finely fused in S7, for each pixel (x, y) in the panoramic image, an optical flow vector fLR (x, y) from a first fisheye image to a second fisheye image in the two fisheye images and an optical flow vector fRL (x, y) from the second fisheye image to the first fisheye image are utilized, and fusion weights wF (x, y) and wB (x, y) of the two images are calculated according to a fusion mask m (x, y), wherein a calculation formula of wF (x, y) is as follows:

wF(x,y)=b(x,y);

And the calculation formula of wB (x, y) is as follows:

wB(x,y)=1-b(x,y);

More specifically, the calculation formula of the color value C (x, y) of the pixel (x, y) in the panoramic image is as follows:

C(x,y)=CF(x,y)*wF(x,y)*rho(x,y)+CB(x,y)*wB(x,y)*rho(x,y);

S8, outputting the panoramic image.

The invention can be divided into two parts, wherein the first part is to make optical flow estimation on the panoramic image unfolded by the two double-fisheye images, and the second part is to make local fusion on the overlapping area, namely geometric distortion correction.

The invention has the following advantages:

The method has the advantages that the splicing precision and stability are improved, in complex scenes with less illumination change, rapid motion and texture, the motion vector cannot be effectively estimated by the existing optical flow algorithm, so that the problems of dislocation, cracks or edge discontinuity exist on the boundary of a spliced picture, and the optical flow estimation model can be efficiently utilized to realize the splicing effect with higher precision and robustness;

The real-time performance of a splicing algorithm is improved, the traditional optical flow algorithm has large calculated amount and high time cost in high-resolution and multi-lens scenes, and the requirements of real-time video splicing are difficult to meet; according to the invention, by adopting a light optical flow network model, the calculation complexity is greatly reduced while the precision is ensured, and the real-time processing performance is improved;

Optimizing the picture consistency of the dynamic scene, wherein the moving objects in the dynamic scene are easy to introduce splicing errors (such as double images or blurring), the invention designs a special optical flow splicing strategy, enhances the processing capacity of object boundaries and details in the dynamic scene, and realizes smooth transition of pictures;

The method has the advantages that the suture line smoothing effect and geometric distortion correction are improved, in a suture region spliced by multiple lenses, consistency of global pictures and fidelity of local details are difficult to be considered in the traditional method, so that the splice line is fuzzy or artifact is caused;

The invention mainly solves various defects existing in the traditional panoramic stitching technology, has the advantages of high-precision optical flow stitching, dynamic scene optimization and panoramic picture quality improvement compared with the prior art, utilizes light-weight and high-efficiency optical flow estimation, improves the precision and continuity near a stitching line, reduces edge dislocation and cracks, remarkably reduces ghost and blurring phenomena aiming at a self-adaptive stitching strategy of a dynamic object, enhances the picture fluency, reduces visual artifacts and blurring transition through rough and fine fusion strategy and suture line optimization, realizes seamless panoramic effect and improves user experience;

The invention also discloses a storage medium, and the computer readable storage medium stores a program or instructions for executing the steps of the panorama stitching method. .

The present embodiment is not limited in any way by the shape, material, structure, etc. of the present invention, and any simple modification, equivalent variation and modification made to the above embodiments according to the technical substance of the present invention are all included in the scope of protection of the technical solution of the present invention.

Claims

1. The panoramic stitching method is realized based on a deep learning algorithm and is characterized by comprising the following steps of:

s7, finely fusing the panoramic images;

S8, outputting the panoramic image.

2. The panorama stitching method according to claim 1, wherein the fusion of p12 in S3 is calculated as:

p12=p1 (m 1) +p2 (| m 1), where (m 1) represents the effective visualization area mask of the fisheye image in S2 expanded panorama m1, and (| m 1) represents the effective visualization area mask of the fisheye image in S2 inverted in expanded panorama m 1.

3. The panorama stitching method according to claim 1, wherein the fusion of p21 in S3 is calculated as:

4. The panorama stitching method according to claim 1, wherein the deep learning algorithm is any one of fastFlowNet algorithm, RAFT algorithm or memFlowNet algorithm.

5. The panorama stitching method according to claim 1, wherein when the panorama is fused in S7, for each pixel (x, y) in the panorama, using the optical flow vector fLR (x, y) from the first to the second of the two fisheye images and the optical flow vector fRL (x, y) from the second to the first fisheye image, the fusion weights wF (x, y) and wB (x, y) of the two images are calculated according to the fusion mask m (x, y), wherein the calculation formula of wF (x, y) is:

wF(x,y)=b(x,y);

And the calculation formula of wB (x, y) is as follows:

wB(x,y)=1-b(x,y);

6. The panorama stitching method according to claim 5, wherein the color value C (x, y) of the pixel (x, y) in the panoramic image is calculated as:

C(x,y)=CF(x,y)*wF(x,y)*rho(x,y)+CB(x,y)*wB(x,y)*rho(x,y)。

7. Camera, characterized in that it comprises a processor and a memory, said processor being adapted to perform the steps of the panorama stitching method according to any one of claims 1-6 by invoking a program or instructions stored in said memory.

8. A storage medium characterized in that the computer readable storage medium stores a program or instructions for performing the steps of the panorama stitching method according to any one of claims 1-6.