CN102098527B

CN102098527B - Method and device for transforming two dimensions into three dimensions based on motion analysis

Info

Publication number: CN102098527B
Application number: CN 201110032204
Authority: CN
Inventors: 戴琼海; 张佳宏; 曹汛; 王好谦
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2011-01-28
Filing date: 2011-01-28
Publication date: 2013-04-10
Anticipated expiration: 2031-01-28
Also published as: CN102098527A

Abstract

The invention discloses a plane-to-stereo method based on motion analysis, comprising the following steps: inputting a single image, and preprocessing the single image; reading each frame of the single image as the current frame, and The heterosexuality detection selects its adjacent n frames as frames to be matched; obtains a motion vector according to the frame to be matched and the current frame, and uses the motion vector to perform motion analysis and judgment to obtain a final matching frame; the current frame and the current frame performing image transformation on the final matching frame to obtain a first stereoscopic input image and a second stereoscopic input image; and performing stereoscopic synthesis on the first stereoscopic input image and the second stereoscopic input image to obtain a stereoscopic video, and outputting the stereoscopic video. The motion-analysis-based plane-to-stereo method of the embodiment of the present invention has the characteristics of wide application range, fast calculation speed, high image and video quality, and good stereo effect. The invention also discloses a plane-to-stereo device based on motion analysis.

Description

A method and device for converting plane to stereo based on motion analysis

技术领域 technical field

本发明涉及计算机视觉处理领域，特别涉及一种基于运动分析的平面转立体方法及装置。The invention relates to the field of computer vision processing, in particular to a method and device for converting plane to stereo based on motion analysis.

背景技术 Background technique

立体视频技术作为未来多媒体技术的发展方向，是一种能够提供立体感的新型视频技术。与单通道视频相比，立体视频一般有两个视频通道，数据量要远远大于单通道视频，所以对立体视频的高效压缩尤为重要。具体而言，立体视频不仅包含传统二维视频的关于场景的表面信息，而且还包含与场景具体位置相关的三维立体信息。与传统的二维视频相比，立体视频是一种更有效、更真实的表达方式，克服了二维视频的片面性与被动性的缺点，能够更充分地满足人们的视觉感官需求，在交互式自由视点视频(FVV)、虚拟现实、3DTV、3D游戏、体育直播、广告传媒等很多领域有着广泛的应用前景。Stereoscopic video technology, as the development direction of future multimedia technology, is a new type of video technology that can provide stereoscopic effect. Compared with single-channel video, stereoscopic video generally has two video channels, and the amount of data is much larger than single-channel video, so efficient compression of stereoscopic video is particularly important. Specifically, the stereoscopic video not only contains the surface information about the scene of the traditional two-dimensional video, but also contains the three-dimensional information related to the specific position of the scene. Compared with traditional two-dimensional video, stereoscopic video is a more effective and realistic way of expression, which overcomes the shortcomings of one-sidedness and passivity of two-dimensional video, and can more fully meet people's visual sensory needs. Free Viewpoint Video (FVV), virtual reality, 3DTV, 3D games, live sports, advertising media and many other fields have broad application prospects.

在立体视频技术中，通过双目立体视觉实现立体感知效果。具体而言，双目立体视觉利用双目成像的原理，通过模拟双目感知成像，将左右两路图像或者视频作为左右眼的输入，然后通过大脑的融合，重构出输入两路图像和视频的立体场景，达到立体感知效果。其中双目成像原理为根据双目感知世界的原理，左右两眼在观看某个具体场景时，在左右两眼的视网膜上形成具有一定视差的左右两个成像图，并通过大脑的融合，重构出具有前后层次感的实际场景。In stereoscopic video technology, stereoscopic perception is achieved through binocular stereovision. Specifically, binocular stereo vision uses the principle of binocular imaging, and by simulating binocular perception imaging, the left and right images or videos are used as the input of the left and right eyes, and then through the fusion of the brain, the input two images and videos are reconstructed stereoscopic scenes to achieve a stereoscopic perception effect. The binocular imaging principle is based on the principle of binocular perception of the world. When the left and right eyes watch a specific scene, two left and right imaging images with a certain parallax are formed on the retinas of the left and right eyes, and through the fusion of the brain, the Construct an actual scene with a sense of front and rear layers.

在立体显示技术的飞速发展的同时，立体内容的匮乏却严重制约着整个立体视频产业的发展，并且现有的立体视频技术处理过程复杂并且成像质量低，整体立体显示效果差。With the rapid development of stereoscopic display technology, the lack of stereoscopic content seriously restricts the development of the entire stereoscopic video industry, and the existing stereoscopic video technology has a complex processing process and low imaging quality, and the overall stereoscopic display effect is poor.

发明内容 Contents of the invention

本发明的目的旨在至少解决上述技术缺陷之一，特别提出一种基于运动分析的平面转立体方法及装置，该基于运动分析的平面转立体方法及装置的应用范围宽、运算速度快、图像质量高且立体效果好。The purpose of the present invention is to at least solve one of the above-mentioned technical defects, and especially propose a method and device for converting plane to stereo based on motion analysis. High quality and good three-dimensional effect.

为达到上述目的，本发明的第一方面实施例提出一种基于运动分析的平面转立体方法，包括如下步骤：In order to achieve the above-mentioned purpose, the embodiment of the first aspect of the present invention proposes a plane-to-stereo method based on motion analysis, including the following steps:

输入单路图像，并对所述单路图像进行预处理；input a single-channel image, and preprocess the single-channel image;

读取所述单路图像中的每一帧图像作为当前帧，通过相异性检测选取所述当前帧邻近n帧作为待匹配帧；Reading each frame of image in the single-channel image as a current frame, and selecting n frames adjacent to the current frame as frames to be matched by dissimilarity detection;

根据所述待匹配帧和所述单路图像中的每一帧图像求取运动矢量，利用所述运动矢量进行运动分析及判断以获取最终匹配帧；Calculating a motion vector according to the frame to be matched and each frame of the single-channel image, and using the motion vector to perform motion analysis and judgment to obtain a final matching frame;

对所述当前帧和匹配帧进行图像变换以得到第一立体输入图像和第二立体输入图像；以及performing image transformation on the current frame and the matching frame to obtain a first stereo input image and a second stereo input image; and

对所述第一立体输入图像和第二立体输入图像进行立体化合成以得到立体视频，输出所述立体视频。Stereoscopic synthesis is performed on the first stereoscopic input image and the second stereoscopic input image to obtain a stereoscopic video, and the stereoscopic video is output.

根据本发明实施例的基于运动分析的平面转立体方法，具有以下优点：The plane-to-stereo method based on motion analysis according to the embodiment of the present invention has the following advantages:

(1)应用范围宽：本发明实施例的基于运动分析的平面转立体方法针对不同图像或者视频输入条件具有不同的解决方案，从而具有较大的应用范围。(1) Wide application range: The motion analysis-based plane-to-stereo method of the embodiment of the present invention has different solutions for different image or video input conditions, thus having a relatively large application range.

(2)运算速度快：通过邻近帧的选取方法和块采样方法以降低计算总量，同时本发明采用的运动分析和判断算法，较简单实用且处理效率高。(2) Fast calculation speed: the total amount of calculation is reduced through the selection method of adjacent frames and the block sampling method, and the motion analysis and judgment algorithm adopted by the present invention is relatively simple and practical and has high processing efficiency.

(3)图像与视频质量高：由于当前帧和匹配帧都来自图像序列或者视频本身，并且对当前帧和匹配帧进行图像变换时基于预设区域或者整幅图像进行的，从而没有图像和视频本身质量的损失，使得图像与视频质量高。(3) High image and video quality: Since both the current frame and the matching frame come from the image sequence or the video itself, and the image transformation of the current frame and the matching frame is based on the preset area or the entire image, there is no image and video The loss of its own quality makes the image and video quality high.

(4)立体效果佳：线性、非线性以及基于预设区域的图像变换方法，使得立体合成后的立体视频立体效果明显，无明显的深度感知错误和帧间抖动。(4) Good stereoscopic effect: Linear, nonlinear and image transformation methods based on preset areas make the stereoscopic video after stereoscopic synthesis have obvious stereoscopic effects, without obvious depth perception errors and frame-to-frame jitter.

本发明第二方面的实施例提出一种图像立体化装置，包括：单路输入及预处理模块，用于输入单路图像，并对所述单路图像进行预处理；待匹配帧选取模块，用于对来自所述单路输入及预处理模块的所述单路图像中的每一当前帧，通过相异性检测选取其邻近n帧作为待匹配帧；运动分析及判断模块，用于根据来自所述待匹配帧选取模块的所述待匹配帧和所述单路图像中的每一帧图像求取运动矢量，利用所述运动矢量进行运动分析及判断以获取最终匹配帧；图像变换模块，用于对来自所述运动分析及判断模块的所述当前帧和匹配帧进行图像变换以得到第一立体输入图像和第二立体输入图像；以及立体合成模块，用于对来自所述图像变换模块的所述第一立体输入图像和第二立体输入图像进行立体化合成以得到立体视频，输出所述立体视频。The embodiment of the second aspect of the present invention proposes a stereoscopic image device, including: a single-channel input and preprocessing module for inputting a single-channel image and performing preprocessing on the single-channel image; a frame selection module to be matched, For each current frame in the single-channel image from the single-channel input and preprocessing module, select its adjacent n frames as the frame to be matched by dissimilarity detection; the motion analysis and judgment module is used for according to the The frame to be matched and each frame of the single image in the frame to be matched selection module obtain a motion vector, and use the motion vector to perform motion analysis and judgment to obtain a final matching frame; the image conversion module, For performing image transformation on the current frame and the matching frame from the motion analysis and judgment module to obtain a first stereo input image and a second stereo input image; and a stereo synthesis module, used to perform image transformation from the image transformation module performing stereo synthesis on the first stereo input image and the second stereo input image to obtain a stereo video, and outputting the stereo video.

根据本发明实施例的基于运动分析的平面转立体装置，针对不同图像或者视频输入条件具有不同的解决方案，从而具有较大的应用范围，并且通过邻近帧的选取和块采样降低计算总量，同时本发明的运动分析及判断模块采用的运动分析和判断算法，因此较简单实用且处理效率高。并且由于当前帧和匹配帧都来自图像序列或者视频本身，并且对当前帧和匹配帧进行图像变换时基于预设区域或者整幅图像进行的，从而没有图像和视频本身质量的损失，使得图像与视频质量高。此外，本发明应用的线性、非线性以及基于预设区域的图像变换，使得立体合成后的立体视频立体效果明显，从而无明显的深度感知错误和帧间抖动。According to the motion analysis-based plane-to-stereo device of the embodiment of the present invention, there are different solutions for different image or video input conditions, thus having a large application range, and reducing the total amount of calculation through the selection of adjacent frames and block sampling, At the same time, the motion analysis and judgment algorithm adopted by the motion analysis and judgment module of the present invention is simple, practical and has high processing efficiency. And because both the current frame and the matching frame come from the image sequence or the video itself, and the image transformation of the current frame and the matching frame is based on the preset area or the entire image, so there is no loss of image and video quality, so that the image and the matching frame The video quality is high. In addition, the linear, non-linear and image transformation based on the preset area applied in the present invention makes the stereoscopic video after the stereoscopic synthesis have obvious stereoscopic effect, so that there is no obvious depth perception error and frame-to-frame jitter.

本发明附加的方面和优点将在下面的描述中部分给出，部分将从下面的描述中变得明显，或通过本发明的实践了解到。Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.

附图说明 Description of drawings

本发明上述的和/或附加的方面和优点从下面结合附图对实施例的描述中将变得明显和容易理解，其中：The above and/or additional aspects and advantages of the present invention will become apparent and easy to understand from the following description of the embodiments in conjunction with the accompanying drawings, wherein:

图1为根据本发明实施例的基于运动分析的平面转立体方法的流程框图；以及FIG. 1 is a flow diagram of a motion analysis-based plane-to-stereo method according to an embodiment of the present invention; and

图2为根据本发明实施例的基于运动分析的平面转立体装置的结构框图。Fig. 2 is a structural block diagram of a motion analysis-based 2D-to-3D device according to an embodiment of the present invention.

具体实施方式 Detailed ways

下面详细描述本发明的实施例，所述实施例的示例在附图中示出，其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的，仅用于解释本发明，而不能解释为对本发明的限制。Embodiments of the present invention are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals designate the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary only for explaining the present invention and should not be construed as limiting the present invention.

如图1所示，根据本发明实施例基于运动分析的平面转立体方法，包括如下步骤：As shown in Figure 1, according to the embodiment of the present invention, the plane-to-stereo method based on motion analysis includes the following steps:

S101：输入单路图像，并对单路图像进行预处理；S101: Input a single-channel image, and perform preprocessing on the single-channel image;

首先输入单路图像，其中单路图像可以为以下之一：单路平面图像、单路图像序列或单路视频。其中，单路平面图像或单路图像序列的图像格式可以为RAW、BMP、PCX、TIFF、GIF、JPEG、TGA、EXIF等；单路视频的视频格式可以为MPEG、AVI、MOV、ASF、WMV、MKV、DIVX、Real Video等。当然本领域技术人员可以理解的是，单路平面图像或单路图像序列亦可以为其他图像格式，单路视频亦可以为其他视频格式。然后对输入的单路图像进行预处理。First, input a single image, where the single image can be one of the following: single planar image, single image sequence or single video. Among them, the image format of single-channel plane image or single-channel image sequence can be RAW, BMP, PCX, TIFF, GIF, JPEG, TGA, EXIF, etc.; the video format of single-channel video can be MPEG, AVI, MOV, ASF, WMV , MKV, DIVX, Real Video, etc. Of course, those skilled in the art can understand that the single-channel planar image or single-channel image sequence can also be in other image formats, and the single-channel video can also be in other video formats. The input single-channel image is then preprocessed.

对于单路视频的预处理采用图像编码技术对视频进行解码操作。在本发明的一个实施例中，采用解码器对输入的单路视频进行解码后，输出YUV系列或者RGB系列的码流。For the preprocessing of single-channel video, image coding technology is used to decode the video. In one embodiment of the present invention, a decoder is used to decode the input single-channel video, and output a code stream of YUV series or RGB series.

预处理过程还包括对输入的单路图像提取图像像素值。具体而言，读取单路平面图像、单路图像序列或单路视频解码流中的每一帧，然后提取相应颜色空间中的像素值。The preprocessing process also includes extracting image pixel values from the input single-channel image. Specifically, each frame in a single planar image, a single image sequence, or a single video decoding stream is read, and then pixel values in the corresponding color space are extracted.

对于RGB空间，提取的图像像素值可以为以下三种之一：RGB三个通道值的三元数组；RGB三通道值的平均值；RGB三通道以某个比例系数进行叠加求和得到的加权值。其中，当提取的图像像素值为RGB三通道以某个比例系数进行叠加求和得到的加权值时，通过以下公式进行计算，For the RGB space, the extracted image pixel value can be one of the following three types: a triple array of RGB three-channel values; the average value of the RGB three-channel values; the weighted sum of the RGB three-channels with a certain proportional coefficient value. Among them, when the extracted image pixel value is the weighted value obtained by superimposing and summing the RGB three channels with a certain proportional coefficient, it is calculated by the following formula,

Gray＝R*0.3+G*0.59+B*0.11，Gray＝R*0.3+G*0.59+B*0.11,

其中，Gray为加权值。当然本领域技术人员可以理解的是，当以其他系数进行加权获取加权值时，亦属于本发明的保护范围。Among them, Gray is the weighted value. Of course, those skilled in the art can understand that when other coefficients are used for weighting to obtain weighted values, it also belongs to the protection scope of the present invention.

对于YUV空间，提取的图像像素值可以分为以下三种之一：Y分量；YUV三通道的平均值；YUV三通道以某个比例系数进行叠加求和得到的加权值。For the YUV space, the extracted image pixel values can be divided into one of the following three types: the Y component; the average value of the three YUV channels; the weighted value obtained by superimposing and summing the three YUV channels with a certain proportional coefficient.

S102：对于单路图像中的每一帧图像，选取当前帧的邻近帧作为待匹配帧；S102: For each frame of image in the single-channel image, select an adjacent frame of the current frame as the frame to be matched;

由于在双面立体视觉时利用双面成像的原理，需要左右两路图像作为初始输入。因此，除输入的当前帧外，还需要与当前帧对应的匹配帧，分别作为两路输入。Due to the use of the principle of double-sided imaging in the double-sided stereoscopic vision, two images on the left and right are required as initial input. Therefore, in addition to the input current frame, matching frames corresponding to the current frame are also required as two inputs.

当单路图像为单路平面图像时，匹配帧即为当前帧。When the single-channel image is a single-channel planar image, the matching frame is the current frame.

当单路图像为单路图像序列或单路视频时，在确定匹配帧之前需要获取待匹配帧。具体而言，对于单路图像中的每一帧图像，需要从当前帧的邻近帧中选取初始待匹配帧序列。其中，在初始待匹配帧序列中包括有多个初始待匹配帧。其中，初始待匹配帧序列为当前帧在时间轴上的前h个帧和后h个帧。在本发明的一个实施例中，h可以为5。当然本领域技术人员可以理解的是，h亦可以为其他值。When the single-channel image is a single-channel image sequence or a single-channel video, it is necessary to obtain the frame to be matched before determining the matching frame. Specifically, for each frame of images in the single-channel image, it is necessary to select an initial sequence of frames to be matched from adjacent frames of the current frame. Wherein, the sequence of initial frames to be matched includes a plurality of initial frames to be matched. Wherein, the initial frame sequence to be matched is the first h frames and the last h frames of the current frame on the time axis. In one embodiment of the present invention, h may be 5. Of course, those skilled in the art can understand that h can also be other values.

为了确定当前帧与初始待匹配帧序列的相异性，在获取初始待匹配帧序列后，需要对初始待匹配帧序列中的每一个初始待匹配帧进行相异性检测以获取待匹配帧序列，包括如下步骤：In order to determine the dissimilarity between the current frame and the initial frame sequence to be matched, after obtaining the initial frame sequence to be matched, it is necessary to perform a dissimilarity detection on each initial frame to be matched in the initial frame sequence to be matched to obtain the frame sequence to be matched, including Follow the steps below:

A1：首先对当前帧与一个初始待匹配帧同时进行一个水平中线分割和垂直中线分割，由此当前帧和初始待匹配帧分别被均分为四个图像块，分割后的四个图像块的长和宽的尺寸为原来图像帧长和宽大小的1/2；A1: First, a horizontal midline segmentation and a vertical midline segmentation are performed on the current frame and an initial frame to be matched at the same time, so that the current frame and the initial frame to be matched are divided into four image blocks, and the four image blocks after segmentation The length and width are 1/2 of the length and width of the original image frame;

A2：对当前帧和初始待匹配帧对应位置的四个图像块分别进行图像块像素值相减，取差值的绝对值作为新像素值以得到新的图像块，对四个新图像块求取像素均值和方差；A2: Subtract the pixel values of the four image blocks corresponding to the current frame and the initial frame to be matched, and take the absolute value of the difference as the new pixel value to obtain a new image block, and calculate the four new image blocks Take the pixel mean and variance;

A3：对四个新图像块的像素均值和方差分别进行阈值判断，当四个新图像块中均值和方差超过预设阈值的数量均小于或等于相异性数量阈值时，则判断该初始待匹配帧为当前帧的待匹配帧；当四个新图像块中均值或方差超过预设阈值的数量大于相异性数量阈值时，则判断该初始待匹配帧不为当前帧的待匹配帧。其中，预设阈值包括像素均值阈值和方差阈值。在本发明的一个实施例中，像素均值和方差阈值可以为[20，30]之内的任意值。其中，相异性数量阈值可以为3。A3: Threshold judgment is performed on the pixel mean and variance of the four new image blocks respectively. When the number of the mean and variance exceeding the preset threshold in the four new image blocks is less than or equal to the threshold of the number of dissimilarities, the initial matching is judged The frame is the frame to be matched of the current frame; when the number of the four new image blocks whose mean value or variance exceeds the preset threshold is greater than the threshold of the number of dissimilarities, it is judged that the initial frame to be matched is not the frame to be matched of the current frame. Wherein, the preset threshold includes a pixel mean threshold and a variance threshold. In an embodiment of the present invention, the pixel mean value and variance threshold can be any value within [20, 30]. Wherein, the threshold of the number of dissimilarities may be 3.

重复步骤A1至A3，将初始待匹配帧序列中的每一个初始待匹配帧均与当前帧进行一次相异性检测计算，直至所有初始待匹配帧均做完相异性检测得到n个待匹配帧。其中n为正整数。Steps A1 to A3 are repeated, and each initial frame to be matched in the sequence of initial frames to be matched is subjected to a dissimilarity detection calculation with the current frame until all initial frames to be matched are subjected to dissimilarity detection to obtain n frames to be matched. where n is a positive integer.

S103：求取运动矢量，利用运动矢量进行运动分析及判断以获取匹配帧；S103: Obtain a motion vector, and use the motion vector to perform motion analysis and judgment to obtain a matching frame;

由于单路平面图形的匹配帧即为当前帧，从而不需要进行运动分析及判断。Since the matching frame of the single-channel planar graphics is the current frame, motion analysis and judgment are not required.

在本步骤中，当输入为单路图像序列或单路视频时需要通过运动检测来判断场景的运动形式和大小。具体而言，对单路图像序列或单路视频的待匹配帧和当前帧求取运动矢量，利用运动矢量进行运动分析及判断以获取匹配帧，包括如下步骤：In this step, when the input is a single-channel image sequence or a single-channel video, motion detection is required to determine the motion form and size of the scene. Specifically, the motion vector is obtained for the frame to be matched and the current frame of the single-channel image sequence or single-channel video, and the motion analysis and judgment are performed using the motion vector to obtain the matching frame, including the following steps:

第一步，对当前帧的图像数据按照预定尺寸进行分块，并对分块后的图像按照预定采样数进行数据块的均匀采样以获取多个采样块；In the first step, the image data of the current frame is divided into blocks according to a predetermined size, and the divided image is uniformly sampled according to a predetermined sampling number to obtain multiple sampling blocks;

首先，根据当前帧图像的尺寸和分块的尺寸对图像数据安装预定尺寸进行分块。其中，每个图像块的尺寸通过预先设定。在本发明的一个实施例中，对于640*480的图像，每个图像块的尺寸为4*4个像素。然后对分块后的图像进行数据块的均匀采样以获取多个采样块。其中，采样点的数量通过预先设定。在本发明的一个实施例中，采样点的数量为图像像素总个数的5％左右。First, divide the image data into blocks with a predetermined size according to the size of the current frame image and the size of the blocks. Wherein, the size of each image block is preset. In an embodiment of the present invention, for a 640*480 image, the size of each image block is 4*4 pixels. Then uniform sampling of data blocks is performed on the divided image to obtain multiple sampling blocks. Wherein, the number of sampling points is preset. In one embodiment of the present invention, the number of sampling points is about 5% of the total number of image pixels.

第二步，对当前帧中的各个采样块分别与每一个待匹配帧中的匹配块之间的运动偏移作为该采样块对应该待匹配帧的运动矢量；In the second step, the motion offset between each sampling block in the current frame and the matching block in each frame to be matched is used as the motion vector of the sampling block corresponding to the frame to be matched;

对当前帧中的每一个采样块，在一个待匹配帧中搜索与其最相似的块。其中，搜索方法包括全搜索、三步搜索、对数搜索或钻石搜索算法等等。当搜索结束之后，当前帧中各个采样块与该待匹配帧中的匹配块之间的运动偏移作为当前帧中的该采样块对应该待匹配帧的运动矢量。通过上述方法可以得到n个运动矢量图，其中，n为正整数。For each sampling block in the current frame, search for the most similar block in a frame to be matched. Wherein, the search method includes full search, three-step search, logarithmic search or diamond search algorithm and the like. After the search is finished, the motion offset between each sample block in the current frame and the matching block in the frame to be matched is used as a motion vector of the sample block in the current frame corresponding to the frame to be matched. Through the above method, n motion vector graphics can be obtained, where n is a positive integer.

第三步，对n个矢量图进行运动分析和判断以得到零运动矢量图的数量和非零运动矢量图的数量；The third step is to carry out motion analysis and judgment to n vector graphics to obtain the quantity of zero motion vector graphics and the quantity of non-zero motion vector graphics;

判断已得到n个运动矢量图中是否存在零运动矢量图。在本发明的一个实施例中，当对应零矢量采样块的数量占总采样块数量的90％以上时，则判断该运动矢量图为零运动矢量图。其中，零运动矢量图的数量为m。当然本领域技术人员可以理解的是，对应零矢量采样块的数量占总采样块数量的比例为其他合理比例时，亦落入本发明的保护范围。It is judged whether there is a zero motion vector diagram among the obtained n motion vector diagrams. In one embodiment of the present invention, when the number of sample blocks corresponding to zero vector accounts for more than 90% of the total number of sample blocks, it is judged that the motion vector diagram is a zero motion vector diagram. Wherein, the number of zero-motion vector diagrams is m. Of course, those skilled in the art can understand that when the ratio of the number of sampling blocks corresponding to zero vectors to the total number of sampling blocks is other reasonable ratios, it also falls within the protection scope of the present invention.

根据上述得到的m个零运动矢量图，可以获知非零运动矢量图的数量为n-m。然后对非零运动矢量图进行水平运动矢量分析。非零运动矢量图包括前景水平运动对应的运动矢量图、摄像机水平移动对应的运动矢量图和剩余情形对应的运动矢量图。其中，剩余情形对应的运动矢量图为除前景水平运动对应的运动矢量图和摄像机水平移动对应的运动矢量图之外的运动矢量图。According to the m zero motion vector diagrams obtained above, it can be known that the number of non-zero motion vector diagrams is n-m. A horizontal motion vector analysis is then performed on the non-zero motion vector diagrams. Non-zero motion vector diagrams include motion vector diagrams corresponding to foreground horizontal motion, motion vector diagrams corresponding to camera horizontal movement, and motion vector diagrams corresponding to remaining situations. The motion vector diagrams corresponding to the remaining situations are motion vector diagrams other than the motion vector diagram corresponding to the foreground horizontal motion and the motion vector diagram corresponding to the camera horizontal movement.

当非零运动矢量图对应的采样块中，运动矢量方向为水平方向的采样块占非零矢量采样块的比例为70％到90％时，判断为前景水平运动。其中，前景水平运动对应运动矢量图的数量为p。当然本领域技术人员可以理解的是，运动矢量方向为水平方向的采样块占非零矢量采样块的比例为其他合理比例时，亦落入本发明的保护范围。When, among the sampling blocks corresponding to the non-zero motion vector diagram, the sampling blocks whose motion vector direction is in the horizontal direction account for 70% to 90% of the non-zero vector sampling blocks, it is determined that the foreground is moving horizontally. Wherein, the number of motion vector diagrams corresponding to the foreground horizontal motion is p. Of course, those skilled in the art can understand that when the ratio of the sampling blocks whose motion vector direction is in the horizontal direction to the non-zero vector sampling blocks is other reasonable ratios, it also falls within the protection scope of the present invention.

当非零运动矢量图对应的采样块中，运动矢量方向为水平方向的采样块占非零矢量采样块的比例为90％以上，判断为摄像机水平移动。其中，摄像机水平移动对应的运动矢量图的数量为q。当然本领域技术人员可以理解的是，运动矢量方向为水平方向的采样块占非零矢量采样块的比例为其他合理比例时，亦落入本发明的保护范围。When among the sampling blocks corresponding to the non-zero motion vector diagram, the sampling blocks whose motion vector direction is in the horizontal direction account for more than 90% of the non-zero vector sampling blocks, it is determined that the camera is moving horizontally. Wherein, the number of motion vector diagrams corresponding to the horizontal movement of the camera is q. Of course, those skilled in the art can understand that when the ratio of the sampling blocks whose motion vector direction is in the horizontal direction to the non-zero vector sampling blocks is other reasonable ratios, it also falls within the protection scope of the present invention.

由此可知，剩余情形对应的运动矢量图的数量为n-m-p-q。It can be seen that the number of motion vector diagrams corresponding to the remaining cases is n-m-p-q.

下面对n-m个非零运动矢量图进行运动分析和判断。首先，对于每个非零运动矢量图的非零运动矢量求平均，得到每个非零运动矢量图的运动矢量平均值A(i)。其中A为平均值，i为非零矢量标号。如果非零运动矢量图对应的非零运动矢量平均值在均值阈值内时，则保留该非零运动矢量图，否则丢弃该非零运动矢量图并将与其对应的待匹配帧丢弃。Next, motion analysis and judgment are performed on the n-m non-zero motion vector diagrams. First, average the non-zero motion vectors of each non-zero motion vector diagram to obtain the average value A(i) of the motion vectors of each non-zero motion vector diagram. Among them, A is the average value, and i is a non-zero vector label. If the average value of the non-zero motion vector corresponding to the non-zero motion vector diagram is within the average threshold, the non-zero motion vector diagram is retained; otherwise, the non-zero motion vector diagram is discarded and the corresponding frame to be matched is discarded.

在本发明的一个实例中，均值阈值的范围为[5，30]的像素距离。当然本领域技术人员可以理解的是，当均值阈值为其他合理的设置范围时，亦落入本发明的保护范围。In one example of the present invention, the mean threshold is in the range of [5, 30] pixel distances. Of course, those skilled in the art can understand that, when the average threshold value is in other reasonable setting ranges, it also falls within the protection scope of the present invention.

通过上述运动分析和判断，可以得到满足要求的零运动矢量图的数量为v1、前景水平运动对应的运动矢量图的数量为v2、摄像机水平移动对应的运动矢量图的数量为v3，剩余情形对应的运动矢量图的数量为v4。Through the above motion analysis and judgment, it can be obtained that the number of zero-motion vector diagrams meeting the requirements is v1, the number of motion vector diagrams corresponding to foreground horizontal motion is v2, the number of motion vector diagrams corresponding to camera horizontal movement is v3, and the remaining cases correspond to The number of motion vector graphics for v4.

第四步，根据零运动图矢量的数量和非零运动矢量图的数量获取匹配帧。The fourth step is to obtain matching frames according to the number of zero motion vectors and the number of non-zero motion vectors.

(1)当零运动矢量图的数量v1大于第一阈值R(v1)时，则判断单路视频或单路图像序列当前帧对应场景为静止场景，并且在所有零运动矢量图对应的待匹配帧中选取与当前帧在时间轴上最远的一帧作为匹配帧。(1) When the number v1 of the zero-motion vector diagram is greater than the first threshold R(v1), then it is judged that the scene corresponding to the current frame of the single-channel video or single-channel image sequence is a static scene, and all zero-motion vector graphics corresponding to the scene to be matched Select the farthest frame on the time axis from the current frame as the matching frame.

在本发明的一个实施例中，第一阈值R(v1)可以设置为max(1，n-2)，即v1＞max(1，n-2)。当然本领域技术人员可以理解的是，第一阈值设置为其他合理值时，亦落入本发明的保护范围。In an embodiment of the present invention, the first threshold R(v1) may be set to max(1, n-2), that is, v1>max(1, n-2). Of course, those skilled in the art can understand that when the first threshold is set to other reasonable values, it also falls within the protection scope of the present invention.

(2)当前景水平运动对应的运动矢量图的数量v2位于第二阈值R(v2)内时，则判断单路视频或单路图像序列当前帧对应场景为前景运动场景，选取v2个前景水平运动对应的运动矢量图的平均值A(v2)与前景标准预设阈值R(a2)的差值最小的前景水平运动对应的运动矢量图为最终前景水平运动对应的运动矢量图，选取最终前景水平运动对应的运动矢量图对应的待匹配帧为匹配帧。(2) When the number v2 of the motion vector diagram corresponding to the foreground horizontal motion is within the second threshold R(v2), it is judged that the scene corresponding to the current frame of the single-channel video or single-channel image sequence is a foreground motion scene, and v2 foreground levels are selected The motion vector map corresponding to the foreground horizontal motion with the smallest difference between the average value A(v2) of the motion vector map corresponding to the motion and the foreground standard preset threshold R (a2) is the motion vector graph corresponding to the final foreground horizontal motion, and the final foreground is selected The frame to be matched corresponding to the motion vector diagram corresponding to the horizontal motion is the matching frame.

在本发明的一个实施例中，R(v2)＝[n/3，n]，R(a2)＝15。当然本领域技术人员可以理解的是，第二阈值R(v2)设置为其他合理值时，亦落入本发明的保护范围。In one embodiment of the present invention, R(v2)=[n/3,n], R(a2)=15. Of course, those skilled in the art can understand that when the second threshold R(v2) is set to other reasonable values, it also falls within the protection scope of the present invention.

(3)当摄像机水平移动对应的运动矢量图的数量v3大于第三阈值R(v3)时，则判断单路视频或单路图像序列当前帧对应场景为摄像机移动场景，选取v3个摄像机水平移动对应的运动矢量图的平均值A(v3)与摄像机标准预设阈值R(a3)的差值最小的前景水平运动对应的运动矢量图为最终摄像机水平移动对应的运动矢量图，选取最终摄像机水平移动对应的运动矢量图对应的待匹配帧为匹配帧。(3) When the number v3 of the motion vector diagram corresponding to the horizontal movement of the camera is greater than the third threshold R(v3), it is judged that the scene corresponding to the current frame of the single-channel video or single-channel image sequence is a camera movement scene, and v3 cameras are selected to move horizontally The motion vector diagram corresponding to the foreground horizontal movement with the smallest difference between the average value A(v3) of the corresponding motion vector diagram and the camera standard preset threshold R(a3) is the motion vector diagram corresponding to the final camera horizontal movement, and the final camera horizontal movement is selected as The frame to be matched corresponding to the motion vector diagram corresponding to the movement is the matching frame.

在本发明的一个实施例中，R(v3)＝[n/2，n]，R(a3)＝10。当然本领域技术人员可以理解的是，第三阈值R(v3)设置为其他合理值时，亦落入本发明的保护范围。In one embodiment of the present invention, R(v3)=[n/2,n], R(a3)=10. Of course, those skilled in the art can understand that when the third threshold R(v3) is set to other reasonable values, it also falls within the protection scope of the present invention.

(4)当剩余情形对应的运动矢量图的数量v4大于第四阈值R(v4)时，则判断单路视频或单路图像序列当前帧对应场景为除前景水平运动和摄像机水平移动之外的场景。选取v4个剩余情形对应的运动矢量图的平均值A(v4)与剩余情形标准预设阈值R(a4)的差值最小的剩余情形对应的运动矢量图为最终剩余情形对应的运动矢量图，选取最终剩余情形对应的运动矢量图对应的待匹配帧为匹配帧。(4) When the number v4 of the motion vector diagrams corresponding to the remaining situations is greater than the fourth threshold R(v4), then it is judged that the scene corresponding to the current frame of the single-channel video or single-channel image sequence is a scene other than the horizontal movement of the foreground and the horizontal movement of the camera Scenes. Select the motion vector diagram corresponding to the remaining situation with the smallest difference between the average A(v4) of the motion vector diagrams corresponding to v4 remaining situations and the standard preset threshold value R(a4) of the remaining situations as the motion vector diagram corresponding to the final remaining situation, The frame to be matched corresponding to the motion vector diagram corresponding to the final remaining situation is selected as the matching frame.

在本发明的一个实施例中，R(v4)＝[n-2，n]，R(a4)＝5。当然本领域技术人员可以理解的是，第四阈值R(v4)设置为其他合理值时，亦落入本发明的保护范围。In one embodiment of the present invention, R(v4)=[n-2,n], R(a4)=5. Of course, those skilled in the art can understand that when the fourth threshold R(v4) is set to other reasonable values, it also falls within the protection scope of the present invention.

在本发明的一个实施例中，当均不满足上述四种情形，如输入单路平面图像，则选取当前帧作为待匹配帧。In one embodiment of the present invention, when none of the above four conditions are met, for example, a single planar image is input, the current frame is selected as the frame to be matched.

S104：对于前景运动场景时，根据匹配帧与当前帧求取视差图，并根据视差图和当前帧更新匹配帧；S104: For a foreground moving scene, obtain a disparity map according to the matching frame and the current frame, and update the matching frame according to the disparity map and the current frame;

当单路视频或单路图像序列当前帧对应场景为前景运动场景时，需要根据匹配帧与当前帧求取视差图，然后通过视差图和当前帧来求取新的匹配帧，即更新匹配帧。其中，通过以下算法之一求取视差图，包括基于像素块操作的光流法、块匹配法、基于像素点操作的深度匹配法等。When the scene corresponding to the current frame of a single-channel video or single-channel image sequence is a foreground motion scene, it is necessary to obtain a disparity map based on the matching frame and the current frame, and then obtain a new matching frame through the disparity map and the current frame, that is, update the matching frame . Wherein, the disparity map is obtained by one of the following algorithms, including an optical flow method based on pixel block operations, a block matching method, a depth matching method based on pixel point operations, and the like.

在本发明的一个实施例中，利用当前帧和视差图通过DIBR(Depth Image BasedRendering)方法求取新的匹配帧。In one embodiment of the present invention, utilize current frame and disparity map to obtain new matching frame by DIBR (Depth Image BasedRendering) method.

S105：对当前帧和匹配帧进行图像变换以得到第一立体输入图像和第二立体输入图像；S105: Perform image transformation on the current frame and the matching frame to obtain a first stereo input image and a second stereo input image;

当单路图像为单路平面图像或单路视频或单路图像序列，且所述单路视频或单路图像序列当前帧对应场景为静止场景或摄像机移动场景或剩余情形时，将当前帧和匹配帧进行图像变换以得到第一立体输入图像和第二立体输入图像。When the single-channel image is a single-channel plane image or a single-channel video or a single-channel image sequence, and the scene corresponding to the current frame of the single-channel video or single-channel image sequence is a static scene or a camera moving scene or a remaining situation, the current frame and Matching frames for image transformation to obtain a first stereoscopic input image and a second stereoscopic input image.

当单路视频或单路图像序列当前帧对应场景为前景运动场景时，则将当前帧和更新后的匹配帧进行图像变换以得到第一立体输入图像和第二立体输入图像。When the scene corresponding to the current frame of the single-channel video or single-channel image sequence is a foreground motion scene, image transformation is performed on the current frame and the updated matching frame to obtain the first stereoscopic input image and the second stereoscopic input image.

图像变换包括以下方式：线性变换、非线性变换和预设区域变换。Image transformation includes the following methods: linear transformation, nonlinear transformation and preset area transformation.

线性变换是指只对当前帧或者匹配帧做全图像的线性图像变换。在本发明的一个实施例中，线性变换可以为平移变换、拉伸变换、压缩变换或透视变换等。Linear transformation refers to the linear image transformation of the entire image only for the current frame or the matching frame. In an embodiment of the present invention, the linear transformation may be translation transformation, stretch transformation, compression transformation, or perspective transformation.

非线性变换是指包括绕X、Y、Z轴的图像旋转变换，针对图像每一行或每一列或每一个像素的预先设定好的变换操作。Non-linear transformation refers to image rotation transformation around X, Y, and Z axes, and a preset transformation operation for each row or column or pixel of the image.

预设区域变换是指通过预先设定用户感兴趣的区域，主要对该部分预设区域进行线性变换或非线性变换，而对预设区域外的区域则不作任何变换，或者对预设区域外的区域进行较预设区域变换的变换幅度小的变换。Preset area transformation means that by presetting the area that the user is interested in, the part of the preset area is mainly linearly transformed or nonlinearly transformed, while the area outside the preset area is not transformed, or the area outside the preset area is not transformed. The area is transformed with a smaller transformation range than the preset area transformation.

在本发明的一个实施例中，对当前帧与匹配帧的图像变换可以为单一式，也可以为组合式，并且可以只对其中一帧图像做变换，而对另一帧图像不做变换。In an embodiment of the present invention, the image transformation between the current frame and the matching frame can be a single type or a combined type, and only one frame of the image can be transformed, and the other frame of image can not be transformed.

S106：对第一立体输入图像和第二立体输入图像进行立体化合成以得到立体视频，输出立体视频。S106: Perform stereoscopic synthesis on the first stereoscopic input image and the second stereoscopic input image to obtain a stereoscopic video, and output the stereoscopic video.

在本发明的一个实施例中，将第一立体输入图像作为左视图的初始输入，即将当前帧作为左视图的初始输入；将第二立体输入图像作为右视图的初始输入，对应左眼观看；即将匹配帧作为右视图的初始输入，对应右眼观看。当然本领域技术人员可以理解的是，将当前帧作为右视图的初始输入，将匹配帧作为左视图的初始输入，亦落入本发明的保护范围。In one embodiment of the present invention, the first stereoscopic input image is used as the initial input of the left view, that is, the current frame is used as the initial input of the left view; the second stereoscopic input image is used as the initial input of the right view, corresponding to viewing with the left eye; That is, the matching frame is used as the initial input of the right view, which corresponds to viewing with the right eye. Of course, those skilled in the art can understand that using the current frame as the initial input of the right view and using the matching frame as the initial input of the left view also falls within the protection scope of the present invention.

对应不同的显示终端，采用不同的立体视频合成格式和立体视频输出方式对两路输入图像进行立体合成。其中，将两路输入图像按照立体视频合成格式进行合成，包括将左右两路输入图像进行适当的合成方式以得到适应不同立体显示要求的单路立体视频输出。其中，立体视频合成格式包括棋盘格式合成和水平交错格式合成等。Corresponding to different display terminals, different stereoscopic video synthesis formats and stereoscopic video output methods are used to perform stereoscopic synthesis of two input images. Wherein, synthesizing the two input images according to the stereoscopic video synthesizing format includes performing appropriate synthesizing on the left and right input images to obtain a single stereoscopic video output adapting to different stereoscopic display requirements. Wherein, the stereoscopic video synthesis format includes checkerboard format synthesis and horizontal interlaced format synthesis, and the like.

立体视频输出方式包括将左右两路输入图像以水平并排、垂直并排进行排列，也可以将左右两路输入图像作为两路输出。在本发明的一个实施例中，合成后输出的立体视频可以为红绿、红青、红蓝等立体合成单路输出显示。The stereoscopic video output method includes arranging the left and right input images side by side horizontally and vertically, or using the left and right input images as two output channels. In an embodiment of the present invention, the output stereoscopic video after synthesis may be a single-channel output display of stereoscopic synthesis of red-green, red-cyan, red-blue, etc.

(4)立体效果佳：本发明实施例的线性、非线性以及基于预设区域的图像变换，使得立体合成后的立体视频立体效果明显，无明显的深度感知错误和帧间抖动。(4) Good three-dimensional effect: the linear, non-linear and image transformation based on the preset area of the embodiment of the present invention makes the three-dimensional video after stereo synthesis have obvious three-dimensional effect, without obvious depth perception error and frame-to-frame jitter.

下面参考图2描述根据本发明实施例的基于运动分析的平面转立体装置200。Referring to FIG. 2 , a motion analysis-based 2D-to-3D device 200 according to an embodiment of the present invention will be described below.

如图2所示，根据本发明实施例的基于运动分析的平面转立体装置200包括单路输入及预处理模块210，其中单路输入及预处理模块210用于输入单路图像，并对单路图像进行预处理；待匹配帧选取模块220，其中待匹配帧选取模块220用于对来自单路输入及预处理模块210的单路图像中的每一当前帧，通过相异性检测选取其邻近n帧作为待匹配帧；运动分析及判断模块230，其中运动分析及判断模块230用于根据来自待匹配帧选取模块220的待匹配帧和当前帧求取运动矢量，利用运动矢量进行运动分析及判断以获取匹配帧；图像变换模块250，其中图像变换模块250用于对来自运动分析及判断模块230的当前帧和匹配帧进行图像变换以得到第一立体输入图像和第二立体输入图像；立体合成模块260，其中立体合成模块260用于对来自图像变换模块250的第一立体输入图像和第二立体输入图像进行立体化合成以得到立体视频，输出立体视频。As shown in FIG. 2 , according to the embodiment of the present invention, the device 200 for converting plane to stereo based on motion analysis includes a single-channel input and preprocessing module 210, wherein the single-channel input and pre-processing module 210 is used to input a single-channel image, and The frame to be matched is selected by the frame selection module 220, wherein the frame to be matched is selected by the frame to be matched for each current frame in the single-channel image from the single-channel input and the pre-processing module 210, and its neighbors are selected by the difference detection n frames are used as frames to be matched; motion analysis and judgment module 230, wherein motion analysis and judgment module 230 are used to obtain the motion vector according to the frame to be matched and the current frame from the frame to be matched selection module 220, and use the motion vector to carry out motion analysis and Judging to obtain the matching frame; image transformation module 250, wherein the image transformation module 250 is used to perform image transformation on the current frame and the matching frame from the motion analysis and judgment module 230 to obtain the first stereoscopic input image and the second stereoscopic input image; stereoscopic The synthesis module 260, wherein the stereo synthesis module 260 is used to perform stereo synthesis on the first stereo input image and the second stereo input image from the image transformation module 250 to obtain a stereo video, and output the stereo video.

在本发明的一个实施例中，单路输入及预处理模块210输入的单路图像可以为以下之一：单路平面图像、单路图像序列或单路视频。其中，单路平面图像或单路图像序列的图像格式可以为RAW、BMP、PCX、TIFF、GIF、JPEG、TGA、EXIF等；单路视频的视频格式可以为MPEG、AVI、MOV、ASF、WMV、MKV、DIVX、Real Video等。当然本领域技术人员可以理解的是，单路平面图像或单路图像序列亦可以为其他图像格式，单路视频亦可以为其他视频格式。然后对输入的单路图像进行预处理。In an embodiment of the present invention, the single-channel image input by the single-channel input and preprocessing module 210 may be one of the following: a single-channel planar image, a single-channel image sequence or a single-channel video. Among them, the image format of single-channel plane image or single-channel image sequence can be RAW, BMP, PCX, TIFF, GIF, JPEG, TGA, EXIF, etc.; the video format of single-channel video can be MPEG, AVI, MOV, ASF, WMV , MKV, DIVX, Real Video, etc. Of course, those skilled in the art can understand that the single-channel planar image or single-channel image sequence can also be in other image formats, and the single-channel video can also be in other video formats. The input single-channel image is then preprocessed.

对于单路视频的预处理，单路输入及预处理模块210采用图像编码技术对视频进行解码操作。在本发明的一个实施例中，单路输入及预处理模块210采用解码器对输入的单路视频进行解码后，输出YUV系列或者RGB系列的码流。For the preprocessing of single-channel video, the single-channel input and preprocessing module 210 uses image coding technology to decode the video. In one embodiment of the present invention, the single-channel input and preprocessing module 210 uses a decoder to decode the input single-channel video, and outputs a code stream of YUV series or RGB series.

单路输入及预处理模块210进行预处理还包括对输入的单路图像提取图像像素值。具体而言，单路输入及预处理模块210读取单路平面图像、单路图像序列或单路视频解码流中的每一帧，然后提取相应颜色空间中的像素值。The preprocessing by the single-channel input and preprocessing module 210 also includes extracting image pixel values from the input single-channel image. Specifically, the single-channel input and preprocessing module 210 reads each frame in a single-channel planar image, a single-channel image sequence or a single-channel video decoding stream, and then extracts pixel values in a corresponding color space.

对于RGB空间，单路输入及预处理模块210提取的图像像素值可以为以下三种之一：RGB三个通道值的三元数组；RGB三通道值的平均值；RGB三通道以某个比例系数进行叠加求和得到的加权值。其中，当提取的图像像素值为RGB三通道以某个比例系数进行叠加求和得到的加权值时，通过以下公式进行计算，For the RGB space, the image pixel value extracted by the single-channel input and preprocessing module 210 can be one of the following three types: a ternary array of RGB three-channel values; the average value of the RGB three-channel values; The weighted value obtained by superimposing and summing the coefficients. Among them, when the extracted image pixel value is the weighted value obtained by superimposing and summing the RGB three channels with a certain proportional coefficient, it is calculated by the following formula,

Gray＝R*0.3+G*0.59+B*0.11，Gray＝R*0.3+G*0.59+B*0.11,

对于YUV空间，单路输入及预处理模块210提取的图像像素值可以分为以下三种之一：Y分量；YUV三通道的平均值；YUV三通道以某个比例系数进行叠加求和得到的加权值。For the YUV space, the image pixel values extracted by the single input and preprocessing module 210 can be divided into one of the following three types: Y component; the average value of the YUV three channels; weighted value.

当单路图像为单路图像序列或单路视频时，待匹配帧选取模块220在确定匹配帧之前需要获取待匹配帧。具体而言，对于单路图像中的每一帧图像，待匹配帧选取模块220需要从当前帧的邻近帧中选取初始待匹配帧序列。其中，在初始待匹配帧序列中包括有多个初始待匹配帧。其中，初始待匹配帧序列为当前帧在时间轴上的前h个帧和后h个帧。在本发明的一个实施例中，h可以为5。当然本领域技术人员可以理解的是，h亦可以为其他值。When the single-channel image is a single-channel image sequence or a single-channel video, the frame-to-be-matched selection module 220 needs to obtain the frame-to-be-matched frame before determining the matched frame. Specifically, for each frame of images in the single-channel image, the frame-to-be-matched selection module 220 needs to select an initial sequence of frames to be matched from adjacent frames of the current frame. Wherein, the sequence of initial frames to be matched includes a plurality of initial frames to be matched. Wherein, the initial frame sequence to be matched is the first h frames and the last h frames of the current frame on the time axis. In one embodiment of the present invention, h may be 5. Of course, those skilled in the art can understand that h can also be other values.

为了确定当前帧与初始待匹配帧序列的相异性，待匹配帧选取模块220在获取初始待匹配帧序列后，需要对初始待匹配帧序列中的每一个初始待匹配帧进行相异性检测以获取待匹配帧序列，包括如下步骤：In order to determine the dissimilarity between the current frame and the initial to-be-matched frame sequence, after obtaining the initial to-be-matched frame sequence, the to-be-matched frame selection module 220 needs to perform a dissimilarity detection on each initial to-be-matched frame in the initial to-be-matched frame sequence to obtain The frame sequence to be matched includes the following steps:

A1：待匹配帧选取模块220首先对当前帧与一个初始待匹配帧同时进行一个水平中线分割和垂直中线分割，由此当前帧和初始待匹配帧分别被均分为四个图像块，分割后的四个图像块的长和宽的尺寸为原来图像帧长和宽大小的1/2；A1: The frame to be matched selection module 220 firstly performs a horizontal midline segmentation and a vertical midline segmentation on the current frame and an initial frame to be matched, so that the current frame and the initial frame to be matched are respectively divided into four image blocks. The length and width of the four image blocks are 1/2 of the length and width of the original image frame;

A2：待匹配帧选取模块220对当前帧和初始待匹配帧对应位置的四个图像块分别进行图像块像素值相减，取差值的绝对值作为新像素值以得到新的图像块，对更新后的四个图像块求取像素均值和方差；A2: The frame to be matched selection module 220 subtracts the image block pixel values from the four image blocks corresponding to the current frame and the initial frame to be matched, and takes the absolute value of the difference as a new pixel value to obtain a new image block. Calculate the pixel mean and variance of the four updated image blocks;

A3：待匹配帧选取模块220对四个新图像块的像素均值和方差分别进行阈值判断，当四个新图像块中均值和方差超过预设阈值的数量均小于或等于相异性数量阈值时，则判断该初始待匹配帧为当前帧的待匹配帧；当更新后的四个图像块中均值或方差超过预设阈值的数量大于相异性数量阈值时，则判断该初始待匹配帧不为当前帧的待匹配帧。其中，预设阈值包括像素均值阈值和方差阈值。在本发明的一个实施例中，像素均值阈值和方差阈值可以为[20，30]之内的任意值。其中，相异性数量阈值可以为3。重复步骤A1至A3，待匹配帧选取模块220将初始待匹配帧序列中的每一个初始待匹配帧均与当前帧进行一次相异性检测计算，直至所有初始待匹配帧均做完相异性检测得到n个待匹配帧。其中n为正整数。A3: The to-be-matched frame selection module 220 performs threshold judgment on the pixel mean and variance of the four new image blocks, and when the number of the four new image blocks with the mean and variance exceeding the preset threshold is less than or equal to the dissimilarity quantity threshold, Then it is judged that the initial frame to be matched is the frame to be matched of the current frame; when the number of the four updated image blocks whose mean value or variance exceeds the preset threshold is greater than the threshold of the number of dissimilarities, it is judged that the initial frame to be matched is not the current frame Frames to match. Wherein, the preset threshold includes a pixel mean threshold and a variance threshold. In an embodiment of the present invention, the pixel mean threshold and variance threshold can be any value within [20, 30]. Wherein, the threshold value of the number of dissimilarities may be 3. Steps A1 to A3 are repeated, and the to-be-matched frame selection module 220 performs a dissimilarity detection calculation on each initial to-be-matched frame in the initial to-be-matched frame sequence with the current frame until all initial to-be-matched frames have completed the dissimilarity detection to obtain n frames to be matched. where n is a positive integer.

由于单路平面图形的匹配帧即为当前帧，从而不需要运动分析及判断模块230进行运动分析及判断。Since the matching frame of the single-channel planar image is the current frame, the motion analysis and judgment module 230 is not required to perform motion analysis and judgment.

当输入为单路图像序列或单路视频时，运动分析及判断模块230需要通过运动检测来判断场景的运动形式和大小。具体而言，运动分析及判断模块230对单路图像序列或单路视频的待匹配帧和当前帧求取运动矢量，利用运动矢量进行运动分析及判断以获取匹配帧，包括如下步骤：When the input is a single-channel image sequence or a single-channel video, the motion analysis and judgment module 230 needs to judge the motion form and size of the scene through motion detection. Specifically, the motion analysis and judgment module 230 obtains a motion vector for the frame to be matched and the current frame of the single-channel image sequence or single-channel video, and uses the motion vector to perform motion analysis and judgment to obtain a matching frame, including the following steps:

第一步，运动分析及判断模块230对当前帧的图像数据按照预定尺寸进行分块，并对分块后的图像按照预定采样数进行数据块的均匀采样以获取多个采样块；In the first step, the motion analysis and judgment module 230 divides the image data of the current frame into blocks according to a predetermined size, and uniformly samples the data blocks according to a predetermined number of samples to obtain a plurality of sample blocks;

第二步，运动分析及判断模块230对当前帧中的各个采样块分别与每一个待匹配帧中的匹配块之间的运动偏移作为该采样块对应该待匹配帧的运动矢量；In the second step, the motion analysis and judgment module 230 uses the motion offset between each sample block in the current frame and the matching block in each frame to be matched as the motion vector of the sample block corresponding to the frame to be matched;

第三步，运动分析及判断模块230对n个矢量图进行运动分析和判断以得到零运动矢量图的数量和非零运动矢量图的数量；In the third step, the motion analysis and judgment module 230 performs motion analysis and judgment on the n vector graphics to obtain the number of zero motion vector graphics and the number of non-zero motion vector graphics;

运动分析及判断模块230判断已得到n个运动矢量图中是否存在零运动矢量图。在本发明的一个实施例中，当对应零矢量采样块的数量占总采样块数量的90％以上时，则运动分析及判断模块230判断该运动矢量图为零运动矢量图。其中，零运动矢量图的数量为m。当然本领域技术人员可以理解的是，对应零矢量采样块的数量占总采样块数量的比例为其他合理比例时，亦落入本发明的保护范围。The motion analysis and judging module 230 judges whether there is a zero motion vector diagram among the obtained n motion vector diagrams. In one embodiment of the present invention, when the number of sampling blocks corresponding to zero vector accounts for more than 90% of the total number of sampling blocks, the motion analysis and judgment module 230 judges that the motion vector diagram is a zero motion vector diagram. Wherein, the number of zero motion vector diagrams is m. Of course, those skilled in the art can understand that when the ratio of the number of sampling blocks corresponding to zero vectors to the total number of sampling blocks is other reasonable ratios, it also falls within the protection scope of the present invention.

根据上述得到的m个零运动矢量图，可以获知非零运动矢量图的数量为n-m。然后由运动分析及判断模块230对非零运动矢量图进行水平运动矢量分析。非零运动矢量图包括前景水平运动对应的运动矢量图、摄像机水平移动对应的运动矢量图和剩余情形对应的运动矢量图。其中，剩余情形对应的运动矢量图为除前景水平运动对应的运动矢量图和摄像机水平移动对应的运动矢量图之外的运动矢量图。According to the m zero motion vector diagrams obtained above, it can be known that the number of non-zero motion vector diagrams is n-m. Then, the motion analysis and judgment module 230 performs horizontal motion vector analysis on the non-zero motion vector diagrams. Non-zero motion vector diagrams include motion vector diagrams corresponding to foreground horizontal motion, motion vector diagrams corresponding to camera horizontal movement, and motion vector diagrams corresponding to remaining situations. The motion vector diagrams corresponding to the remaining situations are motion vector diagrams other than the motion vector diagram corresponding to the foreground horizontal motion and the motion vector diagram corresponding to the camera horizontal movement.

当非零运动矢量图对应的采样块中，运动矢量方向为水平方向的采样块占非零矢量采样块的比例为70％到90％时，运动分析及判断模块230判断为前景水平运动。其中，前景水平运动对应运动矢量图的数量为p。当然本领域技术人员可以理解的是，运动矢量方向为水平方向的采样块占非零矢量采样块的比例为其他合理比例时，亦落入本发明的保护范围。When among the sampling blocks corresponding to the non-zero motion vector diagram, the sampling blocks whose motion vector direction is in the horizontal direction account for 70% to 90% of the non-zero vector sampling blocks, the motion analysis and judging module 230 judges that the foreground is moving horizontally. Wherein, the number of motion vector diagrams corresponding to the foreground horizontal motion is p. Of course, those skilled in the art can understand that when the ratio of the sampling blocks whose motion vector direction is in the horizontal direction to the non-zero vector sampling blocks is other reasonable ratios, it also falls within the protection scope of the present invention.

当非零运动矢量图对应的采样块中，运动矢量方向为水平方向的采样块占非零矢量采样块的比例为90％以上，运动分析及判断模块230判断为摄像机水平移动。其中，摄像机水平移动对应的运动矢量图的数量为q。当然本领域技术人员可以理解的是，运动矢量方向为水平方向的采样块占非零矢量采样块的比例为其他合理比例时，亦落入本发明的保护范围。When among the sampling blocks corresponding to the non-zero motion vector diagram, the sampling blocks whose motion vector direction is in the horizontal direction account for more than 90% of the non-zero vector sampling blocks, the motion analysis and judgment module 230 judges that the camera moves horizontally. Wherein, the number of motion vector diagrams corresponding to the horizontal movement of the camera is q. Of course, those skilled in the art can understand that when the ratio of the sampling blocks whose motion vector direction is in the horizontal direction to the non-zero vector sampling blocks is other reasonable ratios, it also falls within the protection scope of the present invention.

然后，运动分析及判断模块230对n-m个非零运动矢量图进行运动分析和判断。首先，对于每个非零运动矢量图的非零运动矢量求平均，运动分析及判断模块230得到每个非零运动矢量图的运动矢量平均值A(i)。其中A为平均值，i为非零矢量标号。如果非零运动矢量图对应的非零运动矢量平均值在均值阈值内时，则运动分析及判断模块230保留该非零运动矢量图，否则丢弃该非零运动矢量图并将与其对应的待匹配帧丢弃。Then, the motion analysis and judgment module 230 performs motion analysis and judgment on the n-m non-zero motion vector diagrams. First, the average of the non-zero motion vectors of each non-zero motion vector diagram is calculated, and the motion analysis and judgment module 230 obtains the average value A(i) of the motion vectors of each non-zero motion vector diagram. Among them, A is the average value, and i is a non-zero vector label. If the non-zero motion vector mean value corresponding to the non-zero motion vector diagram is within the mean value threshold, the motion analysis and judgment module 230 retains the non-zero motion vector diagram, otherwise discards the non-zero motion vector diagram and matches the non-zero motion vector diagram corresponding to it. Frame drops.

通过运动分析及判断模块230的运动分析和判断，可以得到满足要求的零运动矢量图的数量为v1、前景水平运动对应的运动矢量图的数量为v2、摄像机水平移动对应的运动矢量图的数量为v3，剩余情形对应的运动矢量图的数量为v4。Through the motion analysis and judgment of the motion analysis and judgment module 230, the number of zero motion vector diagrams meeting the requirements can be obtained as v1, the number of motion vector diagrams corresponding to the foreground horizontal motion is v2, and the number of motion vector diagrams corresponding to the horizontal movement of the camera is v3, and the number of motion vector diagrams corresponding to the remaining cases is v4.

第四步，运动分析及判断模块230根据零运动图矢量的数量和非零运动矢量图的数量获取匹配帧。In the fourth step, the motion analysis and judgment module 230 acquires matching frames according to the number of zero motion vectors and the number of non-zero motion vectors.

(1)当零运动矢量图的数量v1大于第一阈值R(v1)时，则运动分析及判断模块230判断单路视频或单路图像序列当前帧对应场景为静止场景，并且在所有零运动矢量图对应的待匹配帧中选取与当前帧在时间轴上最远的一帧作为匹配帧。(1) When the number v1 of the zero-motion vector diagram is greater than the first threshold R(v1), the motion analysis and judgment module 230 judges that the scene corresponding to the current frame of the single-channel video or single-channel image sequence is a static scene, and all zero-motion Among the frames to be matched corresponding to the vector diagram, the frame farthest from the current frame on the time axis is selected as the matching frame.

(2)当前景水平运动对应的运动矢量图的数量v2位于第二阈值R(v2)内时，则运动分析及判断模块230判断单路视频或单路图像序列当前帧对应场景为前景运动场景，选取v2个前景水平运动对应的运动矢量图的平均值A(v2)与前景标准预设阈值R(a2)的差值最小的前景水平运动对应的运动矢量图为最终前景水平运动对应的运动矢量图，选取最终前景水平运动对应的运动矢量图对应的待匹配帧为匹配帧。(2) When the number v2 of the motion vector diagram corresponding to the foreground horizontal motion is within the second threshold R(v2), the motion analysis and judgment module 230 judges that the scene corresponding to the current frame of the single-channel video or single-channel image sequence is a foreground motion scene , select the motion vector graph corresponding to the foreground horizontal motion with the smallest difference between the average A(v2) of the motion vector graphs corresponding to the v2 foreground horizontal motions and the foreground standard preset threshold R(a2) as the motion corresponding to the final foreground horizontal motion Vector graph, select the frame to be matched corresponding to the motion vector graph corresponding to the final foreground horizontal motion as the matching frame.

(3)当摄像机水平移动对应的运动矢量图的数量v3大于第三阈值R(v3)时，则运动分析及判断模块230判断单路视频或单路图像序列当前帧对应场景为摄像机移动场景，选取v3个摄像机水平移动对应的运动矢量图的平均值A(v3)与摄像机标准预设阈值R(a3)的差值最小的前景水平运动对应的运动矢量图为最终摄像机水平移动对应的运动矢量图，选取最终摄像机水平移动对应的运动矢量图对应的待匹配帧为匹配帧。(3) When the number v3 of the motion vector diagram corresponding to the horizontal movement of the camera is greater than the third threshold R(v3), the motion analysis and judgment module 230 judges that the scene corresponding to the current frame of the single-channel video or single-channel image sequence is a camera movement scene, Select the motion vector diagram corresponding to the foreground horizontal motion with the smallest difference between the average A(v3) of the motion vector diagrams corresponding to the v3 camera horizontal movement and the camera standard preset threshold R(a3) as the final motion vector corresponding to the horizontal camera movement , select the frame to be matched corresponding to the motion vector diagram corresponding to the final horizontal movement of the camera as the matching frame.

(4)当剩余情形对应的运动矢量图的数量v4大于第四阈值R(v4)时，则运动分析及判断模块230判断单路视频或单路图像序列当前帧对应场景为除前景水平运动和摄像机水平移动之外的场景。选取v4个剩余情形对应的运动矢量图的平均值A(v4)与剩余情形标准预设阈值R(a4)的差值最小的剩余情形对应的运动矢量图为最终剩余情形对应的运动矢量图，选取最终剩余情形对应的运动矢量图对应的待匹配帧为匹配帧。(4) When the quantity v4 of the motion vector diagrams corresponding to the remaining situations is greater than the fourth threshold R(v4), then the motion analysis and judgment module 230 judges that the scene corresponding to the current frame of the single-channel video or single-channel image sequence is a scene other than the foreground horizontal motion and Scenes outside of horizontal camera movement. Select the motion vector diagram corresponding to the remaining situation with the smallest difference between the average A(v4) of the motion vector diagrams corresponding to v4 remaining situations and the standard preset threshold value R(a4) of the remaining situations as the motion vector diagram corresponding to the final remaining situation, The frame to be matched corresponding to the motion vector diagram corresponding to the final remaining situation is selected as the matching frame.

如图2所示，基于运动分析的平面转立体装置200进一步包括匹配帧更新模块240。当单路视频或单路图像序列当前帧对应场景为前景运动场景时，匹配帧更新模块240需要根据匹配帧与当前帧求取视差图，然后通过视差图和当前帧来求取新的匹配帧，即更新匹配帧。其中，匹配帧更新模块240通过以下算法之一求取视差图，包括基于像素块操作的光流法、块匹配法、基于像素点操作的深度匹配法等。As shown in FIG. 2 , the apparatus 200 for stereo-to-stereo conversion based on motion analysis further includes a matching frame update module 240 . When the scene corresponding to the current frame of the single-channel video or single-channel image sequence is a foreground motion scene, the matching frame update module 240 needs to obtain a disparity map according to the matching frame and the current frame, and then obtain a new matching frame through the disparity map and the current frame , that is, update the matching frame. Wherein, the matching frame update module 240 obtains the disparity map through one of the following algorithms, including an optical flow method based on pixel block operations, a block matching method, and a depth matching method based on pixel point operations.

在本发明的一个实施例中，匹配帧更新模块240利用当前帧和视差图通过DIBR(Depth Image Based Rendering)方法求取新的匹配帧。In one embodiment of the present invention, the matching frame update module 240 uses the current frame and the disparity map to obtain a new matching frame through the DIBR (Depth Image Based Rendering) method.

当单路图像为单路平面图像或单路视频或单路图像序列，且所述单路视频或单路图像序列当前帧对应场景为静止场景或摄像机移动场景或剩余情形时，图像变换模块250将当前帧和匹配帧进行图像变换以得到第一立体输入图像和第二立体输入图像。When the single-channel image is a single-channel plane image or a single-channel video or a single-channel image sequence, and the scene corresponding to the current frame of the single-channel video or single-channel image sequence is a static scene or a camera moving scene or other situations, the image conversion module 250 Image transformation is performed on the current frame and the matching frame to obtain a first stereoscopic input image and a second stereoscopic input image.

当单路视频或单路图像序列当前帧对应场景为前景运动场景时，图像变换模块250将当前帧和更新后的匹配帧进行图像变换以得到第一立体输入图像和第二立体输入图像。When the scene corresponding to the current frame of the single-channel video or single-channel image sequence is a foreground motion scene, the image transformation module 250 performs image transformation on the current frame and the updated matching frame to obtain the first stereoscopic input image and the second stereoscopic input image.

图像变换模块250进行图像变换包括以下方式：线性变换、非线性变换和预设区域变换。The image transformation performed by the image transformation module 250 includes the following manners: linear transformation, nonlinear transformation and preset area transformation.

在本发明的一个实施例中，图像变换模块250对当前帧与匹配帧的图像变换可以为单一式，也可以为组合式，并且可以只对其中一帧图像做变换，而对另一帧图像不做变换。In one embodiment of the present invention, the image transformation of the current frame and the matching frame by the image transformation module 250 can be a single type or a combined type, and can only transform one of the frame images, while the other frame image Do not transform.

对应不同的显示终端，立体合成模块260采用不同的立体视频合成格式和立体视频输出方式对两路输入图像进行立体合成。其中，立体合成模块260将两路输入图像按照立体视频合成格式进行合成，包括将左右两路输入图像进行适当的合成方式以得到适应不同立体显示要求的单路立体视频输出。其中，立体视频合成格式包括棋盘格式合成和水平交错格式合成等。Corresponding to different display terminals, the stereo synthesis module 260 performs stereo synthesis on two input images by using different stereo video synthesis formats and stereo video output modes. Among them, the stereo synthesis module 260 synthesizes the two input images according to the stereo video synthesis format, including performing proper synthesis of the left and right input images to obtain a single stereo video output that meets different stereo display requirements. Wherein, the stereoscopic video synthesis format includes checkerboard format synthesis and horizontal interlaced format synthesis, and the like.

立体视频输出方式包括将左右两路输入图像以水平并排、垂直并排进行排列，也可以将左右两路输入图像作为两路输出。在本发明的一个实施例中，立体合成模块260合成后输出的立体视频可以为红绿、红青、红蓝等立体合成单路输出显示。The stereoscopic video output method includes arranging the left and right input images side by side horizontally and vertically, or using the left and right input images as two output channels. In an embodiment of the present invention, the stereoscopic video synthesized by the stereoscopic synthesis module 260 may be a single-channel output display of stereoscopic synthesis of red-green, red-cyan, red-blue, etc.

根据本发明实施例的基于运动分析的平面转立体装置，针对不同图像或者视频输入条件具有不同的解决方案，从而具有较大的应用范围，并且通过邻近帧的选取和块采样降低计算总量，同时本发明的运动分析及判断模块采用的运动分析和判断算法，因此较简单实用且处理效率高。并且由于当前帧和匹配帧都来自图像序列或者视频本身，并且对当前帧和匹配帧进行图像变换时基于预设区域或者整幅图像进行的，从而没有图像和视频本身质量的损失，使得图像与视频质量高。此外，本发明实施例的线性、非线性以及基于预设区域的图像变换，使得立体合成后的立体视频立体效果明显，从而无明显的深度感知错误和帧间抖动。According to the motion analysis-based plane-to-stereo device of the embodiment of the present invention, there are different solutions for different image or video input conditions, thus having a large application range, and reducing the total amount of calculation through the selection of adjacent frames and block sampling, At the same time, the motion analysis and judgment algorithm adopted by the motion analysis and judgment module of the present invention is simple, practical and has high processing efficiency. And because both the current frame and the matching frame come from the image sequence or the video itself, and the image transformation of the current frame and the matching frame is based on the preset area or the entire image, so there is no loss of image and video quality, so that the image and the matching frame The video quality is high. In addition, the linear, non-linear, and preset region-based image transformations of the embodiments of the present invention make the stereoscopic video after stereoscopic synthesis have obvious stereoscopic effect, so that there is no obvious depth perception error and frame-to-frame jitter.

在本说明书的描述中，参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施例或示例中。在本说明书中，对上述术语的示意性表述不一定指的是相同的实施例或示例。而且，描述的具体特征、结构、材料或者特点可以在任何的一个或多个实施例或示例中以合适的方式结合。In the description of this specification, descriptions referring to the terms "one embodiment", "some embodiments", "example", "specific examples", or "some examples" mean that specific features described in connection with the embodiment or example , structure, material or characteristic is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the specific features, structures, materials or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

尽管已经示出和描述了本发明的实施例，对于本领域的普通技术人员而言，可以理解在不脱离本发明的原理和精神的情况下可以对这些实施例进行多种变化、修改、替换和变型，本发明的范围由所附权利要求及其等同限定。Although the embodiments of the present invention have been shown and described, those skilled in the art can understand that various changes, modifications and substitutions can be made to these embodiments without departing from the principle and spirit of the present invention. and modifications, the scope of the invention is defined by the appended claims and their equivalents.

Claims

1. A plane-to-stereo method based on motion analysis, is characterized in that, comprises the steps:

Inputting a single-channel image, and performing preprocessing on the single-channel image, including: reading each frame in the single-channel image, and extracting a pixel value in a corresponding color space;

Read each frame of image in the single-channel image as the current frame, when the single-channel image is a single-channel planar image, the final matching frame is the current frame; when the single-channel image is a single-channel image sequence or a single-channel video, Selecting n frames adjacent to the current frame as frames to be matched by dissimilarity detection, wherein n is a positive integer, and selecting n frames adjacent to the current frame as frames to be matched includes: selecting an initial frame from adjacent frames of the current frame A sequence of frames to be matched, wherein the initial sequence of frames to be matched includes a plurality of initial frames to be matched, and the initial sequence of frames to be matched is the first h frames and the last h frames of the current frame on the time axis, and Performing a dissimilarity detection on each of the initial frames to be matched to obtain n frames to be matched, wherein the dissimilarity detection on each of the initial frames to be matched includes the following steps:

A1: Carry out a horizontal midline segmentation and a vertical midline segmentation on the current frame and an initial frame to be matched at the same time to obtain four image blocks respectively;

A2: Subtract the pixel values of the current frame and the four image blocks corresponding to the corresponding initial frame to be matched and take the absolute value to obtain four new image blocks, and calculate the pixel mean value and variance of the four new image blocks;

A3: Threshold judgment is performed on the pixel mean and variance of the four new image blocks, and when the number of the mean and variance of the four new image blocks exceeding the preset threshold is less than or equal to the dissimilarity threshold, then judge the The initial frame to be matched is a frame to be matched in the current frame,

Repeat steps A1 to A3 until each initial frame to be matched in the sequence of initial frames to be matched completes the dissimilarity detection to obtain n frames to be matched;

Obtaining a motion vector according to the frame to be matched and the current frame, and using the motion vector to perform motion analysis and judgment to obtain a final matching frame, including: dividing the image data of the current frame into blocks according to a predetermined size, and after the blocks are divided, The image is uniformly sampled according to the predetermined number of samples to obtain multiple sampling blocks; the motion offset between each sampling block in the current frame and the matching block in each frame to be matched is used as the corresponding sampling block The motion vector of the frame to be matched to obtain u vector graphics, wherein u is a positive integer; motion analysis and judgment are performed on the u vector graphics to obtain the number of zero motion vector graphics and the number of non-zero motion vector graphics, Wherein the non-zero motion vector diagram includes the motion vector diagram corresponding to the foreground horizontal motion, the motion vector diagram corresponding to the camera horizontal movement and the motion vector diagram corresponding to the remaining situation; and according to the number of the zero motion vector diagram and the non-zero motion vector diagram Quantity to get the final matching frame;

performing image transformation on the current frame and the final matching frame to obtain a first stereoscopic input image and a second stereoscopic input image; and

Stereoscopic synthesis is performed on the first stereoscopic input image and the second stereoscopic input image to obtain a stereoscopic video, and the stereoscopic video is output.

2. the plane-to-stereo method based on motion analysis as claimed in claim 1, is characterized in that, described according to the quantity of zero motion vector diagram and the quantity of non-zero motion vector diagram to obtain final matching frame, comprises the steps:

When the number of the zero-motion vector graphics is greater than the first threshold, it is judged that the scene corresponding to the current frame of the single-channel video or single-channel image sequence is a static scene, and the frame to be matched corresponding to the zero-motion vector graphics is selected The frame farthest from the current frame on the time axis is used as the final matching frame;

When the number of motion vector diagrams corresponding to the foreground horizontal motion is within the second threshold, it is judged that the scene corresponding to the current frame of the single-channel video or single-channel image sequence is a foreground motion scene, and the motion corresponding to the foreground horizontal motion is selected. The motion vector diagram corresponding to the foreground horizontal motion with the minimum difference between the non-zero motion vector average value and the foreground standard preset threshold value in the vector diagram is the motion vector diagram corresponding to the final foreground horizontal motion, and the motion vector corresponding to the final foreground horizontal motion is selected The frame to be matched corresponding to the graph is the final matching frame;

When the number of motion vector diagrams corresponding to the horizontal movement of the camera is greater than the third threshold, it is judged that the scene corresponding to the current frame of the single-channel video or single-channel image sequence is a camera movement scene, and the motion vector corresponding to the horizontal movement of the camera is selected. In the figure, the motion vector diagram corresponding to the camera horizontal movement with the smallest difference between the non-zero motion vector average value and the camera standard preset threshold is the motion vector diagram corresponding to the final camera horizontal movement, and the motion vector diagram corresponding to the final camera horizontal movement is selected The corresponding frame to be matched is the final matching frame;

When the number of motion vector diagrams corresponding to the remaining situation is greater than the fourth threshold, select the remaining situation in which the difference between the average value of non-zero motion vectors and the standard preset threshold of the remaining situation in the motion vector diagram corresponding to the remaining situation is the smallest The corresponding motion vector diagram is the motion vector diagram corresponding to the final remaining situation, and the frame to be matched corresponding to the motion vector diagram corresponding to the final remaining situation is selected as the final matching frame.

3. The plane-to-stereo method based on motion analysis as claimed in claim 2, wherein, when the scene corresponding to the single-channel video or single-channel image sequence is a foreground motion scene, according to the matching frame and the current frame to obtain Get the disparity map, and update the matching frame according to the disparity map and the current frame.

4. The plane-to-stereo method based on motion analysis as claimed in claim 3, wherein said performing image transformation on the current frame and the matching frame to obtain the first stereo input image and the second stereo input image comprises:

When the single-channel image is a single-channel plane image or a single-channel video or a single-channel image sequence, and the scene corresponding to the single-channel video or single-channel image sequence is a static scene or a camera moving scene or a remaining situation, the current frame performing image transformation with the matching frame to obtain the first stereoscopic input image and the second stereoscopic input image;

When the scene corresponding to the single-channel video or the single-channel image sequence is a foreground moving scene, image transformation is performed on the current frame and the updated matching frame to obtain the first stereoscopic input image and the second stereoscopic input image.

5. The method for converting plane to stereo based on motion analysis according to claim 4, wherein the image transformation includes the following methods: linear transformation, nonlinear transformation and preset area transformation.

6. A plane-to-stereo device based on motion analysis, characterized in that it comprises:

The single-channel input and preprocessing module is used to input a single-channel image and perform preprocessing on the single-channel image, wherein the single-channel input and pre-processing module further reads each frame in the single-channel image, And extract the pixel value in the corresponding color space;

The frame selection module to be matched is used to read each current frame in the single image from the single input and preprocessing module. When the single image is a single planar image, the frame to be matched is selected The module selects the current frame as the final matching frame; when the single-channel image is a single-channel image sequence or a single-channel video, for each frame of the image in the single-channel image, the frame selection module to be matched selects the selected frame through the dissimilarity detection The adjacent n frames of the current frame are frames to be matched, wherein n is a positive integer, and an initial frame sequence to be matched is selected from adjacent frames of the current frame, wherein the initial sequence of frames to be matched includes a plurality of initial frames to be matched , the frame sequence to be initially matched is the first h frames and the last h frames of the current frame on the time axis, and each initial frame to be matched is subjected to dissimilarity detection to obtain a frame sequence to be matched, wherein , the frame to be matched selection module performs a horizontal midline segmentation and a vertical midline segmentation on the current frame and an initial frame to be matched simultaneously to obtain four image blocks respectively, and subtracts the pixel values of the four image blocks respectively and obtains Absolute value, finally four new image blocks are obtained, and the pixel mean value and variance are calculated for the four new image blocks; threshold value judgment is performed on the pixel mean value and variance of the four new image blocks, when the four new image blocks When the number of blocks whose mean value and variance exceeds the preset threshold is less than or equal to the dissimilarity threshold, it is judged that the initial frame to be matched is a frame to be matched in the current frame, and the above steps are repeated until the initial frame to be matched is in the sequence of frames to be matched Complete the dissimilarity detection for each initial frame to be matched, and obtain n frames to be matched, where n is a positive integer;

A motion analysis and judgment module, configured to obtain a motion vector from the frame to be matched and the current frame from the frame to be matched selection module, and use the motion vector to perform motion analysis and judgment to obtain a final matching frame, wherein the The motion analysis and judgment module divides the image data of the current frame into blocks according to a predetermined size, and uniformly samples the data blocks according to a predetermined number of samples to obtain a plurality of sample blocks. Each sample in the current frame The motion offset between the block and the matching block in each frame to be matched is used as the motion vector of the sampled block corresponding to the frame to be matched, so as to obtain u vector diagrams, where u is a positive integer, and for the u Carry out motion analysis and judgment on the vector graphics to obtain the number of zero motion vector graphics and the number of non-zero motion vector graphics, wherein the non-zero motion vector graphics include the motion vector graphics corresponding to the horizontal movement of the foreground, the motion vector graphics corresponding to the horizontal movement of the camera, and the remaining The motion vector diagram corresponding to the situation; and obtaining the final matching frame according to the quantity of the zero motion vector diagram and the quantity of the non-zero motion vector diagram;

An image transformation module, configured to perform image transformation on the current frame and the final matching frame from the motion analysis and judgment module to obtain a first stereoscopic input image and a second stereoscopic input image; and

A stereo synthesis module, configured to perform stereo synthesis on the first stereo input image and the second stereo input image from the image conversion module to obtain a stereo video, and output the stereo video.

7. The plane-to-stereo device based on motion analysis as claimed in claim 6, wherein said obtaining the final matching frame according to the quantity of zero motion vector diagram and the quantity of non-zero motion vector diagram comprises the steps of:

8. The plane-to-stereo device based on motion analysis as claimed in claim 7, wherein the plane-to-stereo device based on motion analysis further comprises a matching frame update module,

When the scene corresponding to the current frame of the single-channel video or single-channel image sequence is a foreground motion scene, the matching frame update module is used to obtain a disparity map according to the matching frame and the current frame from the motion analysis and judgment module , and update the matching frame according to the disparity map and the current frame.

9. The plane-to-stereo device based on motion analysis as claimed in claim 8, characterized in that,

When the single-channel image is a single-channel plane image or a single-channel video or a single-channel image sequence, and the scene corresponding to the current frame of the single-channel video or single-channel image sequence is a static scene or a camera moving scene or other situations, the image transformation The module performs image transformation on the current frame and the matching frame from the motion analysis and judgment module to obtain the first stereoscopic input image and the second stereoscopic input image;

When the scene corresponding to the current frame of the single-channel video or single-channel image sequence is a foreground motion scene, the image transformation module performs image transformation on the current frame and the updated matching frame from the matching frame updating module to obtain The first stereoscopic input image and the second stereoscopic input image are obtained.

10. The motion analysis-based 2D-to-3D device according to claim 9, wherein the image transformation performed by the image transformation module includes the following methods: linear transformation, nonlinear transformation and preset area transformation. the