CN103248911B

CN103248911B - A Virtual Viewpoint Rendering Method Based on Space-Time Combination in Multi-viewpoint Video

Info

Publication number: CN103248911B
Application number: CN201310188898.5A
Authority: CN
Inventors: 刘琚; 成聪; 杨晓辉
Original assignee: Shandong University
Current assignee: Shandong University
Priority date: 2013-05-20
Filing date: 2013-05-20
Publication date: 2015-11-04
Anticipated expiration: 2033-05-20
Also published as: CN103248911A

Abstract

The invention proposes a new view synthesis method based on a depth map. First, obtain the color image and depth image of the virtual viewpoint through 3D image transformation, and remove small holes and mapping error points; then fill the holes in the depth image of the virtual viewpoint, and record the pixel coordinates of the holes at the same time; after that, reverse 3D image transformation, locate the target area (the area that causes the hole) in the target frame of the reference viewpoint, and use the front and rear frames to restore the background of the target area; finally, use the sample-based image repair algorithm to repair the remaining holes. In this method, by using the image information of the front and back frames, the space-time domain is combined to fill holes, which can make the results more accurate and improve the quality of virtual viewpoint images compared with purely space-based hole filling operations; in addition, through reverse mapping , locate the target area, perform background restoration in a targeted manner, and perform large-scale background restoration to achieve more efficient hole filling when the results are similar.

Description

A Virtual Viewpoint Rendering Method Based on Space-Time Combination in Multi-viewpoint Video

技术领域technical field

本发明涉及一种基于深度图像的虚拟视点合成方法，属于视频、多媒体信号处理技术领域。The invention relates to a method for synthesizing virtual viewpoints based on depth images, and belongs to the technical field of video and multimedia signal processing.

背景技术Background technique

凭借良好的用户交互性以及生动逼真的视觉体验，三维立体电视成为新一代多媒体技术的领跑者。真实场景的深度信息的构建使立体电视给人一种景物伸展于荧屏之外伸手可触的感觉。其中，多视点视频被认为拥有极其广阔的应用前景。它的主要实现目标是，在播放端，针对同一场景，用户可以根据自己的需要选择不同的视点欣赏不同角度的场景，以获取强烈的临场感与真实感。然而，传输带宽以及传输速率的限制增加了多视点视频实现的难度。于是，虚拟视点合成技术应运而生。With good user interaction and vivid visual experience, 3D TV has become the leader of the new generation of multimedia technology. The construction of the depth information of the real scene makes the stereoscopic TV give people a feeling that the scene is stretched out of the screen and can be touched by hand. Among them, multi-viewpoint video is considered to have extremely broad application prospects. Its main realization goal is that on the playback side, for the same scene, users can choose different viewpoints to enjoy the scene from different angles according to their own needs, so as to obtain a strong sense of presence and reality. However, the limitation of transmission bandwidth and transmission rate increases the difficulty of realizing multi-view video. Therefore, virtual view synthesis technology came into being.

虚拟视点合成的方法有很多种，其中，基于深度图像的视点合成是一种典型的方法。它可以利用一路视频的彩色图像与深度图像，合成其他任意视点的视频信息。在视点合成的过程中，原始视点中被遮挡的背景部分有可能在虚拟视点中暴露出来，这就是所谓的空洞问题。由于这部分暴露出来的区域在原始视点中是找不到参考信息的，所以空洞问题成为视点合成中最具挑战性的问题，如何高效高质地完成视点合成的问题便转换为如何准确地填补空洞的问题。There are many methods for virtual view synthesis, among which, view synthesis based on depth images is a typical method. It can use the color image and depth image of one video to synthesize video information of other arbitrary viewpoints. In the process of view synthesis, the occluded background part in the original view may be exposed in the virtual view, which is the so-called hole problem. Since this part of the exposed area cannot find reference information in the original view, the hole problem has become the most challenging problem in view synthesis. The problem of how to complete view synthesis efficiently and with high quality is transformed into how to accurately fill the hole. The problem.

在传统的基于深度图像的视点合成方法中，主要分为三个基本步骤：参考视点深度图像的平滑处理、通过3D图像变换获得虚拟视点深度图像与彩色图像、空洞填充。这类方法得到的虚拟视点彩色图像中会出现一些小的裂纹和大片的空洞区域。尤其是对于后者，无论是通过邻域赋值、线性插值，还是使用复杂度较高的图像修复算法进行填充，都仅仅是在空域上利用空洞的周边信息进行推测的过程，难以具备较高的可信度，获得的虚拟视点图像也难以准确地反映真实的场景。In the traditional method of view synthesis based on depth image, it is mainly divided into three basic steps: smoothing of reference view depth image, obtaining virtual view depth image and color image through 3D image transformation, and hole filling. There will be some small cracks and large hollow areas in the virtual viewpoint color images obtained by such methods. Especially for the latter, no matter whether it is filled by neighborhood assignment, linear interpolation, or image restoration algorithm with high complexity, it is only a process of guessing using the surrounding information of the hole in the airspace, and it is difficult to have a high In terms of credibility, the obtained virtual viewpoint images are also difficult to accurately reflect the real scene.

发明内容Contents of the invention

为解决现有的基于深度图像的视点合成方法中，基于空域的图像后处理所导致的虚拟视点合成的欠准确性问题，本发明提出一种新的基于深度图的视点合成方法。鉴于在参考视点中，被遮挡的背景部分（即导致虚拟视点中空洞的部分）随着时间的推移会暴露出来，这也就意味着，我们可以从参考视点目标帧的前后帧中找到待修复的空洞部分的准确信息。利用这一点，本发明提出了空域和时域相结合进行虚拟视点合成的方法，其中一部分空洞可以通过时域的信息进行更为准确地填充，而对于另一部分无法使用时域信息进行填充的空洞，仍然使用空域补洞的方法进行填充。这样，由于其中一部分空洞可以说是用真实的背景信息进行填充的，较在单幅图像中的推测式的填充，整体性能更好。In order to solve the inaccurate problem of virtual viewpoint synthesis caused by image post-processing based on spatial domain in the existing viewpoint synthesis method based on depth image, the present invention proposes a new viewpoint synthesis method based on depth map. Since in the reference viewpoint, the occluded background parts (i.e., the parts that cause holes in the virtual viewpoint) will be exposed over time, this means that we can find the unfixed Accurate information on the hollow part of the . Taking advantage of this, the present invention proposes a method for combining the space domain and the time domain to perform virtual viewpoint synthesis, in which some holes can be filled more accurately with time domain information, and for other holes that cannot be filled with time domain information , still use the method of hole filling in the airspace to fill. In this way, since some of the holes can be said to be filled with real background information, the overall performance is better than speculative filling in a single image.

本发明中，首先，通过3D图像变换获得虚拟视点彩色图像与深度图像，并去除小的裂纹以及映射错误的像素点；然后进行虚拟视点深度图像的空洞填充，获得完整的虚拟视点深度图像，同时对空洞处像素坐标进行记录；之后，进行反向的3D图像变换，在参考视点的目标帧中定位目标区域（导致空洞产生的遮挡区域），并且利用前后帧对目标帧中目标区域进行背景复原，用于虚拟视点彩色图像中部分空洞的填充；最后，利用基于样本的图像修复算法对剩余空洞进行修复。本方法中，通过利用参考视点的前后帧的图像信息，实现空时域相结合进行补洞，较单纯基于空域的补洞操作，能够使结果更加准确，提高虚拟视点图像的质量。另外，通过反向3D图像变换，定位目标区域，有针对性的进行背景复原，较大规模的背景复原，在性能相近的情况下实现更加高效地补洞。In the present invention, firstly, the color image and the depth image of the virtual viewpoint are obtained through 3D image conversion, and small cracks and pixels with wrong mapping are removed; Record the pixel coordinates of the hole; after that, reverse the 3D image transformation, locate the target area in the target frame of the reference viewpoint (causing the occlusion area generated by the hole), and use the front and rear frames to restore the background of the target area in the target frame , which is used to fill part of the holes in the virtual viewpoint color image; finally, the remaining holes are repaired using the sample-based image inpainting algorithm. In this method, by using the image information of the front and rear frames of the reference viewpoint, the space-time domain is combined to fill holes, which can make the result more accurate and improve the quality of the virtual viewpoint image compared with the hole filling operation based solely on the space domain. In addition, through the reverse 3D image transformation, the target area is located, and the background restoration is carried out in a targeted manner. Large-scale background restoration can achieve more efficient hole filling with similar performance.

本发明采用的技术方案为：The technical scheme adopted in the present invention is:

一种基于深度图的视点合成方法，其特征在于，采用空域和时域相结合的方式，利用参考视点中目标帧的前后帧的图像信息，基于对导致空洞产生的遮挡区域的定位结果，进行有针对性的背景复原，从而对虚拟视点图像中的部分空洞进行填充，具体步骤如下：A view synthesis method based on a depth map, characterized in that, using a combination of space domain and time domain, using the image information of the frame before and after the target frame in the reference view, and based on the positioning results of the occlusion area that causes the hole to be generated, perform Targeted background restoration, so as to fill some holes in the virtual viewpoint image, the specific steps are as follows:

（1）3D图像变换：根据摄像机投影原理，通过3D图像变换，将参考视点的图像信息投影到虚拟视点上，获得虚拟视点彩色图像与深度图像，并且对其中的小的裂纹以及映射错误的像素点进行处理；(1) 3D image transformation: According to the principle of camera projection, through 3D image transformation, the image information of the reference viewpoint is projected onto the virtual viewpoint, and the color image and depth image of the virtual viewpoint are obtained, and the small cracks and wrongly mapped pixels are corrected. point for processing;

（2）虚拟视点深度图像中空洞的坐标记录及其填充：对深度图像中的空洞部分进行坐标记录，然后利用邻域赋值，对深度图像进行空洞填充；(2) Coordinate recording and filling of the holes in the depth image of the virtual viewpoint: record the coordinates of the holes in the depth image, and then use the neighborhood assignment to fill the holes in the depth image;

（3）反向的3D图像变换，定位目标区域：从虚拟视点出发，根据摄像机投影原理，进行反向的3D图像变换，同时根据步骤（2）中的坐标记录结果进行参考视点目标帧中目标区域的定位；(3) Reverse 3D image transformation to locate the target area: starting from the virtual viewpoint, according to the principle of camera projection, perform reverse 3D image transformation, and at the same time refer to the target in the viewpoint target frame according to the coordinate recording results in step (2) positioning of the area;

（4）在时域上进行目标区域的背景复原：利用参考视点中目标帧的前后帧信息，进行目标区域的背景复原，用于虚拟视点彩色图像中部分空洞的填充；(4) Restoring the background of the target area in the time domain: using the frame information before and after the target frame in the reference viewpoint to restore the background of the target area, which is used to fill some holes in the color image of the virtual viewpoint;

（5）剩余空洞的修复：利用基于样本的图像修复技术对虚拟视点彩色图像中的剩余空洞进行填充。(5) Restoration of remaining holes: The remaining holes in the virtual viewpoint color image are filled using sample-based image restoration technology.

所述步骤（1）的具体步骤为：The specific steps of the step (1) are:

a.根据参考视点的深度信息，实现从参考视点平面到三维空间，再从三维空间到虚拟视点平面的坐标变换；a. According to the depth information of the reference viewpoint, realize the coordinate transformation from the reference viewpoint plane to the three-dimensional space, and then from the three-dimensional space to the virtual viewpoint plane;

b.根据坐标变换的结果，将参考视点中的像素点投影到虚拟视点平面内，从而获得虚拟视点彩色图像与深度图像；b. According to the result of coordinate transformation, project the pixels in the reference viewpoint into the virtual viewpoint plane, thereby obtaining the virtual viewpoint color image and depth image;

c.对虚拟视点彩色图像与深度图像中映射错误的点进行检测，并进行邻域赋值，同时，对于其中一些小的裂纹，也使用邻域赋值的方式进行填充。c. Detect the wrongly mapped points in the virtual viewpoint color image and depth image, and perform neighborhood assignment. At the same time, for some small cracks, also use the neighborhood assignment method to fill.

所述步骤（2）的具体实现步骤如下：The specific implementation steps of the step (2) are as follows:

a.对虚拟视点深度图像及彩色图像中的大的空洞像素进行坐标记录；a. Record the coordinates of the large hole pixels in the depth image of the virtual viewpoint and the color image;

b.逐行地使用空洞区域邻近的背景像素进行虚拟视点深度图像的空洞填充，获取完整的虚拟视点深度图像。b. Use background pixels adjacent to the hole area to fill holes in the virtual viewpoint depth image line by line, and obtain a complete virtual viewpoint depth image.

所述步骤（3）的具体实现步骤如下：The specific implementation steps of the step (3) are as follows:

a.根据步骤（2）中得到的虚拟视点深度图像，实现从虚拟视点平面到三维空间，再从三维空间到参考视点平面的坐标变换；a. According to the virtual viewpoint depth image obtained in step (2), realize the coordinate transformation from the virtual viewpoint plane to the three-dimensional space, and then from the three-dimensional space to the reference viewpoint plane;

b.根据坐标变换结果以及步骤（2）中的坐标记录结果，在参考视点目标帧中定位目标区域。b. According to the coordinate transformation result and the coordinate recording result in step (2), locate the target area in the target frame of the reference viewpoint.

所述步骤（4）的具体实现步骤如下：The specific implementation steps of the step (4) are as follows:

a.在时域上，利用前后帧进行目标帧中目标区域的背景复原；a. In the time domain, use the front and back frames to restore the background of the target area in the target frame;

b.根据步骤（3）中的坐标变换结果，进行虚拟视点中彩色图像中部分空洞的填充。b. According to the coordinate transformation result in step (3), fill some holes in the color image in the virtual viewpoint.

所述步骤（5）的具体实现步骤如下：The specific implementation steps of the step (5) are as follows:

a.检测空洞区域的边缘，计算边缘上各像素的优先级，决定修复顺序；a. Detect the edge of the hole area, calculate the priority of each pixel on the edge, and determine the order of repair;

b.以边缘像素为中心，获取特定尺寸的样本块，以样本块为单位，根据其中的颜色信息，从源图像中搜索最佳匹配块；b. Take the edge pixel as the center, obtain a sample block of a specific size, and use the sample block as a unit, and search for the best matching block from the source image according to the color information in it;

c.找到最佳匹配块之后，将最佳匹配块中的像素信息复制到样本块中的空洞处，实现填充。c. After finding the best matching block, copy the pixel information in the best matching block to the hole in the sample block to realize filling.

附图说明Description of drawings

图1：本发明的流程图。Figure 1: Flowchart of the present invention.

图2：参考视点的彩色图像及深度图像。Figure 2: Color and depth images of the reference viewpoint.

图3：虚拟视点的彩色图像及深度图像。Figure 3: Color and depth images of the virtual viewpoint.

图4：小的空洞以及错误映射点去除后的虚拟视点的彩色图像。Figure 4: Color image of the virtual viewpoint after removal of small holes and mismapped points.

图5：空洞填充之后的虚拟视点深度图像。Figure 5: Virtual viewpoint depth image after hole filling.

图6：参考视点目标帧中的目标区域。Figure 6: Object regions in reference viewpoint object frames.

图7：后一帧图像的前、背景分类结果图。Figure 7: Foreground and background classification result map of the next frame image.

图8：背景复原之后的目标帧的彩色图像。Figure 8: Color image of the target frame after background restoration.

图9：部分空洞修复之后的虚拟视点彩色图像。Figure 9: Virtual viewpoint color image after partial hole repair.

图10：剩余空洞修复之后的虚拟视点彩色图像。Figure 10: Virtual viewpoint color image after remaining hole inpainting.

具体实施方式detailed description

该发明采用了“Mobile”视频序列进行实验。该视频序列采集了一个场景的9个视点的视频，并提供了相应的深度信息以及各摄像机的内、外部参数。实验中，我们选取4号视点作为参考视点，5号视点作为虚拟视点。The invention was experimented with a "Mobile" video sequence. The video sequence collects videos from 9 viewpoints of a scene, and provides corresponding depth information as well as internal and external parameters of each camera. In the experiment, we selected the No. 4 viewpoint as the reference viewpoint, and the No. 5 viewpoint as the virtual viewpoint.

图1所示为本发明的流程图，根据流程图，我们对其具体实施方式进行介绍。Fig. 1 shows the flowchart of the present invention, according to the flowchart, we introduce its specific implementation.

（1）3D图像变换。所谓的3D图像变换，就是根据摄像机投影原理，将参考视点中的像素点投影到虚拟视点平面内。这一过程主要分为两个部分，首先是将参考视点中的像素投影到三维空间中，然后再由三维空间投影到虚拟视点平面内。图2为参考视点的彩色图像与深度图像。假设摄像机i的内参矩阵为A_i，外参矩阵中的平移矩阵与旋转矩阵分别为R_i、t_i,映射方程可以表示为：(1) 3D image transformation. The so-called 3D image transformation is to project the pixels in the reference viewpoint into the virtual viewpoint plane according to the camera projection principle. This process is mainly divided into two parts. First, the pixels in the reference viewpoint are projected into the 3D space, and then projected from the 3D space into the virtual viewpoint plane. Figure 2 shows the color image and depth image of the reference viewpoint. Assuming that the internal reference matrix of camera i is A _i , and the translation matrix and rotation matrix in the external reference matrix are R _i and t _i respectively, the mapping equation can be expressed as:

sm＝A_i[R_i|t_i]M （1）sm＝A _i [R _i |t _i ]M (1)

其中s为一标量，m＝[u,v,1]^T为图像像素点在图像坐标系下的齐次坐标，M＝[X,Y,Z,1]^T为相应空间点在世界坐标系下的齐次坐标。Where s is a scalar, m=[u,v,1] ^T is the homogeneous coordinates of image pixels in the image coordinate system, M=[X,Y,Z,1] ^T is the corresponding space point in the world coordinate system The homogeneous coordinates below.

利用方程（1），可以实现参考视点到三维空间的映射以及三维空间到虚拟视点的映射。图3为虚拟视点彩色图像及深度图像。Using equation (1), the mapping from the reference viewpoint to the 3D space and the mapping from the 3D space to the virtual viewpoint can be realized. Fig. 3 is a color image and a depth image of a virtual viewpoint.

从图3可以看出，在虚拟视点彩色图像与深度图像上，同时存在一些映射错误的像素点，这一类像素点主要分为两种：映射到背景部分的前景像素与映射到前景部分的背景像素。鉴于这一类错误像素通常只有一至两个宽度，所以我们利用它们与周围像素的深度关系将它们检测出来，然后通过邻域赋值进行修正。另外，从图3也可以看出，在虚拟视点彩色图像与深度图像上，存在一些小的裂纹，针对这些裂纹，我们也采用简单的邻域赋值的方式进行消除。图4为处理之后的虚拟视点彩色图像。It can be seen from Figure 3 that there are some wrongly mapped pixels on the virtual viewpoint color image and depth image. This type of pixel is mainly divided into two types: the foreground pixel mapped to the background part and the foreground pixel mapped to the foreground part. background pixels. Since this type of wrong pixels usually has only one or two widths, we use their depth relationship with surrounding pixels to detect them, and then correct them by neighborhood assignment. In addition, it can also be seen from Figure 3 that there are some small cracks on the virtual viewpoint color image and depth image. For these cracks, we also use simple neighborhood assignment to eliminate them. Figure 4 is the processed virtual viewpoint color image.

（2）虚拟视点深度图像的空洞填充及坐标记录。为了在参考视点目标帧中定位目标区域，我们将虚拟视点深度图像中的空洞区域进行坐标记录。之后，再利用深度图的平滑渐变性，逐行地使用空洞区域邻近的背景像素进行虚拟视点深度图像的空洞填充，获取完整的虚拟视点深度图像，如图5所示。(2) Hole filling and coordinate recording of virtual viewpoint depth images. To localize object regions in the reference viewpoint object frame, we record the coordinates of hollow regions in the virtual viewpoint depth image. Then, using the smooth gradient of the depth map, the background pixels adjacent to the hole area are used to fill the holes in the virtual viewpoint depth image line by line to obtain a complete virtual viewpoint depth image, as shown in Figure 5.

（3）反向的3D图像变换，定位目标区域。所谓反向的3D图像变换，就是传统的3D图像变换的逆过程。先将虚拟视点中的像素投影到三维空间中，再由三维空间映射到参考视点平面内，建立起虚拟视点与参考视点之间的对应关系。然后结合步骤（2）中的坐标记录结果，定位目标区域。图6为最终定位的目标区域。(3) Reverse 3D image transformation to locate the target area. The so-called reverse 3D image transformation is the inverse process of the traditional 3D image transformation. First project the pixels in the virtual viewpoint to the three-dimensional space, and then map the three-dimensional space to the plane of the reference viewpoint, and establish the corresponding relationship between the virtual viewpoint and the reference viewpoint. Then combine the coordinate recording results in step (2) to locate the target area. Figure 6 shows the target area for final positioning.

（4）在时域上进行目标区域的背景复原。由于虚拟视点图像中暴露的出来的部分应该是背景信息，所以我们真正感兴趣的是目标区域中被前景所遮挡的背景部分。考虑到前景物体的运动会让背景部分随着时间的迁移不同程度地暴露出来，所以在目标区域的背景复原过程中，目标帧的前后帧的信息是尤其重要的。为了准确高效地利用前后帧信息进行背景复原，我们采用了K均值聚类算法对前后帧的深度图像进行前景、背景的划分，只有划分为背景部分的图像信息才可以作为备选信息用于目标帧的背景复原。所谓的聚类算法是将数据集合中在某些方面相似的数据成员进行分类组织的过程。而K均值聚类算法是在划分之前，人工或者自动按照某种原则确定K个聚类中心，然后根据定义的某种距离函数将数据集中的数据逐一归于K个聚类中心中。每当有一个数据归入一个类中，便通过计算该类中数据的均值来更新该聚类中心。然后根据更新后的聚类中心继续进行聚类，直至所有的数据都完成分类。该发明中，我们首先对待分类的深度图像进行直方图分析，确定整幅图像中前景、背景的灰度分布并以此作为依据确定两个（K=2）聚类中心，然后以灰度差异作为距离函数，按照上述步骤完成前景、背景的划分。图7为目标帧的后一帧图像的前、背景的划分结果图，其中黑色部分代表背景，白色部分代表前景。(4) The background restoration of the target area is carried out in the time domain. Since the exposed part of the virtual viewpoint image should be background information, what we are really interested in is the background part occluded by the foreground in the target area. Considering that the movement of the foreground object will expose the background part to varying degrees over time, the information of the frames before and after the target frame is particularly important in the background restoration process of the target area. In order to accurately and efficiently use the information of the front and back frames for background restoration, we use the K-means clustering algorithm to divide the depth images of the front and back frames into foreground and background. Only the image information divided into the background part can be used as candidate information for the target. The background of the frame is restored. The so-called clustering algorithm is the process of classifying and organizing data members that are similar in some respects in a data set. The K-means clustering algorithm is to manually or automatically determine K cluster centers according to a certain principle before dividing, and then assign the data in the data set to K cluster centers one by one according to a defined distance function. Whenever a data is classified into a class, the cluster center is updated by calculating the mean of the data in the class. Then continue clustering according to the updated cluster centers until all the data are classified. In this invention, we first perform histogram analysis on the depth image to be classified, determine the gray distribution of the foreground and background in the entire image, and use this as a basis to determine two (K=2) cluster centers, and then use the gray difference As a distance function, the division of foreground and background is completed according to the above steps. FIG. 7 is a diagram showing the division results of the foreground and the background of the next frame image of the target frame, wherein the black part represents the background, and the white part represents the foreground.

确定备选信息后，考虑到光强等各种环境因素的影响随着帧间距的增大而增大，所以，我们从最邻近的帧开始，逐步进行目标帧的背景复原。图8为复原完成后的目标帧。目标区域中的背景信息即可用于虚拟视点彩色图像中部分空洞的填充。图9为部分空洞填充完成后的虚拟视点彩色图像。After determining the candidate information, considering that the influence of various environmental factors such as light intensity increases with the increase of the frame spacing, we start from the nearest adjacent frame and gradually restore the background of the target frame. Fig. 8 is the target frame after restoration. The background information in the target area can be used to fill some holes in the virtual viewpoint color image. FIG. 9 is a color image of a virtual viewpoint after partial hole filling is completed.

（5）剩余空洞的修复。虚拟视点中产生的空洞，根据其位置，可以被分为两类：前景与背景交界处的空洞与场景边界的空洞。前者是由于遮挡导致的背景暴露，而后者则是由于场景转换导致的新场景的暴露。对于前者，我们可以通过上述步骤，利用复原出来的背景信息进行填充。但是对于后者，在参考视点的所有帧中都无法找到参考信息。为解决这部分空洞，我们使用基于样本的图像修复算法进行修复。具体步骤如下：(5) Repair of remaining voids. The holes generated in the virtual viewpoint can be divided into two types according to their positions: the holes at the junction of the foreground and the background and the holes at the boundary of the scene. The former is background exposure due to occlusion, while the latter is the exposure of new scenes due to scene transitions. For the former, we can use the restored background information to fill in through the above steps. But for the latter, reference information cannot be found in all frames of the reference viewpoint. To address this part of the hole, we use a sample-based image inpainting algorithm for inpainting. Specific steps are as follows:

首先，我们对待修复图像进行空洞边缘的检测，检测出边缘上的所有像素点后，我们再进行优先级的计算，以决定修复顺序。优先级的计算公式如（2）所示。First, we detect the edge of the hole in the image to be repaired. After detecting all the pixels on the edge, we calculate the priority to determine the order of repair. The calculation formula of the priority is shown in (2).

P(p)=C(p)D( (2)P(p)=C(p)D( (2)

其中，C(p)为置信度项，D(p)为数据项，定义如下：Among them, C(p) is the confidence item, D(p) is the data item, defined as follows:

$C C ((p p)) = = \frac{\underset{q q &Element; &Element; {Ψ Ψ}_{p p} \cap \cap Φ Φ}{Σ Σ} C C ((q q))}{| | {Ψ Ψ}_{p p} | |},,$ $D D. ((p p)) = = \frac{| | {&dtri; &dtri; I I}_{p p}^{&perp; &perp;} \cdot \cdot {n no}_{p p} | |}{α α} - - - - - - ((33))$

C(p)中的分子表示样本块中非空洞像素的数量，分母|Ψ_p|表示样本块中总的像素数量。D(p)中，n_p分别表示空洞边缘上p点处的等照度线的方向与法向量。α为归一化因子，值为255。通过C(p)与D(p)，基于样本的图像修复技术可以在保持图像的线性结构的情况下进行纹理的填充。The numerator in C(p) represents the number of non-hole pixels in the sample block, and the denominator | _Ψp | represents the total number of pixels in the sample block. In D(p), n _p represent the direction and normal vector of the iso-illuminance line at point p on the edge of the cavity, respectively. α is a normalization factor with a value of 255. Through C(p) and D(p), the sample-based image inpainting technique can fill in the texture while maintaining the linear structure of the image.

待修复顺序确定之后，我们以优先级最高的块作为样本块，从源图像中搜索最佳匹配块。搜索匹配块的匹配原则如等式（4）、（5）所示。After the inpainting order is determined, we take the block with the highest priority as the sample block and search for the best matching block from the source image. The matching principles for searching for matching blocks are shown in equations (4), (5).

$SSD SSD (({Ψ Ψ}_{\overset{^^}{p p}},, {Ψ Ψ}_{q q})) = = \underset{p p &Element; &Element; {Ψ Ψ}_{p p} \cap \cap Φ Φ}{Σ Σ} (({(({R R}_{\overset{^^}{p p}} ((p p)) - - {R R}_{q q} ((p p))))}^{22} + + {(({G G}_{\overset{^^}{p p}} ((p p)) - - {G G}_{q q} ((p p))))}^{22} + + {(({B B}_{\overset{^^}{p p}} ((p p)) - - {B B}_{q q} ((p p))))}^{22})) - - - - - - ((44))$

${Ψ Ψ}_{q q__best the best} = = arg arg \underset{{Ψ Ψ}_{q q} &Element; &Element; Φ Φ}{min min} SSD SSD (({Ψ Ψ}_{\overset{^^}{p p}},, {Ψ Ψ}_{q q})) - - - - - - ((55))$

等式（5）定义了误差平方和（SSD）的计算方法，对于两个像素块中的每一对像素点，先计算它们之间R、G、B分量差值的平方之和；再将所有像素点对的计算结果相加，即为两个像素块的误差平方和（SSD）。Equation (5) defines the calculation method of the error sum of squares (SSD). For each pair of pixel points in two pixel blocks, first calculate the sum of the squares of the R, G, and B component differences between them; then The calculation results of all pixel point pairs are added together, which is the error sum of squares (SSD) of the two pixel blocks.

在匹配过程中，以误差平方和（SSD）计算样本块中非空洞像素与匹配块中的相应位置的像素之间的彩色信息的差异，差异越小，则匹配度越高。通过查找所有SSD中的最小值来最终确定最佳匹配块Ψ_{q_best}。In the matching process, the difference in color information between the non-hole pixels in the sample block and the pixels at the corresponding position in the matching block is calculated by using the error sum of squares (SSD). The smaller the difference, the higher the matching degree. The best matching block Ψ _{q_best} is finally determined by finding the minimum value among all SSDs.

找到最佳匹配块之后，利用最佳匹配块中的像素对样本块中的空洞像素点进行填充。然后再进行置信度的更新，公式如下所示：After the best matching block is found, the pixels in the sample block are filled with the pixels in the best matching block. Then update the confidence, the formula is as follows:

$C C ((p p)) = = C C \overset{^^}{((p p))} &ForAll; &ForAll; p p &Element; &Element; {Ψ Ψ}_{\overset{^^}{p p}} \cap \cap Ω Ω - - - - - - ((66))$

重复以上过程，直至修复完成。修复后的虚拟视点彩色图像如图10所示。Repeat the above process until the repair is complete. The repaired virtual viewpoint color image is shown in Figure 10.

Claims

1. the visual point synthesizing method based on depth map, it is characterized in that, adopt the mode that spatial domain and time domain combine, utilize the image information of the front and back frame of target frame in reference view, based on the positioning result to the occlusion area causing cavity to produce, carry out background targetedly to restore, thus fill the part cavity in virtual visual point image, concrete steps are as follows:

(1) 3D rendering conversion: according to video camera projection theory, converted by 3D rendering, with reference to the visual information projection of viewpoint on virtual view, obtain virtual view coloured image and depth image, and the pixel of little crackle wherein and mapping error is processed;

(2) empty in virtual viewpoint depth image coordinate record and filling thereof: carry out coordinate record to the hollow sectors in depth image, then utilize neighborhood assignment, carry out cavity to depth image and fill;

(3) reverse 3D rendering conversion, localizing objects region: from virtual view, according to video camera projection theory, carry out reverse 3D rendering conversion, carry out the location of target area in reference view target frame simultaneously according to the coordinate record result in step (2);

(4) background of carrying out target area in time domain is restored: the front and back frame information utilizing target frame in reference view, and the background of carrying out target area is restored, for the filling in part cavity in virtual view coloured image;

(5) reparation in residue cavity: utilize the image repair technology based on sample to fill the residue cavity in virtual view coloured image, specific implementation step is as follows:

A. detect the edge of hole region, the priority of each pixel in edge calculation, determine reparation order;

B., centered by edge pixel, obtain the sample block of specific dimensions, in units of sample block, according to colouring information wherein, from source images, search for best matching blocks;

C. after finding best matching blocks, the Pixel Information in best matching blocks is copied to the cavity place in sample block, realize filling.

2. the visual point synthesizing method based on depth map according to claim 1, is characterized in that, the concrete steps of described step (1) are:

A. according to the depth information of reference view, realize from reference view plane to three dimensions, then from three dimensions to the coordinate transform of virtual view plane;

B. according to the result of coordinate transform, project in virtual view plane with reference to the pixel in viewpoint, thus obtain virtual view coloured image and depth image;

C. the point of mapping error in virtual view coloured image and depth image is detected, and carry out neighborhood assignment, meanwhile, for the crackle that some of them are little, also use the mode of neighborhood assignment to fill.

3. the visual point synthesizing method based on depth map according to claim 1, is characterized in that, the specific implementation step of described step (2) is as follows:

A. coordinate record is carried out to the large empty pixel in virtual viewpoint depth image and coloured image;

B. the cavity using the contiguous background pixel of hole region to carry out virtual viewpoint depth image is line by line filled, and obtains complete virtual viewpoint depth image.

4. the visual point synthesizing method based on depth map according to claim 1, is characterized in that, the specific implementation step of described step (3) is as follows:

A. according to the virtual viewpoint depth image obtained in step (2), realize from virtual view plane to three dimensions, then from three dimensions to the coordinate transform of reference view plane;

B. according to the coordinate record result in coordinate transform result and step (2), localizing objects region in reference view target frame.

5. the visual point synthesizing method based on depth map according to claim 1, is characterized in that, the specific implementation step of described step (4) is as follows:

A., in time domain, before and after utilizing, frame carries out the background recovery of target area in target frame;

B. according to the coordinate transform result in step (3), the filling in part cavity in coloured image in virtual view is carried out.