[go: up one dir, main page]

CN100448271C - Video Editing Method Based on Panorama Stitching - Google Patents

Video Editing Method Based on Panorama Stitching Download PDF

Info

Publication number
CN100448271C
CN100448271C CNB2007100707436A CN200710070743A CN100448271C CN 100448271 C CN100448271 C CN 100448271C CN B2007100707436 A CNB2007100707436 A CN B2007100707436A CN 200710070743 A CN200710070743 A CN 200710070743A CN 100448271 C CN100448271 C CN 100448271C
Authority
CN
China
Prior art keywords
video
image
panorama
editing
frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CNB2007100707436A
Other languages
Chinese (zh)
Other versions
CN101119442A (en
Inventor
杜歆
朱云芳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CNB2007100707436A priority Critical patent/CN100448271C/en
Publication of CN101119442A publication Critical patent/CN101119442A/en
Application granted granted Critical
Publication of CN100448271C publication Critical patent/CN100448271C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Studio Circuits (AREA)

Abstract

The present invention discloses a video editing method based on concatenation of panoramagram, being used for editing or repairing section series of motion video. The method includes the steps as follows: matching with the characteristic point of the every frame image of the video series, calculating the projection matrix between the frames and appending the image to acquire the video panoragram; editing the video panoragram with the human interaction and computer according to the specific demand of video editing; restoring the video series from the edited panoragram based on the projection relationship. Applying the video panoragram as the medium, the present invention shifts the video editing into the image editing of vider panoragram, not only directly providing the whole scene message to the user, but also reducing the calculating quantity and the human interaction.

Description

基于全景图拼接的视频编辑方法 Video Editing Method Based on Panorama Stitching

技术领域 technical field

本发明涉及一种视频序列处理方法,具体地说,涉及一种通过合成视频全景图,以对全景图的图像编辑来代替传统视频编辑的新方法。The invention relates to a video sequence processing method, in particular to a new method for replacing traditional video editing with image editing of the panorama by synthesizing video panoramas.

背景技术 Background technique

传统的视频编辑中,镜头拍摄、剪辑通常以录像带为存储介质,由于素材在录像带上是顺序存放的,要完成编辑必须反复搜索和复制,并在另一录像带重新安排这些素材,这种编辑方法被称之为线性编辑方法。自数字技术发展起来之后出现了专用的非线性编辑机,可以不按照素材在磁带上的线性位置进行更方便的处理。实际上PC也可以作为一台非线性编辑机,因为所有的素材都捕捉到磁盘上,可以随时处理任何时间线位置上的内容。In traditional video editing, videotapes are usually used as the storage medium for lens shooting and editing. Since the materials are stored sequentially on the videotapes, it is necessary to search and copy repeatedly to complete the editing, and rearrange these materials on another videotape. This editing method It is called the linear editing method. Since the development of digital technology, a dedicated non-linear editing machine has appeared, which can be more conveniently processed without following the linear position of the material on the tape. In fact, the PC can also be used as a non-linear editing machine, because all the material is captured to disk, and can be processed at any time in any timeline position.

非线性视频编辑很大程度上方便视频编辑工作,但它仍然是一件很繁琐的工作,这是因为现有的非线性视频编辑是将视频文件逐帧展开,以帧为精度来进行编辑。由于视频数据量极大,因此对视频的逐帧编辑需要耗费大量的人工交互以及计算机的计算量。Non-linear video editing is largely convenient for video editing, but it is still a very cumbersome task, because the existing non-linear video editing is to expand video files frame by frame and edit them with frame precision. Due to the huge amount of video data, frame-by-frame editing of videos requires a lot of human interaction and computer calculations.

由于视频中的物体通常会出现在很多帧,如果直接修改视频,需要逐帧进行,有大量重复工作。如果将整段视频信息用一幅图像来表示,根据需要对这幅图像进行编辑,然后再根据编辑后的图像重新得到视频,不仅使手工参与的工作量大大减小(甚至可以完全由计算机自动完成),还节省了计算时间,提高了工作效率。由此本发明提出了基于全景图拼接的视频编辑新方法。Since the objects in the video usually appear in many frames, if you directly modify the video, you need to do it frame by frame, and there is a lot of repetitive work. If the whole segment of video information is represented by an image, the image is edited as required, and then the video is re-obtained based on the edited image, not only the workload of manual participation is greatly reduced (it can even be completely automated by the computer). completed), it also saves calculation time and improves work efficiency. Therefore, the present invention proposes a new video editing method based on panorama mosaic.

发明内容 Contents of the invention

本发明的目的是提供一种基于全景图拼接的视频编辑方法,解决现有的视频编辑方法对视频逐帧地编辑,不直观,计算量大,耗时长等缺陷,提供一种能够快速、直观地对视频内容进行编辑的方法。The purpose of the present invention is to provide a video editing method based on panorama stitching, which solves the disadvantages of editing video frame by frame in the existing video editing method, which is not intuitive, requires a large amount of calculation, and takes a long time, and provides a fast and intuitive method. A method for editing video content.

为了实现上述目的,本发明采用的技术方案是:In order to achieve the above object, the technical scheme adopted in the present invention is:

1、本发明提供了一种基于全景图拼接的视频内容编辑方法,用于对一段运动视频序列的编辑。该方法包括:1. The present invention provides a video content editing method based on panorama mosaic, which is used for editing a moving video sequence. The method includes:

1)用多个视频帧生成一描述运动视频全貌的视频全景图;1) Generate a video panorama describing the overall picture of the motion video with a plurality of video frames;

2)对得到的视频全景图进行图像内容编辑;2) Editing the image content of the obtained video panorama;

3)由编辑后的视频全景图逆投影回各视频帧坐标系,生成编辑后的视频序列。3) Back-project the edited video panorama back to the coordinate system of each video frame to generate an edited video sequence.

2、所述的视频全景图生成包括下列步骤:2. The generation of the video panorama comprises the following steps:

1)对多个视频帧之间相对的全局运动进行全局运动估计,得出各视频帧图像之间的平面投影关系;1) Carry out global motion estimation to the relative global motion between a plurality of video frames, obtain the planar projection relation between each video frame image;

2)如果运动视频序列中包含有运动物体,则首先将其去除;2) If there are moving objects in the moving video sequence, remove them first;

3)根据各视频帧图像之间的平面投影关系,以第一帧图像作为参考帧,建立全景图坐标系,将各视频帧图像投影到该全景图坐标系中,并估计出全景图的尺寸;3) According to the planar projection relationship between each video frame image, the first frame image is used as a reference frame to establish a panoramic coordinate system, and each video frame image is projected into the panoramic coordinate system, and the size of the panoramic image is estimated ;

4)根据各视频帧之间的平面投影关系,计算全景图上每个像点在多个视频帧图像中的对应点,将这多个对应点进行排序,取中值作为全景图上的值,构成视频全景图;4) According to the planar projection relationship between each video frame, calculate the corresponding point of each image point in multiple video frame images on the panorama, sort the multiple corresponding points, and take the median value as the value on the panorama , forming a video panorama;

3、所述的全局运动估计包括:3. The global motion estimation includes:

1)匹配步骤:提取各视频帧图像的角点,并进行相关匹配,得到初始匹配点集;1) Matching step: extract the corner points of each video frame image, and perform correlation matching to obtain an initial matching point set;

2)参数估计步骤:利用Ransac剔除初始匹配点集中的错误匹配,并用最小二乘估计出透视投影下的变换参数。2) Parameter estimation step: Use Ransac to eliminate the wrong matches in the initial matching point set, and use least squares to estimate the transformation parameters under perspective projection.

4、所述运动物体去除方法包括:4. The moving object removal method includes:

1)利用帧差法确定运动物体的大致范围;1) Use the frame difference method to determine the approximate range of the moving object;

2)利用基于颜色的区域分割将图像划分为颜色不同的区域;2) Using color-based region segmentation to divide the image into regions of different colors;

3)用图切割法将二者结合,并用前一帧分割结果作为约束进行优化求解。3) Combine the two with the graph cutting method, and use the segmentation result of the previous frame as a constraint to optimize the solution.

5、还包括对各视频帧图像进行颜色亮度校正,以消除拍摄时由于曝光和白平衡不一样造成的颜色差异。5. It also includes color brightness correction for each video frame image to eliminate color differences caused by different exposure and white balance during shooting.

6、所述的全景图内容编辑方法包括:6. The method for editing panorama content includes:

1)图像移植:通过手工选择一块区域,再将此块区域的信息放到需要填充的区域中,根据被填充区域外部的信息改变原区域信息的颜色,使这种填充变得自然;1) Image transplantation: select an area manually, then put the information of this area into the area to be filled, and change the color of the original area information according to the information outside the filled area, so that this filling becomes natural;

2)基于信息繁衍的图像编辑:利用围绕着待编辑区域边界的已知信息,沿着梯度最小的方向将边界上的灰度信息“繁殖”到待编辑区域内来实现;2) Image editing based on information propagation: use the known information around the boundary of the area to be edited, and "propagate" the gray information on the boundary into the area to be edited along the direction of the smallest gradient;

3)纹理图像的半自动填充:自动填充有纹理的区域。3) Semi-automatic filling of texture images: Automatically fill textured areas.

7、所述的编辑后的视频序列生成包括下列步骤:7. The generation of the edited video sequence comprises the following steps:

1)计算从视频全景图到各视频帧坐标系的逆投影矩阵;1) Calculate the backprojection matrix from the video panorama to each video frame coordinate system;

2)根据逆投影矩阵从视频全景图生成各视频帧图象,完成视频编辑过程。2) Generate each video frame image from the video panorama according to the back-projection matrix, and complete the video editing process.

本发明具有的有益效果是:The beneficial effects that the present invention has are:

1.本发明将传统视频编辑方法的逐帧编辑转化为对合成的视频全景图像的一次性编辑,极大地减小了编辑所需的人工交互以及计算量;1. The present invention converts the frame-by-frame editing of the traditional video editing method into a one-time editing of the synthesized video panoramic image, which greatly reduces the manual interaction and calculation required for editing;

2.由于用户的编辑工作在得到的视频全景图上完成,因此更直观、准确。2. Since the user's editing work is completed on the obtained video panorama, it is more intuitive and accurate.

附图说明 Description of drawings

图1为本发明方法的流程图。Fig. 1 is the flowchart of the method of the present invention.

图2为图像移植的示意图。Figure 2 is a schematic diagram of image transplantation.

图3为纹理图像半自动填充示意图。Fig. 3 is a schematic diagram of semi-automatic filling of texture images.

图4本发明用于对视频场景内容编辑的示例,其中(a)为原始的视频序列各帧,(b)为生成的视频全景图,(c)对视频全景图进行编辑后的结果,(d)由编辑后的视频序列逆投影回各视频帧坐标系的结果,即最终结果。Fig. 4 is an example of the present invention being used for editing video scene content, wherein (a) is each frame of the original video sequence, (b) is the generated video panorama, (c) is the result after editing the video panorama, ( d) The result of back-projecting the edited video sequence back to the coordinate system of each video frame, that is, the final result.

具体实施方式 Detailed ways

下面结合附图和具体实施方式对本发明作进一步详细描述。The present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

图1给出了依照本发明进行视频编辑的方法流程图。FIG. 1 shows a flowchart of a method for video editing according to the present invention.

在运动视频的拍摄过程中,摄像机的运动会造成视频图像背景的运动,这种运动形式被称为全局运动。与全局运动相对应的为局部运动,局部运动指运动物体动作造成的前景运动。考虑到运动物体可以是刚体或非刚体,因此采用基于时域的帧差法做初始的分割。帧差法基于背景静止或者具有统一的全局运动,而运动物体具有不同于此全局运动的性质这一假设。在背景具有统一全局运动的情况下,只需求取得到全局运动参数,即能求出不服从此参数的运动物体区域。During the shooting of motion video, the motion of the camera will cause the motion of the background of the video image, and this type of motion is called global motion. The local motion corresponds to the global motion, and the local motion refers to the foreground motion caused by the motion of the moving object. Considering that the moving object can be a rigid body or a non-rigid body, the frame difference method based on the time domain is used for the initial segmentation. The frame difference method is based on the assumption that the background is stationary or has a uniform global motion, while the moving objects have properties different from this global motion. In the case that the background has a uniform global motion, only the global motion parameters need to be obtained, that is, the moving object area that does not obey the parameters can be obtained.

如图1所示,在步骤101中,对视频序列各帧之间相对的全局运动进行全局运动估计,得到全局运动参数。基于视频编辑的要求,对视频帧中的每相邻两帧之间的全局运动进行估计,从而可通过递推计算出视频各帧对于前述参照帧的全局运动,由此可以得到视频各帧坐标系相对于参考帧坐标系的变换参数。通常情况下,参考帧可选场景中的第一帧。As shown in FIG. 1 , in step 101 , global motion estimation is performed on relative global motion between frames of a video sequence to obtain global motion parameters. Based on the requirements of video editing, the global motion between every adjacent two frames in the video frame is estimated, so that the global motion of each frame of the video relative to the aforementioned reference frame can be calculated recursively, and the coordinates of each frame of the video can be obtained The transformation parameters of the system relative to the reference frame coordinate system. Typically, the reference frame is the first frame in the optional scene.

相邻图像帧之间的全局运动的规律可由全局运动参数表征,根据全局运动估计所针对的场景不同,可选用不同的参数模型。在本发明中,为了体现出场景深度的变化,选用的是八自由度的透视变换模型:The law of global motion between adjacent image frames can be represented by global motion parameters, and different parameter models can be selected according to different scenes targeted by global motion estimation. In the present invention, in order to reflect the change of scene depth, what select for use is the perspective transformation model of eight degrees of freedom:

透视变换的矩阵形式可表示为:The matrix form of perspective transformation can be expressed as:

xx 11 ′′ xx 22 ′′ xx 33 ′′ == hh 1111 hh 1212 hh 1313 hh 21twenty one hh 22twenty two hh 23twenty three hh 3131 hh 3232 hh 3333 xx 11 xx 22 xx 33

它是齐次坐标系下非奇异线性变换的一般形式。透视变换矩阵有9个参数,但是齐次坐标系中有意义的是其比值,因此这种变换实际上是有8个参数。对于同一视频序列的每相邻两帧图象,只需要有四对点的对应,就可以求出此参数。It is a general form of non-singular linear transformations in a homogeneous coordinate system. The perspective transformation matrix has 9 parameters, but what is meaningful in the homogeneous coordinate system is its ratio, so this transformation actually has 8 parameters. For every two adjacent frames of images in the same video sequence, this parameter can be calculated only if there are four pairs of corresponding points.

求解变换参数,本发明采用了特征点匹配算法,它包含下面三个步骤:Solve transformation parameter, the present invention has adopted feature point matching algorithm, and it comprises following three steps:

①分别对相邻的视频帧图像提取角点,如用Harris角点,SUSAN角点等。① Extract corner points from adjacent video frame images, such as using Harris corner points, SUSAN corner points, etc.

②用所提取的角点邻域信息的相关匹配得到粗匹配结果。② Use the correlation matching of the extracted corner neighborhood information to get the rough matching result.

对于序列图像n中的特征点x,相关窗口设置为(2n+1)×(2m+1)。在图像n+1中的搜索区域定为(2du+1)×(2dv+1)。计算搜索区域中每一点x′和x的相关系数ρ(x,x′):For a feature point x in sequence image n, the correlation window is set to (2n+1)×(2m+1). The search area in image n+1 is defined as (2d u +1)×(2d v +1). Calculate the correlation coefficient ρ(x,x') of each point x' and x in the search area:

ρρ (( xx ,, xx ′′ )) == CovCov (( xx ,, xx ′′ )) σσ (( xx )) ·&Center Dot; σσ (( xx ′′ ))

其中Cov(x,x′)是x′和x的协方差:where Cov(x, x′) is the covariance of x′ and x:

CovCov (( xx ,, xx ′′ )) == ΣΣ ii == -- nno nno ΣΣ jj == -- mm mm [[ II (( uu ++ ii ,, vv ++ jj )) -- EE. (( xx )) ]] [[ II (( uu ′′ ++ ii ,, vv ′′ ++ jj )) -- EE. (( xx ′′ )) ]] (( 22 nno ++ 11 )) (( 22 mm ++ 11 ))

σ(x)是点x=(u,v)相关窗口的标准偏差:σ(x) is the standard deviation of the correlation window at point x=(u,v):

σσ (( xx )) == ΣΣ ii == -- nno nno ΣΣ jj == -- mm mm [[ II (( uu ++ ii ,, vv ++ jj )) -- EE. (( xx )) ]] 22 (( 22 nno ++ 11 )) (( 22 mm ++ 11 ))

E(x)是点x=(u,v)相关窗口的均值:E(x) is the mean of the correlation window at point x=(u,v):

EE. (( xx )) == ΣΣ ii == -- nno nno ΣΣ jj == -- mm mm II (( uu ++ ii ,, vv ++ jj )) (( 22 nno ++ 11 )) (( 2m+12m+1 ))

选择相关系数ρ(x,x′)最大的匹配点作为最佳匹配,为了保证匹配点的正确性,还应设置一个阈值T,最佳匹配的相关系数应大于此阈值。Select the matching point with the largest correlation coefficient ρ(x, x′) as the best matching. In order to ensure the correctness of the matching point, a threshold T should also be set, and the correlation coefficient of the best matching should be greater than this threshold.

③由于②中的匹配存在误匹配的可能,且即使匹配没有出错,但匹配点如果恰好位于运动物体上,通过这些匹配点所求得的摄像机运动参数也是错误的。因此,必须有对匹配结果进行检验的手段,以确保匹配结果的鲁棒性。本发明采用的方法是利用透视变换矩阵为约束,用RANSAC进行投票,去掉粗匹配结果中不符合摄像机全局运动参数的点对。③Because of the matching in ②, there is a possibility of mismatching, and even if there is no error in the matching, if the matching points happen to be located on the moving object, the camera motion parameters obtained through these matching points are also wrong. Therefore, there must be a means of checking the matching results to ensure the robustness of the matching results. The method adopted in the present invention is to use the perspective transformation matrix as a constraint, use RANSAC to vote, and remove the point pairs that do not conform to the global motion parameters of the camera in the rough matching result.

关于RANSAC算法可参考文献1:Fischler M.A.and Bolles R.C.RandomSample Consensus:A Paradigm for Model Fitting with Applications to ImageAnalysis and Automated Cartography.Communications of the ACM,1981,Vol.24:381-395.For the RANSAC algorithm, please refer to literature 1: Fischler M.A.and Bolles R.C.RandomSample Consensus: A Paradigm for Model Fitting with Applications to ImageAnalysis and Automated Cartography. Communications of the ACM, 1981, Vol.24: 381-395.

对RANSAC算法进行简单描述:A brief description of the RANSAC algorithm:

假设点x的齐次坐标表示为(x1,x2,1)T,则经过透视矩阵投影后的坐标x′为

Figure C20071007074300081
则投影后两个对应点之间的欧氏距离为Assuming that the homogeneous coordinates of point x are expressed as (x 1 , x 2 , 1) T , then the coordinate x′ after perspective matrix projection is
Figure C20071007074300081
Then the Euclidean distance between two corresponding points after projection is

dd == (( xx 11 ′′ xx 33 ′′ -- xx 11 )) 22 ++ (( xx 22 ′′ xx 33 ′′ -- xx 22 )) 22

其中:in:

x1′=h11x1+h12x2+h13 x 1 '=h 11 x 1 +h 12 x 2 +h 13

x2′=h21x1+h22x2+h23 x 2 '=h 21 x 1 +h 22 x 2 +h 23

x3′=h31x1+h32x2+h33 x 3 '=h 31 x 1 +h 32 x 2 +h 33

假设角点匹配得到的对应点组数为P,赋最大匹配点组Pmax的初始值为0,迭代的次数N的初始值设为200。Assuming that the number of corresponding point groups obtained by corner point matching is P, the initial value of the maximum matching point group P max is 0, and the initial value of the number of iterations N is set to 200.

a)从P中随机选取4组点对,求出透视矩阵HPia) Randomly select 4 groups of point pairs from P to obtain the perspective matrix H Pi ;

b)计算每个匹配点到模型之间的距离,如果距离小于阈值d,将此点标记为true,否则标记为false,记下当前模型下所有点对中,标记为true的点对的组数PGb) Calculate the distance between each matching point and the model, if the distance is less than the threshold d, mark this point as true, otherwise mark it as false, write down the group of point pairs marked as true in all point pairs under the current model Number P G ;

c)如果Pmax<PG,则令Pmax=PG,并保存下当前标记为true的所有点对,转到步骤d;否则,回到步骤a;c) If P max <P G , then set P max =P G , and save all point pairs currently marked as true, and go to step d; otherwise, go back to step a;

d)计算迭代次数 k P = log ( 1 - T ) log ( 1 - ( P G / P ) 4 ) , 其中T为预测的粗匹配结果中正确匹配结果所占比例的先验概率;d) Calculate the number of iterations k P = log ( 1 - T ) log ( 1 - ( P G / P ) 4 ) , Where T is the prior probability of the proportion of correct matching results in the predicted rough matching results;

e)如果N>kP,则令N=kP,与当前总共迭代次数k进行比较,如果k<N,令k=k+1,返回步骤a;否则到步骤f;e) If N>k P , then make N=k P , compare it with the current total number of iterations k, if k<N, make k=k+1, return to step a; otherwise, go to step f;

f)如果Pmax≥4,则转到④用最小二乘法求解满足当前模型的所有点对的最优透视矩阵HP,而此前记录下的匹配点对也就是用RANSAC方法去掉所有的粗差点对之后剩下的结果。f) If P max ≥ 4, go to ④ use the least squares method to solve the optimal perspective matrix HP for all point pairs that satisfy the current model, and the previously recorded matching point pairs are to use the RANSAC method to remove all gross points for the remaining results.

利用透视变换作为RANSAC的约束条件,不仅可以消除匹配中的错误点,同时也给匹配增加了参数方程的约束。Using perspective transformation as the constraint condition of RANSAC can not only eliminate the error points in the matching, but also add the constraint of the parameter equation to the matching.

④通常情况下,匹配点对数目比求8个自由度的透视变换矩阵所要求的的4组点对要多,本发明用最小二乘法求此超定线性方程的解,其过程如下:4. under normal circumstances, matching point number is more than asking 4 groups of points required by the perspective transformation matrix of 8 degrees of freedom. The present invention asks the solution of this overdetermined linear equation with least squares method, and its process is as follows:

假设有n组对应的匹配点x,x′,其对应的透视矩阵为HP,则最小二乘法的解HP应下式具有最小值:Assuming that there are n groups of corresponding matching points x, x′, and their corresponding perspective matrix is H P , then the solution H P of the least squares method should have the minimum value of the following formula:

EE. == &Sigma;&Sigma; ii == 11 nno || || Hh PP xx ii -- xx ii &prime;&prime; || || 22

&sigma;E &sigma; H P = 0 , 对HP求导可以通过对其每个元素求导推出。make &sigma;E &sigma; h P = 0 , The derivative of HP can be deduced by deriving each element of it.

例如, &Sigma; | | h 11 h 12 h 13 h 21 h 22 h 23 h 31 h 32 h 33 x 1 x 2 x 3 - x 1 &prime; x 2 &prime; x 3 &prime; | | 2 对h11求偏导,展开得到:For example, &Sigma; | | h 11 h 12 h 13 h twenty one h twenty two h twenty three h 31 h 32 h 33 x 1 x 2 x 3 - x 1 &prime; x 2 &prime; x 3 &prime; | | 2 Take the partial derivative for h 11 , and expand to get:

∑2(h11x1+h12x2+h13x3-x′1)x1=0∑2(h 11 x 1 +h 12 x 2 +h 13 x 3 -x′ 1 )x 1 =0

对HP中的每个元素都如上式对h11求偏导一样展开,再将结果合并在一起可以得到:Each element in HP is expanded as the partial derivative of h 11 in the above formula, and then the results can be combined to get:

22 &Sigma;&Sigma; ii (( Hh PP xx ii -- xx ii &prime;&prime; )) &CenterDot;&CenterDot; xx TT == 00

H P &Sigma; i x i x i T = &Sigma; i x i &prime; x i T &CenterDot; Right now h P &Sigma; i x i x i T = &Sigma; i x i &prime; x i T &CenterDot;

令左右两边两个3×1和1×3的列矢量和行矢量相乘得到的矩阵分别为: A = &Sigma; i x i x i T B = &Sigma; i x i &prime; x i T , 则HPA=B,可求出HP的解为:The matrices obtained by multiplying the two 3×1 and 1×3 column vectors and row vectors on the left and right sides are respectively: A = &Sigma; i x i x i T and B = &Sigma; i x i &prime; x i T , Then HP A=B, the solution of HP can be obtained as:

HP=BA-1 H P =BA -1

如图1所示,在步骤102中,当视频序列中存在运动物体时,应将其去除。本发明采用帧差法来确定运动物体的大致范围,以此作为初始值,结合帧内及帧间各像素之间的相关性,定义能量方程,利用图切割方法解此能量方程而得到最终的分割结果。As shown in FIG. 1, in step 102, when there is a moving object in the video sequence, it should be removed. The present invention adopts the frame difference method to determine the approximate range of the moving object, uses this as the initial value, combines the correlation between each pixel in the frame and between frames, defines the energy equation, and uses the graph cutting method to solve the energy equation to obtain the final Split results.

步骤101已经介绍了求取摄像机全局运动参数的方法。利用此参数,可将一帧图像Ii+1投影到相邻帧Ii的坐标系中,得到新图像Ii+1′。此图像相对于Ii的背景静止。假设图像Ii+1和Ii满足投影关系P,即xi+1=Pxi,其中xi和xi+1分别表示图像Ii和Ii+1上对应点的坐标,则新图像Ii+1′的计算过程为:Step 101 has already introduced the method of obtaining the global motion parameters of the camera. Using this parameter, one frame of image I i+1 can be projected into the coordinate system of the adjacent frame I i to obtain a new image I i+1 ′. This image is stationary relative to the background of I i . Assuming that images I i+1 and I i satisfy the projection relationship P, that is, x i+1 = Pxi , where x i and x i+1 represent the coordinates of corresponding points on images I i and I i+1 respectively, then the new image The calculation process of I i+1 ′ is:

①生成一幅大小和原图像相同的空白图像Ii+1′;① Generate a blank image I i+1 ′ with the same size as the original image;

②对于 &ForAll; x &Element; I i + 1 &prime; , 求出其在原图像中的对应点坐标x′=P-1x(P-1为P的逆投影);② for &ForAll; x &Element; I i + 1 &prime; , Find its corresponding point coordinates x'=P -1 x (P -1 is the back projection of P) in the original image;

③求出的对应坐标x′,该坐标通常不是整数,用双线性插值法进行插值。③ The obtained corresponding coordinate x', which is usually not an integer, is interpolated by bilinear interpolation.

需要说明的是由于二帧图像的范围并不完全相同,因此在把一帧图像投影到另一帧图像的坐标系时,会出现一些“盲区”(计算所得的坐标值超出了图像边界),盲区的点像素值置为0。It should be noted that since the ranges of the two frames of images are not exactly the same, when projecting one frame of image to the coordinate system of another frame of image, there will be some "blind areas" (the calculated coordinate value exceeds the boundary of the image), The pixel value of the point in the blind area is set to 0.

理想情况下,当序列中不存在运动物体时,Ii+1′和Ii是同样的图像。若序列中存在运动物体,Ii+1′和Ii会在有运动物体的区域内有差别。因此,可定义Ii中的运动物体区域为:Ideally, when there is no moving object in the sequence, I i+1 ′ and I i are the same image. If there is a moving object in the sequence, I i+1 ′ and I i will be different in the area where the moving object exists. Therefore, the moving object area in I can be defined as:

{xi|xi∈Ii,|f(xi)-f(xi+1′)|>T}{x i |x i ∈I i ,|f(x i )-f(x i+1 ′)|>T}

上式中,f(xi)表示Ii中位置为xi的点的象素值,f(x′i+1)则是由Ii+1投影到Ii所生成的新图像Ii+1′中坐标为x′i+1的象素值,T是所设定的阈值。当对应点相减的结果大于此阈值T时,认为此点是运动物体上的点,否则视为静止的背景点。In the above formula, f(xi ) represents the pixel value of the point in I i whose position is x i , and f(x′ i+1 ) is the new image I i generated by projecting I i+1 to I i The pixel value whose coordinates are x'i +1 in +1 ', T is the set threshold. When the subtraction result of the corresponding point is greater than the threshold T, the point is considered to be a point on a moving object, otherwise it is regarded as a static background point.

为了尽可能多地保留每帧图像中静止背景图像的信息,可再加入第三帧的信息,即用前后帧图像来估计当前帧中的运动物体,因此对上式进行了修改,In order to retain as much information as possible of the static background image in each frame image, the information of the third frame can be added, that is, the image of the previous and subsequent frames is used to estimate the moving object in the current frame, so the above formula is modified,

{xi|xi∈Ii,|f(xi)-f(xi+1′)|>T&|f(xi)-f(xi-1′)|>T}{x i |x i ∈I i ,|f(x i )-f(x i+1 ′)|>T&|f(x i )-f(x i-1 ′)|>T}

受到噪声等因素的影响,直接由上式得到的结果常常会出现一些错误孤立点或者小块区域,可用形态学算子做简单的处理。Affected by noise and other factors, the results obtained directly from the above formula often have some error isolated points or small areas, which can be easily processed by morphological operators.

如果图切割法以像素点为节点则计算量较大,会影响算法的效率。因此,在运用图切割算法之前,本发明先用均值偏移方法对图像做预分割,用分割后得到的各区域作为图的节点。这样不仅可以减小计算量,而且由于均值偏移能够较准确的定位各颜色区域的边缘,还可以保证分割结果的准确性。If the graph cutting method uses pixels as nodes, the amount of calculation will be large, which will affect the efficiency of the algorithm. Therefore, before using the graph cutting algorithm, the present invention uses the mean value shift method to pre-segment the image, and uses the regions obtained after the segmentation as the nodes of the graph. This can not only reduce the amount of calculation, but also ensure the accuracy of the segmentation result because the mean shift can more accurately locate the edge of each color region.

关于均值偏移算法可参考文献2:Fukunaga K.and Hostetler L.D.Theestimation of the gradient of a density function,with applications in patternrecognition,IEEE Transactions on Information Theory,1975,21(1):32~40.For the mean shift algorithm, please refer to literature 2: Fukunaga K. and Hostetler L.D. The estimation of the gradient of a density function, with applications in pattern recognition, IEEE Transactions on Information Theory, 1975, 21(1): 32-40.

基于均值偏移的颜色分割算法主要分两个步骤:首先,图像在联合域中进行均值偏移滤波,该滤波具有不连续保持性,将每个像素都划分至联合域中最近的模式,并将相应模式中的三维彩色分量替换各像素原先值。然后,采用迭代方法融合位于色彩空间中hr/2范围内的模式吸引域,直至收敛,最终得到分割后图像。The color segmentation algorithm based on the mean shift is mainly divided into two steps: first, the image is subjected to mean shift filtering in the joint domain, which has discontinuous retention, and each pixel is divided into the nearest mode in the joint domain, and Replace the original value of each pixel with the 3D color component in the corresponding mode. Then, an iterative method is used to fuse the pattern attraction domains in the range of h r /2 in the color space until convergence, and finally the segmented image is obtained.

基于均值偏移的颜色分割算法利用彩色信息和空间位置将图像划分成小块区域。但是分割的结果不具有语义上的知识,根据此结果无法区分运动物体和背景。因此我们用图切割的方法建立起运动物体区域和彩色均值偏移算法结果的关系,从而可以将时域和空域信息结合起来,得到比较准确的分割结果。The color segmentation algorithm based on mean shift uses color information and spatial position to divide the image into small regions. However, the result of segmentation does not have semantic knowledge, and it is impossible to distinguish between moving objects and backgrounds based on this result. Therefore, we use the method of graph cutting to establish the relationship between the moving object area and the result of the color mean shift algorithm, so that the temporal domain and spatial domain information can be combined to obtain more accurate segmentation results.

关于图切割算法可参考文献3:Yuri Boykov,Olga Veksler,Ramin Zabih.Efficient Approximate Energy Minimization via Graph Cuts.IEEE transactions onPattern Analysis and Machine Intelligence,2001,20(12):1222-1239.For graph cutting algorithm, please refer to literature 3: Yuri Boykov, Olga Veksler, Ramin Zabih. Efficient Approximate Energy Minimization via Graph Cuts. IEEE transactions on Pattern Analysis and Machine Intelligence, 2001, 20(12): 1222-1239.

如图1所示,在步骤103中,全景图拼接采用平面拼接方法,选取其中一帧所在平面为参考帧,利用平面投影模型,将其它所有帧都投影到此帧所在平面,构造出全景图。步骤101已经详细介绍了利用匹配估计全局运动参数的方法。假设xi代表第i帧图像上点的坐标,Pi,j代表i,j两帧间的透视投影矩阵,帧间的投影关系公式表示为:As shown in Figure 1, in step 103, the panorama stitching adopts the plane stitching method, selects the plane where one frame is located as the reference frame, and uses the plane projection model to project all other frames onto the plane where this frame is located to construct a panorama . Step 101 has introduced in detail the method of estimating global motion parameters by matching. Suppose x i represents the coordinates of the point on the i-th frame image, P i, j represents the perspective projection matrix between two frames i and j, and the projection relationship formula between frames is expressed as:

xi=Pi,i+1xi+1 x i =P i,i+1 x i+1

利用相邻帧之间的传递性,可以得到各帧图像与第一帧图像间的投影关系,即:Using the transitivity between adjacent frames, the projection relationship between each frame image and the first frame image can be obtained, namely:

x1=P1,2x2 x 1 =P 1,2 x 2

x2=P2,3x3 x 2 =P 2,3 x 3

     ..

     ..

     ..

xn-1=Pn-1,nxn x n-1 = P n-1, n x n

由此可以计算出每帧图像和第一帧图像之间的投影关系:From this, the projection relationship between each frame image and the first frame image can be calculated:

x1=P1,2x2=P1,2P2,3x3=P1,3x3(其中,P1,3=P1,2P2,3)x 1 =P 1,2 x 2 =P 1,2 P 2,3 x 3 =P 1,3 x 3 (wherein, P 1,3 =P 1,2 P 2,3 )

                                  .                 

                                  .                 

                                  .                 

x1=P1,2P2,3…Pn-1,nxn=P1,nxn(其中,P1,n=P1,2P2,3…Pn-1,n)x 1 =P 1,2 P 2,3 ...P n-1,n x n =P 1,n x n (wherein, P 1,n =P 1,2 P 2,3 ...P n-1,n )

选取第一帧图像为参考帧,就可得到每帧图像与全景图的投影关系。Select the first frame of image as the reference frame, and then the projection relationship between each frame of image and the panorama can be obtained.

确定各帧与全景图坐标系之间的投影关系后,接下来需要计算全景图的大小:计算出每帧图像的四个顶点在全景图坐标系下的投影位置,记其坐标为(x,y)。比较其坐标的大小,得到xmax、xmax和ymax、ymin,则全景图的大小可确定为W×H,其中W=xmax-xmin,H=ymax-yminAfter determining the projection relationship between each frame and the coordinate system of the panorama, the size of the panorama needs to be calculated next: calculate the projection position of the four vertices of each frame image in the coordinate system of the panorama, and record its coordinates as (x, y). Compare the sizes of their coordinates to get x max , x max and y max , y min , then the size of the panorama can be determined as W×H, where W=x max -x min , H=y max -y min .

由于透视变换是一种线性可逆变换,因此全景图上的点到各帧的投影关系可表示为: x n = P 1 , n - 1 x , 其中矩阵P1,n -1是P1,n的逆矩阵。在此投影关系下,可求出全景图上的像素点{x,y}在第n帧图像上的坐标{xn,yn}。由于得到的{xn,yn}可能不是整数,可通过双线性插值计算此点的像素值。Since the perspective transformation is a linear reversible transformation, the projection relationship from the point on the panorama to each frame can be expressed as: x no = P 1 , no - 1 x , where matrix P 1,n -1 is the inverse matrix of P 1,n . Under this projection relationship, the coordinates {x n , y n } of the pixel point {x, y} on the panorama on the image of the nth frame can be obtained. Since the obtained {x n , y n } may not be an integer, the pixel value of this point can be calculated by bilinear interpolation.

当视频序列中相邻帧间的摄像机运动比较小时,全景图上点通常情况下与多视频帧图像上的点相对应。假设与其对应的视频帧数为M,可取对应的这M个点的像素值的中值为全景图上点的像素值。When the camera motion between adjacent frames in a video sequence is relatively small, the points on the panorama usually correspond to the points on the image of multiple video frames. Assuming that the number of video frames corresponding thereto is M, the median value of the pixel values of the corresponding M points can be taken as the pixel value of the points on the panorama.

如图1所示,在步骤104中,根据视频编辑的具体要求,对由步骤103生成的全景图进行图像内容编辑,对图像编辑的方法很多,本发明中,采用了下列三种方法:As shown in Figure 1, in step 104, according to the concrete requirement of video editing, carry out image content editing to the panorama generated by step 103, to the method for image editing a lot, in the present invention, adopted following three kinds of methods:

①图像移植:图2中给出了图像移植的示意图,将选中的左边区域的图像g移植到右边的区域Ω中,Ω′表示Ω的内部边界。令所求的移植后的合成图像为u,被移植的图像为g,而被移植到的区域Ω的原始图像为f。① Image transplantation: Figure 2 shows a schematic diagram of image transplantation. The selected image g in the left region is transplanted into the right region Ω, and Ω′ represents the internal boundary of Ω. Let the desired composite image after transplantation be u, the transplanted image be g, and the original image of the region Ω to be transplanted be f.

为了引入移植图像的信息,令合成图像的一阶微分与移植图像相同 &dtri; u = &dtri; g , 即:In order to introduce the information of the transplanted image, let the first differential of the composite image be the same as that of the transplanted image &dtri; u = &dtri; g , Right now:

&PartialD;&PartialD; uu &PartialD;&PartialD; xx == &PartialD;&PartialD; gg &PartialD;&PartialD; xx foruforu &Element;&Element; &Omega;&Omega; &PartialD;&PartialD; uu &PartialD;&PartialD; ythe y == &PartialD;&PartialD; gg &PartialD;&PartialD; ythe y foruforu &Element;&Element; &Omega;&Omega;

同时,移植后的合成图像还会受到移植区域边界上的约束。通常情况下,为了保持图像的连续性,要满足u=f for u∈ Ω′。At the same time, the transplanted synthetic image is also subject to constraints on the boundaries of the transplanted region. Usually, in order to maintain the continuity of the image, u=f for u∈ Ω′ should be satisfied.

上面两个约束条件可以用同一个能量函数J(u)来表示:The above two constraints can be expressed by the same energy function J(u):

JJ (( uu )) == &Integral;&Integral; &Omega;&Omega; (( (( &PartialD;&PartialD; uu &PartialD;&PartialD; xx -- &PartialD;&PartialD; gg &PartialD;&PartialD; xx )) 22 ++ (( &PartialD;&PartialD; uu &PartialD;&PartialD; ythe y -- &PartialD;&PartialD; gg &PartialD;&PartialD; ythe y )) 22 )) dd (( xx ,, ythe y )) ++ &lambda;&lambda; &Integral;&Integral; &Omega;&Omega; &prime;&prime; (( uu -- ff )) 22 dd (( xx ,, ythe y ))

能使此能量函数J(u)值最小的u就是合成后的图像。其中λ(λ>0)是Lagrange乘数,它调节上面两个条件在整个约束中所占比重。The u that can minimize the value of this energy function J(u) is the synthesized image. Among them, λ (λ>0) is the Lagrange multiplier, which adjusts the proportion of the above two conditions in the whole constraint.

②基于信息繁衍的图像编辑:利用围绕着待编辑区域边界的已知信息,沿着梯度最小的方向将边界上的灰度信息“繁殖”到待编辑区域内来实现。②Image editing based on information propagation: use the known information around the boundary of the area to be edited, and "propagate" the gray information on the boundary into the area to be edited along the direction of the smallest gradient.

令Io(i,j):[0,M]×[0,N]→R表示一幅大小为M×N的图像。图像修复算法会通过迭代得到一系列的图像I(i,j,n):[0,M]×[0,N]×N→R,满足I(i,j,0)=Io(i,j)且 lim n &RightArrow; &infin; I ( i , j , n ) = I R ( i , j ) (IR(i,j)是输出图像),其数学表达式写为:Let I o (i,j):[0,M]×[0,N]→R represent an image with a size of M×N. The image inpainting algorithm will obtain a series of images I(i, j, n) through iteration: [0, M]×[0, N]×N→R, satisfying I(i, j, 0)=I o (i , j) and lim no &Right Arrow; &infin; I ( i , j , no ) = I R ( i , j ) (I R (i, j) is the output image), its mathematical expression is written as:

II nno ++ 11 (( ii ,, jj )) == II nno (( ii ,, jj )) ++ &Delta;tI&Delta;tI tt nno (( ii ,, jj )) ,, &ForAll;&ForAll; (( ii ,, jj )) &Element;&Element; &Omega;&Omega;

上式中,n表示修复的时间,即迭代的次数,(i,j)表示像素的坐标,而Δt是每次迭代的步长,It n(i,j)表示图像In(i,j)的更新对象,而In+i(i,j)则为In(i,j)在It n(i,j)的约束下经过一次迭代后得到的结果。这个等式的有效区域为手工指定的待修改区域Ω的内部。在n次迭代后可以得到修复好的图像。In the above formula, n represents the repair time, that is, the number of iterations, (i, j) represents the coordinates of the pixel, and Δt is the step size of each iteration, It n (i, j) represents the image I n (i, j), and I n+i (i, j) is the result obtained by I n (i, j) after one iteration under the constraint of It n ( i, j). The valid area of this equation is the interior of the manually specified area Ω to be modified. A repaired image can be obtained after n iterations.

算法的关键在于找一个合适的It n(i,j)。在人工修复技术中,人们通常会将损坏区域外部的信息沿着损坏区域的外部边界慢慢的扩充到损坏区域内,。因此在用计算机模仿人工修复时,可以借用此思想,将Ω外部信息平滑扩充到Ω内部。假设Ln(i,j)是待扩充的信息,而

Figure C20071007074300133
是扩充的方向,可得到It n(i,j)的表达式为:The key of the algorithm is to find a suitable It n (i, j). In artificial repair technology, people usually slowly expand the information outside the damaged area into the damaged area along the outer boundary of the damaged area. Therefore, when imitating manual restoration with a computer, this idea can be borrowed to smoothly expand the external information of Ω to the internal of Ω. Suppose L n (i, j) is the information to be expanded, and
Figure C20071007074300133
is the direction of expansion, and the expression of It n ( i, j) can be obtained as:

II tt nno (( ii ,, jj )) == &delta;&delta; LL nno &RightArrow;&Right Arrow; (( ii ,, jj )) &CenterDot;&Center Dot; NN nno &RightArrow;&Right Arrow; (( ii ,, jj ))

其中

Figure C20071007074300135
是信息Ln(i,j)的变化量。在此等式中可以估计出图像的信息Ln(i,j)并可计算出其在方向
Figure C20071007074300136
上的变化量。在稳定后的状态也就是算法收敛时,满足In+1(i,j)=In(i,j),也就是 &delta;L n &RightArrow; ( i , j ) &CenterDot; N n &RightArrow; ( i , j ) = 0 , 意味着信息量L已经完全扩充到方向中。in
Figure C20071007074300135
is the change amount of information L n (i, j). In this equation, the information L n (i, j) of the image can be estimated and its direction in
Figure C20071007074300136
The amount of change on . In the stable state, that is, when the algorithm converges, it satisfies I n+1 (i, j)=I n (i, j), that is &delta; L no &Right Arrow; ( i , j ) &CenterDot; N no &Right Arrow; ( i , j ) = 0 , It means that the amount of information L has been fully expanded to direction.

因为希望信息是平滑地扩散到图像中,故Ln(i,j)是一个平滑算子,可以选取拉普拉斯算子,其表示为 L n ( i , j ) = I zz n ( i , j ) + I yy n ( i , j ) . 当然,其它的平滑算子也是适用的。Because it is hoped that the information is smoothly diffused into the image, L n (i, j) is a smoothing operator, and the Laplacian operator can be selected, which is expressed as L no ( i , j ) = I zz no ( i , j ) + I yy no ( i , j ) . Of course, other smoothing operators are also applicable.

由于等照度线的连续性总是沿着边界的法线方向,故选择边界

Figure C200710070743001310
的法线方向为平滑信息变化的方向
Figure C200710070743001311
。对于Ω内的每个点(i,j),的方向垂直此点所在的边界。修复区域是任意的,故的方向与原图像本身无关。如果等照度线的方向和
Figure C200710070743001315
一致,则选取
Figure C200710070743001316
时,最好的方向就是等照度线的方向。对任意点(i,j),梯度
Figure C200710070743001317
是变化最大的方向,因此与梯度垂直的方向
Figure C200710070743001318
是变化最小的方向。定义
Figure C200710070743001319
为等照度线的方向,从而方向矢量
Figure C200710070743001320
的表达式为:Since the continuity of the isoluminescence line is always along the normal direction of the boundary, the boundary is chosen
Figure C200710070743001310
The normal direction of is the direction of smooth information change
Figure C200710070743001311
. For each point (i, j) within Ω, The direction of the point is perpendicular to the boundary where this point is located . The repair area is arbitrary, so The orientation of is independent of the original image itself. If the direction of the isolux line and
Figure C200710070743001315
agree, select
Figure C200710070743001316
, the best direction is the direction of the isolux line. For any point (i, j), the gradient
Figure C200710070743001317
is the direction of greatest change, so the direction perpendicular to the gradient
Figure C200710070743001318
is the direction of least change. definition
Figure C200710070743001319
is the direction of the isoluminescence line, so the direction vector
Figure C200710070743001320
The expression is:

NN &RightArrow;&Right Arrow; (( ii ,, jj ,, nno )) == &dtri;&dtri; &perp;&perp; II nno (( ii ,, jj ))

③纹理图像半自动填充:上述提供的二种方法主要针对平滑的图像区域,对于纹理丰富的区域要用纹理图像的半自动填充来解决。用图像中未损坏的区域为采样标本,以块为单位对图像进行修复。定义每“块”的大小为w×w,以w为标准将损坏区域划分为n块,{B1,B2,...,Bn},然后依次修复每一小块。③ Semi-automatic filling of texture images: The two methods provided above are mainly aimed at smooth image areas, and semi-automatic filling of texture images should be used for areas with rich textures. Use the undamaged area in the image as a sampling sample, and repair the image in units of blocks. Define the size of each "block" as w×w, divide the damaged area into n blocks based on w, {B 1 , B 2 ,..., B n }, and then repair each small block in turn.

如图3所示,对当前要修补的损坏块Bk,在它周围已知的区域中取一个宽度wB的带,图3中阴影所示的部分。对图中所示的当前损坏块Bk,它右边的图像仍属于损坏部分,故右边的带状区域的信息为未知,所以取的带状区域为左、上、下三边。同样,对于其它的损坏块,也只考虑四边中信息已知的带状区域。对采样区域中(这里就是图像中未损坏的部分)的任一采样标本B(x,y)(B(x,y)表示左下角的点为(x,y)的块),取同样位置和大小的带状区域

Figure C20071007074300142
,如图3中未损坏区域内用阴影所表示的部分。计算两个带状区域的距离,可以得到与当前损坏块Bk距离小于某一给定阈值的块的集合ψB。定义集合ψB为:As shown in Figure 3, for the damaged block Bk currently to be repaired, take a band of width w B in the known area around it , the shaded part in Figure 3. For the current damaged block B k shown in the figure, the image on the right is still a damaged part, so the information of the right strip area is unknown, so the strip areas taken are the left, upper and lower sides. Similarly, for other damaged blocks, only the strip-shaped areas with known information in the four sides are considered. Take the same position for any sampling specimen B (x, y) in the sampling area (here is the undamaged part of the image) (B (x, y) represents the block whose point in the lower left corner is (x, y)) and a banded region of size
Figure C20071007074300142
, as shown in the shaded area in the undamaged area in Figure 3. By calculating the distance between the two banded areas, a set ψ B of blocks whose distance to the current damaged block B k is smaller than a given threshold can be obtained. Define the set ψ B as:

&psi;&psi; BB == {{ BB (( xx ,, ythe y )) || dd (( EE. BB (( xx ,, ythe y )) ,, EE. BB kk )) << dd maxmax }}

其中,dmax为给定的阈值。在集合ψB中随机选择一块,把这块中每点的灰度值依次拷贝到当前损坏块Bk中。按照同样的方法处理剩下的损坏块(已经修补好的区域可能为下一块要修补的区域提供边界约束条件)。直到最后一块的值确定,整幅图像即修补完毕。Among them, d max is a given threshold. Randomly select a block in the set ψ B , and copy the gray value of each point in this block to the current damaged block B k in turn. The rest of the damaged blocks are processed in the same way (areas that have been repaired may provide boundary constraints for the next block to be repaired). Until the value of the last block is determined, the entire image is patched.

如图1所示,在步骤105中,根据视频编辑的具体要求,对由步骤104进行编辑后的全景图中恢复出视频序列。As shown in FIG. 1 , in step 105 , according to the specific requirements of video editing, a video sequence is recovered from the panorama edited in step 104 .

各视频帧图像和全景图之间的投影关系表示为:The projection relationship between each video frame image and the panorama is expressed as:

x=P1,nxn x=P 1,n x n

由于透视变换是一种线性可逆变换,因此可得到:Since the perspective transformation is a linear reversible transformation, it can be obtained:

x n = P 1 , n - 1 x , 其中P1,n -1是P1,n的逆矩阵。 x no = P 1 , no - 1 x , where P 1,n -1 is the inverse matrix of P 1,n .

在此投影关系下,即可各视频帧上的像素点在全景图上的坐标,如果改坐标点为非整数,则可通过双线性插值计算得到该点的像素值。Under this projection relationship, that is, the coordinates of the pixel points on each video frame on the panorama. If the coordinate points are changed to non-integer numbers, the pixel value of the point can be calculated by bilinear interpolation.

图4本发明用于对视频场景内容编辑的示例,其中(a)为原始的视频序列各帧,(b)为生成的视频全景图,(c)对视频全景图进行编辑后的结果,(d)由编辑后的视频序列逆投影回各视频帧坐标系的结果,即最终结果。Fig. 4 is an example of the present invention being used for editing video scene content, wherein (a) is each frame of the original video sequence, (b) is the generated video panorama, (c) is the result after editing the video panorama, ( d) The result of back-projecting the edited video sequence back to the coordinate system of each video frame, that is, the final result.

Claims (4)

1、一种基于全景图拼接的视频编辑方法,其特征在于,该方法的步骤如下:1, a kind of video editing method based on panorama mosaic, it is characterized in that, the steps of this method are as follows: 1)用多个视频帧生成一描述运动视频全貌的视频全景图;1) Generate a video panorama describing the overall picture of the motion video with a plurality of video frames; 2)对得到的视频全景图进行图像内容编辑;2) Editing the image content of the obtained video panorama; 3)由编辑后的视频全景图逆投影回各视频帧坐标系,生成编辑后的视频序列;3) Back-projection of the edited video panorama back to each video frame coordinate system to generate an edited video sequence; 所述的视频全景图生成包括下列步骤:Described video panorama generation comprises the following steps: (1)对多个视频帧之间相对的全局运动进行全局运动估计,得出各视频帧图像之间的平面投影关系;(1) Carry out global motion estimation to the relative global motion between a plurality of video frames, obtain the plane projection relation between each video frame image; (2)如果运动视频序列中包含有运动物体,则首先将其去除;(2) If there is a moving object in the motion video sequence, it will be removed first; (3)根据各视频帧图像之间的平面投影关系,以第一帧图像作为参考帧,建立全景图坐标系,将各视频帧图像投影到该全景图坐标系中,并估计出全景图的尺寸;(3) According to the planar projection relationship between each video frame image, the first frame image is used as a reference frame to establish a panoramic coordinate system, and each video frame image is projected into the panoramic coordinate system, and the panoramic image is estimated. size; (4)根据各视频帧之间的平面投影关系,计算全景图上每个像点在多个视频帧图像中的对应点,将这多个对应点进行排序,取中值作为全景图上的值,构成视频全景图;(4) According to the planar projection relationship between each video frame, calculate the corresponding point of each image point in multiple video frame images on the panorama, sort these multiple corresponding points, and take the median as the panorama value to form a video panorama; 所述的全景图内容编辑方法包括:The described panorama content editing method comprises: (1)图像移植:通过手工选择一块区域,再将此块区域的信息放到需要填充的区域中,根据被填充区域外部的信息改变原区域信息的颜色,使这种填充变得自然;(1) Image transplantation: select an area manually, then put the information of this area into the area to be filled, and change the color of the original area information according to the information outside the filled area, so that this filling becomes natural; (2)基于信息繁衍的图像编辑:利用围绕着待编辑区域边界的已知信息,沿着梯度最小的方向将边界上的灰度信息“繁殖”到待编辑区域内来实现;(2) Image editing based on information propagation: use the known information around the boundary of the area to be edited, and "propagate" the gray information on the boundary into the area to be edited along the direction of the smallest gradient; (3)纹理图像的半自动填充:自动填充有纹理的区域;(3) Semi-automatic filling of texture images: automatic filling of textured areas; 所述的编辑后的视频序列生成包括下列步骤:The video sequence generation after described editing comprises the following steps: (1)计算从视频全景图到各视频帧坐标系的逆投影矩阵;(1) Calculate the back projection matrix from the video panorama to each video frame coordinate system; (2)根据逆投影矩阵从视频全景图生成各视频帧图象,完成视频编辑过程。(2) Generate each video frame image from the video panorama according to the back-projection matrix, and complete the video editing process. 2.根据权利要求1所述的一种基于全景图拼接的视频编辑方法,其特征在于,所述的全局运动估计包括:2. a kind of video editing method based on panorama stitching according to claim 1, is characterized in that, described global motion estimation comprises: 1)匹配步骤:提取各视频帧图像的角点,并进行相关匹配,得到初始匹配点集;1) Matching step: extract the corner points of each video frame image, and perform correlation matching to obtain an initial matching point set; 2)参数估计步骤:利用Ransac剔除初始匹配点集中的错误匹配,并用最小二乘估计出透视投影下的变换参数。2) Parameter estimation step: Use Ransac to eliminate the wrong matches in the initial matching point set, and use least squares to estimate the transformation parameters under perspective projection. 3.根据权利要求1所述的一种基于全景图拼接的视频编辑方法,其特征在于,所述运动物体去除方法包括:3. A kind of video editing method based on panorama mosaic according to claim 1, is characterized in that, described moving object removal method comprises: 1)利用帧差法确定运动物体的大致范围;1) Use the frame difference method to determine the approximate range of the moving object; 2)利用基于颜色的区域分割将图像划分为颜色不同的区域;2) Using color-based region segmentation to divide the image into regions of different colors; 3)用图切割法将二者结合,并用前一帧分割结果作为约束进行优化求解。3) Combine the two with the graph cutting method, and use the segmentation result of the previous frame as a constraint to optimize the solution. 4.根据权利要求1所述的一种基于全景图拼接的视频编辑方法,其特征在于,还包括对各视频帧图像进行颜色亮度校正,以消除拍摄时由于曝光和白平衡不一样造成的颜色差异。4. A kind of video editing method based on panorama mosaic according to claim 1, is characterized in that, also comprises carrying out color brightness correction to each video frame image, to eliminate the color that is caused by different exposure and white balance when shooting. difference.
CNB2007100707436A 2007-08-10 2007-08-10 Video Editing Method Based on Panorama Stitching Expired - Fee Related CN100448271C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2007100707436A CN100448271C (en) 2007-08-10 2007-08-10 Video Editing Method Based on Panorama Stitching

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2007100707436A CN100448271C (en) 2007-08-10 2007-08-10 Video Editing Method Based on Panorama Stitching

Publications (2)

Publication Number Publication Date
CN101119442A CN101119442A (en) 2008-02-06
CN100448271C true CN100448271C (en) 2008-12-31

Family

ID=39055353

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2007100707436A Expired - Fee Related CN100448271C (en) 2007-08-10 2007-08-10 Video Editing Method Based on Panorama Stitching

Country Status (1)

Country Link
CN (1) CN100448271C (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102426705B (en) * 2011-09-30 2013-10-30 北京航空航天大学 Behavior splicing method of video scene
US9712746B2 (en) * 2013-03-14 2017-07-18 Microsoft Technology Licensing, Llc Image capture and ordering
GB2512621A (en) 2013-04-04 2014-10-08 Sony Corp A method and apparatus
CN104092998B (en) * 2014-07-18 2018-04-06 深圳英飞拓科技股份有限公司 A kind of panoramic video processing method and its device
CN104537659B (en) * 2014-12-23 2017-10-27 金鹏电子信息机器有限公司 The automatic calibration method and system of twin camera
CN104966063A (en) * 2015-06-17 2015-10-07 中国矿业大学 Mine multi-camera video fusion method based on GPU and CPU cooperative computing
CN105894443B (en) * 2016-03-31 2019-07-23 河海大学 A kind of real-time video joining method based on improved SURF algorithm
JP6953961B2 (en) * 2017-09-27 2021-10-27 カシオ計算機株式会社 Image processing equipment, image processing methods and programs
CN108198181B (en) * 2018-01-23 2019-12-27 电子科技大学 Infrared thermal image processing method based on region segmentation and image fusion
CN108319958A (en) * 2018-03-16 2018-07-24 福州大学 A kind of matched driving license of feature based fusion detects and recognition methods
CN120259497A (en) * 2019-01-18 2025-07-04 斯纳普公司 System and method for generating personalized video based on template
CN110717430A (en) * 2019-09-27 2020-01-21 聚时科技(上海)有限公司 Long object identification method and identification system based on target detection and RNN
CN113902905A (en) * 2021-10-11 2022-01-07 北京百度网讯科技有限公司 Image processing method, device and electronic device
CN113962867B (en) * 2021-12-22 2022-03-15 深圳思谋信息科技有限公司 Image processing method, image processing device, computer equipment and storage medium

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
一种全自动稳健的图像拼接融合算法. 赵向阳,杜利民.中国图象图形学报,第9卷第4期. 2004
一种全自动稳健的图像拼接融合算法. 赵向阳,杜利民.中国图象图形学报,第9卷第4期. 2004 *
视频序列的全景图拼接技术. 朱云芳等.中国图象图形学报,第11卷第8期. 2006
视频序列的全景图拼接技术. 朱云芳等.中国图象图形学报,第11卷第8期. 2006 *

Also Published As

Publication number Publication date
CN101119442A (en) 2008-02-06

Similar Documents

Publication Publication Date Title
CN100448271C (en) Video Editing Method Based on Panorama Stitching
Jia et al. Video repairing: Inference of foreground and background under severe occlusion
Bhat et al. Using photographs to enhance videos of a static scene
JP4074062B2 (en) Semantic object tracking in vector image sequences
EP1977395B1 (en) Methods and systems for digitally re-mastering of 2d and 3d motion pictures for exhibition with enhanced visual quality
US7653261B2 (en) Image tapestry
US8249394B2 (en) Method and system for shift-map image editing
CN104616286A (en) Fast semi-automatic multi-view depth restoring method
BRPI0613102A2 (en) cut and paste video object
CN102096915B (en) Camera lens cleaning method based on precise image splicing
US20080170783A1 (en) Method and apparatus for processing an image
CN101459843B (en) Method for precisely extracting broken content region in video sequence
CN111127376A (en) Method and device for repairing digital video file
CN101425088A (en) Key frame extracting method and system based on chart partition
CN112884664A (en) Image processing method, image processing device, electronic equipment and storage medium
CN111696062A (en) Color-balanced non-deformable DOM local repair aggregation method
JP2012208759A (en) Method and program for improving accuracy of three-dimensional shape model
CN102005061A (en) Method for reusing cartoons based on layering/hole-filling
Kottler et al. 3GAN: A Three-GAN-based Approach for Image Inpainting Applied to the Reconstruction of Occluded Parts of Building Walls.
JP2024520059A (en) Modifying objects in films
Wu et al. Optimized synthesis of art patterns and layered textures
Luo et al. Texture atlas compression based on repeated content removal
Liu et al. Recent development in image completion techniques
Zhang et al. Superpixel-based image inpainting with simple user guidance
CN118037592A (en) Panorama tripod fouling repairing method and system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20081231

Termination date: 20100810