[go: up one dir, main page]

CN101710987B - Configuration method of layered B forecasting structure with high compression performance - Google Patents

Configuration method of layered B forecasting structure with high compression performance Download PDF

Info

Publication number
CN101710987B
CN101710987B CN 200910155883 CN200910155883A CN101710987B CN 101710987 B CN101710987 B CN 101710987B CN 200910155883 CN200910155883 CN 200910155883 CN 200910155883 A CN200910155883 A CN 200910155883A CN 101710987 B CN101710987 B CN 101710987B
Authority
CN
China
Prior art keywords
image
configuration method
layered
picture
compression performance
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN 200910155883
Other languages
Chinese (zh)
Other versions
CN101710987A (en
Inventor
朱政
李东晓
张明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wanwei Display Technology Shenzhen Co ltd
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN 200910155883 priority Critical patent/CN101710987B/en
Publication of CN101710987A publication Critical patent/CN101710987A/en
Application granted granted Critical
Publication of CN101710987B publication Critical patent/CN101710987B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention discloses a configuration method of a layered B forecasting structure with high compression performance, which comprises the following steps: configuring the layered B forecasting structure by using the recursive dichotomy; dividing an image group into two sub-image groups by the image B of the first layer; and then, respectively dividing each sub-image group of which the length is greater than 1 into two smaller sub-image groups by the image B of the lower layer, and carrying out recursion until the whole dividing process is finished, wherein the image B for each dividing process is selected through a simple expression. The invention is suitable for any image group length, and the configuring difficulty and the dividing regularity of the configuration method of the invention are equivalent to that of of the traditional configuration method. However, the configuration method in the invention obtains an optimal forecasting structure through dynamic programming search on the basis of mathematical modeling on the compression performance of the layered B forecasting structure; and the experimental result indicates that under various image group lengths, the configuration method of the invention has better average coding rate distortion performance.

Description

一种高压缩性能分层B预测结构配置方法A Hierarchical B Prediction Structure Allocation Method with High Compression Performance

技术领域technical field

本发明涉及视频编码领域,尤其涉及一种高压缩性能分层B预测结构配置方法。The invention relates to the field of video coding, in particular to a method for configuring a layered B prediction structure with high compression performance.

背景技术Background technique

数字视频与模拟视频相比,具有质量高、易处理、易校正、容量大、节目多等诸多优点。随着人们对视频质量需求的不断提升并得益于技术的进步,数字视频今已得到了越来越广泛的应用。Compared with analog video, digital video has many advantages such as high quality, easy processing, easy correction, large capacity, and many programs. With the continuous improvement of people's demand for video quality and benefiting from the advancement of technology, digital video has been more and more widely used today.

然而,数字视频所包含的信息量非常大,对存储容量和传输网络的带宽要求很高。为使数字视频得到更有效的应用,压缩编码是首先需要解决的问题。为此,学术界和业界在视频压缩编码技术领域展开了广泛和深入的研究。目前,视频压缩编码的主要方法有基于波形的编码和基于内容的编码两大类。自上世纪80年代以来,国际标准组织(ISO)和国际电信联盟电信标准部(ITU-T)陆续推出了一系列数字视频压缩编码的国际标准,大大推动了视频通信和数字电视广播的发展。这些标准大多采用以预测编码、变换编码和熵编码为主的混合编码框架,属于基于波形的编码方法。However, the amount of information contained in digital video is very large, which requires high storage capacity and bandwidth of transmission network. In order to make digital video more effective, compression coding is the first problem to be solved. For this reason, academia and industry have carried out extensive and in-depth research in the field of video compression coding technology. At present, the main methods of video compression coding are waveform-based coding and content-based coding. Since the 1980s, the International Standards Organization (ISO) and the International Telecommunication Union Telecommunication Standardization Division (ITU-T) have successively introduced a series of international standards for digital video compression coding, which has greatly promoted the development of video communication and digital TV broadcasting. Most of these standards adopt a hybrid coding framework mainly based on predictive coding, transform coding and entropy coding, and belong to waveform-based coding methods.

2003年3月,ITU-T/ISO正式公布了视频编码标准H.264/AVC。H.264/AVC不仅显著提高了压缩性能,而且具有良好的网络亲和性,被人们称作新一代的视频编码标准。与以往的视频编码标准相比,H.264/AVC标准也是采用混合编码框架,但增添了很多特性。在这些特性中,图像层级的灵活性也是H.264/AVC获得高效性能的原因之一。In March 2003, ITU-T/ISO officially announced the video coding standard H.264/AVC. H.264/AVC not only significantly improves the compression performance, but also has good network affinity, and is known as a new generation of video coding standards. Compared with previous video coding standards, the H.264/AVC standard also adopts a hybrid coding framework, but adds many features. Among these features, the flexibility of the image level is also one of the reasons why H.264/AVC achieves efficient performance.

数字视频信号由分布在离散时间上的一幅幅图像前后相继而成。由于邻近图像中的景物存在着一定相关性,可以采用帧间预测的编码方法。预测法是简单实用的压缩编码方法。帧间预测通过将当前编码的图像划分为若干块或宏块,设法搜索出每个块或宏块在邻近已编码图像中对应的位置,得到预测值。经过编码后,只需要传输预测值和实际值之差,达到压缩的效果。不采用帧间预测编码的图像称为I图,只采用一个参考图像进行编码的图像称为P图,而采用多个参考图像进行编码的图像称为B图。在采用H.264/AVC标准进行编码时,与以往的视频编码标准有很大区别的是,图像的编码顺序和显示顺序可以任意配置,而且任意图像包括双向参考的B图也可以用作参考。H.264/AVC还提供了强大的参考图管理功能,在运动估计中可以通过多参考帧预测来提供预测精度。这些灵活的特性使得视频编码时可以任意地选择预测结构,包括具有优异压缩性能的分层B预测结构。The digital video signal is formed successively by successive images distributed in discrete time. Since there is a certain correlation between the scenes in the adjacent images, the encoding method of inter-frame prediction can be used. The prediction method is a simple and practical compression coding method. Inter-frame prediction divides the currently encoded image into several blocks or macroblocks, and tries to find out the corresponding position of each block or macroblock in the adjacent encoded image to obtain the predicted value. After encoding, only the difference between the predicted value and the actual value needs to be transmitted to achieve the effect of compression. A picture that does not use inter-frame predictive coding is called an I picture, a picture that uses only one reference picture for coding is called a P picture, and a picture that uses multiple reference pictures for coding is called a B picture. When using the H.264/AVC standard for encoding, it is very different from previous video encoding standards in that the encoding order and display order of images can be configured arbitrarily, and any image including bidirectionally referenced B pictures can also be used as a reference . H.264/AVC also provides a powerful reference image management function, which can provide prediction accuracy through multi-reference frame prediction in motion estimation. These flexible features allow arbitrary selection of prediction structures during video coding, including the layered B prediction structure with excellent compression performance.

分层B预测结构在一个图像组中采用了若干个层级的B图,这些B图采用双向预测,且高层级的B图可用作低层级B图的参考。视频序列的第一个图像采用I图编码,在其后等间隔地采用关键图像。关键图像可采用I图或P图编码,一般采用I图编码。对关键图像而言,编码顺序在前的图像显示顺序也必然在前。位于这些关键图像之间的图像则采用分层级B图的编码方式。每个B图采用前向和后向比自身层级更高且最近的一个图像用作参考图。将关键图像视作最高层级,则第1层级的B图只能采用前后两个关键图像作为参考图,最低层级的B图不用做参考。一个关键图像以及它之前和上一个关键图像之后的所有B图组成一个图像组(GOP)。长度为L的GOP,拥有B图的数量为L-1个。The hierarchical B-prediction structure adopts several levels of B-pictures in a group of pictures, and these B-pictures adopt bidirectional prediction, and the high-level B-pictures can be used as references for low-level B-pictures. The first image of the video sequence is coded with an I picture, followed by key images at equal intervals. The key image can be coded by I-picture or P-picture, and generally I-picture is used for coding. For the key image, the display sequence of the image whose coding sequence is earlier must also be earlier. Images located between these key images are encoded in a hierarchical B-picture manner. Each B-picture uses the most recent picture whose forward and backward directions are higher than its own level as a reference picture. If the key image is regarded as the highest level, the B picture of the first level can only use the two key pictures before and after as reference pictures, and the B picture of the lowest level does not need to be used as a reference. A key picture and all B-pictures before it and after the previous key picture form a group of pictures (GOP). A GOP of length L has L-1 B-pictures.

与其他方式的时间方向上的预测结构相比,分层B结构能够显著地提高压缩性能。分层B预测结构不但在传统的视频应用领域内非常重要,也在一些前沿的视频应用和研究领域扮演着重要的角色,比如可分级视频编码和多视点视频编码。可分级视频编码是针对在日益增长的因特网业务中,不同用户所需的视频服务的种类和内容都不尽相同,因此视频编码技术需要在只编码一次的条件下能够以不同的码率和视频质量来满足不同的应用需求。分层B预测结构中各图像层级正好提供了时间上不同的分辨率。多视点视频编码是三维立体视频技术研究领域中的一项关键技术。多视点视频信号是由一组相机阵列从不同视点同时对场景进行拍摄,利用其中的1个或多个视点信息可以合成虚拟视点的信息,达到提供立体感观以及自由切换任意视点的目的。多视点视频信号的海量数据对压缩技术提出了更高的需求。由于分层B结构优越的压缩性能,多视点视频编码标准中采用其作为时间方向上的预测结构。Compared with other prediction structures in the time direction, the hierarchical B structure can significantly improve the compression performance. Hierarchical B prediction structure is not only very important in traditional video applications, but also plays an important role in some cutting-edge video applications and research fields, such as scalable video coding and multi-view video coding. Scalable video coding is aimed at the growing Internet business, the types and contents of video services required by different users are not the same, so the video coding technology needs to be able to encode with different bit rates and video under the condition of only one coding quality to meet different application requirements. Each picture level in the hierarchical B prediction structure provides precisely temporally different resolutions. Multi-viewpoint video coding is a key technology in the research field of 3D stereoscopic video technology. The multi-viewpoint video signal is a group of camera arrays that shoot the scene from different viewpoints at the same time, and use one or more viewpoint information to synthesize virtual viewpoint information to achieve the purpose of providing stereoscopic perception and free switching of any viewpoint. Massive data of multi-viewpoint video signals put forward higher demands on compression technology. Due to the superior compression performance of the layered B structure, it is used as the prediction structure in the time direction in the multi-view video coding standard.

HHI的学者最早提出了分层B预测结构,并就其压缩性能、编码延迟以及存储容量需求等各方面进行了分析。目前,已经有两种设定好的分层B预测结构被H.264/AVC的参考软件JM所支持,使用的方法为在JM的配置文件中将参数HierarchicalCoding分别设置为1和2。HierarchicalCoding等于1的分层B预测结构的配置方法为,将GOP中B图按照显示顺序的奇偶分类。距离该GOP之前的关键图像的间隔为奇数的B图,均不用作参考,且采用其紧邻的前后两个图像作为参考;距离该GOP之前的关键图像的间隔为偶数的B图,均用作参考,且层级按照显示顺序从前往后递降。HierarchicalCoding等于2的分层B预测结构的配置方法为,若GOP的长度为L,则划分B图的层级数

Figure G2009101558832D00031
首先将所有的B图标注为最低层级;而后将距离该GOP之前的关键图像的间隔为21的正整数倍的B图作为倒数第二层级;如此类推,将距离该GOP之前的关键图像的间隔为2lv-2的正整数倍的B图作为第二层级;最后将距离该GOP之前的关键图像的间隔为2lv-1的正整数倍的B图作为第一层级。以上两种分层B预测结构的配置方法,均为二分法,即首先用一个B图将GOP划分为两个部分,该B图采用GOP首尾端的图像作为参考;而后又将这两个部分分别用一个B图划分为两个部分,分别采用各自部分首尾端的图像作为参考;如此往下进行直至无可划分。Scholars at HHI first proposed the hierarchical B prediction structure, and analyzed its compression performance, encoding delay, and storage capacity requirements. At present, there are already two preset hierarchical B prediction structures supported by the reference software JM of H.264/AVC. The method used is to set the parameter HierarchicalCoding to 1 and 2 respectively in the configuration file of JM. The configuration method of the hierarchical B prediction structure with HierarchicalCoding equal to 1 is to classify the B pictures in the GOP according to the parity of the display order. B-pictures whose distance from the key picture before the GOP is an odd number are not used as references, and the two pictures immediately before and after it are used as references; B pictures whose distance from the key picture before the GOP is an even-numbered picture are used as references. Reference, and the hierarchy descends from front to back in the order of display. The configuration method of the hierarchical B prediction structure with HierarchicalCoding equal to 2 is, if the length of the GOP is L, divide the number of layers of the B picture
Figure G2009101558832D00031
First, mark all the B-pictures as the lowest level; then take the B-picture whose distance from the key image before the GOP is a positive integer multiple of 2 1 as the second-to-last level; and so on, the distance from the key picture before the GOP The B-picture whose interval is a positive integer multiple of 2 lv-2 is used as the second level; finally, the B-picture whose interval is a positive integer multiple of 2 lv-1 from the key image before the GOP is used as the first level. The configuration methods of the above two hierarchical B prediction structures are both dichotomous methods, that is, first divide the GOP into two parts with a B picture, and the B picture uses the images at the beginning and end of the GOP as a reference; Divide a picture B into two parts, and use the images at the beginning and end of each part as a reference; proceed until there is no division.

发明内容Contents of the invention

本发明的目的是克服现有技术的不足,提供了一种高压缩性能分层B预测结构配置方法。The purpose of the present invention is to overcome the deficiencies of the prior art and provide a high compression performance layered B prediction structure configuration method.

高压缩性能分层B预测结构配置方法是:High compression performance hierarchical B prediction structure configuration method is:

采用递归的二分法对分层B预测结构进行配置,即首先用一个B图将图像组划分为两个子图像组,该B图采用图像组首尾端的图像作为参考,而后又将这两个子图像组分别用一个B图划分为两个更小的子图像组,分别采用各自部分首尾端的图像作为参考,如此往下进行直至无可划分;A recursive dichotomy is used to configure the hierarchical B prediction structure, that is, a B picture is first used to divide the image group into two sub-image groups. Divide a B picture into two smaller sub-image groups, respectively use the images at the beginning and end of each part as reference, and proceed until there is no division;

若图像组或子图像组长度为L,L>1,则将该图像组或子图像组划分为两个更小的子图像组的B图距离该图像组或子图像组首尾图像其中之一的距离应为2的整数次幂D1=2m,m为正整数且取值方法为:令

Figure G2009101558832D00032
Figure G2009101558832D00033
代表向下取整,若L≥2n×3,则m=n+1,否则m=n,该B图到首尾图像中另一个图像的距离为D2=L-D1。If the length of the image group or sub-image group is L, and L>1, then the image group or sub-image group is divided into two smaller sub-image groups, and the distance between the picture B of the image group or sub-image group is one of the first and last images The distance should be an integer power of 2 D 1 =2 m , m is a positive integer and the value method is: let
Figure G2009101558832D00032
Figure G2009101558832D00033
Represents rounding down, if L≥2 n ×3, then m=n+1, otherwise m=n, the distance between the picture B and the other image in the first and last images is D 2 =LD 1 .

本发明提出的高压缩性能的分层B预测结构的配置方法与现有的两种配置方法相比,均是采用二分法递归进行划分,且配置难易度以及划分的规则性相当。但本发明中的配置方法是在对分层B预测结构的压缩性能进行数学建模的基础上,经动态规划搜索得到的最优预测结构,实验结果表明在各种图像组长度下本发明的配置方法具有更好的平均编码率失真性能。Compared with the two existing configuration methods, the configuration method of the hierarchical B prediction structure with high compression performance proposed by the present invention adopts the dichotomy method for recursive division, and the difficulty of configuration and the regularity of division are comparable. However, the configuration method in the present invention is based on the mathematical modeling of the compression performance of the hierarchical B prediction structure, and the optimal prediction structure obtained through dynamic programming search. Experimental results show that the present invention's The profiled approach has better average rate-distortion performance.

附图说明Description of drawings

图1为一个典型的分层B预测结构示意图(图像组长度为8,拥有4个图像层级);Figure 1 is a schematic diagram of a typical hierarchical B prediction structure (the length of the image group is 8, with 4 image levels);

图2(a)为news序列实际的归一化码率NR值与通过线性模型估计出来的数值之间的比较;Figure 2(a) is the comparison between the actual normalized code rate NR value of the news sequence and the value estimated by the linear model;

图2(b)为basket序列实际的归一化码率NR值与通过线性模型估计出来的数值之间的比较;Figure 2(b) is the comparison between the actual normalized code rate NR value of the basket sequence and the value estimated by the linear model;

图3为图像组长度为8的一种分层B预测结构映射为二叉树的分步示意图;Fig. 3 is a step-by-step schematic diagram of a hierarchical B prediction structure mapped to a binary tree with an image group length of 8;

图4为图像组长度为8的另一种分层B预测结构映射为二叉树示意图;Fig. 4 is a schematic diagram of another hierarchical B prediction structure mapped to a binary tree with a group of pictures length of 8;

图5为图像组长度为8的最优二叉树的递归结构;Fig. 5 is the recursive structure of the optimum binary tree of 8 for image group length;

图6为图像组长度为10的最优预测结构配置示意图。FIG. 6 is a schematic diagram of an optimal prediction structure configuration with an image group length of 10.

具体实施方式Detailed ways

高压缩性能分层B预测结构配置方法是:High compression performance hierarchical B prediction structure configuration method is:

采用递归的二分法对分层B预测结构进行配置,即首先用一个B图将图像组划分为两个子图像组,该B图采用图像组首尾端的图像作为参考,而后又将这两个子图像组分别用一个B图划分为两个更小的子图像组,分别采用各自部分首尾端的图像作为参考,如此往下进行直至无可划分;A recursive dichotomy is used to configure the hierarchical B prediction structure, that is, a B picture is first used to divide the image group into two sub-image groups. Divide a B picture into two smaller sub-image groups, respectively use the images at the beginning and end of each part as reference, and proceed until there is no division;

若图像组或子图像组长度为L,L>1,则将该图像组或子图像组划分为两个更小的子图像组的B图距离该图像组或子图像组首尾图像其中之一的距离应为2的整数次幂D1=2m,m为正整数且取值方法为:令

Figure G2009101558832D00041
Figure G2009101558832D00042
代表向下取整,若L≥2n×3,则m=n+1,否则m=n,该B图到首尾图像中另一个图像的距离为D2=L-D1。If the length of the image group or sub-image group is L, and L>1, then the image group or sub-image group is divided into two smaller sub-image groups, and the distance between the picture B of the image group or sub-image group is one of the first and last images The distance should be an integer power of 2 D 1 =2 m , m is a positive integer and the value method is: let
Figure G2009101558832D00041
Figure G2009101558832D00042
Represents rounding down, if L≥2 n ×3, then m=n+1, otherwise m=n, the distance between the picture B and the other image in the first and last images is D 2 =LD 1 .

分析和比较不同分层B预测结构的性能,以及寻求一个高压缩性能的预测结构,首先对其进行数学建模是一个有效的办法。视频编码预测结构对压缩性能所产生的影响主要取决于在帧间预测中对于邻近图像之间的相关性的发掘和利用。一般而言,在活动的视频图像序列没有发生场景突变或者场景周期性重复过程时,邻近图像之间的相关性都与其时间间隔有关,间隔越近,相关性越高,帧间预测编码取得的性能也越好。因此,将帧间预测编码图像的压缩码率表示为参考间隔的函数,可以在统计意义上保证其有效性。对于以图像组(GOP)的形式周期性重复的预测结构,其整体压缩性能可以表示为一个图像组中所有图像的码率之和。To analyze and compare the performance of different hierarchical B prediction structures, and to seek a prediction structure with high compression performance, it is an effective way to conduct mathematical modeling on it first. The impact of video coding prediction structure on compression performance mainly depends on the exploration and utilization of the correlation between adjacent images in inter-frame prediction. Generally speaking, when there is no sudden change of scene or periodic repetition of scenes in the active video image sequence, the correlation between adjacent images is related to its time interval. The closer the interval, the higher the correlation. The better the performance too. Therefore, expressing the compression rate of an inter-frame predictive coded image as a function of the reference interval can guarantee its validity in a statistical sense. For a periodically repeated prediction structure in the form of a group of pictures (GOP), its overall compression performance can be expressed as the sum of the code rates of all pictures in a group of pictures.

本发明采用如下的两参数模型来表示含有两个参考图像的B图的压缩码率:The present invention adopts following two-parameter model to represent the compression rate of the B picture that contains two reference images:

RB=RI×(θ12log(D1·D2))   (1)R B =R I ×(θ 12 log(D 1 ·D 2 )) (1)

D1和D2表示B图的两个参考间隔,θ1和θ2为待估参数。RB和RI为在相同的编码条件下对同一图像分别进行B图编码和I图编码得到的输出输出码率。在使用H.264/AVC编码器得到不同的参考间隔下的数据后,通过最小二乘法估计参数,实验结果表明该模型非常符合B图压缩码率的变化规律,如图2所示。图中标注为NR的实线代表着归一化的B图压缩码率RB/RI的实际数据,标注为LSE的虚线代表着通过最小二乘估计得到的线性模型产生的数值。选用的序列分别是分辨率为352x288的news以及分辨率为720x576的basket。横坐标代表了两个参考间隔D1和D2在GOP长度L=2,3...8时的各种组合情况。D 1 and D 2 represent the two reference intervals of Figure B, and θ 1 and θ 2 are parameters to be estimated. R B and R I are the output code rates obtained by performing B-picture coding and I-picture coding respectively on the same picture under the same coding conditions. After using the H.264/AVC encoder to obtain data under different reference intervals, the parameters are estimated by the least square method. The experimental results show that the model is very consistent with the change law of the compressed code rate of the B picture, as shown in Figure 2. The solid line marked NR in the figure represents the actual data of the normalized B image compression rate R B /R I , and the dotted line marked LSE represents the value generated by the linear model obtained by least squares estimation. The selected sequences are news with a resolution of 352x288 and basket with a resolution of 720x576. The abscissa represents various combinations of the two reference intervals D 1 and D 2 when the GOP length L=2, 3...8.

对于一个分层B预测结构,其整体压缩性能可表示为一个图像组(GOP)中所有B图的码率之和再加上I图的码率。GOP长度为L的分层B预测结构整体压缩性能如下式所示:For a hierarchical B prediction structure, its overall compression performance can be expressed as the sum of the code rates of all B pictures in a group of pictures (GOP) plus the code rate of the I picture. The overall compression performance of the hierarchical B prediction structure with GOP length L is as follows:

R GOP = R I × ( 1 + θ 1 ( L - 1 ) + θ 2 log ( Π i = 1 L - 1 D i 1 · D i 2 ) ) (2) R GOP = R I × ( 1 + θ 1 ( L - 1 ) + θ 2 log ( Π i = 1 L - 1 D. i 1 &Center Dot; D. i 2 ) ) (2)

式中Di1和Di2分别表示第i(i=1...L-1)个B图的参考间隔。In the formula, D i1 and D i2 represent the reference interval of the i-th (i=1...L-1) B-picture respectively.

如果仅仅关注预测结构本身,且在相同的GOP长度下比较不同分层B预测结构之间性能的优劣,则可以省去(2)式中的参数,并且只保留最后一项,即所有B图参考间隔的乘积,如下式所示:If you only focus on the prediction structure itself, and compare the performance of different hierarchical B prediction structures under the same GOP length, you can omit the parameters in (2), and only keep the last item, that is, all B The product of graph reference intervals, as shown in the following formula:

R GOP = Π i = 1 L - 1 D i 1 · D i 2 (3) R GOP = Π i = 1 L - 1 D. i 1 &Center Dot; D. i 2 (3)

使用(3)式,就可以非常方便地分析和比较任何分层B预测结构的压缩性能。且将其用作代价,可以使用动态规划的方法找出任何GOP长度下的最优预测结构。Using formula (3), it is very convenient to analyze and compare the compression performance of any hierarchical B prediction structure. And using it as a cost, a dynamic programming method can be used to find the optimal prediction structure under any GOP length.

由于B图压缩码率的函数是单调递增的,所以采用二分法来配置分层B预测结构具有更好的压缩性能。即首先用一个B图将GOP划分为两个部分,该B图采用GOP首尾端的图像作为参考;而后又将这两个部分分别用一个B图划分为两个部分,分别采用各自部分首尾端的图像作为参考;如此往下进行直至无可划分。如果不采用二分法,假设某一次划分时,用两个B图将该部分划分为三个部分,这两个B图都采用该部分首尾端的图像作为参考。则可以将其改造为二分,使得其中一个B图在更小的部分中拥有更短的参考间隔,而获得更好的压缩性能。所以寻求最好压缩性能的分层B预测结构只需要在二分法配置结构中寻找。Since the function of the B-picture compression rate is monotonously increasing, adopting the dichotomy method to configure the hierarchical B prediction structure has better compression performance. That is, first divide the GOP into two parts with a B picture, and the B picture uses the images at the beginning and end of the GOP as a reference; For reference; and so on until no divisions can be made. If the dichotomy method is not used, it is assumed that in a certain division, the part is divided into three parts with two B pictures, and the two B pictures use the images at the beginning and end of the part as references. Then it can be transformed into a bisection, so that one of the B-pictures has a shorter reference interval in a smaller part, and better compression performance is obtained. So the hierarchical B prediction structure for the best compression performance only needs to be found in the dichotomous configuration structure.

采用二分法配置的分层B预测结构可以映射为一一对应的二叉树。映射方法为,将GOP的长度作为树的根节点。在一个B图将GOP划分为两部分后,用根的两个子节点表示各部分的长度。如此继续划分,也继续用子节点的子节点来表示各部分的长度。图3展示了GOP长度为8的一种分层B预测结构映射为二叉树的分步示意图。图4展示了GOP长度为8的另一种分层B预测结构映射为二叉树的示意图。The hierarchical B prediction structure configured by dichotomy can be mapped as a binary tree with one-to-one correspondence. The mapping method is to use the length of the GOP as the root node of the tree. After a B-graph divides a GOP into two parts, the length of each part is represented by two child nodes of the root. Continue to divide in this way, and continue to use the child nodes of the child nodes to represent the length of each part. Fig. 3 shows a step-by-step schematic diagram of mapping a hierarchical B prediction structure with a GOP length of 8 into a binary tree. Fig. 4 shows a schematic diagram of mapping another hierarchical B prediction structure with a GOP length of 8 into a binary tree.

在二叉树上,各节点的值既是各子部分的长度,又是各B图的参考间隔。运用(3)式,分层B预测结构的压缩性能可以用二叉树上除去根之外的所有节点的值相乘再取对数来表示。在此基础上,运用动态规划的方法,可以方便地找出各个GOP长度下的最优预测结构对应的最优二叉树。任何一颗最优二叉树,必定是由递归的最优二叉子树所构成的。比如GOP长度L=8,对应二叉树的根为8,其下的子节点有4种可能:(1,7),(2,6),(3,5),(4,4),如图5所示。对于每一种可能来说,都是由子节点作为根的最优子树构成的。因此当GOP长度为2到L-1的最优结构都已经得到时,可以方便地推出GOP长度为L的最优结构。将根的子节点所对应的一颗或两颗最优子树的代价(当子节点为1时,不存在子树),加上这对子节点的代价,就可以得到这种分法的总的代价。再从这

Figure G2009101558832D00061
种分法中找出最好的一个,即得到GOP长度为L的最优二叉树以及最优预测结构。因为对数相加等于相乘再取对数,在计算代价时可直接使用乘法。令Ci为GOP长度为i时最优树的代价,对于GOP长度为L的最优树具体的推算方法如下:On the binary tree, the value of each node is not only the length of each subpart, but also the reference interval of each B-graph. Using formula (3), the compression performance of the hierarchical B prediction structure can be expressed by multiplying the values of all nodes except the root on the binary tree and then taking the logarithm. On this basis, using the method of dynamic programming, the optimal binary tree corresponding to the optimal prediction structure under each GOP length can be easily found out. Any optimal binary tree must be composed of recursive optimal binary subtrees. For example, GOP length L=8, corresponding to the root of the binary tree is 8, there are 4 possible child nodes under it: (1, 7), (2, 6), (3, 5), (4, 4), as shown in the figure 5. For each possibility, an optimal subtree rooted at a child node is formed. Therefore, when the optimal structures with GOP lengths from 2 to L-1 have been obtained, the optimal structure with GOP length L can be deduced conveniently. Adding the cost of one or two optimal subtrees corresponding to the child node of the root (when the child node is 1, there is no subtree) to the cost of the pair of child nodes, the result of this division can be obtained total cost. from here
Figure G2009101558832D00061
Find the best one in the classification method, that is, get the optimal binary tree with GOP length L and the optimal prediction structure. Since adding logarithms is equal to multiplying and then taking the logarithms, multiplication can be used directly when calculating the cost. Let C i be the cost of the optimal tree when the GOP length is i. The specific calculation method for the optimal tree with the GOP length L is as follows:

(1)从GOP长度为2开始递推,此时只有一个B图,也只有一种预测结构,C2=1·1;(1) Start recursively from the GOP length of 2, at this time there is only one B picture, and only one prediction structure, C 2 =1·1;

(2)当Ci(i=2...L-1)已经获得,CL可在如下数值中选取最小值获得:1·(L-1)·CL-1,2·(L-2)·C2·CL-2,...,(L/2)·(L/2)·CL/2·CL/2(当L为偶数)或者((L-1)/2)·((L+1)/2)·C(L-1)/2·C(L+1)/2(当L为奇数)。(2) When C i (i=2...L-1) has been obtained, C L can be obtained by selecting the minimum value from the following values: 1·(L-1)·C L-1 , 2·(L- 2)·C 2 ·C L-2 ,...,(L/2)·(L/2)·C L/2 ·C L/2 (when L is an even number) or ((L-1)/ 2)·((L+1)/2)·C (L-1)/2 ·C (L+1)/2 (when L is an odd number).

使用这种动态规划的方法,得出GOP长度为2到16的最优预测结构,将第一层级划分后左右子树的根(即左右部分的长度)列于下表:Using this dynamic programming method, the optimal prediction structure with a GOP length of 2 to 16 is obtained, and the roots of the left and right subtrees after the first level division (ie the length of the left and right parts) are listed in the following table:

表1GOP长度为2到16的最优树划分Table 1 Optimal tree partitioning with GOP lengths from 2 to 16

Figure G2009101558832D00062
Figure G2009101558832D00062

Figure G2009101558832D00071
Figure G2009101558832D00071

该表中左子树和右子树的长度可以互换。由于是采用递归的二分法配置分层B预测结构,因此通过该表的查询就可以将GOP长度为2到16的任意长度饿最优结构配置出来。比如GOP长度为10,首先通过查询该表得知第一层级划分为4和6,而后再分别查询GOP长度为4和6应该如何划分,如此往下递推就可以配置出完整的预测结构,如图6所示。在某些特定的GOP长度上,通过最优树找出的预测结构与现有预测结构HierarchicalCoding=2是相同的(比如GOP长度为2的整数次幂时),但划分的准则却并不相同。The lengths of the left and right subtrees in this table are interchangeable. Since the recursive dichotomy is used to configure the hierarchical B prediction structure, the optimal structure with any length of GOP ranging from 2 to 16 can be configured by querying the table. For example, if the length of the GOP is 10, firstly, by querying the table, we can know that the first level is divided into 4 and 6, and then query how to divide the GOP with the length of 4 and 6, so that a complete prediction structure can be configured by recursively going down. As shown in Figure 6. In some specific GOP lengths, the prediction structure found through the optimal tree is the same as the existing prediction structure HierarchicalCoding=2 (for example, when the GOP length is an integer power of 2), but the division criteria are different .

再继续动态规划的过程可以推出GOP长度更大的最优分层B预测结构。通过总结以上模型所推出的最优分层B预测结构的划分规律,得到高压缩性能分层B预测结构配置方法。Continuing the process of dynamic programming can introduce the optimal hierarchical B prediction structure with larger GOP length. By summarizing the division rules of the optimal hierarchical B forecasting structure derived from the above models, a method for configuring the hierarchical B forecasting structure with high compression performance is obtained.

本发明的配置方法可以在H.264/AVC中添加C语言代码,作为一个附加的HierarchicalCoding选项实现;也可以通过将HierarchicalCoding设置为3,并在ExplicitHierarchyFormat参数中手动配置B图的层级来实现。就配置的难易度以及划分的规则性来看,本配置方法也与现有配置方法相当。使用H.264/AVC参考软件版本JM11.0进行编码,并与HierarchicalCoding设置为1和2的两种现有分层B预测结构进行比较,实验结果表明本发明的配置方法在各个GOP长度下比现有预测结构具有更好的平均编码率失真性能。实现结果比较The configuration method of the present invention can be realized by adding C language code in H.264/AVC as an additional HierarchicalCoding option; it can also be realized by setting HierarchicalCoding to 3 and manually configuring the level of B-picture in the ExplicitHierarchyFormat parameter. In terms of the difficulty of configuration and the regularity of division, this configuration method is also equivalent to the existing configuration method. Use H.264/AVC reference software version JM11.0 to encode, and compare with two kinds of existing hierarchical B prediction structures that HierarchicalCoding is set to 1 and 2, the experimental results show that the configuration method of the present invention is better than each GOP length Existing prediction structures have better average rate-distortion performance. Realize result comparison

表2新配置方法与原有配置方法实验结果比较Table 2 Comparison of experimental results between the new configuration method and the original configuration method

Figure G2009101558832D00072
Figure G2009101558832D00072

Figure G2009101558832D00081
Figure G2009101558832D00081

上表列出了不同分辨率的8个序列,在GOP长度为7,11,15时编码的实验结果。表中HC1代表HierarchicalCoding设置为1的分层B预测结构,HC2代表HierarchicalCoding设置为2的分层B预测结构。在GOP长度为7时,本发明的配置方法与HC1的预测结构相近,但是与HC2的预测结构相差较大,有平均0.1dB的增益;在GOP长度为11和15时,本发明的配置方法与HC2的预测结构要相近一些,但平均来看仍有增益,与HC1预测结构相比,相差越来越大,在GOP长度为15时,有0.1~0.4dB的增益。总的来看,本发明的配置方法具有更好的压缩性能。The above table lists 8 sequences with different resolutions, and the experimental results when the GOP length is 7, 11, and 15. In the table, HC1 represents the hierarchical B prediction structure with HierarchicalCoding set to 1, and HC2 represents the hierarchical B prediction structure with HierarchicalCoding set to 2. When the GOP length is 7, the configuration method of the present invention is similar to the prediction structure of HC1, but is quite different from the prediction structure of HC2, and has an average gain of 0.1dB; when the GOP length is 11 and 15, the configuration method of the present invention It is similar to the prediction structure of HC2, but there is still gain on average. Compared with the prediction structure of HC1, the difference is getting bigger and bigger. When the GOP length is 15, there is a gain of 0.1-0.4dB. Generally speaking, the configuration method of the present invention has better compression performance.

Claims (1)

1. configuration method of layered B forecasting structure with high compression performance is characterized in that comprising:
The optimum prediction structure is to carry out on the basis of mathematical modeling at the compression performance to layered B forecasting structures, obtains through the Dynamic Programming search;
Adopt the dichotomy of recurrence that layered B forecasting structures is configured, promptly at first image sets is divided into two number of sub images groups with a B figure, the image of these B figure employing image sets two ends as a reference, then again this two number of sub images group is divided into two littler subimage groups with a B figure respectively, the image that adopts part two ends separately respectively so down carries out as a reference until not dividing;
If image sets or subimage group length are L, L>1 then is divided into this image sets or subimage group one of them distance of this image sets of B map distance of two littler subimage groups or subimage group head and the tail image and should be 2 integral number power D 1=2 m, m is that positive integer and obtaining value method are: order
Figure FSB00000432964300011
Representative rounds downwards, if L 〉=2 n* 3, m=n+1 then, otherwise m=n, this B figure distance of another image in the head and the tail image is D 2=L-D 1
CN 200910155883 2009-12-29 2009-12-29 Configuration method of layered B forecasting structure with high compression performance Expired - Fee Related CN101710987B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 200910155883 CN101710987B (en) 2009-12-29 2009-12-29 Configuration method of layered B forecasting structure with high compression performance

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 200910155883 CN101710987B (en) 2009-12-29 2009-12-29 Configuration method of layered B forecasting structure with high compression performance

Publications (2)

Publication Number Publication Date
CN101710987A CN101710987A (en) 2010-05-19
CN101710987B true CN101710987B (en) 2011-06-15

Family

ID=42403744

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 200910155883 Expired - Fee Related CN101710987B (en) 2009-12-29 2009-12-29 Configuration method of layered B forecasting structure with high compression performance

Country Status (1)

Country Link
CN (1) CN101710987B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180199062A1 (en) * 2017-01-11 2018-07-12 Qualcomm Incorporated Intra prediction techniques for video coding

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5815209A (en) * 1993-11-09 1998-09-29 Matsushita Electric Industrial Co., Ltd. Encoding method, an encoding apparatus, a decoding method and a decoding apparatus for a moving picture
US20030147466A1 (en) * 2002-02-01 2003-08-07 Qilian Liang Method, system, device and computer program product for MPEG variable bit rate (VBR) video traffic classification using a nearest neighbor classifier
CN101018334A (en) * 2007-02-13 2007-08-15 武汉大学 A method for quickly implementing flexible time domain coding of the dual frame reference video stream
CN101212543A (en) * 2006-12-30 2008-07-02 中国科学院计算技术研究所 JPEG2000 Bit Rate Control Method Based on Image Quality and Bit Rate Constraints
US20080304761A1 (en) * 2003-02-03 2008-12-11 Actimagine Process and device for the compression of portions of images

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5815209A (en) * 1993-11-09 1998-09-29 Matsushita Electric Industrial Co., Ltd. Encoding method, an encoding apparatus, a decoding method and a decoding apparatus for a moving picture
US20030147466A1 (en) * 2002-02-01 2003-08-07 Qilian Liang Method, system, device and computer program product for MPEG variable bit rate (VBR) video traffic classification using a nearest neighbor classifier
US20080304761A1 (en) * 2003-02-03 2008-12-11 Actimagine Process and device for the compression of portions of images
CN101212543A (en) * 2006-12-30 2008-07-02 中国科学院计算技术研究所 JPEG2000 Bit Rate Control Method Based on Image Quality and Bit Rate Constraints
CN101018334A (en) * 2007-02-13 2007-08-15 武汉大学 A method for quickly implementing flexible time domain coding of the dual frame reference video stream

Also Published As

Publication number Publication date
CN101710987A (en) 2010-05-19

Similar Documents

Publication Publication Date Title
US12069293B2 (en) Adaptive interpolation filter
CN102055982B (en) Coding and decoding methods and devices for three-dimensional video
CN103179405B (en) A kind of multi-view point video encoding method based on multi-level region-of-interest
CN1126262C (en) Method and apparatus for encoding mode signals for use in binary shape coder
CN100459707C (en) Mode coding method and apparatus for use in interlaced shape coder
CN101222630B (en) Time-domain gradable video encoding method for implementing real-time double-frame reference
KR101511230B1 (en) Apparatus and method for distributed video coding / decoding using adaptive quantization
CN102065296B (en) Three-dimensional video coding method
CN103096048B (en) A kind of scalable video quantization parameter defining method and device
CN111464814B (en) A virtual reference frame generation method based on parallax-guided fusion
CN104754335B (en) A kind of code rate controlling method for video coding
CN104159095A (en) Code rate control method for multi-view texture video and depth map coding
CN101711001B (en) Evaluating method of compression properties of layered B forecasting structures
CN117041599A (en) HEVC-VPCC-based intra-frame rapid coding method and system
CN116489333A (en) An edge classification model construction method for depth map coding unit division
CN113141526B (en) Point cloud video adaptive transmission method with joint resource allocation driven by QoE
CN101710987B (en) Configuration method of layered B forecasting structure with high compression performance
Wu et al. Promptus: Can prompts streaming replace video streaming with stable diffusion
CN102263953B (en) Quick fractal compression and decompression method for multicasting stereo video based on object
Xue et al. Proto-object based rate control for JPEG2000: An approach to content-based scalability
CN105791863A (en) Layer-based 3D-HEVC Depth Map Intra-frame Prediction Coding Method
CN113542753B (en) AVS3 video coding method and encoder
CN111757125A (en) Multi-view video compression method based on light field, device, equipment and medium thereof
Reusens et al. Dynamic coding of visual information
CN106231300A (en) A kind of HEVC complexity control method based on coding unit level

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C41 Transfer of patent application or patent right or utility model
TR01 Transfer of patent right

Effective date of registration: 20160614

Address after: 518000 new energy building, Nanhai Road, Shenzhen, Guangdong, Nanshan District A838

Patentee after: Meng Qi media (Shenzhen) Co.,Ltd.

Address before: 310027 Hangzhou, Zhejiang Province, Zhejiang Road, No. 38

Patentee before: Zhejiang University

C41 Transfer of patent application or patent right or utility model
TR01 Transfer of patent right

Effective date of registration: 20160901

Address after: 518000, 101, 2, Fengyun technology building, Fifth Industrial Zone, North Ring Road, Shenzhen, Guangdong, Nanshan District

Patentee after: World wide technology (Shenzhen) Ltd.

Address before: 518000 new energy building, Nanhai Road, Shenzhen, Guangdong, Nanshan District A838

Patentee before: Meng Qi media (Shenzhen) Co.,Ltd.

EE01 Entry into force of recordation of patent licensing contract
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20100519

Assignee: MCLOUD (SHANGHAI) DIGITAL TECHNOLOGY CO.,LTD.

Assignor: World wide technology (Shenzhen) Ltd.

Contract record no.: 2018440020049

Denomination of invention: Configuration method of layered B forecasting structure with high compression performance

Granted publication date: 20110615

License type: Exclusive License

Record date: 20180428

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20180903

Address after: 518000 B unit 101, Fengyun mansion 5, Xili street, Nanshan District, Shenzhen, Guangdong.

Patentee after: WANWEI DISPLAY TECHNOLOGY (SHENZHEN) Co.,Ltd.

Address before: 518000 2 of Fengyun tower, Fifth Industrial Zone, Nanshan District North Ring Road, Shenzhen, Guangdong, 101

Patentee before: World wide technology (Shenzhen) Ltd.

CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20110615