[go: up one dir, main page]

CN113393377B - Single-frame image super-resolution method based on video coding - Google Patents

Single-frame image super-resolution method based on video coding Download PDF

Info

Publication number
CN113393377B
CN113393377B CN202110541900.7A CN202110541900A CN113393377B CN 113393377 B CN113393377 B CN 113393377B CN 202110541900 A CN202110541900 A CN 202110541900A CN 113393377 B CN113393377 B CN 113393377B
Authority
CN
China
Prior art keywords
network
image
sub
resolution
super
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110541900.7A
Other languages
Chinese (zh)
Other versions
CN113393377A (en
Inventor
吴庆波
李鹏飞
李宏亮
孟凡满
许林峰
潘力立
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN202110541900.7A priority Critical patent/CN113393377B/en
Publication of CN113393377A publication Critical patent/CN113393377A/en
Application granted granted Critical
Publication of CN113393377B publication Critical patent/CN113393377B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/40Analysis of texture
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/002Image coding using neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The invention discloses a single-frame image super-resolution method based on video coding, which utilizes prior information directly obtained in the video coding to perform targeted processing on subblocks in different parts of an image, utilizes a complex network to process subblocks with more complex textures, and simultaneously designs an adaptive convolution module to perform targeted processing on subblocks with different coding modes, so that the network is more targeted, different detailed information is restored aiming at different textures, and the precision of a super-resolution result is improved. The invention shares the parameters of the network with few channels into the network with deep channels, namely, the super-resolution process of the whole picture is realized by using different layers of a main network, the network with relatively simple use, shallow layers and few channels is used for processing relatively large subblocks with smoother texture, and the time required by the super-resolution process is reduced.

Description

一种基于视频编码的单帧图像超分辨率方法A single-frame image super-resolution method based on video coding

技术领域technical field

本发明涉及图像处理技术领域,具体涉及一种基于视频编码的单帧图像超分辨率方法。The invention relates to the technical field of image processing, in particular to a single-frame image super-resolution method based on video coding.

背景技术Background technique

图像超分辨率是将输入的低分辨率的视觉图像转化为高分辨率的视觉图像的过程。最近超分辨率工作的一个重要的关注点是提出各式各样的对推理过程进行加速的网络。其中一个分支是利用更少的参数,更快的速度实现高效的超分辨率工作。例如早期的FSRCNN,直接将输入图像进行特征提取,随后特征图经过一个上采样网络完成超分辨率图像的构建。又例如最近的工作CARN是利用了分组卷积技术设计了一个残差网络,以实现对输入图片的快速处理。另一个分支是增大网络模型的复杂度,增加模型分支数目,通过对不同种类的输入进行单独训练,如ClassSR。Image super-resolution is the process of converting an input low-resolution visual image into a high-resolution visual image. An important focus of recent super-resolution work has been to propose various networks to accelerate the inference process. One of the branches is to achieve efficient super-resolution work faster with fewer parameters. For example, the early FSRCNN directly extracts features from the input image, and then the feature map passes through an upsampling network to complete the construction of super-resolution images. Another example is the recent work CARN, which uses grouped convolution technology to design a residual network to achieve fast processing of input images. Another branch is to increase the complexity of the network model, increase the number of model branches, and train separately on different kinds of inputs, such as ClassSR.

ClassSR通过对不同复杂程度的低分辨率输入图像采用不同复杂度的神经网络进行训练和推理。由于图像的大部分区域只需要通过计算量相对小的网络,这种方法在一定程度上提升了网络推理阶段的运行速度。具体来说该方法是将图片分割成32×32像素的小块。通过一个预先训练好的分类网络,依据小块图像的纹理复杂程度将其分为三类:简单图片,中等图片,困难图片。不同类别的图片对应不同通道数目的主干网络。ClassSR trains and infers by employing neural networks of different complexity on low-resolution input images of different complexity. Since most areas of the image only need to pass through the network with a relatively small amount of computation, this method improves the running speed of the network inference stage to a certain extent. Specifically, the method divides the image into small blocks of 32×32 pixels. Through a pre-trained classification network, small images are classified into three categories according to their texture complexity: simple images, medium images, and difficult images. Pictures of different categories correspond to backbone networks with different numbers of channels.

在传统的超分辨率网络中,都是对整张图片直接提取特征图,这样的结构使得网络没有办法很好的学习每个区域不同的特征,应用相同的卷积核处理不同的区域使得恢复出来的图像纹理细节与真实图像不符。并且由于图像不同区域的纹理细节复杂度不同,对低细节区域的复杂处理往往会没有必要的增加网络的计算量。但是如ClassSR提出的先分类后经过三个参数不共享的神经网络会使得在训练的时候花费大量的时间和计算力,增大了网络的复杂度。在上述提到的缺点之外,如今的超分辨率方法大部分都忽略了图像原本便的先验信息对于图像超分辨率过程的帮助。因此亟需一种网络的计算量小,恢复出来的图像纹理细节与真实图像符合精度提高的超分辨率方法。In the traditional super-resolution network, the feature map is directly extracted from the entire image. This structure makes the network unable to learn the different features of each region well. Applying the same convolution kernel to process different regions makes the recovery The texture details of the resulting image do not match the real image. And because the complexity of texture details in different areas of the image is different, the complex processing of low-detail areas often increases the computational complexity of the network unnecessarily. However, as proposed by ClassSR, the neural network that first classifies and then passes through three parameters that are not shared will spend a lot of time and computing power during training, increasing the complexity of the network. In addition to the shortcomings mentioned above, most of today's super-resolution methods ignore the help of the image's original prior information for the image super-resolution process. Therefore, there is an urgent need for a super-resolution method with a small amount of network computation, and the recovered image texture details are in line with the real image and improved in accuracy.

发明内容SUMMARY OF THE INVENTION

为解决现有技术中存在的问题,本发明提供了一种基于视频编码的单帧图像超分辨率方法,解决了上述背景技术中提到的问题。In order to solve the problems existing in the prior art, the present invention provides a single-frame image super-resolution method based on video coding, which solves the problems mentioned in the above background art.

为实现上述目的,本发明提供如下技术方案:一种基于视频编码的单帧图像超分辨率方法,包括以下步骤:To achieve the above object, the present invention provides the following technical solutions: a single-frame image super-resolution method based on video coding, comprising the following steps:

S1、利用视频编码每一帧图像的先验信息,将视频中低分辨率图像ILR按照H.265视频编码信息分为对应的4×4像素、8×8像素、16×16像素和32×32像素的子块,对于4×4和8×8像素的子块

Figure BDA0003072132290000021
可以得到其对应的编码预测模式Mpre,依据不同的编码模式生成对应的高斯分布的模型Gm;S1. Using the prior information of each frame of video coding, the low-resolution image I LR in the video is divided into corresponding 4 × 4 pixels, 8 × 8 pixels, 16 × 16 pixels and 32 pixels according to the H.265 video coding information. × 32 pixel sub-block, for 4 × 4 and 8 × 8 pixel sub-block
Figure BDA0003072132290000021
The corresponding coding prediction mode M pre can be obtained, and the corresponding Gaussian distribution model G m is generated according to different coding modes;

S2、利用16×16以及32×32像素的子块

Figure BDA0003072132290000022
对通道自适应主干网络CAB进行训练,将CAB中的每一个卷积块分为conv1和conv2两层通道,在每一次迭代中,仅使用conv1的参数进行前向和反向传播,不使用conv2的参数,通过最小化感知损失
Figure BDA0003072132290000023
和mse损失
Figure BDA0003072132290000024
得到最终超分辨输出ISR S2. Utilize sub-blocks of 16×16 and 32×32 pixels
Figure BDA0003072132290000022
The channel adaptive backbone network CAB is trained, and each convolution block in the CAB is divided into conv1 and conv2 two-layer channels. In each iteration, only the parameters of conv1 are used for forward and backward propagation, and conv2 is not used. parameters, by minimizing the perceptual loss
Figure BDA0003072132290000023
and mse loss
Figure BDA0003072132290000024
Get the final super-resolution output I SR

Figure BDA0003072132290000025
Figure BDA0003072132290000025

S3、利用4×4以及8×8像素的子块

Figure BDA0003072132290000026
对通道自适应主干网络CAB进行训练,此时使用conv1和conv2的参数进行前向传播,conv1已经在
Figure BDA0003072132290000027
的训练中学习到了平滑信息的特征提取方式,在反向传播的时候固定conv1的参数,仅更新conv2的参数,通过最小化感知损失
Figure BDA0003072132290000028
和mse损失
Figure BDA0003072132290000029
得到最终超分辨率输出ISR S3. Use 4×4 and 8×8 pixel sub-blocks
Figure BDA0003072132290000026
The channel adaptive backbone network CAB is trained. At this time, the parameters of conv1 and conv2 are used for forward propagation, and conv1 is already in the
Figure BDA0003072132290000027
The feature extraction method of smoothing information is learned in the training of . The parameters of conv1 are fixed during backpropagation, and only the parameters of conv2 are updated, by minimizing the perceptual loss.
Figure BDA0003072132290000028
and mse loss
Figure BDA0003072132290000029
Get the final super-resolution output I SR

Figure BDA0003072132290000031
Figure BDA0003072132290000031

S4、步骤S2和步骤S3训练完成后,对整个网络进行训练,训练时固定通道自适应主干网络CAB的参数,利用最小化感知损失

Figure BDA0003072132290000032
和mse损失
Figure BDA0003072132290000033
进行训练,对剩下的网络参数进行更新,训练
Figure BDA0003072132290000034
对应分支的特征提取模块,初步提取出
Figure BDA0003072132290000035
的特征
Figure BDA0003072132290000036
S4. After the training of steps S2 and S3 is completed, the entire network is trained. During training, the parameters of the channel adaptive backbone network CAB are fixed, and the use of minimizing the perceptual loss
Figure BDA0003072132290000032
and mse loss
Figure BDA0003072132290000033
Perform training, update the remaining network parameters, train
Figure BDA0003072132290000034
The feature extraction module of the corresponding branch is initially extracted
Figure BDA0003072132290000035
Characteristics
Figure BDA0003072132290000036

S5、在对4×4以及8×8像素的子块

Figure BDA0003072132290000037
所对应的分支网络CAB进行训练时,将
Figure BDA0003072132290000038
按数字编号i(i=0,1,2…15)的相对顺序输入到网络中,将每个子块记为
Figure BDA0003072132290000039
与其相同大小且相邻的四个子块记为
Figure BDA00030721322900000310
其中,
Figure BDA00030721322900000311
中的i代表了数字编号的数值;S5, in sub-blocks of 4×4 and 8×8 pixels
Figure BDA0003072132290000037
When the corresponding branch network CAB is trained, the
Figure BDA0003072132290000038
Input into the network in relative order of numerical numbers i (i = 0, 1, 2...15), and denote each sub-block as
Figure BDA0003072132290000039
The four adjacent sub-blocks of the same size are denoted as
Figure BDA00030721322900000310
in,
Figure BDA00030721322900000311
The i in represents the value of the number number;

S6、对步骤S1中生成的高斯模型以(0,0)为中心进行宽高等间距采样

Figure BDA00030721322900000312
得到与卷积块宽高相同的矩阵,将
Figure BDA00030721322900000313
与自适应卷积模块ACB中的卷积层Conv进行点乘操作,进行加权,表达式为:S6. Sampling the Gaussian model generated in step S1 with (0, 0) as the center
Figure BDA00030721322900000312
To get a matrix with the same width and height as the convolution block, set
Figure BDA00030721322900000313
Do point multiplication with the convolutional layer Conv in the adaptive convolution module ACB, and weight it, the expression is:

Figure BDA00030721322900000314
Figure BDA00030721322900000314

用点乘之后的卷积核再对输入图像

Figure BDA00030721322900000315
进行普通卷积运算,经过ACB模块后得到更加专注于图像纹理特征的特征图
Figure BDA00030721322900000316
Use the convolution kernel after dot multiplication to apply the input image
Figure BDA00030721322900000315
Perform ordinary convolution operations, and get a feature map that is more focused on image texture features after the ACB module
Figure BDA00030721322900000316

S7、在每四张相邻子块经过自适应纹理处理模块后,对这四张子块按照其在原图中的位置进行拼接,再将其传递到主干网络,得到一张宽高为单张子块的两倍的特征图

Figure BDA00030721322900000317
以矩阵的形式表示为:S7. After every four adjacent sub-blocks pass through the adaptive texture processing module, splicing the four sub-blocks according to their positions in the original image, and then transmitting them to the backbone network to obtain a single sub-block with a width and height of twice the feature map of
Figure BDA00030721322900000317
Represented in matrix form as:

Figure BDA00030721322900000318
Figure BDA00030721322900000318

S8、对网络利用最小化Ltotal进行进一步细微调整即完成图片的超分辨率过程。S8, further fine-tuning the network utilization to minimize L total to complete the image super-resolution process.

优选的,所述步骤S1的编码预测模式Mpre包括DC预测模式、平面预测模式和角度预测模式。Preferably, the coding prediction mode M pre of the step S1 includes a DC prediction mode, a plane prediction mode and an angle prediction mode.

优选的,所述的通过编码预测模式Mpre对Gm的协方差矩阵C进行控制,Preferably, the covariance matrix C of G m is controlled by the coding prediction mode M pre ,

Gm=Guss(C,θ|Mpre)G m =Guss(C,θ|M pre )

通过调整协方差矩阵,使生成的高斯模型的极大值处与模式纹理角度相吻合,自适应的专注于图像纹理特征,其中,将Mpre为DC模式或平面模式时,设置为拥有单位协方差矩阵的高斯模型,将Mpre为角度模式且角度为θ的子块设置初始协方差矩阵C,并对其做θ角度变换后得到的结果,表示为:By adjusting the covariance matrix, the maximum value of the generated Gaussian model is consistent with the pattern texture angle, and adaptively focuses on the image texture features. When M pre is DC mode or plane mode, it is set to have unit covariance The Gaussian model of the variance matrix, the initial covariance matrix C is set to the sub-block with M pre as the angle mode and the angle is θ, and the result obtained after the θ angle transformation is performed, which is expressed as:

Gm=A(θ)CA(θ)T G m =A(θ)CA(θ) T

其中A(θ)为二维旋转矩阵

Figure BDA0003072132290000041
A(θ)T表示矩阵A(θ)的转置。where A(θ) is a two-dimensional rotation matrix
Figure BDA0003072132290000041
A(θ) T represents the transpose of matrix A(θ).

优选的,所述步骤S8中的细微调整具体包括:Preferably, the fine adjustment in step S8 specifically includes:

使用mse损失

Figure BDA0003072132290000042
来最小化输入的低分辨率图像和真实的高分辨率图像之间的差距
Figure BDA0003072132290000043
其中,N代表了像素个数,
Figure BDA0003072132290000044
代表了不同分支的输出,将其分别与相对应分支的真实图像
Figure BDA0003072132290000045
进行计算,在损失函数中加入感知损失项,使得生成的图片经过CNN网络的特征值与目标图片经过CNN网络的特征值的L2距离尽可能的小,这使得待生成的图片与目标图片在语义上更加相似,
Figure BDA0003072132290000046
其中f表示CNN网络,CNN网络具体为VGG-16网络。use mse loss
Figure BDA0003072132290000042
to minimize the gap between the input low-resolution image and the real high-resolution image
Figure BDA0003072132290000043
Among them, N represents the number of pixels,
Figure BDA0003072132290000044
represent the outputs of different branches, and compare them with the real images of the corresponding branches
Figure BDA0003072132290000045
Perform calculation, add the perceptual loss term to the loss function, so that the L2 distance between the eigenvalues of the generated image passing through the CNN network and the eigenvalues of the target image passing through the CNN network is as small as possible, which makes the image to be generated and the target image semantically indistinguishable. more similar to
Figure BDA0003072132290000046
Among them, f represents the CNN network, and the CNN network is specifically the VGG-16 network.

对4×4、8×8子块使用更大的损失权重值ω2,对更大的平滑子块16×16、32×32使用更小的权重值ω1Use a larger loss weight value ω 2 for the 4×4 and 8×8 sub-blocks, and use a smaller weight value ω 1 for the larger smoothing sub-blocks 16×16 and 32×32,

损失函数Ltotal表示为:The loss function L total is expressed as:

Figure BDA0003072132290000047
Figure BDA0003072132290000047

其中ω1为0.5,ω2为1。where ω1 is 0.5 and ω2 is 1.

本发明的有益效果是:The beneficial effects of the present invention are:

1)本发明利用视频编码中可以直接得到的先验信息,对图像的不同部分子块进行针对性的处理,利用复杂的网络处理纹理更复杂的子块,同时设计一个自适应卷积模块对不同编码模式的子块进行针对处理,使网络更有针对性,针对不同的纹理恢复出不同的细节信息,从而提高超分辨率结果的精度。1) The present invention uses the prior information that can be directly obtained in video coding to carry out targeted processing on different sub-blocks of the image, uses complex networks to process sub-blocks with more complex textures, and simultaneously designs an adaptive convolution module to The sub-blocks of different coding modes are targeted for processing, so that the network is more targeted, and different details are recovered for different textures, thereby improving the accuracy of super-resolution results.

2)本发明将少通道的网络的参数共享到深通道的网络中,即达到用一个主干网络的不同层数实现一整张图片的超分辨率过程,使用相对简单,浅层,少通道的网络处理相对大的、纹理更为平滑的子块,减少超分辨过程所需要的时间。2) The present invention shares the parameters of the network with few channels to the network with deep channels, that is to achieve the super-resolution process of realizing a whole picture with different layers of a backbone network, and it is relatively simple to use, shallow layer, few channels. The network processes relatively large, smoother textured sub-blocks, reducing the time required for the super-resolution process.

附图说明Description of drawings

图1为本发明实施例网络结构示意图;1 is a schematic diagram of a network structure according to an embodiment of the present invention;

图2为本发明实施例网络自适应纹理处理模块示意图;2 is a schematic diagram of a network adaptive texture processing module according to an embodiment of the present invention;

图3为本发明实施例4×4、8×8像素子块训练输入顺序示意图。FIG. 3 is a schematic diagram of a training input sequence of 4×4 and 8×8 pixel sub-blocks according to an embodiment of the present invention.

具体实施方式Detailed ways

下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

请参阅图1-3,本发明提供一种技术方案:一种基于视频编码的单帧图像超分辨率方法,网络结构如图1所示,包括以下步骤:Please refer to Fig. 1-3, the present invention provides a technical solution: a single-frame image super-resolution method based on video coding, the network structure is shown in Fig. 1, including the following steps:

S1、利用视频编码每一帧图像的先验信息,将视频中低分辨率图像ILR按照H.265视频编码信息分为对应的4×4像素、8×8像素、16×16像素和32×32像素的子块,对于4×4和8×8像素的子块

Figure BDA0003072132290000051
可以得到其对应的编码预测模式Mpre,编码预测模式Mpre包括DC预测模式、平面预测模式和角度预测模式,依据不同的编码模式生成对应的高斯分布的模型Gm。S1. Using the prior information of each frame of video coding, the low-resolution image I LR in the video is divided into corresponding 4 × 4 pixels, 8 × 8 pixels, 16 × 16 pixels and 32 pixels according to the H.265 video coding information. × 32 pixel sub-block, for 4 × 4 and 8 × 8 pixel sub-block
Figure BDA0003072132290000051
The corresponding coding prediction mode M pre can be obtained. The coding prediction mode M pre includes a DC prediction mode, a plane prediction mode and an angle prediction mode, and a corresponding Gaussian distribution model G m is generated according to different coding modes.

其中,通过编码预测模式Mpre对Gm的协方差矩阵C进行控制,Among them, the covariance matrix C of G m is controlled by the coding prediction mode M pre ,

Gm=Guss(C,θ|Mpre)G m =Guss(C,θ|M pre )

通过调整协方差矩阵,使得生成的高斯模型的极大值处与模式纹理角度相吻合,以自适应的专注于图像纹理特征,其中,将Mpre为DC模式或平面模式时,设置为拥有单位协方差矩阵的高斯模型,将Mpre为角度模式且角度为θ的子块设置初始协方差矩阵C,并对其做θ角度变换后得到的结果,表示为:By adjusting the covariance matrix, the maximum value of the generated Gaussian model is consistent with the pattern texture angle, so as to adaptively focus on the image texture features. When M pre is DC mode or plane mode, it is set to have the unit For the Gaussian model of the covariance matrix, the initial covariance matrix C is set to the sub-block with M pre as the angle mode and the angle is θ, and the result obtained after performing the θ angle transformation on it is expressed as:

Gm=A(θ)CA(θ)T G m =A(θ)CA(θ) T

其中A(θ)为二维旋转矩阵

Figure BDA0003072132290000061
A(θ)T表示矩阵A(θ)的转置。where A(θ) is a two-dimensional rotation matrix
Figure BDA0003072132290000061
A(θ) T represents the transpose of matrix A(θ).

S2、利用16×16以及32×32像素的子块

Figure BDA0003072132290000062
对图1中通道自适应主干网络(ChannelAdaptive Backbone,CAB)进行训练,注意此时为了高效处理不同类型的输入,将CAB中的每一个卷积块分为conv1和conv2两层通道,在每一次迭代中,仅使用conv1的参数进行前向和反向传播,并不使用conv2的参数,通过最小化感知损失
Figure BDA0003072132290000063
和mse损失
Figure BDA0003072132290000064
得到最终超分辨输出ISR S2. Utilize sub-blocks of 16×16 and 32×32 pixels
Figure BDA0003072132290000062
Train the Channel Adaptive Backbone (CAB) in Figure 1. Note that in order to efficiently process different types of input, each convolution block in the CAB is divided into two layers of channels, conv1 and conv2. In the iteration, only the parameters of conv1 are used for forward and backward propagation, and the parameters of conv2 are not used, by minimizing the perceptual loss
Figure BDA0003072132290000063
and mse loss
Figure BDA0003072132290000064
Get the final super-resolution output I SR

Figure BDA0003072132290000065
Figure BDA0003072132290000065

S3、利用4×4以及8×8像素的子块

Figure BDA0003072132290000066
对图1中通道自适应主干网络CAB进行训练,注意因为复杂的纹理信息需要使用到更多的网络参数进行处理,此时使用conv1和conv2的参数进行前向传播,因为conv1已经在
Figure BDA0003072132290000067
的训练中以及学习到了平滑信息的特征提取方式,在反向传播的时候固定conv1的参数,仅更新conv2的参数,依旧通过最小化感知损失
Figure BDA0003072132290000068
和mse损失
Figure BDA0003072132290000069
得到最终超分辨率输出ISR S3. Use 4×4 and 8×8 pixel sub-blocks
Figure BDA0003072132290000066
Train the channel adaptive backbone network CAB in Figure 1. Note that more network parameters need to be used for processing complex texture information. At this time, the parameters of conv1 and conv2 are used for forward propagation, because conv1 is already in the
Figure BDA0003072132290000067
In the training of , and the feature extraction method of smooth information is learned, the parameters of conv1 are fixed during backpropagation, and only the parameters of conv2 are updated, still by minimizing the perceptual loss.
Figure BDA0003072132290000068
and mse loss
Figure BDA0003072132290000069
Get the final super-resolution output I SR

Figure BDA00030721322900000610
Figure BDA00030721322900000610

S4、步骤S2和步骤S3训练完成后,对整个网络进行训练,训练时固定通道自适应主干网络CAB的参数,仅利用最小化感知损失

Figure BDA0003072132290000071
和mse损失
Figure BDA0003072132290000072
进行训练,对剩下的网络参数进行更新,首先训练
Figure BDA0003072132290000073
对应分支的特征提取模块,初步提取出
Figure BDA0003072132290000074
的特征
Figure BDA0003072132290000075
S4. After the training of step S2 and step S3 is completed, the entire network is trained. During training, the parameters of the channel adaptive backbone network CAB are fixed, and only the minimum perceptual loss is used.
Figure BDA0003072132290000071
and mse loss
Figure BDA0003072132290000072
Perform training, update the remaining network parameters, first train
Figure BDA0003072132290000073
The feature extraction module of the corresponding branch is initially extracted
Figure BDA0003072132290000074
Characteristics
Figure BDA0003072132290000075

S5、在对4×4以及8×8像素的子块

Figure BDA0003072132290000076
所对应的分支网络CAB进行训练时,将
Figure BDA0003072132290000077
按数字编号i(i=0,1,2…15)的相对顺序输入到网络中,将每个子块记为
Figure BDA0003072132290000078
与其相同大小且相邻的四个子块记为
Figure BDA0003072132290000079
其中,
Figure BDA00030721322900000710
中的i代表了数字编号的数值(如图3中所示,在输入i=5的子块时,其相邻子块为i=6,7,8);S5, in sub-blocks of 4×4 and 8×8 pixels
Figure BDA0003072132290000076
When the corresponding branch network CAB is trained, the
Figure BDA0003072132290000077
Input into the network in relative order of numerical numbers i (i = 0, 1, 2...15), and denote each sub-block as
Figure BDA0003072132290000078
The four adjacent sub-blocks of the same size are denoted as
Figure BDA0003072132290000079
in,
Figure BDA00030721322900000710
i in represents the numerical value of the number number (as shown in Figure 3, when the sub-block of i=5 is input, its adjacent sub-blocks are i=6, 7, 8);

S6、对步骤S1中生成的高斯模型以(0,0)为中心进行宽高等间距采样

Figure BDA00030721322900000711
得到与卷积块宽高相同的矩阵,将
Figure BDA00030721322900000712
与自适应卷积模块ACB(如图2所示)中的卷积层Conv进行点乘操作,进行加权,表达式为:S6. Sampling the Gaussian model generated in step S1 with (0, 0) as the center
Figure BDA00030721322900000711
To get a matrix with the same width and height as the convolution block, set
Figure BDA00030721322900000712
Do a point multiplication operation with the convolutional layer Conv in the adaptive convolution module ACB (as shown in Figure 2), and perform weighting, the expression is:

Figure BDA00030721322900000713
Figure BDA00030721322900000713

用点乘之后的卷积核再对输入图像

Figure BDA00030721322900000714
进行普通卷积运算,经过ACB模块后得到更加专注于图像纹理特征的特征图
Figure BDA00030721322900000715
Use the convolution kernel after dot multiplication to apply the input image
Figure BDA00030721322900000714
Perform ordinary convolution operations, and get a feature map that is more focused on image texture features after the ACB module
Figure BDA00030721322900000715

S7、在每四张相邻子块经过自适应纹理处理模块后,对这四张子块按照其在原图中的位置进行拼接,再将其传递到主干网络,得到一张宽高为单张子块的两倍的特征图

Figure BDA00030721322900000716
以矩阵的形式表示为:S7. After every four adjacent sub-blocks pass through the adaptive texture processing module, splicing the four sub-blocks according to their positions in the original image, and then transmitting them to the backbone network to obtain a single sub-block with a width and height of twice the feature map of
Figure BDA00030721322900000716
Represented in matrix form as:

Figure BDA00030721322900000717
Figure BDA00030721322900000717

S8、为了更关注于细节信息,对网络利用最小化Ltotal进行进一步细微调整即完成图片的超分辨率过程。S8. In order to pay more attention to the detailed information, the network utilization minimizes L total to make further fine adjustments to complete the super-resolution process of the picture.

在上述的训练过程中,使用mse损失

Figure BDA00030721322900000718
来最小化输入的低分辨率图像和真实的高分辨率图像之间的差距
Figure BDA00030721322900000719
其中,N代表了像素个数,
Figure BDA00030721322900000720
代表了不同分支的输出,将其分别与相对应分支的真实图像
Figure BDA00030721322900000721
进行计算,但是由于mse损失进行逐像素的损失往往与真实视觉感受有出入,因此我们在损失函数中加入感知损失项,使得生成的图片经过CNN网络的特征值与目标图片经过CNN网络的特征值的L2距离尽可能的小,这使得待生成的图片与目标图片在语义上更加相似(相对于Pixel级别的损失函数),
Figure BDA0003072132290000081
Figure BDA0003072132290000082
In the above training process, the mse loss is used
Figure BDA00030721322900000718
to minimize the gap between the input low-resolution image and the real high-resolution image
Figure BDA00030721322900000719
Among them, N represents the number of pixels,
Figure BDA00030721322900000720
represent the outputs of different branches, and compare them with the real images of the corresponding branches
Figure BDA00030721322900000721
However, since the pixel-by-pixel loss of the mse loss is often different from the real visual experience, we add the perceptual loss term to the loss function, so that the eigenvalues of the generated image passing through the CNN network and the eigenvalues of the target image passing through the CNN network The L2 distance is as small as possible, which makes the image to be generated and the target image more semantically similar (relative to the Pixel-level loss function),
Figure BDA0003072132290000081
Figure BDA0003072132290000082

这里,f所代表的CNN网络我们选为VGG-16。Here, the CNN network represented by f is selected as VGG-16.

此外,由于图像的超分辨率质量更体现在细节处的表现,因此我们更加关注纹理复杂部分,即4×4,8×8子块的重建效果,我们也因此对于这两种子块给予更大的损失权重值ω2,对更大的平滑子块16×16、32×32使用更小的权重值ω1In addition, since the super-resolution quality of the image is more reflected in the performance of the details, we pay more attention to the complex parts of the texture, that is, the reconstruction effect of the 4×4 and 8×8 sub-blocks. The loss weight value ω 2 of the larger smoothing sub-blocks 16×16 and 32×32 uses a smaller weight value ω 1 ,

因此,损失函数Ltotal表示为:Therefore, the loss function L total is expressed as:

Figure BDA0003072132290000083
Figure BDA0003072132290000083

其中ω1为0.5,ω2为1。where ω1 is 0.5 and ω2 is 1.

本发明利用视频编码中可以直接得到的先验信息,对图像的不同部分子块进行针对性的处理,利用复杂的网络处理纹理更复杂的子块,同时设计一个自适应卷积模块对不同编码模式的子块进行针对处理,使网络更有针对性,针对不同的纹理恢复出不同的细节信息,从而提高超分辨率结果的精度。本发明将少通道的网络的参数共享到深通道的网络中,即达到用一个主干网络的不同层数实现一整张图片的超分辨率过程,使用相对简单,浅层,少通道的网络处理相对大的、纹理更为平滑的子块,减少超分辨过程所需要的时间。The present invention uses the prior information that can be directly obtained in video coding to carry out targeted processing on different sub-blocks of the image, uses a complex network to process sub-blocks with more complex textures, and simultaneously designs an adaptive convolution module for different coding. The sub-blocks of the pattern are targeted for processing, making the network more targeted and recovering different details for different textures, thereby improving the accuracy of the super-resolution results. The present invention shares the parameters of the network with few channels to the network with deep channels, that is to achieve the super-resolution process of a whole picture with different layers of a backbone network, and uses relatively simple, shallow, and few-channel network processing. Relatively large, smoother textured sub-blocks reduce the time required for the super-resolution process.

尽管参照前述实施例对本发明进行了详细的说明,对于本领域的技术人员来说,其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换,凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。Although the present invention has been described in detail with reference to the foregoing embodiments, for those skilled in the art, it is still possible to modify the technical solutions described in the foregoing embodiments, or to perform equivalent replacements for some of the technical features. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention shall be included within the protection scope of the present invention.

Claims (4)

1.一种基于视频编码的单帧图像超分辨率方法,其特征在于,包括以下步骤:1. a single-frame image super-resolution method based on video coding, is characterized in that, comprises the following steps: S1、利用视频编码每一帧图像的先验信息,将视频中低分辨率图像ILR按照H.265视频编码信息分为对应的4×4像素、8×8像素、16×16像素和32×32像素的子块,对于4×4和8×8像素的子块
Figure FDA0003448287950000011
可以得到其对应的编码预测模式Mpre,依据不同的编码模式生成对应的高斯分布的模型Gm
S1. Using the prior information of each frame of video coding, the low-resolution image I LR in the video is divided into corresponding 4 × 4 pixels, 8 × 8 pixels, 16 × 16 pixels and 32 pixels according to the H.265 video coding information. × 32 pixel sub-block, for 4 × 4 and 8 × 8 pixel sub-block
Figure FDA0003448287950000011
The corresponding coding prediction mode M pre can be obtained, and the corresponding Gaussian distribution model G m is generated according to different coding modes;
S2、利用16×16以及32×32像素的子块
Figure FDA0003448287950000012
对通道自适应主干网络CAB进行训练,将CAB中的每一个卷积块分为conv1和conv2两层通道,在每一次迭代中,仅使用conv1的参数进行前向和反向传播,不使用conv2的参数,通过最小化感知损失
Figure FDA0003448287950000013
和mse损失
Figure FDA0003448287950000014
得到最终超分辨输出ISR
S2. Utilize sub-blocks of 16×16 and 32×32 pixels
Figure FDA0003448287950000012
The channel adaptive backbone network CAB is trained, and each convolution block in the CAB is divided into conv1 and conv2 two-layer channels. In each iteration, only the parameters of conv1 are used for forward and backward propagation, and conv2 is not used. parameters, by minimizing the perceptual loss
Figure FDA0003448287950000013
and mse loss
Figure FDA0003448287950000014
Get the final super-resolution output I SR
Figure FDA0003448287950000015
Figure FDA0003448287950000015
S3、利用4×4以及8×8像素的子块
Figure FDA0003448287950000016
对通道自适应主干网络CAB进行训练,此时使用conv1和conv2的参数进行前向传播,conv1已经在
Figure FDA0003448287950000017
的训练中学习到了平滑信息的特征提取方式,在反向传播的时候固定conv1的参数,仅更新conv2的参数,通过最小化感知损失
Figure FDA0003448287950000018
和mse损失
Figure FDA0003448287950000019
得到最终超分辨率输出ISR
S3. Use 4×4 and 8×8 pixel sub-blocks
Figure FDA0003448287950000016
The channel adaptive backbone network CAB is trained. At this time, the parameters of conv1 and conv2 are used for forward propagation, and conv1 is already in the
Figure FDA0003448287950000017
The feature extraction method of smoothing information is learned in the training of . The parameters of conv1 are fixed during backpropagation, and only the parameters of conv2 are updated, by minimizing the perceptual loss.
Figure FDA0003448287950000018
and mse loss
Figure FDA0003448287950000019
Get the final super-resolution output I SR
Figure FDA00034482879500000110
Figure FDA00034482879500000110
S4、步骤S2和步骤S3训练完成后,对整个网络进行训练,训练时固定通道自适应主干网络CAB的参数,利用最小化感知损失
Figure FDA00034482879500000111
和mse损失
Figure FDA00034482879500000112
进行训练,对剩下的网络参数进行更新,训练
Figure FDA00034482879500000113
对应分支的特征提取模块,初步提取出
Figure FDA00034482879500000114
的特征
Figure FDA00034482879500000115
S4. After the training of steps S2 and S3 is completed, the entire network is trained. During training, the parameters of the channel adaptive backbone network CAB are fixed, and the use of minimizing the perceptual loss
Figure FDA00034482879500000111
and mse loss
Figure FDA00034482879500000112
Perform training, update the remaining network parameters, train
Figure FDA00034482879500000113
The feature extraction module of the corresponding branch is initially extracted
Figure FDA00034482879500000114
Characteristics
Figure FDA00034482879500000115
S5、在对4×4以及8×8像素的子块
Figure FDA00034482879500000116
所对应的分支网络CAB进行训练时,将
Figure FDA00034482879500000117
按数字编号i,i=0,1,2…15的相对顺序输入到网络中,将每个子块记为
Figure FDA00034482879500000118
与其相同大小且相邻的四个子块记为
Figure FDA00034482879500000119
其中,
Figure FDA00034482879500000120
中的i代表了数字编号的数值;
S5, in sub-blocks of 4×4 and 8×8 pixels
Figure FDA00034482879500000116
When the corresponding branch network CAB is trained, the
Figure FDA00034482879500000117
Input into the network in the relative order of number i, i=0,1,2...15, denote each sub-block as
Figure FDA00034482879500000118
The four adjacent sub-blocks of the same size are denoted as
Figure FDA00034482879500000119
in,
Figure FDA00034482879500000120
The i in represents the value of the number number;
S6、对步骤S1中生成的高斯模型以(0,0)为中心进行宽高等间距采样
Figure FDA0003448287950000021
Figure FDA0003448287950000022
得到与卷积块宽高相同的矩阵,将
Figure FDA0003448287950000023
与自适应卷积模块ACB中的卷积层Conv进行点乘操作,进行加权,表达式为:
S6. Sampling the Gaussian model generated in step S1 with (0, 0) as the center
Figure FDA0003448287950000021
Figure FDA0003448287950000022
To get a matrix with the same width and height as the convolution block, set
Figure FDA0003448287950000023
Do point multiplication with the convolutional layer Conv in the adaptive convolution module ACB, and weight it, the expression is:
Figure FDA0003448287950000024
Figure FDA0003448287950000024
用点乘之后的卷积核再对输入图像
Figure FDA0003448287950000025
进行普通卷积运算,经过ACB模块后得到更加专注于图像纹理特征的特征图
Figure FDA0003448287950000026
Use the convolution kernel after dot multiplication to apply the input image
Figure FDA0003448287950000025
Perform ordinary convolution operations, and get a feature map that is more focused on image texture features after the ACB module
Figure FDA0003448287950000026
S7、在每四张相邻子块经过自适应纹理处理模块后,对这四张子块按照其在原图中的位置进行拼接,再将其传递到主干网络,得到一张宽高为单张子块的两倍的特征图
Figure FDA0003448287950000027
以矩阵的形式表示为:
S7. After every four adjacent sub-blocks pass through the adaptive texture processing module, splicing the four sub-blocks according to their positions in the original image, and then transmitting them to the backbone network to obtain a single sub-block with a width and height of twice the feature map of
Figure FDA0003448287950000027
Represented in matrix form as:
Figure FDA0003448287950000028
Figure FDA0003448287950000028
S8、对网络利用最小化Ltotal进行进一步细微调整即完成图片的超分辨率过程。S8, further fine-tuning the network utilization to minimize L total to complete the image super-resolution process.
2.根据权利要求1所述的基于视频编码的单帧图像超分辨率方法,其特征在于:所述步骤S1的编码预测模式Mpre包括DC预测模式、平面预测模式和角度预测模式。2 . The single-frame image super-resolution method based on video coding according to claim 1 , wherein the coding prediction mode M pre in step S1 includes DC prediction mode, plane prediction mode and angle prediction mode. 3 . 3.根据权利要求1所述的基于视频编码的单帧图像超分辨率方法,其特征在于:所述的通过编码预测模式Mpre对Gm的协方差矩阵C进行控制,3. the single-frame image super-resolution method based on video coding according to claim 1, is characterized in that: the described covariance matrix C of G m is controlled by coding prediction mode M pre , Gm=Guss(C,θ|Mpre)G m =Guss(C,θ|M pre ) 通过调整协方差矩阵,使生成的高斯模型的极大值处与模式纹理角度相吻合,自适应的专注于图像纹理特征,其中,将Mpre为DC模式或平面模式时,设置为拥有单位协方差矩阵的高斯模型,将Mpre为角度模式且角度为θ的子块设置初始协方差矩阵C,并对其做θ角度变换后得到的结果,表示为:By adjusting the covariance matrix, the maximum value of the generated Gaussian model is consistent with the pattern texture angle, and adaptively focuses on the image texture features. When M pre is DC mode or plane mode, it is set to have unit covariance The Gaussian model of the variance matrix, the initial covariance matrix C is set to the sub-block with M pre as the angle mode and the angle is θ, and the result obtained after the θ angle transformation is performed, which is expressed as: Gm=A(θ)CA(θ)T G m =A(θ)CA(θ) T 其中A(θ)为二维旋转矩阵
Figure FDA0003448287950000031
A(θ)T表示矩阵A(θ)的转置。
where A(θ) is a two-dimensional rotation matrix
Figure FDA0003448287950000031
A(θ) T represents the transpose of matrix A(θ).
4.根据权利要求1所述的基于视频编码的单帧图像超分辨率方法,其特征在于:所述步骤S8中的细微调整具体包括:4. The single-frame image super-resolution method based on video coding according to claim 1, wherein the fine adjustment in the step S8 specifically comprises: 使用mse损失
Figure FDA0003448287950000032
来最小化输入的低分辨率图像和真实的高分辨率图像之间的差距
Figure FDA0003448287950000033
其中,N代表了像素个数,
Figure FDA0003448287950000034
代表了不同分支的输出,将其分别与相对应分支的真实图像
Figure FDA0003448287950000035
进行计算,在损失函数中加入感知损失项,使得生成的图片经过CNN网络的特征值与目标图片经过CNN网络的特征值的L2距离尽可能的小,这使得待生成的图片与目标图片在语义上更加相似,
Figure FDA0003448287950000036
其中f表示CNN网络,CNN网络具体为VGG-16网络;
use mse loss
Figure FDA0003448287950000032
to minimize the gap between the input low-resolution image and the real high-resolution image
Figure FDA0003448287950000033
Among them, N represents the number of pixels,
Figure FDA0003448287950000034
represent the outputs of different branches, and compare them with the real images of the corresponding branches
Figure FDA0003448287950000035
Perform calculation, add the perceptual loss term to the loss function, so that the L2 distance between the eigenvalues of the generated image passing through the CNN network and the eigenvalues of the target image passing through the CNN network is as small as possible, which makes the image to be generated and the target image semantically indistinguishable. more similar to
Figure FDA0003448287950000036
Among them, f represents the CNN network, and the CNN network is specifically the VGG-16 network;
对4×4、8×8子块使用更大的损失权重值ω2,对更大的平滑子块16×16、32×32使用更小的权重值ω1Use a larger loss weight value ω 2 for the 4×4 and 8×8 sub-blocks, and use a smaller weight value ω 1 for the larger smoothing sub-blocks 16×16 and 32×32, 损失函数Ltotal表示为:The loss function L total is expressed as:
Figure FDA0003448287950000037
Figure FDA0003448287950000037
其中ω1为0.5,ω2为1。where ω1 is 0.5 and ω2 is 1.
CN202110541900.7A 2021-05-18 2021-05-18 Single-frame image super-resolution method based on video coding Active CN113393377B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110541900.7A CN113393377B (en) 2021-05-18 2021-05-18 Single-frame image super-resolution method based on video coding

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110541900.7A CN113393377B (en) 2021-05-18 2021-05-18 Single-frame image super-resolution method based on video coding

Publications (2)

Publication Number Publication Date
CN113393377A CN113393377A (en) 2021-09-14
CN113393377B true CN113393377B (en) 2022-02-01

Family

ID=77617993

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110541900.7A Active CN113393377B (en) 2021-05-18 2021-05-18 Single-frame image super-resolution method based on video coding

Country Status (1)

Country Link
CN (1) CN113393377B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115115512B (en) * 2022-06-13 2023-10-03 荣耀终端有限公司 A training method and device for image super-resolution network

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102835105A (en) * 2010-02-19 2012-12-19 斯凯普公司 Data compression for video
CN110956671A (en) * 2019-12-12 2020-04-03 电子科技大学 An Image Compression Method Based on Multi-scale Feature Coding
CN112449140A (en) * 2019-08-29 2021-03-05 华为技术有限公司 Video super-resolution processing method and device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110969577B (en) * 2019-11-29 2022-03-11 北京交通大学 A Video Super-Resolution Reconstruction Method Based on Deep Dual Attention Network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102835105A (en) * 2010-02-19 2012-12-19 斯凯普公司 Data compression for video
CN112449140A (en) * 2019-08-29 2021-03-05 华为技术有限公司 Video super-resolution processing method and device
CN110956671A (en) * 2019-12-12 2020-04-03 电子科技大学 An Image Compression Method Based on Multi-scale Feature Coding

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
A2RMNet: Adaptively Aspect Ratio Multi-Scale Network for Object Detection in Remote Sensing Images;Heqian Qiu等;《remote sensing》;20190704;第2-23页 *
REGION ADAPTIVE TWO-SHOT NETWORK FOR SINGLE IMAGE DEHAZING;Hui Li等;《UTC from IEEE Xplore》;20200619;第1-6页 *
基于压缩特征的稀疏表示运动目标跟踪;张红梅等;《郑州大学学报(工学版)》;20160603(第03期);第24-29页 *
高效率视频编码帧内预测编码单元划分快速算法;齐美彬等;《电子与信息学报》;20140731;第36卷(第7期);第1699-1704页 *

Also Published As

Publication number Publication date
CN113393377A (en) 2021-09-14

Similar Documents

Publication Publication Date Title
CN111861945B (en) A text-guided image restoration method and system
CN108596841B (en) A parallel method for image super-resolution and deblurring
CN116309232B (en) Underwater image enhancement method combining physical priori with deep learning
CN112489164B (en) Image coloring method based on improved depth separable convolutional neural network
CN116433516A (en) Low-illumination image denoising and enhancing method based on attention mechanism
CN114898284B (en) Crowd counting method based on feature pyramid local difference attention mechanism
CN116912114B (en) Non-reference low-illumination image enhancement method based on high-order curve iteration
CN116958534A (en) An image processing method, image processing model training method and related devices
CN110148138A (en) A kind of video object dividing method based on dual modulation
CN116109510A (en) A Face Image Inpainting Method Based on Dual Generation of Structure and Texture
CN118230131B (en) Image recognition and target detection method
CN119863544B (en) Lightweight face image generation method based on improved StarGAN and knowledge distillation
CN117152600A (en) An underwater image processing method based on lightweight diffusion model
CN115937704A (en) Remote sensing image road segmentation method based on topology perception neural network
CN117689592A (en) An underwater image enhancement method based on cascade adaptive network
CN113393377B (en) Single-frame image super-resolution method based on video coding
CN115526779A (en) Infrared image super-resolution reconstruction method based on dynamic attention mechanism
Shen et al. Deeper super-resolution generative adversarial network with gradient penalty for sonar image enhancement
CN118799230A (en) A high dynamic range imaging method based on multi-scale progressive reconstruction network
CN116681621A (en) Face image restoration method based on feature fusion and multiplexing
Yang et al. PAFPT: Progressive aggregator with feature prompted transformer for underwater image enhancement
CN119888656A (en) Self-supervision blind image decomposition method oriented to automatic driving scene
CN118154440A (en) Image enhancement method and system based on multi-prior feature fusion
CN118333865A (en) Multi-scale mixed self-attention-based light-weight image super-resolution method
CN117409204A (en) Real-time semantic segmentation method based on feature multiplexing and two-stage self-attention

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant