[go: up one dir, main page]

CN120238649A - A lightweight semantic communication method and system for image transmission - Google Patents

A lightweight semantic communication method and system for image transmission Download PDF

Info

Publication number
CN120238649A
CN120238649A CN202510713473.4A CN202510713473A CN120238649A CN 120238649 A CN120238649 A CN 120238649A CN 202510713473 A CN202510713473 A CN 202510713473A CN 120238649 A CN120238649 A CN 120238649A
Authority
CN
China
Prior art keywords
convolution
shift
downsampling
layer
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202510713473.4A
Other languages
Chinese (zh)
Other versions
CN120238649B (en
Inventor
涂杰楠
何若欣
吴志豪
刘晓东
徐子晨
周福辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanchang University
Original Assignee
Nanchang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanchang University filed Critical Nanchang University
Priority to CN202510713473.4A priority Critical patent/CN120238649B/en
Priority claimed from CN202510713473.4A external-priority patent/CN120238649B/en
Publication of CN120238649A publication Critical patent/CN120238649A/en
Application granted granted Critical
Publication of CN120238649B publication Critical patent/CN120238649B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The invention relates to the technical field of communication, and provides a lightweight semantic communication method and a system for image transmission, which are used for replacing traditional convolution in downsampling by shift convolution, performing external shift convolution processing on an input target image, splicing a first feature image obtained by the external shift convolution into input data of the downsampling to form two shift convolution operations, effectively improving the extraction efficiency of semantic features, continuously executing downsampling for a plurality of times, multiplexing the external shift convolution, further reducing the demand on computing resources, improving the lightweight degree, splicing original information into downsampling according to the original information obtained by the target image, reducing the influence of original information loss caused by continuous repeated shift convolution, and improving the quality of a reconstructed image. The lightweight semantic communication method and the lightweight semantic communication system for image transmission can meet the requirement of light weight through the combination of internal and external shift convolution, and can effectively improve the image reconstruction quality.

Description

Lightweight semantic communication method and system for image transmission
Technical Field
The invention relates to the technical field of communication, in particular to a lightweight semantic communication method and system for image transmission.
Background
With the rapid development of bandwidth-intensive applications such as the internet of things and virtual reality, the demand of wireless communication networks for data transmission has increased exponentially, which makes efficient wireless communication system design a research hotspot. Among them, semantic communication has become an important research direction of 6G communication technology by virtue of its potential in improving transmission efficiency, reducing redundancy, adapting to complex channel environments, and the like.
Currently, research on performance improvement of a semantic communication system has made many progress, but efficient semantic extraction is still a key challenge to be solved. The traditional semantic extraction method based on convolutional neural network (Convolutional Neural Network, CNN) can realize higher precision, but consumes a large amount of computing resources. Deep learning (DEEP LEARNING, DL) has been applied in semantic communication systems by virtue of its significant advantage of automatically extracting semantic features, but its demands on power consumption and computing resources remain high, and the amount of parameters of the employed network model is large. The power consumption and the computing resources of the Internet of things equipment are limited, so that the semantic communication system in the prior art is difficult to be efficiently applied to the Internet of things equipment, and the development of the Internet of things technology is limited.
Disclosure of Invention
Based on the above, the invention aims to provide an image transmission-oriented lightweight semantic communication method and system, so as to solve the problems that the semantic communication system in the prior art is high in resource consumption and difficult to be applied to Internet of things equipment efficiently, and the development of Internet of things technology is limited.
The invention provides a lightweight semantic communication method for image transmission, which comprises the following steps:
Sequentially performing first convolution processing, downsampling and first hole space pyramid pooling on the target image to obtain coded data;
Sequentially performing up-sampling, second convolution processing and second hole space pyramid pooling according to the received coded data to obtain a reconstructed image of the target image;
And performing outer shift convolution on the target image to obtain a first feature map, wherein the feature size of the first feature map is consistent with the feature size of the downsampled input data;
The downsampling comprises feature stitching, inner shift rolling and nonlinear activation which are sequentially carried out, wherein the feature stitching is used for stitching the first feature map into downsampled input data;
the first feature map is also superimposed into the output data of the intra-shift convolution by a residual connection;
the outer shift convolution and the inner shift convolution comprise shift, batch normalization and point state convolution which are sequentially carried out;
The downsampling is performed a plurality of times in succession.
Optionally, the downsampling further comprises:
Adjusting the channel weight of the down-sampled input data according to a channel attention mechanism to obtain first intermediate data;
the feature stitching is used for stitching the first feature map to the first intermediate data to obtain second intermediate data, so that inner shift convolution is performed according to the second intermediate data.
Optionally, the step of obtaining the first intermediate data further comprises adjusting channel weights of the input data according to a channel attention mechanism:
the third convolution processing, the first global average pooling, the full-connection layer dimension reduction, the activation, the full-connection layer dimension increase, the second global average pooling and the channel weighting are sequentially carried out;
Wherein the output of the third convolution process is also superimposed into the second global average pooled output data by a residual connection.
Optionally, the step of obtaining encoded data is in a sequential processing mode and further comprises obtaining a scaling factor from a difference in size of the current down-sampled input data and the target image and obtaining a shift operation step of the outer shift convolution from the scaling factor such that the obtained size of the first feature map is consistent with the current down-sampled input data.
Optionally, the method further comprises the step of stitching the first feature map to the first hole space pyramid-pooled input data.
Another aspect of the present invention provides an image transmission oriented lightweight semantic communication system, comprising:
The encoder comprises a first convolution module, a downsampling module and a first cavity space pyramid pooling module which are sequentially connected, and is used for sequentially carrying out first convolution processing, downsampling and first cavity space pyramid pooling on a target image so as to obtain encoded data;
the decoder comprises an up-sampling module, a second convolution module and a second cavity space pyramid pooling module which are sequentially connected, and is used for sequentially carrying out up-sampling, second convolution processing and second cavity space pyramid pooling according to the received coded data so as to obtain a reconstructed image of the target image;
The encoder further comprises an outer shift convolution unit, wherein the outer shift convolution unit is used for performing outer shift convolution on the target image to obtain a first characteristic diagram, and the characteristic size of the first characteristic diagram is consistent with the characteristic size of the downsampled image;
The downsampling module comprises a characteristic splicing unit, an internal shift convolution unit and a nonlinear activation unit which are connected in sequence, wherein the characteristic splicing is used for splicing the first characteristic diagram into input data of the downsampling module;
The outer shift convolution unit is further connected to an output end of the inner shift convolution unit, so that the first feature map is overlapped into output data of the inner shift convolution unit through residual connection;
the outer shift convolution unit and the inner shift convolution unit comprise a shift layer, a batch normalization layer and a point state convolution layer which are sequentially connected;
the downsampling module is arranged in a plurality of continuous modes.
Optionally, the downsampling module further comprises:
the compression excitation unit is used for adjusting the channel weight of the down-sampled input data according to a channel attention mechanism to obtain first intermediate data;
the feature stitching unit is used for stitching the first feature map to the first intermediate data to obtain second intermediate data, and the second intermediate data is used as input data of the inner shift convolution unit.
Optionally, the compression excitation unit includes:
the third convolution layer, the first global average pooling layer, the full-connection layer dimension reduction layer, the activation layer, the full-connection layer dimension increasing layer, the second global average pooling layer and the channel weighting layer are sequentially connected;
Wherein the output of the third convolutional layer is also superimposed into the output data of the second global average pooling layer by a residual connection.
Optionally, the working mode of the encoder is a sequential processing mode, and the outer shift convolution unit is further configured to obtain a scaling factor according to a difference between a size of the input data of the current downsampling module and a size of the target image, and obtain a shift operation step size of the outer shift convolution unit according to the scaling factor, so that the obtained size of the first feature map is consistent with the size of the input data of the current downsampling module.
Optionally, the first feature map is further spliced into the input data of the first hole space pyramid pooling module.
The lightweight semantic communication method for image transmission provided by the invention can effectively reduce the computational complexity by replacing the traditional convolution requirement in downsampling through shift convolution, and performs external shift convolution processing on an input target image, and the obtained first feature image is spliced into the input data of the downsampling to form two shift convolution operations, so that the extraction efficiency of semantic features can be effectively improved, the downsampling is continuously performed for a plurality of times, the external shift convolution can be multiplexed, the requirement on computational resources is further reduced, the lightweight degree is improved, the first feature image is original information obtained according to the target image, the original information is spliced into each downsampling, the original information loss influence caused by continuous and repeated shift convolution can be reduced, and the quality of a reconstructed image is improved. The lightweight semantic communication method for image transmission provided by the invention can effectively reduce the computational resource requirement on a coding end through the combination of the outer shift convolution and the inner shift convolution of multiple downsampling, can reduce the loss of original information, and can improve the quality of a reconstructed image obtained by decoding.
Drawings
FIG. 1 is a main flow chart of a lightweight semantic communication method for image transmission in an embodiment of the present invention;
FIG. 2 is a flow chart of downsampling correlation of a lightweight semantic communication method for image transmission in an embodiment of the present invention;
Fig. 3 is a test result of a lightweight semantic communication method for image transmission in an embodiment of the present invention.
The invention will be further described in the following detailed description in conjunction with the above-described figures.
Detailed Description
In order that the invention may be readily understood, a more complete description of the invention will be rendered by reference to the appended drawings. Several embodiments of the invention are presented in the figures. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete.
It will be understood that when an element is referred to as being "mounted" on another element, it can be directly on the other element or intervening elements may also be present. When an element is referred to as being "connected" to another element, it can be directly connected to the other element or intervening elements may also be present. The terms "vertical," "horizontal," "left," "right," and the like are used herein for illustrative purposes only.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used herein in the description of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. The term "and/or" as used herein includes any and all combinations of one or more of the associated listed items.
The method aims at solving the problems that the semantic communication system in the prior art is high in resource consumption, difficult to be applied to the Internet of things equipment in an efficient manner and limited in development of the Internet of things technology. The application provides an image transmission-oriented lightweight semantic communication method, which can effectively reduce the computational complexity by replacing the traditional convolution requirement in downsampling through shift convolution, and performs external shift convolution processing on an input target image, and the obtained first feature image is spliced into the internal shift convolution of the downsampling to form two shift convolution operations, so that the extraction efficiency of semantic features can be effectively improved, the downsampling is continuously performed for a plurality of times, the external shift convolution can be multiplexed, the requirement on computational resources is further reduced, the lightweight degree is improved, the first feature image is original information obtained according to the target image, the original information loss influence caused by the continuous and repeated shift convolution can be reduced, and the quality of a reconstructed image is improved.
Referring to fig. 1 and fig. 2, flowcharts of a lightweight semantic communication method for image transmission according to an embodiment of the invention are shown.
At the encoder end, the encoding step comprises the steps of sequentially carrying out first convolution processing, downsampling and first hole space pyramid pooling on the target image to obtain encoded data, and a receiving end sends the encoded data through a physical channel.
At the decoder end, the decoding step comprises upsampling, second convolution processing and second hole space pyramid pooling to obtain a reconstructed image of the target image.
The first convolution processing and the second convolution processing can be implemented by adopting a convolution neural network (Convolutional Neural Networks, CNN), wherein the first convolution processing is used for increasing the number of characteristic channels of data, for example, R, G, B three-color characteristic channels of an RGB image are increased to 32 channels, and the second convolution processing is used for multiple tasks of characteristic refinement, fusion, noise reduction, dimension adjustment and the like so as to obtain the original information of a target image.
Compared with the traditional CNN convolution, the downsampling selection shift convolution (ShiftConv) technology can effectively reduce the requirement on computing resources and realize lightweight design.
As shown in fig. 2, the downsampling mainly includes an inner Shift convolution operation, and an outer Shift convolution operation includes shifting (Shift), batch normalization (BatchNorm) and point state convolution (1×1 convolution) performed sequentially, the shifting is used for enhancing feature diversity through spatial translation, the batch normalization is used for ensuring stability of training, the point state convolution is used for optimizing channel interaction, and model training efficiency and robustness of the inner Shift convolution can be improved as a whole.
In order to improve the extraction efficiency of semantic features, in this embodiment, downsampling is continuously performed multiple times, and the encoder end also performs outer shift convolution, and the outer shift convolution processes the target image to obtain a first feature map, and the first feature map is spliced into input data of inner shift convolution of each downsampling through feature splicing operation, so that loss of original information of the target image in the multiple shift convolutions can be reduced.
The operation and architecture of the outer shift convolution and the inner shift convolution are consistent, so that the characteristic size of the first characteristic diagram obtained by the outer shift convolution is consistent with the characteristic size of the inner shift convolution, and the first characteristic diagram can be effectively spliced into the input data of the inner shift convolution through characteristic splicing.
The characteristic splicing operation can be specifically splicing according to Channel dimensions (Channel-wise Concatenation).
After the inner shift convolution, nonlinear activation is also performed by PReLU activation functions. By introducing nonlinearity, linear constraint can be broken, gradient propagation can be optimized, feature expression can be enhanced, and semantic feature extraction efficiency can be effectively improved.
In order to improve the efficiency of the downsampling training, in this embodiment, the first feature map is further superimposed into the output data of the inner shift convolution through a residual connection.
To improve the model performance, the downsampling further comprises a compression-Excitation (SE) operation for adjusting channel weights of the downsampled input data according to a channel attention mechanism to obtain first intermediate data. The compression (Squeeze) operation is mainly global average pooling, the Excitation (expression) operation is mainly full connection layer (Fully Connected Layer) processing, the sensitivity of the model to the channel is enhanced through the global average pooling and the full connection layer, and the network performance is improved under the condition that the image feature resolution is kept unchanged.
The feature stitching is used for stitching the first feature map to first intermediate data output by the compression excitation operation to obtain second intermediate data, and moving convolution is carried out according to the second intermediate data.
As shown in FIG. 2, the compression excitation operation specifically comprises a third convolution process (realized by CNN), a first global average pooling, full-link layer dimension reduction, activation (activated by ReLU activation function), full-link layer dimension increase, a second global average pooling and channel weighting (Scale) which are sequentially performed, wherein the output of the third convolution process is further overlapped into the output data of the second global average pooling through residual connection.
After downsampling, the output feature image and the input feature image have scaling difference in size, in order to ensure the size matching of the first feature image and the output feature image of each external shift convolution performed for multiple times, the step of obtaining the encoded data is in a sequential processing mode (after the encoding of the current image is completed, the encoding of the next image is performed), and the method further comprises the steps of obtaining a scaling factor according to the size difference between the current downsampled input data and the size of the target image, and obtaining a shift operation step length of the external shift convolution according to the scaling factor, so that the obtained size of the first feature image is consistent with the current downsampled input data.
When the Internet of things equipment is applied to a scene with low requirements on communication speed, a sequential processing mode is adopted, and the size matching requirement of two feature graphs participating in splicing can be met through the step length dynamic adjustment of external shift convolution, so that the splicing effectiveness is ensured, the multiplexing of the external shift convolution is realized, and the light weight degree of the system is ensured.
Specifically, the image is generally two-dimensional data, after shift convolution, the two-dimensional size of the obtained feature image is different from the original image, scaling factors (scaling factors) in two dimensions can be obtained respectively, and an average value of the two scaling factors is used as the scaling factor of the external shift convolution, and the size transformation formula of the shift convolution is as follows: , wherein, To shift the size of the convolved feature map,To scale down the integer, i is the input image size, p is the fill operation, k is the size of the convolution kernel, and S is the scale factor, consistent with the shift step size of the shift convolution.
In order to further enhance the feature extraction effect, as shown in fig. 1, in this embodiment, the method further includes stitching the first feature map to the input data of the first hole space pyramid pooling (Atrous SPATIAL PYRAMID Pooling, ASPP). And the first feature map containing the original information of the target image is spliced into the pyramid pooling of the first cavity space, so that details (such as edges and textures) of deep feature loss can be reduced, and the multi-scale feature extraction effect is improved.
The invention also provides a lightweight semantic communication system facing image transmission, which comprises:
The encoder comprises a first convolution module, a downsampling module and a first cavity space pyramid pooling module which are sequentially connected, and is used for sequentially carrying out first convolution processing, downsampling and first cavity space pyramid pooling on a target image so as to obtain encoded data;
The decoder comprises an up-sampling module, a second convolution module and a second hole space pyramid pooling module which are sequentially connected, and is used for sequentially carrying out up-sampling, second convolution processing and second hole space pyramid pooling according to received coded data so as to obtain a reconstructed image of the target image;
the encoder further comprises an outer shift convolution unit, wherein the outer shift convolution unit is used for performing outer shift convolution on the target image to obtain a first characteristic diagram, and the characteristic size of the first characteristic diagram is consistent with the characteristic size of downsampling;
The downsampling module comprises a characteristic splicing unit, an internal shift convolution unit and a nonlinear activation unit which are connected in sequence, wherein the characteristic splicing is used for splicing the first characteristic diagram into input data of the downsampling module;
The outer shift convolution unit is also connected to the output end of the inner shift convolution unit so as to superimpose the first feature map into the output data of the inner shift convolution unit through residual connection;
The outer shift convolution unit and the inner shift convolution unit comprise a shift layer, a batch normalization layer and a point state convolution layer which are sequentially connected;
the downsampling module is arranged in a plurality.
The downsampling module further comprises a compression excitation unit for adjusting channel weights of downsampled input data according to a channel attention mechanism to obtain first intermediate data, wherein the feature stitching unit is used for stitching the first feature map to the first intermediate data to obtain second intermediate data, and the second intermediate data is used as input data of the inner shift convolution unit.
The compression excitation unit specifically comprises a third convolution layer, a first global average pooling layer, a full-connection layer dimension reduction layer, an activation layer, a full-connection layer dimension lifting layer, a second global average pooling layer and a channel weighting layer which are connected in sequence, wherein the output of the third convolution layer is further overlapped into the output data of the second global average pooling layer through residual connection.
The embodiment is mainly used in a scene with lower requirements on communication speed, and correspondingly, the working mode of the encoder is a sequential processing mode, and the external shift convolution unit is further used for obtaining a scaling factor according to the size difference between the input data of the current downsampling module and the size of the target image, and obtaining the shift operation step length of the external shift convolution unit according to the scaling factor so as to enable the size of the obtained first feature map to be consistent with the size of the input data of the current downsampling module.
As shown in fig. 3, in the embodiment, the test result of the lightweight semantic communication method for image transmission uses peak signal-to-noise ratio (PSNR) as a quality evaluation index of a reconstructed image, and under the condition of low signal-to-noise ratio (SNR), the performance of the lightweight semantic communication method for image transmission shows a stable and growing trend along with the improvement of the SNR, no steep decline phenomenon occurs, the original information loss is small, and the image reconstruction quality is guaranteed.
The lightweight semantic communication method for image transmission provided by the invention can effectively reduce the computational complexity by replacing the traditional convolution requirement in downsampling through shift convolution, and performs external shift convolution processing on an input target image, and the first feature map obtained by the external shift convolution is spliced into the input data of downsampling to form two shift convolution operations, so that the extraction efficiency of semantic features can be effectively improved, the downsampling is continuously performed for a plurality of times, the requirement on computational resources can be further reduced, the lightweight degree is improved, the first feature map is the original information obtained according to the target image, the original information loss influence caused by the continuous and repeated shift convolution can be reduced, and the quality of a reconstructed image is improved.
In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
The above examples merely represent a few specific embodiments of the present invention, which are described in more detail and are not to be construed as limiting the scope of the invention. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the invention, which are all within the scope of the invention. Accordingly, the scope of protection of the present invention is to be determined by the appended claims.

Claims (10)

1.一种面向图像传输的轻量级语义通信方法,其特征在于,包括:1. A lightweight semantic communication method for image transmission, comprising: 对目标图像依次进行第一卷积处理、下采样和第一空洞空间金字塔池化,以获得编码数据;The target image is sequentially subjected to a first convolution process, downsampling, and a first dilated spatial pyramid pooling to obtain encoded data; 根据接收到的所述编码数据依次进行上采样、第二卷积处理和第二空洞空间金字塔池化,以获得所述目标图像的重构图像;Performing upsampling, a second convolution process, and a second dilated spatial pyramid pooling in sequence according to the received encoded data to obtain a reconstructed image of the target image; 以及,对所述目标图像进行外移位卷积,以获得第一特征图,所述第一特征图的特征尺寸与所述下采样的输入数据的特征尺寸一致;and performing an outer shift convolution on the target image to obtain a first feature map, wherein a feature size of the first feature map is consistent with a feature size of the downsampled input data; 其中,所述下采样包括依次进行的特征拼接、内移位卷积和非线性激活,所述特征拼接用于将所述第一特征图拼接至所述下采样的输入数据中;The downsampling includes sequentially performing feature splicing, inner shift convolution and nonlinear activation, and the feature splicing is used to splice the first feature map into the downsampled input data; 所述第一特征图还通过残差连接叠加至所述内移位卷积的输出数据中;The first feature map is also superimposed on the output data of the inner shift convolution through a residual connection; 所述外移位卷积和所述内移位卷积均包括依次进行的移位、批量归一化和点态卷积;The outer shift convolution and the inner shift convolution both include shifting, batch normalization and pointwise convolution performed in sequence; 所述下采样连续执行有多次。The downsampling is performed multiple times in succession. 2.根据权利要求1所述的面向图像传输的轻量级语义通信方法,其特征在于,所述下采样还包括:2. The lightweight semantic communication method for image transmission according to claim 1, wherein the downsampling further comprises: 根据通道注意力机制调整所述下采样的输入数据的通道权重,获得第一中间数据;Adjusting the channel weights of the downsampled input data according to a channel attention mechanism to obtain first intermediate data; 其中,所述特征拼接用于将所述第一特征图拼接至所述第一中间数据,获得第二中间数据,以根据所述第二中间数据进行内移位卷积。The feature splicing is used to splice the first feature map to the first intermediate data to obtain second intermediate data, so as to perform inner shift convolution according to the second intermediate data. 3.根据权利要求2所述的面向图像传输的轻量级语义通信方法,其特征在于,根据通道注意力机制调整输入数据的通道权重,获得第一中间数据的步骤还包括:3. The lightweight semantic communication method for image transmission according to claim 2, characterized in that the channel weight of the input data is adjusted according to the channel attention mechanism, and the step of obtaining the first intermediate data further comprises: 依次进行的第三卷积处理、第一全局平均池化、全连接层降维、激活、全连接层升维、第二全局平均池化和通道加权;The third convolution process, the first global average pooling, the fully connected layer dimensionality reduction, activation, the fully connected layer dimensionality increase, the second global average pooling and channel weighting are performed in sequence; 其中,所述第三卷积处理的输出还通过残差连接叠加至所述第二全局平均池化的输出数据中。The output of the third convolution processing is also superimposed on the output data of the second global average pooling through a residual connection. 4.根据权利要求1所述的面向图像传输的轻量级语义通信方法,其特征在于,获得编码数据的步骤为顺序处理模式,且还包括:根据当前所述下采样的输入数据的尺寸与所述目标图像的尺寸差异获得缩放因子,并根据所述缩放因子获得所述外移位卷积的移位操作步长,以使获得的所述第一特征图的尺寸与当前所述下采样的输入数据的尺寸一致。4. According to claim 1, the lightweight semantic communication method for image transmission is characterized in that the step of obtaining the encoded data is a sequential processing mode, and also includes: obtaining a scaling factor based on the difference between the size of the currently downsampled input data and the size of the target image, and obtaining the shift operation step size of the outer shift convolution based on the scaling factor, so that the size of the obtained first feature map is consistent with the size of the currently downsampled input data. 5.根据权利要求1所述的面向图像传输的轻量级语义通信方法,其特征在于,还包括:将所述第一特征图拼接至所述第一空洞空间金字塔池化的输入数据中。5. The lightweight semantic communication method for image transmission according to claim 1, further comprising: splicing the first feature map into the input data of the first atrous spatial pyramid pooling. 6.一种面向图像传输的轻量级语义通信系统,其特征在于,包括:6. A lightweight semantic communication system for image transmission, comprising: 编码器,包括依次连接的第一卷积模块、下采样模块和第一空洞空间金字塔池化模块,用于对目标图像依次进行第一卷积处理、下采样和第一空洞空间金字塔池化,以获得编码数据;An encoder, comprising a first convolution module, a downsampling module and a first dilated space pyramid pooling module connected in sequence, for sequentially performing a first convolution process, downsampling and a first dilated space pyramid pooling on a target image to obtain encoded data; 解码器,包括依次连接的上采样模块、第二卷积模块和第二空洞空间金字塔池化模块,用于根据接收到的所述编码数据依次进行上采样、第二卷积处理和第二空洞空间金字塔池化,以获得所述目标图像的重构图像;A decoder, comprising an upsampling module, a second convolution module and a second atrous spatial pyramid pooling module connected in sequence, configured to sequentially perform upsampling, a second convolution process and a second atrous spatial pyramid pooling according to the received encoded data to obtain a reconstructed image of the target image; 其中,所述编码器还包括外移位卷积单元,所述外移位卷积单元用于对所述目标图像进行外移位卷积,以获得第一特征图,所述第一特征图的特征尺寸与所述下采样的特征尺寸一致;The encoder further includes an outer shift convolution unit, which is used to perform an outer shift convolution on the target image to obtain a first feature map, wherein the feature size of the first feature map is consistent with the feature size of the downsampling; 所述下采样模块包括依次连接的特征拼接单元、内移位卷积单元和非线性激活单元,其中,所述特征拼接用于将所述第一特征图拼接至所述下采样模块的输入数据中;The downsampling module comprises a feature splicing unit, an inner shift convolution unit and a nonlinear activation unit connected in sequence, wherein the feature splicing is used to splice the first feature map into the input data of the downsampling module; 所述外移位卷积单元还连接至所述内移位卷积单元的输出端,以将所述第一特征图通过残差连接叠加至所述内移位卷积单元的输出数据中;The outer shift convolution unit is also connected to the output end of the inner shift convolution unit to superimpose the first feature map onto the output data of the inner shift convolution unit through a residual connection; 所述外移位卷积单元和所述内移位卷积单元均包括依次连接的移位层、批量归一化层和点态卷积层;The outer shift convolution unit and the inner shift convolution unit each include a shift layer, a batch normalization layer and a pointwise convolution layer connected in sequence; 所述下采样模块连续设置有多个。A plurality of down-sampling modules are continuously arranged. 7.根据权利要求6所述的面向图像传输的轻量级语义通信系统,其特征在于,所述下采样模块还包括:7. The lightweight semantic communication system for image transmission according to claim 6, wherein the downsampling module further comprises: 压缩激励单元,用于根据通道注意力机制调整所述下采样的输入数据的通道权重,获得第一中间数据;A compression excitation unit, configured to adjust the channel weight of the downsampled input data according to a channel attention mechanism to obtain first intermediate data; 其中,所述特征拼接单元用于将所述第一特征图拼接至所述第一中间数据,以获得第二中间数据,并将所述第二中间数据作为所述内移位卷积单元的输入数据。The feature splicing unit is used to splice the first feature map to the first intermediate data to obtain second intermediate data, and use the second intermediate data as input data of the inner shift convolution unit. 8.根据权利要求7所述的面向图像传输的轻量级语义通信系统,其特征在于,所述压缩激励单元包括:8. The lightweight semantic communication system for image transmission according to claim 7, wherein the compression excitation unit comprises: 依次连接的第三卷积层、第一全局平均池化层、全连接层降维层、激活层、全连接层升维层、第二全局平均池化层和通道加权层;The third convolutional layer, the first global average pooling layer, the fully connected layer dimension reduction layer, the activation layer, the fully connected layer dimension increase layer, the second global average pooling layer and the channel weighted layer are connected in sequence; 其中,所述第三卷积层的输出还通过残差连接叠加至所述第二全局平均池化层的输出数据中。The output of the third convolutional layer is also superimposed on the output data of the second global average pooling layer through a residual connection. 9.根据权利要求6所述的面向图像传输的轻量级语义通信系统,其特征在于,所述编码器的工作模式为顺序处理模式,所述外移位卷积单元还用于:根据当前所述下采样模块的输入数据的尺寸与所述目标图像的尺寸差异获得缩放因子,并根据所述缩放因子获得所述外移位卷积单元的移位操作步长,以使获得的所述第一特征图的尺寸与当前所述下采样模块的输入数据的尺寸一致。9. According to claim 6, the lightweight semantic communication system for image transmission is characterized in that the working mode of the encoder is a sequential processing mode, and the external shift convolution unit is also used to: obtain a scaling factor based on the difference between the size of the input data of the current downsampling module and the size of the target image, and obtain the shift operation step of the external shift convolution unit based on the scaling factor, so that the size of the obtained first feature map is consistent with the size of the input data of the current downsampling module. 10.根据权利要求6所述的面向图像传输的轻量级语义通信系统,其特征在于,所述第一特征图还拼接至所述第一空洞空间金字塔池化模块的输入数据中。10. The lightweight semantic communication system for image transmission according to claim 6, wherein the first feature map is also spliced into the input data of the first atrous spatial pyramid pooling module.
CN202510713473.4A 2025-05-30 A lightweight semantic communication method and system for image transmission Active CN120238649B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202510713473.4A CN120238649B (en) 2025-05-30 A lightweight semantic communication method and system for image transmission

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202510713473.4A CN120238649B (en) 2025-05-30 A lightweight semantic communication method and system for image transmission

Publications (2)

Publication Number Publication Date
CN120238649A true CN120238649A (en) 2025-07-01
CN120238649B CN120238649B (en) 2025-10-10

Family

ID=

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020169735A1 (en) * 2001-03-07 2002-11-14 David Kil Automatic mapping from data to preprocessing algorithms
CN114463840A (en) * 2021-12-31 2022-05-10 北京工业大学 Skeleton-based Shift Graph Convolutional Networks for Human Action Recognition
CN116324811A (en) * 2020-10-08 2023-06-23 华为技术有限公司 Multi-bandwidth separation feature extraction convolutional layer of convolutional neural network
CN117011527A (en) * 2023-08-06 2023-11-07 西南石油大学 Lightweight image semantic segmentation method based on spatial shift and convolution
CN117115435A (en) * 2023-06-30 2023-11-24 重庆理工大学 Attention and multi-scale feature extraction-based real-time semantic segmentation method
CN117314753A (en) * 2023-10-31 2023-12-29 大连大学 Global-local cooperative lightweight image super-resolution method based on semantic guidance
CN117593716A (en) * 2023-12-07 2024-02-23 山东大学 Lane line identification method and system based on unmanned aerial vehicle inspection image
US20240135511A1 (en) * 2022-10-06 2024-04-25 Adobe Inc. Generating a modified digital image utilizing a human inpainting model
WO2024130776A1 (en) * 2022-12-22 2024-06-27 之江实验室 Three-dimensional lidar point cloud semantic segmentation method and apparatus based on deep learning
CN119360028A (en) * 2024-12-20 2025-01-24 长春大学 A method for image semantic segmentation based on TransDeep model
CN119399457A (en) * 2024-09-18 2025-02-07 广州大学 A real-time semantic segmentation method and system for multi-shape pyramids in traffic scenes
CN119600602A (en) * 2024-11-15 2025-03-11 西南交通大学 A lightweight semantic segmentation network and method for power grid inspection
CN119649051A (en) * 2024-12-11 2025-03-18 西安建筑科技大学 SAR image change detection method and system based on dynamic bilinear fusion network

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020169735A1 (en) * 2001-03-07 2002-11-14 David Kil Automatic mapping from data to preprocessing algorithms
CN116324811A (en) * 2020-10-08 2023-06-23 华为技术有限公司 Multi-bandwidth separation feature extraction convolutional layer of convolutional neural network
CN114463840A (en) * 2021-12-31 2022-05-10 北京工业大学 Skeleton-based Shift Graph Convolutional Networks for Human Action Recognition
US20240135511A1 (en) * 2022-10-06 2024-04-25 Adobe Inc. Generating a modified digital image utilizing a human inpainting model
WO2024130776A1 (en) * 2022-12-22 2024-06-27 之江实验室 Three-dimensional lidar point cloud semantic segmentation method and apparatus based on deep learning
CN117115435A (en) * 2023-06-30 2023-11-24 重庆理工大学 Attention and multi-scale feature extraction-based real-time semantic segmentation method
CN117011527A (en) * 2023-08-06 2023-11-07 西南石油大学 Lightweight image semantic segmentation method based on spatial shift and convolution
CN117314753A (en) * 2023-10-31 2023-12-29 大连大学 Global-local cooperative lightweight image super-resolution method based on semantic guidance
CN117593716A (en) * 2023-12-07 2024-02-23 山东大学 Lane line identification method and system based on unmanned aerial vehicle inspection image
CN119399457A (en) * 2024-09-18 2025-02-07 广州大学 A real-time semantic segmentation method and system for multi-shape pyramids in traffic scenes
CN119600602A (en) * 2024-11-15 2025-03-11 西南交通大学 A lightweight semantic segmentation network and method for power grid inspection
CN119649051A (en) * 2024-12-11 2025-03-18 西安建筑科技大学 SAR image change detection method and system based on dynamic bilinear fusion network
CN119360028A (en) * 2024-12-20 2025-01-24 长春大学 A method for image semantic segmentation based on TransDeep model

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
GANG WU ET AL: "Fully 1 × 1 Convolutional Network for Lightweight Image Super-resolution", 《SPRINGER NATURE LINK》, 20 August 2024 (2024-08-20), pages 1 - 3 *
JIENAN TU ET AL: "Use of L-band SAR data for Monitoring Glacier Surging next to Aru Lake", 《ELSEVIER》, 31 January 2021 (2021-01-31) *

Similar Documents

Publication Publication Date Title
Luo et al. Lattice network for lightweight image restoration
CN108921786B (en) Image super-resolution reconstruction method based on residual convolutional neural network
CN112767253B (en) Multi-scale feature fusion binocular image super-resolution reconstruction method
CN110197468A (en) A kind of single image Super-resolution Reconstruction algorithm based on multiple dimensioned residual error learning network
CN115496658B (en) Lightweight image super-resolution reconstruction method based on dual attention mechanism
CN117173024B (en) Mine image super-resolution reconstruction system and method based on overall attention
CN113724134B (en) Aerial image blind super-resolution reconstruction method based on residual distillation network
CN111179167A (en) An image super-resolution method based on multi-stage attention enhancement network
CN113516601A (en) Image Restoration Technology Based on Deep Convolutional Neural Network and Compressed Sensing
CN117114994B (en) Mine image super-resolution reconstruction method and system based on hierarchical feature fusion
CN111583107A (en) Image super-resolution reconstruction method and system based on attention mechanism
CN110930308B (en) A Structure Search Method for Image Super-Resolution Generative Networks
CN118351538A (en) A remote sensing image road segmentation method combining channel attention mechanism and multi-layer axial Transformer feature fusion structure
CN116596764B (en) Lightweight image super-resolution method based on transform and convolution interaction
CN116958534A (en) An image processing method, image processing model training method and related devices
CN116645598A (en) Remote sensing image semantic segmentation method based on channel attention feature fusion
CN117745541A (en) Image super-resolution reconstruction method based on lightweight mixed attention network
CN111951203A (en) Viewpoint synthesis method, apparatus, device, and computer-readable storage medium
CN112991169B (en) Image compression method and system based on image pyramid and generation countermeasure network
CN117635478A (en) Low-light image enhancement method based on spatial channel attention
CN119559049A (en) Image super-resolution reconstruction method and device based on multi-domain information enhancement
CN117132472A (en) Forward-backward separable self-attention-based image super-resolution reconstruction method
CN120298723A (en) A remote sensing image stereo matching method and system based on Mamba model interpretation cost volume
KR20220039368A (en) A real-time super-resolution implementation method and apparatus based on artificial intelligence
CN120238649B (en) A lightweight semantic communication method and system for image transmission

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant