CN118741055B

CN118741055B - High-resolution image transmission method and system based on optical communication

Info

Publication number: CN118741055B
Application number: CN202411233580.9A
Authority: CN
Inventors: 曲宝春
Original assignee: Suzhou Aixiongsi Communication Technology Co ltd
Current assignee: Suzhou Aixiongsi Communication Technology Co ltd
Priority date: 2024-09-04
Filing date: 2024-09-04
Publication date: 2024-12-31
Anticipated expiration: 2044-09-04
Also published as: CN118741055A

Abstract

The application provides a high-resolution image transmission method and a system based on optical communication, which relate to the field of image transmission, wherein the method comprises the steps of dividing a high-resolution image to be transmitted into a plurality of image blocks; the method comprises the steps of determining a target compression strategy matched with content complexity of each image block, compressing the image blocks according to the target compression strategy, generating corresponding image coding modes according to the target compression strategy of each image block, transmitting each compressed image block and the corresponding image coding mode to an optical communication receiver, enabling the optical communication receiver to process each received compressed image block and the corresponding image coding mode by calling a condition generation countermeasure network to generate a reconstructed high-resolution image, wherein condition vectors of the condition generation countermeasure network are determined according to the image coding modes. Therefore, the compression rate and the image quality are effectively balanced in the image transmission process, and high-quality transmission of high-resolution images is realized.

Description

High-resolution image transmission method and system based on optical communication

Technical Field

The present application relates to the field of image transmission technologies, and in particular, to a high-resolution image transmission method and system based on optical communication.

Background

Conventional image compression techniques, such as JPEG and PNG, compress image data by reducing redundant information and utilizing the characteristics of the human visual system. These methods are excellent in compression of lower resolution images, but their limitations are increasingly apparent as the resolution of the images increases.

First, the conventional compression technique has difficulty in considering the compression rate and the image quality in the face of high-resolution images, and particularly exhibits a lack of strength in terms of lossless compression. In addition, the real-time transmission requirement further exposes the defects of the traditional compression technology, and because of the problems of data loss and compression artifacts in the compression process, the visual effect of the image is affected, the loss of key details can be caused, and the transmission quality and the application value of the high-resolution image are reduced.

In view of the above problems, currently, no preferred technical solution is proposed.

Disclosure of Invention

The application provides a high-resolution image transmission method, a system, a storage medium, a computer program product and electronic equipment based on optical communication, which are used for at least solving the problem that the compression rate and the image quality cannot be balanced when the high-resolution image is compressed and transmitted at present.

In a first aspect, an embodiment of the present application provides a high-resolution image transmission method based on optical communication, which is applied to an optical communication transmitter, and includes dividing a high-resolution image to be transmitted into a plurality of image blocks, each of the image blocks having a corresponding content complexity, determining a target compression policy matching the content complexity of the image block for each of the image blocks, and compressing the image blocks according to the target compression policy to obtain corresponding compressed image blocks, generating a corresponding image encoding mode according to the target compression policy of each of the image blocks, and transmitting each of the compressed image blocks and the corresponding image encoding mode to an optical communication receiver, so that the optical communication receiver processes each of the received compressed image blocks and the corresponding image encoding mode by calling a condition generation countermeasure network to generate a reconstructed high-resolution image, wherein a condition vector of the condition generation countermeasure network is determined according to the image encoding mode.

In a second aspect, an embodiment of the present application provides an optical communication-based high-resolution image transmission system deployed in an optical communication transmitter, where the system includes an image segmentation unit configured to segment a high-resolution image to be transmitted into a plurality of image blocks, each of the image blocks having a corresponding content complexity, a policy matching unit configured to determine, for each of the image blocks, a target compression policy matching the content complexity of the image block, and compress the image blocks according to the target compression policy to obtain corresponding compressed image blocks, an encoding mode generating unit configured to generate a corresponding image encoding mode according to the target compression policy of each of the image blocks, and a data transmission unit configured to transmit each of the compressed image blocks and the corresponding image encoding mode to an optical communication receiver, such that the optical communication receiver processes each of the received compressed image blocks and the corresponding image encoding mode by calling a condition generating countermeasure network to generate a reconstructed high-resolution image, and the condition vector of the condition generating countermeasure network is determined according to the image encoding mode.

In a third aspect, an electronic device is provided that includes at least one processor and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the steps of the high-resolution image transmission method based on optical communication of any of the embodiments of the present application.

In a fourth aspect, an embodiment of the present application provides a storage medium having stored thereon a computer program, wherein the program when executed by a processor implements the steps of the high-resolution image transmission method based on optical communication of any of the embodiments of the present application.

In a fifth aspect, embodiments of the present application provide a computer program product comprising computer programs/instructions which, when executed by a processor, implement the steps of the high resolution image transmission method based on optical communication of any of the embodiments of the present application.

The high-resolution image transmission method based on optical communication provided by the application can at least produce the following technical effects:

(1) By dividing the high-resolution image into a plurality of image blocks, determining a target compression strategy according to the content complexity of each image block, and dynamically adjusting the compression strategy according to the complexity of the image blocks, the complexity difference of different areas in the image can be flexibly dealt with, the image details of the complex areas are ensured to be reserved in the compression process, and the relatively simple areas are compressed more efficiently. Therefore, redundant data is further reduced on the basis of ensuring key content details, and the overall compression rate is improved, so that the data volume is effectively reduced in the transmission process of the high-resolution image.

(2) At the optical communication receiver side, an image is reconstructed by a Conditional GENERATIVE ADVERSARIAL Network (CGAN), and a condition vector is defined according to an image coding mode to specify a specific condition of the generated image. CGAN through learning a reconstruction strategy of a specific condition vector, the compressed image block can be intelligently reconstructed, the details of the original image can be more accurately recovered, the effective restoration of compression artifacts and detail loss is realized, the image compression rate and the image quality are effectively balanced in the transmission process, and the high-quality transmission of the high-resolution image is realized.

By combining the compression strategy and the condition generation countermeasure network based on the content complexity, the image compression transmission process has higher adaptability and expansibility. The compression strategy can be dynamically adjusted and the image reconstruction can be carried out according to specific conditions no matter how complex the image content is, so that the high-resolution image transmission effect under different types and application scenes is ensured.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 shows a flowchart of an example of a high-resolution image transmission method based on optical communication according to an embodiment of the present application;

FIG. 2 illustrates an example operational flow diagram for segmenting a high resolution image into a plurality of image blocks according to an embodiment of the present application;

FIG. 3 illustrates an example operational flow diagram for compressing image blocks according to a target compression policy, according to an embodiment of the application;

FIG. 4 illustrates an example operational flow diagram for compressing image blocks by a context adaptive entropy compression strategy, according to an embodiment of the present application;

FIG. 5 illustrates a schematic diagram of the structural connections of an example of a CGAN generator according to an embodiment of the present application;

fig. 6 shows a block diagram of a high-resolution image transmission system based on optical communication according to an embodiment of the present application;

Fig. 7 is a schematic structural diagram of an embodiment of an electronic device of the present application.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

In the technical scheme of the application, the related processes such as collection, storage, use, processing, transmission, provision, disclosure and the like of the personal information of the user accord with the regulations of related laws and regulations, and the public order is not violated.

Fig. 1 shows a flowchart of an example of a high-resolution image transmission method based on optical communication according to an embodiment of the present application.

The execution main body of the method of the embodiment of the application can be any controller or processor with calculation or processing capability, and the compression strategy and condition generation countermeasure network matched with the content complexity are introduced in the compression and transmission process of the high-resolution image, so that the quality of image compression and reconstruction is obviously improved, the image compression artifact is reduced, the real-time transmission performance is enhanced, and the adaptability and the robustness of the system are improved.

In some examples, it may be integrally configured in an electronic device or terminal by means of software, hardware or a combination of software and hardware, and the type of terminal or electronic device may be diversified, such as an optical communication transmitter, a mobile phone, a tablet computer or a desktop computer, etc.

As shown in fig. 1, in step S110, a high resolution image to be transmitted is divided into a plurality of image blocks, each having a corresponding content complexity.

In some embodiments, the high resolution image is divided into a plurality of image blocks at a predetermined resolution or block size, e.g., 16 x 16 or 32 x 32 blocks, according to a system setup, such that each block can be processed independently. After the segmentation is completed, content complexity analysis is performed for each image block. Here, the content complexity may be calculated by various image complexity sensing algorithms, such as parameters of gradient change, texture complexity, color difference, etc. of the image blocks, and further, the complexity of each image block is quantized by these indexes, thereby forming a complexity vector. Illustratively, the corresponding content complexity of the image block is extracted by means of an edge detection algorithm (such as Canny edge detection) and a texture analysis method (such as Gabor filter). Thus, by dividing the image into small blocks and performing independent analysis for the content complexity of each image, the variability of the local features of the image can be more finely understood.

In step S120, for each image block, a target compression policy matching the content complexity of the image block is determined, and the image block is compressed according to the target compression policy, so as to obtain a corresponding compressed image block.

In some embodiments, the best compression strategy is dynamically selected based on the content complexity of each image block. For example, for higher complexity image blocks, a lower compression ratio (high fidelity compression strategy) is selected to preserve more detail, and for lower complexity image blocks, a higher compression ratio (high compression strategy) is selected to reduce the amount of data. Further, after the target compression policy is selected, a corresponding compression algorithm (such as JPEG, webP, HEVC) is applied to compress each image block, so as to obtain a corresponding compressed image block.

In this way, by selecting different compression strategies according to content complexity, compression rate of image data is maximized while avoiding loss of important details as much as possible. The details of the high-complexity region are reserved in the compression process, the reconstruction quality of the final image is improved, and the transmitted image is ensured to be capable of retaining enough visual information.

In step S130, a corresponding image encoding mode is generated according to the target compression policy of each image block.

In some embodiments, the encoding mode may include information such as an algorithm type, a compression ratio, a block size, a quantization parameter, and a parameter setting of a compression policy, and the generated image encoding mode is packaged together with the compressed image block, so that the optical communication receiver can correctly identify the compression policy and apply a corresponding decoding reconstruction policy, thereby effectively improving the reconstruction quality of the image.

In step S140, each compressed image block and corresponding image encoding mode is transmitted to the optical communication receiver, such that the optical communication receiver processes each received compressed image block and corresponding image encoding mode by invoking a condition generation countermeasure network, the condition vector of which is determined according to the image encoding mode, to generate a reconstructed high resolution image.

Here, all the compressed image blocks and image encoding modes are transmitted at high speed through an optical communication channel, and the high bandwidth and low delay characteristics of optical communication ensure fast and stable transmission of data. In addition CGAN includes a generator that attempts to generate realistic temperature calibration results and a discriminator that distinguishes between the generated results and the actual results, the discriminator typically functioning only in the training phase and not in the application reasoning phase of the network.

It should be noted that CGAN is a variant of generating an countermeasure network that introduces additional condition information when generating data, CGAN allows a user to specify specific conditions of the data to be generated, compared to a common countermeasure network, so that CGAN can learn to generate more structural and diverse data under given conditions, aiming at achieving finer and controllable generation tasks.

Through the embodiment, how to recover the image details based on different compression strategies is already learned in the training process of CGAN, so that when the compressed image block is processed, the condition vector is defined according to the image coding mode, the reconstruction strategy of the specific condition vector is learned, the compression artifact can be effectively eliminated, and the high-quality image close to the original image can be reconstructed.

Fig. 2 illustrates an example operational flow diagram for segmenting a high resolution image into a plurality of image blocks according to an embodiment of the present application.

As shown in fig. 2, in step S210, a high resolution image to be transmitted is divided into a plurality of image units according to a preset pixel size.

In some embodiments, the image is uniformly segmented according to a preset pixel size using a grid-based segmentation algorithm, e.g., each image element is a rectangular region, to ensure that the segmented image elements have uniform sizes. Furthermore, the pixel size of the image unit may be adjusted according to the requirements of the application scenario, for example, in medical images, the unit size may be smaller to preserve more details, while in general applications, the unit size may be slightly larger to reduce the amount of computation.

In step S220, image complexity corresponding to each image unit is calculated.

Here, the corresponding image complexity may be calculated separately for each image unit.

More specifically, the image complexity is calculated by a multi-scale edge density method:

performing Sobel edge detection on the image unit under a plurality of scales to obtain a corresponding multi-scale edge intensity matrix:

formula (1);

Formula (2);

In the formula, Is the blur scale, which determines the intensity of the gaussian blur; Is of the application scale of Is positioned in the image unit after Gaussian blur processingIs used for the display of the display panel,Represents the radius of the gaussian kernel; is the weight of the Gaussian kernel in the Gaussian kernel function with the scale, and represents the position in the Gaussian kernel matrix A weight value at; Is the position of the image unit Pixel values of (2); Is on a scale of Lower positionEdge strength of (a); Is that At the position ofA gradient of direction calculated using a horizontal convolution kernel of the Sobel operator; Is that At the position ofThe gradient of the direction, which is calculated using the vertical convolution kernel of the Sobel operator.

Here, firstly, the image is subjected to gaussian blur processing to smooth noise and details in the image, so that the image can retain details and edge information of different degrees under different scales. The larger the scale, the stronger the degree of blurring, the more details are lost, and the smaller the scale, the more details remain. In multi-scale edge detection, different image versions are obtained by changing the scale. Then, by the Sobel edge detection method, the edge intensity of each pixel point in the image block is calculated to identify an area (i.e., edge) in the image where the change is severe. In multi-scale edge detection, images of different blur degrees are processed through a Sobel operator, and gradients of each pixel point in a horizontal direction (x direction) and a vertical direction (y direction) are calculated so as to determine the intensity of an edge, and edge information in the image is captured.

According to the multi-scale edge intensity matrix, calculating the edge density corresponding to each fuzzy scale:

formula (3);

In the formula, Representing dimensionsThe density of the lower edge of the sheet,AndRespectively representing the pixel length and the pixel width of the image unit;

It should be noted that the edge density is an average value of edge intensities in an image block, which represents the overall complexity of the image block at a certain scale. By calculating the edge density, the complexity of the image block at different scales can be evaluated. If the edge density is higher at a certain scale, it is stated that the image block contains more detail or texture information at that scale.

And carrying out weighted fusion on the edge densities of all scales to obtain multi-scale edge densities corresponding to the image units:

formula (4);

Is the final multi-scale edge density, representing the image complexity of the image element; is the total number of fuzzy dimensions, Is the weight of each blur scale.

Here, the edge densities at different scales are weighted and averaged to obtain the final edge complexity evaluation result. Therefore, the complexity of the image block can be more comprehensively evaluated by fusing the information of multiple scales, and not only is the global structure under the large scale considered, but also the detail characteristics under the small scale are focused, so that the selection of the compression strategy is more accurately guided.

According to the embodiment, firstly, an image is processed through Gaussian blur and smoothing, then, an Sobel operator is used for detecting edges under different scales, then, edge density is calculated, and finally, information of different scales is fused to obtain the comprehensive complexity of an image block, so that high-resolution images can be compressed and transmitted efficiently when the compression rate is balanced and the image quality is achieved.

In step S230, each image unit is clustered according to the image complexity to obtain a plurality of image blocks.

Here, various non-limiting clustering algorithms may be used to perform clustering on the individual image units, such as K-means clustering, hierarchical clustering, DBSCAN clustering, or the like. Taking a K-means clustering algorithm as an example, a clustering center can be initialized through the K-means++ algorithm, so that the convergence speed of clustering and the stability of a result are improved. And further, each image unit is distributed into corresponding clusters according to the complexity value of the image unit to form a plurality of image blocks, each clustering result corresponds to one image block, and the size and the shape of the image blocks are determined according to the result of a clustering algorithm. Therefore, the image units are classified according to the complexity of the image units through a clustering algorithm, the image units with similar contents are effectively aggregated together to form an image block, and different processing strategies can be adopted for areas with different content complexity in the subsequent compression and transmission processes.

According to the embodiment of the application, a multi-scale edge density method is adopted, the image is subjected to edge detection on a plurality of fuzzy scales, and the global structure (such as a large-scale outline and shape) of the image block can be captured by fusing the edge densities on different scales, and local fine details (such as textures and edges) can be accurately identified, so that the complexity of the image block can be comprehensively estimated, and the complexity of the image can be estimated more comprehensively and accurately.

FIG. 3 illustrates an example operational flow diagram for compressing image blocks according to a target compression policy, according to an embodiment of the application.

As shown in fig. 3, in step S310, it is detected whether the content complexity of each image block exceeds a preset complexity threshold.

In some embodiments, a complexity threshold may be preset according to the requirements of the application scene and the characteristics of the image. It should be noted that, the complexity threshold may be a fixed value or may be adjusted according to historical data or dynamic calculation results, and the complexity threshold should be set so as to effectively distinguish between image blocks with high complexity and image blocks with low complexity. Image blocks that exceed the complexity threshold may then be marked as high complexity image blocks, and image blocks that do not exceed the threshold may then be marked as low complexity image blocks.

In step S321, under the condition that the content complexity of the first image block is detected to exceed the preset complexity threshold, determining that the target compression strategy matched with the first image block is a lossless compression strategy, and compressing the first image block according to the lossless compression strategy to obtain a corresponding first compressed image block.

In some embodiments, a suitable lossless compression policy is selected for high complexity image blocks. Lossless compression strategies typically include PNG, JPEG2000, etc. methods that do not lose any image information during compression, and can fully preserve the details of the original image. In addition, the compression parameters can be adjusted according to specific characteristics (such as edge density, texture complexity, etc.) of the image block to optimize the compression effect, for example, in JPEG2000, the wavelet transform progression and quantization step size can be adjusted to achieve higher compression efficiency.

In step S323, if it is detected that the content complexity of the second image block does not exceed the complexity threshold, the target compression policy matched with the second image block is determined to be a lossy compression policy, and the second image block is compressed according to the lossy compression policy, so as to obtain a corresponding second compressed image block.

In some implementations, an appropriate lossy compression strategy is selected for low complexity image blocks. Lossy compression strategies include JPEG, HEVC, etc., which discard to some extent unimportant image information to significantly reduce the amount of data. In addition, parameters of a lossy compression strategy are adjusted according to the complexity and content characteristics of the image block, for example, in JPEG compression, a compression quality factor may be adjusted, and the amount of data is reduced as much as possible while ensuring the visual effect of the image.

In some examples of embodiments of the present application, the lossy compression strategy is a discrete cosine transform (Discrete Cosine Transform, DCT) compression strategy, and the image coding mode corresponding to the second image block includes a location identifier of the second image block, a quantization matrix, and a quantized DCT coefficient matrix. Here, the position of the image block in the whole image is uniquely identified by the identifier of the image block, so that each image block can be correctly spliced during decoding, and the whole image is restored. The quantization matrix may be a standard quantization matrix, which is used to quantize the DCT transformed coefficients, and at the receiving end, inverse quantization can be performed on the DCT coefficients to help recover the image blocks that are close to the original image. The quantized DCT coefficients are the compressed primary data that the receiver uses to reconstruct the image block through an inverse DCT transform to achieve inverse efficient decoding. Therefore, the receiving end can decode and reconstruct the image correctly, and meanwhile, the compression processing is kept simple and effective.

By the embodiment, the application of the intelligent and differential compression strategy to the high-resolution image is realized. And a lossless compression strategy is used in a region with higher complexity to ensure the reservation of key details, and a lossy compression strategy is used in a region with lower complexity to effectively reduce the data volume. Therefore, the method not only optimizes the image quality and the transmission efficiency, but also can better adapt to the requirements of different application scenes, and provides powerful support for efficient transmission and storage of high-resolution images.

In some examples of embodiments of the application, the lossless compression policy is a context adaptive entropy compression policy.

FIG. 4 illustrates an operational flow diagram of an example of compressing image blocks by a context adaptive entropy compression policy, according to an embodiment of the application.

As shown in fig. 4, in step S410, a context range in which the image complexity of the first image block matches is determined, and the size of the context range has a positive correlation with the image complexity.

Here, an adaptive context modeling mechanism is employed. In particular, the system dynamically adjusts the range and weight of the context based on the actual content characteristics of the image block (i.e., image complexity). Regions of high complexity use more extensive context modeling, while smoothing regions reduces context ranges to reduce computational complexity.

In step S420, for each target pixel in the first image block, neighboring pixels of the target pixel within the context are extracted, and a weighted average of the neighboring pixels is calculated as a predicted pixel value of the target pixel.

More specifically, the predicted value of each pixel is calculated by the following formula:

formula (5);

In the formula, Representing the positionA predicted value of the pixel is located,Representing a first image blockMiddle positionPixel values at; Representing location in context The context pixel weight of a pixel reflects the contribution of the pixel to prediction; Representing the position A set of all pixels in the context of the pixel.

Here, the current pixel is calculated by weighted averaging of pixel values in the contextPredicted value of (2)Each context pixelIs weighted by the influence of (2)The larger the weight, the greater the impact on the current pixel prediction value is determined. Furthermore, the weightMay be adaptively adjusted according to the spatial distance or content correlation of neighboring pixels to improve the accuracy of the prediction.

In step S430, a prediction error between the actual pixel value and the predicted pixel value of each target pixel is calculated.

More specifically, the prediction error is calculated by:

formula (6);

In the formula, Representing the positionThe prediction error of the pixel at that point,Representing the position of the first image blockActual values of pixels at; Representing the position A predicted value of the pixel.

Here, the actual pixel value is calculatedAnd predicted valueDifferences between, i.e. prediction errorsWhich represents the deviation between the predicted value and the actual value when the current pixel is predicted by context.

In step S440, each target pixel in the first image block is entropy encoded based on the probability distribution of the prediction error.

Formula (7);

formula (8);

In the formula, Is the positionThe entropy coding value of the pixel at that point,Representation ofProbability distribution of (2); Representing image blocks Representing a first compressed image block; representing the entropy coding length calculated from the probability distribution.

Here, the prediction error is calculatedEntropy encoded values of (2)Entropy coding is based on probability distribution of errorThe higher the probability of error occurrence, the shorter the corresponding code length, thereby reducing data redundancy.

Although it isFor a single pixel locationIs characterized in that, but in context of pixel informationFor predicting the pixel value and indirectly influencing the probability distribution of the prediction error. Thus, although entropy encoding is for a single pixel, the encoding efficiency is closely related to the information of the entire context region.

In embodiments of the present application, a larger context range is employed for high complexity image blocks to capture more neighboring pixel information, thereby predicting the current pixel more accurately. Then, pixel values within the context are extracted, and a weighted average of these pixels is utilized as a prediction for the current pixel. And finally performing entropy coding based on the probability distribution of the prediction error by calculating the error between the actual pixel value and the prediction value. Thus, the shortest code can be generated according to the occurrence frequency of the error, thereby realizing efficient lossless compression.

According to the embodiment of the application, the context range is dynamically adjusted by utilizing the context adaptive entropy coding, and each pixel is predicted and coded according to the complexity of the image block. Because the prediction error is smaller and concentrated, the entropy coding can generate shorter codes, thereby obviously improving the compression rate and effectively reducing the transmitted data quantity. Context adaptive entropy coding can more effectively capture and exploit spatial correlation inside image blocks than conventional lossless compression methods, thereby reducing data redundancy. In addition, the context adaptive entropy coding maintains the characteristics of lossless compression, ensuring that no image information is lost during compression and decompression. For image blocks with high complexity (such as areas with abundant details of edges, textures and the like), all original details can be reserved, high quality of images can be ensured, and common problems of compression artifacts and the like are avoided.

In some examples of embodiments of the present application, the image coding mode for one-pass transmission with compressed data packets of high complexity further comprises a location identifier of the first image block, a context range, a probability distribution of prediction error, and a context Wen Xiangsu weight.

Here, the location identifier of the image block is used to uniquely identify the location of the image block in the entire image. Since the image is divided into a plurality of blocks for processing during compression, the individual image blocks can be accurately positioned, and the identifier ensures that each image block can be correctly spliced back to the original image during decoding. The context range information describes the range size of the neighboring pixels used for prediction of each pixel in the compression process, which determines how many and the distribution of the neighboring pixels are referenced in the prediction process, so that the receiving end can accurately reproduce the prediction process in encoding at the time of decoding, thereby accurately calculating the original value of each pixel. The probability distribution of prediction errors describes the statistical distribution of prediction errors in an image block, which is used for guiding the entropy coding process, so that common error values are distributed with shorter coding lengths, rare error values are distributed with longer coding lengths, the decoding efficiency and the compression ratio can be improved, the original pixel values can be recovered more efficiently in the decoding process, and the lossless compression characteristics are maintained. The context pixel weight describes the influence of each context pixel on the current pixel in the prediction process, and the larger the weight is, the larger the contribution of the context pixel to the prediction value is described, so that the context prediction in the process of encoding can be accurately reproduced in the decoding process, and the decoded image block is completely consistent with the original image block.

Fig. 5 shows a schematic diagram of the structural connections of an example of a generator CGAN according to an embodiment of the application.

As shown in fig. 5, the generator 500 of the condition generation countermeasure network includes an image decoding layer 510, a condition feature embedding module 520, a condition feature fusion module 530, a multi-scale feature extraction module 540, and an output layer 550.

The image decoding layer 510 is configured to receive each compressed image block and a corresponding image coding mode, and combine each compressed image block to obtain a corresponding image block feature matrix.

In some embodiments, the image decoding layer first decodes the compressed image blocks and then reassembles the decoded image blocks according to the image encoding mode to form an image block feature matrix having spatial structure information that preserves the compressed image information and spatial structure.

The conditional feature embedding module 520 is configured to embed each image encoding mode into the high-dimensional feature space to determine a corresponding conditional feature matrix.

In some embodiments, the encoding mode is used as input, and through a plurality of full connection layers or other embedding operations, a conditional feature matrix with the same spatial dimension as the image block feature matrix is finally generated, and the conditional feature matrix is aligned with the image block feature matrix.

The conditional feature fusion module 530 is configured to fuse the image block feature matrix and the conditional feature matrix to obtain a corresponding target fusion feature matrix.

Here, the conditional feature fusion module enables the generator to better utilize the content information and the coding mode of the image by fusing information of different sources. Specifically, through fusion methods such as bilinear pooling, the conditional feature fusion module carries out deep interaction on two input matrixes to generate a fusion feature matrix with stronger expressive force.

The multi-scale feature extraction module 540 is configured to extract, in parallel, multi-scale convolution features corresponding to the target fusion feature matrix through a plurality of convolution kernels, where each convolution kernel has a corresponding convolution scale.

Here, through the multi-scale feature extraction module, the generator is enabled to capture global structure and local details of the image. Therefore, a plurality of convolution kernels with different scales are simultaneously applied to the target fusion feature matrix, each convolution kernel extracts a corresponding scale feature image, feature information of the image under different resolutions is captured, and different layers of information of the image can be reserved.

The output layer 550 is used to process the multi-scale convolution features through a deconvolution operation to generate a reconstructed high resolution image.

Specifically, the multi-scale convolution features extracted before are gradually restored to the resolution of the original image through the inverse convolution operation, and the reconstructed high-resolution image is finally output, so that the details and the quality of the reconstructed image are ensured, and the input original image is restored to the greatest extent.

Through the embodiment of the application, the generator can better keep the spatial structure information of the image through the preliminary decoding and combination of the image decoding layer. By combining the conditional feature embedding and fusing module, the generator can fully utilize the content information and the coding mode of the image, so that the reconstructed high-resolution image is more similar to the original image in detail and global structure.

In some examples of embodiments of the present application, the conditional feature fusion module 530 employs a multimodal feature fusion mechanism. The multi-mode feature fusion can deeply mine the relation between the compressed image block and the conditional feature vector, and the interaction between different information sources is richer by introducing a multi-mode fusion network (such as a bilinear pooling layer), so that the accuracy of feature expression is improved, the restoration capability of a generator on complex image content in the reconstruction process can be remarkably improved, and the restoration capability is particularly outstanding when processing image data with high diversity and complexity.

Specifically, the conditional feature fusion module 530 may determine the fusion features by:

Conditional feature matrix based on bilinear pooling method And image block feature matrixFusing to obtain an initial fusion feature matrix:

Summarizing, in some embodiments, the image block feature matrixIs outputted by the image decoding layer in the shape ofWhereinAndThe height and width of the image block respectively,Is the number of characteristic channels. Conditional feature matrixIs generated by a conditional feature embedding module and has the shape ofWhereinAndAlso the height and width of the image block,Number of channels that are conditional features.

Further, by using the bilinear pooling method, the image block feature matrix is fused with the conditional feature matrix:

Formula (9);

In the formula, Representing an initial fused feature matrixIn positionThe characteristic value of the position,Representing the position of an image block feature matrixAnd a channelA feature value at the location; representing an expanded conditional feature matrix In positionAnd a channelA feature value at the location; Is that Is used for the number of channels in the channel,Is thatIs a total number of channels.

It should be noted that the bilinear pooling can capture higher-order interaction information between the two matrices, thereby generating an initial fusion feature matrix with better expressive force, and enabling the generator to better recover image details and structures when generating a high-resolution image.

For the initial fusion feature matrixPerforming standardization processing to obtain a target fusion feature matrix:

Formula (10);

In the formula, Representing a target fusion feature matrixIn positionThe characteristic value of the position,Representing a preset constant; The square sum of all elements in the whole initial fusion feature matrix is represented and used for calculating normalization factors; And The transverse spatial dimension index and the longitudinal spatial dimension index of the initial fusion feature matrix are respectively represented.

Therefore, the value of the fusion feature is distributed in a proper range by carrying out standard standardization processing on the initial fusion feature matrix, so that the stability of the fusion feature in subsequent processing is ensured, the subsequent further feature extraction and image generation are facilitated, and the overall calculation efficiency of the generator is improved.

According to the embodiment of the application, the multi-mode feature fusion mechanism realizes the depth interaction between the compressed image block features and the condition features through the advanced feature fusion method such as bilinear pooling, captures the complex relationship between the image content and the coding condition, and enables the generator to reproduce the details and the structure of the high-resolution original image more accurately. In addition, through a feature fusion mechanism, information in the condition vector is fully utilized, and the generator can more accurately recover the edge and detail part of the image in the image reconstruction process, so that the artifact and detail loss phenomenon is effectively reduced, and the output image is clearer and more natural.

It should be noted that, for simplicity of description, the foregoing method embodiments are all illustrated as a series of acts combined, but it should be understood and appreciated by those skilled in the art that the present application is not limited by the order of acts, as some steps may be performed in other orders or concurrently in accordance with the present application. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily required for the present application. In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to related descriptions of other embodiments.

Fig. 6 shows a block diagram of an example of an optical communication-based high-resolution image transmission system that may be deployed in an optical communication transmitter of an optical communication network, in accordance with an embodiment of the present application.

As shown in fig. 6, the high-resolution image transmission system 600 based on optical communication includes an image segmentation unit 610, a policy matching unit 620, an encoding pattern generation unit 630, and a data transmission unit 640.

The image segmentation unit 610 is configured to segment a high resolution image to be transmitted into a plurality of image blocks, where each image block has a corresponding content complexity.

The policy matching unit 620 is configured to determine, for each of the image blocks, a target compression policy that matches the content complexity of the image block, and compress the image block according to the target compression policy, so as to obtain a corresponding compressed image block.

The coding mode generating unit 630 is configured to generate a corresponding image coding mode according to the target compression policy of each image block.

The data transmission unit 640 is configured to transmit each of the compressed image blocks and the corresponding image encoding modes to an optical communication receiver, so that the optical communication receiver processes each of the received compressed image blocks and corresponding image encoding modes by calling a condition generation countermeasure network, which is determined according to the image encoding modes, to generate a reconstructed high resolution image.

In some embodiments, embodiments of the present application provide a non-transitory computer-readable storage medium having stored therein one or more programs including execution instructions that are readable and executable by an electronic device (including, but not limited to, a computer, a server, or a network device, etc.) for performing the steps of any of the above-described optical communication-based high-resolution image transmission methods of the present application.

In some embodiments, embodiments of the present application also provide a computer program product comprising a computer program stored on a non-volatile computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, cause the computer to perform the steps of any of the above-described optical communication-based high resolution image transmission methods.

In some embodiments, the present application also provides an electronic device comprising at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the steps of the optical communication-based high resolution image transmission method.

Fig. 7 is a schematic hardware structure of an electronic device for performing a high-resolution image transmission method based on optical communication according to another embodiment of the present application, as shown in fig. 7, the device includes:

One or more processors 710, and a memory 720, one processor 710 being illustrated in fig. 7.

The apparatus for performing the optical communication-based high resolution image transmission method may further include an input device 730 and an output device 740.

Processor 710, memory 720, input device 730, and output device 740 may be connected by a bus or other means, for example in fig. 7.

The memory 720 is used as a non-volatile computer readable storage medium for storing a non-volatile software program, a non-volatile computer executable program, and modules, such as program instructions/modules corresponding to the optical communication-based high-resolution image transmission method in the embodiment of the present application. The processor 710 executes various functional applications of the server and data processing, i.e., implements the high-resolution image transmission method based on optical communication of the above-described method embodiment, by running nonvolatile software programs, instructions, and modules stored in the memory 720.

The memory 720 may include a storage program area that may store an operating system, application programs required for at least one function, and a storage data area that may store data created according to the use of the electronic device, etc. In addition, memory 720 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid-state storage device. In some embodiments, memory 720 may optionally include memory located remotely from processor 710, which may be connected to the electronic device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The input device 730 may receive input digital or character information and generate signals related to user settings and function control of the electronic device. The output device 740 may include a display device such as a display screen.

The one or more modules are stored in the memory 720 that, when executed by the one or more processors 710, perform the high-resolution image transmission method based on optical communication in any of the method embodiments described above.

The product can execute the method provided by the embodiment of the application, and has the corresponding functional modules and beneficial effects of the execution method. Technical details not described in detail in this embodiment may be found in the methods provided in the embodiments of the present application.

The electronic device of the embodiments of the present application exists in a variety of forms including, but not limited to:

(1) Mobile communication devices, which are characterized by mobile communication functionality and are aimed at providing voice, data communication. Such terminals include smart phones, multimedia phones, functional phones, low-end phones, and the like.

(2) Ultra mobile personal computer equipment, which belongs to the category of personal computers, has the functions of calculation and processing and generally has the characteristic of mobile internet surfing. Such terminals include PDA, MID, and UMPC devices, etc.

(3) Portable entertainment devices such devices can display and play multimedia content. The device comprises an audio player, a video player, a palm game machine, an electronic book, an intelligent toy and a portable vehicle navigation device.

(4) Other on-board electronic devices with data interaction functions, such as on-board devices mounted on vehicles.

The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

From the above description of embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus a general purpose hardware platform, or may be implemented by hardware. Based on such understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the related art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform the method described in the respective embodiments or some parts of the embodiments.

It should be noted that the above-mentioned embodiments are merely for illustrating the technical solution of the present application, and not for limiting the same, and although the present application has been described in detail with reference to the above-mentioned embodiments, it should be understood by those skilled in the art that the technical solution described in the above-mentioned embodiments may be modified or some technical features may be equivalently replaced, and these modifications or substitutions do not make the essence of the corresponding technical solution deviate from the spirit and scope of the technical solution of the embodiments of the present application.

Claims

1. A high resolution image transmission method based on optical communication, applied to an optical communication transmitter, comprising:

Dividing a high-resolution image to be transmitted into a plurality of image blocks, wherein each image block has corresponding content complexity;

determining a target compression strategy matched with the content complexity of the image blocks for each image block, and compressing the image blocks according to the target compression strategy to obtain corresponding compressed image blocks;

generating a corresponding image coding mode according to the target compression strategy of each image block;

Transmitting each of the compressed image blocks and the corresponding image encoding mode to an optical communication receiver such that the optical communication receiver processes each of the received compressed image blocks and corresponding image encoding modes by invoking a condition generation countermeasure network to generate a reconstructed high resolution image;

the dividing the high resolution image to be transmitted into a plurality of image blocks includes:

Dividing a high-resolution image to be transmitted into a plurality of image units according to a preset pixel size;

calculating the image complexity corresponding to each image unit;

clustering each image unit according to the image complexity to obtain a plurality of corresponding image blocks;

Wherein the image complexity is calculated by a multi-scale edge density method:

In the formula, Is the blur scale, which determines the intensity of the gaussian blur; Is of the application scale of Is positioned in the image unit after Gaussian blur processingIs used for the display of the display panel,Represents the radius of the gaussian kernel; Is of the scale of The weights of the gaussian kernels in the gaussian kernel function, representing positions in the gaussian kernel matrixA weight value at; Is the position of the image unit Pixel values of (2); Is on a scale of Lower positionEdge strength of (a); Is that At the position ofA gradient of direction calculated using a horizontal convolution kernel of the Sobel operator; is at A gradient of direction calculated using a vertical convolution kernel of the Sobel operator;

Is the final multi-scale edge density, representing the image complexity of the image element; is the total number of fuzzy dimensions, Is the weight of each fuzzy scale;

The determining a target compression strategy matched with the content complexity of the image block, and compressing the image block according to the target compression strategy to obtain a corresponding compressed image block, including:

Under the condition that the content complexity of the first image block exceeds a preset complexity threshold value, determining a target compression strategy matched with the first image block as a lossless compression strategy, and compressing the first image block according to the lossless compression strategy to obtain a corresponding first compressed image block;

Under the condition that the complexity of the content of the second image block is detected not to exceed the complexity threshold, determining that a target compression strategy matched with the second image block is a lossy compression strategy, and compressing the second image block according to the lossy compression strategy to obtain a corresponding second compressed image block;

The lossless compression strategy is a context adaptive entropy compression strategy;

The compressing the first image block according to the lossless compression policy to obtain a corresponding first compressed image block includes:

determining a context range matched with the image complexity of the first image block, wherein the size of the context range and the image complexity are in positive correlation;

for each target pixel in the first image block, extracting neighboring pixels of the target pixel within a context range, and taking a weighted average of the neighboring pixels as a predicted pixel value of the target pixel by calculating:

In the formula, Representing the positionA predicted value of the pixel is located,Representing a first image blockMiddle positionPixel values at; Representing location in context The context pixel weight of a pixel reflects the contribution of the pixel to prediction; Representing the position A set of all pixels in the context of the pixel;

calculating a prediction error between an actual pixel value and a predicted pixel value of each target pixel:

In the formula, Representing the positionThe prediction error of the pixel at that point,Representing the position of the first image blockActual values of pixels at; Representing the position A predicted value of a pixel;

Entropy encoding each target pixel in the first image block based on a probability distribution of prediction errors:

2. The method of claim 1, wherein the image coding mode corresponding to the first image block comprises a location identifier, a context range, a probability distribution of prediction error, and a context Wen Xiangsu weight for the first image block.

3. The method of claim 1, wherein the lossy compression strategy is a discrete cosine transform compression strategy, and the image coding mode corresponding to the second image block comprises a position identifier of the second image block, a quantization matrix, and a quantized discrete cosine transform coefficient matrix.

4. A method according to any of claims 1-3, wherein the generator of the conditional generation countermeasure network comprises an image decoding layer, a conditional feature embedding module, a conditional feature fusion module, a multi-scale feature extraction module, and an output layer;

The image decoding layer is used for receiving each compressed image block and a corresponding image coding mode and combining each compressed image block to obtain a corresponding image block characteristic matrix;

The conditional feature embedding module is used for embedding each image coding mode into a high-dimensional feature space so as to determine a corresponding conditional feature matrix;

the conditional feature fusion module is used for fusing the image block feature matrix and the conditional feature matrix to obtain a corresponding target fusion feature matrix;

The multi-scale feature extraction module is used for extracting multi-scale convolution features corresponding to the target fusion feature matrix in parallel through a plurality of convolution kernels, wherein each convolution kernel has a corresponding convolution scale;

The output layer is configured to process the multi-scale convolution features through a deconvolution operation to generate a reconstructed high resolution image.

5. The method of claim 4, wherein the conditional feature fusion module employs a multimodal feature fusion mechanism and is configured to determine fusion features by:

In the formula,Representing an initial fused feature matrixIn positionThe characteristic value of the position,Representing the position of an image block feature matrixAnd a channelA feature value at the location; representing an expanded conditional feature matrix In positionAnd a channelA feature value at the location; Is that Is used for the number of channels in the channel,Is thatIs a total channel number of (a);

Performing standardization processing on the initial fusion feature matrix to obtain a target fusion feature matrix :

In the formula,Representing a target fusion feature matrixIn positionThe characteristic value of the position,Representing a preset constant; The square sum of all elements in the whole initial fusion feature matrix is represented and used for calculating normalization factors; And The transverse spatial dimension index and the longitudinal spatial dimension index of the initial fusion feature matrix are respectively represented.

6. A high resolution image transmission system based on optical communication deployed at an optical communication transmitter, the system comprising:

The image segmentation unit is used for segmenting the high-resolution image to be transmitted into a plurality of image blocks, wherein each image block has corresponding content complexity;

The strategy matching unit is used for determining a target compression strategy matched with the content complexity of the image blocks aiming at each image block, and compressing the image blocks according to the target compression strategy to obtain corresponding compressed image blocks;

The coding mode generating unit is used for generating a corresponding image coding mode according to the target compression strategy of each image block;

A data transmission unit for transmitting each of the compressed image blocks and the corresponding image encoding modes to an optical communication receiver, such that the optical communication receiver processes each of the received compressed image blocks and the corresponding image encoding modes by calling a condition generation countermeasure network to generate a reconstructed high resolution image;

calculating the image complexity corresponding to each image unit;