Disclosure of Invention
Aiming at the technical problem of low storage density in the existing DNA information storage coding and decoding algorithm, the application provides an image compression method, a DNA coding method and a related device thereof.
In order to achieve the above purpose, the application is realized by adopting the following technical scheme:
in a first aspect, the present application proposes an image compression method, including:
The following steps are respectively executed for each channel of the image to be compressed:
dividing a pixel matrix of a channel into a plurality of divided matrices to obtain the minimum value of each divided matrix, and marking a matrix formed by the minimum value of each divided matrix as the minimum matrix;
subtracting the minimum value of the corresponding division matrix from all values in the division matrix to obtain a replacement matrix of the pixel matrix;
if the value in the replacement matrix is larger than the allowable maximum value of the corresponding binary system and the bit number after conversion, after the replacement matrix of the pixel matrix is obtained, the value larger than the allowable maximum value is replaced by the allowable maximum value;
Flattening the replacement matrix converted by all channels into a one-dimensional array, and then sequentially splicing the flattened one-dimensional arrays to obtain a first array;
And flattening the minimum matrix converted by all the channels into a one-dimensional array, and then sequentially splicing the flattened one-dimensional arrays to obtain a second array, wherein the first array and the second array are used as image compression results.
Further, the size of the segmentation matrix is determined according to the storage density and the restored image precision.
Further, the converting the values in the replacement matrix and the minimum matrix into numbers with different digits in the same system respectively includes:
And converting the values in the replacement matrix into four-bit quaternary numbers, and converting the values in the minimum matrix into two-bit quaternary numbers.
Further, after the obtaining the replacement matrix of the pixel matrix, the method further includes:
If the value in the replacement matrix is equal to or greater than 16, the value is replaced by 15.
Further, the flattening into a one-dimensional array includes:
transversely into a one-dimensional array or longitudinally into a one-dimensional array.
In a second aspect, the present application proposes an image compression system comprising:
The following modules are adopted to respectively execute the corresponding steps for each channel of the image:
the segmentation module is used for segmenting the pixel matrix of one channel into a plurality of segmentation matrices to obtain the minimum value of each segmentation matrix, and the matrix formed by the minimum value of each segmentation matrix is recorded as the minimum matrix;
The replacing module is used for respectively subtracting the minimum value of the corresponding dividing matrix from all the values in the dividing matrix to obtain a replacing matrix of the pixel matrix;
The conversion module is used for respectively converting the values in the replacement matrix and the minimum matrix into digits with different digits in the same system, correspondingly obtaining a converted replacement matrix and a converted minimum matrix, and if the values in the replacement matrix are larger than the allowed maximum value of the converted corresponding system and digits, replacing the values larger than the allowed maximum value with the allowed maximum value after obtaining the replacement matrix of the pixel matrix;
The first splicing module is used for flattening the replacement matrix converted by all the channels into a one-dimensional array, and then splicing the flattened one-dimensional arrays in sequence to obtain a first array;
And the second splicing module is used for flattening the minimum matrix converted by all the channels into a one-dimensional array, and then splicing the flattened one-dimensional arrays in sequence to obtain a second array, wherein the first array and the second array are used as image compression results.
In a third aspect, the present application provides a DNA encoding method comprising:
Compressing an image using the image compression method of any one of claims 1 to 5;
the values in the first array and the second array are encoded into DNA sequences according to base mapping with the DNA sequences, respectively.
In a fourth aspect, the present application provides a DNA encoding system comprising:
the compression module is used for compressing the image by the image compression method;
And the coding module is used for respectively coding the values in the first array and the second array into DNA sequences according to the base mapping with the DNA sequences.
In a fifth aspect, the application features an electronic device comprising a memory, one or more processors, the memory coupled to the processors, wherein the memory has stored therein computer program code comprising computer instructions that, when executed by the processors, perform the steps of the image compression method described above, or the electronic device performs the steps of the DNA encoding method described above.
In a sixth aspect, the present application proposes a computer readable storage medium having stored therein a computer program which, when executed by a processor, implements the steps of the above-described image compression method or which, when executed by a processor, implements the steps of the above-described DNA encoding method.
Compared with the prior art, the application has the following beneficial effects:
The application provides an image compression method, which comprises the steps of firstly dividing a pixel matrix of each channel of an image to be compressed, obtaining a minimum value of each divided matrix to form a minimum matrix, respectively subtracting the minimum value of the corresponding divided matrix from all values in the divided matrix to obtain a replacement matrix, respectively converting the replacement matrix and the minimum matrix into numbers with the same system and different digits, flattening, and then splicing to finish compression. The application utilizes the characteristic that the image information has certain redundancy, properly discards some information, reduces the storage information quantity and maintains the integrity of the image on the visual sense. Meanwhile, the pixel matrix is divided into n division matrices, and for each division matrix, a storage space of 4n 2 bases is required before processing, and after segmentation and conversion, the required storage space is 4+2n 2 bases. Therefore, the application can effectively reduce the required storage space and increase the storage density.
The application also provides an image compression system, a DNA coding method, a DNA coding system, electronic equipment and a computer storage medium, which have all the advantages of the image compression method.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. The components of the embodiments of the present application generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the application, as presented in the figures, is not intended to limit the scope of the application, as claimed, but is merely representative of selected embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
It should be noted that like reference numerals and letters refer to like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures.
In the description of the embodiments of the present application, it should be noted that, if the terms "upper," "lower," "horizontal," "inner," and the like indicate an azimuth or a positional relationship based on the azimuth or the positional relationship shown in the drawings, or the azimuth or the positional relationship in which the inventive product is conventionally put in use, it is merely for convenience of describing the present application and simplifying the description, and does not indicate or imply that the apparatus or element to be referred to must have a specific azimuth, be configured and operated in a specific azimuth, and thus should not be construed as limiting the present application. Furthermore, the terms "first," "second," and the like, are used merely to distinguish between descriptions and should not be construed as indicating or implying relative importance.
With the development of the internet age, data generated by humans has shown explosive growth. Currently, data storage mainly depends on silicon-based storage media such as hard disks, flash memories and the like. However, silicon-based storage media are increasingly reduced and are gradually depleted with the increase of data, so that development of new storage media is a very urgent need.
DNA information storage is becoming a leading edge technology, which is attracting attention due to the characteristics of high density, long-term stability, and amplifiability of DNA. In DNA information storage systems, the codec algorithm is the core component responsible for converting conventional digital information into DNA sequences and recovering the original information from these DNA sequences. In the leading field of DNA information storage, the design of compression algorithms is critical, as compression algorithms are directly related to how much information can be stored in a limited DNA molecule and the efficiency of reading the information. Although the storage density of DNA is extremely high, the cost and complexity of synthesizing long strings of DNA sequences is also relatively high. By the compression algorithm, the length of the required DNA sequence can be reduced on the premise of not sacrificing the data integrity, thereby reducing the storage cost and improving the reading and writing speed. Whether words in text, pixels in images, samples in audio or frames in video, exhibit a commonality in that there is a high degree of correlation between adjacent elements. In addition, a large amount of redundant information is often contained in multimedia data (such as images, audio and video), so that a potential space is provided for improving the storage efficiency, and the redundancy can be properly removed through an intelligent compression strategy, so that the encoded information density is remarkably improved.
Traditional compression methods have focused on coding optimization of data that has been converted to DNA sequences, such as by reducing repetitive fragments in the sequence, using specific DNA sequence patterns to represent more common blocks of data, and so forth. However, this approach often does not adequately take into account the inherent nature and distribution of the original information, i.e., the text, image, audio or video data prior to conversion to a DNA sequence.
In most naturally occurring data, there is often a strong correlation between adjacent elements. For example, in text, certain combinations of words occur much more frequently than others, and in images, the color values of adjacent pixels are typically very close, especially in smooth or graded areas. This correlation means that a lot of predictability is included in the data, thus providing the possibility for compression. In addition, multimedia data typically contains a large amount of redundant information. For example, an image may contain a large number of areas of the same color, audio may have a long silence or repeated background audio, and there is often only a small change between adjacent frames in the video. The redundant information can be reduced by a compression algorithm without affecting the overall perceived quality of the data.
Based on the above, the present application proposes an image compression method, a DNA encoding method and related devices, and the present application will be described in detail with reference to the embodiments and the accompanying drawings.
As a first embodiment of an image compression method of the present application, it may include:
as shown in fig. 1, a flow chart of steps performed on each channel of an image to be compressed in this embodiment is shown, specifically:
S101, dividing a pixel matrix of a channel into a plurality of divided matrices, obtaining the minimum value of each divided matrix, and marking a matrix formed by the minimum value of each divided matrix as the minimum matrix.
For each channel in the image, such as the red, green and blue channels in an RGB image, each channel is represented as a two-dimensional matrix of pixels, where each element of the matrix represents the color intensity of the image at that location. Dividing the pixel matrix of each channel into a plurality of smaller divided matrices, the size of the divided pixels can affect the compression ratio and image quality of the image. For each of the split matrices, a minimum value is found, which represents the minimum level of color intensity in the split matrix. And these minima are formed into a new matrix, called the minimum matrix.
S102, subtracting the minimum value of the corresponding division matrix from all values in the division matrix to obtain a replacement matrix of the pixel matrix.
For each division matrix, subtracting the minimum value of the division matrix from each value of the division matrix to obtain a replacement matrix can reduce the dynamic range of data, namely the difference between the maximum value and the minimum value of the data, so that the data is easier to process and compress. Compression algorithms are generally more efficient for data with a smaller dynamic range. By preprocessing the data to reduce its range, the performance of the compression algorithm can be improved, thereby achieving a higher compression ratio. In addition, since the processing is performed independently for each of the division matrices, local information of the image can be retained.
S103, respectively converting the values in the replacement matrix and the minimum matrix into digits with different digits in the same system, correspondingly obtaining the converted replacement matrix and the converted minimum matrix, and if the values in the replacement matrix are larger than the allowed maximum value of the converted corresponding system and digits, replacing the values larger than the allowed maximum value with the allowed maximum value after obtaining the replacement matrix of the pixel matrix.
By replacing the value in the replacement matrix that is greater than the allowed maximum value of the corresponding binary sum number after conversion with the allowed maximum value, it is possible to ensure that all values are within the range allowed at the time of conversion.
Converting the values in the replacement matrix and the minimum matrix to numbers of the same system but different numbers of bits can reduce the representation length of the data and can optimize the compression effect by selecting the appropriate system. The system may be determined based on different data characteristics and the number of bits may be determined based on the range of values and the required accuracy. By selecting the appropriate number of bits and bins, the data can be more efficiently represented while reducing the number of bits required for storage and transmission.
S104, flattening the replacement matrix converted by all channels into a one-dimensional array, and then sequentially splicing the flattened one-dimensional arrays to obtain a first array.
S105, flattening the minimum matrix converted by all channels into a one-dimensional array, and then sequentially splicing the flattened one-dimensional arrays to obtain a second array, wherein the first array and the second array are used as image compression results.
Flattening a matrix means converting a two-dimensional matrix into a one-dimensional array, which is typically done in a row-first or column-first order. Taking row preference as an example, each row of the matrix is connected in turn to form a one-dimensional array. After flattening the matrix of each channel into one-dimensional arrays, the next step is to stitch the one-dimensional arrays together. The order of stitching is typically consistent with the order of processing the channels, e.g., red channel, then green channel, and finally blue channel. If one-dimensional array is obtained for each channel, the first array after stitching will be a sequential arrangement of the three one-dimensional arrays.
It is noted that existing DNA information storage compression codec algorithms are mostly lossless type, aiming at fully retaining all information into DNA vectors. The prior art does not distinguish between other types of files when encoding an image, which results in an inability to personalize with some characteristics specific to the image. However, for certain sources of information, such as image data, the inherent structure implies redundant space that can be utilized. Meanwhile, the characteristic that small differences exist between adjacent pixels in the image is found through research, so that the possibility is provided for improving the information storage efficiency.
The application also designs an image lossy compression coding and decoding algorithm specially for DNA information storage based on the discovery. The algorithm considers the distribution characteristics of the pixels of the image, and achieves the aim of obviously improving the coding density while guaranteeing the visual effect of the image through selective information rejection. The application fully utilizes the adjacent similarity of the pixel values in the image, and successfully realizes the remarkable improvement of the coding density by carrying out fine resolution and reasonable elimination of redundant information on the pixel matrix, thereby remarkably improving the DNA storage efficiency.
It should be noted that, the above-mentioned image compression method may be used for DNA storage, but also has a wide application field, for example, in the fields of monitoring, artificial intelligence training and learning, image processing software, server, network transmission, etc., where the image compression method may be designed, and the above-mentioned image compression method of the present application may also be used.
The image compression method of the present application is described in one step as follows, taking application to DNA storage as an example:
As shown in fig. 2, a schematic flow chart of a DNA encoding method of the present application may include:
s201, extracting the picture to be compressed according to the channels to obtain a pixel matrix A of each channel.
In some embodiments of the present application, the image to be compressed is an RGB image, and the number of channels is 3, so that the subsequent operations are performed on the 3 channels, respectively. The application can be suitable for the compression of various images and the corresponding subsequent coding and storage, such as gray level images, the channel number is 1, therefore, the type of images do not need to be extracted by channels, and the pixel points of the original image are pixel matrixes. Such as a four-channel color image.
S202, dividing the pixel matrix A.
For each pixel matrix a extracted from the image to be compressed, it needs to be segmented. Let the size of the pixel matrix A beThe size of the division matrix obtained by division is. As shown in fig. 3, a schematic diagram of the division of the pixel matrix a is shown. In this step of the process, the process is carried out,Is any positive integer value less than a and less than b.The size of (c) affects the storage density and the accuracy of the decoded image.The smaller the value of (c), the more values need to be stored, the lower the storage density, and the higher the accuracy of the restored image.The larger the value of (c), the fewer the values that need to be stored, the higher the storage density, and the lower the accuracy of the restored image. From the point of view of the storage density,The smaller the value of the matrix is, the more the number of the matrices is after segmentation, each matrix contains fewer pixels, and the local characteristics of the image can be recorded more carefully, but more matrix information needs to be stored, so that the storage density is low; the larger the value of (c) the fewer the number of matrices segmented, each containing more pixels, so generally less information needs to be stored and the storage density is high, but larger matrices may not be able to represent the local features of the image finely, resulting in information loss. As shown in fig. 4, compression ratios for different block lengths are schematically shown.
In practical applications, there may beOr (b)Cannot be covered byUnder-sizing occurs in the case of integer divisionThe matrix of (2) is not required to be fully sized, and the actual size is maintained.
S203, obtaining a minimum matrix B.
For all the split matrices resulting from the splitting of a pixel matrix a, a minimum value V min is selected, which represents the lowest level of pixel values within the split matrix, and recorded. All V min together form a minimum matrix B and are located in the same position as their corresponding split matrix in matrix a, which means that if matrix a is split in row and column order, minimum matrix B should also be arranged with these minimum values in the same row and column order.
S204, updating the value in the pixel matrix A to obtain a replacement matrix of the pixel matrix A.
For each minimum matrix B, subtracting the minimum value V min in the minimum matrix B from each value in the minimum matrix B, and updating the obtained difference value at the corresponding position of the pixel matrix A to obtain a replacement matrix. In this embodiment, to ensure that all values in the updated replacement matrix can be represented using two base bits, each value is required to be less than 16. When the obtained difference is 16 or more, it may be replaced with 15. It should be noted that, in this embodiment, each value is required to be smaller than 16, because in this example, all values are subsequently converted into a quaternary number. In other embodiments of the present application, if the conversion to other numbers is needed later, the corresponding requirements in this step may also be adaptively adjusted. As shown in fig. 5, a schematic diagram of the replacement matrix and the minimum matrix is shown, where the minimum value V min =189.
S205, converting the numerical value into an equal-length quaternary number.
For pixel matrix a, each value has been controlled to within 16 by the foregoing steps, so each value can be converted into a two-bit quaternary number. For the minimum matrix B, the elements in the minimum matrix B are original picture pixel values, so that the maximum value is 255, and each element can be converted into a four-bit quaternary number. Note that, the quaternary numbers use 0, 1,2, and 3 as the numbers, so that the two-bit quaternary numbers can represent a range of 00 to 33 (equivalent to 0 to 15 in decimal). A four-bit quaternary may represent a maximum value of 3333 (255 in decimal), which corresponds exactly to the maximum value of the 8-bit image pixel value.
In practice, to convert a decimal number to quaternary, the number may be divided by 4 and the remainder recorded. But for the values in the minimum matrix B, it is necessary to ensure that the converted quaternary has four bits, since they may reach 255.
S206, expanding the converted replacement matrix and the minimum matrix into a one-dimensional array.
The converted replacement matrix and the minimum matrix are respectively unfolded into one-dimensional arrays, and can be unfolded transversely, unfolded longitudinally or by adopting other rules.
In practical applications, when the width of the image is fixed and the subsequent processing steps require accessing the pixels in rows, a lateral expansion may be used to arrange the elements in the matrix in a one-dimensional array in row order, i.e. all elements of the first row are read first, then all elements of the second row are read, and so on. When the height of the image is fixed and the subsequent processing steps require accessing the pixels by column, a longitudinal expansion may be employed, reading all elements of the first column first, then all elements of the second column, and so on. Besides transverse expansion and longitudinal expansion, zigzag expansion, block expansion, even custom rules and the like can be adopted.
S207, encoding the unfolded one-dimensional array into a DNA sequence.
Because the one-dimensional array is composed of quaternary numbers, the array can be converted into a DNA sequence according to the specified mapping relation of 0,1,2,3 and A, T, C and G.
The four numbers 0, 1,2, and 3 are used to represent the numerical values, and four bases in the DNA are adenine (a), thymine (T), cytosine (C), and guanine (G), respectively, and the four numbers 0, 1,2, and 3 correspond to the DNA base A, T, C, G according to the mapping relationship, which is established based on the analogy between the biological DNA sequence and the digital system.
The application provides a novel image lossy processing compression algorithm for the DNA information storage field, which can effectively reduce the information quantity required to be stored, but has no obvious difference on the final effect. In addition, the pixel matrix is subjected to modularized processing, and the pixel values are split, so that information to be recorded is reduced.
Accordingly, as shown in fig. 6, which is a schematic diagram of an image compression system according to the present application, the image compression system may include:
The following modules are adopted to respectively execute the corresponding steps for each channel of the image:
the segmentation module is used for segmenting the pixel matrix of one channel into a plurality of segmentation matrices to obtain the minimum value of each segmentation matrix, and the matrix formed by the minimum value of each segmentation matrix is recorded as the minimum matrix;
The replacing module is used for respectively subtracting the minimum value of the corresponding dividing matrix from all the values in the dividing matrix to obtain a replacing matrix of the pixel matrix;
The conversion module is used for respectively converting the values in the replacement matrix and the minimum matrix into digits with different digits in the same system, correspondingly obtaining a converted replacement matrix and a converted minimum matrix, and if the values in the replacement matrix are larger than the allowed maximum value of the converted corresponding system and digits, replacing the values larger than the allowed maximum value with the allowed maximum value after obtaining the replacement matrix of the pixel matrix;
The first splicing module is used for flattening the replacement matrix converted by all the channels into a one-dimensional array, and then splicing the flattened one-dimensional arrays in sequence to obtain a first array;
And the second splicing module is used for flattening the minimum matrix converted by all the channels into a one-dimensional array, and then splicing the flattened one-dimensional arrays in sequence to obtain a second array, wherein the first array and the second array are used as image compression results.
In some embodiments of the image compression system of the present application, the size of the segmentation matrix may be determined based on the storage density and the recovered image accuracy.
In some embodiments of the image compression system of the present application, when the values in the replacement matrix and the minimum matrix are respectively converted into numbers with different digits in the same system, the values in the replacement matrix may be converted into four-digit quaternary numbers, and the values in the minimum matrix may be converted into two-digit quaternary numbers.
In some embodiments of the image compression system of the present application, after obtaining the replacement matrix of the pixel matrix, the method may further include replacing with 15 if a value in the replacement matrix is greater than or equal to 16.
In some embodiments of the image compression system of the present application, the image compression system may be expanded into a one-dimensional array in a lateral direction or into a one-dimensional array in a longitudinal direction when flattened into a one-dimensional array.
In addition, as shown in FIG. 7, which is a schematic diagram of the DNA encoding system of the present application, it may include:
the compression module is used for compressing the image by adopting the image compression method;
And the coding module is used for respectively coding the values in the first array and the second array into DNA sequences according to the base mapping with the DNA sequences.
It should be noted that, in several embodiments provided by the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the system embodiments described above are merely illustrative, e.g., the division of modules is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple modules may be combined or integrated into another device, or some features may be omitted or not performed. The modules described as separate components may or may not be physically separate, and components shown as modules may be one physical unit or multiple physical units, may be located in one place, or may be distributed in a plurality of different places. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each module in the embodiments of the present invention may be integrated in one processing unit, or each module may exist alone physically, or two or more modules may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The embodiment of the application also provides an electronic device, which can comprise one or more processors, a memory and a communication interface.
Wherein the memory, the communication interface, and the processor are coupled. For example, the memory, communication interface, and processor may be coupled together by a bus.
The communication interface is used for carrying out data transmission with other devices. The memory has stored therein computer program code. The computer program code comprises computer instructions which, when executed by a processor, cause the electronic device to perform the steps of the above-described image compression method or cause the electronic device to perform the steps of the above-described DNA encoding method. .
The Processor may be a Processor or controller, such as a central processing unit (Central Processing Unit, CPU), general purpose Processor, digital signal Processor (DIGITAL SIGNAL Processor, DSP), application-specific integrated Circuit (ASIC), field programmable gate array (Field Programmable GATE ARRAY, FPGA) or other programmable logic device, transistor logic device, hardware component, or any combination thereof. Which may implement or perform the various exemplary logic blocks, modules, and circuits described in connection with this disclosure. The processor may also be a combination that performs the function of a computation, e.g., a combination comprising one or more microprocessors, a combination of a DSP and a microprocessor, and the like. The processor may be adapted to support an electronic device for performing the method steps provided in the above embodiments.
The bus may be a peripheral component interconnect standard (PERIPHERAL COMPONENT INTERCONNECT, PCI) bus, or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, among others. The buses may be classified into address buses, data buses, control buses, and the like.
The embodiment of the application provides a computer readable storage medium, wherein a computer program is stored in the computer readable storage medium, and the steps of the image compression method are realized when the computer program is executed by a processor, or the steps of the DNA coding method are realized when the computer program is executed by the processor.
The computer readable storage medium to which the present application relates includes Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The description of the related parts in the image compression system, the DNA encoding method, the DNA encoding system, the electronic device and the computer readable storage medium provided in the embodiments of the present application is referred to in the detailed description of the corresponding parts in the image compression method provided in the embodiments of the present application, and will not be repeated here. In addition, the parts of the above technical solutions provided in the embodiments of the present application, which are consistent with the implementation principles of the corresponding technical solutions in the prior art, are not described in detail, so that redundant descriptions are avoided.
The above is only a preferred embodiment of the present application, and is not intended to limit the present application, but various modifications and variations can be made to the present application by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the protection scope of the present application.