HK1068200B

HK1068200B - Dct compression using golomb-rice coding

Info

Publication number: HK1068200B
Application number: HK05100266.2A
Authority: HK
Inventors: K．蒂亚加拉简
Original assignee: 高通股份有限公司
Priority date: 2001-06-29
Filing date: 2002-06-17
Publication date: 2010-04-09

Description

DCT compression using Golomb-Rice coding

Technical Field

The present invention relates to graphics processing and compression, and more particularly to a DCT coefficient coding method using Golomb-Rice.

Background

Within the general specifications of digital signal processing, digital picture processing has a significant position. The importance of human vision is of great interest in digital picture processing techniques and science. In the field of video signal transceiving, such as projection movies, various improvements are being made to image compression techniques. Many existing and proposed video systems utilize digital coding techniques including graphics coding, image restoration, and image feature selection. Image coding means attempting to transmit pictures of a digital communication channel in an efficient manner, using as few bits as possible to reduce the required bandwidth, while keeping distortion within certain limits. The image restoration is to restore the real image of the target. An encoded image transmitted via a communication channel may be distorted by various factors, the source of degradation being caused by the formation of an image from a target. Feature selection refers to the selection of certain picture attributes that are required for a wider range of recognition, classification and judgment.

Digital encoding of video, such as digital cinema, is an area that benefits from improvements in image compression techniques. Digital image compression is generally divided into two categories: non-destructive and destructive methods. Lossless image restoration does not lose any information. Lossy methods involve the irreparable loss of some information, depending on the compression ratio, the quality of the compression algorithm, and the implementation of the algorithm. In general, lossy compression methods are considered to achieve the compression ratios desired for economical digital cinema methods. To achieve a digital cinema quality level, the compression method should provide a visually lossless level of performance. Therefore, although the compression process suffers from mathematical information loss, under normal viewing conditions, the image distortion caused by this loss should not be visible to the viewer.

Existing digital image compression techniques have been developed for other applications, namely television systems, which, while making design compromises with the original application, do not meet the quality requirements required for motion picture presentation.

Digital cinema compression techniques should provide the viewing quality that a moviegoer has previously experienced. The viewing quality of digital cinema should preferably exceed the quality of high-quality release cinema prints, while the compression technique should have a practically high coding efficiency. As defined herein, coding efficiency refers to the bit rate required to compress image quality to a certain level. Moreover, the system and encoding techniques should have inherent flexibility to accommodate different formats and be economical, i.e., small scale and efficient decoder or encoder processing.

Many existing compression techniques have high compression levels but cause degradation in the quality of the video signal. In general, techniques for delivering compressed information require that the compressed information be delivered at a constant bit rate.

One compression technique that provides both a high level of compression and a desired level of quality for video signals employs adaptively sized blocks and partitions of encoded "discrete cosine transform" (DCT) coefficients, hereinafter referred to as the "Adaptive Block Size Discrete Cosine Transform (ABSDCT) method. U.S. Pat. No.5,021,891 entitled "Adaptive Block size Image Compression Method and System" discloses this technique. Which is assigned to the assignee of the present invention and is incorporated herein by reference. DCT technology is also disclosed in U.S. Pat. No.5,107,345 entitled "Adaptive Block Size Image Compression Method and System," which is assigned to the assignee of the present invention and is incorporated herein by reference. In addition, U.S. patent No.5,452,104 entitled "Adaptive Block Size Image Compression method System," also discusses the use of ABSDCT techniques in combination with "differential quadrant tree transformation" techniques, which is also assigned to the assignee of the present invention and is incorporated herein by reference. The systems disclosed in these patents apply so-called "intra-frame" encoding methods, where each frame of image data is encoded independently of the content of any other frame. With the ABSDCT technique, the achievable data rates can be reduced from 15 to 0.5 megabits per second without noticeable image quality degradation.

ABSDCT technology can compress black and white or color images or signals representing images. The color input signal may be in YIQ format, Y luminance or brightness samples, I and Q bit chrominance or color samples, in 4:4:4 or another format. Other known formats, such as YUV, YC, may also be used_b C_rOr RGB format. Due to the low spatial sensitivity of the eye to color, most studies have shown that the horizontal and vertical directions are perpendicular to each otherIt is appropriate to reduce the sub-samples of the color components to 1/4 so that the video signal can be represented by four luminance components and two chrominance components.

Using ABSDCT, the video signal is typically processed by dividing it into blocks of pixels. For each block, the luminance and chrominance components are passed to a block interleaver, which may, for example, pass 16 x 16 (pixel) blocks to a block interleaver, which arranges or organizes the image samples within each 16 x 16 block to produce blocks of data and composite blocks for Discrete Cosine Transform (DCT) analysis. The DCT operator is a method of converting space-time sampled signals into frequency representation. After conversion to a frequency representation, DCT techniques have proven suitable for extremely high compression levels because the designed quantizer can take advantage of the frequency distribution characteristics of the image. In an embodiment, one 16 × 16DCT is used for the first ordering, four 8 × 8 DCTs are used for the second ordering, 164 × 4 DCTs are used for the third ordering, and 62 2 × 2 DCTs are used for the fourth ordering.

The DCT operation reduces the spatial redundancy inherent to the video source. After DCT, most of the video signal energy tends to concentrate on a few DCT coefficients. To reduce redundancy between DCT coefficients, an additional transform, namely a "Differential Quadtree Transform (DQT)" may be applied.

For 16 x 16 blocks and each block, the DCT coefficient values and DQT values (if DQT is used only) are analyzed to determine the number of bits required for block or block encoding, and then the block or combination of blocks requiring the least number of bits to encode is selected to represent the image segment. For example, two 8 × 8 partitions, six 4 × 4 partitions, and eight 2 × 2 partitions may be selected to represent the image segment.

Then, the selected blocks or combinations of blocks are correctly ordered into 16 × 16 blocks. The DCT/DQT coefficient values are then subjected to frequency weighting, quantization and coding (e.g., variable length coding), in preparation for transmission. The ABSDCT technique described above is easy to execute, but the amount of computation is large, and it is difficult to make the hardware for implementing the technique dense.

Variable length coding has been implemented in continuous length and size. Other compression methods, such as "joint photographic experts group" (JPEG) or "moving picture experts group" (MPEG-2), all apply a common zig-zag scanning method to the entire processed block size. However, applying ABSDCT, different block sizes are generated depending on the variation within the data block. Some encoding methods, such as Huffman coding, include a sequence of zeros followed by a non-zero coefficient. However, Huffman coding is more optimal when the source symbol probability is negative to a power of 2. However, in the case of consecutive length/size pairs, the symbol probability is rarely negative to a power of 2.

Furthermore, Huffman coding requires the storage of a codebook of pre-calculated codewords, which is very large in size and the longest codeword is very long, so Huffman coding of symbols for consecutive lengths/sizes is not efficient.

Disclosure of Invention

An apparatus and method for coding the continuous length and amplitude of DCT coefficients in a lossless manner to achieve compression is described. Specifically, after quantization, Golomb-Rice coding is used to encode both zero continuous and non-zero amplitudes of the DCT coefficients. It has been found that a method using exponential distribution of data, such as the Golomb-Rice coding method, can achieve higher coding efficiency than other methods.

The invention relates to a quality-based image compression system and method, which apply discrete cosine transform coefficient data adaptive sizing blocks and sub-blocks and quality-based quantization stability coefficients. The block of pixel data is input to an encoder which includes a Block Size Assignment (BSA) element that divides the input block of pixels for processing. The block size assignment is based on the variation of the input block and the subdivided block. Typically, regions of greater variance are subdivided into smaller blocks, and regions of lesser variance are not subdivided, such that the average of blocks and partitions fall within different predetermined ranges. Thus, based on its average, the block's variance threshold is first modified by its nominal value, then the variance of the block is compared to the threshold, and if the variance is greater than the threshold, the block is subdivided.

The block size assignment is supplied to a transform element which transforms the pixel data into frequency domain data. Only the selected block and block transform is specified by block size and then scaled for transform data via quantization and serialization. The quantization of the transform data is based on certain pixel intensities, such as scaling coefficients that adjust contrast, coefficient count, rate distortion, block size specified density, and/or past scaling coefficients. Zigzag serialization is based on establishing the longest possible continuous length for the same value and then encoding the data stream for transmission with a variable length encoder. An exponential distribution based coding method, such as Golomb-Rice coding, is applied. Specifically, a zero run length is determined for data representing zeros. A function of a Golomb parameter, and the remainder is encoded as a function of zero run length, the Golomb parameter, and a window. The encoded windows are concatenated with the remainder. For data representing non-zero, non-zero data is encoded as a function of the non-zero data value and its sign. The encoded data is transmitted to a decoder through a transmission channel, and pixel data is reconstructed and displayed.

Thus, an aspect of an embodiment does not require a priori code generation.

Another aspect of an embodiment does not require the storage of a widely used codebook.

Yet another aspect of an embodiment reduces the size required for hardware implementation.

Yet another aspect of an embodiment achieves high coding efficiency.

A further aspect of an embodiment utilizes an exponential distribution of DCT data.

Drawings

The features and advantages of the present invention will become more apparent from the detailed description set forth below when taken in conjunction with the drawings in which like reference characters identify correspondingly throughout and wherein:

FIG. 1 is a block diagram of an encoder portion of an image compression and processing system;

FIG. 2 is a block diagram of a decoder portion of an image compression and processing system;

FIG. 3 is a flow chart of the processing steps involved in specifying based on varying block sizes;

FIG. 4a shows the exponential distribution of zero consecutive length Y components in a DCT coefficient matrix;

FIG. 4b shows a zero run length C in the DCT coefficient matrix_bAn exponential distribution of the components;

FIG. 4C shows a zero run length C in the DCT coefficient matrix_rAn exponential distribution of the components;

FIG. 5a shows an exponential distribution of the amplitude size Y component in a matrix of DCT coefficients;

FIG. 5b shows the magnitude C of the amplitude in the DCT coefficient matrix_bAn exponential distribution of the components;

FIG. 5C shows the magnitude C of the amplitude in the DCT coefficient matrix_rAn exponential distribution of the components;

FIG. 6 illustrates a Golomb-Rice encoding process; and

fig. 7 illustrates a Golomb-Rice encoding apparatus.

Detailed Description

To make digital signals pure and share the corresponding benefits, some form of signal compression must generally be applied. It is also important to maintain a high quality of the image in order to achieve a high degree of compression of the resulting image. Furthermore, in many cases, an intensive hardware implementation is also expected to be computationally efficient.

Before one embodiment of the invention is explained in detail, it is to be understood that the invention is not limited in its application to the details of construction and the arrangement of components set forth in the following description or illustrated in the following drawings, which have the function of other embodiments and are constructed in various ways. It is also to be understood that the terminology used herein is for the purpose of description and should not be regarded as limiting.

One aspect of an embodiment applies Image Compression Based on Discrete Cosine Transform (DCT) techniques, such as that disclosed in co-pending U.S. patent application Serial No. 09/436,085 entitled "Contrast Sensitive variable basic adaptive Block Size DCT Image Compression," filed on 8.11.1999, assigned to the assignee of the present application and incorporated herein by reference. Co-pending us patent application 09/494,192 entitled "Quality Based Image Compression", filed on 28.1.2000, assigned to the assignee of the present application and incorporated herein by reference, describes an Image Compression and decompression system that employs DCT. In general, a graphic is prepared for processing in the digital domain, the pixel data of which is divided into an array of non-overlapping blocks of size N × N, and a two-dimensional DCT can be performed on each block defined by:

in the formula (I), the compound is shown in the specification,and is

X (M, N) is the pixel at position (M, N) within the NxM block, and

x (k, l) is the corresponding DCT coefficient.

Since the pixel values are not negative, the DCT component X (0, 0) is always a positive number, usually with the largest energy. In fact, for a typical image, most of the transform energy is concentrated around the component X (0, 0), and this energy compression property makes the DCT technique a compression method that is so interesting.

The image compression technique employs contrast adaptive coding to further reduce the bit rate. It has been observed that most natural images consist of flat areas that change relatively slowly and busy areas such as high contrast structures at the border of the object. Contrast adaptive coding exploits this factor to allocate more bits to busy bits and fewer bits to non-busy bits.

Contrast adaptive methods apply intra-frame coding (spatial processing) and no inter-frame coding (spatio-temporal processing). Inter-frame coding requires multiple frame buffers in addition to more complex processing circuitry. In many instances, practical implementations require reduced complexity. Intra-frame coding is also suitable where spatio-temporal coding methods are inefficient, such as films with 24 frames per second can be classified as such, because the integration time caused by a mechanical shutter is relatively short, causing a higher degree of temporal aliasing. For unstable fast movements, the assumed inter-frame correlation is destroyed. Intra-coding is also easier to standardize when both 50HZ and 60HZ power line frequencies are involved. Currently television signals are transmitted at 50HZ or 60HZ, digitally applied intraframe methods are capable of both 50HZ and 60HZ operations, and are even suitable for 24 frames per second movies by balancing frame rate and spatial resolution.

For image processing purposes, the pixel data divided into the non-overlapping block array is subjected to a DCT operation. Note that although block sizes discussed herein are N × N, it is contemplated that various block sizes may be used, for example, N × M block sizes may be used, N and M are both integers, and M is greater than or less than N. Another point is that the block can be divided into at least one layer of blocks, such as N/ixN/i, N/ixN/j, N/ixM/j, etc., where i and j are integers. Again, the exemplary block sizes discussed herein are 16 x 16 blocks of pixels, blocks and partitions with corresponding DCT coefficients. Various other integers are also contemplated, such as using parity integer values, e.g., 9 x 9.

Fig. 1 and 2 illustrate an image processing system 100 incorporating the concept of a configurable serializer. The system 100 includes an encoder 104 that compresses a received video signal. The compressed signal is transmitted over a transmission channel or physical medium 108 and received by a decoder 112, which decodes the received encoded data into image samples for display.

The image is typically divided into blocks of pixels for processing. RGB/YC for color signal₁C₂Converter 116 converts from RGB space to YC₁C₂Space, Y is a luminance or brightness component, C₁And C₂Is a chrominance or color component. Many systems steer C in both horizontal and vertical directions due to the low spatial sensitivity of the eye to color₁And C₂The sub-sampling of the components is reduced to 1/4. But subsampling is not required. In some applications, such as those involving "digital cinema", the full resolution image in 4:4:4 format is extremely largeUseful or necessary. Two possibilities Y C₁C₂The representations are YIQ and YUV representations, both of which are well known in the art. A variation of the YUV representation, namely Y C, may also be applied_bC_rThis can be further divided into parity components. Thus, in one embodiment, the notation Y-even, Y-odd, C is used_b-even, C_b-odd, C_r-even, C_r-odd.

In a preferred embodiment, each parity y, C is processed_bAnd C_rThe components, not sub-sampled, are input to the encoder 104 as six components of a 16 x 16 block of pixels. For the purpose of illustration, Y is shown_DollEncoder 104 of component, for Y_{Magic card}Component and parity C₆And C_rThe components also use encoders. The encoder 104 includes a block size specification component 120 that performs block size specification for video compression and decides on a block decomposition for a 16 x 16 block based on perceptual features of the image in the block. The block size specifies the subdivision of each 16 x 16 block into smaller blocks, such as 8 x 8, 4 x 4 and 2 x 2, in a quadtree fashion, depending on the activity within the 16 x 16 block. The block size assignment element 120 generates quadrant tree data called PQR data, which is 1-21 bits in length. Thus, if the block size designation determines that a 16 × 16 block is to be divided, the R bits of the PQR data are set, followed by four additional bits of Q data corresponding to the four divided 8 × 8 blocks. If the block size specifies any of the 8 x 8 blocks of the block subdivision, then four additional bits of P data are added to each subdivided 8 x 8 block.

Referring now to fig. 3, a flowchart details the operation of the block size designation element 120. The block variation is used as a measure of the decided subdivided block. Beginning at step 202, a 16 x 16 block of pixels is read out. At step 204, the change v16 for the block is calculated as follows:

where N is 16, Xi, j is the pixel in the ith row and jth column in the nxn block. In step 206, if the average value of the blocks is between two predetermined values, the change threshold T16 is first modified. A new threshold T '16 is derived and the block change is compared to the new threshold T' 16.

If the change v16 is not greater than the threshold T16, then the start address of the 16 x 16 block is written to the register and the R position 0 of the PQR data is written to the register, indicating that the 16 x 16 block is not subdivided, at step 208. The algorithm then reads the next 16 x 16 block of pixels. If the change v16 is greater than the threshold T16, then the R position 1 of the PQR data is represented at step 210 as a subdivision of the 16 × 16 block into four 8 × 8 blocks.

Four 8 x 8 blocks i are considered for subdivision consecutively 1: 4 as shown in step 212. In step 214, the change v8i for each 8 × 8 block is calculated. If the average value of the block is between the two predetermined values, the change threshold T8 is modified to obtain a new threshold T' 8, and the block change is compared to the new threshold at step 216.

If the change v8i is not greater than the threshold T8, then the start address of the 8 x 8 block is written to the register in step 218, corresponding to Q bit Q_iAnd setting 0. Then, the next 8 × 8 block is processed. If the change v8i is greater than the threshold T8, then the corresponding Q bit Q is set at step 220_iSet 1, indicating that the 8 x 8 block is to be subdivided into four 4 x 4 blocks.

Consider four 4 x 4 blocks j, as shown in step 222_iSubdividing the powder 1: 4. At step 224, the change v4 of each 4 × 4 block is calculated_ij. In step 226, if the average value of the block is between two predetermined values, the change threshold T4 is modified to obtain a new threshold T' 4, and the block change is compared with the new threshold.

If it changes V4_ijNot greater than the threshold T4, the address of the 4 x 4 block is written at step 228, with the corresponding P bit P_ijSet to 0 and process the next 4 x 4 block. If the change v4_ijGreater than the threshold T4, the corresponding P bit P is set in step 230_ijSet to 1, indicates that the 4 x 4 block is to be subdivided into four 2 x 2 blocks. In addition, addresses of four 2 × 2 blocks are written into the register.

The thresholds T16, T8, and T4 are predetermined constants. This is a hard decision. In addition, adaptive or soft decision may be carried out, for example, soft decision changes the varying thresholds according to the average pixel value of a 2N × 2N block, where N may be 8, 4 or 2, thereby using a function of the average pixel value as the threshold.

For purposes of illustration, see the following examples. For 16 × 16, 8 × 8, and 4 × 4 blocks, the predetermined variation thresholds for the Y component are set to 50, 1100, and 880, respectively, in other words, T16 ═ 50, T8 ═ 1100, and T4 ═ 880. The average values were set to 80 and 100. Assume that the calculated variance for a 16 x 16 block is 60. Since 60 is greater than T16, and the average value 90 is between 80 and 100, the 16 x 16 block is subdivided into four 8 x 8 partitions. Assuming that the calculated changes for the 8 x 8 blocks are 1180, 935, 980 and 1210, since the changes for two 8 x 8 blocks exceed 78, the two blocks are subdivided to give a total of eight 4 x 4 partitions. Finally, assuming that the variance of eight 4 × 4 blocks is 620, 630, 670, 610, 590, 525, 930 and 690, the corresponding average values are 90, 120, 110, 115, since the average value of the first 4 × 4 block is in the range of (80, 100), the threshold should be reduced to T' 4 ═ 200, which is less than 880, so the 4 × 4 block is subdivided in addition to the seventh 4 × 4 block.

Note that for the luminance component Y_{Magic card}And a chrominance component C_b-couple、C_b-odd、C_r-coupleAnd C_r-oddBlock sizes may be specified in a similar manner. The chroma component is one-10 times of the chroma component in a horizontal method and a vertical method.

It is further noted that while the block size designation is described as a top-down approach, where the largest block is evaluated first (16 x 16 in this example), the bottom-up approach may be traded for evaluating the smallest block first (2 x 2 in this example).

Referring back to fig. 1, the PQR data is sent to the DCT element 124 along with the address of the selected block, which performs an appropriately sized discrete cosine transform on the selected block using the PQR data, except that the selected block is DCT processed.

Image processing system 100 also includes DQT element 128 that reduces redundancy between DC coefficients of the DCT. The DC coefficients are located in the upper left corner of each DCT block and are typically larger than the AC coefficients, and the size difference makes it difficult to design an efficient variable length encoder, so it is advantageous to reduce the redundancy between DC coefficients.

The DQT element performs a two-dimensional DCT on the DC coefficients, taking 2 × 2 at a time. Starting from a 2 x 2 block within a 4 x 4 block, a two-dimensional DCT is performed on the four DC coefficients, such a 2 x 2DCT being referred to as a differential quadtree transform, or DQT, of the four DC coefficients. Next, the next level DQT is computed using the DQT coefficients along with three adjacent DC coefficients within an 8 x 8 block. Finally, the DQT is calculated using the DC coefficients of four 8 × 8 blocks within a 16 × 16 block. Thus within a 16 x 16 block there is one true DC coefficient, the remainder being AC coefficients corresponding to the DCT and DQT.

The quantizer is provided with transform coefficients (DCT and DQT) for quantization. In one embodiment, the DCT coefficients are quantized using a Frequency Weighted Mask (FWM) and quantized scaling coefficients. FWM is a table of frequency weights of the same size as the block of incoming DCT coefficients, the frequency weights weighting different DCT coefficients with different weights. The weighting is designed to emphasize input samples for which the human visual or optical system is more sensitive to its frequency components, and not to emphasize samples for which the visual or optical system is less sensitive to its frequency components. The weighting method can also be designed according to factors such as the visual range.

The weights are selected according to empirical data. Published by the International standards organization in 1994 entitled "digital compression and encoding of connecting-tone still image-part 1: the method of designing a weighted mask for 8 × 8DCT coefficients is disclosed in ISO/IEC ITC1CD10918 by Requirements and guidelines ", which is incorporated herein by reference. Two FWMs are typically designed, one for the luminance component and one for the chrominance component. The scaling coefficients control the quality and bit rate of the quantized coefficients by taking 1/10 and interpolating 16 x 16 for an 8 x 8 block to obtain a FWM table of block sizes 2 x 2, 4 x 4.

Thus, each DCT coefficient is quantized as follows:

where DCT (i, j) is the input DCT coefficient, fwm (i, j) is the frequency weighting mask, q is the scaling coefficient, and DCTq (i, j) is the quantized coefficient. Note that the first term in parenthesis is rounded off according to the sign of the DCT coefficient. The DQT coefficients are also quantized with appropriate weighted masks, although multiple tables or masks may be applied to each Y, Cb and Cr component.

The pixel data and frequency weighted mask blocks are then scaled with quantizer 130 or scaling coefficient elements. DCT coefficient quantization reduces most of it to zero resulting in compression. In one embodiment, there are 32 scaling coefficients corresponding to the average bit rate. Unlike other compression methods such as MPEG2, the control of the average bit rate is based on the quality of the processed image rather than the target bit rate and buffer status.

To further improve compression, the quantized coefficients are sent to a scan serializer 134, which scans through blocks of quantized coefficients to produce a serialized stream of quantized coefficients. Zigzag scanning, column scanning or row scanning may be applied. Several different zigzag scanning patterns and patterns thereof may also be selected. A preferred technique applies 8 x 8 block sizes for zig-zag scanning. Zig-zag scanning of quantized coefficients improves the chance of encountering large succession of zero values, which themselves have a decreasing probability of being efficiently encoded with Huffman codes.

The serialized, quantized coefficient stream is sent to a variable length encoder 138. The Z-encoder separates the quantized coefficients between zero and non-zero coefficients, as described in detail with reference to fig. 6. In one embodiment, Golomb-Rice coding is applied to facilitate coding non-negative integers of the exponential distribution. The application of Golomb codes provides shorter length codes for exponential distribution-like variables, better for compression.

In Golomb coding run lengths, GThe Golomb code is parameterized by a non-negative integer m, e.g. given a parameter m, Golomb coding of a positive integer n is represented as a quotient n/m of a unary code followed by a remainder represented by a modified binary code, if the remainder is less thanIt is [ log ]₂m]Bit length, otherwise [ log ]₂m]The bit length. Golomb-Rice coding is a special case of Golomb coding, where the parameter m is expressed as m-2^K. In this case, the quotient n/m is obtained by right-shifting the binary representation of the parameter n by K bits, and the remainder of n/m is represented by the lowest K bits of n. Thus, the Golomb-Rice code is a concatenation of the two. The Golomb-Rice coding method can code positive and negative integers with geometrical shape (exponential) distribution on two sides, and the formula is as follows:

P_α(x)＝cα^|x| (1)

in the formula (1), α is a parameter representing the probability fading of X, and C is a normalization constant. Due to p_α(X) is monotonic, so it can be seen that a sequence of integer values should suffice

p_α(x_i＝0)≥p_α(x_i＝-1)≥p_α(x_i＝+1)≥p_α(x_i＝-2)≥… (2)

As shown in fig. 4 a-4 c and 5 a-5 c, the zero-continuity and amplitude in the quantized DCT coefficient matrix have an exponential distribution, the distribution shown in the figure being based on data from the actual image. Fig. 4a shows a distribution 400 of Y components of zero run length and relative frequency, and similarly fig. 4b and 4c show distributions of Cb and Cr components of zero run length and relative frequency 410 and 420. Fig. 5a shows the distribution of the Y component of amplitude magnitude versus relative frequency, and likewise fig. 5b and 5c show the distribution of the Cb and Cr components of amplitude magnitude versus relative frequency 510 and 520, respectively. Note that in fig. 5a to 5c, the graph shows the distribution of DCT coefficient sizes. Each size represents a range of coefficient values, e.g., a range of size values of four { -15, -14, … … -8, 8, … … 14, 15}, for a total of 16 values. Similarly, the + size value ranges from 1023, -1022, … …, -512, 512, … …, 1022, 1023, and has 1024 values. As can be seen from fig. 4a to 4c and 5a to 5c, both the continuous length and the amplitude size have an exponential distribution. The actual distribution of the amplitudes fits in equation (3):

in the formula (3), X_K，lRepresenting the DCT coefficients corresponding to frequencies K and l along the vertical and horizontal dimensions, respectively, and the mean valueVariations inThus, when processing data in DCT, Golomb-Rice is applied in the manner describedThe coding method is more preferred.

Although image data compression is described below, the embodiments are equally applicable to embodiments that compress audio data. When compressing image data, the image or video signal may be, for example, RGB or YIQ or YUV or YC_bC_rComponents, having linearly or logarithmically encoded pixel values.

Fig. 6 illustrates an encoding process 600 of zero and non-zero coefficients. When scanning the DCT matrix, the zero and non-zero coefficients are processed separately and separated (604). For zero data, a zero run length is determined (608). Note that the continuous length is a positive integer. For example, if the continuous length is found to be n, the Golomb parameter m is determined (612). In one embodiment, the Golomb parameter is defined as a function of the continuous length. In another embodiment, the Golomb parameter (m) is determined by equation (4):

m＝[log₂ n] (4)

another approach is to count the length of the run length and the associated Golomb parameter with a counter or register (616). To encode the zero run length n, the quotient is encoded (620). In one embodiment, the quotient is defined as a function of the zero run length and the Golomb parameter. In another embodiment, the quotient (Q) is determined by equation (5):

Q＝[n/2^m] (5)

in one embodiment, the quotient Q is encoded in a unary code, requiring Q +1 bits. The remainder is then encoded 624. In one embodiment, the remainder is encoded as a function of the run length and the quotient. In another embodiment, the remainder (R) is determined using equation (6):

R＝n-2^mQ (6)

in one embodiment, the remainder R is encoded in an m-bit binary code. The quotient Q and remainder R are then determined and the codes for Q and R are concatenated 628 to represent a total code of zero consecutive length n.

The non-zero function is also encoded in Golomb-Rice. Since the coefficient amplitude can be positive or negative, it is necessary to use the sign bit and encode the absolute value of a given amplitude. The non-zero function is given with an amplitude x, which can be expressed as a function of the absolute value of the amplitude and the sign. Therefore, the amplitude can be expressed as y by equation (7):

accordingly, the non-zero coefficient values are selectively counted with a counter or register (632). Then, determining whether the amplitude is greater than or equal to zero (636), if so, encoding the value as twice the given value (640); if not, the value is encoded to a value less than twice the absolute value (644). Other mapping methods may also be applied, the key being that no additional bits of the value symbol are required to be distinguished.

The amplitude represented by equation (7) is encoded, resulting in positive values of x being even integers and negative values becoming odd integers. The mapping method retains the probability assignment of x in equation (2). The coding method shown in formula (7) has the advantage of avoiding representing positive and negative numbers by sign bits. After mapping, y is encoded in the same way as zero runs, and the operation continues until all coefficients are scanned in the current block.

Importantly, although the embodiments of the present invention define the coefficient values and the continuous lengths as functions of equations (1) to (7), precise equations (1) to (7) are not necessarily applied. By using Golomb-Rice coding and the exponential distribution of DCT coefficients, audio-visual data can be more efficiently compressed.

Since the encoded zero-run and non-zero amplitudes are indistinguishable, a fixed-length dedicated prefix code must be applied to mark the occurrence of the first zero-run. After encountering a non-zero magnitude, all zeros are typically encountered in the block, where it is more efficient to use an end of block (EOB) code rather than a Golomb-Rice code. The EOB code may still be selected as a dedicated fixed length code.

The probability distribution of amplitude or continuous length in the DCT coefficient matrix is parameterized by alpha or lambda, according to equations (1) or (3), which means that if a range occurs that is not a particular DCT coefficient block,the coding efficiency can be increased and the relevant quantities can then be coded using a suitable Golomb-Rice parameter. In one embodiment, a counter or register is used for each successive length and amplitude magnitude value to calculate the respective accumulated value and the corresponding number of occurrences of that value. For example, if the registers storing the accumulated value and the number of accumulated elements are R_r1And N_r1Then the following equation (6) can be used as the Rice-Golomb parameter for coding the run length:

a similar method can be applied for amplitude.

Referring back to fig. 1, the compressed image signal generated by the encoder 104 may be buffered in a buffer 142 and then transmitted to the decoder 112 via the transmission channel 108. The transmission channel 108 may be a physical medium such as a magnetic or optical storage device, or a wired or wireless transmission process or apparatus. The PQR data with block size designation information is also provided to decoder 112 (fig. 2), which includes buffer 164 and variable length decoder 168. decoder 168 decodes the run-length values and non-zero values in a manner similar to but opposite to the decoder described in fig. 6.

The output of the variable length decoder 168 is fed to an inverse serializer 172 which arranges the coefficients according to the scanning method employed. For example, if a mixed zig-zag scan, vertical scan, and horizontal scan is applied, the deserializer 172 rearranges the coefficients correctly according to the type of scan applied. The inverse serializer 172 receives the PQR data and helps to properly align the coefficients into a block of synthesized coefficients.

Because of the use of quantizer scaling coefficients and frequency weighted masks, the resultant block is fed to an inverse quantizer 174 in order to restore the process.

If a differential quadtree transform is applied, the block of coefficients is sent to the IPQT element 186, followed by the IDCT element 190, otherwise directly to the IDCT element 190. IDQT and IDCT elements 186 and 190 inverse transform the coefficients to produce blocks of pixel data. The pixel data must then be interpolated to RGB form and stored for future display.

Fig. 7 illustrates a Golomb-Rice encoding apparatus 700 that preferably performs the process described in fig. 6. The determiner 704 determines the continuous length (n) and the Golomb parameter (m). Or for each successive length and amplitude magnitude value, a counter or register 708 may be used to calculate the respective accumulated value and the corresponding number of occurrences of that value. Encoder 712 encodes the quotient (Q) as a function of the continuous length and the Golomb parameter. In another embodiment, encoder 712 also encodes non-zero data as a function of the non-zero data values and their signs. The Q and R values are concatenated by a concatenator 716.

By way of example, the illustrative logical blocks, flowcharts, and steps described in connection with the embodiments disclosed herein may be implemented as hardware and software including an Application Specific Integrated Circuit (ASIC), a programmable logic device, discrete gate or transistor logic, discrete hardware elements such as registers and FIFDs, a processor executing a set of firmware instructions, any of the usual programmable software and processors, or any combination thereof. The processor is advantageously a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. The software may reside in the RM, flash memory, ROM, registers, hard disk, removable disk, CD-ROM, DVD-ROM, or any other form of storage medium known in the art.

The foregoing description of the preferred embodiments will provide those skilled in the art with a clear understanding that many modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without the use of the inventive faculty. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Other features and advantages of the invention are set forth in the following claims.

Claims

1. A method of encoding data representing quantized frequencies, the data comprising data representing zeros and non-zeros, the method comprising:

determining a zero run length n;

defining the Golomb parameter m as a function of zero run length, wherein the formula m ═ log₂n]To determine a Golomb parameter m;

encoding the quotient Q as a function of zero run length and a Golomb parameter;

encoding the remainder R as a function of zero run length, Golomb parameter and quotient; and

concatenating the encoded quotient and the encoded remainder; and also

For data representing non-zero, non-zero data is encoded as a function of non-zero data values and non-zero data value symbols.

2. The method of claim 1, wherein the formula Q ═ n/2 is used^m]To determine the quotient Q.

3. A method as claimed in claim 1, characterized by using the formula R-n-2^mQ to determine the remainder R.

4. The method of claim 1 wherein said non-zero data is encoded as y-values using the equation

Where x is the amplitude of the non-zero data to be encoded.

5. An apparatus for encoding data representing quantized frequencies, the data comprising data representing zeros and non-zeros, the apparatus comprising:

for the data representing the zero or zero values,

means for determining the zero run length n, wherein the formula m ═ log₂n]To determine a Golomb parameter m;

means for defining a Golomb parameter m as a function of zero run length;

means for encoding the quotient Q as a function of zero run length and a Golomb parameter;

means for encoding the remainder R as a function of zero run length, Golomb parameter and quotient; and

means for concatenating the encoded quotient and the encoded remainder; while

Means for encoding non-zero data as a function of non-zero data values and non-zero data value signs for data representing non-zero.

6. The apparatus of claim 5, wherein [ n/2 ] is given by the formula Q ═ n^m]To determine the quotient Q.

7. The apparatus of claim 5, wherein the formula R-n-2^mQ to determine the remainder R.

8. The apparatus of claim 5, wherein the encoding of non-zero data is determined as a y-value using the equation

Where x is the amplitude of the non-zero data to be encoded.

9. The apparatus of claim 5, further comprising means for counting zero run lengths and non-zero amplitudes and corresponding occurrences of the zero run lengths and non-zero amplitudes.

10. An apparatus for encoding data representing quantized frequencies, the data comprising data representing zeros and non-zeros, the apparatus comprising:

for the data representing the zero or zero values,

a first determiner configured to determine a zero consecutive length n;

a second determiner configured to determine a Golomb parameter m as a function of zero consecutive length, wherein [ log ] is given by the formula m₂n]To determine a Golomb parameter m;

an encoder configured to encode a quotient Q as a function of zero run length and a Golomb parameter, and configured to encode a remainder R as a function of zero run length, the Golomb parameter, and the quotient, and for data representing non-zero, encode non-zero data as a function of a non-zero data value and a non-zero data value sign; and

a concatenator configured to concatenate the encoded quotient and the encoded remainder.

11. The apparatus of claim 10, wherein the formula Q ═ n/2 is used^m]To determine the quotient Q.

12. The apparatus of claim 10, wherein the formula R-n-2^mQ to determine the remainder R.

13. The apparatus of claim 10, wherein the non-zero data is encoded as a y-value using the equation:

where x is the amplitude of the non-zero data to be encoded.

14. The apparatus of claim 10, further comprising a counter configured to count zero run lengths and non-zero magnitudes and the number of times the zero run lengths and non-zero magnitudes occur, respectively.