CN114205586A - Video processing method for carrying out rate distortion optimization based on multi-color space and application - Google Patents
Video processing method for carrying out rate distortion optimization based on multi-color space and application Download PDFInfo
- Publication number
- CN114205586A CN114205586A CN202111486783.5A CN202111486783A CN114205586A CN 114205586 A CN114205586 A CN 114205586A CN 202111486783 A CN202111486783 A CN 202111486783A CN 114205586 A CN114205586 A CN 114205586A
- Authority
- CN
- China
- Prior art keywords
- color space
- distortion
- unit
- color
- coding
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000005457 optimization Methods 0.000 title claims abstract description 24
- 238000003672 processing method Methods 0.000 title claims abstract description 17
- 230000009466 transformation Effects 0.000 claims abstract description 13
- 238000012545 processing Methods 0.000 claims abstract description 3
- 238000006243 chemical reaction Methods 0.000 claims description 53
- 239000011159 matrix material Substances 0.000 claims description 16
- 238000013507 mapping Methods 0.000 claims description 4
- 230000001131 transforming effect Effects 0.000 claims 3
- 238000000034 method Methods 0.000 abstract description 19
- 230000008569 process Effects 0.000 description 12
- 230000006835 compression Effects 0.000 description 11
- 238000007906 compression Methods 0.000 description 11
- 238000010586 diagram Methods 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 239000003086 colorant Substances 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000013139 quantization Methods 0.000 description 2
- 230000001105 regulatory effect Effects 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000008092 positive effect Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
- 
        - H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/146—Data rate or code amount at the encoder output
- H04N19/147—Data rate or code amount at the encoder output according to rate distortion criteria
 
- 
        - H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/172—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
 
- 
        - H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
 
- 
        - H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/186—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
 
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Color Television Systems (AREA)
Abstract
The invention discloses a video processing method for carrying out rate distortion optimization based on a multicolor space and application thereof, relating to the technical field of digital image processing. The method comprises the following steps: predefining N color spaces YUiVi in the candidate color space list; dividing an input video into a plurality of units, respectively converting each unit into the N color spaces for coding, and recording the bit number Bi of the unit after coding in each color space; carrying out dequantization and inverse transformation on the coding unit to obtain a distortion decoding unit, uniformly converting the distortion decoding unit into an initial standard color space to obtain a distortion unit, and comparing the distortion unit with an original input unit to calculate the distortion Di of the unit in each color space; and calculating the rate distortion cost Ji = Di + lambda Bi of each color space, and determining the color space with the minimum Ji as the coding color space of the current unit. The invention can flexibly and adaptively select the most suitable color space for coding aiming at the video image content.
    Description
Technical Field
      The invention relates to the technical field of digital image processing, in particular to a video processing method for carrying out rate distortion optimization based on a multi-color space and application thereof.
    Background
      Video compression reduces the cost of storing and transmitting video information by converting it into a lower bit rate form, and decompressing (also called decoding) reconstructs a version of the original information from the compressed form. With the development of IT technology, video applications have penetrated various areas of society. The emerging video applications put higher demands on video compression efficiency.
      Video compression performance needs to be evaluated jointly according to the bit rate of the encoded output and the distortion generated after encoding. Coding bit rate and distortion are constrained, for example, reducing the bit rate necessarily increases the distortion, and conversely, to obtain better video quality, and increases the bit rate after coding, a typical rate-distortion curve is shown in fig. 1. The core goal of video coding is therefore to reduce the coding bit rate as much as possible while guaranteeing a certain video quality. In order to deal with different video scenes, an encoder has a plurality of selectable encoding modes under a relatively fixed framework, and one core work of the encoder is to select optimal encoding parameters by using a certain strategy so as to realize optimal video compression performance. The process of coding parameter selection based on Rate-distortion theory is called Rate-distortion optimization (RDO).
      There are many encoding parameters in the conventional video encoding process, that is, there are many processes that can perform rate distortion optimization, including, for example, intra prediction mode, inter motion estimation, quantization, etc. Currently, in the mainstream video coding framework, the process capable of performing rate distortion optimization is relatively fixed, and the rate distortion optimization process of the intra prediction mode is taken as an example for description. The intra-frame prediction refers to a process of predicting a to-be-coded block by using a coded pixel point of a current image, 35 selectable intra-frame prediction modes are provided according to the mainstream H.265 standard, any one coding block is coded by traversing all prediction modes, and the prediction mode with the minimum distortion meeting the code rate limitation is the optimal intra-frame prediction mode.
      On the other hand, a video source (such as a camera) typically provides video in a particular color space, where the color components of the video are sub-sampled according to a particular color sampling rate. In general, a color space (also called a color model) is a model for representing colors as n values per physical location, where n ≧ 1, where each n value provides a color component value for that location. One typically uses a triplet (n = 3) or quadruplet (n = 4) number to describe the color of a color space, e.g. an RGB, CMYK color space, in which a red (R) component value represents the intensity of red at a location, a green (G) component value represents the intensity of green at that location, and a blue (B) component value represents the intensity of blue at that location; in the CMYK color space, four standard colors C = Cyan, M = Magenta, Y = Yellow, K = blacK, a Cyan (C) component value represents Cyan intensity at a position, a Magenta (M) component value represents Magenta intensity at the position, a Yellow (Y) component value represents Yellow intensity at the position, and a blacK (K) component value represents blacK intensity at the position. Different color spaces have the advantage for different applications, one color can often be converted in different color spaces using a color space conversion operation, the conversion between color spaces is mostly a simple linear mapping, e.g. the conversion of an RGB color space to a YCbCr color space:
      
      or conversion of RGB color space to YCgCo color space:
      
      currently, most of the mainstream video and image coding systems (video encoders) convert an original image into a YCbCr color space for coding based on the characteristics of the human visual system (characteristics of human eyes, which place more importance on brightness information than color information when judging the perceptual quality of a video/image picture). However, considering that the content of the video is rich and varied, the above-mentioned method of encoding using a single color space has difficulty to adapt to the varied video scenes: because the color distribution in different video scenes is often inconsistent and the corresponding data statistics in different color spaces are also inconsistent, a video scene may have higher compression efficiency when being encoded in a certain color space than when being encoded in another color space, as shown in fig. 2, compared with the commonly used YCbCr color space, the representation of the image in the YCgCo color space appears flatter (the data fluctuation is small, the color difference texture is less obvious), and therefore it can be expected that the image is more suitable for being encoded in the YCgCo color space, and higher compression efficiency can be obtained.
      Accordingly, the prior art also proposes a picture coding strategy using adaptive color space transformation, such as chinese patent zl201310101249.7, which discloses an encoder: the encoder has a color space transformer that transforms the color space representation of the picture block from a "native" color space representation to a secondary color space representation when the transform decision indicates the required transform; the transform decider is operable to: estimating an expected quality of a coded picture representation on a block basis when each block is coded separately in either an original color space representation or a secondary color space representation; in particular, it may be decided whether a transform is necessary or appropriate for each block based on the maximum bit rate required, and thus always the best possible coding quality at a given bit rate is selected. However, the above coding scheme is based on a specified maximum bit rate of a bit stream or optimizes a bit rate for a fixed quality, etc. when setting a rate distortion optimum manner, and requires a color space predefined by an international standard, and the scheme does not provide sufficient flexibility for a video scene with rich and varied contents.
      In summary, from the perspective of color space, how to provide a video processing method with better flexibility and wider applicability, which can further mine the compression efficiency of video and image, is a technical problem that needs to be solved in the art.
    Disclosure of Invention
      The invention aims to: the defects of the prior art are overcome, and the video processing method and the application for carrying out rate distortion optimization based on the multi-color space are provided. According to the technical scheme provided by the invention, rate distortion optimization of color space selection in the video and image coding process is added, a user can adaptively set the color space in the rate distortion optimization process of the video and image coding according to needs, and the most appropriate color space representation can be flexibly and adaptively selected for coding aiming at different video image contents, so that the coding compression efficiency of the video image is further improved.
      In order to achieve the above object, the present invention provides the following technical solutions:
      a video processing method based on multi-color space rate distortion optimization comprises the following encoding steps:
      according to an N value set by a user, predefining a candidate color space list with the scale of N, wherein the candidate color space list comprises N color spaces YUiVi, and conversion coefficient information is set between each color space YUiVi and a preset standard color space, wherein i =0,1, … … and N-1; n is an integer greater than or equal to 2;
      receiving a video input using a standard color space, dividing the input video into a plurality of cells;
      for each cell, converting it to the N color spaces yuvi of the candidate color space list, respectively; coding units of different color spaces, and recording the bit number Bi obtained by coding the units in each color space YUiVi; the coding units coded in different color spaces are subjected to dequantization and inverse transformation to obtain distortion decoding units, the distortion decoding units in different color spaces are subjected to color space conversion to be uniformly converted into an initial standard color space, and the originally input unit data are compared in the standard color space to calculate the distortion Di of the unit in each color space YUiVi; and calculating rate-distortion cost Ji = Di + lambda Bi according to the bit number Bi and the distortion Di of each color space YUiVi, wherein lambda is a Lagrangian factor, determining the color space with the minimum rate-distortion cost value as the coding color space of the current unit, and adding information representing the color space of each unit into the coding data stream.
      Further, in decoding, information indicating the color space of each cell is extracted from the data stream, and the video image data is converted back to the standard color space according to the corresponding color space based on the color space information recorded in each cell and is output and displayed.
      Further, the unit is a block-level picture, divides an image frame of an input video into coding blocks of a certain size, and selects a color space for each coding block.
      Or, the unit is a frame level picture, and color space selection is performed for each image frame.
      Or, the unit is at GOP level, and color space selection is performed for each group of pictures.
      Or, the unit is at a video sequence level, and color space selection is performed for each video sequence.
      Further, the standard color space is any one of color spaces predefined by international standards;
      the color spaces yuvi in the list of candidate color spaces include one or more of international standard predefined color spaces and one or more of non-standard predefined color spaces; the conversion coefficient between the international standard predefined color space is set by default of the system, and the conversion coefficient between the non-standard predefined color space and the standard color space is set by user self-definition.
      Further, the conversion of the non-standard predefined color space to the standard color space is mapped linearly, in this case, the conversion coefficients include a first conversion matrix L1 and a second conversion matrix L2, the first conversion matrix L1 and the second conversion matrix L2 are set by user-definition, each component of the color space yuvi = L1. each component of the standard color space + L2.
      The present invention also provides a video encoder for performing rate-distortion optimization based on a multi-color space, comprising:
      the device comprises an initialization module, a color space selection module and a color space selection module, wherein the initialization module is used for predefining a candidate color space list with the scale of N according to an N value set by a user, the candidate color space list comprises N color spaces YUiVi, and conversion coefficient information is set between each color space YU and a preset standard color space, wherein i =0,1, … … and N-1; n is an integer greater than or equal to 2;
      the encoding module is used for receiving video input adopting a standard color space and dividing the input video into a plurality of units; for each cell, converting it to the N color spaces yuvi of the candidate color space list, respectively; coding units of different color spaces, and recording the bit number Bi obtained by coding the units in each color space YUiVi; the coding units coded in different color spaces are subjected to dequantization and inverse transformation to obtain distortion decoding units, the distortion decoding units in different color spaces are subjected to color space conversion to be uniformly converted into an initial standard color space, and the originally input unit data are compared in the standard color space to calculate the distortion Di of the unit in each color space YUiVi; and calculating rate-distortion cost Ji = Di + lambda Bi according to the bit number Bi and the distortion Di of each color space YUiVi, wherein lambda is a Lagrangian factor, determining the color space with the minimum rate-distortion cost value as the coding color space of the current unit, and adding information representing the color space of each unit into the coding data stream.
      The invention also provides a video codec based on multi-color space for rate distortion optimization, comprising:
      the system comprises an encoder, a color space selection module and a color space selection module, wherein the encoder is used for predefining a candidate color space list with the scale of N according to an N value set by a user, the candidate color space list comprises N color spaces YUiVi, and conversion coefficient information is set between each color space YUiVi and a preset standard color space, wherein i =0,1, … … and N-1; n is an integer greater than or equal to 2; and receiving a video input using a standard color space, dividing the input video into a plurality of cells; for each cell, converting it to the N color spaces yuvi of the candidate color space list, respectively; coding units of different color spaces, and recording the bit number Bi obtained by coding the units in each color space YUiVi; the coding units coded in different color spaces are subjected to dequantization and inverse transformation to obtain distortion decoding units, the distortion decoding units in different color spaces are subjected to color space conversion to be uniformly converted into an initial standard color space, and the originally input unit data are compared in the standard color space to calculate the distortion Di of the unit in each color space YUiVi; calculating rate-distortion cost Ji = Di + lambda Bi according to the bit number Bi and the distortion Di of each color space YUiVi, wherein lambda is a Lagrangian factor, determining the color space with the minimum rate-distortion cost value as the coding color space of the current unit, and adding information representing the color space of each unit into the coding data stream;
      and the decoder is used for extracting the information which represents the color space of each unit from the data stream, and converting the video image data back to the standard color space according to the corresponding color space for output and display according to the color space information recorded by each unit.
      Due to the adoption of the technical scheme, compared with the prior art, the invention has the following advantages and positive effects as examples: the user can carry out rate distortion optimization on the color space selection in the video and image coding process according to the needs, and can flexibly and adaptively select the most appropriate color space representation for coding according to different video image contents, thereby further improving the coding compression efficiency of the video image.
      Furthermore, the invention can select the color space at the block level, the frame level, the GOP level and the video sequence level according to the hardware cost, the video scene and other actual requirements, can improve the coding efficiency in many scenes, and has strong flexibility and wide applicability.
      Furthermore, the color space used by the invention is not necessarily the color space predefined in the international standard, and the user can self-define the color space according to the video content and the coding requirement, so that the potential of video compression can be effectively mined through more free color space selection; meanwhile, the size of the candidate color space list can be freely regulated according to specific equipment performance, and the method has strong flexibility and compatibility.
    Drawings
      Fig. 1 is a typical rate-distortion curve in the prior art.
      Fig. 2 is a diagram showing the effect of converting an RGB image into a YCbCr color space and a YCgCo color space, respectively.
      FIG. 3 is an exemplary diagram of a color space included in a candidate color space list according to an embodiment of the present invention.
      Fig. 4 is a diagram illustrating an exemplary process for performing rate-distortion optimization using multi-color space according to an embodiment of the present invention.
    Detailed Description
      The video processing method and application based on multi-color space rate-distortion optimization disclosed by the invention are further described in detail with reference to the accompanying drawings and specific embodiments. It should be noted that technical features or combinations of technical features described in the following embodiments should not be considered as being isolated, and they may be combined with each other to achieve better technical effects. In the drawings of the embodiments described below, the same reference numerals appearing in the respective drawings denote the same features or components, and may be applied to different embodiments. Thus, once an item is defined in one drawing, it need not be further discussed in subsequent drawings.
      It should be noted that the structures, proportions, sizes, and other dimensions shown in the drawings and described in the specification are only for the purpose of understanding and reading the present disclosure, and are not intended to limit the scope of the invention, which is defined by the claims, and any modifications of the structures, changes in the proportions and adjustments of the sizes and other dimensions, should be construed as falling within the scope of the invention unless the function and objectives of the invention are affected. The scope of the preferred embodiments of the present invention includes additional implementations in which functions may be executed out of order from that described or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the embodiments of the present invention.
      Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate. In all examples shown and discussed herein, any particular value should be construed as merely illustrative, and not limiting. Thus, other examples of the exemplary embodiments may have different values.
      Examples
      The embodiment provides a video processing method for rate distortion optimization based on a multi-color space, which comprises the following steps of encoding video:
      s100, predefining a candidate color space list with the scale of N according to an N value set by a user, wherein the candidate color space list comprises N color spaces YUiVi, and conversion coefficient information is set between each color space YUiVi and a preset standard color space, wherein i =0,1, … … and N-1; n is an integer of 2 or more.
      The specific value of N may be determined by a user according to actual requirements such as hardware cost and video content.
      In the color space yuvi, Y denotes a luminance (luminance) component, and U and V denote two chrominance components.
      In this embodiment, the standard color space may be any one of color spaces predefined by international standards, and during specific setting, a user may set the standard color space according to a video to be processed. As a typical mode preference, the standard color space may be set to a standard RGB color space.
      The color space YUiVi in the candidate color space list comprises one or more of color spaces predefined by international standards, such as YCbCr color space and YPbPr color space; one or more of non-standard predefined color spaces, i.e., custom color spaces, such as those customized by a user based on actual scene characteristics, may also be included.
      The conversion coefficient between the international standard predefined color space is set by default of the system, and the conversion coefficient between the non-standard predefined color space and the standard color space can be set by the user in a self-defined way.
      In this embodiment, the conversion between the non-standard predefined color space and the standard color space also adopts linear mapping, preferably, the conversion coefficients may include a first conversion matrix L1 and a second conversion matrix L2, both the first conversion matrix L1 and the second conversion matrix L2 are set by user definition, and each component of the color space yuvi = L1.
      As an example and not by way of limitation, taking the example that the standard color space can be set as an RGB color space, for example, the size N =3 (i =0,1, 2) of the candidate color space list set by the user, that is, 3 color spaces are included in the candidate color space list, which are YU0V0, YU1V1, YU2V2 in sequence, and the conversion system of the color spaces YU0V0, YU1V1, YU2V2 and the standard RGB color space may be defined as:
      
      
      
      
      
      
      
      
      
      s200, receiving video input adopting a standard color space, and dividing the input video into a plurality of units.
      That is, the input video is RGB spatial data.
      In one embodiment of this embodiment, the unit is a block-level picture, in which an image frame of an input video is divided into coding blocks of a certain size, and color space selection is performed on each coding block.
      In another embodiment of this embodiment, the unit is a frame level picture, and the color space selection is performed for each image frame.
      In another embodiment of this embodiment, the cells are at a gop (group of picture) level, with color space selection for each group of pictures.
      In another embodiment of this embodiment, the unit is at video sequence level, and the color space selection is performed for each video sequence.
      S300, converting each unit into N color spaces YuiVi of the candidate color space list respectively; coding units of different color spaces, and recording the bit number Bi obtained by coding the units in each color space YUiVi; the coding units coded in different color spaces are subjected to dequantization and inverse transformation to obtain distortion decoding units, the distortion decoding units in different color spaces are subjected to color space conversion to be uniformly converted into an initial standard color space, and the originally input unit data are compared in the standard color space to calculate the distortion Di of the unit in each color space YUiVi; and calculating rate-distortion cost Ji = Di + lambda Bi according to the bit number Bi and the distortion Di of each color space YUiVi, wherein lambda is a Lagrangian factor, determining the color space with the minimum rate-distortion cost value as the coding color space of the current unit, and adding information representing the color space of each unit into the coding data stream.
      Taking the coding blocks as an example, in the step S300, each coding block is converted into YU0V0, YU1V1, YU2V2 color space representations, respectively. According to the traditional video coding scheme, the coding blocks of different color spaces are subjected to prediction, transformation, quantization and entropy coding, and the bit numbers B0, B1 and B2 obtained after the coding of the different color spaces are recorded. Then, the coding blocks after the coding of different color spaces are subjected to dequantization and inverse transformation to obtain a decoding block with distortion, which is called a distortion decoding block; carrying out color space conversion on distorted decoding blocks of different color spaces, uniformly converting the distorted decoding blocks into an initial standard RGB color space, comparing original input block data in the RGB color space, namely comparing the data of the distorted decoding blocks to be converted into the RGB color space with the original input block data, and calculating respective distortion degrees D0, D1 and D2 through comparison; the rate-distortion cost is calculated by combining the bit numbers B0, B1, B2 and the distortion degrees D0, D1 and D2.
      In this embodiment, a rate distortion optimization method based on lagrangian is preferably adopted:
      
      d, B respectively indicate the distortion degree and the bit number of coding by using different color space representations, and λ is Lagrangian factor. The optimal color space mode is the mode with the least rate distortion cost.
      At this time, the rate-distortion costs corresponding to YU0V0, YU1V1, and YU2V2 color spaces are:
      
      
      
      selecting a color space with the minimum rate distortion cost, determining the color space as the coding color space (optimal color space) of the current coding block, and adding information representing the coding color space of each coding block into the coding data stream. Specifically, the encoding color space information of each encoding block may be stored in the encoding data stream corresponding to the encoding block.
      In this embodiment, when the decoder performs decoding, first, information indicating the color space of each cell is extracted from the data stream, and then, video image data is restored according to a conventional decoding process, where the video image data is converted back to the standard RGB color space according to the color space information recorded in each cell and is output and displayed.
      The technical scheme provided by the invention can flexibly set the related information of the color space in the rate distortion optimization process of video and image coding, and can adaptively select the most appropriate color space representation for coding according to different video image contents. Meanwhile, color space selection can be carried out according to the picture block level, the frame level, the GOP level and the video sequence level as required, the color space is customized, and the potential of video compression can be effectively mined; meanwhile, the size of the candidate color space list can be freely regulated according to specific equipment performance, and the method has strong flexibility and compatibility.
      In another embodiment of the present invention, a video encoder for performing rate-distortion optimization based on multi-color space is also provided.
      The video encoder includes an initialization module and an encoding module.
      The initialization module is used for predefining a candidate color space list with the scale of N according to an N value set by a user, wherein the candidate color space list comprises N color spaces YUVi, and conversion coefficient information is set between each color space YUVi and a preset standard color space, wherein i =0,1, … … and N-1; n is an integer of 2 or more.
      The encoding module is used for receiving video input adopting a standard color space and dividing the input video into a plurality of units; for each cell, converting it to the N color spaces yuvi of the candidate color space list, respectively; coding units of different color spaces, and recording the bit number Bi obtained by coding the units in each color space YUiVi; the coding units coded in different color spaces are subjected to dequantization and inverse transformation to obtain distortion decoding units, the distortion decoding units in different color spaces are subjected to color space conversion to be uniformly converted into an initial standard color space, and the originally input unit data are compared in the standard color space to calculate the distortion Di of the unit in each color space YUiVi; and calculating rate-distortion cost Ji = Di + lambda Bi according to the bit number Bi and the distortion Di of each color space YUiVi, wherein lambda is a Lagrangian factor, determining the color space with the minimum rate-distortion cost value as the coding color space of the current unit, and adding information representing the color space of each unit into the coding data stream.
      In this embodiment, the specific value of N may be determined by a user according to actual requirements such as hardware cost and video content.
      In the color space yuvi, Y denotes a luminance (luminance) component, and U and V denote two chrominance components.
      The standard color space may be any one of color spaces predefined by international standards, and in a specific setting, a user may set the standard color space according to a video to be processed. As a typical mode preference, the standard color space may be set to a standard RGB color space.
      The color space YUiVi in the candidate color space list comprises one or more of color spaces predefined by international standards, such as YCbCr color space and YPbPr color space; one or more of non-standard predefined color spaces, i.e., custom color spaces, such as those customized by a user based on actual scene characteristics, may also be included.
      The conversion coefficient between the international standard predefined color space is set by default of the system, and the conversion coefficient between the non-standard predefined color space and the standard color space can be set by the user in a self-defined way.
      In this embodiment, the conversion between the non-standard predefined color space and the standard color space also adopts linear mapping, preferably, the conversion coefficients may include a first conversion matrix L1 and a second conversion matrix L2, both the first conversion matrix L1 and the second conversion matrix L2 are set by user definition, and each component of the color space yuvi = L1.
      In this embodiment, the unit may be a block-level picture, and at this time, an image frame of an input video is divided into coding blocks of a certain size, and color space selection is performed for each coding block.
      Or, the unit is a frame level picture, and color space selection is performed for each image frame.
      Or, the unit is at GOP level, and color space selection is performed for each group of pictures.
      Or, the unit is at a video sequence level, and color space selection is performed for each video sequence.
      Other technical features are referred to in the previous embodiments and are not described herein.
      Another embodiment of the present invention further provides a video codec.
      The video codec includes an encoder and a decoder.
      The encoder is used for predefining a candidate color space list with the scale of N according to an N value set by a user, wherein the candidate color space list comprises N color spaces YUVi, and conversion coefficient information is set between each color space YUVi and a preset standard color space, wherein i =0,1, … … and N-1; n is an integer greater than or equal to 2; and receiving a video input using a standard color space, dividing the input video into a plurality of cells; for each cell, converting it to the N color spaces yuvi of the candidate color space list, respectively; coding units of different color spaces, and recording the bit number Bi obtained by coding the units in each color space YUiVi; the coding units coded in different color spaces are subjected to dequantization and inverse transformation to obtain distortion decoding units, the distortion decoding units in different color spaces are subjected to color space conversion to be uniformly converted into an initial standard color space, and the originally input unit data are compared in the standard color space to calculate the distortion Di of the unit in each color space YUiVi; and calculating rate-distortion cost Ji = Di + lambda Bi according to the bit number Bi and the distortion Di of each color space YUiVi, wherein lambda is a Lagrangian factor, determining the color space with the minimum rate-distortion cost value as the coding color space of the current unit, and adding information representing the color space of each unit into the coding data stream.
      The decoder is used for extracting information representing the color space of each unit from the data stream, and converting the video image data back to the standard color space according to the corresponding color space for output and display according to the color space information recorded by each unit.
      Other technical features of the encoder and the decoder are referred to in the previous embodiments and will not be described herein.
      In the foregoing description, the disclosure of the present invention is not intended to limit itself to these aspects. Rather, the various components may be selectively and operatively combined in any number within the intended scope of the present disclosure. In addition, terms like "comprising," "including," and "having" should be interpreted as inclusive or open-ended, rather than exclusive or closed-ended, by default, unless explicitly defined to the contrary. All technical, scientific, or other terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs unless defined otherwise. Common terms found in dictionaries should not be interpreted too ideally or too realistically in the context of related art documents unless the present disclosure expressly limits them to that. Any changes and modifications of the present invention based on the above disclosure will be within the scope of the appended claims.
    Claims (10)
1. A video processing method for rate-distortion optimization based on multi-color space is characterized by comprising the following encoding steps:
      according to the N value set by the user, predefining a candidate color space list with the size of N, wherein the candidate color space list comprises N color spaces YUiViAll color spaces YUiViSetting conversion coefficient information with a preset standard color space, wherein i =0,1, … …, N-1; n is an integer greater than or equal to 2;
      receiving a video input using a standard color space, dividing the input video into a plurality of cells;
      for each cell, convert it to the N color spaces YU of the candidate color space listiVi(ii) a Coding the units in different color spaces, and recording the units in the color spaces YUiViNumber of bits B obtained after encodingi(ii) a And dequantizing and inversely transforming the coding units coded in the different color spaces to obtain a distortion decoding unit, performing color space transformation on the distortion decoding units in the different color spaces to uniformly transform the distortion decoding units into an initial standard color space, and comparing originally input unit data in the standard color space to calculate the YU of the unit in each color spaceiViDegree of distortion D ofi(ii) a According to each color space YUiViNumber of bits BiAnd degree of distortion DiCalculating a rate-distortion cost Ji=Di+λBiAnd determining the color space with the minimum rate distortion cost value as the coding color space of the current unit, and adding information representing the color space of each unit into the coding data stream.
    2. The video processing method of claim 1, wherein: when decoding, the information of the color space of each unit is extracted from the data stream, and the video image data is converted back to the standard color space according to the corresponding color space according to the color space information recorded in each unit and is output and displayed.
    3. The video processing method of claim 1, wherein: the unit is a block-level picture, divides an image frame of an input video into coding blocks with a certain size, and selects a color space for each coding block.
    4. The video processing method of claim 1, wherein: the unit is a frame level picture, and color space selection is performed for each image frame.
    5. The video processing method of claim 1, wherein: the unit is at GOP level, and color space selection is performed for each group of pictures.
    6. The video processing method of claim 1, wherein: the cells are at the video sequence level, and color space selection is performed for each video sequence.
    7. The video processing method of claim 1, wherein: the standard color space is any one of color spaces predefined by international standards;
      color space YU in the list of candidate color spacesiViOne or more of international standard predefined color spaces and one or more of non-standard predefined color spaces; the conversion coefficient between the international standard predefined color space is set by default of the system, and the conversion coefficient between the non-standard predefined color space and the standard color space is set by user self-definition.
    8. The video processing method according to claim 7, wherein: the conversion of the non-standard predefined color space into a linear mapping with the standard color space, in which case the conversion coefficients include a first conversion matrix L1 and a second conversion matrix L2, the first conversion matrix L1 and the second conversion matrix L2 being set by user-definition, the color space YUiViEach component = L1 · standard color space each component + L2.
    9. A video encoder for rate-distortion optimization based on multi-color space, comprising:
      an initialization module for predefining a candidate color space list with a scale of N according to the value of N set by the user, wherein the candidate color space list comprises N color spaces YUiViAll color spaces YUiViSetting conversion coefficient information with a preset standard color space, wherein i =0,1, … …, N-1; n is an integer greater than or equal to 2;
      the encoding module is used for receiving video input adopting a standard color space and dividing the input video into a plurality of units; for each cell, convert it to the N color spaces YU of the candidate color space listiVi(ii) a Coding the units in different color spaces, and recording the units in the color spaces YUiViNumber of bits B obtained after encodingi(ii) a And dequantizing and inversely transforming the coding units coded in the different color spaces to obtain a distortion decoding unit, performing color space transformation on the distortion decoding units in the different color spaces to uniformly transform the distortion decoding units into an initial standard color space, and comparing originally input unit data in the standard color space to calculate the YU of the unit in each color spaceiViDegree of distortion D ofi(ii) a According to each color space YUiViNumber of bits BiAnd degree of distortion DiCalculating a rate-distortion cost Ji=Di+λBiAnd determining the color space with the minimum rate distortion cost value as the coding color space of the current unit, and adding information representing the color space of each unit into the coding data stream.
    10. A video codec for rate-distortion optimization based on multi-color space, comprising:
      an encoder for predefining a candidate color space list of size N including N color spaces YU according to the value N set by the useriViAll color spaces YUiViSetting conversion coefficient information with a preset standard color space, wherein i =0,1, … …, N-1; n is an integer greater than or equal to 2; and receiving a video input using a standard color space, dividing the input video into a plurality of cells; for each cell, convert it to the N color spaces YU of the candidate color space listiVi(ii) a For cells of different color spacesPerforming coding processing to record the units in each color space YUiViNumber of bits B obtained after encodingi(ii) a And dequantizing and inversely transforming the coding units coded in the different color spaces to obtain a distortion decoding unit, performing color space transformation on the distortion decoding units in the different color spaces to uniformly transform the distortion decoding units into an initial standard color space, and comparing originally input unit data in the standard color space to calculate the YU of the unit in each color spaceiViDegree of distortion D ofi(ii) a According to each color space YUiViNumber of bits BiAnd degree of distortion DiCalculating a rate-distortion cost Ji=Di+λBiDetermining a color space with the minimum rate distortion cost value as the coding color space of the current unit, and adding information representing the color space of each unit into the coding data stream, wherein lambda is a Lagrange factor;
      and the decoder is used for extracting the information which represents the color space of each unit from the data stream, and converting the video image data back to the standard color space according to the corresponding color space for output and display according to the color space information recorded by each unit.
    Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title | 
|---|---|---|---|
| CN202111486783.5A CN114205586A (en) | 2021-12-07 | 2021-12-07 | Video processing method for carrying out rate distortion optimization based on multi-color space and application | 
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title | 
|---|---|---|---|
| CN202111486783.5A CN114205586A (en) | 2021-12-07 | 2021-12-07 | Video processing method for carrying out rate distortion optimization based on multi-color space and application | 
Publications (1)
| Publication Number | Publication Date | 
|---|---|
| CN114205586A true CN114205586A (en) | 2022-03-18 | 
Family
ID=80651177
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date | 
|---|---|---|---|
| CN202111486783.5A Pending CN114205586A (en) | 2021-12-07 | 2021-12-07 | Video processing method for carrying out rate distortion optimization based on multi-color space and application | 
Country Status (1)
| Country | Link | 
|---|---|
| CN (1) | CN114205586A (en) | 
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title | 
|---|---|---|---|---|
| CN116347088A (en) * | 2023-02-20 | 2023-06-27 | 北京达佳互联信息技术有限公司 | Method, device, electronic equipment and storage medium for determining encoding mode | 
| CN117579839A (en) * | 2024-01-15 | 2024-02-20 | 电子科技大学 | An image compression method based on rate-distortion optimized color space conversion matrix | 
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title | 
|---|---|---|---|---|
| CN101356825A (en) * | 2006-01-13 | 2009-01-28 | 弗劳恩霍夫应用研究促进协会 | Picture coding using adaptive color space transform | 
| CN105264888A (en) * | 2014-03-04 | 2016-01-20 | 微软技术许可有限责任公司 | Coding strategy for adaptive switching of color space, color sampling rate and/or bit depth | 
- 
        2021
        - 2021-12-07 CN CN202111486783.5A patent/CN114205586A/en active Pending
 
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title | 
|---|---|---|---|---|
| CN101356825A (en) * | 2006-01-13 | 2009-01-28 | 弗劳恩霍夫应用研究促进协会 | Picture coding using adaptive color space transform | 
| CN105264888A (en) * | 2014-03-04 | 2016-01-20 | 微软技术许可有限责任公司 | Coding strategy for adaptive switching of color space, color sampling rate and/or bit depth | 
Non-Patent Citations (2)
| Title | 
|---|
| WEI DAI等: "RCE1: Adaptive Color Transforms For Range Extensions", JOINT COLLABORATIVE TEAM ON VIDEO CODING (JCT-VC) OF ITU-T SG16 WP3 AND ISO/IEC JTC1/SC29/WG11 12TH MEETING: INCHEON, KR, 18-28 APRIL, 2013,JCTVC-M0048-R1, pages 1 - 4 * | 
| WEI DAI等: "RCE1: Adaptive Color Transforms For Range Extensions", JOINT COLLABORATIVE TEAM ON VIDEO CODING (JCT-VC) OF ITU-T SG16 WP3 AND ISO/IEC JTC1/SC29/WG11 12TH MEETING: INCHEON, KR, 18-28 APRIL, 2013,JCTVC-M0048-V4, pages 1 - 4 * | 
Cited By (4)
| Publication number | Priority date | Publication date | Assignee | Title | 
|---|---|---|---|---|
| CN116347088A (en) * | 2023-02-20 | 2023-06-27 | 北京达佳互联信息技术有限公司 | Method, device, electronic equipment and storage medium for determining encoding mode | 
| CN116347088B (en) * | 2023-02-20 | 2025-08-26 | 北京达佳互联信息技术有限公司 | Method, device, electronic device and storage medium for determining encoding mode | 
| CN117579839A (en) * | 2024-01-15 | 2024-02-20 | 电子科技大学 | An image compression method based on rate-distortion optimized color space conversion matrix | 
| CN117579839B (en) * | 2024-01-15 | 2024-03-22 | 电子科技大学 | An image compression method based on rate-distortion optimized color space conversion matrix | 
Similar Documents
| Publication | Publication Date | Title | 
|---|---|---|
| CN103297769B (en) | Use the picture coding of adaptive colour space transformation | |
| US8175158B2 (en) | Methods and systems for inter-layer image prediction parameter determination | |
| TWI758081B (en) | Adaptive color space transform coding | |
| CN105657425B (en) | Encoding method, system and encoder, decoding method, system and decoder | |
| KR101266168B1 (en) | Method and apparatus for encoding, decoding video | |
| JP6822122B2 (en) | Image processing equipment, image processing methods and programs | |
| JP6822123B2 (en) | Image processing equipment, image processing methods and programs | |
| US20220345727A1 (en) | Color transform for video coding | |
| CN104322063B (en) | Method, system and the computer readable storage medium of video stream compression | |
| CN114205586A (en) | Video processing method for carrying out rate distortion optimization based on multi-color space and application | |
| KR20180102565A (en) | Systems and methods for calculating distortion in display stream compression (DSC) | |
| US8537891B2 (en) | Independently adjusting the quality levels for luminance and chrominance channels in a digital image | |
| KR20220146469A (en) | Quantization parameter control method for video coding by combined pixel/transform-based quantization | |
| KR102543449B1 (en) | Image processing device and method for operating image processing device | |
| JP7557298B2 (en) | Video encoding device, video decoding device, and programs thereof | |
| US20250286992A1 (en) | Merge candidate construction | |
| US20250150631A1 (en) | Systems and methods for signaling and derivation of quantization parameters for a frame-level interpolation prediction mode | |
| US20250240403A1 (en) | Multi-Hypothesis Cross Component Prediction Models | |
| US20250150638A1 (en) | Systems and methods of frame interpolation with loop filtering | |
| WO2025080312A1 (en) | Multi-hypothesis cross component prediction different model | |
| CN118786665A (en) | System and method for blending block portions in partition-based prediction mode | |
| CN119452653A (en) | Determined by the bias value of the chrominance (CfL) mode based on luminance | |
| CN120345244A (en) | CCSO with offset specific options | |
| CN120390092A (en) | Video data decoding method, computing system and storage medium | |
| Weerakkody et al. | Adaptive low complexity colour transform for video coding | 
Legal Events
| Date | Code | Title | Description | 
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination |