[go: up one dir, main page]

CN106454354A - AVS2 parallel encoding processing system and method - Google Patents

AVS2 parallel encoding processing system and method Download PDF

Info

Publication number
CN106454354A
CN106454354A CN201610808832.5A CN201610808832A CN106454354A CN 106454354 A CN106454354 A CN 106454354A CN 201610808832 A CN201610808832 A CN 201610808832A CN 106454354 A CN106454354 A CN 106454354A
Authority
CN
China
Prior art keywords
thread
stripes
avs2
slice
encoding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610808832.5A
Other languages
Chinese (zh)
Other versions
CN106454354B (en
Inventor
梁凡
曾昊峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sun Yat Sen University
Original Assignee
Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Yat Sen University filed Critical Sun Yat Sen University
Priority to CN201610808832.5A priority Critical patent/CN106454354B/en
Publication of CN106454354A publication Critical patent/CN106454354A/en
Application granted granted Critical
Publication of CN106454354B publication Critical patent/CN106454354B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/174Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a slice, e.g. a line of blocks or a group of blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • H04N19/436Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation using parallelised computational arrangements

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computing Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本发明公开了一种AVS2并行编码处理系统及方法,该系统包括用于以条带作为基本编码单元对帧图像进行条带划分,然后对划分后得到的多个条带进行并行编码处理的编码单元。该方法包括以条带作为基本编码单元对帧图像进行条带划分,然后对划分后得到的多个条带进行并行编码处理这一步骤。通过使用本发明的AVS2编码处理,能大大提高AVS2的编码处理效率,满足AVS2实时编码的需要。本发明作为一种AVS2并行编码处理系统及方法可广泛应用于音视频编码领域中。

The invention discloses an AVS2 parallel encoding processing system and method, the system includes an encoding system for dividing a frame image into slices with slices as a basic encoding unit, and then performing parallel encoding processing on a plurality of slices obtained after division. unit. The method includes the step of dividing the frame image into slices by taking the slice as a basic coding unit, and then performing parallel encoding processing on a plurality of slices obtained after the division. By using the AVS2 encoding process of the present invention, the encoding process efficiency of AVS2 can be greatly improved, and the requirement of AVS2 real-time encoding can be met. As an AVS2 parallel encoding processing system and method, the present invention can be widely applied in the field of audio and video encoding.

Description

一种AVS2并行编码处理系统及方法A kind of AVS2 parallel code processing system and method

技术领域technical field

本发明涉及音视频编解码技术,尤其涉及一种AVS2并行编码处理系统及方法。The invention relates to audio and video coding and decoding technology, in particular to an AVS2 parallel coding processing system and method.

背景技术Background technique

技术词解释:Explanation of technical terms:

TLS:Thread Local Storage的简称,中文为线程局部存储。TLS: The abbreviation of Thread Local Storage, Chinese is thread local storage.

AVS2是继AVS之后我国自主研发的新一代音视频编码标准,是《信息技术高效多媒体编码》标准的简称。它的目标是:在主流技术可以实现的条件下,当重建视频的主观质量一致时,AVS2对高清或更高分辨率视频编码效率至少要比AVS1的最好性能提高1倍。在主流的编码配置下,效率要高于最新的国际标准HEVC/H.265。AVS2与最新的国际编码标准HEVC/H.265对于常规视频的编码效率相当,比国际标准H.264/AVC以及第一代国家标准AVSl的编码效率提高近1倍;而对于监控等场景类视频,AVS2的压缩效率是H.264/AVC的4倍。因此,作为我国制定的第二代具有自主知识产权的音视频编解码标准,AVS2直接影响着我国在国际视频领域的核心竞争力,关系着未来我国在信息领域的战略部署,对于我国信息化产业的快速发展具有重大意义。然而,由于AVS2引入了很多最新的视频编码技术,编码计算复杂度也明显增加,给实时编码实现带来了新的挑战。AVS2 is a new generation of audio and video coding standards independently developed by my country after AVS, and it is the abbreviation of the "Efficient Multimedia Coding for Information Technology" standard. Its goal is: under the condition that the mainstream technology can achieve, when the subjective quality of the reconstructed video is consistent, the coding efficiency of AVS2 for high-definition or higher resolution video is at least 1 times higher than the best performance of AVS1. Under the mainstream encoding configuration, the efficiency is higher than the latest international standard HEVC/H.265. AVS2 and the latest international coding standard HEVC/H. 265 has the same coding efficiency for conventional video, nearly double the coding efficiency of the international standard H.264/AVC and the first-generation national standard AVS1; and for monitoring and other scene videos, the compression efficiency of AVS2 is H. 264/AVC 4 times. Therefore, as the second-generation audio and video codec standard with independent intellectual property rights formulated by my country, AVS2 directly affects my country's core competitiveness in the international video field, and is related to my country's future strategic deployment in the information field. For my country's information industry The rapid development is of great significance. However, since AVS2 introduces a lot of the latest video coding technologies, the computational complexity of coding is also significantly increased, which brings new challenges to the realization of real-time coding.

根据AVS2标准可知,AVS2编码器在编码一帧图像时,是以LCU(即最大编码单元)为基本单位串行进行的,而其具体原理为:在编码每一个LCU时,首先对当前LCU的相关变量进行初始化,然后判断该LCU是否属于新的条带,若是属于新的条带,则对当前条带的相关变量进行初始化,并把条带头信息输入码流,而后再对当前LCU进行编码;最后经过熵编码得到需要传输的码流,包括条带头数据、残差信息、分块信息等;其中,在存储码流的时候,码流是不断叠加存储在一个总的码流存储器中的,直到所有LCU编码完成,则输出总码流,一帧图像编码完毕。由此可见,传统的AVS2编码处理技术存有处理效率低下、无法满足AVS2实时编码要求等缺点。According to the AVS2 standard, when an AVS2 encoder encodes a frame of image, it takes the LCU (that is, the largest coding unit) as the basic unit to perform serially, and the specific principle is: when encoding each LCU, firstly, the current LCU Initialize the related variables, and then judge whether the LCU belongs to a new stripe, if it belongs to a new stripe, initialize the related variables of the current stripe, and input the stripe header information into the code stream, and then encode the current LCU ; Finally, the code stream to be transmitted is obtained through entropy coding, including strip header data, residual information, block information, etc.; wherein, when storing the code stream, the code stream is continuously superimposed and stored in a total code stream memory , until all LCUs are encoded, the total code stream is output, and one frame of image encoding is completed. It can be seen that the traditional AVS2 encoding processing technology has disadvantages such as low processing efficiency and inability to meet the requirements of AVS2 real-time encoding.

发明内容Contents of the invention

为了解决上述技术问题,本发明的目的是提供一种编码处理效率高的AVS2并行编码处理系统。In order to solve the above technical problems, the object of the present invention is to provide an AVS2 parallel encoding processing system with high encoding processing efficiency.

本发明的另一目的是提供一种编码处理效率高的AVS2并行编码处理方法。Another object of the present invention is to provide an AVS2 parallel encoding processing method with high encoding processing efficiency.

本发明所采用的技术方案是:一种AVS2并行编码处理系统,该系统包括:The technical solution adopted in the present invention is: a kind of AVS2 parallel coding processing system, and this system comprises:

编码单元,用于以条带作为基本编码单元对帧图像进行条带划分,然后对划分后得到的多个条带进行并行编码处理。The coding unit is configured to divide the frame image into slices by using the slice as a basic coding unit, and then perform parallel coding processing on the divided slices.

进一步,所述编码单元包括:Further, the encoding unit includes:

划分模块,用于以条带作为基本编码单元对帧图像进行条带划分,从而得到该帧图像的多个条带;A division module, configured to divide the frame image into slices using the slice as a basic coding unit, so as to obtain multiple slices of the frame image;

编码控制处理模块,用于将当前帧图像的所有条带依次放入任务队列中,然后利用线程池来对多个条带进行并行编码处理。The encoding control processing module is used to put all the strips of the current frame image into the task queue in turn, and then use the thread pool to perform parallel encoding processing on the multiple strips.

进一步,所述利用线程池来对多个条带进行并行编码处理,其具体包括:Further, the use of thread pool to perform parallel encoding processing on multiple stripes specifically includes:

当任务队列中每放入一个条带时,则唤醒线程池中一个空闲工作线程,令该被唤醒的工作线程对当前被放入任务队列的条带进行编码处理;When each stripe is put into the task queue, an idle worker thread in the thread pool is awakened, so that the awakened worker thread encodes the stripe currently put into the task queue;

当任务队列中存有未进行编码的条带,且所有的工作线程均处于忙碌状态时,则创建新的工作线程来对该未进行编码的条带进行编码处理,或者,使任务队列中未进行编码的条带处于等待状态,直到结束任务之后重新返回线程池的工作线程来对该未进行编码的条带进行编码处理;When there are unencoded stripes in the task queue and all worker threads are busy, create a new worker thread to encode the unencoded stripes, or make the unencoded stripes in the task queue The strip to be encoded is in a waiting state until the end of the task and returns to the worker thread of the thread pool to encode the unencoded strip;

当工作线程完成一个条带的编码处理后,则重新返回线程池,并将当前已完成编码处理的条带的数量加1,然后判断任务队列中是否存有未进行编码的条带,若是,则令该工作线程对该未进行编码的条带进行编码处理,反之,则将该工作线程置为阻塞状态,等待下一个任务的到来;When the worker thread completes the encoding processing of a stripe, it returns to the thread pool again, and adds 1 to the number of stripes that have currently completed encoding processing, and then judges whether there are unencoded stripes in the task queue. If so, Then make the worker thread encode the unencoded strip, otherwise, put the worker thread into a blocking state and wait for the arrival of the next task;

当已完成编码处理的条带的数量与帧图像的总条带数量相同时,则唤醒主线程进行数据串行处理。When the number of strips that have completed encoding processing is the same as the total number of strips of the frame image, the main thread is awakened to perform data serial processing.

进一步,所述对条带进行编码处理,其具体为:Further, the encoding processing of the strips is specifically:

对条带的条带头信息进行存储,然后对条带内的LCU依次进行编码,直到条带内的所有LCU编码完成。The slice header information of the slice is stored, and then the LCUs in the slice are encoded sequentially until all the LCUs in the slice are encoded.

进一步,所述利用线程池来对多个条带进行并行编码处理,其具体还包括:Further, the use of thread pool to perform parallel encoding processing on multiple stripes specifically includes:

当工作线程需要进行全局变量访问时,则对其自身存有的全局变量的副本进行操作,从而实现全局变量的访问。When the worker thread needs to access the global variable, it operates on the copy of the global variable stored by itself, so as to realize the access of the global variable.

进一步,还包括码流缓冲单元,所述码流缓冲单元包括一个总码流存储器和多个子码流存储器;Further, it also includes a code stream buffer unit, and the code stream buffer unit includes a total code stream memory and a plurality of sub code stream memories;

所述总码流存储器,用于存储帧图像的图像头信息;The total code stream memory is used to store image header information of frame images;

所述子码流存储器,用于存储条带的条带头信息以及条带内所有LCU的编码信息。The sub-stream memory is used to store slice header information of a slice and coding information of all LCUs in the slice.

本发明所采用的另一技术方案是: 一种AVS2并行编码处理方法,该方法包括:Another technical solution adopted in the present invention is: a kind of AVS2 parallel coding processing method, and this method comprises:

以条带作为基本编码单元对帧图像进行条带划分,然后对划分后得到的多个条带进行并行编码处理。A slice is used as a basic coding unit to divide a frame image into slices, and then parallel encoding processing is performed on a plurality of slices obtained after division.

进一步,所述以条带作为基本编码单元对帧图像进行条带划分,然后对划分后得到的多个条带进行并行编码处理这一步骤具体包括:Further, the step of dividing the frame image into slices by using the slice as the basic coding unit, and then performing parallel encoding processing on the divided slices specifically includes:

以条带作为基本编码单元对帧图像进行条带划分,从而得到该帧图像的多个条带;dividing the frame image into slices by using the slice as the basic coding unit, so as to obtain multiple slices of the frame image;

将当前帧图像的所有条带依次放入任务队列中,然后利用线程池来对多个条带进行并行编码处理。Put all the strips of the current frame image into the task queue in turn, and then use the thread pool to encode multiple strips in parallel.

进一步,所述利用线程池来对多个条带进行并行编码处理这一步骤,其具体包括:Further, the step of using the thread pool to perform parallel encoding processing on multiple strips specifically includes:

当任务队列中每放入一个条带时,则唤醒线程池中一个空闲工作线程,令该被唤醒的工作线程对当前被放入任务队列的条带进行编码处理;When each stripe is put into the task queue, an idle worker thread in the thread pool is awakened, so that the awakened worker thread encodes the stripe currently put into the task queue;

当任务队列中存有未进行编码的条带,且所有的工作线程均处于忙碌状态时,则创建新的工作线程来对该未进行编码的条带进行编码处理,或者,使任务队列中未进行编码的条带处于等待状态,直到结束任务之后重新返回线程池的工作线程来对该未进行编码的条带进行编码处理;When there are unencoded stripes in the task queue and all worker threads are busy, create a new worker thread to encode the unencoded stripes, or make the unencoded stripes in the task queue The strip to be encoded is in a waiting state until the end of the task and returns to the worker thread of the thread pool to encode the unencoded strip;

当工作线程完成一个条带的编码处理后,则重新返回线程池,并将当前已完成编码处理的条带的数量加1,然后判断任务队列中是否存有未进行编码的条带,若是,则令该工作线程对该未进行编码的条带进行编码处理,反之,则将该工作线程置为阻塞状态,等待下一个任务的到来;When the worker thread completes the encoding processing of a stripe, it returns to the thread pool again, and adds 1 to the number of stripes that have currently completed encoding processing, and then judges whether there are unencoded stripes in the task queue. If so, Then make the worker thread encode the unencoded strip, otherwise, put the worker thread into a blocking state and wait for the arrival of the next task;

当已完成编码处理的条带的数量与帧图像的总条带数量相同时,则唤醒主线程进行数据串行处理。When the number of strips that have completed encoding processing is the same as the total number of strips of the frame image, the main thread is awakened to perform data serial processing.

进一步,所述利用线程池来对多个条带进行并行编码处理这一步骤,其具体还包括:Further, the step of using the thread pool to perform parallel encoding processing on multiple strips specifically includes:

当工作线程需要进行全局变量访问时,则对其自身存有的全局变量的副本进行操作,从而实现全局变量的访问。When the worker thread needs to access the global variable, it operates on the copy of the global variable stored by itself, so as to realize the access of the global variable.

本发明的有益效果是:本发明的系统是以条带作为基本编码单元对帧图像进行条带划分,然后对划分后得到的多个条带进行并行编码处理,从而实现帧图像的编码处理的,因此,相较于传统的AVS2编码处理,本发明的系统则无需在编码过程中对每一个LCU进行判断,判断该LCU是否属于新的条带,这样则能节省大量重复的冗余步骤,大大提高AVS2编码处理的效率。而且,本发明的系统是以数据并行处理方式来对多个条带进行编码处理的,因此,相较于传统的串行方式,本发明系统的数据处理效率更高,更能满足AVS2实时编码的需要。The beneficial effect of the present invention is that: the system of the present invention divides the frame image into slices using slices as the basic coding unit, and then performs parallel coding processing on the multiple slices obtained after the division, thereby realizing the coding process of the frame image Therefore, compared with the traditional AVS2 encoding process, the system of the present invention does not need to judge each LCU in the encoding process to determine whether the LCU belongs to a new strip, which can save a lot of repeated redundant steps, Greatly improve the efficiency of AVS2 encoding processing. Moreover, the system of the present invention encodes multiple strips in a data parallel processing manner. Therefore, compared with the traditional serial method, the data processing efficiency of the system of the present invention is higher, and it can better meet the requirements of AVS2 real-time encoding. needs.

本发明的另一有益效果是:本发明的方法是以条带作为基本编码单元对帧图像进行条带划分,然后对划分后得到的多个条带进行并行编码处理,从而实现帧图像的编码处理的,因此,相较于传统的AVS2编码处理,本发明的方法则无需在编码过程中对每一个LCU进行判断,判断该LCU是否属于新的条带,这样则能节省大量重复的冗余步骤,而且本发明的方法是采用并行方式来进行条带编码的,因此通过使用本发明的方法,则能大大提高AVS2的编码处理效率,满足AVS2实时编码的需要。Another beneficial effect of the present invention is that: the method of the present invention divides the frame image into slices using the slice as the basic coding unit, and then performs parallel encoding processing on the multiple slices obtained after division, thereby realizing the encoding of the frame image Therefore, compared with the traditional AVS2 encoding process, the method of the present invention does not need to judge each LCU in the encoding process to determine whether the LCU belongs to a new strip, which can save a lot of redundant redundancy steps, and the method of the present invention adopts parallel mode to carry out strip coding, so by using the method of the present invention, the coding processing efficiency of AVS2 can be greatly improved to meet the needs of AVS2 real-time coding.

附图说明Description of drawings

下面结合附图对本发明的具体实施方式作进一步说明:The specific embodiment of the present invention will be further described below in conjunction with accompanying drawing:

图1是本发明一种AVS2并行编码处理系统的结构框图;Fig. 1 is a structural block diagram of a kind of AVS2 parallel coding processing system of the present invention;

图2是本发明一种AVS2并行编码处理方法的步骤流程图。FIG. 2 is a flow chart of the steps of an AVS2 parallel encoding processing method of the present invention.

具体实施方式detailed description

如图1所示,一种AVS2并行编码处理系统,该系统包括:As shown in Figure 1, an AVS2 parallel encoding processing system, the system includes:

编码单元,用于以条带作为基本编码单元对帧图像进行条带划分,然后对划分后得到的多个条带进行并行编码处理。所述编码单元可优选应用在AVS2编码器中。The coding unit is configured to divide the frame image into slices by using the slice as a basic coding unit, and then perform parallel coding processing on the divided slices. The encoding unit can preferably be applied in an AVS2 encoder.

作为本实施例系统的优选实施方式,所述编码单元包括:As a preferred implementation of the system in this embodiment, the encoding unit includes:

划分模块,用于以条带作为基本编码单元对帧图像进行条带划分,从而得到该帧图像的多个条带;A division module, configured to divide the frame image into slices using the slice as a basic coding unit, so as to obtain multiple slices of the frame image;

编码控制处理模块,用于将当前帧图像的所有条带依次放入任务队列中,然后利用线程池来对多个条带进行并行编码处理。The encoding control processing module is used to put all the strips of the current frame image into the task queue in turn, and then use the thread pool to perform parallel encoding processing on the multiple strips.

作为本实施例系统的优选实施方式,所述利用线程池来对多个条带进行并行编码处理,其具体包括:As a preferred implementation of the system of this embodiment, the parallel encoding processing of multiple stripes by using the thread pool specifically includes:

当任务队列中每放入一个条带时,则唤醒线程池中一个空闲工作线程,令该被唤醒的工作线程对当前被放入任务队列的条带进行编码处理;When each stripe is put into the task queue, an idle worker thread in the thread pool is awakened, so that the awakened worker thread encodes the stripe currently put into the task queue;

当任务队列中存有未进行编码的条带,且所有的工作线程均处于忙碌状态时,则创建新的工作线程来对该未进行编码的条带进行编码处理,或者,使任务队列中未进行编码的条带处于等待状态,直到结束任务之后重新返回线程池的工作线程来对该未进行编码的条带进行编码处理;When there are unencoded stripes in the task queue and all worker threads are busy, create a new worker thread to encode the unencoded stripes, or make the unencoded stripes in the task queue The strip to be encoded is in a waiting state until the end of the task and returns to the worker thread of the thread pool to encode the unencoded strip;

当工作线程完成一个条带的编码处理后,则重新返回线程池,并将当前已完成编码处理的条带的数量加1,然后判断任务队列中是否存有未进行编码的条带,若是,则令该工作线程对该未进行编码的条带进行编码处理,反之,则将该工作线程置为阻塞状态,等待下一个任务的到来;When the worker thread completes the encoding processing of a stripe, it returns to the thread pool again, and adds 1 to the number of stripes that have currently completed encoding processing, and then judges whether there are unencoded stripes in the task queue. If so, Then make the worker thread encode the unencoded strip, otherwise, put the worker thread into a blocking state and wait for the arrival of the next task;

当已完成编码处理的条带的数量与帧图像的总条带数量相同时,则唤醒主线程进行数据串行处理。When the number of strips that have completed encoding processing is the same as the total number of strips of the frame image, the main thread is awakened to perform data serial processing.

作为本实施例系统的优选实施方式,所述对条带进行编码处理,其具体为:As a preferred implementation of the system in this embodiment, the encoding processing of the strips is specifically:

对条带的条带头信息进行存储,然后对条带内的LCU依次进行编码,直到条带内的所有LCU编码完成。The slice header information of the slice is stored, and then the LCUs in the slice are encoded sequentially until all the LCUs in the slice are encoded.

作为本实施例系统的优选实施方式,所述利用线程池来对多个条带进行并行编码处理,其具体还包括:As a preferred implementation of the system of this embodiment, the use of the thread pool to perform parallel encoding processing on multiple stripes specifically includes:

当工作线程需要进行全局变量访问时,则对其自身存有的全局变量的副本进行操作,从而实现全局变量的访问。When the worker thread needs to access the global variable, it operates on the copy of the global variable stored by itself, so as to realize the access of the global variable.

作为本实施例系统的优选实施方式,所述利用线程池来对多个条带进行并行编码处理,其具体还包括:As a preferred implementation of the system of this embodiment, the use of the thread pool to perform parallel encoding processing on multiple stripes specifically includes:

创建线程池,并在线程池内创建多个工作线程,通常,所述工作线程的个数默认为当前处理器的核数;然后,将所述多个工作线程放入到空闲线程队列中。Create a thread pool, and create a plurality of worker threads in the thread pool, usually, the number of the worker threads defaults to the number of cores of the current processor; then, put the plurality of worker threads into the idle thread queue.

优选地,对于所述的工作线程,其无需执行任务时使其置为阻塞状态,而其需要执行任务时则使其置为可调度状态,这样在工作线程无工作项目时则不会占用处理器资源,只占用少量内存空间,大大提高操作的灵活度。Preferably, for the worker thread, it is placed in a blocked state when it does not need to perform tasks, and it is placed in a schedulable state when it needs to perform tasks, so that it will not occupy the processing when the worker thread has no work items. Server resources, only occupying a small amount of memory space, greatly improving the flexibility of operation.

作为本实施例系统的优选实施方式,还包括码流缓冲单元,所述码流缓冲单元包括一个总码流存储器和多个子码流存储器;As a preferred implementation of the system of this embodiment, it also includes a code stream buffer unit, and the code stream buffer unit includes a total code stream memory and a plurality of sub code stream memories;

所述总码流存储器,用于存储帧图像的图像头信息;The total code stream memory is used to store image header information of frame images;

所述子码流存储器,用于存储条带的条带头信息以及条带内所有LCU的编码信息。待所有条带编码完成后,各个子码流存储器中所存储的码流按照条带顺序依次合并到总码流中,最终得到的码流就是一进行编码的帧图像码流。The sub-stream memory is used to store slice header information of a slice and coding information of all LCUs in the slice. After all the strips are encoded, the code streams stored in each sub-stream memory are sequentially combined into the total code stream according to the order of the stripes, and the finally obtained code stream is a coded frame image code stream.

优选地,所述子码流存储器的个数为N,并且每个子码流存储器被分配的内存大小为总码流存储器的内存大小的1/N。Preferably, the number of sub-stream memories is N, and the allocated memory size of each sub-stream memory is 1/N of the memory size of the total code stream memory.

如图 2所示,一种AVS2并行编码处理方法,该方法包括:As shown in Figure 2, an AVS2 parallel encoding processing method includes:

以条带作为基本编码单元对帧图像进行条带划分,然后对划分后得到的多个条带进行并行编码处理。A slice is used as a basic coding unit to divide a frame image into slices, and then parallel encoding processing is performed on a plurality of slices obtained after division.

作为本实施例方法的优选实施方式,所述以条带作为基本编码单元对帧图像进行条带划分,然后对划分后得到的多个条带进行并行编码处理这一步骤具体包括:As a preferred implementation of the method in this embodiment, the step of dividing the frame image into slices using slices as the basic coding unit, and then performing parallel encoding processing on the divided slices specifically includes:

以条带作为基本编码单元对帧图像进行条带划分,从而得到该帧图像的多个条带;dividing the frame image into slices by using the slice as the basic coding unit, so as to obtain multiple slices of the frame image;

将当前帧图像的所有条带依次放入任务队列中,然后利用线程池来对多个条带进行并行编码处理。Put all the strips of the current frame image into the task queue in turn, and then use the thread pool to encode multiple strips in parallel.

作为本实施例方法的优选实施方式,所述利用线程池来对多个条带进行并行编码处理这一步骤,其具体包括:As a preferred implementation of the method in this embodiment, the step of using a thread pool to perform parallel encoding processing on multiple stripes specifically includes:

当任务队列中每放入一个条带时,则唤醒线程池中一个空闲工作线程,令该被唤醒的工作线程对当前被放入任务队列的条带进行编码处理;When each stripe is put into the task queue, an idle worker thread in the thread pool is awakened, so that the awakened worker thread encodes the stripe currently put into the task queue;

当任务队列中存有未进行编码的条带,且所有的工作线程均处于忙碌状态时,则创建新的工作线程来对该未进行编码的条带进行编码处理,或者,使任务队列中未进行编码的条带处于等待状态,直到结束任务之后重新返回线程池的工作线程来对该未进行编码的条带进行编码处理;When there are unencoded stripes in the task queue and all worker threads are busy, create a new worker thread to encode the unencoded stripes, or make the unencoded stripes in the task queue The strip to be encoded is in a waiting state until the end of the task and returns to the worker thread of the thread pool to encode the unencoded strip;

当工作线程完成一个条带的编码处理后,则重新返回线程池,并将当前已完成编码处理的条带的数量加1,然后判断任务队列中是否存有未进行编码的条带,若是,则令该工作线程对该未进行编码的条带进行编码处理,反之,则将该工作线程置为阻塞状态,等待下一个任务的到来;When the worker thread completes the encoding processing of a stripe, it returns to the thread pool again, and adds 1 to the number of stripes that have currently completed encoding processing, and then judges whether there are unencoded stripes in the task queue. If so, Then make the worker thread encode the unencoded strip, otherwise, put the worker thread into a blocking state and wait for the arrival of the next task;

当已完成编码处理的条带的数量与帧图像的总条带数量相同时,则唤醒主线程进行数据串行处理。When the number of strips that have completed encoding processing is the same as the total number of strips of the frame image, the main thread is awakened to perform data serial processing.

作为本实施例方法的优选实施方式,所述对条带进行编码处理,其具体为:As a preferred implementation of the method in this embodiment, the encoding processing of the strips is specifically:

对条带的条带头信息进行存储,然后对条带内的LCU依次进行编码,直到条带内的所有LCU编码完成。The slice header information of the slice is stored, and then the LCUs in the slice are encoded sequentially until all the LCUs in the slice are encoded.

作为本实施例方法的优选实施方式,所述利用线程池来对多个条带进行并行编码处理这一步骤,其具体还包括:As a preferred implementation of the method in this embodiment, the step of using the thread pool to perform parallel encoding processing on multiple stripes specifically includes:

当工作线程需要进行全局变量访问时,则对其自身存有的全局变量的副本进行操作,从而实现全局变量的访问。When the worker thread needs to access the global variable, it operates on the copy of the global variable stored by itself, so as to realize the access of the global variable.

作为本实施例方法的优选实施方式,所述利用线程池来对多个条带进行并行编码处理这一步骤,其具体还包括:As a preferred implementation of the method in this embodiment, the step of using the thread pool to perform parallel encoding processing on multiple stripes specifically includes:

创建线程池,并在线程池内创建多个工作线程,通常,所述工作线程的个数默认为当前处理器的核数;然后,将所述多个工作线程放入到空闲线程队列中。Create a thread pool, and create a plurality of worker threads in the thread pool, usually, the number of the worker threads defaults to the number of cores of the current processor; then, put the plurality of worker threads into the idle thread queue.

本发明一具体实施例A specific embodiment of the invention

AVS2标准中,在编码一帧图像时,每一个条带都会在开始编码之前对其二元符号模型和熵编码器进行初始化,而且条带内的每个LCU编码时只会参考到本条带内的其它LCU,不会使用到帧图像内其它条带的数据。因此由此可见,帧图像中的各个条带间的编码是相互独立的,这样以条带作为基本编码单元来实现并行编码处理,能达到快速提高编码速度的目的。In the AVS2 standard, when encoding a frame of image, each slice will initialize its binary symbol model and entropy encoder before starting encoding, and each LCU in the slice will only refer to this slice when encoding Other LCUs will not use the data of other strips in the frame image. Therefore, it can be seen that the encoding between the slices in the frame image is independent of each other, so that the parallel encoding process can be realized by using the slice as the basic encoding unit, which can achieve the purpose of rapidly increasing the encoding speed.

基于条带作为基本编码单元的AVS2视频并行编码优化算法,其具体包括以下四个组成部分:The AVS2 video parallel coding optimization algorithm based on slices as the basic coding unit specifically includes the following four components:

一、并行编码器的架构设计1. Architecture Design of Parallel Encoder

若想实现基于条带的AVS2并行编码,则首先编码器框架必须要以条带作为基本编码单元进行编码,这就需要对原编码器中以LCU为基本单元的编码框架进行调整。调整后的编码器,在编码一帧图像时,是以条带作为基本编码单元进行编码的,一个条带内的所有LCU编码完成之后,再对下一个条带进行编码,直到该帧图像中的所有条带编码完成。在编码条带时,首先对条带进行初始化并存储该条带的条带头信息,这样则不再需要像之前那样编码每一个LCU时都要判断其是否属于新的条带。对条带进行初始化完成之后,再对条带内的LCU依次进行编码,直到条带内的所有LCU编码完成。If you want to implement slice-based AVS2 parallel encoding, the encoder framework must first use slices as the basic coding unit for encoding, which requires adjustments to the encoding framework in the original encoder with LCU as the basic unit. The adjusted encoder, when encoding a frame of image, uses the strip as the basic coding unit for encoding. After all the LCUs in a strip are encoded, the next strip is encoded until the image in the frame All strip encoding is done. When encoding a slice, first initialize the slice and store the slice header information of the slice, so that it is no longer necessary to determine whether each LCU belongs to a new slice when encoding each LCU as before. After the slice is initialized, the LCUs in the slice are encoded sequentially until all the LCUs in the slice are encoded.

由上述可见,本实施例中的编码系统可具体为AVS2编码器,而所述编码器包含一编码单元,所述编码单元具体用于以条带作为基本编码单元对帧图像进行条带划分,然后对划分后得到的多个条带进行并行编码处理;It can be seen from the above that the coding system in this embodiment may be specifically an AVS2 coder, and the coder includes a coding unit, and the coding unit is specifically used to divide a frame image into slices using slices as a basic coding unit, Then perform parallel encoding processing on the multiple strips obtained after division;

其中,所述对条带进行编码处理,其具体为:对条带进行初始化并对条带头信息进行存储,然后对条带内的LCU依次进行编码,直到条带内的所有LCU编码完成。Wherein, the encoding processing of the slice specifically includes: initializing the slice and storing header information of the slice, and then encoding the LCUs in the slice in sequence until the encoding of all the LCUs in the slice is completed.

二、码流缓冲区设计2. Stream buffer design

在调整后的并行编码框架基础上,需要对码流缓冲区进行调整,在编码一帧含有N个条带的图像时,编码器在码流缓冲区建立1个总码流存储器(GlobalBitStream)和N个子码流存储器(SliceBitStream[i]),并为各个存储器分配一定的内存空间,其中子码流存储器被分配到的内存大小为总码流存储器被分配到的内存大小的1/N。在编码过程中,首先把图像头信息存储到总码流存储器中,然后在编码每一个条带时,把各条带的条带头信息和条带内所有LCU的编码信息存储在码流缓冲区内对应的子码流存储器中,即相当于一子码流存储中存储一条带的条带头信息及该条带内所有LCU的编码信息。待所有条带编码完成后,再把各个子码流存储器中的码流按照条带顺序依次合并到总码流中,则最终得到的码流就是编码一帧图像的码流。On the basis of the adjusted parallel encoding framework, the code stream buffer needs to be adjusted. When encoding a frame of images containing N strips, the encoder establishes a total code stream memory (GlobalBitStream) and N sub-bitstream memories (SliceBitStream[i]), and allocate a certain memory space for each memory, wherein the memory size allocated to the sub-bitstream memory is 1/N of the memory size allocated to the total bitstream memory. In the encoding process, the image header information is first stored in the total code stream memory, and then when encoding each slice, the slice header information of each slice and the encoding information of all LCUs in the slice are stored in the code stream buffer The slice header information of a slice and the encoding information of all LCUs in the slice are stored in the corresponding sub-stream memory, which is equivalent to a sub-stream storage. After all the strips are encoded, the code streams in each sub-stream memory are merged into the total code stream according to the order of the stripes, and the finally obtained code stream is the code stream for encoding one frame of image.

由上述可见,本实施例中编码器还包括码流缓冲单元,所述码流缓冲单元包括一个总码流存储器和多个子码流存储器;It can be seen from the above that the encoder in this embodiment further includes a code stream buffer unit, and the code stream buffer unit includes a total code stream memory and a plurality of sub code stream memories;

所述总码流存储器,用于存储帧图像的图像头信息;The total code stream memory is used to store image header information of frame images;

所述子码流存储器,用于存储条带的条带头信息以及条带内所有LCU的编码信息。The sub-stream memory is used to store slice header information of a slice and coding information of all LCUs in the slice.

其中,所述子码流存储器被分配到的内存大小为总码流存储器被分配到的内存大小的1/N,N为子码流存储器的总个数。Wherein, the memory size allocated to the sub-stream memory is 1/N of the memory size allocated to the total code stream memory, and N is the total number of sub-stream memories.

三、线程池设计3. Thread pool design

本实施例中的线程池是一种多线程处理形式,采用的是事先创建线程的技术。而通过采用线程池技术来对多个条带进行并行编码处理,能够节省编码每个条带时需要不断创建、销毁线程所占用的 CPU 时间。具体地,所述利用线程池来对多个条带进行并行编码处理的具体步骤包括:The thread pool in this embodiment is a form of multi-thread processing, which adopts the technique of creating threads in advance. By using the thread pool technology to encode multiple strips in parallel, it can save the CPU time that needs to be continuously created and destroyed when encoding each strip. Specifically, the specific steps of using the thread pool to perform parallel encoding processing on multiple strips include:

1、在AVS2编码器启动后,首先创建一个线程池,并在线程池内创建一定数量的工作线程(默认为当前处理器的核数),然后把这些工作线程放入到一个空闲线程队列;并且当无工作项目时使这些工作线程处于阻塞状态,这样在无工作项目时,工作线程则不占用处理器资源,只占用少量内存空间,而有工作项目时才为可调度的;1. After the AVS2 encoder starts, first create a thread pool, and create a certain number of worker threads in the thread pool (the default is the number of cores of the current processor), and then put these worker threads into an idle thread queue; and When there are no work items, these worker threads are blocked, so that when there are no work items, the worker threads do not occupy processor resources, only occupy a small amount of memory space, and are schedulable when there are work items;

2、当主程序执行到对一帧图像进行编码时,首先把当前帧图像的所有条带依次放入一个任务队列中,QueueWork(encode_one_slice,(void *)& CurrentMbNumber),当所有条带都放入到任务队列中后,创建一个事件对象,其事件信号设为无信号状态,WorkFinished =CreateEvent(NULLTRUE,FALSE,NULL);然后,主线程设为阻塞状态,WaitForSingleObject(WorkFinished ,INFINITE),等待事件信号转为有信号状态;2. When the main program executes to encode a frame of image, first put all the slices of the current frame of image into a task queue, QueueWork (encode_one_slice, (void *) & CurrentMbNumber), when all the slices are put After entering the task queue, create an event object, whose event signal is set to no signal state, WorkFinished = CreateEvent (NULLTRUE, FALSE, NULL); then, the main thread is set to blocked state, WaitForSingleObject (WorkFinished, INFINITE), waiting for the event The signal changes to the signaled state;

3、与此同时,当任务队列中每放入一个条带,则唤醒线程池中的一个空闲工作线程,令该工作线程对当前条带开始编码,在工作线程中调用任务时使用的是函数指针,函数指针作为参数调用来传递相关数据,而用于调用的任务函数定义为void ThreadFun (void *context);当任务队列中存有未编码完成的条带,而所有的工作线程均处于忙碌状态,即没有空闲的线程来执行编码处理时,线程池管理器可以额外创建一定数量的工作线程,来执行更多的任务,也可以让任务队列中等待编码的条带先处于等待状态,等待结束任务之后重新返回线程池的线程来执行;3. At the same time, when each stripe is placed in the task queue, an idle worker thread in the thread pool is awakened, and the worker thread starts encoding the current stripe, and the function is used when calling the task in the worker thread Pointers and function pointers are called as parameters to pass relevant data, and the task function used for calling is defined as void ThreadFun (void *context); when there are unencoded strips in the task queue, and all worker threads are busy State, that is, when there are no idle threads to perform encoding processing, the thread pool manager can create a certain number of additional worker threads to perform more tasks, or let the stripes waiting for encoding in the task queue be in the waiting state first, waiting After the task is finished, return to the thread of the thread pool for execution;

4、每个工作线程完成一个条带的编码,则重新回到线程池,并把完成条带的数量加1;而回到线程池的工作线程则不需要退出,若任务队列仍存在未执行编码的条带,则工作线程继续对该条带进行编码,否则就置为阻塞状态,等待下一个任务的到来;当完成编码的条带的数量和当前帧图像内的总条带数量相同时,则将所述事件对象,其事件信号WorkFinished置为有信号状态,SetEvent(WorkFinished),从而唤醒主线程,主线程进入到接下来的串行部分,包括同步数据和合并码流等;4. Each worker thread completes the encoding of a stripe, then returns to the thread pool, and adds 1 to the number of completed stripes; while the worker thread returning to the thread pool does not need to exit, if the task queue still exists unexecuted Encoded strips, the worker thread continues to encode the strips, otherwise it is set to a blocked state and waits for the arrival of the next task; when the number of encoded strips is the same as the total number of strips in the current frame image , then the event object, its event signal WorkFinished is set to a signal state, SetEvent (WorkFinished), thereby waking up the main thread, and the main thread enters the next serial part, including synchronous data and merging streams, etc.;

5、完成一帧图像的编码之后,若视频序列仍没有结束,则重新回到步骤2,继续进行编码。若视频序列结束,则销毁线程,销毁线程池。5. After the encoding of one frame of image is completed, if the video sequence is still not over, return to step 2 to continue encoding. If the video sequence ends, the thread is destroyed and the thread pool is destroyed.

由上述可见,本实施例中编码单元所包含的编码控制处理模块具体用于将当前帧图像的所有条带依次放入任务队列中,然后利用线程池来对多个条带进行并行编码处理。It can be seen from the above that the encoding control processing module included in the encoding unit in this embodiment is specifically used to put all the slices of the current frame image into the task queue in sequence, and then use the thread pool to perform parallel encoding processing on multiple slices.

其中,所述利用线程池来对多个条带进行并行编码处理,其具体包括:Wherein, the parallel encoding processing of multiple strips by using the thread pool specifically includes:

创建一个线程池,并在线程池内创建多个工作线程,通常,所述工作线程的个数可优选默认为当前处理器的核数;然后,将所述多个工作线程放入到空闲线程队列中;其中,当工作线程无需执行任务时则使其置为阻塞状态,而其需要执行任务时则使其置为可调度状态;Create a thread pool, and create a plurality of working threads in the thread pool, usually, the number of the working threads can preferably default to the number of cores of the current processor; then, put the multiple working threads into the idle thread queue Among them, when the worker thread does not need to perform tasks, it is set to a blocked state, and when it needs to perform tasks, it is set to a schedulable state;

当执行帧图像编码时,将当前帧图像中的所有条带依次放入一个任务队列中;而当所有条带都放入到任务队列中后,则创建事件对象,并使其事件信号设为无信号状态,然后将主线程设为阻塞状态,等待事件信号转为有信号状态;When performing frame image encoding, put all the strips in the current frame image into a task queue in turn; and when all the strips are put into the task queue, create an event object and set its event signal to No signal state, then set the main thread to a blocked state, and wait for the event signal to turn into a signal state;

当任务队列中每放入一个条带时,则唤醒线程池中一个空闲工作线程,令该被唤醒的工作线程对当前被放入任务队列的条带进行编码处理;When each stripe is put into the task queue, an idle worker thread in the thread pool is awakened, so that the awakened worker thread encodes the stripe currently put into the task queue;

当任务队列中存有未进行编码的条带,且所有的工作线程均处于忙碌状态时,则创建新的工作线程来对该未进行编码的条带进行编码处理,或者,使任务队列中未进行编码的条带处于等待状态,直到结束任务之后重新返回线程池的工作线程来对该未进行编码的条带进行编码处理;When there are unencoded stripes in the task queue and all worker threads are busy, create a new worker thread to encode the unencoded stripes, or make the unencoded stripes in the task queue The strip to be encoded is in a waiting state until the end of the task and returns to the worker thread of the thread pool to encode the unencoded strip;

当工作线程完成一个条带的编码处理后,则重新返回线程池,并将当前已完成编码处理的条带的数量加1,然后判断任务队列中是否存有未进行编码的条带,若是,则令该工作线程对该未进行编码的条带进行编码处理,反之,则将该工作线程置为阻塞状态,等待下一个任务的到来;When the worker thread completes the encoding processing of a stripe, it returns to the thread pool again, and adds 1 to the number of stripes that have currently completed encoding processing, and then judges whether there are unencoded stripes in the task queue. If so, Then make the worker thread encode the unencoded strip, otherwise, put the worker thread into a blocking state and wait for the arrival of the next task;

当已完成编码处理的条带的数量与帧图像的总条带数量相同时,则唤醒主线程进行后续的数据串行处理,如包括后续的数据同步处理、码流合并处理等;When the number of stripes that have been encoded is the same as the total number of stripes in the frame image, wake up the main thread for subsequent data serial processing, such as subsequent data synchronization processing, code stream merging processing, etc.;

当完成一帧图像的编码处理后,则判断视频序列中所包含的帧图像是否均已进行编码,若是,则表示视频序列的编码处理已结束,此时,则销毁线程及销毁线程池;反之,则将未进行编码的一帧图像的所有条带依次放入任务队列中,从而利用线程池来对多个条带进行编码处理。After finishing the encoding processing of a frame image, then judge whether the frame images contained in the video sequence have been encoded, if so, it means that the encoding processing of the video sequence has ended, at this time, then destroy the thread and destroy the thread pool; otherwise , then put all strips of a frame of image that have not been encoded into the task queue in turn, so that the thread pool is used to encode multiple strips.

四、数据本地存储实现4. Implementation of local data storage

本实施例中,AVS2编码器利用TLS技术从而实现不同工作线程同时访问共享资源。使用TLS之后,并行编码的工作线程可以同时对全局变量进行访问,而且能够单独享有全局变量,也就是说使用了TLS的每个工作线程都可以拥有全局变量的一个副本,对全局变量访问时只对这一副本进行操作,线程之间相互独立,互不干扰,因此这样能进一步提高并行编码处理的工作效率。而隐式使用TLS是由编译器、加载器和链接器共同协作实现的,具体操作就是在需要用到线程本地存储的变量前加上修饰符__declspec(thread)。这种方法比较简单,而且比较适用于AVS2编码器中大量的局部存储变量以及频繁的调用现状。In this embodiment, the AVS2 encoder utilizes the TLS technology to realize simultaneous access to shared resources by different working threads. After using TLS, the worker threads of parallel encoding can access global variables at the same time, and can enjoy global variables independently, that is to say, each worker thread using TLS can have a copy of global variables, and only To operate on this copy, the threads are independent of each other and do not interfere with each other, so this can further improve the work efficiency of parallel encoding processing. The implicit use of TLS is realized by the cooperation of the compiler, loader and linker. The specific operation is to add the modifier __declspec(thread) before the variables that need to use thread local storage. This method is relatively simple, and more suitable for a large number of local storage variables and frequent calls in the AVS2 encoder.

优选地,本算法在对AVS2编码中的关键模块深入了解,并对编码器具体代码细致分析之后,共对155个全局变量设置了线程本地存储,首先创建了TlsData 数据结构,并把155个需要线程局部存储化的变量放入其中,然后执行__declspec(thread) TlsData *TlsInfo操作,从而生成一个线程局部存储的指针,每次需要使用到某个变量时,例如current_mb_nr,只需要执行TlsInfo->current_mb_nr,代表的就是当前工作线程所使用的变量为current_mb_nr,这样工作线程之间则不会有所影响。另外,主线程把各个条带的编码任务放到任务队列之后,工作线程接收到编码任务,首先对*TlsInfo指针分配一个TlsData大小的内存空间,然后利用tlsmalloc()函数,对所有线程局部存储的变量分配所需的内存空间,最后在对一个条带完成编码之后则利用tlsfree()函数将所有工作线程中分配的内存空间进行释放。Preferably, after a deep understanding of the key modules in AVS2 encoding and detailed analysis of the specific codes of the encoder, this algorithm sets thread local storage for a total of 155 global variables, first creates the TlsData data structure, and stores the 155 required Put the thread-locally stored variables into it, and then execute the __declspec(thread) TlsData *TlsInfo operation to generate a thread-locally stored pointer. Every time you need to use a certain variable, such as current_mb_nr, you only need to execute TlsInfo-> current_mb_nr means that the variable used by the current working thread is current_mb_nr, so that there will be no influence between working threads. In addition, after the main thread puts the encoding tasks of each strip into the task queue, the worker thread receives the encoding tasks, first allocates a memory space of TlsData size to the *TlsInfo pointer, and then uses the tlsmalloc() function to locally store the encoding tasks for all threads. Variables allocate the required memory space, and finally use the tlsfree() function to release the memory space allocated in all worker threads after encoding a strip.

由上述可见,所述利用线程池来对多个条带进行并行编码处理,其具体还包括:As can be seen from the above, the use of the thread pool to perform parallel encoding processing on multiple stripes specifically includes:

当工作线程需要进行全局变量访问时,则对其自身存有的全局变量的副本进行操作,从而实现全局变量的访问;When the worker thread needs to access the global variable, it operates on the copy of the global variable stored by itself, so as to realize the access of the global variable;

而这一步骤具体包括:And this step specifically includes:

对相对应的多个全局变量设置线程本地存储,具体地,首先创建TlsData 数据结构,并将多个全局变量放入其中,然后执行__declspec(thread) TlsData *TlsInfo操作,从而生成一个线程局部存储的指针;其中,所述全局变量前设有修饰符__declspec(thread);Set up thread-local storage for corresponding multiple global variables. Specifically, first create a TlsData data structure, put multiple global variables into it, and then execute the __declspec(thread) TlsData *TlsInfo operation to generate a thread-local storage A pointer; wherein, the global variable is preceded by a modifier __declspec(thread);

当工作线程接收到条带编码任务后,首先对*TlsInfo指针分配一个TlsData大小的内存空间,然后利用tlsmalloc()函数,对所有线程局部存储的变量分配所需的内存空间;When the worker thread receives the stripe encoding task, it first allocates a memory space of TlsData size to the *TlsInfo pointer, and then uses the tlsmalloc() function to allocate the required memory space for all thread-locally stored variables;

当工作线程对全局变量进行访问时,则执行TlsInfo->变量这一指令;When the worker thread accesses the global variable, execute the instruction TlsInfo->variable;

当工作线程对条带完成编码后则利用tlsfree()函数将所有工作线程中分配的内存空间进行释放。After the worker thread has finished encoding the stripe, use the tlsfree() function to release the memory space allocated in all worker threads.

以上是对本发明的较佳实施进行了具体说明,但本发明创造并不限于所述实施例,熟悉本领域的技术人员在不违背本发明精神的前提下还可做作出种种的等同变形或替换,这些等同的变形或替换均包含在本申请权利要求所限定的范围内。The above is a specific description of the preferred implementation of the present invention, but the invention is not limited to the described embodiments, and those skilled in the art can also make various equivalent deformations or replacements without violating the spirit of the present invention. , these equivalent modifications or replacements are all within the scope defined by the claims of the present application.

Claims (10)

1.一种AVS2并行编码处理系统,其特征在于:该系统包括:1. A kind of AVS2 parallel coding processing system, it is characterized in that: the system comprises: 编码单元,用于以条带作为基本编码单元对帧图像进行条带划分,然后对划分后得到的多个条带进行并行编码处理。The coding unit is configured to divide the frame image into slices by using the slice as a basic coding unit, and then perform parallel coding processing on the divided slices. 2.根据权利要求1所述一种AVS2并行编码处理系统,其特征在于:所述编码单元包括:2. a kind of AVS2 parallel encoding processing system according to claim 1, is characterized in that: described encoding unit comprises: 划分模块,用于以条带作为基本编码单元对帧图像进行条带划分,从而得到该帧图像的多个条带;A division module, configured to divide the frame image into slices using the slice as a basic coding unit, so as to obtain multiple slices of the frame image; 编码控制处理模块,用于将当前帧图像的所有条带依次放入任务队列中,然后利用线程池来对多个条带进行并行编码处理。The encoding control processing module is used to put all the strips of the current frame image into the task queue in turn, and then use the thread pool to perform parallel encoding processing on the multiple strips. 3.根据权利要求2所述一种AVS2并行编码处理系统,其特征在于:所述利用线程池来对多个条带进行并行编码处理,其具体包括:3. a kind of AVS2 parallel coding processing system according to claim 2, is characterized in that: described utilize thread pool to carry out parallel coding processing to a plurality of stripes, it specifically comprises: 当任务队列中每放入一个条带时,则唤醒线程池中一个空闲工作线程,令该被唤醒的工作线程对当前被放入任务队列的条带进行编码处理;When each stripe is put into the task queue, an idle worker thread in the thread pool is awakened, so that the awakened worker thread encodes the stripe currently put into the task queue; 当任务队列中存有未进行编码的条带,且所有的工作线程均处于忙碌状态时,则创建新的工作线程来对该未进行编码的条带进行编码处理,或者,使任务队列中未进行编码的条带处于等待状态,直到结束任务之后重新返回线程池的工作线程来对该未进行编码的条带进行编码处理;When there are unencoded stripes in the task queue and all worker threads are busy, create a new worker thread to encode the unencoded stripes, or make the unencoded stripes in the task queue The strip to be encoded is in a waiting state until the end of the task and returns to the worker thread of the thread pool to encode the unencoded strip; 当工作线程完成一个条带的编码处理后,则重新返回线程池,并将当前已完成编码处理的条带的数量加1,然后判断任务队列中是否存有未进行编码的条带,若是,则令该工作线程对该未进行编码的条带进行编码处理,反之,则将该工作线程置为阻塞状态,等待下一个任务的到来;When the worker thread completes the encoding processing of a stripe, it returns to the thread pool again, and adds 1 to the number of stripes that have currently completed encoding processing, and then judges whether there are unencoded stripes in the task queue. If so, Then make the worker thread encode the unencoded strip, otherwise, put the worker thread into a blocking state and wait for the arrival of the next task; 当已完成编码处理的条带的数量与帧图像的总条带数量相同时,则唤醒主线程进行数据串行处理。When the number of strips that have completed encoding processing is the same as the total number of strips of the frame image, the main thread is awakened to perform data serial processing. 4.根据权利要求3所述一种AVS2并行编码处理系统,其特征在于:所述对条带进行编码处理,其具体为:4. a kind of AVS2 parallel coding processing system according to claim 3, is characterized in that: described strip is carried out coding processing, and it is specifically: 对条带的条带头信息进行存储,然后对条带内的LCU依次进行编码,直到条带内的所有LCU编码完成。The slice header information of the slice is stored, and then the LCUs in the slice are encoded sequentially until all the LCUs in the slice are encoded. 5.根据权利要求3所述一种AVS2并行编码处理系统,其特征在于:所述利用线程池来对多个条带进行并行编码处理,其具体还包括:5. a kind of AVS2 parallel coding processing system according to claim 3, is characterized in that: described utilize thread pool to carry out parallel coding processing to a plurality of stripes, it specifically also comprises: 当工作线程需要进行全局变量访问时,则对其自身存有的全局变量的副本进行操作,从而实现全局变量的访问。When the worker thread needs to access the global variable, it operates on the copy of the global variable stored by itself, so as to realize the access of the global variable. 6.根据权利要求1-5任一项所述一种AVS2并行编码处理系统,其特征在于:还包括码流缓冲单元,所述码流缓冲单元包括一个总码流存储器和多个子码流存储器;6. A kind of AVS2 parallel encoding processing system according to any one of claims 1-5, characterized in that: it also includes a code stream buffer unit, and the code stream buffer unit includes a total code stream memory and a plurality of sub code stream memory ; 所述总码流存储器,用于存储帧图像的图像头信息;The total code stream memory is used to store image header information of frame images; 所述子码流存储器,用于存储条带的条带头信息以及条带内所有LCU的编码信息。The sub-stream memory is used to store slice header information of a slice and coding information of all LCUs in the slice. 7.一种AVS2并行编码处理方法,其特征在于:该方法包括:7. An AVS2 parallel encoding processing method, characterized in that: the method comprises: 以条带作为基本编码单元对帧图像进行条带划分,然后对划分后得到的多个条带进行并行编码处理。A slice is used as a basic coding unit to divide a frame image into slices, and then parallel encoding processing is performed on a plurality of slices obtained after division. 8.根据权利要求7所述一种AVS2并行编码处理方法,其特征在于:所述以条带作为基本编码单元对帧图像进行条带划分,然后对划分后得到的多个条带进行并行编码处理这一步骤具体包括:8. A kind of AVS2 parallel coding processing method according to claim 7, it is characterized in that: described frame image is carried out strip division with strip as basic coding unit, then a plurality of strips obtained after division are carried out parallel coding This step specifically includes: 以条带作为基本编码单元对帧图像进行条带划分,从而得到该帧图像的多个条带;dividing the frame image into slices by using the slice as the basic coding unit, so as to obtain multiple slices of the frame image; 将当前帧图像的所有条带依次放入任务队列中,然后利用线程池来对多个条带进行并行编码处理。Put all the strips of the current frame image into the task queue in turn, and then use the thread pool to encode multiple strips in parallel. 9.根据权利要求8所述一种AVS2并行编码处理方法,其特征在于:所述利用线程池来对多个条带进行并行编码处理这一步骤,其具体包括:9. A kind of AVS2 parallel coding processing method according to claim 8, it is characterized in that: described utilize thread pool to carry out this step of parallel coding processing to a plurality of stripes, it specifically comprises: 当任务队列中每放入一个条带时,则唤醒线程池中一个空闲工作线程,令该被唤醒的工作线程对当前被放入任务队列的条带进行编码处理;When each stripe is placed in the task queue, an idle worker thread in the thread pool is awakened, and the awakened worker thread encodes the stripe currently put into the task queue; 当任务队列中存有未进行编码的条带,且所有的工作线程均处于忙碌状态时,则创建新的工作线程来对该未进行编码的条带进行编码处理,或者,使任务队列中未进行编码的条带处于等待状态,直到结束任务之后重新返回线程池的工作线程来对该未进行编码的条带进行编码处理;When there are unencoded stripes in the task queue and all worker threads are busy, create a new worker thread to encode the unencoded stripes, or make the unencoded stripes in the task queue The strip to be encoded is in a waiting state until the end of the task and returns to the worker thread of the thread pool to encode the unencoded strip; 当工作线程完成一个条带的编码处理后,则重新返回线程池,并将当前已完成编码处理的条带的数量加1,然后判断任务队列中是否存有未进行编码的条带,若是,则令该工作线程对该未进行编码的条带进行编码处理,反之,则将该工作线程置为阻塞状态,等待下一个任务的到来;When the worker thread completes the encoding processing of a stripe, it returns to the thread pool again, and adds 1 to the number of stripes that have currently completed encoding processing, and then judges whether there are unencoded stripes in the task queue. If so, Then make the worker thread encode the unencoded strip, otherwise, put the worker thread into a blocking state and wait for the arrival of the next task; 当已完成编码处理的条带的数量与帧图像的总条带数量相同时,则唤醒主线程进行数据串行处理。When the number of strips that have completed encoding processing is the same as the total number of strips of the frame image, the main thread is awakened to perform data serial processing. 10.根据权利要求9所述一种AVS2并行编码处理方法,其特征在于:所述利用线程池来对多个条带进行并行编码处理这一步骤,其具体还包括:10. A kind of AVS2 parallel coding processing method according to claim 9, it is characterized in that: described utilize thread pool to carry out this step of parallel coding processing to a plurality of stripes, it specifically also comprises: 当工作线程需要进行全局变量访问时,则对其自身存有的全局变量的副本进行操作,从而实现全局变量的访问。When the worker thread needs to access the global variable, it operates on the copy of the global variable stored by itself, so as to realize the access of the global variable.
CN201610808832.5A 2016-09-07 2016-09-07 A kind of AVS2 parallel code processing system and method Expired - Fee Related CN106454354B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610808832.5A CN106454354B (en) 2016-09-07 2016-09-07 A kind of AVS2 parallel code processing system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610808832.5A CN106454354B (en) 2016-09-07 2016-09-07 A kind of AVS2 parallel code processing system and method

Publications (2)

Publication Number Publication Date
CN106454354A true CN106454354A (en) 2017-02-22
CN106454354B CN106454354B (en) 2019-10-18

Family

ID=58164151

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610808832.5A Expired - Fee Related CN106454354B (en) 2016-09-07 2016-09-07 A kind of AVS2 parallel code processing system and method

Country Status (1)

Country Link
CN (1) CN106454354B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107102582A (en) * 2017-04-07 2017-08-29 深圳怡化电脑股份有限公司 The synchronous method and device of a kind of subsystem command
CN107454406A (en) * 2017-08-18 2017-12-08 深圳市佳创视讯技术股份有限公司 The live high-speed decoding method of VR panoramic videos and system based on AVS+
CN107943602A (en) * 2017-12-15 2018-04-20 北京数码视讯科技股份有限公司 Hardware abstraction plateform system and equipment based on AVS2 codings
WO2018166535A1 (en) * 2017-03-17 2018-09-20 山东科技大学 Method for load balancing based on encoding time prediction model
CN109862357A (en) * 2019-01-09 2019-06-07 深圳威尔视觉传媒有限公司 Cloud game image encoding method, device, equipment and the storage medium of low latency
CN110727520A (en) * 2019-10-23 2020-01-24 四川长虹电器股份有限公司 Implementation method for optimizing Android frame animation
CN113590376A (en) * 2021-07-14 2021-11-02 华中科技大学 Multithreading parallel coding/decoding method, coder/decoder and user side
CN115811614A (en) * 2021-09-13 2023-03-17 华为技术有限公司 Video data processing method, chip, electronic device and readable storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101150719A (en) * 2006-09-20 2008-03-26 华为技术有限公司 Method and device for parallel video coding
CN101222646A (en) * 2008-01-30 2008-07-16 上海广电(集团)有限公司中央研究院 An intra-frame prediction device and prediction method suitable for AVS coding
US7630565B2 (en) * 2004-11-30 2009-12-08 Lsi Corporation Parallel video encoder with whole picture deblocking and/or whole picture compressed as a single slice
JP4674496B2 (en) * 2005-06-08 2011-04-20 ソニー株式会社 Encoding device, decoding device, encoding method, decoding method, and program thereof
CN104038766A (en) * 2014-05-14 2014-09-10 三星电子(中国)研发中心 Device used for using image frames as basis to execute parallel video coding and method thereof

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7630565B2 (en) * 2004-11-30 2009-12-08 Lsi Corporation Parallel video encoder with whole picture deblocking and/or whole picture compressed as a single slice
JP4674496B2 (en) * 2005-06-08 2011-04-20 ソニー株式会社 Encoding device, decoding device, encoding method, decoding method, and program thereof
CN101150719A (en) * 2006-09-20 2008-03-26 华为技术有限公司 Method and device for parallel video coding
CN101222646A (en) * 2008-01-30 2008-07-16 上海广电(集团)有限公司中央研究院 An intra-frame prediction device and prediction method suitable for AVS coding
CN104038766A (en) * 2014-05-14 2014-09-10 三星电子(中国)研发中心 Device used for using image frames as basis to execute parallel video coding and method thereof

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
蒋骁辰,李国平,王国中,赵海武,藤国伟: "基于AVS+实时编码的多核并行视频编码算法", 《电子与信息学报》 *
陈嗣文: "面向AVS2的并行优化设计方法", 《福建电脑》 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018166535A1 (en) * 2017-03-17 2018-09-20 山东科技大学 Method for load balancing based on encoding time prediction model
CN107102582A (en) * 2017-04-07 2017-08-29 深圳怡化电脑股份有限公司 The synchronous method and device of a kind of subsystem command
CN107454406A (en) * 2017-08-18 2017-12-08 深圳市佳创视讯技术股份有限公司 The live high-speed decoding method of VR panoramic videos and system based on AVS+
CN107943602A (en) * 2017-12-15 2018-04-20 北京数码视讯科技股份有限公司 Hardware abstraction plateform system and equipment based on AVS2 codings
CN109862357A (en) * 2019-01-09 2019-06-07 深圳威尔视觉传媒有限公司 Cloud game image encoding method, device, equipment and the storage medium of low latency
CN110727520A (en) * 2019-10-23 2020-01-24 四川长虹电器股份有限公司 Implementation method for optimizing Android frame animation
CN113590376A (en) * 2021-07-14 2021-11-02 华中科技大学 Multithreading parallel coding/decoding method, coder/decoder and user side
CN113590376B (en) * 2021-07-14 2024-07-02 华中科技大学 Multithread parallel encoding/decoding method, encoder/decoder and user side
CN115811614A (en) * 2021-09-13 2023-03-17 华为技术有限公司 Video data processing method, chip, electronic device and readable storage medium

Also Published As

Publication number Publication date
CN106454354B (en) 2019-10-18

Similar Documents

Publication Publication Date Title
CN106454354B (en) A kind of AVS2 parallel code processing system and method
JP7191240B2 (en) Video stream decoding method, device, terminal equipment and program
CN105992008B (en) A multi-level multi-task parallel decoding method on a multi-core processor platform
EP2593862B1 (en) Out-of-order command execution in a multimedia processor
CN109213593B (en) Resource allocation method, device and equipment for panoramic video transcoding
KR102144881B1 (en) Transmitting apparatus and method thereof for video processing
CN104869398B (en) A kind of CABAC realized based on CPU+GPU heterogeneous platforms in HEVC parallel method
US8532196B2 (en) Decoding device, recording medium, and decoding method for coded data
CN106921863A (en) Method, apparatus and processor for decoding video bitstream using multiple decoder cores
CN104904202A (en) Video encoding method and apparatus for parallel processing using reference picture information, and video decoding method and apparatus for parallel processing using reference picture information
CN112422983B (en) Universal multi-core parallel decoder system and application thereof
CN103716644A (en) H264 multi-granularity parallel handling method
CN105592314A (en) Parallel decoding method and device
CN109391816B (en) Parallel processing method of entropy coding in HEVC based on CPU+GPU heterogeneous platform
US20190279330A1 (en) Watermark embedding method and apparatus
CN107197296B (en) A kind of HEVC parallel encoding method and system based on COStream
CN105874800B (en) Syntax analysis device and syntax analysis method
CN108540797A (en) HEVC based on multi-core platform combines WPP coding methods within the frame/frames
CN112422984A (en) Code stream preprocessing device, system and method of multi-core decoding system
CN104038766A (en) Device used for using image frames as basis to execute parallel video coding and method thereof
Gudumasu et al. Software-based versatile video coding decoder parallelization
US20140334545A1 (en) Hybrid video encoder apparatus and methods
KR101138920B1 (en) Video decoder and method for video decoding using multi-thread
CN114374848B (en) Video coding optimization method and system
CN113542763A (en) Efficient video decoding method and decoder

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20191018

CF01 Termination of patent right due to non-payment of annual fee