CN115049963A - Video classification method and device, processor and electronic equipment - Google Patents
Video classification method and device, processor and electronic equipment Download PDFInfo
- Publication number
- CN115049963A CN115049963A CN202210720251.1A CN202210720251A CN115049963A CN 115049963 A CN115049963 A CN 115049963A CN 202210720251 A CN202210720251 A CN 202210720251A CN 115049963 A CN115049963 A CN 115049963A
- Authority
- CN
- China
- Prior art keywords
- frames
- video
- frame
- classified
- sequence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
- 
        - G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/41—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
 
- 
        - G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/762—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
- G06V10/763—Non-hierarchical techniques, e.g. based on statistics of modelling distributions
 
- 
        - G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
 
- 
        - G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
 
- 
        - G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
 
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Software Systems (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Probability & Statistics with Applications (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Image Analysis (AREA)
Abstract
本申请公开了一种视频分类方法、装置、处理器及电子设备。涉及人工智能领域,该方法包括:获取待分类视频;确定待分类视频的关键帧和抽样帧;根据关键帧和抽样帧在待分类视频中的时间顺序,对关键帧和抽样帧进行合并得到图像帧序列;基于图像帧序列确定待分类视频的分类结果。通过本申请,解决了相关技术中视频分类效果差的问题。
The present application discloses a video classification method, apparatus, processor and electronic device. It relates to the field of artificial intelligence, and the method includes: acquiring a video to be classified; determining key frames and sampling frames of the video to be classified; and combining the key frames and sampling frames according to the time sequence of the key frames and sampling frames in the video to be classified to obtain an image Frame sequence; determine the classification result of the video to be classified based on the image frame sequence. Through the present application, the problem of poor video classification effect in the related art is solved.
Description
技术领域technical field
本申请涉及领域,具体而言,涉及一种视频分类方法、装置、处理器及电子设备。The present application relates to the field, and in particular, to a video classification method, apparatus, processor and electronic device.
背景技术Background technique
对于使用深度神经网络的视频分类,目前的方案是先将视频逐帧分解,然后使用深度神经网络逐帧(或者进行随机抽帧,如:每5帧保留一帧)进行分类识别,最后加权平均算出最终分类结果。For video classification using deep neural network, the current solution is to first decompose the video frame by frame, and then use the deep neural network to classify and identify frame by frame (or randomly select frames, such as: reserve one frame every 5 frames), and finally weighted average. Calculate the final classification result.
这种技术是比较耗时耗力,一般情况一个视频是一秒30帧,一个10分钟的视频就会产生18000帧,即18000张图像。目前V100显卡对实时ResNet模型预测大概可以达到60FPS,也就是完整处理10分钟的视频需要5分钟时间。因此为了加速预测,往往需要多张GPU进行并行计算加速。随机抽帧虽然可以减少对GPU资源的开销,但是由于抽取的随机性,很难保证选取出来的帧包含了关键信息,往往会因为干扰严重导致结果不好。不过,即使是使用全部帧信息,也可能因为视频中无关信息过多导致最后加权计算的时候影响最终结果。This technology is time-consuming and labor-intensive. Generally, a video is 30 frames per second, and a 10-minute video will generate 18,000 frames, or 18,000 images. At present, the V100 graphics card can predict about 60FPS for the real-time ResNet model, that is, it takes 5 minutes to completely process a 10-minute video. Therefore, in order to accelerate the prediction, multiple GPUs are often required for parallel computing acceleration. Although random frame extraction can reduce the overhead of GPU resources, due to the randomness of extraction, it is difficult to ensure that the selected frames contain key information, and the results are often bad due to serious interference. However, even if all frame information is used, the final result may be affected when the final weighting calculation is caused by too much irrelevant information in the video.
针对相关技术中视频分类效果差的问题,目前尚未提出有效的解决方案。For the problem of poor video classification effect in the related art, no effective solution has been proposed yet.
发明内容SUMMARY OF THE INVENTION
本申请的主要目的在于提供一种视频分类方法、装置、处理器及电子设备,以解决相关技术中视频分类效果差的问题。The main purpose of the present application is to provide a video classification method, apparatus, processor and electronic device to solve the problem of poor video classification effect in the related art.
为了实现上述目的,根据本申请的一个方面,提供了一种视频分类方法。该方法包括:获取待分类视频;确定所述待分类视频的关键帧和抽样帧;根据所述关键帧和所述抽样帧在所述待分类视频中的时间顺序,对所述关键帧和所述抽样帧进行合并得到图像帧序列;基于所述图像帧序列确定所述待分类视频的分类结果。In order to achieve the above object, according to an aspect of the present application, a video classification method is provided. The method includes: acquiring a video to be classified; determining key frames and sample frames of the video to be classified; The sampled frames are combined to obtain an image frame sequence; the classification result of the video to be classified is determined based on the image frame sequence.
可选地,确定所述待分类视频的关键帧和抽样帧包括:确定所述待分类视频的多个图像帧;在所述多个图像帧中进行抽取,得到所述抽样帧;对所述多个图像帧进行聚类,确定所述关键帧。Optionally, determining the key frame and sampling frame of the video to be classified includes: determining multiple image frames of the video to be classified; extracting from the multiple image frames to obtain the sampling frame; A plurality of image frames are clustered to determine the key frame.
可选地,在所述多个图像帧中进行抽取,得到所述抽样帧包括:按照预定帧数间隔,在所述多个图像帧中抽取指定帧数对应的图像帧作为所述抽样帧;或按照预定时间间隔,在所述多个图像帧中抽取指定时间对应的图像帧作为所述抽样帧。Optionally, extracting from the multiple image frames to obtain the sampling frame includes: extracting image frames corresponding to a specified number of frames from the multiple image frames as the sampling frame according to a predetermined frame number interval; Or according to a predetermined time interval, an image frame corresponding to a specified time is extracted from the plurality of image frames as the sampling frame.
可选地,根据所述关键帧和所述抽样帧在所述待分类视频中的时间顺序,对所述关键帧和所述抽样帧进行合并得到图像帧序列包括:确定所述关键帧在所述待分类视频中的播放时间为第一播放时间;确定所述抽样帧在所述待分类视频中的播放时间为第二播放时间;根据所述第一播放时间和所述第二播放时间的时间顺序,确定所述关键帧和所述抽样帧的排列顺序;根据所述关键帧和所述抽样帧的排列顺序,确定所述图像帧序列。Optionally, according to the time sequence of the key frame and the sample frame in the video to be classified, combining the key frame and the sample frame to obtain an image frame sequence includes: determining that the key frame is in the video frame. The playing time in the video to be classified is the first playing time; it is determined that the playing time of the sampling frame in the video to be classified is the second playing time; according to the difference between the first playing time and the second playing time The time sequence is to determine the arrangement order of the key frames and the sampled frames; the image frame sequence is determined according to the arrangement order of the key frames and the sampled frames.
可选地,基于所述图像帧序列确定所述待分类视频的分类结果包括:将所述图像帧序列中的多个预定图像帧输入预设卷积神经网络模型,确定每个所述预定图像帧的特征矩阵和概率,其中,所述预定图像帧包括:所述关键帧和所述抽样帧,所述预设卷积神经网络模型根据已标定特征的样本图像训练得到;根据所述图像帧序列中多个预定图像帧的特征矩阵和概率,确定所述待分类视频的分类结果。Optionally, determining the classification result of the video to be classified based on the image frame sequence includes: inputting a plurality of predetermined image frames in the image frame sequence into a preset convolutional neural network model, and determining each of the predetermined image frames. The feature matrix and probability of the frame, wherein the predetermined image frame includes: the key frame and the sampling frame, and the preset convolutional neural network model is trained according to the sample image of the calibrated feature; according to the image frame The feature matrix and probability of a plurality of predetermined image frames in the sequence determine the classification result of the video to be classified.
可选地,根据所述图像帧序列中多个预定图像帧的特征矩阵和概率,确定所述待分类视频的分类结果包括:确定每个所述预定图像帧的特征矩阵和概率乘积,得到特征结果;按照每个预定图像帧在所述图像帧序列的顺序,确定特征结果序列,其中,特征结果序列包括所述图像帧序列中多个预定图像帧的特征结果;将所述特征结果序列输入预设循环神经网络模型,确定所述待分类视频的分类结果,其中,所述预设循环神经网络模型根据已经标定分类结果的样本视频和所述样本视频对应的特征结果序列训练得到。Optionally, determining the classification result of the video to be classified according to the feature matrices and probabilities of a plurality of predetermined image frames in the image frame sequence includes: determining the feature matrix and probability product of each of the predetermined image frames to obtain a feature. result; according to the sequence of each predetermined image frame in the sequence of image frames, determine a sequence of feature results, wherein the sequence of feature results includes feature results of multiple predetermined image frames in the sequence of image frames; input the sequence of feature results into A preset cyclic neural network model is used to determine the classification result of the video to be classified, wherein the preset cyclic neural network model is obtained by training according to the sample video for which the classification result has been calibrated and the feature result sequence corresponding to the sample video.
可选地,在将所述图像帧序列中的多个预定图像帧输入预设卷积神经网络模型,确定每个所述预定图像帧的特征矩阵和概率之后,所述方法还包括:将所述关键帧的概率调整为预设值;在所述图像帧序列中删除所述概率低于预设阈值的抽样帧。Optionally, after inputting a plurality of predetermined image frames in the image frame sequence into a preset convolutional neural network model, and determining the feature matrix and probability of each of the predetermined image frames, the method further includes: The probability of the key frame is adjusted to a preset value; the sampling frame whose probability is lower than the preset threshold is deleted from the image frame sequence.
为了实现上述目的,根据本申请的另一方面,提供了一种视频分类装置,包括:获取单元,用于获取待分类视频;第一确定单元,用于确定所述待分类视频的关键帧和抽样帧;合并单元,用于根据所述关键帧和所述抽样帧在所述待分类视频中的时间顺序,对所述关键帧和所述抽样帧进行合并得到图像帧序列;第二确定单元,用于基于所述图像帧序列确定所述待分类视频的分类结果。In order to achieve the above object, according to another aspect of the present application, there is provided a video classification device, comprising: an acquisition unit for acquiring a video to be classified; a first determination unit for determining a key frame and a key frame of the to-be-classified video a sampling frame; a merging unit, configured to merge the key frame and the sampling frame according to the time sequence of the key frame and the sampling frame in the video to be classified to obtain an image frame sequence; a second determining unit , for determining the classification result of the video to be classified based on the image frame sequence.
为了实现上述目的,根据本申请的另一方面,提供了一种处理器。该处理器用于运行程序,其中,所述程序运行时执行上述视频分类方法。In order to achieve the above object, according to another aspect of the present application, a processor is provided. The processor is used for running a program, wherein the above-mentioned video classification method is executed when the program is running.
为了实现上述目的,根据本申请的另一方面,提供了一种电子设备。该电子设备包括一个或多个处理器和存储器,所述存储器用于存储一个或多个程序,其中,当所述一个或多个程序被所述一个或多个处理器执行时,使得所述一个或多个处理器实现上述视频分类方法。In order to achieve the above object, according to another aspect of the present application, an electronic device is provided. The electronic device includes one or more processors and a memory for storing one or more programs, wherein the one or more programs, when executed by the one or more processors, cause the one or more programs to One or more processors implement the video classification method described above.
通过本申请,采用以下步骤:获取待分类视频;确定待分类视频的关键帧和抽样帧;根据关键帧和抽样帧在待分类视频中的时间顺序,对关键帧和抽样帧进行合并得到图像帧序列;基于图像帧序列确定待分类视频的分类结果,解决了相关技术中视频分类效果差的问题。进而达到了准确确定视频分类结果的效果。Through the present application, the following steps are adopted: obtaining the video to be classified; determining the key frame and sampling frame of the video to be classified; according to the time sequence of the key frame and the sampling frame in the video to be classified, combining the key frame and the sampling frame to obtain the image frame Sequence; the classification result of the video to be classified is determined based on the image frame sequence, which solves the problem of poor video classification effect in the related art. Thus, the effect of accurately determining the video classification result is achieved.
附图说明Description of drawings
构成本申请的一部分的附图用来提供对本申请的进一步理解,本申请的示意性实施例及其说明用于解释本申请,并不构成对本申请的不当限定。在附图中:The accompanying drawings constituting a part of the present application are used to provide further understanding of the present application, and the schematic embodiments and descriptions of the present application are used to explain the present application and do not constitute an improper limitation of the present application. In the attached image:
图1是根据本申请实施例提供的一种视频分类方法的流程图;1 is a flowchart of a video classification method provided according to an embodiment of the present application;
图2是根据本申请实施例的一种预定卷积神经网络模型的示意图;2 is a schematic diagram of a predetermined convolutional neural network model according to an embodiment of the present application;
图3是根据本申请实施例的一种预设循环神经网络的示意图;3 is a schematic diagram of a preset recurrent neural network according to an embodiment of the present application;
图4是根据本申请实施例提供的一种视频分类装置的示意图;4 is a schematic diagram of a video classification apparatus provided according to an embodiment of the present application;
图5是根据本申请实施例提供的一种电子设备的示意图。FIG. 5 is a schematic diagram of an electronic device provided according to an embodiment of the present application.
具体实施方式Detailed ways
需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本申请。It should be noted that the embodiments in the present application and the features of the embodiments may be combined with each other in the case of no conflict. The present application will be described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分的实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本申请保护的范围。In order to make those skilled in the art better understand the solutions of the present application, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are only The embodiments are part of the present application, but not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the scope of protection of the present application.
需要说明的是,本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本申请的实施例。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。It should be noted that the terms "first", "second", etc. in the description and claims of the present application and the above drawings are used to distinguish similar objects, and are not necessarily used to describe a specific sequence or sequence. It is to be understood that the data so used are interchangeable under appropriate circumstances for the embodiments of the application described herein. Furthermore, the terms "comprising" and "having" and any variations thereof, are intended to cover non-exclusive inclusion, for example, a process, method, system, product or device comprising a series of steps or units is not necessarily limited to those expressly listed Rather, those steps or units may include other steps or units not expressly listed or inherent to these processes, methods, products or devices.
下面结合优选的实施步骤对本发明进行说明,图1是根据本申请实施例提供的一种视频分类方法的流程图,如图1所示,该方法包括如下步骤:The present invention will be described below with reference to the preferred implementation steps. FIG. 1 is a flowchart of a video classification method provided according to an embodiment of the present application. As shown in FIG. 1 , the method includes the following steps:
步骤S101,获取待分类视频;Step S101, obtaining the video to be classified;
步骤S102,确定待分类视频的关键帧和抽样帧;Step S102, determining the key frame and sampling frame of the video to be classified;
步骤S103,根据关键帧和抽样帧在待分类视频中的时间顺序,对关键帧和抽样帧进行合并得到图像帧序列;Step S103, according to the time sequence of the key frame and the sample frame in the video to be classified, combine the key frame and the sample frame to obtain an image frame sequence;
步骤S104,基于图像帧序列确定待分类视频的分类结果。Step S104, determining the classification result of the video to be classified based on the image frame sequence.
需要说明是,视频都是由静止的画面组成的,这些静止的画面被称为帧,因此待分类视频包括多个图像帧。It should be noted that videos are all composed of still pictures, and these still pictures are called frames, so the video to be classified includes multiple image frames.
在上述步骤S101中,待分类视频可以采用任意已知非加密的视频格式(MP4、MAV、MOV等),且视频时间长短不限。In the above step S101, the video to be classified may use any known non-encrypted video format (MP4, MAV, MOV, etc.), and the video time is not limited.
在上述步骤S102中,抽样帧可以在待分类视频的图像帧中随机抽取,关键帧可以是待分类视频中具有特殊图像特征的图像帧。In the above step S102, the sampling frame may be randomly selected from the image frames of the video to be classified, and the key frame may be an image frame with special image characteristics in the video to be classified.
可选地,关键帧与图像帧在色彩、对比度或者亮度等图像特征存在特殊差别。Optionally, the key frame and the image frame have special differences in image characteristics such as color, contrast, or brightness.
在上述步骤S103中,图像帧表示待分类视频在特定播放时间播放的画面,因此,图像帧具有时间属性,根据图像帧确定的关键帧和抽样帧也同样具有时间属性,所以在确定关键帧和抽样帧后,可以确定各关键帧和抽样帧在待分类视频中的播放时间,并根据播放时间的时间顺序,合并关键帧和抽样帧,得到图像帧序列。In the above step S103, the image frame represents the screen of the video to be classified at a specific playback time. Therefore, the image frame has a time attribute, and the key frame and sampling frame determined according to the image frame also have a time attribute. Therefore, when determining the key frame and After sampling the frames, the playing time of each key frame and the sampling frame in the video to be classified can be determined, and according to the time sequence of the playing time, the key frame and the sampling frame are combined to obtain an image frame sequence.
在上述步骤S103中,图像帧序列包括至少一个关键帧和至少一个抽样帧。In the above step S103, the image frame sequence includes at least one key frame and at least one sample frame.
本发明上述实施例,由于图像帧序列即包括抽样帧,又包括关键帧,因此该图像帧序列中极大程度的保留了待分类视频的关键信息,进而根据图像帧序列可以准确对待分类视频进行分类。In the above-mentioned embodiment of the present invention, since the image frame sequence includes both sampling frames and key frames, the key information of the video to be classified is largely preserved in the image frame sequence, and further, according to the image frame sequence, the video to be classified can be accurately processed. Classification.
在上述步骤S104中,可以使用机器学习的方式对图像帧序列分析,确定待分类视频的分类结果。In the above step S104, the image frame sequence may be analyzed by means of machine learning to determine the classification result of the video to be classified.
可选地,在本申请实施例提供的视频分类方法中,确定待分类视频的关键帧和抽样帧包括:确定待分类视频的多个图像帧;在多个图像帧中进行抽取,得到抽样帧;对多个图像帧进行聚类,确定关键帧。Optionally, in the video classification method provided in the embodiment of the present application, determining the key frame and sampling frame of the video to be classified includes: determining multiple image frames of the video to be classified; extracting from the multiple image frames to obtain the sampling frame. ; Cluster multiple image frames to determine key frames.
本发明上述实施例,抽样帧是在待分类视频的多个图像帧中按照简单抽帧的方式抽取的,简单抽帧就是指按照一定规律去掉一些图像;关键帧是在待分类视频的多个图像帧通过聚类的方式确定的。In the above-mentioned embodiment of the present invention, the sampling frame is extracted from multiple image frames of the video to be classified according to the method of simple frame extraction, and the simple frame extraction refers to removing some images according to a certain rule; the key frame is extracted from multiple image frames of the video to be classified. Image frames are determined by clustering.
可选地,将待分类视频逐帧处理生成图像,获得原始帧图片集合(即得到多个图像帧)P={F1,F2,F3....Fn}。Optionally, the video to be classified is processed frame by frame to generate an image, and an original frame picture set (ie, multiple image frames are obtained) P={F1, F2, F3....Fn}.
可选地,通过对原始帧图片集合(即得到多个图像帧)P={F1,F2,F3....Fn}进行简单抽帧处理,可以得到获得简单抽帧后的图片集合(即抽样帧)F={F1,F6,F11....Fn}。Optionally, by performing simple frame extraction processing on the original frame picture set (that is, to obtain multiple image frames) P={F1, F2, F3.... sample frame) F={F1,F6,F11....Fn}.
需要说明的是,简单抽帧就是指按照一定规律去掉一些图片,常见的方式有每5帧保留一帧或者每秒保留1/3的帧数(若视频为30FPS则保留10帧)。It should be noted that simple frame extraction refers to removing some pictures according to certain rules. The common method is to reserve one frame every 5 frames or reserve 1/3 of the number of frames per second (if the video is 30FPS, reserve 10 frames).
可选地,关键帧计算是基于集合原始帧图像集合(即得到多个图像帧)P={F1,F2,F3....Fn}进行的。关键帧区别于其他帧的地方在于,关键帧针对色彩、对比度或者亮度等图像特征比较特殊。所以这里需要通过K-means聚类算法对原始帧图像集合(即得到多个图像帧)P={F1,F2,F3....Fn}进行聚类分析得到不同的簇,然后对于每个簇里面的图像进行直方图分析,对于那些直方图的均值偏离平均过多的将它挑选出来作为关键帧,以此得到关键帧集合K。Optionally, the key frame calculation is performed based on a collection of original frame images (ie, a plurality of image frames are obtained) P={F1,F2,F3....Fn}. The difference between keyframes and other frames is that keyframes are special for image characteristics such as color, contrast, or brightness. Therefore, it is necessary to perform cluster analysis on the original frame image set (that is, to obtain multiple image frames) P={F1, F2, F3....Fn} through the K-means clustering algorithm to obtain different clusters, and then for each The images in the cluster are subjected to histogram analysis, and those whose mean values deviate from the average are selected as key frames to obtain the key frame set K.
可选地,在本申请实施例提供的视频分类方法中,在多个图像帧中进行抽取,得到抽样帧包括:按照预定帧数间隔,在多个图像帧中抽取指定帧数对应的图像帧作为抽样帧;或按照预定时间间隔,在多个图像帧中抽取指定时间对应的图像帧作为抽样帧。Optionally, in the video classification method provided in the embodiment of the present application, extracting from multiple image frames to obtain a sampling frame includes: extracting image frames corresponding to a specified number of frames from the multiple image frames according to a predetermined frame number interval. As a sampling frame; or according to a predetermined time interval, an image frame corresponding to a specified time is extracted from a plurality of image frames as a sampling frame.
本发明上述实施例,在获取抽样帧的过程中,可以按照预定帧数间隔进行抽取,例如,在将待分类视频拆分为多个图像帧后,可以每5帧保留一帧作为抽样帧;也可以按照预定时间间隔进行抽取,例如,在每秒待分类视频对应的多个图像帧中保留1/3的帧数作为抽样帧。In the above-mentioned embodiment of the present invention, in the process of acquiring sampling frames, the sampling may be performed at intervals of a predetermined number of frames. For example, after the video to be classified is divided into multiple image frames, one frame may be reserved for every 5 frames as the sampling frame; The sampling may also be performed at predetermined time intervals, for example, 1/3 of the frames of the multiple image frames corresponding to the video to be classified per second are reserved as sampling frames.
可选地,在本申请实施例提供的视频分类方法中,根据关键帧和抽样帧在待分类视频中的时间顺序,对关键帧和抽样帧进行合并得到图像帧序列包括:确定关键帧在待分类视频中的播放时间为第一播放时间;确定抽样帧在待分类视频中的播放时间为第二播放时间;根据第一播放时间和第二播放时间的时间顺序,确定关键帧和抽样帧的排列顺序;根据关键帧和抽样帧的排列顺序,确定图像帧序列。Optionally, in the video classification method provided in the embodiment of the present application, according to the time sequence of the key frame and the sample frame in the video to be classified, combining the key frame and the sample frame to obtain the image frame sequence includes: determining that the key frame is in the to-be-classified video frame sequence. The playback time in the classified video is the first playback time; the playback time of the sampling frame in the video to be classified is determined as the second playback time; according to the time sequence of the first playback time and the second playback time, the key frame and the sampling frame are determined. Arrangement order; determine the sequence of image frames according to the arrangement order of key frames and sampling frames.
本发明上述实施例,关键帧和抽样帧都是待分类视频中的图像帧,因此每个关键帧和抽样帧在待分类视频中具有对应的播放时间,按照关键帧和抽样帧在待分类视频中的播放时间,确定关键帧和抽样帧的排列顺序,进而得到保留有待分类视频的关键信息的图像帧序列。In the above-mentioned embodiment of the present invention, the key frame and the sample frame are image frames in the video to be classified, so each key frame and sample frame has a corresponding playback time in the video to be classified, according to the key frame and sample frame in the video to be classified The playback time in the video frame is determined, and the sequence of key frames and sample frames is determined, so as to obtain the image frame sequence that retains the key information of the video to be classified.
可选地,在本申请实施例提供的视频分类方法中,基于图像帧序列确定待分类视频的分类结果包括:将图像帧序列中的多个预定图像帧输入预设卷积神经网络模型,确定每个预定图像帧的特征矩阵和概率,其中,预定图像帧包括:关键帧和抽样帧,预设卷积神经网络模型根据已标定特征的样本图像训练得到;根据图像帧序列中多个预定图像帧的特征矩阵和概率,确定待分类视频的分类结果。Optionally, in the video classification method provided in the embodiment of the present application, determining the classification result of the video to be classified based on the image frame sequence includes: inputting a plurality of predetermined image frames in the image frame sequence into a preset convolutional neural network model, and determining The feature matrix and probability of each predetermined image frame, wherein the predetermined image frame includes: key frames and sample frames, the preset convolutional neural network model is trained according to the sample images of the calibrated features; The feature matrix and probability of the frame determine the classification result of the video to be classified.
本发明上述实施例,根据预设卷积神经网络模型对图像帧序列中的多个预定图像帧进行分析,可以快速确定每个预定图像帧的特征矩阵和概率,进而根据确定的特征矩阵和概率对待分类视频进行分类。In the above-mentioned embodiment of the present invention, according to the preset convolutional neural network model, the feature matrix and probability of each predetermined image frame can be quickly determined by analyzing a plurality of predetermined image frames in the image frame sequence, and then according to the determined feature matrix and probability Classify the video to be classified.
可选地,在本申请实施例提供的视频分类方法中,根据图像帧序列中多个预定图像帧的特征矩阵和概率,确定待分类视频的分类结果包括:根据特征矩阵和概率的乘积,确定每个预定图像帧的特征结果;按照预定图像帧在图像帧序列的顺序,对图像帧序列中多个预定图像帧的特征结果进行排序,得到特征结果序列;将特征结果序列输入预设循环神经网络模型,确定待分类视频的分类结果,其中,预设循环神经网络模型根据已经标定分类结果的样本视频和样本视频对应的特征结果序列训练得到。Optionally, in the video classification method provided in the embodiment of the present application, determining the classification result of the video to be classified according to the feature matrix and the probability of a plurality of predetermined image frames in the image frame sequence includes: determining according to the product of the feature matrix and the probability. The feature result of each predetermined image frame; according to the sequence of the predetermined image frame in the image frame sequence, sort the feature results of multiple predetermined image frames in the image frame sequence to obtain the feature result sequence; input the feature result sequence into the preset recurrent neural network The network model determines the classification result of the video to be classified, wherein the preset recurrent neural network model is obtained by training according to the sample video for which the classification result has been calibrated and the feature result sequence corresponding to the sample video.
本发明上述实施例,根据预定图像帧的特征矩阵和概率乘积,可以确定预定图像帧的特征结果,进而根据预设循环神经网络模型对预定图像帧的特征结果进行分析,可以根据图像帧序列中多个预定图像帧的特征结果对待分类视频进行分类。In the above-mentioned embodiment of the present invention, according to the feature matrix and the probability product of the predetermined image frame, the feature result of the predetermined image frame can be determined, and then the feature result of the predetermined image frame can be analyzed according to the preset cyclic neural network model, and the feature result of the predetermined image frame can be analyzed according to the image frame sequence. The feature results of a plurality of predetermined image frames classify the video to be classified.
可选地,在本申请实施例提供的视频分类方法中,在将图像帧序列中的多个预定图像帧输入预设卷积神经网络模型,确定每个预定图像帧的特征矩阵和概率之后,方法还包括:将关键帧的概率调整为预设值;在图像帧序列中删除概率低于预设阈值的抽样帧。Optionally, in the video classification method provided in the embodiment of the present application, after inputting multiple predetermined image frames in the image frame sequence into a preset convolutional neural network model, and determining the feature matrix and probability of each predetermined image frame, The method further includes: adjusting the probability of the key frame to a preset value; and deleting the sampled frame whose probability is lower than the preset threshold in the image frame sequence.
本发明上述实施例,将关键帧的概率调整为预设值,可以突出对待分类视频的关键信息,在图像帧序列中删除概率低于预设阈值的抽样帧,可以降低待分类视频的干扰信息,进而基于调整后的图像帧序列可以准确确定待分类视频的分类结果。In the above embodiment of the present invention, the probability of the key frame is adjusted to a preset value, which can highlight the key information of the video to be classified, and delete the sampling frame whose probability is lower than the preset threshold in the image frame sequence, which can reduce the interference information of the video to be classified. , and then the classification result of the video to be classified can be accurately determined based on the adjusted image frame sequence.
本发明还提供了一种优选实施例,该优选实施例提供了一种基于关键帧的视频分类方法。The present invention also provides a preferred embodiment, which provides a video classification method based on key frames.
本发明通过预先对视频的关键帧进行预提取来解决随机抽帧导致漏掉关键信息,以及动态调整最后加权因子同时使用RNN模型(即预设卷积神经网络模型)解决因视频中无关信息对整体结果的影响。The invention solves the problem of missing key information caused by random frame selection by pre-extracting key frames of the video in advance, and dynamically adjusts the final weighting factor while using the RNN model (ie, the preset convolutional neural network model) to solve the problem of irrelevant information in the video. impact on the overall outcome.
作为一种可选的示例,本发明提供一种基于关键帧的视频分类方法的流程包括如下步骤:As an optional example, the present invention provides a key frame-based video classification method, which includes the following steps:
S201:输入待分类视频。S201: Input the video to be classified.
可选地,待分类视频可以为任意已知非加密的视频格式(MP4、MAV、MOV等),视频时间长短不限。Optionally, the video to be classified may be any known non-encrypted video format (MP4, MAV, MOV, etc.), and the video duration is not limited.
S202:对待分类视频进行抽帧处理后输入预设卷积神经网络模型获取图片特征以及概率。S202: After performing frame extraction processing on the video to be classified, a preset convolutional neural network model is input to obtain image features and probabilities.
可选地,将待分类视频逐帧处理生成多个图像帧,然后进行简单抽帧处理。除此之外这里还需要对全部图像帧进行关键帧计算,对于满足要求的关键帧若经过简单抽帧被过滤掉,则需要按照时间顺序加回去。Optionally, the video to be classified is processed frame by frame to generate multiple image frames, and then simple frame extraction processing is performed. In addition, it is also necessary to perform key frame calculation on all image frames. If the key frames that meet the requirements are filtered out after simple frame extraction, they need to be added back in chronological order.
可选地,预设卷积神经网络模型的结构可以采用Resnet或者VGG,需要输出图像帧的特征矩阵以及该图像帧对应的分类概率。Optionally, the structure of the preset convolutional neural network model can be Resnet or VGG, and it is necessary to output the feature matrix of the image frame and the classification probability corresponding to the image frame.
S203:将图像帧的特征矩阵以及概率输入预设循环神经网络模型获取分类结果S203: Input the feature matrix and probability of the image frame into a preset recurrent neural network model to obtain a classification result
将S202输出的图像帧的特征矩阵*概率生成新矩阵(即特征结果)后按时间顺序叠放输入预设循环神经网络模型后经过softmax获取最终分类结果。A new matrix (ie, a feature result) is generated from the feature matrix*probability of the image frame output in S202, and then stacked in time sequence and input to the preset recurrent neural network model, and then the final classification result is obtained through softmax.
作为一种可选的示例,本发明基于关键帧的视频分类方法的细节流程包括如下步骤:As an optional example, the detailed flow of the key frame-based video classification method of the present invention includes the following steps:
S301:输入待分类视频。S301: Input the video to be classified.
S302:将待分类视频逐帧分解为多个图像帧,计算出关键帧集合(即多个关键帧)K。按照每N帧一抽,保留图片得到抽样帧集合(即多个抽样帧)F。将关键帧集合(即多个关键帧)K与抽样帧集合(即多个抽样帧)F按照时间取并集得到图像帧序列FN。S302: Decompose the video to be classified into multiple image frames frame by frame, and calculate a key frame set (ie, multiple key frames) K. According to one sampling every N frames, a set of sampled frames (ie, a plurality of sampled frames) F is obtained from the reserved pictures. The image frame sequence FN is obtained by taking the union of the key frame set (ie multiple key frames) K and the sample frame set (ie multiple sample frames) F according to time.
可选地,将待分类视频逐帧处理生成图片获得原始帧图片集合(即多个图像帧)P={F1,F2,F3....Fn},然后进行简单抽帧处理。所谓简单抽帧就是指按照一定规律去掉一些图片,常见的方式有每5帧保留一帧或者每秒保留1/3的帧数(若视频为30FPS则保留10帧)。由此获得简单抽帧后的抽样帧集合(即多个抽样帧)F={F1,F6,F11....Fn}。Optionally, the video to be classified is processed frame by frame to generate pictures to obtain a set of original frame pictures (ie multiple image frames) P={F1, F2, F3....Fn}, and then simple frame extraction processing is performed. The so-called simple frame extraction refers to removing some pictures according to certain rules. The common method is to reserve one frame every 5 frames or reserve 1/3 of the number of frames per second (if the video is 30FPS, reserve 10 frames). Thus, a sample frame set (ie, a plurality of sample frames) F={F1, F6, F11 . . . Fn} after the simple frame sampling is obtained.
可选地,关键帧计算是基于原始帧图片集合(即多个图像帧)P进行的。关键帧与其他帧的区别地方在于,关键帧的色彩、对比度或者亮度等图片特征比较特殊。所以通过K-means聚类算法对原始帧图片集合(即多个图像帧)合P进行聚类分析得到不同的簇,然后对于每个簇里面的图片进行直方图分析,挑选出那些直方图的均值偏离平均过多的作为关键帧,以此得到关键帧集合(即多个关键帧)K。Optionally, the key frame calculation is performed based on the original frame picture set (ie, multiple image frames) P. The difference between a key frame and other frames is that the key frame has special image characteristics such as color, contrast, or brightness. Therefore, the K-means clustering algorithm is used to perform cluster analysis on the original frame picture set (that is, multiple image frames) and P to obtain different clusters, and then perform histogram analysis on the pictures in each cluster, and select those histograms. Too much deviation from the mean value is used as a key frame, so as to obtain a key frame set (ie, multiple key frames) K.
可选地,将抽样帧集合(即多个抽样帧)F与关键帧集合(即多个关键帧)K按照时间顺序取并集,即若关键帧集合(即多个关键帧)K中的帧已经在抽样帧集合(即多个抽样帧)F中,则不做任何操作,若关键帧集合(即多个关键帧)K中的帧不在抽样帧集合(即多个抽样帧)F中,则按照时间顺序将这帧插入抽样帧集合(即多个抽样帧)F中,得到带关键帧的图像帧序列FN。Optionally, the sampling frame set (that is, multiple sampling frames) F and the key frame set (that is, multiple key frames) K are merged in chronological order, that is, if the key frame set (that is, multiple key frames) K If the frame is already in the sampling frame set (ie multiple sampling frames) F, do nothing, if the frame in the key frame set (ie multiple key frames) K is not in the sampling frame set (ie multiple sampling frames) F , then insert this frame into the sample frame set (ie, a plurality of sample frames) F according to the time sequence, to obtain the image frame sequence FN with key frames.
例如,F={F1,F6,F11},K={F1,F3,F9},则FN={F1,F3,F6,F9,F11}。For example, F={F1,F6,F11}, K={F1,F3,F9}, then FN={F1,F3,F6,F9,F11}.
S303:将图像帧序列FN输入预设卷积神经网络模型,得到每个预设图像帧的特征矩阵以及属于某个分类的概率。S303: Input the image frame sequence FN into a preset convolutional neural network model to obtain a feature matrix of each preset image frame and a probability of belonging to a certain category.
图2是根据本申请实施例的一种预定卷积神经网络模型的示意图。如图2所示,这里需要逐个将图像帧序列FN中包含的预设图像帧输入预先训练好的预定卷积神经网络模型中,输出的特征矩阵是未经过全连接层前的输出,概率则是通过softmax输出的结果。FIG. 2 is a schematic diagram of a predetermined convolutional neural network model according to an embodiment of the present application. As shown in Figure 2, it is necessary to input the preset image frames contained in the image frame sequence FN into the pre-trained predetermined convolutional neural network model one by one. The output feature matrix is the output before the fully connected layer, and the probability is is the result output by softmax.
可选地,训练这个预定卷积神经网络模型的时候,还是按照传统的方式进行训练,即CNN+全连接层一起训练。Optionally, when training the predetermined convolutional neural network model, the training is still performed in a traditional way, that is, CNN + fully connected layers are trained together.
S304:将预设图像帧的特征矩阵乘以其概率,将结果按照时间的原有顺序叠放输入预设循环神经网络模型。S304: Multiply the feature matrix of the preset image frame by its probability, and input the results into the preset recurrent neural network model by stacking the results in the original order of time.
可选地,这里的概率和特征矩阵均来自于S303的输出,但需要注意的是,关键帧的概率手动调整为1,对于概率小于0.5的帧抛弃不用,因为概率过低会影响最终的识别结果,成为干扰项。Optionally, both the probability and feature matrix here come from the output of S303, but it should be noted that the probability of the key frame is manually adjusted to 1, and it is not necessary to discard frames with a probability less than 0.5, because the probability is too low, which will affect the final recognition. As a result, it becomes a distractor.
图3是根据本申请实施例的一种预设循环神经网络的示意图。如图3所示,预设循环神经网网络模型包含了预定卷积神经网络模型RNN+全连接层。FIG. 3 is a schematic diagram of a preset recurrent neural network according to an embodiment of the present application. As shown in FIG. 3 , the preset recurrent neural network model includes a predetermined convolutional neural network model RNN + fully connected layer.
S305:输出分类结果。S305: Output the classification result.
需要说明的是,过去的视频识别是进行固定抽帧,比较容易漏掉一些关键信息对结果进行干扰,同时也可能因为抽出来的帧比较随机导致结果偏差;而且即使在抽帧的情况下也需要耗费大量算力进行计算。It should be noted that in the past, video recognition was based on fixed frame extraction, which was easy to miss some key information to interfere with the results. At the same time, the results may be biased due to the randomness of the extracted frames; and even in the case of frame extraction, It takes a lot of computing power to calculate.
本发明引入了关键帧的预提取,因此可以让抽的帧数更少更稀疏,从而减少计算时间以及提升结果准确性。而在简单图像分类过后引入RNN网络对视频进行全局分析,比起传统加权平均要更加准确。The present invention introduces the pre-extraction of key frames, so that the number of extracted frames can be less and more sparse, thereby reducing the calculation time and improving the accuracy of the results. After simple image classification, the RNN network is introduced to analyze the video globally, which is more accurate than the traditional weighted average.
本发明采用基于关键帧的抽帧技术,在更稀疏抽帧的情况下极大程度保留了视频的关键信息,而且引入的RNN网络也可以更好的对一整个流视频进行理解。The present invention adopts the frame extraction technology based on key frames, which greatly retains the key information of the video in the case of more sparse frame extraction, and the introduced RNN network can also better understand an entire stream of video.
本申请实施例提供的视频分类方法,获取待分类视频;确定待分类视频的关键帧和抽样帧;根据关键帧和抽样帧在待分类视频中的时间顺序,对关键帧和抽样帧进行合并得到图像帧序列;基于图像帧序列确定待分类视频的分类结果,解决了相关技术中视频分类效果差的问题。进而达到了准确确定视频分类结果的效果。The video classification method provided in the embodiment of the present application obtains the video to be classified; determines the key frame and the sample frame of the video to be classified; according to the time sequence of the key frame and the sample frame in the video to be classified, the key frame and the sample frame are combined to obtain Image frame sequence; the classification result of the video to be classified is determined based on the image frame sequence, which solves the problem of poor video classification effect in the related art. Thus, the effect of accurately determining the video classification result is achieved.
需要说明的是,在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行,并且,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。It should be noted that the steps shown in the flowcharts of the accompanying drawings may be executed in a computer system, such as a set of computer-executable instructions, and, although a logical sequence is shown in the flowcharts, in some cases, Steps shown or described may be performed in an order different from that herein.
本申请实施例还提供了一种视频分类装置,需要说明的是,本申请实施例的视频分类装置可以用于执行本申请实施例所提供的用于视频分类方法。以下对本申请实施例提供的视频分类装置进行介绍。The embodiment of the present application further provides a video classification apparatus. It should be noted that the video classification apparatus of the embodiment of the present application may be used to execute the method for video classification provided by the embodiment of the present application. The video classification apparatus provided by the embodiments of the present application will be introduced below.
图4是根据本申请实施例的一种视频分类装置的示意图。如图4所示,该装置包括:获取单元41,用于获取待分类视频;第一确定单元42,用于确定待分类视频的关键帧和抽样帧;合并单元43,用于根据关键帧和抽样帧在待分类视频中的时间顺序,对关键帧和抽样帧进行合并得到图像帧序列;第二确定单元44,用于基于图像帧序列确定待分类视频的分类结果。FIG. 4 is a schematic diagram of a video classification apparatus according to an embodiment of the present application. As shown in FIG. 4 , the device includes: an acquisition unit 41 for acquiring a video to be classified; a first determining unit 42 for determining key frames and sample frames of the video to be classified; a merging unit 43 for determining according to the key frame and The time sequence of the sampled frames in the video to be classified is obtained by combining the key frame and the sampled frame to obtain an image frame sequence; the second determining unit 44 is configured to determine the classification result of the video to be classified based on the image frame sequence.
需要说明的是,该实施例中的获取单元41可以用于执行本申请实施例中的步骤S101,该实施例中的第一确定单元42可以用于执行本申请实施例中的步骤S102,该实施例中的合并单元43可以用于执行本申请实施例中的步骤S103,该实施例中的第二确定单元44可以用于执行本申请实施例中的步骤S104。上述单元与对应的步骤所实现的示例和应用场景相同,但不限于上述实施例所公开的内容。It should be noted that, the obtaining unit 41 in this embodiment may be used to execute step S101 in this embodiment of the present application, and the first determining unit 42 in this embodiment may be used to execute step S102 in this embodiment of the present application, the The merging unit 43 in this embodiment may be configured to perform step S103 in this embodiment of the present application, and the second determining unit 44 in this embodiment may be configured to perform step S104 in this embodiment of the present application. The examples and application scenarios implemented by the foregoing units and corresponding steps are the same, but are not limited to the contents disclosed in the foregoing embodiments.
可选地,在本申请实施例提供的视频分类装置中,第一确定单元包括:第一确定模块,用于确定待分类视频的多个图像帧;抽取模块,用于在多个图像帧中进行抽取,得到抽样帧;聚类模块,用于对多个图像帧进行聚类,确定关键帧。Optionally, in the video classification apparatus provided in the embodiment of the present application, the first determination unit includes: a first determination module, configured to determine multiple image frames of the video to be classified; Extraction is performed to obtain sampled frames; the clustering module is used to cluster multiple image frames to determine key frames.
可选地,在本申请实施例提供的视频分类装置中,抽取模块包括:第一抽取模块,用于按照预定帧数间隔,在多个图像帧中抽取指定帧数对应的图像帧作为抽样帧;或第二抽取模块,用于按照预定时间间隔,在多个图像帧中抽取指定时间对应的图像帧作为抽样帧。Optionally, in the video classification apparatus provided in the embodiment of the present application, the extraction module includes: a first extraction module, configured to extract image frames corresponding to a specified number of frames from a plurality of image frames as sampling frames according to a predetermined frame number interval. ; or a second extraction module, configured to extract image frames corresponding to a specified time from a plurality of image frames as sampling frames according to a predetermined time interval.
可选地,在本申请实施例提供的视频分类装置中,合并单元包括:第三确定模块,用于确定关键帧在待分类视频中的播放时间为第一播放时间;第四确定模块,用于确定抽样帧在待分类视频中的播放时间为第二播放时间;第五确定模块,用于根据第一播放时间和第二播放时间的时间顺序,确定关键帧和抽样帧的排列顺序;第六确定模块,用于根据关键帧和抽样帧的排列顺序,确定图像帧序列。Optionally, in the video classification apparatus provided in the embodiment of the present application, the merging unit includes: a third determination module, configured to determine the playback time of the key frame in the video to be classified as the first playback time; a fourth determination module, used In determining that the playback time of the sampling frame in the video to be classified is the second playback time; the fifth determining module is used to determine the arrangement order of the key frames and the sampling frames according to the time sequence of the first playback time and the second playback time; Six determination modules are used to determine the sequence of image frames according to the arrangement order of the key frames and the sampling frames.
可选地,在本申请实施例提供的视频分类装置中,第二确定单元包括:第六确定模块,用于将图像帧序列中的多个预定图像帧输入预设卷积神经网络模型,确定每个预定图像帧的特征矩阵和概率,其中,预定图像帧包括:关键帧和抽样帧,预设卷积神经网络模型根据已标定特征的样本图像训练得到;第七确定模块,用于根据图像帧序列中多个预定图像帧的特征矩阵和概率,确定待分类视频的分类结果。Optionally, in the video classification device provided in the embodiment of the present application, the second determination unit includes: a sixth determination module, configured to input a plurality of predetermined image frames in the image frame sequence into a preset convolutional neural network model, and determine The feature matrix and probability of each predetermined image frame, wherein the predetermined image frame includes: key frames and sample frames, the preset convolutional neural network model is trained according to the sample images of the calibrated features; the seventh determination module is used for The feature matrix and probability of multiple predetermined image frames in the frame sequence determine the classification result of the video to be classified.
可选地,在本申请实施例提供的视频分类装置中,第六确定模块包括:第七确定模块,用于根据特征矩阵和概率的乘积,确定每个预定图像帧的特征结果;第八确定模块,用于按照预定图像帧在图像帧序列的顺序,对图像帧序列中多个预定图像帧的特征结果进行排序,得到特征结果序列;第九确定模块,用于将特征结果序列输入预设循环神经网络模型,确定待分类视频的分类结果,其中,预设循环神经网络模型根据已经标定分类结果的样本视频和样本视频对应的特征结果序列训练得到。Optionally, in the video classification apparatus provided in the embodiment of the present application, the sixth determination module includes: a seventh determination module, configured to determine the characteristic result of each predetermined image frame according to the product of the characteristic matrix and the probability; the eighth determination The module is used for sorting the feature results of a plurality of predetermined image frames in the image frame sequence according to the sequence of the predetermined image frames in the image frame sequence to obtain the feature result sequence; the ninth determination module is used for inputting the feature result sequence into the preset The cyclic neural network model determines the classification result of the video to be classified, wherein the preset cyclic neural network model is obtained by training according to the sample video for which the classification result has been calibrated and the feature result sequence corresponding to the sample video.
可选地,在本申请实施例提供的视频分类装置中,装置还包括:调整单元,用于在将图像帧序列中的多个预定图像帧输入预设卷积神经网络模型,确定每个预定图像帧的特征矩阵和概率之后,将关键帧的概率调整为预设值;删除单元,用于在图像帧序列中删除概率低于预设阈值的抽样帧。Optionally, in the video classification apparatus provided in the embodiment of the present application, the apparatus further includes: an adjustment unit, configured to input a plurality of predetermined image frames in the image frame sequence into a preset convolutional neural network model, and determine each predetermined image frame. After obtaining the feature matrix and probability of the image frame, the probability of the key frame is adjusted to a preset value; the deletion unit is used to delete the sampling frame whose probability is lower than the preset threshold in the image frame sequence.
本申请实施例提供的视频分类装置,获取待分类视频;确定待分类视频的关键帧和抽样帧;根据关键帧和抽样帧在待分类视频中的时间顺序,对关键帧和抽样帧进行合并得到图像帧序列;基于图像帧序列确定待分类视频的分类结果,解决了相关技术中视频分类效果差的问题。进而达到了准确确定视频分类结果的效果The video classification device provided in the embodiment of the present application obtains the video to be classified; determines the key frame and the sample frame of the video to be classified; Image frame sequence; the classification result of the video to be classified is determined based on the image frame sequence, which solves the problem of poor video classification effect in the related art. And then achieve the effect of accurately determining the video classification results
所述视频分类装置包括处理器和存储器,上述单元等均作为程序单元存储在存储器中,由处理器执行存储在存储器中的上述程序单元来实现相应的功能。The video classification device includes a processor and a memory, the above-mentioned units and the like are all stored in the memory as program units, and the processor executes the above-mentioned program units stored in the memory to realize corresponding functions.
处理器中包含内核,由内核去存储器中调取相应的程序单元。内核可以设置一个或以上,通过调整内核参数来达到准确确定视频分类结果的目的。The processor includes a kernel, and the kernel calls the corresponding program unit from the memory. One or more kernels can be set, and the purpose of accurately determining the video classification result can be achieved by adjusting the kernel parameters.
存储器可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM),存储器包括至少一个存储芯片。Memory may include non-persistent memory in computer readable media, random access memory (RAM) and/or non-volatile memory, such as read only memory (ROM) or flash memory (flash RAM), the memory including at least one memory chip.
本发明实施例提供了一种计算机可读存储介质,其上存储有程序,该程序被处理器执行时实现所述视频分类方法。An embodiment of the present invention provides a computer-readable storage medium on which a program is stored, and when the program is executed by a processor, implements the video classification method.
本发明实施例提供了一种处理器,所述处理器用于运行程序,其中,所述程序运行时执行所述视频分类方法。An embodiment of the present invention provides a processor for running a program, wherein the video classification method is executed when the program is running.
图5是根据本申请实施例提供的一种电子设备的示意图。如图5所示,本发明实施例提供了一种电子设备50,设备包括处理器501、存储器502及存储在存储器上并可在处理器上运行的程序,处理器执行程序时实现以下步骤:获取待分类视频;确定待分类视频的关键帧和抽样帧;根据关键帧和抽样帧在待分类视频中的时间顺序,对关键帧和抽样帧进行合并得到图像帧序列;基于图像帧序列确定待分类视频的分类结果。FIG. 5 is a schematic diagram of an electronic device provided according to an embodiment of the present application. As shown in FIG. 5, an embodiment of the present invention provides an electronic device 50. The device includes a processor 501, a memory 502, and a program stored in the memory and running on the processor. The processor implements the following steps when executing the program: Obtain the video to be classified; determine the key frame and sample frame of the video to be classified; combine the key frame and sample frame to obtain an image frame sequence according to the time sequence of the key frame and the sample frame in the video to be classified; determine the image frame sequence based on the image frame sequence Classification results of classified videos.
可选地,处理器执行程序时实现以下步骤:确定待分类视频的多个图像帧;在多个图像帧中进行抽取,得到抽样帧;对多个图像帧进行聚类,确定关键帧。Optionally, when the processor executes the program, the following steps are implemented: determining multiple image frames of the video to be classified; extracting from multiple image frames to obtain sampled frames; clustering multiple image frames to determine key frames.
可选地,处理器执行程序时实现以下步骤:按照预定帧数间隔,在多个图像帧中抽取指定帧数对应的图像帧作为抽样帧;或按照预定时间间隔,在多个图像帧中抽取指定时间对应的图像帧作为抽样帧。Optionally, the processor implements the following steps when executing the program: according to a predetermined frame number interval, extracting image frames corresponding to a specified number of frames from a plurality of image frames as sampling frames; or according to a predetermined time interval, extracting from a plurality of image frames. The image frame corresponding to the specified time is used as the sampling frame.
可选地,处理器执行程序时实现以下步骤:确定关键帧在待分类视频中的播放时间为第一播放时间;确定抽样帧在待分类视频中的播放时间为第二播放时间;根据第一播放时间和第二播放时间的时间顺序,确定关键帧和抽样帧的排列顺序;根据关键帧和抽样帧的排列顺序,确定图像帧序列。Optionally, when the processor executes the program, the following steps are implemented: determining the playback time of the key frame in the video to be classified as the first playback time; determining the playback time of the sample frame in the video to be classified as the second playback time; The chronological sequence of the playing time and the second playing time determines the arrangement order of the key frames and the sampling frames; the image frame sequence is determined according to the arrangement order of the key frames and the sampling frames.
可选地,处理器执行程序时实现以下步骤:将图像帧序列中的多个预定图像帧输入预设卷积神经网络模型,确定每个预定图像帧的特征矩阵和概率,其中,预定图像帧包括:关键帧和抽样帧,预设卷积神经网络模型根据已标定特征的样本图像训练得到;根据图像帧序列中多个预定图像帧的特征矩阵和概率,确定待分类视频的分类结果。Optionally, when the processor executes the program, the following steps are implemented: inputting a plurality of predetermined image frames in the image frame sequence into a preset convolutional neural network model, and determining the feature matrix and probability of each predetermined image frame, wherein the predetermined image frame Including: key frames and sampling frames, the preset convolutional neural network model is trained according to the sample images with calibrated features; according to the feature matrix and probability of multiple predetermined image frames in the image frame sequence, the classification result of the video to be classified is determined.
可选地,处理器执行程序时实现以下步骤:根据特征矩阵和概率的乘积,确定每个预定图像帧的特征结果;按照预定图像帧在图像帧序列的顺序,对图像帧序列中多个预定图像帧的特征结果进行排序,得到特征结果序列;将特征结果序列输入预设循环神经网络模型,确定待分类视频的分类结果,其中,预设循环神经网络模型根据已经标定分类结果的样本视频和样本视频对应的特征结果序列训练得到。Optionally, when the processor executes the program, the following steps are implemented: determining the characteristic result of each predetermined image frame according to the product of the characteristic matrix and the probability; The feature results of the image frames are sorted to obtain a sequence of feature results; the sequence of feature results is input into a preset cyclic neural network model to determine the classification result of the video to be classified, wherein the preset cyclic neural network model is based on the sample videos that have calibrated the classification results and The feature result sequence corresponding to the sample video is obtained by training.
可选地,处理器执行程序时实现以下步骤:在将图像帧序列中的多个预定图像帧输入预设卷积神经网络模型,确定每个预定图像帧的特征矩阵和概率之后,将关键帧的概率调整为预设值;在图像帧序列中删除概率低于预设阈值的抽样帧Optionally, when the processor executes the program, the following steps are implemented: after inputting a plurality of predetermined image frames in the image frame sequence into a preset convolutional neural network model, after determining the feature matrix and probability of each predetermined image frame, the key frame The probability is adjusted to the preset value; the sampling frame whose probability is lower than the preset threshold is deleted from the image frame sequence
可选地,本文中的电子设备可以是服务器、PC、PAD、手机等。Optionally, the electronic device herein may be a server, a PC, a PAD, a mobile phone, and the like.
本申请还提供了一种计算机程序产品,当在数据处理设备上执行时,适于执行初始化有如下方法步骤的程序:获取待分类视频;确定待分类视频的关键帧和抽样帧;根据关键帧和抽样帧在待分类视频中的时间顺序,对关键帧和抽样帧进行合并得到图像帧序列;基于图像帧序列确定待分类视频的分类结果。The present application also provides a computer program product, which, when executed on a data processing device, is suitable for executing a program initialized with the following method steps: acquiring a video to be classified; determining key frames and sampling frames of the video to be classified; According to the time sequence of the sampled frames in the video to be classified, the key frame and the sampled frame are combined to obtain an image frame sequence; the classification result of the to-be-classified video is determined based on the image frame sequence.
可选地,当在数据处理设备上执行时,适于执行初始化有如下方法步骤的程序:确定待分类视频的多个图像帧;在多个图像帧中进行抽取,得到抽样帧;对多个图像帧进行聚类,确定关键帧。Optionally, when executed on a data processing device, it is suitable for executing a program initialized with the following method steps: determining multiple image frames of the video to be classified; extracting from multiple image frames to obtain sampled frames; Image frames are clustered to determine keyframes.
可选地,当在数据处理设备上执行时,适于执行初始化有如下方法步骤的程序:按照预定帧数间隔,在多个图像帧中抽取指定帧数对应的图像帧作为抽样帧;或按照预定时间间隔,在多个图像帧中抽取指定时间对应的图像帧作为抽样帧。Optionally, when executed on a data processing device, it is suitable for executing a program initialized with the following method steps: extracting image frames corresponding to a specified number of frames from a plurality of image frames as sampling frames according to a predetermined frame interval; or At a predetermined time interval, an image frame corresponding to a specified time is extracted from a plurality of image frames as a sampling frame.
可选地,当在数据处理设备上执行时,适于执行初始化有如下方法步骤的程序:确定关键帧在待分类视频中的播放时间为第一播放时间;确定抽样帧在待分类视频中的播放时间为第二播放时间;根据第一播放时间和第二播放时间的时间顺序,确定关键帧和抽样帧的排列顺序;根据关键帧和抽样帧的排列顺序,确定图像帧序列。Optionally, when executed on a data processing device, it is suitable for executing a program initialized with the following method steps: determining the playback time of the key frame in the video to be classified as the first playback time; determining the playback time of the sample frame in the video to be classified; The play time is the second play time; according to the time sequence of the first play time and the second play time, the arrangement order of the key frames and the sampling frames is determined; the image frame sequence is determined according to the arrangement order of the key frames and the sampling frames.
可选地,当在数据处理设备上执行时,适于执行初始化有如下方法步骤的程序:将图像帧序列中的多个预定图像帧输入预设卷积神经网络模型,确定每个预定图像帧的特征矩阵和概率,其中,预定图像帧包括:关键帧和抽样帧,预设卷积神经网络模型根据已标定特征的样本图像训练得到;根据图像帧序列中多个预定图像帧的特征矩阵和概率,确定待分类视频的分类结果。Optionally, when executed on a data processing device, it is suitable for executing a program initialized with the following method steps: inputting a plurality of predetermined image frames in the image frame sequence into a preset convolutional neural network model, and determining each predetermined image frame. The feature matrix and probability of , wherein the predetermined image frame includes: key frame and sampling frame, the preset convolutional neural network model is trained according to the sample image of the calibrated feature; probability, to determine the classification result of the video to be classified.
可选地,当在数据处理设备上执行时,适于执行初始化有如下方法步骤的程序:根据特征矩阵和概率的乘积,确定每个预定图像帧的特征结果;按照预定图像帧在图像帧序列的顺序,对图像帧序列中多个预定图像帧的特征结果进行排序,得到特征结果序列;将特征结果序列输入预设循环神经网络模型,确定待分类视频的分类结果,其中,预设循环神经网络模型根据已经标定分类结果的样本视频和样本视频对应的特征结果序列训练得到。Optionally, when executed on a data processing device, it is suitable for executing a program initialized with the following method steps: determining the characteristic result of each predetermined image frame according to the product of the characteristic matrix and the probability; order, sort the feature results of a plurality of predetermined image frames in the image frame sequence to obtain a feature result sequence; input the feature result sequence into a preset cyclic neural network model to determine the classification result of the video to be classified, wherein the preset cyclic neural network The network model is trained according to the sample videos whose classification results have been calibrated and the feature result sequences corresponding to the sample videos.
可选地,当在数据处理设备上执行时,适于执行初始化有如下方法步骤的程序:在将图像帧序列中的多个预定图像帧输入预设卷积神经网络模型,确定每个预定图像帧的特征矩阵和概率之后,将关键帧的概率调整为预设值;在图像帧序列中删除概率低于预设阈值的抽样帧。Optionally, when executed on a data processing device, it is suitable for executing a program initialized with the following method steps: inputting a plurality of predetermined image frames in a sequence of image frames into a preset convolutional neural network model, determining each predetermined image After the feature matrix and probability of the frame, the probability of the key frame is adjusted to a preset value; the sampled frame whose probability is lower than the preset threshold is deleted from the image frame sequence.
本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。As will be appreciated by those skilled in the art, the embodiments of the present application may be provided as a method, a system, or a computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the present application. It will be understood that each flow and/or block in the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to the processor of a general purpose computer, special purpose computer, embedded processor or other programmable data processing device to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing device produce Means for implementing the functions specified in a flow or flow of a flowchart and/or a block or blocks of a block diagram.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory result in an article of manufacture comprising instruction means, the instructions The apparatus implements the functions specified in the flow or flow of the flowcharts and/or the block or blocks of the block diagrams.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded on a computer or other programmable data processing device to cause a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process such that The instructions provide steps for implementing the functions specified in the flow or blocks of the flowcharts and/or the block or blocks of the block diagrams.
在一个典型的配置中,计算设备包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
存储器可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。存储器是计算机可读介质的示例。Memory may include non-persistent memory in computer readable media, random access memory (RAM) and/or non-volatile memory in the form of, for example, read only memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括暂存电脑可读媒体(transitory media),如调制的数据信号和载波。Computer-readable media includes both persistent and non-permanent, removable and non-removable media, and storage of information may be implemented by any method or technology. Information may be computer readable instructions, data structures, modules of programs, or other data. Examples of computer storage media include, but are not limited to, phase-change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), Flash Memory or other memory technology, Compact Disc Read Only Memory (CD-ROM), Digital Versatile Disc (DVD) or other optical storage, Magnetic tape cassettes, magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer-readable media does not include transitory computer-readable media, such as modulated data signals and carrier waves.
还需要说明的是,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、商品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、商品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括要素的过程、方法、商品或者设备中还存在另外的相同要素。It should also be noted that the terms "comprising", "comprising" or any other variation thereof are intended to encompass a non-exclusive inclusion such that a process, method, article or device comprising a series of elements includes not only those elements, but also Other elements not expressly listed, or which are inherent to such a process, method, article of manufacture, or apparatus are also included. Without further limitation, an element qualified by the phrase "comprising a..." does not preclude the presence of additional identical elements in the process, method, article of manufacture or apparatus that includes the element.
本领域技术人员应明白,本申请的实施例可提供为方法、系统或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。It will be appreciated by those skilled in the art that the embodiments of the present application may be provided as a method, a system or a computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
以上仅为本申请的实施例而已,并不用于限制本申请。对于本领域技术人员来说,本申请可以有各种更改和变化。凡在本申请的精神和原理之内所作的任何修改、等同替换、改进等,均应包含在本申请的权利要求范围之内。The above are merely examples of the present application, and are not intended to limit the present application. Various modifications and variations of this application are possible for those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application shall be included within the scope of the claims of the present application.
Claims (10)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title | 
|---|---|---|---|
| CN202210720251.1A CN115049963A (en) | 2022-06-23 | 2022-06-23 | Video classification method and device, processor and electronic equipment | 
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title | 
|---|---|---|---|
| CN202210720251.1A CN115049963A (en) | 2022-06-23 | 2022-06-23 | Video classification method and device, processor and electronic equipment | 
Publications (1)
| Publication Number | Publication Date | 
|---|---|
| CN115049963A true CN115049963A (en) | 2022-09-13 | 
Family
ID=83163856
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date | 
|---|---|---|---|
| CN202210720251.1A Pending CN115049963A (en) | 2022-06-23 | 2022-06-23 | Video classification method and device, processor and electronic equipment | 
Country Status (1)
| Country | Link | 
|---|---|
| CN (1) | CN115049963A (en) | 
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title | 
|---|---|---|---|---|
| CN115205768A (en) * | 2022-09-16 | 2022-10-18 | 山东百盟信息技术有限公司 | A Video Classification Method Based on Resolution Adaptive Network | 
| CN116580370A (en) * | 2023-04-18 | 2023-08-11 | 南斗六星系统集成有限公司 | A frame extraction method and system for road test vehicle image information | 
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title | 
|---|---|---|---|---|
| CN110866510A (en) * | 2019-11-21 | 2020-03-06 | 山东浪潮人工智能研究院有限公司 | Video description system and method based on key frame detection | 
| CN111368656A (en) * | 2020-02-21 | 2020-07-03 | 华为技术有限公司 | Video content description method and video content description device | 
| CN112163120A (en) * | 2020-09-04 | 2021-01-01 | Oppo(重庆)智能科技有限公司 | Classification method, terminal and computer storage medium | 
| US20210232825A1 (en) * | 2019-03-06 | 2021-07-29 | Tencent Technology (Shenzhen) Company Limited | Video classification method, model training method, device, and storage medium | 
| US20210357652A1 (en) * | 2020-11-17 | 2021-11-18 | Beijing Baidu Netcom Science And Technology Co., Ltd. | Method, apparatus, electronic device and readable storage medium for classifying video | 
- 
        2022
        - 2022-06-23 CN CN202210720251.1A patent/CN115049963A/en active Pending
 
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title | 
|---|---|---|---|---|
| US20210232825A1 (en) * | 2019-03-06 | 2021-07-29 | Tencent Technology (Shenzhen) Company Limited | Video classification method, model training method, device, and storage medium | 
| CN110866510A (en) * | 2019-11-21 | 2020-03-06 | 山东浪潮人工智能研究院有限公司 | Video description system and method based on key frame detection | 
| CN111368656A (en) * | 2020-02-21 | 2020-07-03 | 华为技术有限公司 | Video content description method and video content description device | 
| CN112163120A (en) * | 2020-09-04 | 2021-01-01 | Oppo(重庆)智能科技有限公司 | Classification method, terminal and computer storage medium | 
| US20210357652A1 (en) * | 2020-11-17 | 2021-11-18 | Beijing Baidu Netcom Science And Technology Co., Ltd. | Method, apparatus, electronic device and readable storage medium for classifying video | 
Non-Patent Citations (1)
| Title | 
|---|
| 肖廷汉: "基于CNN-RNN的视频事件分类", 中国优秀硕士论文全文数据库(电子期刊), 15 January 2019 (2019-01-15), pages 40 - 43 * | 
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title | 
|---|---|---|---|---|
| CN115205768A (en) * | 2022-09-16 | 2022-10-18 | 山东百盟信息技术有限公司 | A Video Classification Method Based on Resolution Adaptive Network | 
| CN116580370A (en) * | 2023-04-18 | 2023-08-11 | 南斗六星系统集成有限公司 | A frame extraction method and system for road test vehicle image information | 
Similar Documents
| Publication | Publication Date | Title | 
|---|---|---|
| CN104050247B (en) | The method for realizing massive video quick-searching | |
| CN111462183A (en) | Behavior identification method and system based on attention mechanism double-current network | |
| CN107729809B (en) | A method and device for adaptively generating video abstract and its readable storage medium | |
| CN111277910B (en) | Bullet screen display method, device, electronic device and storage medium | |
| CN113762138A (en) | Method and device for identifying forged face picture, computer equipment and storage medium | |
| WO2021082589A1 (en) | Content check model training method and apparatus, video content check method and apparatus, computer device, and storage medium | |
| CN111061898A (en) | Image processing method, device, computer equipment and storage medium | |
| CN111429341B (en) | A video processing method, device and computer-readable storage medium | |
| CN111209897A (en) | Video processing method, device and storage medium | |
| CN115049963A (en) | Video classification method and device, processor and electronic equipment | |
| CN111046969A (en) | Data screening method and device, storage medium and electronic equipment | |
| CN114329050B (en) | Visual media data deduplication processing method, device, equipment and storage medium | |
| CN112528058A (en) | Fine-grained image classification method based on image attribute active learning | |
| US20250245986A1 (en) | Scene recognition | |
| CN116543261B (en) | Model training method for image recognition, image recognition method device and medium | |
| Edwards et al. | A review of deepfake techniques: architecture, detection and datasets | |
| Oraibi et al. | Enhancement digital forensic approach for inter-frame video forgery detection using a deep learning technique | |
| US8165387B2 (en) | Information processing apparatus and method, program, and recording medium for selecting data for learning | |
| WO2024087358A1 (en) | Target detection method and apparatus, and device and non-volatile readable storage medium | |
| CN111275166A (en) | Image processing apparatus, equipment and readable storage medium based on convolutional neural network | |
| CN110428006A (en) | The detection method of computer generated image, system, device | |
| CN114092819A (en) | Image classification method and device | |
| CN116563170B (en) | Image data processing method, system and electronic device | |
| CN118015388A (en) | Small target detection method, device and storage medium | |
| WO2022141092A1 (en) | Model generation method and apparatus, image processing method and apparatus, and readable storage medium | 
Legal Events
| Date | Code | Title | Description | 
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination |