CN105825511B - A kind of picture background clarity detection method based on deep learning - Google Patents
A kind of picture background clarity detection method based on deep learning Download PDFInfo
- Publication number
- CN105825511B CN105825511B CN201610155947.9A CN201610155947A CN105825511B CN 105825511 B CN105825511 B CN 105825511B CN 201610155947 A CN201610155947 A CN 201610155947A CN 105825511 B CN105825511 B CN 105825511B
- Authority
- CN
- China
- Prior art keywords
- picture
- layer
- pixel
- background
- deep learning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30168—Image quality inspection
Landscapes
- Engineering & Computer Science (AREA)
- Quality & Reliability (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
Abstract
Description
技术领域technical field
本发明涉及一种基于深度学习的图片背景清晰度检测方法,主要关于机器学习中深度学习的应用,属于人工智能图片识别技术领域。The invention relates to a method for detecting the clarity of a picture background based on deep learning, mainly relates to the application of deep learning in machine learning, and belongs to the technical field of artificial intelligence picture recognition.
背景技术Background technique
深度学习的概念源于人工神经网络的研究,含多隐层的多层感知器就是一种深度学习结构。深度学习通过组合低层特征形成更加抽象的高层表示(属性类别或特征),以发现数据的分布式特征表示。BP算法作为传统训练多层网络的典型算法,实际上对于仅含几层网络,该训练方法就已很不理想。深度结构(涉及多个非线性处理单元层)非凸目标代价函数中普遍存在的局部最小是训练困难的主要来源。The concept of deep learning originated from the research of artificial neural networks, and the multi-layer perceptron with multiple hidden layers is a deep learning structure. Deep learning forms more abstract high-level representations (attribute categories or features) by combining low-level features to discover distributed feature representations of data. The BP algorithm is a typical algorithm for traditional multi-layer network training. In fact, this training method is not ideal for networks with only a few layers. The ubiquitous local minima in non-convex objective cost functions of deep architectures (involving multiple layers of nonlinear processing units) are the main source of training difficulties.
当前多数分类、回归等学习方法为浅层结构算法,其局限性在于有限样本和计算单元情况下对复杂函数的表示能力有限,针对复杂分类问题其泛化能力受到一定制约。深度学习可通过学习一种深层非线性网络结构,实现复杂函数逼近,表征输入数据分布式表示,并展现了强大的从少数样本集中学习数据集本质特征的能力。深度学习就是一种特征学习方法,把原始数据通过一些简单的但是非线性的模型转变成为更高层次的,更加抽象的表达。通过足够多的转换的组合,非常复杂的函数也可以被学习。深度学习的核心方面是,上述各层的特征都不是利用人工工程来设计的,而是使用一种通用的学习过程从数据中学到的。Most of the current learning methods such as classification and regression are shallow structure algorithms, which are limited in the ability to express complex functions under the condition of limited samples and computing units, and their generalization ability for complex classification problems is restricted to a certain extent. Deep learning can achieve complex function approximation by learning a deep nonlinear network structure, characterize the distributed representation of input data, and demonstrate a powerful ability to learn the essential characteristics of data sets from a small number of sample sets. Deep learning is a feature learning method that converts raw data into higher-level, more abstract expressions through some simple but nonlinear models. By combining enough transformations, very complex functions can also be learned. The core aspect of deep learning is that the features of the above layers are not designed by human engineering, but are learned from data using a general learning process.
Lecun等人提出的卷积神经网络(CNNs)是第一个真正多层结构学习算法,而本发明所用到的核心知识就是卷积神经网络,它利用空间相对关系减少参数数目以提高BP训练性能。卷积神经网络在图像识别领域取得很好的效果,在识别手写字符上达到了很好的效果。但其网络结构对图像识别的效果和效率有较大的影响,为改善识别性能,通过重复使用较小卷积核,设计并实现一种新的卷积神经网络结构,有效地减少训练参数的数量,并能够提高识别的准确率。卷积神经网络算法与图像识别领域当前具有世界先进水平的ILSVRC挑战赛中取得较好成绩的算法对比实验,验证这种结构的有效性。The convolutional neural network (CNNs) proposed by Lecun et al. is the first real multi-layer structure learning algorithm, and the core knowledge used in the present invention is the convolutional neural network, which utilizes the spatial relative relationship to reduce the number of parameters to improve BP training performance . Convolutional neural networks have achieved good results in the field of image recognition, and have achieved good results in recognizing handwritten characters. However, its network structure has a great influence on the effect and efficiency of image recognition. In order to improve the recognition performance, a new convolutional neural network structure is designed and implemented by reusing smaller convolution kernels, which can effectively reduce the training parameters. quantity and can improve the accuracy of recognition. The comparative experiment of convolutional neural network algorithm and the algorithm that has achieved good results in the ILSVRC Challenge, which is currently the world's advanced level in the field of image recognition, verifies the effectiveness of this structure.
卷积神经网络的训练过程,需要大量的已知标记的样本,如果含标记的样本量不够,就容易造成系统的过度拟合。Jeff Donahue等人构建了Decaf框架,其思想就是首先在含有大量已知标记样本的图片集中进行预训练,调整卷积神经网络系统的参数,利用迁移学习将整个系统的参数迁移到要训练的图片集中,这样只需要少量的已知标记的样本,就能够得到准确的分类。The training process of the convolutional neural network requires a large number of known labeled samples. If the number of labeled samples is not enough, it is easy to cause the system to overfit. Jeff Donahue and others built the Decaf framework. The idea is to first perform pre-training on a picture set containing a large number of known labeled samples, adjust the parameters of the convolutional neural network system, and use migration learning to migrate the parameters of the entire system to the pictures to be trained. Concentration, so that only a small number of known labeled samples are needed to obtain accurate classification.
目前利用深度学习图片识别的种类有很多,例如手写字符,车牌号等,但是基于卷积神经网络的用法并没有开发完全,目前并没有人工智能识别图片中环境可见程度的好方法,图片背景的可见程度,即背景中事物的模糊程度,目前大部分图片识别的过程都是识别图片中的物体,往往忽略了其背景环境中有用的信息。本发明主要就是用于解决这个问题。识别图片背景的可见程度的实用性很大,比如利用本专利在现实中根据图片识别雾霾的等级,其应用前景很广阔。At present, there are many types of image recognition using deep learning, such as handwritten characters, license plate numbers, etc., but the usage based on convolutional neural networks has not been fully developed. At present, there is no good method for artificial intelligence to recognize the visibility of the environment in the image. Visibility, that is, the degree of blurring of things in the background. At present, most of the image recognition process is to identify the objects in the picture, often ignoring the useful information in the background environment. The present invention is mainly used to solve this problem. It is very practical to identify the degree of visibility of the background of the picture. For example, using this patent to identify the level of haze according to the picture in reality has a broad application prospect.
发明内容Contents of the invention
本发明所要解决的技术问题是:提供一种基于深度学习的图片背景清晰度检测方法,检测图片中背景清晰程度,即背景中事物的模糊程度,提取背景环境中有用的信息,为图片识别提供参考。The technical problem to be solved by the present invention is to provide a method for detecting the background clarity of a picture based on deep learning, detect the background clarity in the picture, that is, the blur degree of things in the background, extract useful information in the background environment, and provide information for picture recognition. refer to.
本发明为解决上述技术问题采用以下技术方案:The present invention adopts the following technical solutions for solving the problems of the technologies described above:
一种基于深度学习的图片背景清晰度检测方法,包括如下步骤:A method for detecting the sharpness of a picture background based on deep learning, comprising the following steps:
步骤1,将图片库ImageNet中已知标记的图片以及未在图片库ImageNet中但已知背景清晰度值的样本图片,均转换成像素为256*256的灰度图片;Step 1, convert the known marked pictures in the image library ImageNet and the sample images that are not in the image library ImageNet but have known background sharpness values into grayscale images with pixels of 256*256;
步骤2,对图片库ImageNet中转换后的灰度图片进行预训练,利用卷积神经网络提取所有灰度图片的特征并进行分类,计算损失函数,用随机梯度下降法调整卷积参数,使函数损失在预定范围内,得到初步调整后的卷积参数;Step 2. Pre-train the converted grayscale images in the image library ImageNet, use the convolutional neural network to extract the features of all grayscale images and classify them, calculate the loss function, and adjust the convolution parameters with the stochastic gradient descent method to make the function The loss is within the predetermined range, and the initially adjusted convolution parameters are obtained;
步骤3,对未在图片库ImageNet中但已知背景清晰度值的样本图片转换后的灰度图片,基于步骤2初步调整后的卷积参数,利用卷积神经网络提取特征并进行分类,得到其清晰度值,与实际清晰度值作对比,计算损失函数,用随机梯度下降法继续调整卷积参数,使函数损失在预定范围内,得到最终调整后的卷积参数;Step 3, for the converted grayscale image of the sample image that is not in the image library ImageNet but with known background sharpness value, based on the initially adjusted convolution parameters in step 2, use the convolutional neural network to extract features and classify them, and obtain The sharpness value is compared with the actual sharpness value, the loss function is calculated, and the convolution parameters are continuously adjusted by the stochastic gradient descent method, so that the function loss is within the predetermined range, and the final adjusted convolution parameters are obtained;
步骤4,将待检测清晰度的图片转换成像素为256*256的灰度图片,基于步骤3最终调整后的卷积参数,利用卷积神经网络提取特征并进行分类,得到待检测清晰度图片的清晰度值。Step 4: Convert the image to be detected into a grayscale image with a pixel size of 256*256. Based on the finally adjusted convolution parameters in step 3, use the convolutional neural network to extract features and classify them to obtain the image to be detected. clarity value.
作为本发明的一个优选方案,所述卷积神经网络包括依次由输入到输出的输入层、第一卷积层、第一向下采样层、第二卷积层、第二向下采样层、全连接层、输出层,且除输入层、输出层外,第一卷积层、第一向下采样层、第二卷积层、第二向下采样层、全连接层在卷积神经网络所在层数分别为1、2、3、4、5层。As a preferred solution of the present invention, the convolutional neural network includes an input layer, a first convolutional layer, a first downsampling layer, a second convolutional layer, a second downsampling layer, Fully connected layer, output layer, and in addition to the input layer and output layer, the first convolutional layer, the first downsampling layer, the second convolutional layer, the second downsampling layer, and the fully connected layer in the convolutional neural network The floors are 1, 2, 3, 4, and 5 respectively.
作为本发明的一个优选方案,所述第一卷积层的卷积过程公式为:其中,l=1,xl表示经第一卷积层卷积后输出的像素点的值,表示输入层中第i行j列像素点的值,w为卷积参数,b为偏移量。As a preferred solution of the present invention, the convolution process formula of the first convolution layer is: Wherein, l=1, x l represents the value of the pixel point output after the convolution of the first convolutional layer, Indicates the value of the pixel in the i-th row and j-column in the input layer, w is the convolution parameter, and b is the offset.
作为本发明的一个优选方案,所述第一向下采样层的向下采样过程公式为:其中,l=2,xl表示经第一向下采样层采样后输出的像素点的值,表示第一卷积层中第i行j列像素点的值,β为向下采样参数,b为偏移量。As a preferred solution of the present invention, the down-sampling process formula of the first down-sampling layer is: Wherein, l=2, x l represents the value of the pixel point output after the first down-sampling layer sampling, Indicates the value of the pixel in the i-th row and j-column in the first convolutional layer, β is the downsampling parameter, and b is the offset.
作为本发明的一个优选方案,所述全连接层包括两次全连接过程,且全连接过程公式为:其中,第一次全连接时l=5,第二次全连接时l=6,xl表示经全连接后输出的像素点的值,k表示像素点编号,第一次全连接时k=1,…,576,第二次全连接时k=1,…,50,为权重值。As a preferred solution of the present invention, the fully connected layer includes two fully connected processes, and the formula of the fully connected process is: Among them, l=5 during the first full connection, l=6 during the second full connection, x l represents the value of the pixel point output after the full connection, k represents the pixel number, k= during the first full connection 1,...,576, k=1,...,50 in the second full connection, is the weight value.
本发明采用以上技术方案与现有技术相比,具有以下技术效果:Compared with the prior art, the present invention adopts the above technical scheme and has the following technical effects:
1、本发明基于深度学习的图片背景清晰度检测方法,在样本图片不足的情况下,利用迁移学习的思想,首先在含有大量已知标记的ImageNet图片集中进行预训练,获得CNN参数,并进一步对CNN参数进行调整,使之适应待检测图片集,从而使待检测图片集的检测精确性更高。1, the present invention is based on deep learning image background clarity detection method, in the case of insufficient sample images, using the idea of transfer learning, first pre-training is carried out in the ImageNet image set containing a large number of known markers to obtain CNN parameters, and further Adjust the CNN parameters to adapt to the picture set to be detected, so that the detection accuracy of the picture set to be detected is higher.
2、本发明基于深度学习的图片背景清晰度检测方法,解决了图片中背景清晰程度检测的问题,对实际应用中如识别雾霾等级、空气质量等方面有很大的作用。2. The method for detecting the background clarity of pictures based on deep learning in the present invention solves the problem of detecting the background clarity in pictures, and has a great effect on practical applications such as identifying smog levels and air quality.
附图说明Description of drawings
图1是本发明基于深度学习的图片背景清晰度检测方法的整体架构图。FIG. 1 is an overall architecture diagram of the deep learning-based image background definition detection method of the present invention.
图2是本发明基于深度学习的图片背景清晰度检测方法的流程图。FIG. 2 is a flow chart of the method for detecting the sharpness of a picture background based on deep learning in the present invention.
图3是本发明中卷积神经网络的内部结构图。Fig. 3 is a diagram of the internal structure of the convolutional neural network in the present invention.
具体实施方式Detailed ways
下面详细描述本发明的实施方式,所述实施方式的示例在附图中示出。下面通过参考附图描述的实施方式是示例性的,仅用于解释本发明,而不能解释为对本发明的限制。Embodiments of the invention are described in detail below, examples of which are illustrated in the accompanying drawings. The embodiments described below by referring to the figures are exemplary only for explaining the present invention and should not be construed as limiting the present invention.
由于给定的一张图片,其像素并不确定,而卷积神经网络中的输入图片像素要求是固定的,所以首先要对图片进行预处理,将其转换成相同像素的图片,并且已知对ImageNet训练的过程中是将图片全部转换成256*256像素的图片进行处理,故输入的图片像素也固定为256*256像素,待处理的图片像素大小不为256*256,则首先进行像素转换,将其转换成256*256像素的图片。由于图片的清晰程度与其颜色并没有很大的相关性,所以首先应将所有的图片转换成灰度的图片。本发明根据清晰度值将图片背景的清晰程度分为五个等级,分别为优、良、中、差、极差,以便更好的分析与测试。Since the pixels of a given picture are uncertain, and the pixel requirements of the input picture in the convolutional neural network are fixed, the picture must first be preprocessed to convert it into a picture with the same pixels, and it is known In the process of ImageNet training, all the pictures are converted into 256*256 pixel pictures for processing, so the input picture pixels are also fixed at 256*256 pixels, and the pixel size of the picture to be processed is not 256*256, then the pixel size is first processed Convert, convert it into a 256*256 pixel picture. Since the clarity of a picture has little to do with its color, all pictures should be converted to grayscale first. The present invention divides the clarity of the picture background into five grades according to the clarity value, which are excellent, good, medium, poor, and extremely poor, so as to better analyze and test.
本发明主要包括三个过程:预训练过程、调整过程和实际检测过程。预训练过程是用已知标记的图片集ImageNet进行训练,其目的是为了获得最初的CNN参数;调整过程是利用少量的已知背景清晰度值的样本图片,对CNN参数进行调整,使CNN参数适应待检测图片集;得到调整好的参数就可以进行待检测图片的检测。The present invention mainly includes three processes: a pre-training process, an adjustment process and an actual detection process. The pre-training process is to use the known labeled picture set ImageNet for training, the purpose is to obtain the initial CNN parameters; the adjustment process is to use a small number of sample pictures with known background clarity values to adjust the CNN parameters, so that the CNN parameters Adapt to the set of images to be detected; after obtaining the adjusted parameters, the detection of the images to be detected can be carried out.
卷积神经网络(Convolutional Neural Network,CNN)是本发明的核心技术,CNN是一种前馈神经网络,它的人工神经元可以响应一部分覆盖范围内的周围单元,对于大型图像处理有出色表现。它包括卷积层(alternating convolutional layer)和向下采样层(pooling layer)。最初的几个阶段是由卷积层和向下采样层组成,卷积层的单元被组织在特征图中,在特征图中,每一个单元通过一组叫做滤波器的权值被连接到上一层的特征图的一个局部块,然后这个局部加权和被传给一个非线性函数,比如ReLU。在一个特征图中的全部单元享用相同的过滤器,不同层的特征图使用不同的过滤器。使用这种结构处于两方面的原因。首先,在数组数据中,比如图像数据,一个值的附近的值经常是高度相关的,可以形成比较容易被探测到的有区分性的局部特征。其次,不同位置局部统计特征不太相关的,也就是说,在一个地方出现的某个特征,也可能出现在别的地方,所以不同位置的单元可以共享权值以及可以探测相同的样本。在数学上,这种由一个特征图执行的过滤操作是一个离线的卷积,卷积神经网络也是这么得名来的。Convolutional Neural Network (CNN) is the core technology of the present invention. CNN is a feed-forward neural network whose artificial neurons can respond to surrounding units in a part of the coverage area, and have excellent performance for large-scale image processing. It includes alternating convolutional layers and pooling layers. The first few stages are composed of convolutional layers and downsampling layers. The units of the convolutional layer are organized in feature maps. In the feature map, each unit is connected to the upper layer by a set of weights called filters. A local block of the feature map of one layer, and then this local weighted sum is passed to a non-linear function, such as ReLU. All units in a feature map share the same filter, and feature maps of different layers use different filters. This structure is used for two reasons. First, in array data, such as image data, values near a value are often highly correlated, which can form distinctive local features that are easier to detect. Secondly, the local statistical features of different locations are not very relevant, that is to say, a certain feature that appears in one place may also appear in other places, so units in different locations can share weights and can detect the same samples. Mathematically, this filtering operation performed by a feature map is an offline convolution, which is how the convolutional neural network gets its name.
卷积神经网络具有很好的特征提取的效果,通过卷积神经网络提取的特征能够很好的对目标对象分类。The convolutional neural network has a good feature extraction effect, and the features extracted by the convolutional neural network can classify the target object very well.
在传统的机器学习的框架下,学习的任务就是在给定充分训练数据的基础上来学习一个分类模型;然后利用这个学习到的模型来对测试文档进行分类与预测。然而,我们看到机器学习算法在当前的Web挖掘研究中存在着一个关键的问题:一些新出现的领域中的大量训练数据非常难得到。In the framework of traditional machine learning, the task of learning is to learn a classification model based on sufficient training data; and then use the learned model to classify and predict test documents. However, we see a key problem with machine learning algorithms in current Web mining research: large amounts of training data in some emerging domains are very difficult to obtain.
传统的机器学习需要对每个领域都标定大量训练数据,这将会耗费大量的人力与物力。而没有大量的标注数据,会使得很多与学习相关研究与应用无法开展。其次,传统的机器学习假设训练数据与测试数据服从相同的数据分布。然而,在许多情况下,这种同分布假设并不满足。通常可能发生的情况如训练数据过期。这往往需要我们去重新标注大量的训练数据以满足我们训练的需要,但标注新数据是非常昂贵的,需要大量的人力与物力。从另外一个角度上看,如果我们有了大量的、在不同分布下的训练数据,完全丢弃这些数据也是非常浪费的。如何合理的利用这些数据就是迁移学习主要解决的问题。迁移学习是运用已存有的知识对不同但相关领域问题进行求解的新的一种机器学习方法。Traditional machine learning needs to calibrate a large amount of training data for each domain, which will consume a lot of manpower and material resources. Without a large amount of labeled data, many studies and applications related to learning cannot be carried out. Second, traditional machine learning assumes that training data and test data obey the same data distribution. However, in many cases this same distribution assumption is not satisfied. Commonly possible situations such as outdated training data. This often requires us to relabel a large amount of training data to meet our training needs, but labeling new data is very expensive and requires a lot of manpower and material resources. From another point of view, if we have a large amount of training data under different distributions, it is very wasteful to completely discard these data. How to make reasonable use of these data is the main problem to be solved by transfer learning. Transfer learning is a new machine learning method that uses existing knowledge to solve problems in different but related fields.
我们在迁移学习方面的工作目前可以分为以下三个部分:同构空间下基于实例的迁移学习,同构空间下基于特征的迁移学习与异构空间下的迁移学习。本发明中用到的迁移学习属于第二个部分,同构空间下基于特征的迁移学习。由于ImageNet与目标域具有共享的参数,所以只需要将CNN系统中的参数进行迁移即可。Our current work on transfer learning can be divided into the following three parts: instance-based transfer learning in homogeneous spaces, feature-based transfer learning in homogeneous spaces, and transfer learning in heterogeneous spaces. The transfer learning used in the present invention belongs to the second part, feature-based transfer learning in the homogeneous space. Since ImageNet and the target domain have shared parameters, it is only necessary to migrate the parameters in the CNN system.
如图1、图2所示,本发明的具体操作过程如下:As shown in Fig. 1 and Fig. 2, the concrete operation process of the present invention is as follows:
1、图片的预处理:将ImageNet中的图片,以及已知标记的样本图片,都进行预处理,均转换成256*256像素的灰度图片。1. Image preprocessing: preprocess the images in ImageNet and the sample images with known marks, and convert them into grayscale images of 256*256 pixels.
2、预训练阶段:用ImageNet图片库进行预训练,输入256*256的像素的灰度图片,用CNN提取特征,并进行分类,计算其损失函数,用随机梯度下降法调整CNN中的参数。2. Pre-training stage: Pre-training with the ImageNet image library, input a grayscale image of 256*256 pixels, use CNN to extract features, classify them, calculate their loss function, and use the stochastic gradient descent method to adjust the parameters in CNN.
3、调整阶段:用已知其背景清晰度值的图片作为输入图片,通过CNN提取特征并进行分类,得到其清晰度的值,与实际图片的清晰度值作对比,计算损失函数,用随机梯度下降法调整系统参数。3. Adjustment stage: use the picture whose background definition value is known as the input picture, extract features through CNN and classify, get the value of its definition, compare it with the definition value of the actual picture, calculate the loss function, and use random The gradient descent method adjusts the system parameters.
4、实际检测阶段:首先未知其清晰程度的图片也要转换成256*256像素作为输入,用CNN进行提取特征并进行分类,最终得到其清晰程度的标签。4. Actual detection stage: firstly, the picture whose clarity is not known should be converted into 256*256 pixels as input, and CNN is used to extract features and classify, and finally get the label of its clarity.
卷积神经网络的具体过程(参数数目可根据实际情况调整):The specific process of the convolutional neural network (the number of parameters can be adjusted according to the actual situation):
本算法中的CNN共有5层,不包含输入输出层,每层都包含可训练参数(连接权重)。输入图像为256*256像素的大小。The CNN in this algorithm has 5 layers in total, excluding input and output layers, and each layer contains trainable parameters (connection weights). The input image has a size of 256*256 pixels.
1、输入层到Convolution 1层是一个卷积的过程,其做法是用4个9*9的滤波器与输入图片中的9*9的像素点相乘求和,即对输入图片中每个9*9大小的像素进行加权和,再加上一个偏移量,卷积中像素有重叠,每次计算后进行一个像素的滤波器平移,卷积过程公式如下:1. The input layer to the Convolution 1 layer is a convolution process. The method is to multiply and sum the 9*9 pixels in the input picture with four 9*9 filters, that is, each of the input pictures The weighted sum of 9*9 pixels, plus an offset, the pixels in the convolution overlap, and a filter translation of one pixel is performed after each calculation. The formula of the convolution process is as follows:
其中,l表示层数(本层l=1),x表示某个像素点的值,i、j分别表示像素点所在的行、列数(本层中i的取值为1到9,本层中j的取值为1到9),w为卷积参数,b为偏移量。Among them, l represents the number of layers (this layer l=1), x represents the value of a certain pixel point, and i and j represent the row and column number where the pixel point is located respectively (the value of i in this layer is 1 to 9, this layer The value of j in the layer is 1 to 9), w is the convolution parameter, and b is the offset.
具体情况参见图3中的第二个框图,该图每个方块为一个像素,可见输入层每9*9的像素经过卷积过程,转换成Convolution1层的一个像素,滤波器每次的位移为一个像素。输入层的大小为256*256像素,有1张特征图,Convolution1层的大小是248*248,共有4张特征图。For details, refer to the second block diagram in Figure 3. Each block in this figure is a pixel. It can be seen that every 9*9 pixel in the input layer is converted into a pixel in the Convolution1 layer through the convolution process, and the displacement of the filter each time is one pixel. The size of the input layer is 256*256 pixels, with 1 feature map, and the size of the Convolution1 layer is 248*248, with a total of 4 feature maps.
2、Convolution 1层到Subsampling 2层是一个向下采样的过程,直接对该层中的4*4大小的像素的点进行一次求和后加权,再加上一个偏移量,该向下采样的过程没有重叠,向下采样过程的公式如下:2. Convolution 1 layer to Subsampling 2 layer is a down-sampling process, which directly sums and weights the points of 4*4 pixels in the layer, and adds an offset, the down-sampling There is no overlap in the process, and the formula for the downsampling process is as follows:
其中,l表示层数(本层l=2),x表示某个像素点的值,i、j分别表示像素点所在的行、列数(本层中i的取值为1到4,本层中j的取值为1到4),β为向下采样参数,b为偏移量。Among them, l represents the number of layers (this layer l=2), x represents the value of a certain pixel point, and i and j represent the row and column numbers where the pixel point is located respectively (the value of i in this layer is 1 to 4, this layer The value of j in the layer is 1 to 4), β is the downsampling parameter, and b is the offset.
具体情况参见图3中的第三个框图,该图中每个方块为一个像素,可见Convolution 1层中的每4*4的像素经过向下采样转换成subsampling 2层的1个像素。Convolution 1层的大小为248*248,共有4张特征图,Subsampling2层的大小为62*62,共有4张特征图。For details, refer to the third block diagram in Figure 3. Each block in this figure is a pixel. It can be seen that every 4*4 pixels in the Convolution 1 layer are down-sampled and converted into 1 pixel in the subsampling 2 layer. The size of the Convolution 1 layer is 248*248, and there are 4 feature maps in total. The size of the Subsampling 2 layer is 62*62, and there are 4 feature maps in total.
3、Subsampling 2层到Convolution3层的过程与第一次卷积的过程相同卷积滤波器的大小也是9*9,用16个录波器同时进行卷积。Subsampling2层的大小为62*62,共有4张特征图,Convolution3层的大小为54*54,共有16张特征图。3. The process from Subsampling layer 2 to Convolution layer 3 is the same as the first convolution process. The size of the convolution filter is also 9*9, and 16 recorders are used to perform convolution at the same time. The size of the Subsampling2 layer is 62*62, with a total of 4 feature maps, and the size of the Convolution3 layer is 54*54, with a total of 16 feature maps.
4、Convolution3层到Subsampling4层的过程与第一次向下取样的过程类似,不同点在于,对Convolution3层中每9*9大小的像素求和后加权,再加上一个偏移量,向下采样过程没有重叠。4. The process from the Convolution3 layer to the Subsampling4 layer is similar to the first downsampling process, the difference is that each 9*9 pixel in the Convolution3 layer is summed and then weighted, plus an offset, downward The sampling process did not overlap.
i与j的取值都是1到9。Subsampling4层的大小为6*6,共有16张特征图。The values of i and j are both 1 to 9. The size of the Subsampling4 layer is 6*6, with a total of 16 feature maps.
5、16张6*6的特征图,一共有16*6*6个特征点,经过两层全连接变换,所谓全连接就是指每个输出单元都是由所有输入单元加权和得到,Subsampling4层共有16*6*6个单元,经过一次全连接到第5层转换成50个单元,5层的50个单元经过第二次全连接变换转换成最终的5个等级。全连接的公式如下:5. 16 6*6 feature maps, a total of 16*6*6 feature points, after two layers of full connection transformation, the so-called full connection means that each output unit is obtained by the weighted sum of all input units, Subsampling4 layers There are a total of 16*6*6 units, which are transformed into 50 units after a full connection to the 5th layer, and the 50 units of the 5th layer are converted into the final 5 levels through the second full connection transformation. The formula for full connection is as follows:
其中,l表示层数(第一次全连接时l=5,第二次全连接时l=6),x表示某个像素点的值,k表示像素点编号(第一次全连接时k=1,…,576,第二次全连接时k=1,…,50),为权重值。Among them, l represents the number of layers (l=5 in the first full connection, l=6 in the second full connection), x represents the value of a certain pixel, and k represents the pixel number (k in the first full connection =1,...,576, k=1,...,50 in the second full connection), is the weight value.
以上实施例仅为说明本发明的技术思想,不能以此限定本发明的保护范围,凡是按照本发明提出的技术思想,在技术方案基础上所做的任何改动,均落入本发明保护范围之内。The above embodiments are only to illustrate the technical ideas of the present invention, and can not limit the protection scope of the present invention with this. All technical ideas proposed in accordance with the present invention, any changes made on the basis of technical solutions, all fall within the protection scope of the present invention. Inside.
Claims (5)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201610155947.9A CN105825511B (en) | 2016-03-18 | 2016-03-18 | A kind of picture background clarity detection method based on deep learning |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201610155947.9A CN105825511B (en) | 2016-03-18 | 2016-03-18 | A kind of picture background clarity detection method based on deep learning |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN105825511A CN105825511A (en) | 2016-08-03 |
| CN105825511B true CN105825511B (en) | 2018-11-02 |
Family
ID=56523997
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201610155947.9A Expired - Fee Related CN105825511B (en) | 2016-03-18 | 2016-03-18 | A kind of picture background clarity detection method based on deep learning |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN105825511B (en) |
Families Citing this family (23)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN106372402B (en) * | 2016-08-30 | 2019-04-30 | 中国石油大学(华东) | Parallelization method of fuzzy area convolution neural network in big data environment |
| CN106372656B (en) * | 2016-08-30 | 2019-05-10 | 同观科技(深圳)有限公司 | Obtain method, image-recognizing method and the device of the disposable learning model of depth |
| CN106504233B (en) * | 2016-10-18 | 2019-04-09 | 国网山东省电力公司电力科学研究院 | Recognition method and system for power widgets in UAV inspection images based on Faster R-CNN |
| CN106780448B (en) * | 2016-12-05 | 2018-07-17 | 清华大学 | A kind of pernicious categorizing system of ultrasonic Benign Thyroid Nodules based on transfer learning and Fusion Features |
| CN106777986B (en) * | 2016-12-19 | 2019-05-21 | 南京邮电大学 | Based on the ligand molecular fingerprint generation method of depth Hash in drug screening |
| CN108510071B (en) * | 2017-05-10 | 2020-01-10 | 腾讯科技(深圳)有限公司 | Data feature extraction method and device and computer readable storage medium |
| EP3631751A4 (en) * | 2017-05-24 | 2021-02-24 | HRL Laboratories, LLC | LEARNING TO TRANSFER CONVOLUTIONAL NEURAL NETWORKS FROM VISIBLE COLOR SPACE (RBG) TO INFRARED (IR) AREA |
| CN107463937A (en) * | 2017-06-20 | 2017-12-12 | 大连交通大学 | A kind of tomato pest and disease damage automatic testing method based on transfer learning |
| CN107239803A (en) * | 2017-07-21 | 2017-10-10 | 国家海洋局第海洋研究所 | Utilize the sediment automatic classification method of deep learning neutral net |
| CN107506740B (en) * | 2017-09-04 | 2020-03-17 | 北京航空航天大学 | Human body behavior identification method based on three-dimensional convolutional neural network and transfer learning model |
| CN108021936A (en) * | 2017-11-28 | 2018-05-11 | 天津大学 | A kind of tumor of breast sorting algorithm based on convolutional neural networks VGG16 |
| CN108363961A (en) * | 2018-01-24 | 2018-08-03 | 东南大学 | Bridge pad disease recognition method based on transfer learning between convolutional neural networks |
| CN108647588A (en) * | 2018-04-24 | 2018-10-12 | 广州绿怡信息科技有限公司 | Goods categories recognition methods, device, computer equipment and storage medium |
| CN108875794B (en) * | 2018-05-25 | 2020-12-04 | 中国人民解放军国防科技大学 | An Image Visibility Detection Method Based on Transfer Learning |
| CN109003601A (en) * | 2018-08-31 | 2018-12-14 | 北京工商大学 | A kind of across language end-to-end speech recognition methods for low-resource Tujia language |
| CN109460699B (en) * | 2018-09-03 | 2020-09-25 | 厦门瑞为信息技术有限公司 | Driver safety belt wearing identification method based on deep learning |
| CN109410169B (en) * | 2018-09-11 | 2020-06-05 | 广东智媒云图科技股份有限公司 | Image background interference degree identification method and device |
| CN109472284A (en) * | 2018-09-18 | 2019-03-15 | 浙江大学 | A Cell Defect Classification Method Based on Unbiased Embedding Zero-Sample Learning |
| CN109740495A (en) * | 2018-12-28 | 2019-05-10 | 成都思晗科技股份有限公司 | Outdoor weather image classification method based on transfer learning technology |
| CN111191054B (en) * | 2019-12-18 | 2024-02-13 | 腾讯科技(深圳)有限公司 | Media data recommendation method and device |
| CN111259957A (en) * | 2020-01-15 | 2020-06-09 | 上海眼控科技股份有限公司 | Visibility monitoring and model training method, device, terminal and medium based on deep learning |
| CN111432206B (en) * | 2020-04-24 | 2024-11-26 | 腾讯科技(北京)有限公司 | Video clarity processing method, device and electronic equipment based on artificial intelligence |
| CN115331210B (en) * | 2022-08-23 | 2025-09-30 | 哈尔滨工业大学(威海) | An automatic license plate recognition method for hazy weather environment |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN103544705A (en) * | 2013-10-25 | 2014-01-29 | 华南理工大学 | Image quality testing method based on deep convolutional neural network |
| CN105160678A (en) * | 2015-09-02 | 2015-12-16 | 山东大学 | Convolutional-neural-network-based reference-free three-dimensional image quality evaluation method |
| CN105205504A (en) * | 2015-10-04 | 2015-12-30 | 北京航空航天大学 | Image interest region quality evaluation index learning method based on data driving |
Family Cites Families (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US9953425B2 (en) * | 2014-07-30 | 2018-04-24 | Adobe Systems Incorporated | Learning image categorization using related attributes |
-
2016
- 2016-03-18 CN CN201610155947.9A patent/CN105825511B/en not_active Expired - Fee Related
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN103544705A (en) * | 2013-10-25 | 2014-01-29 | 华南理工大学 | Image quality testing method based on deep convolutional neural network |
| CN105160678A (en) * | 2015-09-02 | 2015-12-16 | 山东大学 | Convolutional-neural-network-based reference-free three-dimensional image quality evaluation method |
| CN105205504A (en) * | 2015-10-04 | 2015-12-30 | 北京航空航天大学 | Image interest region quality evaluation index learning method based on data driving |
Non-Patent Citations (2)
| Title |
|---|
| Convolutional Neural Networks for No-Reference Image Quality Assessment;Le Kang 等;《CVPR 2014》;20140628;第1-8页 * |
| 一种基于深度卷积神经网络的摄像机覆盖质量评价算法;朱陶 等;《江西师范大学学报(自然科学版)》;20150515;第39卷(第3期);第309-314页 * |
Also Published As
| Publication number | Publication date |
|---|---|
| CN105825511A (en) | 2016-08-03 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN105825511B (en) | A kind of picture background clarity detection method based on deep learning | |
| Mascarenhas et al. | A comparison between VGG16, VGG19 and ResNet50 architecture frameworks for Image Classification | |
| Chen et al. | Pavement crack detection and recognition using the architecture of segNet | |
| CN104992223B (en) | Intensive population estimation method based on deep learning | |
| CN104281853B (en) | A kind of Activity recognition method based on 3D convolutional neural networks | |
| CN105447473B (en) | A kind of any attitude facial expression recognizing method based on PCANet-CNN | |
| CN105488534B (en) | Traffic scene deep analysis method, apparatus and system | |
| Lim et al. | Transformed representations for convolutional neural networks in diabetic retinopathy screening. | |
| CN116883933A (en) | A security inspection contraband detection method based on multi-scale attention and data enhancement | |
| CN107886133A (en) | A kind of underground piping defect inspection method based on deep learning | |
| CN107977661B (en) | Region-of-interest detection method based on FCN and low-rank sparse decomposition | |
| CN107316307A (en) | A kind of Chinese medicine tongue image automatic segmentation method based on depth convolutional neural networks | |
| CN104809443A (en) | Convolutional neural network-based license plate detection method and system | |
| CN109558806A (en) | The detection method and system of high score Remote Sensing Imagery Change | |
| CN108229523A (en) | Image detection, neural network training method, device and electronic equipment | |
| CN105005774A (en) | Face relative relation recognition method based on convolutional neural network and device thereof | |
| CN107730473A (en) | A kind of underground coal mine image processing method based on deep neural network | |
| CN107230205A (en) | A kind of transmission line of electricity bolt detection method based on convolutional neural networks | |
| Yusof et al. | Automated asphalt pavement crack detection and classification using deep convolution neural network | |
| CN108596818A (en) | A kind of image latent writing analysis method based on multi-task learning convolutional neural networks | |
| CN108596256B (en) | Object recognition classifier construction method based on RGB-D | |
| CN109766823A (en) | A high-resolution remote sensing ship detection method based on deep convolutional neural network | |
| Saleh et al. | End-to-end tire defect detection model based on transfer learning techniques | |
| Zhang et al. | A deep extractor for visual rail surface inspection | |
| CN116893162A (en) | Rare anti-nuclear antibody karyotype detection method based on YOLO and attention neural network |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant | ||
| CF01 | Termination of patent right due to non-payment of annual fee | ||
| CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20181102 |