CN112149538A - A Pedestrian Re-identification Method Based on Multi-task Learning - Google Patents
A Pedestrian Re-identification Method Based on Multi-task Learning Download PDFInfo
- Publication number
- CN112149538A CN112149538A CN202010960694.9A CN202010960694A CN112149538A CN 112149538 A CN112149538 A CN 112149538A CN 202010960694 A CN202010960694 A CN 202010960694A CN 112149538 A CN112149538 A CN 112149538A
- Authority
- CN
- China
- Prior art keywords
- pedestrian
- attribute
- identity
- loss
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/292—Multi-camera tracking
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Multimedia (AREA)
- Human Computer Interaction (AREA)
- Image Analysis (AREA)
Abstract
本发明提出了一种基于多任务学习的行人重识别方法,包括:获取跨摄像机行人图像,构建一个行人重识别训练数据集;构造一个多任务学习网络,所述网络可以将属性任务和身份识别任务相结合,可以联合优化所有参数和损失,从而达到提升行人重识别准确率的目的;属性任务和识别任务中分别包含验证和验证步骤,分类和验证损失来优化样品的距离,识别损失用于构建一个大的类空间,同时验证损失,本发明通过最小化相似图像和最大限度地扩大不同图像之间的距离来优化空间,以提高识别精度。
The invention proposes a pedestrian re-identification method based on multi-task learning, including: acquiring cross-camera pedestrian images, constructing a pedestrian re-recognition training data set; constructing a multi-task learning network, which can combine attribute tasks and identity recognition The combination of tasks can jointly optimize all parameters and losses, so as to achieve the purpose of improving the accuracy of pedestrian re-identification; the attribute task and the recognition task respectively include verification and verification steps, classification and verification loss to optimize the distance of the sample, and the recognition loss is used for Constructing a large class space while validating the loss, the present invention optimizes the space by minimizing similar images and maximizing the distance between different images to improve recognition accuracy.
Description
技术领域technical field
本发明涉及一种行人重识别方法,具体的说是一种基于多任务学习的行人重识别方法,属于深度学习技术领域。The invention relates to a pedestrian re-identification method, in particular to a pedestrian re-identification method based on multi-task learning, and belongs to the technical field of deep learning.
背景技术Background technique
行人再识别对无重叠的多个摄像头拍摄的行人图像进行识别,是将一个摄像头拍摄的目标行人作为检索对象,从其他摄像头拍摄的图像中准确识别行人,由抽取健壮表示特征及采用有效度量模型实现识别两个过程组成.大多数研究工作集中在行人外表图像表示特征及相似度量模型两方面工作.行人再识别主要的任务就是找到行人图像的有效表达。Pedestrian re-identification recognizes pedestrian images captured by multiple cameras without overlapping. The target pedestrian captured by one camera is used as the retrieval object to accurately identify pedestrians from images captured by other cameras. The robust representation features are extracted and an effective measurement model is used. The realization of recognition consists of two processes. Most of the research work focuses on two aspects of pedestrian appearance image representation and similarity measurement model. The main task of pedestrian re-identification is to find an effective representation of pedestrian images.
在深度学习技术出现之前,早期的行人再识别研究主要集中于如何手工设计更好的视觉特征和如何学习更好的相似度度量。近几年随着深度学习的发展,深度学习技术在行人再识别任务上得到了广泛的应用。和传统方法不同,深度学习方法可以自动提取较好的行人图像特征,同时学习得到较好的相似度度量。Before the advent of deep learning techniques, early research on person re-identification mainly focused on how to manually design better visual features and how to learn better similarity measures. In recent years, with the development of deep learning, deep learning technology has been widely used in pedestrian re-identification tasks. Different from traditional methods, deep learning methods can automatically extract better pedestrian image features and learn better similarity measures at the same time.
行人重识别由于受多种条件变化的影响,其识别精度仍然不高,需要提出鲁棒性更好的新技术和算法.目前研究主要围绕图像特征学习和相似度量方面展开。单一、单层次的图像表示特征在不同场景下,其性能表现各异,针对图像全局或局部结构信息进行投影变换学习判别能力强的子空间技术,不能很好识别来自不同摄像头的图像。Due to the influence of various conditions, the recognition accuracy of person re-identification is still not high, and new technologies and algorithms with better robustness need to be proposed. The current research mainly focuses on image feature learning and similarity measurement. Single and single-level image representation features have different performances in different scenarios. The subspace technology with strong discriminative ability to perform projection transformation for image global or local structure information cannot identify images from different cameras well.
具体原因分析如下:The specific reasons are analyzed as follows:
1)不同层次、不同部分的表示特征代表不同语义,通过不同视图、不同层面、不同局部的视觉颜色特征、结构特征或深度特征等对图像加以描述,反映了不同语义.这些描述数据,其产生方式不同,承载的信息不同,对各种环境变化的敏感性不同,具有很强的互补性.研究结果表明,使用单一种类表示特征、单一层面表示特征,识别效果不如多种特征、多层面特征融合效果好.但如不加区别将多种特征融合利用,其鲁棒性不好,需要研究环境适应性强的特征融合利用技术,在不同场景下(即不同的复杂变化下),放大对该场景更具鲁棒性特征的作用,降低对该场景鲁棒性差的特征的作用.为此需要研究分析适用于不同变化的表示特征。1) The representation features of different levels and parts represent different semantics, and the images are described by visual color features, structural features or depth features of different views, different levels, and different parts, reflecting different semantics. These description data, which generate Different methods carry different information, different sensitivities to various environmental changes, and have strong complementarity. The research results show that using a single type to represent features and a single level to represent features, the recognition effect is not as good as multiple features and multi-level features. The fusion effect is good. However, if a variety of features are used without distinction, the robustness is not good. It is necessary to study the feature fusion and utilization technology with strong environmental adaptability. The role of the scene is more robust, and the role of the features with poor robustness in the scene is reduced. For this reason, it is necessary to study and analyze the representation features suitable for different changes.
2)深度卷积神经网络在图像分类中表现出优异的性能,利用深度卷积神经网络学习行人表示深度特征,具有很好的判别能力并对噪声具有很好的鲁棒性.但网络训练需要大量标记数据,标记数据费时费力,需要研究如何增加深度网络的训练数据,提高深度特征的鲁棒性。2) Deep convolutional neural networks show excellent performance in image classification. Using deep convolutional neural networks to learn pedestrian representation depth features has good discriminative ability and good robustness to noise. However, network training requires A large amount of labeled data is time-consuming and labor-intensive. It is necessary to study how to increase the training data of deep networks and improve the robustness of deep features.
3)样本间的类内和类间差异:对于行人再识别的数据集,类内差异和类间混淆是常见的问题。类内差异指同意行人在不同摄像头下所呈现的多样性,比如姿势、角度、外观等,类间混淆指不同行人在摄像头下的相似外观。产生这些样本差异的原因是跨摄像头会带来复杂度,模型很难对这些特征进行学习和匹配。3) Intra- and inter-class differences between samples: For datasets of person re-identification, intra-class differences and inter-class confusion are common problems. Intra-class difference refers to the diversity of agreeing pedestrians under different cameras, such as pose, angle, appearance, etc., and inter-class confusion refers to the similar appearance of different pedestrians under different cameras. The reason for these sample differences is the complexity introduced by the cross-camera, and it is difficult for the model to learn and match these features.
发明内容SUMMARY OF THE INVENTION
本发明的目的是提供一种基于多任务学习的行人重识别方法,能够针对识别任务中仅是对行人的整体轮廓进行识别未考虑到局部信息,以及属性任务中未考虑到的属性验证问题,提高识别的准确率。The purpose of the present invention is to provide a pedestrian re-identification method based on multi-task learning, which can only identify the overall outline of the pedestrian in the recognition task without considering the local information, and attribute verification problems that are not considered in the attribute task, Improve the accuracy of recognition.
本发明的目的是这样实现的:一种基于多任务学习的行人重识别方法,包括以下步骤:The object of the present invention is achieved in this way: a pedestrian re-identification method based on multi-task learning, comprising the following steps:
步骤1)获取跨摄像机行人图像,构建一个行人重识别训练数据集,所述数据集中包含预设数量的行人图像;Step 1) obtaining cross-camera pedestrian images, and constructing a pedestrian re-identification training data set, which contains a preset number of pedestrian images;
步骤2)将ResNet-50作为骨干网络,基于ImageNet的预训练模型在行人重识别数据上进行预训练,获得行人重识别预训练网络模型;采用身份标签和属性标签分别对行人重识别预训练网络进行微调,产生身份识别网络和属性识别网络;Step 2) Using ResNet-50 as the backbone network, the pre-training model based on ImageNet is pre-trained on the pedestrian re-identification data, and the pedestrian re-identification pre-training network model is obtained; the identity label and the attribute label are respectively used to re-identify the pedestrian. Perform fine-tuning to generate identity recognition network and attribute recognition network;
步骤3)将身份识别网络分为身份分类任务和身份验证任务,将属性识别网络分为属性分类任务和属性验证任务,身份分类任务中使用损失函数进行身份预测,对样本图片输出的特征向量计算分类损失;Step 3) Divide the identity recognition network into an identity classification task and an identity verification task, and divide the attribute recognition network into an attribute classification task and an attribute verification task. In the identity classification task, a loss function is used for identity prediction, and the feature vector output from the sample image is calculated. classification loss;
步骤4)身份验证任务中,给定两张图片,输入到行人重识别预训练网络模型,获取两张图片的全局特征向量,计算图像对的损失函数;选取N个图像对,计算总的验证损失;Step 4) In the identity verification task, given two pictures, input them into the pedestrian re-identification pre-training network model, obtain the global feature vectors of the two pictures, and calculate the loss function of the image pairs; select N image pairs, and calculate the total verification. loss;
步骤5)属性分类任务中,用M个损失函数进行属性预测,再计算M个网络的m个属性的损失;Step 5) In the attribute classification task, use M loss functions for attribute prediction, and then calculate the loss of m attributes of M networks;
步骤6)属性验证任务是为了判断所提取的属性特征是否符合行人特征,使用对比损失函数计算属性验证任务的总损失;Step 6) The attribute verification task is to use the contrast loss function to calculate the total loss of the attribute verification task in order to judge whether the extracted attribute features conform to the pedestrian characteristics;
步骤7)四个任务产生的总损失,用平衡参数α和β来调节,参数α平衡了身份和属性之间的贡献,β平衡分类和验证之间的贡献。Step 7) The total loss generated by the four tasks is adjusted with balancing parameters α and β, where α balances the contributions between identity and attributes, and β balances the contributions between classification and verification.
作为本发明的进一步改进,步骤1)具体包括:As a further improvement of the present invention, step 1) specifically includes:
1.1)采用高斯混合模型检测行人图像前景;1.1) Use Gaussian mixture model to detect the foreground of pedestrian images;
1.2)根据步骤1.1的检测结果,若存在运动前景的视频帧,采用预训练的行人检测器对行人进行检测,精确定位行人位置,并从视频帧中截取对应区域图像作为行人图像;1.2) According to the detection result of step 1.1, if there is a video frame with a moving foreground, use a pre-trained pedestrian detector to detect pedestrians, accurately locate the pedestrian position, and intercept the corresponding area image from the video frame as a pedestrian image;
1.3)根据步骤1.1的检测结果,若高斯混合模型未检测到运动前景,则行人检测器不执行;1.3) According to the detection result of step 1.1, if the Gaussian mixture model does not detect the motion foreground, the pedestrian detector does not execute;
1.4)采用人工标注方式将不同摄像机上提取的同一行人标注为同一个类,并赋予一个编号,每类行人图像不少于预设的样本个数;而不同类行人图像之间采用不同的编号表示;每个行人的属性标签进行手工标注;迭代上述样本收集过程,当训练数据集规模包含预设数量的行人图像时,可停止收集数据。1.4) The same pedestrian extracted from different cameras is marked as the same class by manual labeling, and a number is assigned, and each type of pedestrian image is not less than the preset number of samples; different types of pedestrian images use different numbers. Representation; the attribute labels of each pedestrian are manually marked; the above sample collection process is iterated, and data collection can be stopped when the size of the training data set contains a preset number of pedestrian images.
作为本发明的进一步改进,步骤2)具体包括:As a further improvement of the present invention, step 2) specifically includes:
2.1)获取在ImageNet数据集上预先训练好的深度卷积网络模型,并将其在行人重识别数据上进行训练;2.1) Obtain the deep convolutional network model pre-trained on the ImageNet dataset, and train it on the pedestrian re-identification data;
2.2)在行人重识别数据上预训练深度卷积神经网络模型时,利用了样本身份信息和属性信息分别对深度卷积网络模型进行微调,在ImageNet数据集上预训练的ResNet50网络模型,添加平均池化层,在平均池化层之后插入一个dropout层。2.2) When pre-training the deep convolutional neural network model on the person re-identification data, the sample identity information and attribute information were used to fine-tune the deep convolutional network model respectively, and the ResNet50 network model pre-trained on the ImageNet dataset was added to the average. Pooling layer, insert a dropout layer after the average pooling layer.
作为本发明的进一步改进,步骤3)具体包括:As a further improvement of the present invention, step 3) specifically includes:
3.1)数据集上获取图像组,图像组包含身份标签以及属性标签;3.1) Obtain an image group from the dataset, and the image group contains identity tags and attribute tags;
3.2)使用损失函数进行身份预测,根据身份识别分类网络获取行人全局图像的预测概率;3.2) Use the loss function to predict the identity, and obtain the predicted probability of the pedestrian global image according to the identity recognition classification network;
3.3)根据预测概率定义身份分类网络中的损失函数。3.3) Define the loss function in the identity classification network according to the predicted probability.
作为本发明的进一步改进,步骤4)具体包括:As a further improvement of the present invention, step 4) specifically includes:
4.1)定两张图像I,J,通过身份识别网络提取图片对应高级特征fa,fb;4.1) Determine two images I, J, and extract the corresponding high-level features f a , f b of the pictures through the identity recognition network;
4.2)根据步骤4.1所提取的特征,使用对比损失函数进行相似性估计;4.2) According to the features extracted in step 4.1, use the contrast loss function to perform similarity estimation;
4.3)根据一组图像的损失函数,计算每个batch中N个图像对的验证损失。4.3) Calculate the validation loss for N image pairs in each batch according to the loss function of a set of images.
作为本发明的进一步改进,步骤5)具体包括:As a further improvement of the present invention, step 5) specifically includes:
5.1)给定样本图像I,假设含有m个类别的属性;5.1) Given a sample image I, it is assumed that there are m categories of attributes;
5.2)使用损失函数进行属性预测,根据属性分类网络获取行人图像属性的预测概率;5.2) Use the loss function for attribute prediction, and obtain the predicted probability of pedestrian image attributes according to the attribute classification network;
5.3)根据预测概率定义属性分类网络中的损失函数。5.3) Define the loss function in the attribute classification network according to the predicted probability.
作为本发明的进一步改进,步骤7)具体包括:As a further improvement of the present invention, step 7) specifically includes:
7.1结合上述步骤3)、步骤4)、步骤5)、步骤6)产生的四个损失来生成最终的多任务损失;7.1 Combine the four losses generated in the above steps 3), 4), 5), and 6) to generate the final multi-task loss;
7.2使用参数α平衡了身份和属性之间的损失;7.2 Using the parameter α to balance the loss between identity and attributes;
7.3使用参数β平衡了分类和验证之间的损失。7.3 The loss between classification and validation is balanced using the parameter β.
本发明采用以上技术方案与现有技术相比,具有以下技术效果:改进现有的深度学习端到端的学习框架,把属性任务与身份识别任务相结合,提出一种多任务学习网络,行人再识别是一项从不重叠的摄像机中找到被查询的人的任务,而目标则是寻找被查询的人;属性识别的目的是从图像中预测一组属性的存在;属性描述一个人的详细信息,包括性别、配饰、衣服颜色等;识别任务的目的是判断是否为我们要查询的人;利用属性标签,模型能够通过明确地关注一些局部语义描述来学习对行人进行分类,这极大地简化了模型的训练,利用属性标签中的互补线索来提高大规模人员再识别的性能;将行人属性任务与行人识别任务相结合,通过整合多语境信息,从不同的角度提出补充信息;属性侧重于一个人的局部信息,而身份则更注重整体轮廓和外观,识别损失用来构造一个大的类空间,验证损失用来优化空间,使得相似图片之间的距离更小,不同图片之间的距离更大;预计结合两种分类和验证方法,充分利用了身份和属性这两项任务,验证技术嵌入到属性中,这在一定程度上弥补了属性识别模型的缺陷。Compared with the prior art, the present invention adopts the above technical solution, and has the following technical effects: improving the existing end-to-end learning framework of deep learning, combining attribute tasks with identity recognition tasks, and proposing a multi-task learning network, which enables pedestrians to learn more efficiently. Recognition is a task of finding the queried person from non-overlapping cameras, while the goal is to find the queried person; attribute recognition aims to predict the presence of a set of attributes from an image; attributes describe the details of a person , including gender, accessories, clothes color, etc.; the purpose of the recognition task is to judge whether it is the person we want to query; using attribute labels, the model can learn to classify pedestrians by explicitly focusing on some local semantic descriptions, which greatly simplifies The training of the model uses complementary cues in attribute labels to improve the performance of large-scale person re-identification; combines the pedestrian attribute task with the pedestrian recognition task, and proposes supplementary information from different perspectives by integrating multi-context information; attributes focus on A person's local information, while identity pays more attention to the overall outline and appearance, the recognition loss is used to construct a large class space, and the verification loss is used to optimize the space so that the distance between similar pictures is smaller, and the distance between different pictures is smaller. Larger; it is expected that the two tasks of identity and attribute are fully utilized by combining the two classification and verification methods, and the verification technology is embedded in the attribute, which makes up for the shortcomings of the attribute recognition model to a certain extent.
附图说明Description of drawings
图1是本发明实施例多任务学习方法概述图。FIG. 1 is an overview diagram of a multi-task learning method according to an embodiment of the present invention.
图2是本发明实施例基于多任务学习网络的体系架构。FIG. 2 is an architecture based on a multi-task learning network according to an embodiment of the present invention.
具体实施方式Detailed ways
下面结合附图对本发明的技术方案做进一步的详细说明。The technical solutions of the present invention will be further described in detail below with reference to the accompanying drawings.
如图1所示的一种基于多任务学习的行人重识别方法,包括以下几个步骤。As shown in Figure 1, a multi-task learning-based pedestrian re-identification method includes the following steps.
步骤1)获取跨摄像机行人图像,构建一个行人重识别训练数据集,所述数据集中包含预设数量的行人图像;Step 1) obtaining cross-camera pedestrian images, and constructing a pedestrian re-identification training data set, which contains a preset number of pedestrian images;
在本实施例中,采用高斯混合模型检测行人图像前景;根据高斯混合模型的检测结果,若存在运动前景的视频帧,采用预训练的行人检测器对行人进行检测,精确定位行人位置,并从视频帧中截取对应区域图像作为行人图像;若高斯混合模型未检测到运动前景,则行人检测器不执行;采用人工标注方式将不同摄像机上提取的同一行人标注为同一个类,并赋予一个编号,每类行人图像不少于预设的样本个数;而不同类行人图像之间采用不同的编号表示;每个行人的属性标签进行手工标注;迭代上述样本收集过程,当训练数据集规模包含预设数量的行人图像时,可停止收集数据。In this embodiment, the Gaussian mixture model is used to detect the foreground of the pedestrian image; according to the detection result of the Gaussian mixture model, if there is a video frame with a moving foreground, a pre-trained pedestrian detector is used to detect the pedestrian, accurately locate the pedestrian position, and from the The corresponding area image is intercepted from the video frame as the pedestrian image; if the Gaussian mixture model does not detect the moving foreground, the pedestrian detector will not be executed; the same pedestrian extracted from different cameras is marked as the same class by manual labeling, and assigned a number , each type of pedestrian image is not less than the preset number of samples; different types of pedestrian images are represented by different numbers; the attribute labels of each pedestrian are manually labeled; iterate the above sample collection process, when the training data set scale contains Data collection can be stopped at a preset number of pedestrian images.
步骤2)将ResNet-50作为骨干网络,基于ImageNet的预训练模型在行人重识别数据上进行预训练,获得行人重识别预训练网络模型;采用身份标签和属性标签分别对行人重识别预训练网络进行微调,产生身份识别网络和属性识别网络;Step 2) Using ResNet-50 as the backbone network, the pre-training model based on ImageNet is pre-trained on the pedestrian re-identification data, and the pedestrian re-identification pre-training network model is obtained; the identity label and the attribute label are respectively used to re-identify the pedestrian. Perform fine-tuning to generate identity recognition network and attribute recognition network;
在本实施例中,所述的网络都为卷积神经网络,包括若干卷积单元和池化层构成,其中每个卷积单元由一个批量归一化层、一个卷积层和一个非线性激活层构成;近年来,深度学习中的卷积网络在提取图像高级特征中展现出很好的效果,但是经卷积核提取的信息缺乏足够的目标先验信息;因此将属性任务与识别任务相结合,用属性任务来补充身份识别用以提高行人识别的准确度;In this embodiment, the networks described are all convolutional neural networks, including several convolutional units and pooling layers, wherein each convolutional unit consists of a batch normalization layer, a convolutional layer and a nonlinear The composition of the activation layer; in recent years, the convolutional network in deep learning has shown a good effect in extracting high-level image features, but the information extracted by the convolution kernel lacks sufficient target prior information; therefore, the attribute task and the recognition task are combined. Combined with attribute tasks to supplement identity recognition to improve the accuracy of pedestrian recognition;
获取在ImageNet数据集上预先训练好的深度卷积网络模型,并将其在行人重识别数据上进行训练;在行人重识别数据上预训练深度卷积神经网络模型时,利用了样本身份信息和属性信息分别对深度卷积网络模型进行微调,在ImageNet数据集上预训练的ResNet50网络模型,添加平均池化层,在平均池化层之后插入一个dropout层;身份网络和属性网络两个子网结构相同、参数共享。Obtain the deep convolutional network model pre-trained on the ImageNet dataset and train it on the person re-identification data; when pre-training the deep convolutional neural network model on the person re-identification data, the sample identity information and The attribute information is used to fine-tune the deep convolutional network model separately. The ResNet50 network model pre-trained on the ImageNet dataset is added with an average pooling layer, and a dropout layer is inserted after the average pooling layer; the identity network and the attribute network are two subnet structures. Same, parameter sharing.
步骤3)身份识别网络中分为身份分类任务和身份验证任务,身份分类任务中使用损失函数进行身份预测,样本图片输出的特征向量计算分类损失;Step 3) The identity recognition network is divided into an identity classification task and an identity verification task. In the identity classification task, a loss function is used to perform identity prediction, and the feature vector output from the sample image is used to calculate the classification loss;
在本实施例中,多任务学习网络可以输入单张图片进行学习,输出图片的全局特征;全局特征通过身份网络层FC_IC层提取身份特征,使用损失函数进行身份预测,很好地表达样本间的相似度。I={xn,kn,an}为图片组,其中身份标签为kn和属性标签an,每个图像xn,包含身份标签k∈{1……k}的子集以及属性标签的子集,若训练图片为x,则输出z=[z1,z2……zk],每个身份标签k∈{1,2……k}的概率能够预测为P,损失函数定义为其中是预测概率,Pi是目标概率;除Pi=k=1外,所有pi=0,f为2048维的特征。In this embodiment, the multi-task learning network can input a single picture for learning, and output the global features of the picture; the global features extract the identity features through the identity network layer FC_IC layer, and use the loss function for identity prediction, which can well express the difference between samples. similarity. I={x n , k n , a n } is a picture group, wherein the identity label is k n and the attribute label a n , each image x n , contains a subset of the identity label k∈{1...k} and attributes Label A subset of , if the training image is x, then the output z=[z 1 , z 2 ...... z k ], the probability of each identity label k∈{1, 2 ...... k} can be predicted as P, the loss function is defined for in is the predicted probability, and P i is the target probability; except for P i = k =1, all p i =0,f are 2048-dimensional features.
步骤4)身份验证任务中,给定两张图片,输入到行人重识别预训练网络模型,获取两张图片的全局特征向量,计算图像对的损失函数;选取N个图像对,计算总的验证损失;Step 4) In the identity verification task, given two pictures, input them into the pedestrian re-identification pre-training network model, obtain the global feature vectors of the two pictures, and calculate the loss function of the image pairs; select N image pairs, and calculate the total verification. loss;
在本实例方式中,利用网络前向传播求解目标函数相对网络各层参数,对各层参数利用随机梯度下降进行参数更新学习;In this example, the network forward propagation is used to solve the objective function relative to the parameters of each layer of the network, and stochastic gradient descent is used to update and learn the parameters of each layer;
本实例方式中,给定两张图像I,J,通过身份识别网络FC_I层提取图片对应高级特征fa,fb,每张图片对应的身份标签为y.根据所提取的高级特征fa,fb,使用对比损失函数进行相似性估计;图像对损失函数定义为:In this example, given two images I, J, the high-level features f a and f b corresponding to the pictures are extracted through the FC_I layer of the identity recognition network, and the identity label corresponding to each picture is y. According to the extracted high-level features f a , f b , using a contrastive loss function for similarity estimation; the image pair loss function is defined as:
用欧氏距离来测量距离,用||·||1表示;||·||1表示向量的第一范数;|·|表示求绝对值,λ是加权参数,控制正则化器的强度,θ>0是一个余量阈值参数;在本发明中,我们将参数设置为λ=0.01和θ=1024;如果两个图片相似,则y=0;否则y=1;第一个惩罚项惩罚映射到不同代码的相似图像;第二个惩罚项惩罚当欧式距离低于阈值θ时,不同图像映射到相似图像代码;最后一项用于使输出匹配所需的离散值(+1/-1),从而确保网络的收敛性;假设随机从一个小批次中选择N对,我们的最终目标是将总损失函数最小化,总损失函数可定义为Lm(fa,fb,y),即为1到N的损失函数之和。Euclidean distance is used to measure distance, represented by ||·|| 1 ; ||·|| 1 represents the first norm of the vector; |·| represents the absolute value, λ is the weighting parameter, which controls the strength of the regularizer , θ>0 is a margin threshold parameter; in the present invention, we set the parameters as λ=0.01 and θ=1024; if the two pictures are similar, then y=0; otherwise, y=1; the first penalty term Penalizes similar images mapped to different codes; the second penalty term penalizes different images mapped to similar image codes when the Euclidean distance is below a threshold θ; the last term is used to make the output match the required discrete values (+1/- 1), thus ensuring the convergence of the network; assuming that N pairs are randomly selected from a small batch, our ultimate goal is to minimize the total loss function, which can be defined as L m (f a ,f b ,y ), which is the sum of the loss functions from 1 to N.
步骤5)属性分类任务中,用M个损失函数进行属性预测,再计算M个网络的m个属性的损失;Step 5) In the attribute classification task, use M loss functions for attribute prediction, and then calculate the loss of m attributes of M networks;
本实例方式中,与步骤3)类似,给定I={xn,kn,an},以及属性标签的子集,我们用M个损失函数进行属性预测,属性i分配给属性类别j∈{1,2……m}的概率为P,则属性分类的交叉熵成本函数可被定义为:也就是说,总属性分类损失是每个属性分类损失之和。假设yi,m是属性i的真实标签;当j=yi,m时,Pi,j=1。In this example, similar to step 3), given I={x n , k n , a n }, and the attribute label A subset of , we use M loss functions for attribute prediction, and the probability that attribute i is assigned to attribute category j∈{1, 2...m} is P, then The cross-entropy cost function for attribute classification can be defined as: That is, the total attribute classification loss is the sum of each attribute classification loss. Suppose y i,m is the true label of attribute i; when j=y i,m , P i,j =1.
步骤6)属性验证任务是为了判断所提取的属性特征是否符合行人特征,使用对比损失函数计算属性验证任务的总损失;Step 6) The attribute verification task is to use the contrast loss function to calculate the total loss of the attribute verification task in order to judge whether the extracted attribute features conform to the pedestrian characteristics;
本实例方式中,属性验证任务可以通过组合多个不同的对比损失来实现。与步骤4)类似,给定I={xn,kn,an},以及属性标签的子集,假设属性i有j∈{1,2…m}种类别,我们能定义属性验证损失为: In this example approach, the attribute verification task can be achieved by combining multiple different contrastive losses. Similar to step 4), given I={x n , k n , a n }, and attribute labels A subset of , assuming that attribute i has j ∈ {1, 2…m} categories, we can define the attribute validation loss as:
步骤7)四个任务产生的总损失,用平衡参数α和β来调节,参数α平衡了身份和属性之间的贡献,而β平衡了分类和验证之间的贡献;Step 7) The total loss generated by the four tasks is adjusted by balancing parameters α and β, where parameter α balances the contribution between identity and attributes, while β balances the contribution between classification and verification;
本实例方式中,结合上述步骤3)、步骤4)、步骤5)、步骤6)产生的四个损失来生成最终的多任务损失;多任务学习网络经过培训可以同时预测身份和属性;最终损失函数L为四种损失函数值按一定权重的和,参数α平衡了身份和属性之间的损失,而β平衡了分类和验证之间的损失。In this example, the final multi-task loss is generated by combining the four losses generated in the above steps 3), 4), 5), and 6); the multi-task learning network can be trained to simultaneously predict the identity and attributes; the final loss The function L is the sum of the four loss function values according to a certain weight, the parameter α balances the loss between identity and attributes, and β balances the loss between classification and verification.
以上所述,仅为本发明中的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉该技术的人在本发明所揭露的技术范围内,可理解想到的变换或替换,都应涵盖在本发明的包含范围之内,因此,本发明的保护范围应该以权利要求书的保护范围为准。The above is only a specific embodiment of the present invention, but the protection scope of the present invention is not limited to this, any person familiar with the technology can understand the transformation or replacement that comes to mind within the technical scope disclosed by the present invention, All should be included within the scope of the present invention, therefore, the protection scope of the present invention should be subject to the protection scope of the claims.
Claims (7)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202010960694.9A CN112149538A (en) | 2020-09-14 | 2020-09-14 | A Pedestrian Re-identification Method Based on Multi-task Learning |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202010960694.9A CN112149538A (en) | 2020-09-14 | 2020-09-14 | A Pedestrian Re-identification Method Based on Multi-task Learning |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN112149538A true CN112149538A (en) | 2020-12-29 |
Family
ID=73892207
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202010960694.9A Withdrawn CN112149538A (en) | 2020-09-14 | 2020-09-14 | A Pedestrian Re-identification Method Based on Multi-task Learning |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN112149538A (en) |
Cited By (11)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112613474A (en) * | 2020-12-30 | 2021-04-06 | 珠海大横琴科技发展有限公司 | Pedestrian re-identification method and device |
| CN112784772A (en) * | 2021-01-27 | 2021-05-11 | 浙江大学 | In-camera supervised cross-camera pedestrian re-identification method based on contrast learning |
| CN112801051A (en) * | 2021-03-29 | 2021-05-14 | 哈尔滨理工大学 | Method for re-identifying blocked pedestrians based on multitask learning |
| CN113033428A (en) * | 2021-03-30 | 2021-06-25 | 电子科技大学 | Pedestrian attribute identification method based on instance segmentation |
| CN113128441A (en) * | 2021-04-28 | 2021-07-16 | 安徽大学 | System and method for identifying vehicle weight by embedding structure of attribute and state guidance |
| CN113807200A (en) * | 2021-08-26 | 2021-12-17 | 青岛文达通科技股份有限公司 | Multi-person identification method and system based on dynamic fitting multi-task reasoning network |
| CN114155554A (en) * | 2021-12-02 | 2022-03-08 | 东南大学 | Transformer-based camera domain pedestrian re-recognition method |
| CN114241380A (en) * | 2021-12-16 | 2022-03-25 | 之江实验室 | Multi-task attribute scene recognition method based on category label and attribute annotation |
| CN115187816A (en) * | 2022-08-04 | 2022-10-14 | 南京工业大学 | Traditional Chinese medicine decoction piece identification and classification method based on multi-attribute auxiliary task learning |
| CN115909464A (en) * | 2022-12-26 | 2023-04-04 | 淮阴工学院 | Self-adaptive weak supervision label marking method for pedestrian re-identification |
| CN120032420A (en) * | 2024-12-30 | 2025-05-23 | 北京信息科技大学 | A person re-identification method based on attribute information constraints |
-
2020
- 2020-09-14 CN CN202010960694.9A patent/CN112149538A/en not_active Withdrawn
Cited By (17)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112613474B (en) * | 2020-12-30 | 2022-01-18 | 珠海大横琴科技发展有限公司 | Pedestrian re-identification method and device |
| CN112613474A (en) * | 2020-12-30 | 2021-04-06 | 珠海大横琴科技发展有限公司 | Pedestrian re-identification method and device |
| CN112784772A (en) * | 2021-01-27 | 2021-05-11 | 浙江大学 | In-camera supervised cross-camera pedestrian re-identification method based on contrast learning |
| CN112784772B (en) * | 2021-01-27 | 2022-05-27 | 浙江大学 | An in-camera supervised cross-camera pedestrian re-identification method based on contrastive learning |
| CN112801051A (en) * | 2021-03-29 | 2021-05-14 | 哈尔滨理工大学 | Method for re-identifying blocked pedestrians based on multitask learning |
| CN113033428A (en) * | 2021-03-30 | 2021-06-25 | 电子科技大学 | Pedestrian attribute identification method based on instance segmentation |
| CN113128441A (en) * | 2021-04-28 | 2021-07-16 | 安徽大学 | System and method for identifying vehicle weight by embedding structure of attribute and state guidance |
| CN113128441B (en) * | 2021-04-28 | 2022-10-14 | 安徽大学 | System and method for identifying vehicle weight by embedding structure of attribute and state guidance |
| CN113807200B (en) * | 2021-08-26 | 2024-04-19 | 青岛文达通科技股份有限公司 | Multi-row person identification method and system based on dynamic fitting multi-task reasoning network |
| CN113807200A (en) * | 2021-08-26 | 2021-12-17 | 青岛文达通科技股份有限公司 | Multi-person identification method and system based on dynamic fitting multi-task reasoning network |
| CN114155554A (en) * | 2021-12-02 | 2022-03-08 | 东南大学 | Transformer-based camera domain pedestrian re-recognition method |
| CN114155554B (en) * | 2021-12-02 | 2025-07-22 | 东南大学 | Transformer-based camera domain adaptive pedestrian re-recognition method |
| CN114241380A (en) * | 2021-12-16 | 2022-03-25 | 之江实验室 | Multi-task attribute scene recognition method based on category label and attribute annotation |
| CN115187816A (en) * | 2022-08-04 | 2022-10-14 | 南京工业大学 | Traditional Chinese medicine decoction piece identification and classification method based on multi-attribute auxiliary task learning |
| CN115909464B (en) * | 2022-12-26 | 2024-03-26 | 淮阴工学院 | Self-adaptive weak supervision tag marking method for pedestrian re-identification |
| CN115909464A (en) * | 2022-12-26 | 2023-04-04 | 淮阴工学院 | Self-adaptive weak supervision label marking method for pedestrian re-identification |
| CN120032420A (en) * | 2024-12-30 | 2025-05-23 | 北京信息科技大学 | A person re-identification method based on attribute information constraints |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN112149538A (en) | A Pedestrian Re-identification Method Based on Multi-task Learning | |
| CN111709311B (en) | Pedestrian re-identification method based on multi-scale convolution feature fusion | |
| CN110807434B (en) | Pedestrian re-recognition system and method based on human body analysis coarse-fine granularity combination | |
| CN110070066B (en) | A video pedestrian re-identification method and system based on attitude key frame | |
| CN111126482B (en) | Remote sensing image automatic classification method based on multi-classifier cascade model | |
| CN110580460A (en) | Pedestrian re-identification method based on joint identification and verification of pedestrian identity and attribute features | |
| CN109255289B (en) | Cross-aging face recognition method based on unified generation model | |
| CN110781829A (en) | A lightweight deep learning face recognition method for smart business halls | |
| CN106529499A (en) | Fourier descriptor and gait energy image fusion feature-based gait identification method | |
| CN104504366A (en) | System and method for smiling face recognition based on optical flow features | |
| CN116778277B (en) | Cross-domain model training method based on progressive information decoupling | |
| CN110033007A (en) | Attribute recognition approach is worn clothes based on the pedestrian of depth attitude prediction and multiple features fusion | |
| TWI525574B (en) | Collaborative face annotation method and collaborative face annotation system | |
| CN110728216A (en) | Unsupervised pedestrian re-identification method based on pedestrian attribute adaptive learning | |
| CN103150546A (en) | Video face identification method and device | |
| CN111339849A (en) | A Pedestrian Re-identification Method Based on Pedestrian Attributes | |
| CN110163117A (en) | A kind of pedestrian's recognition methods again based on autoexcitation identification feature learning | |
| CN114764869A (en) | Multi-object detection with single detection per object | |
| CN111815582A (en) | A two-dimensional code region detection method with improved background prior and foreground prior | |
| CN111008575A (en) | A robust face recognition method based on multi-scale context information fusion | |
| Andiani et al. | Face recognition for work attendance using multitask convolutional neural network (MTCNN) and pre-trained facenet | |
| CN110458064B (en) | Combining data-driven and knowledge-driven low-altitude target detection and recognition methods | |
| CN110781817A (en) | Pedestrian re-identification method for solving component misalignment | |
| CN112613474B (en) | Pedestrian re-identification method and device | |
| CN112446305B (en) | A person re-identification method based on classification weighted equidistant distribution loss model |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| WW01 | Invention patent application withdrawn after publication |
Application publication date: 20201229 |
|
| WW01 | Invention patent application withdrawn after publication |