[go: up one dir, main page]

CN112434178B - Image classification method, device, electronic equipment and storage medium - Google Patents

Image classification method, device, electronic equipment and storage medium Download PDF

Info

Publication number
CN112434178B
CN112434178B CN202011325685.9A CN202011325685A CN112434178B CN 112434178 B CN112434178 B CN 112434178B CN 202011325685 A CN202011325685 A CN 202011325685A CN 112434178 B CN112434178 B CN 112434178B
Authority
CN
China
Prior art keywords
sample
image
target
model
probability
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011325685.9A
Other languages
Chinese (zh)
Other versions
CN112434178A (en
Inventor
申世伟
李家宏
李思则
李岩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Dajia Internet Information Technology Co Ltd
Original Assignee
Beijing Dajia Internet Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Dajia Internet Information Technology Co Ltd filed Critical Beijing Dajia Internet Information Technology Co Ltd
Priority to CN202011325685.9A priority Critical patent/CN112434178B/en
Publication of CN112434178A publication Critical patent/CN112434178A/en
Priority to PCT/CN2021/114146 priority patent/WO2022105336A1/en
Application granted granted Critical
Publication of CN112434178B publication Critical patent/CN112434178B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/55Clustering; Classification
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2155Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses an image classification method, an image classification device, electronic equipment and a storage medium. The method is used for solving the problem of lower recall rate and accuracy rate caused by the fact that a single model is adopted to detect the image in the related technology. In the embodiment of the application, a target image is firstly acquired; extracting features of the target image to obtain first image features of the target image; secondly, determining a first probability that the target image belongs to the target category by adopting a first image characteristic of the target image; when the first probability is higher than a first probability threshold, extracting second image features from the target image, and acquiring user features of a target object associated with the target image; adopting a decision tree to fuse the second image features and the user features to obtain a second probability that the target image belongs to the target class; and when the second probability is higher than a second probability threshold, determining that the target object belongs to the target category.

Description

Image classification method, device, electronic equipment and storage medium
Technical Field
The present application relates to the field of image detection, and in particular, to an image classification method, apparatus, electronic device, and storage medium.
Background
With the development of computer vision technology, image content understanding and analysis is becoming more and more intelligent. The task of classifying based on image information is an important application of computer vision.
With the increase of image information, how to efficiently classify the image is particularly important in special scenes such as security audit and abnormal behavior detection. In these special scenarios, the natural occurrence of a certain class of images may be very low (parts per million). Such as tens of thousands of pictures, several target images may appear.
In some scenes requiring high recall rate of the target image, the image classification technology in the related art is difficult to be applied, and the recall rate and the classification accuracy rate of the image classification technology are difficult to meet the service requirement.
Disclosure of Invention
The application aims to provide an image classification method, an image classification device, electronic equipment and a storage medium, which are used for solving the problem of low recall rate and accuracy rate when a single model is used for classifying pictures.
In a first aspect, an embodiment of the present application provides an image classification method, including:
acquiring a target image;
extracting features of the target image to obtain first image features of the target image;
determining a first probability that the target image belongs to a target category by adopting a first image characteristic of the target image;
When the first probability is higher than a first probability threshold, extracting second image features from the target image, and acquiring user features of a target object associated with the target image; adopting a decision tree to perform fusion processing on the second image features and the user features to obtain a second probability that the target image belongs to the target category;
And when the second probability is higher than a second probability threshold, determining that the target object belongs to the target category.
In some embodiments, after the determining the first probability that the target image belongs to the target category, the method further comprises:
And when the first probability is smaller than or equal to the first probability threshold, determining that the target image does not belong to the target category.
In some embodiments, after the determining the second probability that the target image belongs to the target class, the method further comprises:
When the second probability is higher than a preset probability threshold and smaller than the second probability threshold, the target object is distributed to a designated task set;
And when the second probability is smaller than the preset probability threshold, determining that the target image does not belong to the target category.
In some embodiments, a quasi-recall change curve is pre-stored, wherein the quasi-recall change curve is used for describing a recall parameter of a recall rate, an accuracy parameter for describing a judging accuracy of a target class, and an association relation between the preset probability threshold;
the preset probability threshold is set according to a preset recall rate index and an accuracy rate index.
In some embodiments, feature extraction is performed on the target image using a pre-trained first model, and a first probability that the target image belongs to the target class is determined, wherein the first model is trained according to the following method:
acquiring a first sample set, wherein the first sample set comprises a plurality of sample images, and each sample image is associated with a pre-labeled category;
Filtering out sample images which do not belong to the target category in the first sample set and have the similarity with the target category higher than the designated similarity to obtain a target sample set;
taking the sample image in the target sample set as an input of the first model, taking the category of the sample image as an expected output of the first model, and training the first model until the training of the first model converges.
In some embodiments, a second model based on a decision tree is used to perform fusion processing on the second image feature and the user feature, so as to obtain a second probability that the target image belongs to the target category, wherein the second model is trained according to the following method:
Acquiring sample images with probability of belonging to the target category larger than a specified probability threshold to construct a second sample set;
Extracting image features from the sample images for any one of the second sample images, and acquiring user features of a target object associated with the sample images;
And training the second model by adopting the image characteristics and the user characteristics until the training of the second model converges.
In some embodiments, the samples of the target class are positive samples and the samples not belonging to the target class are negative samples;
When the positive sample recall rate needs to be improved, setting the loss weight of the first model to be a value larger than 1;
And when the negative sample recall rate needs to be improved, setting the loss weight of the first model to be a value smaller than 1.
In some embodiments, the filtering out sample images in the first set of samples that do not belong to a target class and that have a similarity to the target class that is higher than a specified similarity includes:
Training the first model with the sample images in the first set of samples as inputs to the first model and the categories of the sample images as desired outputs of the first model before the first model is trained until the first model training converges;
Inputting each sample image in the first sample set into the first model to obtain the probability that the sample image belongs to the target category as the similarity with the target category;
And if the sample image does not belong to the target category and the similarity with the target category is higher than the designated similarity, filtering the sample image.
In some embodiments, the acquiring sample images having probabilities of belonging to the target class greater than a specified probability threshold builds a second sample set comprising:
Respectively cutting each sample image in the second sample set for multiple times to obtain a plurality of cut sample images;
and acquiring respective categories of the sample images in the second sample set and the plurality of cut sample images, and constructing the third sample set consisting of the sample images and the corresponding categories.
In some embodiments, the extracting image features from the sample image and acquiring user features of a target object associated with the sample image includes:
performing feature recognition on an object of interest in the sample image, and acquiring feature information of the object of interest from the sample image;
acquiring an object identifier of a target object associated with the sample image;
and acquiring the user characteristics of the target object according to the object identifier.
In some embodiments, feature recognition is performed on an object of interest in the sample image, and feature information of the object of interest is obtained from the sample image, including:
And when the sample image comprises a plurality of objects of interest, sequentially acquiring characteristic information of at least one object of interest from the sample image according to the size sequence of the objects of interest.
In a second aspect, an embodiment of the present application provides a training method for an image classification model, including:
acquiring a first sample set, wherein the first sample set comprises a plurality of sample images, and each sample image is associated with a pre-labeled category;
Filtering out sample images which do not belong to the target category in the first sample set and have the similarity with the target category higher than the designated similarity to obtain a target sample set;
Training the first model with the sample images in the target sample set as input to the first model and the categories of the sample images as expected output of the first model until the training of the first model converges;
Classifying and identifying each sample image in the first sample set by adopting the trained first model to obtain the probability of each sample image belonging to the target class;
Acquiring sample images with probability of belonging to the target category larger than a specified probability threshold to construct a second sample set;
Extracting image features from the sample images for any one of the second sample images, and acquiring user features of a target object associated with the sample images;
And training the second model by adopting the image characteristics and the user characteristics until the training of the second model converges.
In some embodiments, the samples of the target class are positive samples and the samples not belonging to the target class are negative samples;
When the positive sample recall rate needs to be improved, setting the loss weight of the first model to be a value larger than 1;
And when the negative sample recall rate needs to be improved, setting the loss weight of the first model to be a value smaller than 1.
In some embodiments, the filtering out sample images in the first set of samples that do not belong to a target class and that have a similarity to the target class that is higher than a specified similarity includes:
Training the first model with the sample images in the first set of samples as inputs to the first model and the categories of the sample images as desired outputs of the first model before the first model is trained until the first model training converges;
Inputting each sample image in the first sample set into the first model to obtain the probability that the sample image belongs to the target category as the similarity with the target category;
And if the sample image does not belong to the target category and the similarity with the target category is higher than the designated similarity, filtering the sample image.
In some embodiments, the acquiring sample images having probabilities of belonging to the target class greater than a specified probability threshold builds a second sample set comprising:
Respectively cutting each sample image in the second sample set for multiple times to obtain a plurality of cut sample images;
and acquiring respective categories of the sample images in the second sample set and the plurality of cut sample images, and constructing the third sample set consisting of the sample images and the corresponding categories.
In some embodiments, the extracting image features from the sample image and acquiring user features of a target object associated with the sample image includes:
performing feature recognition on an object of interest in the sample image, and acquiring feature information of the object of interest from the sample image;
acquiring an object identifier of a target object associated with the sample image;
and acquiring the user characteristics of the target object according to the object identifier.
In some embodiments, feature recognition is performed on an object of interest in the sample image, and feature information of the object of interest is obtained from the sample image, including:
And when the sample image comprises a plurality of objects of interest, sequentially acquiring characteristic information of at least one object of interest from the sample image according to the size sequence of the objects of interest.
In a third aspect, the present application also provides an image classification apparatus, the apparatus comprising:
an image acquisition module configured to acquire a target image;
The first feature extraction module is configured to perform feature extraction on the target image to obtain first image features of the target image;
a first probability determination module configured to determine a first probability that the target image belongs to a target category using a first image feature of the target image;
the first probability judging module is configured to extract second image features from the target image and acquire user features of a target object associated with the target image when the first probability is higher than a first probability threshold; adopting a decision tree to perform fusion processing on the second image features and the user features to obtain a second probability that the target image belongs to the target category;
and the second probability judging module is configured to determine that the target object belongs to the target category when the second probability is higher than a second probability threshold.
In some embodiments, the first probability determination module is further configured to:
And when the first probability is smaller than or equal to the first probability threshold, determining that the target image does not belong to the target category.
In some embodiments, the first probability determination module is further configured to:
When the second probability is higher than a preset probability threshold and smaller than the second probability threshold, the target object is distributed to a designated task set;
And when the second probability is smaller than the preset probability threshold, determining that the target image does not belong to the target category.
In some embodiments, a quasi-recall change curve is pre-stored, wherein the quasi-recall change curve is used for describing a recall parameter of a recall rate, an accuracy parameter for describing a judging accuracy of a target class, and an association relation between the preset probability threshold;
the preset probability threshold is set according to a preset recall rate index and an accuracy rate index.
In some embodiments, feature extraction is performed on the target image using a pre-trained first model, and a first probability that the target image belongs to the target class is determined, wherein the first model is trained according to the following modules:
A first sample set acquisition module configured to acquire a first sample set, the first sample set including a plurality of sample images, each of the sample images being associated with a pre-labeled category;
the filtering module is configured to filter out the sample images which do not belong to the target category in the first sample set and have the similarity with the target category higher than the appointed similarity to obtain a target sample set;
A first model training module configured to train the first model with the sample images in the target sample set as inputs to the first model and the categories of the sample images as desired outputs of the first model until the first model training converges.
In some embodiments, a second model based on a decision tree is used to perform fusion processing on the second image feature and the user feature, so as to obtain a second probability that the target image belongs to the target category, where the second model is trained according to the following modules:
a second sample set acquisition module configured to acquire sample images having probabilities of belonging to the target class greater than a specified probability threshold to construct a second sample set;
a feature extraction module configured to extract image features from the sample images for any of the sample images in the second sample set, and to obtain user features of a target object associated with the sample images;
A second model training module configured to train the second model using the image features and the user features until the second model training converges.
In some embodiments, the samples of the target class are positive samples and the samples not belonging to the target class are negative samples;
When the positive sample recall rate needs to be improved, setting the loss weight of the first model to be a value larger than 1;
And when the negative sample recall rate needs to be improved, setting the loss weight of the first model to be a value smaller than 1.
In some embodiments, the filtration module comprises:
A first training unit configured to train the first model with the sample images in the first sample set as inputs to the first model and the categories of the sample images as expected outputs of the first model until the first model training converges, before the first model is trained;
A similarity obtaining unit configured to input each sample image in the first sample set to the first model, and obtain a probability that the sample image belongs to the target class as a similarity with the target class;
And a filtering unit configured to filter out the sample image if the sample image does not belong to a target class and the similarity with the target class is higher than a specified similarity.
In some embodiments, the second sample set acquisition module comprises:
the clipping unit is configured to clip each sample image in the second sample set for multiple times respectively to obtain multiple clipped sample images;
a third sample set obtaining unit configured to obtain respective categories of the sample image in the second sample set and the plurality of clipped sample images, and construct the third sample set composed of sample images and corresponding categories.
In some embodiments, the feature extraction module comprises:
A feature information acquisition unit configured to perform feature recognition on an object of interest in the sample image, and acquire feature information of the object of interest from the sample image;
An object identification acquisition unit configured to acquire an object identification of a target object associated with the sample image;
and the user characteristic acquisition unit is configured to acquire the user characteristic of the target object according to the object identifier.
In some embodiments, the feature information acquisition unit includes:
And when the sample image comprises a plurality of objects of interest, sequentially acquiring characteristic information of at least one object of interest from the sample image according to the size sequence of the objects of interest.
In a fourth aspect, the present application further provides a training device for an image classification model, where the device includes:
A first acquisition module configured to acquire a first sample set, the first sample set including a plurality of sample images, each of the sample images being associated with a pre-labeled category;
The target sample set acquisition module is configured to filter out sample images which do not belong to a target category in the first sample set and have a similarity higher than a specified similarity with the target category to obtain a target sample set;
A first training module configured to train the first model with the sample images in the target sample set as inputs to the first model and the categories of the sample images as desired outputs of the first model until the first model training converges;
the probability acquisition module is configured to respectively classify and identify each sample image in the first sample set by adopting the trained first model to obtain the probability of each sample image belonging to the target class;
A second acquisition module configured to acquire sample images belonging to the target category having a probability greater than a specified probability threshold to construct a second sample set;
a feature extraction module configured to extract image features from the sample images for any of the sample images in the second sample set, and to obtain user features of a target object associated with the sample images;
And a second training module configured to train the second model using the image features and the user features until the second model training converges.
In some embodiments, the samples of the target class are positive samples and the samples not belonging to the target class are negative samples;
When the positive sample recall rate needs to be improved, setting the loss weight of the first model to be a value larger than 1;
And when the negative sample recall rate needs to be improved, setting the loss weight of the first model to be a value smaller than 1.
In some embodiments, the target sample set acquisition module comprises:
A first training unit configured to train the first model with the sample images in the first sample set as inputs of the first model and the categories of the sample images as desired outputs of the first model until the first model training converges, before the first model is trained;
A similarity obtaining unit configured to input each sample image in the first sample set to the first model, and obtain a probability that the sample image belongs to the target class as a similarity with the target class;
And a filtering unit configured to filter out the sample image if the sample image does not belong to a target class and the similarity with the target class is higher than a specified similarity.
In some embodiments, the second acquisition module includes:
The clipping processing unit is configured to respectively clip each sample image in the second sample set for a plurality of times to acquire a plurality of clipped sample images;
a third sample set obtaining unit configured to obtain respective categories of the sample image in the second sample set and the plurality of clipped sample images, and construct the third sample set composed of sample images and corresponding categories.
In some embodiments, the feature extraction module comprises:
a feature recognition unit configured to perform feature recognition on an object of interest in the sample image, and acquire feature information of the object of interest from the sample image;
an acquisition unit configured to acquire an object identification of a target object associated with the sample image;
And the portrait acquisition unit is configured to acquire the user characteristics of the target object according to the object identification.
In some embodiments, feature recognition is performed on an object of interest in the sample image, and feature information of the object of interest is obtained from the sample image, including:
And when the sample image comprises a plurality of objects of interest, sequentially acquiring characteristic information of at least one object of interest from the sample image according to the size sequence of the objects of interest.
In a fifth aspect, another embodiment of the present application also provides an electronic device, including at least one processor; and a memory communicatively coupled to the at least one processor; the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can execute the image classification method provided by the embodiment of the application.
In a sixth aspect, another embodiment of the present application further provides a computer storage medium, where the computer storage medium stores a computer program for causing a computer to execute the image classification method in the embodiment of the present application.
In the embodiment of the application, the target image is classified by adopting a method of combining the first model and the second model, wherein the first model improves the recall rate, the second model improves the accuracy rate, and the two models perform their own roles, thereby improving the overall performance of the image classification method.
Additional features and advantages of the application will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the application. The objectives and other advantages of the application will be realized and attained by the structure particularly pointed out in the written description and claims thereof as well as the appended drawings.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments of the present application will be briefly described below, and it is obvious that the drawings described below are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is an application scenario diagram of a graph classification method according to an embodiment of the present application;
FIG. 2 is a flowchart of the overall training of the graphic classification method according to the embodiment of the present application in the implementation process;
FIG. 3 is a flowchart of a first model training method for graphic classification according to an embodiment of the present application;
FIG. 4 is a flowchart of filtering out sample images with a higher similarity than a specified similarity in a target class in a graph classification method according to an embodiment of the present application;
FIG. 5 is a flowchart of training a second model of a graph classification method according to an embodiment of the present application;
FIG. 6 is a diagram of steps in constructing a second sample set according to an embodiment of the present application;
FIG. 7 is a diagram of a specific implementation step of extracting image features to obtain user features in a graph classification method according to an embodiment of the present application;
FIG. 8A is a diagram illustrating steps performed during use of the method for classifying graphics according to an embodiment of the present application;
FIG. 8B is a schematic diagram of a graphic classification model according to an embodiment of the present application;
FIG. 9A is a device diagram of a graph classification method according to an embodiment of the present application;
FIG. 9B is a diagram of a training device for an image classification model according to an embodiment of the present application;
Fig. 10 is an electronic device diagram of a graph classification method according to an embodiment of the present application.
Detailed Description
In order to enable a person skilled in the art to better understand the technical solutions of the present application, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings.
It should be noted that the terms "first," "second," and the like in the description and in the claims are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the application described herein may be implemented in sequences other than those illustrated or otherwise described herein. The implementations described in the following exemplary examples do not represent all implementations consistent with the application. Rather, they are merely examples of apparatus and methods consistent with aspects of the application as detailed in the accompanying claims.
The inventor researches and discovers that with the development of computer vision technology, image content understanding and analysis are more and more intelligent. The classification task based on image information is an important application of computer vision; with the increase of image information, how to efficiently classify and check the security of image decomposition, detect abnormal behavior and other special scenes is particularly important. In these special scenarios, the natural occurrence of a certain class of images is very low (parts per million). Such as tens of thousands of pictures, several target images may appear.
In some scenes requiring high recall rate of the target image, the image classification technology in the related technology is difficult to be applied, and the recall rate and the classification accuracy rate of the image classification technology are difficult to meet the service requirement.
In view of the above, the present application proposes an image classification method, apparatus, electronic device, and storage medium for solving the above-mentioned problems. The inventive concept of the present application can be summarized as follows: two stages are employed to accomplish image classification. The first stage is mainly used for realizing the guarantee of recall rate, and the second stage is used for further analyzing the output result of the first stage and realizing the guarantee of classification accuracy.
In the embodiment of the application, the first model is used for extracting the characteristics of the target image in the first stage, the image with insufficient accuracy in the first model processing is further analyzed by the model in the second stage, the model in the second stage can adopt a multi-characteristic fusion mode, and the model based on the decision tree is used for analyzing the characteristics of multiple dimensions to ensure the accuracy of the classification result. According to the application, the two models perform their own functions, so that the overall recognition effect of the model is improved, and the overall performance of the model is improved.
The processing modes of the images of different target categories are the same, and in order to facilitate understanding, whether the images are yellow or not is taken as an example for explanation in the embodiment of the application.
The image classification method in the embodiment of the application is described in detail below with reference to the accompanying drawings.
For easy understanding, the technical scheme provided by the embodiment of the application is described below by taking classification and identification of yellow-related images as an example.
It should be understood that other classification tasks are also applicable to embodiments of the present application.
In one embodiment, as shown in fig. 1, an application scenario diagram for determining whether a picture is yellow in an embodiment of the present application is shown. The application scene comprises the following steps: terminal equipment 101, server 102, network 103, and memory 104;
The terminal equipment 101 uploads pictures, the pictures are stored in the memory 104 through the server 102, and the trained model is installed on the server 102; in particular applications, pictures are retrieved from memory 104 and categorized in a server-installed model.
In some embodiments, the server may not only obtain the target image through the image uploaded by the terminal, but also obtain the image for classification according to other modes such as short video in light application, which is not limited in the present application.
In the image classification method provided by the embodiment of the application, the first model trained firstly is used for extracting the characteristics of the target image, and the first probability that the target image belongs to the target class is determined; and carrying out fusion processing on the second image features and the user features by adopting a second model based on the decision tree to obtain a second probability that the target image belongs to the target class.
For ease of understanding, the embodiment of the present application uses two parts to describe the image classification method provided by the embodiment of the present application in terms of model training and model.
1. Training of image classification models
In some embodiments, the sample of the target class is a positive sample, the sample not belonging to the target class is a negative sample, the application scene is to detect whether the sample image is yellow, if the yellow image in the sample image is expected to be detected, the yellow image is set as the positive sample, the loss weight is set to be greater than 1, and the model is trained by using the marked sample image, and the specific training method is as follows:
as shown in fig. 2, a flowchart of overall training of an image classification model provided in an embodiment of the present application includes the following specific steps:
In step 201: acquiring a first sample set, wherein the first sample set comprises a plurality of sample images, and each sample image is associated with a category marked in advance;
in step 202: filtering out sample images which do not belong to the target category in the first sample set and have higher similarity with the target category than the designated similarity to obtain a target sample set;
In step 203: taking a sample image in a target sample set as input of a first model, taking a category of the sample image as expected output of the first model, and training the first model until training of the first model converges;
In step 204: respectively classifying and identifying each sample image in the first sample set by adopting a trained first model to obtain the probability of each sample image belonging to the target class;
In step 205: acquiring sample images with probability of belonging to the target category larger than a specified probability threshold value to construct a second sample set;
in step 206: extracting image features from the sample images for any one of the sample images in the second sample set, and acquiring user features of a target object associated with the sample images;
In step 207: and training the second model by adopting the image characteristics and the user characteristics until the training of the second model converges.
In the embodiment provided by the application, the sample images with the similarity to the target category higher than the designated similarity are filtered, so that the first model recalls images which are not in the target category but have the similarity to the target category and high similarity, and the recall rate is greatly improved.
For ease of understanding, the training process of the two models is described in detail below.
1. Training a first model
As shown in fig. 3, a flowchart of training a first model according to an embodiment of the present application includes the following specific steps:
In step 301: acquiring a first sample set, wherein the first sample set comprises a plurality of sample images, and each sample image is associated with a category marked in advance;
the category of the pre-label associated with the image may be manually labeled or may be labeled by a machine, and the present application is not limited thereto.
In step 302: filtering out sample images which do not belong to the target category in the first sample set and have higher similarity with the target category than the designated similarity to obtain a target sample set;
in one embodiment, as shown in fig. 4, in order to better train the first model and make the training of the first model more pure, in the embodiment of the present application, filtering out a sample image in the first sample set, which does not belong to the target class and has a similarity higher than a specified similarity, is implemented as follows:
in step 401: before the first model is trained, taking a sample image in the first sample set as an input of the first model, taking a category of the sample image as an expected output of the first model, and training the first model until the training of the first model converges;
In step 402: inputting each sample image in the first sample set into a first model to obtain the probability that the sample image belongs to the target class as the similarity with the target class;
When the probability that the sample image belongs to the target class is obtained as the similarity of the target class, the similarity of the sample image and the target class can be directly calculated through a similarity calculation formula, and other methods for calculating the similarity are applicable to the application, so that the application is not limited to the method.
In step 403: and if the sample image does not belong to the target category and the similarity with the target category is higher than the designated similarity, filtering the sample image.
For example: if the similarity between one sample image and the target class is 90% and the designated similarity is 50%, filtering the sample image, and removing the sample image which has high similarity with the target class but does not belong to the target class from the sample image in the mode; training the first model by using the sample image with high similarity with the target class removed, so that the recall rate of the first model on the image with the target class is effectively improved.
In one embodiment, the samples of the target class are positive samples and the samples not belonging to the target class are negative samples; when the loss weight of the first model is set, the loss weight is specifically set according to specific requirements, and the specific setting mode is as follows:
When the positive sample recall rate needs to be improved, setting the loss weight of the first model to be a value larger than 1;
And when the negative sample recall rate needs to be improved, setting the loss weight of the first model to be a value smaller than 1.
For example: loss weight = 0.5 means that more focus is placed on the accuracy of the negative samples, which is desired to be sufficiently accurate. If the prediction probability of the negative sample is not high enough, the sample is determined to be a positive sample. At this point, recall of the positive sample is guaranteed. Instead Loss-weight=2, meaning that the accuracy of the positive samples is of interest.
In step 303: taking a sample image in the target sample set as an input of the first model, taking a category of the sample image as an expected output of the first model, and training the first model until the training of the first model converges.
Training the first model to the training result may be that the loss of the first model is not reduced, or the training frequency reaches the designated training frequency, and training the second model to the convergence by the subsequent introduction has the same meaning as training the first model to the convergence, which will not be described in detail later.
2. Training a second model
As shown in fig. 5, a flowchart of training a second model according to an embodiment of the present application includes the following specific steps:
in step 501: acquiring sample images with probability of belonging to the target category larger than a specified probability threshold value to construct a second sample set;
In one embodiment, as shown in fig. 6, the steps performed in constructing the second sample set are as follows:
In step 601: respectively cutting each sample image in the second sample set for multiple times to obtain a plurality of cut sample images;
in order to expand the number of sample images in the second sample set, the sample images are cropped, which also makes the second model more versatile.
In step 602: and acquiring respective categories of the sample image in the second sample set and the plurality of cut sample images, and constructing the third sample set formed by the sample image and the corresponding category.
In step 502: extracting image features from the sample images for any one of the sample images in the second sample set, and acquiring user features of a target object associated with the sample images;
In one embodiment, as shown in fig. 7, the steps for extracting image features to obtain user features are as follows:
In step 701: performing feature recognition on an object of interest in a sample image, and acquiring feature information of the object of interest from the sample image;
When a plurality of objects of interest are included in the sample image, feature information of at least one object of interest is sequentially acquired from the sample image according to the size sequence of the objects of interest;
For example: when the object of interest is a human face, extracting feature information of the human face from the sample image according to the size sequence of the human face, wherein the feature information can be age, gender and the like.
In step 702: acquiring an object identifier of a target object associated with the sample image;
In step 703: and acquiring the user characteristics of the target object according to the object identification.
In one embodiment, the user characteristics of the target object include at least one of: age, sex, city of the user. A history browsing record; the content specifically included in the user features is determined according to the specific application scenario, which is not limited by the present application.
In step 503: and training the second model by adopting the image characteristics and the user characteristics until the training of the second model converges.
The trained second model can accurately extract information in the image and classify the information to obtain a sample image belonging to the target class.
The trained first model is combined with the second model, the recall rate is improved by the first model, the accuracy rate is improved by the second model, and the two models perform their own roles, so that the overall performance of the image classification model provided by the embodiment of the application is improved.
When the model is used, setting a second probability threshold according to a quasi-recall change curve, wherein the quasi-recall change curve is used for describing the association relationship among recall parameters of recall rates, accuracy parameters for describing the judging accuracy of target categories and preset probability thresholds. That is, the quasi-recall change curve is a three-dimensional corresponding relation, the three-dimensional information comprises recall rate, accuracy and a preset probability threshold, when there is a clear recall rate and accuracy requirement, the corresponding preset probability threshold can be determined according to the requirement, so that different preset probability thresholds can be selected according to different service requirements.
2. Use of image classification models
As shown in fig. 8A, the specific implementation steps of the image classification model provided in the application when in use are as follows:
in step 801: acquiring a target image;
In step 802: extracting features of the target image to obtain first image features of the target image;
In step 803: determining a first probability that the target image belongs to the target category by adopting a first image characteristic of the target image;
In step 804: when the first probability is higher than a first probability threshold, extracting second image features from the target image, and acquiring user features of a target object associated with the target image; adopting a decision tree to fuse the second image features and the user features to obtain a second probability that the target image belongs to the target class;
and when the first probability is smaller than or equal to the first probability threshold value, determining that the target image does not belong to the target category.
In step 805: and when the second probability is higher than the second probability threshold, determining that the target object belongs to the target category.
When the second probability is higher than a preset probability threshold value and smaller than the second probability threshold value, distributing the target object into the designated task set;
For example: if the second probability is 80%, the second probability threshold is 90%, and the preset probability threshold is 70%; the second probability is smaller than the second probability threshold, so the second model cannot judge the target image as the target class, but the second probability is larger than the preset probability threshold at this time, namely, the similarity between the target image and the target class is too high, so the target object is allocated to the designated task set at this time in order to improve the judgment accuracy. The appointed task set can be a task queue which needs to be processed in a manual processing link, so that the images with difficulty can be screened out for manual verification.
And when the second probability is smaller than the preset probability threshold value, determining that the target image does not belong to the target category.
According to the application method of the image classification model provided by the embodiment of the application, when the image is required to be classified, the image to be detected is only required to be input into the model, and the second probability threshold is set according to the requirement, so that the image can be effectively classified.
In order to facilitate understanding, the structure of the image classification model according to the embodiment of the present application is described below, as shown in fig. 8B, a target image is input into a first model 810, and the first model performs feature extraction on the target image to extract a first image feature; and determining a first probability that the target image belongs to the target category, wherein the first model is used for recall, namely the first model can classify the images with special suspected targets into the target category, so that when the first probability predicted by the first model is smaller than or equal to a first probability threshold value, the target image does not belong to the target category, and if the first probability is higher than the first probability threshold value, the first model predicts the target image as the target category, and because the first model is used for recall, the accuracy of judging the target category possibly cannot meet the service requirement, and therefore, in order to accurately judge whether the target image is the target category, the first model can be used for classification recognition again, so as to ensure the accuracy. That is, as shown in fig. 8B, when the second probability that the target image belongs to the target category is higher than the second probability threshold, it is determined that the target object belongs to the target category.
As shown in fig. 9A, based on the same inventive concept, an image classification apparatus 900 is proposed, comprising:
an image acquisition module 9001 for acquiring a target image;
a first feature extraction module 9002, configured to perform feature extraction on the target image, so as to obtain a first image feature of the target image;
a first probability determining module 9003, configured to determine, using first image features of the target image, a first probability that the target image belongs to a target class;
A first probability judging module 9004, configured to extract a second image feature from the target image and obtain a user feature of a target object associated with the target image when the first probability is higher than a first probability threshold; adopting a decision tree to perform fusion processing on the second image features and the user features to obtain a second probability that the target image belongs to the target category;
And a second probability judging module 9005, configured to determine that the target object belongs to the target category when the second probability is higher than a second probability threshold.
In some embodiments, the first probability determination module is further configured to:
And when the first probability is smaller than or equal to the first probability threshold, determining that the target image does not belong to the target category.
In some embodiments, the first probability determination module is further configured to:
When the second probability is higher than a preset probability threshold and smaller than the second probability threshold, the target object is distributed to a designated task set;
And when the second probability is smaller than the preset probability threshold, determining that the target image does not belong to the target category.
In some embodiments, a quasi-recall change curve is pre-stored, wherein the quasi-recall change curve is used for describing a recall parameter of a recall rate, an accuracy parameter for describing a judging accuracy of a target class, and an association relation between the preset probability threshold;
the preset probability threshold is set according to a preset recall rate index and an accuracy rate index.
In some embodiments, feature extraction is performed on the target image using a pre-trained first model, and a first probability that the target image belongs to the target class is determined, wherein the first model is trained according to the following modules:
A first sample set acquisition module configured to acquire a first sample set, the first sample set including a plurality of sample images, each of the sample images being associated with a pre-labeled category;
the filtering module is configured to filter out the sample images which do not belong to the target category in the first sample set and have the similarity with the target category higher than the appointed similarity to obtain a target sample set;
A first model training module configured to train the first model with the sample images in the target sample set as inputs to the first model and the categories of the sample images as desired outputs of the first model until the first model training converges.
In some embodiments, a second model based on a decision tree is used to perform fusion processing on the second image feature and the user feature, so as to obtain a second probability that the target image belongs to the target category, where the second model is trained according to the following modules:
a second sample set acquisition module configured to acquire sample images having probabilities of belonging to the target class greater than a specified probability threshold to construct a second sample set;
a feature extraction module configured to extract image features from the sample images for any of the sample images in the second sample set, and to obtain user features of a target object associated with the sample images;
A second model training module configured to train the second model using the image features and the user features until the second model training converges.
In some embodiments, the samples of the target class are positive samples and the samples not belonging to the target class are negative samples;
When the positive sample recall rate needs to be improved, setting the loss weight of the first model to be a value larger than 1;
And when the negative sample recall rate needs to be improved, setting the loss weight of the first model to be a value smaller than 1.
In some embodiments, the filtration module comprises:
A first training unit configured to train the first model with the sample images in the first sample set as inputs to the first model and the categories of the sample images as expected outputs of the first model until the first model training converges, before the first model is trained;
A similarity obtaining unit configured to input each sample image in the first sample set to the first model, and obtain a probability that the sample image belongs to the target class as a similarity with the target class;
And a filtering unit configured to filter out the sample image if the sample image does not belong to a target class and the similarity with the target class is higher than a specified similarity.
In some embodiments, the second sample set acquisition module comprises:
the clipping unit is configured to clip each sample image in the second sample set for multiple times respectively to obtain multiple clipped sample images;
a third sample set obtaining unit configured to obtain respective categories of the sample image in the second sample set and the plurality of clipped sample images, and construct the third sample set composed of sample images and corresponding categories.
In some embodiments, the feature extraction module comprises:
A feature information acquisition unit configured to perform feature recognition on an object of interest in the sample image, and acquire feature information of the object of interest from the sample image;
An object identification acquisition unit configured to acquire an object identification of a target object associated with the sample image;
and the user characteristic acquisition unit is configured to acquire the user characteristic of the target object according to the object identifier.
In some embodiments, the feature information acquisition unit includes:
And when the sample image comprises a plurality of objects of interest, sequentially acquiring characteristic information of at least one object of interest from the sample image according to the size sequence of the objects of interest.
As shown in fig. 9B, based on the same inventive concept, a training apparatus 1000 of an image classification model is provided, including:
a first obtaining module 10001 configured to obtain a first sample set, where the first sample set includes a plurality of sample images, and each sample image is associated with a pre-labeled category;
The target sample set acquisition module 10002 is configured to filter out sample images which do not belong to a target category in the first sample set and have a similarity higher than a specified similarity with the target category, so as to obtain a target sample set;
A first training module 10003 configured to train the first model with the sample image in the target sample set as an input to the first model and a category of the sample image as a desired output of the first model until the first model training converges;
The probability obtaining module 10004 is configured to respectively classify and identify each sample image in the first sample set by adopting the trained first model, so as to obtain the probability of each sample image belonging to the target class;
a second obtaining module 10005 configured to obtain sample images with probability of belonging to the target category greater than a specified probability threshold to construct a second sample set;
a feature extraction module 10006 configured to extract, for any sample image in the second sample set, image features from the sample image, and acquire user features of a target object associated with the sample image;
a second training module 10007 configured to train the second model using the image features and the user features until the second model training converges.
In some embodiments, the samples of the target class are positive samples and the samples not belonging to the target class are negative samples;
When the positive sample recall rate needs to be improved, setting the loss weight of the first model to be a value larger than 1;
And when the negative sample recall rate needs to be improved, setting the loss weight of the first model to be a value smaller than 1.
In some embodiments, the target sample set acquisition module comprises:
A first training unit configured to train the first model with the sample images in the first sample set as inputs of the first model and the categories of the sample images as desired outputs of the first model until the first model training converges, before the first model is trained;
A similarity obtaining unit configured to input each sample image in the first sample set to the first model, and obtain a probability that the sample image belongs to the target class as a similarity with the target class;
And a filtering unit configured to filter out the sample image if the sample image does not belong to a target class and the similarity with the target class is higher than a specified similarity.
In some embodiments, the second acquisition module includes:
The clipping processing unit is configured to respectively clip each sample image in the second sample set for a plurality of times to acquire a plurality of clipped sample images;
a third sample set obtaining unit configured to obtain respective categories of the sample image in the second sample set and the plurality of clipped sample images, and construct the third sample set composed of sample images and corresponding categories.
In some embodiments, the feature extraction module comprises:
a feature recognition unit configured to perform feature recognition on an object of interest in the sample image, and acquire feature information of the object of interest from the sample image;
an acquisition unit configured to acquire an object identification of a target object associated with the sample image;
And the portrait acquisition unit is configured to acquire the user characteristics of the target object according to the object identification.
In some embodiments, feature recognition is performed on an object of interest in the sample image, and feature information of the object of interest is obtained from the sample image, including:
And when the sample image comprises a plurality of objects of interest, sequentially acquiring characteristic information of at least one object of interest from the sample image according to the size sequence of the objects of interest.
Having described the image classification method and the image classification model training method and apparatus according to the exemplary embodiment of the present application, next, an electronic device according to another exemplary embodiment of the present application is described.
Those skilled in the art will appreciate that the various aspects of the application may be implemented as a system, method, or program product. Accordingly, aspects of the application may be embodied in the following forms, namely: an entirely hardware embodiment, an entirely software embodiment (including firmware, micro-code, etc.) or an embodiment combining hardware and software aspects may be referred to herein as a "circuit," module "or" system.
In some possible embodiments, an electronic device according to the application may comprise at least one processor and at least one memory. Wherein the memory stores program code which, when executed by the processor, causes the processor to perform the steps in the image classification method and the training method of the image classification model according to various exemplary embodiments of the application described in the present specification.
An electronic device 130 according to this embodiment of the application is described below with reference to fig. 10. The electronic device 130 shown in fig. 10 is merely an example and should not be construed as limiting the functionality and scope of use of embodiments of the present application.
As shown in fig. 10, the electronic device 130 is embodied in the form of a general-purpose electronic device. Components of electronic device 130 may include, but are not limited to: the at least one processor 131, the at least one memory 132, and a bus 133 connecting the various system components, including the memory 132 and the processor 131.
Bus 133 represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, a processor, and a local bus using any of a variety of bus architectures.
Memory 132 may include readable media in the form of volatile memory such as Random Access Memory (RAM) 1321 and/or cache memory 1322, and may further include Read Only Memory (ROM) 1323.
Memory 132 may also include a program/utility 1325 having a set (at least one) of program modules 1324, such program modules 1324 include, but are not limited to: an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment.
The electronic device 130 may also communicate with one or more external devices 134 (e.g., keyboard, pointing device, etc.), one or more devices that enable a user to interact with the electronic device 130, and/or any device (e.g., router, modem, etc.) that enables the electronic device 130 to communicate with one or more other electronic devices. Such communication may occur through an input/output (I/O) interface 135. Also, electronic device 130 may communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet, through network adapter 136. As shown, network adapter 136 communicates with other modules for electronic device 130 over bus 133. It should be appreciated that although not shown, other hardware and/or software modules may be used in connection with electronic device 130, including, but not limited to: microcode, device drivers, redundant processors, external disk drive arrays, RAID systems, tape drives, data backup storage systems, and the like.
In some possible embodiments, aspects of an image classification method and a training method for an image classification model provided by the present application may also be implemented in the form of a program product comprising program code for causing a computer device to perform the steps of the image classification method and the training method for an image classification model according to the various exemplary embodiments of the application as described herein above when the program product is run on a computer device.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The program product for image classification according to embodiments of the present application may employ a portable compact disc read-only memory (CD-ROM) and comprise program code and may run on an electronic device. However, the program product of the present application is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the consumer electronic device, partly on the consumer electronic device, as a stand-alone software package, partly on the consumer electronic device, partly on the remote electronic device, or entirely on the remote electronic device or server. In the case of remote electronic devices, the remote electronic device may be connected to the consumer electronic device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external electronic device (e.g., connected through the internet using an internet service provider).
It should be noted that although several units or sub-units of the apparatus are mentioned in the above detailed description, such a division is merely exemplary and not mandatory. Indeed, the features and functions of two or more of the elements described above may be embodied in one element in accordance with embodiments of the present application. Conversely, the features and functions of one unit described above may be further divided into a plurality of units to be embodied.
Furthermore, although the operations of the methods of the present application are depicted in the drawings in a particular order, this is not required or suggested that these operations must be performed in this particular order or that all of the illustrated operations must be performed in order to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step to perform, and/or one step decomposed into multiple steps to perform.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present application without departing from the spirit or scope of the application. Thus, it is intended that the present application also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (36)

1. A method of classifying images, the method comprising:
acquiring a target image;
extracting features of the target image to obtain first image features of the target image;
Determining a first probability that the target image belongs to a target category by adopting a first image characteristic of the target image; extracting features of the target image by adopting a pre-trained first model to obtain first image features of the target image; determining a first probability that the target image belongs to the target category by adopting a first image feature of the target image and a first model trained in advance;
When the first probability is higher than a first probability threshold, extracting second image features from the target image, and acquiring user features of a target object associated with the target image; adopting a second model based on a decision tree to perform fusion processing on the second image features and the user features to obtain a second probability that the target image belongs to the target category; a second sample set adopted in training the second model is constructed by acquiring sample images of which the first probability belonging to the target class is higher than the first probability threshold;
And when the second probability is higher than a second probability threshold, determining that the target object belongs to the target category.
2. The method of claim 1, wherein after the determining the first probability that the target image belongs to a target category, the method further comprises:
And when the first probability is smaller than or equal to the first probability threshold, determining that the target image does not belong to the target category.
3. The method of claim 1, wherein after the determining the second probability that the target image belongs to the target category, the method further comprises:
When the second probability is higher than a preset probability threshold and smaller than the second probability threshold, the target object is distributed to a designated task set;
And when the second probability is smaller than the preset probability threshold, determining that the target image does not belong to the target category.
4. The method of claim 3, wherein a quasi-recall change curve is pre-stored, the quasi-recall change curve being used for describing a correlation between recall parameters of recall, accuracy parameters for describing determination accuracy of a target class, and the preset probability threshold;
the preset probability threshold is set according to a preset recall rate index and an accuracy rate index.
5. The method of claim 1, wherein the first model is trained according to the following method:
acquiring a first sample set, wherein the first sample set comprises a plurality of sample images, and each sample image is associated with a pre-labeled category;
Filtering out sample images which do not belong to the target category in the first sample set and have the similarity with the target category higher than the designated similarity to obtain a target sample set;
taking the sample image in the target sample set as an input of the first model, taking the category of the sample image as an expected output of the first model, and training the first model until the training of the first model converges.
6. The method of claim 1, wherein the second model is trained according to the following method:
Acquiring sample images with probability of belonging to the target category larger than a specified probability threshold to construct a second sample set;
Extracting image features from the sample images for any one of the second sample images, and acquiring user features of a target object associated with the sample images;
And training the second model by adopting the image characteristics and the user characteristics until the training of the second model converges.
7. The method of claim 5, wherein samples of the target class are positive samples and samples not belonging to the target class are negative samples;
When the positive sample recall rate needs to be improved, setting the loss weight of the first model to be a value larger than 1;
And when the negative sample recall rate needs to be improved, setting the loss weight of the first model to be a value smaller than 1.
8. The method of claim 5, wherein filtering out sample images in the first set of samples that do not belong to a target class and that have a similarity to the target class that is higher than a specified similarity comprises:
Training the first model with the sample images in the first set of samples as inputs to the first model and the categories of the sample images as desired outputs of the first model before the first model is trained until the first model training converges;
Inputting each sample image in the first sample set into the first model to obtain the probability that the sample image belongs to the target category as the similarity with the target category;
And if the sample image does not belong to the target category and the similarity with the target category is higher than the designated similarity, filtering the sample image.
9. The method of claim 6, wherein the obtaining sample images having probabilities of belonging to the target class greater than a specified probability threshold constitutes a second sample set, comprising:
Respectively cutting each sample image in the second sample set for multiple times to obtain a plurality of cut sample images;
And acquiring respective categories of the sample images in the second sample set and the plurality of cut sample images, and constructing a third sample set consisting of the sample images and the corresponding categories.
10. The method of claim 6, wherein the extracting image features from the sample image and obtaining user features of a target object associated with the sample image comprises:
performing feature recognition on an object of interest in the sample image, and acquiring feature information of the object of interest from the sample image;
acquiring an object identifier of a target object associated with the sample image;
and acquiring the user characteristics of the target object according to the object identifier.
11. The method of claim 10, wherein the feature recognition of the object of interest in the sample image, and the obtaining of the feature information of the object of interest from the sample image, comprises:
And when the sample image comprises a plurality of objects of interest, sequentially acquiring characteristic information of at least one object of interest from the sample image according to the size sequence of the objects of interest.
12. A method of training an image classification model, the classification model comprising a first model and a decision tree-based second model, the method comprising:
acquiring a first sample set, wherein the first sample set comprises a plurality of sample images, and each sample image is associated with a pre-labeled category;
Filtering out sample images which do not belong to the target category in the first sample set and have the similarity with the target category higher than the designated similarity to obtain a target sample set;
Training the first model with the sample images in the target sample set as input to the first model and the categories of the sample images as expected output of the first model until the training of the first model converges;
Classifying and identifying each sample image in the first sample set by adopting the trained first model to obtain the probability of each sample image belonging to the target class;
Acquiring sample images with probability of belonging to the target category larger than a specified probability threshold to construct a second sample set;
Extracting image features from the sample images for any one of the second sample images, and acquiring user features of a target object associated with the sample images;
And training the second model by adopting the image characteristics and the user characteristics until the training of the second model converges.
13. The method of claim 12, wherein samples of the target class are positive samples and samples not belonging to the target class are negative samples;
When the positive sample recall rate needs to be improved, setting the loss weight of the first model to be a value larger than 1;
And when the negative sample recall rate needs to be improved, setting the loss weight of the first model to be a value smaller than 1.
14. The method of claim 12, wherein filtering out sample images in the first set of samples that do not belong to a target class and that have a similarity to the target class that is higher than a specified similarity comprises:
Training the first model with the sample images in the first set of samples as inputs to the first model and the categories of the sample images as desired outputs of the first model before the first model is trained until the first model training converges;
Inputting each sample image in the first sample set into the first model to obtain the probability that the sample image belongs to the target category as the similarity with the target category;
And if the sample image does not belong to the target category and the similarity with the target category is higher than the designated similarity, filtering the sample image.
15. The method of claim 12, wherein the obtaining sample images having probabilities of belonging to the target class greater than a specified probability threshold constitutes a second sample set, comprising:
Respectively cutting each sample image in the second sample set for multiple times to obtain a plurality of cut sample images;
And acquiring respective categories of the sample images in the second sample set and the plurality of cut sample images, and constructing a third sample set consisting of the sample images and the corresponding categories.
16. The method of claim 12, wherein the extracting image features from the sample image and obtaining user features of a target object associated with the sample image comprises:
performing feature recognition on an object of interest in the sample image, and acquiring feature information of the object of interest from the sample image;
acquiring an object identifier of a target object associated with the sample image;
and acquiring the user characteristics of the target object according to the object identifier.
17. The method of claim 16, wherein the feature recognition of the object of interest in the sample image, and the obtaining of the feature information of the object of interest from the sample image, comprises:
And when the sample image comprises a plurality of objects of interest, sequentially acquiring characteristic information of at least one object of interest from the sample image according to the size sequence of the objects of interest.
18. An image classification apparatus, the apparatus comprising:
an image acquisition module configured to acquire a target image;
The first feature extraction module is configured to perform feature extraction on the target image to obtain first image features of the target image; the method comprises the steps of specifically extracting features of a target image by adopting a pre-trained first model to obtain first image features of the target image;
a first probability determination module configured to determine a first probability that the target image belongs to a target category using a first image feature of the target image; the method comprises the steps of determining a first probability that a target image belongs to a target category by adopting a pre-trained first model and first image features of the target image;
The first probability judging module is configured to extract second image features from the target image and acquire user features of a target object associated with the target image when the first probability is higher than a first probability threshold; adopting a second model based on a decision tree to perform fusion processing on the second image features and the user features to obtain a second probability that the target image belongs to the target category; a second sample set adopted in training the second model is constructed by acquiring sample images of which the first probability belonging to the target class is higher than the first probability threshold;
and the second probability judging module is configured to determine that the target object belongs to the target category when the second probability is higher than a second probability threshold.
19. The apparatus of claim 18, wherein the first probability determination module is further configured to:
And when the first probability is smaller than or equal to the first probability threshold, determining that the target image does not belong to the target category.
20. The apparatus of claim 18, wherein the first probability determination module is further configured to:
When the second probability is higher than a preset probability threshold and smaller than the second probability threshold, the target object is distributed to a designated task set;
And when the second probability is smaller than the preset probability threshold, determining that the target image does not belong to the target category.
21. The apparatus of claim 20, wherein a quasi-recall change curve is pre-stored, the quasi-recall change curve being used to describe an association between recall parameters of a recall, accuracy parameters of a decision accuracy for describing a target class, and the preset probability threshold;
the preset probability threshold is set according to a preset recall rate index and an accuracy rate index.
22. The apparatus of claim 18, wherein the first model is trained according to the following modules:
A first sample set acquisition module configured to acquire a first sample set, the first sample set including a plurality of sample images, each of the sample images being associated with a pre-labeled category;
the filtering module is configured to filter out the sample images which do not belong to the target category in the first sample set and have the similarity with the target category higher than the appointed similarity to obtain a target sample set;
A first model training module configured to train the first model with the sample images in the target sample set as inputs to the first model and the categories of the sample images as desired outputs of the first model until the first model training converges.
23. The apparatus of claim 18, wherein the second model is trained according to the following modules:
a second sample set acquisition module configured to acquire sample images having probabilities of belonging to the target class greater than a specified probability threshold to construct a second sample set;
a feature extraction module configured to extract image features from the sample images for any of the sample images in the second sample set, and to obtain user features of a target object associated with the sample images;
A second model training module configured to train the second model using the image features and the user features until the second model training converges.
24. The apparatus of claim 22, wherein samples of the target class are positive samples and samples not belonging to the target class are negative samples;
When the positive sample recall rate needs to be improved, setting the loss weight of the first model to be a value larger than 1;
And when the negative sample recall rate needs to be improved, setting the loss weight of the first model to be a value smaller than 1.
25. The apparatus of claim 22, wherein the filter module comprises:
A first training unit configured to train the first model with the sample images in the first sample set as inputs to the first model and the categories of the sample images as expected outputs of the first model until the first model training converges, before the first model is trained;
A similarity obtaining unit configured to input each sample image in the first sample set to the first model, and obtain a probability that the sample image belongs to the target class as a similarity with the target class;
And a filtering unit configured to filter out the sample image if the sample image does not belong to a target class and the similarity with the target class is higher than a specified similarity.
26. The apparatus of claim 23, wherein the second sample set acquisition module comprises:
the clipping unit is configured to clip each sample image in the second sample set for multiple times respectively to obtain multiple clipped sample images;
A third sample set obtaining unit configured to obtain respective categories of the sample image in the second sample set and the plurality of clipped sample images, and construct a third sample set composed of the sample images and the corresponding categories.
27. The apparatus of claim 23, wherein the feature extraction module comprises:
A feature information acquisition unit configured to perform feature recognition on an object of interest in the sample image, and acquire feature information of the object of interest from the sample image;
An object identification acquisition unit configured to acquire an object identification of a target object associated with the sample image;
and the user characteristic acquisition unit is configured to acquire the user characteristic of the target object according to the object identifier.
28. The apparatus according to claim 27, wherein the characteristic information acquisition unit includes:
And when the sample image comprises a plurality of objects of interest, sequentially acquiring characteristic information of at least one object of interest from the sample image according to the size sequence of the objects of interest.
29. An apparatus for training an image classification model, the classification model comprising a first model and a decision tree-based second model, the apparatus comprising:
A first acquisition module configured to acquire a first sample set, the first sample set including a plurality of sample images, each of the sample images being associated with a pre-labeled category;
The target sample set acquisition module is configured to filter out sample images which do not belong to a target category in the first sample set and have a similarity higher than a specified similarity with the target category to obtain a target sample set;
A first training module configured to train the first model with the sample images in the target sample set as inputs to the first model and the categories of the sample images as desired outputs of the first model until the first model training converges;
the probability acquisition module is configured to respectively classify and identify each sample image in the first sample set by adopting the trained first model to obtain the probability of each sample image belonging to the target class;
A second acquisition module configured to acquire sample images belonging to the target category having a probability greater than a specified probability threshold to construct a second sample set;
a feature extraction module configured to extract image features from the sample images for any of the sample images in the second sample set, and to obtain user features of a target object associated with the sample images;
And a second training module configured to train the second model using the image features and the user features until the second model training converges.
30. The apparatus of claim 29, wherein samples of the target class are positive samples and samples not belonging to the target class are negative samples;
When the positive sample recall rate needs to be improved, setting the loss weight of the first model to be a value larger than 1;
And when the negative sample recall rate needs to be improved, setting the loss weight of the first model to be a value smaller than 1.
31. The apparatus of claim 29, wherein the target sample set acquisition module comprises:
A first training unit configured to train the first model with the sample images in the first sample set as inputs of the first model and the categories of the sample images as desired outputs of the first model until the first model training converges, before the first model is trained;
A similarity obtaining unit configured to input each sample image in the first sample set to the first model, and obtain a probability that the sample image belongs to the target class as a similarity with the target class;
And a filtering unit configured to filter out the sample image if the sample image does not belong to a target class and the similarity with the target class is higher than a specified similarity.
32. The apparatus of claim 29, wherein the second acquisition module comprises:
The clipping processing unit is configured to respectively clip each sample image in the second sample set for a plurality of times to acquire a plurality of clipped sample images;
A third sample set obtaining unit configured to obtain respective categories of the sample image in the second sample set and the plurality of clipped sample images, and construct a third sample set composed of the sample images and the corresponding categories.
33. The apparatus of claim 29, wherein the feature extraction module comprises:
a feature recognition unit configured to perform feature recognition on an object of interest in the sample image, and acquire feature information of the object of interest from the sample image;
an acquisition unit configured to acquire an object identification of a target object associated with the sample image;
And the portrait acquisition unit is configured to acquire the user characteristics of the target object according to the object identification.
34. The apparatus of claim 33, wherein the feature recognition of the object of interest in the sample image, obtaining feature information of the object of interest from the sample image, comprises:
And when the sample image comprises a plurality of objects of interest, sequentially acquiring characteristic information of at least one object of interest from the sample image according to the size sequence of the objects of interest.
35. An electronic device comprising at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-17.
36. A computer storage medium, characterized in that the computer storage medium stores a computer program for causing a computer to perform the method of any one of claims 1-17.
CN202011325685.9A 2020-11-23 2020-11-23 Image classification method, device, electronic equipment and storage medium Active CN112434178B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202011325685.9A CN112434178B (en) 2020-11-23 2020-11-23 Image classification method, device, electronic equipment and storage medium
PCT/CN2021/114146 WO2022105336A1 (en) 2020-11-23 2021-08-23 Image classification method and electronic device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011325685.9A CN112434178B (en) 2020-11-23 2020-11-23 Image classification method, device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112434178A CN112434178A (en) 2021-03-02
CN112434178B true CN112434178B (en) 2024-07-12

Family

ID=74693868

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011325685.9A Active CN112434178B (en) 2020-11-23 2020-11-23 Image classification method, device, electronic equipment and storage medium

Country Status (2)

Country Link
CN (1) CN112434178B (en)
WO (1) WO2022105336A1 (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112434178B (en) * 2020-11-23 2024-07-12 北京达佳互联信息技术有限公司 Image classification method, device, electronic equipment and storage medium
CN114255389B (en) * 2021-11-15 2024-12-27 浙江时空道宇科技有限公司 A method, device, equipment and storage medium for detecting a target object
CN114241241B (en) * 2021-12-16 2024-12-06 北京奇艺世纪科技有限公司 Image classification method, device, electronic device and storage medium
CN114428872B (en) * 2022-01-20 2025-09-02 北京有竹居网络技术有限公司 Image retrieval method, device, storage medium and electronic device
CN114842261B (en) * 2022-05-10 2025-07-25 西华师范大学 Image processing method, device, electronic equipment and storage medium
CN115019212B (en) * 2022-06-30 2024-11-26 中国电子科技集团公司第五十四研究所 A vehicle recognition method for infrared remote sensing images based on semi-supervised learning
CN115115911B (en) * 2022-07-13 2025-03-21 维沃移动通信(杭州)有限公司 Image model training method, device and electronic device
CN116206371A (en) * 2022-12-27 2023-06-02 支付宝(杭州)信息技术有限公司 Living body detection model training method, living body detection method and system
CN116205601B (en) * 2023-02-27 2024-04-05 开元数智工程咨询集团有限公司 Internet-based engineering list rechecking and data statistics method and system
CN117292174B (en) * 2023-09-06 2024-04-19 中化现代农业有限公司 Apple disease identification method, apple disease identification device, electronic equipment and storage medium
CN118968157B (en) * 2024-07-29 2025-07-18 南方科技大学 A classification method, device, terminal and storage medium based on cross-modal feature projection learning

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102449661A (en) * 2009-06-01 2012-05-09 惠普发展公司,有限责任合伙企业 Determining detection certainty in a cascade classifier
CN107844785A (en) * 2017-12-08 2018-03-27 浙江捷尚视觉科技股份有限公司 A kind of method for detecting human face based on size estimation

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2015308717B2 (en) * 2014-08-28 2021-02-18 Retailmenot, Inc. Reducing the search space for recognition of objects in an image based on wireless signals
CN107145862B (en) * 2017-05-05 2020-06-05 山东大学 A multi-feature matching multi-target tracking method based on Hough forest
CN108446723B (en) * 2018-03-08 2021-06-15 哈尔滨工业大学 A Multi-scale Space-Spectrum Collaborative Classification Method for Hyperspectral Images
CN111444334B (en) * 2019-01-16 2023-04-25 阿里巴巴集团控股有限公司 Data processing method, text recognition device and computer equipment
CN110009623B (en) * 2019-04-10 2021-05-11 腾讯医疗健康(深圳)有限公司 Image recognition model training and image recognition method, device and system
CN110619350B (en) * 2019-08-12 2021-06-18 北京达佳互联信息技术有限公司 Image detection method, device and storage medium
CN111125422B (en) * 2019-12-13 2024-04-02 北京达佳互联信息技术有限公司 Image classification method, device, electronic equipment and storage medium
CN112434178B (en) * 2020-11-23 2024-07-12 北京达佳互联信息技术有限公司 Image classification method, device, electronic equipment and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102449661A (en) * 2009-06-01 2012-05-09 惠普发展公司,有限责任合伙企业 Determining detection certainty in a cascade classifier
CN107844785A (en) * 2017-12-08 2018-03-27 浙江捷尚视觉科技股份有限公司 A kind of method for detecting human face based on size estimation

Also Published As

Publication number Publication date
WO2022105336A1 (en) 2022-05-27
CN112434178A (en) 2021-03-02

Similar Documents

Publication Publication Date Title
CN112434178B (en) Image classification method, device, electronic equipment and storage medium
CN110147726B (en) Service quality inspection method and device, storage medium and electronic device
CN110781960B (en) Training method, classification method, device and equipment of video classification model
US11294754B2 (en) System and method for contextual event sequence analysis
CN112860943A (en) Teaching video auditing method, device, equipment and medium
CN111858973B (en) Method, device, server and storage medium for detecting multimedia event information
CN110751224A (en) Training method of video classification model, video classification method, device and equipment
CN109992484B (en) A network alarm correlation analysis method, device and medium
CN111046956A (en) Occlusion image detection method and device, electronic equipment and storage medium
CN112925905B (en) Method, device, electronic equipment and storage medium for extracting video subtitles
CN103455546B (en) For setting up the method and system of profile for activity and behavior
CN114245232B (en) Video abstract generation method and device, storage medium and electronic equipment
CN118135480A (en) Visual image processing method and system for electromechanical construction of tunnel
CN112990350B (en) Target detection network training method and target detection network-based coal and gangue identification method
CN114241363A (en) Process identification method, device, electronic device and storage medium
CN114418111A (en) Label prediction model training and sample screening method, device and storage medium
CN116775437A (en) A method, device, equipment and medium for model generation and disk failure prediction
CN114842295B (en) Method, device and electronic equipment for obtaining insulator fault detection model
CN111882034A (en) Neural network processing and face recognition method, device, equipment and storage medium
US20240404281A1 (en) Abnormality analysis apparatus, abnormality analysis method, and non-transitory computer-readable medium
CN112989869B (en) Optimization method, device, equipment and storage medium of face quality detection model
CN118295842A (en) Data processing method, device and server for transaction system abnormal event
CN115238805B (en) Training method of abnormal data recognition model and related equipment
CN113807445B (en) File rechecking method and device, electronic device and readable storage medium
CN114022698B (en) A multi-label behavior recognition method and device based on binary tree structure

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant