The present invention relates to vision and area of pattern recognition.More particularly, relate to a kind of object detection systems and method based on deformable component model (DPM, Deformable Part Model).
Background technology
Object detection is an important technology in the vision technique, and it has very important application in intelligent video surveillance, content-based image/video retrieval, image/video note, auxiliary man-machine interaction.Because different classes of object has a great difference in shape, so object detection is very difficult.
Multiple object detection scheme has been proposed so far, for example, Boosting method, DPM etc.Boosting method use characteristic is trained a plurality of simple Weak Classifiers, then these Weak Classifiers is configured to cascade classifier, in image each sliding window is classified.Yet the Boosting method can successfully detect comparatively simple object such as people's face, human eye, but still can not satisfy the detection of common object (for example, automobile etc.).DPM comes parametrization DPM by the outward appearance of each parts in the image and for the geometric model that obtains the spatial relationship between the parts.The study of DPM parameter can be expressed as the classification problem of using hidden variable (latent variable), and can use latent support vector machine (SVM) to solve this classification problem.DPM has represented the current development level in this field, wins the championship title in the PASCAL VOC in 2009 about object detection.DPM is very effective with respect to additive method, can handle piece image several seconds for a classification.Yet for the occasion that real-time is had relatively high expectations, such speed still can not meet the demands.In addition, still there is the problem that is difficult to the overlapping a plurality of object instances in the detected image in DPM.
DPM generally includes: the data item of the object root (root) of image/parts (part); Measure the deformatter of the distortion cost of these parts from the anchor station of each parts.Object instance can be expressed as followsin in the score of DPM:
Wherein,
Be the data item in the parts definition of the root of object and object,
It is deformatter.
Here, p
0The root of indicated object, p
1, p
2... p
nThe n of an indicated object parts, the quantity of the parts of n indicated object, it is positive integer, F
iBe that (i equals to represent the convolution filter corresponding with the root proper vector at 0 o'clock for the convolution filter corresponding with root proper vector and component feature vector, i is not equal at 0 o'clock and represents the convolution filter corresponding with the component feature vector), H is the characteristics of image pyramid of input picture, φ (H, p
i) expression by the characteristics of image pyramid obtain at p
iFeature, φ
d(dx
i, dy
i)=(dx
i, dy
i, (dx
i)
2, (dy
i)
2), dx
i, dy
iRepresent i parts in the horizontal direction with vertical direction on skew, d
iBe the parameter of deformatter, b is the skew of the equation (1) as scoring function, and it depends on the concrete DPM of use.
In addition, also can be represented as based on the score in the sorter of DPM:
f(z)=β·Ψ(H,z), (2)
Wherein, z=(p
0... p
n), β represents the model parameter of DPM; Z is latent vector, has comprised the numbering of position, scaling ratio and/or the concrete root template of using of object root and parts.
When equation 1 and equation 2 are consistent,
β=(F
0,...F
n,d
1,...d
n,b);
Ψ(H,z)=(φ(H,p
0),...φ(H,p
n),-φ
d(dx
1,dy
1),...-φ
d(dx
n,dy
n),1)。
In above-mentioned model, similar to traditional SVM, can obtain β to this model training by utilizing positive negative sample, also, parameter F
i, d
iAnd b.
In the process of the object in detected image, use root template (that is, detection window) that image is scanned to extract a plurality of window areas (that is, root), with the input as DPM of the characteristics of image of window area.The window that must assign to determine to exist object according to each window area.
Yet, in traditional DPM, at each root template or component model, use be that the length and width of fixing are recently mated object, therefore having among a small circle with the template length breadth ratio, the object of difference is easy to be lost in detection.
Summary of the invention
One object of the present invention is to solve technical matters above-mentioned.
An aspect of of the present present invention provides a kind of object detection systems, comprising: image receiving unit receives image to be detected; Feature extraction unit is utilized the root template to treat detected image and is scanned, to extract the characteristics of image of a plurality of window areas; Deformable component model detecting unit, characteristics of image input deformable component model by a plurality of window areas that will extract, to utilize the deformable component model to obtain the degree of confidence of described a plurality of window areas, wherein, the deformable component model size of adjusting each window area makes the degree of confidence of each window area reach maximum; The object determining unit is determined to exist the window area of object according to the degree of confidence of window area.
Alternatively, adjust the scope of each window area that extracts, the corresponding convolution filter of each window area after feasible the adjustment reaches maximum with the dot product of the characteristics of image on each window area after the adjustment, makes the degree of confidence of each window area reach maximum.
Alternatively, the deformable component model obtains by training, wherein, when training deformable component model, adjusts the size as the window area of sample, makes that the degree of confidence as the window area of sample reaches maximum.
Alternatively, when training deformable component model, adjustment is as the scope of the window area of sample, the corresponding convolution filter of the window area as sample after make adjusting with reach maximum at the dot product as the characteristics of image on the window area of sample after the adjustment, make that the degree of confidence as the window area of sample reaches maximum.
Alternatively, described deformable component model is to mix the deformable component model.
Alternatively, described object detection systems also comprises: the redundant unit that suppresses, from the described a plurality of window areas that obtained degree of confidence, remove pseudo-window area according to the interactive relation between described a plurality of window areas, wherein, the redundant unit that suppresses comprises: the characteristic information extraction unit, from each window area characteristic information extraction; The redundant unit of removing utilizes the characteristic information that extracts to determine described interactive relation, to remove pseudo-window area from described a plurality of window areas.
Alternatively, described characteristic information comprises at least one in the degree of confidence of degree of confidence, parts of the positional information of the positional information of degree of confidence, root of window area and/or yardstick information, parts and/or yardstick information, root.
Alternatively, the redundant unit of removing is by maximizing following equation and judge and remove pseudo-window area:
Wherein, M represents the quantity of described a plurality of window areas;
φ (x
i, y
i)=y
iX
i x
i=(v
i(s), Z), v
i(s) degree of confidence of i window area of expression, Z represents the vector of K dimension, K represents the quantity of the deformable component model that described mixing deformable component model is included, the v of Z
i(c) individual element is that other elements of 1, Z are zero, v
i(c) expression detects the index of i the employed deformable component model of window area; y
iThat represents i window area is used for whether sign is the binary score of pseudo-window area; y
iWhether what expression was used for j window area is the binary score of pseudo-window area for sign;
The representation model parameter, d
IjRepresent the interactive relation between i window area and j the window area,
Wherein, when described equation maximized, the window area with binary score of the pseudo-window area of sign was judged as pseudo-window area.
Alternatively, use the φ (x of precognition
i, y
i),
SS divides class methods by predetermined structure to be trained to obtain
Alternatively, the interactive relation between the window area comprise that root-root is mutual, root-parts mutual, at least one in mutual of parts-parts.
Alternatively, described object detection systems also comprises: context deformable component model detecting unit, the contextual feature input context sorter of the window area of degree of confidence will have been obtained, to obtain the new degree of confidence of window area, wherein, the context sorter is to utilize as the contextual feature of sample to train the sorter that obtains.
Alternatively, contextual feature comprises: shaped position feature, neighborhood characteristics, the collaborative feature that occurs.
Alternatively, the size of shaped position character representation window area in image to be detected and size and the relative position of each parts in position and the window area, neighborhood characteristics is represented the image difference of the neighborhood of window area and window area; Work in coordination with and the character representation window area occurs and have the relation of the window area of maximum confidence.
Alternatively, the contextual feature of window area is represented by vector f:
f=(σ(sc),r,p,q,σ(s
m),r
m)
Wherein, σ (sc)=1/ (1+exp (2sc)),
Wherein, sc is the degree of confidence of window area, and r represents position and the size of window area, and p represents that each parts in the window area are with respect to the position at root area center, q represents the gradation of image mean difference of the adjacent area of specific region in the window area and window area, s
mBe the maximum confidence in the degree of confidence of described a plurality of window areas, r
mBe position and the size with window area of maximum confidence.
Alternatively, described object detection systems also comprises: context deformable component model detecting unit, to from described a plurality of window areas, remove the pseudo-window area contextual feature input context sorter of remaining window area afterwards, to obtain the new degree of confidence of window area, wherein, the context sorter is to utilize as the contextual feature of sample to train the sorter that obtains.
According to a further aspect in the invention, provide a kind of object detection method, comprising: receive image to be detected; Utilize the root template to treat detected image and scan, to extract the characteristics of image of a plurality of window areas;
By the characteristics of image input deformable component model of a plurality of window areas that will extract, obtain the degree of confidence of described a plurality of window areas to utilize the deformable component model, wherein, the size of adjusting each window area makes the degree of confidence maximum of this window area; Determine to exist the window area of object according to the degree of confidence of window area.
Alternatively, adjust the scope of each window area that extracts, the corresponding convolution filter of each window area after feasible the adjustment reaches maximum with the dot product of the characteristics of image on each window area after the adjustment, makes the degree of confidence of each window area reach maximum.
Alternatively, the deformable component model obtains by training, wherein, when training deformable component model, adjusts the size as the window area of sample, makes that the degree of confidence as the window area of sample reaches maximum.
Alternatively, when training deformable component model, adjustment is as the scope of the window area of sample, the corresponding convolution filter of the window area as sample after make adjusting with reach maximum at the dot product as the characteristics of image on the window area of sample after the adjustment, make that the degree of confidence as the window area of sample reaches maximum.
Alternatively, described deformable component model is to mix the deformable component model.
Alternatively, described method also comprises: remove pseudo-window area according to the interactive relation between described a plurality of window areas from the described a plurality of window areas that obtained degree of confidence.
Alternatively, the step of removing pseudo-window area comprises: the characteristic information extraction unit, from each window area characteristic information extraction; The redundant unit of removing utilizes the characteristic information that extracts to determine described interactive relation, to remove pseudo-window area from described a plurality of window areas.
Alternatively, described characteristic information comprises at least one in the degree of confidence of degree of confidence, parts of the positional information of the positional information of degree of confidence, root of window area and/or yardstick information, parts and/or yardstick information, root.
Alternatively, judge and remove pseudo-window area by maximizing following equation:
Wherein, M represents the quantity of described a plurality of window areas;
φ (x
i, y
i)=y
iX
i x
i=(v
i(s), Z), v
i(s) degree of confidence of i window area of expression, Z represents the vector of K dimension, K represents the quantity of the deformable component model that described mixing deformable component model is included, the v of Z
i(c) individual element is that other elements of 1, Z are zero, v
i(c) expression detects the index of i the employed deformable component model of window area; y
iThat represents i window area is used for whether sign is the binary score of pseudo-window area; y
jWhether what expression was used for j window area is the binary score of pseudo-window area for sign;
The representation model parameter, d
IjRepresent the interactive relation between i window area and j the window area, wherein, when described equation maximization, the window area with binary score of the pseudo-window area of sign is judged as pseudo-window area.
Alternatively, use the φ (x of precognition
i, y
i),
SS divides class methods by predetermined structure to be trained to obtain
Alternatively, the interactive relation between the window area comprise that root-root is mutual, root-parts mutual, at least one in mutual of parts-parts.
Alternatively, described method also comprises: the contextual feature input context sorter that will obtain the window area of degree of confidence, to obtain the new degree of confidence of window area, wherein, the context sorter is to utilize as the contextual feature of sample to train the sorter that obtains.
Alternatively, contextual feature comprises: shaped position feature, neighborhood characteristics, the collaborative feature that occurs.
Alternatively, the size of shaped position character representation window area in image to be detected and size and the relative position of each parts in position and the window area, neighborhood characteristics is represented the image difference of the neighborhood of window area and window area; Work in coordination with and the character representation window area occurs and have the relation of the window area of maximum confidence.
Alternatively, the contextual feature of window area is represented by vector f:
f=(σ(sc),r,p,q,σ(s
m),r
m),
Wherein, σ (sc)=1/ (1+exp (2sc)),
Wherein, sc is the degree of confidence of window area, and r represents position and the size of window area, and p represents that each parts in the window area are with respect to the position at root area center, q represents the gradation of image mean difference of the adjacent area of specific region in the window area and window area, s
mBe the maximum confidence in the degree of confidence of described a plurality of window areas, r
mBe position and the size with window area of maximum confidence.
Alternatively, described method also comprises: will remove the pseudo-window area contextual feature input context sorter of remaining window area afterwards from described a plurality of window areas, to obtain the new degree of confidence of window area, wherein, the context sorter is to utilize as the contextual feature of sample to train the sorter that obtains.
Another aspect of the present invention provides a kind of object detection systems, comprising: image receiving unit receives image to be detected; Feature extraction unit is extracted the characteristics of image of a plurality of window areas; Deformable component model detecting unit, the characteristics of image input deformable component model by a plurality of window areas that will extract obtains the degree of confidence of described a plurality of window areas to utilize the deformable component model; The object determining unit is determined to exist the window area of object according to the degree of confidence of window area.
Alternatively, described object detection systems also comprises: the redundant unit that suppresses, from the described a plurality of window areas that obtained degree of confidence, remove pseudo-window area according to the interactive relation between described a plurality of window areas, wherein, the redundant unit that suppresses comprises: the characteristic information extraction unit, from each window area characteristic information extraction; The redundant unit of removing utilizes the characteristic information that extracts to determine described interactive relation, to remove pseudo-window area from described a plurality of window areas.
Alternatively, described object detection systems also comprises: context deformable component model detecting unit, to from described a plurality of window areas, remove the pseudo-window area contextual feature input context sorter of remaining window area afterwards, to obtain the new degree of confidence of window area, wherein, the context sorter is to utilize as the contextual feature of sample to train the sorter that obtains.
Another aspect of the present invention provides a kind of object detection method, comprising: receive image to be detected; Extract the characteristics of image of a plurality of window areas; Characteristics of image input deformable component model by a plurality of window areas that will extract obtains the degree of confidence of described a plurality of window areas to utilize the deformable component model; Determine to exist the window area of object according to the degree of confidence of window area.
Alternatively, described object detection method also comprises: remove pseudo-window area according to the interactive relation between described a plurality of window areas from the described a plurality of window areas that obtained degree of confidence, wherein, removing pseudo-window area comprises: from each window area characteristic information extraction; Utilize the characteristic information that extracts to determine described interactive relation, from described a plurality of window areas, to remove pseudo-window area.
Alternatively, described object detection method also comprises: will remove the pseudo-window area contextual feature input context sorter of remaining window area afterwards from described a plurality of window areas, to obtain the new degree of confidence of window area, wherein, the context sorter is to utilize as the contextual feature of sample to train the sorter that obtains.
According to technical scheme of the present invention, by improving the data item in root definition among the existing DPM, can effectively overcome and have the object of difference among a small circle with the template length breadth ratio and be easy to the problem of in detection, being lost.
In addition, according to Redundancy-Restraining Technique of the present invention, solved because object is blocked, combine with other objects and the score of the window area that object space layout and overlapping complicacy cause is detected by DPM may inaccurate problem, effectively eliminated pseudo-window area.
In addition, according to the present invention, utilize contextual feature that the classification score of window area is proofreaied and correct, can further improve accuracy of detection, particularly improved the accuracy of detection to the object in the medical image.
Will be in ensuing description part set forth the present invention other aspect and/or advantage, some will be clearly by describing, and perhaps can learn through enforcement of the present invention.
Embodiment
Below, exemplary embodiment of the present invention is described with reference to the accompanying drawings more fully, exemplary embodiment is shown in the drawings.Run through the description to accompanying drawing, identical label is represented components identical.
Fig. 1 is the block diagram that illustrates according to the object detection systems 100 of the object in the detected image of the embodiment of the invention.
Detection system 100 comprises image receiving unit 110, feature extraction unit 120, DPM detecting unit 130, object determining unit 140.
Image receiving unit 110 is used for receiving image to be detected.
Feature extraction unit 120 is used for extracting characteristics of image from image to be detected.
The characteristics of image that extracts can be that for example gradient orientation histogram (HOG) feature, local binary (LBP) feature, trellis depth feature (GDF), yardstick invariant features are changed various characteristics of image such as (SIFT) feature.
Feature extraction unit 120 can be used the root template to treat detected image and scan, thereby obtains a plurality of window areas (that is a plurality of roots) and characteristics of image thereof.
DPM detecting unit 130 can use the DPM of training in advance to come detected object.DPM detecting unit 130 is imported DPM with the characteristics of image of each window area that feature extraction unit 120 is extracted, thereby obtains the classification score (that is degree of confidence) of each window area.
DPM according to the present invention improves existing DPM.
In existing DPM, according to equation (1) as can be known, at the data item of root definition (that is convolution filter F corresponding with root (the also window area that namely extracts),
0With at root P
0The dot product of the feature of extracting) can be expressed as followsin: F
0. φ (h, p
0).
In DPM according to the present invention, the data item that defines at root is expressed as followsin:
Wherein,
P
0The regional extent of expression root, F
0(τ) the convolution filter F of expression root correspondence
0Part on regional τ, φ (H, p
0, τ) expression by characteristics of image pyramid H obtain at P
0The part of root feature on regional τ.
Equation (4) expression is adjusted into the feasible convolution filter F that is applied on the regional τ of τ by the scope with original root area
0(τ) and the dot product of the proper vector of extracting at regional τ (that is the data item of the definition of the root after adjustment) maximum.Preferably, consider computational complexity and regional representativeness, the variation range of variable τ is restricted to several and is equal to or slightly less than regional P
0The rectangular area.
At this moment, equation 1 can be rewritten as:
Specifically, when training DPM, adjust the root P as sample
0The scope of region make with adjust after root P
0Corresponding convolution filter F
0The dot product maximum of the feature of extracting with root after adjustment.When using DPM to detect, adjust the root P as input
0The scope of region make with adjust after root P
0Corresponding convolution filter F
0The dot product maximum of the feature of extracting with root after adjustment.According to equation (5) as can be known, with root P
0Corresponding convolution filter F
0With the dot product maximum of the feature of extracting at root, it also is final classification score maximum.
Object determining unit 140 determines to exist the window of object according to the classification score (that is degree of confidence) of window area.Should be appreciated that this neighborhood technician is under the situation of the classification score that obtains window area, it is known determining to exist the technology of the window area of object.For example, can be with the window area of the classification score maximum zone as the object place; Perhaps when the classification score of window area during less than predetermined threshold, determine in this window area, not exist object; When the classification score of window area during more than or equal to predetermined threshold, determine in this window area, to have object.
As implied above, improve by the data item in root definition, can eliminate effectively and have the object of difference among a small circle with the template length breadth ratio and be easy to the problem of in detection, being lost.
In a further embodiment, DPM detecting unit 130 uses and mixes DPM.Mix DPM and be made of a plurality of DPM, the root template of each DPM differs from one another.At this moment, feature extraction unit 120 is used the root template of each DPM to treat detected image respectively and is scanned, thereby obtains a plurality of window areas and characteristics of image thereof.
The characteristics of image of each in described a plurality of window area is transfused to mixing DPM, thereby obtains the classification score of each window area.
In some specific occasions, owing to object is blocked, combines with other objects and object space layout and overlapping complicacy, therefore the score of the window area of DPM detection may be inaccurate, some pseudo-window areas (namely, the window area that does not have object) possible score is higher, and this has greatly reduced accuracy of detection.In order to address this problem, the present invention proposes a kind of Redundancy-Restraining Technique, get rid of these pseudo-window areas, thus can according to from described a plurality of window areas, got rid of classifying of window area after the pseudo-window area assign to determine to exist the window area of object.
According to another embodiment of the present invention, object detection systems 100 also comprises the redundant unit (not shown) that suppresses.Pseudo-window area can be removed in the redundant unit that suppresses from the window area of DPM detecting unit 130 outputs.In other words, redundantly suppress pseudo-window area can be removed in the unit from the classification score of a plurality of window areas of DPM detecting unit 130 outputs classification score.
Fig. 2 illustrates the block diagram that suppresses the unit according to the redundancy of the embodiment of the invention.
The redundant unit that suppresses comprises characteristic information extraction unit 141, the redundant unit 142 of removing.
Characteristic information extraction unit 141 is from each window area corresponding window area of classification score of output (that is, with) characteristic information extraction.Specifically, described characteristic information can comprise at least one in the score information of yardstick information, root and parts of positional information, root and parts of the root of skew that the model of PTS, the use of distortion cost, the window area of the parts of object causes, the object in the window area and parts.
For example, the characteristic information from i window area can be represented as v
i,
v
i=(b,s,s
0,s
1,...s
D,dd
1,..dd
D,l
0,l
1...l
D,c), (6)
Wherein, l
0Be position and the yardstick of the root of object, l
1... l
DBe position and the yardstick of the parts of object, the quantity of the parts of D indicated object, s
0Be the score of root, s
1... s
DBe the score of parts, dd
1..dd
DBe the distortion cost of parts, the skew that the DPM among the mixing DPM that b is to use causes, s is the PTS of window area, c be object component index (namely, represent that i window area obtained by c the DPM detection that mixes among the DPM), 1≤c≤K, K represent to mix the quantity of the DPM among the DPM.
Should be appreciated that, although top v
iIn comprised multiple information, but should be appreciated that, can only extract partial information wherein as required.
The redundant unit 142 of removing utilizes the characteristic information that extracts to determine from the interactive relation between the window area of DPM detecting unit 130 outputs, to remove pseudo-window area from the window area of DPM detecting unit 130 outputs.This interactive relation has embodied the overlapping characteristic between the window area.
Specifically, suppose x
iBe the characteristic information that extracts from window area i, then entire image can be represented as the characteristic information X={x of extraction
i: i=1...M}, M represent the quantity of window area.If being carried out binary, each window area marks to determine whether it is correct example, the then mark of i window area: y
i{ 0,1} (should be appreciated that bi-values of the present invention is not limited to 0 and 1, also can use other value as bi-values) makes Y={y to ∈
i: i=1...M}, use must being divided into of vectorial Y mark image X:
Wherein, φ (x
i, y
i)=y
iX
i,
The representation model parameter, x
i=(v
i(s), Z), Z represents the vector of K dimension, the v of Z
i(c) individual element is 1, and other elements are zero, d
IjRepresent the interactive relation (that is overlapping relation) between i window area and the j window area.
Can use the φ (x of precognition
i, y
i),
SS trains to obtain by existing structured sorting technique (for example, structuring SVM algorithm, Boost algorithm etc.)
Preferably, use structuring SVM algorithm.Owing to use the structuring sorting technique to obtain
Be existing technology, will no longer describe in detail.
According to embodiments of the invention, the interactive relation between the different windows zone comprises that root-root is mutual, root-parts mutual, at least one in mutual of parts-parts.
Root-root has embodied the overlapping characteristic (for example, the overlapping characteristic between the root of the root of a window area and another window area) between the root in different windows zone alternately.
Root-parts have embodied root and the overlapping characteristic between the parts (for example, the overlapping characteristic between the parts of the root of a window area and another window area) in different windows zone alternately.
Parts-parts have embodied the overlapping characteristic (for example, the overlapping characteristic between the parts of the parts of a window area and another window area) between the parts in different windows zone alternately.
Root-root is mutual, root-parts are mutual, parts-parts can be represented as respectively alternately
Know
For example, when using all above-mentioned three kinds when mutual,
Represent the interactive relation between the root between i window area and the j window area.Be appreciated that
It is the matrix of K * K.In one example, the arbitrary element in this matrix
(m, n represent the index of the element in this matrix, for example, the capable n row of m) can be expressed as following equation (8):
Here, ol (v
i(l
0), v
i(l
0)) Duplication between the root of expression i window area and the root of j window area.
Represent root and the interactive relation between the parts (that is the interactive relation between the parts of the root of i window area and j window area) between i window area and the j window area.Be appreciated that
Be K * (matrix of K * D), arbitrary element in this matrix
Can be expressed as following equation (9):
Here, g ∈ [1, D], ol (v
i(l
0), v
i(l
g)) Duplication between the root of expression i window area and g the parts of j window area.
Represent parts between i window area and the j window area and the interactive relation between the parts.Be appreciated that
Be (K * D) * (the matrix of K * D).In one example, the arbitrary element in this matrix
Can be expressed as following equation (10):
....(10)
Here, e ∈ [1, D], g ∈ [1, D], ol (v
i(l
e), v
j(l
g)) Duplication between e parts of expression i window area and g the parts of j window area.
Calculating makes equation 7 be maximum Y, and this calculating can be represented as argmax
YS (X, Y).At this moment, be noted as 1 window area and be considered to the final example that detects, be noted as 0 window area and be considered to pseudo-window area.
Can make to calculate in various manners to make equation 7 for maximum Y, for example, can use the mode of enumerating.
In another embodiment of the present invention, using greedy algorithm to calculate makes equation 7 be maximum Y.
In according to another embodiment of the present invention, detection system 100 also comprises context DPM detecting unit (not shown).Context DPM detecting unit is proofreaied and correct DPM detecting unit 130 or the redundant classification score that suppresses each window area of unit output according to the contextual feature of window area.
Specifically, the contextual feature input context sorter of the window area that context DPM detecting unit will be corresponding with the classification score of DPM detecting unit 130 or redundant inhibition unit output obtains the new score of window area.At this moment, object determining unit 140 is classified to such an extent that assign to determine to exist the window of object according to window area new.
The context sorter is to utilize as the contextual feature of sample to train the sorter that obtains.Preferably, from being used for training DPM or mixing the sample extraction contextual feature of DPM as the contextual feature of sample, can improve training speed and precision like this.
Contextual feature comprises: the shaped position feature; Neighborhood characteristics; The collaborative feature that occurs.The size of shaped position character representation window area in image to be detected and size and the relative position of each parts in position and the window area.Neighborhood characteristics is represented the image difference of the neighborhood of window area and window area.Preferably, the area of window area is identical with the area of the neighborhood of window area.For example, image difference can be the average image gray scale difference or gray variance and the position-statistic such as gray scale covariance of root area and component area and described neighborhood.The collaborative relation that the window area of maximum score occurs having in character representation window area and detected all window areas.For example, if current window area is not the window area with maximum score, then collaboratively feature occurs and in the context sorter, generally can inhibiting effect be arranged to current window area.
In one example, the contextual feature of a window area can be represented by following vector f:
f=(σ(sc),r,p,q,σ(s
m),r
m)
Wherein, σ (sc)=1/ (1+exp (2sc)),
Wherein, sc is the score of window area, and r represents position and the size of window area, and p represents that each parts in the window area are with respect to the position at root area center, q represents the gradation of image mean difference of the adjacent area of specific region in the window area and window area, s
mMaximum score in the score of window area, r
mBe position and the size with window area of described maximum score.
During the neoplastic lesion in detecting medical image etc., compare with other object detection, it is big that tumour in the medical image has change of shape, poor contrast, characteristics such as noise is obvious, more particularly the characteristics of image of a part of tumor region has ambiguity, that is to say, an almost identical image block, in piece image, can be considered to tumour, in another width of cloth image, will be considered to not be tumour, or even can be considered to tumour in a position appearance of same image, and tumour occur to be considered to not be in another position.By contextual feature of the present invention the classification score is proofreaied and correct, can improve the accuracy of detection of neoplastic lesion object effectively.
In addition, in a further embodiment, the redundant unit, context DPM detecting unit of suppressing according to the present invention can be separately or be applied to together (for example, utilize the object detection systems of the DPM of equation 1) in the existing object detection systems based on DPM.
Fig. 3 illustrates the process flow diagram of the object detection method of the object in the detected image according to an embodiment of the invention.
In step 301, receive image to be detected.
In step 302, extract characteristics of image from image to be detected.Specifically, utilize the root template to treat detected image and scan, thereby obtain a plurality of window areas and characteristics of image thereof.
In step 303, will import the DPM of training in advance at the characteristics of image that step 302 is extracted, thereby obtain the classification score of each window area.
In step 304, according to classifying of the window area that obtains in step 303 assign to determine to exist the window area of object.
Fig. 4 illustrates the process flow diagram of the object detection method of the object in according to another embodiment of the present invention the detected image.
In step 401, receive image to be detected.
In step 402, extract characteristics of image from image to be detected.Specifically, utilize the root template to treat detected image and scan, thereby obtain a plurality of window areas and characteristics of image thereof.
In step 403, will import the DPM of training in advance at the characteristics of image that step 402 is extracted, thereby obtain the classification score of each window area.
In step 404, according to classifying of the window area that obtains in step 403 assign to determine to exist the window area of object.
In step 405, from each window area characteristic information extraction.Described characteristic information can comprise at least one in the score of score, parts of the positional information of the positional information of PTS, root of window area and/or yardstick information, parts and/or yardstick information, root.
In step 406, utilize the characteristic information that extracts from described a plurality of window areas, to remove pseudo-window area.Specifically, the result of maximization formula (7), thus the window area with binary score of the pseudo-window area of sign is judged as pseudo-window area.
In a further embodiment, after step 303,403 or 406, also can comprise step: extract the contextual feature of window area, and contextual feature is imported the new score that the context sorter obtains window area.Should be appreciated that the window area that is extracted contextual feature is not included in the pseudo-window area that step 405 is determined.
According to the present invention, by improving the account form in the data item of root definition among the existing DPM, can effectively overcome and have the object of difference among a small circle with the template length breadth ratio and be easy to the problem of in detection, being lost.In addition, according to Redundancy-Restraining Technique of the present invention, solved because object is blocked, combine with other objects and the score of the window area that object space layout and overlapping complicacy cause is detected by DPM may inaccurate problem, effectively eliminated pseudo-window area.In addition, according to the present invention, utilize contextual feature that the classification score of window area is proofreaied and correct, can further improve accuracy of detection, particularly improved the accuracy of detection to the object in the medical image.
Although specifically shown with reference to its exemplary embodiment and described the present invention, but the technician of this neighborhood should be appreciated that, under the situation that does not break away from the spirit and scope of the present invention that claim limits, can carry out various changes on form and the details to it.