CN113052083B

CN113052083B - Action behavior segmentation method for constraint matrix decomposition of multi-neighbor graph

Info

Publication number: CN113052083B
Application number: CN202110326354.5A
Authority: CN
Inventors: 张妍
Original assignee: Shaanxi Stride Industrial Co ltd
Current assignee: Shaanxi Stride Industrial Co ltd
Priority date: 2021-03-26
Filing date: 2021-03-26
Publication date: 2024-06-11
Anticipated expiration: 2041-03-26
Also published as: CN113052083A

Abstract

The invention relates to the technical field of human behavior segmentation, and discloses an action behavior segmentation method for multi-neighbor graph constraint matrix decomposition, which comprises the following steps: s1, designing constraint regular terms of a multi-neighbor graph constraint structure through behavior action sequence data; s2, constructing a regular term constraint semi-NMF matrix decomposition model; s3, solving a semi-NMF matrix decomposition model to obtain a low-dimensional representation of the behavior sequence; s4, calculating sequence point similarity weight by using low-dimensional representation, and generating a similarity relation matrix; s5, according to the similarity relation matrix, the behavior segmentation is obtained by using a graph segmentation method, and the behavior segmentation method has the advantage that the phenomenon of error segmentation in the action sequence is reduced, so that similar actions in the human body behavior can be accurately distinguished.

Description

Action behavior segmentation method for constraint matrix decomposition of multi-neighbor graph

Technical Field

The invention relates to the technical field of human behavior segmentation, in particular to an action behavior segmentation method for multi-neighbor graph constraint matrix decomposition.

Background

Human behavior recognition is a multi-disciplinary crossing research direction, relates to a plurality of disciplines such as image processing, computer vision, pattern recognition, machine learning, artificial intelligence and the like, and is an important research subject in the field of computer vision. With the rapid development of image processing technology and intelligent hardware manufacturing technology, human behavior data are more and more abundant, and the variety is more and more abundant. Generally, it can be classified into a sensor and a vision. Sensor-based behavior recognition utilizes the corresponding sensors worn to obtain behavior data information. The behavior recognition based on the vision can be divided into recognition based on single-frame images and videos, the behavior recognition based on the single-frame images cannot effectively acquire the coherent information between the behavior images, misjudgment is usually generated, the behavior recognition based on the videos can well acquire the space and time information in the videos, and the accuracy is greatly improved. The behavior analysis based on the video is high in expansibility and flexibility, so that the behavior analysis based on the video is widely researched and applied. In practice, the human behavior data is a sequence data set, whether it is sensor or vision collected data.

The human behavior segmentation is to accurately segment the original human motion capture data, is the basis of structural analysis, understanding and application of the human motion data, and is a focus of attention of researchers at home and abroad. Extensive research has conducted extensive discussion of this problem, with a series of achievements, roughly forming three types of methods.

Aiming at the difficult problems of high dimension and large processing calculation amount of original human motion capture data, some researchers propose a segmentation method based on data dimension reduction. Barbic et al suggest that different human locomotor behaviors can be represented by different intrinsic dimensions, and methods such as PCA, PPCA and GMM have been proposed. These methods reduce the dimension of the human motion data by principal component analysis PCA (principal component analysis). Assume that: the intrinsic dimension of a motion sequence containing a single behavior should be smaller than the intrinsic dimension of a motion sequence containing a plurality of behaviors, and therefore the positions of behavior segmentation points can be detected by projection errors in the subspace. Yang Tao et al propose a method for extracting key frames based on motion capture data with hierarchical curve simplification. The PCA-based method has low requirements on hardware, is relatively simple to realize, and can be easily realized for some applications conforming to the assumption.

Deep learning-based methods are new hot spots studied in recent years. Like the TS-CNN method, spatiotemporal CNN (ST-CNN) of low-level coded visual information is used for fine-grained tasks of coding object states, positions, and relationships between objects. A semi-markov model and Conditional Random Field (CRF) co-segmentation and classification action is introduced that captures high-level time information. The TCN method introduces ED-TCN and DILATED TCN coding and decoding networks. Misra I et al use the temporal coherence information to train a depth model for behavior recognition and pose estimation. In summary, the deep learning-based method generally uses a combination of CNN (or an automatic encoder) and other machine learning methods to realize behavior segmentation, which has better effect than other methods, but has higher requirement on hardware configuration, relies on a large amount of data, and is difficult to realize.

Compared with the two ideas, the clustering method has the characteristics of relative simplicity and good effect, and is also an important point for researching human behavior segmentation. Based on two extended kmeans clusters, such as ACA (Aligned Cluster Analysis), it is ensured that the clusters can contain a variable number of features. In order to overcome the deficiency of ACA, zhou et al have also proposed the HACA method. As an important class of approaches, subspace clustering (subspace clustering) is often used to solve the motion segmentation problem. Xia and the like propose a SSC (sparse subspace clustering ) based method, subspace clustering is carried out through SSC, and then triangle constraint is used for solving the problem that similar frames cannot be divided into the same cluster in different time periods, so that the time continuity is ensured. Considering that a human behavior sequence is a kind of time sequence data, similarity information is contained between adjacent sequence points, and some methods design clustering algorithms for the sequence data and apply to human behavior segmentation. Such as OSC (Ordered Subspace Clustering), embeds the neighbor structure information into the optimization objective of the subspace representation. Li et al propose TSC (temporal subspace clustering) algorithm, adopt different constraint regular terms, and embed similarity constraint between sequence data into subspace projection process, so as to obtain better action representation.

However, the difficulty of human behavior segmentation is that: 1) It is difficult to design a reasonable and easy to calculate behavioral sequence point similarity metric. Since the behavior movement is continuous sequence data over a period of time, the similarity of the behavior data needs to consider the similarity problem between behavior actions besides the similarity between sequence points in the behavior actions, and the problems of alignment, clipping and the like need to be considered for comparing the similarity between the behavior actions; 2) In general, the behavior sequence data is high-dimensional sequence data, and preprocessing such as similarity measurement, data alignment, data dimension reduction and the like needs to be performed on the high-dimensional data, so that practical implementation is difficult.

Disclosure of Invention

The invention provides a motion behavior segmentation method for multi-neighbor graph constraint matrix decomposition, which has the phenomenon of error division in a motion sequence and can accurately distinguish similar motions in human behaviors.

The invention provides an action behavior segmentation method for multi-neighbor graph constraint matrix decomposition, which comprises the following steps:

S1, designing constraint regular terms of a multi-neighbor graph constraint structure through behavior action high-dimensional sequence data;

s2, constructing a regular term constraint non-semi-negative matrix semi-NMF decomposition model;

S3, solving a non-semi-negative matrix semi-NMF decomposition model to obtain a low-dimensional representation matrix H of the behavior sequence;

S4, generating a relation graph G of sequence points by using a low-dimensional representation matrix H;

s5, dividing the relation graph G of the sequence points by using a graph cutting method to obtain the division of the action behaviors.

The specific step of designing the constraint regularization term of the constraint structure of the multi-neighbor graph in the step S1 comprises the following steps:

s11, calculating neighbor constraint graph R ₁

Let the input X ε R ^d×n, its low-dimensional representation is denoted as: h ε R ^p×n, where d represents the original dimension of each data point in the sequence, p represents the dimension of the data points in the low-dimensional space, and n represents the total number of data points in the sequence; because the human body behavior is formed by continuous sequences, the adjacent sequence points have higher similarity; the closer the sequence points are, the higher the similarity is; the farther the sequence points are, the lower the similarity is;

let the sum of the number of neighbors before and after the current data point i is included as q, q is a positive even number, and the error between i and the neighbors is as small as possible, namely: Tending to 0, constructing a structural constraint by using the current data point and its several neighbors, defining a strip matrix R ₁∈R^n×n;

When i is less than or equal to q/2, the number of neighbors before the current sequence point is less than q/2, and q neighbors after the sequence point are sequentially complemented; similarly, when i is larger than n-q/2, the number of neighbors after the current sequence point is smaller than q/2, and q neighbors before the sequence point are sequentially complemented;

S12, calculating a similarity graph R ₂

For the input X epsilon R ^d×n, the similarity of each node to all other nodes is calculated, and the first q nodes with the highest similarity form a similarity graph R ₂∈R^n×n used for representing a plurality of node sets similar to the nodes in the data.

The specific method for constructing the regular term constraint non-semi-negative matrix semi-NMF decomposition model in the step S2 is as follows:

aiming at the characteristic that X epsilon R ^d×n is high-dimensional data, after multi-graph constraint is obtained, performing dimension reduction on the original data by using semi-NMF; let Z epsilon R ^d×p represent the feature space, H epsilon R ^p×R represent the representation coefficient matrix of the input data in the feature space, the matrix element is non-negative;

the build optimization objective is as follows:

Wherein, Representing the Frobenius norm, where α and β are weights in the optimization objective for the weight parameters adjustment regularization constraint term, τ is the weight vector for the different constraint graphs, where L ₁＝D₁-R₁,D₁＝diag(∑R₁) and L ₂＝D₂-R₂,D₂＝diag(∑R₂).

The step S3 is to solve a non-semi-negative matrix semi-NMF decomposition model to obtain a low-dimensional representation of a behavior sequence as H, and the specific method is as follows:

the solution semi-NMF matrix decomposition model is a non-convex optimization problem, no global optimal solution exists, the local optimal solution of the optimization model is obtained by utilizing the concept of alternate iteration, the iteration times are set, other variables are respectively fixed each time, Z, H and τ are sequentially calculated until the iteration times reach a set value, and the obtained H is the final output;

in each iteration, Z, H and τ are calculated as follows:

The iterative formula of Z: according to the matrix operation rule, the closed solution is obtained as follows:

Z=XH^T(HH^T)^-1

h, iterative formula: according to the rule of nonnegative matrix factorization, the iterative formula of H needs to solve the following optimization targets:

Let [ ] ⁺ denote non-negative elements containing only the original matrix, [ ] ^- denote non-positive elements containing only the original matrix, then the iterative formula for H is:

Wherein the method comprises the steps of

Iterative rule of τ: the calculation of τ requires solving the following optimization objectives:

Let γ=α/β, and by the above calculation, a low-dimensional representation matrix H is obtained.

The specific method for generating the relationship graph G of the sequence points by using the low-dimensional representation matrix H in the step S4 is as follows:

According to a spectrogram theory (SPECTRAL GRAPH), generating a relation graph G epsilon R ^n×n of sequence points by using H, wherein the vertexes of the graph represent the sequence points, the edges of the graph are obtained by calculating the similarity between the sequence points and other points on the H, the k points with the highest similarity form a k-neighbor set N (i) of the points, and the edges G (i, j) are defined as follows:

Obviously, as the i+1th point is very close to the point i, and the similarity between the i+1th point and the point i is higher due to structural constraint, edges can be generated, so that the points in the action sequence are ensured to be divided into the same class as much as possible; if the similarity between i and i+1 is still not high even if the structural constraint is utilized, i is at the end position of the action sequence, and meanwhile, for points far away from i, if the similarity value between i and i is also high, it is indicated that the points belong to repeated action sequences, so that the repeated action sequences can be divided into the same class.

The graph-cutting method in the step S5 is to divide G by using the feature vector corresponding to the second smallest laplace feature value of G.

Compared with the prior art, the invention has the beneficial effects that:

The invention designs a reasonable and easy-to-calculate behavior sequence point similarity measurement criterion, carries out preprocessing such as similarity measurement, data alignment, data dimension reduction and the like on the high-dimensional data according to the similarity measurement criterion, has higher accuracy of dividing actions, has the phenomenon of error division in the action sequence, and can accurately distinguish the actions similar to each other in human behaviors.

Drawings

Fig. 1 is a flow chart diagram of an action behavior segmentation method of multi-neighbor graph constraint matrix decomposition provided by the invention.

Fig. 2 is a schematic diagram of a data action 1 segmentation result according to an embodiment of the present invention.

Fig. 3 is a schematic diagram of a data action division result of number 2 according to an embodiment of the present invention.

Fig. 4 is a schematic diagram of a division result of a data operation No. 3 according to an embodiment of the present invention.

Detailed Description

One embodiment of the present invention will be described in detail below with reference to fig. 1-4, but it should be understood that the scope of the present invention is not limited by the embodiment.

As shown in fig. 1, the motion behavior segmentation method of multi-neighbor graph constraint matrix decomposition provided by the embodiment of the invention provides a multi-graph constraint semi-non-negative matrix decomposition (semi-NMF) method for behavior motion sequence data representation, and clustering is performed by using the graph segmentation method, so that human body behaviors are segmented. The main work includes: 1. designing constraint regular terms of a constraint structure of the multi-neighbor graph; 2. constructing a regular term constraint semi-NMF matrix decomposition model; 3. solving the model to obtain a low-dimensional representation of the behavior sequence; 4. calculating sequence point similarity weight by using the low-dimensional representation, and generating a similarity relation matrix; 5. behavior segmentation is obtained by using a graph segmentation method.

1. Designing constraint regularization term of multi-neighbor graph

First,: calculating neighbor constraint graph R ₁

Let the input X ε R ^d×n, its low dimension is denoted as: h ε R ^p×n, where d represents the original dimension of each data point in the sequence, P represents the dimension of the data points in the low dimensional space, and n represents the total number of data points in the sequence. Because the human body behavior is composed of continuous sequences, the adjacent sequence points have higher similarity. In general, the closer the sequence points are, the higher the similarity is; the farther the sequence points are, the lower the similarity. It is desirable to construct the sequence data representation as closely as possible between neighboring points to ensure that the data representation within the behavioural action is as compact as possible. Let the sum of the number of neighbors before and after the inclusion comparison be q (q is a positive even number) for the current data point i, we want the error between i and neighbor to be as small as possible, namely: As much as 0. Accordingly, the current data point and its several neighbors can be used to construct a structural constraint, for which a special strip matrix R ₁∈R^n×n is defined:

When i is less than or equal to q/2, the number of neighbors before the current sequence point is less than q/2, and q neighbors after the sequence point can be sequentially complemented; similarly, when i is larger than n-q/2, the number of neighbors after the current sequence point is smaller than q/2, and q can be sequentially complemented in the neighbors before the sequence point. For example, when n=10, q=2, where the 1 st sequence point is not preceded by a neighbor, the third sequence point after it can be included in the structural constraint.

Secondly: calculating a similarity graph R ₂

For the input X e R ^d×n, the similarity between each node and all the rest nodes can be calculated, and the first q nodes with the highest similarity can form a similarity graph R ₂∈R^n×n, which is used for representing a plurality of node sets similar to the nodes in the data. The specific calculation of R ₂ refers to the method and code provided in (Cai D,He X,Han J,et al.Graph Regularized Nonnegative Matrix Factorization for Data Representation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2011,33(8):1548-1560.), where the measure of distance may be selected in binary mode or in thermonuclear mode.

Through the above calculations, two different graphs can be utilized to characterize the node's neighbor similarity information.

2. Constructing a regular term constraint semi-NMF matrix decomposition model

Aiming at the characteristic that X epsilon R ^d×n is high-dimensional data, after multi-graph constraint is obtained, the original data is subjected to dimension reduction processing by using semi-NMF. Let Z ε R ^d×p denote the feature space, H denote the matrix of coefficients of the input data in the feature space, the elements of the matrix are non-negative.

Optimization objectives can be constructed as follows:

Wherein, Representing the Frobenius norms, α and β as weight parameters to adjust the weights of the canonical constraint term in the optimization objective, τ is the weight vector of the different constraint graphs, where L ₁＝D₁-R₁,D₁＝diag(∑R₁) and L ₂＝D₂-R₂,D₂＝diag(∑R₂).

3. Solving the model to obtain a low-dimensional representation H of the behavior sequence

The solution model is a non-convex optimization problem, has no global optimal solution, and can obtain a local optimal solution of the optimization model by utilizing an alternating iterative thought. The iteration number can be set to 100, other variables are fixed each time, Z, H and τ are calculated sequentially until after 100 iterations, the obtained H is the final output.

In each iteration, Z, H and τ are calculated as follows:

The iterative formula of Z: since Z may be negative, its closed-form solution may be obtained according to the matrix operation rule.

Z＝XH^T(HH^T)^-1

H, iterative formula: according to the rule of nonnegative matrix factorization, the iterative formula of H needs to solve the following optimization targets

Wherein the method comprises the steps of

Let γ=α/β, then the optimization objective is a typical constrained quadratic programming problem, which can be solved using any existing method, reference Jie Chen,Hua Mao,Yongsheng Sang,Zhang Yi.Subspace clustering using a symmetric low-rank representation[J].Knowledge-Based Systems,2017,127:46-57.

Through the above calculation, a low-dimensional representation matrix H can be obtained.

4. Generating a relationship graph G using H

A relationship graph G ε R ^n×n of sequence points is generated by H according to spectrogram theory (SPECTRAL GRAPH) literature Belkin M.Laplacian eigenmaps and spectral techniques for embedding and clustering[A].Advances in neural information processing systems[C],Cambridge,MA:2001.585-591,. The vertex of the graph represents a sequence point, the edge of the graph is obtained by calculating the similarity between the sequence point and other points on H, the top k points with the highest similarity form a k adjacent set N (i) of the point, and the edge G (i, j) is defined as:

Obviously, as the i+1th point is very close to the point i, and the similarity between the i+1th point and the point i is generally higher due to structural constraint, edges can be generated, so that the points in the action sequence are ensured to be divided into the same class as much as possible; if the similarity of i and i+1 is still not high, even with structural constraints, it is stated that i may be at the end of the action sequence. Meanwhile, for points far away from i, if the similarity value between the points is high, the points possibly belong to repeated action sequences, so that the repeated action sequences can be divided into the same class.

5. Classifying G by means of graph Cut (Normalized Cut: ncut)

G was classified by the method of graph Cut (Normalized Cut: ncut). Ncut is a commonly used graph segmentation method that segments G using the feature vector corresponding to the second smallest of its Laplace (laplacian) feature values. Detailed description of the preferred embodiments (Shi J,Malik J.Normalized cuts and image segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2000,22(8):888-905.).

Specific cases: we use numbers 1, 2, 3 of CMU Mocap, 86 (experiments were performed with reference Chu W S,et al.,Video Co-summarization:Video Summarization by Visual Co-occurrence[A].2015IEEE Conference on Computer Vision and Pattern Recognition[C].Boston,MA,USA,IEEE Computer Society:2015.3584-3592). Mocap dataset in which the human motion in each frame is composed of 42-dimensional vectors. As shown in FIGS. 2, 3, and 4, the first row in dataset is the true division point and the second row is the division point for motion decomposition of the present invention. These two sequences of numbers need to be divided more than the number of classes, as shown in FIG. 4, into 9 parts, with 7 motion behaviors (classes), i.e., repeated motion behaviors in the sequence, and these two sequences contain negative data.

Parameter setting: the number of iterations is chosen to be 100, the dimension p of the H matrix is chosen to be greater than the number of classes, we choose p=15, q to be a positive even number, such as 2,4,6,8, 10, etc. k is a positive integer, and is generally greater than 3. Alpha and beta are generally chosen to be between 0.001 and 1 as weight parameters. The segmentation results are shown in fig. 2,3 and 4, respectively.

The invention designs a more reasonable and easy-to-calculate behavior sequence point similarity measurement criterion, carries out preprocessing such as similarity measurement, data alignment, data dimension reduction and the like on the high-dimensional data according to the similarity measurement criterion, has higher accuracy of dividing actions (the difference between the starting point and the ending point of the division is not large), has the phenomenon of error division in the action sequence, and can accurately distinguish the actions similar to the human body behavior.

The foregoing disclosure is merely illustrative of some embodiments of the invention, but the embodiments are not limited thereto, and any variations that may be contemplated by one skilled in the art should fall within the scope of the invention.

Claims

1. The action behavior segmentation method for multi-neighbor graph constraint matrix decomposition is characterized by comprising the following steps of:

S5, segmenting a relation graph G of sequence points by using a graph segmentation method to obtain segmentation of action behaviors;

s11, calculating neighbor constraint graph R ₁

S12, calculating a similarity graph R ₂

2. The method for dividing action behaviors of multi-neighbor graph constraint matrix decomposition according to claim 1, wherein the specific method for constructing a regular term constraint non-semi-negative matrix semi-NMF decomposition model in step S2 is as follows:

Aiming at the characteristic that X epsilon R ^d×n is high-dimensional data, after multi-graph constraint is obtained, performing dimension reduction on the original data by using semi-NMF; let Z epsilon R ^d×p represent the feature space, H epsilon R ^p×n represent the representation coefficient matrix of the input data in the feature space, the matrix element is non-negative;

the build optimization objective is as follows:

3. The method for partitioning action behavior by multi-neighbor graph constraint matrix decomposition according to claim 2, wherein the step S3 of solving the non-half-negative matrix semi-NMF decomposition model obtains a low-dimensional representation of the action sequence as H, and the specific method is as follows:

in each iteration, Z, H and τ are calculated as follows:

Z＝XH^T(HH^T)^-1

Wherein the method comprises the steps of

4. The method for dividing action behavior of constraint matrix decomposition of multi-neighbor graph according to claim 1, wherein the specific method for generating the relationship graph G of sequence points by using the low-dimensional representation matrix H in step S4 is as follows:

5. The method for motion behavior segmentation of multi-neighbor graph constraint matrix decomposition according to claim 1, wherein said graph cut method in step S5 is to segment G by using a feature vector corresponding to a second smallest laplacian feature value of G.