[go: up one dir, main page]

CN109063089B - A method and device for subgraph matching based on community structure - Google Patents

A method and device for subgraph matching based on community structure Download PDF

Info

Publication number
CN109063089B
CN109063089B CN201810836811.3A CN201810836811A CN109063089B CN 109063089 B CN109063089 B CN 109063089B CN 201810836811 A CN201810836811 A CN 201810836811A CN 109063089 B CN109063089 B CN 109063089B
Authority
CN
China
Prior art keywords
community
matching
target pattern
subgraph
nodes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810836811.3A
Other languages
Chinese (zh)
Other versions
CN109063089A (en
Inventor
王朝坤
楼昀恺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN201810836811.3A priority Critical patent/CN109063089B/en
Publication of CN109063089A publication Critical patent/CN109063089A/en
Application granted granted Critical
Publication of CN109063089B publication Critical patent/CN109063089B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明公开一种基于社区结构的子图匹配方法及装置,方法包括:导入包含目标模式的文件,分析目标模式结构,找出目标模式中互相匹配等价的子图;根据网络图数据生成以社区作为结点的超图,计算每个社区中各结点与本社区的邻接社区间的边数;在网络图各社区内部利用预设子图匹配算法找出各社区的与目标模式结构匹配的子图,获得第一匹配结果;在网络图中,基于网络图每个社区中各结点与本社区的各邻接社区间的边数和找出的目标模式中互相匹配等价的子图找出跨社区的与目标模式匹配的子图,获得第二匹配结果;将第一、二匹配结果汇总获得最终子图匹配结果。可提高子图匹配速度,减少时间开销。

Figure 201810836811

The invention discloses a subgraph matching method and device based on a community structure. The method includes: importing a file containing a target pattern, analyzing the target pattern structure, and finding out subgraphs matching and equivalent to each other in the target pattern; The community is used as a hypergraph of nodes, and the number of edges between each node in each community and the adjacent communities of this community is calculated; within each community of the network graph, the preset subgraph matching algorithm is used to find out the structure matching of each community with the target pattern. In the network graph, based on the number of edges between each node in each community in the network graph and each adjacent community of the community and the subgraphs that match each other and are equivalent to each other in the target pattern found Find the subgraphs that match the target pattern across the community, and obtain the second matching result; summarize the first and second matching results to obtain the final subgraph matching result. It can improve the subgraph matching speed and reduce the time overhead.

Figure 201810836811

Description

Subgraph matching method and device based on community structure
Technical Field
The embodiment of the invention relates to the technical field of computers, in particular to a subgraph matching method and device based on a community structure.
Background
With the wide application of graph data in various fields, the demand for a correlation algorithm for graph data processing is increasing. The subgraph matching algorithm is taken as a classic algorithm on the graph, and how to carry out efficient subgraph matching becomes an important problem. The subgraph matching refers to a process of specifying a target mode by a user and finding all subgraphs isomorphic with the target mode in the graph through a subgraph matching algorithm. The application of subgraph matching includes finding a single node satisfying a specific condition, finding several nodes with a specific relationship, and the like, such as finding a structure composed of all three people known to each other in a social network.
The Ullmann algorithm and the vf2 algorithm are classical subgraph matching methods. Neo4j is a commonly used graph database system that has no specific implementation for sub-graph matching, but achieves the effect of sub-graph matching by performing path matching.
The speed of the sub-graph matching method used in the Ullmann algorithm, the vf2 algorithm and the Neo4j is low in actual calculation, and when the data size is large, the time overhead of the algorithm is large, so that the requirements of actual application cannot be met. Moreover, the algorithms do not fully utilize the community structure of the graph, and have a large improvement space.
Therefore, how to perform sub-graph matching to increase the computation speed of sub-graph matching becomes a technical problem to be solved at present.
Disclosure of Invention
Because the existing method has the problems, the embodiment of the invention provides a subgraph matching method and device based on a community structure, which are applied to a social network.
In a first aspect, an embodiment of the present invention provides a subgraph matching method based on a community structure, which is applied to a social network, and includes:
importing a file containing a target mode representing three mutually recognized people, analyzing the structure of the target mode based on the imported file, and finding out mutually matched equivalent subgraphs in the target mode;
generating a hypergraph with communities as nodes according to a social network, and calculating the number of edges between each node in each community and the node in the adjacent community of the community;
in each community in the social network, respectively finding out subgraphs formed by all three people known to each other in each community by using a preset subgraph matching algorithm to obtain a subgraph matching result in the community;
performing node repeatable subgraph matching on a target pattern on the hypergraph among all communities in the social network by utilizing a preset subgraph matching algorithm, storing an obtained distribution scheme in a preset table, processing each distribution scheme s in the preset table based on the calculated number of edges between each node in each community and each adjacent community of the community and the found subgraph which is equivalent to each other in the target pattern when the number of elements in the preset table is larger than a preset threshold value, judging whether the number of nodes of the target pattern distributed to each community in the social network in s is larger than the number of nodes in the community, judging whether s can be converted into a distribution scheme with a smaller index value through matching equivalence relation, and calculating the common one-hop number of the nodes in the target pattern in other communities except the community in pairs for the nodes distributed to the same community in s, finding out a subgraph formed by all three mutually known people across communities to obtain a subgraph matching result among the communities;
and summarizing the intra-community subgraph matching result and the inter-community subgraph matching result to obtain a target mode and a subgraph matching result of the social network, namely obtaining all three people known mutually in the social network.
Optionally, the importing a file including a target pattern, analyzing a structure of the target pattern based on the imported file, and finding out mutually matched equivalent subgraphs in the target pattern includes:
importing a file containing a target mode, and analyzing nodes, edges and related attributes in the target mode;
and analyzing the structure of the target pattern based on the analyzed nodes and edges and related attributes in the target pattern, and finding out mutually matched equivalent subgraphs in the target pattern.
Optionally, inside each community in the social network, respectively finding out subgraphs, matched with the target mode, of each community by using a preset subgraph matching algorithm, and obtaining subgraph matching results inside the community, including:
and in parallel, in each community in the social network, respectively finding out subgraphs, matched with the target mode, of each community by using a preset subgraph matching algorithm, and obtaining a subgraph matching result in the community.
Optionally, the method includes performing node-repeatable subgraph matching on a target pattern on the hypergraph by using a preset subgraph matching algorithm among all communities in the social network, storing the obtained distribution schemes in a preset table, processing each distribution scheme s in the preset table based on the calculated number of edges between each node in each community and each adjacent community of the community and the found subgraph equivalent to each other in the target pattern when the number of elements in the preset table is greater than a preset threshold, judging whether the number of nodes of the target pattern distributed to each community in the social network in s is greater than the number of nodes in the community, judging whether s can be converted into a distribution scheme with a smaller index value through matching equivalence, and calculating the number of common one-hop neighbors of the nodes in the target pattern except the community in pairs among the nodes distributed to the same community in s in the target pattern except the community Pruning is carried out, a subgraph formed by three mutually known people across communities is found, and subgraph matching results among communities are obtained, and the method comprises the following steps:
performing node repeatable subgraph matching on the target pattern on the hypergraph by using a preset subgraph matching algorithm, and storing the obtained distribution scheme in a preset table;
when the number of elements in the preset table is greater than a preset threshold, processing each allocation scheme s in the preset table, including:
judging whether the number of the nodes of the target mode distributed to each community in the social network in the step s is larger than that of the nodes in the community or not;
if the number of the nodes of the target mode distributed to each community in the social network in s is less than or equal to the number of the nodes in the community, taking the nodes of the target mode distributed to each community in s as the nodes to be matched according to the structure of the target mode, calculating the number of edges between the nodes to be matched and the nodes distributed to other communities except the community in the target mode, pruning the nodes in the community and the edges of other communities except the community in the social network to reduce the number of candidate matching nodes corresponding to each node in the target mode, and if the number of the candidate matching nodes corresponding to a certain node in the target mode after pruning is reduced to 0, finishing the processing of s;
if the number of candidate matching nodes corresponding to all nodes in the pruned target pattern is not 0, judging whether the current s can be converted into a distribution scheme with a smaller index value through matching equivalence relation according to the found target pattern matched with equivalent subgraphs, and if not, recording all distribution schemes which can be obtained by converting the current s through matching equivalence relation in a preset file;
for the nodes in the target mode which is distributed to the same community in the s, calculating the number of the common one-hop neighbors of the nodes in the target mode in other communities except the community two by two;
taking a derived subgraph of a target pattern formed by nodes in the target pattern distributed to the same community in the step s as a target pattern subgraph, carrying out subgraph matching on the target pattern subgraph and a corresponding community in the social network by using a preset subgraph matching algorithm, carrying out matching attempt on each node in the target pattern subgraph by using the corresponding candidate matching node in the matching process, and pruning by using the calculated public one-hop neighbor number to obtain a matching result of carrying out subgraph matching on each target pattern subgraph in the step s and the corresponding community in the social network;
carrying out further inspection on matching results obtained by carrying out subgraph matching on each target pattern subgraph in s and the corresponding community in the social network, and for the matching results of any two target pattern subgraphs in s in two communities, if two nodes of the two target pattern subgraphs are connected by edges, the nodes matched with the two nodes are also connected by edges; if a matching result set containing a matching result of each target pattern subgraph in s passes the check in pairs, combining the matching results in the matching result set to generate a matching result of the target pattern and the social network;
and calculating a final matching result for all distribution schemes recorded in the preset file by matching equivalent corresponding relations according to the generated matching results of all target patterns and the social network.
Optionally, after determining whether the number of nodes of the target pattern assigned to each community in the social network is greater than the number of nodes in the community in s, the method further includes:
and if the number of the nodes in the target mode distributed to a certain community in the social network in the s is judged and known to be larger than the number of the nodes in the community, finishing the processing of the s.
Optionally, after determining whether the current s can be converted into an allocation scheme with a smaller index value through a matching equivalence relation according to mutually matching equivalent subgraphs in the found target pattern, the method further includes:
and judging to acquire that the current s can be converted into a distribution scheme with a smaller index value through matching the equivalence relation, and finishing the processing of the s.
Optionally, when the number of elements in the preset table is greater than a preset threshold, each matching s in the preset table is processed in parallel.
In a second aspect, an embodiment of the present invention further provides a subgraph matching device applied to a social network and based on a community structure, including:
the import module is used for importing a file containing a target mode representing three mutually recognized people and analyzing nodes, edges and related attributes in the target mode;
the generating module is used for generating a hypergraph with communities as nodes according to the social network and calculating the number of edges between each node in each community and each adjacent community interval of the community;
the analysis module is used for analyzing the structure of the target pattern based on the analyzed nodes, edges and related attributes in the target pattern and finding out mutually matched equivalent subgraphs in the target pattern;
the first acquisition module is used for respectively finding out subgraphs formed by all three people known mutually in the subgraphs, matched with the target mode, of each community by utilizing a preset subgraph matching algorithm in each community in the social network to obtain a subgraph matching result in the community;
a second obtaining module, configured to perform node-repeatable subgraph matching on the hypergraph on the target pattern by using a preset subgraph matching algorithm among all communities in the social network, store the obtained distribution schemes in a preset table, process, for each distribution scheme s in the preset table, based on the calculated number of edges between each node in each community and each neighboring community of the community and the found subgraph equivalent to each other in the target pattern, when the number of elements in the preset table is greater than a preset threshold, determine whether the number of nodes of the target pattern allocated to each community in the social network is greater than the number of nodes in the community, determine whether s can be converted into a distribution scheme with a smaller index value through matching equivalence relations, and for nodes in the target pattern allocated to the same community in s, in the target mode, the number of public one-hop neighbors of the target mode in other communities except the community is calculated pairwise to prune, a subgraph formed by all three people mutually known across communities is found out, and a subgraph matching result between communities is obtained;
and the summarizing module is used for summarizing the intra-community subgraph matching result and the inter-community subgraph matching result to obtain a target mode and the subgraph matching result of the social network, namely obtaining all three people known mutually in the social network.
In a third aspect, an embodiment of the present invention further provides an electronic device, including: a processor, a memory, a bus, and a computer program stored on the memory and executable on the processor;
the processor and the memory complete mutual communication through the bus;
the processor, when executing the computer program, implements the method described above.
In a fourth aspect, an embodiment of the present invention provides a non-transitory computer-readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements the above method.
According to the technical scheme, the subgraph matching method and device based on the community structure, provided by the embodiment of the invention, find out the subgraphs, which are matched with each other and equivalent, in the target pattern by a method of analyzing the structure of the target pattern based on the imported file; generating a hypergraph with communities as nodes according to a network graph, calculating the number of edges between each node in each community and each adjacent community of the community, respectively finding out subgraphs, matched with a target mode, of each community in the network graph by using a preset subgraph matching algorithm, and obtaining a subgraph matching result in each community; and between all communities in the network graph, finding out subgraphs which cross the communities and are matched with the target pattern based on the calculated number of edges between each node in each community and each adjacent community of the community and the found subgraphs which are matched with each other and are equivalent in the target pattern, obtaining subgraph matching results between the communities, summarizing the subgraph matching results in the communities and the subgraph matching results between the communities, and obtaining the subgraph matching results between the target pattern and the network graph, so that the calculation speed of subgraph matching can be effectively improved, and the time overhead can be reduced.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained based on these drawings without creative efforts.
Fig. 1 is a schematic flow chart of a subgraph matching method based on a community structure according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a subgraph matching apparatus based on a community structure according to an embodiment of the present invention;
fig. 3 is a schematic physical structure diagram of an electronic device according to an embodiment of the present invention;
FIG. 4a is a diagram illustrating an example of a directed pattern graph according to an embodiment of the present invention;
FIG. 4b is an exemplary diagram of another directed pattern graph according to an embodiment of the present invention;
fig. 4c is an exemplary diagram of a undirected network diagram according to an embodiment of the invention.
Detailed Description
The following further describes embodiments of the present invention with reference to the accompanying drawings. The following examples are only for illustrating the technical solutions of the present invention more clearly, and the protection scope of the present invention is not limited thereby.
Fig. 1 shows a flowchart of a subgraph matching method based on a community structure according to an embodiment of the present invention, and as shown in fig. 1, the subgraph matching method based on a community structure according to the embodiment includes:
and S1, importing a file containing the target pattern, analyzing the structure of the target pattern based on the imported file, and finding out the sub-images which are matched and equivalent in the target pattern.
And S2, generating a hypergraph with communities as nodes according to the data of the network graph, and calculating the number of edges between each node in each community and each adjacent community of the community.
For convenience of description and understanding, in the present embodiment, the total number of edges between one node and all nodes in the corresponding community in the network graph may be referred to as the number of edges between the node and the community.
It can be understood that, in this embodiment, each community may be regarded as a super node according to the data of the network graph, and if there is an edge connection between nodes in two communities, the super nodes corresponding to the two communities are also connected, and the two communities are referred to as being connected; and if two nodes in a community are connected by edges, adding a self-loop for the super-junction point corresponding to the community. The graph formed by these super junction points is called a hypergraph.
It is understood that in this step, which communities are connected to each community may be calculated and stored.
It is understood that the method of the present embodiment is based on a community-based graph database system, and it is assumed that the following information has been obtained: each node belongs to which community, each community contains which nodes, and which nodes are connected to an edge for a given node.
And S3, respectively finding out subgraphs, matched with the target mode, of each community in the network graph by using a preset subgraph matching algorithm, and obtaining subgraph matching results in the communities.
In a specific application, the preset subgraph matching algorithm of the embodiment may include: the basic subgraph matching algorithms such as the vf2 algorithm and the like are not limited in the embodiment, and other basic subgraph matching algorithms can be selected as the preset subgraph matching algorithm according to actual conditions.
S4, in all communities in the network graph, based on the calculated number of the edges of each node in each community and each adjacent community of the community and the found subgraph which is equivalent to each other in the target pattern, finding out the subgraph which crosses the communities and is matched with the target pattern, and obtaining the subgraph matching result of the community.
S5, summarizing the intra-community subgraph matching result and the inter-community subgraph matching result to obtain a target mode and a subgraph matching result of the network graph.
The subgraph matching method based on the community structure comprises the steps of dividing subgraph matching into two stages, respectively calculating through two processes of subgraph matching in communities and subgraph matching between communities, finding out subgraphs which are matched with each other and are equivalent in a target mode through a method of analyzing the structure of the target mode based on an imported file, generating a hypergraph with communities as nodes according to a network graph, and calculating the number of edges between each node in each community and each adjacent community of the community; respectively finding out subgraphs, matched with the target mode, of each community in the network graph by using a preset subgraph matching algorithm, and obtaining subgraph matching results in the communities; and between all communities in the network graph, finding out subgraphs which cross the communities and are matched with the target pattern based on the calculated number of edges between each node in each community and each adjacent community of the community and the found subgraphs which are matched with each other and are equivalent in the target pattern, obtaining subgraph matching results between the communities, summarizing the subgraph matching results in the communities and the subgraph matching results between the communities, and obtaining the subgraph matching results between the target pattern and the network graph, so that the calculation speed of subgraph matching can be effectively improved, and the time overhead can be reduced.
It should be noted that, in this embodiment, the matching equivalence definitions and related concepts are defined as follows:
definition 1 (symmetry point): given pattern diagram P ═ Vp,Ep),vi,vj∈VpWherein V ispSet of vertices in the pattern P, EpIs the set of edges in the pattern P, if viAnd vjThe topological structure of (1) is symmetrical, namely viAnd vjAre symmetrical points with each other. Node viAnd vjThe definition of topological symmetry is: if there is a non-trivial autoregressive on the graph, the corresponding bijective g of the autoregressive satisfies g (v)i)=vj,g(vj)=viThen called node viAnd vjThe topological structure is symmetrical.
For example, referring to FIG. 4a, the schema diagram in FIG. 4a exists for a self-isomorphic mapping g, satisfying g (v)1)=v1,g(v2)=v3,g(v3)=v2,g(v4)=v5,g(v5)=v4,g(v6)=v6. Node v is thus2And v3Symmetrical topological structure, node v4And v5The topology is symmetrical, so v2And v3,v4And v5Are mutually symmetrical points respectively.
Definition 2 (symmetry diagram): given pattern diagram P ═ Vp,Ep) Let Psub1=(Vp1,Ep1) And Psub2=(Vp2,Ep2) For two subgraphs existing in P, if bijective f: V representing symmetric relation exists for the two subgraphsp1→Vp2(abbreviated as symmetric bijective), wherein for each mapping f (u) v, the node u and the node v are symmetric points, and then P is calledsub1And Psub2Are symmetrical subgraphs of each other. Let the bijection f be Psub1In respect of Psub2Symmetric bijections of (2).
Obviously, if there is Psub1In respect of Psub2The symmetric bijective f of (a) defines a bijective g satisfying for each mapping f (u) v for f, respectively g (v) u, then g is Psub2In respect of Psub1Symmetric bijections of (2).
Define 3 (extended symmetry mapping): given pattern diagram P ═ Vp,Ep) And there are two subgraphs P in Psub1=(Vp1,Ep1) And Psub2=(Vp2,Ep2) They are symmetric subgraphs of each other. Let Psub1In respect of Psub2The symmetric bijection of (b) is f, the mapping h is called Vp→VpIs Psub1In respect of Psub2If h satisfies the extended symmetric mapping of (1)
Figure GDA0002971082330000101
For example, Psub1={v2And Psub2={v3Are two subgraphs of the pattern diagram shown in FIG. 4a, and v2And v3Points of mutual symmetry, present Psub1In respect of Psub2Of (c) symmetrical bijective f: { v2}→{v3Satisfy f (v)2)=v3Thus P issub1And Psub2Are symmetrical subgraphs of each other. Psub1In respect of Psub2Satisfies h (v) by the extended symmetric mapping1)=v1,h(v2)=v3,h(v3)=v2,h(v4)=v4,h(v5)=v5,h(v6)=v6
Definition 4 (congruent): given undirected pattern diagram P ═ Vp,Ep),v1,v2∈VpNeighbours (v) represents a set of neighbor nodes for node v. Balance v1And v2Are congruent to each other, if: (1) | Neighbours (v)1)|=|Neighbours(v2)|;(2)Neighbours(v1)-{v1,v2}=Neighbours(v2)-{v1,v2}. For directed pattern graphs, OutNeighbours (v) represents the set of out-neighbor nodes for node v, InNeighbours (v) represents the set of in-neighbor nodes for node v, denoted v1And v2Are congruent to each other, if: (1) | OutNeighbours (v)1)|=|OutNeighbours(v2) I and OutNeighbours (v)1)-{v1,v2}=OutNeighbours(v2)-{v1,v2};(2)|InNeighbours(v1)|=|InNeighbours(v2) I and InNeighbours (v)1)-{v1,v2}=InNeighbours(v2)-{v1,v2};(3)v1∈OutNeighbours(v2) And v is2∈OutNeighbours(v1) Or v or1And v2There is no edge in between.
For example, referring to FIG. 4b, node v in FIG. 4b1,v2,v3Two by two are congruent points.
Define 5 (neighborhood range): given undirected pattern diagram P ═ Vp,Ep) For node V ∈ VpAnd its neighborhood range n (v) { t | t ═ vort ∈ neighbours (v) }, where neighbours (v) represents a neighbor node of node v. For the directed pattern graph, the neighborhood range No is respectively definedutAnd an in-neighborhood region Nin:Nout(v)={t|t=vort∈OutNeighbours(v)},Nin(v)={t|t=vort∈InNeighbours(v)}。
For example, there is N in FIG. 4aout(v1)={v1,v2,v3},Nin(v1)={v1},Nout(v2)={v2,v4},Nin(v2)={v1,v2},Nout(v3)={v3,v5},Nin(v3)={v1,v3}。
Define 6 (neighborhood range rule): given undirected pattern diagram P ═ Vp,Ep),Psub1And Psub2Are two subgraphs of P, and they are symmetric subgraphs to each other, let P besub1In respect of Psub2Symmetric bijection of (d) is f, Psub1In respect of Psub2The extended symmetric bijection of (c) is h. For Psub1And Psub2The neighborhood range rule refers to the following two rules: rule 1, Psub1In the neighborhood of each node P, N (P) and P are in Psub2A neighborhood n (q) of the symmetry point q ═ f (p) satisfies h (n (p)) n (q); rule 2, Psub1With each node P in Psub2The symmetric points q ═ f (p) in (a) are congruent points with each other. If Psub1And Psub2If at least one of the two rules is satisfied, P is calledsub1And Psub2Satisfying the neighborhood range rule. For a directed pattern graph P ═ Vp,Ep) About P being symmetric subgraphs of each othersub1And Psub2The neighborhood range rule of (1) is: rule 1, Psub1Of each node p in the treeout(p) neighborhood of entries Nin(P) and P are in Psub2The symmetric point q in (f) (p) is the out-neighborhood region Nout(q) neighborhood of entries Nin(q) satisfy the conditions h (No) respectivelyut(p))=Nout(q) and h (N)in(p))=Nin(q); rule 2, Psub1With each node P in Psub2The symmetric points q ═ f (p) in (a) are congruent points with each other. Wherein f is Psub1In respect of Psub2Is symmetric bijective, h is Psub1In respect of Psub2Extended symmetric bijective of. If Psub1And Psub2If at least one of the two rules is satisfied, P is calledsub1And Psub2Satisfying the neighborhood range rule.
Definition 7 (matching equivalence): given an undirected or directed pattern graph P ═ Vp,Ep),Psub1And Psub2Is two subgraphs of P, and they are mutually symmetrical subgraphs, note Psub1In respect of Psub2Symmetric bijection of (d) is f, Psub1In respect of Psub2Is symmetrically mapped as h. If Psub1And Psub2Satisfies the neighborhood range rule, and Psub1And Psub2Isomorphism, then called Psub1And Psub2And matching is equivalent.
Further, on the basis of the above embodiment, the step S1 may include:
importing a file containing a target mode, and analyzing nodes, edges and related attributes in the target mode;
and analyzing the structure of the target pattern based on the analyzed nodes and edges and related attributes in the target pattern, and finding out mutually matched equivalent subgraphs in the target pattern.
Therefore, the structure of the target pattern can be analyzed, and the equivalent subgraphs matched with each other in the target pattern can be found.
Further, on the basis of the above embodiment, the step S3 may include:
and parallelly finding out subgraphs, matched with the target mode, of each community in the network graph by using a preset subgraph matching algorithm, and obtaining subgraph matching results in the communities.
It can be understood that the computing speed can be accelerated by realizing the parallelization of the subgraph matching process in the community.
Further, on the basis of the above embodiment, the step S4 may include:
performing repeatable subgraph matching on the target pattern on the hypergraph by using a preset subgraph matching algorithm (namely two nodes in the target pattern can be matched with the same node in the hypergraph), and storing an obtained distribution scheme in a preset table (list) (the distribution scheme refers to a distribution mode for distributing the nodes in the target pattern to various communities, wherein for repeatable subgraph matching of the nodes, if a certain node in the hypergraph is matched with a certain node in the target pattern, the target pattern node is distributed to a community corresponding to the node in the hypergraph);
when the number of elements in the preset table is greater than a preset threshold, the processing may specifically include steps P1-P7, not shown in the figure, for each allocation scheme s in the preset table:
and P1, judging whether the number of the nodes of the target mode distributed to each community in the network graph in s is larger than that of the nodes in the community.
In step P1, if it is determined that the number of nodes in s assigned to the target pattern of a community in the hypergraph is greater than the number of nodes in the community, the process for s is terminated.
And P2, if the number of the nodes of the target pattern allocated to each community in the network graph in s is less than or equal to the number of the nodes in the community, taking the nodes of the target pattern allocated to each community in s as the nodes to be matched according to the structure of the target pattern, calculating the number of edges between the nodes to be matched in the target pattern and the nodes allocated to other communities except the community, pruning the nodes in the network graph and the edges of other communities except the community in the network graph in combination to reduce the number of the candidate matching nodes corresponding to each node in the target pattern, and if the number of the candidate matching nodes corresponding to a certain node in the target pattern after pruning is reduced to 0, ending the processing on s.
Wherein, the candidate matching node of one node in the target pattern refers to the network graph node which is matched with the candidate matching node.
P3, if the number of candidate matching nodes corresponding to all the nodes in the pruned target pattern is not 0, judging whether the current s can be converted into a distribution scheme with a smaller index value through matching equivalence relation according to the found mutually matched equivalent subgraphs in the target pattern, and if not, recording all the distribution schemes which can be obtained by converting the current s through matching equivalence relation in a preset file (newallmaps).
In step P3, if it is determined that s can be converted into an allocation scheme with a smaller index value through matching equivalence relation, the process for s is ended.
P4, for the nodes in the target pattern which are distributed to the same community in s, calculating the number of common one-hop neighbors of the nodes in the target pattern in other communities outside the community in pairs.
And P5, taking the derived subgraph of the target pattern corresponding to the nodes in the target pattern distributed to the same community in s as a target pattern subgraph, performing subgraph matching on the target pattern subgraph and the corresponding community in the network graph by using a preset subgraph matching algorithm, performing matching attempt on each node in the target pattern subgraph by using the corresponding candidate matching node in the matching process, and performing pruning by using the calculated public one-hop neighbor number to obtain a matching result of performing subgraph matching on each target pattern subgraph in s and the corresponding community in the network graph.
P6, carrying out further inspection on matching results obtained by carrying out subgraph matching on each target mode subgraph in s and the corresponding community in the network graph, and for the matching results of any two target mode subgraphs in s in two communities, if two nodes of the two target mode subgraphs are connected by edges, the nodes matched with the two nodes are also connected by edges; and if the matching result set comprises a matching result of each target pattern subgraph in the s and the matching results in the matching result set pass the inspection pairwise, combining the matching results in the matching result set to generate a matching result of the target pattern and the network graph.
And P7, calculating a final matching result for all distribution schemes recorded in the preset files (newallmaps) by matching equivalent corresponding relations according to the generated matching results of all target patterns and the network graph.
It can be understood that inter-community subgraph matching in the embodiment fully utilizes the structure of the target pattern and the structure information of the community, and calculates the matching equivalent subgraph in the pattern graph, thereby reducing the number of matching calculation; the number of the nodes in the target mode in the matching process of the nodes is reduced by calculating the number of the edges of the nodes and each community in the network graph and the number of the edges of the nodes and the nodes distributed to other communities different from the nodes in the target mode through pairwise calculation of the number of the common one-hop neighbors of the nodes in the target mode in each other community except the target mode in the target mode, so that the matching calculation speed can be further increased.
Further, when the number of elements in the preset table is greater than the preset threshold, the present embodiment may process each allocation scheme s in the preset table in parallel.
Therefore, the computation speed of the subgraph matching among communities can be increased, and the computation speed of the whole subgraph matching method is further increased.
The subgraph matching method based on the community structure fully utilizes the structure of the target mode and the information of the community structure, has a good pruning effect, can effectively improve the computation speed of subgraph matching, and reduces time overhead.
Fig. 2 shows a schematic structural diagram of a subgraph matching device based on a community structure according to an embodiment of the present invention, and as shown in fig. 2, the subgraph matching device based on a community structure according to the embodiment includes: the system comprises an importing module 21, a generating module 22, a first matching module 23, a second matching module 24 and a summarizing module 25; wherein:
the import module 21 is configured to import a file including a target pattern, analyze a structure of the target pattern based on the imported file, and find out mutually matched equivalent subgraphs in the target pattern;
the generating module 22 is configured to generate a hypergraph using communities as nodes according to the network graph, and calculate the number of edges between each node in each community and each adjacent community of the community;
the first matching module 23 is configured to find sub-graphs, which are matched with the target pattern, of each community in the network graph by using a preset sub-graph matching algorithm, and obtain sub-graph matching results in the communities;
the second matching module 24 is configured to find out subgraphs, which cross communities and are matched with the target pattern, among all communities in the network graph based on the calculated number of edges between each node in each community and each adjacent community of the community and the subgraphs, which are matched with each other and are equivalent in the found target pattern, and obtain a subgraph matching result between communities;
and the summarizing module 25 is configured to summarize the intra-community subgraph matching result and the inter-community subgraph matching result to obtain a target pattern and a subgraph matching result of the network graph.
Specifically, the importing module 21 imports a file containing a target pattern, analyzes the structure of the target pattern based on the imported file, and finds out mutually matched equivalent subgraphs in the target pattern; the generation module 22 generates a hypergraph with communities as nodes according to the network graph, and calculates the number of edges between each node in each community and each adjacent community of the community; the first matching module 23 finds sub-graphs, which are matched with the target pattern, of each community in the network graph by using a preset sub-graph matching algorithm, and obtains sub-graph matching results in the communities; the second matching module 24 finds out subgraphs, which cross communities and are matched with the target pattern, among all communities in the network graph based on the calculated number of edges between each node in each community and each adjacent community of the community and the found subgraphs, which are matched with each other and are equivalent in the target pattern, and obtains a subgraph matching result between communities; the summarizing module 25 summarizes the intra-community subgraph matching result and the inter-community subgraph matching result to obtain a target mode and a subgraph matching result of the network graph.
For convenience of description and understanding, in the present embodiment, the total number of edges between one node and all nodes in the corresponding community in the network graph may be referred to as the number of edges between the node and the community.
It can be understood that, in this embodiment, each community may be regarded as a supernode according to the data of the network graph, and if there is an edge between nodes of two communities connected, the supernodes corresponding to the two communities are also connected; and if two nodes in a community are connected by edges, adding a self-loop for the super junction point corresponding to the community. The graph formed by these community nodes is called hypergraph.
It will be appreciated that in the generation module 22, which communities are connected to each community may be calculated and stored.
It is understood that the apparatus of the present embodiment is based on a community-based graph database system, and it is assumed that the following information is obtained: each node belongs to which community, each community contains which nodes, and which nodes are connected to an edge for a given node.
In a specific application, the preset subgraph matching algorithm of the embodiment may include: the basic subgraph matching algorithms such as the vf2 algorithm and the like are not limited in the embodiment, and other basic subgraph matching algorithms can be selected as the preset subgraph matching algorithm according to actual conditions.
It should be noted that, in the present embodiment, reference may be made to the detailed description of the foregoing method embodiments for matching equivalent definitions and related concept definitions.
According to the subgraph matching device based on the community structure, subgraph matching is divided into two stages, and calculation is performed through two processes of subgraph matching in communities and subgraph matching between communities respectively, so that the calculation speed of subgraph matching can be effectively improved, and time overhead is reduced.
Further, on the basis of the above embodiment, the importing module 21 may parse nodes and edges and related attributes in the target schema by importing a file including the target schema; and analyzing the structure of the target pattern based on the analyzed nodes and edges and related attributes in the target pattern, and finding out mutually matched equivalent subgraphs in the target pattern.
Therefore, the structure of the analysis target pattern can be found, and the subgraphs which are matched with each other and equivalent in the target pattern can be found.
Further, on the basis of the above embodiment, the first matching module 23 may be specifically used for
And parallelly finding out subgraphs, matched with the target mode, of each community in the network graph by using a preset subgraph matching algorithm, and obtaining subgraph matching results in the communities.
It can be understood that the computing speed can be accelerated by realizing the parallelization of the subgraph matching process in the community.
Further, on the basis of the above embodiment, the second matching module 24 can be specifically used for
Performing repeatable subgraph matching on the target pattern on the hypergraph by using a preset subgraph matching algorithm (namely two nodes in the target pattern can be matched with the same node in the hypergraph), and storing the obtained distribution scheme in a preset table (list) (the distribution scheme refers to a distribution mode for distributing the nodes in the target pattern to various communities);
when the number of elements in the preset table is greater than a preset threshold, processing each allocation scheme s in the preset table may specifically include:
judging whether the number of the nodes of the target mode distributed to each community in the network graph in the step s is larger than that of the nodes in the community or not;
if the number of the nodes of the target pattern distributed to each community in the network graph in s is less than or equal to the number of the nodes in the community, taking the nodes of the target pattern distributed to each community in s as the nodes to be matched according to the structure of the target pattern, calculating the number of edges between the nodes to be matched and the nodes distributed to other communities except the community in the target pattern, pruning the nodes in the community and the edges of other communities except the community in the network graph to reduce the number of candidate matching nodes corresponding to the nodes in the target pattern, and if the number of the candidate matching nodes corresponding to a certain node in the target pattern after pruning is 0, finishing the processing of s;
if the number of candidate matching nodes corresponding to all nodes in the pruned target pattern is not 0, judging whether the current s can be converted into a distribution scheme with a smaller index value through matching equivalence relation according to the found target pattern matched with equivalent subgraphs, and if not, recording all distribution schemes which can be obtained by converting the current s through matching equivalence relation in a preset file (newallmaps);
for the nodes in the target mode which is distributed to the same community in the s, calculating the number of the common one-hop neighbors of the nodes in the target mode in each other community except the community pairwise;
taking a derived subgraph of a target pattern corresponding to a node in the target pattern distributed to the same community in s as a target pattern subgraph, carrying out subgraph matching on the target pattern subgraph and a corresponding community in a network graph by using a preset subgraph matching algorithm, carrying out matching attempt on each node in the target pattern subgraph by using a corresponding candidate matching node in the matching process, and pruning by using the calculated public one-hop neighbor number to obtain a matching result of carrying out subgraph matching on each target pattern subgraph in s and the corresponding community in the network graph;
further checking the matching result of each target mode subgraph in s and the corresponding community in the network graph, and if the matching result of any two target mode subgraphs in s in two communities has an edge between two nodes of the two target mode subgraphs, the edge between the nodes matched with the two nodes is also connected; if a matching result set containing a matching result of each target pattern subgraph in s passes the check in pairs, combining the matching results in the matching result set to generate a matching result of the target pattern and the network graph;
and according to the generated matching results of all target patterns and the network diagram, calculating the final matching result for all distribution schemes recorded in the preset files (newallmaps) through the matching equivalence relation.
It can be understood that, after determining whether the number of nodes of the target pattern allocated to each community in the network graph in s is greater than the number of nodes in the community, if it is determined that the number of nodes of the target pattern allocated to a certain community in s is greater than the number of nodes in the community, the second matching module 24 ends the processing on s.
It can be understood that, after the second matching module 24 determines whether the current s can be converted into the allocation scheme with the smaller index value through the matching equivalence relation according to the mutually matching equivalent subgraphs in the found target pattern, if it is determined that the current s can be converted into the allocation scheme with the smaller index value through the matching equivalence relation, the processing on s is ended.
It can be understood that inter-community subgraph matching in the embodiment fully utilizes the structure of the target pattern and the structure information of the community, and calculates the matching equivalent subgraph in the pattern graph, thereby reducing the number of matching calculation; the number of the nodes in the target pattern which are distributed to the same community in the s is calculated pairwise to carry out pruning by calculating the number of the common one-hop neighbors of the nodes in the target pattern in each other community except the community, and the number of the edges between the nodes in the network graph and each community and the number of the edges between the nodes in the target pattern and the nodes distributed to other communities different from the target pattern are calculated to reduce the number of candidate matching nodes matched with the nodes in the target pattern in the network graph, so that the matching calculation speed can be further increased.
Further, the second matching module 24 may perform processing on each allocation scheme s in the preset table in parallel when the number of elements in the preset table is greater than a preset threshold.
Therefore, the computation speed of the subgraph matching among communities can be increased, and the computation speed of the whole subgraph matching method is further increased.
The subgraph matching device based on the community structure fully utilizes the structure of the target mode and the information of the community structure, has a better pruning effect, can effectively improve the computation speed of subgraph matching, and reduces the time overhead.
The subgraph matching device based on the community structure in this embodiment may be used to implement the technical solutions of the foregoing method embodiments, and the implementation principles and technical effects thereof are similar and will not be described herein again.
Fig. 3 is a schematic entity structure diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 3, the electronic device may include: a processor 31, a memory 32, a bus 33, and computer programs stored on the memory 32 and executable on the processor 31;
the processor 31 and the memory 32 complete mutual communication through the bus 33;
when the processor 31 executes the computer program, the method provided by the foregoing method embodiments is implemented, for example, including: importing a file containing a target mode, analyzing the structure of the target mode based on the imported file, and finding out mutually matched equivalent subgraphs in the target mode; generating a hypergraph with communities as nodes according to the data of the network graph, and calculating the number of edges between each node in each community and each adjacent community of the community; respectively finding out subgraphs, matched with the target mode, of each community by using a preset subgraph matching algorithm in each community in the network graph to obtain subgraph matching results in the communities; finding out subgraphs, which cross communities and are matched with the target pattern, among all communities in the network graph based on the calculated number of edges between each node in each community and each adjacent community of the community and the found subgraphs, which are matched with each other and are equivalent in the target pattern, and obtaining a subgraph matching result between communities; and summarizing the intra-community subgraph matching result and the inter-community subgraph matching result to obtain a target mode and a subgraph matching result of the network graph.
An embodiment of the present invention provides a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the method provided by the foregoing method embodiments, and for example, the method includes: importing a file containing a target mode, analyzing the structure of the target mode based on the imported file, and finding out mutually matched equivalent subgraphs in the target mode; generating a hypergraph with communities as nodes according to the data of the network graph, and calculating the number of edges between each node in each community and each adjacent community of the community; respectively finding out subgraphs, matched with the target mode, of each community by using a preset subgraph matching algorithm in each community in the network graph to obtain subgraph matching results in the communities; finding out subgraphs, which cross communities and are matched with the target pattern, among all communities in the network graph based on the calculated number of edges between each node in each community and each adjacent community of the community and the found subgraphs, which are matched with each other and are equivalent in the target pattern, and obtaining a subgraph matching result between communities; and summarizing the intra-community subgraph matching result and the inter-community subgraph matching result to obtain a target mode and a subgraph matching result of the network graph.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, apparatus, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus, and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means/systems for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element. The terms "upper", "lower", and the like, indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings, and are only for convenience in describing the present invention and simplifying the description, but do not indicate or imply that the referred devices or elements must have a specific orientation, be constructed and operated in a specific orientation, and thus, should not be construed as limiting the present invention. Unless expressly stated or limited otherwise, the terms "mounted," "connected," and "connected" are intended to be inclusive and mean, for example, that they may be fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood by those skilled in the art according to specific situations.
In the description of the present invention, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description. Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention. It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present invention is not limited to any single aspect, nor is it limited to any single embodiment, nor is it limited to any combination and/or permutation of these aspects and/or embodiments. Moreover, each aspect and/or embodiment of the present invention may be utilized alone or in combination with one or more other aspects and/or embodiments thereof.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; such modifications and substitutions do not depart from the spirit and scope of the present invention, and they should be construed as being included in the following claims and description.

Claims (10)

1.一种应用于社交网络基于社区结构的子图匹配方法,其特征在于,包括:1. a subgraph matching method applied to social network based on community structure, is characterized in that, comprises: 导入包含代表三个互相认识的人的目标模式的文件,基于所导入的文件,分析目标模式的结构,找出目标模式中互相匹配等价的子图;Import a file containing target patterns representing three people who know each other, analyze the structure of the target pattern based on the imported file, and find out matching and equivalent subgraphs in the target pattern; 根据社交网络的数据生成以社区作为结点的超图,计算每个社区中各结点与本社区的各邻接社区间的边数;Generate a hypergraph with the community as a node according to the data of the social network, and calculate the number of edges between each node in each community and each adjacent community of the community; 在所述社交网络中各社区内部,利用预设子图匹配算法,分别找出各社区内所有互相认识的三个人构成的子图,获得社区内子图匹配结果;Within each community in the social network, a preset subgraph matching algorithm is used to find subgraphs composed of all three people who know each other in each community, and obtain subgraph matching results within the community; 在所述社交网络中所有社区之间,利用预设子图匹配算法,在所述超图上对目标模式进行结点可重复的子图匹配,将得到的分配方案存储在预设表中,当所述预设表内的元素数量大于预设阈值时,对于所述预设表中的每个分配方案s基于所计算的每个社区中各结点与本社区的各邻接社区间的边数和所找出的目标模式中互相匹配等价的子图进行处理,判断s中被分配到所述社交网络中各社区的目标模式的结点数量是否大于该社区中的结点数量,判断s是否可以通过匹配等价关系转换成索引值更小的分配方案,以及对于s中被分配到相同社区的目标模式中的结点,两两计算目标模式中它们在除本社区之外的其它各社区中的公共一跳邻居数来进行剪枝,找出跨社区的所有互相认识的三个人构成的子图,获得社区间子图匹配结果;Among all the communities in the social network, using a preset subgraph matching algorithm, the target pattern is subjected to node-repeatable subgraph matching on the hypergraph, and the obtained allocation scheme is stored in a preset table, When the number of elements in the preset table is greater than the preset threshold, for each allocation scheme s in the preset table, based on the calculated edge between each node in each community and each adjacent community of the community The number of matching and equivalent subgraphs in the found target pattern is processed, and it is judged whether the number of nodes in the target pattern assigned to each community in the social network in s is greater than the number of nodes in the community. Whether s can be converted into an allocation scheme with a smaller index value by matching the equivalence relationship, and for the nodes in s that are allocated to the target pattern of the same community, calculate them in pairs other than this community in the target pattern. The number of public one-hop neighbors in each community is used for pruning, and the subgraph composed of all three people who know each other across the community is found, and the subgraph matching result between communities is obtained; 将所述社区内子图匹配结果和社区间子图匹配结果进行汇总,获得目标模式与所述网络图的子图匹配结果,即获得所述社交网络中所有互相认识的三个人。The intra-community sub-graph matching results and the inter-community sub-graph matching results are summarized to obtain the sub-graph matching results between the target pattern and the network graph, that is, all three people who know each other in the social network are obtained. 2.根据权利要求1所述的方法,其特征在于,所述导入包含目标模式的文件,基于所导入的文件,分析目标模式的结构,找出目标模式中互相匹配等价的子图,包括:2. method according to claim 1, is characterized in that, described import comprises the file of target pattern, based on the imported file, analyze the structure of target pattern, find out the subgraphs that match each other equivalent in target pattern, comprise : 导入包含目标模式的文件,解析所述目标模式中的结点和边及相关属性;importing a file containing a target schema, parsing the nodes and edges and related attributes in the target schema; 基于解析出的所述目标模式中的结点和边及相关属性,分析目标模式的结构,找出目标模式中互相匹配等价的子图。Based on the parsed nodes, edges and related attributes in the target pattern, the structure of the target pattern is analyzed, and subgraphs matching and equivalent to each other in the target pattern are found. 3.根据权利要求1所述的方法,其特征在于,在所述社交网络中各社区内部,利用预设子图匹配算法,分别找出各社区的与目标模式相匹配的子图,获得社区内子图匹配结果,包括:3. method according to claim 1, is characterized in that, in each community in described social network, utilize preset subgraph matching algorithm, find out the subgraph matching with target pattern of each community respectively, obtain community Intra-subgraph matching results, including: 并行地在所述社交网络中各社区内部,利用预设子图匹配算法,分别找出各社区的与目标模式相匹配的子图,获得社区内子图匹配结果。In parallel within each community in the social network, a preset subgraph matching algorithm is used to find subgraphs of each community that match the target pattern, and obtain subgraph matching results within the community. 4.根据权利要求1所述的方法,其特征在于,在所述社交网络中所有社区之间,利用预设子图匹配算法,在所述超图上对目标模式进行结点可重复的子图匹配,将得到的分配方案存储在预设表中,当所述预设表内的元素数量大于预设阈值时,对于所述预设表中的每个分配方案s基于所计算的每个社区中各结点与本社区的各邻接社区间的边数和所找出的目标模式中互相匹配等价的子图进行处理,判断s中被分配到所述社交网络中各社区的目标模式的结点数量是否大于该社区中的结点数量,判断s是否可以通过匹配等价关系转换成索引值更小的分配方案,以及对于s中被分配到相同社区的目标模式中的结点,两两计算目标模式中它们在除本社区之外的其它各社区中的公共一跳邻居数来进行剪枝,找出跨社区的所有互相认识的三个人构成的子图,获得社区间子图匹配结果,包括:4. The method according to claim 1, characterized in that, among all the communities in the social network, using a preset subgraph matching algorithm to perform node-repeatable subgraphs on the target pattern on the hypergraph map matching, store the obtained allocation scheme in a preset table, and when the number of elements in the preset table is greater than a preset threshold, for each allocation scheme s in the preset table based on each calculated The number of edges between each node in the community and each adjacent community in this community and the subgraphs that match and are equivalent to each other in the found target pattern are processed, and the target pattern assigned to each community in the social network in s is judged. Whether the number of nodes in s is greater than the number of nodes in the community, determine whether s can be converted into an allocation scheme with a smaller index value by matching the equivalence relationship, and for nodes in s that are allocated to the target pattern of the same community, In the target mode, calculate the number of common one-hop neighbors in other communities except this community for pruning, find out the subgraph composed of all three people who know each other across the community, and obtain the inter-community subgraph Match results, including: 利用预设子图匹配算法,在所述超图上对目标模式进行结点可重复的子图匹配,将得到的分配方案存储在预设表中;Using a preset subgraph matching algorithm, perform node-repeatable subgraph matching on the target pattern on the hypergraph, and store the obtained allocation scheme in a preset table; 当所述预设表内的元素数量大于预设阈值时,对于所述预设表中的每个分配方案s进行处理,包括:When the number of elements in the preset table is greater than the preset threshold, processing each allocation scheme s in the preset table, including: 判断s中被分配到所述社交网络中各社区的目标模式的结点数量是否大于该社区中的结点数量;Judging whether the number of nodes in s assigned to the target pattern of each community in the social network is greater than the number of nodes in the community; 若s中被分配到所述社交网络中各社区的目标模式的结点数量小于等于该社区中的结点数量,则根据目标模式的结构,将s中被分配到各社区的目标模式中的结点作为待匹配结点,计算所述待匹配结点在所述目标模式中与被分配给除了本社区之外的其它各社区的结点间的边数,并结合社交网络中本社区中结点与除了本社区之外的其它社区的边数进行剪枝,以减少目标模式中各结点对应的候选匹配结点的数量,若剪枝后目标模式中的某个结点对应的候选匹配结点的数量减少到了0,则结束对s的处理;If the number of nodes in s assigned to the target pattern of each community in the social network is less than or equal to the number of nodes in the community, then according to the structure of the target pattern, the number of nodes in s assigned to the target pattern of each community is The node is used as the node to be matched, and the number of edges between the node to be matched in the target pattern and the nodes assigned to other communities except this community is calculated, and the number of edges in this community in the social network is combined. The number of edges between nodes and other communities other than this community is pruned to reduce the number of candidate matching nodes corresponding to each node in the target pattern. When the number of matching nodes is reduced to 0, the processing of s ends; 若剪枝后目标模式中的所有结点对应的候选匹配结点的数量均不为0,则根据所找出的目标模式中互相匹配等价的子图,判断当前s是否可以通过匹配等价关系转换成索引值更小的分配方案,若不可以,则将当前s能通过匹配等价关系转换得到的所有分配方案记录在预设文件中;If the number of candidate matching nodes corresponding to all nodes in the target pattern after pruning is not 0, then according to the found subgraphs in the target pattern that match and are equivalent to each other, it is judged whether the current s can pass the matching equivalent The relationship is converted into an allocation plan with a smaller index value. If it is not possible, all allocation plans that can be converted by matching the equivalence relationship in the current s are recorded in the preset file; 对于s中被分配到相同社区的目标模式中的结点,两两计算目标模式中它们在除本社区之外的其它各社区中的公共一跳邻居数;For the nodes in s that are assigned to the target pattern of the same community, calculate the number of common one-hop neighbors in the target pattern in other communities except this community; 将s中被分配到相同社区的目标模式中的结点组成的目标模式的导出子图作为目标模式子图,利用预设子图匹配算法,将所述目标模式子图与社交网络中的对应社区进行子图匹配,在匹配过程中,对目标模式子图中的各结点使用其对应的所述候选匹配结点进行匹配尝试,并使用计算得到的公共一跳邻居数进行剪枝,得到s中每个目标模式子图与社交网络中的对应社区进行子图匹配的匹配结果;The derived subgraph of the target pattern composed of nodes in the target pattern assigned to the same community in s is taken as the target pattern subgraph, and a preset subgraph matching algorithm is used to match the target pattern subgraph with the corresponding one in the social network. The community performs subgraph matching. During the matching process, each node in the target pattern subgraph uses its corresponding candidate matching node to perform a matching attempt, and uses the calculated number of public one-hop neighbors to prune, and obtain The matching result of subgraph matching between each target pattern subgraph in s and the corresponding community in the social network; 对s中每个目标模式子图与社交网络中的对应社区进行子图匹配得到的匹配结果进行进一步检验,对于s中的任意两个目标模式子图在两个社区中的匹配结果,若两个目标模式子图的两个结点间有边相连,则与这两个结点匹配的结点间也应有边相连;若对于包含s中的各目标模式子图的一种匹配结果的一个匹配结果集合,且这个匹配结果集合中的匹配结果两两通过了检验,则将匹配结果集合中的匹配结果合并产生一个目标模式与社交网络的匹配结果;The matching results obtained by sub-graph matching between each target pattern subgraph in s and the corresponding community in the social network are further tested. For the matching results of any two target pattern subgraphs in s in the two communities, if the two The two nodes of each target pattern subgraph are connected by an edge, then the nodes matching these two nodes should also be connected by an edge; if a matching result containing each target pattern subgraph in s A matching result set, and the matching results in the matching result set pass the test in pairs, then the matching results in the matching result set are combined to generate a matching result between the target pattern and the social network; 根据所产生的所有目标模式与社交网络的匹配结果,对所述预设文件中记录的所有分配方案,通过匹配等价的对应关系,计算出最终的匹配结果。According to the generated matching results between all target patterns and social networks, for all allocation schemes recorded in the preset file, the final matching results are calculated by matching equivalent correspondences. 5.根据权利要求4所述的方法,其特征在于,在判断s中被分配到所述社交网络中各社区的目标模式的结点数量是否大于该社区中的结点数量之后,所述方法还包括:5. The method according to claim 4, characterized in that after judging whether the number of nodes in s allocated to the target pattern of each community in the social network is greater than the number of nodes in the community, the method Also includes: 若判断获知s中被分配到所述社交网络中某社区的目标模式中的结点数量大于该社区中的结点数量,则结束对s的处理。If it is determined that the number of nodes in the target pattern assigned to a community in the social network in s is greater than the number of nodes in the community, the processing of s is ended. 6.根据权利要求4所述的方法,其特征在于,在根据所找出的目标模式中互相匹配等价的子图,判断当前s是否可以通过匹配等价关系转换成索引值更小的分配方案之后,所述方法还包括:6. method according to claim 4, is characterized in that, in according to the target pattern that finds out, match the subgraph that is equivalent to each other, judge whether current s can be converted into the assignment that index value is smaller by matching equivalence relation After the solution, the method further includes: 判断获知当前s可以通过匹配等价关系转换成索引值更小的分配方案,则结束对s的处理。It is judged that the current s can be converted into an allocation scheme with a smaller index value by matching the equivalence relationship, and the processing of s is ended. 7.根据权利要求4所述的方法,其特征在于,当所述预设表内的元素数量大于预设阈值时,并行地对于所述预设表中的每个分配方案s进行处理。7. The method according to claim 4, wherein when the number of elements in the preset table is greater than a preset threshold, processing each allocation scheme s in the preset table in parallel. 8.一种应用于社交网络基于社区结构的子图匹配装置,其特征在于,包括:8. A sub-graph matching device applied to social network based on community structure is characterized in that, comprising: 导入模块,用于导入包含代表三个互相认识的人的目标模式的文件,基于所导入的文件,分析目标模式的结构,找出目标模式中互相匹配等价的子图;The import module is used to import the file containing the target pattern representing the three people who know each other, analyze the structure of the target pattern based on the imported file, and find out the matching and equivalent subgraphs in the target pattern; 生成模块,用于根据社交网络生成以社区作为结点的超图,计算每个社区中各结点与本社区的各邻接社区间的边数;The generation module is used to generate a hypergraph with a community as a node according to the social network, and calculate the number of edges between each node in each community and each adjacent community of the community; 第一匹配模块,用于在所述社交网络中各社区内部,利用预设子图匹配算法,分别找出各社区的与目标模式相匹配的子图内所有互相认识的三个人构成的子图,获得社区内子图匹配结果;The first matching module is used to, within each community in the social network, use a preset subgraph matching algorithm to respectively find subgraphs composed of three people who know each other in the subgraphs of each community that match the target pattern. , to obtain the subgraph matching results within the community; 第二匹配模块,用于在所述社交网络中所有社区之间,利用预设子图匹配算法,在所述超图上对目标模式进行结点可重复的子图匹配,将得到的分配方案存储在预设表中,当所述预设表内的元素数量大于预设阈值时,对于所述预设表中的每个分配方案s基于所计算的每个社区中各结点与本社区的各邻接社区间的边数和所找出的目标模式中互相匹配等价的子图进行处理,判断s中被分配到所述社交网络中各社区的目标模式的结点数量是否大于该社区中的结点数量,判断s是否可以通过匹配等价关系转换成索引值更小的分配方案,以及对于s中被分配到相同社区的目标模式中的结点,两两计算目标模式中它们在除本社区之外的其它各社区中的公共一跳邻居数来进行剪枝,找出跨社区的所有互相认识的三个人构成的子图,获得社区间子图匹配结果;The second matching module is used to perform node-repeatable sub-graph matching on the target pattern on the hypergraph by using a preset subgraph matching algorithm among all the communities in the social network. Stored in the preset table, when the number of elements in the preset table is greater than the preset threshold, for each allocation scheme s in the preset table is calculated based on each node in each community and the community The number of edges between the adjacent communities and the subgraphs that match each other in the found target pattern are processed, and it is judged whether the number of nodes in the target pattern assigned to each community in the social network in s is greater than that of the community. The number of nodes in s, determine whether s can be converted into an allocation scheme with a smaller index value by matching the equivalence relationship, and for the nodes in s that are allocated to the target pattern of the same community, calculate them in pairs in the target pattern. In addition to this community, the number of public one-hop neighbors in other communities is used for pruning, and the subgraph composed of all three people who know each other across the community is found, and the subgraph matching results between communities are obtained; 汇总模块,用于将所述社区内子图匹配结果和社区间子图匹配结果进行汇总,获得目标模式与所述社交网络的子图匹配结果,即获得所述社交网络中所有互相认识的三个人。The aggregation module is used to summarize the sub-graph matching results within the community and the sub-graph matching results between the communities, and obtain the sub-graph matching results between the target pattern and the social network, that is, obtain all three people who know each other in the social network. . 9.一种电子设备,其特征在于,包括:处理器、存储器、总线及存储在存储器上并可在处理器上运行的计算机程序;9. An electronic device, comprising: a processor, a memory, a bus, and a computer program stored in the memory and running on the processor; 其中,所述处理器,存储器通过所述总线完成相互间的通信;Wherein, the processor and the memory communicate with each other through the bus; 所述处理器执行所述计算机程序时实现如权利要求1-7中任一项所述的方法。The processor implements the method of any one of claims 1-7 when executing the computer program. 10.一种非暂态计算机可读存储介质,其特征在于,所述非暂态计算机可读存储介质上存储有计算机程序,该计算机程序被处理器执行时实现如权利要求1-7中任一项所述的方法。10. A non-transitory computer-readable storage medium, wherein a computer program is stored on the non-transitory computer-readable storage medium, and when the computer program is executed by a processor, any one of claims 1-7 is implemented. one of the methods described.
CN201810836811.3A 2018-07-26 2018-07-26 A method and device for subgraph matching based on community structure Active CN109063089B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810836811.3A CN109063089B (en) 2018-07-26 2018-07-26 A method and device for subgraph matching based on community structure

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810836811.3A CN109063089B (en) 2018-07-26 2018-07-26 A method and device for subgraph matching based on community structure

Publications (2)

Publication Number Publication Date
CN109063089A CN109063089A (en) 2018-12-21
CN109063089B true CN109063089B (en) 2021-04-23

Family

ID=64836646

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810836811.3A Active CN109063089B (en) 2018-07-26 2018-07-26 A method and device for subgraph matching based on community structure

Country Status (1)

Country Link
CN (1) CN109063089B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111382315B (en) * 2018-12-29 2024-04-05 阿里巴巴集团控股有限公司 Merging method of sub-graph isomorphic matching results, electronic equipment and storage medium
WO2022041023A1 (en) * 2020-08-27 2022-03-03 清华大学 Sub-graph matching policy determination method, sub-graph matching method, sub-graph counting method and calculation device
CN114491100A (en) * 2021-12-31 2022-05-13 清华大学 Graph structure-based multi-modal media data clustering method and device

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8676565B2 (en) * 2010-03-26 2014-03-18 Virtuoz Sa Semantic clustering and conversational agents
CN102662974B (en) * 2012-03-12 2014-02-26 浙江大学 A Network Graph Indexing Method Based on Adjacent Node Tree
CN105956114A (en) * 2016-05-05 2016-09-21 南京邮电大学 Method for searching pattern matching subgraphs based on tag graph
CN106384050B (en) * 2016-09-13 2019-01-15 哈尔滨工程大学 A kind of dynamic stain analysis method excavated based on Maximum Frequent subgraph

Also Published As

Publication number Publication date
CN109063089A (en) 2018-12-21

Similar Documents

Publication Publication Date Title
Ni et al. Inside the atoms: ranking on a network of networks
CN112579797B (en) Service processing method and device for knowledge graph
CN109063089B (en) A method and device for subgraph matching based on community structure
Weller et al. Exact approaches for scaffolding
Rao et al. New approximation techniques for some linear ordering problems
CN102456062A (en) Community similarity calculation method and social network cooperation mode discovery method
CN114116785B (en) Distributed SPARQL query optimization method based on minimum attribute cut
CN109981326B (en) Method and device for positioning household broadband sensing fault
CN115240048A (en) Image classification-oriented deep learning operator positioning fusion method and device
Agarwal et al. Distributed weighted all pairs shortest paths through pipelining
da Silva et al. Model-based operator placement for data processing in iot environments
CN108614932B (en) Edge graph-based linear flow overlapping community discovery method, system and storage medium
CN109656898A (en) Distributed large-scale complex community detection method and device based on node degree
CN117668004A (en) Method for acquiring shortest path between nodes
Mowshowitz et al. Entropy, orbits, and spectra of graphs
KR20150136734A (en) Data parallel inference method and apparatus thereof
CN115277124A (en) Online system and server for searching and matching attack mode based on system tracing graph
Chen et al. PBSM: an efficient top-K subgraph matching algorithm
CN117172633B (en) Manufacturing service subgraph simulation method and system for industrial Internet platform
CN108491505A (en) A kind of DSATUR figure vertex coloring methods based on the sequence of Topology Potential value
Hoffmann et al. The parameterized complexity of centrality improvement in networks
CN115150152B (en) Network user actual authority quick reasoning method based on authority dependency graph reduction
Maniu et al. ProbTree: A query-efficient representation of probabilistic graphs
Xiang et al. TKDA: An improved method for k-degree anonymity in social graphs
CN114579826B (en) Task processing method and device based on knowledge graph

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20181221

Assignee: Galaxy Aerospace (Xi'an) Technology Co.,Ltd.

Assignor: TSINGHUA University

Contract record no.: X2023110000007

Denomination of invention: A subgraph matching method and device based on community structure

Granted publication date: 20210423

License type: Common License

Record date: 20230112

EE01 Entry into force of recordation of patent licensing contract