CN112150581B

CN112150581B - Distributed self-adaptive graph vertex coloring method and system

Info

Publication number: CN112150581B
Application number: CN202010837566.5A
Authority: CN
Inventors: 王志刚; 王宁; 杨洋; 魏志强; 黄磊; 刘昊; 盛艳秀
Original assignee: Ocean University of China
Current assignee: Ocean University of China
Priority date: 2020-08-19
Filing date: 2020-08-19
Publication date: 2022-09-30
Anticipated expiration: 2040-08-19
Also published as: CN112150581A

Abstract

The invention discloses a distributed _adaptive graph vertex coloring method and system, comprising: respectively acquiring the coloring information set C ^t-1 ( vi); Coloring conflict judgment is performed on each vertex _vi ; wherein, for any vertex vi, if the value rs of the current probability generator is satisfied that the value _rs of the current probability generator is less than or equal to the first threshold α and the coloring information c _i of the _vertex _vi ∈C ^t‑1 (v _i ) and c _i ∈B ^t (v _i ), then determine that the vertex v _i and its adjacent vertices produce coloring conflicts; determine the available color set U(v _i ) corresponding to each vertex v _i , A color is randomly selected from the available color set U(vi ₎ corresponding to each vertex v _i according to the second threshold β _, and the coloring information c _i of each vertex v _i is determined; All adjacent vertices v _j send their coloring information c _i ; if the coloring information of each vertex v _i is different from the color of all its adjacent vertices, the coloring information of each vertex in the undirected graph is determined.

Description

A Distributed Adaptive Graph Vertex Shading Method and System

技术领域technical field

本发明涉及数据处理技术领域，并且更具体地，涉及一种分布式的自适应图顶点着色方法及系统。The present invention relates to the technical field of data processing, and more particularly, to a distributed adaptive graph vertex coloring method and system.

背景技术Background technique

NP-C问题是世界七大数学难题之一，即便是在信息技术高速发展的今天，各领域仍存在大量NP-C问题有待解决。而图顶点着色问题(GVCP,graph vertex coloringproblem)则是最著名的NP-C问题之一。GVCP研究如何给一个无向连通图的顶点着色，使得有公共边的相邻顶点的颜色不同。具体地，GVCP判定主要考虑能否用给定的k种颜色完成着色任务，而GVCP质量优化则致力于最小化k值，它们被广泛应用于任务调度、寄存器分配和时间表安排等实际场景。因此，研究GVCP的判定和质量优化，不仅对NP-C类问题的探索与解决具有重要的理论意义，同时也有广阔的商业应用价值。The NP-C problem is one of the seven major mathematical problems in the world. Even today with the rapid development of information technology, there are still a large number of NP-C problems to be solved in various fields. The graph vertex coloring problem (GVCP, graph vertex coloring problem) is one of the most famous NP-C problems. GVCP studies how to color the vertices of an undirected connected graph such that adjacent vertices with common edges have different colors. Specifically, GVCP determination mainly considers whether the coloring task can be completed with a given k colors, while GVCP quality optimization focuses on minimizing the k value, and they are widely used in practical scenarios such as task scheduling, register allocation, and scheduling. Therefore, studying the determination and quality optimization of GVCP not only has important theoretical significance for the exploration and solution of NP-C problems, but also has broad commercial application value.

由于直接计算GVCP的复杂度较高，目前普遍采用的方案是启发式求解，即通过迭代逼近的方式逐步寻求最优解。即便如此，在大数据时代，图数据规模呈爆发式增长，致使传统的集中式(单机)迭代方式仍难以在合理时间内完成计算任务。近年来，基于云计算的新型分布式处理系统在计算和存储能力方面具有良好的扩展性，可为GVCP的迭代求解提供有效支撑。其中，尤以Google公司提出的分布式图计算系统Pregel及其开源实现与扩展如GraphLab等最为适宜。它们提供了以顶点为中心的编程接口以及性能优化，使应用层面的用户仅需关心业务逻辑，无需了解分布式处理细节。Due to the high complexity of directly calculating GVCP, the currently commonly used solution is heuristic solution, that is, the optimal solution is gradually sought through iterative approximation. Even so, in the era of big data, the scale of graph data has grown explosively, making it difficult for traditional centralized (single-machine) iterative methods to complete computing tasks within a reasonable time. In recent years, the new distributed processing systems based on cloud computing have good scalability in terms of computing and storage capabilities, and can provide effective support for the iterative solution of GVCP. Among them, the distributed graph computing system Pregel proposed by Google and its open source implementation and extension such as GraphLab are the most suitable. They provide a vertex-centric programming interface and performance optimization, so that users at the application level only need to care about business logic and do not need to understand the details of distributed processing.

在分布式图计算系统中，图顶点数据被分配到不同任务(如物理机、进程或线程等)上以便增加着色处理的并行度。某次迭代过程中，各任务上的顶点并行着色，并将各自选择的颜色以消息的形式广播给有边连接的邻接顶点，而系统则通过全局同步路障来协调各任务的处理进度(同步计算)并确保所有顶点在计算时能够收到所有应该收到的消息数据。需要注意的是，本步迭代发送的消息只能在路障之后的下一步迭代中才可被使用。然而，具体到GVCP计算，逻辑上拓扑结构相邻的顶点在物理上可能归属于不同的任务，其并行着色过程中极有可能选择相同颜色而产生着色冲突。另一方面，同步路障的存在导致着色信息的传播具有滞后性，即实时性差。较差的实时性导致只有在下一步迭代才会意识到着色冲突进而继续进行颜色选择。但再选择过程仍可能产生着色冲突，并最终陷入无限振荡，导致算法无法收敛。为解决该问题，已有技术或者通过逐步计算独立集然后以独立集为单位分批顺序着色，以避免相邻顶点同时着色，或者通过破除全局同步路障的方式对着色引入随机扰动以跳出振荡循环。然而，前者引入额外计算开销和独立集结果的维护开销，后者则因扰动的不可控性(难以复现)极大增加了程序调试与容错控制的难度。In a distributed graph computing system, graph vertex data is distributed to different tasks (such as physical machines, processes or threads, etc.) in order to increase the parallelism of shading processing. In an iteration process, the vertices on each task are colored in parallel, and the selected color is broadcast to the adjacent vertices connected by edges in the form of messages, and the system coordinates the processing progress of each task through the global synchronization roadblock (synchronous calculation). ) and make sure that all vertices receive all the message data they should receive at the time of computation. It should be noted that the messages sent in this iteration can only be used in the next iteration after the roadblock. However, specific to GVCP calculation, logically topologically adjacent vertices may belong to different tasks physically, and it is very likely that the same color will be selected during the parallel coloring process, resulting in coloring conflicts. On the other hand, the existence of synchronization roadblocks leads to a lag in the propagation of coloring information, that is, poor real-time performance. Poor real-time performance results in only the next iteration being aware of shading conflicts and proceeding with color selection. However, the re-selection process may still produce coloring conflicts, and eventually fall into infinite oscillation, resulting in the failure of the algorithm to converge. In order to solve this problem, the existing technology either calculates independent sets step by step and then colorizes sequentially in batches in units of independent sets to avoid simultaneous coloring of adjacent vertices, or introduces random disturbances to coloring by breaking the global synchronization roadblock to jump out of the oscillation cycle. . However, the former introduces additional computational overhead and maintenance overhead of independent set results, while the latter greatly increases the difficulty of program debugging and fault-tolerant control due to the uncontrollable perturbation (difficult to reproduce).

发明内容SUMMARY OF THE INVENTION

本发明提出一种分布式的自适应图顶点着色方法及系统，以解决如何高效、准确对无向图顶点进行着色的问题。The present invention proposes a distributed adaptive graph vertex coloring method and system to solve the problem of how to efficiently and accurately color undirected graph vertices.

为了解决上述问题，根据本发明的一个方面，提供了一种分布式的自适应图顶点着色方法，所述方法包括：In order to solve the above problems, according to an aspect of the present invention, a distributed adaptive graph vertex coloring method is provided, and the method includes:

步骤1，分别获取无向图中每个顶点v_i的所有邻接顶点在上一次迭代过程中的着色信息集合C^t-1(v_i)；其中，t为当前的迭代次数；Step 1, respectively obtain the coloring information set C ^t-1 (vi ₎ of all adjacent vertices of each vertex v _i in the undirected graph in the previous iteration process; wherein, t is the current number of iterations;

步骤2，对每个顶点v_i进行着色冲突判断；其中，对于任一个顶点vi，若满足当前的概率发生器的值r_s小于等于第一阈值α并且该顶点v_i的着色信息c_i∈C^t-1(v_i)并且c_i∈B^t(v_i)，则确定该顶点v_i与其邻接顶点产生着色冲突，更新该顶点vi对应的状态累积监控变量s_i为初始阈值；其中，s_i用于记录该顶点v_i的着色信息连续未发生变化的迭代次数；B^t(v_i)为该顶点v_i在更新时刻已经收到的来自于当前迭代步的且位于该顶点v_i所在任务的部分邻接顶点的着色信息；Step 2: Perform coloring conflict judgment on each vertex vi; wherein, for any vertex vi, if the current probability generator value _rs is less than or equal to the first threshold α and the coloring information c _i _∈ of the vertex vi is _satisfied C ^t-1 (v _i ) and c _i ∈ B ^t (v _i ), then determine that the vertex vi and its adjacent vertices have a color conflict, and update the state accumulation monitoring variable _si corresponding to the vertex _vi to be the initial threshold; where, s _i is used to record the number of iterations that the coloring information of the vertex v _i has not changed continuously; B ^t (vi ₎ is the current iteration step that the vertex v _i has received at the update time and is located at the vertex v _i The coloring information of some adjacent vertices of the task;

步骤3，确定每个顶点v_i对应的可用颜色集合U(v_i)，分别从每个顶点v_i对应的可用颜色集合U(v_i)中根据第二阈值β随机选择一个颜色，确定每个顶点v_i的着色信息c_i；Step 3: Determine the available color set U(v _i ) corresponding to each vertex v _i , and randomly select a color from the available color set U(vi ₎ corresponding to each vertex v _i according to the second threshold β, and determine each color. coloring information c _{i of each vertex v i} _;

步骤4，对于每个顶点v_i，沿边向其所有邻接顶点v_j发送其着色信息c_i；Step 4, for each vertex v _i , send its coloring information c _i to all its adjacent vertices v _j along the edge;

步骤5，判断是否满足每个顶点v_i的着色信息和其所有邻接顶点的颜色均不同；其中，若满足，则直接确定所述无向图中每个顶点的着色信息；若不满足，则根据运行信息调整第一阈值α和第二阈值β，并返回步骤1重新进行迭代，直至每个顶点v_i的着色信息和其所有邻接顶点的颜色均不同时停止，并确定所述无向图中每个顶点的着色信息。Step 5, determine whether the coloring information of each vertex v _i is different from the color of all its adjacent vertices; wherein, if it is satisfied, then directly determine the coloring information of each vertex in the undirected graph; if not, then Adjust the first threshold α and the second threshold β according to the running information, and return to step 1 to re-iterate until the coloring information of each vertex v _i and the colors of all its adjacent vertices are different, stop, and determine the undirected graph Shading information for each vertex in .

优选地，其中所述方法还包括：Preferably, wherein the method further comprises:

在对每个顶点v_i进行着色冲突判断时，若不满足当前的概率发生器的值r_s小于等于第一阈值α并且该顶点v_i的着色信息c_i∈C^t-1(v_i)并且c_i∈B^t(v_i)，则确定该顶点v_i与其邻接顶点v_j未产生着色冲突，更新s_i＝s_i+1，并直接进入步骤4。When judging coloring conflict for each vertex v _i , if the current value rs of the probability generator is not satisfied, the value _rs is less than or equal to the first threshold α and the coloring information of the vertex v _i is c _i ∈ C ^t-1 (vi ₎ And c _i ∈ B ^t (vi ), then it is determined that the vertex v _i and its adjacent vertex v _j have no coloring conflict, update s _i _{=s i} ₊₁ , and go to step 4 directly.

在进行下一次迭代前，利用如下公式动态地调整所述第一阈值α，包括：Before the next iteration, the first threshold α is dynamically adjusted using the following formula, including:

α＝p+(p-dv/maxd)×γ，α=p+(p-dv/maxd)×γ,

γ＝|V^x|/|V|，γ=|V ^x |/|V|,

其中，p为预设的基本阈值，取值范围为0到1；d_v为当前的顶点v_i的度数，maxd为所述无向图中所有顶点的最大度数，γ为取值为0到1的收敛调控参数；|V|为所述无向图中的全部顶点数量，|V^x|为在预设的连续x步迭代计算中颜色均未发生变化的顶点数量。Among them, p is the preset basic threshold, the value range is 0 to 1; d _v is the degree of the current vertex v _i , maxd is the maximum degree of all the vertices in the undirected graph, γ is 0 to 1 The convergence control parameter of 1; |V| is the number of all vertices in the undirected graph, and |V ^x | is the number of vertices whose color has not changed in the preset continuous x-step iterative calculation.

优选地，其中所述确定每个顶点v_i对应的可用颜色集合U(v_i)，分别从每个顶点v_i对应的可用颜色集合U(v_i)中随机选择一个颜色，确定每个顶点v_i的着色信息c_i，包括：Preferably, wherein the determining the available color set U(v _i ) corresponding to each vertex v _i , randomly select a color from the available color set U(vi ₎ corresponding to each vertex v _i , and determine each vertex The coloring information _ci of _vi , including:

确定每个顶点v_i对应的可用颜色集合U(v_i)Determine the available color set U(vi ₎ corresponding to each vertex v _i

对于任一个顶点v_i,若|U(v_i)|小于第二阈值β，则进行颜色扩充，将已用的最大颜色值以β为单位步长自增，并加入到所述可用颜色集合U(v_i)中；For any vertex v _i , if |U(vi ₎ | is less than the second threshold β, then color expansion is performed, the maximum used color value is incremented by β as a unit step, and added to the available color set in U( _vi );

将每个顶点v_i对应的当前的可用颜色集合U(v_i)中的颜色按照颜色编号升序排列，选取top-β个颜色放入数组中，并根据随机着色器的值r_c模β取余后作为数组下标进行颜色选择，确定每个顶点v_i的着色信息为c_{i_}＝arrayOfTop-β[r_c modβ]。Arrange the colors in the current available color set U(v _i ) corresponding to each vertex v _i in ascending order of color numbers, select the top-β colors and put them into the array, and take the value r _c modulo β according to the value of the random shader. The remainder is used as an array subscript for color selection, and the coloring information of each vertex v _i is determined as c _{i_} =arrayOfTop-β[r _c modβ].

在进行下一次迭代前，利用如下公式动态地调整所述第二阈值β，包括：Before the next iteration, the second threshold β is dynamically adjusted using the following formula, including:

γ＝|V^x|/|V|，γ=|V ^x |/|V|,

其中，

为向上取整数运算符，γ为取值为0到1的收敛调控参数；|V|为所述无向图中的全部顶点数量，|V^x|为在预设的连续x步迭代计算中颜色均未发生变化的顶点数量。in,

is an upward integer operator, γ is a convergence control parameter with a value from 0 to 1; |V| is the total number of vertices in the undirected graph, |V ^x | is in the preset continuous x-step iterative calculation The number of vertices for which none of the colors have changed.

根据本发明的另一个方面，提供了一种分布式的自适应图顶点着色系统，所述系统包括：According to another aspect of the present invention, a distributed adaptive graph vertex shader system is provided, the system comprising:

着色信息集合获取单元，用于分别获取无向图中每个顶点v_i的所有邻接顶点在上一次迭代过程中的着色信息集合C^t-1(v_i)；其中，t为当前的迭代次数；The coloring information set acquisition unit is used to obtain the coloring information set C ^t-1 (vi ₎ of all adjacent vertices of each vertex v _i in the undirected graph in the previous iteration process; wherein, t is the current number of iterations ;

着色冲突判断单元，用于对每个顶点v_i进行着色冲突判断；其中，对于任一个顶点vi，若满足当前的概率发生器的值r_s小于等于第一阈值α并且该顶点v_i的着色信息c_i∈C^t-1(v_i)并且c_i∈B^t(v_i)，则确定该顶点v_i与其邻接顶点产生着色冲突，更新该顶点vi对应的状态累积监控变量s_i为初始阈值；其中，s_i用于记录该顶点v_i的着色信息连续未发生变化的迭代次数；B^t(v_i)为该顶点v_i在更新时刻已经收到的来自于当前迭代步的且位于该顶点v_i所在任务的部分邻接顶点的着色信息；The coloring conflict judgment unit is used to judge the coloring conflict for each vertex _vi ; wherein, for any vertex vi, if the value rs of the current probability generator is satisfied that the value _rs of the current probability generator is less than or equal to the first threshold α and the coloring of the vertex _vi is Information c _i ∈ C ^t-1 (vi ) and c _i _∈ B ^t (vi ₎ , then determine that the vertex vi and its adjacent vertices have coloring conflicts, and update the state accumulation monitoring variable _si corresponding to the vertex _vi to be the initial Threshold; wherein, s _i is used to record the number of iterations that the coloring information of the vertex v _i has not changed continuously; B ^t (vi _{) is the current iteration step that the vertex v i} _has received at the update time and is located in The coloring information of some adjacent vertices of the task where the vertex v _i is located;

着色单元，用于确定每个顶点v_i对应的可用颜色集合U(v_i)，分别从每个顶点v_i对应的可用颜色集合U(v_i)中根据第二阈值β随机选择一个颜色，确定每个顶点v_i的着色信息c_i；The coloring unit is used to determine the available color set U(v _i ) corresponding to each vertex v _i , and randomly select a color from the available color set U(vi ₎ corresponding to each vertex v _i according to the second threshold β, Determine the coloring information c _{i of each vertex v i} _;

消息广播单元，用于对于每个顶点v_i，沿边向其所有邻接顶点v_j发送其着色信息c_i；a message broadcasting unit, for each vertex v _i , sending its coloring information c _i to all its adjacent vertices v _j along the edge;

着色信息确定单元，用于判断是否满足每个顶点v_i的着色信息和其所有邻接顶点的颜色均不同；其中，若满足，则直接确定所述无向图中每个顶点的着色信息；若不满足，则根据运行信息调整第一阈值α和第二阈值β，并进入着色信息集合获取单元重新进行迭代，直至每个顶点v_i的着色信息和其所有邻接顶点的颜色均不同时停止，并确定所述无向图中每个顶点的着色信息。A coloring information determination unit, used to determine whether the coloring information of each vertex v _i is different from the colors of all its adjacent vertices; wherein, if it is satisfied, the coloring information of each vertex in the undirected graph is directly determined; if If not satisfied, then adjust the first threshold α and the second threshold β according to the running information, and enter the coloring information set acquisition unit to iterate again, until the coloring information of each vertex v _i and the color of all its adjacent vertices are different, stop, And determine the coloring information of each vertex in the undirected graph.

优选地，其中所述着色冲突判断单元，还包括：Preferably, the coloring conflict judgment unit further includes:

在对每个顶点v_i进行着色冲突判断时，若不满足当前的概率发生器的值r_s小于等于第一阈值α并且该顶点v_i的着色信息c_i∈C^t-1(v_i)并且c_i∈B^t(v_i)，则确定该顶点v_i与其邻接顶点v_j未产生着色冲突，更新s_i＝s_i+1，并直接进入消息广播单元。When judging coloring conflict for each vertex v _i , if the current value rs of the probability generator is not satisfied, the value _rs is less than or equal to the first threshold α and the coloring information of the vertex v _i is c _i ∈ C ^t-1 (vi ₎ And c _i ∈ B ^t (vi ), it is determined that the vertex v _i and its adjacent vertex v _j have no coloring conflict, update s _i _{=s i} ₊₁ , and directly enter the message broadcasting unit.

优选地，其中所述系统还包括：Preferably, wherein the system further comprises:

第一阈值调整单元，用于在进行下一次迭代前，利用如下公式动态地调整所述第一阈值α，包括：A first threshold adjustment unit, configured to dynamically adjust the first threshold α by using the following formula before performing the next iteration, including:

α＝p+(p-dv/maxd)×γ，α=p+(p-dv/maxd)×γ,

γ＝|V^x|/|V|，γ=|V ^x |/|V|,

优选地，其中所述着色单元，确定每个顶点v_i对应的可用颜色集合U(v_i)，分别从每个顶点v_i对应的可用颜色集合U(v_i)中随机选择一个颜色，确定每个顶点v_i的着色信息c_i，包括：Preferably, the coloring unit determines the available color set U(v _i ) corresponding to each vertex v _i , and randomly selects a color from the available color set U(vi ₎ corresponding to each vertex v _i , respectively, and determines The shading information c _i of each vertex v _i , including:

第二阈值调整单元，用于在进行下一次迭代前，利用如下公式动态地调整所述第二阈值β，包括：The second threshold adjustment unit is configured to dynamically adjust the second threshold β by using the following formula before the next iteration, including:

γ＝|V^x|/|V|，γ=|V ^x |/|V|,

其中，

本发明提供了一种分布式的自适应图顶点着色方法及系统，针对现有技术在解决着色信息传播延迟和并行着色冲突问题时的不足，在分布式同步计算环境下通过顶点之间的协同处理直接进行着色。其中，针对着色信息传播延迟，考虑到跨步计算技术的适用性和跨步处理远程任务消息的随机性，提出局部消息流式可见技术，允许顶点在着色时，对着同一个任务内、已经完成着色的本地顶点的着色信息(即B^t(v_i))实时可见，以加快信息传播速度且保证正确性，并通过全局同步和记录本地顶点更新顺序来保证计算过程的可复现性；针对并行着色冲突问题，从着色时机和颜色选择两个角度注入可复现的随机扰动(即第一阈值和第二阈值)，以缓解相邻顶点的着色冲突、加速收敛，相应技术可直接对顶点进行着色而无需计算独立集，提高了计算效率，可将基于独立集的图拓扑结构删减转化为顶点颜色状态的多次松弛调整，简化了数据管理操作；此外，在迭代计算过程中，通过实时侦测顶点的着色更改情况，可动态自适应调整随机扰动强度(即第一阈值和第二阈值的大小)，从而在保证算法收敛的前提下进一步提高计算并行度和着色质量。The invention provides a distributed adaptive graph vertex coloring method and system, aiming at the deficiencies of the prior art in solving the problems of coloring information propagation delay and parallel coloring conflict, in a distributed synchronous computing environment, through the coordination between vertices Processing goes directly to the shading. Among them, in view of the propagation delay of coloring information, considering the applicability of stride computing technology and the randomness of stride processing remote task messages, a local message streaming visibility technology is proposed, which allows vertices to be in the same task and have The shading information (ie B ^t (v _i )) of the local vertices that have completed shading is visible in real time to speed up information propagation and ensure correctness, and to ensure the reproducibility of the calculation process by globally synchronizing and recording the update order of local vertices; For the problem of parallel coloring conflicts, reproducible random disturbances (ie, the first threshold and the second threshold) are injected from the perspectives of coloring timing and color selection to alleviate the coloring conflicts of adjacent vertices and accelerate the convergence. The corresponding technology can directly Vertices are colored without calculating independent sets, which improves computational efficiency, and can transform graph topology reduction based on independent sets into multiple relaxation adjustments of vertex color states, which simplifies data management operations; in addition, in the iterative calculation process, By detecting the coloring changes of vertices in real time, the random disturbance intensity (that is, the size of the first threshold and the second threshold) can be dynamically and adaptively adjusted, thereby further improving the computational parallelism and coloring quality while ensuring the convergence of the algorithm.

附图说明Description of drawings

通过参考下面的附图，可以更为完整地理解本发明的示例性实施方式：Exemplary embodiments of the present invention may be more fully understood by reference to the following drawings:

图1为着色振荡的示意图。Figure 1 is a schematic diagram of colored oscillations.

图2为全局跨步消息处理的框架图。Figure 2 is a frame diagram of global stride message processing.

图3为根据本发明实施方式的分布式的自适应图顶点着色方法300的流程图；3 is a flowchart of a distributed adaptive graph vertex shader method 300 according to an embodiment of the present invention;

图4为根据本发明实施方式的局部消息流式可见技术框图；FIG. 4 is a block diagram of a local message streaming visible technology according to an embodiment of the present invention;

图5为根据本发明实施方式的分布式的自适应图顶点着色的流程图；5 is a flowchart of distributed adaptive graph vertex shading according to an embodiment of the present invention;

图6为根据本发明实施方式的分布式的自适应图顶点着色系统600的结构示意图。FIG. 6 is a schematic structural diagram of a distributed adaptive graph vertex shader system 600 according to an embodiment of the present invention.

具体实施方式Detailed ways

现在参考附图介绍本发明的示例性实施方式，然而，本发明可以用许多不同的形式来实施，并且不局限于此处描述的实施例，提供这些实施例是为了详尽地且完全地公开本发明，并且向所属技术领域的技术人员充分传达本发明的范围。对于表示在附图中的示例性实施方式中的术语并不是对本发明的限定。在附图中，相同的单元/元件使用相同的附图标记。Exemplary embodiments of the present invention will now be described with reference to the accompanying drawings, however, the present invention may be embodied in many different forms and is not limited to the embodiments described herein, which are provided for the purpose of this thorough and complete disclosure invention, and fully convey the scope of the invention to those skilled in the art. The terms used in the exemplary embodiments shown in the drawings are not intended to limit the invention. In the drawings, the same elements/elements are given the same reference numerals.

除非另有说明，此处使用的术语(包括科技术语)对所属技术领域的技术人员具有通常的理解含义。另外，可以理解的是，以通常使用的词典限定的术语，应当被理解为与其相关领域的语境具有一致的含义，而不应该被理解为理想化的或过于正式的意义。Unless otherwise defined, terms (including scientific and technical terms) used herein have the commonly understood meanings to those skilled in the art. In addition, it is to be understood that terms defined in commonly used dictionaries should be construed as having meanings consistent with the context in the related art, and should not be construed as idealized or overly formal meanings.

由于图顶点着色(GVCP,graph vertex coloring problem)要求有公共边相邻的顶点颜色不能相同，一个顶点的着色需要考虑其邻接顶点的颜色选择并相互影响。这意味着所有顶点的一次着色并不能解决GVCP；相反地，需要根据邻接顶点的着色多次调整，即多次迭代、反复改色，从而使顶点着色逐渐稳定，即算法收敛。而为了设计单次迭代的计算逻辑，首先需要分析每个顶点的行为。显然，为了能让顶点完成适当的颜色选择，需要获得其邻接顶点当前的颜色属性值，之后根据邻接顶点的颜色集合选择合适的颜色并实行着色，最后将所修改的结果发送给邻接顶点，以便在下一轮迭代中其邻接顶点进行颜色选择。综上所述，给定一个无向图G(V,E)，在分布式图顶点着色过程中，各顶点v_i的颜色首先被初始化为c_i＝0，之后v_i迭代地更新颜色，直至所有顶点的颜色不再更新。在现有技术中，每步迭代计算中，顶点v_i的行为如下：Since the graph vertex coloring problem (GVCP, graph vertex coloring problem) requires that the colors of adjacent vertices with a common edge cannot be the same, the coloring of a vertex needs to consider the color selection of its adjacent vertices and affect each other. This means that one coloring of all vertices cannot solve GVCP; on the contrary, it is necessary to adjust multiple times according to the coloring of adjacent vertices, that is, multiple iterations and repeated color changes, so that the vertex coloring is gradually stabilized, that is, the algorithm converges. In order to design the calculation logic of a single iteration, the behavior of each vertex needs to be analyzed first. Obviously, in order for a vertex to complete the appropriate color selection, it is necessary to obtain the current color attribute value of its adjacent vertices, then select the appropriate color according to the color set of the adjacent vertices and execute the coloring, and finally send the modified result to the adjacent vertices, so that Its adjacent vertices undergo color selection in the next iteration. To sum up, given an undirected graph G(V,E), in the distributed graph vertex coloring process, the color of each vertex v _i is first initialized to c _i =0, and then v _i iteratively updates the color, Until the colors of all vertices are no longer updated. In the prior art, in each step of iterative calculation, the behavior of vertex v _i is as follows:

Step1，获得邻接顶点在上一步迭代的着色信息集合，即C(v_i)＝{c₁,c₂,…,c_j},v_j∈N(v_i),其中N(v_i)为v_i的邻接顶点集合；Step1, obtain the coloring information set of adjacent vertices in the previous iteration, namely C(v _i )={c ₁ ,c ₂ ,...,c _j },v _j ∈N(v _i ), where N(v _i ) is The set of adjacent vertices of v _i ;

Step2，如果c_i∈C(v_i)，则v_i与邻接顶点产生着色冲突，判定结果为True，需跳转到Step3修改颜色；Step2, if c _i _∈ C(vi ₎ , then vi and adjacent vertices have a coloring conflict, the result is True, and you need to jump to Step 3 to modify the color;

Step3，对顶点v_i,令U(v_i)为可用颜色集合，应有

通常可用最小选择策略，即

使v_i着上最小的可用颜色，进而优化整个算法的色数k以提高着色质量；Step3, for vertex v _i , let U(vi ₎ be the set of available colors, there should be

Usually a minimal selection strategy is available, i.e.

Make v _i the smallest available color, and then optimize the chromatic number k of the whole algorithm to improve the coloring quality;

Step4，以消息(m_ij)的形式沿边向所有邻接顶点v_j发送颜色信息c_i，即m_ij＝<v_j,c_i>,以便邻接顶点在下一轮迭代中继续进行着色冲突的判定(即Step2)和可能的颜色选择(即Step3)。Step4, send color information c _i to all adjacent vertices v _j along the edge in the form of message (m _ij ), that is, m _ij =<v _j , _ci >, so that the adjacent vertices continue to judge the coloring conflict in the next iteration ( i.e. Step2) and possible color choices (i.e. Step3).

在分布式同步计算环境下，上述过程极易产生“着色振荡”现象，即两个或多个散布于不同任务上的相邻顶点并行执行着色冲突判定(Step2)且结果为True时，如果它们的邻接顶点着色信息C完全相同，那么在最小选色策略下，这些顶点将会改成相同颜色；且由于同步控制的约束，这些顶点互不了解对方的着色选择，致使下一轮迭代因产生着色冲突而需要继续改色。这种情形可能会持续数轮迭代，甚至因颜色的重复变换产生着色振荡，使算法陷入死循环、无法收敛。图1展示了这种着色振荡现象。左图中，位于不同任务上的相邻顶点v₁与v₃同时被设置为红色(c₁＝c₃＝1),因此需要改色。由于它们的邻接顶点v₂与v₄已经使用了蓝色(c₂＝c₄＝2),在最小选色策略下，值为3的黄色自然成为当前可选的最优颜色(3＝min{3,4,5,…}).故如右图所示，在并行着色过程中，v₁与v₃将再次同时设置为黄色。但此时它们仍有着色冲突，因此在下一次迭代中又会改为最优选择的红色，从而陷入无限振荡，无法收敛。In the distributed synchronous computing environment, the above process is very prone to the phenomenon of "shading oscillation", that is, when two or more adjacent vertices scattered on different tasks execute the shading conflict judgment (Step 2) in parallel and the result is True, if they The coloring information C of adjacent vertices is exactly the same, then under the minimum color selection strategy, these vertices will be changed to the same color; and due to the constraints of synchronous control, these vertices do not know each other's coloring choices, resulting in the next iteration due to the generation of Coloring conflict and need to continue to change the color. This situation may continue for several rounds of iterations, and even shading oscillations occur due to repeated color transformations, making the algorithm fall into an infinite loop and fail to converge. Figure 1 illustrates this coloring oscillation phenomenon. In the left figure, the adjacent vertices v ₁ and v ₃ on different tasks are set to red at the same time (c ₁ =c ₃ =1), so the color needs to be changed. Since their adjacent vertices v ₂ and v ₄ have already used blue (c ₂ =c ₄ =2), under the minimum color selection strategy, yellow with a value of 3 naturally becomes the currently selected optimal color (3 = min {3,4,5,…}). So as shown on the right, during parallel shading, v ₁ and v ₃ will be set to yellow at the same time again. However, they still have coloring conflicts at this time, so in the next iteration, they will change to the optimal choice of red, and they will fall into infinite oscillation and cannot converge.

GVCP的本质是在满足“相邻顶点着色不同”的约束条件下，对不同顶点的颜色选择所组成的庞大解空间进行搜索，以筛选出符合要求的解。在单机集中式环境下，所有顶点串行着色，且着色顶点在选择颜色过程中可实时获取其邻接顶点的最新颜色信息，从而避免相邻顶点的着色冲突，杜绝着色振荡。然而，由于解空间过于庞大，已有的穷举法、回溯法、贪心法等效率不高，研究者因此提出了蚁群算法和遗传算法等优化方案，以便减少搜索过程的盲目性并尽可能避免陷入局部最优解。The essence of GVCP is to search the huge solution space formed by the color selection of different vertices under the constraint of "different coloring of adjacent vertices" to filter out the solutions that meet the requirements. In a single-machine centralized environment, all vertices are shaded serially, and the shading vertices can obtain the latest color information of their adjacent vertices in real time during the process of color selection, so as to avoid shading conflicts between adjacent vertices and prevent shading oscillations. However, because the solution space is too large, the existing exhaustive methods, backtracking methods, greedy methods, etc. are inefficient. Therefore, researchers have proposed optimization schemes such as ant colony algorithm and genetic algorithm, in order to reduce the blindness of the search process and maximize the efficiency of the search process. Avoid getting stuck in a local optimum.

蚁群算法模拟蚂蚁在寻找食物的过程中通过信息素寻找从蚁穴至食物的最短路径。具体应用时，可将多只蚂蚁放在同一顶点或随机的多个顶点上进行图遍历。当蚂蚁给顶点着色时会释放特定的信息素，而其它蚂蚁会根据信息素强度选择遍历路线。在此基础上，最大最小蚁群搜索算法将信息素强度初值设为最大，并对搜索过程中的强度取值范围做了限定，以减少陷入局部最小的概率。Ant colony algorithm simulates ants searching for the shortest path from ant nest to food through pheromone in the process of searching for food. In specific applications, multiple ants can be placed on the same vertex or multiple random vertices for graph traversal. Specific pheromones are released when ants color vertices, and other ants choose traversal routes based on pheromone strength. On this basis, the maximum and minimum ant colony search algorithm sets the initial value of pheromone strength to the maximum, and limits the range of strength values during the search process to reduce the probability of falling into a local minimum.

遗传搜索基于适者生存、优胜劣汰的原则模拟自然进化过程，通过产生初始种群、选择、杂交与变异以逐渐收敛到最优个体，具有良好的全局搜索能力，不易陷入局部最优解，且具有一定的并行性。后续的改进版本通过重新设计杂交算子、局部搜索算子和适应度函数等手段优化计算效率和解的最优性。Genetic search simulates the natural evolution process based on the principle of survival of the fittest and survival of the fittest. It gradually converges to the optimal individual by generating initial populations, selection, hybridization and mutation. It has good global search ability and is not easy to fall into local optimal solutions. of parallelism. The subsequent improved version optimizes the computational efficiency and the optimality of the solution by redesigning the hybridization operator, the local search operator and the fitness function.

在分布式着色时，分布式着色方案致力于通过强化算力配置来提高着色效率。然而，根据上述分析，相邻顶点极有可能发生着色冲突并导致颜色反复改变，产生振荡现象，致使算法无法收敛。为解决该问题，GraphLab系统的分布式异步计算框架解除了全局同步控制，可随机化不同任务上顶点的着色时机，从而降低相邻顶点同时进行着色选择以及选择相同颜色的概率；而分布式同步计算框架方面，当前主流思想是逐步寻找独立集然后以独立集为单位进行着色以避免着色冲突，之后将独立集中顶点与边删除，重复上述过程，直到所有顶点与边被删除即可，进一步地，可按照顶点度大小对各独立集进行优先级着色，以提高着色质量。In distributed shading, the distributed shading scheme is dedicated to improving shading efficiency by strengthening the configuration of computing power. However, according to the above analysis, it is very likely that adjacent vertices have coloring conflicts and cause the colors to change repeatedly, resulting in oscillation, which makes the algorithm unable to converge. In order to solve this problem, the distributed asynchronous computing framework of the GraphLab system removes the global synchronization control, which can randomize the coloring timing of vertices on different tasks, thereby reducing the probability of adjacent vertices coloring and selecting the same color at the same time; while distributed synchronization In terms of computing framework, the current mainstream idea is to gradually find independent sets and then use independent sets as a unit to color to avoid coloring conflicts, then delete vertices and edges from the independent set, and repeat the above process until all vertices and edges are deleted. , which can prioritize shading for each independent set according to the vertex degree size to improve shading quality.

关于着色信息的实时性方面，GraphLab的异步计算框架允许顶点随时启动着色操作并利用当前所有收到的、最新的邻接顶点着色消息，实时性最佳。同步计算方面，由于全局路障的存在，当前迭代步发送的消息只能在下一个迭代步才可以使用，但受异步计算机制启发，已有研究人员提出全局跨步消息处理和以块为中心的局部异步计算技术，以便对单源最短路径等支持消息累加计算的算法进行消息规模剪枝或加快消息传播速度。不同之处在于，前者允许更新顶点时除了处理上一个迭代步收到的消息，还可以处理当前迭代步已经收到的、原本用于下一步迭代的、来自于顶点所在任务和其它任务的全部消息数据；后者仅允许同一个任务上的顶点之间直接进行通信计算而不受同步路障的制约。Regarding the real-time aspect of shading information, GraphLab's asynchronous computing framework allows vertices to start shading operations at any time and utilize all currently received, latest adjacent vertex shading messages, with the best real-time performance. In terms of synchronous computing, due to the existence of global roadblocks, the messages sent by the current iteration step can only be used in the next iteration step. However, inspired by the asynchronous computing mechanism, some researchers have proposed global stride message processing and block-centered local Asynchronous computing technology to prune message scale or speed up message propagation for algorithms that support message accumulation computing, such as single-source shortest path. The difference is that the former allows updating vertices, in addition to processing the messages received in the previous iteration, but also processing all messages that have been received in the current iteration, originally used for the next iteration, from the task where the vertex is located and other tasks. Message data; the latter only allows direct communication computations between vertices on the same task without the constraints of synchronization roadblocks.

图2展示了全局跨步消息处理流程。假设有3个分布式任务，则按照正常的同步迭代约束机制，在第l步迭代计算中，各任务在更新本地顶点时，仅可以使用第(l-1)步收到的消息C^l-1。然而，各任务内部的顶点更新计算是串行的，已经完成更新的顶点会立即将新消息广播给邻接顶点，而这些消息有可能在其邻接顶点被调度执行第l步迭代计算之前被接收，记为B^l.全局跨步处理即允许顶点在执行更新计算时，同时利用C^l-1和B^l,如果后者已经被接收、可用。同时，因为B^l中的消息已被使用，将被彻底清除。这意味着在第(l+1)步计算中，这些消息将不可见，所以需要从C^l中去除B^l中的消息，即C^l-B^l。Figure 2 shows the global stride message processing flow. Assuming there are 3 distributed tasks, according to the normal synchronous iterative constraint mechanism, in the iterative calculation in step l, each task can only use the message C ^{l- 1} . However, the vertex update calculation within each task is serial, and the updated vertex will immediately broadcast new messages to the adjacent vertices, and these messages may be received before the adjacent vertices are scheduled to perform the first iteration calculation. Denoted as B ^l . The global stride process allows a vertex to use C ^l-1 and B ^l at the same time when performing update calculations, if the latter has been received and available. At the same time, because the message in ^Bl has been used, it will be completely cleared. This means that in the calculation of step (l+1), these messages will not be visible, so the messages in B ^l need to be removed from C ^l , ie C ^l -B ^l .

目前主流的集中式着色算法已在处理效率和着色质量方面进行了深度优化，但在大数据时代，面对动辄上亿顶点、几十亿条边且规模仍在快速增长的大图，受限于单机的硬件配置，其计算效率远远不能满足经济社会的发展需求；分布式计算为提升计算效率提供了有效的解决途径，但为解决相邻顶点并行着色导致的振荡问题，需采用异步计算或基于独立集的分批着色策略，前者因破坏了全局同步且顶点的更新时机完全随机，导致计算过程难以复现，对程序调试和容错控制构成极大挑战；而后者在独立集计算阶段需要引入额外迭代处理并维护其计算结果以及因此导致的图结构的动态删减，这使得处理过程复杂而耗时。特别地，现有的跨步计算技术(包括以块为中心的计算)仅支持消息可累加合并的算法，如单源最短路径、连通域计算和增量PageRank计算等，也即，如果一个顶点的值在相邻两次迭代计算中未发生变化，则它不必重复发送顶点值给邻接顶点。然而，在顶点着色算法中，即使一个顶点的颜色在相邻两次迭代计算中没有发生变化，但其邻接顶点的其它邻居的颜色可能发生变化，所以它仍需发送自己的颜色值给邻接顶点，以便后者进行正确地的着色选择。因此，现有跨步计算并不适用于顶点着色问题。此外，现有跨步技术下，所有已经收到的消息均参与计算，但来自于其它任务的远程消息由于网络原因，其到达时间具有随机性，导致计算过程难以复现，极大增加了程序调试和容错控制的难度。At present, the mainstream centralized shading algorithms have been deeply optimized in terms of processing efficiency and shading quality. However, in the era of big data, in the face of large graphs with hundreds of millions of vertices, billions of edges, and the scale is still growing rapidly, limited Due to the hardware configuration of a single machine, its computing efficiency is far from meeting the needs of economic and social development; distributed computing provides an effective solution for improving computing efficiency, but to solve the oscillation problem caused by the parallel coloring of adjacent vertices, asynchronous computing is required. Or the batch shading strategy based on independent sets. The former destroys the global synchronization and the update timing of the vertices is completely random, which makes the calculation process difficult to reproduce, which poses a great challenge to program debugging and fault-tolerant control; while the latter requires the independent set calculation stage. The introduction of additional iterative processing and maintenance of its computational results and the resulting dynamic pruning of the graph structure makes the processing complex and time-consuming. In particular, existing stride computation techniques (including block-centric computations) only support algorithms for accumulatively combining messages, such as single-source shortest paths, connected domain computations, and incremental PageRank computations, that is, if a vertex The value of is unchanged between two adjacent iterations, so it does not have to repeatedly send vertex values to adjacent vertices. However, in the vertex coloring algorithm, even if the color of a vertex does not change in two adjacent iterations, the colors of other neighbors of its adjacent vertices may change, so it still needs to send its own color value to the adjacent vertices. , so that the latter makes the correct shading selection. Therefore, existing stride calculations are not suitable for vertex shading problems. In addition, under the existing stride technology, all received messages are involved in the calculation, but the arrival time of remote messages from other tasks is random due to network reasons, which makes the calculation process difficult to reproduce and greatly increases the number of programs. Difficulty of debugging and fault tolerance control.

针对现有技术在解决着色信息传播延迟和并行着色冲突问题时的不足，本发明在分布式同步计算环境下通过顶点之间的协同处理直接进行着色。其中，针对着色信息传播延迟，考虑到跨步计算技术的适用性和跨步处理远程任务消息的随机性，提出局部消息流式可见技术，允许顶点在着色时，对着同一个任务内、已经完成着色的本地顶点的着色信息(即B^t(v_i))实时可见，以加快信息传播速度且保证正确性，并通过全局同步和记录本地顶点更新顺序来保证计算过程的可复现性；针对并行着色冲突问题，提出两种不同的着色技术，从着色时机和颜色选择两个角度注入可复现的随机扰动(即第一阈值和第二阈值)，以缓解相邻顶点的着色冲突、加速收敛，相应技术可直接对顶点进行着色而无需计算独立集，提高了计算效率，可将基于独立集的图拓扑结构删减转化为顶点颜色状态的多次松弛调整，简化了数据管理操作；此外，在迭代计算过程中，通过实时侦测顶点的着色更改情况，可动态自适应调整随机扰动强度(即第一阈值和第二阈值的大小)，从而在保证算法收敛的前提下进一步提高计算并行度和着色质量。Aiming at the deficiencies of the prior art in solving the problems of coloring information propagation delay and parallel coloring conflict, the present invention directly performs coloring through cooperative processing between vertices in a distributed synchronous computing environment. Among them, in view of the propagation delay of coloring information, considering the applicability of stride computing technology and the randomness of stride processing remote task messages, a local message streaming visibility technology is proposed, which allows vertices to be in the same task and have The shading information (ie B ^t (v _i )) of the local vertices that have completed shading is visible in real time to speed up information propagation and ensure correctness, and to ensure the reproducibility of the calculation process by globally synchronizing and recording the update order of local vertices; Aiming at the problem of parallel coloring conflicts, two different coloring techniques are proposed, which inject reproducible random disturbances (ie, the first threshold and the second threshold) from the perspectives of coloring timing and color selection to alleviate the coloring conflicts of adjacent vertices, Accelerates convergence. Corresponding technology can directly color vertices without calculating independent sets, which improves computing efficiency, and can convert graph topology deletion based on independent sets into multiple relaxation adjustments of vertex color states, simplifying data management operations; In addition, in the iterative calculation process, by detecting the coloring changes of vertices in real time, the random disturbance intensity (that is, the size of the first threshold and the second threshold) can be dynamically and adaptively adjusted, so as to further improve the calculation on the premise of ensuring the convergence of the algorithm. Parallelism and shading quality.

图3为根据本发明实施方式的分布式的自适应图顶点着色方法300的流程图。如图3所示，本发明实施方式提供的分布式的自适应图顶点着色方法300，从步骤301处开始，在步骤301分别获取无向图中每个顶点v_i的所有邻接顶点在上一次迭代过程中的着色信息集合C^t-1(v_i)；其中，t为当前的迭代次数。FIG. 3 is a flowchart of a distributed adaptive graph vertex shading method 300 according to an embodiment of the present invention. As shown in FIG. 3 , the distributed adaptive graph vertex coloring method 300 provided by the embodiment of the present invention starts from step 301, and in step 301, respectively obtains all adjacent vertices of each vertex v _i The coloring information set C ^t-1 ( _vi ) in the iterative process; wherein, t is the current number of iterations.

在步骤302，对每个顶点v_i进行着色冲突判断；其中，对于任一个顶点vi，若满足当前的概率发生器的值r_s小于等于第一阈值α并且该顶点v_i的着色信息c_i∈C^t-1(v_i)并且c_i∈B^t(v_i)，则确定该顶点v_i与其邻接顶点产生着色冲突，更新该顶点vi对应的状态累积监控变量s_i为初始阈值；其中，s_i用于记录该顶点v_i的着色信息连续未发生变化的迭代次数；B^t(v_i)为该顶点v_i在更新时刻已经收到的来自于当前迭代步的且位于该顶点v_i所在任务的部分邻接顶点的着色信息。In step 302, coloring conflict judgment is performed on each vertex vi; wherein, for any vertex vi, if the current value _rs of the probability generator is less than or equal to the first threshold α and the coloring information c _i of the vertex _vi is _satisfied ∈C ^t-1 (v _i ) and c _i ∈ B ^t (v _i ), then it is determined that the vertex vi and its adjacent vertices have a color conflict _{, and the state accumulation monitoring variable si} _{corresponding} to the vertex vi is updated to be the initial threshold; , s _i is used to record the number of iterations that the coloring information of the vertex v _i has not changed continuously; B ^t (vi _{) is the current iteration step that the vertex v i} _has received at the update time and is located at the vertex v The shading information of the part of the adjacent vertices of the task where _i is located.

在对每个顶点v_i进行着色冲突判断时，若不满足当前的概率发生器的值r_s小于等于第一阈值α并且该顶点v_i的着色信息c_i∈C^t-1(v_i)并且c_i∈B^t(v_i)，则确定该顶点v_i与其邻接顶点v_j未产生着色冲突，更新s_i＝s_i+1，并直接进入步骤304。When judging coloring conflict for each vertex v _i , if the current value rs of the probability generator is not satisfied, the value _rs is less than or equal to the first threshold α and the coloring information of the vertex v _i is c _i ∈ C ^t-1 (vi ₎ And c _i ∈ B ^t (vi ), it is determined that the vertex v _i and its adjacent vertex v _j have no coloring conflict, update s _i _{=s i} ₊₁ , and go to step 304 directly.

在本发明的实施方式中，基于局部消息流式可见的着色传播加速技术实现着色信息的实时传播。本技术的核心要点是解除现有同步迭代模型对消息可见性的严格约束，通过提前窥测相同任务内邻接顶点的着色信息来帮助本顶点选择更为恰当的颜色，以此达到加快着色信息传播速度的效果。In the embodiment of the present invention, the real-time propagation of shading information is realized based on the shading propagation acceleration technology with visible local message flow. The core point of this technology is to remove the strict constraints on message visibility imposed by the existing synchronous iterative model, and to help this vertex select a more appropriate color by detecting the coloring information of adjacent vertices in the same task in advance, so as to speed up the speed of coloring information dissemination Effect.

在关于分布式图顶点着色的实现描述中，在现有全局同步模型的约束下，第l步迭代计算中，顶点v_i仅根据其邻接顶点在第(l-1)步发送过来的着色信息集合C^l-1(v_i)进行颜色选择，然后广播给其邻接顶点；与此同时，v_i的部分邻接顶点可能先于v_i完成了本步的着色选择且其广播的消息已经被成功接收并放入C^l(v_i),以备在第(l+1)步迭代中用来更新v_i的着色。然而，根据C^l-1(v_i),在第l步迭代计算中，v_i可能与其邻接顶点选择相同的颜色，即c_i∈C^l(v_i),产生着色冲突。显然，如果v_i在更新着色时，能够同时参考C^l-1(v_i)(来自于全部邻接顶点的完整着色信息)和B^l(v_i)(在v_i更新时刻、已经收到的、来自于当前迭代步的且位于该顶点vi所在任务的部分邻接顶点的着色信息)中的信息，可以避免此类着色冲突。换言之，隶属于同一个任务的、先于v_i更新的本地顶点所发送的消息，将流式地传送到v_i且在其更新着色时可予以参考。设A为全部可用颜色集合，则颜色选择方式可由公式(1)给出。In the implementation description of distributed graph vertex coloring, under the constraints of the existing global synchronization model, in the iterative calculation in the lth step, the vertex v _i is only based on the coloring information sent by its adjacent vertices in the (l-1)th step. Set C ^l-1 (vi ) for color selection, and then broadcast to its adjacent vertices; at the same time _, some adjacent vertices of v _i may have completed the coloring selection of this step before _vi and its broadcast message has been successfully Receive and put in C ^l (vi ) to be used to update the coloring of v _i in the iteration of step (l+1 ₎ . However, according to C ^l-1 (vi ), in the iterative calculation in step l, v _i may choose the same color as its adjacent vertices, that is, c _i _∈ C ^l (vi ₎ , resulting in a coloring conflict. Obviously, if v _i is updating the shading, it can simultaneously refer to C ^l-1 (vi ) (complete shading information from all adjacent vertices) and B ^l (vi ₎ _{(at the time of v i} _update , the received , information from the current iteration step and located in the shading information of some adjacent vertices of the task where the vertex vi is located), such shading conflicts can be avoided. In other words, messages sent by local vertices belonging to the same task that are updated prior to v _i are streamed to v _i and can be referenced as they update shading. Let A be the set of all available colors, then the color selection method can be given by formula (1).

特别地，B^l(v_i)中的消息是只读的，不可被删除，即仅允许v_i可见地参考着色信息。否则，在第(l+1)步迭代中，会缺失对应邻接顶点的着色信息，导致v_i进行颜色选择时，U(v_i)被错误地放大，出现着色错误，因此，还需要根据v_i更新时刻、已经收到的、来自于当前迭代步的且位于该顶点vi所在任务的部分邻接顶点的着色信息B^l(v_i)进行着色冲突的判断。此外，消息流式可见的范围被限定为局部，而非全局，即只有来自于同一个任务的本地顶点的消息可以参与v_i的更新计算。这是因为分布式环境下，每个任务负责处理的顶点是固定的，且处理方式为串行，易于控制本地更新顺序，进而复现计算过程。比如，可以按照本地任务所负责处理的顶点编号大小、度大小等进行排序，然后在每步迭代中均按照同一顺序进行顶点计算。由于顶点之间更新的先后顺序是固定的，因此对每个顶点而言，其B^l(v_i)中可用于第l步迭代的消息也是固定的，是可以严格复现的。相反地，如果允许B^l(v_i)中来自于其它任务的消息也参与本步更新，则由于不同任务上顶点更新顺序的不确定性以及网络延迟的不可控性，在两次不同的迭代计算过程中，几乎不可能在计算v_i时严格复现B^l(v_i)中的消息。In particular, the messages in B ^l (vi _{) are read-only and cannot be deleted, ie only v i} _is allowed to visibly refer to the shading information. Otherwise, in the iteration of step (l+1), the coloring information corresponding to the adjacent vertices will be missing, resulting in U( _vi ) being wrongly enlarged when vi _selects the color, resulting in a coloring error. At the time of _i update, the coloring information B ^l (vi ₎ that has been received from the current iteration step and is located at some adjacent vertices of the task where the vertex vi is located is used to judge the coloring conflict. In addition, the visible range of message flow is limited to local, not global, that is, only messages from local vertices of the same task can participate in the update calculation of v _i . This is because in a distributed environment, the vertices that each task is responsible for processing is fixed, and the processing method is serial, which is easy to control the local update order, and then reproduce the calculation process. For example, it can be sorted according to the vertex number size, degree size, etc. that the local task is responsible for processing, and then the vertex calculation is performed in the same order in each iteration. Since the update sequence between vertices is fixed, for each vertex, the messages available for the first iteration in its B ^l ( _vi ) are also fixed and can be strictly reproduced. Conversely, if messages from other tasks in B ^l (v _i ) are allowed to also participate in the update of this step, then due to the uncertainty of the vertex update order on different tasks and the uncontrollability of network delay, in two different iterations During the calculation, it is almost impossible to strictly reproduce the message in B ^l (vi ₎ when calculating v _i .

图4给出了局部消息流式可见技术的实现框架。对比图2中的全局消息跨步技术，主要有两点不同：(1)第l步接收的消息B^l作为只读消息，对相关目的顶点而言是可见的，仅可用于指导颜色选择，而不会被彻底处理，因此在第(l+1)迭代，它们仍有效，无需从C^l中去除，这种设计可以保证第(l+1)步迭代中顶点可以获得完整的邻接顶点着色信息，否则无法正确进行颜色选择；(2)跨步可见的消息范围仅局限于本地任务生成的消息，而不包括其它远程任务，比如任务1的跨步可见消息不包括来自任务2和任务3的远程消息，以确保计算过程中不会引入不可控随机性因素，从而保证计算的可复现性。Figure 4 shows the implementation framework of the local message streaming visibility technology. Compared with the global message striding technology in Figure 2, there are two main differences: (1) The message B1 received in step ¹ is a read-only message, which is visible to the relevant destination vertex and can only be used to guide color selection, are not completely processed, so at iteration (l+1), they are still valid and do not need to be removed from C ^l , this design ensures that vertices can get full adjacency vertex coloring at iteration (l+1) information, otherwise the color selection cannot be performed correctly; (2) the range of visible messages by stride is limited to the messages generated by the local task, excluding other remote tasks, such as the visible messages of task 1 do not include the messages from task 2 and task 3 To ensure that no uncontrollable random factors are introduced in the calculation process, so as to ensure the reproducibility of the calculation.

本发明实施方式采用基于弹性冲突判定与颜色选择的并行着色技术。弹性着色技术的核心思想是在着色判定和颜色选择两个阶段(即Step2和Step3)分别注入弹性的随机扰动，以随机化相邻顶点的着色时机和颜色选择，从而降低着色冲突概率、缓解“着色振荡”。The embodiment of the present invention adopts a parallel coloring technology based on elastic conflict determination and color selection. The core idea of the elastic coloring technology is to inject elastic random disturbances in the two stages of coloring determination and color selection (ie, Step2 and Step3), to randomize the coloring timing and color selection of adjacent vertices, thereby reducing the probability of coloring conflict and alleviating " Shading Oscillation".

对于弹性冲突判定，在一次迭代计算过程中，顶点将以一定概率跳过Step2和Step3，即v_i不对收到的邻接顶点着色信息C^l-1(v_i)进行冲突判定以及发生冲突后的颜色重新选择。如是，假设随机跳过(random skipping)概率发生器为r_s(0≤r_s≤1),跳过阈值为α，则Step2的执行逻辑改为：For elastic conflict determination, in an iterative calculation process, the vertex will skip Step2 and Step3 with a certain probability, that is, _vi does not perform conflict determination on the received adjacent vertex coloring information C ^l-1 ( _vi ) and after the conflict occurs Color re-selection. If so, assuming that the random skipping probability generator is _rs ( _0≤rs ≤1) and the skipping threshold is α, the execution logic of Step2 is changed to:

Step2，对于任一个顶点v_i，如果满足r_s≤α∧c_i∈C(v_i)∧c_i∈B^t(v_i)，则确定该顶点v_i与邻接顶点产生着色冲突，判定结果为True，需跳转到Step3修改颜色。Step2, for any vertex v _i , if r _s ≤α∧c _i ∈C(v _i )∧c _i ∈B ^t (v _i ), it is determined that the vertex v _i has a color conflict with the adjacent vertex, and the result is determined If it is True, you need to jump to Step3 to modify the color.

添加跳过概率条件后，位于不同任务上的相邻顶点会以较低的概率同时进行着色，进而避免二者选择相同颜色。需要注意的是，Step4仍正常执行，以便v_i的邻接顶点能够在下一步迭代中进行正常计算。这也意味着，即使v_i在第l步跳过了Step2和Step3，但在第(l+1)步，它仍可以收到完整的邻接顶点着色信息C^l+1(v_i)，以便在r_s≤α时能够正确执行着色冲突判定。After adding the skip probability condition, adjacent vertices on different tasks will be colored at the same time with a low probability, thus preventing them from choosing the same color. It should be noted that Step4 is still executed normally, so that the adjacent vertices of _vi can be calculated normally in the next iteration. This also means that even if vi skips Step2 and _Step3 at step l, at step (l+1), it can still receive the complete adjacent vertex coloring information C ^l+1 ( _vi ), so that Coloring conflict determination can be correctly performed when r _s ≤ α.

因此，在本发明的实施方式中，首先，所有任务分别获取无向图中每个顶点v_i的所有邻接顶点在上一次(l-1次)迭代过程中的着色信息集合C^l-1(v_i)；其中，l为当前的迭代次数。然后，对每个顶点v_i进行着色冲突判断，其中，对于任一个顶点vi，若满足当前的概率发生器的值r_s≤α∧c_i∈C^t-1(v_i)∧c_i∈B^t(v_i)，则确定该顶点v_i与其邻接顶点产生着色冲突，更新该顶点vi对应的状态累积监控变量s_i为初始阈值，并依次进行着色和消息广播的步骤；其中，s_i用于记录该顶点v_i的着色信息连续未发生变化的迭代次数；B^t(v_i)为该顶点v_i在更新时刻已经收到的来自于当前迭代步的且位于该顶点vi所在任务的部分邻接顶点的着色信息；α为第一阈值；若不满足，则更新s_i＝s_i+1,并直接进入消息广播的步骤。Therefore, in the embodiment of the present invention, first, all tasks respectively obtain the _coloring information set C ^1-1 ( v _i ); where, l is the current number of iterations. Then, perform coloring conflict judgment on each vertex vi, wherein, for any vertex vi, if the current value of the probability generator _rs _≤α∧ci _∈C ^t-1 (vi ) _∧ci _∈ is satisfied B ^t (v _i ), then determine that the vertex v _i and its adjacent vertices have a coloring conflict, update the state accumulation monitoring variable _si corresponding to the vertex vi to be the initial threshold, and perform the steps of coloring and message broadcasting in turn; wherein, s _i It is used to record the number of iterations that the coloring information of the vertex _vi has not changed continuously; B ^t ( _vi ) is the number of iterations that the vertex _vi has received at the update time from the current iteration step and is located in the task where the vertex vi is located. Coloring information of some adjacent vertices; α is the first threshold; if not satisfied, update s _i =s _i +1, and directly enter the step of message broadcasting.

在步骤303，确定每个顶点v_i对应的可用颜色集合U(v_i)，分别从每个顶点v_i对应的可用颜色集合U(v_i)中根据第二阈值β随机选择一个颜色，确定每个顶点v_i的着色信息c_i。In step 303, determine the available color set U(vi) corresponding to each vertex v _i _, select a color randomly from the available color set U(vi ₎ corresponding to each vertex v _i according to the second threshold β, and determine Shading information c _{i for each vertex v i} _.

在本发明的实施方式中，在弹性着色选择时，不再使用统一的最小选择策略

而是从可用颜色集合U(v_i)中随机选取一种颜色，即

具体地，可先将U(v_i)中颜色按照颜色编号升序排序，然后将选择范围限定到top-β个颜色并进行随机选择。一种简单有效的随机选择方式为取余法，即将top-β个颜色放入数组中，然后将随机着色器r_c(random coloring)的值模β取余后作为数组下标进行选择，即c_i＝arrayOfTop-β[r_c modβ].特别地，当|U(v_i)|<β时，需进行颜色扩充，即将已用的最大颜色值以β为单位步长自增，然后加入集合U(v_i)中，以确保颜色选择的随机性。因此，本发明实施方式对应的Step3操作如下：In an embodiment of the present invention, a unified minimum selection strategy is no longer used when selecting elastically colored

Instead, a color is randomly selected from the set of available colors U(v _i ), namely

Specifically, the colors in U(v _i ) can be sorted in ascending order of color numbers, and then the selection range is limited to the top-β colors and selected randomly. A simple and effective random selection method is the remainder method, that is, put the top-β colors into an array, and then take the remainder of the value modulo β of the random shader _rc (random coloring) and select it as an array subscript, that is, c _i =arrayOfTop-β[r _c _modβ ]. In particular, when |U(vi )|<β, color expansion needs to be performed, that is, the maximum color value used is incremented in steps of β, and then added set U( _vi ) to ensure the randomness of color selection. Therefore, the operation of Step 3 corresponding to the embodiment of the present invention is as follows:

Step3，对顶点v_i,令U(v_i)为可用颜色集合，应有

若|U(v_i)|<β,则进行颜色扩充，之后按照升序选择top-β个颜色进行随机选色c_i＝arrayOfTop-β[r_c modβ]。Step3, for vertex v _i , let U(vi ₎ be the set of available colors, there should be

If |U(v _i )|<β, color expansion is performed, and then top-β colors are selected in ascending order for random color selection _c _i =arrayOfTop-β[rc modβ].

显然，在弹性着色的冲突判定和颜色选择过程中，均注入了随机扰动，即r_s和r_c,这对计算的可复现性提出了挑战。为解决该问题，可以利用计算机编程语言所提供的随机数发生器的伪随机性，即如果给定相同的初始化种子，则随机数发生器在多次作业执行过程中，产生的随机数序列是相同的。据此，可将各个任务上的随机数生成器种子设定为任务编号，以确保各个任务上的随机数生成序列不尽相同，从而实现着色时机或颜色选择的差异性。另一方面，如需进行程序调试或容错，仅需将随机数生成器的种子进行相同设定并执行指定次数的随机发生操作，即可复现计算过程。需要强调的是，弹性冲突判定与弹性着色选择可同时使用，以进一步降低着色冲突的可能性。Obviously, random disturbances, namely _rs and _rc , are injected into the conflict determination and color selection process of elastic coloring, which challenges the reproducibility of the calculation. To solve this problem, the pseudo-randomness of the random number generator provided by the computer programming language can be used, that is, if the same initialization seed is given, the random number sequence generated by the random number generator during the execution of multiple jobs is identical. Accordingly, the random number generator seed on each task can be set as the task number to ensure that the random number generation sequence on each task is not the same, so as to realize the difference of coloring timing or color selection. On the other hand, for program debugging or fault tolerance, the calculation process can be reproduced only by setting the seed of the random number generator to the same setting and performing a specified number of random operations. It should be emphasized that elastic conflict determination and elastic coloring selection can be used at the same time to further reduce the possibility of coloring conflicts.

在本发明的实施方式中，在确定顶点v_i着色冲突时，确定每个顶点v_i对应的可用颜色集合U(v_i)，对于任一个顶点v_i,若|U(v_i)|小于第二阈值β，则进行颜色扩充，将已用的最大颜色值以β为单位步长自增，并加入到所述可用颜色集合U(v_i)中；将每个顶点v_i对应的当前的可用颜色集合U(v_i)中的颜色按照颜色编号升序排列，选取top-β个颜色放入数组中，并根据随机着色器的值r_c模β取余后作为数组下标进行颜色选择，确定每个顶点v_i的着色信息为c_{i_}＝arrayOfTop-β[r_c modβ]。In the embodiment of the present invention, when determining the coloring conflict of the vertex v _i , the available color set U(vi ₎ corresponding to each vertex v _i is determined. For any vertex v _i , if |U(vi ₎ | is less than The second threshold β, then color expansion is performed, and the maximum used color value is automatically incremented with β as the unit step, and added to the available color set U(vi ); the current corresponding to each vertex v _i is added _. The colors in the available color set U(v _i ) are arranged in ascending order of color numbers, and the top-β colors are selected and put into the array, and the remainder is taken as the subscript of the array according to the value of the random shader r _c modulo β. , determine the coloring information of each vertex v _i as _c _{i_} =arrayOfTop-β[rc modβ].

α＝p+(p-dv/maxd)×γ，α=p+(p-dv/maxd)×γ,

γ＝|V^x|/|V|，γ=|V ^x |/|V|,

γ＝|V^x|/|V|，γ=|V ^x |/|V|,

其中，

在本发明的实施方式中，在弹性并行着色技术中，冲突判定与颜色选择均需注入随机扰动，而扰动程度则由参数α和β决定。两种参数的设定会影响着色并行度和着色质量。因此，需要根据迭代过程中的收敛程度进行动态调整优化。In the embodiment of the present invention, in the elastic parallel coloring technology, both conflict determination and color selection need to inject random disturbance, and the degree of disturbance is determined by parameters α and β. The settings of the two parameters will affect the shading parallelism and shading quality. Therefore, it is necessary to dynamically adjust and optimize according to the degree of convergence in the iterative process.

具体地，对于弹性冲突判定，如果跳过阈值α过低，则大量顶点会跳过冲突判定，使其颜色收敛推迟到后续迭代步中，并行计算的程度降低，延长了迭代收敛所需时间；而阈值过高，则大部分顶点均参与冲突判定，难以实现差异化着色时机的目标。因此，寻找最佳跳过概率(即α)至关重要。由于不同图数据的规模和拓扑结构不同，其最佳跳过概率也不尽相同，难以统一精确测量。特别地，对于真实图数据，考虑到顶点度的分布具有幂律偏斜特点，即大部分顶点的度数较低而少部分顶点的度数极高。显然，高度顶点因邻居众多，发生着色冲突的概率较大。因此可设置区别化跳过策略，即对低度顶点设置较高的α以便大多数顶点快速着色以使算法快速收敛；而对顶点度数较高的顶点则设置较低的α以防止频繁振荡的发生。具体地，经实验测试，可选择以50％作为基本阈值p，根据不同顶点的度数进行区别化设置，其计算方式可由公式(2)给出。其中，d_v为当前顶点度数，maxd为给定图中所有顶点的最大度数，γ是取值为0到1的收敛调控参数。Specifically, for elastic conflict judgment, if the skip threshold α is too low, a large number of vertices will skip the conflict judgment, so that the color convergence is delayed to the subsequent iteration steps, the degree of parallel calculation is reduced, and the time required for iterative convergence is prolonged; If the threshold is too high, most of the vertices are involved in conflict determination, making it difficult to achieve the goal of differentiated coloring timing. Therefore, finding the optimal skip probability (i.e. α) is crucial. Due to the different scales and topological structures of different graph data, the optimal skip probabilities are also different, so it is difficult to measure them uniformly and accurately. In particular, for real graph data, it is considered that the distribution of vertex degrees has a power-law skew characteristic, that is, most vertices have a low degree and a few vertices have a very high degree. Obviously, because of the large number of neighbors, the height vertex has a high probability of coloring conflict. Therefore, a differentiated skip strategy can be set, that is, a higher α is set for low-degree vertices so that most vertices are quickly colored to make the algorithm converge quickly; and a lower α is set for vertices with higher degrees to prevent frequent oscillations. occur. Specifically, through experimental tests, 50% can be selected as the basic threshold p, and differentiated settings are made according to the degrees of different vertices, and the calculation method can be given by formula (2). where d _v is the current vertex degree, maxd is the maximum degree of all vertices in a given graph, and γ is a convergence control parameter ranging from 0 to 1.

α＝0.5+(0.5-d_v/maxd)×γ (2)α=0.5+(0.5-d _v /maxd)×γ(2)

显然，对于低度数顶点，α趋于1，降低了跳过概率；而对高度数顶点，α趋于0，增大了跳过概率以避免着色冲突，但这种跳过操作显然会降低并行计算度，使算法所需总迭代次数增加。特别地，对于γ,其设定应由迭代收敛程度来确定。直观上，当大部分顶点的颜色已经收敛稳定时，发生着色冲突的可能性会降低，可将概率阈值调高以降低跳过概率。而收敛程度可通过连续监测顶点着色状态是否发生变更来度量，具体计算方式如公式(3)所示：Obviously, for low-degree vertices, α tends to 1, reducing the skip probability; while for high-degree vertices, α tends to 0, increasing the skip probability to avoid coloring conflicts, but this skip operation obviously reduces parallelism The degree of computation increases the total number of iterations required by the algorithm. In particular, for γ, its setting should be determined by the degree of iterative convergence. Intuitively, when the colors of most vertices have converged and stabilized, the possibility of coloring conflicts will decrease, and the probability threshold can be adjusted higher to reduce the skip probability. The degree of convergence can be measured by continuously monitoring whether the vertex shading state has changed. The specific calculation method is shown in formula (3):

γ＝|V^x|/|V| (3)γ=|V ^x |/|V| (3)

其中，|V|为给定图上的全部顶点数目，而|V^x|为连续x步迭代计算中颜色均未发生变化的顶点数目。根据实验测试，一般取x＝5即可获得较好的调控效果。where |V| is the total number of vertices on a given graph, and |V ^x | is the number of vertices whose color has not changed in successive x iterations. According to the experimental test, generally taking x=5 can obtain a better control effect.

针对弹性着色选择，随机选色的范围(即参数β)至关重要。若随机范围太小，则选择相同颜色的概率增加，容易发生振荡现象，导致算法的收敛速度降低；而若随机范围过大，顶点可能会着上较大的颜色，降低了算法结果的着色质量(即所用的总颜色数目k较大)。为均衡着色质量和收敛速度，类比于α的动态调整，可根据算法收敛程度动态设定β值。直观上，当大部分顶点的着色状态稳定后，可适当收窄随机选色范围，以提高着色质量。根据大量真实图上的实测结果，当选取10作为范围基数并进行动态调整时，可以取得较好的均衡效果。故β的计算方式为

其中

为向上取整数运算符，而γ的具体值由公式(3)给出。For elastic coloring selection, the range of random color selection (ie, parameter β) is critical. If the random range is too small, the probability of selecting the same color increases, and oscillation is prone to occur, resulting in a decrease in the convergence speed of the algorithm; if the random range is too large, the vertices may be colored with larger colors, reducing the coloring quality of the algorithm results. (ie the total number of colors k used is larger). In order to balance the shading quality and convergence speed, analogous to the dynamic adjustment of α, the β value can be dynamically set according to the degree of algorithm convergence. Intuitively, when the shading state of most vertices is stable, the random color selection range can be appropriately narrowed to improve the shading quality. According to the measured results on a large number of real maps, when 10 is selected as the range base and dynamically adjusted, a better balanced effect can be achieved. So β is calculated as

in

is an upward integer operator, and the specific value of γ is given by formula (3).

在步骤304，对于每个顶点v_i，沿边向其所有邻接顶点v_j发送其着色信息c_i。At step 304, for each vertex v _i , its shading information c _i is sent to all its adjacent vertices v _j along the edge.

在步骤305，判断是否满足每个顶点v_i的着色信息和其所有邻接顶点的颜色均不同；其中，若满足，则直接确定所述无向图中每个顶点的着色信息；若不满足，则根据运行信息调整第一阈值α和第二阈值β，并返回步骤1重新进行迭代，直至每个顶点v_i的着色信息和其所有邻接顶点的颜色均不同时停止，并确定所述无向图中每个顶点的着色信息。In step 305, it is judged whether the coloring information of each vertex v _i is different from the color of all its adjacent vertices; wherein, if it is satisfied, the coloring information of each vertex in the undirected graph is directly determined; if not, the coloring information of each vertex in the undirected graph is directly determined; Then adjust the first threshold α and the second threshold β according to the running information, and go back to step 1 to re-iterate until the coloring information of each vertex v _i and the colors of all its adjacent vertices are different. Stop, and determine that the undirected Shading information for each vertex in the graph.

如图5所示，为包含m个任务的分布式自适应图顶点着色技术的整体实现框架。迭代计算开始前，各任务首先并行加载图数据并进行初始参数的设置，后者主要包括：将随机数发生器r_s和r_c的种子设定为任务编号，给定初始选择阈值α与β.之后在迭代计算中，按照准备消息、冲突判定、颜色选择和新消息广播四个步骤进行循环处理。其中，在冲突判定和颜色选择阶段注入可复现的随机扰动，以缓解着色振荡现象；而在冲突判定阶段，可对下一步的本地消息B^l(v_i)流式可见，以加快消息传播速度。特别地，为每个顶点v_i设置状态累积监控变量s_i,以记录该顶点连续不进行颜色选择(即认定为颜色状态未发生变化)的迭代步数，以便在每步迭代计算结束后计算收敛调控参数γ,进而调整阈值参数α与β,实现随机扰动力度的自适应学习与调整，在解决着色振荡和提高计算并行度与着色质量方面进行动态优化。As shown in Figure 5, it is the overall implementation framework of the distributed adaptive graph vertex shading technology including m tasks. Before the iterative calculation starts, each task first loads the graph data in parallel and sets the initial parameters. The latter mainly includes: setting the seeds of the random number generator _rs and _rc as the task number, and giving the initial selection thresholds α and β. .After that, in the iterative calculation, cyclic processing is carried out according to the four steps of preparing message, conflict determination, color selection and new message broadcasting. Among them, a reproducible random disturbance is injected in the conflict determination and color selection stages to alleviate the phenomenon of coloring oscillation; and in the conflict determination stage, the next local message B ^l ( _vi ) can be streamed and visible to speed up message dissemination speed. In particular, a state accumulation monitoring variable s _i is set for each vertex v _i to record the number of iteration steps in which the vertex does not continuously perform color selection (that is, it is determined that the color state has not changed), so as to calculate after the end of each step of iterative calculation. Convergence control parameter γ, and then adjust threshold parameters α and β to realize adaptive learning and adjustment of random disturbance speed, and perform dynamic optimization in solving shading oscillation and improving calculation parallelism and shading quality.

图6为根据本发明实施方式的分布式的自适应图顶点着色系统600的结构示意图。如图6所示，本发明实施方式提供的分布式的自适应图顶点着色系统600，包括：着色信息集合获取单元601、着色冲突判断单元602、着色单元603、消息广播单元604和着色信息确定单元605。FIG. 6 is a schematic structural diagram of a distributed adaptive graph vertex shader system 600 according to an embodiment of the present invention. As shown in FIG. 6 , the distributed adaptive graph vertex shading system 600 provided by the embodiment of the present invention includes: a shading information set acquisition unit 601 , a shading conflict judgment unit 602 , a shading unit 603 , a message broadcasting unit 604 , and a shading information determination unit 601 unit 605.

优选地，所述着色信息集合获取单元601，用于分别获取无向图中每个顶点v_i的所有邻接顶点在上一次迭代过程中的着色信息集合C^t-1(v_i)；其中，t为当前的迭代次数。Preferably, the coloring information set obtaining unit 601 is configured to obtain the coloring information set C ^t-1 (vi ₎ of all adjacent vertices of each vertex v _i in the undirected graph in the previous iteration process; wherein, t is the current iteration number.

优选地，所述着色冲突判断单元602，用于对每个顶点v_i进行着色冲突判断；其中，对于任一个顶点vi，若满足当前的概率发生器的值r_s小于等于第一阈值α并且该顶点v_i的着色信息c_i∈C^t-1(v_i)并且c_i∈B^t(v_i)，则确定该顶点v_i与其邻接顶点产生着色冲突，更新该顶点vi对应的状态累积监控变量s_i为初始阈值；其中，s_i用于记录该顶点v_i的着色信息连续未发生变化的迭代次数；B^t(v_i)为该顶点v_i在更新时刻已经收到的来自于当前迭代步的且位于该顶点vi所在任务的部分邻接顶点的着色信息。Preferably, the coloring conflict judgment unit 602 is configured to perform coloring conflict judgment on each vertex _vi ; wherein, for any vertex vi, if the current value rs of the probability generator is satisfied, the value _rs is less than or equal to the first threshold α and The coloring information c _i ∈ C ^t-1 (vi ) and c _i _∈ B ^t (vi ₎ of the vertex v _i are determined to have a coloring conflict between the vertex v _i and its adjacent vertices, and the state accumulation corresponding to the vertex vi is updated The monitoring variable _si is the initial threshold; among them, _si is used to record the number of iterations that the coloring information of the vertex _vi has not changed continuously; B ^t (vi ₎ is the received information from the vertex _vi at the update time Shading information of some adjacent vertices at the current iteration step and located in the task where the vertex vi is located.

优选地，其中所述着色冲突判断单元602，还包括：Preferably, the coloring conflict judgment unit 602 further includes:

在对每个顶点v_i进行着色冲突判断时，若不满足当前的概率发生器的值r_s小于等于第一阈值α并且该顶点v_i的着色信息c_i∈C^t-1(v_i)并且c_i∈B^t(v_i)，则确定该顶点v_i与其邻接顶点v_j未产生着色冲突，更新s_i＝s_i+1，并直接进入消息广播单元604。When judging coloring conflict for each vertex v _i , if the current value rs of the probability generator is not satisfied, the value _rs is less than or equal to the first threshold α and the coloring information of the vertex v _i is c _i ∈ C ^t-1 (vi ₎ And c _i ∈ B ^t (vi ), it is determined that the vertex v _i and its adjacent vertex v _j have no coloring conflict, update s _i _{=s i} ₊₁ , and directly enter the message broadcasting unit 604 .

α＝p+(p-dv/maxd)×γ，α=p+(p-dv/maxd)×γ,

γ＝|V^x|/|V|，γ=|V ^x |/|V|,

优选地，所述着色单元603，用于确定每个顶点v_i对应的可用颜色集合U(v_i)，分别从每个顶点v_i对应的可用颜色集合U(v_i)中随机选择一个颜色，确定每个顶点v_i的着色信息c_i。Preferably, the coloring unit 603 is configured to determine the available color set U(v _i ) corresponding to each vertex v _i , and randomly select a color from the available color set U(vi ₎ corresponding to each vertex v _i respectively , determine the coloring information c _{i of each vertex v i} _.

优选地，其中所述着色单元603，确定每个顶点v_i对应的可用颜色集合U(v_i)，分别从每个顶点v_i对应的可用颜色集合U(v_i)中随机选择一个颜色，确定每个顶点v_i的着色信息c_i，包括：Preferably, the coloring unit 603 determines the available color set U(v _i ) corresponding to each vertex v _i , and randomly selects a color from the available color set U(vi ₎ corresponding to each vertex v _i , respectively, Determine the shading information _ci for each vertex _vi , including:

γ＝|V^x|/|V|，γ=|V ^x |/|V|,

其中，

优选地，所述消息广播单元604，用于对于每个顶点v_i，沿边向其所有邻接顶点v_j发送其着色信息c_i。Preferably, the message broadcasting unit 604 is configured to, for each vertex v _i , send its coloring information c _i to all its adjacent vertices v _j along the edge.

优选地，所述着色信息确定单元605，用于判断是否满足每个顶点v_i的着色信息和其所有邻接顶点的颜色均不同；其中，若满足，则直接确定所述无向图中每个顶点的着色信息；若不满足，则根据运行信息调整第一阈值α和第二阈值β，并进入着色信息集合获取单元重新进行迭代，直至每个顶点v_i的着色信息和其所有邻接顶点的颜色均不同时停止，并确定所述无向图中每个顶点的着色信息。Preferably, the coloring information determining unit 605 is configured to determine whether the coloring information of each vertex v _i is different from the colors of all its adjacent vertices; wherein, if it is satisfied, directly determine each vertex in the undirected graph The coloring information of the vertex; if not satisfied, adjust the first threshold α and the second threshold β according to the running information, and enter the coloring information set acquisition unit to iterate again, until the coloring information of each vertex v _i and all its adjacent vertices. The colors are not stopped at the same time, and the coloring information of each vertex in the undirected graph is determined.

本发明的实施例的分布式的自适应图顶点着色系统600与本发明的另一个实施例的分布式的自适应图顶点着色方法300相对应，在此不再赘述。The distributed adaptive graph vertex shading system 600 of the embodiment of the present invention corresponds to the distributed adaptive graph vertex shading method 300 of another embodiment of the present invention, and details are not described herein again.

已经通过参考少量实施方式描述了本发明。然而，本领域技术人员所公知的，正如附带的专利权利要求所限定的，除了本发明以上公开的其他的实施例等同地落在本发明的范围内。The present invention has been described with reference to a few embodiments. However, as is known to those skilled in the art, other embodiments than the above disclosed invention are equally within the scope of the invention, as defined by the appended patent claims.

通常地，在权利要求中使用的所有术语都根据他们在技术领域的通常含义被解释，除非在其中被另外明确地定义。所有的参考“一个/所述/该[装置、组件等]”都被开放地解释为所述装置、组件等中的至少一个实例，除非另外明确地说明。这里公开的任何方法的步骤都没必要以公开的准确的顺序运行，除非明确地说明。Generally, all terms used in the claims are to be interpreted according to their ordinary meaning in the technical field, unless explicitly defined otherwise herein. All references to "a/the/the [means, component, etc.]" are open to interpretation as at least one instance of said means, component, etc., unless expressly stated otherwise. The steps of any method disclosed herein do not have to be performed in the exact order disclosed, unless explicitly stated.

本领域内的技术人员应明白，本申请的实施例可提供为方法、系统、或计算机程序产品。因此，本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且，本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。As will be appreciated by those skilled in the art, the embodiments of the present application may be provided as a method, a system, or a computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器，使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the present application. It will be understood that each flow and/or block in the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to the processor of a general purpose computer, special purpose computer, embedded processor or other programmable data processing device to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing device produce Means for implementing the functions specified in a flow or flow of a flowchart and/or a block or blocks of a block diagram.

这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中，使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品，该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory result in an article of manufacture comprising instruction means, the instructions The apparatus implements the functions specified in the flow or flow of the flowcharts and/or the block or blocks of the block diagrams.

这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上，使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理，从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded on a computer or other programmable data processing device to cause a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process such that The instructions provide steps for implementing the functions specified in the flow or blocks of the flowcharts and/or the block or blocks of the block diagrams.

最后应当说明的是：以上实施例仅用以说明本发明的技术方案而非对其限制，尽管参照上述实施例对本发明进行了详细的说明，所属领域的普通技术人员应当理解：依然可以对本发明的具体实施方式进行修改或者等同替换，而未脱离本发明精神和范围的任何修改或者等同替换，其均应涵盖在本发明的权利要求保护范围之内。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention rather than to limit them. Although the present invention has been described in detail with reference to the above embodiments, those of ordinary skill in the art should understand that: the present invention can still be Modifications or equivalent replacements are made to the specific embodiments of the present invention, and any modifications or equivalent replacements that do not depart from the spirit and scope of the present invention shall be included within the protection scope of the claims of the present invention.

Claims

1. A distributed adaptive graph vertex coloring method, wherein the method comprises:

Step 1, respectively obtain the coloring information set C ^t-1 (vi ₎ of all adjacent vertices of each vertex v _i in the undirected graph in the previous iteration process; wherein, t is the current number of iterations;

Step 2: Perform coloring conflict judgment on each vertex vi; wherein, for any vertex vi, if the current probability generator value _rs is less than or equal to the first threshold α and the coloring information c _i _∈ of the vertex vi is _satisfied C ^t-1 (v _i ) and c _i ∈ B ^t (v _i ), then determine that the vertex vi and its adjacent vertices have a color conflict, and update the state accumulation monitoring variable _si corresponding to the vertex _vi to be the initial threshold; where, s _i is used to record the number of iterations that the coloring information of the vertex v _i has not changed continuously; B ^t (vi ₎ is the current iteration step that the vertex v _i has received at the update time and is located at the vertex v _i The coloring information of some adjacent vertices of the task;

Step 3: Determine the available color set U(v _i ) corresponding to each vertex v _i , and randomly select a color from the available color set U(vi ₎ corresponding to each vertex v _i according to the second threshold β, and determine each color. coloring information c _{i of each vertex v i} _;

Step 4, for each vertex v _i , send its coloring information c _i to all its adjacent vertices v _j along the edge;

Step 5, determine whether the coloring information of each vertex v _i is different from the color of all its adjacent vertices; wherein, if it is satisfied, then directly determine the coloring information of each vertex in the undirected graph; if not, then Adjust the first threshold α and the second threshold β according to the running information, and return to step 1 to re-iterate until the coloring information of each vertex v _i and the colors of all its adjacent vertices are different, stop, and determine the undirected graph Shading information for each vertex in .

2. The method according to claim 1, wherein the method further comprises:

When judging coloring conflict for each vertex v _i , if the current value rs of the probability generator is not satisfied, the value _rs is less than or equal to the first threshold α and the coloring information of the vertex v _i is c _i ∈ C ^t-1 (vi ₎ And c _i ∈ B ^t (vi ), then it is determined that the vertex v _i and its adjacent vertex v _j have no coloring conflict, update s _i _{=s i} ₊₁ , and go to step 4 directly.

3. The method according to claim 1, wherein the method further comprises:

Before the next iteration, the first threshold α is dynamically adjusted using the following formula, including:

α=p+(p-dv/maxd)×γ,

γ=|V ^x |/|V|,

Among them, p is the preset basic threshold, the value range is 0 to 1; d _v is the degree of the current vertex v _i , maxd is the maximum degree of all the vertices in the undirected graph, γ is 0 to 1 The convergence control parameter of 1; |V| is the number of all vertices in the undirected graph, and |V ^x | is the number of vertices whose color has not changed in the preset continuous x-step iterative calculation.

4. The method according to claim 1, characterized in that, said determining the available color set U(vi) corresponding to each vertex v _i , respectively from the available color set U(vi ₎ corresponding to each vertex v _i _. ) and randomly select a color to determine the coloring information c _i of each vertex v _i , including:

Determine the available color set U(vi ₎ corresponding to each vertex v _i

For any vertex v _i , if |U(vi ₎ | is less than the second threshold β, then color expansion is performed, the maximum used color value is incremented by β as a unit step, and added to the available color set in U( _vi );

Arrange the colors in the current available color set U(v _i ) corresponding to each vertex v _i in ascending order of the color numbers, select the top-β colors and put them into the array, and get them according to the value r _c modulo β of the random shader. The remainder is used as an array subscript for color selection, and the coloring information of each vertex v _i is determined as c _i _=arrayOfTop-β[rc _modβ ].

5. The method according to claim 4, wherein the method further comprises:

Before the next iteration, the second threshold β is dynamically adjusted using the following formula, including:

γ=|V ^x |/|V|,

in,

6. A distributed adaptive graph vertex shader system, wherein the system comprises:

The coloring information set acquisition unit is used to obtain the coloring information set C ^t-1 (vi ₎ of all adjacent vertices of each vertex v _i in the undirected graph in the previous iteration process; wherein, t is the current number of iterations ;

The coloring conflict judgment unit is used to judge the coloring conflict for each vertex _vi ; wherein, for any vertex vi, if the value rs of the current probability generator is satisfied that the value _rs of the current probability generator is less than or equal to the first threshold α and the coloring of the vertex _vi is Information c _i ∈ C ^t-1 (vi ) and c _i _∈ B ^t (vi ₎ , then determine that the vertex vi and its adjacent vertices have coloring conflicts, and update the state accumulation monitoring variable _si corresponding to the vertex _vi to be the initial Threshold; wherein, s _i is used to record the number of iterations that the coloring information of the vertex v _i has not changed continuously; B ^t (vi _{) is the current iteration step that the vertex v i} _has received at the update time and is located in The coloring information of some adjacent vertices of the task where the vertex v _i is located;

The coloring unit is used to determine the available color set U(v _i ) corresponding to each vertex v _i , and randomly select a color from the available color set U(vi ₎ corresponding to each vertex v _i according to the second threshold β, Determine the coloring information c _{i of each vertex v i} _;

a message broadcasting unit, for each vertex v _i , sending its coloring information c _i to all its adjacent vertices v _j along the edge;

A coloring information determination unit, used to determine whether the coloring information of each vertex v _i is different from the color of all its adjacent vertices; wherein, if it is satisfied, the coloring information of each vertex in the undirected graph is directly determined; if If not satisfied, then adjust the first threshold α and the second threshold β according to the running information, and enter the coloring information set acquisition unit to iterate again, until the coloring information of each vertex v _i and the colors of all its adjacent vertices are different, then stop, And determine the coloring information of each vertex in the undirected graph.

7. The system according to claim 6, wherein the coloring conflict judgment unit further comprises:

When judging coloring conflict for each vertex v _i , if the current value rs of the probability generator is not satisfied, the value _rs is less than or equal to the first threshold α and the coloring information of the vertex v _i is c _i ∈ C ^t-1 (vi ₎ And c _i ∈ B ^t (vi ), it is determined that the vertex v _i and its adjacent vertex v _j have no coloring conflict, update s _i _{=s i} ₊₁ , and directly enter the message broadcasting unit.

8. The system of claim 6, wherein the system further comprises:

A first threshold adjustment unit, configured to dynamically adjust the first threshold α by using the following formula before performing the next iteration, including:

α=p+(p-dv/maxd)×γ,

γ=|V ^x |/|V|,

Among them, p is the preset basic threshold, the value range is 0 to 1; d _v is the degree of the current vertex v _i , maxd is the maximum degree of all vertices in the undirected graph, γ is 0 to 1 The convergence control parameter of 1; |V| is the number of all vertices in the undirected graph, and |V ^x | is the number of vertices whose color has not changed in the preset continuous x-step iterative calculation.

9. The system according to claim 6, wherein the coloring unit determines the available color set U(v _i ) corresponding to each vertex v _i , respectively, from the available color set U corresponding to each vertex v _i (v _i ) randomly select a color to determine the coloring information c _i of each vertex v _i , including:

Determine the available color set U(vi ₎ corresponding to each vertex v _i

10. The system of claim 9, wherein the system further comprises:

The second threshold adjustment unit is configured to dynamically adjust the second threshold β by using the following formula before the next iteration, including:

γ=|V ^x |/|V|,

in,