Disclosure of Invention
The invention aims to solve the problems and provides a segmentation parallel expansion method for keyword optimal path query.
The invention adopts the following technical scheme that the segmentation parallel expansion method for the keyword optimal path query comprises the following steps.
S100, calculating a minimum cost value b (V i,vj) between any two vertexes in the query graph and a target value o σ(vi,vj corresponding to the minimum cost value, and a minimum target value o (V i,vj) and a cost value b τ(vi,vj corresponding to the minimum target value, and simultaneously recording the minimum cost value b min and the minimum target value o min in the query graph G= (V, E).
And S200, constructing a keyword inverted list, namely combining all keyword information { v.w 1,v.w2, & gt } of the vertexes into a non-coincident keyword set, constructing a keyword inverted list in the shape of { w i:vj }, and recording the vertexes corresponding to the keywords.
S300, skipping the vertex irrelevant to the query keyword, constructing the keyword vertex path with the minimum target value asThe keyword vertex path with the minimum cost value is marked as
S400, dividing the keyword vertex paths, and expanding the divided paths in a segmented parallel manner.
The step S300 includes the steps of,
S310, giving a queryvs、vt、And delta respectively represents a query starting point, a query end point, a keyword set which needs to be covered by a query path and a cost threshold value, and is hidden and not containedThe method comprises the steps of (1) reserving a starting point, an ending point and keyword vertexes, constructing a simplified keyword vertex query graph G ', directly acquiring the keyword vertexes according to a keyword inverted list in S200 in the implementation process, and then adding the query starting point and the query ending point to form the query graph G ', wherein the query graph G ' contains information including (1) weights b(vi,vj)、oσ(vi,vj)、o(vi,vj)、bτ(vi,vj) between any two vertexes acquired in a preprocessing stage, a minimum cost value b min and a minimum target value o min in the query graph, and (2) querying the starting point, the ending point and the keyword vertexes.
S320, in the keyword vertex query graph G', keyword vertices are arranged and combined, traversed through breadth-first search, and finally returned to meet the keyword full coverage and cost threshold valueWherein the method comprises the steps ofThe vertex path of the keyword comprising the smallest target value is recorded asThe keyword vertex path with the minimum cost value is marked as
Step S320 includes the steps of giving a queryWherein the method comprises the steps ofwi={vk,vm},wj={vx,vy}。
S321, first construct an initial tag at v s Placing it in a priority queue.
S322, expanding from the starting point to all keyword vertexes, and constructing labels from v s to each keyword vertexAnd calculating a global priority enqueue, wherein the smaller the global priority, the higher the global priority.
S323, selecting the label with highest global priorityDequeuing at this timeHas included the keyword w j, directionVertex v i、vj related to keyword w i which is not contained is expanded, and new label is constructedAnd judging whether the new label meets the cost threshold and the approximate dominant condition, if so, calculating the global priority and then enqueuing, otherwise, discarding.
S324, continuing to select the label with the highest priority level for expansion, repeating the process S323 until the label is expanded to a terminal v t, and selecting a path corresponding to the label meeting the full coverage of the keyword and having the minimum cost threshold and the target value as a mark
S325, continuing to select the label expansion with the highest priority level, repeating the process S323 until the label expansion reaches the terminal v t, and selecting a path corresponding to the label meeting the full coverage of the keyword, the cost threshold and the minimum cost value as the path
In step S322, the tagIs expressed as global priority ofThe minimum target value from v s to v t is o (v s,vt), the priority factor β is set, 1< β <2, and the smaller the global priority of the tag is, the higher the global priority of the tag is.
Before step S400 is performed, the keyword vertex path with the smallest target value constructed in step S300 is selectedKeyword vertex path with minimum cost valueThe selection is carried out, and the specific selection process is as follows:
in construction of In the process of (2), the returned target value is the smallestIs marked asSimultaneously recording the corresponding target value O min and cost value B max, and returning the minimum cost valueIs marked asSimultaneous recording of B min and O max, availability of [ B min,Bmax ] and [ O min,Omax ], for assuranceThe effective path is returned for the path expansion, and O max is taken as the upper bound of the target value, namely, the path target value returned in the worst case is O max, namelyTo expand the path of the basis, byAndThe key word vertex path boundaries are shared by the following cases;
1) Is not the air-conditioner and is not the air-conditioner, Not null, and B min≤Bmax、Omin≤Omax, in the path-segment parallel expansion, toTaking O max as the upper boundary of the target value,The local cost threshold in (a) constrains the expansion of each path segment.
2)Is in the form of a hollow body,Not empty, path segment expansionTo be accurate.
3)Is in the form of a hollow body,For null, it is indicated that the cost threshold is too low, and the keyword vertex path with the smallest reconstruction target value for improving the cost threshold is recorded asThe keyword vertex path with the minimum cost value is marked as
Step S400 specifically includes the following steps,
And S410, setting parallelism, dividing the keyword vertex path into a plurality of sections, wherein all vertices except for the head vertex and the tail vertex in the keyword vertex path are keyword vertices, and dividing the path by using the keyword vertices.
And S420, carrying out parallel expansion on path segments, judging whether cost values of all paths from the initial point vertexes to the target vertexes of each path through the non-keyword vertexes are smaller than or equal to a local cost threshold value at the target vertexes when the path segments are expanded, if yes, carrying out feasible solution target value pruning strategy comparison applicable to parallel expansion, if yes, calculating global priority, and then queuing, otherwise, discarding.
In step S410, the principle of parallelism setting is to make the keyword vertexes in each path segment as uniform as possible so that the number of vertexes in each path segment is similar when expanding, thereby achieving the purpose of the close expansion time of each path segment.
In step S420, the judgment inequality of comparison of the feasible target value pruning strategy is:
Wherein, the AndRepresenting the vertices of the keywords,Is fromExpansion to v k,vk is fromTo the point ofA label of one vertex passing in the middle, if the constraint is satisfiedEnqueue continues expanding, otherwise discard
Compared with the prior art, the invention has the following beneficial effects:
(1) The invention provides a thought for converting a multi-constraint KOR problem into a limited shortest path problem, namely, firstly constructing a keyword vertex path meeting the full coverage and cost threshold of keywords, and then considering the minimum constraint of a target value in detailed expansion to segment the problem.
(2) Aiming at the problem of long execution time of long path search, the invention provides a segmentation parallel expansion method (PSE-KOR) for keyword optimal path query, which is characterized in that a keyword vertex path is constructed, then the keyword vertex is taken as a boundary to divide the path into a plurality of sections, the search scale (depth) during long path search is reduced, and then the execution time of the long path KOR is further shortened by adopting a parallel expansion mode.
(3) And for path expansion of each segment, local cost threshold pruning is provided to restrict expansion direction and search depth, and meanwhile, the association degree between the segmented paths is reduced. And meanwhile, the practical solution target value pruning strategy of approximate dominant pruning, global priority expansion and applicable parallel expansion is utilized to accelerate the expansion.
Detailed Description
Aiming at the problems, the invention provides a segmentation parallel expansion method for keyword optimal path query, which comprises the steps of firstly skipping vertexes irrelevant to query keywords, expanding the vertexes of the keywords only, constructing keyword vertex paths, dividing the paths into multiple sections by taking the vertexes of the keywords as boundaries, expanding the paths in parallel, and finally splicing the paths into a complete path. In order to reduce the association degree among all road sections and the search scale, the invention also provides a local cost threshold and feasible solution target value pruning suitable for parallel expansion, and the local cost threshold and the feasible solution target value pruning are matched with the parallel expansion.
Before further explanation, the KOR definition will be explained first:
KOR is based on a road network, which can be abstracted into a query graph as shown in fig. 1.
Definition 1 query graph g= (V, E) is made up of vertex set V and edge set E. Any V epsilon V represents the interest point, the geographic position of V is represented by longitude and latitude, and the keyword set marked as v.loc and V is represented asThe path (v i,vj) connecting the neighbors v i and v j is called a edge, denoted as E, E. Each edge has two attributes, cost value b (v i,vj) and target value o (v i,vj).
As shown in FIG. 1, the typical cost value is represented by the value outside the brackets, and the target value is represented by the time required to pass the edge, represented by the value inside the brackets. Table 1 lists the keyword sets corresponding to the vertices in fig. 1.
Table 1 vertex keyword information
Definition 2 Path p= (v 0,v1,....vn) represents a path from v 0 through v 1,…vn-1 to v n, and p-overlaid keywords are expressed as
According to definition 2, the cost value and the target value of path p are respectively:
definition of 3KOR A KOR query Q is a four-tuple vs、vt、And delta represents the query start point, end point, the covered keyword set, and cost threshold, respectively. Let p s,t denote the set of paths from v s to v t, and p cand denote the set of possible paths p that meet the keyword full coverage and cost threshold, i.eOptimal path of KOR
The technical scheme of the invention is further described as follows:
S100, calculating a minimum cost value b (V i,vj) between any two vertexes in the query graph and a target value o σ(vi,vj corresponding to the minimum cost value, a minimum target value o (V i,vj) and a cost value b τ(vi,vj corresponding to the minimum target value by using a Floyd algorithm, recording the minimum cost value b min and the minimum target value o min in G= (V, E), and storing the minimum cost value b min and the minimum target value o min in a memory.
And 200, constructing a keyword inverted index, namely combining all keyword information { v.w 1,v.w2..degree } of the interest points into a non-coincident keyword set, constructing a keyword inverted index table, and recording the interest points corresponding to the keywords, wherein the table 2 is shown.
TABLE 2 keyword inverted index Table
S300, skipping the vertex irrelevant to the query keyword, constructing the keyword vertex path with the minimum target value asThe keyword vertex path with the minimum cost value is marked as
Definition 4 keyword vertices givenFor keywordsIf a certain vertexCovering keywords w i, i.eThen call forIs a vertex covering the keyword w i.
Definition 5 keyword vertex Path givenContinuously checking uncovered keywords from v s, expanding the keywords to corresponding vertexes to obtain paths which are expanded from v s to v t and meet the full coverage and cost threshold of the keywords, and marking the paths asWeighing scaleIs a keyword vertex path.
Step S300 includes the steps of:
S310, giving a query vs、vt、And delta respectively represents a query starting point, a query end point, a keyword set which needs to be covered by a query path and a cost threshold value, and is hidden and not containedThe method comprises the steps of (1) reserving a starting point, an ending point and keyword vertexes, constructing a simplified keyword vertex query graph G ', directly acquiring the keyword vertexes according to a keyword inverted list in S200 in the implementation process, and then adding the query starting point and the query ending point to form the query graph G ', wherein the query graph G ' contains information including (1) weights b(vi,vj)、oσ(vi,vj)、o(vi,vj)、bτ(vi,vj) between any two vertexes acquired in a preprocessing stage, a minimum cost value b min and a minimum target value o min in the query graph, and (2) querying the starting point, the ending point and the keyword vertexes.
A simplified keyword vertex query graph G' is shown in fig. 2. The broken lines connect the keyword vertices to form a keyword vertex query graph.
S320, in the keyword vertex query graph G', keyword vertices are arranged and combined, traversed through breadth-first search, and finally returned to meet the keyword full coverage and cost threshold valueWherein the method comprises the steps ofThe vertex path of the keyword comprising the smallest target value is recorded asThe keyword vertex path with the minimum cost value is marked asAs shown in fig. 3, the keyword vertex path with the smallest target value may be returned.
S320, in the keyword vertex query graph G', keyword vertices are arranged and combined, traversed through breadth-first search, and finally returned to meet the keyword full coverage and cost threshold valueWherein the method comprises the steps ofThe vertex path of the keyword comprising the smallest target value is recorded asThe keyword vertex path with the minimum cost value is marked as
Step S320 includes the steps of giving a queryWherein the method comprises the steps ofwi={vk,vm},wj={vx,vy}。
S321, first construct an initial tag at v s Placing it in a priority queue.
S322, expanding from the starting point to all keyword vertexes, and constructing labels from v s to each keyword vertexAnd calculating a global priority enqueue, wherein the smaller the global priority, the higher the global priority.
Label (Label)Is expressed as global priority ofThe minimum target value from v s to v t is o (v s,vt), the priority factor β is set, 1< β <2, and the smaller the global priority of the tag is, the higher the global priority of the tag is.
S323, selecting the label with highest global priorityDequeuing at this timeHas included the keyword w j, directionVertex v i、vj related to keyword w i which is not contained is expanded, and new label is constructedAnd judging whether the new label meets the cost threshold and the approximate dominant condition, if so, calculating the global priority and then enqueuing, otherwise, discarding.
Cost threshold-the path is meeting the guest's travel budget.
The approximation governs that given a parameter a, a slightly greater than 1 (e.g., a=1.1), if the start and end points of path p i、pj are the same,B (p i)≤b(pj) andPath p i approximately dominates that p j,pj is pruned.
Target correction given a query graph g= (V, E) and a queryKnown tagsMinimum cost value b min, minimum target value o min, which is corrected to target value of Is referred to as a correction ratio, where 0< epsilon <1.
S324, continuing to select the label with the highest priority level for expansion, repeating the process S323 until the label is expanded to a terminal v t, and selecting a path corresponding to the label meeting the full coverage of the keyword and having the minimum cost threshold and the target value as a mark
S325, continuing to select the label expansion with the highest priority level, repeating the process S323 until the label expansion reaches the end point v t, and selecting a path corresponding to the label meeting the full coverage of the keyword, the cost threshold and the minimum cost value as a mark
The expansion result is shown in G' of FIG. 4, v 2、v4 is in constructionKeyword vertices that were discarded.
The keyword vertex path with the smallest target value constructed in step S300Keyword vertex path with minimum cost valueThe selection is carried out, and the specific selection process is as follows:
in construction of In the process of (2), the returned target value is the smallestIs marked asSimultaneously recording the corresponding target value O min and cost value B max, and returning the minimum cost valueIs marked asB min and O max were recorded simultaneously. [ B min,Bmax ] and [ O min,Omax ] can be obtained. To ensure toReturns the effective path for the underlying path extension, as represented by O max (i.e) As an upper bound to the target value, the worst case return path target value is O max, i.e.To expand the path of the basis. In the expansion stage, byAndThe key vertex path boundaries that make up share the following:
(1) Is not the air-conditioner and is not the air-conditioner, Not empty, and B min≤Bmax、Omin≤Omax, in path segment expansion, toTaking O max as the upper boundary of the target value,The local cost threshold in (a) constrains the expansion of each path segment.
(2)Is in the form of a hollow body,Not empty, path segment expansionTo be accurate.
(3)Is in the form of a hollow body,For null, it is indicated that the cost threshold is too low, and the keyword vertex path with the smallest reconstruction target value for improving the cost threshold is recorded asThe keyword vertex path with the minimum cost value is marked as
The keyword vertex path with the smallest target value, therefore, has the smallest target value O min, since the solution is being performedWhen the target value is minimum, the path cost value is required to meet the cost threshold value, so that the contractThe path cost value of (3) is B max.
The keyword vertex path with the smallest cost value has the smallest target value B min because the solution is being carried outWhen the target value cost value is minimum, there is no constraint on the target value, so the contractThe path cost value of (3) is O max.
And (3) the minimum target value of the whole road network diagram and the minimum cost value of the whole road network diagram are respectively represented by o min and b min.
From the above, it can be seen that in constructionAnd obtaining two keyword vertex paths with the minimum target value and the minimum cost value, and selecting a specific expansion basis according to the three conditions in the actual expansion process.
S400, dividing the keyword vertex paths, and expanding the divided paths in a segmented parallel manner. The method comprises the following specific steps:
And S410, setting parallelism, dividing the keyword vertex path into a plurality of sections, wherein all vertices except for the head vertex and the tail vertex in the keyword vertex path are keyword vertices, and dividing the path by using the keyword vertices.
All vertices in the keyword vertex path except the head and tail vertices are keyword vertices, so the path can be segmented by using the keyword vertices. As shown in fig. 4, the pathCan be divided into three sections v s→v1、v1→v3、v3→vt or two sections v s→v1→v3、v3→vt.
The principle of setting the parallelism is that the keyword vertexes in each path segment are uniform as much as possible so that the number of vertexes in each path segment is similar when expanding, thereby achieving the purpose of the approaching expanding time of each path segment.
And S420, carrying out parallel expansion on path segments, judging whether cost values of all paths from the initial point vertexes to the target vertexes of each path through the non-keyword vertexes are smaller than or equal to a local cost threshold value at the target vertexes when the path segments are expanded, if yes, carrying out feasible solution target value pruning strategy comparison applicable to parallel expansion, if yes, calculating global priority, and then queuing, otherwise, discarding.
Feasible solution target value pruning when obtaining a feasible path p i(pi∈pcand), taking the target value o (p i) as pruning condition, if the target value of the path p j(pj∈pcand) is larger than o (p i), namely o (p i)<o(pj), then p j is pruned.
Local cost threshold value-known keyword vertex pathTotal cost value for the pathVertex for any two adjacent keywordsAndWhen the path is expanded, the methodAs a slaveTo the point ofThe local cost threshold of the path expansion is calculated.
When the keyword vertex path is constructed, two constraint conditions of keyword full coverage and cost threshold are satisfied, and the local cost threshold of each path segment is recorded, so that the local cost threshold can be used for replacing global cost threshold constraint when each path segment is expanded in parallel. The local cost threshold value only constrains each path and is independent from other paths, so that the association degree between paths can be reduced. The judgment inequality of comparison of the viable solution target value pruning strategy is as follows:
Wherein, the AndRepresenting the vertices of the keywords,Is fromExpansion to v k,vk is fromTo the point ofA label of one vertex passing in the middle, if the constraint is satisfiedEnqueue continues expanding, otherwise discard