CN109544998B

CN109544998B - A Multi-objective Optimization Method for Flight Slot Allocation Based on Distribution Estimation Algorithm

Info

Publication number: CN109544998B
Application number: CN201811608206.7A
Authority: CN
Inventors: 吴振锋; 田靖; 黄吉波; 严勇杰; 聂建强; 徐善娥; 王冠
Original assignee: CETC 28 Research Institute
Current assignee: CETC 28 Research Institute
Priority date: 2018-12-27
Filing date: 2018-12-27
Publication date: 2020-08-04
Anticipated expiration: 2038-12-27
Also published as: CN109544998A

Abstract

The invention discloses a flight time slot allocation multi-objective optimization method based on a distribution estimation algorithm, and provides a multi-objective flight time slot allocation heuristic optimization method based on the distribution estimation algorithm aiming at the current situations that the existing method is insufficient in research on the time slot allocation multi-objective optimization algorithm and few in research on flight relevance under resource constraint. The invention considers the time slot distribution problem in the ground waiting strategy, establishes a multi-objective optimization problem under the resource constraint including delay cost and fairness indexes, and also adds the capacity of a takeoff airport in the constraint condition to prevent the capacity flow imbalance of other busy airports. On the basis of a distribution estimation algorithm, a Markov network is introduced to model a problem decision variable, the correlation among multiple variables is described and mined, and the search optimization efficiency is increased; meanwhile, a fitness function with an exponential weight parameter is provided for a multi-objective optimization process, and the distribution of the pareto solution is improved under the conditions of low calculation complexity and flexible parameter configuration.

Description

A Multi-objective Optimization Method for Flight Slot Allocation Based on Distribution Estimation Algorithm

技术领域technical field

本发明属于空中交通流量管理领域，尤其涉及一种基于分布估计算法的航班时隙分配多目标优化方法。The invention belongs to the field of air traffic flow management, and in particular relates to a multi-objective optimization method for flight time slot allocation based on a distribution estimation algorithm.

背景技术Background technique

在流量管理中，地面等待策略是一种主动实施、非常重要的流量管理方法。实质是将昂贵且高风险的空中等待转化为安全经济的地面等待。流量管理部门预先了解到航空器所经过的航路、扇区或到达的目的地机场即将容量下降导致空中交通流量拥堵，则要求航空器在起飞机场地面等待来调节空中交通网络的流量，实现容需平衡，从而达到优化航班延误成本、提高机场、空域利用率和降低飞行安全风险的目的。In traffic management, the ground waiting strategy is an active and very important traffic management method. The essence is to convert expensive and high-risk air waiting into safe and economical ground waiting. The traffic management department knows in advance that the air route, sector or destination airport that the aircraft passes through is about to reduce the capacity of the air traffic flow, so the aircraft is required to wait on the ground at the departure airport to adjust the flow of the air traffic network to achieve a balance between capacity and demand. , so as to optimize flight delay costs, improve airport and airspace utilization and reduce flight safety risks.

在地面等待策略中，一项核心技术就是航班时隙分配。即机场由于容量受限产生多个时隙资源(可看做是进场时隙)，同时存在多架次进场航班，要求每个时隙只能至多分配给一个航班，并且每个航班必须配置一个比航班计划进场时间晚的时隙。一般来说，延误成本是必须要考虑的目标，除此以外，公平性指标也是需要考量的重要因素。因此，时隙分配问题往往是一种(时隙)资源约束下的多目标优化问题。In the ground holding strategy, a core technology is flight slot allocation. That is, the airport generates multiple time slot resources (which can be regarded as arrival time slots) due to limited capacity, and there are multiple arrival flights at the same time. It is required that each time slot can only be assigned to at most one flight, and each flight must be configured with A time slot later than the scheduled arrival time of the flight. Generally speaking, the cost of delay is the target that must be considered. In addition, the fairness index is also an important factor to be considered. Therefore, the time slot allocation problem is often a multi-objective optimization problem under (time slot) resource constraints.

目前，时隙分配问题的研究一方面集中在问题建模上，包含确定性、不确定性、动态、多目标等问题；另一方面在优化方法上进行研究，例如整数规划、随机规划、鲁棒优化等。At present, the research of time slot allocation problem focuses on problem modeling, including deterministic, uncertain, dynamic, multi-objective and other problems; on the other hand, research on optimization methods, such as integer programming, stochastic programming, robust Stick optimization, etc.

针对时隙分配问题，现有方法对多目标优化算法研究不足，同时对于资源约束下航班关联性研究甚少。针对上述不足，本发明提出利用马尔科夫网络对航班关联性进行建模，通过基于概率统计的分布估计算法进行搜索优化，同时对多目标优化给出新的适应度函数，以此建立一种针对航班时隙分配的多目标优化方法。For the time slot allocation problem, the existing methods are insufficient for multi-objective optimization algorithms, and there are few studies on flight correlation under resource constraints. In view of the above deficiencies, the present invention proposes to use the Markov network to model the flight correlation, to perform search optimization through a distribution estimation algorithm based on probability statistics, and to provide a new fitness function for multi-objective optimization, so as to establish a A multi-objective optimization method for flight slot assignment.

发明内容SUMMARY OF THE INVENTION

发明目的：在确保机场容量约束前提下，充分利用时隙，实现延误成本和公平性指标的多目标优化，并满足计算时间复杂度要求。在优化过程中，给出时隙分配解决方案的同时，还对航班之间关联性进行挖掘，为辅助决策提供数据支持。Purpose of the invention: Under the premise of ensuring airport capacity constraints, the time slot is fully utilized to achieve multi-objective optimization of delay cost and fairness indicators, and to meet the computational time complexity requirements. In the optimization process, the time slot allocation solution is given, and the correlation between flights is also mined to provide data support for auxiliary decision-making.

技术方案：一种基于分布估计算法的航班时隙分配多目标优化方法，包括如下步骤：Technical solution: a multi-objective optimization method for flight time slot allocation based on a distribution estimation algorithm, comprising the following steps:

步骤1，基于效率性和公平性建立时隙分配多目标模型；Step 1, establish a multi-objective model of time slot allocation based on efficiency and fairness;

步骤2，基于时隙分配多目标模型，建立对应马尔科夫网络；Step 2, based on the time slot allocation multi-objective model, establish a corresponding Markov network;

步骤3，建立概率矩阵；Step 3, establish a probability matrix;

步骤4，采用新的适应度函数，计算并找出当代帕累托解；Step 4, using the new fitness function, calculate and find the contemporary Pareto solution;

步骤5，根据当代帕累托解，更新信任解集存档；Step 5, according to the contemporary Pareto solution, update the trust solution set archive;

步骤6，学习马尔科夫网络，更新概率矩阵；Step 6, learn the Markov network and update the probability matrix;

步骤7，根据概率矩阵和马尔科夫网络进行采样，生成新解；Step 7: Sampling according to the probability matrix and the Markov network to generate a new solution;

步骤8，达到终止条件，获得最终结果。Step 8, the termination condition is reached, and the final result is obtained.

本发明中，步骤1包括：In the present invention, step 1 includes:

步骤1-1，建立如下多目标函数：Step 1-1, establish the following multi-objective function:

其中，N表示航班总数，M表示时隙总数，c_ij表示航班i被分配给时隙j时的延误成本，x_ij表示决策变量，F(x_ij)表示航班i被分配给时隙j时的公平性指标；公式(1)表示最小化延误成本和公平性差异指标的多目标函数。where N is the total number of flights, M is the total number of time slots, c _ij is the delay cost when flight i is assigned to time slot j, x _ij is a decision variable, and F(x _ij ) is when flight i is assigned to time slot j Equation (1) represents the multi-objective function that minimizes delay cost and fairness difference index.

步骤1-2，设定约束条件：Step 1-2, set constraints:

c_ij＝ω_i×(t_j-SLDT_i) (5)c _ij =ω _i ×(t _j -SLDT _i ) (5)

t_j≥SLDT_i,当x_ij＝1时 (6)t _j ≥SLDT _i , when x _ij =1 (6)

其中，ω_i表示航班i在延误时间方面的重要性权重，取值范围0.0～1.0；Q表示航空公司总数，N_q表示航空公司q的航班总数，t_j表示时隙j的开始时间；x_ij为决策变量，代表航班i是否被分配给时隙j，仅能取值0或者1；SLDT_i表示航班i的时刻表计划着陆时间，c_i ^*表示航班i最优情况的延误，d(x_ij)表示航班i被分配时隙j时和最优情况的差值，nd(x_ij)表示正则化后的d(x_ij)的差值，f_q(x_ij)表示航空公司q的平均正则化后的差值；公式(2)为问题的决策变量，公式(3)和(4)分别表示航班分配的约束条件和时隙分配的约束条件，公式(5)用于计算航班延误成本，公式(6)为航班时刻表约束，公式(7)和(8)计算实际时隙分配与最优分配下的差值并进行正则化，公式(9)和(10)计算所有参与时隙分配公司的方差，作为航空公司公平性指标。Among them, ω _i represents the importance weight of flight i in terms of delay time, ranging from 0.0 to 1.0; Q represents the total number of airlines, N _q represents the total number of flights of airline q, t _j represents the start time of time slot j; x _ij is a decision variable, representing whether flight i is allocated to time slot j, and can only take a value of 0 or 1; SLDT _i represents the scheduled landing time of flight i, c _i ^* represents the optimal delay of flight i, d( x _ij ) represents the difference between when flight i is assigned time slot j and the optimal situation, nd(x _ij ) represents the difference between the regularized d(x _ij ), and f _q (x _ij ) represents the difference of airline q Average regularized difference; formula (2) is the decision variable of the problem, formulas (3) and (4) represent the constraints of flight allocation and time slot allocation, respectively, and formula (5) is used to calculate flight delays Cost, formula (6) is the flight schedule constraint, formulas (7) and (8) calculate the difference between the actual time slot allocation and the optimal allocation and regularize it, and formulas (9) and (10) calculate all participating time slots. The variance of the gap-allocated companies is used as an indicator of airline fairness.

本发明中，步骤2包括：建立对应时隙分配多目标模型的马尔科夫网络，在建立的马尔科夫网络中，每个节点代表一个决策变量，即航班的时隙分配选择；节点X_i表示航班i的时隙选择，其取值范围D(X_i)＝{s₁,s₂…,s_M}，其中M为可用时隙总数，s_M表示第M个可用的时隙；在建立的马尔科夫网络中，两个节点之间的连线表示两个航班之间存在较强耦合性和航班时隙争夺。In the present invention, step 2 includes: establishing a Markov network corresponding to a multi-objective model of time slot allocation, and in the established Markov network, each node represents a decision variable, that is, the time slot allocation selection of flights; node X _i represents the time slot selection of flight i, and its value range D(X _i )={s ₁ , s ₂ ..., s _M }, where M is the total number of available time slots, and s _M represents the M-th available time slot; In the established Markov network, the connection between two nodes indicates that there is strong coupling between the two flights and competition for flight time slots.

本发明中，步骤3包括：建立如下概率矩阵：In the present invention, step 3 includes: establishing the following probability matrix:

P(t)表示迭代计算过程中第t代的概率矩阵；P(t) represents the probability matrix of the t-th generation in the iterative calculation process;

P_NM表示矩阵第N行第M列的值；P _NM represents the value of the Nth row and Mth column of the matrix;

矩阵的行i表示航班i分配到每个时隙j上的概率，合计为100％；Row i of the matrix represents the probability that flight i is assigned to each time slot j, which adds up to 100%;

矩阵的列j表示针对时隙j，每个航班i被分配到该时隙的概率，合计为100％；Column j of the matrix represents, for time slot j, the probability that each flight i is assigned to this time slot, which adds up to 100%;

初始化矩阵时，采用等概率进行赋值，针对时隙分配具体问题，存在公式(6)中的时间约束，制定如下规则：对每一行i，在M个时隙中，将早于SLDT_i的时隙j的概率p初始值设置为0，将晚于SLDT_i的时隙j的概率p初始值设置为1/z，其中z为晚于SLDT_i的时隙数量。When initializing the matrix, equal probability is used for assignment. For the specific problem of time slot allocation, there is a time constraint in formula (6), and the following rules are formulated: For each row i, in M time slots, the time slot will be earlier than the time of SLDT _i . The initial value of probability p of slot j is set to 0, and the initial value of probability p of time slot j later than SLDT _i is set to 1/z, where z is the number of time slots later than SLDT _i .

本发明中，步骤4包括：In the present invention, step 4 includes:

步骤4-1，帕累托解归纳为两部分：边界部分和中间部分，边界部分是单目标优化部分，中间部分是多目标优化部分，针对边界部分，采用VEGA算法中的边界适应度函数来计算：Step 4-1, the Pareto solution is divided into two parts: the boundary part and the middle part. The boundary part is the single-objective optimization part, and the middle part is the multi-objective optimization part. For the boundary part, the boundary fitness function in the VEGA algorithm is used to calculate:

EdgeFitness_f(X)＝obj_value_f(X) (13)EdgeFitness _f (X)=obj_value _f (X) (13)

其中X表示任意解，EdgeFitness_f(X)表示解X在目标f的适应度函数，obj_value_f(X)表示解X在目标f上函数值；where X represents an arbitrary solution, EdgeFitness _f (X) represents the fitness function of the solution X on the target f, and obj_value _f (X) represents the function value of the solution X on the target f;

针对中间部分，采用如下中间适应度函数进行计算：For the middle part, the following middle fitness function is used for calculation:

其中X表示任意解，CentralFitness(X)表示解X在中间部分的适应度函数，p(X)和q(X)分别表示被X支配(dominate)的数量以及支配X的数量；where X represents an arbitrary solution, CentralFitness(X) represents the fitness function of the solution X in the middle part, p(X) and q(X) represent the quantity dominated by X and the quantity dominated by X, respectively;

所有帕累托解的中间部分适应度函数值都是大于等于1，同时，所有非帕累托解都小于1大于0；The fitness function value of the middle part of all Pareto solutions is greater than or equal to 1, and at the same time, all non-Pareto solutions are less than 1 and greater than 0;

根据公式(13)和(14)，分别计算出边界适应度和中间适应度；According to formulas (13) and (14), the boundary fitness and the intermediate fitness are calculated respectively;

步骤4-2，采用如下带有指数化权重参数的复合适应度函数来结合边界适应度和中间适应度函数：In step 4-2, the following composite fitness function with exponential weight parameters is used to combine the boundary fitness and the intermediate fitness function:

其中D-Fitness(X)是解X考虑复合中间部分适应度和边界部分适应度的综合适应度，m是目标总数，N_POP是人口总数，Ranking_EdgeFitnessi(X)是基于第i个目标的解X的排名，Ranking_{CentralFitness}(X)是针对X在中间适应度上的排名，ω_i和ω_m+1是指数化权重参数(例如双目标优化问题，m＝2，所以指数化权重参数有3个：ω₁，ω₂，ω₃)，，用来区分不同排名的得分，同时能够实现对一个目标的偏好。where D-Fitness(X) is the comprehensive fitness of the solution X considering the composite middle part fitness and boundary part fitness, m is the total number of targets, N_POP is the total population, and Ranking _EdgeFitnessi (X) is the solution X based on the ith target The ranking of , Ranking _{CentralFitness} (X) is the ranking of X on the intermediate fitness, ω _i and ω _m+1 are the indexed weight parameters (for example, dual-objective optimization problem, m=2, so there are 3 indexed weight parameters : ω ₁ , ω ₂ , ω ₃ ), which are used to distinguish the scores of different rankings, and at the same time can realize the preference for a target.

本发明中，步骤5包括：在数据库中建立一个存档用于存放信任解，作为信任解集存档，容量和每一次迭代中的人口数量N_POP相同，或者小于N_POP；例如取值N_POP×1/2。In the present invention, step 5 includes: establishing an archive in the database for storing trust solutions, as a trust solution set archive, the capacity is the same as the population number N_POP in each iteration, or less than N_POP; for example, the value is N_POP×1/2 .

通过公式(15)计算获得本代的所有解的适应度函数值，并根据适应度函数值由高到低选择α％的解存入信任解集存档，从上一代信任解集存档中根据适应度函数值由高到低选择1-α％的信任解存入本代的信任解集存档。Calculate the fitness function value of all solutions of the current generation by formula (15), and select α% solutions from high to low according to the fitness function value and store them in the trust solution set archive. The degree function value from high to low selects 1-α% of the trust solutions and stores them in the current generation's trust solution set archive.

本发明中，步骤6包括：In the present invention, step 6 includes:

步骤6-1，学习马尔科夫网络：Step 6-1, learn Markov network:

通过公式(16)，计算变量X_i和Y_j的互信息量MI(X_i；Y_j)：By formula (16), the mutual information MI(X _i ; Y _j ) of the variables X _i and Y _j is calculated:

其中，X_i和Y_j表示两个随机变量，时隙分配问题中，变量具体代表的是一个航班的时隙分配；p(x_i|D)和p(y_j|D)分别表示基于可信解集合D的变量X_i＝x_i的概率和Y_j＝y_j的概率，p(x_i,y_j|D)是X_i＝x_i和Y_j＝y_j的联合概率；Among them, X _i and Y _j represent two random variables. In the time slot allocation problem, the variables specifically represent the time slot allocation of a flight; p( _xi |D) and p(y _j |D) represent The probability of variables X _i = _xi and Y _j =y _j of the set D of belief solutions, p(x _i ,y _j |D) is the joint probability of X _i = _xi and Y _j =y _j ;

如果两个变量的互信息量大于给定阈值threshold(例如可以取值为0.5)，则判定两变量之间存在较强关联，在马尔科夫网络中建立连接线并互称为邻居；阈值threshold由用户设定，或者在计算过程中，根据已知的互信息量来动态更新阈值，如公式(17)所示：If the mutual information of the two variables is greater than a given threshold threshold (for example, the value can be 0.5), it is determined that there is a strong correlation between the two variables, and a connection line is established in the Markov network and called neighbors to each other; the threshold threshold The threshold is set by the user, or dynamically updated according to the known mutual information during the calculation process, as shown in formula (17):

其中，MI表示两个变量X_i和Y_j的互信息量，NDV表示变量的总量，参数β用于控制结构复杂度(通常可设置为0.5-1.5之间)；Among them, MI represents the mutual information of the two variables X _i and Y _j , NDV represents the total amount of variables, and the parameter β is used to control the structural complexity (usually can be set between 0.5-1.5);

步骤6-2，更新概率矩阵：Step 6-2, update the probability matrix:

变量X_i在其邻居N_i的概率矩阵，如公式(18)所示：The probability matrix of variable X _i in its neighbor N _i , as shown in equation (18):

其中P(X_i|N_i)表示变量X_i在所有其邻居N_i的条件概率矩阵，{x_i,1，…，x_i,j，…，x_i,Yi}分别表示变量X_i能够被赋予的值，在时隙分配中即对应每一个待分配的时隙编号，x_i,Yi表示变量X_i第Y_i个能够被赋予的值，Y_i表示变X_i取值的最大编号，x_i,Yi即为X_i能够被取值的最大值，时隙分配问题中，即为最晚的一个时隙；p(x_i,Valuej|n_i,Neighbork)表示变量X_i取值Value_j时，在其所有邻居联合取值场景为Neighbor_k下的条件概率；Z_i表示邻居集合N_i的联合取值场景的最大编号；where P(X _i |N _i ) represents the conditional probability matrix of variable X _i in all its neighbors N _i , {x _i,1 ,..., _xi,j ,..., _xi,Yi }respectively represent that variable X _i can The assigned value corresponds to the number of each time slot to be assigned in the time slot allocation, x _{i, Yi} represent the value that can be assigned to the variable X _i , and Y _i _{represents the maximum number of the variable X i} _value . , x _{i, Yi} is the maximum value that X _i can take, and in the time slot allocation problem, it is the latest time slot; p( _xi,Valuej |n _i,Neighbork ) represents the value of the variable X _i When Value _j , the conditional probability when all its neighbors jointly take the value scene is Neighbor _k ; Z _i represents the maximum number of the joint value scene of the neighbor set _Ni ;

对于公式(12)和(18)中的概率，首先计算当前代的边界概率P_t ^*(X＝x)：For the probabilities in equations (12) and (18), first calculate the boundary probability P _t ^* (X=x) of the current generation:

其中，N(X＝x)表示在可信任解中变量X取值x的数量，N(D)表示可信任解的总量；通过公式(20)更新概率模型：Among them, N(X=x) represents the number of values x of the variable X in the trusted solution, and N(D) represents the total number of trusted solutions; the probability model is updated by formula (20):

P_t(X＝x)＝(1-λ)×P_t-1(X＝x)+λ×P_t ^*(X＝x) (20)P _t (X=x)=(1-λ)×P _t-1 (X=x)+λ×P _t ^* (X=x) (20)

其中P_t(X＝x)和P_t-1(X＝x)分别表示在迭代计算过程中第t代时变量X取x值的概率和第t-1代时变量X取x值的概率，λ是从当前代的学习率，如果λ＝1，则表示概率模型将完全从当前代信任解集存档中学习；Among them, P _t (X=x) and P _t-1 (X=x) represent the probability that the variable X takes the value of x in the t-th generation and the probability that the variable X takes the value of x in the t-1 generation in the iterative calculation process, respectively. , λ is the learning rate from the current generation, if λ=1, it means that the probability model will be completely learned from the current generation trust solution set archive;

步骤6-3，在通过公式(20)更新概率模型之后，通过变异概率(例如5％)执行公式(21)的变异操作：Step 6-3, after the probability model is updated by the formula (20), the mutation operation of the formula (21) is performed by the mutation probability (for example, 5%):

P_t(X＝x)＝min(P_t(X＝x)+θ,1-ε) (21)P _t (X=x)=min(P _t (X=x)+θ,1-ε) (21)

其中θ表示变异值(例如取0.2)，ε是一个很小的正数(例如取0.01)。where θ represents the variation value (eg 0.2) and ε is a small positive number (eg 0.01).

本发明中，步骤7包括：In the present invention, step 7 includes:

步骤7-1，随机生成一个可行解s＝(s₁,s₂,…,s_n)，s_n为最后一个变量X_n的取值，时隙分配问题中s₁,s₂,…,s_n代表每个航班的时隙选择，s_n表示第n架航班的时隙选择，可行解s生成方法如下：将所有航班根据航班时刻表排列，从第一个航班x₁开始，从所有剩余时隙中，随机选择一个时隙s₁分配给x₁，要求s₁不早于x₁的计划起飞时间，并且时隙s₁没有被其他航班选择，删除时隙s₁，移动到下一个航班x₂，用同样的方法随机选择，直至所有航班选择结束；Step 7-1, randomly generate a feasible solution s=(s ₁ ,s ₂ ,…,s _n ), where s _n is the value of the last variable X _n , in the time slot allocation problem s ₁ ,s ₂ ,…, s _n represents the time slot selection of each flight, and s _n represents the time slot selection of the nth flight. The feasible solution s is generated as follows: Arrange all flights according to the flight schedule, starting from the first flight x ₁ , starting from all flights In the remaining time slots, randomly select a time slot s ₁ and assign it to x ₁ . It is required that s ₁ is not earlier than the planned departure time of x ₁ , and the time slot s ₁ is not selected by other flights. Delete the time slot s ₁ and move to the next One flight x ₂ , randomly selected in the same way, until all flight selection ends;

步骤7-2，将解s＝(s₁,s₂,…,s_n)顺序随机打乱，得到新的顺序s₁ ^*,s₂ ^*,…,s_n ^*，其中s_n ^*表示新的航班排序中最后一个航班的时隙选择；需要说明的是：此时的航班以及时隙对应关系并没有改变，只是排列的次序发生变化；Step 7-2, randomly shuffle the order of the solution s=(s ₁ ,s ₂ ,...,s _n ) to obtain a new order s ₁ ^* ,s ₂ ^* ,...,s _n ^* , where _sn ^* represents a new order The time slot selection of the last flight in the flight sequence of ; it should be noted that: the corresponding relationship between the flight and the time slot at this time has not changed, but the order of the arrangement has changed;

步骤7-3，从s₁ ^*开始，设定s₁ ^*对应航班h，根据建立的马尔科夫网络找出航班h变量的所有邻居，并根据此时解s＝(s₁,s₂,…,s_n)中邻居的取值以及对应的条件概率p(x_h|N_h)来决定航班h选择各个时隙的概率，再按照概率分布，采用轮盘赌选择算法(Roulette WheelSelection，即根据各种取值的概率随机选择，某个值的概率越高，则被选中的可能性越大)生成s₁ ^*的新值；然后移动到s₂ ^*，采用同样方法再计算s₂ ^*的新值，直到完成所有变量，则完成了一个解的采样；Step 7-3, starting from s ₁ ^* , set s ₁ ^* corresponding to flight h, find out all the neighbors of flight h variable according to the established Markov network, and solve s=(s ₁ ,s ₂ , ...,s _n ) and the corresponding conditional probability p(x _h |N _h ) to determine the probability of selecting each time slot for flight h, and then according to the probability distribution, the roulette wheel selection algorithm (Roulette WheelSelection) is used, namely Randomly select according to the probability of various values, the higher the probability of a certain value, the greater the possibility of being selected) to generate a new value of s ₁ ^* ; then move to s ₂ ^* , and use the same method to calculate s ₂ ^* The new value of , until all variables are completed, the sampling of a solution is completed;

步骤7-4，重复步骤7-1至步骤7-3的方法，完成一个群体N_POP个解的采样，最终形成当前代的解集。Step 7-4, repeat the method from step 7-1 to step 7-3, complete the sampling of N_POP solutions of a population, and finally form the solution set of the current generation.

有益效果：与现有方法相比，本发明的优点在于：Beneficial effect: Compared with the existing method, the advantages of the present invention are:

采用分布估计算法，相比遗传算法等启发式算法，全局搜索能力和收敛速度有所提高；Using distribution estimation algorithm, compared with heuristic algorithms such as genetic algorithm, the global search ability and convergence speed are improved;

采用马尔科夫网络结构，更好描述和发掘变量关系，提供更多辅助决策能力；The Markov network structure is adopted to better describe and explore the relationship of variables and provide more auxiliary decision-making capabilities;

采用基于指数化权重参数适应度函数，实现低计算复杂度和搜索灵活性。A fitness function based on exponential weight parameters is adopted to achieve low computational complexity and search flexibility.

本发明方法在确保机场容量约束前提下，充分利用时隙，实现延误成本和公平性指标的多目标优化，并满足计算时间复杂度要求。在优化过程中，给出时隙分配解决方案的同时，还对航班之间关联性进行挖掘，为辅助决策提供数据支持。Under the premise of ensuring airport capacity constraints, the method of the invention makes full use of time slots, realizes multi-objective optimization of delay cost and fairness index, and meets the requirements of computational time complexity. In the optimization process, the time slot allocation solution is given, and the correlation between flights is also mined to provide data support for auxiliary decision-making.

附图说明Description of drawings

下面结合附图和具体实施方式对本发明做更进一步的具体说明，本发明的上述或其他方面的优点将会变得更加清楚。The present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments, and the advantages of the above or other aspects of the present invention will become clearer.

图1是基于马尔科夫网络的多目标分布估计算法流程。Figure 1 is the flow of the multi-objective distribution estimation algorithm based on Markov network.

图2是分布估计算法流程。Figure 2 is the flow of the distribution estimation algorithm.

图3是马尔科夫网络结构示例。Figure 3 is an example of a Markov network structure.

图4是包含边界部分和中间部分的帕累托解。Figure 4 is a Pareto solution containing a boundary part and an intermediate part.

图5是针对中间部分的适应度函数。Figure 5 is the fitness function for the middle part.

图6a是最小化问题的示例。Figure 6a is an example of a minimization problem.

图6b是带有指数化权重参数的函数。Figure 6b is a function with an exponential weight parameter.

图7是更新存档候选解集。Figure 7 is an update archive candidate solution set.

图8是实验验证结果(部分)。FIG. 8 is the experimental verification result (part).

图9是基于分布估计算法的航班时隙分配多目标优化方法的流程图。FIG. 9 is a flowchart of a multi-objective optimization method for flight slot allocation based on a distribution estimation algorithm.

具体实施方式Detailed ways

下面结合附图及实施例对本发明做进一步说明。The present invention will be further described below with reference to the accompanying drawings and embodiments.

如图1所示，是基于马尔科夫网络的多目标分布估计算法流程示意图。算法是基于种群进化理论(Population-to-Population)，每一代产生新的候选解，并联合上一代选择出的信任解，通过适应度函数进行多目标帕累托选择，以此更新信任解。然后基于新得到的信任解，通过分布估计算法中概率模型估计和采样，得到下一代的候选解，一直循环直到终止条件出现。其中需要说明的是，马尔科夫网络引入是和分布估计算法做结合，增强概率模型的估计和采样过程。As shown in Figure 1, it is a schematic flowchart of the multi-objective distribution estimation algorithm based on Markov network. The algorithm is based on the Population-to-Population theory. Each generation generates new candidate solutions, and combines the trust solutions selected by the previous generation to perform multi-objective Pareto selection through the fitness function to update the trust solutions. Then, based on the newly obtained trust solution, the next generation of candidate solutions is obtained through probability model estimation and sampling in the distribution estimation algorithm, and the cycle continues until the termination condition occurs. It should be noted that the introduction of the Markov network is combined with the distribution estimation algorithm to enhance the estimation and sampling process of the probability model.

图2是采用的分布估计算法流程示意图。其中重点是采样生成新解，和估算概率模型两个方面。FIG. 2 is a schematic flow chart of the distribution estimation algorithm adopted. The focus is on sampling to generate new solutions and estimating probabilistic models.

如图9所示，本发明具体过程如下：As shown in Figure 9, the concrete process of the present invention is as follows:

步骤1：基于效率性和公平性建立时隙分配多目标模型。Step 1: Establish a multi-objective model of slot allocation based on efficiency and fairness.

(1)参数(1) Parameters

参数名称参数说明Parameter name Parameter description

N 航班总数N total number of flights

M 时隙总数M total number of slots

Q 航空公司总数Q Total number of airlines

i 航班i，i＝1,…,Ni flight i, i=1,...,N

j 时隙j，j＝1,…,Mj time slot j, j=1,...,M

q 航空公司q，q＝1,…,Qq airline q, q=1,...,Q

N_q 航空公司q的航班总数Total number of flights from N _q airline q

c_ij 航班i被分配给时隙j时的延误成本c _ij delay cost when flight i is assigned to slot j

x_ij 航班i是否被分配给时隙jIs x _ij flight i assigned to slot j?

F(x_ij) 航班i被分配给时隙j时的公平性指标F(x _ij ) fairness index when flight i is assigned to slot j

t_j 时隙j的开始时间t _j start time of slot j

SLDT_i 航班i的时刻表计划着陆时间Schedule of SLDT _i flight i planned landing time

c_i ^* 航班i最优情况的延误c _i ^* flight i optimal-case delay

d(x_ij) 航班i被分配时隙j时，和最优情况的差值d(x _ij ) when flight i is assigned time slot j, the difference from the optimal case

nd(x_ij) 正则化后的d(x_ij)差值nd(x _ij ) normalized d(x _ij ) difference

f_q(x_ij) 航空公司q的平均正则化后的差值f _q (x _ij ) average regularized difference of airline q

ω_i 航班i在延误时间方面的重要性权重ω _i The importance weight of flight i in terms of delay time

(2)多目标函数(2) Multi-objective function

(3)约束条件(3) Constraints

c_ij＝ω_i×(t_j-SLDT_i) (5)c _ij =ω _i ×(t _j -SLDT _i ) (5)

t_j≥SLDT_i,当x_ij＝1时 (6)t _j ≥SLDT _i , when x _ij =1 (6)

其中，公式(1)表示最小化延误成本和公平性差异指标的多目标函数。公式(2)中x_ij为决策变量，代表是否航班i被分配给时隙j，仅能取值0或者1。公式(3)和(4)确保对于任何一个待分配的航班，有且仅有一个时隙分配给它，而对于任何一个待分配的时隙，最多仅可以被分配给一个航班。公式(5)表示根据分配的时隙的开始时间和时刻表时间的差值，计算延误成本。公式(6)保证对于任何一个航班，分配的时隙必须要等于或者晚于时刻表时间。公式(7)和(8)计算分配方案下延误成本与最优情况的延误成本的差值并正则化。公式(9)和(10)计算所有参与时隙分配公司的方差，作为航空公司公平性指标。Among them, formula (1) represents the multi-objective function that minimizes the delay cost and fairness difference index. In formula (2), x _ij is a decision variable, representing whether flight i is allocated to time slot j, and can only take a value of 0 or 1. Equations (3) and (4) ensure that for any flight to be allocated, there is one and only one time slot allocated to it, and for any time slot to be allocated, at most one flight can be allocated. Equation (5) indicates that the delay cost is calculated according to the difference between the start time of the allocated time slot and the time of the schedule. Equation (6) guarantees that for any flight, the allocated time slot must be equal to or later than the schedule time. Equations (7) and (8) calculate and regularize the difference between the delay cost under the allocation scheme and the delay cost in the optimal case. Equations (9) and (10) calculate the variance of all participating companies in slot allocation as an airline fairness indicator.

步骤2：基于时隙分配问题，建立对应马尔科夫网络。Step 2: Based on the time slot allocation problem, establish the corresponding Markov network.

马尔科夫网络为一网状无向图(或称为双向图)，由结构G和参数Ψ组成。结构G中包含节点Node和连接边E，因此一个马尔科夫网络可以表示为：MN＝{G,Ψ}＝{Node,E,Ψ}。如图3所示，马尔科夫网络中含有5个变量(x₁-x₅)，变量间的连线代表这两个变量间存在某种强关联，互相之间被称为邻居。例如变量X₁含有2个邻居X₂和X₃；X₂含有3个邻居，X₁，X₃和X₄。A Markov network is a meshed undirected graph (or bidirectional graph), which consists of a structure G and a parameter Ψ. Structure G contains node Node and connecting edge E, so a Markov network can be expressed as: MN={G,Ψ}={Node,E,Ψ}. As shown in Figure 3, the Markov network contains 5 variables (x ₁ -x ₅ ), and the connection between the variables indicates that there is a strong correlation between these two variables, and they are called neighbors. For example, variable X ₁ contains 2 neighbors X ₂ and X ₃ ; X ₂ contains 3 neighbors, X ₁ , X ₃ and X ₄ .

变量X_i(D(X_i)＝{x_i ¹,…,x_i ⁿⁱ})的取值，(x_i ⁿⁱ表示X_i能取的最大值)是由其所有邻居的条件概率计算得来的，如公式(11)所示。The value of variable X _i (D(X _i )={x _i ¹ ,...,x _i ⁿⁱ }), (x _i ⁿⁱ represents the maximum value that X _i can take) is calculated from the conditional probability of all its neighbors , as shown in formula (11).

p(x_i|x-{x_i})＝p(x_i|N_i) (11)p(x _i |x-{x _i })=p(x _i |N _i ) (11)

其中，N_i是变量X_i的邻居的集合，p(x_i|x-{x_i})表示变量X_i在其他所有变量联合取值下的条件概率，p(x_i|N_i)表示变量X_i在它所有邻居联合取值下的条件概率。Among them, _{Ni is the set of neighbors of variable Xi, p(x i} _| x-{x _i }) _represents the conditional probability of variable Xi under the joint value of all other variables, p( _xi |N _i ₎ represents Conditional probability of variable X _i under the joint value of all its neighbors.

针对时隙分配优化问题，此步骤负责建立对应马尔科夫网络，包含网络中各节点的定义和取值范围。For the optimization problem of time slot allocation, this step is responsible for establishing the corresponding Markov network, including the definition and value range of each node in the network.

在马尔科夫网络中，每个节点代表一个决策变量，具体来说就是航班的时隙分配选择。节点X_i表示航班i的时隙选择，其取值范围D(X_i)＝{s₁,s₂…,s_M}，即所有的待分配时隙。两个节点之间的连线表示两个航班之间存在较强耦合性和资源(航班时隙)争夺，作用是一方面为进化过程提供邻居信息，计算概率分布；另一方面为后续例如时隙交换等操作提供一定依据。In a Markov network, each node represents a decision variable, specifically the choice of time slot allocation for flights. Node X _i represents the time slot selection of flight i, and its value range is D(X _i )={s ₁ , s ₂ . . . , s _M }, that is, all time slots to be allocated. The connection between the two nodes indicates that there is a strong coupling and competition for resources (flight time slots) between the two flights. On the one hand, the function is to provide neighbor information for the evolution process and calculate the probability distribution; Gap exchange and other operations provide a certain basis.

步骤3：建立概率矩阵。Step 3: Build the probability matrix.

此时概率矩阵是基于上步骤中的马尔科夫网络的。概率模型如公式(12)所示，P(t)表示迭代计算过程中第t代的概率矩阵名称，为一N×M的矩阵。行i表示航班i分配到每个时隙j上的概率，合计为100％。以列j来看，表示针对时隙j，每个航班i被分配到该时隙的概率，合计也为100％。At this time, the probability matrix is based on the Markov network in the previous step. The probability model is shown in formula (12), where P(t) represents the name of the probability matrix of the t-th generation in the iterative calculation process, which is an N×M matrix. Row i represents the probability of flight i being assigned to each time slot j, which adds up to 100%. Viewed from column j, it represents the probability that for time slot j, each flight i is assigned to this time slot, and the total is also 100%.

初始化时，一般采用等概率进行赋值。针对时隙分配具体问题，存在公式(6)中的时间约束，同时为了加快采样和搜索的效率，制定如下规则：对每一行i来说，在M个时隙中，将早于SLDT_i的时隙j的概率p初始值设置为0，将晚于SLDT_i的时隙j的概率p初始值设置为1/Q，其中Q为晚于SLDT_i的时隙数量。During initialization, the assignment is generally performed with equal probability. For the specific problem of time slot allocation, there is a time constraint in formula (6). At the same time, in order to speed up the efficiency of sampling and searching, the following rules are formulated: For each row _i , in M time slots, the The initial value of probability p of time slot j is set to 0, and the initial value of probability p of time slot j later than SLDT _i is set to 1/Q, where Q is the number of time slots later than SLDT _i .

步骤4：采用新的适应度函数，计算并找出当代帕累托解。Step 4: Using the new fitness function, compute and find the contemporary Pareto solution.

数学模型中包含两个目标需要同时进行优化，设计出一套高效率的适应度函数来完成解的搜索是十分重要的。在不同的多目标优化算法中，适应度函数对于搜索效率和效果起到相当大的作用。本发明提出一种带有指数化权重参数的适应度函数，能够提高搜索效率的同时，带来更高的灵活性。The mathematical model contains two objectives that need to be optimized at the same time, and it is very important to design a set of efficient fitness functions to complete the solution search. In different multi-objective optimization algorithms, the fitness function plays a considerable role in the search efficiency and effect. The invention proposes a fitness function with an exponential weight parameter, which can improve the search efficiency and bring about higher flexibility.

如图4所示，以最小化多目标问题为例，帕累托解归纳为两部分：边界部分(单目标优化部分)和中间部分(多目标优化部分)，针对边界部分，采用VEGA算法中的边界适应度函数来计算：As shown in Figure 4, taking the minimization multi-objective problem as an example, the Pareto solution is divided into two parts: the boundary part (single-objective optimization part) and the middle part (multi-objective optimization part). The boundary fitness function to calculate:

EdgeFitness_f(X)＝obj_value_f(X) (13)EdgeFitness _f (X)=obj_value _f (X) (13)

其中X表示任意解，EdgeFitness_f(X)表示解X在目标f的适应度函数，obj_value_f(X)表示解X在目标f上函数值。Where X represents an arbitrary solution, EdgeFitness _f (X) represents the fitness function of the solution X on the target f, and obj_value _f (X) represents the function value of the solution X on the target f.

从图5中可以看出，公式(14)给出的适应度函数能够很好的区分出是否属于帕累托集合。所有支配解(帕累托解)的CentralFitness值都是大于等于1，而且支配其他解越多，数值越高。同时，所有非支配解都小于1大于0。因此，数值1可以清楚地分辨一个解是否为支配解。It can be seen from Figure 5 that the fitness function given by formula (14) can well distinguish whether it belongs to a Pareto set. The CentralFitness values of all dominant solutions (Pareto solutions) are greater than or equal to 1, and the more dominant other solutions, the higher the value. Meanwhile, all non-dominated solutions are less than 1 and greater than 0. Therefore, a value of 1 can clearly distinguish whether a solution is the dominant solution.

根据公式(13)和(14)，可以分别计算出边界适应度和中间适应度，而且计算复杂度非常低。According to formulas (13) and (14), the boundary fitness and the intermediate fitness can be calculated respectively, and the computational complexity is very low.

下一步为了能够获得最终解，需要综合考虑上述两个适应度函数。有两种方案，一种是采用多岛(Multi-island)优化，即部分种群依照边界适应度函数优化，另外一部分种群依照中间适应度优化。这种方法执行效率高，但带来的问题是很难最终形成统一解以及帕累托解的最优性不高。In the next step, in order to obtain the final solution, it is necessary to comprehensively consider the above two fitness functions. There are two schemes, one is to use multi-island optimization, that is, some populations are optimized according to the boundary fitness function, and the other part of the population is optimized according to the intermediate fitness. This method has high execution efficiency, but the problem is that it is difficult to finally form a unified solution and the optimality of the Pareto solution is not high.

算法中采用带有指数化权重参数的函数来结合边界适应度和中间适应度函数，如公式(15)所示：In the algorithm, a function with an exponential weight parameter is used to combine the boundary fitness and the intermediate fitness function, as shown in formula (15):

其中D-Fitness(X)是解X考虑复合中间部分适应度和边界部分适应度的综合适应度，m是目标总数(双目标问题，m＝2)，N_POP是人口总数，Ranking_EdgeFitnessi(X)是基于第i个目标的X的排名(EdgeFitness越高，排名越靠前)，Ranking_{CentralFitness}(X)是针对X在中间适应度上的排名(CentralFitness越高，排名越靠前)，ω_i和ω_m+1是指数化权重参数(例如双目标优化问题，m＝2，所以指数化权重参数有3个：ω₁，ω₂，ω₃)，，用来区分不同排名的得分，同时可以实现对某一个目标的偏好。where D-Fitness(X) is the comprehensive fitness of the solution X considering the fitness of the composite middle part and the fitness of the boundary part, m is the total number of targets (dual target problem, m=2), N_POP is the total population, Ranking _EdgeFitnessi (X) is the ranking of X based on the ith target (the higher the EdgeFitness, the higher the ranking), the Ranking _{CentralFitness} (X) is the ranking of X in the middle fitness (the higher the CentralFitness, the higher the ranking), ω _i and ω _m+1 is an indexed weight parameter (for example, a dual-objective optimization problem, m=2, so there are 3 indexed weight parameters: ω ₁ , ω ₂ , ω ₃ ), which is used to distinguish the scores of different rankings. To achieve a preference for a certain goal.

以图6a和图6b为例说明指数化权重参数。设定问题为最小化问题，在图6a中，考量黑色解和白色解，黑色解为帕累托解，而白色解不是。黑色解的特征是在某一个维度上非常优秀(边界，f₂)，白色解虽然在边界f₂这个维度上比黑色解略微差一些，但是其他维度上都要比黑色解好。帕累托解只需在某一个维度中非常“杰出”的，而不是在每个维度都“不错”的。因为为了能够明显区分并且快速找寻这样的帕累托解，图6b是带有指数化权重参数的函数，即和常见的线性关系相比，在某一维度一个解的排名越靠前，它的得分(适应度函数值)比其他解高得越多。The exponentiation weight parameter is illustrated by taking Fig. 6a and Fig. 6b as an example. The problem is assumed to be a minimization problem. In Figure 6a, the black solution and the white solution are considered. The black solution is a Pareto solution, while the white solution is not. The characteristic of the black solution is that it is very good in one dimension (boundary, f ₂ ). Although the white solution is slightly worse than the black solution in the dimension of the boundary f ₂ , it is better than the black solution in other dimensions. A Pareto solution only needs to be "excellent" in one dimension, not "good" in every dimension. Because in order to clearly distinguish and quickly find such a Pareto solution, Figure 6b is a function with an exponential weight parameter, that is, compared with the common linear relationship, the higher the ranking of a solution in a certain dimension, its The score (fitness function value) is much higher than the other solutions.

同时，采用指数化权重参数还有一个非常重要的作用：实现搜索的灵活性。类似于权重的概念，如果某一个维度出现在帕累托解中的数量不足，则需要加大对该维度的发掘。通过增大对应的指数化权重参数，对该维度从最终适应度函数值的计算上侧重，“鼓励”该维度能够出现更多的优秀解出现，也提高了帕累托解的分布性。At the same time, the use of the exponential weight parameter also has a very important role: to achieve the flexibility of the search. Similar to the concept of weight, if the number of a certain dimension in the Pareto solution is insufficient, it is necessary to increase the exploration of this dimension. By increasing the corresponding exponential weight parameter, the dimension is focused on the calculation of the final fitness function value, "encouraging" this dimension to appear more excellent solutions, and improving the distribution of Pareto solutions.

步骤5：根据当代帕累托解，更新信任解集存档。Step 5: Update the trust solution set archive according to the contemporary Pareto solution.

如图7所示，基于本代的候选解(Candidate Solutions)和上代的信任解(Promising Data)，通过上步骤中的复合适应度函数，两部分中都有一部分解成为帕累托解。信任解集存档(Archive)用于存放更新后的信任解。本算法中采用固定大小的方式，其的优势是有更多的解用于之后的概率估算，防止出现过早成熟、陷入局部最优的问题。对于选取哪些候选解作为非帕累托解进入存档中，为了保证搜索的灵活性、解的多样性和存档的稳定性，通过公式(15)计算获得本代的所有解的适应度函数值，并根据适应度函数值由高到低选择α％的解存入信任解集存档，从上一代信任解集存档中根据适应度函数值由高到低选择(1-α％)的信任解存入本代的信任解集存档。As shown in Figure 7, based on the candidate solutions of the current generation (Candidate Solutions) and the trust solutions of the previous generation (Promising Data), through the composite fitness function in the previous step, some of the solutions in both parts become Pareto solutions. The trust solution archive (Archive) is used to store the updated trust solution. The fixed-size method is adopted in this algorithm, and its advantage is that there are more solutions for the subsequent probability estimation, which prevents the problem of premature maturity and falling into local optimum. For which candidate solutions are selected as non-Pareto solutions to enter the archive, in order to ensure the flexibility of search, the diversity of solutions and the stability of archives, the fitness function value of all solutions of the current generation is calculated by formula (15), And according to the fitness function value from high to low, select α% solution and save it into the trust solution set archive, and select (1-α%) trust solution from the previous generation trust solution set archive according to the fitness function value from high to low. Enter the trust resolution archive of this generation.

步骤6：学习马尔科夫网络，更新概率矩阵。Step 6: Learn the Markov network and update the probability matrix.

本算法和传统的分布估计算法不同的是引入了马尔科夫网络，因此在分布估计时，除了概率矩阵需要更新，马尔科夫网络的结构也需要重新学习。下面从两个方面分别说明：The difference between this algorithm and the traditional distribution estimation algorithm is that the Markov network is introduced. Therefore, in the distribution estimation, in addition to the probability matrix needs to be updated, the structure of the Markov network also needs to be re-learned. The following is an explanation from two aspects:

(1)学习马尔科夫网络(1) Learning Markov Networks

马尔科夫网络中两个节点之间的联系表示两个变量之间具有较强的关联性(非因果关系)，并且变量的赋值也是基于和它存在关联的“邻居”的，因此马尔科夫网络的学习过程对于本算法是很重要的一个环节。The connection between two nodes in the Markov network indicates that there is a strong correlation (non-causal relationship) between the two variables, and the assignment of the variable is also based on the "neighbors" associated with it, so Markov The learning process of the network is an important part of this algorithm.

一种建立网络结构的方法是由问题专家负责，但是对于大部分问题很难找到专家能够对变量之间的关联性给出可信意见。本算法采用一种条件独立测试(ConditionalIndependence Test)——互信息(Mutual Information，MI)来估算马尔科夫网络结构。互信息的优势在于使用简单并且计算快速。One way to build the network structure is for problem experts to be responsible, but for most problems it is difficult to find experts who can give credible opinions on the correlations between variables. This algorithm uses a conditional independence test (ConditionalIndependence Test)-Mutual Information (MI) to estimate the structure of Markov network. The advantage of mutual information is that it is simple to use and fast to compute.

其中，X_i和Y_j表示两个随机变量(时隙分配问题中，变量具体代表的是某个航班的时隙分配)，p(x_i|D)和p(y_j|D)是基于可信解集合D的变量X_i＝x_i的概率和Y_j＝y_j的概率，p(x_i,y_j|D)是X_i＝x_i和Y_j＝y_j的联合概率；Among them, X _i and Y _j represent two random variables (in the time slot allocation problem, the variables specifically represent the time slot allocation of a flight), and p(x _i |D) and p(y _j |D) are based on The probability of variables X _i = _xi and Y _j =y _j of the trusted solution set D, p(x _i ,y _j |D) is the joint probability of X _i = _xi and Y _j =y _j ;

如果两个变量的互信息量大于给定阈值threshold(例如取值0.5)，则判定两变量之间存在较强关联，在马尔科夫网络中建立连接线并互称为邻居；阈值threshold由用户设定，或者在计算过程中，根据已知的互信息量来动态更新阈值，如公式(17)所示：If the mutual information of the two variables is greater than the given threshold threshold (for example, the value is 0.5), it is determined that there is a strong correlation between the two variables, and a connection line is established in the Markov network and called each other as neighbors; the threshold threshold is determined by the user Set, or dynamically update the threshold value according to the known mutual information during the calculation process, as shown in formula (17):

其中，X_i和Y_j表示两个随机变量，MI表示两个变量X_i和Y_j的互信息量，NDV表示变量的总量，参数β用于控制结构复杂度(通常可设置为0.5-1.5之间)。如果β被设置为一较高值，则阈值较高，意味着马尔科夫网络中拥有较少的连线和较短的计算时间，但会损失一部分关联信息。如果β被设置为一较低值，则阈值变低，意味着马尔科夫网络中拥有较多的连线和更多的关联信息被保留，有助于提高搜索的质量，但是需要较长的计算时间。因此，参数β可以用来控制搜索质量和效率。Among them, X _i and Y _j represent two random variables, MI represents the mutual information of the two variables X _i and Y _j , NDV represents the total amount of variables, and the parameter β is used to control the structural complexity (usually can be set to 0.5- 1.5). If β is set to a higher value, the threshold is higher, which means that there are fewer connections and shorter computation time in the Markov network, but some associated information is lost. If β is set to a lower value, the threshold becomes lower, which means that there are more connections in the Markov network and more related information is preserved, which helps to improve the quality of the search, but requires a longer time calculating time. Therefore, the parameter β can be used to control the search quality and efficiency.

结构学习往往需要很长时间，因此为了搜索效率，并不需要每一代都进行学习。例如可以设置为每10代进行结构学习一次(概率矩阵每一代都需要更新)。Structural learning tends to take a long time, so for search efficiency, learning does not need to be done every generation. For example, it can be set to perform structure learning every 10 generations (probability matrix needs to be updated every generation).

(2)更新概率矩阵(2) Update the probability matrix

在公式(12)中，概率矩阵表示为每个变量的独立概率。同时由于在马尔科夫网络中存在关联性，还需要建立和更新基于邻居的条件概率矩阵。变量X_i在其邻居N_i的概率矩阵，如公式(18)所示，In Equation (12), the probability matrix is expressed as the independent probability of each variable. At the same time, due to the existence of correlation in the Markov network, it is also necessary to establish and update the conditional probability matrix based on neighbors. The probability matrix of variable X _i in its neighbor N _i , as shown in formula (18),

其中P(X_i|N_i)表示变量X_i在所有其邻居N_i的条件概率矩阵，{x_i,1，…，x_i,j，…，x_i,Yi}分别表示变量X_i能够被赋予的值，在时隙分配中即对应每一个待分配的时隙编号，Y_i表示变X_i取值的最大编号，x_i,Yi表示X_i能够被取值的最大值(时隙分配问题中，即为最晚的一个时隙)；

表示变量X_i的某取值Value_j时，在其所有邻居联合取值场景为Neighbor_k下的条件概率，Z_i表示邻居集合N_i的联合取值场景的最大编号。where P(X _i |N _i ) represents the conditional probability matrix of variable X _i in all its neighbors N _i , {x _i,1 ,..., _xi,j ,..., _xi,Yi }respectively represent that variable X _i can The _assigned _value _corresponds to the number of each to-be-allocated time _slot in the time slot allocation. In the allocation problem, it is the latest time slot);

Represents the conditional probability of a value _j of the variable X _i when the joint value scene of all its neighbors is Neighbor _k , and Z _i represents the maximum number of the joint value scene of the neighbor set _Ni .

对于公式(12)和(18)中的概率，都是根据计算获得的可信任解或者可信任解集存档中的内容进行更新。由于是迭代搜索的，首先计算当前代的边界概率：For the probabilities in formulas (12) and (18), both are updated according to the trusted solution obtained by calculation or the content in the archive of the trusted solution set. Since it is an iterative search, first calculate the boundary probability of the current generation:

其中，N(X＝x)表示在可信任解中变量X取值x的数量，N(D)表示可信任解的总量，1/|X|的设置是为了防止出现0的可能，对概率计算不产生实质影响。Among them, N(X=x) represents the number of values x of the variable X in the trusted solution, N(D) represents the total number of trusted solutions, and the setting of 1/|X| is to prevent the possibility of 0 from appearing. Probability calculations have no real impact.

计算完当前代的概率，需要更新概率模型，为了保证模型的稳定性，通过公式(20)进行计算：After calculating the probability of the current generation, the probability model needs to be updated. In order to ensure the stability of the model, the calculation is performed by formula (20):

其中P_t(X＝x)和P_t-1(X＝x)分别表示在迭代计算过程中第t代时变量X取x值的概率和第t-1代时变量X取x值的概率，λ是从当前代的学习率，如果λ＝1，则表示概率模型将完全从当前代得到的可信解中学习。考虑到算法中已经设计了归档机制，可以将λ的值设置大于0.5。Among them, P _t (X=x) and P _t-1 (X=x) represent the probability that the variable X takes the value of x in the t-th generation and the probability that the variable X takes the value of x in the t-1 generation in the iterative calculation process, respectively. , λ is the learning rate from the current generation, if λ = 1, it means that the probability model will be completely learned from the trusted solutions obtained from the current generation. Considering that the filing mechanism has been designed in the algorithm, the value of λ can be set larger than 0.5.

为了能够进一步增加解的多样性，避免陷入局部最优，类似GA的变异操作也被加入到概率更新的计算过程中。在公式(20)更新概率之后，通过变异概率(例如5％)执行公式(21)的变异操作：In order to further increase the diversity of solutions and avoid falling into local optimum, a mutation operation similar to GA is also added to the calculation process of probability update. After Equation (20) updates the probability, perform the mutation operation of Equation (21) with a mutation probability (eg, 5%):

其中θ表示变异值(例如取0.2)，ε是一个很小的正数(例如取0.01)，为了保证所有的概率都要小于100％。Where θ represents the variation value (eg 0.2), ε is a small positive number (eg 0.01), in order to ensure that all probabilities are less than 100%.

步骤7：根据概率矩阵和马尔科夫网络进行采样，生成新解。Step 7: Sampling according to the probability matrix and Markov network to generate a new solution.

通过上个步骤已经获得概率模型和网络模型，下一步需要生成下一代的解。马尔科夫网络和贝叶斯网络(Bayesian network，BN)有很大的不同，并不存在因果关系，因此在采样生成新解过程中，并没有明确的顺序。为了得到新解，算法采用一种基于马尔科夫链的蒙特卡洛方法——吉布斯采样(Gibbs Sampler)。具体过程如下：The probabilistic model and network model have been obtained in the previous step, and the next step is to generate the next generation of solutions. Markov network is very different from Bayesian network (BN), and there is no causal relationship, so there is no clear order in the process of sampling to generate new solutions. In order to obtain the new solution, the algorithm adopts a Monte Carlo method based on Markov chain - Gibbs Sampler. The specific process is as follows:

(1)随机生成一个可行解s＝(s₁,s₂,…,s_n)，s_n为最后一个变量X_n的取值(时隙分配问题中代表每个航班的时隙选择，s_n表示第n架航班的时隙选择)。生成方法如下：将所有航班根据航班时刻表排列，从第一个航班x₁开始，从所有剩余时隙中，随机选择一个时隙s₁(要求s₁不早于x₁的计划起飞时间，并且时隙s₁没有被其他航班选择)分配给x₁，删除时隙s₁，移动到下一个航班x₂，用同样的方法随机选择，直至所有航班选择结束。(1) Randomly generate a feasible solution s=(s ₁ , s ₂ ,...,s _n ), where s _n is the value of the last variable X _n (in the time slot assignment problem, it represents the time slot selection of each flight, s _n represents the time slot selection of the nth flight). The generation method is as follows: Arrange all flights according to the flight schedule, starting from the first flight x ₁ , from all remaining time slots, randomly select a time slot s ₁ (requires that s ₁ is not earlier than the planned departure time of x ₁ , and time slot s ₁ is not selected by other flights) assigned to x ₁ , delete time slot s ₁ , move to the next flight x ₂ , select randomly in the same way, until all flight selection ends.

(2)将解s＝(s₁,s₂,…,s_n)顺序随机打乱，得到新的顺序s₁ ^*,s₂ ^*,…,s_n ^*，其中s_n ^*表示新的航班排序中最后一个航班的时隙选择。(需要说明的是：此时的航班以及时隙对应关系并没有改变，只是排列的次序发生变化)。(2) Randomly shuffle the order of the solution s=(s ₁ , s ₂ ,...,s _n ) to obtain a new order s ₁ ^* ,s ₂ ^* ,...,s _n ^* , where s _n ^* represents a new flight Slot selection for the last flight in the sequence. (It should be noted that the correspondence between flights and time slots at this time has not changed, but the order of arrangement has changed).

(3)从s₁ ^*(假设对应航班h)开始，根据建立的马尔科夫网络找出航班h变量的所有邻居，并根据此时解s＝(s₁,s₂,…,s_n)中邻居的取值以及对应的条件概率p(x_h|N_h)来决定航班h选择各个时隙的概率，再按照概率分布，采用轮盘赌选择算法(Roulette WheelSelection，即根据各种取值的概率随机选择，某个值的概率越高，则被选中的可能性越大)生成s₁ ^*的新值；然后移动到s₂ ^*，采用同样方法再计算s₂ ^*的新值，直到完成所有变量，则完成了一个解的采样。(3) Starting from s ₁ ^* (assuming the corresponding flight h), find out all the neighbors of the flight h variable according to the established Markov network, and solve s=(s ₁ ,s ₂ ,...,s _n ) according to this time. The value of the neighbors and the corresponding conditional probability p(x _h |N _h ) determine the probability of selecting each time slot for flight h, and then according to the probability distribution, the roulette wheel selection algorithm (Roulette WheelSelection) is used. The probability of s is randomly selected, the higher the probability of a value, the greater the possibility of being selected) to generate a new value of s ₁ ^* ; then move to s ₂ ^* , and use the same method to calculate the new value of s ₂ ^* until Completion of all variables completes the sampling of a solution.

重复上述步骤(1)至(3)，完成一个群体N_POP个解的采样，最终形成当前代的解集。Repeat the above steps (1) to (3) to complete the sampling of N_POP solutions in a population, and finally form the solution set of the current generation.

步骤8：达到终止条件，获得最终结果。Step 8: The termination condition is reached and the final result is obtained.

重复以上步骤，直至满足终止条件。终止条件可以设置为迭代次数，例如1000代。当达到终止条件时，此时获得的帕累托解即为最终的结果。Repeat the above steps until the termination condition is met. The termination condition can be set to the number of iterations, such as 1000 generations. When the termination condition is reached, the Pareto solution obtained at this time is the final result.

为了验证本发明方法中模型和算法的正确性，基于华北地区流量管理环境，选取了18：00-24：00共计6个小时的模拟航班数据用作验证，并假设22：00-23：00之间北京首都机场(ZBAA)出现容量下降和超容情况。In order to verify the correctness of the model and algorithm in the method of the present invention, based on the traffic management environment in North China, the simulated flight data for a total of 6 hours from 18:00-24:00 is selected for verification, and it is assumed that 22:00-23:00 The Beijing Capital Airport (ZBAA) experienced a capacity drop and overcapacity.

航班时隙分配的结果如图8所示(部分结果)。图中结果给出了受影响航班的预计起飞时间和计算起飞时间，并给出了地面延误时间的建议。从结果可以比较直观地看出，一方面有较多航班的公司的部分航班是不需要延误的，说明符合多目标优化中的效率性(延误时间)指标，另一方面也可以发现平均延误时间比较类似，并没有出现某一家航空公司的航班出现过多延误，也证明了公平性指标得到保证。通过实验仿真验证，说明本方法可以有效地解决时隙分配问题，并按照问题建模的多目标优化计算获得满意的结果。The results of flight slot allocation are shown in Figure 8 (partial results). The results in the graph give the estimated and calculated departure times of the affected flights, as well as suggestions for ground delay times. It can be seen intuitively from the results that, on the one hand, some flights of companies with more flights do not need to be delayed, indicating that they meet the efficiency (delay time) index in multi-objective optimization. On the other hand, the average delay time can also be found. Similarly, there is no excessive delay of a certain airline's flight, which also proves that the fairness index is guaranteed. The experimental simulation and verification show that the method can effectively solve the time slot allocation problem, and obtain satisfactory results according to the multi-objective optimization calculation of problem modeling.

本发明提供了一种基于分布估计算法的航班时隙分配多目标优化方法，具体实现该技术方案的方法和途径很多，以上所述仅是本发明的优选实施方式，应当指出，对于本技术领域的普通技术人员来说，在不脱离本发明原理的前提下，还可以做出若干改进和润饰，这些改进和润饰也应视为本发明的保护范围。本实施例中未明确的各组成部分均可用现有技术加以实现。The present invention provides a multi-objective optimization method for flight time slot allocation based on a distribution estimation algorithm. There are many specific methods and approaches for realizing the technical solution. The above are only the preferred embodiments of the present invention. For those of ordinary skill in the art, without departing from the principle of the present invention, several improvements and modifications can also be made, and these improvements and modifications should also be regarded as the protection scope of the present invention. All components not specified in this embodiment can be implemented by existing technologies.

Claims

1. A flight time slot allocation multi-objective optimization method based on a distribution estimation algorithm is characterized by comprising the following steps:

step 1, establishing a time slot allocation multi-target model based on efficiency and fairness;

step 2, establishing a corresponding Markov network based on a time slot distribution multi-target model;

step 3, establishing a probability matrix;

step 4, calculating and finding out a contemporary pareto solution by adopting a new fitness function;

step 5, updating a trust solution set for filing according to the current pareto solution;

step 6, learning a Markov network and updating a probability matrix;

step 7, sampling according to the probability matrix and the Markov network to generate a new solution;

step 8, reaching a termination condition to obtain a final result;

the step 1 comprises the following steps:

step 1-1, establishing the following multi-objective function:

where N denotes the total number of flights, M denotes the total number of time slots, c_ijRepresenting the delay cost, x, when flight i is assigned to slot j_ijRepresenting a decision variable, F (x)_ij) Representing a fairness index when flight i is assigned to slot j; formula (1) represents a multi-objective function of minimizing delay cost and fairness difference index;

step 1-2, setting constraint conditions:

c_ij＝ω_i×(t_j-SLDT_i) (5)

t_j≥SLDT_iwhen x is_ijWhen becoming 1 (6)

Wherein, ω is_iRepresenting the importance weight of the flight i in the aspect of delay time, wherein the value range is 0.0-1.0; q denotes the total number of airlines, N_qRepresenting the total number of flights, t, for the airline q_jRepresents the start time of slot j; x is the number of_ijFor decision variables, representing whether flight i is assigned to slot j, only the value 0 or 1, S L DT_iSchedule plan landing time, c, representing flight i_i ^*Delay, d (x), representing the optimum of flight i_ij) The difference, nd (x), between when flight i is assigned slot j and the optimum_ij) Representing d (x) after regularization_ij) Difference of (f)_q(x_ij) Representing the average normalized difference of the airline q; formula (2) is a decision variable of a problem, formulas (3) and (4) respectively represent a constraint condition of flight allocation and a constraint condition of time slot allocation, formula (5) is used for calculating flight delay cost, formula (6) is constraint of a flight schedule, formulas (7) and (8) calculate a difference value between actual time slot allocation and optimal allocation and perform regularization, and formulas (9) and (10) calculate variance of all participating time slot allocation companies as fairness indexes of airlines.

2. The method of claim 1, wherein step 2 comprises: establishing a Markov network corresponding to a time slot distribution multi-target model, wherein in the established Markov network, each node represents a decision variable, namely time slot distribution selection of flights; node X_iSlot selection representing flight i, which takes on a range D (X)_i)＝{s₁,s₂…,s_MWhere M is the total number of available slots, s_MIndicating the mth available slot; in the established Markov network, the connection line between two nodes indicates that stronger coupling exists between two flights and flight time slot contention exists.

3. The method of claim 2, wherein step 3 comprises: the following probability matrix is established:

p (t) represents the probability matrix of the t generation in the iterative computation process;

P_NMa value representing the Nth row and the Mth column of the matrix;

row i of the matrix represents the probability of flight i being assigned to each time slot j, totaling 100%;

column j of the matrix represents the probability for slot j that each flight i is assigned to that slot, totaling 100%;

when initializing the matrix, assignment is carried out by adopting equal probability, and aiming at the specific problem of time slot allocation, time constraint in a formula (6) exists, and a rule is formulated that for each row i, in M time slots, the time constraint is earlier than S L DT_iIs set to 0, will be later than S L DT_iIs set to 1/z, where z is later than S L DT_iThe number of time slots.

4. The method of claim 3, wherein step 4 comprises:

in the step 4-1, the pareto solution is summarized into two parts: the method comprises a boundary part and a middle part, wherein the boundary part is a single-target optimization part, the middle part is a multi-target optimization part, and a boundary fitness function in a VEGA algorithm is adopted to calculate aiming at the boundary part:

EdgeFitness_f(X)＝obj_value_f(X) (13)

wherein X represents an arbitrary solution, edgeFitness_f(X) represents the fitness function of solution X at target f, obj _ value_f(X) represents the value of the function of solution X on target f;

for the middle part, the following intermediate fitness function is used for calculation:

wherein X represents an arbitrary solution, CentralFitness (X) represents the fitness function of solution X in the middle part, p (X) and q (X) represent the number dominated by X and the number dominated by X, respectively;

the fitness function values of the middle parts of all pareto solutions are more than or equal to 1, and meanwhile, all non-pareto solutions are less than 1 and more than 0;

respectively calculating the boundary fitness and the intermediate fitness according to formulas (13) and (14);

step 4-2, combining the boundary fitness and the intermediate fitness function by adopting the following composite fitness function with exponential weight parameters:

wherein D-Fitness (X) is the comprehensive fitness of X considering the composite middle part fitness and the boundary part fitness, m is the total number of targets, N _ POP is the total number of population, and Ranking_EdgeFitnessi(X) Ranking of solution X based on ith target, Ranking_{CentralFitness}(X) is the ranking of X on the intermediate fitness, ω_iAnd ω_m+1The weight parameters are indexed to distinguish scores of different ranks and simultaneously realize the preference of one target.

5. The method of claim 4, wherein step 5 comprises: establishing a file in a database for storing the trust solution as a trust solution set, wherein the capacity of the file is the same as or less than the N _ POP of the population number in each iteration;

and calculating by using a formula (15) to obtain fitness function values of all solutions of the current generation, selecting α% of solutions from high to low according to the fitness function values, storing the solutions into a trust solution set archive, and selecting 1- α% of trust solutions from the previous generation trust solution set archive according to the fitness function values from high to low, and storing the solutions into the trust solution set archive of the current generation.

6. The method of claim 5, wherein step 6 comprises:

step 6-1, learning the Markov network:

by the formula (16), the variable X is calculated_iAnd Y_jMutual information amount MI (X) of_i；Y_j)：

Wherein, X_iAnd Y_jRepresenting two random variables, wherein in the time slot allocation problem, the variable specifically represents the time slot allocation of one flight; p (x)_iI D) and p (y)_j| D) respectively represent variables X based on a set of trusted solutions D_i＝x_iProbability sum Y of_j＝y_jProbability of p (x)_i,y_jI D) is X_i＝x_iAnd Y_j＝y_jA joint probability of (a);

if the mutual information quantity of the two variables is greater than a given threshold, judging that the two variables have strong correlation, and establishing a connecting line in the Markov network and mutually calling the connecting line as a neighbor; the threshold is set by the user, or dynamically updated according to the known mutual information amount in the calculation process, as shown in formula (17):

where MI represents two variables X_iAnd Y_jNDV represents the total amount of variables, and parameter β is used to control the structural complexity;

step 6-2, updating the probability matrix:

variable X_iIn its neighbor N_iAs shown in equation (18):

wherein P (X)_i|N_i) Represents variable X_iIn all its neighbors N_iConditional probability matrix of { x }_i,1，…，x_i,j，…，x_i,YiRespectively represent variables X_iA value that can be assigned in the slot assignment, i.e. for each slot number to be assigned, x_i,YiRepresents variable X_iThe Y th_iA value that can be assigned, Y_iRepresents change X_iMaximum number of values, x_i,YiIs namely X_iThe maximum value which can be taken as a value is the latest time slot in the time slot allocation problem;

represents variable X_iValue_jIn time, the combined value scene of all the neighbors is Neighbor_kA conditional probability of; z_iRepresenting a set of neighbors N_iThe maximum number of the combined value-taking scene;

for the probabilities in equations (12) and (18), the boundary probability P of the current generation is first calculated_t ^*(X＝x)：

Where N (X ═ X) denotes the number of variables X that take on value X in the trusted solution, and N (d) denotes the total number of trusted solutions; the probabilistic model is updated by equation (20):

P_t(X＝x)＝(1-λ)×P_t-1(X＝x)+λ×P_t ^*(X＝x) (20)

wherein P is_t(X ═ X) and P_t-1(X ═ X) respectively represents the probability that the time variable X of the t-th generation takes the X value and the probability that the time variable X of the t-1 th generation takes the X value in the iterative computation process, wherein lambda is the learning rate from the current generation, and if lambda is 1, the probability model is completely learned from the current generation trust solution set archive;

step 6-3, after updating the probability model by formula (20), performing mutation operation of formula (21) by mutation probability:

P_t(X＝x)＝min(P_t(X＝x)+θ,1-) (21)

where θ represents the variance and is a small positive number.

7. The method of claim 6, wherein step 7 comprises:

step 7-1, randomly generating a feasible solution s ═ s(s)₁,s₂,…,s_n)，s_nIs the last variable X_nValue of (1), s in the slot allocation problem₁,s₂,…,s_nTime slot selection, s, on behalf of each flight_nAnd (3) representing the time slot selection of the nth flight, wherein the feasible solution s is generated by the following method: arrange all flights according to flight schedule, from the first flight x₁Initially, a time slot s is randomly selected from all the remaining time slots₁Is assigned to x₁Requires s₁Not earlier than x₁And time slot s₁Delete slot s without selection by other flights₁Move to the next flight x₂Randomly selecting by the same method until all flights are selected;

step 7-2, converting the solution s to(s)₁,s₂,…,s_n) The sequence is randomly disordered to obtain a new sequence s₁ ^*,s₂ ^*,…,s_n ^*Wherein s is_n ^*A slot selection indicating the last flight in the new flight sequence;

step 7-3, from s₁ ^*At first, set s₁ ^*Corresponding to the flight h, finding out all neighbors of the flight h variable according to the established Markov network, and solving s-s(s) according to the moment₁,s₂,…,s_n) Value of middle neighbor and corresponding conditional probability p (x)_h|N_h) Determining the probability of selecting each time slot for flight h, and generating s by adopting a roulette selection algorithm according to the probability distribution₁ ^*A new value of (d); then move to s₂ ^*Recalculating s by the same method₂ ^*Until all variables are completed, the sampling of a solution is completed;

and 7-4, repeating the methods from the step 7-1 to the step 7-3 to complete the sampling of N _ POP solutions of a group, and finally forming a solution set of the current generation.