[go: up one dir, main page]

CN115408943B - Combustion optimization method for thermal power generating unit - Google Patents

Combustion optimization method for thermal power generating unit Download PDF

Info

Publication number
CN115408943B
CN115408943B CN202211046653.4A CN202211046653A CN115408943B CN 115408943 B CN115408943 B CN 115408943B CN 202211046653 A CN202211046653 A CN 202211046653A CN 115408943 B CN115408943 B CN 115408943B
Authority
CN
China
Prior art keywords
algorithm
virtual environment
reward
design
transfer function
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211046653.4A
Other languages
Chinese (zh)
Other versions
CN115408943A (en
Inventor
黄晓明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Innavitt Automation Technology Co ltd
Jinggangshan Power Plant of Huaneng Power International Inc
Original Assignee
Nanjing Innavitt Automation Technology Co ltd
Jinggangshan Power Plant of Huaneng Power International Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Innavitt Automation Technology Co ltd, Jinggangshan Power Plant of Huaneng Power International Inc filed Critical Nanjing Innavitt Automation Technology Co ltd
Priority to CN202211046653.4A priority Critical patent/CN115408943B/en
Publication of CN115408943A publication Critical patent/CN115408943A/en
Application granted granted Critical
Publication of CN115408943B publication Critical patent/CN115408943B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2111/00Details relating to CAD techniques
    • G06F2111/04Constraint-based CAD

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computer Hardware Design (AREA)
  • Geometry (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Regulation And Control Of Combustion (AREA)
  • Control Of Steam Boilers And Waste-Gas Boilers (AREA)

Abstract

本发明提供了一种用于火电机组燃烧优化方法,以最优锅炉效率和污染物排放为优化控制目标,利用改进后的异步优势的行动者评论家算法作为控制算法,通过对火电机组燃烧系统的状态设计和奖励设计进行训练。改进后的异步优势行动者评论家算法收敛速度快、学习能力更强,在对锅炉燃烧优化的这个多变量制约、多目标优化的复杂问题,可有效寻找到最佳控制策略,同时考虑了壁温控制的问题,可以有效提高机组运行的安全性。

The present invention provides a combustion optimization method for thermal power units, which takes the optimal boiler efficiency and pollutant emission as the optimization control targets, uses the improved asynchronous advantage actor-critic algorithm as the control algorithm, and trains through the state design and reward design of the thermal power unit combustion system. The improved asynchronous advantage actor-critic algorithm has a fast convergence speed and stronger learning ability. It can effectively find the best control strategy for the complex problem of multi-variable constraints and multi-objective optimization of boiler combustion optimization, and at the same time considers the problem of wall temperature control, which can effectively improve the safety of unit operation.

Description

Combustion optimization method for thermal power generating unit
Technical Field
The invention relates to the field of automatic control of thermal power generating units, in particular to a combustion optimization method of a thermal power generating unit.
Background
With the continuous growth of new energy industry, wind power and solar energy gradually change the current power grid pattern, and the flexibility transformation of each unit is one of main directions due to the instability of new energy. The main technical difficulty of the flexible transformation is how to enable the high-capacity coal motor unit to carry out deep peak shaving towards the ultra-low load, and therefore higher requirements are put on the combustion stability of the high-capacity coal motor unit. The boiler combustion optimization reduces the generation of NOx by means of grading air distribution, coordination of coal blending and the like, and simultaneously takes into account economic indexes such as CO emission concentration, boiler efficiency and the like and safety indexes such as high-temperature steam pipe wall overtemperature prevention and the like, is a complex problem of multi-field coupling, multivariable restriction and multi-objective optimization, and can cause the problems of instability of boiler combustion, incapability of effective operation of a denitration system, overtemperature of the pipe wall of a boiler steam-water system and the like due to load reduction under deep peak regulation operation. How to effectively ensure boiler efficiency and NOx emission is an important research problem for combustion optimization under deep peak shaving.
The boiler combustion condition is more complicated, the limiting conditions are more, the current group intelligent optimization algorithm can not find the optimal solution, and the state description and the action evaluation dynamic adjustment strategy of the environment can be directly utilized through deep reinforcement learning so as to meet the final target control, but sometimes learning is directly performed through feedback of the environment, so that the learning efficiency is lower.
Based on the problems, the invention improves the algorithm of the actor commentator with asynchronous advantages, increases the Dyna structure to increase the training efficiency of the algorithm, dynamically adjusts the learning proportion of the virtual environment, and performs the state design and the rewarding design of the algorithm for the boiler combustion system, thereby reducing the time required by training while considering various constraints.
Disclosure of Invention
The invention aims to overcome the defects in the background art, and provides a combustion optimizing method of a thermal power generating unit, which is realized by the following scheme:
the invention provides a thermal power generating unit combustion optimization method, which comprises the following steps:
Step 1, carrying out state design and rewarding design of an algorithm according to the condition of boiler equipment;
Step2, designing a network structure and basic parameters of an asynchronous dominant actor commentator algorithm;
Step 3, establishing a multi-variable transfer function model of the boiler combustion system;
step4, establishing an asynchronous dominant actor criticizing algorithm based on the virtual environment model according to the transfer function model in the step 3, and training;
The algorithm state design in the step 1 comprises a set value, an actual value, an adjustment quantity and deviation required by the algorithm, wherein the set value comprises a carbon monoxide quantity CO sp, a smoke exhaust temperature T e,sp and a NOx concentration set value NO x,sp for a boiler combustion system, the adjustment quantity comprises a total air quantity D AIR, an nth layer combustor coal quantity D B,n, an ith layer primary air door opening V f,i, a jth layer secondary air door opening V s,j, a kth layer fuel air door opening V c,k and a combustor swing angle A f, the actual value carbon monoxide quantity CO, the smoke exhaust temperature T e and the NOx concentration NO x,eCx,eTe,eNOx are respectively a carbon monoxide quantity deviation, a smoke exhaust temperature deviation and a NOx concentration deviation, a safety margin delta T P between an actual wall temperature maximum value of a water wall and an overtemperature value, and a state S t of a coordinated control system at a moment T:
St={COsp,Te,sp,NOx,sp,DAIR,DB,n,Vf,i,Vs,j,Vc,k,Af,CO,Te,NOx,eCO,eTe,eNOx,ΔTP}.
The algorithm rewarding design in the step 1 is divided into a continuous rewarding item and a control quantity change rate limiting item, and the continuous rewarding is as follows:
e t is the weighted bias at time t, K 1 is the continuous bonus term weight;
the control amount change rate limiting term is:
k 2 is the weight of the control quantity change rate limiting item;
The asynchronous dominant actor critique algorithm based on the virtual environment model in the step 4 adds a Dynabar structure for each thread by utilizing the transfer function model in the step 3, so that the algorithm trains in the transfer function model while learning in the real environment to improve the learning efficiency, and adds the dynamic weight mu for learning in the virtual environment, and then the parameters are updated as follows:
θ′a←θ′a+μεadθ′a,θ′v←θ′v+μεvdθ′v
Wherein the method comprises the steps of In the above formula, rm is a cumulative reward in the virtual environment model, rm is set to zero when Rm is less than 0, rm is a maximum reward given by each step of virtual environment model, j is the number of repeated execution in the virtual environment model, and μ decreases with increasing global parameter updating times and increasing cumulative reward in each thread. θ 'a and θ' v are global sharing parameters in the virtual environment, and T is a global sharing count.
Advantageous effects
According to the method disclosed by the invention, the boiler combustion system of the actor commentator algorithm with asynchronous advantages is optimized, the problem of multi-target and multi-coupling can be solved, and meanwhile, the Dynaberry structure is added to improve the learning efficiency of the algorithm, so that the algorithm converges more quickly, the algorithm training is simple, and engineering practice can be realized.
Drawings
Fig. 1 is a schematic diagram of a combustion system of a thermal power generating unit.
FIG. 2 is a diagram of the boiler combustion optimization training of the present invention.
Detailed description of the preferred embodiments
The invention will be further described with reference to the drawings and the detailed description.
Fig. 1 is a schematic diagram of a boiler combustion system of a thermal power generating unit, which is a six-input three-output system, wherein the inputs are total air quantity, fuel quantity (or total fuel quantity) of each layer, opening degree of a primary air door of each layer, opening degree of a secondary air door of each layer, opening degree of an exhaust air of each layer and a swinging angle of a burner, and the outputs are carbon monoxide concentration, NOx concentration and exhaust gas temperature.
FIG. 2 is a training result of a boiler combustion system algorithm based on a deep reinforcement learning algorithm, including rewarding design, status design, network parameter setting, virtual environment setting of the algorithm. Firstly, designing the state and the rewarding of an algorithm, then inputting the state and the rewarding value into the algorithm, wherein the state is simultaneously input into a virtual environment in Dyna and an actual algorithm, integrating after the algorithm outputs action to obtain the adjustment quantity at the actual current moment, and enabling the adjustment quantity to act on an actual boiler combustion system to obtain the state and rewarding at the next moment, so that repeated learning is performed until the algorithm converges.
The method mainly comprises the following steps:
Step 1, carrying out state design and rewarding design of an algorithm according to the condition of boiler equipment;
Step2, designing a network structure and basic parameters of an asynchronous dominant actor commentator algorithm;
Step 3, establishing a multi-variable transfer function model of the boiler combustion system;
step4, establishing an asynchronous dominant actor criticizing algorithm based on the virtual environment model according to the transfer function model in the step 3, and training;
The algorithm state design comprises a set value, an actual value, an adjustment quantity and deviation required by the algorithm, wherein the set value comprises a carbon monoxide quantity CO sp, a smoke exhaust temperature T e,sp and a NOx concentration set value NO x,sp for a boiler combustion system, the adjustment quantity comprises a total air quantity D AIR, an nth layer burner coal quantity D B,n, an ith layer primary air door opening V f,i, a jth layer secondary air door opening V s,j, a kth layer fuel air door opening V c,k and a burner swing angle A f, the actual value carbon monoxide quantity CO, the smoke exhaust temperature T e and the NOx concentration NO x,eCx,eTe,eNOx are respectively the carbon monoxide quantity deviation, the smoke exhaust temperature deviation and the NOx concentration deviation, and a safety margin delta T P between an actual wall temperature maximum value and an overtemperature value of a water-cooled wall, and the state S t of the coordination control system at the moment T:
St={COsp,Te,sp,NOx,sp,DAIR,DB,n,Vf,i,Vs,j,Vc,k,Af,CO,Te,NOx,eCO,eTe,eNOx,ΔTP}
the algorithm rewarding design in the step 1 is divided into a continuous rewarding item and a control quantity change rate limiting item, wherein the continuous rewarding item is as follows:
Wherein e t is the weighted deviation at time t, [ lambda ] 123]T is a deviation weight matrix for adjusting the specific gravity among three adjustment targets, and K 1 is the weight of the continuous rewarding item;
the control amount change rate limiting term is:
k 2 is the weight of the control quantity change rate limiting item;
The network structure of the asynchronous dominant actor commentator algorithm in the step 2 is a 4-layer full-connection layer, the commentator network input layer comprises 20 nodes including state information and rewarding information, two middle layers respectively comprise 80 nodes, the output layer comprises 1 node, the actor network input layer comprises 20 nodes including state information and output of the commentator network, the middle hidden layer is the same as the commentator network, and the output layer comprises 3 nodes. The basic parameters are global maximum update times T max, thread maximum update times T max, actor network learning rate epsilon a, critic network learning rate epsilon v, virtual environment model repetition times n, thread update discount factor gamma and global update frequency F update, and the specific settings are as follows:
The convergence speed of the algorithm can be generally judged according to the boiler structure and design parameters, so that the parameters can be adjusted.
The multi-variable transfer function model of the boiler combustion system in the step 3 is used as a virtual environment for auxiliary training, the precision requirement is not high, the general disturbance experiment at a certain load point is carried out to directly establish the 6-input 3-output transfer function model of the boiler combustion system, the model is only used as the virtual environment in Dyna, the initial convergence rate of the algorithm is improved, and the final precision is not influenced
The asynchronous dominant actor critique algorithm based on the virtual environment model in the step 4 adds a Dyna structure for each thread by utilizing the transfer function model in the step 3, so that the algorithm is trained in the transfer function model while learning in the real environment to improve the learning efficiency, and the dynamic weight mu of the virtual environment learning is added, and then the parameters are updated as follows:
θ′a←θ′a+μεadθ′a,θ′v←θ′v+μεvdθ′v
Wherein the method comprises the steps of
In the above formula, rm is a cumulative reward in the virtual environment model, rm is set to zero when Rm is less than 0, rm is a maximum reward given by each step of virtual environment model, j is the number of repeated execution in the virtual environment model, and μ decreases with increasing global parameter updating times and increasing cumulative reward in each thread. θ 'a and θ' v are global sharing parameters in the virtual environment, and T is a global sharing count.

Claims (1)

1.一种火电机组燃烧优化方法,其特征在于:具体步骤如下:1. A method for optimizing combustion of a thermal power unit, characterized in that the specific steps are as follows: 步骤1:根据锅炉设备情况进行算法的状态设计和奖励设计;Step 1: Design the algorithm’s state and reward according to the boiler equipment conditions; 步骤2:设计异步优势行动者评论家算法的网络结构和基本参数;Step 2: Design the network structure and basic parameters of the asynchronous advantage actor-critic algorithm; 步骤3:建立锅炉燃烧系统的多变量传递函数模型;Step 3: Establish a multivariable transfer function model of the boiler combustion system; 步骤4:根据步骤3中的传递函数模型建立基于虚拟环境模型的异步优势行动者评论家算法,并进行训练;Step 4: Establish an asynchronous advantage actor-critic algorithm based on the virtual environment model according to the transfer function model in step 3 and perform training; 所述步骤1中的状态设计分为设定值、实际值、调节量和偏差,其中设定值包括一氧化碳量COsp,排烟温度Te,sp和NOx浓度设定值NOx,sp;调节量为总风量DAIR、第n层燃烧器煤量DB,n、第i层一次风门开度Vf,i、第j层二次风门开度Vs,j、第k层燃尽风门开度Vc,k和燃烧器摆角Af,;实际值为一氧化碳量CO,排烟温度Te和NOx浓度NOx,;eCx,eTe,eNOx分别为一氧化碳量偏差、排烟温度偏差和NOx浓度偏差以及水冷壁实际壁温最大值与超温值之间安全裕度ΔTP,t时刻的协调控制系统的状态StThe state design in step 1 is divided into set values, actual values, adjustment amounts and deviations, wherein the set values include the carbon monoxide amount CO sp , the exhaust gas temperature Te ,sp and the NOx concentration set value NO x,sp ; the adjustment amounts are the total air volume D AIR , the coal amount of the n-th layer burner DB,n , the i-th layer primary air door opening V f,i , the j-th layer secondary air door opening V s,j , the k-th layer burnout air door opening V c,k and the burner swing angle A f ; the actual values are the carbon monoxide amount CO , the exhaust gas temperature Te and the NOx concentration NO x ; e Cx , e Te , e NOx are the carbon monoxide amount deviation, the exhaust gas temperature deviation and the NOx concentration deviation and the safety margin ΔT P between the actual maximum value of the water-cooled wall temperature and the over-temperature value, respectively; the state S t of the coordinated control system at time t is: St={COsp,Te,sp,NOx,sp,DAIR,DB,n,Vf,i,Vs,j,Vc,k,Af,CO,Te,NOx,eCO,eTe,eNOx,ΔTP};S t ={CO sp ,T e,sp ,NO x,sp ,D AIR ,D B,n ,V f,i ,V s,j ,V c,k ,A f ,CO,T e ,NO x ,e CO ,e Te ,e NOx ,ΔT P }; 所述步骤1中的奖励设计分为持续奖励项、控制量变化率限制项,持续奖励为:The reward design in step 1 is divided into a continuous reward item and a control amount change rate limit item. The continuous reward is: et为t时刻加权后偏差,K1为持续奖励项权重; e t is the weighted deviation at time t, K 1 is the weight of the continuous reward item; 控制量变化率限制项为:The control variable change rate limit item is: K2为控制量变化率限制项权重; K 2 is the weight of the control variable change rate limit item; 所述步骤4中的基于虚拟环境模型的异步优势行动者评论家算法为在每个线程中利用步骤3的传递函数模型增加添加Dyna结构,令算法在真实环境学习的同时在传递函数模型中训练以提高学习效率,并增加虚拟环境学习动态权重μ,则参数更新为:The asynchronous advantage actor-critic algorithm based on the virtual environment model in step 4 adds a Dyna structure to the transfer function model in step 3 in each thread, so that the algorithm is trained in the transfer function model while learning in the real environment to improve the learning efficiency, and the virtual environment learning dynamic weight μ is increased, and the parameter update is: θa′←θa′+μεaa′,θv′←θv′+μεvvθ a ′←θ a ′+με aa ′,θ v ′←θ v ′+με vv 其中 in 上式中Rm是在虚拟环境模型中的累计奖励,当Rm<0时将Rm置零,rm为每一步虚拟环境模型给出的最大奖励,j为在虚拟环境模型中重复执行的次数,μ随全局参数更新次数增多以及在各线程中的累计奖励增加而减少;θ′a和θ′v为虚拟环境中的全局共享参数,T是全局共享计数。In the above formula, Rm is the cumulative reward in the virtual environment model. When Rm<0, Rm is set to zero. rm is the maximum reward given by the virtual environment model at each step. j is the number of repeated executions in the virtual environment model. μ decreases with the increase in the number of global parameter updates and the increase in the cumulative rewards in each thread. θ′ a and θ′ v are the global shared parameters in the virtual environment, and T is the global shared count.
CN202211046653.4A 2022-08-30 2022-08-30 Combustion optimization method for thermal power generating unit Active CN115408943B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211046653.4A CN115408943B (en) 2022-08-30 2022-08-30 Combustion optimization method for thermal power generating unit

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211046653.4A CN115408943B (en) 2022-08-30 2022-08-30 Combustion optimization method for thermal power generating unit

Publications (2)

Publication Number Publication Date
CN115408943A CN115408943A (en) 2022-11-29
CN115408943B true CN115408943B (en) 2025-02-18

Family

ID=84161510

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211046653.4A Active CN115408943B (en) 2022-08-30 2022-08-30 Combustion optimization method for thermal power generating unit

Country Status (1)

Country Link
CN (1) CN115408943B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109089307A (en) * 2018-07-19 2018-12-25 浙江工业大学 A kind of energy-collecting type wireless relay network througput maximization approach based on asynchronous advantage actor reviewer algorithm

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7660639B2 (en) * 2006-03-27 2010-02-09 Hitachi, Ltd. Control system for control subject having combustion unit and control system for plant having boiler
US20200241542A1 (en) * 2019-01-25 2020-07-30 Bayerische Motoren Werke Aktiengesellschaft Vehicle Equipped with Accelerated Actor-Critic Reinforcement Learning and Method for Accelerating Actor-Critic Reinforcement Learning
CN111731303B (en) * 2020-07-09 2021-04-23 重庆大学 A HEV energy management method based on deep reinforcement learning A3C algorithm

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109089307A (en) * 2018-07-19 2018-12-25 浙江工业大学 A kind of energy-collecting type wireless relay network througput maximization approach based on asynchronous advantage actor reviewer algorithm

Also Published As

Publication number Publication date
CN115408943A (en) 2022-11-29

Similar Documents

Publication Publication Date Title
CN107726358B (en) Boiler Combustion Optimization System based on CFD numerical simulations and intelligent modeling and method
CN107023825B (en) Fluidized Bed Boiler Control and Combustion Optimization System
CN103759290A (en) Large coal-fired unit online monitoring and optimal control system and implementation method thereof
CN113742997B (en) Intelligent optimization setting method for air quantity in urban solid waste incineration process
CN114529208B (en) Dynamic Optimal Scheduling Method for Electric-Thermal Coupling System Considering Rapid Ramp Capability Constraints of CHP Units
CN113887130B (en) Industrial boiler operation optimization method based on ensemble learning
CN116526511A (en) Method for controlling load frequency of multi-source cooperative participation system
Ma et al. Three-objective optimization of boiler combustion process based on multi-objective teaching–learning based optimization algorithm and ameliorated extreme learning machine
CN118757787A (en) A method for controlling furnace temperature in municipal solid waste incineration process
CN119245008A (en) A combustion optimization control method for coal-fired units in thermal power plants
Tian et al. Furnace temperature control based on adaptive TS-FNN for municipal solid waste incineration process
CN109989835A (en) Fault-tolerant control system for micro gas turbine and its control method
CN115408943B (en) Combustion optimization method for thermal power generating unit
CN117404650A (en) Deep-tuning boiler combustion control method
Ali et al. Hybrid Design Optimization of Heating Furnace Temperature using ANFIS-PSO
CN105240822B (en) A kind of the damper control method of boiler three and system based on neutral net
CN116742655A (en) Wind power system load frequency control method based on multi-target random paint optimizer and active disturbance rejection control
Wang et al. Dynamic adaptive control of boiler combustion based on improved GNG algorithm
Cheng et al. A composite weighted human learning network and its application for modeling of the intermediate point temperature in USC
Xuan Research on Boiler Combustion Optimization System Based on NSGA-II BP
CN120351528B (en) Primary air quantity coupling and adjusting system for mixed coal volatile matters and ignition distance
CN114895567B (en) A PSO-ELM-based supercritical unit overheating predictive control method
CN114784877B (en) Parameter optimization method and system based on multi-virtual synchronous generator parallel grid-connected system
Manke A Performance Comparison of Different Back Propagation Neural Networks for Nitrogen Oxides Emission Prediction in Thermal Power Plant
Xu et al. Process control optimization for hydroelectric power based on neural network algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant