Wang et al., 2025 - Google Patents
A Value Iteration Algorithm for Stochastic Linear Quadratic RegulatorWang et al., 2025
- Document ID
- 1992699072167216112
- Author
- Wang H
- Liu Y
- Liu X
- Publication year
- Publication venue
- Journal of Optimization Theory and Applications
External Links
Snippet
In this paper, we propose a novel value iteration algorithm for online adaptive optimal control of discrete-time stochastic linear quadratic regulator (LQR) problems. The algorithm iteratively solves the algebraic Riccati equation (ARE) using online information of states and …
Classifications
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/04—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
- G05B13/042—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators in which a parameter or coefficient is automatically adjusted to optimise the performance
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/0205—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric not using a model or a simulator of the controlled system
- G05B13/024—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric not using a model or a simulator of the controlled system in which a parameter or coefficient is automatically adjusted to optimise the performance
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/0265—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion
- G05B13/027—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion using neural networks only
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/04—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
- G05B13/048—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators using a predictor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B17/00—Systems involving the use of models or simulators of said systems
- G05B17/02—Systems involving the use of models or simulators of said systems electric
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/04—Inference methods or devices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/11—Complex mathematical operations for solving equations, e.g. nonlinear equations, general mathematical optimization problems
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B11/00—Automatic controllers
- G05B11/01—Automatic controllers electric
- G05B11/32—Automatic controllers electric with inputs from more than one sensing element; with outputs to more than one correcting element
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computer systems based on specific mathematical models
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Pham et al. | Neural networks-based backward scheme for fully nonlinear PDEs | |
| Zhao et al. | Adaptive neural decentralised control for switched interconnected nonlinear systems with backlash-like hysteresis and output constraints | |
| Yaghmaie et al. | Linear quadratic control using model-free reinforcement learning | |
| Shalaby et al. | Fractional order modeling and control for under-actuated inverted pendulum | |
| Jiang et al. | Robust adaptive dynamic programming for linear and nonlinear systems: An overview | |
| Han et al. | Recurrent neural networks for stochastic control problems with delay | |
| Cao et al. | Gaussian process model predictive control of unknown non‐linear systems | |
| Wu et al. | Multiobjective $ H_ {2}/H_ {\infty} $ Control Design of the Nonlinear Mean-Field Stochastic Jump-Diffusion Systems via Fuzzy Approach | |
| Yin et al. | Joint stochastic distribution tracking control for multivariate descriptor systems with non-Gaussian variables | |
| Talebi et al. | On regularizability and its application to online control of unstable LTI systems | |
| Pang et al. | Robust reinforcement learning for stochastic linear quadratic control with multiplicative noise | |
| Qian et al. | Hybrid identification method for fractional-order nonlinear systems based on the multi-innovation principle | |
| Liu et al. | Neural network state learning based adaptive terminal ILC for tracking iteration-varying target points | |
| Jiang et al. | Adaptive linear quadratic control for stochastic discrete-time linear systems with unmeasurable multiplicative and additive noises | |
| Wang et al. | A Value Iteration Algorithm for Stochastic Linear Quadratic Regulator | |
| Zhu et al. | Cooperative game-theoretic optimization of adaptive robust constraint-following control for fuzzy mechanical systems under inequality constraints | |
| Zhuang et al. | Robust control design for zero-sum differential games problem based on off-policy reinforcement learning technique | |
| Zhao | Discovering phase field models from image data with the pseudo-spectral physics informed neural networks | |
| Zhang et al. | Adaptive iterative learning control of non-uniform trajectory tracking for strict feedback nonlinear time-varying systems | |
| Dai | A fast equivalent scheme for robust constrained-input min–max finite horizon model predictive control via model-free critic-only Q-learning | |
| Buldaev | Operator forms of the maximum principle and iterative algorithms in optimal control problems | |
| Chen et al. | Optimal tracking control of mechatronic servo system using integral reinforcement learning | |
| Janakiraman et al. | A Lyapunov based stable online learning algorithm for nonlinear dynamical systems using extreme learning machines | |
| Islam et al. | Robust functional observer for stabilising uncertain fuzzy systems with time-delay | |
| Li et al. | Disentangling linear quadratic control with untrusted ML predictions |