Identyfikatory
Warianty tytułu
Języki publikacji
Abstrakty
In this paper, the fixed final time adaptive optimal regulation of discrete-time linear systems with unknown system dynamics is addressed. First, by transforming the linear systems into the input/output form, the adaptive optimal control design depends only on the measured outputs and past inputs instead of state measurements. Next, due to the time-varying nature of finite-horizon, a novel online adaptive estimator is proposed by utilizing an online approximator to relax the requirement on the system dynamics. An additional error term corresponding to the terminal constraint is defined and minimized overtime. No policy/value iteration is performed by the novel parameter update law which is updated once a sampling interval. The proposed control design yields an online and forward-in-time solution which enjoys great practical advantages. Stability of the system is demonstrated by Lyapunov analysis while simulation results verify the effectiveness of the propose approach.
Wydawca
Rocznik
Tom
Strony
175--187
Opis fizyczny
Bibliogr. 25 poz., rys.
Twórcy
autor
- Department of Electrical and Computer Engineering, Missouri University of Science and Technology Rolla, Missouri, USA
autor
- Department of Electrical and Computer Engineering, Missouri University of Science and Technology Rolla, Missouri, USA
autor
- Department of Electrical and Computer Engineering, Missouri University of Science and Technology Rolla, Missouri, USA
Bibliografia
- [1] F. L. Lewis and V. L. Syrmos, Optimal Control, 2nd edition. New York: Wiley, 1995.
- [2] D. Kirk, Optimal Control Theory: An Introduction, New Jersey, Prentice-Hall, 1970.
- [3] Z. Chen and S. Jagannathan, “Generalized Hamilton-Jacobi-Bellman formulation based neural network control of affine nonlinear discretetime systems”, IEEE Trans. Neural Networks, vol. 7, pp. 90–106, 2008.
- [4] S. J. Bradtke and B. E. Ydstie, Adaptive linear quadratic control using policy iteration, in Proc. Am Contr. Conf., Baltimore, pp. 3475–3479, 1994.
- [5] Z. Qiming, X. Hao and S. Jagannathan, “Finitehorizon optimal control design for uncertain linear discrete-time systems”, Proceedings of IEEE Symposium on Approximate Dynamic Programming and Reinforcement Learning (ADPRL), Singapore, 2013.
- [6] X. Hao, S. Jagannathan and F. L. Lewis, “Stochastic optimal control of unknown networked control systems in the presence of random delays and packet losses,” Automatica, vol. 48, pp. 1017–1030, 2012.
- [7] T. Dierks and S. Jagannathan, “Online optimal control of affine nonlinear discrete-time systems with unknown internal dynamics by using timebased policy update,” IEEE Trans. Neural Networks and Learning Systems, vol. 23, pp. 1118–1129, 2012.
- [8] R. Beard, “Improving the closed-loop performance of nonlinear systems,” Ph.D. dissertation, Rensselaer Polytechnic Institute, USA, 1995.
- [9] T. Cheng, F. L. Lewis, and M. Abu-Khalaf, “A neural network solution for fixed-final-time optimal control of nonlinear systems,” Automatica, vol. 43, pp. 482–490, 2007.
- [10] A. Heydari and S. N. Balakrishnan, “Finitehorizon Control-Constrained Nonlinear Optimal Control Using Single Network Adaptive Critics,” IEEE Trans. Neural Networks and Learning Systems, vol. 24, pp. 145–157, 2013.
- [11] P. J. Werbos, “A menu of designs for reinforcement learing over time,” J. Neural NetworkContr., vol. 3, pp. 835–846, 1983.
- [12] J. Si, A. G. Barto, W. B. Powell and D. Wunsch,Handbook of Learning and Approximate Dynamic Programming. New York: Wiley, 2004.Qiming Zhao, Hao Xu and S. Jagannathan 187
- [13] A. Al-Tamimi and F. L. Lewis, ”Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof,” IEEE Trans. Systems, Man and Cybernetics, Part B: Cybernetics, vol. 38, pp. 943–949, 2008.
- [14] H. Xu and S. Jagannathan, “Stochastic optimal controller design for uncertain nonlinear networked control system via neuro dynamic programming, IEEE Trans. Neural Netw. And Learning Syst, 24 (2013), pp. 471–484.
- [15] C. Watkins, “Learning from delayed rewards,” Ph.D. dissertation, Cambridge University, England, 1989.
- [16] W. Aangenent, D. Kostic, B. de Jager, R. van de Molengraft and M. Steinbuch, Data-based optimal control, in Proc. Amer. Control Conf., Portland, OR, 2005, pp. 1460–1465.
- [17] R. K. Lim, M. O. Phan, and R. W. Longman, “State-space system identification with identified Hankel matrix,” Dept. Mech. Aerosp. Eng., Princeton Univ., NJ, Tech. Rep. 3045, Sep, 1998.
- [18] M. O. Phan, R. K. Lim and R. W. Longman, “Unifying input-output and state-space perspectives of predictive control”, Dept. Mech. Aerosp. Eng., Princeton Univ., NJ, Tech. Rep. 3044, Sep, 1998
- [19] F. L. Lewis and K. G. Vamvoudakis, “Reinforcement learning for partial observable dynamic process: adaptive dynamic programming using measured output data”, Trans. On Systems, Man, and Cybernetics – Part B. Vo. 41, pp. 14-25, 2011.
- [20] S. Jagannathan, Neural Network Control of Nonlinear Discrete-Time Systems, Boca Raton, FL: CRC Press, 2006.
- [21] M. Green and J. B. Moore, “Persistency of excitation in linear systems,” Syst. and Cont. Letter, vol. 7, pp. 351–360, 1986.
- [22] K. S. Narendra and A. M. Annaswamy, Stable Adaptive Systems, New Jersey: Prentice-Hall ,1989.
- [23] F. L. Lewis, S. Jagannathan, and A. Yesildirek, Neural Network Control of Robot Manipulators and Nonlinear Systems, New York: Taylor & Francis, 1999.
- [24] H.K. Khalil, Nonlinear System, 3rd edition, Prentice-Hall, Upper Saddle River, NJ, 2002.
- [25] R. W. Brochett, R. S. Millman, and H. J. Sussmann, Differential geometric control theory, Birkhauser, USA, 1983.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-2be2caf5-1939-412f-b053-7d57d624c8e3