Have a personal or library account? Click to login
A linear programming methodology for approximate dynamic programming Cover

A linear programming methodology for approximate dynamic programming

Open Access
|Jul 2020

References

  1. Allgower, F. and Zheng, A. (2012). Nonlinear Model Predictive Control, Springer, New York, NY.
  2. Ariño, C., Querol, A. and Sala, A. (2017). Shape-independent model predictive control for Takagi–Sugeno fuzzy systems, Engineering Applications of Artificial Intelligence65(1): 493–505.10.1016/j.engappai.2017.07.011
  3. Armesto, L., Girbés,V., Sala, A., Zima, M. and Šmídl, V. (2015). Duality-based nonlinear quadratic control: Application to mobile robot trajectory-following, IEEE Transactions on Control Systems Technology23(4): 1494–1504.10.1109/TCST.2014.2377631
  4. Armesto, L., Moura, J., Ivan, V., Erden, M.S., Sala, A. and Vijayakumar, S. (2018). Constraint-aware learning of policies by demonstration, International Journal of Robotics Research37(13–14): 1673–1689.10.1177/0278364918784354
  5. Bertsekas, D.P. (2017). Dynamic Programming and Optimal Control, Vol. 1, 4th Edn, Athena Scientific, Belmont, MA.
  6. Bertsekas, D.P. (2019). Reinforcement Learning and Optimal Control, Athena Scientific, Belmont, MA.
  7. Busoniu, L., Babuska, R., De Schutter, B. and Ernst, D. (2010). Reinforcement Learning and Dynamic Programming Using Function Approximators, CRC Press, Boca Raton, FL.
  8. Cervellera, C., Wen, A. and Chen, V.C. (2007). Neural network and regression spline value function approximations for stochastic dynamic programming, Computers & Operations Research34(1): 70–90.10.1016/j.cor.2005.02.043
  9. De Farias, D.P. and Van Roy, B. (2003). The linear programming approach to approximate dynamic programming, Operations Research51(6): 850–865.10.1287/opre.51.6.850.24925
  10. Deisenroth, M.P., Neumann, G. and Peters, J. (2013). A survey on policy search for robotics, Foundations and Trends in Robotics2(1–2): 1–142.10.1561/2300000021
  11. Díaz, H., Armesto, L. and Sala, A. (2019). Metodología de programación dinámica aproximada para control óptimo basada en datos, Revista Iberoamericana de Automática e Informática industrial16(3): 273–283.10.4995/riai.2019.10379
  12. Díaz, H., Armesto, L. and Sala, A. (2020). Fitted Q-function control methodology based on Takagi–Sugeno systems, IEEE Transactions on Control Systems Technology28(2): 477–488.10.1109/TCST.2018.2885689
  13. Lagoudakis, M.G. and Parr, R. (2003). Least-squares policy iteration, Journal of Machine Learning Research4(Dec): 1107–1149.
  14. Lewis, F.L. and Liu, D. (2013). Reinforcement Learning and Approximate Dynamic Programming for Feedback Control, Wiley, Hoboken, NJ.10.1002/9781118453988
  15. Lewis, F.L. and Vrabie, D. (2009). Reinforcement learning and adaptive dynamic programming for feedback control, IEEE Circuits and Systems Magazine9(3): 32–50.10.1109/MCAS.2009.933854
  16. Lewis, F., Vrabie, D. and Syrmos, V. (2012). Optimal Control, 3rd Edn, John Wiley & Sons, Hoboken, NJ.10.1002/9781118122631
  17. Liu, D., Wei, Q., Wang, D., Yang, X. and Li, H. (2017). Adaptive Dynamic Programming with Applications in Optimal Control, Springer, Berlin.10.1007/978-3-319-50815-3
  18. Manne, A.S. (1960). Linear programming and sequential decisions, Management Science6(3): 259–267.10.1287/mnsc.6.3.259
  19. Marsh, L.C. and Cormier, D.R. (2001). Spline Regression Models, Number 137, Sage, Thousand Oaks, CA.10.4135/9781412985901
  20. Munos, R., Baird, L.C. and Moore, A.W. (1999). Gradient descent approaches to neural-net-based solutions of the Hamilton–Jacobi–Bellman equation, International Joint Conference on Neural Networks, Washington, DC, USA, Vol. 3, pp. 2152–2157.
  21. Munos, R. and Szepesvári, C. (2008). Finite-time bounds for fitted value iteration, Journal of Machine Learning Research9(May): 815–857.
  22. Powell, W.B. (2011). Approximate Dynamic Programming: Solving the Curses of Dimensionality, 2nd Edn, Wiley, Hoboken, NJ.10.1002/9781118029176
  23. Preitl, S., Precup, R.-E., Preitl, Z., Vaivoda, S., Kilyeni, S. and Tar, J.K. (2007). Iterative feedback and learning control. servo systems applications, IFAC Proceedings Volumes40(8): 16–27.
  24. Rantzer, A. (2006). Relaxed dynamic programming in switching systems, IEE Proceedings: Control Theory and Applications153(5): 567–574.10.1049/ip-cta:20050094
  25. Robles, R., Sala, A. and Bernal, M. (2019). Performance-oriented quasi-LPV modeling of nonlinear systems, International Journal of Robust and Nonlinear Control29(5): 1230–1248.10.1002/rnc.4444
  26. Sutton, R.S. and Barto, A.G. (2018). Reinforcement Learning: An Introduction, 2nd Edn, MIT Press, Cambridge, MA.
  27. Tan, K., Zhao, S. and Xu, J. (2007). Online automatic tuning of a proportional integral derivative controller based on an iterative learning control approach, IET Control Theory Applications1(1): 90–96.10.1049/iet-cta:20050004
  28. Zajdel, R. (2013). Epoch-incremental reinforcement learning algorithms, International Journal of Applied Mathematics and Computer Science23(3): 623–635, DOI: 10.2478/amcs-2013-0047.10.2478/amcs-2013-0047
  29. Zhao, D., Liu, J., Wu, R., Cheng, D. and Tang, X. (2019). An active exploration method for data efficient reinforcement learning, International Journal of Applied Mathematics and Computer Science29(2): 351–362, DOI: 10.2478/amcs-2019-0026.10.2478/amcs-2019-0026
DOI: https://doi.org/10.34768/amcs-2020-0028 | Journal eISSN: 2083-8492 | Journal ISSN: 1641-876X
Language: English
Page range: 363 - 375
Submitted on: Oct 26, 2019
Accepted on: Mar 2, 2020
Published on: Jul 4, 2020
Published by: University of Zielona Góra
In partnership with: Paradigm Publishing Services
Publication frequency: 4 issues per year

© 2020 Henry Díaz, Antonio Sala, Leopoldo Armesto, published by University of Zielona Góra
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.