Ariño, C., Querol, A. and Sala, A. (2017). Shape-independent model predictive control for Takagi–Sugeno fuzzy systems, Engineering Applications of Artificial Intelligence65(1): 493–505.10.1016/j.engappai.2017.07.011
Armesto, L., Girbés,V., Sala, A., Zima, M. and Šmídl, V. (2015). Duality-based nonlinear quadratic control: Application to mobile robot trajectory-following, IEEE Transactions on Control Systems Technology23(4): 1494–1504.10.1109/TCST.2014.2377631
Armesto, L., Moura, J., Ivan, V., Erden, M.S., Sala, A. and Vijayakumar, S. (2018). Constraint-aware learning of policies by demonstration, International Journal of Robotics Research37(13–14): 1673–1689.10.1177/0278364918784354
Busoniu, L., Babuska, R., De Schutter, B. and Ernst, D. (2010). Reinforcement Learning and Dynamic Programming Using Function Approximators, CRC Press, Boca Raton, FL.
Cervellera, C., Wen, A. and Chen, V.C. (2007). Neural network and regression spline value function approximations for stochastic dynamic programming, Computers & Operations Research34(1): 70–90.10.1016/j.cor.2005.02.043
De Farias, D.P. and Van Roy, B. (2003). The linear programming approach to approximate dynamic programming, Operations Research51(6): 850–865.10.1287/opre.51.6.850.24925
Deisenroth, M.P., Neumann, G. and Peters, J. (2013). A survey on policy search for robotics, Foundations and Trends in Robotics2(1–2): 1–142.10.1561/2300000021
Díaz, H., Armesto, L. and Sala, A. (2019). Metodología de programación dinámica aproximada para control óptimo basada en datos, Revista Iberoamericana de Automática e Informática industrial16(3): 273–283.10.4995/riai.2019.10379
Díaz, H., Armesto, L. and Sala, A. (2020). Fitted Q-function control methodology based on Takagi–Sugeno systems, IEEE Transactions on Control Systems Technology28(2): 477–488.10.1109/TCST.2018.2885689
Lewis, F.L. and Liu, D. (2013). Reinforcement Learning and Approximate Dynamic Programming for Feedback Control, Wiley, Hoboken, NJ.10.1002/9781118453988
Lewis, F.L. and Vrabie, D. (2009). Reinforcement learning and adaptive dynamic programming for feedback control, IEEE Circuits and Systems Magazine9(3): 32–50.10.1109/MCAS.2009.933854
Liu, D., Wei, Q., Wang, D., Yang, X. and Li, H. (2017). Adaptive Dynamic Programming with Applications in Optimal Control, Springer, Berlin.10.1007/978-3-319-50815-3
Munos, R., Baird, L.C. and Moore, A.W. (1999). Gradient descent approaches to neural-net-based solutions of the Hamilton–Jacobi–Bellman equation, International Joint Conference on Neural Networks, Washington, DC, USA, Vol. 3, pp. 2152–2157.
Rantzer, A. (2006). Relaxed dynamic programming in switching systems, IEE Proceedings: Control Theory and Applications153(5): 567–574.10.1049/ip-cta:20050094
Robles, R., Sala, A. and Bernal, M. (2019). Performance-oriented quasi-LPV modeling of nonlinear systems, International Journal of Robust and Nonlinear Control29(5): 1230–1248.10.1002/rnc.4444
Tan, K., Zhao, S. and Xu, J. (2007). Online automatic tuning of a proportional integral derivative controller based on an iterative learning control approach, IET Control Theory Applications1(1): 90–96.10.1049/iet-cta:20050004
Zajdel, R. (2013). Epoch-incremental reinforcement learning algorithms, International Journal of Applied Mathematics and Computer Science23(3): 623–635, DOI: 10.2478/amcs-2013-0047.10.2478/amcs-2013-0047
Zhao, D., Liu, J., Wu, R., Cheng, D. and Tang, X. (2019). An active exploration method for data efficient reinforcement learning, International Journal of Applied Mathematics and Computer Science29(2): 351–362, DOI: 10.2478/amcs-2019-0026.10.2478/amcs-2019-0026