Have a personal or library account? Click to login
Solving Finite-Horizon Discounted Non-Stationary MDPS Cover

Solving Finite-Horizon Discounted Non-Stationary MDPS

Open Access
|Jun 2023

References

  1. Allamigeon, X., Boyet, M., Gaubert, S. (2021). Piecewise Affine Dynamical Models of Petri Nets–Application to Emergency Call Centers. Fundamenta Informaticae, 183(3–4), 169–201. DOI: 10.3233/FI-2021-2086.
  2. Asadi, A., Pinkley, S.N., Mes, M. (2022). A Markov decision process approach for managing medical drone deliveries. Expert Systems With Applications, 204, 117490. DOI: 10.1016/j. eswa.2022.117490.
  3. Bellman, R. (1958). Dynamic programming and stochastic control processes. Information and Control, 1(3), 228–239. DOI: 10.1016/S0019-9958(58)80003-0.
  4. Bertsekas, D. (2012). Dynamic programming and optimal control: Volume I (vol. 1). Athena scientific.
  5. Bertsimas, D., Mišić, V.V. (2016). Decomposable markov decision processes: A fluid optimization approach. Operations Research, 64(6), 1537–1555. DOI: 10.1287/opre.2016.1531.
  6. Dulac-Arnold, G., Levine, N., Mankowitz, D.J., Li, J., Paduraru, C., Gowal, S., Hester, T. (2021). Challenges of real-world reinforcement learning: Definitions, benchmarks and analysis. Machine Learning, 110(9), 2419–2468. DOI: 10.1007/s10994-021-05961-4.
  7. El Akraoui, B., Daoui, C., Larach, A. (2022). Decomposition Methods for Solving Finite-Horizon Large MDPs. Journal of Mathematics, 2022. DOI: 10.1155/2022/8404716.
  8. Emadi, H., Atkins, E., Rastgoftar, H. (2022). A Finite-State Fixed-Corridor Model for UAS Traffic Management. ArXiv Preprint ArXiv:2204.05517.
  9. Feinberg, E.A. (2016). Optimality conditions for inventory control. In Optimization Challenges in Complex, Networked and Risky Systems (pp. 14–45). INFORMS. DOI: 10.1287/educ.2016.0145.
  10. Hordijk, A., Kallenberg, L.C.M. (1984). Transient policies in discrete dynamic programming: Linear programming including suboptimality tests and additional constraints. Mathematical Programming, 30(1), 46–70. DOI: 10.1007/BF02591798.
  11. Howard, R.A. (1960). Dynamic programming and markov processes. MIT Press, Cambridge, MA. https://books.google.co.ma/books?id=fXJEAAAAIAAJ.
  12. Kallenberg, L.C.M. (1983). Linear programming and finite Markovian control problems, Math. Centre Tracts, 148, 1–245.
  13. Larach, A., Chafik, S., Daoui, C. (2017). Accelerated decomposition techniques for large discounted Markov decision processes. Journal of Industrial Engineering International, 13(4), 417–426. DOI: 10.1007/s40092-017-0197-7.
  14. Mao, W., Zheng, Z., Wu, F., Chen, G. (2018). Online Pricing for Revenue Maximization with Unknown Time Discounting Valuations. IJCAI, 440–446. DOI: 10.24963/ijcai.2018/61.
  15. Pavitsos, A., Kyriakidis, E.G. (2009). Markov decision models for the optimal maintenance of a production unit with an upstream buffer. Computers & Operations Research, 36(6), 1993–2006. DOI: 10.1016/j.cor.2008.06.014.
  16. Peng, H., Cheng, Y., Li, X. (2023). Real-Time Pricing Method for Spot Cloud Services with Non-Stationary Excess Capacity. Sustainability, 15(4), 3363.
  17. Puterman, M.L. (1994). Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons Inc. DOI: 10.1002/9780470316887.
  18. Rimélé, A., Grangier, P., Gamache, M., Gendreau, M., Rousseau, L.-M. (2021). E-commerce warehousing: Learning a storage policy. ArXiv Preprint ArXiv:2101.08828. DOI: 10.48550/arXiv.2101.08828.
  19. Spieksma, F., Nunez-Queija, R. (2015). Markov Decision Processes. Adaptation of the Text by R. Nunez-Queija, 55.
  20. Sutton, R.S. (1988). Learning to predict by the methods of temporal differences. Machine Learning, 3(1), 9–44. DOI: 10.1007/BF00115009.
  21. White III, C.C., White, D.J. (1989). Markov decision processes. European Journal of Operational Research, 39(1), 1–16. DOI: 10.1016/0377-2217(89)90348-2.
  22. Wu, Y., Zhang, J., Ravey, A., Chrenko, D., Miraoui, A. (2020). Real-time energy management of photovoltaic-assisted electric vehicle charging station by markov decision process. Journal of Power Sources, 476, 228504.
  23. Ye, G., Lin, Q., Juang, T.-H., Liu, H. (2020). Collision-free Navigation of Human-centered Robots via Markov Games. 2020 IEEE International Conference on Robotics and Automation (ICRA), 11338–11344. DOI: 10.1109/ICRA40945.2020.9196810.
  24. Ye, Y. (2011). The simplex and policy-iteration methods are strongly polynomial for the Markov decision problem with a fixed discount rate. Mathematics of Operations Research, 36(4), 593–603. DOI: 10.1287/moor.1110.0516.
  25. Zhang, Y., Kim, C.-W., Tee, K.F. (2017). Maintenance management of offshore structures using Markov process model with random transition probabilities. Structure and Infrastructure Engineering, 13(8), 1068–1080. DOI: 10.1080/15732479.2016.1236393.
DOI: https://doi.org/10.2478/foli-2023-0001 | Journal eISSN: 1898-0198 | Journal ISSN: 1730-4237
Language: English
Page range: 1 - 15
Accepted on: Feb 26, 2023
Published on: Jun 9, 2023
Published by: University of Szczecin
In partnership with: Paradigm Publishing Services
Publication frequency: 2 issues per year

© 2023 El Akraoui Bouchra, Cherki Daoui, published by University of Szczecin
This work is licensed under the Creative Commons Attribution-ShareAlike 4.0 License.