Have a personal or library account? Click to login

Feature Reinforcement Learning: Part II. Structured MDPs

Open Access
|Jun 2021

References

  1. Bertsekas, D. P., and Tsitsiklis, J. N. 1996. Neuro-Dynamic Programming. Belmont, MA: Athena Scientific.
  2. Bishop, C. M. 2006. Pattern Recognition and Machine Learning. Springer.
  3. Boutilier, C.; Dean, T.; and Hanks, S. 1999. Decision-Theoretic Planning: Structural Assumptions and Computational Leverage. Journal of Artificial Intelligence Research 11:1–94.10.1613/jair.575
  4. Chow, C. K., and Liu, C. N. 1968. Approximating discrete probability distributions with dependence trees. IEEE Transactions on Information Theory IT-14(3):462–467.10.1109/TIT.1968.1054142
  5. Dean, T., and Kanazawa, K. 1989. A Model for Reasoning about Persistence and Causation. Computational Intelligence 5(3):142–150.10.1111/j.1467-8640.1989.tb00324.x
  6. Friedman, N.; Geiger, D.; and Goldszmid, M. 1997. Bayesian Network Classifiers. Machine Learning 29(2):131–163.10.1023/A:1007465528199
  7. Gaglio, M. 2007. Universal Search. Scholarpedia 2(11):2575.10.4249/scholarpedia.2575
  8. Goertzel, B., and Pennachin, C., eds. 2007. Artificial General Intelligence. Springer.10.1007/978-3-540-68677-4
  9. Grünwald, P. D. 2007. The Minimum Description Length Principle. Cambridge: The MIT Press.10.7551/mitpress/4643.001.0001
  10. Guestrin, C.; Koller, D.; Parr, R.; and Venkataraman, S. 2003. Efficient Solution Algorithms for Factored MDPs. Journal of Artificial Intelligence Research (JAIR) 19:399–468.10.1613/jair.1000
  11. Hutter, M. 2003. Optimality of Universal Bayesian Prediction for General Loss and Alphabet. Journal of Machine Learning Research 4:971–1000.
  12. Hutter, M. 2005. Universal Artificial Intelligence: Sequential Decisions based on Algorithmic Probability. Berlin: Springer.
  13. Hutter, M. 2009a. Feature Dynamic Bayesian Networks. In Proc. 2nd Conf. on Artificial General Intelligence (AGI’09), volume 8, 67–73. Atlantis Press.10.2991/agi.2009.6
  14. Hutter, M. 2009b. Feature Reinforcement Learning: Part I: Unstructured MDPs. Journal of Artificial General Intelligence 1:3–24.10.2478/v10229-011-0002-8
  15. Kaelbling, L. P.; Littman, M. L.; and Cassandra, A. R. 1998. Planning and Acting in Partially Observable Stochastic Domains. Artificial Intelligence 101:99–134.10.1016/S0004-3702(98)00023-X
  16. Kearns, M., and Koller, D. 1999. Efficient Reinforcement Learning in Factored MDPs. In Proc. 16th International Joint Conference on Artificial Intelligence (IJCAI-99), 740–747. San Francisco: Morgan Kaufmann.
  17. Koller, D., and Parr, R. 1999. Computing Factored Value Functions for Policies in Structured MDPs,. In Proc. 16st International Joint Conf. on Artificial Intelligence (IJCAI’99), 1332–1339.
  18. Koller, D., and Parr, R. 2000. Policy Iteration for Factored MDPs. In Proc. 16th Conference on Uncertainty in Artificial Intelligence (UAI-00), 326–334. San Francisco, CA: Morgan Kaufmann.
  19. Legg, S., and Hutter, M. 2007. Universal Intelligence: A Definition of Machine Intelligence. Minds & Machines 17(4):391–444.10.1007/s11023-007-9079-x
  20. Lewis, D. D. 1998. Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval. In Proc. 10th European Conference on Machine Learning (ECML’98), 4–15. Chemnitz, DE: Springer.10.1007/BFb0026666
  21. Littman, M. L.; Sutton, R. S.; and Singh, S. P. 2001. Predictive Representations of State. In Advances in Neural Information Processing Systems, volume 14, 1555–1561. MIT Press.
  22. McCallum, A. K. 1996. Reinforcement Learning with Selective Perception and Hidden State. Ph.D. Dissertation, Department of Computer Science, University of Rochester.
  23. Puterman, M. L. 1994. Markov Decision Processes — Discrete Stochastic Dynamic Programming. New York, NY: Wiley.10.1002/9780470316887
  24. Ross, S.; Pineau, J.; Paquet, S.; and Chaib-draa, B. 2008. Online Planning Algorithms for POMDPs. Journal of Artificial Intelligence Research 2008(32):663–704.10.1613/jair.2567
  25. Russell, S. J., and Norvig, P. 2003. Artificial Intelligence. A Modern Approach. Englewood Cliffs, NJ: Prentice-Hall, 2nd edition.
  26. Singh, S.; Littman, M.; Jong, N.; Pardoe, D.; and Stone, P. 2003. Learning Predictive State Representations. In Proc. 20th International Conference on Machine Learning (ICML’03), 712– 719.
  27. Singh, S. P.; James, M. R.; and Rudary, M. R. 2004. Predictive State Representations: A New Theory for Modeling Dynamical Systems. In Proc. 20th Conference in Uncertainty in Artificial Intelligence (UAI’04), 512–518. Banff, Canada: AUAI Press.
  28. Strehl, A. L.; Diuk, C.; and Littman, M. L. 2007. Efficient Structure Learning in Factored-State MDPs. In Proc. 27th AAAI Conference on Artificial Intelligence, 645–650. Vancouver, BC: AAAI Press.
  29. Sutton, R. S., and Barto, A. G. 2018. Reinforcement Learning: An Introduction. Cambridge, MA: MIT Press, 2nd edition.
  30. Szita, I., and Lörincz, A. 2008. The Many Faces of Optimism: a Unifying Approach. In Proc. 12th International Conference (ICML 2008), volume 307, 1048–1055.
Language: English
Page range: 71 - 86
Submitted on: Oct 21, 2020
Accepted on: Apr 6, 2021
Published on: Jun 14, 2021
Published by: Artificial General Intelligence Society
In partnership with: Paradigm Publishing Services
Publication frequency: 2 times per year

© 2021 Marcus Hutter, published by Artificial General Intelligence Society
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.