Have a personal or library account? Click to login

Learning causal theories with non-reversible MCMC methods

Open Access
|Jun 2022

References

  1. Akaike, H. (1974) A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6): 716–723.10.1109/TAC.1974.1100705
  2. Angelino, E., Johnson, M. J. and Adams, R. P. (2016) Patterns of scalable bayesian inference. Foundations and Trends in Machine Learning, 9: 119–247.10.1561/2200000052
  3. Arabas, P. (2019) Energy aware data centers and networks: a survey. Journal of Telecommunications and Information Technology, 4: 26–36.10.26636/jtit.2018.129818
  4. Arabas, P. (2021) Modeling and simulation of hierarchical task allocation system for energy-aware hpc clouds. Simulation Modelling Practice and Theory, 107: 102–221.10.1016/j.simpat.2020.102221
  5. Bierkens, J. (2016) Non-reversible Metropolis-Hastings. Stat Comput, 26: 1213—1228.10.1007/s11222-015-9598-x
  6. Carey, S. (1985) Conceptual Change in Childhood. MIT Press.
  7. Chickering, D. (2002) Optimal structure identification with greedy search. Journal of Machine Learning Research, 3: 507–554. doi: 10.1162/153244303321897717.
  8. Contaldi, C., Vafaee, F. and Nelson, P.C. (2019) Bayesian network hybrid learning using an elite-guided genetic algorithm. Artificial Intelligence Review, 52(1): 245–272.10.1007/s10462-018-9615-5
  9. Cooper, G. F. and Herskovits, E. (1992) A bayesian method for the induction of probabilistic networks from data. Mach. Learn., 9(4): 309–347. ISSN 0885-6125. doi: 10.1023/A:1022649401552.
  10. Corander, J., Ekdahl, M. and Koski, T. (2008) Parallel interacting mcmc for learning of topologies of graphical models. Data Mining and Knowledge Discovery, 17: 431–456.10.1007/s10618-008-0099-9
  11. Corander, P., Gyllenberg, M. and Koski, T. (2006) Bayesian model learning based on a parallel mcmc strategy. Statistics and Computing, 16: 355–362.10.1007/s11222-006-9391-y
  12. Dai, J., Ren, J. and Du, W. (2020) Decomposition-based bayesian network structure learning algorithm using local topology information. Knowledge-Based Systems, 105602.10.1016/j.knosys.2020.105602
  13. Dolan, E. and More, J. (2001) Benchmarking optimization software with performance profiles. Mathematical Programming, 91. doi: 10.1007/s101070100263.
  14. Flores, M. J., Nicholson, A. E., Brunskill, A., Korb, K. B. and Mascaro, S. (2011) Incorporating expert knowledge when learning bayesian network structure: A medical case study. Artificial Intelligence in Medicine, 53 (3):181–204. doi: https://doi.org/10.1016/j.artmed.2011.08.004.21958683
  15. Friedman, N. and Koller, D. (2001) Being bayesian about network structure: A bayesian approach to structure discovery in bayesian networks. Mach. Learn., 50. doi: 10.1023/A:1020249912095.
  16. Friedman, N., Nachman, I. and Pe’er, D. (1999) Learning bayesian network structure from massive datasets: The sparse candidate algorithm. In: Proceedings of the Fifteenth Conference on Uncertainty and Artificial Intelligence. Morgan Kaufmann Publishers, 206–215. doi: 10.13140/2.1.1125.2169.
  17. Gao, F. and Huang, D. (2020) A node sorting method for k2 algorithm in bayesian network structure learning. In: 2020 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA), 106–110. doi: 10.1109/ICAICA50127.2020.9182465.
  18. Goudie, R. J. B. and Mukherjee, S. (2016) A gibbs sampler for learning dags. Journal of Machine learning Research, 17(2): 1–39.
  19. Griffiths, T. L., Chater, N., Kemp, C., Perfors, A., Tenenbaum, J. B. and Goodman, N. D. (2010) Probabilistic models of cognition: exploring representation and inductive biases. Trends in Cognitive Sciences, 14: 357–364.10.1016/j.tics.2010.05.00420576465
  20. Hansen, N., Auger, A., Mersmann, O., Tusar, T. and Brockhoff, D. (2016) Coco: A platform for comparing continuous optimizers in a blackbox setting. Optimization Methods and Software, 36: 114–144.10.1080/10556788.2020.1808977
  21. Hastings, W. K. (1970) Monte Carlo sampling methods using Markov chains and their applications. Biometrika, 57:97–109.10.1093/biomet/57.1.97
  22. Jones, M. and Love, B. C. (2011) Bayesian fundamentalism or enlightenment? On the explanatory status and theoretical contributions of bayesian models of cognition. Behavioral and Brain Sciences, 34:169–231.10.1017/S0140525X1000313421864419
  23. Kemp, C., Griffiths, T. L. and Tenenbaum, J.B. (2004) Discovering latent classes in relational data. Technical report, MIT, CSAIL, 2004.
  24. Kemp, C., Tenenbaum, J. B., Niyogi, S. and Griffiths, T. (2010) A probabilistic model of theory formation. Cognition, 114: 165–196.10.1016/j.cognition.2009.09.00319892328
  25. Koller, D., Friedman, N., Getoor, L. and Taskar, B. (2007) Graphical models in a nutshell. In: L. Getoor and B. Taskar, eds., Introduction to Statistical Relational Learning. The MIT Press, 13–55.
  26. Koski, T. and Noble, J. M. (2012) A review of bayesian networks and structure learning. Mathematica Applicanda, 40: 51–103.10.14708/ma.v40i1.278
  27. Koski, T. J. T. and Noble, J. M., eds. (2009) Bayesian Networks: an Introduction. Wiley.10.1002/9780470684023
  28. Lee, C. and van Beek, P. (2017)Metaheuristics for score-and-search bayesian network structure learning. In: Canadian Conference on Artificial Intelligence, 129–141. Springer.10.1007/978-3-319-57351-9_17
  29. Madigan, D. and York, J. (1993) Bayesian graphical models for discrete data. International Statistical Review, 63: 215–232.10.2307/1403615
  30. Madigan, D., Andersson, S., Perlman, M. and Volinsky, C. (2000) Bayesian model averaging and model selection for Markov equivalence classes of acyclic digraphs. Communications in Statistics: Theory and Methods, 25. doi: 10.1080/03610929608831853.
  31. Madsen, A. L., Jensen, F., Salmerón, A., Langseth, H. and Nielsen, T. D. (2017) A parallel algorithm for bayesian network structure learning from large data sets. Knowledge-Based Systems, 117: 46–55.10.1016/j.knosys.2016.07.031
  32. Mansinghka, V., Kemp, C., Tenenbaum, J. B. and Griffiths, T. L. (2006) Structured priors for structure learning. In: Proceedings of the 22nd conference on uncertainty in artificial intelligence (UAI). AUAI Press, 324–331.
  33. McClelland, J., Botvinick, M. M., Noelle, D. C., Plaut, D. C., Rogers, T. T., Seidenberg, M. S. and Smit, L. B. (2010) Letting structure emerge: connectionist and dynamical systems approaches to cognition. Trends in Cognitive Sciences, 14: 348–356.10.1016/j.tics.2010.06.002305644620598626
  34. Moore, A. and Wong, W.-k. (2004) Optimal reinsertion: A new search operator for accelerated and more accurate bayesian network structure learning. In: Proceedings of the Twentieth International Conference on Machine Learning (ICML’03). AAAI Press, 552–559.
  35. More, J. and Wild, S. (2009) Benchmarking derivative-free optimization algorithms. SIAM Journal on Optimization, 20: 172–191. doi: 10.1137/080724083.
  36. Murphy, G. and Medin, D. (1985) The role of theories in conceptual coherence. Psychological Review, 92(3): 289–316. ISSN 0033-295X. doi: 10.1037/0033-295X.92.3.289.
  37. Peters, G. (2008) Markov chain Monte Carlo: stochastic simulation for bayesian inference. Statistics in Medicine, 27(16): 3213–3214. doi: 10.1002/sim.3240.
  38. Ravenzwaaij, D., Cassey, P. and Brown, S. D. (2018) A simple introduction to Markov chain Monte-Carlo. Psychonomic Bulletin & Review, 25: 143–154.10.3758/s13423-016-1015-8
  39. Rios, F., Noble, J. and Koski, T. (2015) A prior distribution over directed acyclic graphs for sparse bayesian networks. arXiv: 1504.06701
  40. Rissanen, J. (1978) Modeling by shortest data description. Automatica, 14(5): 465–471. ISSN 0005-1098. doi: https://doi.org/10.1016/0005-1098(78)90005-5.
  41. Robinson, R. (1973) Counting labeled acyclic digraphs. In: F. Harary, ed., New Directions in the Theory of Graphs, 239–273. Academic Press, New York, NY.
  42. Scanagatta, M., Salmeron, A. and Stella, F. (2019) A survey on bayesian network structure learning from data. Progress in Artificial Intelligence, 8: 425–439.10.1007/s13748-019-00194-y
  43. Schwarz, G. (1978) Estimating the dimension of a model. Ann. Statist., 6(2): 461–464. doi: 10.1214/aos/1176344136.
  44. Silander, T., Leppä-Aho, J., Jääsaari, E. and Roos, T. (2018) Quotient normalized maximum likelihood criterion for learning bayesian network structures. In: International Conference on Artificial Intelligence and Statistics, 948–957. [Publisher ????]
  45. Szynkiewicz, P. (2018) Comparative study of pso and cma-es algorithms on black-box optimization benchmarks. Journal of Telecommunications and Information Technology, 4: 5–17.10.26636/jtit.2018.127418
  46. Tenenbaum, J. B., Kemp, C., Griffiths, T. L. and Goodman, N. D. (2011) Statistics, structure, and abstraction. Science, 331: 1279–1285.10.1126/science.119278821393536
  47. v. d. Vaart, A. W. (1998) Asymptotic Statistics. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press. doi: 10.1017/CBO9780511802256.
  48. Wellman, H. M. and Gelman, S. A. (1992) Cognitive development: Foundational theories of core domains. Annual Review of Psychology, 43(1):3 3 7–375. doi: 10.1146/annurev.ps.43.020192.002005. PMID: 1539946.1539946
DOI: https://doi.org/10.2478/candc-2021-0021 | Journal eISSN: 2720-4278 | Journal ISSN: 0324-8569
Language: English
Page range: 323 - 361
Submitted on: May 1, 2021
Accepted on: Jul 1, 2021
Published on: Jun 27, 2022
Published by: Systems Research Institute Polish Academy of Sciences
In partnership with: Paradigm Publishing Services
Publication frequency: 4 times per year

© 2022 Antonina Krajewska, published by Systems Research Institute Polish Academy of Sciences
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.