Control Systems and Computers, N5, 2020, Article 5

https://doi.org/10.15407/csc.2020.05.052

Control Systems and Computers, 2020, Issue 5 (289), pp. 52-63.

UDK 004.021

Bespala Olha M., PhD student, National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute” 37, Prosp. Peremohy, Kyiv, Ukraine, 03056, Olya327@ukr.net ORCID 0000-0003-3285-2585

 TOOLS OF CAUSAL INFERENCE: REVIEW AND PROSPECTS

Introduction. The need to establish causality covers a fairly wide range of different industries with different specifics and approaches. Therefore, it becomes necessary to apply various methods to solve the assigned tasks (in the context of causality), which is accompanied by the choice of a wide range of tools, depending on the task at hand.

Purpose. The purpose of this work is a brief overview and analysis of modern methods, algorithms and technologies for detecting causation and the range of tasks in which the use of the appropriate tools takes place.

Methods. Starting from the gold standards of causal identification and to more accurate, but limited by the range of conditions, algorithms, the current state, advantages and disadvantages of the use of tools are described.

Result. The analysis of the current state of existing methods, algorithms and technologies for establishing causality is carried out, the prospects for further development and improvement of tools for causal detection are examined.

Conclusions. At the moment there is a large list of known methods, algorithms and technologies, there is a number of problems in which there is a need for more accurate detection of causality. The paper shows that most of the tools for establishing causality give good results for acyclic structures, at the same time, they can give false positive conclusions for cyclic structures. Well-known world scientific institutions and leading corporations of computer technology are fruitfully engaged in the development and implementation of more and more perfect tools for establishing causality in order to develop automated software projects close to human thinking.

Download full text! (In Ukrainian)

Keywords: establishment of causality; causal inference tools; causal relationship; causal conclusion; causal analysis.

  1. Mahajan, , Tan, C., Sharma, A., 2019. Preserving causal constraints in counterfactual explanations for machine learning classifiers. arXiv preprint arXiv:1912.03277.
  2. Lecuyer, M., Lockerman, J., Nelson, L., Sen, S., Sharma, A., Slivkins, A., 2017. “Har-vesting randomness to optimize distributed systems”, Proceedings of the 16th ACM Workshop on Hot Topics in Networks, pp. 178-184.
    https://doi.org/10.1145/3152434.3152435
  3.  Pearl, J., 2019. “The seven tools of causal inference, with reflections on machine learn-ing”, Communications of the ACM, 62 (3), pp. 54-60. 
    https://doi.org/10.1145/3241036
  4. Kiciman, E., Sharma, A., 2019. “Causal Inference and Counterfactual Reasoning (3hr Tutorial)”, Proceedings of the 12th ACM International Conference on Web Search and Data Mining, pp. 828-829. https://doi.org/10.1145/3289600.3291381
  5. Zhang, C., Zhang, K., Li, Y., 2020. A Causal View on Robustness of Neural Networks, arXiv preprint arXiv:2005.01095.
  6. Saha, K., Sharma, A., “Causal Factors of Effective Psychosocial Outcomes in Online Mental Health Communities”, Proceedings of the International AAAI Conference on Web and Social Media, 14, pp. 590–601. [online] Available at: <https://ojs.aaai.org/index.php/ICWSM/article/view/7326> [Accessed 23 December 2020].
  7. Sharma, A., Hofman, J.M., Watts, D.J., 2015. “Estimating the causal impact of recom-mendation systems from observational data”, EC’15: Proceedings of the 16th ACM Conference on Economics and Computation, pp. 453-470. 
    https://doi.org/10.1145/2764468.2764488
  8. Wager, S., Athey, S., 2018. “Estimation and inference of heterogeneous treatment ef-fects using random forests”, Journal of the American Statistical Association, 113 (523), pp. 1228-1242. 
    https://doi.org/10.1080/01621459.2017.1319839.
  9.  Chiba, Y., 2018. “Bayesian inference of causal effects for an ordinal outcome in rando-mized trials”, Journal of Causal Inference, 6 (2). https://doi.org/10.1515/jci-2017-0019
  10. Halloran, M.E., Struchiner, C.J., 1995. “Causal inference in infectious diseases”, Epi-demiology, 6 (2), pp. 142-151. DOI: https://doi.org/10.1097/00001648-199503000-00010
  11. Hudgens, M.G., Halloran, M.E., 2008. “Toward causal inference with interference”, Journal of the American Statistical Association, 103 (482), pp. 832-842. DOI: https://doi.org/10.1198/016214508000000292
  12. Makar, M., Swaminathan, A., Kıcıman, E., 2019. “A Distillation Approach to Data Ef-ficient Individual Treatment Effect Estimation”, Proceedings of the AAAI Conference on Artificial Intelligence, 33 (1), pp. 4544-4551. DOI: https://doi.org/10.1609/aaai.v33i01.33014544
  13. Pearl, J., 2003. “Statistics and causal inference: A review”, Test, 12 (2), pp. 281-345. DOI: 
    https://doi.org/10.1007/BF02595718
  14. Hofman, J. M., Sharma, A., Watts, D. J., 2017. “Prediction and explanation in social systems”, Science, 355 (6324), pp. 486-488. DOI: https://doi.org/10.1126/science.aal3856
  15. Tu, R., Zhang, K., Bertilson, B., Kjellstrom, H., Zhang, C., 2019. “Neuropathic Pain Diagnosis Simulator for Causal Discovery Algorithm Evaluation”, Advances in Neural Information Processing Systems, 12793–12804.
  16.  Imbens, G. W., Rubin, D. B., 2015. Causal inference in statistics, social, and biomedical sciences, Cambridge University Press. DOI: https://doi.org/10.1017/CBO9781139025751
  17. Peirce, C.S., Jastrow, J., 1884. “On small differences in sensation”, Memoirs of the National Academy of Sciences, 3, pp. 75–83.
  18. Dehue, T., 1997. “Deception, efficiency, and random groups: Psychology and the gra-dual origination of the random group design”, 88 (4), pp. 653-673. DOI: https://doi.org/10.1086/383850
  19. Neuhauser, D., Diaz, M., 2004. “Daniel: using the Bible to teach quality improvement methods”, Qual. Saf. Health Care, 13 (2), pp. 153-155. DOI: https://doi.org/10.1136/qshc.2003.009480
  20. Fisher, R.A., Kotz, S., Johnson, N.L., 1992. “Breakthroughs in Statistics. Statistical me-thods for research workers”, Breakthroughs in statistics, Springer, New York, pp. 66-70.
    https://doi.org/10.1007/978-1-4612-4380-9_6
  21. Aldrich, J., 1995. “Correlations genuine and spurious in Pearson and Yule”, Statistical science, 10 (4), pp. 364-376.
    https://doi.org/10.1214/ss/1177009870
  22. Vigen, T., 2015. Spurious Correlations. Hachette Books.
  23. Yeliseyeva, I., Yuzbashev, M., 2008. Obshchaya teoriya statistiki: Uchebnik, in Yeliseyevoy I. I., 4th ed, Finansy i Statistika, Moscow, 480 p.
  24.  Imbens, G., Lemieux, T., 2008. “Regression discontinuity designs: A guide to practice”, Journal of econometrics, 142 (2), pp. 615-635.
    https://doi.org/10.1016/j.jeconom.2007.05.001
  25. Thistlethwaite, D.L., Campbell, D.T., 1960. “Regression-discontinuity analysis: An al-ternative to the ex post facto experiment”, Journal of Educational psychology, 51 (6), pp. 309-317. DOI: 
    https://doi.org/10.1037/h0044319
  26. Moss, B.G., Yeaton, W.H., Lioyd, J.E., 2014. “Evaluating the effectiveness of deve-lopmental mathematics by embedding a randomized experiment within a regression discontinuity de-sign”, Educational Evaluation and Policy Analysis, 36 (2), pp. 170-185.
    https://doi.org/10.3102/0162373713504988
  27. Pearl, J., 1985. “Bayesian netwcrks: A model cf self-activated memory for evidential reasoning”, Proceedings of the 7th Conference of the Cognitive Science Society, University of California, Irvine, CA, USA, 15–17.
  28. Bayes, T., Price, M., 1763. “An essay towards solving a problem in the doctrine of chances. By the late Rev. Mr. Bayes, FRS communicated by Mr. Price, in a letter to John Canton, A.M.F.R.S.”, Philosophical transactions, Royal Society of London, 53 (1763), 53 p. DOI: https://doi.org/10.1098/rstl.1763.0053
  29. Pearl J., 2009. Causality, Cambridge university press.
    https://doi.org/10.1017/CBO9780511803161
  30. Gill, K.S., 2020. Pearl, Judea and Mackenzie, Dana: The book of why: the new science of cause and effect, Al & Society.
    https://doi.org/10.1007/s00146-020-00971-7
  31. Spirtes, P., Zhang, K., 2016. “Causal discovery and inference: concepts and recent me-thodological advances”, Applied informatics, Springer Berlin Heidelberg, 3 (1), pp. 3. DOI: 10.1186/s40535-016-0018-x.
    https://doi.org/10.1186/s40535-016-0018-x
  32. Cooper, G.F., 1997. “A simple constraint-based algorithm for efficiently mining obser-vational databases for causal relationships”, Data Mining and Knowledge Discovery, 1 (2), pp. 203-224.
    https://doi.org/10.1023/A:1009787925236
  33. Meek, C., 1997. Graphical Models: Selecting causal and statistical models, Doctoral dissertation, PhD thesis, Carnegie Mellon University.
  34. Chickering, D.M., 2002. “Optimal structure identification with greedy search”, Journal of machine learning research, 3, pp. 507-554.
  35. Buntine, W., 1991. “Theory refinement on Bayesian networks”, Uncertainty Proceed-ings 1991, Morgan Kaufmann, pp. 52-60.
    https://doi.org/10.1016/B978-1-55860-203-8.50010-3
  36. Shimizu, S., Hoyer, P., Hyvärinen, A., Kerminen, A., 2006. “A linear non-Gaussian acyclic model for causal discovery”, Journal of Machine Learning Research, 7, pp. 2003-2030.
  37. Spirtes, P., Glymour, C., 1991. “An algorithm for fast recovery of sparse causal graphs”, Social science computer review, 9 (1), pp. 62-72. DOI: 10.1177/089443939100900106.
    https://doi.org/10.1177/089443939100900106
  38. Zarebavani, B., Jafarinejad, F., Hashemi, M., Salehkaleybar, S., 2019. “cuPC: CUDA-based parallel PC algorithm for causal structure learning on GPU”, IEEE Transactions on Parallel and Distributed Systems, 31 (3), pp. 530-542. DOI: 10.1109/TPDS.2019.2939126.
    https://doi.org/10.1109/TPDS.2019.2939126
  39. Friedman, N., Linial, M., Nachman, I., Pe’er, D., 2000. “Using Bayesian networks to analyze expression data”, Journal of computational biology, 7 (3-4), pp. 601-620. DOI: 10.1089/106652700750050961.
    https://doi.org/10.1089/106652700750050961
  40. Zhang, J., 2012. A characterization of Markov equivalence classes for directed acyclic graphs with latent variables, arXiv preprint arXiv:1206.5282.
  41. Drton, M., Maathuis, M., 2017. “Structure learning in graphical modeling”, Annual Re-view of Statistics and Its Application, 4, pp. 365-393. DOI: 10.1146/annurev-statistics-060116-053803.
    https://doi.org/10.1146/annurev-statistics-060116-053803
  42. Meek, C., 1995. Causal inference and causal explanation with background, Uncertainty in Artificial Intelligence, in Besnard P. & Hanks S. (eds.), Proceedings of the 11th conference on Un-certainty in artificial intelligence, pp. 403-410.
  43. Spirtes, P., Glymour, C., Scheines, R., Heckerman, D., 2000. Causation, prediction, and search. MIT press.
    https://doi.org/10.7551/mitpress/1754.001.0001
  44. Spirtes, P.L., Meek, C., Richardson, T., 2013. Causal inference in the presence of latent variables and selection bias, arXiv preprint arXiv:1302.4983.
  45. Zhang, J., 2008. “On the completeness of orientation rules for causal discovery in the presence of latent confounders and selection bias”, Artificial Intelligence, 172 (16-17), pp. 1873-1896. DOI: 10.1016/j.artint.2008.08.001
    https://doi.org/10.1016/j.artint.2008.08.001
  46. Spirtes, P., 2001. “An Anytime Algorithm for Causal Inference”, AISTATS.
  47. Colombo, D., Maathuis, M., Kalisch, M., Richardson, T., 2011. “Learning high-dimensional DAGs with latent and selection variables”, UAI, p. 850.
  48. Jabbari, F., Ramsey, J., Spirtes, P., Cooper, G., 2017. “Discovery of causal models that contain latent variables through Bayesian scoring of independence constraints”, Joint European Confe-rence on Machine Learning and Knowledge Discovery in Databases, Springer, Cham., pp. 142-157. DOI: 10.1007/978-3-319-71246-8_9.
    https://doi.org/10.1007/978-3-319-71246-8_9
  49. Ogarrio, J.M., Spirtes, P., Ramsey, J., 2016. “A hybrid causal search algorithm for latent variable models”, Conference on Probabilistic Graphical Models, pp. 368-379.
  50. Zhang, J., 2012. A characterization of Markov equivalence classes for directed acyclic graphs with latent variables. arXiv preprint arXiv:1206.5282.

Received 10.10.2020