Structure Learning for Directed Trees

Research output: Contribution to journalJournal articleResearchpeer-review

Standard

Structure Learning for Directed Trees. / Jakobsen, Martin Emil; Shah, Rajen D.; Bühlmann, Peter; Peters, Jonas.

In: Journal of Machine Learning Research, Vol. 23, (159), 2022, p. 1-97.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

Jakobsen, ME, Shah, RD, Bühlmann, P & Peters, J 2022, 'Structure Learning for Directed Trees', Journal of Machine Learning Research, vol. 23, (159), pp. 1-97.

APA

Jakobsen, M. E., Shah, R. D., Bühlmann, P., & Peters, J. (2022). Structure Learning for Directed Trees. Journal of Machine Learning Research, 23, 1-97. [(159)].

Vancouver

Jakobsen ME, Shah RD, Bühlmann P, Peters J. Structure Learning for Directed Trees. Journal of Machine Learning Research. 2022;23:1-97. (159).

Author

Jakobsen, Martin Emil ; Shah, Rajen D. ; Bühlmann, Peter ; Peters, Jonas. / Structure Learning for Directed Trees. In: Journal of Machine Learning Research. 2022 ; Vol. 23. pp. 1-97.

Bibtex

@article{481153e9f5a34b25ab1aea9cbc4db7b7,
title = "Structure Learning for Directed Trees",
abstract = "Knowing the causal structure of a system is of fundamental interest in many areas of science and can aid the design of prediction algorithms that work well under manipulations to the system. The causal structure becomes identifiable from the observational distribution under certain restrictions. To learn the structure from data, score-based methods evaluate different graphs according to the quality of their fits. However, for large, continuous, and nonlinear models, these rely on heuristic optimization approaches with no general guarantees of recovering the true causal structure. In this paper, we consider structure learning of directed trees. We propose a fast and scalable method based on Chu–Liu–Edmonds{\textquoteright} algorithm we call causal additive trees (CAT). For the case of Gaussian errors, we prove consistency in an asymptotic regime with a vanishing identifiability gap. We also introduce two methods for testing substructure hypotheses with asymptotic family-wise error rate control that is valid post-selection and in unidentified settings. Furthermore, we study the identifiability gap, which quantifies how much better the true causal model fits the observational distribution, and prove that it is lower bounded by local properties of the causal model. Simulation studies demonstrate the favorable performance of CAT compared to competing structure learning methods.",
keywords = "Causality, directed trees, hypothesis testing, restricted causal models, structure learning",
author = "Jakobsen, {Martin Emil} and Shah, {Rajen D.} and Peter B{\"u}hlmann and Jonas Peters",
note = "Publisher Copyright: {\textcopyright} 2022 Martin Emil Jakobsen, Rajen Shah, Peter B{\"u}hlmann and Jonas Peters.",
year = "2022",
language = "English",
volume = "23",
pages = "1--97",
journal = "Journal of Machine Learning Research",
issn = "1533-7928",
publisher = "MIT Press",

}

RIS

TY - JOUR

T1 - Structure Learning for Directed Trees

AU - Jakobsen, Martin Emil

AU - Shah, Rajen D.

AU - Bühlmann, Peter

AU - Peters, Jonas

N1 - Publisher Copyright: © 2022 Martin Emil Jakobsen, Rajen Shah, Peter Bühlmann and Jonas Peters.

PY - 2022

Y1 - 2022

N2 - Knowing the causal structure of a system is of fundamental interest in many areas of science and can aid the design of prediction algorithms that work well under manipulations to the system. The causal structure becomes identifiable from the observational distribution under certain restrictions. To learn the structure from data, score-based methods evaluate different graphs according to the quality of their fits. However, for large, continuous, and nonlinear models, these rely on heuristic optimization approaches with no general guarantees of recovering the true causal structure. In this paper, we consider structure learning of directed trees. We propose a fast and scalable method based on Chu–Liu–Edmonds’ algorithm we call causal additive trees (CAT). For the case of Gaussian errors, we prove consistency in an asymptotic regime with a vanishing identifiability gap. We also introduce two methods for testing substructure hypotheses with asymptotic family-wise error rate control that is valid post-selection and in unidentified settings. Furthermore, we study the identifiability gap, which quantifies how much better the true causal model fits the observational distribution, and prove that it is lower bounded by local properties of the causal model. Simulation studies demonstrate the favorable performance of CAT compared to competing structure learning methods.

AB - Knowing the causal structure of a system is of fundamental interest in many areas of science and can aid the design of prediction algorithms that work well under manipulations to the system. The causal structure becomes identifiable from the observational distribution under certain restrictions. To learn the structure from data, score-based methods evaluate different graphs according to the quality of their fits. However, for large, continuous, and nonlinear models, these rely on heuristic optimization approaches with no general guarantees of recovering the true causal structure. In this paper, we consider structure learning of directed trees. We propose a fast and scalable method based on Chu–Liu–Edmonds’ algorithm we call causal additive trees (CAT). For the case of Gaussian errors, we prove consistency in an asymptotic regime with a vanishing identifiability gap. We also introduce two methods for testing substructure hypotheses with asymptotic family-wise error rate control that is valid post-selection and in unidentified settings. Furthermore, we study the identifiability gap, which quantifies how much better the true causal model fits the observational distribution, and prove that it is lower bounded by local properties of the causal model. Simulation studies demonstrate the favorable performance of CAT compared to competing structure learning methods.

KW - Causality

KW - directed trees

KW - hypothesis testing

KW - restricted causal models

KW - structure learning

UR - http://www.scopus.com/inward/record.url?scp=85131841168&partnerID=8YFLogxK

M3 - Journal article

AN - SCOPUS:85131841168

VL - 23

SP - 1

EP - 97

JO - Journal of Machine Learning Research

JF - Journal of Machine Learning Research

SN - 1533-7928

M1 - (159)

ER -

ID: 314448437