Kernel-based tests for joint independence
Research output: Contribution to journal › Journal article › Research › peer-review
Standard
Kernel-based tests for joint independence. / Pfister, Niklas; Bühlmann, Peter; Schölkopf, Bernhard; Peters, Jonas.
In: Journal of the Royal Statistical Society, Series B (Statistical Methodology), Vol. 80, No. 1, 01.01.2018, p. 5-31.Research output: Contribution to journal › Journal article › Research › peer-review
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - JOUR
T1 - Kernel-based tests for joint independence
AU - Pfister, Niklas
AU - Bühlmann, Peter
AU - Schölkopf, Bernhard
AU - Peters, Jonas
PY - 2018/1/1
Y1 - 2018/1/1
N2 - We investigate the problem of testing whether $d$ random variables, which may or may not be continuous, are jointly (or mutually) independent. Our method builds on ideas of the two variable Hilbert-Schmidt independence criterion (HSIC) but allows for an arbitrary number of variables. We embed the $d$-dimensional joint distribution and the product of the marginals into a reproducing kernel Hilbert space and define the $d$-variable Hilbert-Schmidt independence criterion (dHSIC) as the squared distance between the embeddings. In the population case, the value of dHSIC is zero if and only if the $d$ variables are jointly independent, as long as the kernel is characteristic. Based on an empirical estimate of dHSIC, we define three different non-parametric hypothesis tests: a permutation test, a bootstrap test and a test based on a Gamma approximation. We prove that the permutation test achieves the significance level and that the bootstrap test achieves pointwise asymptotic significance level as well as pointwise asymptotic consistency (i.e., it is able to detect any type of fixed dependence in the large sample limit). The Gamma approximation does not come with these guarantees; however, it is computationally very fast and for small $d$, it performs well in practice. Finally, we apply the test to a problem in causal discovery.
AB - We investigate the problem of testing whether $d$ random variables, which may or may not be continuous, are jointly (or mutually) independent. Our method builds on ideas of the two variable Hilbert-Schmidt independence criterion (HSIC) but allows for an arbitrary number of variables. We embed the $d$-dimensional joint distribution and the product of the marginals into a reproducing kernel Hilbert space and define the $d$-variable Hilbert-Schmidt independence criterion (dHSIC) as the squared distance between the embeddings. In the population case, the value of dHSIC is zero if and only if the $d$ variables are jointly independent, as long as the kernel is characteristic. Based on an empirical estimate of dHSIC, we define three different non-parametric hypothesis tests: a permutation test, a bootstrap test and a test based on a Gamma approximation. We prove that the permutation test achieves the significance level and that the bootstrap test achieves pointwise asymptotic significance level as well as pointwise asymptotic consistency (i.e., it is able to detect any type of fixed dependence in the large sample limit). The Gamma approximation does not come with these guarantees; however, it is computationally very fast and for small $d$, it performs well in practice. Finally, we apply the test to a problem in causal discovery.
KW - Causal inference
KW - Independence test
KW - Kernel methods
KW - V-statistics
U2 - 10.1111/rssb.12235
DO - 10.1111/rssb.12235
M3 - Journal article
VL - 80
SP - 5
EP - 31
JO - Journal of the Royal Statistical Society, Series B (Statistical Methodology)
JF - Journal of the Royal Statistical Society, Series B (Statistical Methodology)
SN - 1369-7412
IS - 1
ER -
ID: 188871088