Statistical Inference in High-Dimensional Regressiom Models with Applications in Genome-Wide Association Studies

Specialeforsvar ved Rune Christiansen

Titel: Statistical Inference in High-Dimensional Regression Models with Applications in Genome-Wide Association Studies

 

Abstract:  In this project, we discuss the subject of high-dimensional linear regression. With a particular focus on post-selection inference, we illustrate the invalidity of conventional statistics in data-driven submodels and propose a new method (PoSI) for valid and exact inference conditional on the lasso selection event. Motivated by the undesirable wideness of confidence intervals, we provide theory and implementations for an extension of PoSI and illustrate the gained benefits under different conditions. To embed conditional inference in the general spectrum of methodology for high-dimensional regression, we conduct a broad simulation study, comparing PoSI with marginal estimation, Multi-Sample Splitting and the Desparsified Lasso. We round off by a case study from the field of genetic epidemiology, investigating the methods’ suitabilities for application in genome-wide association studies

 

Vejledere: Niels Richard Hansen
                Line Skotte, Statens Serum Institut
Censor:    Jens Ledet Jensen, Aarhus Universitet