Extremal Random Forests

Institut for Matematiske Fag

Publikation: Bidrag til tidsskrift › Tidsskriftartikel › Forskning › fagfællebedømt

Standard

Extremal Random Forests. / Gnecco, Nicola; Terefe, Edossa Merga; Engelke, Sebastian.

I: Journal of the American Statistical Association, 2024.

Publikation: Bidrag til tidsskrift › Tidsskriftartikel › Forskning › fagfællebedømt

Harvard

Gnecco, N, Terefe, EM & Engelke, S 2024, 'Extremal Random Forests', Journal of the American Statistical Association. https://doi.org/10.1080/01621459.2023.2300522

APA

Gnecco, N., Terefe, E. M., & Engelke, S. (2024). Extremal Random Forests. Journal of the American Statistical Association. https://doi.org/10.1080/01621459.2023.2300522

Vancouver

Gnecco N, Terefe EM, Engelke S. Extremal Random Forests. Journal of the American Statistical Association. 2024. https://doi.org/10.1080/01621459.2023.2300522

Author

Gnecco, Nicola ; Terefe, Edossa Merga ; Engelke, Sebastian. / Extremal Random Forests. I: Journal of the American Statistical Association. 2024.

Bibtex

@article{3e637627324146818dea9d04c6c706c8,

title = "Extremal Random Forests",

abstract = "Classical methods for quantile regression fail in cases where the quantile of interest is extreme and only few or no training data points exceed it. Asymptotic results from extreme value theory can be used to extrapolate beyond the range of the data, and several approaches exist that use linear regression, kernel methods or generalized additive models. Most of these methods break down if the predictor space has more than a few dimensions or if the regression function of extreme quantiles is complex. We propose a method for extreme quantile regression that combines the flexibility of random forests with the theory of extrapolation. Our extremal random forest (ERF) estimates the parameters of a generalized Pareto distribution, conditional on the predictor vector, by maximizing a local likelihood with weights extracted from a quantile random forest. We penalize the shape parameter in this likelihood to regularize its variability in the predictor space. Under general domain of attraction conditions, we show consistency of the estimated parameters in both the unpenalized and penalized case. Simulation studies show that our ERF outperforms both classical quantile regression methods and existing regression approaches from extreme value theory. We apply our methodology to extreme quantile prediction for U.S. wage data. Supplementary materials for this article are available online.",

keywords = "Extreme quantiles, Local likelihood estimation, Quantile regression, Random forests, Threshold exceedances",

author = "Nicola Gnecco and Terefe, {Edossa Merga} and Sebastian Engelke",

note = "Publisher Copyright: {\textcopyright} 2024 American Statistical Association.",

year = "2024",

doi = "10.1080/01621459.2023.2300522",

language = "English",

journal = "Journal of the American Statistical Association",

issn = "0162-1459",

publisher = "Taylor & Francis",

}

RIS

TY - JOUR

T1 - Extremal Random Forests

AU - Gnecco, Nicola

AU - Terefe, Edossa Merga

AU - Engelke, Sebastian

PY - 2024

Y1 - 2024

N2 - Classical methods for quantile regression fail in cases where the quantile of interest is extreme and only few or no training data points exceed it. Asymptotic results from extreme value theory can be used to extrapolate beyond the range of the data, and several approaches exist that use linear regression, kernel methods or generalized additive models. Most of these methods break down if the predictor space has more than a few dimensions or if the regression function of extreme quantiles is complex. We propose a method for extreme quantile regression that combines the flexibility of random forests with the theory of extrapolation. Our extremal random forest (ERF) estimates the parameters of a generalized Pareto distribution, conditional on the predictor vector, by maximizing a local likelihood with weights extracted from a quantile random forest. We penalize the shape parameter in this likelihood to regularize its variability in the predictor space. Under general domain of attraction conditions, we show consistency of the estimated parameters in both the unpenalized and penalized case. Simulation studies show that our ERF outperforms both classical quantile regression methods and existing regression approaches from extreme value theory. We apply our methodology to extreme quantile prediction for U.S. wage data. Supplementary materials for this article are available online.

AB - Classical methods for quantile regression fail in cases where the quantile of interest is extreme and only few or no training data points exceed it. Asymptotic results from extreme value theory can be used to extrapolate beyond the range of the data, and several approaches exist that use linear regression, kernel methods or generalized additive models. Most of these methods break down if the predictor space has more than a few dimensions or if the regression function of extreme quantiles is complex. We propose a method for extreme quantile regression that combines the flexibility of random forests with the theory of extrapolation. Our extremal random forest (ERF) estimates the parameters of a generalized Pareto distribution, conditional on the predictor vector, by maximizing a local likelihood with weights extracted from a quantile random forest. We penalize the shape parameter in this likelihood to regularize its variability in the predictor space. Under general domain of attraction conditions, we show consistency of the estimated parameters in both the unpenalized and penalized case. Simulation studies show that our ERF outperforms both classical quantile regression methods and existing regression approaches from extreme value theory. We apply our methodology to extreme quantile prediction for U.S. wage data. Supplementary materials for this article are available online.

KW - Extreme quantiles

KW - Local likelihood estimation

KW - Quantile regression

KW - Random forests

KW - Threshold exceedances

UR - http://www.scopus.com/inward/record.url?scp=85185692133&partnerID=8YFLogxK

U2 - 10.1080/01621459.2023.2300522

DO - 10.1080/01621459.2023.2300522

M3 - Journal article

AN - SCOPUS:85185692133

JO - Journal of the American Statistical Association

JF - Journal of the American Statistical Association

SN - 0162-1459

ER -

ID: 384878166