Tuning Stochastic Gradient Algorithms for Statistical Inference via Large-Sample Asymptotics

Research output: Working paper › Preprint

Standard

Tuning Stochastic Gradient Algorithms for Statistical Inference via Large-Sample Asymptotics. / Negrea, Jeffrey; Yang, Jun; Feng, Haoyue; Roy, Daniel M.; Huggins, Jonathan H.

arXiv preprint, 2023.

Research output: Working paper › Preprint

Harvard

Negrea, J, Yang, J, Feng, H, Roy, DM & Huggins, JH 2023 'Tuning Stochastic Gradient Algorithms for Statistical Inference via Large-Sample Asymptotics' arXiv preprint.

APA

Negrea, J., Yang, J., Feng, H., Roy, D. M., & Huggins, J. H. (2023). Tuning Stochastic Gradient Algorithms for Statistical Inference via Large-Sample Asymptotics. arXiv preprint.

Vancouver

Negrea J, Yang J, Feng H, Roy DM, Huggins JH. Tuning Stochastic Gradient Algorithms for Statistical Inference via Large-Sample Asymptotics. arXiv preprint. 2023.

Author

Negrea, Jeffrey ; Yang, Jun ; Feng, Haoyue ; Roy, Daniel M. ; Huggins, Jonathan H. / Tuning Stochastic Gradient Algorithms for Statistical Inference via Large-Sample Asymptotics. arXiv preprint, 2023.

Bibtex

@techreport{051ecd25619f45e8a6239d260fe771ea,

title = "Tuning Stochastic Gradient Algorithms for Statistical Inference via Large-Sample Asymptotics",

abstract = "The tuning of stochastic gradient algorithms (SGAs) for optimization and sampling is often based on heuristics and trial-and-error rather than generalizable theory. We address this theory--practice gap by characterizing the large-sample statistical asymptotics of SGAs via a joint step-size--sample-size scaling limit. We show that iterate averaging with a large fixed step size is robust to the choice of tuning parameters and asymptotically has covariance proportional to that of the MLE sampling distribution. We also prove a Bernstein--von Mises-like theorem to guide tuning, including for generalized posteriors that are robust to model misspecification. Numerical experiments validate our results and recommendations in realistic finite-sample regimes. Our work lays the foundation for a systematic analysis of other stochastic gradient Markov chain Monte Carlo algorithms for a wide range of models.",

keywords = "stat.CO, cs.LG, stat.ME, stat.ML",

author = "Jeffrey Negrea and Jun Yang and Haoyue Feng and Roy, {Daniel M.} and Huggins, {Jonathan H.}",

note = "42 pgs",

year = "2023",

language = "English",

publisher = "arXiv preprint",

type = "WorkingPaper",

institution = "arXiv preprint",

}

RIS

TY - UNPB

T1 - Tuning Stochastic Gradient Algorithms for Statistical Inference via Large-Sample Asymptotics

AU - Negrea, Jeffrey

AU - Yang, Jun

AU - Feng, Haoyue

AU - Roy, Daniel M.

AU - Huggins, Jonathan H.

N1 - 42 pgs

PY - 2023

Y1 - 2023

N2 - The tuning of stochastic gradient algorithms (SGAs) for optimization and sampling is often based on heuristics and trial-and-error rather than generalizable theory. We address this theory--practice gap by characterizing the large-sample statistical asymptotics of SGAs via a joint step-size--sample-size scaling limit. We show that iterate averaging with a large fixed step size is robust to the choice of tuning parameters and asymptotically has covariance proportional to that of the MLE sampling distribution. We also prove a Bernstein--von Mises-like theorem to guide tuning, including for generalized posteriors that are robust to model misspecification. Numerical experiments validate our results and recommendations in realistic finite-sample regimes. Our work lays the foundation for a systematic analysis of other stochastic gradient Markov chain Monte Carlo algorithms for a wide range of models.

AB - The tuning of stochastic gradient algorithms (SGAs) for optimization and sampling is often based on heuristics and trial-and-error rather than generalizable theory. We address this theory--practice gap by characterizing the large-sample statistical asymptotics of SGAs via a joint step-size--sample-size scaling limit. We show that iterate averaging with a large fixed step size is robust to the choice of tuning parameters and asymptotically has covariance proportional to that of the MLE sampling distribution. We also prove a Bernstein--von Mises-like theorem to guide tuning, including for generalized posteriors that are robust to model misspecification. Numerical experiments validate our results and recommendations in realistic finite-sample regimes. Our work lays the foundation for a systematic analysis of other stochastic gradient Markov chain Monte Carlo algorithms for a wide range of models.

KW - stat.CO

KW - cs.LG

KW - stat.ME

KW - stat.ML

M3 - Preprint

BT - Tuning Stochastic Gradient Algorithms for Statistical Inference via Large-Sample Asymptotics

PB - arXiv preprint

ER -

ID: 361432255

Department of Mathematical Sciences