Tuning Stochastic Gradient Algorithms for Statistical Inference via Large-Sample Asymptotics
Research output: Working paper › Preprint
Standard
Tuning Stochastic Gradient Algorithms for Statistical Inference via Large-Sample Asymptotics. / Negrea, Jeffrey; Yang, Jun; Feng, Haoyue; Roy, Daniel M.; Huggins, Jonathan H.
arXiv preprint, 2023.Research output: Working paper › Preprint
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - UNPB
T1 - Tuning Stochastic Gradient Algorithms for Statistical Inference via Large-Sample Asymptotics
AU - Negrea, Jeffrey
AU - Yang, Jun
AU - Feng, Haoyue
AU - Roy, Daniel M.
AU - Huggins, Jonathan H.
N1 - 42 pgs
PY - 2023
Y1 - 2023
N2 - The tuning of stochastic gradient algorithms (SGAs) for optimization and sampling is often based on heuristics and trial-and-error rather than generalizable theory. We address this theory--practice gap by characterizing the large-sample statistical asymptotics of SGAs via a joint step-size--sample-size scaling limit. We show that iterate averaging with a large fixed step size is robust to the choice of tuning parameters and asymptotically has covariance proportional to that of the MLE sampling distribution. We also prove a Bernstein--von Mises-like theorem to guide tuning, including for generalized posteriors that are robust to model misspecification. Numerical experiments validate our results and recommendations in realistic finite-sample regimes. Our work lays the foundation for a systematic analysis of other stochastic gradient Markov chain Monte Carlo algorithms for a wide range of models.
AB - The tuning of stochastic gradient algorithms (SGAs) for optimization and sampling is often based on heuristics and trial-and-error rather than generalizable theory. We address this theory--practice gap by characterizing the large-sample statistical asymptotics of SGAs via a joint step-size--sample-size scaling limit. We show that iterate averaging with a large fixed step size is robust to the choice of tuning parameters and asymptotically has covariance proportional to that of the MLE sampling distribution. We also prove a Bernstein--von Mises-like theorem to guide tuning, including for generalized posteriors that are robust to model misspecification. Numerical experiments validate our results and recommendations in realistic finite-sample regimes. Our work lays the foundation for a systematic analysis of other stochastic gradient Markov chain Monte Carlo algorithms for a wide range of models.
KW - stat.CO
KW - cs.LG
KW - stat.ME
KW - stat.ML
M3 - Preprint
BT - Tuning Stochastic Gradient Algorithms for Statistical Inference via Large-Sample Asymptotics
PB - arXiv preprint
ER -
ID: 361432255