Optimal learning with Bernstein online aggregation
Research output: Contribution to journal › Journal article › Research › peer-review
Standard
Optimal learning with Bernstein online aggregation. / Wintenberger, Olivier.
In: Machine Learning, Vol. 106, No. 1, 01.01.2017, p. 119-141.Research output: Contribution to journal › Journal article › Research › peer-review
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - JOUR
T1 - Optimal learning with Bernstein online aggregation
AU - Wintenberger, Olivier
PY - 2017/1/1
Y1 - 2017/1/1
N2 - We introduce a new recursive aggregation procedure called Bernstein Online Aggregation (BOA). Its exponential weights include a second order refinement. The procedure is optimal for the model selection aggregation problem in the bounded iid setting for the square loss: the excess of risk of its batch version achieves the fast rate of convergence log (M) / n in deviation. The BOA procedure is the first online algorithm that satisfies this optimal fast rate. The second order refinement is required to achieve the optimality in deviation as the classical exponential weights cannot be optimal, see Audibert (Advances in neural information processing systems. MIT Press, Cambridge, MA, 2007). This refinement is settled thanks to a new stochastic conversion that estimates the cumulative predictive risk in any stochastic environment with observable second order terms. The observable second order term is shown to be sufficiently small to assert the fast rate in the iid setting when the loss is Lipschitz and strongly convex. We also introduce a multiple learning rates version of BOA. This fully adaptive BOA procedure is also optimal, up to a log log (n) factor.
AB - We introduce a new recursive aggregation procedure called Bernstein Online Aggregation (BOA). Its exponential weights include a second order refinement. The procedure is optimal for the model selection aggregation problem in the bounded iid setting for the square loss: the excess of risk of its batch version achieves the fast rate of convergence log (M) / n in deviation. The BOA procedure is the first online algorithm that satisfies this optimal fast rate. The second order refinement is required to achieve the optimality in deviation as the classical exponential weights cannot be optimal, see Audibert (Advances in neural information processing systems. MIT Press, Cambridge, MA, 2007). This refinement is settled thanks to a new stochastic conversion that estimates the cumulative predictive risk in any stochastic environment with observable second order terms. The observable second order term is shown to be sufficiently small to assert the fast rate in the iid setting when the loss is Lipschitz and strongly convex. We also introduce a multiple learning rates version of BOA. This fully adaptive BOA procedure is also optimal, up to a log log (n) factor.
KW - Exponential weighted averages
KW - Individual sequences
KW - Learning theory
UR - http://www.scopus.com/inward/record.url?scp=84990841038&partnerID=8YFLogxK
U2 - 10.1007/s10994-016-5592-6
DO - 10.1007/s10994-016-5592-6
M3 - Journal article
AN - SCOPUS:84990841038
VL - 106
SP - 119
EP - 141
JO - Machine Learning
JF - Machine Learning
SN - 0885-6125
IS - 1
ER -
ID: 196943682