Optimal learning with Bernstein online aggregation

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningfagfællebedømt

Standard

Optimal learning with Bernstein online aggregation. / Wintenberger, Olivier.

I: Machine Learning, Bind 106, Nr. 1, 01.01.2017, s. 119-141.

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningfagfællebedømt

Harvard

Wintenberger, O 2017, 'Optimal learning with Bernstein online aggregation', Machine Learning, bind 106, nr. 1, s. 119-141. https://doi.org/10.1007/s10994-016-5592-6

APA

Wintenberger, O. (2017). Optimal learning with Bernstein online aggregation. Machine Learning, 106(1), 119-141. https://doi.org/10.1007/s10994-016-5592-6

Vancouver

Wintenberger O. Optimal learning with Bernstein online aggregation. Machine Learning. 2017 jan. 1;106(1):119-141. https://doi.org/10.1007/s10994-016-5592-6

Author

Wintenberger, Olivier. / Optimal learning with Bernstein online aggregation. I: Machine Learning. 2017 ; Bind 106, Nr. 1. s. 119-141.

Bibtex

@article{60d7379893bb4a20beb3efec7a19a7f3,
title = "Optimal learning with Bernstein online aggregation",
abstract = "We introduce a new recursive aggregation procedure called Bernstein Online Aggregation (BOA). Its exponential weights include a second order refinement. The procedure is optimal for the model selection aggregation problem in the bounded iid setting for the square loss: the excess of risk of its batch version achieves the fast rate of convergence log (M) / n in deviation. The BOA procedure is the first online algorithm that satisfies this optimal fast rate. The second order refinement is required to achieve the optimality in deviation as the classical exponential weights cannot be optimal, see Audibert (Advances in neural information processing systems. MIT Press, Cambridge, MA, 2007). This refinement is settled thanks to a new stochastic conversion that estimates the cumulative predictive risk in any stochastic environment with observable second order terms. The observable second order term is shown to be sufficiently small to assert the fast rate in the iid setting when the loss is Lipschitz and strongly convex. We also introduce a multiple learning rates version of BOA. This fully adaptive BOA procedure is also optimal, up to a log log (n) factor.",
keywords = "Exponential weighted averages, Individual sequences, Learning theory",
author = "Olivier Wintenberger",
year = "2017",
month = jan,
day = "1",
doi = "10.1007/s10994-016-5592-6",
language = "English",
volume = "106",
pages = "119--141",
journal = "Machine Learning",
issn = "0885-6125",
publisher = "Springer",
number = "1",

}

RIS

TY - JOUR

T1 - Optimal learning with Bernstein online aggregation

AU - Wintenberger, Olivier

PY - 2017/1/1

Y1 - 2017/1/1

N2 - We introduce a new recursive aggregation procedure called Bernstein Online Aggregation (BOA). Its exponential weights include a second order refinement. The procedure is optimal for the model selection aggregation problem in the bounded iid setting for the square loss: the excess of risk of its batch version achieves the fast rate of convergence log (M) / n in deviation. The BOA procedure is the first online algorithm that satisfies this optimal fast rate. The second order refinement is required to achieve the optimality in deviation as the classical exponential weights cannot be optimal, see Audibert (Advances in neural information processing systems. MIT Press, Cambridge, MA, 2007). This refinement is settled thanks to a new stochastic conversion that estimates the cumulative predictive risk in any stochastic environment with observable second order terms. The observable second order term is shown to be sufficiently small to assert the fast rate in the iid setting when the loss is Lipschitz and strongly convex. We also introduce a multiple learning rates version of BOA. This fully adaptive BOA procedure is also optimal, up to a log log (n) factor.

AB - We introduce a new recursive aggregation procedure called Bernstein Online Aggregation (BOA). Its exponential weights include a second order refinement. The procedure is optimal for the model selection aggregation problem in the bounded iid setting for the square loss: the excess of risk of its batch version achieves the fast rate of convergence log (M) / n in deviation. The BOA procedure is the first online algorithm that satisfies this optimal fast rate. The second order refinement is required to achieve the optimality in deviation as the classical exponential weights cannot be optimal, see Audibert (Advances in neural information processing systems. MIT Press, Cambridge, MA, 2007). This refinement is settled thanks to a new stochastic conversion that estimates the cumulative predictive risk in any stochastic environment with observable second order terms. The observable second order term is shown to be sufficiently small to assert the fast rate in the iid setting when the loss is Lipschitz and strongly convex. We also introduce a multiple learning rates version of BOA. This fully adaptive BOA procedure is also optimal, up to a log log (n) factor.

KW - Exponential weighted averages

KW - Individual sequences

KW - Learning theory

UR - http://www.scopus.com/inward/record.url?scp=84990841038&partnerID=8YFLogxK

U2 - 10.1007/s10994-016-5592-6

DO - 10.1007/s10994-016-5592-6

M3 - Journal article

AN - SCOPUS:84990841038

VL - 106

SP - 119

EP - 141

JO - Machine Learning

JF - Machine Learning

SN - 0885-6125

IS - 1

ER -

ID: 196943682