Regularizing towards Causal Invariance: Linear Models with Proxies

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Standard

Regularizing towards Causal Invariance : Linear Models with Proxies. / Oberst, Michael ; Thams, Nikolaj Theodor Birkmose; Peters, Jonas Martin; Sontag, David .

Proceedings of the 38th International Conference on Machine Learning (ICML). PMLR, 2021. p. 1-11 (Proceedings of Machine Learning Research, Vol. 139).

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Harvard

Oberst, M, Thams, NTB, Peters, JM & Sontag, D 2021, Regularizing towards Causal Invariance: Linear Models with Proxies. in Proceedings of the 38th International Conference on Machine Learning (ICML). PMLR, Proceedings of Machine Learning Research, vol. 139, pp. 1-11, 38th International Conference on Machine Learning (ICML), Virtual, 18/07/2021. <https://proceedings.mlr.press/v139/>

APA

Oberst, M., Thams, N. T. B., Peters, J. M., & Sontag, D. (2021). Regularizing towards Causal Invariance: Linear Models with Proxies. In Proceedings of the 38th International Conference on Machine Learning (ICML) (pp. 1-11). PMLR. Proceedings of Machine Learning Research Vol. 139 https://proceedings.mlr.press/v139/

Vancouver

Oberst M, Thams NTB, Peters JM, Sontag D. Regularizing towards Causal Invariance: Linear Models with Proxies. In Proceedings of the 38th International Conference on Machine Learning (ICML). PMLR. 2021. p. 1-11. (Proceedings of Machine Learning Research, Vol. 139).

Author

Oberst, Michael ; Thams, Nikolaj Theodor Birkmose ; Peters, Jonas Martin ; Sontag, David . / Regularizing towards Causal Invariance : Linear Models with Proxies. Proceedings of the 38th International Conference on Machine Learning (ICML). PMLR, 2021. pp. 1-11 (Proceedings of Machine Learning Research, Vol. 139).

Bibtex

@inproceedings{4e4821d415304369a90f036bde0f2439,
title = "Regularizing towards Causal Invariance: Linear Models with Proxies",
abstract = "We propose an algorithm for stochastic and adversarial multiarmed bandits with switching costs, where the algorithm pays a price λ every time it switches the arm being played. Our algorithm is based on adaptation of the Tsallis-INF algorithm of Zimmert and Seldin (2021) and requires no prior knowledge of the regime or time horizon. In the oblivious adversarial setting it achieves the minimax optimal regret bound of O((λK)1/3T2/3+KT−−−√), where T is the time horizon and K is the number of arms. In the stochastically constrained adversarial regime, which includes the stochastic regime as a special case, it achieves a regret bound of O((λK)2/3T1/3+lnT)∑i≠i∗Δ−1i), where Δi are suboptimality gaps and i∗ is the unique optimal arm. In the special case of λ=0 (no switching costs), both bounds are minimax optimal within constants. We also explore variants of the problem, where switching cost is allowed to change over time. We provide experimental evaluation showing competitiveness of our algorithm with the relevant baselines in the stochastic, stochastically constrained adversarial, and adversarial regimes with fixed switching cost.",
author = "Michael Oberst and Thams, {Nikolaj Theodor Birkmose} and Peters, {Jonas Martin} and David Sontag",
year = "2021",
language = "English",
series = "Proceedings of Machine Learning Research",
pages = "1--11",
booktitle = "Proceedings of the 38th International Conference on Machine Learning (ICML)",
publisher = "PMLR",
note = "38th International Conference on Machine Learning (ICML) ; Conference date: 18-07-2021 Through 24-07-2021",

}

RIS

TY - GEN

T1 - Regularizing towards Causal Invariance

T2 - 38th International Conference on Machine Learning (ICML)

AU - Oberst, Michael

AU - Thams, Nikolaj Theodor Birkmose

AU - Peters, Jonas Martin

AU - Sontag, David

PY - 2021

Y1 - 2021

N2 - We propose an algorithm for stochastic and adversarial multiarmed bandits with switching costs, where the algorithm pays a price λ every time it switches the arm being played. Our algorithm is based on adaptation of the Tsallis-INF algorithm of Zimmert and Seldin (2021) and requires no prior knowledge of the regime or time horizon. In the oblivious adversarial setting it achieves the minimax optimal regret bound of O((λK)1/3T2/3+KT−−−√), where T is the time horizon and K is the number of arms. In the stochastically constrained adversarial regime, which includes the stochastic regime as a special case, it achieves a regret bound of O((λK)2/3T1/3+lnT)∑i≠i∗Δ−1i), where Δi are suboptimality gaps and i∗ is the unique optimal arm. In the special case of λ=0 (no switching costs), both bounds are minimax optimal within constants. We also explore variants of the problem, where switching cost is allowed to change over time. We provide experimental evaluation showing competitiveness of our algorithm with the relevant baselines in the stochastic, stochastically constrained adversarial, and adversarial regimes with fixed switching cost.

AB - We propose an algorithm for stochastic and adversarial multiarmed bandits with switching costs, where the algorithm pays a price λ every time it switches the arm being played. Our algorithm is based on adaptation of the Tsallis-INF algorithm of Zimmert and Seldin (2021) and requires no prior knowledge of the regime or time horizon. In the oblivious adversarial setting it achieves the minimax optimal regret bound of O((λK)1/3T2/3+KT−−−√), where T is the time horizon and K is the number of arms. In the stochastically constrained adversarial regime, which includes the stochastic regime as a special case, it achieves a regret bound of O((λK)2/3T1/3+lnT)∑i≠i∗Δ−1i), where Δi are suboptimality gaps and i∗ is the unique optimal arm. In the special case of λ=0 (no switching costs), both bounds are minimax optimal within constants. We also explore variants of the problem, where switching cost is allowed to change over time. We provide experimental evaluation showing competitiveness of our algorithm with the relevant baselines in the stochastic, stochastically constrained adversarial, and adversarial regimes with fixed switching cost.

M3 - Article in proceedings

T3 - Proceedings of Machine Learning Research

SP - 1

EP - 11

BT - Proceedings of the 38th International Conference on Machine Learning (ICML)

PB - PMLR

Y2 - 18 July 2021 through 24 July 2021

ER -

ID: 305176750