Daniel Russo – Research

Preprints

Posterior Sampling via Autoregressive Generation
Kelly Zhang, Tiffany Cai, Hongseok Namkoong, and Daniel Russo
Working paper - Submitted.

Optimizing Adaptive Experiments: A Unified Approach to Regret Minimization and Best-Arm Identification
Chao Qin and Daniel Russo
Working paper - Submitted.
Second place, INFORMS Jeff McGill Student Paper Award.

Neural Inventory Control in Networks via Hindsight Differentiable Policy Optimization
Matias Alvo, Daniel Russo, Yash Kanoria
Working paper - Submitted.

Impatient Bandits: Optimizing Recommendations for the Long-Term Without Delay
Thomas M McDonald, Lucas Maystre, Mounia Lalmas, Daniel Russo, Kamil Ciosek
Preliminary version appeared at KDD 2023. Journal version in progress.
Spotify blog post

Optimizing Audio Recommendations for the Long-Term: A Reinforcement Learning Perspective
Lucas Maystre, Daniel Russo, and Yu Zhao
Working paper - Major revision at Management Science.

An Information-Theoretic Analysis of Nonstationary Bandit Learning
Seungki Min and Daniel Russo
Immediate revision requested at Operations Research. Preliminary version appeared at ICML 2023.

Adaptive Experimentation in the Presence of Exogenous Nonstationary Variation
Chao Qin and Daniel Russo
Working paper - Major revision at Management Science

Published Papers

Global Optimality Guarantees For Policy Gradient Methods
Jalaj Bhandari and Daniel Russo
Operations Research, 2024
Talk link

On the Statistical Benefits of Temporal Difference Learning
David Cheikhi and Daniel Russo
ICML 2023
Full oral presentation at ICML (top 2.2% of submissions). Finalist in INFORMS APS student paper competition.

Approximation Benefits of Policy Gradient Methods with Aggregated States.
Daniel Russo
Management Science, 2023

Temporally-Consistent Survival Analysis
Lucas Maystre and Daniel Russo
Neurips 2022
Spotify blog post

Satisficing in Time-Sensitive Bandit Learning
Daniel Russo and Benjamin Van Roy
Mathematics of Operations Research, 2022

Learning to Stop with Surprisingly Few Samples
Tianyi Zhang, Daniel Russo, Assaf Zeevi
Conference on Learning Theory (COLT) 2022

On the Linear Convergence of Policy Gradient Methods for Finite MDPs
Jalaj Bhandari and Daniel Russo
Conference on Artificial Intelligence and Statistics (AISTATS), 2021

On the Futility of Dynamics in Robust Mechanism Design
Santiago Balseiro, Anthony Kim and Daniel Russo
Operations Research, 2021

A Note on the Equivalence of Upper Confidence Bounds and Gittins Indices for Patient Agents
Daniel Russo
Operations Research, 2021

A Finite Time Analysis of Temporal Difference Learning With Linear Function Approximation
Jalaj Bhandari, Daniel Russo, and Raghav Singal
Operations Research, 2021
Preliminary version appeared at COLT 2018
Short talk link

Worst-Case Regret Bounds For Exploration Via Randomized Value Functions
Daniel Russo
NeurIPS 2019

A Tutorial on Thompson Sampling
Daniel Russo, Benjamin Van Roy, Abbas Kazerouni, Ian Osband, and Zheng Wen
Foundations and Trends in Machine Learning, Vol. 11, No. 1, pp. 1-96, 2018. (code)

Deep Exploration via Randomized Value Functions
Ian Osband, Daniel Russo, Zheng Wen, and Benjamin Van Roy
Journal of Machine Learning Research, 2019

Improving the Expected Improvement Algorithm
Chao Qin, Diego Klabjan and Daniel Russo
NeurIPS 2017

Simple Bayesian Algorithms for Best Arm Identification
Daniel Russo
Operations Research, 2020
Prelimnary version appeared in COLT 2016
First place, INFORMS JFIG paper competition.

Controling Bias in Adaptive Data Analysis Using Information Theory
Daniel Russo and James Zou
IEEE Transaction on Information Theory, 2020
Preliminary version appeared at AISTATS 2016 (full oral presentation; top 7% of submissions).

Learning to Optimize Via Information Directed Sampling
Daniel Russo and Benjamin Van Roy
Operations Research, 2018
Prelimnary version appeared at NeurIPS 2014
First place, INFORMS George Nicholson 2014 student paper competition.

An Information-Theoretic Analysis of Thompson Sampling
Daniel Russo and Benjamin Van Roy
Journal of Machine Learning Research, 2016

Learning to Optimize Via Posterior Sampling
Daniel Russo and Benjamin Van Roy
Mathematics of Operations Research. Vol. 39. No. 4, pp. 1221-1243, 2014.

Eluder Dimension and the Sample Complexity of Optimistic Exploration
Daniel Russo and Benjamin Van Roy
NeurIPS 2013 (full oral presentation; top 1.4% of submissions).

(More) Efficient Reinforcement Learning via Posterior Sampling
Ian Osband, Daniel Russo, and Benjamin Van Roy
NeurIPS 2013.

Welfare-Improving Cascades and the Effect of Noisy Reviews
Nick Arnosti and Daniel Russo
WINE 2013.