Continuous-in-time Limit for Multi-armed Bandit

Yuhua Zhu

Assistant Professor, Department of Halicioğlu Data Science Institute (HDSI) & Department of Mathematics, University of California, San Diego

Seminar Information

Seminar Series

Dynamic Systems & Controls

Seminar Date - Time

October 27, 2023, 3:00 pm

-

4 PM

Seminar Location

EBU II 479, Von Karman-Penner Seminar Room

Important Links

Zoom Meeting ID: 971 5388 3048

Abstract

In this talk, I will build the connection between Hamilton-Jacobi-Bellman equations (HJB) and the multi-armed bandit (MAB) problems. HJB is an important equation in solving stochastic optimal control problems. MAB is a widely used paradigm for studying the exploration-exploitation trade-off in sequential decision making under uncertainty. This is the first work that establishes this connection in a general setting. I will present an efficient algorithm for solving MAB problems based on this connection and demonstrate its practical applications. This is a joint work with Lexing Ying and Zach Izzo from Stanford University.

Speaker Bio

Yuhua Zhu is an assistant professor at UC San Diego, where she holds a joint appointment in the Halicioğlu Data Science Institute (HDSI) and the Department of Mathematics. Previously, she was a Postdoctoral Fellow at Stanford University, and received her Ph.D. from UW-Madison. Her work builds the bridge between differential equations and machine learning, spanning the areas of reinforcement learning, stochastic optimization, sequential decision-making, and uncertainty quantification.