IMS-Microsoft Research Workshop: Foundations of Data Science – Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits
- Alekh Agarwal | Microsoft
We present a new algorithm for the contextual bandit learning problem, where the learner repeatedly takes one of K actions in response to the observed context, and observes the reward only for that chosen action. Our method assumes access to an oracle for solving fully supervised cost-sensitive classification problems and achieves the statistically optimal regret guarantee with only ˜O(√KT) oracle calls across all T rounds. By doing so, we obtain the most practical contextual bandit learning algorithm amongst approaches that work for general policy classes. We further conduct a proof-of-concept experiment which demonstrates the excellent computational and prediction performance of (an online variant of) our algorithm relative to several baselines.
[Joint work with Daniel Hsu, Satyen Kale, John Langford, Lihong Li and Rob Schapire]
Speaker Details
I am currently a researcher in the New York lab of Microsoft Research, where I have also spent two wonderful years as a postdoc. Prior to that, I obtained my PhD in Computer Science from UC Berkeley, working with Peter Bartlett and Martin Wainwright.
-
-
Alekh Agarwal
Principal Research Manager
-
Jeff Running
-
-
Watch Next
-
-
-
Multimodal & Embodied Intelligence (Pt 1), Panel on Multimodal AI: Progress, Pitfalls, Possibilities
- Madhava Krishna,
- Sriram Ganapathy,
- Somak Aditya
-
Session on Compute & Trust (Security)
- Krishna Pillutla,
- Danish Pruthi
-
-
Session on Reasoning
- Hongxiang Fan,
- Nagarajan Natarajan
-
-
Session on Retrieval
- Lokesh Nagalapatti,
- Soumen Chakrabarti
-
-