COWLES FOUNDATION FOR RESEARCH IN ECONOMICS
AT YALE UNIVERSITY

Box 208281
New Haven, CT 06520-8281

Lux et veritas

COWLES FOUNDATION DISCUSSION PAPER NO. 1551

BANDIT PROBLEMS

Dirk Bergemann and Juuso Välimäki

January 2006

We survey the literature on multi-armed bandit models and their applications in economics. The multi-armed bandit problem is a statistical decision model of an agent trying to optimize his decisions while improving his information at the same time. This classic problem has received much attention in economics as it concisely models the trade-off between exploration (trying out each arm to find the best one) and exploitation (playing the arm believed to give the best payoff).

Keywords: One-Armed Bandit, Multi-Armed Bandit, Bayesian Learning, Experimentation, Index Policy, Matching, Experience Goods

JEL Classification: C72, C73, D43, D83