Schedules and Prioritization: A Behavioral Foundation for Multi-Armed Bandits and Stopping Problems (opens in new tab)

Bandit models typically begin with arms, states, rewards, and transition rules. This paper instead begins with preferences over stopped local contingent schedules: possible unfoldings of a responsibility, project, experiment, or opportunity in its own local time. Behavioral axioms on single schedules characterize a generalized stopping representation with current utility, local discounting, and a broad continuation aggregator. A common-tail comp...

Read the original article