Multi-armed bandit (opens in new tab)

Covered by 3 sources including DEV Community, jacquescorbytuech.com

In probability theory and machine learning, the multi-armed bandit problem (sometimes called the K-[1] or N-armed bandit problem[2]) is named from imagining a gambler at a row of slot machines (sometimes known as "one-armed bandits"), who has to decide which machines to play, how many times to play each machine and in which order to play them, and whether to continue with the current machine or try a different machine.[3]

Read the original article

Sign in to keep reading the full article.

Sign Up Log In

Covered in 3 articles

DEV Community·

AutoML for Agent Fleets, Without the Vendor Bill

Discussed on DEV

jacquescorbytuech.com·

Be This Tall to Ride

Discussed on Hacker News

dwarkesh.com·

Eric Jang – Building AlphaGo from scratch

Discussed on Hacker News