Made an interactive explainer about speculative decoding/MTP (opens in new tab)
Speculative decoding: when LLMs predict their own predictions
Read the original articleSpeculative decoding: when LLMs predict their own predictions
Read the original article