Semantic DLM+: Improving Diffusion Language Models through Bias-variance Trade-off in Transition Kernel Design (opens in new tab)

Diffusion Language Models (DLMs) have demonstrated strong scaling capacity as alternatives to autoregressive language models. However, their performance is highly sensitive to the choice of transition kernels, and poorly designed kernels can lead to issues like training instability, slow convergence, and biased sampling. In this paper, we study this sensitivity through a principled analysis of generalization error and identify three critical fac...

Read the original article