Utility Boundary of Dataset Distillation: Scaling and Configuration-Coverage Laws
arxiv.org·1h
🧠Machine Learning
Preview
Report Post

View PDF HTML (experimental)

Abstract:Dataset distillation (DD) aims to construct compact synthetic datasets that allow models to achieve comparable performance to full-data training while substantially reducing storage and computation. Despite rapid empirical progress, its theoretical foundations remain limited: existing methods (gradient, distribution, trajectory matching) are built on heterogeneous surrogate objectives and optimization assumptions, which makes it difficult to analyze their common principles or provide general guarantees. Moreover, it is still unclear under what conditions distilled data can retain the effectiveness of full datasets when the training configuration, such as optimizer, ar…

Similar Posts

Loading similar posts...