Where Did This Sentence Come From? Tracing Provenance in LLM Reasoning Distillation
arxiv.org·2d
🧮Theorem Proving
Preview
Report Post

View PDF HTML (experimental)

Abstract:Reasoning distillation has attracted increasing attention. It typically leverages a large teacher model to generate reasoning paths, which are then used to fine-tune a student model so that it mimics the teacher’s behavior in training contexts. However, previous approaches have lacked a detailed analysis of the origins of the distilled model’s capabilities. It remains unclear whether the student can maintain consistent behaviors with the teacher in novel test-time contexts, or whether it regresses to its original output patterns, raising concerns about the generalization of distillation models. To analyse this question, we introduce a cross-model Reasoning Distillation …

Similar Posts

Loading similar posts...