The Zero-Step Thinking: An Empirical Study of Mode Selection as Harder Early Exit in Reasoning Models

View PDF HTML (experimental)

Abstract:Reasoning models have demonstrated exceptional performance in tasks such as mathematics and logical reasoning, primarily due to their ability to engage in step-by-step thinking during the reasoning process. However, this often leads to overthinking, resulting in unnecessary computational overhead. To address this issue, Mode Selection aims to automatically decide between Long-CoT (Chain-of-Thought) or Short-CoT by utilizing either a Thinking or NoThinking mode. Simultaneously, Early Exit determines the optimal stopping point during the iterative reasoning process. Both methods seek to reduce the computational burden. In this paper, we first identify Mode Selection as…

View PDF HTML (experimental)

Abstract:Reasoning models have demonstrated exceptional performance in tasks such as mathematics and logical reasoning, primarily due to their ability to engage in step-by-step thinking during the reasoning process. However, this often leads to overthinking, resulting in unnecessary computational overhead. To address this issue, Mode Selection aims to automatically decide between Long-CoT (Chain-of-Thought) or Short-CoT by utilizing either a Thinking or NoThinking mode. Simultaneously, Early Exit determines the optimal stopping point during the iterative reasoning process. Both methods seek to reduce the computational burden. In this paper, we first identify Mode Selection as a more challenging variant of the Early Exit problem, as they share similar objectives but differ in decision timing. While Early Exit focuses on determining the best stopping point for concise reasoning at inference time, Mode Selection must make this decision at the beginning of the reasoning process, relying on pre-defined fake thoughts without engaging in an explicit reasoning process, referred to as zero-step thinking. Through empirical studies on nine baselines, we observe that prompt-based approaches often fail due to their limited classification capabilities when provided with minimal hand-crafted information. In contrast, approaches that leverage internal information generally perform better across most scenarios but still exhibit issues with stability. Our findings indicate that existing methods relying solely on the information provided by models are insufficient for effectively addressing Mode Selection in scenarios with limited information, highlighting the ongoing challenges of this task. Our code is available at this https URL.


Comments:	Accepted by NeurIPS’25 Efficient Reasoning Workshop
Subjects:	Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
Cite as:	arXiv:2510.19176 [cs.AI]
	(or arXiv:2510.19176v1 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2510.19176 arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Yuqiao Tan [view email] [v1] Wed, 22 Oct 2025 02:28:10 UTC (2,358 KB)

Submission history

Similar Posts