Capturing Classic Authorial Style in Long-Form Story Generation with GRPO Fine-Tuning

View PDF HTML (experimental)

Abstract:Recent advances in large language models (LLMs) show impressive performance in open-ended story generation, but fine-grained stylistic control remains limited. Existing methods often rely on shallow cues (e.g., names or topics) to simulate authorial style, without robust evaluation. In this work, we present a training framework for style-conditioned story generation using Group Relative Policy Optimization (GRPO) and a custom multi-reward setup. The style reward is derived from a fine-tuned sentence transformer using authorship verification (AV) signals, combined with content and completeness scores to stabilize long-form narrative generation. We conduct exper…

View PDF HTML (experimental)

Abstract:Recent advances in large language models (LLMs) show impressive performance in open-ended story generation, but fine-grained stylistic control remains limited. Existing methods often rely on shallow cues (e.g., names or topics) to simulate authorial style, without robust evaluation. In this work, we present a training framework for style-conditioned story generation using Group Relative Policy Optimization (GRPO) and a custom multi-reward setup. The style reward is derived from a fine-tuned sentence transformer using authorship verification (AV) signals, combined with content and completeness scores to stabilize long-form narrative generation. We conduct experiments using fiction by Mark Twain, a prominent 19th-century American author, with The Adventures of Huckleberry Finn serving as the reference style exemplar. Our 8B model outperforms larger baselines such as GPT-4o and Claude Sonnet 4 in AV-style metrics, achieving a style score of 0.628 and competitive content quality. Results demonstrate the feasibility of agentic stylistic generation with moderate model size and task-specific training. While the output is clearly style-aligned, narrative completeness remains a challenge, indicating future work is needed to better model global coherence and story resolution.


Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2512.05747 [cs.CL]
	(or arXiv:2512.05747v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2512.05747 arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Jinlong Liu [view email] [v1] Fri, 5 Dec 2025 14:29:27 UTC (843 KB)

Submission history

Similar Posts