Identifying the Periodicity of Information in Natural Language

Title:Identifying the Periodicity of Information in Natural Language

Abstract:Recent theoretical advancement of information density in natural language has brought the following question on desk: To what degree does natural language exhibit periodicity pattern in its encoded information? We address this question by introducing a new method called AutoPeriod of Surprisal (APS). APS adopts a canonical periodicity detection algorithm and is able to identify any significant periods that exist in the surprisal sequence of a single document. By applying the algorithm to a set of corpora, we have obtained the following interesting results: Firstly, a considerable proportion of huma…

Title:Identifying the Periodicity of Information in Natural Language

View PDF HTML (experimental)

Abstract:Recent theoretical advancement of information density in natural language has brought the following question on desk: To what degree does natural language exhibit periodicity pattern in its encoded information? We address this question by introducing a new method called AutoPeriod of Surprisal (APS). APS adopts a canonical periodicity detection algorithm and is able to identify any significant periods that exist in the surprisal sequence of a single document. By applying the algorithm to a set of corpora, we have obtained the following interesting results: Firstly, a considerable proportion of human language demonstrates a strong pattern of periodicity in information; Secondly, new periods that are outside the distributions of typical structural units in text (e.g., sentence boundaries, elementary discourse units, etc.) are found and further confirmed via harmonic regression modeling. We conclude that the periodicity of information in language is a joint outcome from both structured factors and other driving factors that take effect at longer distances. The advantages of our periodicity detection method and its potentials in LLM-generation detection are further discussed.


Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2510.27241 [cs.CL]
	(or arXiv:2510.27241v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2510.27241 arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Yulin Ou [view email] [v1] Fri, 31 Oct 2025 07:10:30 UTC (531 KB)

Title:Identifying the Periodicity of Information in Natural Language

Title:Identifying the Periodicity of Information in Natural Language

Submission history

Similar Posts