Abstract:Advances in artificial intelligence (AI) and deep learning have improved diagnostic capabilities in healthcare, yet limited interpretability continues to hinder clinical adoption. Schizophrenia, a complex disorder with diverse symptoms including disorganized speech and social withdrawal, demands tools that capture symptom severity and provide clinically meaningful insights beyond binary diagnosis. Here, we present an interpretable framework that leverages articulatory speech features through eigenspectra difference plots and a weighted sum with exponential decay (WSED) to quantify vocal tract coordination. Eigenspectra plots effectively distinguished complex from sim…
Abstract:Advances in artificial intelligence (AI) and deep learning have improved diagnostic capabilities in healthcare, yet limited interpretability continues to hinder clinical adoption. Schizophrenia, a complex disorder with diverse symptoms including disorganized speech and social withdrawal, demands tools that capture symptom severity and provide clinically meaningful insights beyond binary diagnosis. Here, we present an interpretable framework that leverages articulatory speech features through eigenspectra difference plots and a weighted sum with exponential decay (WSED) to quantify vocal tract coordination. Eigenspectra plots effectively distinguished complex from simpler coordination patterns, and WSED scores reliably separated these groups, with ambiguity confined to a narrow range near zero. Importantly, WSED scores correlated not only with overall BPRS severity but also with the balance between positive and negative symptoms, reflecting more complex coordination in subjects with pronounced positive symptoms and the opposite trend for stronger negative symptoms. This approach offers a transparent, severity-sensitive biomarker for schizophrenia, advancing the potential for clinically interpretable speech-based assessment tools.
| Comments: | Submitted to ICASSP 2026 |
| Subjects: | Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Signal Processing (eess.SP) |
| Cite as: | arXiv:2511.03084 [eess.AS] |
| (or arXiv:2511.03084v1 [eess.AS] for this version) | |
| https://doi.org/10.48550/arXiv.2511.03084 arXiv-issued DOI via DataCite (pending registration) |
Submission history
From: Gowtham Premananth [view email] [v1] Wed, 5 Nov 2025 00:19:05 UTC (145 KB)