Abstract:We introduce O-EENC-SD: an end-to-end online speaker diarization system based on EEND-EDA, featuring a novel RNN-based stitching mechanism for online prediction. In particular, we develop a novel centroid refinement decoder whose usefulness is assessed through a rigorous ablation study. Our system provides key advantages over existing methods: a hyperparameter-free solution compared to unsupervised clustering approaches, and a more efficient alternative to current online end-to-end methods, which are computationally costly. We demonstrate that O-EENC-SD is competitive with the state of the art in the two-speaker conversational telephone speech domain, as tested on the CallHome dataset. Our results show that O-EENC-SD provides a g…
Abstract:We introduce O-EENC-SD: an end-to-end online speaker diarization system based on EEND-EDA, featuring a novel RNN-based stitching mechanism for online prediction. In particular, we develop a novel centroid refinement decoder whose usefulness is assessed through a rigorous ablation study. Our system provides key advantages over existing methods: a hyperparameter-free solution compared to unsupervised clustering approaches, and a more efficient alternative to current online end-to-end methods, which are computationally costly. We demonstrate that O-EENC-SD is competitive with the state of the art in the two-speaker conversational telephone speech domain, as tested on the CallHome dataset. Our results show that O-EENC-SD provides a great trade-off between DER and complexity, even when working on independent chunks with no overlap, making the system extremely efficient.
| Subjects: | Machine Learning (cs.LG); Sound (cs.SD); Signal Processing (eess.SP) |
| Cite as: | arXiv:2512.15229 [cs.LG] |
| (or arXiv:2512.15229v1 [cs.LG] for this version) | |
| https://doi.org/10.48550/arXiv.2512.15229 arXiv-issued DOI via DataCite (pending registration) | |
| Journal reference: | IEEE International Conference on Acoustics, Speech, and Signal Processing, Apr 2025, Hyderabad, India, India |
Submission history
From: Elio Gruttadauria [view email] [via CCSD proxy] [v1] Wed, 17 Dec 2025 09:27:23 UTC (4,963 KB)