arXiv:2601.12153v1 Announce Type: cross Abstract: Automatic Singing Assessment and Singing Information Processing have evolved over the past three decades to support singing pedagogy, performance analysis, and vocal training. While the first approach objectively evaluates a singer’s performance through computational metrics ranging from real-time visual feedback and acoustical biofeedback to sophisticated pitch tracking and spectral analysis, the latter method compares a predictor vocal signal with a target reference to capture nuanced data embedded in the singing voice. Notable advancements include the development of interactive systems that have significantly improved real-time visual feedback, and the integration of machine learning and deep neural network architectures that enhance the…
arXiv:2601.12153v1 Announce Type: cross Abstract: Automatic Singing Assessment and Singing Information Processing have evolved over the past three decades to support singing pedagogy, performance analysis, and vocal training. While the first approach objectively evaluates a singer’s performance through computational metrics ranging from real-time visual feedback and acoustical biofeedback to sophisticated pitch tracking and spectral analysis, the latter method compares a predictor vocal signal with a target reference to capture nuanced data embedded in the singing voice. Notable advancements include the development of interactive systems that have significantly improved real-time visual feedback, and the integration of machine learning and deep neural network architectures that enhance the precision of vocal signal processing. This survey critically examines the literature to map the historical evolution of these technologies, while identifying and discussing key gaps. The analysis reveals persistent challenges, such as the lack of standardized evaluation frameworks, difficulties in reliably separating vocal signals from various noise sources, and the underutilization of advanced digital signal processing and artificial intelligence methodologies for capturing artistic expressivity. By detailing these limitations and the corresponding technological advances, this review demonstrates how addressing these issues can bridge the gap between objective computational assessments and subjective human-like evaluations of singing performance, ultimately enhancing both the technical accuracy and pedagogical relevance of automated singing evaluation systems.