Abstract:Subjective video quality assessment is crucial for optimizing streaming and compression, yet traditional protocols face limitations in capturing nuanced perceptual differences and ensuring reliable user input. We propose an integrated framework that enhances rater training, enforces attention through real-time scoring, and streamlines pairwise comparisons to recover quality scores with fewer comparisons. Participants first undergo an automated training quiz to learn key video quality indicators (e.g., compression artifacts) and verify their readiness. During the test, a real-time attention scoring mechanism, using "golden" video pairs, monitors and reinforces rater…
Abstract:Subjective video quality assessment is crucial for optimizing streaming and compression, yet traditional protocols face limitations in capturing nuanced perceptual differences and ensuring reliable user input. We propose an integrated framework that enhances rater training, enforces attention through real-time scoring, and streamlines pairwise comparisons to recover quality scores with fewer comparisons. Participants first undergo an automated training quiz to learn key video quality indicators (e.g., compression artifacts) and verify their readiness. During the test, a real-time attention scoring mechanism, using "golden" video pairs, monitors and reinforces rater focus by applying penalties for lapses. An efficient chain-based pairwise comparison procedure is then employed, yielding quality scores in Just-Objectionable-Differences (JOD) units. Experiments comparing three groups (no training, training without feedback, and training with feedback) with 80 participants demonstrate that training-quiz significantly improves data quality in terms of golden unit accuracy and reduces tie rate, while real-time feedback further improves data quality and yields the most monotonic quality ratings. The new training, quiz, testing with feedback, 3-phase approach can significantly reduce the non-monotonic cases on the high quality part of the R-Q curve where normal viewer typically prefer the slightly compressed less-grainy content and help train a better objective video quality metric.
| Comments: | Accepted at 5th Workshop on Image/Video/Audio Quality Assessment in Computer Vision, VLM and Diffusion Model (WVAQ), at IEEE/CVF WACV 2026 |
| Subjects: | Multimedia (cs.MM) |
| Cite as: | arXiv:2601.04184 [cs.MM] |
| (or arXiv:2601.04184v1 [cs.MM] for this version) | |
| https://doi.org/10.48550/arXiv.2601.04184 arXiv-issued DOI via DataCite (pending registration) |
Submission history
From: Kumar Rahul [view email] [v1] Wed, 7 Jan 2026 18:51:23 UTC (2,163 KB)