EmoCaliber: Advancing Reliable Visual Emotion Comprehension via Confidence Verbalization and Calibration
arxiv.org·18h
🗄️Vector Databases
Preview
Report Post

View PDF HTML (experimental)

Abstract:Visual Emotion Comprehension (VEC) aims to infer sentiment polarities or emotion categories from affective cues embedded in images. In recent years, Multimodal Large Language Models (MLLMs) have established a popular paradigm in VEC, leveraging their generalizability to unify VEC tasks defined under diverse emotion taxonomies. While this paradigm achieves notable success, it typically formulates VEC as a deterministic task, requiring the model to output a single, definitive emotion label for each image. Such a formulation insufficiently accounts for the inherent subjectivity of emotion perception, overlooking alternative interpretations that may be equally plausible …

Similar Posts

Loading similar posts...