Measuring the Unspoken: A Disentanglement Model and Benchmark for Psychological Analysis in the Wild
arxiv.org·2d
📊Learned Metrics
Preview
Report Post

View PDF HTML (experimental)

Abstract:Generative psychological analysis of in-the-wild conversations faces two fundamental challenges: (1) existing Vision-Language Models (VLMs) fail to resolve Articulatory-Affective Ambiguity, where visual patterns of speech mimic emotional expressions; and (2) progress is stifled by a lack of verifiable evaluation metrics capable of assessing visual grounding and reasoning depth. We propose a complete ecosystem to address these twin challenges. First, we introduce Multilevel Insight Network for Disentanglement(MIND), a novel hierarchical visual encoder that introduces a Status Judgment module to algorithmically suppress ambiguous lip features based on their temporal…

Similar Posts

Loading similar posts...