UniME-V2: MLLM-as-a-Judge for Universal Multimodal Embedding Learning
arxiv.org·1h
🧮Vector Embeddings
Flag this post
VideoLucy: Deep Memory Backtracking for Long Video Understanding
arxiv.org·1d
🧠Learned Codecs
Flag this post
Towards Region-aware Bias Evaluation Metrics
arxiv.org·1h
🎛️Feed Filtering
Flag this post
OmniGaze: Reward-inspired Generalizable Gaze Estimation In The Wild
arxiv.org·1h
🏛Digital humanities
Flag this post
Information-Theoretic Criteria for Knowledge Distillation in Multimodal Learning
arxiv.org·1h
🧠Machine Learning
Flag this post
Unlocking Musical DNA: Seeing Music Through Movement by Arvind Sundararajan
🎼Computational Musicology
Flag this post
MAPS: Masked Attribution-based Probing of Strategies- A computational framework to align human and model explanations
arxiv.org·1d
🧠Machine Learning
Flag this post
Progressive multi-fidelity learning for physical system predictions
arxiv.org·1h
📊Quantization
Flag this post
Multi-Head Latent Attention
💻Local LLMs
Flag this post
Class Prototypes based Contrastive Learning for Classifying Multi-Label and Fine-Grained Educational Videos
arxiv.org·2d
🧠Learned Codecs
Flag this post
CurLL: A Developmental Framework to Evaluate Continual Learning in Language Models
arxiv.org·1h
💻Programming languages
Flag this post
FedGTEA: Federated Class-Incremental Learning with Gaussian Task Embedding and Alignment
arxiv.org·1h
🧠Machine Learning
Flag this post
Reasoning as Representation: Rethinking Visual Reinforcement Learning in Image Quality Assessment
arxiv.org·2d
🧠Machine Learning
Flag this post
State Space Prompting via Gathering and Spreading Spatio-Temporal Information for Video Understanding
arxiv.org·1d
🌀Differential Geometry
Flag this post
Visual Interestingness Decoded: How GPT-4o Mirrors Human Interests
arxiv.org·1h
🧭Content Discovery
Flag this post
Loading...Loading more...