TAR-TVG: Enhancing VLMs with Timestamp Anchor-Constrained Reasoning for Temporal Video Grounding
arxiv.org·3d
CLUE: Leveraging Low-Rank Adaptation to Capture Latent Uncovered Evidence for Image Forgery Localization
arxiv.org·3d
Natural Language-Driven Viewpoint Navigation for Volume Exploration via Semantic Block Representation
arxiv.org·3d
LaVieID: Local Autoregressive Diffusion Transformers for Identity-Preserving Video Creation
arxiv.org·3d
Inclusive Employment Pathways: Career Success Factors for Autistic Individuals in Software Engineering
arxiv.org·1d
Dynamic Pattern Alignment Learning for Pretraining Lightweight Human-Centric Vision Models
arxiv.org·3d
MIND: A Noise-Adaptive Denoising Framework for Medical Images Integrating Multi-Scale Transformer
arxiv.org·3d
Loading...Loading more...