Neural Recognition, Document AI, Layout Analysis, Multi-modal Processing
Identifying Signatures of Image Phenotypes to Track Treatment Response in Liver Disease
arxiv.orgยท13h
NarrLV: Towards a Comprehensive Narrative-Centric Evaluation for Long Video Generation Models
arxiv.orgยท1d
MoSAiC: Multi-Modal Multi-Label Supervision-Aware Contrastive Learning for Remote Sensing
arxiv.orgยท3d
Spatial ModernBERT: Spatial-Aware Transformer for Table and Key-Value Extraction in Financial Documents at Scale
arxiv.orgยท2d
Bridging the Gap in Vision Language Models in Identifying Unsafe Concepts Across Modalities
arxiv.orgยท1d
Loading...Loading more...