Learned Compression, Deep Learning, Rate-Distortion, Entropy Models
OwlCap: Harmonizing Motion-Detail for Video Captioning via HMD-270K and Caption Set Equivalence Reward
arxiv.org·2d
Knowing or Guessing? Robust Medical Visual Question Answering via Joint Consistency and Contrastive Learning
arxiv.org·2d
Loading...Loading more...