Incremental OCR, Real-time Processing, Character Recognition Pipelines, Text Flows
Cozette
github.comยท1d
Constrained Prompt Enhancement for Improving Zero-Shot Generalization of Vision-Language Models
arxiv.orgยท15m
MMCIG: Multimodal Cover Image Generation for Text-only Documents and Its Dataset Construction via Pseudo-labeling
arxiv.orgยท15m
TOMATO: Assessing Visual Temporal Reasoning Capabilities in Multimodal Foundation Models
arxiv.orgยท15m
Benchmarking Class Activation Map Methods for Explainable Brain Hemorrhage Classification on Hemorica Dataset
arxiv.orgยท15m
ISALux: Illumination and Segmentation Aware Transformer Employing Mixture of Experts for Low Light Image Enhancement
arxiv.orgยท15m
Loading...Loading more...