Neural Recognition, Document AI, Layout Analysis, Multi-modal Processing
Multi-modal Representations for Fine-grained Multi-label Critical View of Safety Recognition
arxiv.org·22h
Taming the Tri-Space Tension: ARC-Guided Hallucination Modeling and Control for Text-to-Image Generation
arxiv.org·22h
An HTR-LLM Workflow for High-Accuracy Transcription and Analysis of Abbreviated Latin Court Hand
arxiv.org·22h
Mimesis, Poiesis, and Imagination: Exploring Text-to-Image Generation of Biblical Narratives
arxiv.org·22h
Distilling High Diagnostic Value Patches for Whole Slide Image Classification Using Attention Mechanism
arxiv.org·22h
ReCAP: Recursive Cross Attention Network for Pseudo-Label Generation in Robotic Surgical Skill Assessment
arxiv.org·22h
Loading...Loading more...