Do We Still Need OCR?
🤖Advanced OCR
Flag this post
CT-CLIP: A Multi-modal Fusion Framework for Robust Apple Leaf Disease Recognition in Complex Environments
arxiv.org·1d
🤖Advanced OCR
Flag this post
Scripts That Don’t Fit: The Hidden Bias of NLP in South Asian Languages
digitalorientalist.com·7h
🏛Digital humanities
Flag this post
DeepSeek-OCR: Images Simplify Text for Large Language Models
heise.de·4d
🤖Advanced OCR
Flag this post
DeepOCR – Permanently free, multi-scenario OCR for receipts, docs, handwriting
🤖Advanced OCR
Flag this post
Precise classification of low quality G-banded Chromosome Images by reliability metrics and data pruning classifier
arxiv.org·16h
🕳️Persistent Homology
Flag this post
Long-tailed Species Recognition in the NACTI Wildlife Dataset
arxiv.org·1d
🤖Advanced OCR
Flag this post
Morphology-Aware KOA Classification: Integrating Graph Priors with Vision Models
arxiv.org·16h
🌀Riemannian Computing
Flag this post
CURVETE: Curriculum Learning and Progressive Self-supervised Training for Medical Image Classification
arxiv.org·16h
🌀Differential Geometry
Flag this post
Accurate and Scalable Multimodal Pathology Retrieval via Attentive Vision-Language Alignment
arxiv.org·16h
🧮Vector Embeddings
Flag this post
"the densest and longest lasting human readable information storage media"
🌡️Preservation Physics
Flag this post
MedXplain-VQA: Multi-Component Explainable Medical Visual Question Answering
arxiv.org·16h
🏛Digital humanities
Flag this post
Surface Reading LLMs: Synthetic Text and its Styles
arxiv.org·16h
🔤Font Archaeology
Flag this post
Scanner-Agnostic MRI Harmonization via SSIM-Guided Disentanglement
arxiv.org·16h
🌀Riemannian Computing
Flag this post
Mitigating Coordinate Prediction Bias from Positional Encoding Failures
arxiv.org·16h
🚀SIMD Text Processing
Flag this post
A Multimodal, Multitask System for Generating E Commerce Text Listings from Images
arxiv.org·16h
🤖Advanced OCR
Flag this post
Loading...Loading more...