Neural Recognition, Document AI, Layout Analysis, Multi-modal Processing
AprilRobotics/apriltag
github.com·3d
What is a large language model?
proton.me·1d
TTF-VLA: Temporal Token Fusion via Pixel-Attention Integration for Vision-Language-Action Models
arxiv.org·3d
UTA-Sign: Unsupervised Thermal Video Augmentation via Event-Assisted Traffic Signage Sketching
arxiv.org·2d
Loading...Loading more...