A Picture is Worth a Thousand (Correct) Captions: A Vision-Guided Judge-Corrector System for Multimodal Machine Translation
arxiv.orgยท23m
๐Ÿค–AI
Flag this post
Argus: Quality-Aware High-Throughput Text-to-Image Inference Serving System
arxiv.orgยท23m
๐Ÿค–AI
Flag this post
Multi-Scale Feature Fusion and Graph Neural Network Integration for Text Classification with Large Language Models
arxiv.orgยท23m
๐Ÿค–AI
Flag this post
LG-NuSegHop: A Local-to-Global Self-Supervised Pipeline For Nuclei Instance Segmentation
arxiv.orgยท1d
๐Ÿค–AI
Flag this post
Hierarchical Spatial-Frequency Aggregation for Spectral Deconvolution Imaging
arxiv.orgยท23m
๐Ÿค–AI
Flag this post
Detecting Logo Similarity: Combining AI Embeddings with Fourier Descriptors
dev.toยท1dยท
Discuss: DEV
๐Ÿค–AI
Flag this post
Machine learning automates material analysis and design using X-ray spectroscopy data
phys.orgยท19h
๐Ÿค–AI
Flag this post
Automatic segmentation of colorectal liver metastases for ultrasound-based navigated resection
arxiv.orgยท1d
โšกHTMX
Flag this post
CSGaze: Context-aware Social Gaze Prediction
arxiv.orgยท23m
๐Ÿค–AI
Flag this post
ROAR: Robust Accident Recognition and Anticipation for Autonomous Driving
arxiv.orgยท23m
๐Ÿค–AI
Flag this post
Fuzzy Label: From Concept to Its Application in Label Learning
arxiv.orgยท23m
๐Ÿค–AI
Flag this post
Real-Time Bundle Adjustment for Ultra-High-Resolution UAV Imagery Using Adaptive Patch-Based Feature Tracking
arxiv.orgยท23m
๐Ÿค–AI
Flag this post
Validating Vision Transformers for Otoscopy: Performance and Data-Leakage Effects
arxiv.orgยท1d
๐Ÿค–AI
Flag this post
SurgiATM: A Physics-Guided Plug-and-Play Model for Deep Learning-Based Smoke Removal in Laparoscopic Surgery
arxiv.orgยท1d
๐Ÿค–AI
Flag this post
NeuroBridge: Bio-Inspired Self-Supervised EEG-to-Image Decoding via Cognitive Priors and Bidirectional Semantic Alignment
arxiv.orgยท23m
๐Ÿค–AI
Flag this post
Predicting Pedestrian Intent with Spatiotemporal Graph Neural Networks for Enhanced AEB Systems
dev.toยท1dยท
Discuss: DEV
๐Ÿค–AI
Flag this post
CAMP-VQA: Caption-Embedded Multimodal Perception for No-Reference Quality Assessment of Compressed Video
arxiv.orgยท23m
๐Ÿ”ท.NET
Flag this post