Image Processing, Computer Vision Libraries, Real-time Processing, Object Detection

OSMGen: Highly Controllable Satellite Image Synthesis using OpenStreetMap Data
arxiv.orgยท8h
๐Ÿง OpenAI
Flag this post
Deep Neural Watermarking for Robust Copyright Protection in 3D Point Clouds
arxiv.orgยท1d
โ˜๏ธPoint Cloud Processing
Flag this post
Performance evaluation of image convolution with gradient filters in OpenCL
milania.deยท6dยท
Discuss: Hacker News
๐Ÿ”ขNumPy
Flag this post
FG-CLIP 2: A Bilingual Fine-grained Vision-Language Alignment Model
paperium.netยท1dยท
Discuss: DEV
๐Ÿ”Grad-CAM
Flag this post
Disciplined Biconvex Programming
arxiv.orgยท8h
๐Ÿ”ขNumPy
Flag this post
NeuraSnip A Local Semantic Image Search Engine
github.comยท1dยท
Discuss: r/opensource
๐Ÿง OpenAI
Flag this post
VLM6D: VLM based 6Dof Pose Estimation based on RGB-D Images
arxiv.orgยท8h
๐Ÿ”บGeometric Learning
Flag this post
Deep Generative Models for Enhanced Vitreous OCT Imaging
arxiv.orgยท8h
๐Ÿ‘๏ธVision Transformers
Flag this post
A Hybrid Deep Learning and Forensic Approach for Robust Deepfake Detection
arxiv.orgยท1d
๐Ÿ”Grad-CAM
Flag this post
flowengineR: A Modular and Extensible Framework for Fair and Reproducible Workflow Design in R
arxiv.orgยท8h
๐Ÿง OpenAI
Flag this post
FedMGP: Personalized Federated Learning with Multi-Group Text-Visual Prompts
arxiv.orgยท8h
๐Ÿง OpenAI
Flag this post
Bridging Vision, Language, and Mathematics: Pictographic Character Reconstruction with B\'ezier Curves
arxiv.orgยท8h
๐Ÿ”บGeometric Learning
Flag this post
GeneFlow: Translation of Single-cell Gene Expression to Histopathological Images via Rectified Flow
arxiv.orgยท8h
๐Ÿ“ŠAltair
Flag this post
Investigating Label Bias and Representational Sources of Age-Related Disparities in Medical Segmentation
arxiv.orgยท8h
๐Ÿ”Grad-CAM
Flag this post
Efficient Curvature-aware Graph Network
arxiv.orgยท8h
๐Ÿ”บGeometric Learning
Flag this post
FLoC: Facility Location-Based Efficient Visual Token Compression for Long Video Understanding
arxiv.orgยท8h
๐ŸŒŸBokeh
Flag this post
Building a Multimodal RAG That Responds with Text, Images, and Tables from Sources
towardsdatascience.comยท17h
๐Ÿง OpenAI
Flag this post