Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Vision-Language Models
👁️ Vision-Language Models
Specific
VLM, CLIP, image-text, grounding
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
298
posts in
11.7
ms
An Effective Router for
Vision-Language
Model
Selection
🎭
Multimodal AI
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for An Effective Router for Vision-Language Model Selection
openpilot 0.11.1
🎭
Multimodal AI
Content type:
Blog
blog.comma.ai
·
6d
6 days ago
Actions for openpilot 0.11.1
Less-relevant results
dimitrisdimitrov5-blip/Phantomix
: The open-source
AI
browser agent. Free alternative to OpenAI Operator.
🎭
Multimodal AI
Content type:
Code
github.com
·
4h
4 hours ago
·
Hacker News
Actions for dimitrisdimitrov5-blip/Phantomix: The open-source AI browser agent. Free alternative to OpenAI Operator.
A generalist biomedical
vision-language
model
via multi-CLIP knowledge distillation
🎭
Multimodal AI
Content type:
Academic
nature.com
·
1d
1 day ago
Actions for A generalist biomedical vision-language model via multi-CLIP knowledge distillation
Disquiet Junto Project 0754: The
Blip
🎭
Multimodal AI
llllllll.co
·
13h
13 hours ago
Actions for Disquiet Junto Project 0754: The Blip
OpenCV 5.0 Released With Rewritten DNN Engine, Built-In LLM &
VLM
Support
🎭
Multimodal AI
phoronix.com
·
4d
4 days ago
·
Hacker News
Actions for OpenCV 5.0 Released With Rewritten DNN Engine, Built-In LLM & VLM Support
Can robots
read
the room?
🎭
Multimodal AI
Content type:
News
Content type:
Academic
news.cornell.edu
·
1d
1 day ago
Actions for Can robots read the room?
RoboHack
AI
CTF (Robotic Hacking Community at DEFCON 34)
🎭
Multimodal AI
ctftime.org
·
20h
20 hours ago
Actions for RoboHack AI CTF (Robotic Hacking Community at DEFCON 34)
OpenCV 5 release - New DNN engine with enhanced ONNX and
LLM/VLM
support, Intel, Arm, and RISC-V hardware optimizations - CNX Software
🎭
Multimodal AI
Content type:
News
cnx-software.com
·
1d
1 day ago
Actions for OpenCV 5 release - New DNN engine with enhanced ONNX and LLM/VLM support, Intel, Arm, and RISC-V hardware optimizations - CNX Software
AVIS: Adaptive Test-Time Scaling for
Vision-Language
Models
🎭
Multimodal AI
Content type:
Academic
arxiv.org
·
7h
7 hours ago
Actions for AVIS: Adaptive Test-Time Scaling for Vision-Language Models
local llm on laptop 780M GPU using llama + gemma 4 qat
🖥️
wgpu
Content type:
Blog
alper.bearblog.dev
·
5d
5 days ago
Actions for local llm on laptop 780M GPU using llama + gemma 4 qat
OpenCV 5.0 Computer
Vision
Library Released with Rewritten DNN Engine
🎭
Multimodal AI
linuxiac.com
·
2d
2 days ago
Actions for OpenCV 5.0 Computer Vision Library Released with Rewritten DNN Engine
OpenCV 5 Debuts with Improved ONNX Support and Native
AI
Upgrades
🎭
Multimodal AI
Content type:
News
hackster.io
·
20h
20 hours ago
Actions for OpenCV 5 Debuts with Improved ONNX Support and Native AI Upgrades
NVlabs/Eagle: Eagle: Frontier
Vision-Language
Models
with Data-Centric Strategies
🎭
Multimodal AI
Content type:
Code
github.com
·
5d
5 days ago
Actions for NVlabs/Eagle: Eagle: Frontier Vision-Language Models with Data-Centric Strategies
4DP-QA: Scalable QA for 4D Perception in
Vision
Language
Models
🎭
Multimodal AI
Content type:
Academic
arxiv.org
·
7h
7 hours ago
Actions for 4DP-QA: Scalable QA for 4D Perception in Vision Language Models
Sale Sharks:
Blip
or decline after season of strife for Prem club?
🎭
Multimodal AI
Content type:
News
bbc.com
·
1d
1 day ago
Actions for Sale Sharks: Blip or decline after season of strife for Prem club?
OpenCV 5 Is Here: The Biggest Leap in Years for Computer
Vision
🎭
Multimodal AI
opencv.org
·
5d
5 days ago
·
Hacker News
,
Hacker News
Actions for OpenCV 5 Is Here: The Biggest Leap in Years for Computer Vision
Adapting
Vision-Language
Models
from Iconic to Inclusive for Multi-Label Recognition Without Labels
🎭
Multimodal AI
Content type:
Academic
arxiv.org
·
7h
7 hours ago
Actions for Adapting Vision-Language Models from Iconic to Inclusive for Multi-Label Recognition Without Labels
ApertureLab · Synthetic Aperture Sonar Simulator
🎭
Multimodal AI
gergltd.com
·
14h
14 hours ago
·
Hacker News
Actions for ApertureLab · Synthetic Aperture Sonar Simulator
Are
Reasoning
Vision-Language
Models Robust to Semantic Visual Distractions?
🎭
Multimodal AI
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for Are Reasoning Vision-Language Models Robust to Semantic Visual Distractions?
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help