Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Multimodal AI
🖼️ Multimodal AI
vision-language models, VLM, image-text, multimodal learning
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
311
posts in
7.0
ms
Decoding Pedestrian Crossing Intention from Egocentric
Vision
via
Vision
Language
Models
👁️
Computer Vision
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for Decoding Pedestrian Crossing Intention from Egocentric Vision via Vision Language Models
NVlabs/Eagle: Eagle: Frontier
Vision-Language
Models
with Data-Centric Strategies
🔍
Fine-Grained Classification
Content type:
Code
github.com
·
5d
5 days ago
Actions for NVlabs/Eagle: Eagle: Frontier Vision-Language Models with Data-Centric Strategies
Mi50 32GB / GFX906 - vLLM Qwen 3.5 Configuration for Qwen 3.5:9B AWQ-4bit
🧠
LLMs
huggingface.co
·
54m
54 minutes ago
·
r/LocalLLaMA
Actions for Mi50 32GB / GFX906 - vLLM Qwen 3.5 Configuration for Qwen 3.5:9B AWQ-4bit
A generalist biomedical
vision-language
model
via multi-CLIP knowledge distillation
🏷️
Label Noise
Content type:
Academic
nature.com
·
1d
1 day ago
Actions for A generalist biomedical vision-language model via multi-CLIP knowledge distillation
SpaceX IPO hype is massive — and especially dangerous for investors over 50
🛰️
Geospatial AI
marketwatch.com
·
11h
11 hours ago
Actions for SpaceX IPO hype is massive — and especially dangerous for investors over 50
Vibe Coding Specificity
Foundation
Models
🧠
LLMs
Content type:
Academic
biorxiv.org
·
6d
6 days ago
Actions for Vibe Coding Specificity Foundation Models
A new chapter of efficient
foundation
models
for medical
imaging
⚙️
MLOps
techcommunity.microsoft.com
·
1d
1 day ago
Actions for A new chapter of efficient foundation models for medical imaging
OpenCV 5.0 Released With Rewritten DNN Engine, Built-In LLM &
VLM
Support
👁️
Computer Vision
phoronix.com
·
5d
5 days ago
·
Hacker News
Actions for OpenCV 5.0 Released With Rewritten DNN Engine, Built-In LLM & VLM Support
Can robots read the room?
🧠
LLMs
Content type:
News
Content type:
Academic
news.cornell.edu
·
2d
2 days ago
Actions for Can robots read the room?
OpenCV 5 release - New DNN engine with enhanced ONNX and
LLM/VLM
support, Intel, Arm, and RISC-V hardware optimizations - CNX Software
👁️
Computer Vision
Content type:
News
cnx-software.com
·
1d
1 day ago
Actions for OpenCV 5 release - New DNN engine with enhanced ONNX and LLM/VLM support, Intel, Arm, and RISC-V hardware optimizations - CNX Software
openpilot 0.11.1
✍️
Prompt Engineering
Content type:
Blog
blog.comma.ai
·
6d
6 days ago
Actions for openpilot 0.11.1
NVIDIA's Cosmos 3: The World's First Fully Open
AI
Omnimodel
🤖
AI Agents
Content type:
News
aimagazine.com
·
2d
2 days ago
Actions for NVIDIA's Cosmos 3: The World's First Fully Open AI Omnimodel
Adapting
Vision-Language
Models
from Iconic to Inclusive for Multi-Label Recognition Without Labels
🏷️
Label Noise
Content type:
Academic
arxiv.org
·
19h
19 hours ago
Actions for Adapting Vision-Language Models from Iconic to Inclusive for Multi-Label Recognition Without Labels
ApertureLab · Synthetic Aperture Sonar Simulator
👁️
Computer Vision
gergltd.com
·
1d
1 day ago
·
Hacker News
Actions for ApertureLab · Synthetic Aperture Sonar Simulator
OpenCV 5 Is Here: The Biggest
Leap
in Years for Computer
Vision
👁️
Computer Vision
opencv.org
·
6d
6 days ago
·
Hacker News
,
Hacker News
Actions for OpenCV 5 Is Here: The Biggest Leap in Years for Computer Vision
Disquiet Junto Project 0754: The
Blip
✍️
Prompt Engineering
llllllll.co
·
1d
1 day ago
Actions for Disquiet Junto Project 0754: The Blip
Apple Reveals New
AI
Architecture Built Around Google Gemini
Models
🛰️
Geospatial AI
Content type:
News
macrumors.com
·
3d
3 days ago
·
Hacker News
Actions for Apple Reveals New AI Architecture Built Around Google Gemini Models
dimitrisdimitrov5-blip/Phantomix
: The open-source
AI
browser agent. Free alternative to OpenAI Operator.
🔌
MCP
Content type:
Code
github.com
·
16h
16 hours ago
·
Hacker News
Actions for dimitrisdimitrov5-blip/Phantomix: The open-source AI browser agent. Free alternative to OpenAI Operator.
Mbodi
AI
(YC P25) Is Hiring Founding Machine
Learning
Engineer (Robotics)
⚙️
MLOps
ycombinator.com
·
5d
5 days ago
·
Hacker News
Actions for Mbodi AI (YC P25) Is Hiring Founding Machine Learning Engineer (Robotics)
MSUE:
Multi-Modal
Soccer Understanding Expert
🔍
Fine-Grained Classification
Content type:
Academic
arxiv.org
·
19h
19 hours ago
Actions for MSUE: Multi-Modal Soccer Understanding Expert
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help