Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Vision-Language Models
👁️ Vision-Language Models
Specific
VLM, CLIP, image-text, grounding
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
298
posts in
11.4
ms
From Prompts to Tokens: Internalizing Causal Supervision in
Vision-Language
Model
for Multi-Image Causal Reasoning
🎭
Multimodal AI
Content type:
Academic
arxiv.org
·
7h
7 hours ago
Actions for From Prompts to Tokens: Internalizing Causal Supervision in Vision-Language Model for Multi-Image Causal Reasoning
Less-relevant results
Dollar exchange is gradually gaining
ground
. Will continue to rise?
🎭
Multimodal AI
qcostarica.com
·
6d
6 days ago
Actions for Dollar exchange is gradually gaining ground. Will continue to rise?
mtmd : add video input support by ngxson · Pull Request #24269 · ggml-org/llama.cpp
🖥️
wgpu
Content type:
Code
github.com
·
2d
2 days ago
·
r/LocalLLaMA
Actions for mtmd : add video input support by ngxson · Pull Request #24269 · ggml-org/llama.cpp
The Last Visible Pixel: Probing Fine-Scale Perception in
Vision-Language
Models
🎭
Multimodal AI
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for The Last Visible Pixel: Probing Fine-Scale Perception in Vision-Language Models
Vibe Coding Specificity Foundation
Models
🎭
Multimodal AI
Content type:
Academic
biorxiv.org
·
6d
6 days ago
Actions for Vibe Coding Specificity Foundation Models
World
Model
Self-Distillation: Training World Models to Solve General Tasks
🎭
Multimodal AI
Content type:
Academic
arxiv.org
·
7h
7 hours ago
Actions for World Model Self-Distillation: Training World Models to Solve General Tasks
Cadillac's updated Lyriq costs $200 more but unlocks Tesla charging and keeps CarPlay alive
🎭
Multimodal AI
thecooldown.com
·
2d
2 days ago
Actions for Cadillac's updated Lyriq costs $200 more but unlocks Tesla charging and keeps CarPlay alive
NVIDIA's Cosmos 3: The World's First Fully Open
AI
Omnimodel
🎭
Multimodal AI
Content type:
News
aimagazine.com
·
1d
1 day ago
Actions for NVIDIA's Cosmos 3: The World's First Fully Open AI Omnimodel
MSUE:
Multi-Modal
Soccer Understanding Expert
🎭
Multimodal AI
Content type:
Academic
arxiv.org
·
7h
7 hours ago
Actions for MSUE: Multi-Modal Soccer Understanding Expert
Why SanDisk Stock Is Sinking Today
🎭
Multimodal AI
Content type:
News
fool.com
·
5d
5 days ago
Actions for Why SanDisk Stock Is Sinking Today
Vision
Language
Model
Helps Private Information De-Identification in
Vision
Data
🎭
Multimodal AI
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for Vision Language Model Helps Private Information De-Identification in Vision Data
Green growth claims are overstated – our study shows three
reasons
why
🎭
Multimodal AI
theconversation.com
·
5d
5 days ago
·
r/Economics
Actions for Green growth claims are overstated – our study shows three reasons why
Harnessing Streaming Video in the Wild
🎭
Multimodal AI
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for Harnessing Streaming Video in the Wild
A new chapter of efficient foundation
models
for medical
imaging
🔌
Semiconductors
techcommunity.microsoft.com
·
21h
21 hours ago
Actions for A new chapter of efficient foundation models for medical imaging
Costa Rica watches the dollar
climb
after four years of a rising colón
🎭
Multimodal AI
ticotimes.net
·
1d
1 day ago
Actions for Costa Rica watches the dollar climb after four years of a rising colón
AutoMine Solution for AV2 2026 Scenario Mining Challenge
🎭
Multimodal AI
Content type:
Academic
arxiv.org
·
7h
7 hours ago
Actions for AutoMine Solution for AV2 2026 Scenario Mining Challenge
After Layoffs, Pinterest Plans to Invest $4 Billion in Amazon’s
AI
chips
🎭
Multimodal AI
wwd.com
·
6d
6 days ago
Actions for After Layoffs, Pinterest Plans to Invest $4 Billion in Amazon’s AI chips
AgenticNav: Zero-Shot
Vision-and-Language
Navigation as a Tool-Calling Harness
🎭
Multimodal AI
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for AgenticNav: Zero-Shot Vision-and-Language Navigation as a Tool-Calling Harness
AI-Based
Medication Monitoring, Subreddit Spam, Chipotle Chatbots, More: ResearchBuzz
AI
Update, June 6, 2026
🎭
Multimodal AI
researchbuzz.me
·
4d
4 days ago
Actions for AI-Based Medication Monitoring, Subreddit Spam, Chipotle Chatbots, More: ResearchBuzz AI Update, June 6, 2026
Do VLMs See What Sensors Feel? A Scalable Expert-Guided Design for Wheelchair Accessibility Assessment from Street View
🎭
Multimodal AI
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for Do VLMs See What Sensors Feel? A Scalable Expert-Guided Design for Wheelchair Accessibility Assessment from Street View
Sign up or log in to see more results
Sign Up
Login
« Page 2
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help