Multimodal AI

Feeds to Scour
SubscribedAll
Scoured 387 posts in 7.4 ms

BioCoach uses AI and biomechanics to give real-time exercise feedback at home

 💻AI Coding

mtmd : add video input support by ngxson · Pull Request #24269 · ggml-org/llama.cpp

 🗄️Vector Databases  Content type: Code
github.com··r/LocalLLaMA

Florida's lawsuit against OpenAI and CEO Altman treats ChatGPT as a defective product and public nuisance

 ⚖️AI Governance
the-decoder.com
·

VL-DINO: Leveraging CLIP Vision-Language Knowledge for Open-Vocabulary Object Detectio

 🗄️Vector Databases  Content type: Academic
arxiv.org·

Scott Bessent says America's in a 'manufacturing renaissance' and Wall Street largely agrees. So where are the jobs?

 💎Token Economics
fortune.com
·

Are Reasoning Vision-Language Models Robust to Semantic Visual Distractions?

 🧠Reasoning Models  Content type: Academic
arxiv.org·

A Blip on a Telescope in a Colorado Parking Lot Bolstered a Space Mission That Has Found Thousands of Planets … and Counting

 🔌MCP
smithsonianmag.com·

A primordial black hole nicknamed ‘Phoebe’ may help solve the mystery of dark matter

 Inference

OmniGameArena: A Unified UE5 Benchmark for VLM Game Agents with Improvement Dynamics

 💾Agent Memory  Content type: Academic
arxiv.org·

World Model Self-Distillation: Training World Models to Solve General Tasks

 🎛️Fine-tuning  Content type: Academic
arxiv.org·

fix(sessions): preserve user model override across daily/idle rollove… · openclaw/openclaw@8e81bf7

 🔌MCP  Content type: Code
github.com·

MLingualFC: Evaluating Jailbreak Vulnerabilities in Multilingual Vision-Language Models

 🧠LLMs  Content type: Academic
arxiv.org·

Two Bridges, One Pathway: From VLMs to Generalizable VLAs with Embodied Trajectory-Coupled Data

 🎯Reinforcement Learning  Content type: Academic
arxiv.org·

fix: preserve Foundry Responses reasoning replay ids · openclaw/openclaw@248dfb2

 ✍️Prompt Engineering  Content type: Code
github.com·

MSUE: Multi-Modal Soccer Understanding Expert

 📊Model Evaluation  Content type: Academic
arxiv.org·

The Last Visible Pixel: Probing Fine-Scale Perception in Vision-Language Models

 Inference  Content type: Academic
arxiv.org·

Vision Language Model Helps Private Information De-Identification in Vision Data

 🎛️Fine-tuning  Content type: Academic
arxiv.org·

Decoding Pedestrian Crossing Intention from Egocentric Vision via Vision Language Models

 🌐World Models  Content type: Academic
arxiv.org·

CheXanatomy: Anatomy-Aware Vision-Language Modeling for Chest Radiographs

 🗄️Vector Databases  Content type: Academic
arxiv.org·

OpenMedReason: Scientific Reasoning Supervision for Medical Vision-Language Models

 🧠Reasoning Models  Content type: Academic
arxiv.org·
Sign up or log in to see more results

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help