I Built an AI-Powered Screenshot Search Engine
igorstechnoclub.com·13h
SKATE, a Scalable Tournament Eval: Weaker LLMs differentiate between stronger ones using verifiable challenges
arxiv.org·17h
Affordance-R1: Reinforcement Learning for Generalizable Affordance Reasoning in Multimodal Large Language Model
arxiv.org·17h
I tested GPT-5's coding skills, and it was so bad that I'm sticking with GPT-4o (for now)
zdnet.com·8h
Let's talk about the GLM 4.5 models.
threadreaderapp.com·6h
Loading...Loading more...