ReLook: Vision-Grounded RL with a Multimodal LLM Critic for Agentic Web Coding
dev.to·1d·
Discuss: DEV
Flag this post

How AI Learns to Build Web Pages by Seeing Them

Ever wondered how a computer could see a web page and fix its own code? ReLook makes that possible. Imagine a robot artist who paints a picture, steps back, looks at the canvas, and then adds the perfect brushstroke. In the same way, this new AI system writes a snippet of front‑end code, takes a screenshot of the result, and lets a smart visual critic point out what looks off. The critic is a multimodal language model that can understand both text and images, so it can say, “The button is missing” or “The layout is crooked,” and the AI instantly rewrites the code to improve it. By rewarding only screenshots that actually render correctly, the system avoids cheating and keeps getting better, just like a student who only moves on…

Similar Posts

Loading similar posts...