I Watched Gemini Gaslight Itself in Real Time (opens in new tab)
TL;DR: I asked Google's flagship LLM one yes/no question. It said yes, then no, then yes, then no, then yes, then admitted in writing it had been "confidently hallucinating" and had "talked itself out of the correct answer." The pattern has a name — sycophantic capitulation — and it's the failure mode you have to stress-test for before any LLM enters your product. Three prompts at the bottom that will catch it on any model. The question I wanted to know if Gemini had a native Windows app. It ...
Read the original article