Prompt injection in Google Translate reveals base model behaviors behind task-specific fine-tuning (opens in new tab)

Discussed on Hacker News

Published on February 7, 2026 1:56 PM GMT tl;dr Argumate on Tumblr found you can sometimes access the base model behind Google Translate via prompt injection. The result replicates for me, and specific responses indicate that (1) Google Translate is running an instruction-following LLM that self-identifies as such, (2) task-specific fine-tuning (or whatever Google did instead) does not create robust boundaries between "content to process" and "instructions to foll...

Read the original article