We Tested an AI Agent With Gemini 3 Flash — 67% of Commands Were Unsafe (opens in new tab)

Discussed on Hacker News

We gave Gemini 3 Flash Preview three autonomous agent tasks and let it generate whatever commands it wanted. 67% targeted unsafe endpoints. Check caught all of them.

Read the original article