We Tested an AI Agent With Gemini 3 Flash — 67% of Commands Were Unsafe (opens in new tab)
We gave Gemini 3 Flash Preview three autonomous agent tasks and let it generate whatever commands it wanted. 67% targeted unsafe endpoints. Check caught all of them.
Read the original article