AI research agents would rather make up facts than say "I don't know"
the-decoder.com·14h
🤖AI Curation
Preview
Report Post

Content

A new study from Oppo’s AI team reveals systematic flaws in "deep research" systems designed to automate complex reporting. Nearly 20 percent of errors stem from systems inventing plausible-sounding but entirely fake content.

The researchers analyzed around 1,000 reports using two new evaluation tools: FINDER, a benchmark for deep research tasks, and DEFT, a taxonomy for classifying failures.

To feign competence, one system claimed an investment fund achieved an exact 30.2 percent annual return over 20 years. Since such specific data isn’t public, the AI likely fabricated the figure.

In another test involving scientific papers, a system listed 24 references. A check revealed several links were dead, while others pointed to reviews rather than original research—yet …

Similar Posts

Loading similar posts...