SurveyBench: How Well Can LLM(-Agents) Write Academic Surveys?
arxiv.org·4h

Title:SurveyBench: How Well Can LLM(-Agents) Write Academic Surveys?

View PDF HTML (experimental)

Abstract:Academic survey writing, which distills vast literature into a coherent and insightful narrative, remains a labor-intensive and intellectually demanding task. While recent approaches, such as general DeepResearch agents and survey-specialized methods, can generate surveys automatically (a.k.a. LLM4Survey), their outputs often fall short of human standards and there lacks a rigorous, reader-aligned benchmark for thoroughly revealing their deficiencies. To fill the gap, we propose a fine-grained, quiz-driven evaluation framework SurveyBench, featuring (1) typical survey topics source from rece…

Similar Posts

Loading similar posts...