SoMe: A Realistic Benchmark for LLM-based Social Media Agents
arxiv.org·1d
🤖n8n, automation, AI agents, Gemini, Claude, openrouter, grok, chatgpt
Preview
Report Post

View PDF HTML (experimental)

Abstract:Intelligent agents powered by large language models (LLMs) have recently demonstrated impressive capabilities and gained increasing popularity on social media platforms. While LLM agents are reshaping the ecology of social media, there exists a current gap in conducting a comprehensive evaluation of their ability to comprehend media content, understand user behaviors, and make intricate decisions. To address this challenge, we introduce SoMe, a pioneering benchmark designed to evaluate social media agents equipped with various agent tools for accessing and analyzing social media data. SoMe comprises a diverse collection of 8 social media agent tasks, 9,164,284 posts, 6…

Similar Posts

Loading similar posts...