foss
The Impact of Critique on LLM-Based Model Generation from Natural Language: The Case of Activity Diagrams
arxiv.orgΒ·6d
OpenAIs HealthBench in Action: Evaluating an LLM-Based Medical Assistant on Realistic Clinical Queries
arxiv.orgΒ·6d
Easier Painting Than Thinking: Can Text-to-Image Models Set the Stage, but Not Direct the Play?
arxiv.orgΒ·6d
Loading...Loading more...