Uncovering Competency Gaps in Large Language Models and Their Benchmarks
arxiv.org·2d
🏁Language Benchmarks
Preview
Report Post

View PDF HTML (experimental)

Abstract:The evaluation of large language models (LLMs) relies heavily on standardized benchmarks. These benchmarks provide useful aggregated metrics for a given capability, but those aggregated metrics can obscure (i) particular sub-areas where the LLMs are weak ("model gaps") and (ii) imbalanced coverage in the benchmarks themselves ("benchmark gaps"). We propose a new method that uses sparse autoencoders (SAEs) to automatically uncover both types of gaps. By extracting SAE concept activations and computing saliency-weighted performance scores across benchmark data, the method grounds evaluation in the model’s internal representations and enables comparison acro…

Similar Posts

Loading similar posts...