Partial Identification from LLM Prompts (opens in new tab)

Large language models are increasingly used as binary classifiers when the true label is latent. We study partial identification of the prevalence $\theta = P(X^* = 1)$ from panels of LLM reports whose errors may be arbitrarily dependent given the truth. The design of replication determines the observable, and hence the identifying content: repeated prompts to one model yield a count, several named models a response vector, and both a response m...

Read the original article