PLSEMANTICSBENCH: Large Language Models As Programming Language Interpreters
arxiv.org·9h

Title:PLSEMANTICSBENCH: Large Language Models As Programming Language Interpreters

View PDF HTML (experimental)

Abstract:As large language models (LLMs) excel at code reasoning, a natural question arises: can an LLM execute programs (i.e., act as an interpreter) purely based on a programming language’s formal semantics? If so, it will enable rapid prototyping of new programming languages and language features. We study this question using the imperative language IMP (a subset of C), formalized via small-step operational semantics (SOS) and rewriting-based operational semantics (K-semantics). We introduce three evaluation sets-Human-Written, LLM-Translated, and Fuzzer- Generated-whose difficult…

Similar Posts

Loading similar posts...