arXiv:2601.12274v1 Announce Type: new Abstract: Concolic testing, a powerful hybrid software testing technique, has historically been plagued by fundamental limitations such as path explosion and the high cost of constraint solving, which hinder its practical application in large-scale, real-world software systems. This paper introduces a novel algorithmic framework that synergistically integrates concolic execution with Large Language Models (LLMs) to overcome these challenges. Our hybrid approach leverages the semantic reasoning capabilities of LLMs to guide path exploration, prioritize interesting execution paths, and assist in constraint solving. We formally define the system architecture and algorithms that constitute this new paradigm. Through a series of experiments on both synthet…
arXiv:2601.12274v1 Announce Type: new Abstract: Concolic testing, a powerful hybrid software testing technique, has historically been plagued by fundamental limitations such as path explosion and the high cost of constraint solving, which hinder its practical application in large-scale, real-world software systems. This paper introduces a novel algorithmic framework that synergistically integrates concolic execution with Large Language Models (LLMs) to overcome these challenges. Our hybrid approach leverages the semantic reasoning capabilities of LLMs to guide path exploration, prioritize interesting execution paths, and assist in constraint solving. We formally define the system architecture and algorithms that constitute this new paradigm. Through a series of experiments on both synthetic and real-world Fintech applications, we demonstrate that our approach significantly outperforms traditional concolic testing, random testing, and genetic algorithm-based methods in terms of branch coverage, path coverage, and time-to-coverage. The results indicate that by combining the strengths of both concolic execution and LLMs, our method achieves a more efficient and effective exploration of the program state space, leading to improved bug detection capabilities.