Title:REMODEL-LLM: Transforming C code to Java using LLMs
Abstract:The automated translation of C code to Java code is a notoriously difficult task, fraught with challenges stemming from fundamental paradigm shifts (procedural vs. Object Oriented), memory models (manual pointers vs. Garbage Collection), and incompatible data types. This paper investigates the efficacy of 19 small, quantized LLMs (under 20 billion parameters) for the C to Java translation task. We use a novel, hybrid pipeline that leverages Abstract Syntax Trees (ASTs) for semantic decomposition and employs a highly constrained, rule based prompting strategy. The results are stark: a clear multi tier…
Title:REMODEL-LLM: Transforming C code to Java using LLMs
Abstract:The automated translation of C code to Java code is a notoriously difficult task, fraught with challenges stemming from fundamental paradigm shifts (procedural vs. Object Oriented), memory models (manual pointers vs. Garbage Collection), and incompatible data types. This paper investigates the efficacy of 19 small, quantized LLMs (under 20 billion parameters) for the C to Java translation task. We use a novel, hybrid pipeline that leverages Abstract Syntax Trees (ASTs) for semantic decomposition and employs a highly constrained, rule based prompting strategy. The results are stark: a clear multi tiered performance divide emerged. The vast majority of models (Tier 3, e.g., llama3.1, gemma3, starcoder2) failed 100% of the tests, proving incapable of generating even basic, runnable Java boilerplate. A small middle tier (Tier 2, e.g., mistral-nemo and mistral) produced runnable code but was plagued by dangerous semantic failures and wrong translations. Only three models (Tier 1: phi4, deepseek-coder-v2, codeqwen) proved viable, passing over 50% of the test suite. Even these top models failed on the most complex C concepts, such as function pointers, sizeof, and enum logic, revealing a hard ceiling for the reasoning capabilities of current quantized models.
| Subjects: | Software Engineering (cs.SE); Artificial Intelligence (cs.AI) |
| Cite as: | arXiv:2512.11402 [cs.SE] |
| (or arXiv:2512.11402v1 [cs.SE] for this version) | |
| https://doi.org/10.48550/arXiv.2512.11402 arXiv-issued DOI via DataCite (pending registration) | |
| Related DOI: | https://doi.org/10.13140/RG.2.2.15123.54564 DOI(s) linking to related resources |
Submission history
From: Aryan Gupta [view email] [v1] Fri, 12 Dec 2025 09:25:10 UTC (705 KB)