Abstract:Large Language Models (LLMs) deliver exceptional performance across natural language tasks but demand substantial computational resources, limiting their deployment on resource-constrained edge devices. Existing compression techniques, such as quantization and pruning, often degrade critical linguistic properties and lack formal guarantees for preserving model behavior. We propose Temporal Logic-Guided Large Language Model Compression (TOGGLE), a novel framework that leverages Signal Temporal Logic (STL) to formally specify and enforce linguistic properties during compression. TOGGLE employs an STL robustness-guided Bayesian optimization to systematically explore layer-wise quantization and pruning configurations, generating…
Abstract:Large Language Models (LLMs) deliver exceptional performance across natural language tasks but demand substantial computational resources, limiting their deployment on resource-constrained edge devices. Existing compression techniques, such as quantization and pruning, often degrade critical linguistic properties and lack formal guarantees for preserving model behavior. We propose Temporal Logic-Guided Large Language Model Compression (TOGGLE), a novel framework that leverages Signal Temporal Logic (STL) to formally specify and enforce linguistic properties during compression. TOGGLE employs an STL robustness-guided Bayesian optimization to systematically explore layer-wise quantization and pruning configurations, generating compressed models that formally satisfy specified linguistic constraints without retraining or fine-tuning. Evaluating TOGGLE on four LLM architectures (GPT-2, DeepSeek-V2 7B, LLaMA 3 8B, and Mistral 7B), we achieve up to 3.3x reduction in computational costs (FLOPs) and up to a 68.8% reduction in model size while satisfying all linguistic properties. TOGGLE represents the first integration of formal methods into LLM compression, enabling efficient, verifiable deployment of LLMs on edge hardware.
| Comments: | Published in the IEEE ICCAD 2025 conference |
| Subjects: | Artificial Intelligence (cs.AI); Logic in Computer Science (cs.LO) |
| Cite as: | arXiv:2512.16855 [cs.AI] |
| (or arXiv:2512.16855v1 [cs.AI] for this version) | |
| https://doi.org/10.48550/arXiv.2512.16855 arXiv-issued DOI via DataCite (pending registration) | |
| Related DOI: | https://doi.org/10.1109/ICCAD66269.2025.11240962 DOI(s) linking to related resources |
Submission history
From: Khaza Anuarul Hoque [view email] [v1] Thu, 18 Dec 2025 18:27:42 UTC (514 KB)