UniRank: Unified Rank Allocation for Low-Rank LLM Compression (opens in new tab)
Low-rank decomposition serves as a promising compression paradigm for large language models, however, rank allocation remains challenging: manual rules lack generalizability, and learning-based approaches incur heavy computational overhead. To address these issues, we formulate global low-rank allocation as a sorting-and-truncation pipeline, and score each singular component via dual criteria: \textbf{Local} singular energy ratio that quantifi...
Read the original article