Abstract
Cell type annotation is an essential step in single-cell RNA-sequencing analysis, and numerous annotation methods are available. Most require a combination of computational and domain-specific expertise, and they frequently yield inconsistent results that can be challenging to interpret. Large language models have the potential to expand accessibility while reducing manual input and improving accuracy, but existing approaches suffer from hyperconfidence, hallucinations, and lack of reasoning. To address these limitations, we developed CASSIA for automated, accurate, and interpretable cell annotation of single-cell RNA-sequencing data. As demonstrated in analyses of 970 cell types, CASSIA improves annotation accuracy in benchmark datasets as well as complex and rare cell po…
Abstract
Cell type annotation is an essential step in single-cell RNA-sequencing analysis, and numerous annotation methods are available. Most require a combination of computational and domain-specific expertise, and they frequently yield inconsistent results that can be challenging to interpret. Large language models have the potential to expand accessibility while reducing manual input and improving accuracy, but existing approaches suffer from hyperconfidence, hallucinations, and lack of reasoning. To address these limitations, we developed CASSIA for automated, accurate, and interpretable cell annotation of single-cell RNA-sequencing data. As demonstrated in analyses of 970 cell types, CASSIA improves annotation accuracy in benchmark datasets as well as complex and rare cell populations, and also provides users with reasoning and quality assessment to ensure interpretability, guard against hallucinations, and calibrate confidence.
Data availability
The single-cell RNA-sequencing data generated in this study have been deposited in the NCBI Gene Expression Omnibus (GEO) under accession code GSE307976. The processed data are also available under the same accession. Additional publicly available datasets used in this study are listed in the Supplementary Information file. Source data are provided with this paper.
Code availability
The code used to develop the model, perform the analyses, and generate results in this study is publicly available and has been deposited in CASSIA at https://github.com/ElliotXie/CASSIA, under the MIT license. The specific version of the code associated with this publication is archived in Zenodo and is accessible via https://doi.org/10.5281/zenodo.1726168934
References
Tan, Y. & Cahan, P. SingleCellNet: a computational tool to classify single cell RNA-seq data across platforms and across species. Cell Syst. 9, 207–213 e202 (2019).
Alquicira-Hernandez, J., Sathe, A., Ji, H. P., Nguyen, Q. & Powell, J. E. scPred: accurate supervised method for cell-type classification from single-cell RNA-seq data. Genome Biol. 20, 264 (2019).
Yang, F. et al. scBERT as a large-scale pretrained deep language model for cell type annotation of single-cell RNA-seq data. Nat. Mach. Intell. 4, 852–866 (2022).
Cui, H. et al. scGPT: toward building a foundation model for single-cell multi-omics using generative AI. Nat. Methods 21, 1470–1480 (2024).
Kiselev, V. Y., Yiu, A. & Hemberg, M. scmap: projection of single-cell RNA-seq data across data sets. Nat. Methods 15, 359–362 (2018).
Aran, D. et al. Reference-based analysis of lung single-cell sequencing reveals a transitional profibrotic macrophage. Nat. Immunol. 20, 163–172 (2019).
de Kanter, J. K., Lijnzaad, P., Candelli, T., Margaritis, T. & Holstege, F. C. P. CHETAH: a selective, hierarchical cell type identification method for single-cell RNA sequencing. Nucleic Acids Res. 47, e95 (2019).
Fu, Q. et al. A comparison of scRNA-seq annotation methods based on experimentally labeled immune cell subtype dataset. Brief Bioinform 25 https://doi.org/10.1093/bib/bbae392 (2024). 1.
Abdelaal, T. et al. A comparison of automatic cell identification methods for single-cell RNA sequencing data. Genome Biol. 20, 194 (2019).
Shao, X. et al. scCATCH: automatic annotation on cell types of clusters from single-cell RNA sequencing data. iScience 23, 100882 (2020).
Ianevski, A., Giri, A. K. & Aittokallio, T. Fully-automated and ultra-fast cell-type identification using specific marker combinations from single-cell transcriptomic data. Nat. Commun. 13, 1246 (2022).
Zhang, Z. et al. SCINA: a semi-supervised subtyping algorithm of single cells and bulk samples. Genes (Basel) 10. https://doi.org/10.3390/genes10070531 (2019). 1.
Clarke, Z. A. et al. Tutorial: guidelines for annotating single-cell transcriptomic maps using automated and manual methods. Nat. Protoc. 16, 2749–2764 (2021).
Toufiq, M. et al. Harnessing large language models (LLMs) for candidate gene prioritization and selection. J. Transl. Med. 21, 728 (2023).
Kim, J., Wang, K., Weng, C. & Liu, C. Assessing the utility of large language models for phenotype-driven gene prioritization in the diagnosis of rare genetic disease. Am. J. Hum. Genet. 111, 2190–2202 (2024).
Gill, J. K., Chetty, M., Lim, S. & Hallinan, J. Large language model based framework for automated extraction of genetic interactions from unstructured data. PLoS ONE 19, e0303231 (2024).
Hou, W. & Ji, Z. Assessing GPT-4 for cell type annotation in single-cell RNA-seq analysis. Nat. Methods 21, 1462–1465 (2024).
Zhang, X. et al. CellMarker: a manually curated resource of cell markers in human and mouse. Nucleic Acids Res. 47, D721–D728 (2019).
Diehl, A. D. et al. The Cell Ontology 2016: enhanced content, modularization, and ontology interoperability. J. Biomed. Semant. 7, 44 (2016).
Farquhar, S., Kossen, J., Kuhn, L. & Gal, Y. Detecting hallucinations in large language models using semantic entropy. Nature 630, 625–630 (2024).
Huang, L. et al. A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions. ACM Trans. Inf. Syst. 43, 1–55 (2025).
Gallifant, J. et al. The TRIPOD-LLM reporting guideline for studies using large language models. Nat. Med. 31, 60–69 (2025).
Huttenlocher, D., Ozdaglar, A. & Goldston, D. Technical Report MIT Schwarzman College of Computing. https://computing.mit.edu/wp-content/uploads/2023/11/AIPolicyBrief.pdf (2023). 1.
ChatGPT is a black box: how AI research can break it open. Nature 619, 671–672 (2023). 1.
Kojima, T., Gu, S. S., Reid, M., Matsuo, Y. & Iwasawa, Y. Large language models are zero-shot reasoners. Adv. Neural Inf. Process. Syst. 35, 22199–22213 (2022).
Wang, L. et al. Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models. In Proc. 61st Annual Meeting of the Association for Computational Linguistics, 2609–2634 (ACL, 2023). 1.
Shanahan, M., McDonell, K. & Reynolds, L. Role play with large language models. Nature 623, 493–498 (2023).
Bsharat, S. M., Myrzakhan, A., Shen, Z. Principled instructions are all you need for questioning LLaMA-1/2, GPT-3.5/4. Preprint at https://arxiv.org/abs/2312.16171 (2023). 1.
Weng, Y. et al. Large Language Models are Better Reasoners with Self-Verification. Findings of the Association for Computational Linguistics: EMNLP 2550–2575 (2023). 1.
Li, J., Zhang, Q., Yu, Y., Fu, Q. & Ye, D. More Agents Is All You Need. Transactions on Machine Learning Research (ICLR, 2024). 1.
Wang, X. et al. Self-Consistency Improves Chain-of-Thought Reasoning in Language Models. In Proc. Int. Conf. Learn. Represent. (ICLR 2023). 1.
Dominguez Conde, C. et al. Cross-tissue immune cell analysis reveals tissue-specific features in humans. Science 376, eabl5197 (2022).
Shireman, J. M. et al. Genomic analysis of human brain metastases treated with stereotactic radiosurgery reveals unique signature based on treatment failure. iScience 27, 109601 (2024).
Xie, E., Cai, Y., Liu, J., Cheng, L. & Shireman, J. ElliotXie/CASSIA: CASSIA 1.2.0 (v1.2.0). Zenodo https://doi.org/10.5281/zenodo.17261689 (2025).
Acknowledgements
This work was supported by NIH GM102756 (C.K.), NIH K08NS092895 (M.D.), and the IU Value Research grant (M.D.).
Author information
Authors and Affiliations
Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, USA
Elliot Xie, Lingxin Cheng, Yujia Cai, Jihua Liu, Chitrasen Mohanty & Christina Kendziorski 1.
Department of Neurological Surgery, University of Wisconsin-Madison, Madison, WI, USA
Jack Shireman & Mahua Dey
Authors
- Elliot Xie
- Lingxin Cheng
- Jack Shireman
- Yujia Cai
- Jihua Liu
- Chitrasen Mohanty
- Mahua Dey
- Christina Kendziorski
Contributions
E.X. conceived the study and led the method development, implementation, validation, and case study analyses. C.K. co-led the project. L.C. contributed to the method development and developed the RAG agent. L.C., Y.C. and J.L. contributed to case study analyses. M.D. collected the brain tumor samples and assisted with interpretation of results; J.S. processed the brain tumor samples for scRNA-seq profiling, contributed to method development, and assisted with interpretation of results from all case study analyses. C.M. contributed to figure preparation and visualization. C.K., E.X., L.C. and J.S. wrote the manuscript.
Corresponding author
Correspondence to Christina Kendziorski.
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks Jesper Tegner and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Xie, E., Cheng, L., Shireman, J. et al. CASSIA: a multi-agent large language model for automated and interpretable cell annotation. Nat Commun (2025). https://doi.org/10.1038/s41467-025-67084-x
Received: 07 March 2025
Accepted: 17 November 2025
Published: 07 December 2025
DOI: https://doi.org/10.1038/s41467-025-67084-x