New Results
, Abhinav Adduri, Dhruv Gautam, Christopher Carpenter, Rohan Shah, Chiara Ricci-Tam, View ORCID ProfileYuval Kluger, Dave P. Burke, View ORCID ProfileYusuf Husein Roohani
doi: https://doi.org/10.64898/2026.01.09.698608
Abstract
Single-cell transcriptomics offers the promise of measuring the diversity of cellular phenotypes across species, diseases, and other biological conditions. Recently, foundation models have emerged to identify this variation, yet most methods represent each cell independently, despite technical limitations that reduce measurement precision at the single-cell level. Here, we present Stack, a foundation model trained on 149 million uniformly preprocessed human sin…
New Results
, Abhinav Adduri, Dhruv Gautam, Christopher Carpenter, Rohan Shah, Chiara Ricci-Tam, View ORCID ProfileYuval Kluger, Dave P. Burke, View ORCID ProfileYusuf Husein Roohani
doi: https://doi.org/10.64898/2026.01.09.698608
Abstract
Single-cell transcriptomics offers the promise of measuring the diversity of cellular phenotypes across species, diseases, and other biological conditions. Recently, foundation models have emerged to identify this variation, yet most methods represent each cell independently, despite technical limitations that reduce measurement precision at the single-cell level. Here, we present Stack, a foundation model trained on 149 million uniformly preprocessed human single cells that leverages tabular attention to generate representations for each cell informed by the cells in its context. Stack offers substantial improvements for downstream tasks in the zero-shot setting compared to baselines, whether they are zero-shot, fine-tuned, or trained from scratch on the target dataset. Stack can perform in-context learning from unlabeled cells representing arbitrary conditions, such as a chemical perturbation or a different donor, and predict the effect of those conditions on a target cell population without requiring data-specific fine-tuning. We apply Stack to generate Perturb Sapiens, the first human whole-organism atlas of perturbed cells, spanning 28 tissues, 40 cell classes, and 201 perturbations. We validated subsets of Perturb Sapiens using in vitro stimulation profiles. Overall, Stack presents a new modeling framework where cells themselves act as guiding examples at inference time, unlocking general-purpose in-context learning capabilities for single-cell biology.
Competing Interest Statement
D.G. acknowledges outside interest as part of the founding team of the Autoscience Institute. D.P.B. acknowledges outside interest as a Google Advisor. Y.H.R. is a scientific advisory board member at QureXR. All other authors declare no competing interests.
Copyright
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.