Arc Institute unveils a new foundation model with in-context learning of single-cell biology, and applies it to generate Perturb Sapiens, an atlas of simulated human cells

Building a computational model that can predict how cells behave across diseases, drugs, and biological contexts requires solving multiple challenges. With our first virtual cell model State, we showed that working with sets of cells can improve our ability to predict perturbation response. But one of the most fundamental challenges remains: can we build a generalist model that predicts cellular responses across entirely new contexts without requiring perturbational data from that …
Arc Institute unveils a new foundation model with in-context learning of single-cell biology, and applies it to generate Perturb Sapiens, an atlas of simulated human cells

Building a computational model that can predict how cells behave across diseases, drugs, and biological contexts requires solving multiple challenges. With our first virtual cell model State, we showed that working with sets of cells can improve our ability to predict perturbation response. But one of the most fundamental challenges remains: can we build a generalist model that predicts cellular responses across entirely new contexts without requiring perturbational data from that specific context? Today, we’re releasing Stack, an open-source foundation model that makes progress on this question while extending the "sets of cells" concept through two key innovations:
The first is architectural. Stack uses a tabular transformer block that processes single-cell data as a two-dimensional table comprising cells and genes, with information flowing both within individual cells (how genes relate to each other) and between cells (how cells with similar patterns relate). This design allows the model to better capture biological context: a T cell in inflamed tissue behaves differently not just because of its own genes, but because of its cellular environment. Stack also introduces trainable "gene module tokens" that represent cell state using biological components derived from multiple genes instead of modeling each gene independently, making the model both more interpretable and more efficient to train.
The second innovation is training strategy. We pre-trained Stack on 149 million cells from scBaseCount, spanning hundreds of tissues, diseases, donors, and states, to internalize the biological dependencies that define cellular context. Then, through post-training on 55 million cells from public databases CellxGene and Parse PBMC, Stack learned how to use one set of cells as a "prompt" that instructs what to predict in another set. Just as a text prompt guides how a language model generates responses, cells themselves serve as prompts in Stack, defining the biological condition that shapes predictions. Stack can see drug-treated immune cells and predict how epithelial cells would respond to that same drug, which is a new task it was never explicitly trained for. Stack is the first single-cell foundation model we’re aware of that is capable of learning new tasks at inference time, no retraining required.
This capability results in gains on standard benchmarks. We evaluated Stack using cell-eval, a rigorous framework for perturbation prediction, alongside standard tasks for disease classification and cell type integration. Stack consistently outperformed other methods, including both foundation models and popular approaches like scVI (single-cell Variational Inference) and PCA (Principal Component Analysis) trained from scratch on each evaluation dataset. On perturbation prediction metrics, Stack outperformed alternatives, demonstrating that zero-shot foundation models can compete with specialized approaches.
Perturb Sapiens: an atlas of predicted cellular responses

To demonstrate Stack’s capabilities and create a new resource for the field, we built Perturb Sapiens, an atlas of predicted cellular responses derived from Tabula Sapiens.
Perturb Sapiens addresses a fundamental experimental gap: the vast majority of cell type-tissue-perturbation combinations have never been measured. Comprehensively testing even a fraction of these combinations would require millions of dollars and years of experimental work. To create the atlas, we used the model’s in-context learning to "translate" those responses across the human body. For each perturbation, Stack observed the immune cell response and predicted how every cell type in every tissue from Tabula Sapiens would respond. The result is approximately 20,000 predicted cell type-tissue-perturbation combinations.
What can we use Perturb Sapiens for? A drug that strongly affects immune cells might barely touch epithelial or stromal cells in the same tissue. Interferon signaling produces different transcriptomic signatures in lung epithelium versus intestinal epithelium. Some drugs and cytokines activate similar response programs across surprisingly diverse cell types, suggesting shared vulnerabilities.
We validated a subset of predictions with real experiments, focusing on epithelial cell responses to cytokines, confirming Stack’s predictions captured biologically meaningful, cell-type-specific effects. The model’s results aren’t as accurate as real experiments, but they capture meaningful effects and are a great starting point for further exploration when those experiments aren’t feasible.
Perturb Sapiens is available on Hugging Face.
Stack and State: complementary approaches
Stack and State are coming from different directions toward the same goal, and together they expand what virtual cell modeling can do.
Use Stack when you’re working with observational data, such as patient samples, disease tissues, or scenarios where perturbation experiments aren’t feasible. Stack works immediately across a broad range of biological conditions: different donors, disease states, tissue environments and does not require perturbational data for each condition being studied.
Use State when you have access to large-scale perturbation data and want to expand the information obtained from running those experiments. State can predict perturbation effects for a biological context of interest with state of the art performance, learning deeply from perturbation data generated in alternate contexts to predict new drugs, doses, and combinations. The more data you generate, the better State becomes within that experimental setting.
Each model represents different phases of discovery: Stack helps you develop hypotheses across a broad range of contexts, while State helps you expand those hypotheses after performing perturbation experiments. State 2, which we’re developing now, will build on learnings from both models.
###
Dong, M., Adduri, A., Gautam, D., Carpenter, C., Shah, R., Ricci-Tam, C., Kluger, Y., Burke, D. P., & Roohani, Y. H. (2026). Stack: In-context learning of single-cell biology. bioRxiv. DOI: 10.64898/2026.01.09.698608
Stack is open-source and available now:
- Preprint: bioRxiv
- Code: GitHub
- Perturb Sapiens: Hugging Face