Title:Personalization of Large Foundation Models for Health Interventions
Abstract:Large foundation models (LFMs) transform healthcare AI in prevention, diagnostics, and treatment. However, whether LFMs can provide truly personalized treatment recommendations remains an open question. Recent research has revealed multiple challenges for personalization, including the fundamental generalizability paradox: models achieving high accuracy in one clinical study perform at chance level in others, demonstrating that personalization and external validity exist in tension. This exemplifies broader contradictions in AI-driven healthcare: the privacy-performance paradox, scale-specifi…
Title:Personalization of Large Foundation Models for Health Interventions
Abstract:Large foundation models (LFMs) transform healthcare AI in prevention, diagnostics, and treatment. However, whether LFMs can provide truly personalized treatment recommendations remains an open question. Recent research has revealed multiple challenges for personalization, including the fundamental generalizability paradox: models achieving high accuracy in one clinical study perform at chance level in others, demonstrating that personalization and external validity exist in tension. This exemplifies broader contradictions in AI-driven healthcare: the privacy-performance paradox, scale-specificity paradox, and the automation-empathy paradox. As another challenge, the degree of causal understanding required for personalized recommendations, as opposed to mere predictive capacities of LFMs, remains an open question. N-of-1 trials – crossover self-experiments and the gold standard for individual causal inference in personalized medicine – resolve these tensions by providing within-person causal evidence while preserving privacy through local experimentation. Despite their impressive capabilities, this paper argues that LFMs cannot replace N-of-1 trials. We argue that LFMs and N-of-1 trials are complementary: LFMs excel at rapid hypothesis generation from population patterns using multimodal data, while N-of-1 trials excel at causal validation for a given individual. We propose a hybrid framework that combines the strengths of both to enable personalization and navigate the identified paradoxes: LFMs generate ranked intervention candidates with uncertainty estimates, which trigger subsequent N-of-1 trials. Clarifying the boundary between prediction and causation and explicitly addressing the paradoxical tensions are essential for responsible AI integration in personalized medicine.
| Comments: | Accepted to the AAAI 2026 Workshop on Personalization in the Era of Large Foundation Models (PerFM) |
| Subjects: | Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Applications (stat.AP) |
| Cite as: | arXiv:2601.03482 [cs.AI] |
| (or arXiv:2601.03482v1 [cs.AI] for this version) | |
| https://doi.org/10.48550/arXiv.2601.03482 arXiv-issued DOI via DataCite (pending registration) |
Submission history
From: Stefan Konigorski [view email] [v1] Wed, 7 Jan 2026 00:24:01 UTC (297 KB)