Abstract:The PRO-CTCAE is an NCI-developed patient-reported outcome system for capturing symptomatic adverse events in oncology trials. It comprises a large library drawn from the CTCAE vocabulary, and item selection for a given trial is typically guided by expected toxicity profiles from prior data. Selecting too many PRO-CTCAE items can burden patients and reduce compliance, while too few may miss important safety signals. We present an automated method to select a minimal yet comprehensive PRO-CTCAE subset based on historical safety data. Each candidate PRO-CTCAE symptom term is first mapped to its corresponding MedDRA Preferred Terms (PTs), which are then encoded into Safe…
Abstract:The PRO-CTCAE is an NCI-developed patient-reported outcome system for capturing symptomatic adverse events in oncology trials. It comprises a large library drawn from the CTCAE vocabulary, and item selection for a given trial is typically guided by expected toxicity profiles from prior data. Selecting too many PRO-CTCAE items can burden patients and reduce compliance, while too few may miss important safety signals. We present an automated method to select a minimal yet comprehensive PRO-CTCAE subset based on historical safety data. Each candidate PRO-CTCAE symptom term is first mapped to its corresponding MedDRA Preferred Terms (PTs), which are then encoded into Safeterm, a high-dimensional semantic space capturing clinical and contextual diversity in MedDRA terminology. We score each candidate PRO item for relevance to the historical list of adverse event PTs and combine relevance and incidence into a utility function. Spectral analysis is then applied to the combined utility and diversity matrix to identify an orthogonal set of medical concepts that balances relevance and diversity. Symptoms are rank-ordered by importance, and a cut-off is suggested based on the explained information. The tool is implemented as part of the Safeterm trial-safety app. We evaluate its performance using simulations and oncology case studies in which PRO-CTCAE was employed. This automated approach can streamline PRO-CTCAE design by leveraging MedDRA semantics and historical data, providing an objective and reproducible method to balance signal coverage against patient burden.
| Comments: | 13 pages, 2 figures |
| Subjects: | Computation and Language (cs.CL) |
| Cite as: | arXiv:2512.06919 [cs.CL] |
| (or arXiv:2512.06919v1 [cs.CL] for this version) | |
| https://doi.org/10.48550/arXiv.2512.06919 arXiv-issued DOI via DataCite (pending registration) |
Submission history
From: Francois Vandenhende [view email] [v1] Sun, 7 Dec 2025 16:56:27 UTC (246 KB)