Abstract:Prolonged length of stay (pLoS) is a significant factor associated with the risk of adverse in-hospital events. We develop and explain a predictive model for pLos using admission-level patient and hospital administrative data. The approach includes a feature selection method by selecting non-correlated features with the highest information value. The method uses features weights of evidence to select a representative within cliques from graph theory. The prognosis study analyzed the records from 120,354 hospital admissions at the Hospital Alma Mater de Antioquia between January 2017 and March 2022. After a cleaning process the dataset was split into training (67%),…
Abstract:Prolonged length of stay (pLoS) is a significant factor associated with the risk of adverse in-hospital events. We develop and explain a predictive model for pLos using admission-level patient and hospital administrative data. The approach includes a feature selection method by selecting non-correlated features with the highest information value. The method uses features weights of evidence to select a representative within cliques from graph theory. The prognosis study analyzed the records from 120,354 hospital admissions at the Hospital Alma Mater de Antioquia between January 2017 and March 2022. After a cleaning process the dataset was split into training (67%), test (22%), and validation (11%) cohorts. A logistic regression model was trained to predict the pLoS in two classes: less than or greater than 7 days. The performance of the model was evaluated using accuracy, precision, sensitivity, specificity, and AUC-ROC metrics. The feature selection method returns nine interpretable variables, enhancing the models’ transparency. In the validation cohort, the pLoS model achieved a specificity of 0.83 (95% CI, 0.82-0.84), sensitivity of 0.64 (95% CI, 0.62-0.65), accuracy of 0.76 (95% CI, 0.76-0.77), precision of 0.67 (95% CI, 0.66-0.69), and AUC-ROC of 0.82 (95% CI, 0.81-0.83). The model exhibits strong predictive performance and offers insights into the factors that influence prolonged hospital stays. This makes it a valuable tool for hospital management and for developing future intervention studies aimed at reducing pLoS.
| Comments: | 23 pages, 6 figures |
| Subjects: | Machine Learning (cs.LG) |
| MSC classes: | 68T05 |
| ACM classes: | I.2 |
| Cite as: | arXiv:2601.04449 [cs.LG] |
| (or arXiv:2601.04449v1 [cs.LG] for this version) | |
| https://doi.org/10.48550/arXiv.2601.04449 arXiv-issued DOI via DataCite (pending registration) |
Submission history
From: Daniel Sierra [view email] [v1] Wed, 7 Jan 2026 23:35:24 UTC (664 KB)