Oversampling techniques for predicting COVID-19 patient length of stay

Computer Science > Machine Learning

arXiv:2511.15048 (cs)

COVID-19 e-print

Important: e-prints posted on arXiv are not peer-reviewed by arXiv; they should not be relied upon without context to guide clinical practice or health-related behavior and should not be reported in news media as established information without consulting multiple experts in the field.

View PDF HTML (experimental)

Abstract:COVID-19 is a respiratory disease that caused a global pandemic in 2019. It is highly infectious and has the following symptoms: fever or chills, cough, shortness of breath, fatigue, muscle or body aches, headache, the new loss of taste or smell, sore throat, congestion or runny nose, nausea or vomit…

Computer Science > Machine Learning

arXiv:2511.15048 (cs)

COVID-19 e-print

View PDF HTML (experimental)

Abstract:COVID-19 is a respiratory disease that caused a global pandemic in 2019. It is highly infectious and has the following symptoms: fever or chills, cough, shortness of breath, fatigue, muscle or body aches, headache, the new loss of taste or smell, sore throat, congestion or runny nose, nausea or vomiting, and diarrhea. These symptoms vary in severity; some people with many risk factors have been known to have lengthy hospital stays or die from the disease. In this paper, we analyze patients’ electronic health records (EHR) to predict the severity of their COVID-19 infection using the length of stay (LOS) as our measurement of severity. This is an imbalanced classification problem, as many people have a shorter LOS rather than a longer one. To combat this problem, we synthetically create alternate oversampled training data sets. Once we have this oversampled data, we run it through an Artificial Neural Network (ANN), which during training has its hyperparameters tuned using Bayesian optimization. We select the model with the best F1 score and then evaluate it and discuss it.


Comments:	10 pages, 2022 IEEE International Conference on Big Data (Big Data)
Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2511.15048 [cs.LG]
	(or arXiv:2511.15048v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2511.15048 arXiv-issued DOI via DataCite (pending registration)
Journal reference:	2022 IEEE International Conference on Big Data (Big Data), Osaka, Japan, 17-20 December 2022
Related DOI:	https://doi.org/10.1109/BigData55660.2022.10020253 DOI(s) linking to related resources

Submission history

From: Zachariah Farahany B [view email] [v1] Wed, 19 Nov 2025 02:38:10 UTC (192 KB)

Computer Science > Machine Learning

Computer Science > Machine Learning

Submission history

Similar Posts