A Comparative Analysis of Machine Learning Models for the Detection of Undiagnosed Diabetes Patients

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningpeer review

10 Downloads (Pure)

Abstract

Introduction: Early detection of type 2 diabetes is essential for preventing long-term complications. However, screening the entire population for diabetes is not cost-effective, so identifying individuals at high risk for this disease is crucial. The aim of this study was to compare the performance of five diverse machine learning (ML) models in classifying undiagnosed diabetes using large heterogeneous datasets. Methods: We used machine learning data from several years of the National Health and Nutrition Examination Survey (NHANES) from 2005 to 2018 to identify people with undiagnosed diabetes. The dataset included 45,431 participants, and biochemical confirmation of glucose control (HbA1c) were used to identify undiagnosed diabetes. The predictors were based on simple and clinically obtainable variables, which could be feasible for prescreening for diabetes. We included five ML models for comparison: random forest, AdaBoost, RUSBoost, LogitBoost, and a neural network. Results: The prevalence of undiagnosed diabetes was 4%. For the classification of undiagnosed diabetes, the area under the ROC curve (AUC) values were between 0.776 and 0.806. The positive predictive values (PPVs) were between 0.083 and 0.091, the negative predictive values (NPVs) were between 0.984 and 0.99, and the sensitivities were between 0.742 and 0.871. Conclusion: We have demonstrated that several types of classification models can accurately classify undiagnosed diabetes from simple and clinically obtainable variables. These results suggest that the use of machine learning for prescreening for undiagnosed diabetes could be a useful tool in clinical practice.
OriginalsprogEngelsk
TidsskriftDiabetology
Vol/bind5
Udgave nummer1
Sider (fra-til)1-11
Antal sider11
ISSN2673-4540
DOI
StatusUdgivet - 3 jan. 2024

Fingeraftryk

Dyk ned i forskningsemnerne om 'A Comparative Analysis of Machine Learning Models for the Detection of Undiagnosed Diabetes Patients'. Sammen danner de et unikt fingeraftryk.

Citationsformater