Stratified analysis of the age-related waist circumference cut-off model for the screening of dysglycemia at zero-cost
Introduction
The importance of developing screening models to identify subjects with early dysglycemia is an important task to prevent the development of manifest diabetes (Siu and U.S. Preventive Services Task Force, 2015; American Diabetes Association, 2020), to reduce the incidence of several fatal and nonfatal diabetes complication (Lindstrom and Tuomilehto, 2003) and to save the enormous costs of full-blown diabetic disease to the health system (Centers for Disease Control and Prevention, 2020).
Our recently published dysglycemia model (Buccheri et al., 2021) fits into this framework. The model, developed through the support of machine learning (ML) software Brain Project (BP) (Russo, 2016, 2020), predicts the dysglycemia status in a sample of the American adult population through the exclusive use of two zero-cost variables that do not require laboratory tests: waist circumference and age of an individual. As pointed out in Ref. (Buccheri et al., 2021), waist circumference is usually related to the visceral fat of an individual, thus testifying the existence of a strong correlation between obesity and dysglycemia. This result agrees with other important evidence previously published in the literature about genetic predisposition of developing diabetes in patients affected by obesity (Sheikhpour et al., 2020) and the major risk of diabetes and cardiovascular disease in the various obesity types (e.g. central obesity, visceral obesity, etc.) (Miklishanskaya et al., 2021).
Despite its simplicity, the model has similar performance than other complex models previously developed. Therefore, it is ideal to perform large scale screening of dysglycemia in the adult US population (Buccheri et al., 2021). The original model, which in turn consisted in an age-related waist circumference cut-off, was so far validated exclusively on a dataset representative of the general US population. However, its detailed analysis in different sex and ethnic groups, which can be considered intrinsic factors in the genesis of dysglycemia (American Diabetes Association, 2020), is an important step to fully validate the model. For example, it is known that risk to develop dysglycemia in female individuals increases more with age than for male individuals (Huebschmann et al., 2019) as well as certain ethnic groups have a greater genetic predisposition to have an altered blood sugar than others (Gurka et al., 2013).
The paper investigates the stratified performance of the model in different sex and ethnic groups of the US population, which is extremely important in the genesis of a zero-cost and easy-to-use model. We also evaluate the predictive performance individually on prediabetes and diabetes and discuss the calibration of the model. The present work, in turn, complements the findings of our previous work (Buccheri et al., 2021) through a stratified analysis of the model accuracy.
Section snippets
Study population (NHANES, 2007–2016)
The population used to test the model was derived from the data obtained in successive NHANES over 10 years (2007–2016). NHANES is a continuous collection of cross-sectional data conducted by the National Center for Health Statistics in the Center for Disease Control and Prevention (CDC). NHANES is a representation of US population through surveys with the typical and various ethnics US characteristics. NHANES reports the health and nutritional status of adults and children in the US, thus
Results
We have evaluated the performance of the model (Buccheri et al., 2021) separately for female and male individuals. Results of this sex-stratified analysis are described by the ROC curves shown in Fig. 1. We obtained AUC = 0.69–0.71 (95% C.I.) for male individuals and AUC = 0.75–0.78 (95% C.I.) for female individuals. The optimal trade-off between sensitivity and specificity, obtained with the procedure described in (Buccheri et al., 2021), turned out to be fairly similar for both sex groups,
Discussion
The results of the sub-analysis of the model highlighted various aspects that are certainly of scientific interest.
The accuracy of our model even in the lower-accuracy sex-group is still comparable to the overall accuracy achieved by state-of-the-art models (American Diabetes Association, 2020; Lindstrom and Tuomilehto, 2003). In turn, one could speculate that the reason of the slightly different accuracy in different sex-groups is possibly due to the weaker correlation of non-laboratory
Conclusions
In conclusion, the early identification of individuals with dysglycemia status is key to prevent type 2 diabetes as well as several related fatal and nonfatal complications. To help in the prevention of type 2 diabetes in the US population, we published a new model very simple and ideal to be used to perform large-scale zero-cost screening of dysglycemia. The model was previously validated exclusively on the general US population, thus not allowing to have a complete picture of its performance.
CRediT authorship contribution statement
Enrico Buccheri: Original idea, literature search, Study design, Data collection, Data interpretation, Writing – original draft. Daniele Dell’Aquila: Original idea, literature search, Study design, Data collection, Formal analysis, Data interpretation, Writing – original draft. Marco Russo: Original idea, Study design, Data collection, Formal analysis, Data interpretation, Writing - critical review.
Declarations of competing interest
None.
Acknowledgments
Data used in this study were collected by the National Health and Nutrition Examination Survey (NHANES) and they are free and publicly available on the National Center for Health Statistics of the Centers for Disease Control and Prevention (CDC) website. D.D. acknowledges funding support from the Italian Ministry of Education, University and Research (MIUR) through the “PON Ricerca e Innovazione 2014–2020, Azione I.2 A.I.M., D.D. 407/2018”.
References (23)
- et al.
Artificial intelligence in health data analysis: the Darwinian evolution theory suggests an extremely simple and zero-cost large-scale screening tool for prediabetes and type 2 diabetes
Diabetes Res. Clin. Pract.
(2021) - et al.
Neuro-genetic programming for multigenre classification of music content
Appl. Soft Comput.
(2020) - et al.
Age-related and disease-related muscle loss: the effect of diabetes, obesity, and other diseases
Lancet Diabetes Endocrinol
(2014) - et al.
Types of obesity and their prognostic value
Obesity Med.
(2021) A distributed neuro-genetic programming tool
Swarm. Evol. Comput.
(2016)- et al.
Genetic programming for photovoltaic plant output forecasting
Sol. Energy
(2014) - et al.
The Interaction between gene profile and obesity in type 2 diabetes: a review
Obesity Med.
(2020) 2. Classification and diagnosis of diabetes: standards of medical care in diabetes–2020
Diabetes Care
(2020)National Diabetes Statistics Report
(2020)- Centers for Disease Control and Prevention. National Health and Nutrition Examination Survey....
A combined strategy of feature selection and machine learning to identify predictors of prediabetes
J. Am. Med. Inf. Assoc.
Cited by (6)
A novel multi-layer modular approach for real-time fuzzy-identification of gravitational-wave signals
2023, Machine Learning: Science and TechnologyNuclear physics midterm plan at Legnaro National Laboratories (LNL)
2023, European Physical Journal PlusUnderstanding Heavy-ion Fusion Cross Section Data Using Novel Artificial Intelligence Approaches
2023, Journal of Physics: Conference Series