Type 2 diabetes mellitus accounts for 90% of the incidence of diabetes and is considered by many as the 21st century epidemic disease1 because of its increasing prevalence, inherent complications, associated deaths, and exorbitant costs to healthcare systems2 such as the Portuguese National Health Service. There is a growing demand for prevention, early diagnosis, and effective treatment of this illness3.
Diabetes affects an estimated 9.3% (463 million) of the worldwide adult population and contributes to almost 5 million deaths per year. The prevalence of diabetes has been increasing over the last three decades, and the disease is expected to affect about 700 million people by 20403. Additionally, in 2010, diabetes caused 680 disability-adjusted life years (DALY) per 100 000 inhabitants worldwide, one of the highest DALY values registered4.
The Azores, officially the Autonomous Region of the Azores, is an autonomous region of Portugal consisting of nine volcanic islands in the North Atlantic5. According to a 2014 Azores regional health inquiry, the prevalence of diabetes is 9.9% among the population between 20 and 74 years old residing in the Azores (9.8% of women and 10% of men)6. Furthermore, 66% of individuals with diabetes also have hypertension, while 22% and 34% are considered obese and pre-obese, respectively. In addition, a 2009 PREVADIAB (Prevalência da Diabetes em Portugal) study reported a prevalence of 14.3% of diabetes in the Azorean population: 9.2% with an established diagnosis and 5.1% with undiagnosed diabetes6.
It is crucial to identify individuals with undiagnosed diabetes (who are already included in the established prevalence of diabetes) and those at risk of developing this disease, which will thus contribute to an incremental increase in the prevalence of diabetes in the near future. This model would justify the need for new prevention and early diagnosis strategies7.
Type 2 diabetes diagnosis was, for many years, subjective and not systematized8. Currently, the diagnostic criteria for type 2 diabetes are based on fasting plasma glucose levels (≥126 mg/dL), 2-hour plasma glucose levels (≥200 mg/dL 2 hours after the ingestion of 75 g anhydrous glucose in an oral glucose tolerance test), glycated hemoglobin (≥6.5%), and random plasma glucose levels (≥200 mg/dL if associated with diabetes symptoms)9.
Identifying risk factors associated with a greater likelihood of developing type 2 diabetes is extremely important, given the existence of asymptomatic patients who fulfil the laboratory criteria, the epidemiological studies showing that type 2 diabetes may be present up to 10 years before diagnosis, and the several patients who have specific diabetes complications when the diagnosis is established10.
Several risk factors for type 2 diabetes are already known: family history (parents or siblings) of type 2 diabetes, obesity, sedentary lifestyle, specific ethnicities, history of fasting plasma glucose levels >100 mg/dL, decreased glucose tolerance or glycated hemoglobin >5.7–6.4%, history of gestational diabetes or delivery of newborn weighing >4 kg, hypertension, high-density lipoprotein cholesterol levels <35 mg/dL or triacylglycerol (the main constituent of body fat in humans) levels >250 mg/dL, history of polycystic ovary syndrome or of acanthosis nigricans, and history of cardiovascular disease7,9,11,12. Subjective risk factors should also be considered: age, abdominal circumference (measured below the ribs), a diet rich in carbohydrates and saturated fats and poor in fruits and vegetables, smoking habit, low birth weight, low socioeconomic level, and regular medication with corticosteroids or antipsychotics.
It is estimated that the combined effect of these risk factors contributes to an 80% increase in type 2 diabetes risk7. All these risk factors can be used to establish risk predictor models for type 2 diabetes. There are various models designed for this purpose: the Finnish Diabetes Risk Score (FINDRISC)13, which is recommended by the American Diabetes Association and is the most used model in Portugal; the Framingham Offspring Study (FOS)14, which was once the most widely used model in the world; the Leicester Practice Risk Score (LPRC)15, a European model used mainly in the UK and recommended by the National Institute for Health and Care Excellence; and the German Diabetes Risk Score (GDRS)16, a model mainly used in Germany but also recommended by the American Diabetes Association. Nevertheless, most of the risk factors differ between models (see Discussion). For instance, only age, anthropometric factors and history of hypertension were common to all four international models.
It is crucial to evaluate all models to understand which are useful, applicable to the population, reliable, and effective. Usually, the evaluation of risk predictor models is based on their objectives, covariates, methodology, sample size, selection criteria, and model performance (evaluated by internal and/or external validation). A good risk predictor model should have well-defined objectives, proven validation, and good performance, reliability, and clinical utility7.
Guidelines do not indicate the need for a risk predictor model. Instead, in the future, it seems that new models will be developed that will enable each individual to estimate their own risk of developing type 2 diabetes and that will enable delineation of high-risk areas (‘hotspots’); this will be of interest in the field of public health17.
This study aimed to develop AzoresDiab, a type 2 diabetes risk predictor model specific to the Azorean population, which is highly affected by the insularity phenomenon and consanguinity marriages, and has developed unique genetic characteristics of health as expected by Rudan18.
To develop the AzoresDiab model, the authors used anonymous patient-level data from 2013 and 2014 that related to primary care services in all the Azores islands provided by the entity that manages the health system in the Azores (Saudaçor SA). In total, data from 272 705 patients (total population) were supplied. A patient was defined as having type 2 diabetes if they had an International Classification of Primary Care (ICPC-2) code of T90 in the data or had at least one prescription of an antidiabetic drug listed in the Portuguese Guidelines for Diabetes Clinical Management12.
As there is no consensus regarding the best methodology for allocating values to missing values in health19, and because pediatric populations have specific and different clinical patterns and assumptions related to type 2 diabetes, the following exclusion criteria were established: patients missing covariate values and those aged <18 years. After these criteria were applied, 6834 individuals were eligible for inclusion in the development of the risk model.
The AzoresDiab model is based on binary logistic regression20 (the dependent variable is 0 if the person does not have type 2 diabetes and 1 if the person has type 2 diabetes) to describe the individual probabilistic risk of type 2 diabetes development, as suggested by Iezzoni19. To avoid confounding effects, an α value of 0.05 was used as the reference for the independent variables21, and these variables were also included in an interaction model20. The independent variables included in the AzoresDiab model were those available and collected by Saudaçor SA (Table 1), and their connection with type 2 diabetes was supported by the Kasper et al study9 and by two review articles7,22. The model was internally validated by the bootstrapping method as described by Efron and Tibshirani23, and the ability of the model to discriminate this binary scenario was calculated using the area under the curve (AUC)24. All statistical analysis was conducted using SPSS v23 (www.spss.com).
The databases used are approved by the Protection Data National Commitee and the authors have only worked with anonymous data.
The background characteristics of the study population are presented in Table 2.
The probability of an inhabitant of the Azores developing type 2 diabetes is based on the following expression:
where a is given by the following equation:
where BMI is the body mass index, CVD is the history of cardiovascular disease, Gli is the glucose level, HTA is the history of hypertension, and Tri is the triacylglycerol level.
Following internal validation, the model described above had an AUC of 0.863, with a 95% confidence interval of 0.854–0.872; this shows excellent discrimination of individuals with and without type 2 diabetes20.
Table 3 shows the significance of the variables included in the model, and Figure 1 presents the receiver operating characteristic curve.
Early diagnosis of type 2 diabetes is crucial to prevent complications and create personalized diagnosis strategies for each individual, namely in remote and rural populations such as those in the Azores islands. With this in mind, the AzoresDiab model was developed to discriminate individuals with type 2 diabetes from those without type 2 diabetes; the model had an excellent AUC of 0.863.
The AzoresDiab model has similarities with and differences from four other international models used to predict type 2 diabetes: FINDRISC, FOS, LPRC and GDRS (Table 4).
All previous models compared the age, anthropometric factors, and history of hypertension. Nevertheless, the AzoresDiab is the most inclusive model because it includes all individuals aged 18 years or older. In contrast, FINDRISC23, FOS14 (45–64 years), GDRS16 (35–65 years), and LPRS15 (40–75 years) are more restricted models.
Changes in blood glucose level or oral glucose tolerance may be present without a definitive diagnosis of type 2 diabetes, indicating a prediabetic state that can be predicted by the LPRS model15. Considering this, the AzoresDiab model also includes blood glucose levels, which are also considered by the FINDRISC13 and FOS14 models.
A history of CVD is a leading predictor of disability and death among patients with type 2 diabetes27. Compared with the other four international models, the AzoresDiab model is the only one to use a history of CVD as a covariate. Using this covariate in the model is advantageous, namely in cases where type 2 diabetes has not yet been diagnosed, because it will increase the probability of predicting type 2 diabetes.
Larger samples are essential for constructing more robust models. The AzoresDiab model had the second-largest sample (6834), immediately behind the GDRS study (25 167), and followed by the LPRS15 (6186), FINDRISC16 (4746), and FOS14 (3140) studies.
All four previous models are based on prospective cohort studies, whereas the AzoresDiab model uses binary logistic regression with retrospective information. The different objectives of the models justifies this discrepancy in methodologies. The AzoredDiab model aims to predict the probability of an individual having type 2 diabetes in a population with specific characteristics (remote and rural with double insularity phenomena and consanguinity marriages), while the others are more generalized and can be used for the population of the whole world.
Regarding the sample selection criteria, all the international models exclude individuals with a previous diagnosis of type 2 diabetes. However, these individuals were not excluded from the AzoresDiab model because one of the purposes of this model was that undiagnosed individuals could use it to confirm their risk of having type 2 diabetes.
In the methods used to create the four international models and the AzoresDiab model, the discrepancies are not so notable in the diagnosis of type 2 diabetes as in the sample criteria: the five models are based on either the existence of a previously registered diagnosis of type 2 diabetes, the evidence of laboratory criteria for the diagnosis of type 2 diabetes, or the existence of ongoing antidiabetic medication. Laboratory evidence is the most objective criterion. However, AzoresDiab uses the registered diagnosis of type 2 diabetes or the prescription of at least one of the antidiabetic drugs referred to in the Portuguese Guidelines for Diabetes Clinical Management12. Most of these models are based on logistic regression, but it could also be interesting to study further the possibility of calculating the individual risk using the Bayes theorem and machine learning28-30.
The internally validated AUC in the present study of 0.863 was slightly above that of the other models (namely the FINDRISC13 and the FOS14 models), which obtained an internally validated AUC of 0.85. However, using the AzoresDiab model in the Azores still requires external validation before a proper comparison can be made with the other international models.
Besides the external validation, it is also important to note that the AzoresDiab model has other limitations, namely being restricted to the variables collected by Saudaçor SA. Risk factors such as physical activity, dietary habits (FINDRISC and GDRS), cholesterol levels (FOS) and family history of type 2 diabetes (FINDRISC, FOS and LPRS) were not available in the dataset used. Also, individuals were excluded who had data missing, and, therefore, the model may not represent the entire Azorean population. Nevertheless, as mentioned above, through AUC, the performance of the model was considered excellent in predicting type 2 diabetes even without other relevant risk factors reported in the literature and the other four models.
Despite the extensive creation of predictive risk models for type 2 diabetes, there are still some limitations in studies determining the impact of these models on type 2 diabetes. The premise of the AzoresDiab model is to meet the need for a model that can be used by any individual, even a layperson, to measure their risk of developing type 2 diabetes.
The use of predictive risk models will enable the early implementation of disease prevention programs in medium- and high-risk individuals. However, a high percentage of people are not interested in initiating prevention strategies despite having a high risk of developing the disease. Predictive risk models will also allow the delineation of groups of the population (local communities) with a medium and a high risk for type 2 diabetes; this will allow public health policies to be implemented to prevent the onset of the disease in these populations.
The objectives of AzoresDiab were to predict type 2 diabetes in any individual, while considering the epidemiological and geographical context in which it was developed – the remote and rural Azores islands. AzoresDiab also considers the most relevant risk factors for the development of type 2 diabetes. Using diagnostic data from Primary Health Care and registrations of prescriptions of antidiabetic drugs listed in the Portuguese Guidelines for Diabetes Clinical Management standards, the model achieved an internally validated AUC of 0.863, which shows its excellent discrimination power. In the near future, the next step will be the external validation of the AzoresDiab model to confirm its discriminatory power.
The authors would like to thank Ana Raquel Santos and Luís Parreira from Saudaçor SA, and Helena R. Pereira.