Effects of Hypertension, Diabetes, and Smoking on Age and Sex Prediction from Retinal Fundus Images

Retinal fundus images are used to detect organ damage from vascular diseases (e.g. diabetes mellitus and hypertension) and screen ocular diseases. We aimed to assess convolutional neural network (CNN) models that predict age and sex from retinal fundus images in normal participants and in participants with underlying systemic vascular-altered status. In addition, we also tried to investigate clues regarding differences between normal ageing and vascular pathologic changes using the CNN models. In this study, we developed CNN age and sex prediction models using 219,302 fundus images from normal participants without hypertension, diabetes mellitus (DM), and any smoking history. The trained models were assessed in four test-sets with 24,366 images from normal participants, 40,659 images from hypertension participants, 14,189 images from DM participants, and 113,510 images from smokers. The CNN model accurately predicted age in normal participants; the correlation between predicted age and chronologic age was R2 = 0.92, and the mean absolute error (MAE) was 3.06 years. MAEs in test-sets with hypertension (3.46 years), DM (3.55 years), and smoking (2.65 years) were similar to that of normal participants; however, R2 values were relatively low (hypertension, R2 = 0.74; DM, R2 = 0.75; smoking, R2 = 0.86). In subgroups with participants over 60 years, the MAEs increased to above 4.0 years and the accuracies declined for all test-sets. Fundus-predicted sex demonstrated acceptable accuracy (area under curve > 0.96) in all test-sets. Retinal fundus images from participants with underlying vascular-altered conditions (hypertension, DM, or smoking) indicated similar MAEs and low coefficients of determination (R2) between the predicted age and chronologic age, thus suggesting that the ageing process and pathologic vascular changes exhibit different features. Our models demonstrate the most improved performance yet and provided clues to the relationship and difference between ageing and pathologic changes from underlying systemic vascular conditions. In the process of fundus change, systemic vascular diseases are thought to have a different effect from ageing. Research in context. Evidence before this study. The human retina and optic disc continuously change with ageing, and they share physiologic or pathologic characteristics with brain and systemic vascular status. As retinal fundus images provide high-resolution in-vivo images of retinal vessels and parenchyma without any invasive procedure, it has been used to screen ocular diseases and has attracted significant attention as a predictive biomarker for cerebral and systemic vascular diseases. Recently, deep neural networks have revolutionised the field of medical image analysis including retinal fundus images and shown reliable results in predicting age, sex, and presence of cardiovascular diseases. Added value of this study. This is the first study demonstrating how a convolutional neural network (CNN) trained using retinal fundus images from normal participants measures the age of participants with underlying vascular conditions such as hypertension, diabetes mellitus (DM), or history of smoking using a large database, SBRIA, which contains 412,026 retinal fundus images from 155,449 participants. Our results indicated that the model accurately predicted age in normal participants, while correlations (coefficient of determination, R2) in test-sets with hypertension, DM, and smoking were relatively low. Additionally, a subgroup analysis indicated that mean absolute errors (MAEs) increased and accuracies declined significantly in subgroups with participants over 60 years of age in both normal participants and participants with vascular-altered conditions. These results suggest that pathologic retinal vascular changes occurring in systemic vascular diseases are different form the changes in spontaneous ageing process, and the ageing process observed in retinal fundus images may saturate at age about 60 years. Implications of all available evidence. Based on this study and previous reports, the CNN could accurately and reliably predict age and sex using retinal fundus images. The fact that retinal changes caused by ageing and systemic vascular diseases occur differently motivates one to understand the retina deeper. Deep learning-based fundus image reading may be a more useful and beneficial tool for screening and diagnosing systemic and ocular diseases after further development.


Materials and Methods
Dataset Organisation. We used the retinal fundus images from the Seoul National University Bundang Hospital Retina Image Archive (SBRIA) after de-identification except the age, sex, and underlying diseases at the study date; details are described in our previous study 26,30 . We included 412,026 retinal fundus images from 155,449 participants obtained at the health promotion centre in Seoul National University Bundang Hospital (SNUBH) between June 1st, 2003, and June 30th, 2016, in which detailed information regarding the presence of hypertension, DM, and the smoking status was presented. Among them, 243,668 images from normal participants without hypertension, DM, and smoking history, 40,659 images from participants with hypertension, 14,189 images from participants with DM, and 113,510 images from participants with smoking exist. The number of participants and images in each group, and the number of participants and images with overlapping diseases are shown in the Supplementary Table S1. In addition, the number and percentage of retinal fundus images according to the age and sex of each test-set are shown in Supplementary Table S2. As the participants underwent fundus photographs at least 1-year or more intervals, the fundus photographs of the same patient captured on different dates were considered to be distinct from each for age prediction. Retinal fundus images were acquired using various fundus cameras (Kowa VX-10, Kowa VX-10a, Kowa nonmyd7, Canon CF60Uvi, Canon CR6-45NM). This study was approved by the Institutional Review Board (IRB) of the SNUBH (IRB no. B-1703-386-103), and the requirement of informed consent was waived from the IRB. The study complied with the guidelines of the Declaration of Helsinki. z-score 31 . This ensures the classification results to be invariant of intensities and colour contrasts of the images, thereby enabling the model to make predictions solely based on the shape configurations of the fundi. The black background of the fundus images was excluded in normalisation.
The images from normal participants are distributed randomly into training-set, validation-set, and test-set of ratios 89% (216,866), 1% (2,436), and 10% (24,366) by uniform sampling, respectively. We developed and trained CNN for the prediction of age and sex using the training-set from normal participants and verified the models using the validation-set while training the models. We assessed the performance of the models using the test-set of 24,366 normal images (normal test-set) and the groups of images from participants having hypertension, DM, and smoking history, separately. By performing the same number of every epoch time, we obtained the sample mean of the resulting prediction probabilities. Figure 1 shows a flow chart for the experimental setup.
Convolutional neural network (CNN) as regressor and classifier. Separate CNNs were trained for age and sex prediction, respectively. We used Pytorch (ver. 0.4.1, https://pytorch.org/) as the deep-learning library to implement the software to train, validate, and test the CNN. The network structure of the CNN adopted the structure of the residual network (ResNet), which had previously demonstrated high performance in a classification task 32 . ResNet has five versions (ResNet-18, 34, 50, 101, 152) depending on the depth of the convolution. We chose ResNet-152 because the deeper the convolution, the better performance. ResNet-152 has a total of 152 layers and its configuration is as follows. The set consists of a convolution layer, batch normalization, and ReLU (activation function) with 151 layers, and a fully connected last layer. Only the first convolution layer is set to 7×7 kernel with stride 2 and padding 3, and the kernel size of all subsequent convolution layers is 3×3. In addition, by setting stride 2 in 4 convolution layers, multi-scale can be considered without additional pooling layer. We used ResNet-152 as a backbone network, which was pre-trained in general image classification from ImageNet database. We used the parameters of the convolution layers in pre-trained network as our initial values, except for the fully connected last layer. The pre-trained network had learned various features through the huge amount of images from the ImageNet database. Therefore, the transfer learning method from the pre-trained network is faster than the scratched method and shows better performance 33 . While the overall structure is similar, some details were modified regarding the average pooling layer (changed kernel size from 7 to 16) and the fully connected layer (changed output dimension from 1000 to 1) to tailor the CNN to our data and desired output (Fig. 2). We define the unprocessed numerical output of the CNN as the predicted age, making the CNN for age prediction a direct regressor, while the output of the sigmoid function for the numerical output of the CNN is defined as the probability of predicted sex, making the CNN for sex prediction a classifier. As a loss function for training and error calculation, the smooth L1 loss function was adopted in the age prediction model because it has the appropriate properties of both regular features of L1 and L2 [34][35][36] , and the binary cross-entropy function was adopted in the sex prediction models for binary classification. In addition, Adam was used as the optimization scheme (learning rate = 1e −5 , beta1 = 0.9, and beta2 = 0.999) 37 . One epoch is defined as performing backpropagation once for all images in the total training-set, and both CNNs for age and sex prediction were learned through 10 epochs. The CNNs were verified for each epoch using the validation-set; finally, the best results were built into the prediction model.

Class activation map analysis.
To identify the region used in CNN, we used the class activation map (CAM) technique to highlight the core regions that the network focuses on for the age and sex prediction via Bayesian approximation for estimation of uncertainty in predictions 31,38,39 .

Prediction of age and sex in retinal fundus images after inpainting retinal blood vessels.
To determine the effect of retinal blood vessels in retinal fundus images on predicting age and sex, we created vessel-erased images. First, the blood vessel region was extracted using the scale-space approximated CNN (SSANet) that was previously reported by our group and demonstrated state of the art performance in retinal vessel segmentation 40 . Subsequently, we used the inpainting technique, which naturally fills holes or some regions in the image by extrapolation from the surrounding background. We applied the method previously reported by Telea et al. 41 , to inpaint the vascular regions, that is, to essentially erase the vessels from the retinal image. All images of the training-set, validation-set, and test-set were first pre-processed identically to the original training-set, then reconstructed as stated above, and our age and sex prediction models were trained, validated, and tested again in the same manner described above.

Statistical analysis.
To analyse the relationship between predicted age and chronologic age, the absolute values of error and their distributions were obtained from each test-set including normal, hypertension, DM, and smoking status. To further assess the statistical significance of the performance of our age prediction model, we used linear regression to obtain their coefficients of determination (R 2 ) and 95% confidence interval (CI). The mean absolute error (MAE) from the chronologic age was calculated for the testing-sets, for the age subsets that are divided by decade. Additionally, to obtain the accuracy for age prediction, each result was interpreted as correct if the predicted age was concordant with the chronologic age with certain error margins of ±1, ±3, and ±5 years. The analysis of variance (ANOVA) was conducted to compare the values of squared error according to the underlying vascular conditions. Tukey's post-test was used for pairwise group comparison. The area under a receiver operating characteristic curve (AUC) and 95% CI were used to report the performance of sex prediction. All statistical analysis was performed using Python; in particular, 95% CI were estimated by the bootstrap technique 42 .  Figure 3 presents a scatter plot of the linear relationship between chronologic age and predicted age in each test-set. Supplementary Fig. S1 presents a box plot of the squared error values ([predicted age -chronologic age] 2 ) of the four groups. ANOVA showed significant difference at the P < 0.05 level for the squared error values of the four underlying vascular conditions, and post hoc comparisons also indicated that the means of squared error values for each vascular condition were different. (P < 0.05, Supplementary Table S3). The normal test-set demonstrated a fairly linear DM, and smoking test-sets, respectively. Based on the assumption that age prediction was correct when the predicted and chronologic ages differed by an error margin of ±5 years, we observed an 82.8% correct prediction in the normal test-set. The accuracies of prediction were 77.6%, 77.0%, and 85.6% in the hypertension, DM, and smoking test-set, respectively. The accuracy for the prediction of age within ±1 year was less than 30% in all groups, and that within ±3 years was more than 55% in all groups (Table 2).

Results
When each test-set was divided into age subgroups by decades, the MAEs of the estimation in age groups between 20 and 59 years were below 3.0 years in all test-sets. Table 3 shows a fairly consistent MAE and good accuracy under 60 years old; however, an increased MAE and declined accuracy were exhibited in groups aged above 60 years old. Table 4  The accuracies for age prediction within ±5 years were more than 80% in categories 1 and 2, but less than 70% in category 3. Figure 4 shows the changes in accuracy of age prediction according to chronologic age subgroups (divided by 10 years) by error margins of ±1, ±3, and ±5 years in the four test-sets. The accuracy was the highest at 30-40 years of age in the normal and DM test-sets, and 20-30 years in the hypertension and smoking test-sets. The accuracies of age prediction declined gradually with increasing age and deteriorated after 60 years old. The accuracies of the age prediction did not differ according to sex in all groups. Representative images of the CAM heat-map in our age prediction model are shown in Fig. 5. It shows the regions that have higher influence on the prediction results in red, relative to the regions with lower influence, in blue. The CAM of our age prediction model indicated activation primarily in the vascular region.
The age prediction model trained by the vessel-erased images was assessed to obtain the MAE and accuracy in the vessel-erased images in all test-sets. The predicted age from the vessel-erased images indicated a similar MAE of 3.19 years (Table 5), and also showed similar changes with age ( Fig. 4). In the retinal vessel-erased images, attention is still focused on the area where blood vessels are present (Fig. 6).
Our sex prediction model was trained and validated using the same training-set and validation-set that were tested in four test-sets; it demonstrated excellent accuracies, where the AUC was 0.97 in the normal test-set, and similar AUCs above 0.96 were shown in the other test-sets with underlying vascular conditions (hypertension, 0.96; DM, 0.96; smoking, 0.98). To confirm the significance of the fovea and retinal vessels in the prediction of sex, we generated inpainted images with erased fovea and retinal vessels, separately. The AUC was 0.881 (95% CI, 0.877-0.885) in the fovea-erased image and 0.682 (95% CI, 0.676-0.688) in the retinal vessel-erased image (Fig. 7). Representative images of the CAM heat-map in our sex prediction model are shown in Fig. 8. The CAM of our sex prediction model indicated various activations in the fovea, optic disc, and retinal vessel; in particular, the proximal vascular region was prominently activated in females.

Discussion
Historically, researchers have sought a relevant biomarker among medical images that can reflect chronologic ageing or biological ageing. Retinal fundus images are easy to capture, inexpensive, non-invasive, and provide high-resolution images of retinal blood vessels as well as of retina and optic nerve that change according to www.nature.com/scientificreports www.nature.com/scientificreports/ age and vary with sex. In this respect, we developed algorithms predicting chronologic age in retinal fundus images using deep neural networks, and the algorithms demonstrated fair performance in predicting age and sex. Interestingly, the performance was the best in participants aged 20-40 years, decreased as age increased, and poor in participants aged 60 years or over. In addition, the performance deteriorated in participants suffering hypertension or DM with increasing age. The performance of the sex prediction model demonstrated high accuracy and showed no inferiority in test-sets with underlying disease.
CNN have led to a series of breakthroughs for computer vision area including image classification, semantic segmentation, object detection, and bounding box regression 27,[43][44][45] . Network models such as VGG net, GoogLeNet, and ResNet have been widely used to date 46 . Among them, ResNet proposed the concept of residual blocks to improve performance that can be degraded in very deep networks, and solved the interference with convergence such as vanishing/exploding gradients [47][48][49][50] . Through this, ResNet showed better performance compared to VGG net and GoogLeNet in deep network learning. ResNet also demonstrated high performance for medical image recognition tasks such as retinal vessel segmentation 40 . Therefore we used ResNet, which has the advantages mentioned above, to develop an age and sex prediction model using the retinal fundus images.
Deep-learning based age and sex estimation using retinal fundus images has been previously studied by other group. Poplin et al. reported the predictions of age, gender, and cardiovascular risk factors using fundus photograph data from UK Biobank and EyePACS 25 . In the algorithm validation using the UK Biobank validation  www.nature.com/scientificreports www.nature.com/scientificreports/ dataset (n = 12,026 patients), the MAE was 3.26 (95% CI, 3.22-3.31) and R 2 = 0.74 (95% CI, 0.73-0.75). In the algorithm validation using the EyePACS-2K validation dataset (n = 999 patients), the MAE was 3.42 (95% CI, 3.23-3.61) and the R 2 was 0.82 (95% CI, 0.79-0.84). The MAE of the age prediction validation in our study was 3.06 (95% CI, 3.03-3.09) and the R 2 was 0.92, which demonstrated a better or comparable performance than those of previous study. Although there are differences in the data set, the present study showed comparable results using a large number of Korean's fundus images. The cross validation of each other's dataset may be required to confirm the superiority of accuracy.
Vascular risk-scoring methods use equations based on large cohort studies, such as the US Framingham Heart and Offspring Studies 51 , the recent European Systematic Coronary Risk Evaluation project 52 , and the German Prospective Cardiovascular Münster study 53 . The Framingham score calculation, a well-known algorithm, considers age, sex, total and HDL cholesterol, systolic blood pressure, and smoking 51 . Additionally, various novel biomarkers such as intravascular ultrasound in left main coronary arteries 54   www.nature.com/scientificreports www.nature.com/scientificreports/ been studied to predict vascular age and risk. Retinal images can also be used as a tool for such predictions, and this application has been demonstrated in our study using fundus photograph images and the CNN algorithm. In addition, retinal vessels and cerebral vessels exhibit similarity and in vivo direct observations of retinal vessels provide information about cerebral vessels 56,57 . Therefore, efforts to find and understand cerebral and systemic blood vessels through retinal fundus images are worthwhile. Furthermore, retinal images will be more interesting and valuable in systemic vascular diseases such as hypertension and DM. Our age prediction model demonstrated good accuracies in ages under 60 years; however, in ages 60 or over, the performance deteriorated significantly. This is because ageing changes observed in retinal fundus occur continuously until the age of 60 years, and may saturate at approximately 60 years of age; subsequently, after the age of 60 years, the ageing changes may not be obvious with age. The CAM images of our age prediction model demonstrated activation primarily in the vascular region; this may be another evidence to explain the focus of the deep-learning CNN, however, similar results on vessel-erased images showed that not only blood vessels were used for prediction. In other words, optic disc, papillary vessels, and retinal parenchyma can be used to predict age. Our research was inspired by the effects of ageing on the human fundus, beyond reconfirmation of the possibility or capability of age prediction by deep learning. Interestingly, our age prediction model demonstrated similar MAEs in each test-set categorised by underlying diseases, implying that pathologic changes occurring in systemic vascular diseases are different form the changes in the ageing process. In other words, our results demonstrated that systemic vascular diseases such as hypertension, DM, and smoking resulted in various inconsistent changes in the retinal blood vessels and resulted in loosened relationships (increased coefficient of determination, R 2 ). In addition, statistical analysis using ANOVA and post-hoc test for values of absolute errors in each test-set show significant differences in age prediction when the underlying disease is present.
The ageing process causes many changes in the retina and optic nerve that consist of several layers including RNFL, ILM, ONL, PRL, and RPE 8,58 . Retinal vascular changes include decrease in cellularity of peripheral capillaries and diminution of the number of capillaries of the fovea 8,59 . Thickening and hyalinisation of the vessel wall, and arteriosclerotic changes may also develop in retinal vessels with ageing 8 . In addition, the density of the choriocapillaris that provides nutritional support for the RPE and outer retina, decreases with ageing [60][61][62] . In particular, the increase in flow deficit of the choriocapillaris is prominent in the central 1-mm circle of the macula 62 . The fundus autofluorescence from lipofuscin of RPE correlates with age, and indicates age-dependent changes in the fovea 63 . The reason for such changes is that the human body is always maintained through metabolism, and in this process, cells and tissues are damaged while by-products accumulate 2,64 . Hence, changes due to normal ageing and pathological changes share a similar feature and cannot be completely distinguished. In particular, the ageing process and hypertension are known to be associated with vascular network changes such as retinal vascular junctional bifurcation angles 65 . Therefore, the predicted age is expected to be higher than the chronologic age in the presence of hypertension. However, the MAE and difference between the predicted age and chronologic age in the hypertension test-set were similar to that of the normal test-set in our study. This suggests that changes in retinal vessels due to hypertension show some similar features to ageing, but not entirely mimicking normal ageing. DM is another chronic systemic vascular disease that causes changes in whole blood vessels including retinal vessels. The prevalence of any diabetic retinopathy in 35 population-based studies from 22,896 patients was 34.6% overall among diabetic patients 66 . Additionally, DM altered all blood vessels and affected the diameter of the retinal blood vessels; however, the branching angle was not significantly different from those of normal participants 67 . Several studies have reported that wider retinal venular diameters and narrower arteriolar diameters were associated with the presence of diabetic nephropathy and severe levels of diabetic retinopathy [68][69][70] . In a type-1 DM cohort study, age (odds ratio [OR] per 10 years, 2.43 and 2.02) and retinopathy severity (OR per level, 1.14 and 1.21) were associated with focal retinal arteriolar narrowing and A/V nicking, respectively 68 . Another type-1 DM  Table 4. Mean predicted age, mean absolute error (MAE) and its 95% confidence interval (CI); accuracy in predicting age within maximum differences of ±1, ±3, and ±5 years, divided into three categories in each test-set as follows: (1)  www.nature.com/scientificreports www.nature.com/scientificreports/ cohort study indicated that both wider venular diameters and smaller arteriolar diameters were predictors of the 16-year development of nephropathy, neuropathy, and proliferative retinopathy 71 . In a type-2 DM cohort study, smaller retinal arteriolar calibers exhibited associations with increasing age and mean arterial BP, and a larger retinal venular caliber was associated with increasing severities of retinopathy and cigarette smoking 69 . Because  www.nature.com/scientificreports www.nature.com/scientificreports/ the level of diabetic retinopathy and tissue oxygen demand affect the changes in blood vessels 69,72 , retinal vessels in diabetic patients show various changes, thus resulting in a smaller R 2 (coefficient of determination) in diabetic participants than normal participants. In other words, pathologic retinal changes result from various factors; our study suggests that ageing and pathological changes are not exactly the same. Unlike the other groups, the smoking group showed better performance on age prediction. There are several studies on the effects of smoking on retinal vessels 73,74 , but this does not explain how they perform better than normal. Moreover, there are spectral domain optical coherence tomography studies which shows the thickness of retinal layers in healthy chronic smokers was not significantly different to those of healthy individuals 75,76 . Sex, age, and confounding factors were likely to improve predictions, in other words, a large number of men over 30 years of age were included in our study, resulting in uniform results and higher R 2 values.
Our prediction model predicted sex with good accuracy in all test-sets unlike the results of age prediction. A previous study indicated that the AUC of UK Biobank was 0.97 (95% CI, 0.966-0.971) and that of EyePACS-2K was 0.97 (95% CI, 0.96-0.98), which are comparable to our study 25 . Considering that the prediction accuracy is reduced significantly in the fovea-erased images and the accuracy is reduced significantly in the vessel-erased    Table 5.
www.nature.com/scientificreports www.nature.com/scientificreports/ images, both the fovea and blood vessel are used for sex prediction where the blood vessel is the core region. Interestingly, the presence of underlying vascular conditions did not indicate a significant effect on sex prediction. This indicates that alterations in retinal blood vessels due to underlying vascular conditions did not exceed beyond sexual differences. Unexpectedly, our sex prediction model showed the highest AUC in the smoking group. This is probably due to the fact that the sex ratio of smokers is biased towards men (Supplementary  Table S2). Sex prediction has been studied in forensic science typically, using bones or bone fragments to produce estimating formulas [77][78][79] . However, these human structures are expected to exhibit significant differences between sexes, and they can be determined or measured by human examiners without a computer. Considering that sex identification in retinal fundus images proved to be almost impossible even when an inspection was performed by experienced ophthalmologists, our results suggest that deep-running is superior to human perception in image discrimination and identification.  www.nature.com/scientificreports www.nature.com/scientificreports/ Some limitations were present in this study. First, the type of underlying disease and the duration of the disease were not considered, and images showing any ocular disease were not included. Next, changes in retinal fundus images caused by lens yellowing, cataract, and cataract surgery might affect age prediction 80 ; however, the consideration of lens status is lacking in this study. Subsequently, myopia and axial length could cause significant changes in the fundus and optic nerve 81,82 . It was reported that an increased axial length was associated with arteriolar and venular narrowing; however, the arteriovenous diameter ratio or vessel junctions were not affected significantly by the axial length 81 . It is unclear which components of the retinal vasculature were used to predict age and sex; at the least, the effects of arteriovenous diameter ratio and junctional exponents were considered to be less disturbed by the axial length 81 . Next, confounding factors may appear depending on the fundus photograph camera model or its manufacturer. Differences in image size and telecentricity were reported depending on the fundus imaging system 83 . However, the absolute size of the targets by each fundus imaging system may be different; nevertheless, the ratio of each target size may not be significant. Finally, our study has not been validated in other databases and only Koreans are included in the study. Differences in retina according to ethnicity have been reported 84,85 , therefore, further validation studies involving other database especially other ethnicities are warranted.
Our model demonstrates accurate and highly reliable age estimates especially in normal participants under age 60 years. Retinal fundus images from participants with underlying conditions (hypertension, DM, or smoking) indicated relatively low coefficients of determination (R 2 ) between the predicted age and chronologic age, thus suggesting that the ageing process and pathologic vascular changes exhibit different features. Fundus-predicted sex indicated an accuracy of 0.96 of AUC score in all groups. Our CNN-based age and sex prediction model has demonstrated the most improved performance to date. Our research suggests that ageing and systemic vascular diseases have different effects on the retina. Further research on the clinical significance and application of our model to other population groups is needed.

Data availability
The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.