Original Article

International Journal of Impotence Research (2008) 20, 79–84; doi:10.1038/sj.ijir.3901593; published online 23 August 2007

Computational models for detection of endocrinopathy in subfertile males

C R Powell1, R A Desai1, A A Makhlouf2, M Sigman3, J P Jarow4, L S Ross1 and C S Niederberger1

  1. 1Department of Urology, University of Illinois Chicago, Chicago, IL, USA
  2. 2Department of Urology, University of Minnesota, Minneapolis, MN, USA
  3. 3Division of Urology, Brown University, Providence, RI, USA
  4. 4Department of Urology, Johns Hopkins University, Baltimore, MD, USA

Correspondence: Dr CR Powell, Department of Urology, University of Illinois at Chicago, 820 S Wood St M/C 955, Chicago, IL 60612-7316, USA. E-mail: cpowell@uic.edu

Received 21 February 2007; Accepted 23 March 2007; Published online 23 August 2007.

Top

Abstract

The observation that men with sperm density greater than 10 million/ml had low probability of endocrinopathy led to a refinement in the evaluation of subfertility. Using statistical methods, we sought to provide a more accurate prediction of which patients have an endocrinopathy, and to report the outcome as the odds of having disease. In addition, by examining the parameters that influenced the model significantly, the underlying pathophysiology might be better understood. Records of 1035 men containing variables including testis volume, sperm density, motility as well as the presence of endocrinopathy were randomized into 'training' and 'test' data sets. We modeled the data set using linear and quadratic discriminant function analysis, logistic regression (LR) and a neural network. Wilk's regression analysis was performed to determine which variables influenced the model significantly. Of the four models investigated, LR and a neural network performed the best with receiver operating characteristic areas under the curve of 0.93 and 0.95, respectively, correlating to a sensitivity of 28% and a specificity of 99% for the LR model, and a sensitivity and specificity of 56 and 97% for the neural network model. Reverse regression yielded P-values for the testis volume and sperm density of <0.0001. The neural network and LR models accurately predicted the probability of an endocrinopathy from testis volume, sperm density and motility without serum assays. These models may be accessed via the Internet, allowing urologists to select patients for endocrinologic evaluation at http://www.urocomp.org.

Keywords:

endocrinopathy, infertility, semen analysis, testis volume

Abbreviations:

AUC, area under the curve; ED, erectile dysfunction; FSH, follicle stimulating hormone; GLRT, generalized likelihood ratio test; HCG, human chorionic gonadotropin; LDFA, linear discriminant function analysis; LH, luteinizing hormone; LR, logistic regression; mL, milliliter; QDFA, quadratic discriminant function analysis; ROC, receiver operator characteristic

Top

Introduction and objectives

It is estimated that approximately 15% of couples cannot conceive after 1 year of unprotected intercourse. Of those cases, 20% are solely related to the male factor and in another 30–40%, the male factor is contributory.1 Some of these cases are due to an endocrinopathy that may be treatable. The discovery of low testosterone with elevated prolactin, for instance, is likely due to a prolactinoma, and it may be possible to restore fertility with bromocriptine or cabergoline therapy. Hypogonadotropic hypogonadism due to Kallmann syndrome is also important to identify, as it may be treatable with human chorionic gonadotropin and recombinant follicle-stimulating hormone (FSH), and as the Kallmann gene has been cloned, such a diagnosis has implications for the patient's future offspring. In addition to identifying reversible causes of endocrinopathy, examining the parameters that best predict an endocrinopathy may lead to better understanding of the underlying pathophysiology of infertility.

The conventional evaluation for men with infertility has evolved over the years. Recently, Sharlip et al.1 published best practice policies on the evaluation of subfertile men. The authors recommended endocrine screening only with FSH and testosterone levels in men with sperm counts less than 10 million/ml or other evidence of endocrinopathy such as physical findings or decreased libido complaints. This recommendation was based on our observation of a low probability of endocrinopathy in men with sperm count greater than 10 million/ml.2

Since detection of endocrinopathies affecting male fertility may provide significant prognostic information to infertile couples as well as uncover significant underlying medical conditions, we investigated whether we could improve upon the current clinical guidelines using more advanced statistical models. Neural networks are a form of statistical modeling that provides a better curve fit for nonlinear data compared to traditional statistical techniques. Investigators have previously used neural networks, a form of statistical modeling, to analyze infertility data. To our knowledge, these methods have not yet been utilized to predict the presence of endocrinopathy based solely on clinical or semen analysis data.3, 4

Top

Materials and methods

In this paper, we use the term 'endocrinopathy' to mean the presence of an abnormality in the serum hormonal panel (testosterone, FSH, luteinizing hormone (LH) or prolactin), without necessarily implying a primary endocrine cause of infertility.

A retrospective analysis of 1525 patients attending two fertility centers for infertility evaluations was performed. A complete medical history and physical examination was obtained on all patients. Testicular volume was determined using either a Seager or Prader orchidometer. All patients underwent endocrine testing including FSH and testosterone. At one center, follow-up studies including LH and prolactin were only obtained if the initial endocrine studies were abnormal. At the other center, prolactin and LH were obtained for all patients in the initial panel. A minimum of two semen analyses were obtained from all patients after a 2- to 3-day period of abstinence and evaluated by trained laboratory technicians. Seminal volume, sperm density, percent motility, forward progression and percentage of normal morphologic forms were recorded. The final study population comprised 1035 men, 385 from the first center and 650 from the second, in whom semen analysis and hormone levels were known. Patients were excluded who had undergone prior fertility workup or vasectomy.

The four models investigated were linear and quadratic discriminant function analysis (LDFA and QDFA), logistic regression (LR) and a neural network. Neural networks were implemented using 'neUROn' (Neural computational environment for UROlogical numericals), a suite of C++ programs developed by the investigators and cross-compiled using Microsoft Visual C++ version 6 (Microsoft Corp., Redmond, WA, USA) and GNU C++ (Cygwin port) version 2.95 (Red Hat Corp., Raleigh, NC, USA). The training method was canonical off-line back propagation with weight decay, with the weight decay term lambda chosen to be 5 times 10-5. All transfer functions were sigmoidal, allowing for odds ratios to be easily computed at the output node, and the error function was chosen to be cross-entropy for feature extraction using Wilk's generalized likelihood ratio test (GLRT). Decision boundaries of the various models were visualized with software developed by us with Microsoft Visual C++ version 6. The chosen algorithm calculated the model's output for each pixel in the plot, and then identified the boundary pixels as those having an adjacent pixel with a different outcome. The data set was randomly divided into a 'training' set of 777 exemplars and a 'test' set of 258 exemplars. The proportion of exemplars with endocrinopathies was kept similar in both sets using a randomization algorithm, which preserved initial outcome frequencies in each. The test set was excluded from training and used only for cross-validation ('n1/n2' method). Multiple random sets of initial conditions (connection weights) were derived, and the training set was iteratively applied to the neural computational system. When overlearning was observed by divergence of training and test set errors, hidden nodes were removed to reduce network topology. A single hidden layer with four hidden nodes was determined to represent an optimal topology which maintained acceptable goodness-of-fit without overlearning. We considered the network to be trained to completion when the error was observed to be oscillating at a local error minimum, and the error gradient was less than 1 times 10-6.

We employed Wilk's GLRT to determine which input features were significant to the model's outcome in a reverse regression analysis. We also modeled the data set using LR and LDFA and QDFA to compare the nonlinear computational method of neural computation with traditional linear statistical modeling tools. Receiver operating characteristic (ROC) curves were generated by plotting sensitivity vs (1-specificity) for all possible thresholds. We computed ROC area under the curve (AUC) statistically where possible using the method described by Wickens and compared threshold-independent true and false-positive and -negative rates statistically according to the method described by DeLong.5, 6

To use Wilk's GLRT for feature extraction, the full network was trained to a strict local error minimum, and the cross-entropy error was calculated. Subtracting each input node sequentially created feature-deficient networks. These subnetworks were retrained to a strict local error minimum, and the cross-entropy error for each subnetwork was recorded. The probability P that the modeler can reject the null hypothesis (that the full network and feature-deficient network are equivalent) follows a chi2 probability distribution with degrees of freedom equal to the number of nodal connections removed by generating the feature-deficient subnetwork.7

Top

Results

Average and median testicular volumes, semen parameters and hormonal panels for the study population are listed in Table 1. For the purpose of describing the data set further, Table 2 lists the incidence of endocrinopathy in the study population. Over 90% of the population had a normal endocrine (eugonadotropic) evaluation. The most common endocrinopathy seen was germ cell failure, seen in 7.8%, followed by complete testicular failure (0.8%), hyperprolactinemia (0.4%), hypogonadotropic hypogonadism (not from hyperprolactinemia) (0.2%), Leydig cell tumor (0.2%) and androgen resistance (0.1%). These terms are defined in Table 3.




Testis volume, sperm density and motility percentage data were compiled. To illustrate how these variables affect the likelihood of endocrinopathy, we generated decision boundary plots based on the neural network model (Figure 1). Figure 1a shows how the likelihood of endocrinopathy increases with decreasing sperm count and decreasing testis volume, while Figure 1b shows that endocrinopathy is less likely with higher motility values. Four models were constructed, and ROC areas were calculated. ROC AUC for each of the four models in the test set is shown in Table 4. The neural network produced the highest ROC AUC of 0.948. LR produced an ROC AUC close to that of the neural network at 0.925. Wilk's regression was performed on each variable as described above to determine the significance of each, and the P-values for each are shown in Table 5. With a P-value of 7.13 times 10-13, testis volume added significant predictive information to the model. Removal of testis volume during reverse regression analysis decreased the ROC from 0.948 to 0.908, a value similar to that obtained by Sigman with an LR model employing only sperm count.2 Both QDFA and LDFA demonstrated less accuracy, with ROC AUC of 0.854 and 0.637, respectively. To determine whether these differences in ROC AUC were statistically significant, DeLong's test was employed.6 The P-value between LR and a four-hidden-node neural network was 0.23, not significant at the P<0.05 level. By contrast, the P-values between LDFA and QDFA when compared with the neural network were both <0.05, suggesting that the neural network was superior to these linear models and the difference was significant.

Figure 1.
Figure 1 - Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

(a) Effect of testicular volume and sperm count on likelihood of endocrinopathy, with motility kept constant at its mean of 45. Lighter gray indicates higher likelihood of endocrinopathy. The decision boundary, where the odds of endocrinopathy and nonendocrinopathy are equal, is indicated by the white line. The boundary would be slightly different at various values of motility. (b) Effect of motility and testis volume with the count kept constant at 10 million/ml, a value suggested as a cutoff for endocrine evaluation.2

Full figure and legend (45K)



Top

Discussion

Endocrinopathy is a relatively uncommon cause of male infertility, accounting for less than 3% of infertile men.2 Yet, it remains an important diagnosis to consider despite its rarity as it can be reversible and signify a life-threatening condition. Reproductive endocrine abnormalities may be due to etiologic endocrinopathies, those endocrine disorders that cause infertility, or due to endocrine abnormalities that are reflective of testicular dysfunction. Etiologic endocrine disorders such as hyperprolactinemia due to a pituitary adenoma, may be treatable, allowing for an improvement in fertility. In contrast, endocrine abnormalities reflective of impaired testicular function—such as an isolated elevation of FSH—may not be directly treatable but have prognostic and diagnostic value. Thus both types of endocrine disorders should be diagnosed during the evaluation of the infertile male.

In our prior investigation, endocrinopathy was rarely noted in men with a sperm density greater than 10 million/ml. For that reason, we suggested using a threshold of less than 10 million/cc as an indication for endocrine evaluation. Sperm count was the most significant single factor in predicting presence of endocrinopathy. We reported an ROC AUC of 0.902 using sperm count alone.2 In the four models investigated in this study, the addition of testis volume and sperm motility percentage improved the model's ability to predict an endocrinopathy significantly, as evidenced by the higher ROC AUC demonstrated previously in Table 4. The assumption made at the beginning of the investigation was that all three of these variables would be important to the model. However, given the P-value of sperm motility seen in Table 5 it is likely that motility is not a significant predictor of endocrinopathy in subfertile males.

We posit the following formula, based on our analysis of the data using an LR model with continuous variables, to calculate the probability of endocrinopathy:

Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

where x is testis volume in ml, y is sperm count in million/ml, and z is motility percentage.

Finally, the probability of endocrinopathy can be computed as follows:

Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

A user-friendly version of the four-hidden node neural network model can be accessed by clinicians at www.urocomp.org, and a screen shot of the model using an Internet browser on the Windows XP operating system can be seen in Figure 2. We envision clinicians using this model to tailor the diagnostic algorithm to each individual patient. The model reports the odds of endocrinopathy over a wide range of testicular volume and sperm counts allowing each clinician to customize his or her own threshold values for ordering an endocrine evaluation.

Figure 2.
Figure 2 - Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

This model can be made available online via the Internet as well as on palm and pocket PC handheld PDA devices.

Full figure and legend (187K)

Analyzing individual variables for significance may provide new insights to the pathophysiology of endocrinopathy. The reverse regression analysis revealed that testis volume appears to be predictive of an endocrinopathy (Table 5). This easily measured clinical variable may be an effective surrogate for sperm count in predicting endocrinopathy in the future, but further study is needed.

It is also interesting to note that sperm motility was not shown to be predictive of an endocrinopathy. With further investigation, motility might even be shown to be unrelated to the endocrine parameters investigated in this study. The method by which a technician determines motility in the laboratory may also play a role in its significance.

Top

Conclusion

We developed a highly accurate neural network model to predict the presence of endocrinopathy in men undergoing an infertility evaluation. Sperm count as well as testis volume was found to be highly predictive of an endocrinopathy in the subfertile male. This model is available for clinical use on the World Wide Web at the following address: http://www.urocomp.org.

Top

Notes

The authors have no conflict of interest to declare.

Top

References

  1. Sharlip ID, Jarow JP, Belker Am, Lipshultz LI, Sigman M, Thomas AJ et al. Best practice policies for male infertility. Fertil Steril 2002; 77: 873–882. | Article | PubMed | ISI |
  2. Sigman M, Jarow JP. Endocrine evaluation of infertile men. Urology 1997; 50: 659–664. | Article | PubMed | ISI | ChemPort |
  3. Niederberger CS, Lipshultz LI, Lamb DJ. A neural network to analyze fertility data. Fertil Steril 1993; 60: 324–330. | PubMed | ISI | ChemPort |
  4. Lamb DJ, Niederberger CS. Artificial intelligence in medicine and male infertility. World J Urol 1993; 11: 129–136. | Article | PubMed | ISI | ChemPort |
  5. Wickens TD. Elementary Signal Detection Theory. Oxford University Press: New York, 2002.
  6. DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 1988; 44: 837–845. | Article | PubMed | ISI | ChemPort |
  7. Golden RM. Mathematical Methods for Neural Network Analysis and Design. MIT Press: Cambridge, 1996.
Top

MORE ARTICLES LIKE THIS

These links to content published by NPG are automatically generated

RESEARCH

Computational models for detection of endocrinopathy in subfertile males

International Journal of Impotence Research Original Article

Computational models for detection of endocrinopathy in subfertile males

International Journal of Impotence Research Original Article

See all 13 matches for Research

Extra navigation

.

naturejobs

natureproducts


ADVERTISEMENT