Identification of potentially relevant metals for the etiology of autism by using a Bayesian multivariate approach for partially censored values

Heavy metals are known to be able to cross the placental and blood brain barriers to affect critical neurodevelopmental processes in the fetus. We measured metal levels (Al, Cd, Hg, Li, Pb and Zn) in the cord blood of newborns and in the serum of the same children at 5 years of age, and compared between individuals with or without (controls) autism spectrum disorder (ASD) diagnosis. The samples were from a biobank associated with the All Babies in Southeast Sweden (ABIS) registry. We proposed a Bayesian multivariate log-normal model for partially censored values to identify potentially relevant metals for the etiology of ASD. Our results in cord blood suggest prenatal Al levels could be indicative of later ASD incidence, which could also be related to an increased possibility of a high, potentially toxic, exposure to Al and Li during pregnancy. In addition, a larger possibility of a high, potentially beneficial, exposure to Zn could occur during pregnancy in controls. Finally, we found decisive evidence for an average increase of Hg in 5-year-old ASD children compared to only weak evidence for controls. This is concordant with previous research showing an impaired ability for eliminating Hg in the ASD group.


Material and methods
All methods were performed in accordance with the relevant guidelines and regulations.
Origin of samples.The samples are part of the biobank associated to the ABIS (All Babies in Southeast Sweden) registry.The samples are stored at the Division of Pediatrics of the Department of Biomedical and Clinical Sciences, Linköping University.The original goal of the ABIS study was to investigate the emergence of Type 1 diabetes and other immune mediated diseases in Swedish children 14 .Informed consent was obtained from the legal guardians of all participating children.In this context cord blood, breastmilk, hair of the mother was collected at birth, while blood, urine, stool, and hair were obtained from children at the ages of 1, 3 and 5 years.The ABIS cohort comprised male and female children born between Oct 1997 and Oct 1999 in the Swedish counties of Östergötland, Öland, Blekinge, and Småland, which are followed prospectively.Using this cohort, we have recently reported associations between blood heavy metals in children and Type 1 diabetes 15 and autoimmunity 16 .
Database employed for diagnosis.The categorization of the samples as coming from children with or without ASD diagnosis was performed by different medical doctors across Sweden and incorporated into The Swedish National Patient Register.This population-based register was launched in 1964 and is currently maintained by the Swedish National Board of Health and Welfare (http:// www.socia lstyr elsen.se/ engli sh).Over 99% of all somatic and psychiatric hospital discharges, as well as outpatient visits from both private and public caregivers, are recorded in this register.The recorded items are based on the International Classification of Diseases (ICD) codes, and associate with the personal identity number (a unique 10-digit number) assigned to all Swedish residents (http:// www.socia lstyr elsen.se/ engli sh).Over 99% of all somatic and psychiatric hospital discharges, as well as outpatient visits from both private and public caregivers, are recorded in this register.
Analysis of metal levels.The samples were analyzed for the concentration of aluminum (Al), cadmium (Cd), mercury (Hg), lithium (Li), lead (Pb) and zinc (Zn).For cord blood, 20 samples were randomly selected and analyzed from ASD diagnosed children and 40 from the control group.For the serum from 5-year-old children, 11 samples were analyzed from ASD diagnosed children and 24 from the control group.The analysis of metal levels was performed by ALS Scandinavia AB (Luleå, Sweden) using the 'ultrasensitive inductively coupled plasma sector field mass spectrometry method' (ICP-SFMS) 17 after acid digestion with HNO3, according to the standards ISO 17294-1 18 and ISO 17294-2 19 , and the EPA Method 200.8 20 .Most metal levels were lower than the detection level of each metal, which refers to censored metal levels from above, and a few metal values were not recorded (missing data).In the next section, we incorporate these features of our data into our Bayesian modeling of metal values.

Bayesian modeling of metal values
The Bayesian approach formulates a prior distribution for all model parameters, and then updates this prior distribution with observed data through the likelihood function to a posterior distribution.The goal of a Bayesian analysis is to make inference on the posterior distribution of all model parameters.Likelihood function.Let x = x 1 , x 2 , . . ., x J be the vector of values for the J different metals of any individual in one of the groups.Assume for each of the groups that where µ is the mean vector and is the covariance matrix for that group.This implies that x follows a multivari- ate log-normal distribution with mean for each metal j given by and covariance matrix for elements j and k as (1) ln x ∼ N(µ, �), for any types of metals j and k.Let y = ln x and partition y as y = y 1 , y 2 , where y 1 is the vector of observed val- ues and y 2 is the vector of censored values for any individual i.Similarly, partition the mean vector µ = (µ 1 , µ 2 ) and covariance matrix where 12 = 21 by symmetry.Then, it follows that the marginal distribution of y 1 is given by and the conditional distribution of y 2 |y 1 becomes 21 Let f y 1 y 1 be the probability density function of y 1 and let F y 2 |y 1 (c 2 ) denote the cumulative distri- bution function of y 2 |y 1 evaluated at c 2 .The likelihood function L i for any individual i becomes L µ, �|y i1 , where the elements of y 2 are censored at c i2 from above.Then, the likelihood function for n individuals in a group becomes Bayesian inference.The posterior distribution of (µ, �) is given through Bayes' theorem by where L µ, �|y 1 , y 2 is the likelihood function in Eq. ( 2) above and p(µ, �) is the prior distribution of (µ, �) .Following 22 , we estimate the posterior distribution of (µ, �) using the default Markov chain Monte Carlo (MCMC) algorithm for K = 1 mixture component in the R package mixAK.This R package allows for censored values y 2 , which is an inherent feature of our data (see Section Analysis of metal levels), where these values are lower than the detection level of the measuring device.We also use the feature of interval-censored values in mixAK for the non-recorded values (missing data) specified on reasonable intervals.As a standard of comparison, we use the weakly informative prior distributions in mixAK.Weakly informative priors are often considered in Bayesian inference for two reasons: to obtain stable estimation of the posterior and, at the same time, carry very little prior information in such a way that these priors are essentially non-informative prior distributions.Convergence of the MCMC algorithm in mixAK is monitored using the convergency measures R and neff in 23 .The convergency measure R is approximately equal to the square root of the variance of the mixture of all the chains divided by the average within-chain variance and neff is the number of efficient draws as an estimated number of independent posterior draws.We estimated all models such that R < 1.1 and neff > 100 .This suggested good convergence to the posterior with well-mixed chains.
Classification.We obtain a posterior distribution for the probability that each observation y i for individual i belongs to one of the two groups using a Leave-One-Out Cross-Validation (LOOCV) procedure.Given each observation y i , we estimate the model for each group using all other observations y −i than y i as training data.The probability of being in the autism group (here class C = c 1 ) instead of the control group (here class C = c 2 ) for an individual i can be written by reverting Bayes Theorem as (similarly to naive Bayes classification, see e.g. 24) We assume equiprobable classes apriori such that p C i = c 1 |y −i = p C i = c 2 |y −i = 1 2 for any observation i, which simplifies the probability expression to In addition, the odds for an individual i being in the control group also simplifies to , ( Vol:.( 1234567890) www.nature.com/scientificreports/

Results
The results of the model parameters µ and in Eq. ( 1) are shown in the first subsection for the data on metal values in the cord blood.Differences in µ and between the ASD and control groups for 5-year-olds are shown in the second subsection.Group comparisons between metal values in the cord blood and in 5-year-olds are shown in the third subsection.Finally, classification results are presented in the fourth subsection.Throughout this section, let be the standard deviations of x for the ASD and control groups, respectively.

Metal values in the cord blood.
and the posterior probabilities P(µ A > µ C ) , P(σ A > σ C ) for each of the met- als Al, Cd, Hg, Li, Pb and Zn.To a large extent, the expected values µ and standard deviations σ for metals Al, Hg, and Li, are higher in the ASD than in the control group.The opposite holds for metal Pb, where expected values and standard deviations tend to be higher for the control group, and for metal Zn, where standard deviations tend to be higher for the control group.No differences between the groups in expected values and standard deviations are apparent for metal Cd.However, visual inspections from the figures can be more precisely quantified and interpreted using the calculated probabilities P(µ A > µ C ) and P(σ A > σ C ) (shown in the figures).The posterior odds in favor of µ A is given by Analogously to the interpretation of the Bayes factor in 25 , we provide similar interpretations of K in Table 1.
The probabilities for P(µ A > µ C ) and P(σ A > σ C ) from Figs. 1, 2, 3, 4, 5 and 6 can be extracted and combined with the interpretations of the posterior odds K from Table 1.Differences between the autism and control groups in µ and σ for each metal are given in Table 2.There is strong evidence that the mean and standard deviation of Al are higher in the ASD group than in the control group.There is also strong evidence that the standard deviation of Pb is higher in the control group and that the standard deviation of Li is higher in the ASD group.There is decisive evidence that the standard deviation of Zn is higher in the control group compared to the ASD group.
In order to identify if linear associations between the metals would be relevant for the ethiology of ASD, we performed inference on pairwise correlations of the metals.These were extracted from the posterior inference of the elements in for each group.Table 3 shows that especially Pb tends to be associated with other metals.There is decisive evidence that a linear association between Pb and Al is larger in the control group than in the ASD group.Likewise, there is strong evidence of a higher linear association between Pb and Hg in the autism group, and of a higher linear association between Pb and Li in the control group.In addition, there is strong     evidence of a higher linear association between Al and Hg, Al and Zn, and Zn and Cd in the control group.Figure 7 confirms this view as the majority of the histograms of correlations for the control group are shifted more to the right compared to the corresponding histograms of the ASD group.Group comparison between metal values in the cord blood and for 5-year olds.Table 5 shows decisive evidence for the means and standard deviations of almost all metal values being higher in 5-year-olds.One exception was Zn, where the evidence is decisive for both the mean and standard deviation being higher in the cord blood than for 5-year-olds.Another exception was Hg, where the evidence is only weak for both the mean and standard deviation being higher in 5-year-olds in the control group and where no evidence exists for the standard deviation in the ASD group being higher in either the cord blood or in 5-year-olds.

Metal values for 5-year olds.
Classification results.In order to evaluate the ability of our cord blood data to predict future ASD incidence, we performed an analysis of the strength of evidence for an observation to belong to either the ASD or Table 5. Probabilities and strength of evidence for the mean and standard deviation of the ratios R ASD = X ASD denote the metal values in the cord blood and X Control and X ASD the metal values for 5-year olds in the control and ASD groups, respectively.
Histograms for the posterior probability that each actual ASD or control individual belongs to either the ASD or control group using 10000 posterior draws of (µ, σ ).
the control group (as described in Section Classification).Figure 8 shows the posterior distributions (using 100000 posterior draws) of p C i = c 1 |y i , y −i for each individual i.Of the classified individuals, all but one were correctly classified to one of the groups.However, for this individual the average posterior probability, mean µ,σ p C i = c 1 |y i , y −i = 0.938 , is actually the lowest average probability for all the 11 individuals in the figure.
The individuals in Fig. 8 show at least strong favor to one of the groups according to our following definition: The remaining 49 individuals can not be classified according to this definition (Table 6).
When employing a more conservative rule, by changing the threshold above from 0.909 to 0.990, all 5 individuals were correctly classified with at least decisive favor to one of the groups, see Table 7.
Posterior predictive distributions and posterior odds for each type of metal values.In order to predict possible metal values for a new individual, we obtained the posterior predictive distribution, p(x|x), for future values x given observed values x for each type of metal in each of the ASD and control groups.Inves- tigating group differences of the posterior predictive distributions further, we generalized the odds, defined in Section Classification for an individual i, to the following odds for any future value x of a metal belonging to the control group: Hence, the odds for any metal value x is given by the ratio of the posterior predictive densities.Figure 9 shows the posterior predictive distribution p(x|x) for each type of metal and group.The distributions of the groups for each metal overlap to a certain degree.However, for all metals but Pb and Zn the distribution for the autism group attains, in general, larger values in long and thick tails of the distribution.Hence, an extreme value for each type of metal can possibly be used as a biomarker for an individual belonging to either group.By marking out thresholds of metal values in the tails of p(x|x) in Fig. 9 that correspond to considerably low or high values of the odds O Control | x, x , the more extreme metal values beyond each threshold can be used as a biomarker for an individual with such a value to belong to either group.Table 8 shows the values of the marked thresholds in Fig. 9.
This complements the previous classification results in Tables 6 and 7, where the whole vector of metal values for an individual was used to classify each individual to either group.However, tail distributions are known to be sensitive to outliers in particularly small datasets.Therefore, we need to interpret such a biomarker with caution in our relatively small dataset and be aware of the fact that more data can change tail distributions dramatically.Figure 10 shows the posterior odds as a function of values for each type of metal.For large values of Al, Cd, Hg and Li, the odds for being in the control group declines towards 0. This is in contrast to large values of Pb and Zn, where the odds for being in the autism group declines towards 0 instead.

Discussion
In this study, we investigate the presence of heavy metals in the cord blood of newborns and in the serum of the same children at five years of age.To identify the relevance of specific metals (in these two stages of development) for the etiology of ASD, we used a Bayesian multivariate log-normal modeling that accommodates censored values.In cord blood, we found strong evidence for higher Al levels in the ASD compared to the control group.The role of Al as a neurotoxic and neuroinflammatory agent, and its synergistic interaction with Hg, has previously been reported 26 .In addition, Al has been shown to be strongly linked to ASD, with the severity of behavioral mean µ,σ p C i = c 1 |y i , y −i > 0.909 ⇔ at least strong favor to groupC j .symptoms being correlated to Al levels in the hair 27 .To date, however, no study has suggested that higher levels of Al in the cord blood can be indicative of future ASD diagnosis, as our data indicates.The potential of prenatal Al levels to indicate later ASD incidence needs to be further investigated.We also found strong evidence for larger standard deviations of Al and Li in the ASD group, as well as for a larger standard deviation of Zn in the control group.Figure 11 summarizes our findings by using the posterior modes as Bayesian point estimates of the mean and standard deviation for each metal.For Al and Li this may indicate higher possibilities of a high, potentially toxic, exposure to these metals during pregnancy.To the best of our knowledge, no study has investigated the role of prenatal exposure to Li in relation to ASD.For Zn, in turn, our findings of decisive evidence for a higher standard deviation in the controls compared to ASD in cord blood samples suggests that the ASD population is exposed to less nutritional options providing Zn as a micronutrient than the control population.Zn is an important micronutrient whose deficiency during critical fetal periods  The interval μx ± σx for each cord blood metal value in each of the ASD and control groups, where μx and σx are the posterior modes as Bayesian point estimates from the posterior distributions of µ x and σ x , respectively.is thought to contribute to the development of ASD 28 .Concordant with this idea, low hair Zn levels in infants are indeed associated with ASD 28 .Additionally, we found decisive evidence for higher average Zn levels in the cord blood compared to 5-year-olds in both experimental groups.Our combined evidence on Zn suggests the importance of Zn as a micronutrient during pregnancy and may indicate higher possibilities of a high, potentially beneficial, exposure to Zn during pregnancy in controls.Concordantly, less possibilities of higher Zn exposure during pregnancy in ASD might be indicative of lack of this nutrient in crucial early developmental stages.This hypothesis needs to be further examined.
We found decisive evidence for higher average levels of Al and Li in 5-year-olds compared to the cord blood, both in the control and ASD groups.This is expected, because as the children grew, they increasingly obtained these metals via nutrition.The exception was Hg, for which we found decisive evidence of an average increase in 5-year-old ASD children, but only weak evidence for an average increase in 5-year-old control children.This is concordant with previous research showing an impaired ability for eliminating Hg in relation to ASD.For example, newborns later diagnosed with ASD present a 7.7-fold reduction Hg levels in hair, which is one of the main pathways of metal excretion 29 .Such impaired ability to excrete Hg would lead to an increased accumulation in the body, which is concordant with a 1.9-fold increase in blood Hg observed in ASD-diagnosed subjects 30 .Another interesting observation in relation to Hg is that the standard deviation is never higher for 5-year-olds in the ASD group compared to merely weak evidence for controls.This provides further insight to the idea that the increase in Hg levels in 5-year-old ASD children only depends on their impaired metabolic ability to detoxify Hg and not on environmental availability.
Moreover, we investigated the ability of our cord blood data to predict future ASD incidence by using classification.Although only 11 individuals could be classified and 49 could not, all but one (belonging to the control group) was correctly classified.The predictive power from classifying specific cord blood metal levels can certainly be higher for a larger dataset and we believe this approach has good potential for early identification of susceptibility to develop ASD.This would be important to develop early interventions, e.g.dietary, to minimize exposure to certain metals in individuals with a higher risk of developing ASD later on.
Finally, by using posterior predictive distributions of the metal values, we identified threshold levels for specific metals that predict future ASD incidence.Threshold levels could be identified for all metals, except for Cd.For Al, Hg, Li and Pb threshold levels were identified for the right tail of the posterior predictive distribution, where large values for these metals are indicative of a future incidence of ASD.Although these levels are only based on our dataset, a similar approach can be employed in future studies that incorporate a larger sample of children to examine whether the levels reported here are reliable and how they can be refined.For Zn, we identified a potential non-monotonic effect, with values below and above thresholds in both the left and right tail of the predictive distribution being related to controls, rather than to ASD incidence.Although it is natural that higher cord blood levels of this important micronutrient would relate to the prevention of ASD, it is hard to understand how very low levels could do the same.Non-monotonic effects are a contentious topic in environmental toxicology, for which the mechanism and real implications are yet to be understood 31 .However, it is important to point out that in vitro non-monotonic effects have been reported for another metal, Copper, in Caco-2 cells 32 .In any case, our study points towards the feasibility of using levels of specific metals in the cord blood as a warning signal about future ASD incidence, and as previously mentioned, to prompt early interventions aimed at preventing or minimizing the detrimental effects of ASD in children.Such interventions could, for example, involve limiting exposure of susceptible individuals to metals such as Hg.There is accumulating evidence of Hg involvement in ASD, both from previous reports and in the present paper.Hg involvement in ASD seems to be via accumulation in the body 30 , which would in turn lead to disrupted neurodevelopment 9 .Thus, limiting Hg exposure/intake by susceptible individuals could in theory help to promote normal neurodevelopment during key developmental times, notably between birth and the age of six.It is during this age when human brain experiences dramatic changes in size, a 4-fold increase, and connectivity, involving processes such as neuronal arborization, synaptogenesis, glycogenesis and myelination 33 .

Figure 1 .
Figure 1.Posterior distributions of µ A − µ C (top left), µ A and µ C (top right), σ A σ C (bottom left), and σ A and σ C (bottom right) for metal Al.

Figure 2 .
Figure 2. Posterior distributions of µ A − µ C (top left), µ A and µ C (top right), σ A σ C (bottom left), and σ A and σ C (bottom right) for metal Cd.

Figure 3 .
Figure 3. Posterior distributions of µ A − µ C (top left), µ A and µ C (top right), σ A σ C (bottom left), and σ A and σ C (bottom right) for metal Hg.

Figure 4 .
Figure 4. Posterior distributions of µ A − µ C (top left), µ A and µ C (top right), σ A σ C (bottom left), and σ A and σ C (bottom right) for metal Li.

Figure 5 .
Figure 5. Posterior distributions of µ A − µ C (top left), µ A and µ C (top right), σ A σ C (bottom left), and σ A and σ C (bottom right) for metal Pb.

Figure 6 .
Figure 6.Posterior distributions of µ A − µ C (top left), µ A and µ C (top right), σ A σ C (bottom left), and σ A and σ C (bottom right) for metal Zn.

Figure 7 .
Figure 7. Histograms of posterior correlations between the metals for the control group (black) and ASD group (blue).
Controlbeing higher for 5-year-olds compared to metal values in the cord blood, where X

Figure 9 .
Figure 9. Posterior predictive distributions for each type of metal value in the control (black) and autism group (blue).The metal values x are simulated from taking the exponential of the multivariate normal realisations ln x , conditional on the posterior draws of µ and .The 1% most extreme metal values are not shown for each metal type due to poor visualization of the whole distribution from showing the characteristic thick and long tails in the log-normal distribution.The solid vertical lines show thresholds of strong evidence that an individual with more extreme values belongs to the ASD group for metals Al, Cd, Hg, Li, and to the control group for metals Pb and Zn.The dashed lines show corresponding thresholds of decisive evidence.

Figure 10 .
Figure 10.The posterior odds for each metal value belonging to either the control (black) or the autism (blue) group.

Figure 11 .
Figure11.The interval μx ± σx for each cord blood metal value in each of the ASD and control groups, where μx and σx are the posterior modes as Bayesian point estimates from the posterior distributions of µ x and σ x , respectively.
Table 4shows a strong evidence for higher average metal values of Hg in the ASD group compared to the control group.Strong evidence also exists for the standard deviation being higher

Table 2 .
Probabilities and strength of evidence for the mean and standard deviation of the metal values in the cord blood, respectively, being higher in one of the groups.

Table 3 .
Probabilities and strength of evidence for the pairwise correlations of metals being higher in one of the groups.

Table 4 .
Probabilities and strength of evidence for the mean and standard deviation of metal values for 5-year-olds, respectively, being higher in one of the groups for each metal.

Table 6 .
Confusion matrix for the classified 11 individuals with at least strong favor to one of the groups.

Table 7 .
Confusion matrix for the classified 5 individuals with at least decisive favor to one of the groups.

Table 8 .
Threshold values for strong and decisive evidence of a metal value x belonging to the ASD (in bold) and control (in italics) groups.