A systematic review and meta-analysis of the association between fluoride exposure and neurological disorders

Different studies have suggested that fluoride is related to neurological disorders in children and adolescents, but clinical evidences of which neurological parameters associated to fluoride exposure are, in fact, still controversial. In this way, this systematic review and meta-analysis aimed to show if there is an association between fluoride exposure from different sources, doses and neurological disorders. Terms related to “Humans”; “Central nervous system”; “Fluorides”; and “Neurologic manifestations” were searched in a systematic way on PubMed, Scopus, Web of Science, Lilacs, Cochrane and Google Scholar. All studies performed on humans exposed to fluoride were included on the final assessment. A meta-analysis was then performed and the quality level of evidence was performed using the GRADE approach. Our search retrieved 4,024 studies, among which 27 fulfilled the eligibility criteria. The main source of fluoride was naturally fluoridated water. Twenty-six studies showed alterations related to Intelligence Quotient (IQ) while only one has evaluated headache, insomnia, lethargy, polydipsia and polyuria. Ten studies were included on the meta-analysis, which showed IQ impairment only for individuals under high fluoride exposure considering the World Health Organization criteria, without evidences of association between low levels and any neurological disorder. However, the high heterogeneity observed compromise the final conclusions obtained by the quantitative analyses regarding such high levels. Furthermore, this association was classified as very low-level evidence. At this time, the current evidence does not allow us to state that fluoride is associated with neurological damage, indicating the need for new epidemiological studies that could provide further evidences regarding this possible association.


Scientific Reports
| (2021) 11:22659 | https://doi.org/10.1038/s41598-021-99688-w www.nature.com/scientificreports/ To assess the methodological quality and risk of bias, the checklist of Fowkes and Fulton 12 was applied. This checklist has domains that relate to study and sample design; control group characteristics; quality of measures and results; and distorted integrity and influences.
After evaluating each criterion, a (++) sign was assigned for major study problems or (+) for minor problems to assess whether the methods are adequate to produce consistent and valid information, as well as whether the results offered the expected effects. In items where the question was not applicable to the type of study, it was assigned the acronym NA (not applicable). "No problem" has been assigned the sign (0). The evaluation for each domain was standardized by the examiners and is described in Table A.2. After detailed evaluation of the methods and results, the studies were analyzed to verify the possibility of "skewed results", "confusions" and "random occurrence". To determine the value of the study, three summary questions were answered: "Were the results biased?"; "Are factors of confusion or distortion present?" and "Is there a possibility that the results came about by chance?" "YES" and "NO" answers were given. If the answer is NO in the three questions, the article is considered reliable, with low risk of bias.
Quantitative synthesis (meta-analysis). The studies data were analyzed using Review Manager software (Review Manager v. 5.3, The Cochrane Collaboration; Copenhagen, Denmark) to evaluate if Chronic exposure to F is associated with neurological deficit. In all analyses, only studies with low risk of bias were included.
A meta-analysis was performed to compare the percentage of low IQ with high and low chronic exposure to F. Previously, each study classified the F levels as low or high with heterogeneous concentrations. Then, for the meta-analysis we decided to classify the studies according to the WHO guidelines that consider optimal levels between 0.5-1.0 mg/L (low levels) and > 2 mg/L, as higher levels for water fluoridation 13,14 . The number of people with low IQ and the total number of participants in each case group (high fluoride) and control group (low fluoride) were included to calculate the odds ratio with a 95% confidence interval (CI).
The heterogeneity among studies was tested using I 2 index (p-value < 0.05 was considered statistically significant). A fixed and random effects models were used in the analyses of the studies. The final choice regarding effects model was performed based on I 2 index 16 . The forest plots were generated for each analysis and an alpha of 0.05 was adopted as the cut-off point for significance.
The publication bias was assessed through a comprehensive analysis of Egger's test, and Funnel Plot Visual interpretation 17 . A p-value < 0.05 indicated a likely publication bias across the studies. The Jamovi statistical software (version 1.6, Sydney, Australia) was used to generate figures and to run the test.
A sensitivity analyses was used to explore the influence of each study in the pooled meta-analysis or publication bias results. This analysis was adopted in case of substantial or considerable (50 to 100%) heterogeneity, or significant publication bias (p < 0.05). This evaluation was performed by manually omitting one study at time, one by one, and verifying its impact in the final results 15 .
Level of evidence assessment-GRADE. The level of evidence was determined using the Grading of Recommendations Assessment, Development and Evaluation (GRADE) approach. This tool provides a structured process for developing and presenting evidence summaries that measure the quality of evidence to confirm or reject hypotheses in systematic reviews 18 .
GRADE has four levels of evidence -decreasing from low to very low, moderate, and high; depending on whether issues such as risk of bias, inconsistency, inaccuracy and publication bias are severe, very serious or not serious. Although, observational studies begin as poor-quality evidence, the level can increase from low to high if the magnitude of the effect is large or very large 19 . Consent for publication. All the authors are in accordance with the publication.

Results
Search results. Based on the database searches, 4,024 studies were found. Three studies were included after manually searching in the reference lists [20][21][22] . After the removal of duplicate studies (714), 3,310 articles remained and were analyzed by title and abstract according to the eligibility criteria. A total of 3,260 were excluded, and 50 studies remained for full text reading. Fifteen studies were excluded because, when assessing IQ, they did not compare between high and low F concentrations, four contained co-exposure of F and other concomitant elements, and four used the same sample from studies included in this systematic review (Table A.3). Thus, 27 studies were elected, which underwent quality assessment of the risk of bias. The summary of the selection process is shown in Fig. 1.
Characteristics of the studies. The 27 included studies were characterized as observational, crosssectional type, among which 26 were analytical studies, and one was descriptive 23 . The age group investigated included individuals from 6 to 18 years of age. Most of the articles evaluated F exposure due to ingestion of naturally fluoridated water. Only one study analyzed populations exposed to F by burning coal 24 .
The F concentrations in drinking water categorized as low exposure in the selected studies ranged from 0.19 ppm 25 to 2.01 ppm 26 , while high doses ranged from 1.5 ppm 23,27 to 8.3 ppm 28 . Some studies considered a third intermediate category 23,[29][30][31][32][33] , which ranged from 0.5 ppm 30 to 3.1 ppm 33 . One study classified exposed groups according to four concentration levels, ranging from < 0.7 ppm to > 4.0 ppm 21 . One study did not provide high and low dose reference concentrations 20 and the study developed with F exposure from coal burning 24 reported only the content of F related to high exposure (0.0298 mg/m 3 ).
Regarding the source of sample used for the estimation of F exposure, most of the studies evaluated the drinking water alone 20 20,22,[26][27][28][37][38][39][40][41] , and in the air 24 . Some studies did not quantify the F levels, however determined the concentration from data available from national databases or electronic addresses 32,42,43 . Three studies did not report the process used, nor the source consulted to establish F exposure 23,44,45 , and mentioned the use of conventional chemical tests only without specifying the method for F 46 .
In the analysis of results, 23 studies showed a statistical difference between exposure to high and low doses of F. In three studies a comparison of intellectual skill among the groups exposed to high and low F concentrations was not statistically significant 30,34,46 . The descriptive study 23 reported the presence of alterations related to neurological manifestations in some group in high dose exposure (1.5-6.4 ppm). Table 1 shows details of all the characteristics of the included studies.
Risk of bias analysis. The quality of the studies was assessed based on risk of bias, confounding factors, and random occurrence. Eight studies were considered of low methodological quality and were classified as high risk of bias 20 www.nature.com/scientificreports/ they were not serious enough to be classified as high risk of bias. In the "Study sample representative" domain, the problem items were the "Sampling method", "Sample size" and "Entry criteria/exclusion". In the "Sampling method", nine studies presented major problems (++) mainly related to the convenience sample. In the item "Sample size", two articles presented major problems, because they did not make a sample calculation and the www.nature.com/scientificreports/ sample was smaller than 50 participants. In the entry criteria/exclusion section, only two studies presented a minor problem due to co-exposure to arsenic and iodine. For "Control group acceptable", the item "Definition of controls" presented two articles with minor problems (+) because they did not report the F concentration of the control group. Regarding "Matching/Randomization", nine studies did not mention randomization, but did the matching, being considered as a minor problem (+). However, two articles did not mention randomization or pairing, being considered as a major problem (++).
The domain "Quality of measurements and outcomes", the item with the most serious issues was the "Blindness", as 18 studies did not adopt any kind of blinding, followed by "Quality control", with eight studies that did not describe the measurement method. Table 2 presents the risk assessment of bias of the 27 eligible articles.
Level of evidence. The assessment of the level of certainty of the evidence was conducted through a narrative synthesis following the GRADE parameters for systematic reviews. The level of evidence of the studies was very low, both for studies evaluating IQ impairment and for the only study assessing other neurological manifestations, due to observational nature of the study protocol, as well as due to methodological inaccuracy. For the studies that evaluated IQ impairment, a serious risk of bias was observed. Regarding the study evaluating neurological manifestations other than IQ impairment, it also presented a highly suspicious publication bias, given that the measurement of these manifestations was done by the application of a questionnaire with unknown information about validation and without precise details for their reproduction.
Although, a narrative synthesis does not provide precise estimates, nor measure of effects, it was concluded that the level of evidence of the studies taken together is not strong enough to affirm that the high F exposure may produce a neurological damage in children. Results are represented in Table 3.
Quantitative analysis. Ten studies 21,25,28,30,31,34,35,37,38,40 that provided sufficient data for the analysis were included in the meta-analysis. From the studies selected, it was only possible to run the meta-analysis for IQ, due to the scarcity of investigations on other neurological aspects. People exposed to high F levels accounted for 1383 individuals, and to low levels, 1556 individuals. The results showed an association between high F exposure and decreased IQ (OR 3.88; 95% CI 2.41-6.23; p < 0.00001; I 2 = 77%), demonstrating a deleterious effect of high levels of F over IQ (Fig. 2). This evidence was qualified as very low (Table 3). It was observed a considerable heterogeneity (I 2 = 77%, p < 0.00001, Fig. 2) and significant publication bias (p < 0.00001) (Fig. 3).
After performing the sensitivity analysis, three studies were identified as a possible cause of publication bias 25,30,31 , with the detection of a low risk of publication bias (p = 0.25; Figure A, Supplementary material 5) after the exclusion of these studies. However, a considerable heterogeneity was still observed after sensitivity analysis. When the three studies previously identified as possible reason for publication bias were removed from the meta-analysis, the I 2 index decreased from 77 to 62% (Table B, Supplementary material 5). Therefore, the interpretation of the meta-analysis results after sensitivity analysis is still limited due to the considerable heterogeneity across the studies.

Discussion
This systematic review and meta-analysis gathered evidence showing that, following the WHO classification of low and high levels in the drinking water, exposure to low/adequate water F levels is not associated with any neurological damage, while exposure to high levels is. The level of evidence for this association, however, was considered very low. Furthermore, the IQ deficit was reported in the marjority of the primary studies identified, and only one article reported others neurological manifestations.
Systematic reviews aim to gather all the available evidence in the literature to answer a guiding question according to predefined eligibility criteria. It uses a well-designed, explicit and systematic methodology to minimize bias, generating reliable results, answers to raised questions and conclusions about certain problems, thus helping in decision making 47,48 . According to the Cochrane systematic reviews manual, this type of study has as main characteristics: clear and well-defined objectives that follow the pre-established eligibility criteria; the methodology must be easily reproducible, well designed and transparent; the survey must be comprehensive, meeting all the necessary eligibility criteria; the included studies must have their results evaluated for validity, assessing the risk of bias; all characteristics of the studies, including their results, must be presented.
Combined with qualitative synthesis, the meta-analysis reunites the quantitative data of the elected studies, thus being able to estimate the effects of the evidence, whether or not it can confirm the individual results of the elected studies of the systematic review 15 . After these qualitative and quantitative analysis, the GRADE tool helps to compile all the obtained results in the systematic review in order to promote an analysis of evidence and its recommendations for an evidence-based practice. This assessment has four levels of recommendations: very low, low, moderate and high.
Despite some variations in the literature on the F concentrations in the drinking water regarded as both effective and safe, it has been often reported that 1 mg/L is the "optimum level" 13,14 and, as previously mentioned, the concentrations may be adjusted at 0.7-1.2 mg/L, depending on climate, local environment and other sources of F 6 . In line with the above-mentioned observation, the 2017-updated edition of the WHO guidelines for drinkingwater quality suggested that F levels must be within the 0.5-1.0 mg/L range in order to promote maximum caries-preventive benefits with minimum risk of dental fluorosis 13,14 . This justifies the threshold set in the present study to dichotomize F exposure into "low" and "high" categories. This is also more relevant from a public health standpoint, given that artificially fluoridated water facilities must comply with the aforementioned levels, whereas higher concentrations are usually related to focal points in areas in which F is naturally present in the water.
The mechanisms by which F can interfere with child neurodevelopment are associated with damage to nervous cells. Evidences suggest that chronic exposure to F in the prenatal and neonatal periods is potentially toxic to  www.nature.com/scientificreports/ the metabolism and physiology of neuronal and glial cells, which leads to changes in processes related to memory and learning 9,[49][50][51] . This is due to the ability of F to cross the placental and blood-brain barriers, especially in developing individuals, who are more susceptible to changes caused by F because they have greater permeability of this barrier and defense mechanisms that are still immature 49,[52][53][54] . In addition, F can influence membrane ion channels, through interaction with the Ras protein, leading to changes in ion flow and nerve cell volume, which can lead to metabolic disturbances, changes in cell function and modification transmission of nerve impulses 49,55 .
According to the WHO, neurological disorders are multifactorial clinical conditions that may be characterized by signs and symptoms with different aspects, as physical functioning limitations, behavioral problems, psychosocial limitations, communicative and cognitive impairments 56 . Among these features, our study focused on cognitive functions due the approach performed by the elected studies. In this sense, it is important to highlight  57 , once this central function may be characterized as a complex reunion of processes that aims to classify, recognize and comprise information through reasoning, learning and executing them 58 .
In this context, aiming to evaluate cognitive functions of people exposed to F, the researchers from the elected studies used IQ test varieties as previously mentioned and due to that, different abilities of cognitive functions are evaluated, not having standardized and homogeneous parameters among the tests. Matzel and Sauce 59 suggested a hierarquical model of intelligence, in which the general ability, i.e., the intelligence is a result from several domains of ability, as reasoning, processing speed, memory and comprehension, which are evaluated by different methodologies. Stanford-Binet IQ method, e.g., includes tests of different abilities, which estimate the intelligence after and aggregate performance across the tests. While, the Raven's Standard Progressive Matrices is based on a unique ability and in the test, the main feature is that there is an increase on the difficult of perceptual reasoning 60 .
The studies included individuals with ages ranging from 6 to 18 years of age. From epidemiological point of view, this is not interesting, because intelligence tests were applied to participants with very different degrees of neurodevelopment. Data extraction indicates that all eligible studies were concentrated in the Asian continent. These data reflect the remarkable influence of the geographical aspect on the epidemiology of clinical manifestations resulting from F exposure. The availability of naturally occurring high concentration fluoridated compounds in drinking water used by rural communities increases their susceptibility to the adverse effects of F. Considering this aspect, a systematic review proposed to evaluate the neurotoxic effects of F from studies conducted specifically in the Chinese territory 9 , due to the high number of publications on this subject that sometimes has restricted dissemination due to language barrier.
The methodological quality analyses of the studies detected serious problems related to the quality of sample, measurements and outcomes. There were also problems related to the absence of randomization, sample size calculation and blinding, which increase the risk of bias and limit the inference capacity of studies on the neurotoxic effects of F.
Most studies did not assess the individual level of exposure to F, i.e., by urinary F samples. The F concentration in drinking water in regions with high and low F levels was the most reported method. However, there were also studies that used secondary data or did not report the F content in water, which significantly compromises the findings of these investigations. Furthermore, it should be considered that some studies used creatinineadjusted urinary F concentrations to account for urinary dilution which may cause an additional bias 61 , since renal dysfunction in children may be associated with neurocognitive impairments 62 .
Another point worth mentioning is the increased risk of water contamination by other substances in the areas of naturally occurring F. Although some authors consider it unlikely that the effects attributed to F neurotoxicity can be triggered by other contaminants 9 , it is possible that the absence of control in relation to these parameters generates confounding factors. To ensure the balance of electrical charges, water with higher concentrations of endemically occurring F must contain higher concentrations of positive ions to balance out the F. This may affect the pH of the water or result in greater contamination by electropositive water contaminants, for example aluminum, zinc, arsenic, lead, mercury, and other metals and metalloids 61 .
Following the parameters of GRADE, the level of evidence was considered as very low even for individuals exposed to high doses of F, due to imprecision problems (Table 3). This result is related to the types of studies included in this systematic review, as the level of evidence in observational studies starts at a very low level, which www.nature.com/scientificreports/ can only increase if the study meets the other criteria of this evaluation. Despite the large numbers of participants in the analysis, detected problems of inaccuracy can be elucidated by possible methodological disparities in the studies that might interfere in the intelligence quotient (IQ) analysis and neurological manifestations. Another important limitation to be considered is the predominance of cross-sectional studies in this systematic review. Cross-sectional and ecological studies do not allow the establishment of cause-and-effect relationships. They are useful for investigating the effect of environmental exposures related to acute processes, as the time interval between exposure and measurement of physiological parameters is close. Therefore, crosssectional studies are not the ideal model to assess the effect of chronic F exposure on a parameter such as human intelligence 61 . Longitudinal studies, on the other hand, are considered the most appropriate to assess chronic conditions, as by allowing the long-term follow-up of individuals, they make it possible to infer causality 63 .
To sum up, despite the elected studies showed an association between F exposure and IQ deficit, this association was only observed for individuals exposed to levels above those regarded as safe, and the evidence certainty for this association is very low. Within the above-mentioned limitations, the results of the present systematic review demonstrated that exposure to fluoridated water at levels recommended by the WHO can be considered as safe, as it is not associated with IQ impairment.

Conclusion
Although the findings of this meta-analysis indicated that IQ damage can be triggered only by exposure to F at levels that exceed those recommended as a public health measure, the high heterogeneity observed compromise the final conclusions obtained by quantitative analyses. Thus, based on the evidence available on the topic, it is not possible to state neither any association or the lack of an association between F exposure and any neurological disorder.