Health status among greenhouse workers exposed to different levels of pesticides: A genetic matching analysis

(1) Objective: Greenhouse workers are considered a special occupational group who are exposed to more toxic and harmful substances than ordinary farmers. The health problem of this group is a public health problem that warrants attention. Taking greenhouse workers in Ningxia, China, as the research sample, this study analyzed the health risk to practitioners posed by the greenhouse working environment. (2) Method: To analyze the relationship between pesticide exposure and the health of greenhouse workers, the genetic matching method was used to exclude the influence of covariates on the results. (3) Results: The results showed a statistical significance regarding the prevalence of cardiovascular diseases (CVD), skeletal muscle system diseases (SMSD) and digestive diseases between the different exposure groups. Researching the disease symptoms found that different levels of exposure to pesticides in greenhouses could cause multisystem and multisymptom discomfort. In addition to some irritant symptoms such as eye itching, itching, and sneezing, there were also differences in terms of the frequency of discomfort such as back pain, a decline in sleep quality, memory loss, joint pain, swelling and weakness, upper abdominal pain and flatulence, in the different exposure groups. (4) Conclusion: Different levels of exposure to pesticides in greenhouses may be one of the risk factors for practitioners to suffer from various systemic diseases, affecting their health and work efficiency. This hazard is manifested not only in some acute irritant symptoms but also in chronic diseases due to long-term exposure.

Dos Santos et al. believed that chronic exposure to malathion could lead to memory loss and a decline in spatial discrimination 8 . Analyzing human oral cells in Malaysia, Zariyantey found that compared with the control group (office workers), the frequency of manganese in farmers exposed to pesticides increased significantly 16 . Many studies have also shown that different levels of pesticide exposure have a certain cytotoxicity and genotoxicity 17,18 , which may lead to the occurrence of and deterioration due to multisystem diseases.
In a previous cohort study, we analyzed health problems related to multisystem diseases among greenhouse workers 19,20 . At the same time, besticide exposure in the greenhouse microenvironment was also analyzed and studied 21 . In this study, genetic matching methods are used. Potential confounding factors are well controlled as covariates to minimize the impact of bias on the results, better reflection the true, impact of pesticide exposure on the health of practitioners, and providing a theoretical basis for the health risk factors of greenhouse workers.

Materials and Methods
Data source. The data come from health survey data on greenhouse workers randomly sampled from four greenhouse planting villages (Wudu, Lingtian, Maosheng, Yinghe) in Yinchuan city in Ningxia, China in 2015, 2016 and 2017. The data were collected through face-to-face questionnaires with informed consent. The questionnaire is a self-compiled questionnaire from the research group, and after reliability and validity testing, the quality of the questionnaire was found to be good. The research plan was approved by the Medical Ethics Committee of Ningxia Medical University for the record (approval number 2014-090).
Sample collection. According to the formula for sample size in simple random sampling, where p is 30% and δ is controlled at 5%. The sample size needed for simple random sampling is 225. We expand the sample size 1.5 times based on cluster sampling and finally determine a sample size of not less than 403 every year based on a 20% attrition rate. The actual sample sizes in the three years are 448, 460, and 460 respectively. The inclusion criteria are as follows: (1) The respondents are local residents (i.e., living in the local area for more than five year); (2) The respondents are part of the main labor force population aged 18-70; (3) The respondents have been engaged in the relevant work of greenhouse planting for no less than one year.
The exclusion criteria are as follows: (1) People who refused to participate in the investigation after communication; (2) People with cognitive impairment or who cannot communicate effectively; (3) People who can't work continuously in Greenhouse.
Matching. As a method of causal inference, matching is widely used in many fields, such as statistics, medicine, public health, economics, and sociology. Matching can control for potential confounding factors as covariates to minimize the impact of bias on the results. When using the matching method for causal inference, two common approaches are propensity score matching 22 and multivariate matching based on the Mahalanobis distance 23,24 . Both methods need to have the attribute of "equal percent bias reduction" (EPBR) 25,26 , but this attribute is seldom consistent in real data. If it is not established, then the matching will increase the bias of some linear functions of the covariates even if all univariate means are closer in the matched data than in the unmatched data 27 . At the same time, these two methods also have the disadvantage of worsening the balance between potential confounding factors. Therefore, in 2005 and 2011, Sekhon and his colleagues proposed the genetic matching algorithm, which can maximize the balance of the observed covariates between the different exposure groups.
Genetic Matching. Genetic matching is a nonparametric matching algorithm that was proposed by Sekhon and his colleagues 28,29 . Its core motivation for matching is also Rubin's causal model 30 . It does not depend on the understanding and estimation of propensity scores; rather, it is a generalization of propensity score matching and Mahalanobis distance matching. The greatest advantage of genetic matching is that it can quickly find an appropriate weight by machine learning, so that the covariates involved in matching can reach the distribution balance between, the different exposure groups as soon as possible. Genetic matching does not need to establish a model to predict the tendency score value in advance; rather, genetic matching weights it based on the importance of variables, to improve the accuracy and efficiency of the matching. Genetic matching is based on generalizing the Mahalanobis metric and gives weight parameter w to optimize the matching process. The formula is: where W is a k × k positive definite weight matrix and − S 1 2 is the Cholesky decomposition of S. The formula is = − − S (S ) S T 1 2 1 2 . Its weight assignment is 0 when the information of all covariates (confounding factors) is included in the propensity score or when the model can match better through the Mahalanobis distance.
www.nature.com/scientificreports www.nature.com/scientificreports/ Variable selection. Grouping variables, i.e., the high-exposure group and the low-exposure group, were obtained by latent cluster analysis (LCA) perfomed by the members of the previous project team. In the latent cluster analysis, the latent variables we choose mainly include the following three parts: (1) Basic characteristics of greenhouse workers: the number of years working in a greenhouse, the per capita planting area, and the working time in a greenhouse/year. (2) Direct contact with pesticide spraying: the personal spraying of pesticides, the mixing of pesticides, the spraying mode, the spraying interval and the spraying duration of each pesticide. (3) Pesticide spraving protection and protection awareness: behavioral factors during pesticide spraying, personal protective equipment scores, personal hygiene, and inspection before and during pesticide spraying.
The matching variables include three aspects: general demographic characteristics (sex, age, ethnicity, cultural levels, etc.), habits and customs (smoking, drinking, exercise, etc.), and dietary habit (number of meals, category of meals, fruit, salt and etc.).
Statistical analysis method. R 3.5.2 was used to analyze the data. The major package was the matching package that Sekhon and his colleagues developed. The significance level is defined as P < 0.05.

Ethics declarations.
During the investigation, the investigators carefully read the relevant contents of the Declaration of Helsinki and strictly abided by its contents. The privacy of the respondents should be conserved and informed consent should be given. This survey is in line with the relevant content of the Declaration of Helsinki. Combine with the characteristics and implementation of this study, the specific Helsinki principles are summarized as follows: 1. This study adheres to ethical standards, respects all groups, and protects their health and rights; 2. In this survey, the investigators have the duty to protect the lives and health of the subjects and to maintain their privacy and dignity; 3. Before the implementation of this investigation, we will submit the design and implementation of the field investigation to the ethics committee of Ningxia Medical University for examination, comment, guidance and approval; 4. The investigation was approved and filed by the Medical Ethics Committee of Ningxia Medical University (Approval No. 2014-090). (See the annex below for specific proof.); 5. The survey was conducted on the premise that the respondents could benefit from the results of the study; 6. All the respondents in this survey volunteered to participate in this survey, have a full understanding of the research project, and signed the informed consent from; 7. The survey promises to respect the rights of the respondents, to protect their privacy and to minimize any impact on their lives; 8. The purpose, method, source of funding, affiliated units of the researchers, expected benefits and potential risks of the survey were explained to each respondent; 9. This survey promises that the respondents can terminate the project at any time for any reason.

Results
General demographic characteristics of greenhouse workers. A total of 1368 individuals were selected for this study, including 448 in 2015, 460 in 2016 and 460 in 2017. After LCA, 392 people were included in the high-exposure group, and 976 were included in the low-exposure group, accounting for 28.65% and 71.35% of the total population, respectively. The working environment, pesticide use and personal protection of the greenhouse workers are described in Table 1.
Genetic matching. In this study, 392 pairs were successfully matched, with each pair having a certain weight. Based on the weight, the matched data were processed and analyzed. The bubble diagram in Fig. 1 shows the basic information after genetic matching of the sample data. The x-axis is the ID of the high-exposure group, the y-axis is the ID of the low-exposure group, and the z-axis is the weight. Notably, in the matching process, there are some high-exposure objects matching multiple low-exposure objects. Table 2 shows the equilibrium test results before and after covariate matching. When all covariates were included in the matching, there was no significant difference between the high-exposure group and the low-exposure group. This result indicates that genetic matching eliminates the bias of covariates (confounding factors) in this data analysis to a certain extent. Differential analysis of diseases. Before matching, there was no significant difference in the prevalence of cardiovascular diseases (CVD) between the different exposure groups (p = 0.059), but after matching, the prevalence of CVD was different between the different exposure groups (p = 0.009). The results suggest that pesticide exposure in a vegetable greenhouse environment had an effect on the CVD after excluding interference by covariates. Furthermore the analysis show s that after matching, the prevalence of CVD in the high-exposure group (7.91%) was higher than that in the low-exposure group (3.57%), about which it can be concluded that the occupational environment of the vegetable greenhouse was one of the causes of CVD. In the analysis of skeletal muscle system diseases (SMSD), the same conclusion is drawn: before matching, there was no difference in prevalence between the different exposure groups (p = 0.059), but after matching, there was a significant difference in prevalence between the different exposure groups (p = 0.013). Meanwhile after matching, the prevalence rates of www.nature.com/scientificreports www.nature.com/scientificreports/ the high-and low-exposure groups were 15.56% and 9.69%, respectively. This result indicates that after eliminating the interference of other covariates, the working environment of the vegetable greenhouse is one of the causes of SMSD. Rdgardless of whether before or after matching, there were no significant differences in the prevalence of digestive, respiratory or immune-endocrine diseases between the different exposure groups. There was a statistically significant difference in the prevalence of digestive system diseases between the two groups before and after matching; however, for respiratory system diseases and immune-endocrine system diseases, there was no significant difference. The results are shown in Table 3. www.nature.com/scientificreports www.nature.com/scientificreports/ Difference analysis of symptoms. Comparing the difference in the frequency of stimulus symptoms, we found that there was no significant difference between the different exposure groups before and after matching. Notably, however, there were significant differences in symptoms of eye discomfort (itching, pain, dry eyes, etc.) and unexplained sneezing and runny nose between the different exposure groups. However, there was no significant difference between asthma and skin discomfort. The specific results are shown in Table 4.
Comparing the differences in cardiovascular symptoms, we found that before matching, the symptom of right back pain did not differ in the different exposure groups (p = 0.065), however, after matching, excluding the confounding of other covariates, there was a significant difference between the different exposure groups (p = 0.047).  www.nature.com/scientificreports www.nature.com/scientificreports/ Then the frequency of the high-exposure group was higher than that of the low-exposure group. The results are shown in Table 5.
Comparing the differences in nervous system symptoms, we found that before matching, there was no significant difference in the distribution of sleep quality between the different exposure groups (p = 0.228), but after matching, the difference was statistically significant. The frequency of poor sleep quality in the high-exposure group was significantly wider than that in the low-exposure group. The distribution of sleep time and memory impairment showed the opposite results. Before matching, the distribution of sleep time in the high-exposure group was significantly wider than that in the low-exposure group. At the same time, the symptoms of difficulty falling asleep, nightmares and sleep pain that were not affected by covariates were statistically significant in the different exposure groups before and after matching. Table 6 suggests that after matching, the frequency of three symptoms, i.e., difficulty falling asleep, nightmares and sleep pain, was significantly higher in the high-exposure group than in the low-exposure group.   www.nature.com/scientificreports www.nature.com/scientificreports/ Comparing the difference of skeletal muscle system symptoms, we found that before matching, there was no significant difference in the distribution of pain, swelling and weakness symptoms between the different exposure groups, however, after matching, the incidence of symptoms in the high-exposure group was higher than that in the low-exposure group (p = 0.024). The results are shown in Table 7.
In the difference analysis of digestive system symptoms, only after matching were the symptoms of upper abdominal pain and flatulencesignificantly different between the different exposure groups (p = 0.038). The results are shown in Table 8.
In the difference comparison of respiratory symptoms, only after matching was the incidence of unexplained hemoptysis significantly different in the different exposure groups (p = 0.001). The results are shown in Table 9.
There was no significant difference in other variables, including the immune and endocrine systems before and after matching. The results are shown in Table 10.

Disscusion
As a special working environment, vegetable greenhouses are characterized by a closed environment, pesticides and other toxic substances do not easily volatilize. Therefore, there Studying the health problems of vegetable greenhouse practitioners holds a certain level of scientific and practical value 31,32 . Health is the result of a combination of factors related to genetics, the environment, living habits and other factors 33 . If a single-factor method is used to study the principle of health occurrence and development, such a method will have certain limitations, and it will not be possible to exclude the interference of mixed factors in real causality. In this study, genetic matching was used to eliminate the covariate bias between disease and environmental exposure in greenhouses, and to explore the relationship between different diseases (symptoms) and environmental exposure in greenhouses.
Genetic matching is a new matching method subsequent to propensity score matching. It can quickly find an appropriate weight by machine learning so that the covariates involved in matching can reach the distribution balance among groups as soon as possible 30 . Its matching speed and quality are so fast that those of previous matching methods cannot compare. In this study, the equilibrium test of genetic matching also pointed out that   www.nature.com/scientificreports www.nature.com/scientificreports/ there was no significant difference in all covariates between the high-and low-exposure groups after matching, indicating that the matching effect was good.
After genetic matching, it was statistically significant for the difference of the CVD among the different exposure groups, and the prevalence of the CVD in the high-exposure group was higher than that in the low-exposure group, showing that the degree of exposure to greenhouse pesticides had a causal relationship with the CVD of workers after excluding the interference of covariate factors. The Global Burden of Disease (GBD) study indicated that the CVD have been a major cause of global mortality since 1980 34 . In 2015, CVD accounted for nearly one-third of all deaths worldwide, while such diseases accounted for more than 40% of all deaths in China 35,36 . Previous studies by our group have also suggested that the CVD, as one of the high-risk diseases of vegetable greenhouse practitioners, are still a problem that cannot be ignored 30 . Comparing the symptoms of CVD in the different exposure groups, we found that after matching, the occurrence frequency of right back pain was different between the different exposure groups and always showed that the occurrence frequency of the high-exposure group was higher than that of the low-exposure group. There were no differences between the different exposure groups regarding other symptoms. This result may be because CVD have an acute onset, there are more temporary discomfort symptoms in the early stage 37 , and the body itself has a certain tolerance, coupled with differences in individual cognition. Therefore it is easy to ignore the degree of concern for such symptoms.
People engaged in agricultural labor usually maintain a certain forced position in the process of labor, which in the long run will cause skeletal muscles fatigue and induce disease. Studying SMSD among greenhouse workers, it was also found that there were differences in the prevalence of SMSD among greenhouse workers with   www.nature.com/scientificreports www.nature.com/scientificreports/ different exposure levels, and the prevalence of such diseases in the high-exposure group was always higher than that in the low-exposure group. This result is consistent with Zhang's research 33 . With the increase in exposure intensity, the working intensity of workers is also increasing, making some of the skeletal muscles of workers be in a long-term state of tension, and the risk of disease will continue to increase. Some studies have also shown that the prevalence of osteoarthritis in greenhouse workers was 41.9% 38 , which was much higher than that of other workers. It can be seen that SMSD are one of the important factors affecting the health status of greenhouse workers. In addition, the study of disease-related symptoms indicated that the prevalence of the symptoms, "pain, swelling and weakness of hand and foot joints", were significantly higher in the high-exposure group than in the low-exposure group, which also indicated that joints such as those in the hands and feet of greenhouse workers were important body parts prone to injury, and were body parts that need to be emphatically protected 39 .
Many pesticides can harm the digestive system 40 . In this research, the prevalence of digestive system diseases in the high-exposure group was higher than that in the low-exposure group before and after matching. After matching, the prevalence of such diseases was 21.43% and 15.06% in the high-and low-exposure groups respectively, which was higher than other researchers' survey results on digestive system diseases in rural residents 41 . This result further indicates that the vegetable greenhouse environment has a promoting effect on digestive system diseases compared with the ordinary rural working environment. In the digestive symptom analysis, upper abdominal pain and flatulence were different between the different exposure groups, and the frequency of the high-exposure group was higher than that of the low-exposure group. On the one hand, this result may be due to improper protective measures and other factors. Pesticides are more likely to enter the body through the mouth and nose when greenhouse workers spray pesticides, and pesticides entering the body will bind to serine in the center of pancreatic cholinesterase activity. This binding will inhibit acetylcholine activity and thus result in a large amount of accumulation of acetylcholine in nerve synapses, affecting the nerve conduction function, and causing the gastrointestinal function to fail to function well 42 . On the other hand, greenhouse workers need to carry a certain volume of pesticide spraying cans when spraying pesticides, which will cause their abdomen to be in a state of long-term oppression, and therefore, these workers will be more likely than the general population to exhibit abdominal distension, abdominal pain and other discomfort symptoms. Some scholars have also   Table 10. Difference Test of Symptom Frequency in Immune and endocrine system before and after Matching.
suggested that 43 pesticide exposure was associated with the occurrence of digestive system diseases. As a result of the effects and high frequency of pesticide spraying and exposure to the special microenvironment of greenhouses, pesticide exposure in greenhouses is one of the risk factors for digestive system diseases.
Studying neurological symptoms, we found that with the increase in the frequency of pesticide exposure in greenhouses, workers will suffer from different degrees of neurological discomfort, such as a decline in sleep quality, difficulty falling asleep, sleep pain, and memory loss. Research has shown that the nervous system is another important target organ for pesticide exposure. People exposed to pesticides may have symptoms of neurological discomfort of varying degrees, and the symptoms may be aggravated with the increase in exposure intensity 44,45 . Because of the special working environment, greenhouses aggravate damage to the nervous system of practitioners and make them more prone to symptoms of discomfort. Therefore, as one of the risk factors for digestive system diseases, greenhouse pesticide exposure should be given sufficient attention by researchers. In addition, different levels of exposure to pesticides can cause different degrees of eye itching, sneezing and other irritant symptoms 46,47 . Effective protection is an effective way to alleviate pesticide irritant symptoms in greenhouses. Reducing the frequency and intensity of exposure can effectively alleviate the irritant symptoms 47 .
In this research, it was shown that there was no difference in symptoms related to the respiratory system and the endocrine system between the different exposure groups, which is inconsistent with some researchers' findings 14,48 . This result may be due to the short period of time working in a greenhouse for the workers in the selected sample areas; additionally, the sample population is not engaged in this kind of work all year round. They enter greenhouses to do related work only when greenhouses are places busy with farming activity. And other times, they choose to go out for other work without exposure to pesticides. This period serves as the elution period of pesticide exposure toxicity. Therefore, the impact of the greenhouse environment on the disease incidence of different systems will be weakened, but follow-up research should be conducted to verify this hypothesis.

conclusion
It was found that exposure to different degrees of greenhouse pesticides can not only lead to multisystem diseases in human, but also cause many uncomfortable physical symptoms for greenhouse practitioners, affecting their health and work efficiency. This hazard was manifested not only in some acute irritant symptoms, but also in chronic diseases due to long-term exposure 18 .