Environmental and structural factors associated with bacterial diversity in household dust across the Arizona-Sonora border

We previously reported that asthma prevalence was higher in the United States (US) compared to Mexico (MX) (25.8% vs. 8.4%). This investigation assessed differences in microbial dust composition in relation to demographic and housing characteristics on both sides of the US–MX Border. Forty homes were recruited in the US and MX. Home visits collected floor dust and documented occupants’ demographics, asthma prevalence, housing structure, and use characteristics. US households were more likely to have inhabitants who reported asthma when compared with MX households (30% vs. 5%) and had significantly different flooring types. The percentage of households on paved roads, with flushing toilets, with piped water and with air conditioning was higher in the US, while dust load was higher in MX. Significant differences exist between countries in the microbial composition of the floor dust. Dust from Mexican homes was enriched with Alishewanella, Paracoccus, Rheinheimera genera and Intrasporangiaceae family. A predictive metagenomics analysis identified 68 significantly differentially abundant functional pathways between US and MX. This study documented multiple structural, environmental, and demographic differences between homes in the US and MX that may contribute to significantly different microbial composition of dust observed in these two countries.


Study population
Overall, 20 homes from Tucson, Arizona, US (TUS), 20 homes from Nogales, Arizona, US (NUS), and 40 homes from Nogales, Sonora, MX (NMX) (20 high socioeconomic status (high SES) and 20 low socioeconomic status (low SES) were recruited into the study.The household and environmental characteristics were analyzed and compared between the countries (US vs MX) as well as between neighborhoods (TUS, NUS, high SES NMX, and low SES NMX).There was no significant difference in age (mean years of age ± SD US = 29 ± 6, MX = 32 ± 11 p = 0.58) or gender (female gender: US = 93.0%,MX = 90%, p = 0.50) of the respondents between the US and MX, but the respondents in the US had significantly more years of education (mean years of education ± SD US = 14 ± 4, MX = 11 ± 4, p < 0.01) (Supplemental Table S1).US households had higher incomes (p < 0.01) and cleaned their floors less frequently (p < 0.01) (Table 1).The source of drinking water was also significantly different between homes in MX and homes in the US, with more US homes utilizing public tap water (p < 0.01) (Table 1).US families were significantly more likely to have inhabitants with asthma than those in MX (US = 30.0%,MX = 5.0%, p < 0.01) (Table 1).There were no significant differences in homes that had smokers (US = 29%, MX = 28%, p = 1.00, or number of inhabitants per household (US = 6.2 ± 2.2, MX = 6.4 ± 2.0, p = 0.52) between the US and MX (Table 1).There was a significant difference in whether the households had any pets between US and MX (p < 0.01) as well as between TUS vs NUS vs high SES NMX vs low NMX (p = 0.01) (Table 1).

Household characteristics
Although all cities were close to the US-MX Border and within 100 km of each other, there were striking differences between the homes by country.All the homes in high SES NMX had the same floor plan, as it was a planned subdivision.The percentage of homes with paved roads (p < 0.01), flushing toilets (p < 0.01), and piped water (p < 0.01)was significantly lower in low SES NMX (Table 2) compared with the other neighborhoods.The homes in the US differed from those in MX in type of structure (Table 2), with 33% of homes in US being apartments or trailers, while 100% of the homes in MX were detached or duplex houses (p < 0.01) (Table 2).Homes in the US had more rooms and more bathrooms than in MX (p < 0.01).All homes in the US had either air conditioning, evaporative cooling, or both, whereas only a few of the homes in MX (10%) had either type of cooling

In-home floor dust loading
As presented in Table 3, we found significantly greater amounts of dust loading in MX homes (US = 107.4± 2.2 mg/ m 2 MX = 172.4± 2.9 mg/m 2 , p = 0.04) compared to US homes.In addition, the homes located on a dirt road had significantly higher dust load than the homes located on an asphalt road (dirt = 285.9± 3.0 mg/m 2 asphalt/ tile = 106.3± 2.2 mg/m 2 , p = 0.02).Significantly more dust was retrieved in homes with at least one pet (homes with a pet = 176.9± 3.27 mg/m 2 , homes with no pet 115.3 ± 2.15 mg/m 2 , p = 0.01), more than two children ( more than 2 children = 192.1 ± 3.2 mg/m 2 , 2 or less than 2 children = 115.3± 2.3 mg/m 2 , p = 0.01), and more than 4 residents ( more than 4 resident = 176.4± 2.9 mg/m 2 , 4 or less adults = 105.1 ± 2.2, p = 0.01) (Table 3).There were no significant differences in dust loading based on the number of rooms or number of adults living in the home.Finally, although we did not find any significant differences between the dust load by floor type in homes located in the US versus MX, there was a significantly higher (p = 0.03) dust load within US homes that had some sort of carpet, in comparison to homes with smooth flooring (Supplemental Fig. S1).There were no differences in dust loading in relation to whether there were asthmatic inhabitants (p = 0.89).www.nature.com/scientificreports/

Dust microbiome
The microbiome compositions, as measured by unweighted UniFrac, of the dust in the US samples and MX  www.nature.com/scientificreports/samples were significantly different (PERMANOVA pseudo-F = 2.59 p < 0.001, q < 0.001; Fig. 1).We trained a random forest classifier to differentiate the neighborhoods using the dust microbiome compositions, and the classifier achieved an overall accuracy of 0.875.A standard approach for assessing accuracy is to compare this accuracy to the accuracy that would be achieved with a "dummy classifier".Typically, this would be "classifying" based on randomly assigning a label to each sample, or assigning the most frequent label to each sample, and computing the resulting accuracy.When we compare the accuracy of our random forest classifier to baseline accuracy achieved with a "most frequent label" dummy classifier, we found that the random forest classifier's accuracy was 1.75 times better than baseline accuracy (Fig. 1B).This supports our PERMANOVA results by suggesting that our dust microbiome feature data contains information that is specific to the different neighborhoods.Samples predicted to be from the United States were accurately labeled 87.50% of the time, and samples from Mexico were accurately labeled 87.5% of the time (Fig. 1B).Our random forest model and ANCOM each reported many different genera, but both methods identified three of the same genera and one of the same family that were enriched in the MX dust (Alishewanella, Paracoccus, Rheinheimera genera and Intrasporangiaceae family) (Fig. 2).
The dust microbial composition between the four sites (two US cities and two MX neighborhoods) was significantly different (PERMANOVA pseudo-F = 2.06, p < 0.001; because all tests' results reported in this paragraph were the same, we will report the results as F, p, and q-values for readability).The microbial composition in the dust from the higher SES neighborhood in NMX was significantly different from the US cities' (NUS versus high SES NMX): F = 1.74, p = 0.011, q = 0.0132; TUS versus high SES NMX): F = 2.50, p = 0.003, q = 0.0030) and significantly different from low SES NMX: (F = 2.35, p = 0.003, q = 0.0045).The low SES NMX neighborhood dust microbiome composition was also significantly different compared to the US cites (NUS versus low SES NMX): F = 2.18, p < 0.001, q = 0.003; TUS versus low SES NMX: F = 2.227, p < 0.001, q = 0.0030) (Fig. 1).The US cities were not significantly different from each other (TUS versus NUS: F = 1.20, p = 0.133, q = 0.133 (Fig. 1).A sub-analysis showed that the house dust microbial composition differed when comparing the homes by road type (F = 2.21, p < 0.001, q < 0.001), presence of air conditioning (air conditioning vs none: F = 1.86, p = 0.006, q = 0.036), all rug vs all smooth flooring present in the home (F = 1.79, p = 0.009, q = 0.027, having household income of < $6,000 (the lowest bracket of income) vs ≥ $26,000 (the highest bracket of income) (F = 1.87, p = 0.003, q = 0.045, having piped water at the house (F = 1.75, p = 0.008), having flushing toilets in the home (F = 1.52, p = 0.031), and main drinking water source for the house (hauled in vs indoor public water (F = 1.56 p = 0.005, q = 0.015) and hauled in vs vending machine water (F = 1.62, p = 0.005, q = 0.015)).(SupplementalFig. S2 and Supplemental Fig. S2 Supplemental table S1).This also showed the house dust microbial composition was significantly different in houses that had an asthmatic vs those that did not have an asthmatic present in the home (F = 1.14, p = 0.018, q = 0.019) (Supplemental Fig. S2)).We trained a random forest classifier to differentiate the neighborhoods using the dust microbiome compositions, and the classifier achieved an overall accuracy of 0.81, which was 3.25 times www.nature.com/scientificreports/better than baseline accuracy (assigning the most common category to all samples).The model's NUS predictions were accurate 100% of the time; the TUS predictions were accurate 50% of the time, where the remaining 50% of the TUS samples were labeled NUS; the high SES NMX predictions were all correct 75% of the time, where the remaining 25% of the time the high SES NMX samples were predicted to be from low SES NMX; and the low SES NMX samples were all accurately classified (Fig. 1).When the model predicted the wrong neighborhood, it predicted the correct country.Our random forest model and ANCOM for comparing neighborhoods both found one genera that were enriched in the dust microbiome the low SES NMX group (Georgenia) (Supplemental Fig. S3).
The PICRUSt analysis predicted that there were 68 significantly differentially abundant functional pathways in the house dust microbial communities present when comparing the US to MX (Supplemental Fig. S4).Twentytwo significantly different functional pathways of the microbial communities in the dust existed between high SES NMX vs low SES NMX vs NUS, as compared to TUS (Supplemental Fig. S5).The PICRUSt analysis predicts that the microbes present in the house dust in low SES NMX had superpathway of polyamine biosynthesis III capabilities.When comparing the low SES NMX to the high SES NMX, the microbes present in the house dust in the low SES NMX neighborhood had greater pyrimidine deoxyribonucleotides biosynthesis, benzoyl CoA anaerobic degradation, nitrate reduction, and spirilloxanthin and 2.2-diketo-spirilloxanthin biosynthetic capabilities (Supplemental Fig. S5).

Discussion
This study examined differences in household characteristics, environmental factors, socio-economic factors, and microbial composition of dust from households in neighborhoods across the US-MX Border, in which we previously documented differential rates of asthma prevalence 29 .We identified multiple structural, environmental, and human factor differences between the US and MX homes.There were differences in dust loading and microbial diversity across the samples collected from the homes.Some of the key structural differences between the homes in MX vs the US may have led to the differences in microbial compositions.The diversity of microbial composition was significantly different when comparing household dust from homes with different road types, different drinking water sources, presence of air conditioning vs no air conditioning, all rug vs all smooth flooring, piped water at the house, and flushing toilets in the home.The bacterial genera Alishewanella, Paracoccus, Rheinheimera and the Intrasporangiaceae family were found to be enriched in the dust from homes within MX.In low SES NMX (low socioeconomic Nogales, Mexico) group the genus was Georgenia was found to be enriched in the house dust.
This study found multiple housing characteristics that differed between MX and the US, differences that may contribute to the higher prevalence of asthma in children of Mexican descent living in the US compared to MX 29 .Homes in the US were more likely to be on paved roads and to have flushing toilets, piped/municipal water, more rooms and bathrooms, and evaporative cooling and/or air conditioning compared to MX homes.Although only 26.3% of individuals in the US drank their tap water, this was significantly higher than in MX (2.6%).Floors were more likely to be carpeted in the rooms where the dust was collected in the US.Carpets in the home may be an important risk for asthma development as they can be a reservoir for mildew, mold, allergens, and chemical www.nature.com/scientificreports/hazards (e.g., pesticides, metals, flame retardants, per-and polyfluoroalkyl substances (PFAS), and can contribute to poor indoor air quality 30,31 .Although carpets typically have greater dust loading than hard floors, the homes in MX had significantly greater dust loading than the ones in the US, despite the lack of carpet (Tables 2, 3).This suggests that differences in the household density (particularly the number of children), the presence of pets, and other household structural factors between these communities might account for the greater dust loading in MX.Microbial populations in indoor environments, where we live and eat, play an important role in human health.Environmental dust exposure early in life appears to influence what bacteria colonize the gut, skin, and nasal microbiome [32][33][34][35] .It has been proposed that part of the reason for increasing asthma and allergy prevalence worldwide is due to shifts in our lifestyles towards more Western or Modernized ways of living, which has led to a decrease in microbial exposure during a critical period of immune development 36 .Individuals with exposure to more diverse bacteria have lower rates of allergic diseases 37,38 .We found significant differences in the microbiome composition of dust collected from homes in MX compared to the US (Fig. 1).Previous studies have shown a link between house dust microbial composition and risk of allergic asthma development.Children raised in Amish communities have lower rates of allergic sensitization and asthma than those from Hutterite communities.Stein et al. demonstrated that house dust from an Amish community had a different microbial composition than the house dust from a Hutterite community.Further, mice that received the dust from the Hutterite houses intranasally had decreased airway reactivity and eosinophilia, which are markers of allergic asthma 22 .In this study we found a significant difference in microbial dust composition between homes with and without an occupant that reports having asthma (Supplemental Fig. S2).
House dust microbial composition is affected by outdoor and indoor environments, including structural characteristics, as well as household occupants and their activities in the home.Any combination of the significant differences between the homes in MX and the US may have led to the differences observed in the microbial composition of the house dust.A sub-analysis showed that there was a difference in beta diversity (unweighted UniFrac) of the dust microbiome when comparing the homes by road type, presence of air conditioning, presence of piped/municipal water, presence of flushing toilets, drinking water sources, and having all rug vs all smooth flooring.Using an air conditioner changes the indoor environment by changing the temperature and humidity, which would lead to differences in microbial growth indoors 39 , as well as through filtering of the air.Homes with air conditioners are less likely to have windows open, and therefore less dust is likely to blow into the home.Furthermore, it has been shown that as regions become more industrialized and homes are constructed in a manner where they are more tightly sealed, the microbiome diversity decreases 40 .The water sources present in the homes caused variation in the house dust microbial composition.It has been shown that municipal water has fewer commensal microbes present and that drinking of municipal water is associated with higher prevalence of asthma and allergies 21,41 .
We have previously shown that MX and the US have different prevalence of childhood asthma, and this study also found differences in house dust composition between the two regions 29 .The dust from Mexican homes was more enriched with A Alishewanella, Paracoccus, Rheinheimera genera and Intrasporangiaceae family.Although Alishewanella, Rheinheimera genera and Intrasporangiaceae family have not been previously identified as related to asthma prevalence they have been linked to less industrialized and outdoor environments as well as other allergic diseases.Alishewanella, Rheinheimera and Pararcoccus are gram-negative bacteria that have endotoxin present in their cell walls.Previous studies have shown that higher levels of endotoxin in a child's environment are related to lower incidence of atopic asthma and allergic diseases 22,42,43 .Alishewanella, Pararcoccus, Intrasporangiaceae, and Rheinheimera and can be found in soil.Intrasporangiaceae has specifically been found to be in higher concentrations in soil in unindustrialized areas 44,45 .In the southwest of the US and in Mexico City, Rheinheimera has been found to be more frequently present in the air during dust storms and in the air of semirural area compared to urban areas 46,47 .Given that the household in MX where more likely to have dirt roads, no air-cooling system, and have greater air exchange rates with outdoors, it would be reasonable that these bacteria are more commonly present in MX homes.Intrasporangiaceae has been shown to be depleted in household dust in more industrialized regions 45,48 .Depletion of Intrasporangiaceae and Rheinheimera in a person's environment and/or on their skin is linked to development of atopic dermatitis 49,50 .Pararcoccus has postulated to be protective against atopic dermatitis in that it is more prevalent in the skin microbiome of healthy adults to adults with skin affected by atopic dermatitis 51 .However, unlike all the rest of the identified genera or family it has previously been linked to asthma prevalence.In a study examining the effects of indoor microorganisms on asthma and allergic disease in children it was found that healthy children had house dust enriched with Pararcoccus 52 .Alishewanella was only recently established as a unique genus in the year 2000 and its diversity is still being understood.Alishewanella is also in the phylum of Proteobacteria, which, much like Actinobacteria, has been shown to be both negatively and positively associated with allergic disease in different studies [53][54][55][56] .The inconsistencies in the direction of these relationships may be explained by the need to classify beyond the phylum level, which is very broad, or the presence of critical time points at which exposure protects against onset of asthma but may be harmful once asthma has developed.In any case, the differences found in microbial exposure between the sites suggest promising areas for future research related to asthma in the Border region.
Although some microbes in the dust are unlikely to be active metabolically, others are living and metabolically active, and so may play a critical role in alteration of host microbiome.They can colonize the host, where they can become metabolically active and play a key role in risk of asthma development.A PICRUSt analysis was done to identify potentially relevant metabolic pathways present in the house dust from homes located in MX vs US.There were 23 biosynthetic pathways, 38 degradation/utilization/assimilation pathways, and 7 pathways involved in generation of precursor metabolites that were higher in the house dust from MX.Some of the pathways found to be more prevalent in the house dust from MX are involved in biosynthesis of molecules that are known to be protective against asthma (Supplemental Fig. S4), such as short chain fatty acids and pyrimidine 57,58 .There have been multiple studies that show the importance of metabolites produced by the microbiota that colonize humans in protecting against asthma via alterations in the epithelial barrier function and immune system regulations.Many of the possibly up-regulated functional pathways present in the house dust from MX have been examined in other studies and have been shown to be protective against asthma or helpful in asthma control.For example, the UDP − N − acetyl − D − glucosamine and the mycolate biosynthetic pathways that were higher in the house dust from MX have been shown to down-regulate allergic airway inflammation [58][59][60][61][62] (Supplemental Fig. S4).
Limitations of this study include that collection of the house dust microbiome samples occurred at a single time-point and during a single season.There was significant differences in the ambient temperatures during sample collection across the border and this was likely related to more of the MX samples being collected in the spring/summer months and more of the US sample being collected in the winter.Future studies would benefit by looking at multiple time points to assess the variability of household microbiomes.The high SES NMX homes were all part of a subdivision with identical floor plans, which could drive some of the differences observed between the low and high SES neighborhoods in NMX.We used 16S rRNA amplicon sequencing to characterize the house dust, which generally supports taxonomic resolution only at approximately the genus level, although species level differences may affect human health.Shallow or deep shotgun metagenomics, although more expensive, would improve microbial identification, and deep shotgun metagenomics would provide more accurate functional profiles of the samples.PICRUSt uses 16S rRNA data to extrapolate metagenome composition and provides relatively lower confidence functional pathway profiles of samples.Given that asthma is a common reason for presentation to outpatient clinics and household recruitment was through outpatient clinics, there may have been selection bias because families that had an asthmatic child in the home may have been more likely to enroll in the study.However, this bias could have been represented on both sides of the Border.We did not investigate differences in pollutants in the collected dust, such as metals or pesticides, which could affect the dust composition and microbial diversity, along with the growth of opportunistic bacterial pathogens 63 .The structural differences and occupant behavioral differences between US and MX led to the differences in microbial diversity in the household dust.However, many of these structural factors are correlated with each other, and larger sample sizes would be required to disentangle this.Although there were difference in microbial dust composition between homes with and without an occupant that reported having asthma this study did not account for how long those individuals with asthma had lived in that home.
In conclusion, despite TUS, NUS, and NMX being geographically close (< 100 km) and having similar climates, homes across this Border region differ in ways that lead to significantly different indoor environments.Mexican and US households differed in years of education; household income; the percentage of homes that had paved roads, flushing toilets, and piped water; the number rooms and bathrooms present in the home; and presence and type of cooling and flooring.Some of these household differences may have led to the significant differences we observed in the microbial composition of the house dust collected from MX or US homes.The dust from the from Mexican homes was enriched with A. Alishewanella Paracoccus, Rheinheimera genera and Intrasporangiaceae family.Future research should assess whether exposure to these bacteria during critical windows in early life may offer protection from development of asthma or allergic disease.

Study population
Patients at each of three clinics were approached from September to December of 2016 to participate in this study: El Rio Community Health Center in TUS (n = 20), Mariposa Community Health Center in NUS (n = 20), and the Secretaría de Salud de Sonora in NMX (n = 40).In NMX, 20 households were recruited from a traditionally high SES neighborhood (high SES) and 20 from a traditionally low SES neighborhood (low SES).All of the households that were recruited from the high SES neighborhood in NMX were in a subdivision neighborhood where the houses were constructed with identical layouts.Families were eligible to participate in this study if at least one parent was of Mexican descent and had at least one child younger than 5 years old.The University of Arizona Human Subjects Protection Program approved all study materials (IRB approval number: 1607687201), in addition to all necessary permissions and reviews from the US/Mexico Border Commission, the Secretaría de Salud de Sonora, Mariposa Community Health Center and El Rio Community Health Center.

Questionnaire
During the home visit, a questionnaire was administered orally in English or Spanish by a trained research assistant to obtain information on household demographics, asthma prevalence, sanitation measures, drinking water sources, and pets in the home.Multiple household characteristics were assessed by the research assistant at the same time as the home visit(e.g., mildew, water damage, structural characteristics, and type and number of bathrooms).

Dust sample collection
House dust was collected using a Hoover CH3000 vacuum cleaner equipped with a pre-weighed sterilized X-Cell 100 dust collection sock (Midwestern Filtration, Cincinnati, OH) inserted in the crevice tool.To collect floor dust, a one-meter square template was laid on the floor in the child's room and vacuumed for five minutes.If a child did not have their own room, then the sample was collected in the room where the child regularly spends most of their time.The sock filter holding the collected dust was placed in a plastic bag sterilized under UV light in a hood.The vacuum and its accessories were cleaned with disinfectant wipes and sprayed with isopropyl alcohol between sampling each house.Collected dust samples were transported in a cooler with ice packs to the laboratory in the Medical Research Building at the University of Arizona in Tucson, AZ.
In the laboratory, each dust sample was transferred from the filter sock to a pre-weighted 50 ml sterilized centrifuged tube.The centrifuge tube with the dust sample was weighed three times using a Mettler Toledo AB54 Precision Balance Weight Scale (Mettler Toledo International, Inc., Columbus, OH).The average of three measurements was then recorded.All environmental samples were stored in a − 80 °C freezer until analyzed.Frozen dust samples were shipped to the Pathogen and Microbiome Institute at Northern Arizona University for amplicon library preparation and sequencing.

DNA extraction
DNA was extracted using the MoBio Powersoil DNA isolation kit (Qiagen) with an additional mechanical lysis.Briefly, samples were placed in a lysing matrix E tube (MP Biomedical) with 600 µl of Buffer RLT Plus and lysed in 30 s increments for a total of 6 min at 10,000 × g.Samples were sat resting for 30 s between each bead beating to prevent heating.Extraction continued following the manufacturer's protocol.DNA was quantified using a NanoDrop 2000.Extraction blanks, which did not contain any sample during the extraction, were carried throughout the entire extraction and 16S rRNA gene sequencing.

16S rRNA gene sequencing
Sample processing and sequencing were performed using the Earth Microbiome Project (www.earth micro biome.org) protocols.The barcoded primers 515F/806R were used to target the V4 region of the 16S rRNA gene, as previously described 64 .Each PCR reaction contained 2.5 µl of PCR buffer (TaKaRa, 10 × concentration, 1 × final), 1 µl of the Golay barcode tagged forward primer (10 µM concentration, 0.4 µM final), 1 µl of bovine serum albumin (ThermoFisher, 20 mg/mL concentration, 0.56 mg/µl final), 2 µl of dNTP mix (TaKaRa, 2.5 mM concentration, 200 µM final), 0.125 µl of HotStart ExTaq (TaKaRa, 5 U/µl, 0.625 U/µl final), 1 µL reverse primer (10 µM concentration, 0.4 µM final).All PCR reactions were filled to a total 25 µL with PCR grade water (Sigma-Aldrich) then placed on a ThermalCycler.ThermalCycler conditions were as follows: 98 °C denaturing step for 2 min, 30 cycles of 98 °C for 20 s, 50 °C for 30 s, 72 °C for 45 s, and a final step of 72 °C for 10 min.PCR was performed in triplicate for each sample and an additional negative control was included for each barcoded primer.A post-PCR quality control step was performed using a 2% agarose gel (ThermoFisher).Extraction blank controls were processed through the 16S PCR with the same methods as samples.Barcode primer NTCs controls were carried through the agarose gel step.If amplification was present for negative controls, the PCR was repeated with a new barcoded 806R primer.Following agarose gel, PCR product was quantified using the Qubit dsDNA High Sensitivity Kit (ThermoFisher) and the Qubit fluorometer 4. PCR products were pooled at equimolar concentrations of 50 ng.Quality of the pool was assessed with the Bioanalyzer DNA 1000 chip (Agilent Technologies), combined with 1% PhiX and sequenced on the Illumina MiSeq using the 600-cycle MiSeq Reagent Kit VX (Illumina).

Demographic and housing characteristic analysis
The house and family characteristics were analyzed by country (US, MX) and by neighborhood (TUS, NUS, high SES NMX, low SES NMX).Data analysis was conducted with Stata v16.(StataCorp, College Station, TX).Comparisons between the two countries were assessed with Fisher's Exact and Mann Whitney U tests.Comparisons between the neighborhoods were made using Fisher's Exact and Kruskal-Wallis tests.Non-parametric tests were used, as the data had a skewed distribution as tested by the Shapiro-Wilk test.An alpha level of 0.05 was considered statistically significant.

Microbial data analysis
The microbiome sequencing data was analyzed using QIIME 2 2021.2 65.The emp-paired (Hamady & Knight 2009) action in the q2-demux plugin was used to demultiplex the data.We used the denoise-paired action in the q2-dada2 66 plugin to perform sequence quality control and define amplicon sequence variants (ASVs) with the following parameter settings: trim-left-f = 0; trim-left-r = 0; trunc-len-f = 200; trunc-len-r = 230.Replicate samples were grouped together using median ceiling to average abundances.A phylogenetic tree was created using align-to-tree-mafft-fasttree 67,68 in q2-phylogeny, for use with phylogenetic alpha and beta diversity metrics.q2-diversity's core-metrics-phylogenetic action was used to compute Faith's Phylogenetic Diversity Index 69 , Unweighted Unifrac 70 and Weighted Unifrac 71 at an even sampling depth of 24,090.The alpha-rarefaction 72 action in the q2-diversity plugin was used to generate rarefaction curves based on Faith's Phylogenetic Diversity Index 69 to confirm that the richness of the samples was stable around the chosen sampling depth.The beta-groupsignificance action in the q2-diversity plugin was used to run PERMANOVA pseudo-F tests on across sample groupings of interest.False-discovery-rate (FDR) correction was applied to correct for multiple comparisons.These FDR-corrected p-values will be presented as "q-values" in this text.Taxonomic annotation of ASVs was performed using the qiime feature-classifier 73,74 classify-sklearn method using the SILVA 138 classifier 75 .A taxonomic bar plot was generated for the data based on the SILVA-based taxonomy.An ANCOM analysis was applied to identify differentially abundant taxa at the genus level across cities and countries using qiime composition ancom 76 .Finally, a random forest model was built using the qiime sample-classifier 73,74 classify-samples method to predict the country and neighborhood of origin of the microbiome sample.This model was trained on the ASV table combined with the phylum and genus tables that were generated by collapsing ASV into taxa using the collapse action in the q2-taxa plugin.The classifier used fivefold cross-validation, where 80% of the microbial data was used for training and the remaining 20% was used for testing in 5 iterations such that all samples are used both in training and testing, and the overall performance is averaged across the iterations.To predict the abundance of gene families and related functional pathways of microbial communities present in the house dust, a Phylogenetic Investigation of Communities by Reconstruction of Unobserved States (PICRUSt) analysis was completed, which is used to predict metabolic pathways based on 16S rRNA results.The significant differences between the functional pathways present in the US vs MX or the different neighborhoods' dust was First p-values column are p-values comparing the low socioeconomic neighborhood of Nogales, Mexico vs the high socioeconomic neighborhood of Nogales, Mexico vs Tucson, United States of America vs Nogales, United States of America (value for cities).Second bold p-values column are comparing homes in Mexico vs in the United States of America Abbreviations: LSES NMX = low socioeconomic status Nogales, Sanora, Mexico; HSES NMX = high socioeconomic status Nogales, Sanora, Mexico; MX = Mexico; NUS = Nogales, Arizona, United States of America; TUS = Tucson, Arizona, United States of America; US = United States of America *1 of the Mexico homes did not have info on frequency of floor cleaning.**1 of the Mexico homes and 2 of the Arizona homes did not have info on water source.https://doi.org/10.1038/s41598-024-63356-6 https://doi.org/10.1038/s41598-024-63356-6

Figure 1 .
Figure 1.(A) Unweighted UniFrac beta diversity plot comparing the United States of America (US) and Mexico (MX) house dust (PERMANOVA pseudo-F = 3.02 p < 0.001, q < 0.0015).(B) Random Forest classifier trained to differentiate the neighborhoods using the dust microbiome compositions.The classifier achieved an overall accuracy of 0.87594.US samples were accurately labeled 87.5100% of the time, and Mexican samples were accurately labeled 89% of the time.(C) Unweighted UniFrac beta diversity plot comparing high SES NMX neighborhood vs low SES NMX, vs NUS vs TUS.(D) Random Forest classifier trained to differentiate the neighborhoods using the dust microbiome compositions the model predicted NUS accurately 10,075% of the time; TUS 560% of the time: high SES NMX 75% of the time; and low SES NMX 100% of the time.LSES = low socioeconomic status Nogales, Sanora, Mexico; HSES = high socioeconomic status Nogales, Sanora, Mexico; MX = Mexico; USA = United States of America.

Figure 2 .
Figure 2. ANCOM and Sample Classifier comparing the house dust from homes in the United States versus Mexico.Top is Mexico and bottom is United States of America.The x-axes in these figures represent the relative abundance of the taxon that is highlighted in the panel, while the y-axes represent the categorizations of the samples.Each point therefore represents the relative abundance of a taxon in a single sample, and the box plots and the histograms show the distribution of the relative abundances in each sample group.

Table 1 .
Characteristics of the occupants and households.

Table 2 .
Physical characteristics of the homes in Mexico vs in the United States of America, and the low socioeconomic neighborhood of Nogales, Mexico vs the high socioeconomic neighborhood of Nogales,

Table 3 .
Floor dust loading (mg/m 2 ) in relation to household characteristics.Statistical significance was determined using Mann Whitney U Test. MX = Mexico; US = United States of America.