Metagenomic analysis of human-biting cat fleas in urban northeastern United States of America reveals an emerging zoonotic pathogen

An infestation of cat fleas in a research center led to the detection of two genotypes of Ctenocephalides felis biting humans in New Jersey, USA. The rarer flea genotype had an 83% incidence of Rickettsia asembonensis, a recently described bacterium closely related to R. felis, a known human pathogen. A metagenomics analysis developed in under a week recovered the entire R. asembonensis genome at high coverage and matched it to identical or almost identical (> 99% similarity) strains reported worldwide. Our study exposes the potential of cat fleas as vectors of human pathogens in crowded northeastern U.S, cities and suburbs where free-ranging cats are abundant. Furthermore, it demonstrates the power of metagenomics to glean large amounts of comparative data regarding both emerging vectors and their pathogens.


Results
Cat flea infestation. On July 18, 2019 a Rutgers University maintenance worker discovered large numbers of small insects on his clothing and skin after exiting a nearby work site and rushed to the Center for Vector Biology seeking assistance. The following day, while doing research in one of the center's laboratories, the first author (FCF) sustained multiple bites with subsequent lesions on legs (Fig. 1a), arms (Fig. 1b) and body, detected the first fleas in one of the center's laboratories and collected five fleas actively feeding on himself. Subsequently, a large infestation of Ctenocephalides sp. fleas was identified at the (index) work site associated with the presence of free-ranging cats, and a secondary infestation of adult fleas (introduced with the maintenance worker) at the research center. The research center was evacuated and extensive meetings among entomologists, facilities, janitorial, occupational health and safety and contract pest exterminator personnel were required to develop a control plan. To limit impact on live insect colonies maintained at the research center but outside the infested quarters, all surfaces in infested areas were treated twice with a short-residual pyrethrin followed each time by extensive washing and vacuuming of floors and carpets. To assess control efficacy, the exterminators deployed glue boards in every room and light traps in the corridors. The infestation was declared over after 2 weeks when no new fleas were captured for three consecutive days.
In addition to the five fleas that were collected feeding, we obtained 102 fleas from glue traps deployed in both infested sites, which were all morphologically identified as cat fleas (Ctenocephalides felis, Fig. 1c) based on three diagnostic characters: slightly convex shape of the head; first spine of the genal ctenidia exceeding at least half of the second in length; and presence of six spiniform setae in the dorsoposterior margin of the hind tibia. We extracted DNA individually from all 107 fleas and selected a sub-sample of 37 for molecular barcoding at the cytochrome c oxidase subunit I (coxI) locus. This revealed two coxI haplotypes of C. felis that matched those from a recent worldwide assessment of genetic diversity in Ctenocephalides fleas 7 . We found 31 fleas with coxI haplotype h1 (Accession no. MT901293) and 6 fleas with coxI haplotype h6, Accession no. MT901294), which differs from h1 by 2.1% at the nucleotide level. Haplotype h1 is distributed across the temperate zone in all continents except Antarctica and was also recently found in a Long Island, NY (cat flea sequenced as part of bioproject PRJNA184075, Accession no. MK481256). Haplotype h6 is known to occur in the Central African Republic and the Seychelles 7 and has been detected in Georgia, USA 12 . Four of the five fleas collected while feeding on humans had the h1 haplotype and one had the h6 haplotype. Fleas with different haplotypes were morphologically similar.
Although the bites experienced indicate that fleas were biting humans, we amplified and sequenced a 741 bp fragment of vertebrate cytb locus (Ngo & Kramer 2003) from a h6 haplotype flea with a visible blood meal to confirm that they were successfully obtaining human blood and not just probing. We obtained a clean sequence of Homo sapiens cytb (Accession no. MT901295) confirming the blood in the flea was from a human.
Rickettsia detection in cat fleas. We tested all 107 fleas for Rickettsia sp. in pools of DNA with up to eight specimens using a qPCR assay targeting the conserved 17kD antigen locus 13 . Fleas from six positive pools were individually tested with the same assay, revealing six positive individuals (5.6%) with a single positive per pool. We then amplified and sequenced fragments of the gltA 6 (Accession no. MT901296) and ompB 14 (Accession no. MT901297) genes from all six positive fleas, which generated a single genotype for each gene. These sequences produced a 100% match with isolates previously described as Rickettsia asembonensis in California, US 15 , Peru 16 and Brazil 17 . We found that five of the six infected fleas had haplotype h6 (83% of h6 fleas were positive for R. asembonensis) and one had haplotype h1 (3% of h1 fleas were positive for R. asembonensis). These infection rates are significantly different (Fisher's Exact Test; P < 0.001).
Metagenomic analyses: Rickettsia asembonensis genome and cat flea mitochondrial genome. We selected a h6 Rickettsia-positive cat flea that was trapped in a glue trap for a metagenomic analysis using next-generation Illumina sequencing and generated 12.19 M paired-end reads (24.38 M total reads) of 305 bp that assembled to 399,287 scaffolds with an N50 of 1.4Kbp and summed to 430 Mbp. Of these  We used the Illumina metagenome assembly data to compare the R. asembonensis we detected with R. asembonensis isolates from around the world by deriving full-length gene sequences for five commonly sequenced rickettsial marker genes, which include conserved (gltA and 17-kDa) and variable (ompB, ompA and sca4) loci. Gene sequences identical or almost identical (> 99.9% identity) to those from our study were detected in cat fleas from California, Texas and Georgia within the U.S. and in cat fleas and other vectors in countries in Central and South America, Europe, Africa and in Asia ( Table 1). The R. asembonensis detected here is similar (> 99%) to genotypes obtained from human blood in Peru 19 and Malaysia 20 , from Long-tailed macaques (Macaca fascicularis) in Malaysia 21 and from cats in Thailand 22 .
In addition to the R. asembonensis genome, we identified a scaffold encoding a near full-length C. felis mitochondrial genome. This 17.54 kbp scaffold (NODE_28 within the assembly) encodes 13 protein coding genes, 22 tRNAs and two rRNAs and had a mean mapping coverage of > 526×. We were unable to circularize the genome due to a high number of tandem repeats in the control region that could not be spanned by short-read sequencing. Annotation and alignment with four existing Siphonapteran mitogenomes currently available [23][24][25] (Fig. 2) indicate that gene order and sequence within the coding region remain relatively conserved among the three infraorders (Ceratophyllomorpha, Pulicomorpha, Hystrichopsyllomorpha) and four families sampled (Ceratophyllidae, Vermipsyllidae, Hystrichopsyllidae, Pulicidae).

Discussion
Our unanticipated study uncovered two different mitochondrial lineages of human-biting cat fleas infected with Rickettsia asembonensis. This flea-borne spotted fever Rickettsia species falls within the transitional group 26 and has been associated with human pathogenicity in Peru 19 and in Malaysia 20 . Rickettsia asembonensis was detected in the blood of healthy cynomolgus monkeys (Macaca fascicularis) in Malaysia 21 and in blood samples from cats in Thailand 22 , showing that this bacterium infects wild and domestic mammals, which in turn can act as reservoirs for human infections. To date, R. asembonensis has been detected in three tick species and in 10 flea species from 17 countries across four continents; the global distribution of positive cat fleas suggest they are likely to be vectors of this Rickettsia species. More importantly, regions with confirmed cases of R. asembonensis infection in humans and other mammals also have reports of positive cat fleas ( Table 1). The cat flea is the only known vector of R. felis 27 , and currently stands as the main vector of typhus fever in the U.S. 4,28 , reinforcing its importance as a vector of zoonotic pathogens.
We found that fleas with both h1 and h6 coxI haplotypes bit humans and we identified human blood in a haplotype h6 flea infected with R. asembonensis. Of note, although the dozens of skin lesions resulted in a few days of significant discomfort no further symptoms have surfaced. While flea-borne rickettsioses in humans have been linked to the presence of Rickettsia-positive cat fleas 29 , direct evidence of positive fleas biting humans is still scarce. The significant difference in infection rate between fleas with different coxI haplotypes raises questions about differences in vector competence or likelihood of vertical transmission of R. asembonensis in different genetic lineages of cat fleas.
In the U.S., human cases of flea-borne diseases have been reported in Hawaii, Texas and California 30 and our study brings attention to the possible emergence of flea-borne pathogens in the northeastern region where millions of cats and dogs thrive alongside humans 31 . In fact, the cat flea infestation we report led to the discovery of a medium sized colony of free-ranging cats in the basement of the primary infestation site. Our results warrant increasing the public's awareness that free-ranging cats can harbor fleas infected with zoonotic pathogens. Of note, serological assays for other rickettsial agents such as Rickettsia typhi (murine typhus) and R. rickettsii (the agent of Rocky Mountain spotted fever) cross-react with R. asembonensis 32 and the latter should therefore be considered when diagnosing fevers of unknown origin in the northeastern U.S.
The ability to sequence and assemble an entire rickettsial genome from a single infected flea illustrates the potential that modern metagenomic analyses hold for vector biology and public health. We demonstrated that the R. asembonensis detected here is highly similar to those distributed in many countries across the Americas, Europe, Africa and Asia. These results are currently tempered, however, by a lack of comparative data in public databases e.g., GenBank, with a large majority of data from closely related Rickettsia species existing as single genes, partial gene fragments, or missing entirely. Recovery of the complete mitochondrial genome of C. felis concomitant with the pathogen genome enables expanded genotype by genotype evolutionary analyses and provides data useful for population-level analyses using mitochondrial markers. We envision that as metagenome sequencing becomes cheaper and commonplace in situations such as this, comparative analyses within and between global vector populations and their pathogens (inclusive of bacteria, viruses and eukaryote parasites) will be possible, and will quickly expand our knowledge about potential emerging pathogens to which human populations are exposed.

Methods
Flea sampling, identification and DNA extraction. One of the authors of this study (FCF) caught five fleas that were feeding on himself and put these specimens in microtubes containing 95% ethanol. We also obtained fleas that were trapped in glue boards (Catchmaster Mouse and Insects glue boards, Catchmaster, NJ, Fleas were identified to the species level with the aid of a taxonomic key (Centers for Disease Control and Prevention 33 ). We then extracted DNA from all 107 fleas individually using a phenol-chloroform protocol 34 .

Genetic identification of fleas and blood meal analysis.
To confirm the flea species we used primers LCO1490 and Cff-R 35 to amplify and sequence the mitochondrial cytochrome oxidase 1 (coxI) barcode locus from 37 fleas (six Rickettsia-positive fleas), which included one collected feeding on a human (FCF), four other fleas collected on this same person and 27 additional fleas chosen randomly from those available. We amplified and sequenced a 741 bp fragment of vertebrate cytb locus using primers MammalianF and MammalianR 36 from a flea with a visible blood meal.
Rickettsia detection. We tested all 107 fleas for Rickettsia sp. in pools of up to eight specimens using a qPCR assay targeting the conserved 17kD locus 13 . Fleas from positive pools were individually tested with the same qPCR assay to pinpoint which fleas in each pool were positive. To identify the Rickettsia to species level we amplified and sequenced a 608 bp fragment of the rickettsia gltA gene using primers PgltA-2F 5′-TTC TCA  TCC TAT GGC TAT TATGC-3′ and PgltA-2R 5′-TTC AAG TTC TAT TGC TAT TTG-3′ 6 and a 820 bp fragment of the rickettsia ompB gene using primers 120-M59 and 120-807 14 . Positive and negative controls produced the expected results in all tests performed.

Metagenomic analysis.
We used approximately 50 ng of DNA from an un-engorged Rickettsia-positive flea (haplotype h6) to construct an Illumina shotgun sequencing library using the Nextera FLEX sample prep kit (Illumina Inc, San Diego, CA) and a 600-cycle version 3 sequencing kit in 305 bp × 305 bp paired-end mode. Raw reads were adapter and quality-trimmed using BBduk (sourceforge.net/projects/bbmap/) and assembled using SPAdes 3.14.0 37 . Each scaffold was queried against the NCBI 'nt' nucleotide database with the addition of the C. felis draft genome sequence (NCBI accession GCF_003426905; Driscoll et al. [unpublished]) using BLASTN 2.9.0+ 38 on the Rutgers amarel high-performance computing cluster, and hits were annotated with corresponding NCBI taxonomy using the Taxonomizr R module (https ://cran.r-proje ct.org/web/packa ges/taxon omizr /index .html). Using BLAST homology to Rickettsia CDS sequences within NCBI, we identified full-length homologs of the gltA, ompB, ompA, sca4 and the 17 kDa outer membrane antigen htrA within this assembly. Raw reads were additionally mapped to the R. asembonensis strain NMRCii genome scaffolds and plasmid pRAS01 18 [GenBank accession GCA_000828125.2]) using BBmap. Reads with > 1 best alignment were discarded. Of the 23.24 M trimmed reads (totaling 4.51Gbp), 600 k (2.58%) were mapped to the reference. We used this mapping with the Geneious Prime v.2020.1.2 variant caller (Biomatters, Ltd., Auckland, NZ) to assess the number of single-nucleotide variants (SNVs) between the two R. asembonensis genomes. The scaffold encoding the C. felis mitochondrial genome was annotated using MITOS 39 .

ethical statements
All methods were carried out in accordance with relevant guidelines and regulations. The research reported does not involve experiments on humans or animals and therefore neither IRB nor IACUC protocols were necessary. Instead it reports the outcome of an inadvertent, unexpected and undesired ectoparasite infestation in our research center. Fleas were collected by the first author from his own person upon infestation of the research center.

Data availability
The datasets generated during and/or analysed during the current study are available in GenBank under Accession Numbers MT901293-MT901297 for the sequences obtained via Sanger sequencing. The assembled metagenome scaffolds and raw sequence reads are available via NCBI BioProject PRJNA659057.