Single cell transcriptomics of primate sensory neurons identifies cell types associated with human chronic pain

Distinct types of dorsal root ganglion sensory neurons may have unique contributions to chronic pain. Identification of primate sensory neuron types is critical for understanding the cellular origin and heritability of chronic pain. However, molecular insights into the primate sensory neurons are missing. Here we classify non-human primate dorsal root ganglion sensory neurons based on their transcriptome and map human pain heritability to neuronal types. First, we identified cell correlates between two major datasets for mouse sensory neuron types. Machine learning exposes an overall cross-species conservation of somatosensory neurons between primate and mouse, although with differences at individual gene level, highlighting the importance of primate data for clinical translation. We map genomic loci associated with chronic pain in human onto primate sensory neuron types to identify the cellular origin of human chronic pain. Genome-wide associations for chronic pain converge on two different neuronal types distributed between pain disorders that display different genetic susceptibilities, suggesting both unique and shared mechanisms between different pain conditions.


Introduction
The dorsal root ganglion (DRG) consists of a variety of neuron types, each tuned to detect and transduce different physical stimuli. These neuron types can broadly be divided into low-threshold mechanosensitive neurons responsible for sensing touch and high-threshold nociceptors, which are involved in pain, temperature and itch [1][2][3][4] . However, a comprehensive classification of DRG neurons is critical for understanding exactly how somatosensation works and for providing insights into the cellular basis for acute and chronic pain. Rodents represent the main species for studies on the cellular and molecular basis of nociception and the greatest insights with respect to molecular classification of neuronal types have been obtained from mouse, where single-cell RNA-sequencing (scRNA-seq) has led to a molecular taxonomy of existing types of sensory neurons [5][6][7][8][9] .
This has enabled the identification of molecular types representing richly myelinated A-fiber low threshold mechanoreceptors (LTMRs) and limb proprioceptors. The remaining neuronal types in the scRNA-seq are assigned as weakly myelinated or unmyelinated neurons. One of these is a C-fiber LTMR (C-LTMR) neuron type that expresses Vglut3 (Slc17a8) and tyrosine hydroxylase (Th) that likely is not involved in pain sensation [1][2][3][4][5] . Nociception is largely conferred through unmyelinated peptidergic C-fiber neuron types and a few lightly myelinated Aδ-nociceptors, a Trpm8 expressing cluster of neurons, as well as cell types marked by expression of Mrgprd, Mrgpra3 or Sst (named NP1, NP2 and NP3 types of neurons, respectively 8 ). This molecular classification agrees remarkably well with previous studies based on myelination and conduction velocity, neurochemical features and termination patterns peripherally in the skin and centrally in the spinal cord and is also consistent with the known ontogeny of DRG neuron types 5 . As a result, there have been significant advances in understanding the cellular and molecular characteristics of sensory neurons found in mouse DRG.
Much less is known about characteristics of human DRG. Apart from information on size of the ganglia along the rostro-caudal axis, micro-anatomy including neuron size [10][11][12] and electrophysiological characteristics [13][14][15][16] , the molecular characterization of human DRG is still limited to bulk RNA-sequencing [17][18][19] and neurochemical analyses of gene products in a handful of studies 20 . Hence, the concordance of markers used in different studies and their relation to actual neuron types remain largely unknown. Nevertheless, by examining individual gene products, these studies suggest important species differences between human and mouse where for example Nav1.8, Nav1.9, P2X3 receptor and TRPV1 are present in both small and large neurons in human, but only small neurons in mouse, suggesting fundamental differences in molecular characteristics and principles of initiation and transduction of somatosensory stimuli between human and rodent 20 .
In humans, rare and drastic mutations that explain different types of congenital insensitivity to pain and erythromelalgia have been identified, such as for example SCN9A (Nav1.7), NTRK1 (TRKA) and SCN11A (Nav1.9) [21][22][23][24][25] . In addition to these rare causing mutations, it is known that the genetic risk for chronic pain is due to common variations with small effect size 26 . Close to half of the risk of developing chronic pain are attributable to genetic factors [27][28][29] , including musculoskeletal pain conditions 28 . For musculoskeletal pain there is statistical evidence for a diverse set of genes involved, with a marked overrepresentation of genes expressed in neurons and functionally associated to neurotransmission, indicating a strong heritable component caused by altered functions of  After clustering, we used canonical mouse DRG neuron markers to assign tentative identities for the clusters based on their likely mouse counterparts (Fig. 1e, f): NP1 (cluster 8) and NP2 (cluster 9) were named based on the combination of GFRA1 and GFRA2 expression; C-LTMRs (6) were assigned using GFRA2 and ZNF521 (Zpf521 in the mouse); NTRK1 and GAL were used to identify PEP1 (4); SCN10A and TRPM8 suggested the identity of the TrpM8high (2) cluster (negative for SCN10A); IL31RA expression was used for naming NP3 (7); CPNE6 together with NTRK2 was used to assign putative A-LTMRs (1), and CPNE6 together with NTRK1 and SCN10A were used to assign the PEP2 (3) cluster. The final cluster (5) also expressed CPNE6, NTRK1 and SCN10A, and was named PEP3.
A second dataset was prepared from 5 female macaques using Smart-seq2 technology. After clustering and cleaning steps (Extended Data Fig.2a-d) these data included 1038 neurons showing >480,000 counts and >13,500 detected genes per cell on average (Extended Data Fig.2e, f). Tentative neuron identities for this dataset was assigned in a supervised manner using the STRT-2i-seq data as a reference (Fig. 1f, g; Extended Data Fig.2g). Interrogation of marker gene expression used for tentative cell type assignment showed identical patterns between the two macaque datasets (Extended Data Fig.2h), thus representing an independent identification of the cell types identified in the STRT-2i-seq data. For full marker lists from both scRNA-seq datasets, see Extended Data Tables 1 and 2. An interactive web resource for browsing the datasets is available at: https://ernforsgroup.shinyapps.io/macaquedrg/.

A consensus on mouse DRG neuron types across datasets
We wanted to further leverage the vast knowledgebase of mouse DRG neuron types in our investigation of primate DRG neurons. For this, we used two major scRNA-seq datasets from previously published mouse studies, referred hereafter as the Zeisel and Sharma datasets 7,9 . These studies identify similar number of cell types, but it is not known if the same kinds of neurons were identified and furthermore, the studies use different nomenclature. We therefore first identified the corresponding cell types between the datasets. Using the label transfer method implemented in Seurat, we transferred labels from Sharma over to the Zeisel data (Fig. 2a, b; Extended Data Fig. 3a) and then also named the Zeisel types using Usoskin nomenclature 8 (Fig. 2c). We then repeated the label transfer from Zeisel to Sharma data and named the Sharma types also using Usoskin nomenclature (Fig. 2 d-f, Extended Data Fig. 3b). Finally, we repeated the label transfer from Zeisel to Sharma both ways using Usoskin labels for the Zeisel data, and observed identical results (Extended Data Fig. 3c, d).
As an independent method to establish the similarity between cell types identified in the different studies, we employed neural-network based probabilistic scoring modules for learning cell-type features between datasets. We trained the modules using both mouse datasets and using all three nomenclature versions (Zeisel and Usoskin nomenclatures for Zeisel data; Sharma nomenclature for Sharma data) (Extended Data Fig. 4a-i). We then tested the performance of these modules in finding the corresponding cell types between the datasets and found that the modules detect with high probability cell types across the different datasets and that the cell-type assignments concur with the results obtained by label transfer. For the rest of our analyses we used the Usoskin type nomenclature as the default naming. Taken together, the results confirmed a one-to-one relationship between nociceptor types identified in 7,9 although the Aδ-nociceptors of Zeisel, were split into two subtypes in Sharma ( Fig. 2a-i). Because Usoskin nomenclature performed with the least noise and greatest prediction scores, we used this nomenclature for the rest of our analyses.

Overall cross-species conserved strategy for somatosensation
We proceeded to use the probabilistic neural-network machine learning approach to evaluate whether the neuronal basis of somatosensation is conserved between macaque and mouse, and to validate our tentative assignment of cross-species neuron correlations. For this purpose, we generated the probability score for each macaque cluster to the mouse neuron types using both the macaque STRT-2i-seq and Smart-seq2 datasets. Each of the macaque clusters showed similarity to the previously assigned mouse neuron types ( Fig. 3a, b, Extended Data Fig. 5). These results also indicated that the macaque A-LTMR cluster consists of cells corresponding to the lightly myelinated Aδ-LTMR type. Our original annotation of corresponding macaque-mouse neuron types was consistent with the expression of PRDM12 in all nociceptors and the mechanosensory channel PIEZO2 in A-LTMRs, C-LTMRs and NP1 (Fig. 3c). Interestingly, the mouse proprioceptor marker PVALB was expressed in the macaque PEP2 neurons. PEP3, the other macaque neuron type showing 5,7-9 . similarity to mouse PEP2, expressed lower levels of TRPM8, moderate PIEZO2 and differed from the macaque PEP2 by expressing KIT, SCGN and lower levels of the heat sensitive channels TRPV1 and TRPA1 (Fig. 3d). When compared to the mouse PEP2 subtypes in Sharma et al., PEP3 showed higher probability to CGRP-eta over GCRP-zeta and PEP2 to GCRP-zeta over CGRP-eta (Extended Data Fig.  5a, b), indicating two types of Aδ-nociceptors.

Fig. 3. Identification of primate correlates of mouse neuron types. (a, b) Violin plots showing prediction percent probability scores for each macaque cluster against established mouse DRG cell types trained with Usoskin nomenclature. (a) STRT-2iseq, (b) Smart-seq2. (c) Violin plots of markers expressed in macaque clusters. Y-axis, raw UMI counts for STRT-2i-seq and counts for Smart-seq2. (d) Violin plots showing genes differentially expressed between macaque PEP2 and PEP3. (e-g) LargeViz plot showing cross-species clustering of macaque STRT-2i-seq and mouse Zeisel et al. datasets on a shared plot. (e) macaque, (d) mouse, (g) merged plot (legend shows origin of samples, WG = macaque, Zeisel = mouse). (h) Probability plot for each macaque neuron type against mouse neuron types. (i) Synopsis of the corresponding DRG neuron types between mouse and macaque datasets and nomenclatures
To gain further confidence in our cross-species analyses we performed co-integration of our macaque STRT-2i-seq data with mouse data from Zeisel et al. 9 using Conos. Here, the previously assigned macaque clusters showed close positioning to homologous mouse clusters on a joined cross-species clustering graph ( Fig. 3e-g). This strong cross-species association was also apparent on the probability profiles after label propagation from mouse clusters to individual macaque neurons.
All macaque clusters showed close to one-to-one correspondence to individual mouse clusters with PEP2 and PEP3 having the strongest association to the mouse PEP2 cluster (Fig. 3h). Combined, these results evidence a strong cross-species association of sensory neuron types, indicating that the overall cellular basis for somatosensation is conserved between mouse and macaque (Fig. 3i).
The macaque neuronal types were validated in vivo by triple in situ hybridization (Fig. 4a, Extended Data Fig. 6). SCN10A was used as a general marker for nociceptors together with cluster-specific markers or a combination of markers defining only one cluster. For some clusters, negative markers were used to rule out types other than the one under analysis (see Materials and Methods). In addition, we analyzed the soma size distribution and percentage contribution of each cell type in the DRG from the in vivo data (Fig. 4b, c). Finally, we interrogated each of the neuron-types for unique expression patterns for transcription factors, ion channels, G-protein couple receptors (GPCRs), catalytic receptors and endogenous ligands, including neuropeptides (Fig. 4d). This revealed for example the expression of multiple GPCR's related to exogenous defense and cholestatic itch in NP1 and NP2 (MRGPRX1-4), and to histaminergic itch and inflammatory lipids in NP3 (HRH1, S1PR1) and NP1 (LPAR3).

Species differences and similarities in gene expression
Our above results confirm an overall existence of neuronal correlates between the mouse and a primate, nevertheless, important divergences could still exist when examining expression of individual genes within each of the different mouse-macaque correlates. In order to provide a more comprehensive map of the molecular conserved and divergent features of somatic sensation and pain between the mouse and the macaque, we compared gene expression patterns between the species (Fig. 5a-c and Extended Data Table 2). Because strategies to identify new molecular targets for development of analgesic drugs often are focused on genes expressed uniquely in the neuron type(s) causative of pain, we examined the presence of genes within the different neuron types to identify conserved transcriptional programs between species as well as sets of genes that are expressed in highly species-specific manner between corresponding cell types. In such analyses, false negatives can confound the results. We therefore first examined the reliability of the individual STRT-2i-seq and Smart-seq2 datasets in side-by-side analyses of cell type specific expression patterns of mouse-macaque shared genes, macaque-specific genes and mouse specific genes, which showed high reproducibility across different platforms (Extended Data Fig. 7a-c). We thereafter combined the datasets for mouse 7,9 and macaque (STRT-2i-seq and Smart-seq2) to obtain integrated datasets for mouse and macaque. Analysis of this dataset revealed the existence of robust conserved molecular features between mouse and non-human primate (Fig. 5a). For example, neuronal type specific mouse-macaque conserved features of NP3 neurons include SST, JAK1, IL31RA, OSMR and S1PR1 (N = 36 genes) and C-LTMRs P2RY1, EXOC1L, KCND3, IQSEC2, OSBPL1A and FXYD6 (N = 60 genes). The largest cell-type specific shared gene program was found between mouse and macaque Aδ-LTMRs (N = 126 genes) whereas NP1 and NP2 both had conserved features of less than 30 genes (N = 26 and 27, respectively). However, we also identified cell type specific gene expression that were species specific and these existed both in mouse and macaque (Fig. 5b, c; Extended Data Table 3). As examples, species differences included in NP3 the specific expression of TDRD1, EDN3 and GRIA1 in macaque and NPPB, HTR1F and NTS in mouse. For C-LTMRs, TH and RARRES1 are specific for mouse, whereas HSD17B13 and CCKBR are specific for macaque. These species-specific expression patterns need to be considered when translating results obtained in rodents to primates.
We further performed supervised computational screens to find gene families whose differential expression could reliably distinguish similar cell types within and across species 42 . A set of over 1,500 gene families from HGNC (HUGO Gene Nomenclature Committee) was used for these screens (Extended Data Table 4). Within each species the top performing families included ion channels, Gprotein coupled receptor families, cell adhesion molecules and others (Extended Data Fig. 8a, Extended Data Tables 5 and 6). Across species the top performing families included voltage-gated ion channels, G-protein-coupled receptors and neuropeptides (Extended Data Fig. 8b, Extended Data Table 7), suggesting that sensory neuron identities culminate mostly on genes of these families even across species. On cell type level (Extended Data Fig. 8c, Extended Data Table 7), A-LTMR/NF1 neurons were identified nearly perfectly by many gene families, for example by ion channels and cell adhesion molecules (CAMs). For the peptidergic C-fibers (PEP1), neuropeptides/receptor ligands were the highest performing family, as expected. On the other hand, NP1 and NP2 scored poorly in comparison to all other types, suggesting that these neuron types have diverged the most in their gene expression signatures between mouse and macaque. As a final comparison between the species, we used SCENIC 43 to identify gene regulatory networks formed from master transcription factors and their gene targets (i.e. regulons) in all DRG cell types of both macaque and mouse (Fig 5de). We found for example the known fate determining regulons driven by SHOX2, RUNX1 and FOXP2 in mouse 9,44-46 and predicted several new species unique as well as cross-species conserved regulons determining cell-type identities (see Extended Data Tables 8 and 9 for genes).

Genetic association of neuronal types contributing to human pain states
The identification of the cellular and molecular basis for somatosensation and pain in non-human primate enables to determine the contribution of different primate neuron types in human chronic pain states. We therefore used human genetic data to explore how each cell type in the macaque DRG connects to painful phenotypes in humans by employing genome-wide association studies (GWAS). To do so, we used a large cohort made available by the UK Biobank project 47,48 , where we assessed chronic pain from self-report at eight body sites (Fig. 6a). Among the nine neuron types identified in the macaque DRG, we found that common variants associated with chronic pain sites mapped to sets of genes that were specifically expressed in two neuron types in the STRT-2i-seq dataset. We found enrichment of headaches, facial, neck and shoulder, stomach, and hip chronic pains partitioned heritability in PEP1 neurons (PFDR=16%, each), while NP2 neurons were associated most

Fig. 6. Contribution of macaque DRG cell types to partitioned heritability of human chronic pain sites. (a) UK Biobank chronic pain sites mapped to the human body. Number of chronic case participants shown in parentheses. (b) Heatmap of FDR-corrected p-values for enrichment in partitioned heritability of each macaque DRG cell type to each human chronic pain site. (c) Forest plot of partitioned heritability estimates for each macaque DRG cell type contributing to chronic pain. Shown are heritability coefficient estimates (circles) and their 95% confidence intervals (bars) for each pain site. Fixed-effect, standard error weighted, meta-analyzed heritability estimates (triangles) also shown, colored red when significant at Bonferroni-corrected level (corrected for nine cell types). (d,e) Top genes in type-specific cells contributing to specific chronic pain sites. Top genes highlighted in a scatter plot (pink, left). Scatter plots show human GWAS enrichment of gene (Y axis) as a function of macaque cluster enrichment (X axis). (f) Bar plot of top pathways for PEP1 and NP2 cell types in the metaanalysis of the 8 chronic pain sites. (g, h) Schematic illustrations of pathway-related genes in (g) PEP1 and (h) NP2 neuron types.
significantly with the heritability of chronic back pain and hip pain (PFDR=5%, 8%, respectively) (Fig.  6b, Extended Data Table 6). Thus, heritability of hip pain was significantly enriched in both PEP1 and NP2 neuron types while heritability of all other pain sites was significantly enriched in only one neuron type of the macaque DRG. Since the epidemiological prevalence of chronic pain patients to report more than one body site is high, the signal attributed to, for example, subjects with hip pain may also have back pain, so it is not clear where the association signal is deriving from. To control for this co-morbidities, we performed new full GWASes for each of the pain site (row in Extended Data Fig. 9b, c), then removed one by one all comorbid pain sites (column in Extended Data Fig. 9b, c) in all GWASes for PEP1 and NP2 neurons. Although some statistical power is lost in this analysis due to reduced size of chronic case participants, partitioned heritability in PEP1 was confirmed in most GWASes (facial, neck/shoulder, stomach/abdomen and hip pain), while association with back pain remained negative (row in Extended Fig. 9b). However, significance of PEP1 for headaches was lost for all GWASes. Furthermore, back pain and hip pain remained significant for NP2 neurons in all GWASes when excluding other pain sites (row in Extended Fig. 9c). Our results show that seven of the nine neuron types were not associated with any chronic pain sites, and hence the two neuron types, PEP1 and NP2, together represent the main enrichment of musculoskeletal pain heritability. A meta-analysis across all pain sites for each cell type (Fig. 6c) consistently found that both PEP1 (P=5.4x10 -3 ) and NP2 (P=3.9x10 -3 ) displayed significant enrichment across all pain sites.
We next tested if these results were reproducible in the independent scRNA-sequencing dataset obtained by the Smart-seq2 protocol. Consistent with the STRT-2i-seq dataset, we found enrichment of stomach, hip, and neck and shoulder chronic pain partitioned heritability in PEP1 neurons (PFDR=10%, 10%, 13%, respectively), while NP2 neurons were associated most significantly with the heritability of chronic back pain, hip pain and knee pain (PFDR=10%, each) (Extended Fig. 9d). Thus, heritability to pain was consistently assigned to the same neuronal types using the STRT-2i-seq and Smart-seq2 datasets, although significance of PEP1 to headaches and facial pain was lost in the latter.
In order to identify functional pathways and genes whereby human heritability contributes to chronic pain in the different neuronal types, genes were ranked by exclusivity of neuronal cell type expression to establish enrichment scores. Human GWAS enrichment scores were mapped to the macaque single cell expression enrichment to identify top genes in type-specific cells contributing to chronic pain sites in PEP1 and NP2 neurons, revealing hundreds of genes (Fig. 6d, e, Extended Data Table 10). These were thereafter used to identify cellular pathways that confer vulnerability to chronic pain (Fig. 6f). Combined, these results show that heritability of chronic pain at different sites is associated with each of the two major pain neuron types through different biological pathways (see Extended Data Table 10). Pathways contributing to chronic pain in PEP1 neurons included "clathrin-dependent endocytosis", "central nervous system development" and "axon development" with an enrichment of proteins involved in the process of endocytosis, cell adhesion as well as a few transcription factors (Fig. 6f, g). In contrast, pathways in NP2 neurons included "synapse organization", "chemical synaptic transmission" and "cell projection morphogenesis" and were dominated by genes associated with organization of the synaptic membrane and its vesicles, ion channels participating in excitability, G-protein signaling and cell adhesion (Fig. 6f, h). Thus, the two neuron types contribute to chronic pain via distinct pathways.

Discussion
The human DRG like the mouse contain neurons with different histochemical and electrophysiological features 13,14,[14][15][16]49,50 . The identification of the molecular types of primate somatosensory neurons addresses the longstanding question whether cell types involved in somatosensation is conserved between rodents and primates. We conclude that the mouse 7-9 and the Rhesus macaque largely share molecular neuron types which using mouse genetics have been functionally identified as A-LTMR involved in touch and proprioceptive sensation 1 , C-LTMRs involved in the affective aspect of pleasant touch 5 , C-cold thermoreceptors (TrpM8 high ), Aδ fast mechanical nociceptors involved in pinprick pain (PEP2) 51-54 and mechano-heat C-nociceptors (PEP1), as well as "non-peptidergic" neuronal types (NP1, NP2, NP3) known in mouse to be involved sensing noxious mechanical threshold and itch sensation [55][56][57] . Although the neural network predictions score was low, macaque PEP3 appears to be an Aδ−fiber fast nociceptor which correspond to mouse CGRP-eta type in Sharma nomenclature, which is a subtype of the mouse PEP2 type in the Usoskin nomenclature.
Overt differences in the overall cellular basis for nociception between mouse and macaque largely relates to NP1 and NP2 neurons. In the mouse, NP1 is involved in detecting pricking mechanical stimuli and β-alanine induced itch through the Mrgprd receptor, but not thermal sensation. In the macaque, the expression of the heat-activated channel TRPV1 in NP1 neurons, in addition to TRPA1, suggests a broader function than in the mouse. The mouse NP2 neurons express histamine and chloroquine receptors and ablation of these neurons (Mrgpra3 + neurons) specifically affects histamine-dependent and histamine-independent itch, but not acute noxious heat, cold, or mechanical pain 56 . NP2 neurons have therefore been considered to be dedicated itch neurons in rodents 58 . However, this neuronal type was recently found to code for both itch and pain, with itch behavior induced by metabotropic Gq-linked stimulation and pain behavior through fast ionotropic stimulation 59 , suggesting that the same neuronal populations can drive distinct sensations in a stimulus-dependent manner. In addition, based on the molecular profiles, macaque NP2 neurons appear partly different from mouse as histamine receptor HRH1 is low in macaque NP2 neurons. Other known functional stimuli detectors found in these cells are MRGPRX1-4, which are also expressed in NP1 neurons. MRGPRX1-4 are promiscuous low-affinity receptors involved in nonhistaminergic itch, for example chloroquine and pruritogenic peptides 58 . Primate NP1 and NP2 neurons may thus have at least partly different functions than in the mouse.
Even though the exact neuronal basis for human chronic pain is unknown, insights have been obtained through the identification of genes causing congenital insensitivity to pain 22,60 . While most of the genes causing painless phenotype are abundantly expressed in all DRG neuron types, some display restricted expression patterns, thus opening for linking neuronal types to phenotype. These includes congenital insensitivity to pain by mutations in SCN9A (Nav1.7), SCN11A (Nav1.9) and NTRK1 (TRKA) [21][22][23][24][25]61 and PRDM12 62 . In contrast to mouse which display an enriched expression in nociceptors, macaque SCN9A is broadly expressed at similar levels in all neuronal types, while SCN11A expression is more similar to mouse with expression at varying levels in all unmyelinated neuronal types (C-LTMRs, PEP1, NP1-3,) with very low levels in TRPM8 high , myelinated nociceptors and A-LTMRs (see https://ernforsgroup.shinyapps.io/macaquedrg/ for interrogation of gene expression). NTRK1 is largely confined to macaque TRPM8 high , PEP1, PEP2 and PEP3 neuronal types, with lower levels in the other nociceptors. PRDM12 expression is consistent with mouse, appearing in all macaque neurons except Aδ-LTMRs. Thus, although it is not possible to pinpoint the exact neuronal types, it seems based on expression of these causative genes for human monogenic pain insensitivity disorders that PEP1, PEP2, PEP3, TRPM8 high represent important neuronal types for nociception. However, it cannot be excluded that neuronal types sufficient for driving chronic pain might partly involve neuron types other than those required for nociception.
Previous GWAS studies have uncovered genome-wide significant genes that contributes to the heritable risk of chronic pain. Recent methods 32,33 allow the integration of GWAS and scRNA-seq data to map cell types contributing to disease through the testing the enrichment for cell type-specific expression of genes with nearby risk SNPs. Such analyses consider the heritability carried by all common SNPs, linking them to nearby genes, rather than focusing only on genome-wide significant genes. Using this methodology, we linked GWAS results of several human chronic pain sites to specific neuronal types in the primates.
Significance for cell type specific contributions reported in previous studies 32,63 showed stronger association than that reported in this work (FDR in the range of 5-20%). However, the estimation for the heritability of pain ranges from 2-10% 64-66 with 7.6% for chronic back pain 67 . In a comparative study of heritability between different classes of diseases, it was shown that mental health disorders show higher heritability estimates than self-reported pain phenotypes 66 . Furthermore, all chronic pain GWAS in the UK Biobank displayed inflated genomic control parameters ( GC >> 1) while at the same time an LD regression score intercept close to 1, indicating that the inflation's origin is highly polygenic in nature. Because of these two effects combined (lower heritability and higher polygenicity), a tissue-or cell-type specific contribution to chronic pain would be smaller in comparison with other diseases like schizophrenia. On the contrary, it is perhaps remarkable that some specific cell types show significance, particularly because the burden also distributes to cell types other than sensory neurons. However, the strength of our analysis is that it considers the accumulated contribution of all small effect sizes in sensory neurons distributed across the human genome. Thus, we believe that the significance observed is in the range of expectation. Furthermore, we show that the association to the identified neuronal types emerge from multiple pain GWAS and in addition was reproduced in an independent dataset using a different sequencing platform.
The results show that musculoskeletal pain genetics is well represented in the DRG. Our findings reveal a connection to two of nine types of neurons to multiple chronic pain sites, namely PEP1 and NP2. However, there is an overlap of individuals being included in some of these pain groups and if a cell type is enriched in multiple pain GWAS, there is a risk of reporting the false association that is driven by phenotypic comorbidity but not a true genetic association. Iterative removal of one pain site at a time in the GWAS and examining if the significance to the other pain site remains, indicate that most of the GWAS signals observed is attributed to the annotated pain sites themselves. These results point towards a common underlying pain vulnerability regardless of the body site where pain is manifested. PEP1 for headache could not be confirmed in this analysis. Thus, assignment of contribution of PEP1 cell type to headaches might be due to comorbidity of other pain sites with headaches. This suggest that musculoskeletal pain and headache are unique experiences caused by different neuronal mechanisms, which is in line with reports that in congenital insensitivity to pain phenotypes, there are painless cases with the only pain felt consisting in tension headaches 68 and furthermore, is consistent with findings that there is a shared genetic factors in conditions manifesting chronic pain except for migraine 69 .
Apart from headache, the different pain sites suggest the involvement of one of two main neuron types PEP1 or NP2 in all cases but hip pain, which is associated with both. Thus, these results suggest that different neuron types are associated with different chronic pain sites. This opens for the question whether the different types of pain could be location dependent, thus depending on segmental levels of DRGs or trigeminal ganglion neurons in the facial region rather than neuronal type dependency. However, analysis of somatosensory neurons across the rostro-caudal axis of the mouse including DRG 7-9 , jugular ganglion 70 and trigeminal ganglion 7 reveals that the neuronal strategy of somatosensation is shared regardless of body location. Thus, similar types of somatosensory neurons exist in the DRG as in jugular and trigeminal ganglion. Because of this, we assume that rostro-caudal differences in pain sites should not affect the identification of involved cell-types. This motivated us to use macaque DRG to predict the heritable risk genes at different pain locations in the human, including trigeminal area pain. While mouse models have not specifically addressed the contribution of NP2 neurons to chronic pain, the contribution of PEP1 neurons to pain is well established. For example, ablation of CGRP + neurons, which include the PEP1 neurons, results in the loss of noxious heat sensation as well as inflammatory and neuropathic heat hyperalgesia in the mouse 71 .
Largely different genes contribute to chronic pain in PEP1 and NP2 neurons. In previous human association studies for musculoskeletal chronic pain there is a marked enrichment of genes involved in neurotransmission but also for example immune function, metabolic processing, skeletal tissue differentiation and hormone signaling pathways 26 . In this study, we expected to capture the signal associated with the heritable risk in the different somatosensory neurons. The results reveal unique vulnerability pathways at play during pain chronification in the different neuronal types, suggesting that the causal mechanisms might be different between chronic pain conditions. While there is an enrichment of risk genes in PEP1 neurons that belongs to clathrin-dependent endocytosis and axon and nervous system development, NP2 neurons display an enrichment in synaptic organization and transmission and cell projection morphogenesis. This does not inform that these pathways are the only ones present in the different sensory neuron of a particular population. Instead it shows that expression of, for example, specific members of cell adhesion/repulsion genes carrying nearby variations with significant heritable risk to chronic pain are enriched in PEP1 neurons. Analysis of the underlying genes with enriched heritability reveal a common pattern related to a susceptibility of neuronal connectivity, although with different functional classes of genes in different neurons. Thus, we conclude that the results support the notion that the major genetic risks expressed in somatosensory neurons are carried in genes involved in structural and functional connections of the neurons and thus, impacts on neurotransmission. The most direct effects may be contributed by synaptic adhesion molecules, which are known to be involved in synaptic plasticity of sensory neurons.
We have mapped heritability to two specific types of primary sensory neurons. However, as previously mentioned, we do not explain the full heritability to musculoskeletal chronic pain in this study, since cell types other than those analyzed may also contribute to chronic pain. Furthermore, the enhancing statistical power and functional human genomic data may add resolution. Thus, more genes could contribute to chronic pain within sensory neurons as well as in cells other than primary sensory neurons, such as immune and vascular cells for headaches. Nevertheless, the finding that two neuronal types among the variety of sensory neurons carry a significant enrichment for the heritable risk to musculoskeletal pain indicates a contribution of these cell types to the pathophysiology of musculoskeletal chronic pain. This provides a rational for a deeper investigation of their participation with regards to human chronic pain. Furthermore, many drugs fall into the translational chasm between mice and humans 72 . Our atlas of somatosensory neuron gene expression in non-human primate should be important to verify putative drug targets and can also be of help in informed strategies for developing conceptually new analgesic drugs.