Intrinsic transcriptomic sex differences in human endothelial cells at birth and in adults are associated with coronary artery disease targets

Sex differences in endothelial cell (EC) biology may reflect intrinsic differences driven by chromosomes or sex steroid exposure and gender differences accumulated over life. We analysed EC gene expression data from boy–girl twins at birth and in non-twin adults to detect sex differences at different stages of life, and show that 14–25% of the EC transcriptome is sex-biased. By combining data from both stages of life, we identified sex differences that are present at birth and maintained throughout life, and those that are acquired over life. Promisingly, we found that genes that present with an acquired sex difference in ECs are more likely to be targets of sex steroids. Annotating both gene sets with data from multiple genome-wide association studies (GWAS) revealed that genes with an intrinsic sex difference in ECs are enriched for coronary artery disease GWAS hits. This study underscores the need for treating sex as a biological variable.

Sex differences in the endothelial transcriptome at birth. The first step in our analysis was to analyse sex differences in the HUVEC transcriptome at birth. By using both boy-girl twins and boy-boy/girl-girl twins we could validate our own model, as it allowed us to correct for the microenvironment (mother) the boygirl twins developed in. Boy-girl twin differences were more abundant as compared to the boy-boy twin and girl-girl twin comparison, as exemplified by the differential skewing of p-values in the groups of twins ( Fig. 2A). In the comparison within the boy-girl twins, we identified 2,528 DEGs (25% of the interrogated transcriptome) between the sexes, of which 114 were on chromosome X or Y (4.5%, Fig. 2B). We detected 79 DEGs (FDR < 0.1) in the comparison boy-boy versus girl-girl twins of which 27 were located on the sex chromosomes (34.1% of the total number of DEGs, Fig. 2B). Gene hallmark analysis on DEGs at birth of the boy-girl twins higher expressed in girls showed enrichment of endothelial activation pathways such as endothelial to mesenchymal transition, hypoxia and NFkB-signaling (Fig. 2C, Suppl. Fig. 2). The genes that are significantly higher expressed  shown for both the boy-boy/girl-girl twins (green) and the boy-girl twins in which can be corrected for mother (red). (B) The number of differential genes per chromosome is shown in percentages of all genes interrogated on that chromosome (top for boy-boy/girl-girl twins, bottom for boy-girl twins). Lightgrey indicates higher expression in males (notice the Y-chromosome), darkgrey indicates higher expression in females. (C) Hallmark gene enrichments are shown in a radar plot. Blue highlights the significance of the gene enrichment of genes higher expressed in males, whereas pink shows this for genes higher expressed in females. The length of the radius measures significance (− log10 adjusted p-value).
Intrinsic and acquired sex differences. In total, 268 genes were concordantly differentially expressed at birth and in adults and thus showed an intrinsic sex difference, of which 124 were higher in females (p overlap = 0.007), and 144 in males (p overlap = 2.0e−16, Fig. 4A). Of these 268 genes, 34 were sex-chromosomal (12.7%). An example, KDM5C (X-inactivation escapee), can be found in Fig. 4B, which is higher expressed in females at birth and during adulthood. Interestingly, 234 autosomal genes are intrinsically different between the sexes in the endothelium. Gene Ontology enrichments on genes that are higher expressed in females at birth and adult showed similarity, as well as those that are higher in males at birth and adult (Fig. 4C). However, the overlap between the genes contributing to these similar Gene Ontology enrichments at birth and in adults is low. E.g. for cell-substrate adhesion, 7 genes overlap of the 39 genes higher in females at birth and the 48 higher in adult female cells (Suppl. Fig. 4). We found 207 genes that fulfilled the criteria of our definition of an acquired is shown for all probes that map to non-duplicate autosomal genes. A complete one with sex chromosomal genes is shown in Suppl. Fig. 3A. Y-axis: -log2 p-value; x-axis: log fold change. (B) The amount of differential genes per chromosome between the sexes in the adult stage is shown in percentages of all genes interrogated on that chromosome. Lightgrey indicates higher expression in males (notice the Y-chromosome), darkgrey indicates higher expression in females. Only genes with an absolute log-fold change of higher than 0.2 are shown. (C) Hallmark gene enrichments are shown in a radar plot. Blue highlights the significance of the gene enrichment of genes higher expressed in males, whereas pink shows this for genes higher expressed in females. The length of the radius measures significance (− log10 adjusted p-value).  Comparing the sex differences in adults and at birth. (A) Venn-diagrams are shown for genes higher in males at birth and at the adult stage (blue circles), and for genes higher in females at birth and at the adult stage (pink circles). (B) KDM5C expression is plotted in the boy-girl twins (top) between males and females, and at the adult stage (bottom). (C) A dotplot is shown for Gene Ontology enrichments in the four different gene sets; genes higher in females in the adult stage, genes higher in females at birth, genes higher in males in the adult stage, and genes higher in males at birth. Terms are allocated to the rows, color indicates significance, and size of the dot indicates the ratio of genes from the set present. The number indicates the number of genes that could be found in any of the sets tested. The numbers of genes in each Gene Ontology term and overlaps are shown in Suppl. www.nature.com/scientificreports/ Comparing intrinsic and acquired sex differences. As compared to the intrinsic sex differences, the acquired DEGs were less likely to be sex chromosomal (12 out of 207, 5.8%, only X-chromosomal). Previously, it has been shown that genes that present with sex differences in expression are more likely to be less evolutionary conserved in their gene bodies, while their promoter, and hence their regulation, is more likely to be evolutionary conserved [8][9][10] . We determined the evolutionary conservation of promoters and genes by calculating the Phast-Con scores of the acquired and the intrinsic gene set. PhastCon scores tended to be higher for the promoters of genes with either an intrinsic or an acquired sex difference in expression (p permutation over mean = 0.013-0.079), while the sequence of the gene itself had lower PhastCon scores (p permutation over mean = 4e −04 -0.013, Suppl. Fig. 5). High and low classification for PhastCon scores showed that promoters of genes with an acquired sex difference were more often in the class with high conservation as compared to random genes (p permutation = 0.0228), while the sequence of genes with an intrinsic sex difference were more often in 'low' class (p permutation = 0.0036, Suppl. Fig. 6).
Gene hallmark enrichment. Even though the intrinsic and acquired sex differences gene lists are nonoverlapping, both were enriched for hypoxia response and oxidative phosphorylation (Fig. 5B). The acquired DEGs, but not the intrinsic DEGs, were enriched for the sex-hormonal terms, such as estrogen and androgen response. The acquired DEGs are more likely to contain sex-hormonal targets than those with an intrinsic sex difference (Fisher's Exact Test; p = 0.0002). The hallmarks for the genes with an intrinsic sex difference are "UV response" and "apical junction".
Protein associations to intrinsic and acquired regulatory sex differences. To determine potential associated regulators that would be missed by using transcriptomic data alone, we performed a String network analysis on gene expression regulators (transcription factors and epigenetic modifiers) in both the acquired and intrinsic sex difference gene-sets. The network for the regulatory genes with an acquired sex difference had significantly more interactions than expected (20 nodes, 7 edges, PPI-enrichment p = 0.029), which increased after expanding the network with the top 5 most connected associations to our 20 nodes (25 nodes, 22 edges, PPI-enrichment p = 0.000949). There were more transcription factors in the acquired gene set as compared to the intrinsic gene set (7.2% versus 5.2%), whereas epigenetic modifiers were more common in the intrinsic gene set (3.4% versus 6.3%) (Suppl. Fig. 7). The top 5 most connected associated proteins to the regulators of the acquired gene set were the androgen receptor (AR), ATM, HAT1, ING5, and MTA2 (Fig. 5C). The top 5 most connected associated proteins to the regulators of the intrinsic gene set all grouped together with only one of the entries (CDC5L, Suppl. Fig. 8), however, the network itself was already more connected without adding extra nodes (nodes: 25 versus 30, edges: 17 versus 34), indicating that it is a more complete network as compared to the network build from the acquired gene set.

Associations of GWAS traits to intrinsic and acquired sex differences.
To further describe the intrinsic and acquired gene sets, we determined whether or not genes that present with either an intrinsic or acquired sex difference in endothelial cells were targets in major genome-wide association studies (GWAS). We analysed GWAS for coronary artery disease (CAD) 11 , inflammatory bowel disease 12 , Crohn's disease 12 , ulcerative colitis 12 , Alzheimer's disease 13 , body mass index 14 , heel bone mineral density 15 , height 16 , hypertension 17 , multiple sclerosis 18 , rheumatoid arthritis 19 , and type 2 diabetes mellitus 20 . Specifics of the used GWAS studies can be found in Suppl. Data File. There was overlap between GWAS targets and either the acquired or intrinsic gene sets for all GWAS mentioned, except Alzheimer's disease (see Fig. 5D). The acquired gene set was not significantly enriched for any GWAS as compared to random genes. However, the intrinsic gene set contained more genes from the CAD GWAS than randomly expected (p permutation for more genes than expected = 0.0085, Suppl. Data File). These genes are PLPP3, CELSR2, ATP2B1, and SMG6. The intrinsic gene set also contained more genes that are targets in multiple GWAS, as compared to the acquired gene set (Fig. 5D). We also determined whether or not intrinsic or acquired DEGs had an association with any trait in the GWAS catalog. Genes in the acquired set were more likely to have been mapped to by a common variant associated to a trait in the GWAS catalog than random genes (p permutation = 0.01), while the intrinsic set is less likely mapped to as compared to random genes (p permutation = 0.02, Suppl. Fig. 9). A data file with all the genes that show an intrinsic or an acquired sex difference and their associated information can be found in the Suppl. Data File.

Discussion
In the present study, we show that, by using a unique twin model, sex differences in the EC transcriptome are profuse. We showed that a boy-girl twin model is superior to a boy-boy girl-girl twin model for detecting sex differences, since the boy-girl twins developed in the same microenvironment, whereas this is not the case for the comparison between boy-boy and girl-girl twins. Furthermore, by combining data from both stages of life, we identified sex differences that are present at birth and maintained throughout life, and those that are acquired over life. As we expected, acquired sex differences were enriched for hormonal responses, such as genes influenced by estrogens or androgens. Sex differences in transcriptomes of other cell-types and tissues have recently been under increasing scrutiny. A large study on sex differences in the transcriptome of peripheral blood in 5,241 samples showed that 3.1-13.7% of the tested genes show a sex bias in expression 21 . Two studies on sex differences in the liver transcriptome reported 844 and 1,249 sex-biased genes, respectively 22,23 . These numbers are in the same order of magnitude as what we report here for endothelial cells, highlighting that sex influences the transcriptome over variety of tissues, as underlined by work in the Genotype-Tissue Expression project 6,7 . Furthermore, sex differences in expression might also underlie differences between the sexes in response to stimuli or disease. For example, a differential transcriptomic response between the sexes has been shown in Scientific RepoRtS | (2020) 10:12367 | https://doi.org/10.1038/s41598-020-69451-8 www.nature.com/scientificreports/ www.nature.com/scientificreports/ ECs undergoing shear stress 4 . In addition, a recent study on sex differences in the transcriptomic response in the myocardium to ischemia showed many interactions for gene expression between sex and ischemia 24 . Differences at baseline between the sexes should be followed up on by studies investigating sex-differential responses. Sex steroidal influences on the endothelium have been described before 25,26 , and are associated with sex differences in disease bias. Interestingly, gene expression regulators in the list of genes with an acquired sex difference are predicted to be tightly associated with the androgen receptor (AR) (Fig. 5C), a gene expression regulator subject to sex steroids. A strong link to the AR according to our network analysis is NR3C2, the mineralocorticoid receptor. Recently, it has been shown that multiple transcription factors are sex-biasedly conserved over evolution 27 , among which NR3C2, and the AR might play a role in this bias. AR is not differentially expressed in adult HAECs, and neither are the estrogen receptors ESR1, ESR2 and GPER1, indicating that differential RNA expression of sex steroids receptors is not directly implicated in mediating sex differences, but perhaps their interactions/interactors, location in the regulatory gene networks and/or protein levels are. The genes that have acquired a difference over time between sexes are less likely to be sex chromosomal, than those that are intrinsically different. The majority of the acquired sex differences are highly likely to be cell-type specific, since the effects of sex steroids are different per cell-type 28 . Even though sex hormonal responses were enriched in our acquired sex differences gene list, these are undoubtedly not the only cause. Gender differences are likely to play a role in the acquired differences. The majority of the intrinsically differentially expressed genes between males and females in ECs are sex chromosomal (24 out of 48), for which sex-stratified research is lacking. Unexpectedly, we found a non-random enrichment of CAD GWAS targets in the intrinsic sex difference gene set, pointing towards potential EC targets for sex differences in CAD, such as SMG6, CELSR2, PLPP3, and ATP2B1. As an example, lower ATP2B1 expression by gene silencing has recently been shown to increase nitric oxide production and eNOS activity in HUVECs 29 . Since ATP2B1 is expressed on lower levels in our female ECs, this might play a role in keeping ECs healthier in females by higher basal NO production, which has already been shown in rat aortas 30 , and lower ROS production, as shown in HUVECs 3 . A higher enrichment for CAD GWAS targets in the intrinsic gene set as compared to the acquired gene set also suggests that disease bias between the sexes might already be present from birth onwards, and that this is not only driven by sex steroids (changes over lifespan) and gender differences. It is important to note however, that efforts to link genetic variation on the X chromosome to CAD showed no assocations 31 , even though most of the intrinsic sex differences that we found are sex chromosomal. Intrinsic sex differences in transcriptional output of the autosome can be driven by the sex-specific effect of genetic variation 32 , sex-specific epigenetic influences, or differences in the transcriptional machinery, such as transcription factors or other cofactors. While intrinsic differences are probably not detrimental to development of males and females, in fact probably selected for, they might predispose to a variety of diseases. For example, the X-chromosomal KDM6A, a gene with intrinsic sex differences, protects against bladder cancer in females 33 . A recent study discovered that an XX-chromosomal complement promotes atherosclerosis in mice, by increasing the bioavailability of dietary fat 34 . It has also been shown that the Y-chromosome, possibly through the KDM6A Y-chromosomal counterpart UTY , plays a role in the proatherosclerotic reprogramming of the transcriptome 35 . The sex chromosomal intrinsic genes are probably differentially expressed between the sexes in most cell-types, as also seen in tissues 36 , but the genes are thought to have tissue-specific functions.
Even though 144 genes overlapped from the 1,135 genes higher expressed in males at birth and 845 higher expressed in males in adulthood, a strong pattern for similar Gene Ontology enrichment can be appreciated (Fig. 4C). This also holds true for the genes higher expressed in females. The same processes seem to be enriched in differential genes at the two stages, but with mostly different genes (Suppl. Fig. 4). These differential, but consistent Gene Ontology enrichments potentially underlie the functional sex differences in ECs that have been described for migration and proliferation before 3 . A strong enrichment for ribosomal protein transcripts and other translational machinery was present in genes higher expressed in males in both stages, perhaps indicating a higher protein turnover, cell size and growth or proliferation rate in male cells, since these are scenarios where more ribosomes would be needed 37 . This was coincident with a very significant signal from MYC as upstream regulator. MYC is differentially expressed as well in the boy-girl twin comparison (adjusted p = 0.001, log2 fold change = 0.5, higher expressed in girls). Sex differences in the levels of ribosomal proteins have been recently described in ECs 38 , but potential different mechanisms in core output from the DNA have not. The signal observed might be caused by differential responses to components of cell medium, since ribosomal biogenesis is linked to environmental and intracellular conditions 39 . However, cell culture leads to homogenization of the behaviour of cells, and therefore possibly diminishes the differences between male and female cells. Liver sinusoidal endothelial cells lose their phenotype within 3 days of monolayer culturing 40 , hence, in vivo sex differences might be more pronounced. Nevertheless, we show that a boy-girl twin model is superior for detecting sex differences as compared to a comparison between boy-boy twins and girl-girl twins, in an in vitro setting. mTOR signalling was also one of the consistent terms in differences between the male and female endothelium. mTOR signalling has been shown to be sex-specific and tissue-dependent in multiple rodent models 41,42 , and this might also be the case in the endothelium. Gene expression differences between males and females have been studied before in HUVEC pools 4 , however, we only saw overlap of the intrinsic sex chromosomal genes between our study and the one previously published. The majority of sex differences might be masked in the previously published study by not being able to correct for the effect of the environment, explaining why we found a larger number of sex-differential genes.
Concluding remarks. In conclusion, we described and annotated sex differences in the human endothelial transcriptome at birth and in an adult setting. We showed that a boy-girl twin model is superior in detecting sex differences to a boy-boy/girl-girl twin model. In addition, endothelial gene expression is significantly different between males and females at different life stages. Some of these are consistent over life, while others are Scientific RepoRtS | (2020) 10:12367 | https://doi.org/10.1038/s41598-020-69451-8 www.nature.com/scientificreports/ acquired, highlighting that EC studies should take sex and its interaction with lifespan into account. Annotating both gene sets with data from multiple genome-wide association studies (GWAS) revealed that genes with an intrinsic sex difference in ECs are enriched for coronary artery disease GWAS hits. This study underscores the need for treating sex as a biological variable.
Limitations of the study. The main limitation of our study is that we used ECs from different sources (aortic and umbilical vein) for comparisons, since ECs from different sources over the human body have distinct transcriptomes 43 . Also, the sample size of the analysis involving the twins may appear as low or unbalanced as compared to the numbers used for the adult analyses. However, as the comparisons were made between twins who grew in identical environments, we hypothesized that this better powered comparison would allow us to use a small set of twins as compared to the adults in whom the variation was much larger. Indeed, the greatest statistical power is achieved in RNA-sequencing experiments in paired designs as compared to unpaired designs, since confounding at the level of all subjects can be estimated out 44 . To study this further, we performed our own study in comparing the twins to the boy-boy versus girl-girl twin design, in which the first was better powered to detect sex differences ( Fig. 2A). Nevertheless, future studies with more twins might find sex differences that are even less pronounced. Since the HAECs are from male and female individuals, and therefore lack a paired design, we miss out on power to detect sex differences in adults. We tried to address this caveat by using a larger sample size in our unpaired adult design. Within the adults, we might have missed some of the sex differences due to overrepresentation of male samples. Nevertheless, in both the twins and adults we were able to identify sex differences that we feel are robust and prevalent. Lastly, we cannot distinguish sex differential effects of cell medium components (e.g. phenol-red) on the genes under study, or the sex differential gene expression in vivo from in vitro.

Materials and methods
Endothelial cells twins. Isolation of ECs from 28 umbilical cords of 14 twin pregnancies was performed in the UMC Utrecht. ECs from 7 boy-girl twins, 3 boy-boy twins and 4 girl-girl twins were isolated (isolation protocol and characterization in Suppl. Fig. 1). Umbilical cords were biobanked (TCBio 18-234) under the residual material ruling of the hospital (https ://www.umcut recht .nl/nl/zieke nhuis /gebru ik-restm ateri aal-medis che-gegev ens), informed consent was waived due to the fact that the material was derived anonymously. Patients or individuals can opt-out if they do not want their material to be used, which was not the case for the current study. Ethical approval of the study was given by Biobank Committee under responsibility of the Medical Ethical Committee of the UMC Utrecht. The study was performed in accordance with the Declaration of Helsinki.
EC isolation. Umbilical cords were cut from fresh placentas and washed in 1 × PBS in petridishes, disinfected with 70% ethanol, and washed with 1 × PBS again. The vein was located, opened up with tweezers, and cannulated with a tie-wrap tied three-way tap. Next, the vein was gently flushed with 50 ml sterile 1 × PBS to remove blood clots. Then, the air from a full 50 ml syringe was pushed through the vein. To detach the ECs, the three-way tap was shut and the other end of the vein was closed with a forceps, after filling it with sterile Accutase (Innovative Cell Technologies #AT-104) until it was under tension. The vein was incubated for 5 min in a 37 °C sterile 1 × PBS filled zip lock bag and subsequently massaged for 1 min. Next, the umbilical cord was emptied over a 50 ml tube by flushing 20 ml of EGM2 (Lonza, #CC-3162) + PenStrep (Gibco Life Technologies, #15140-122) through the vein. The 50 ml tube was centrifuged for 5 min at 330×g RT. Pellet was resuspended in medium, diluted 1:1 with freezing medium (20% DMSO (Sigma Life Sciences #D2650) in EGM2 medium + PenStrep) and frozen down. To remove remaining blood, cells were thawed when needed, plated and cultured at 37 °C 5% CO 2 in EGM2 + PenStrep until confluent prior to RNA isolation. www.nature.com/scientificreports/ this was divided into portions of 1 ml. This was followed by another centrifuging step (330×g for 5 min at RT), and the resulting pellet was re-suspended in 50 µL FACS buffer. Next, 100 µL of antibody mix was added to the sample.  49 . Genes were called differentially expressed if p-value < 0.05. We used a less stringent cut-off as compared to the HUVECs, as this was an unpaired analysis. Since the microarray was limited to genes that were probed, we only used the genes that were present on the microarray for the RNA-seq analyses as well. The p-value for overlap between the sets was calculated with a hypergeometric test in R. The definition for the intrinsic gene set was: adjusted p-value < 0.1 at birth and p-value < 0.05 in adults. The definition for the acquired gene set was: p-value > 0.5 at birth and p-value < 0.05 in adults, as well as an absolute log-fold change more extreme than the median log-fold change of the genes with p-value < 0.05 for stringency.

Endothelial cells adults.
Gene enrichments. Gene-sets from the Hallmark lists were used to calculate gene enrichments within R 50 using hypergeometric tests, with a population size of 20,000. Gene ratios and p-values for all tests can be found in the Suppl. Data File. Radar plots were generated with the fmsb package. Gene ontology analysis for Fig. 4C was performed and visualized using clusterProfiler (v3.10.0) 51 . Upstream regulator analysis was performed by using Ingenuity Pathway Analysis (Qiagen, Germany). Sex-hormonal terms for estrogens (Hallmarks), androgens (Hallmarks), and progestin (M2191 MSigDB) were combined into one major set (Suppl. Data File) to determine enrichment for sex-hormonal targets.
Evolutionary conservation. PhastCon scores were calculated with the GenomicScores package 52 and the phast-Cons100way.UCSC.hg19 package 53 in R, and summarized by taking the mean of the PhastCon scores of all the bases of either the promoter (200 bp upstream of the transcription start site to 100 bp downstream of the transcription start site) or the entire gene, as determined by the TxDb.Hsapiens.UCSC.hg19.knownGene package.

StringDB analyses.
To determine potential associated regulators that would be missed by using transcriptomic data alone, a list of transcription factors and epigenetic modifiers was constructed using EpiFactors 54 and a recent review on the human transcription factors 55 . The transcription factors and epigenetic modifiers of the intrinsic and the acquired gene set were then entered into String 56 to visualize and analyze associations, as well as adding the top 5 most connected associations to our input with the "More"-button. Three Y-chromosomal genes (intrinsic) could not be found in StringDB (ZFY, UTY , KDM5D).
Individual GWAS. GWAS targets for coronary artery disease (CAD) 11 , inflammatory bowel disease 12 , Crohn's disease 12 , ulcerative colitis 12 , Alzheimer's disease 13 , body mass index 14 , heel bone mineral density 15 , height 16 , hypertension 17 , multiple sclerosis 18 , rheumatoid arthritis 19 , and type 2 diabetes mellitus 20 were acquired from the GWAS catalog under the MAPPED_GENE variable. The contents of the MAPPED_GENE variable were split on commas and hyphens in R using strsplit and subsequently used as the GWAS targets. GWAS catalog IDs and associated information can be found in the Suppl. Data File.
GWAS catalog. The entire GWAS catalog was downloaded on 15th June 2019. The unique list of genes within the GWAS catalog was extracted to calculate overlap with the acquired or intrinsic gene list or with random genes.
Permutations. All permutations (evolutionary conservation, regulator content, GWAS targets) were performed 10,000 times in R by sampling from random genes and taking the medians of their associated values. The observed comparator value was calculated as the median of a gene set, such as of the acquired or the intrinsic