Introduction

Autism spectrum disorders (ASD) constitute a common but heterogeneous group of neurodevelopmental disorders characterized by impairment of social interactions and communication, stereotyped behaviors and restricted interests. Although they are probably the most heritable of psychiatric conditions, with a concordance rate of 80–90% in monozygotic twins versus 10–20% in dizygotic twins, few genes have been reliably associated with ASD.1, 2, 3 Recent studies have highlighted the vast heterogeneity and complexity of the genetics of these disorders. All mutations or copy number variants (CNVs) associated so far with ASD have been rare, with minor allele frequencies <1%. A few de novo or inherited CNVs, some of which are recurrent, such as duplications of the 15q11-q13 or 7q11.23 and deletions of 16p11.2 regions, were shown to confer a highly penetrant risk of autism.4, 5, 6, 7 More recently, de novo mutations in various highly interconnected genes were shown to contribute to ASD, suggesting that abnormalities in different genes could converge to alter common pathways.8, 9, 10 Abnormalities in at least two pathways were repeatedly related to ASD: the first includes mutations in TSC1/TSC2, NF1 or PTEN in the mTOR (mammalian target of rapamycin) pathway; the second is illustrated by mutations in NLGN34, SHANK13 and NRXN1, all of which encode synaptic proteins.11, 12, 13

A striking feature of ASD is the excess of affected males, with a sex-ratio disequilibrium of 4:1 that reaches 10:1 in patients with normal cognitive abilities.14 This suggests that genes located on sex chromosomes contribute to the etiology of ASD or that the penetrance of autistic traits depends on sex determinants such as hormones. In favor of the first hypothesis, mutations in NLGN4X and NLGN3 on chromosome X have been identified in a few families with ASD.15 Additionally, the analysis of all or selected genes located on the X chromosome successfully identified new candidate genes for intellectual disability (ID),16 ASD and schizophrenia.17 Interestingly, the risk of recurrence of ASD is significantly increased in families with two affected sibs, reaching 32% and more when both affected subjects are males.18 This suggests that highly penetrant forms of ASD with autosomal recessive or X-linked inheritance have been overlooked.

To test the hypothesis that yet undiscovered X-linked genes are associated with highly penetrant forms of ASD, we selected 12 unrelated families with at least two affected males compatible with X-linked inheritance and analyzed all of the coding regions on the X chromosome.

Material and methods

Patients

The entire exome of chromosome X was sequenced in 12 families with two affected males with ASD or ID compatible with X-linked inheritance, recruited from the ‘Centre de Référence Déficiences Intellectuelles de causes rares’ (Pitié-Salpêtrière Hospital; Supplementary Figure S1). Index cases were evaluated by specialized geneticists and pediatric neurologists and/or child psychiatrists. Patients were assessed with the Autism Diagnostic Interview-Revised. Nine index cases had autism with ID and three had Asperger syndrome or high-functioning autism based on DSM IV-TR (Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition, Text Revision) criteria. Clinical features of the index cases and the affected relatives are detailed in Supplementary Clinical Table. Normal results were previously obtained by karyotyping, searches for fragile-X syndrome, microarray analysis (CytoSNP-12, Illumina, San Diego, CA, USA) and sequencing of NLGN34X and SHANK3 when appropriate, as well as metabolic screening (including at least creatine and guanidinoacetate analysis).

For TMLHE screening, a cohort of 161 patients (134 patients with autism and ID and 27 patients with Asperger syndrome) recruited at the Pitié-Salpêtrière Hospital (Centre de référence Déficiences Intellectuelles de causes rares or Centre référent diagnostic autisme, Paris, France) and 340 patients from the PARIS (Paris Autism Research International sib pair) study (including 194 patients with autism and ID and 59 patients with Asperger syndrome) were included, for a total of 501 unrelated male patients with ASD. In addition, 765 healthy male controls from North Africa (n=320), Europe (n=350) and Lebanon (n=95) were included to test the new variants.

The analysis of microrearrangements in TMLHE included 178 additional patients with ASD from the PARIS studies previously included in Autism Genome Project (AGP, http://www.autismgenome.org/)7 and 896 healthy male individuals. The control groups included 371 European male subjects from La Pitié-Salpêtrière hospital, 142 from other European laboratories and 383 control individuals from the Study on Addiction Genetics and Environment (n=371) and HapMap CEPH Utah (n=12) series.7 Raw intensities and genotypes were obtained from NHGRI-dbGaP (http://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000092.v1.p1). The sub-set of control data set used in the specific CNV analyses in this paper is composed of control samples that passed all quality control filters (Log R ratios s.d.=0.27; B allele frequency s.d.=0.13; Call Rate >0.99). Informed written consent was obtained from each individual or his/her parents before blood sampling. All experiments were performed in accordance with French guidelines and rules.

Next-generation sequencing

Next-generation sequencing was performed at Integragen SA (Evry, France). Regions of the X chromosome corresponding to coding and 3′–5′ untranslated region (UTR) sequences were captured from genomic DNA using a custom Agilent SureSelect Target Enrichment methodology (Agilent, Santa Clara, CA, USA) with the biotinylated oligonucleotide probe library, followed by paired-end 75 b massively parallel sequencing on Illumina GAIIx (Illumina). For detailed explanations of the process, see Gnirke et al.19 Sequence capture, enrichment and elution were performed precisely according to the manufacturer’s instructions and protocols. Briefly, 3 μg of each genomic DNA was fragmented by sonication to yield fragments of 150–200 bp and then purified. Paired-end adapter oligonucleotides from Illumina were ligated on repaired A-tailed fragments, then purified and enriched by six PCR cycles. Then 500 ng of these purified libraries were hybridized to the SureSelect oligo probe capture library for 24 h. After hybridization, washing and elution, the eluted fraction was PCR-amplified for 10–12 cycles, purified and quantified by quantitative PCR to obtain sufficient DNA template for downstream applications. Each eluted-enriched DNA sample was sequenced on an Illumina GAIIx as paired-end 75 b reads. Image analysis and base calling was performed using Real Time Analysis Pipeline version 1.9 (Illumina) with default parameters.

Bioinformatics analysis

Sequencing data was analyzed according to the Illumina pipeline (CASAVA1.7) and aligned with the Human reference genome (Hg19) using the ELANDv2 algorithm. Variant annotation (RefSeq gene annotation) identification of known polymorphisms (referenced in dbSNP or 1000 Genome) and analysis of the position and consequences of the variants (for example, exonic, intronic, silent and nonsense), was determined with an in-house pipeline from the positions included in the bait coordinates. The frequencies (in the homozygous or heterozygous state) were determined from all exomes sequenced at Integragen and from exome results provided by HapMap. Results per sample were obtained in tabulated text files, and coverage/depth statistical analyses were performed for each bait. The 13,464 single nucleotide polymorphisms (SNPs) and 1532 indels in the 12 male index cases included 1467 SNPs (10.9%) and 590 indels (38.5%) predicted to be at the heterozygous state. Eleven of these variants were tested by Sanger sequencing and proved to be false-positives. Further analysis focused, therefore, on variants predicted to be hemizygous. A total of 171 hemizygous variants were tested and confirmed by Sanger sequencing. The strategy used for selecting potentially pathogenic variants is detailed in Figure 1. They: (i) were located in chromosome X regions common to the two affected sibs (4331 SNPs and 299 indels in 11 families); (ii) had with a minor allele frequency <1% in dbSNP135 (http://www.ncbi.nlm.nih.gov/projects/SNP/), Exome variant server (http://evs.gs.washington.edu/EVS/) and 29 other exomes; (iii) were found in genes expressed in brain according to the Unigene (http://www.ncbi.nlm.nih.gov/unigene) or Uniprot (http://www.uniprot.org/) databases; and (iv) were predicted to have an impact on the gene or the protein (nonsense variants, missense variants, predicted at least once in silico to be deleterious, and synonymous, intronic or 5–3′UTR variants with possible effects on splice sites or promoters using Alamutv2.1/AlamutHT). For variants present in at least two index cases, only those segregating in all the affected members of all families were retained. Mutation interpretation and amino-acid conservation in orthologs and paralogs were assessed using the Alamutv2.1/AlamutHT softwares (Interactive Biosoftware, Rouen, France). Prediction of pathogenicity was assessed using PolyPhen-2 (http://genetics.bwh.harvard.edu/pph2/), SIFT (scale-invariant feature transform) (http://sift.bii.a-star.edu.sg/), Mutpred (http://mutpred.mutdb.org/) and SNPs&GO (http://snps-and-go.biocomp.unibo.it/snps-and-go/). Frequencies were compared with the Fisher’s exact test.

Figure 1
figure 1

Strategy used for the selection of rare and possibly deleterious variants. Data from NGS and single nucleotide polymorphism (SNP) arrays were combined to conserve only variants located in X regions shared by the affected sibs (families 1–11). Further filters included a minor allele frequency (MAF) <1%, expression of the corresponding genes in brain and in silico predictions compatible with an effect of the variant on the gene or the protein (nonsense variants, missense variants with at least one prediction in silico by SIFT (scale-invariant feature transform) or Polyphen-2 that it is deleterious and synonymous, intronic or 5–3′UTR variants with possible effects on splice sites or promoters using Alamutv2.1/AlamutHT). For variants present in at least two index cases, only those that segregated in all affected members of all families were conserved. For one family (family 12), microarray data were unavailable for the affected uncle; segregation of variants found in the index case was performed at a later time.

High-density SNP arrays

Index cases and affected relatives were screened using Illumina cytoSNP-12 arrays to search for CNVs and identify regions on the X chromosome shared by the affected sibs. Illumina microarray experiments were automated and performed at the P3S platform (Pitié-Salpêtrière Hospital), according to the manufacturer’s specifications (Illumina, San Diego, CA, USA). Image acquisition was performed using a BeadArray Reader (Illumina). Image data analysis and automated genotype calling was performed using GenomeStudiov2011.1 (Illumina). Genomic positions were based on the UCSC and Ensembl Genome Browsers. Genotypes on chromosome X were analyzed in affected relatives of each family to identify shared X regions (with the exception of family 12 for which SNP microarray data were unavailable for the affected uncle). Shared regions were defined as identical genotypes spanning at least 2 Mb.

For the analysis of TMLHE microrearrangements, control individuals (n=525) and ASD cases (n=356) were genotyped using Illumina Human 1M-single BeadChip arrays. Samples were processed using the manufacturer’s recommended protocol, and BeadChips were scanned on the Illumina BeadArray Reader using default settings. Analysis and intra-chip normalization were performed using Illumina’s BeadStudio software v.3.3.7, with a GenCall cutoff of 0.1. The quality-control criteria were selected: the Array with call rate >95%; standard deviation for log R ratio values in the autosomes <0.35; and standard deviation of the B Allele frequency values (that is, allelic ratios within the 0.25–0.75 ranges) >0.13. For the samples that passed the above SNP and intensity quality-control filters, we used the QuantiSNP20 and visualizator SnipPeep CNV calling algorithms. The required data for CNV analysis, that is, within-sample normalized fluorescence (that is, X and Y normalized values), between-sample normalized fluorescence (that is, Log R ratios and B allele frequency values) and genotypes for each sample, were exported directly from Illumina’s Beadstudio software. We excluded CNVs when they failed stringent quality control criteria: <5 consecutive probes covering 1 kb of sequence were merged using outside probe boundaries (that is, union of the CNVs) and low confidence score log Bayes factor <15.

Sanger sequencing

Specific primer pairs were designed to amplify 182 variants detected by next-generation sequencing. In addition, eight primer pairs were designed to amplify the coding exons and adjacent intron–exon boundaries of the TMLHE gene. Primer sequences are provided in supplemental data. Forward and reverse sequence reactions were performed with the Big Dye Terminator Cycle Sequencing Ready Reaction Kit (Applied Biosystems, Life Technologies Corporation, Carlsbad, CA, USA) using the same primers. G50-purified sequence products were run on an ABI 3730 automated sequencer (Applied Biosystems) and data were analyzed with Seqscape v2.6 software (Applied Biosystems). The mutation nomenclature is based on the TMLHE cDNA reference sequence (NM_018196.3).

Quantitative multiplex PCR and long-range PCR

The presence of a deletion of exon 2 in TMLHE was tested in the 501 patients with ASD and 371 male control individuals by quantitative multiplex PCR. One hundred and seventy-eight patients with ASD were also screened using Illumina Human 1 M-single BeadChip arrays with the same results. Two primer pairs were used in the quantitative multiplex assay: one specific of exon 2 of TMLHE (final concentration: 0.5 μM) and one amplifying exon 2 of GPR128 (final concentration: 0.08 μM). PCR conditions were as follows: 96 °C 5 min, 20 cycles: 94 °C 30 s, 60 °C−0.5 °C/cycle 30 s, 72 °C 40 s, and 15 cycles: 94 °C 30 s, 50 °C 30 s, 72 °C 40 s, followed by 7 min at 72 °C. PCR products were quantified on a Caliper LabChip system (Caliper Life Sciences, Hopkinton, MA, USA). In addition, the presence of the deletion of exon 2 in TMLHE was confirmed by long-range PCR using the SequalPrep Long PCR kit (Invitrogen, Life Technologies Corporation, Carlsbad, CA, USA) according to the manufacturer’s recommendations. The PCR conditions were as follows: 2 min at 94 °C, 10 cycles: 94 °C 10 s, 58 °C 30 s, 68 °C 18 min (1 min kb−1), 25 cycles: of 94 °C 10 s, 58 °C 30 s, 68 °C 18 min (1 min kb−1)+20 s/cycle, followed by 5 min at 72 °C. Frequencies were compared with the Fisher’s exact test.

Cell culture and mRNA experiments

Lymphoblasts from the two affected brothers and the mother of family 9 were isolated from peripheral blood cells using standard procedures. Fibroblasts were taken from skin cells of the affected brothers. Lymphoblastic cells and fibroblasts were pretreated, or not, overnight with 10 μg ml−1 emetin, an inhibitor of nonsense-mediated decay.

Total RNA from lymphoblasts and fibroblasts was isolated using the Qiagen RNeasy Mini kit (Invitrogen). cDNAs were synthesized from 1 μg of total RNA using the SuperScript III First-Strand Kit (Invitrogen). The reverse-transcribed TMLHE cDNA was amplified and sequenced using specific primers located in exons 2 (forward) and 4 (reverse). The PCR products were run on 2% agarose gels. TMLHE mRNA was quantified using the Qiagen QuantiTect primer assays for TMLHE (forward and reverse primers located in exons 7 and 8 of TMLHE). PPIA was used as the control gene. Each sample was run in triplicate on a Lightcycler 480 (Roche, Applied Sciences, Penzberg, Germany). Forty-five two-step cycles (15 s at 95 °C and 30 s at 60 °C) were performed and analyzed using Lightcycler 480 software release 1.5.0. Relative abundance was calculated using the formula r=2−ΔΔCt, where ΔΔCt=(Ct Gene tested −Ct control genes) TMLHE – (Ct Gene tested −Ct control genes) PPIA.

Chromatography and mass spectrometry

A mixture of internal standard solution was prepared by dissolving 30 μM of carnitine-(N-methyl-d3) (Cambridge Isotopes Lab, Andover, MA, USA) and 15 μM of ɛ-N-trimethyl-(13C3)-L-Lysine (Sigma-Aldrich, St Louis, MO, USA) in methanol. In all, 20 μl of internal standard mixture was added to 30 μl of plasma or urine after mixing, 100 μl of methanol was added and mixed with vortex for protein precipitation. The mixture was incubated in ice for 15 min and centrifuged at 15000 g for 10 minutes at +4 °C. The supernatant was transferred into the vials and 5 μl were injected into tandem LCMS/MS system. Calibration curves were performed by serial dilution of stock solution containing 46.8 μM of l-carnitine and 20 μM of ɛ-N-trimethyllysine (Sigma-Aldrich) in methanol. An Acquity UPLC (ultra performance liquid chromatography) chromatographic system equipped with a BEH C18 RP column (1.7 μ, 50 mm × 2) maintained at 45 °C was coupled to a TQD (tandem quadrupole detector) MS/MS system (Waters, Guyancourt, France) and used as an LCMS/MS (liquid chromatography–mass spectrometry/mass spectrometry) system for trimethyllysine (TML) and carnitine measurement. The mobile phases were: eluent A, ultrapure water; eluent B, acetonitrile. The elution gradient was as follows: flow rate 0.8 ml min−1, 0–1 min, 0% A; 1–1.2, 0–100% A; 1.2–2 min, 100%; 2–2.2, 100–0% A; 2.2–5, column equilibration with 100% B. The detector was used in multiple-reaction monitoring to detect the transition of a specific precursor to daughter ions 189.1/84.1 and 162.2/103.1 for N6-trimethyllysine and carnitine, respectively.

Dietary assessment

In order to determine prospective and retrospective dietary intakes, the patient and their parents had an interview with a dietician. A dietary questionnaire listing all the food and beverages consumed by the patient for a period of 3 days was completed by the parents. Nutritional intakes were estimated with the DSMS software. Carnitine intakes were calculated based on the reference table of the Linus Pauling Institute Oregon State University.

Results

Chromosome X exome sequencing

The coding regions of all genes on chromosome X, including 5′–3′ UTRs, were sequenced in 12 index cases using next-generation sequencing (pedigrees are shown in Supplementary Figure S1). A mean number of 1000 SNPs (829–1459) and 78 indels (56–113; Table 1 and Supplementary Table SA and SB) were identified per patient. In parallel, chromosome X regions common to the affected sibs, ranging from 15 to 109 Mb, were identified in 11 families using Illumina cytoSNP-12 arrays (Supplementary Figure S2). The mean numbers of SNPs and indels located in shared X regions were 394 (77–828) and 27 (7–77) per patient, respectively. Further analysis focused on variants that were (i) absent or rare in databases (minor allele frequency <1%), (ii) located in genes expressed in brain and (iii) predicted to be deleterious (Figure 1). Thirty-eight possibly deleterious variants (mean number per family: 3.2 range: 0–9), all confirmed by Sanger sequencing, were detected (Table 2). Analysis of matched control populations excluded two variants that had a frequency 1%. The variants were present in 15/22 asymptomatic male relatives. Altogether, these results identified 36 rare, possibly deleterious variants in 33 different genes in 9 families.

Table 1 Summary of all SNPs and indels detected by exome analysis of chromosome X in the index cases of families 1–12 and those located in chromosome X regions shared by families 1–11
Table 2 Summary of rare, possibly deleterious variants located in shared regions on chromosome X

In two families, the variants were in genes previously implicated in ID.21, 22, 23, 24 Both variants, c.2904_2906del/p.Ser969del in PHF8 and c.2849 T>A/p.Val950Asp in HUWE1, affect amino-acids that were highly conserved during evolution and were not found in a large control population or reported in Hapmap, 1000 Genomes and the Exome Variant Server (Figure 2). Although, mutations in PHF8 causing a loss-of-function were previously identified in patients with ID and cleft lip/palate,21, 22, 23 the p.Ser969del variant segregated with high-functioning autism without other clinical features in family 8. In the index case of family 4, p.Val950Asp in HUWE1 was predicted to be deleterious by SIFT, Polyphen-2 and Mutpred algorithms. Surprisingly, this variant was not found in the proband’s brother, who was less severely affected, and turned out to have occurred de novo in the proband. We hypothesized that Val950Asp in HUWE1 contributed to a genetically complex disorder by increasing the severity of the phenotype. Alternatively, the phenotype of the brother could have had a different etiology. These results illustrate the complexity of inheritance in ASD, in which a combination of rare inherited and de novo events can contribute to the disorder.6, 7, 8, 9, 10

Figure 2
figure 2

Identification of variants in PHF8 and HUWE1 in families 8 and 4. (a) Pedigree of family 8 and segregation analysis of the p.Ser969del variant in PHF8. The arrow indicates the index case. (b) Sequence electropherograms showing the presence of the p.Ser969del variant at the hemizygous state in the two affected brothers and at the heterozygous state in their mother. (c) Alignment of the region flanking the variant in orthologous proteins, showing the high conservation of Serine 969. (d) Pedigree of family 4 and haplotypes reconstructed from eight informative single nucleotide polymorphisms (SNPs) adjacent to HUWE1 (genotypes of these SNPs were obtained from Illumina cytoSNP-12 arrays analysis), showing that the same maternal haplotype was transmitted to the affected brothers with and without the p.Val950Asp mutation. The arrow indicates the index case. (e) Sequence electropherograms showing the presence of p.Val950Asp in the index case and its absence in the affected brother and in the mother. These results are consistent with the de novo occurrence of p.Val950Asp in the index case. (f) Alignment of the region flanking the variant in orthologous proteins, showing the high conservation of valine 950.

Among the remaining variants, a nonsense mutation (c.229C>T/p.Arg77X) in TMLHE, encoding ɛ-N-trimethyllysine hydroxylase, the enzyme catalyzing the first step of carnitine biosynthesis from TML, segregated with autism and moderate ID in family 9 (Figures 3a and b). The p.Arg77X mutation was absent from 508 healthy male controls and databases.

Figure 3
figure 3

Identification of TMLHE mutations in three families. (a) Pedigrees and segregation analysis of the TMLHE mutations in families 9, PED-804 and AU-205. The arrows indicate the index cases. (b) Sequence electropherograms of the mutations at the hemizygous state in the index cases (835–03 in family 9, 804–03 and 205–03) and the affected brother of family 9 (835–04), and at the heterozygous state in the mothers (835–02, 804–02 and 205–02). (c) Analysis of TMLHE mRNA in lymphoblasts from members of family 9 and schematic representations of the splicing isoforms detected in subjects with the p.Arg77X mutation in exon 3. Reverse transcriptase–PCR products using primer pairs in exons 2 and 4, run on 2% agarose gels, showed two mRNA isoforms in the index case (835–03), his affected brother (835–04) and his mother (835–02) and a single isoform in a control subject (c). Sequence analysis confirmed that the long isoform contains the premature termination codon in exon 3 and that exon 3 was skipped in the short isoform, probably as a consequence of nonsense-associated alternative splicing. (d) Alignment of the region flanking the two missense variants in orthologous proteins showing the conservation of the altered amino acids. (e) Quantification of TMLHE mRNA expression in fibroblasts (F) and lymphoblasts (L) from members of family 9 by quantitative real-time PCR, using primer pairs in exons 7 and 8. TMLHE mRNA was expressed 10 times less in patients compared with healthy controls (green bars). Overnight treatment with 10 μg ml−1 emetin (blue bars), an inhibitor of nonsense-mediated decay, restored the expression of the TMLHE mRNA. (f) Assay of free carnitine by UPLC (ultra performance liquid chromatography) chromatographic and TQD (tandem quadrupole detector) mass spectrometry in the plasma of patients. (g) Assay of trimethyllysine (TML) by UPLC chromatographic and TQD mass spectrometry showing a 2–3-fold increase in the plasma of patients.

This study identified several additional rare variants that might contribute to ASD such as c.521C>A/p.Ala174Asp (family 1) in ODZ1, which encodes teneurin-1, a transmembrane protein expressed in the developing central nervous system that might have a role in neuronal connectivity.25 In addition, 11 variants located in introns or UTRs of genes expressed in the brain and predicted to have a possible effect on gene expression were found in five families. Five of these genes are involved in axon guidance (PLXNA3, PLXNB3, KAL1) or neurotransmission (SYN1, GABRE), four regulate transcription, splicing or translation (TXLNG, TSPYL2, AFF2, involved in brain development, and RBM3, involved in RNA processing, regulation of translation and production of miRNA);26 the remaining genes are involved in ubiquitination (KLHL13) or in protein transport (BCAP31).

Finally, no rare variants meeting the criteria defined in Material and methods were found in three families (families 3, 7 and 10), suggesting that genetic factors in these families are possibly located in unexplored regions of the X chromosome, on autosomes, or were present with a frequency 1%.

Screening of TMLHE and functional consequences of mutations

To investigate the effect of the p.Arg77X mutation at the mRNA level, we performed quantitative reverse transcriptase–PCR analysis in lymphoblasts and fibroblasts from the two affected brothers of family 9 and their mother. The mutated mRNA was significantly downregulated in the cells of the affected sibs. Pretreatment of the cells with emetin restored the mRNA levels, indicating that this downregulation corresponded to the degradation of the mutated mRNA by nonsense-mediated decay (Figure 3e). Interestingly, two mRNA isoforms were detectable in the patients and their heterozygous mother, one with a premature termination codon in exon 3 and one missing exon 3, which restored the reading frame (Figure 3c). Exon 3 skipping could have resulted from nonsense-associated altered splicing, a mechanism alternative to nonsense-mediated decay.27

To confirm that mutations in TMLHE are associated with ASD, we screened its coding sequence in 501 male probands with ASD. Two missense substitutions (c.730G>C/p.Asp244His and c.1107G>T/p.Glu369Asp) were found in two additional unrelated patients with ASD but not in 330 controls and not reported in databases. Aspartic acid 244 and glutamic acid 369 are highly conserved in other species (Figure 3d). Aspartic acid 244 is predicted to bind the 2-oxoglutarate cofactor and constitutes one of the three key residues of the catalytic core,28 suggesting a complete loss-of-function of the protein with this substitution.

To further analyze the consequences of the TMLHE mutations, we assayed carnitine and TML, the precursor of carnitine biosynthesis, in plasma and urine from the brothers with the p.Arg77X mutation and the patient with the p.Asp244His variant for whom biological samples were available. Carnitine was slightly but not significantly decreased in the plasma of the patients; the values remained in the normal range (Figure 3f). By contrast, mass spectrometry revealed a significant 2–3-fold increase in the TML precursor in the plasma of all the three patients with the Arg77X and Asp244His mutations compared with controls (Figure 3g). Carnitine intake, estimated from a 3-day dietary recall questionnaire, was normal in all patients (estimated at 55, 80 and 96 mg j−1, respectively).

Very few variants in TMLHE predicted to have deleterious effects are present in genetic databases (HapMap, 1000 Genomes); in particular, no nonsense mutations have been identified and only 18 non-synonymous variants have been reported on >8700 X chromosomes in the Exome Variant Server. Furthermore, non-synonymous variants are far more frequent in females than in males (n=22/3381 versus n=3/1998, if we exclude the Asn235Thr variant specific of the African population), indicating that variations in TMLHE are not well tolerated and are liable to be pathogenic. Mutations in TMLHE were found in patients with autism and ID but not in patients with Asperger syndrome. If we exclude patients with Asperger syndrome, the difference between point mutations in TMLHE in male patients with autism and mental retardation and male individuals from Exome variant server is significant (P=0.05).

Interestingly, Celestino-Soper et al.29 have recently reported that the deletion of exon 2 of TMLHE is a CNV present in 1/350 males and associated with ASD with a low penetrance. To compare the frequency of this CNV in healthy individuals and patients with ASD, we specifically assessed its presence in 896 unrelated healthy male controls and 691 patients with ASD by quantitative multiplex assay or Illumina Human 1M-single BeadChip arrays (see Material and methods). This study revealed the presence of one deletion of exon 2 of TMLHE in a single subject out of 896 male controls, whereas it was present in 3 out of 691 male patients with ASD (P=0.3). Interestingly, one of the three patients with the deletion of exon 2 had an affected brother who did not carry the deletion. In addition, a duplication encompassing exon 1 of TMLHE, absent from 525 healthy controls, was detected by Illumina Human 1M-single BeadChip arrays in an additional patient with ASD (P=0.4). Altogether, these results suggest that deficits in TMLHE could be rarer than previously reported. These deficits seem to be more frequent in patients with ASD, suggesting that they constitute susceptibility factors for autism, although the difference did not reach significance for microrearrangements in TMLHE.

Discussion

This study focused on rare variants on chromosome X in multiplex families with ASD compatible with X-linked inheritance. Altogether, this study identified 36 possibly deleterious variants in 33 genes, including PHF8, HUWE1 and TMLHE. Variants in genes common to at least two families were exceptional, confirming that ASD is genetically highly heterogeneous.

The X chromosome contains the largest number of genes expressed in the brain.30 For this reason, mutations causing monogenic forms of ID have been identified in numerous genes on chromosome X.31 Interestingly, almost all genes involved in ASD, such as NLGN3/4X on chromosome X or SHANK3 on chromosome 22qter, are also mutated in patients with ID without autistic features.11, 31 In this study, variants in two genes were previously implicated in ID: PHF8, which encodes a histone lysine demethylase that regulates rRNA synthesis32 and retinoic acid-induced neuronal differentiation,33 and HUWE1, which encodes an E3 ubiquitin-protein ligase that controls neural differentiation and proliferation by catalyzing the polyubiquitination and degradation of the N-Myc oncoprotein.34, 35 Missense mutations and microduplications encompassing HUWE1 were identified in a few large families with moderate-to-severe ID,24 whereas nonsense, truncating and one missense mutations in PHF8 were previously reported in patients with ID and cleft lip/palate.21, 22, 23 Interestingly, in a previous report, two brothers with a deletion encompassing PHF8 and two nearby genes (FAM120C and WNK3) also had autistic features.36 In our study, both variants alter highly conserved amino acids in the proteins and are predicted to be deleterious; the p.Ser969del variant in PHF8 segregated in the two affected brothers from family 8 and the p.Val950Asp in HUWE1 occurred de novo in the index case of family 4 who is more severely affected than his brother. The p.Glu441Lys in PPP1R3F was also present in the two affected brothers of family 4 but was inherited from their unaffected maternal grandfather. Interestingly, another missense variant (p.Phe245Leu) in PPP1R3F was reported in a patient with Asperger syndrome.17 This observation suggests that the variants identified in family 4 act as risk factors for ASD when associated with other deleterious variants on the X chromosome or autosomes. Further studies are needed to confirm the roles of these genes in ASD.

Among the variants detected in this study, there was a single nonsense mutation in TMLHE that segregated with ASD in the two affected brothers of family 9. TMLHE is located at the far end of the long arm of chromosome X (Xq28) and encodes ɛ-N-trimethyllysine hydroxylase, the enzyme that catalyzes the first of the four steps of endogenous carnitine biosynthesis. Screening of additional male patients with ASD identified two missense variants predicted to be deleterious in unrelated sporadic patients. The p.Arg77X and p.Asp244His mutations were associated with a significant increase of TML, the substrate of TMLHE and precursor of carnitine biosynthesis, in the plasma of the patients, confirming that they lead to loss of TMLHE function and deficit of endogenous carnitine biosynthesis. However, although carnitine was mildly decreased, it remained in the normal range in the plasma and urine of the patients. This result is not surprising as, in humans, the carnitine pool mainly comes from food intake. A small pool (25%) of carnitine is also synthesized in liver, kidney and brain, but the precise role of endogenous synthesis in these tissues remains unknown.28 Carnitine is an essential metabolite in all animal species as well as in numerous microorganisms and plants. In mammalian cells, carnitine is present as free carnitine and acylcarnitines, including acetylcarnitine.37, 38 The main role of carnitine is to transport activated long-chain fatty acids across the inner mitochondrial membrane for β-oxidation. But additional neuroprotective, neuromodulatory and neurotrophic roles have also been suggested.39, 40, 41, 42 In particular, carnitine is an antioxidant and that might protect mitochondria from oxidative stress.

Another recent study supports the observation that the loss-of-function of TMLHE is associated with ASD. The deletion of exon 2 of TMLHE, originally identified in ASD male patients,43 was shown to be a CNV that is present in male controls at a frequency of 1/350. The first coding exon of TMLHE, exon 2, encodes the signal peptide necessary to address the protein to mitochondria. Deletion of this exon causes enzyme deficiency and impairs endogenous carnitine biosynthesis, as observed for p.Arg77X and p.Asp244His mutations. Remarkably, 6 out of 7 sib pairs with ASD were concordant for the deletion of exon 2, whereas the frequency of this CNV was only slightly, not significantly, increased in sporadic ASD patient. The authors concluded that the deletion of exon 2 of TMLHE is a risk factor for nondysmorphic ASD, with a penetrance estimated at 2–4%.29 In this study, the deletion of exon 2 of TMLHE was present in three male patients with ASD out of 691 and only in one male control out of 896, indicating that this CNV could even be rarer than 1/350. Although CNVs altering TMLHE tend to be more frequent in patients with ASD than in controls, the difference of frequencies between patients and controls was not significant. Yet, this finding is concordant with the results of the previous study in which significance between patients and controls was not achieved either.29 Interestingly, in our case, the only sib pair with ASD was discordant for the deletion of exon 2 of TMLHE and all the other patients were sporadic cases. However, association studies are not appropriate for rare variants, especially in the case of a high genetic heterogeneity such as in ASD. The arguments suggesting that a loss-of-function of TMLHE could contribute to autism are: (i) the demonstration that mutations have functional biological impact; (ii) the segregation of the mutations with the disorder in the families; and (iii) the identification of different rare functional mutations in the same gene in unrelated individuals.44 Taking these arguments into account, our results confirm that the deficiency of TMLHE likely contributes to ASD, probably in association with other genetic or non-genetic abnormalities, and reinforces the view that point mutations could also be identified in patients with ASD. The penetrance of the point mutations remains to be determined but might be higher than previously anticipated for deletion of exon 2.

The mechanism by which TMLHE deficiency leads to ASD remains unclear. The increase in TML could be toxic at some stage during brain development or interfere with the establishment of normal neuronal networks. Alternatively, the deficiency of carnitine itself or one of the three intermediates of endogenous carnitine biosynthesis (HTML, TMABA or γ-BB, Supplementary Figure S4) might be deleterious for brain development, alone or combined with a deficit in carnitine intake. To test the hypothesis that patients with p.Arg77X and p.Asp244His had a deficiency of carnitine intake in addition of the TMLHE mutation, we assessed their dietary intake over several days. The calculated carnitine intake was normal in all the three patients but reflects only current carnitine intake; a lack of carnitine during specific antenatal or neonatal periods cannot be ruled out. A retrospective questionnaire revealed that patient PED-804–03 (with the p.Asp244His mutation) refused to eat meat around 12 months of age. However, young children frequently refuse meat, particularly those with ASD. Interestingly, low levels of carnitine and acetylcarnitines and altered brain fatty acid metabolism were reported in subjects with ASD.45, 46, 47 However, if a systemic carnitine deficiency constitutes a risk factor for ASD, infants with low meat intake would have a higher risk of developing ASD. This is unlikely as the prevalence of autism is not notably increased in vegetarians or populations with low meat intake. In addition, carnitine and related metabolites are often abnormal in patients without ASD. These arguments suggest that low levels of carnitine and acetylcarnitines in ASD patients could reflect a more complex deficit. Further studies are, therefore, needed to decipher the real role of carnitine and TMLHE deficiency in ASD.

Recent studies have emphasized the role of carnitine in promoting social interactions in animal models. Desert (Schistocerca gregaria) and migratory locusts (Locusta Migratoria) reversibly change between two phenotypes (solitarious and gregarious) that differ in bodily appearance, physiology, brain size and organization, and behavior. At low population density, locusts in the solitarious phase avoid their congeners; when the population increases, locusts become gregarious and aggregate in migratory swarms. The genomes of the two forms of locust are equivalent. The transformation is driven only by epigenetic regulation.48 Interestingly, carnitine was recently shown to constitute a key regulatory metabolite in the phase transition in the migratory locust.49 Remarkably, the only other metabolite that is known to regulate phase transition in desert locusts is serotonin.50 The link between carnitine and serotonin is unclear, but carnitine has been proposed to act as a neuromodulator in the animal central nervous system; in addition, acylcarnitines could promote the biosynthesis and release of neurotransmitters, including dopamine and melatonin.51 Although these results were obtained in species very distant from humans, they support the hypothesis that carnitine has a conserved role in socialization during evolution and offers potentially novel insights into the complex role of carnitine and its derivatives acetylcarnitine and acylcarnitines in the brain.

Finally, TMLHE could also have other yet unknown functions. An isoform has been reported in which exon 2 is spliced out and an alternative start codon in exon 3 is used.42 This isoform lacks the mitochondrial targeting signal and probably localizes in another cellular compartment. Contrary to the deletion of exon 2, which affects only the mitochondrial isoform,29 the mutations reported in this study are predicted to affect other isoforms as well. Interestingly, the TMLHE protein was reported to interact with nuclear complex p130/RBL2 that regulates gene expression, supporting the hypothesis that TMLHE has another cellular localization and function.52 The mechanism by which the TMLHE deficiency leads to ASD remains to be further characterized.

Altogether, our results confirm that a TMLHE deficiency is associated with ASD and support the hypothesis that rare variants on the X chromosome are involved in the etiology of ASD and contribute to the sex-ratio disequilibrium characteristic of these disorders.