Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Systematic analysis of short tandem repeats in 38,095 exomes provides an additional diagnostic yield



Expansions of a subset of short tandem repeats (STRs) have been implicated in approximately 30 different human genetic disorders. Despite extensive application of exome sequencing (ES) in routine diagnostic genetic testing, STRs are not routinely identified from these data.


We assessed diagnostic utility of STR analysis in exome sequencing by applying ExpansionHunter to 2,867 exomes from movement disorder patients and 35,228 other clinical exomes.


We identified 38 movement disorder patients with a possible aberrant STR length. Validation by polymerase chain reaction (PCR) and/or repeat-primed PCR technologies confirmed the presence of aberrant expansion alleles for 13 (34%). For seven of these patients the genotype was compatible with the phenotypic description, resulting in a molecular diagnosis. We subsequently tested the remainder of our diagnostic ES cohort, including over 30 clinically and genetically heterogeneous disorders. Optimized manual curation yielded 167 samples with a likely aberrant STR length. Validations confirmed 93/167 (56%) aberrant expansion alleles, of which 48 were in the pathogenic range and 45 in the premutation range.


Our work provides guidance for the implementation of STR analysis in clinical ES. Our results show that systematic STR evaluation may increase diagnostic ES yield by 0.2%, and recommend making STR evaluation a routine part of ES interpretation in genetic testing laboratories.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: Results of short tandem repeat (STR) size detection in two cohorts of exome sequencing (ES) samples and confirmations of alleles that exceeded the thresholds.
Fig. 2: Three examples of clinical compatibility between genotype and the phenotypic description of two patients from the movement disorders cohort that received a genetic diagnosis based on the short tandem repeat (STR) detection workflow, and one validation sample to compare to.

Data availability

Data and materials are available upon request.

Code availability

ExpansionHunter script is available at: GangSTR script is available at: STRetch script is available at:


  1. 1.

    Lander, E. S. et al. Initial sequencing and analysis of the human genome. Nature. 409, 860–921 (2001).

    CAS  Article  Google Scholar 

  2. 2.

    Fungtammasan, A. et al. Accurate typing of short tandem repeats from genome-wide sequencing data and its applications. Genome Res. 25, 736–749 (2015).

    CAS  Article  Google Scholar 

  3. 3.

    Tankard, R. M. et al. Detecting expansions of tandem repeats in cohorts sequenced with short-read sequencing data. Am. J. Hum. Genet. 103, 858–873 (2018).

    CAS  Article  Google Scholar 

  4. 4.

    Walker, F. O. Huntington’s disease. Lancet. 369, 218–228 (2007).

    CAS  Article  Google Scholar 

  5. 5.

    Vissers, L. et al. A clinical utility study of exome sequencing versus conventional genetic testing in pediatric neurology. Genet. Med. 19, 1055–1063 (2017).

    Article  Google Scholar 

  6. 6.

    Gilissen, C. et al. Genome sequencing identifies major causes of severe intellectual disability. Nature 511, 344–347 (2014).

    CAS  Article  Google Scholar 

  7. 7.

    Mousavi, N., Shleizer-Burko, S., Yanicky, R. & Gymrek, M. Profiling the genome-wide landscape of tandem repeat expansions. Nucleic Acids Res. 47, e90 (2019).

    CAS  Article  Google Scholar 

  8. 8.

    Dashnow, H. et al. STRetch: detecting and discovering pathogenic short tandem repeat expansions. Genome Biol. 19, 121 (2018).

    Article  Google Scholar 

  9. 9.

    Dolzhenko, E. et al. Detection of long repeat expansions from PCR-free whole-genome sequence data. Genome Res. 27, 1895–1903 (2017).

    CAS  Article  Google Scholar 

  10. 10.

    Gymrek, M., Golan, D., Rosset, S. & Erlich, Y. lobSTR: a short tandem repeat profiler for personal genomes. Genome Res. 22, 1154–1162 (2012).

    CAS  Article  Google Scholar 

  11. 11.

    Willems, T. et al. Genome-wide profiling of heritable and de novo STR variations. Nat. Methods. 14, 590–592 (2017).

    CAS  Article  Google Scholar 

  12. 12.

    Tang, H. et al. Profiling of short-tandem-repeat disease alleles in 12,632 human whole genomes. Am. J. Hum. Genet. 101, 700–715 (2017).

    CAS  Article  Google Scholar 

  13. 13.

    Halman, A. & Oshlack, A. Accuracy of short tandem repeats genotyping tools in whole exome sequencing data. F1000Res. 9, 200 (2020).

    CAS  Article  Google Scholar 

  14. 14.

    Lelieveld, S. H. et al. Meta-analysis of 2,104 trios provides support for 10 new genes for intellectual disability. Nat. Neurosci. 19, 1194–1196 (2016).

    CAS  Article  Google Scholar 

  15. 15.

    Dolzhenko, E. et al. ExpansionHunter: a sequence-graph-based tool to analyze variation in short tandem repeat regions. Bioinformatics. 35, 4754–4756 (2019).

    CAS  Article  Google Scholar 

  16. 16.

    Kamsteeg, E. J. et al. Best practice guidelines and recommendations on the molecular diagnosis of myotonic dystrophy types 1 and 2. Eur. J. Hum. Genet. 20, 1203–1208 (2012).

    CAS  Article  Google Scholar 

Download references


We thank Michael Eberle and Egor Dolzhenko for kindly providing the Python code for the swimlane plots. We thank Ingrid Siegelaer and Monique Gerrits for helping with the molecular confirmations of the HTT allele sizes. This project was financially supported by an Aspasia grant of the Dutch Research Council (015.014.066 to L.E.L.M.V.), a VIDI grant (917-17-353 to CG) and the NWO X-omics project (184.034.019 to CG). The aims of this study contribute to the Solve-RD project (to C.G. and L.E.L.M.V.) which has received funding from the European Union’s Horizon 2020 research and innovation program (number 779257).

Author information




B.P.G.H.v.d.S.: Methodology, Project administration, Writing—original draft, Writing—review & editing. J.C.: Methodology, Software, Formal analysis, Investigation, Data curation. M.d.G.: Methodology, Software, Formal analysis, Investigation, Data curation. M.P.: Validation, Formal analysis. R.P.P.M.: Validation, Formal analysis. N.V.: Resources, Visualization. B.v.d.W.: Resources, Visualization. M.S.: Resources, Visualization. H.G.Y.: Writing—review & editing. L.E.L.M.V.: Writing—review & editing, Funding acquisition. E.-J.K.: Conceptualization, Methodology, Project administration, Writing—original draft, Writing—review & editing, Supervision. C.G.: Conceptualization, Methodology, Project administration, Writing—original draft, Writing—review & editing, Supervision, Funding acquisition.

Corresponding authors

Correspondence to Erik-Jan Kamsteeg or Christian Gilissen.

Ethics declarations

Competing interests

The authors declare no competing interests.

Ethics Declaration

Patient samples, together with a basic phenotype description were anonymized. Study was approved by the institutional review board “Commissie Mensgebonden Onderzoek Regio Arnhem-Nijmegen” under number 2011/188. We received and archived consent for participation/publication from every individual whose data is included in this manuscript.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

van der Sanden, B.P.G.H., Corominas, J., de Groot, M. et al. Systematic analysis of short tandem repeats in 38,095 exomes provides an additional diagnostic yield. Genet Med (2021).

Download citation


Quick links