Fig. 1 | Nature Communications

Fig. 1

From: A reference haplotype panel for genome-wide imputation of short tandem repeats

Fig. 1

A deep catalog of STR variation in the SSC cohort. a Number of STRs called per sample. Dashed line represents the mean of 1.14 million STRs per sample. b Call rate per locus. Dashed line represents the mean call rate of 90%. c Mendelian inheritance rate at filtered vs. unfiltered STRs. The x-axis gives the posterior genotype score (Q) returned by HipSTR. The y-axis gives the average Mendelian inheritance rate for each bin across all calls on chromosome 21. STRs that were homozygous for the reference allele in all members of a family were removed. Colors represent different motif lengths. d Per-STR expected heterozygosity in SSC vs. 1000 Genomes. Only STRs with expected heterozygosity >0.095 in SSC are included. Color scale gives the log10 number of STRs represented in each bin. e Allele frequency distributions at pathogenic STRs obtained in SSC samples vs. previously reported normal alleles. Blue = SSC, Gold = Previously reported. Boxes span the interquartile range and horizontal lines give the medians. Whiskers extend to the minimum and maximum data points. The y-axis gives the number of repeat units. Sources of previously reported allele frequencies are described in detail in Methods. HD Huntington’s disease, SCA spinocerebellar ataxia, DRPLA Dentatorubral-pallidoluysian atrophy, DM1 myotonic dystrophy type 1, HDL Huntington’s disease-like 2