Table 2 Cohort information and technical features WES

From: Flexible and scalable diagnostic filtering of genomic variants using G2P with Ensembl VEP

  DDD (n = 7357)a CRC (n = 517) GS (n = 315)
Capture kit Agilent Human All-Exon V3 or V5 Plus with custom ELID C0338371 Illumina TruSeq Exome Enrichment kit Illumina TruSeq Exome Enrichment kit
Sequencing platform Illlumina HiSeq Illumina HiSeq 2000 and 2500 Illumina HiSeq 2000 and 2500
Alignment bwa (0.5.9) bwa (0.5.9) bwa (0.5.9)
Variant calling GATK (3.1.1) GATK (3.4) GATK (3.4)
  Indel realignment, BQSR Indel realignment, BQSR Indel realignment, BQSR
  HaplotypeCaller (run in multisample calling mode using the complete dataset) HaplotypeCaller (per sample) HaplotypeCaller (per sample)
   GenotypeGVCFs (joint genotyping across all samples on TruSeq regions + 50 bp padding) GenotypeGVCFs (joint genotyping across all samples on TruSeq regions + 50 bp padding)
Relatedness After excluding poor quality samples, selected randomly one affected proband per family (using the PED file) Unrelated First-degree relatives excluded (based on computed relationship coefficients)
Male:female ratio 1.36 1.11 0.73
Median age 7.9 years 63 years 52 years
  1. aDDD details are based on info in the Methods section of ref. 13