Extended Data Fig. 5: Post hoc evaluation of our cross-assembly strategy. | Nature Microbiology

Extended Data Fig. 5: Post hoc evaluation of our cross-assembly strategy.

From: Long-term stability and Red Queen-like strain dynamics in marine viruses

Extended Data Fig. 5

Cross assembly merged contig sequences from different months when overall identity of overlapping regions greater than 1000 bp in length exceeded 95%. We evaluated how often these merged overlaps occurred at different percent identities to assess how much variation was combined, and also examined other useful statistics. a) Distribution of percent identity of all alignments used to merge contigs during our cross-assembly step, dotted lines represent the percent of alignments covered to the right of the line. Note that 92% of merges had >98% sequence identity b). Distribution of lengths of all alignments used to merge contigs during our cross-assembly step. Note most merged regions were 5,000–10,000 bp in length. c) Distribution of the fractions of the contig used during our merging step (that is length of the alignment divided by the contig length). Note that the vast majority of merges occurred over almost the entire lengths (90–100%) of the contigs d). Distribution of the number of contigs that were merged into a single contig during cross- assembly. Note that the vast majority of merged contigs came from 2 or three individual contigs. All panels taken together show that while merging occurred, the vast majority (86%) was between almost completely overlapping (including nested) and >98% identical sequence contigs, rather than bridging between long contigs with short overlaps.

Back to article page