From: Mutation identification by direct comparison of whole-genome sequencing data from mutant and wild-type individuals using k-mers

Whole-genome sequencing data of two related genomes is analyzed for the frequency of all k-mers. K-mer frequency histograms provide the power to distinguish between native k-mers (area highlighted in gray) and k-mers overlapping with sequencing errors. Comparing the two sets of k-mers of two highly related genomes discloses sample-specific, overlapping k-mers that result from subtle differences between the genomes. These sample-specific k-mers are then merged to a seed if they can be paired with a homologous, but not identical, seed in the other sample. Read pairs that share at least one k-mer with a seed pair will be used for local assemblies. This results in contigs that are centered on the mutated sites.

