a, Copy number distribution for core and variable ORFs. Variable ORFs have a greater frequency of both hemizygous and multiallelic genes. b, Logarithmic-scale distribution of isolates carrying loss-of-function mutations for core (n = 4,931) and variable ORFs (n = 1,111). The core genome is characterized by far fewer loss-of-function mutations compared to variable ORFs (P value = 6.45 × 10−78, two-sided Mann–Whitney–Wilcoxon test). Centre lines, median; boxes, IQR; whiskers, 1.5 × IQR. Data points beyond the whiskers are outliers. c, Different types of variable ORFs have marked differences in distribution.