We agree with Becker and Knapp in noting the promise of novel haplotype estimation methods that combine segregation information with the EM approach. Indeed, we concluded our paper in anticipation of such advances:

Methods which incorporate both familial transmission and genotype frequencies, thereby combining the benefits of error detection, segregation patterns and underlying haplotype frequencies, but which are as yet undeveloped, could favour trios and yield more efficient and accurate methods for cataloguing LD patterns.

The points raised by Becker and Knapp seem to us more a reiteration of this statement than a novel perspective. Moreover, it is difficult to see our results as ‘discrepant’ with those of Schaid and others (citations in Becker and Knapp), as our comparisons and conclusions relate to genotyping error, while those of the others relate to design efficiency and different methods of analysis.

By focusing on characteristics of the L-G method, Becker and Knapp have missed the main points of our paper: (i) that genotyping error can dramatically influence haplotype frequency estimation in any study; (ii) that detection of errors via incompatible segregation in families (‘Mendelian errors’) may not sufficiently protect against such effects; and (iii) that unrelated and trio designs suffer badly from genotyping errors when LD is low or absent and alleles are common. In this latter situation – which showed the greatest effects of genotyping error in our study – Becker and Knapp are incorrect in claiming that L-G is inappropriate and that adding family members offers no further information. There exists a decade of research in genetic map construction using linked markers with common alleles (advancing to microsatellites) in large CEPH families which offset this claim.1,2,3,4

We also fail to see the basis of Becker and Knapp's assertion that our baseline (no error) data do not detect the benefit of child information, as our data actually support their view. In all of the examples we considered (Figure 1), the baseline accuracy rate in trios was about equal to or better than that of unrelated individuals. More importantly, many of these examples involved situations in which the accuracy of baseline haplotype estimation approached 100%. Whether or not ‘benefits of family data’ are appreciated seems largely irrelevant when haplotype frequencies are estimated almost perfectly.

We expressly sought to avoid these types of methodological arguments by comparing genotype error rates after correcting for their respective baseline performance (Figure 3). These data may have been overlooked by Becker and Knapp. In addition, we noted in our paper that, ‘It is important to distinguish these inherent methodological/study design differences from those relating to the effects of genotyping error’. This explicit contextual guideline may have been missed as well.

While we do not see the rationale or novelty of Becker and Knapp's concerns in the context of genotype error, we fully agree that family based designs can offer substantial efficiencies for haplotype estimation in the study of specific LD patterns. We also share their view, as expressed in our paper, that exploration of the effects of genotyping error using combined segregation/EM approaches will be exciting and relevant to the design of large-scale association studies.