Introduction

Pedigrees can be divided into three categories: Acyclic rooted graphs, cyclic but perfectly drawable graphs, and nondrawable cyclic graphs. The drawability of a pedigree depends on how well a given set of aesthetic goals can be met by a two-dimensional presentation.

The first category is the easiest and efficient visualisation methods have been available since the late 1970s. Among the first were Wetherell and Shannon,1 who compared two naive algorithms and a more sophisticated solution and discussed the related aesthetic criteria. Soon after, Reingold and Tilford2 responded by presenting a node positioning algorithm that could produce otherwise perfect layouts except for a minor defect in subtree positioning. A decade later, Walker II3 finally introduced an efficient algorithm that also adjusted the subtrees correctly.

Large families are likely to contain cycles (multiple matings, parents with common ancestors) and belong to one of the remaining categories. The second category was covered by Tores and Barillot,4 who presented an interval graph interpretation of perfectly drawable pedigrees. For the third category, no exact solution exists and time-consuming combinatorial optimisation is required. By contrast, our program transformes any pedigree to an acyclic graph before drawing, thus the node positioning for the first category alone needs to be solved.

Aesthetic criteria

For an optimal visualisation of ordered trees, the basic rules have been defined in the literature:

  1. 1)

    Nodes should not overlap each other.

  2. 2)

    Straight lines from children to their respective parents should not cross.

  3. 3)

    Nodes in the same generation should be placed on a straight line and the lines should be parallel.

  4. 4)

    The parents should be centred over their children.

  5. 5)

    (a) A subtree should be drawn the same way regardless of its position. (b) After reversing the node order, the new drawing should be a reflection of the original.

Aesthetics 4 and 5 are important for viewing quality, but in certain cases they prevent a maximally compact drawing given the other criteria, as demonstrated by Reingold and Tilford.2 In fact, a drawing satisfying all five criteria may not be as narrow as possible since the overall width is affected, to some extent, by the traversal order of the tree. Fortunately, the phenomenon has little practical relevance.

Current pedigree drawing programs are motivated by the need to display every link and to project the result on a flat surface. For example, PED 4.25 achieves compact drawings, but violates Aesthetics 2 and 4. Another solution titled CoPE6 allows batch processing of pedigrees, but cannot handle nondrawable graphs. A third example named Pedigraph7 displays a vertical flow chart, a different approach from the traditional drawings. In this case, failure to fulfil Aesthetic 2 and 4 leads to impaired readability of large and complex pedigrees. On the other hand, populations that have many founders and bottlenecks may be better visualised by this type of diagram.

Pedigree transformation

Figure 1 depicts a planar but nondrawable family graph. If the parents and children are placed on parallel lines according to Aesthetic 2, intersecting lines cannot be avoided. Interestingly, every crossing link involves a father–mother connection, hence Aesthetic 2 is fulfilled by omitting such links. One way to recover the lost information is to draw the problematic spouses more than once. As a side effect, the parent nodes of a child now constitute a mating unit with an important condition: Every child is linked to exactly one mating unit and any pedigree is reduced to a forest of rooted trees.

Figure 1
figure 1

A family with a complex coupling structure (left). The labelled nodes are ancestors and the black dots represent their subtrees. A fragmented representation of the same family (right). Duplicates are indicated by dashed line.

The drawings produced by the duplicate transformation and Walker II's algorithm3 satisfy all the five aesthetics, as illustrated in Figure 2 for a simulated pedigree. Furthermore, the node-positioning algorithm is able to take into account the individual node widths, a useful feature if you wish to print additional textual information such as medical history below the nodes. The symbols and line art in the figure follow the recommendations by Bennet et al8 although the unconventional visualisation algorithm together with the five aesthetics make some of the rules inapplicable.

Figure 2
figure 2

A highly inbred random pedigree. The node legend is overlayed for compactness.

Discussion

Duplicate transformation is, to our knowledge, the easiest way of making full size drawings of very large and complex pedigrees. The aesthetic quality is competitive also for simpler graphs and, owing to the improvement by Buchheim et al,9 the node positioning can be done in linear time, clearly outperforming other methods. For these reasons, the approach is ideal for situations such as interactive applications that may have strict restrictions on response time.

The program CraneFoot is currently being used in active research by the Finnish Diabetic Nephropathy Study (FinnDiane), and as the pedigree visualisation module of the data management system BCOS by Biocomputing Platforms Ltd. It is designed for large sets of pedigrees, with special attention to usability and reliability. The program also gives detailed reports on erroneous topology and provides an intuitive way of extracting visualisation information from pedigree files. To facilitate easy viewing, the results are collected in a single PostScript document with a table of contents. Further development includes a graphical user interface, more choices over the node positioning algorithm and new visual features.