debates 17 December 1998

DANIEL FISHER

In week three of this debate Forey presents an argument that purports to be a reductio ad absurdum for methods that would use stratigraphic data in the principal phase of hypothesis selection in phylogenetic analysis. In week four, Pearson responds to elements of this argument, but I will take a different tack. I believe Forey's argument has several problems, but with respect to stratocladistics and likelihood methods, it mischaracterizes how these methods interact with phylogenetic hypotheses.

Forey's thought experiment ... duplicated

Forey asks us to consider a sequence of single occurrences at successive stratigraphic levels: in upward succession, an irregular urchin, three levels with regular urchins only, and a final level with an irregular urchin. Forey says, "Literal reading would put these together in a simple ancestor-descendant lineage (or perhaps a hypothetical lineage with five cladogenetic events) and explain the reinvention of a regular echinoid as iterative evolution ...."

I would have been uncertain whether Forey meant stratocladistics to be included as one of the methods with a propensity toward "literal reading", because I do not think of it this way at all. However, Forey clarifies this in a review1 that refers to stratocladistics2, where he suggests a similar "thought experiment." This time we are to consider "a series of successively younger strata 1-4 containing in 1, a trilobite, 2, a brachiopod, 3, a foraminiferan, 4, a fish." In Forey's opinion, "Stratigraphic congruence would lead to the conclusion (trilobite(brachiopod(foram, fish))) ...." This reinforces the point that a cladistic issue is involved, whether or not ancestor-descendant hypotheses are being entertained.

Is this an accurate characterization of the operation of either stratocladistics or likelihood methods involving stratigraphic data? Unfortunately, no.

I have previously distinguished between "propositional" and "evaluative" approaches to phylogenetic inference3, the former consisting of methods geared simply toward articulating those interpretations considered best, and the latter consisting of methods that look at competing hypotheses and measure the fit of data to any or all of them. Stratophenetics, as currently implemented, may be accurately characterized as a propositional method, but stratocladistics and likelihood methods are explicitly evaluative. In this, they of course resemble cladistics. Evaluative methods rank competing phylogenetic hypotheses and thus are 'interested in' the entire array of alternatives. It may thus be misleading to identify methods too closely with any single hypothesis, even one that ranks first according to that method.

Beyond this, the particular hypothesis that Forey identifies is not accurately attributable to stratocladistics. Stratocladistics involves a step in which the fit of stratigraphic data to competing hypotheses is considered, but it does not stop there. It goes on to evaluate the support (or refutation) provided to hypotheses by both stratigraphic and morphologic data. In the echinoid example, the only morphologic data Forey mentions pertain to symmetry – regular versus irregular. In the trilobite-to-fish example, no characters are given, but our common knowledge of comparative anatomy provides many. It is this background knowledge that makes Forey's hypothesis seem so wrong, and stratocladistics would concur. Likelihood methods likewise need not exclude morphology4,5.

Perhaps Forey only means that an intermediate stage in the operation of the stratigraphically informed methods appears to recommend a hypothesis that differs from the final resolution, but what is wrong with this? Cladistics accepts this behaviour from morphologic characters. Moreover, if we take seriously the "thought experiment" Forey has proposed, we must be clear on what his stratigraphic data mean. In stratocladistics, before coding this pattern, I would have to be satisfied that appropriate preservational settings for discovery of each group were available within each time interval, or be prepared to forgive units of stratigraphic debt that could be attributed to violations of this requirement. Coding the data according to Forey's description would imply that each interval had been investigated, and that no trace of any relevant group had turned up beyond those noted, in their successive positions.

Interpreted in this context, we begin to see that stratigraphic data summarize a substantive pattern that is not so easily set aside as might at first appear. If our experience matched Forey's model, we would wonder why. Even so, stratocladistics would not insist on "literal reading," by which Forey seems to mean reading stratigraphy and ignoring morphology. It only insists on reading all the data in hand – that is, not indulging in literal reading of morphology while ignoring stratigraphy. Again, likelihood methods share this openness.

Return to reality

Although I am pleased to deal with Forey's example on its own terms, it is instructive to note how it departs from the real nature of the fossil record, where multiple forms typically coexist within each interval. In such circumstances, considering stratigraphic data alone yields multiple best-fit phylogenetic hypotheses, making "literal reading" a much more diffuse proposition. This multiplicity persists among the less-than-best-fit hypotheses, without in any sense neutralizing the differences between better and worse fit to data.

As before, we need not stop at evaluating only one source of data. Not uncommonly, incorporation of additional characters in a conventional cladistic analysis yields a more highly resolved solution at the intersection of the solution sets of characters considered separately. A comparable result often emerges when treating character data in conjunction with stratigraphic data. Whether the best-fit hypothesis/es considering all the data is/are one of the best-fit hypotheses considering only part of the data is an issue of congruence, and cladists usually disapprove of congruence contrived through selective attention to data.

The test of time

Forey is appropriately concerned about how we will test a hypothesis founded (presumably even in part) on stratigraphic data. He is quick to suggest "[c]ladistic analysis of morphological and/or molecular data" as "the obvious way." Stratocladistics welcomes such tests, but extends the invitation likewise to testing by recovery of new samples from intervals of time intermediate between ones already analyzed. Stratocladistics' stance on this matter leaves it open to correction from either data source, while conventional cladistics depends solely on character state distributions.

Forey's position guards tenaciously against giving undue credence to patterns of occurrence that might reflect only biases in preservation within certain stratigraphic intervals. However, taphonomic studies can test for such biases without recourse to phylogenetic hypotheses. Moreover, intermediate intervals at an appropriate scale typically represent their own stratigraphic and taphonomic setting, giving them a degree of independence from occurrences above or below.

The troublesome possibility for conventional cladistics is if patterns of character state change are subject to some common influence that gives them less information content than we normally attribute to them. If we were really dealing with a case of iterative evolution, would conventional cladistics recognize that the parity and/or independence of characters might be compromised? Its capacity for self-correction is limited to finding new characters or new taxa (without treating their stratigraphic position), and is ultimately dependent on the same process assumptions that drive all of conventional cladistics2.

Forey asserts that "[i]f stratigraphic data ... is used at the outset to construct the phylogeny, ... then it cannot be used to 'test' the phylogeny." But a "test" is not a test unless its outcome has the capacity to change the ranking of hypotheses. Cladists recognize new characters as having this capacity, yet they do not exclude these characters from subsequent analyses. The new information these characters contribute retains its value in the context of a refined hypothesis chosen with reference to all the data. If the test value of characters is not compromised by letting them participate in the analysis, neither is the test value of stratigraphic data.

Yes, time is an "arbiter", but if we limit it to calibrating hypotheses based entirely on other data, or selecting only among hypotheses judged most parsimonious under other data, we arbitrarily restrict its domain.

Daniel C. Fisher
Museum of Paleontology and Department of Geological Sciences, University of Michigan, USA


References

  1. Forey, P. Review of Interpreting the hierarchy of nature. Journal of Vertebrate Paleontology 15, 861-863 (1995).
  2. Fisher, D.C. Stratocladistics: morphological and temporal patterns and their relation to phylogenetic process. in Interpreting the hierarchy of nature (Grande, L. & Rieppel, O. eds) 133-171 (Academic Press, San Diego, 1994).
  3. Fisher, D.C. Phylogenetic analysis and its application in evolutionary paleobiology. in Analytical paleobiology (Gilinsky, N.I. & Signor, P.W. eds) 103-121 (Paleontological Society Short Courses in Paleontology No. 4, University of Tennessee, Knoxville, 1991).
  4. Huelsenbeck, J.P. & Rannala, B. Maximum likelihood estimation of phylogeny using stratigraphic data. Paleobiology 23, 174-180 (1997).
  5. Wagner, P.J. A likelihood approach for estimating phylogenetic relationships among fossil taxa. Paleobiology 24, 430-449 (1998).

I thank T.K. Baumiller, D.L. Fox, P.D. Gingerich, and L.R. Leighton for comments.



Macmillan MagazinesNature © Macmillan Publishers Ltd 1998 Registered No. 785998 England.