Abstract
THE current approach in numerical taxonomy is directed towards the so-called “minimum-variance” solution, for which it is argued that a population should be partitioned into cluster subsets by minimizing the total within group variation. Several classification methods have been compared1 and shown to possess related variance constraints, and a case has been made1–3 for suggesting that such methods are not ideally suited to the taxonomic problem of resolving “natural” classes. Implicit in the minimum variance approach is the concept that cluster should have no significant overall variance or spread, and this implies that in the case of a unimodal swarm the distribution should be split into an arbitrary number of compact sections. By contrast, Forgey has argued2,3 that for a “natural” classification, clusters should correspond to data modes, and there can only be as many classes as there are distinct modes. No variance constraint is implied, or should be induced, for when a mode is elongated rather than spherical the distribution merely reflects some internal factor of variation for the corresponding class. Such factors will be present to some extent, depending on data transformations and the quality of the selected character set, and therefore a subsequent variable search is necessary to discover the hidden constant characteristics of the class. Furthermore, those characters which are non-constant for a cluster mode may be inter-correlated, suggesting that the original character choice was poor, and in such cases the consideration of correlations, ratio variables and regression coefficients is indicated. Forgey interprets2,3 a data mode as a continuous dense swarm of points, separated from other such modes by either empty space or a scattering of “noise” data. It has been suggested that “noise” data usually result from sampling errors, and while this is true, they can also be interpreted as those natural phenomena associated with the intersecting tails of disjoint continuous distributions. We can therefore expect a “natural” cluster to exhibit a dense centre (of any shape) which is surrounded by a haze or cloud of points, and the problem is to isolate the dense centres irrespective of this interference.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 51 print issues and online access
$199.00 per year
only $3.90 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Wishart, D., Proc. St Andrews Coll. in Numerical Taxonomy (1968).
Forgey, E. W., Amer. Psychol. Assoc. Meetings, Los Angeles (1964).
Forgey, E. W., AAAS—Biometric Soc. Meetings, Calif. (1965).
Williams, W. T., Lambert, J. M., and Lance, G. N., J. Ecol., 54, 427 (1966).
Lance, G. N., and Williams, W. T., Comp. J., 9, 373 (1967).
Sneath, P. H. A., Comp. J., 8, 383 (1966).
Wishart, D., A Fortran II Programme for Numerical Classification (St Andrews, 1968).
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
WISHART, D. Numerical Classification Method for deriving Natural Classes. Nature 221, 97–98 (1969). https://doi.org/10.1038/221097a0
Received:
Issue Date:
DOI: https://doi.org/10.1038/221097a0
This article is cited by
-
Student college choice sets: Toward an empirical characterization
Research in Higher Education (1983)
-
Grid ? A space density analysis for recognition of noda in vegetation samples
Vegetatio (1980)
-
Clusteranalyse — Überblick und neuere Entwicklungen
Operations-Research-Spektrum (1980)
-
Qualitative and quantitative study of the growth and cell surface properties of Huntington's disease fibroblasts and age-matched controls
Human Genetics (1979)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.