Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Predicting nutrient content of ray-finned fishes using phylogenetic information


Human food and nutrition security is dependent on marine ecosystems threatened by overfishing, climate change, and other processes. The consequences on human nutritional status are uncertain, in part because current methods of analyzing fish nutrient content are expensive. Here, we evaluate the possibility of predicting nutrient content of ray-finned fishes using existing phylogenetic and life history information. We focus on nutrients for which fish are important sources: protein, total fat, omega-3 and omega-6 fatty acids, iron, zinc, vitamin A, vitamin B12, and vitamin D. Our results show that life history traits are weak predictors of species nutrient content, but phylogenetic relatedness is associated with similar nutrient profiles. Further, we develop a method for predicting the nutrient content of 7500+ species based on phylogenetic relationships to species with known nutrient content. Our approach is a cost-effective means for estimating potential changes in human nutrient intake associated with altered access to ray-finned fishes.


Human activity is rapidly transforming marine ecosystems worldwide. Total catch appears to be declining in fisheries around the world, largely due to intense industrial pressure1. Climate change-driven increases in sea temperature threaten to alter the abundance and distribution of hundreds of economically and nutritionally important species2,3,4,5, and are leading to mass coral bleaching6. Ocean acidification, caused by greater absorption of atmospheric carbon dioxide, is inflicting severe damage on biologically rich coral reef habitats7. Coastal development, pollution, and other anthropogenic forces are impacting nearly every marine ecosystem in the world8.

The consequences of these changes for human health, and especially for nutrient intake, are likely to be severe because fish provide critical nutrients essential to human nutrition, including iron, zinc, vitamin A, vitamin B12, omega-3 and omega-6 fatty acids, and others9,10,11. In many societies, seafood is the foundation for healthy diets, and its decline presents a significant risk in destabilizing food and nutrition security12. The consumption of fish is associated with a wide range of health benefits, including the prevention of various non-communicable diseases and the promotion of cognitive development13,14. Ray-finned fish—the class Actinopterygii—are particularly important to low-income populations, comprising 80.6% of the total global tonnage of subsistence marine capture fish and seafood in 201015.

A lack of information about the nutrient composition of fish species, however, hampers quantification of the nutritional threat to human populations of reduced consumption of wild-harvested fish. Measuring nutrient content is expensive and, as a result, nutrient analyses rarely capture the full breadth of vitamins, minerals, and macronutrients relevant to nutrition. This is exacerbated by the need to collect multiple samples of each species to assess variability across individuals, as well across sub-populations of species. This information gap prevents the design of rational fisheries management strategies and nutritional interventions to optimize public health outcomes in the face of rapidly changing marine conditions.

In this paper, we investigate the possibility that phylogenetic relatedness and life history information explain variation in the nutrient content of key fish species in the most commercially and nutritionally important class, Actinopterygii (ray-finned fishes). To our knowledge, this is the first time that such an approach has been explored. We then use the results of this analysis to develop a method for using shared phylogenetic history as a means of predicting the nutrient content of fish whose nutrient information has not yet been assessed. Such predictions are especially critical in regions of the world with known nutrient deficiencies—which are often the same areas where laboratory capacity is often limited. For fish specifically, recent years have seen an emerging desire among policymakers to design fisheries management and aquaculture development interventions with the specific goal of enhancing nutritional security. We note that these methods, if successful, could be used for a broad range of terrestrial animal and plant species as well—wild and domestic—as well as subspecies, varieties, and breeds.

Our results indicate that life history traits predict species nutrient content only weakly; larger datasets and/or use of missing-data imputation techniques may strengthen these models. We find, however, that phylogenetic relatedness is associated with similar nutrient profiles. This finding allows us to create a method for predicting the nutrient content of 7500+ Actinopterygii species based on phylogenetic relationships to species with known nutrient content.


Interspecific variability in nutrient content and life history

We sourced nutrient content information for 371 species in the class Actinopterygii (see Supplementary Table 1 for full list of species, and see Methods section and refs.16,17,18,19,20,21,22,23,24,25,26,27,28 for data sources). Supplementary Table 2 summarizes the number of species with data, as well as the observed range by nutrient. The dataset includes 26 orders, 126 families, and 279 genera; about 42% of species are in the order Perciformes (perch-like fishes). The set of 371 species represents over half of all global capture fisheries by weight, with all 22 of the world’s most harvested marine fish species included29.

We used life history traits and phylogenetic relatedness to predict protein, total fat, omega-3 and omega-6 fatty acids, iron, zinc, vitamin A, vitamin B12, and vitamin D content in Actinopterygii. These nutrients are critical for human nutrition and are generally present in relatively high concentrations in seafood—although, as we describe below, not all species are rich sources of the all the above nutrients. As predictors of nutrient content, we initially considered six continuous and two categorical life history traits describing body size, trophic level, habitat characteristics, and geographic range (Table 1).

Table 1 Nutrient and life history traits used in the analysis*

Summaries of nutrient content across species indicate that ray-finned fishes make substantial contributions to human nutrition, particularly for low-income coastal populations (Supplementary Tables 2, 3). Of species with available information for a given nutrient, 98% are sources or rich sources of protein, 94% rich sources of vitamin D, and 81% rich sources of vitamin B12. In addition, 13% of species are either sources or rich sources of iron, 14% sources or rich sources of zinc, and 10% sources or rich sources of Vitamin A (see Supplementary Table 1 for data by species and threshold designations of source and rich source by nutrient).

We see a great deal of variation across ray-finned fish species for which we have nutrient information (Supplementary Table 3). In particular, the ranges of minimum and maximum depth, habitat preferences, and latitudinal range are especially wide. Nearly 40% of fish are demersal, but significant fractions are benthopelagic, pelagic, or reef-associated. The chosen set is divided roughly equally into tropical, subtropical, and temperate species. As calculated by trophic level, the mean Actinopterygii species in this dataset is a carnivore, and some are apex predators, but many species are also herbivores, zooplanktivores, detrivores, or omnivores.

Correlations among nutrients and life history variables

We first examined the relationship between nutrient content and life history variables by estimating evolutionary correlations (i.e., the correlations between evolutionary changes inferred using observations among species and their phylogenetic relationships; see Methods section). For this analysis (and all regression models described below), we used the ray-finned fish phylogeny reconstructed in Rabosky et al.30. Due to sample size restrictions, we exclude omega-3 and omega-6 fatty acids from the correlation matrix below; however, correlations with fatty acids included are given in Supplementary Table 4.

We found the expected associations between pairs of life history variables (top left corner of Fig. 1, lightly shaded). Maximum depth, maximum fish length, and trophic level are moderately associated with each other, suggesting that larger fish tend to live at greater depths and consume a diet higher on the food chain. The a and b parameters describing species’ mass-length scaling relationships are well-correlated to each other, but not to other life history traits.

Fig. 1

Evolutionary correlation matrix of key variables, log-transformed. Colored cells indicate correlations significant at p < 0.1 (Pearson’s r). Lightly shaded top left corner shows correlations between life history variables; darkly shaded bottom right corner shows correlations between nutrient variables. Unshaded top right corner shows correlations between life history and nutrient variables. The variables 'min depth' and 'max depth' refer to habitat preferences. 'Max length' refers to maximum reported length of the species. The variables 'a_lw' and 'b_lw' refer to the scalar and exponential parameters in the length-weight relationship equation W = aLb, wherein W is weight and L is length. 'Trophic level' refers to the weighted mean of the trophic level of the diet of the species. See Table 1 for more details

Patterns are also apparent within the set of nutrients (bottom right corner of Fig. 1, darkly shaded). Fish with more protein tend to contain less Vitamin A and total fat. Fish that are high in total fat also tend to be good sources of Vitamins A, B12, and D. Fish high in iron also tend to be good sources of zinc and all included vitamins. Vitamin A and Vitamin D are positively associated, and the latter is correlated with Vitamin B12. Overall, this set of correlations suggests that specific sets of nutrients tend to cluster together. Given that nutritional analyses, due to cost considerations, tend to focus on a small set of nutrients, such observed correlations may be important for roughly inferring levels of unmeasured nutrients.

Finally, we see that life history parameters, controlling for phylogeny, vary in their association to key nutrients (top right of Fig. 1, unshaded). Fish living at greater oceanic depths contain more total fat and Vitamin A, but less zinc. Longer fish contain proportionally less zinc and Vitamin B12, but more Vitamin A; smaller fish, often overlooked in conservation, livelihoods, and nutrition programs, may be richer sources of key micronutrients31.

These bivariate associations were tested in multiple regression models fit by phylogenetic least squares (PGLS) in a manner that accounts for phylogenetic signal in species residuals16,32,33,34.

These phylogenetic regressions find that life history parameters are either insignificant predictors of nutrient content or have small effects (Table 2). Longer fish at higher trophic levels but closer to the surface contain more protein, although the coefficient magnitudes are relatively small: 100 cm greater length is associated with 0.69 g more protein; a one-unit higher trophic level is associated with 1.29 g more protein per 100 g of fish weight; and 100 meters greater (i.e., deeper) maximum depth is associated with 0.14 g less protein. Maximum depth is also a significant predictor for total fat, omega-3 and omega-6 fatty acids, and vitamin A, but regression coefficients are small in these cases as well. Other relationships are insignificant. Overall, regression models suggest that life history variables are generally poor predictors of fish nutrient content when accounting for phylogenetic dependence in the data.

Table 2 Phylogenetic generalized least squares models predicting nutrient content

Predicting nutrient content of unmeasured species

All nutrients exhibit significant phylogenetic signal; that is, covariance among species in nutrient content tends to be moderately to strongly associated with phylogenetic relatedness. Estimates of Pagel’s λ vary17,18 are highest (indicating strong phylogenetic signal) in zinc, vitamin D, and vitamin A, intermediate in total fat, omega-3 and omega-6 fatty acids, and vitamin B12, and weaker in protein and iron (Table 3).

Table 3 Unconditional phylogenetic signal (Pagel’s λ) of each nutrient across subsets of Actinopterygii

The phylogenetic structuring of nutrient content can be seen in the heatmap of Fig. 2, which shows some clustering of nutrient values among closely related species, although the location of these clusters varies depending on nutrient. For example, various species of the genus Thunnus contain high levels of protein, as indicated by the dark red region near the top of the phylogeny in the first column of the heatmap. We also observe such clustering at close phylogenetic distance but across genera; for example, Engraulis encrasicolus, Tribolodon hakoensis, Misgurnus anguillicaudatus, and Anguilla japonicus all fall within the top 10% of zinc levels in the database, as indicated by the dark red region near the bottom of the phylogeny in the fourth column of the heatmap; other nearby species also contain relatively large amounts of zinc.

Fig. 2

Phylogenetic relationships and nutrient content, expressed in percentiles. Eighty-four of the most commercially and nutritionally important of the 371 species in the database described above are shown. The time-scale is shown as millions of years ago (mya). The color scales indicate percentiles of nutrient content; see Supplementary Table 1 for ranges of each variable. Common names and pictorial representations of select groups of fishes are shown. All images are taken from Wikipedia, and are Public Domain images

Given our findings that nutrient content is weakly predicted by life history but tends to be phylogenetically structured, we developed a procedure for predicting nutrient content in unmeasured species based on their phylogenetic relationships to measured species and empirical estimates of phylogenetic signal (see Methods for details). We used a jackknifing procedure to determine that this method has acceptable accuracy. Observed nutrient values fell within the prediction 95% confidence intervals for at least 89.7% of cases across all variables (see Fig. 3, as well as Supplementary Fig. 1 for residual plots) and median deviations between predicted and observed values were within 40% of the among-species standard deviation. We note that the method can lead to wide confidence intervals when long phylogenetic branches separate unmeasured from measured species. This prediction method also compares favorably to predictions based on life history regression models; average accuracy and median differences are lower in the phylogenetic signal-only method (Table 4). Moreover, the phylogenetic prediction method can be readily applied to any species whose phylogenetic relationship to measured species has been estimated. Regression-based prediction requires this information in addition to information on species life history, although imputation methods may help to predict missing observations from a combination of other life history predictors and phylogenetic information19.

Fig. 3

Predicted versus observed values for selected nutrients using phylogenetic signal only. Points in green fall within the 95% prediction interval; points in red are outside the interval. The diagonal represents points where predictions are equal to observed values

Table 4 Confidence interval coverage and median deviation of nutrient prediction using phylogenetic signal only and phylogenetic signal plus life history information

Because our phylogenetic method performed well in validation tests, we used it to estimate missing nutrient values for all commercially valuable species in our original 371 species, as well as for all 7500+ unmeasured species in the Actinopterygii phylogeny (see Supplementary Data Files 210)14. Moreover, we used this approach to identify ray-finned fish families representing promising sources of key nutrients (Table 5). Some of the families we identify contain no or relatively few measured species, but our phylogenetic method predicts their potential to contain species rich in specific nutrients.

Table 5 Ray-finned fish families identified as potentially nutrient-rich based on phylogenetic prediction and observation of nutrient content


At least one-tenth of the global population is vulnerable to future deficiencies in micronutrient intake linked to the degradation of marine ecosystems7. Unsustainable fishing practices, the effects of climate change on sea temperature and dissolved oxygen content, and pollution are all major threats to a wide range of commonly consumed fish stocks. Quantifying the exact magnitude of these risks depends on knowing the nutrient content of key species—information that is extremely expensive to obtain, given the cost of directly evaluating nutrient composition data in a laboratory setting. In this study, we develop a modeling framework for using phylogenetic and life history information to predict nutrient content in fish species of the economically and nutritionally important class Actinopterygii (ray-finned fishes). We focus on nine nutrients—protein, total fat, omega-3 and omega-6 fatty acids, iron, zinc, vitamin A, vitamin D, and vitamin B12. Fish can be important sources of each of these, especially among coastal populations in the developing world.

We find that most nutrients exhibit substantial phylogenetic signal—i.e., covariance among species in nutrient values is proportional to their shared evolutionary history—and a model based solely on phylogenetic relationships and empirical estimates of phylogenetic signal provide reasonable predictions of species nutrient content. This phylogenetic signal-based model provides better predictive ability than the multiple regression model (Table 4), although the latter accounts for both phylogenetic signal and life history trait values. This seemingly counterintuitive result can be explained by two factors. First, the sample size for estimating parameters of phylogenetic signal-based model is greater than that of phylogenetic regression model for all nutrients (see Tables 2, 3). Some of the increased predictive capacity of the phylogenetic signal model may be a consequence of better parameter estimation associated with a larger sample of species. Second, even though some life history variables (such as maximum depth) are significant predictors of some nutrients, the magnitude of effect is relatively small (Table 2). Therefore, inclusion of life history variables in nutrient predictions, as in the phylogenetic regression models, does not seem to offset the loss of sample size imposed by the additional requirement of having data on species life history. Note that we do not intend for our results to suggest that life history is unimportant in determining fish nutrient content, but rather that based on available data, incorporation of life history information does not improve predictions of species nutrient values. With the addition of more life history data and/or use of missing-data imputation techniques, phylogenetic regression models may ultimately yield better predictions than the phylogenetic signal model.

In addition, the method developed in this paper provides the basis for targeting new potential sources of key nutrients. The tendency for variation in nutrients to be phylogenetically structured means that groups of species that are closely related to nutrient-rich species are also reasonably likely to be nutrient-rich. As noted above, we explored this possibility by looking across predicted and observed nutrient values for more than 7000 ray-finned fish species, and identified families that are likely to contain species rich in key nutrients. The nutritional quality of the members of some of these families is well known; scombrids, carangids, salmonids, and clupeids make large contributions to existing fisheries. However, we also identify as potentially nutrient-rich several fish families for which species nutrient content of their species is mostly unknown (Table 5). For example, our predictions suggest that exocoetids (flying fish), sphyraenids (barracuda), and centropomids (snook) may be high in protein even though protein content has been measured for no more than one species in any of these families. In addition, even though no species of muraenid, ophichthid, congrid, or channichthyid has been assayed for total fat content, their close phylogenetic relationships to species high in fats lead us to suggest that they may be good sources of this nutrient. To be clear, we recommend that our predictions be used to inform the selection of potentially nutrient-rich species for nutrient content measurement, and we caution against use of our predictions as strong statements on which species should be exploited for nutrition.

Moreover, our identification of nutrient-rich families is limited in some important ways. First, we could only predict nutrient content in species represented in the Rabosky et al. phylogeny, and so some families may be over- or under-represented in our predictions14. For example, some nutrient-rich families may not appear in Table 5 if they have relatively few species in the phylogeny. Conversely, some families may appear to contain a large number of nutrient-rich species because they are well sampled in the phylogeny and because they contain one or more species known to be high in particular nutrients or are closely related to species of high nutritional quality. In fact, our phylogenetic prediction method is unable to make distinctions between unmeasured species of the same phylogenetic distance from measured relatives, though new nutritional data would likely help resolve this issue. Finally, we acknowledge that our predictions are contingent on Rabosky et al.’s estimation of phylogenetic relationships among these ray-finned fishes14, whose relation to the true phylogeny is unknown. Future work will refine these predictions as increasing phylogenetic information becomes available. Despite these limitations, we view our approach as a first step toward more precise estimates of the consequences of marine ecosystem transformation.

We identify several priorities in taking the conclusions of this research forward. First, improving the phylogenies of other classes important for human nutrition, including other marine seafood species as well as freshwater fish, is essential. Second, increasing the size of the nutrient validation sample—that is, measuring the nutrient content of more species directly—would enable a better utilization of the large amount of life history information available. It is possible, perhaps even likely, that the weak associations between life history variables and nutrient content in this study is due to an overly restrictive validation sample. We recommend expanding the sample by prioritizing nutrient assays in species that belong to families identified as potentially nutrient-rich, as presented in Table 5. Third—approaching the small sample problem from the opposite direction—use of techniques like that of Thorson et al. (2017) to predict life history parameters would allow more existing nutrient information to be utilized19.

We utilize several methods in this paper: evolutionary correlations, Brownian-motion models of trait evolution, and phylogenetic least squares regression. We believe that the availability of data is the key determinant of which of these methods is preferable in predicting fish nutrient content. In the absence of life history information, evolutionary correlations and phylogenetic signal-based models provide a rough sense of which nutrients are likely to be found together in Actinopterygii. As phylogenies expand and nutrient and life history datasets grow, life history regression models will likely improve on the predictive power of these methods. Overall, we believe that the combination of phylogenetic and life history information holds great potential for inexpensively inferring the nutrient content of a wide range of wild foods, and thereby quantifying the impacts of ecosystem transformation on human food and nutrition security.


Data sources

We utilized fish nutrient information from food composition tables from Argentina, Bangladesh, Cambodia, Canada, Chile, Denmark, Japan, Finland, Gambia, Greece, Italy, Japan, Malaysia, South Korea, Peru, Turkey, and the United States; 89% of the species we examined have entries in either the Korea, Japan, Bangladesh, or Argentina tables20,21,22,23,24,25,26,27,28,35,36,37,38. The life history data come from FishBase, a publicly accessible database containing taxonomic, biological, ecological, life history, and human use information on finfishes39. We utilized the species-level, time-calibrated, multi-gene phylogeny for Actinopterygii assembled by Rabosky et al.14. Although other large-scale species-level phylogenetic trees are also available for this group40,41, we chose the Rabosky et al. phylogeny because it maximizes overlap with nutrient content data. Rabosky et al.’s reconstruction of the phylogeny was based on maximum likelihood phylogenetic analysis of 13 genes with subsequent smoothing of among-lineage substitution rate heterogeneity and divergence date estimation using fossil calibrations. The original phylogeny includes over 7500 ray-finned fish species. As described in more detail further below, we used this phylogeny in combination with life history predictor variables and estimates of phylogenetic signal (i.e., the tendency for phenotypic similarity among species to be proportional to their time of shared evolution) to predict nutrient content in species lacking such information.

Evolutionary correlations

We first explored bivariate evolutionary correlations between all pairwise combinations of life history and nutrient variables. Here, evolutionary correlations are the Pearson correlation coefficients describing associations between evolutionary changes in pairs of variables42,43. We estimated these correlations for log-transformed species values given the Rabosky et al. phylogeny and a Brownian motion model with Pagel’s λ correction for the degree of phylogenetic signal using the ratematrix function of the geiger package44 for R45.

Multivariable regression models

To evaluate the capacity to predict nutrient content from life history information, we fit multiple regression models using phylogenetic least squares (PGLS) as implemented in the R package phylolm46. Because species with a recent common ancestor are expected to have more similar trait values, the assumptions of ordinary least squares (OLS) are violated47. PGLS accounts for phylogenetic non-independence using shared ancestry as inverse weights on the elements of the residual variance-covariance matrix used in the model15,29,30,31. Thus, in matrix notation, a coefficient β is estimated as follows:

$$\beta = \left( {{\mathbf{X}}^\prime V^{ - 1}{\mathbf{X}}} \right)^{ - 1}{\mathbf{X}}^\prime V^{ - 1}{\mathbf{y}}$$

where X is a matrix of n species and m+1 life history trait values (with an intercept column); X′ is the transpose of X; y is a vector of values for a given nutrient; and V is the residual variance-covariance matrix. Under OLS assumptions, the diagonal elements of V are the variance of the residuals; the residuals are expected to be normally distributed with mean zero. Under PGLS, the residual covariances are computed using branch lengths from each member of a species pair to their common ancestor. Instead of assuming covariance among species to be proportional to their time of shared evolution, we used a maximum likelihood procedure to identify a scalar of V−1, called Pagel’s λ, that best fits the observed data; λ = 0, the OLS approximation, would indicate trait evolution completely independent of phylogeny, λ = 1 would indicate that shared evolutionary history (i.e., shared phylogenetic branch length) predicts phenotypic similarity among species, and intermediate λ (0 < λ < 1) discounts the phylogenetic dependence of trait values among species30,32,33. The best-fit inverse matrix V−1 is then used to estimate the predictor coefficient β30,31,48.

Phylogenetic prediction

The objective of the PGLS modeling exercise is to advance towards methods for predicting the nutrient content of species for which information is not available. Although our analysis shows that life history parameters are generally weak predictors of all nutrients, phylogeny is more promising. We thus predict nutrient content of unmeasured Actinopterygii fish species using nutrient data for measured species and the Rabosky et al. phylogenetic tree relating both measured and unmeasured species14, assuming a λ-corrected Brownian motion model of evolution. Under this model, character change along any branch of the phylogeny is a normally distributed random variate with expected value equal to zero and variance proportional to the length of the branch24. Because the character has no tendency to increase or decrease under the model, the predicted value for an unmeasured species is equal to the estimated state for the most recent common ancestor (MRCA) between that species and the measured species most closely related to it. The state of this ancestor is evaluated as the branch length-weighted mean of the estimated character states in the next node deeper and the next more recent node that connect measured species (though, in some cases, the more recent node is a tip species). We estimated the states of these nodes using the phytools function fastAnc49.

Although the expectation under the Brownian model is that the character will remain unchanged in the unmeasured species from the time of its split from the measured species most closely related to it, the uncertainty in the predicted value increases with time since this split. Therefore, the variance around each predicted value incorporates the branch length (t) subtending the unmeasured species, and is evaluated as

$$t \times \sigma ^2$$

where σ2 is the time-independent variance of the Brownian motion process24. The parameter σ2 was estimated from the data and phylogeny using methods described in the next paragraph. We constructed 95% confidence intervals around each predicted value as \(\pm 1.96 \times \sqrt {t \times \sigma ^2} .\)

Our predictions also incorporate empirical estimates of the degree of phylogenetic signal in measured species nutrient values. We fit Pagel’s λ separately for each nutrient using the phylosig function of the phytools package for R22,32,33,44, which returns the maximum likelihood estimates for both σ2 and λ. We then transformed branch lengths of the phylogeny by the empirical estimates for λ using the rescale function of the R package geiger;21 λ = 1 recovers the original phylogeny, λ < 1 compresses internal branches relative to terminals, and λ = 0 is a star phylogeny. We then estimated ancestral character states, predicted states for unmeasured species, and their confidence intervals on this branch length-transformed tree.


We used a jackknifing approach to validate the method for phylogenetic prediction of nutrient content. For each nutrient, we removed one measured species from the dataset and applied the method to predict the species’ nutrient value and calculate its prediction interval. We then determined whether the prediction interval contained the measured value. If it does not, we label that trial as an error, and then calculated the error rate for each nutrient over all measured species. We also calculated the median percent deviation between predicted and measured values as a proportion of the standard deviation of the sample of all species nutrient values. We then compared error rates and accuracy of phylogeny-only predictions to predictions based on the best-fit multiple regression models. These latter predictions were obtained following the method of Garland and Ives50, and were restricted to the sample of species for which life history data were available.

Code availability

All code used in the analysis is made available as Supplementary Data 1115. SD11 is the script for evolutionary correlations. SD12 is the script for phylogenetic least squares. SD13 is the script for estimating the phylogenetic signal of nutrient variables. SD14 is the script for validating the predications under the lambda model. SD15 is the script for validating predictions under the lambda plus phylogenetic regression model.

Data availability

We declare all data used in the above analysis, tables, and figures to be available within the paper and in the supplementary information files. Supplementary Data 1 is the fish life history and nutrient content database. Phylogenetically predicted nutrient values are labeled as Supplementary Data 2–10.


  1. 1.

    Pauly, D. & Zeller, D. Catch reconstructions reveal that global marine fisheries catches are higher than reported and declining. Nat. Commun. 7, 10244 (2016).

    ADS  Article  PubMed  PubMed Central  CAS  Google Scholar 

  2. 2.

    Cheung, W. W. L. et al. Large-scale redistribution of maximum fisheries catch potential in the global ocean under climate change. Glob. Change Biol. 16, 24–35 (2009).

    ADS  Article  Google Scholar 

  3. 3.

    Cheung, W. W. L., Watson, R. & Pauly, D. Signature of ocean warming in global fisheries catch. Nature 497, 365–368 (2013).

    ADS  Article  PubMed  CAS  Google Scholar 

  4. 4.

    Perry, A. L., Low, P. J., Ellis, J. R. & Reynolds, J. D. Climate change and distribution shifts in marine fishes. Science 308, 1912–1915 (2005).

    ADS  Article  PubMed  CAS  Google Scholar 

  5. 5.

    Pinsky, M. L., Worm, B., Fogarty, M. J., Sarmiento, J. L. & Levin, S. A. Marine taxa track local climate velocities. Science 341, 1239–1242 (2013).

    ADS  Article  PubMed  CAS  Google Scholar 

  6. 6.

    Donner, S. D., Rickbeil, G. J. & Heron, S. F. A new, high-resolution global mass coral bleaching database. PLoS ONE 12, e0175490 (2017).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  7. 7.

    Hoegh-Guldberg, O. et al. Coral reefs under rapidclimate change and ocean acidification. Science 318, 1737–1742 (2007).

    ADS  Article  PubMed  CAS  Google Scholar 

  8. 8.

    Halpern, B. S. et al. A global map of human impact on marine ecosystems. Science 319, 948–952 (2008).

    ADS  Article  PubMed  CAS  Google Scholar 

  9. 9.

    Golden, C. D. et al. Fall in fish catch threatens human health. Nature 534, 317–320 (2016).

    ADS  Article  PubMed  Google Scholar 

  10. 10.

    Thilsted, S. H. et al. Sustaining healthy diets: The role of capture fisheries and aquaculture for improving nutrition in the post-2015 era. Food Policy 61, 126–131 (2016).

    Article  Google Scholar 

  11. 11.

    Béné, C. et al. Contribution of fisheries and aquaculture to food security and poverty reduction: assessing the current evidence. World Dev. 79, 177–196 (2016).

    Article  Google Scholar 

  12. 12.

    Cisneros-Montemayor, A. M., Pauly, D., Weatherdon, L. V. & Ota, Y. A global estimate of seafood consumption by Coastal Indigenous Peoples. PLoS ONE 11, e0166681 (2016).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  13. 13.

    Mozaffarian, D. & Rimm, E. B. Fish intake, contaminants, and human health: evaluating the risks and benefits. JAMA 296, 1885–1899 (2006).

    Article  PubMed  CAS  Google Scholar 

  14. 14.

    Lund, E. K. Health benefits of seafood; is it just the fatty acids? Food Chem. 140, 413–420 (2013).

    Article  PubMed  CAS  Google Scholar 

  15. 15.

    Pauly D. & Zeller D. (eds). Sea Around Us Concepts, Design And Data (2018) Accessed 10 April 2018.

  16. 16.

    Revell, L. J. Phylogenetic signal and linear regression on species data. Methods Ecol. Evol. 1, 319–329 (2010).

    Article  Google Scholar 

  17. 17.

    Pagel, M. Inferring evolutionary processes from phylogenies. Zool. Scr. 26, 331–348 (1997).

    Article  Google Scholar 

  18. 18.

    Pagel, M. Inferring the historical patterns of biological evolution. Nature 401, 877–884 (1999).

    ADS  Article  PubMed  CAS  Google Scholar 

  19. 19.

    Thorson, J. T., Munch, S. B., Cope, J. M. & Gao, J. Predicting life history parameters for all fishes worldwide. Ecol. Appl. 27, 2262–2276 (2017).

    Article  PubMed  Google Scholar 

  20. 20.

    Universidad Nacional de Luján. Tabla de composicion de alimentos (2010) Accessed 25 July 2017.

  21. 21.

    Shaheen, N. et al. Food Composition Tables for Bangladesh. (University of Dhaka, Dhaka, 2013).

  22. 22.

    Seyha, S. & Kuong, K. Food Composition Table for Cambodia (Ministry of Agriculture, Forestry and Fisheries of Cambodia, Cambodia, 2013).

  23. 23.

    Centre for Indigenous Peoples’ Nutrition and Environment (McGill University). Traditional Food Composition Nutribase (2005) Accessed 25 July 2017.

  24. 24.

    Rittenschober, D., Stadlmayr, B., Nowak, V., Du, J. & Charrondiere, U. R. Report on the development of the FAO/INFOODS user database for fish and shellfish (uFiSh)—challenges and possible solutions. Food Chem. 193, 112–120 (2016).

    Article  PubMed  CAS  Google Scholar 

  25. 25.

    Prynne, C. J. & Paul, A. A. Food Composition Table for Use in The Gambia (MRC Human Nutrition Research, Cambridge, UK, 2011).

    Google Scholar 

  26. 26.

    Office for Resources, Policy Division Science and Technology Policy Bureau, Japan. Standard Tables of Food Composition in Japan, Seventh Revised edn. (2015) Accessed 25 July 2017.

  27. 27.

    Kim, J. C. et al. 8th Revision Standard Food Composition Table. (Department of Agrofood Resources, NAAS, RDA, Korea, 2013).

    Google Scholar 

  28. 28.

    Korkalo, L., Hauta-Alus, H. & Mutanen, M. Food Composition Tables for Mozambique, Version 2 (Department of Food and Environmental Sciences, University of Helsinki, Finland, 2011).

    Google Scholar 

  29. 29.

    Food and Agricultural Organization (FAO) of the United Nations. FishStatJ: Software for Fishery Statistical Time Series. In: FAO Fisheries and Aquaculture Department [online] (2017) Accessed 13 July 2017.

  30. 30.

    Rabosky, D. L. et al. Rates of speciation and morphological evolution are correlated across the largest vertebrate radiation. Nat. Commun. 4, 1958 (2013).

    Article  PubMed  CAS  Google Scholar 

  31. 31.

    Bogard, J. R. et al. Nutrient composition of important fish species in Bangladesh and potential contribution to recommended nutrient intakes. J. Food Compos. Anal. 42, 120–133 (2015).

    Article  CAS  Google Scholar 

  32. 32.

    Grafen, A. The phylogenetic regression. Philos. Trans. R. Soc. Lond. B Biol. Sci. 326, 119–157 (1989).

    ADS  Article  PubMed  CAS  Google Scholar 

  33. 33.

    Martins, E. P. & Hansen, T. F. Phylogenies and the comparative method: a general approach to incorporating phylogenetic information into the analysis of interspecific data. Am. Nat. 149, 646–667 (1997).

    Article  Google Scholar 

  34. 34.

    Freckleton, R. P., Harvey, P. H. & Pagel, M. Phylogenetic analysis and comparative data: a test and review of evidence. Am. Nat. 160, 712–726 (2002).

    Article  PubMed  CAS  Google Scholar 

  35. 35.

    Sivakumaran, S., Martell, S. and Huffman, L. The Concise New Zealand Food Composition Tables, 9th edn. (The New Zealand Institute for Plant & Food Research Limited and Ministry of Health, NewZealand, 2012).

  36. 36.

    US Department of Agriculture, Agricultural Research Service, Nutrient Data Laboratory. USDA National Nutrient Database for Standard Reference, Release 28 (Slightly revised) (2016) Accessed 25 July 2017.

  37. 37.

    Stadlmayr, B. et al. Table de composition des aliments d’Afrique de l’Ouest. Food and Agriculture Organization of the United Nations (2012) Accessed 25 July 2017.

  38. 38.

    Dignan, C., Burlingame, B., Kumar, S., and Aalbersberg, W. The Pacific Islands Food Composition Tables, Second edition. Food and Agriculture Organization of the United Nations (2004) Accessed 25 July 2017.

  39. 39.

    Froese, R. & Pauly, D., eds. FishBase (2016) www.fishbase. org. Accessed 13 July 2017.

  40. 40.

    Near, T. J. et al. Resolution of ray-finned fish phylogeny and timing of diversification. Proc. Natl Acad. Sci. 109, 13698–13703 (2013).

    ADS  Article  Google Scholar 

  41. 41.

    Betancur-R, R. et al. The tree of life and a new classification of bony fishes. PLoS Curr. 5 (2013).

  42. 42.

    Revell, L. J. & Harmon, L. J. Testing quantitative genetic hypotheses about the evolutionary rate matrix for continuous characters. Evolut. Ecol. Res. 10, 311–321 (2008).

    Google Scholar 

  43. 43.

    Revell, L. J. & Collar, D. C. Phylogenetic analysis of the evolutionary correlation using likelihood. Evolution 63, 1090–1100 (2009).

    Article  PubMed  Google Scholar 

  44. 44.

    Pennell, M. W. et al. Geiger v2. 0: an expanded suite of methods for fitting macroevolutionary models to phylogenetic trees. Bioinformatics 30, 2216–2218 (2014).

  45. 45.

    R Core Team. R: A language and environment for statistical computing. 2018 (R Foundation for Statistical Computing, Vienna, Austria, 2017)

    Google Scholar 

  46. 46.

    Ho, L. S. T. & Ane, C. A linear-time algorithm for Gaussian and non-Gaussian trait evolution models. Syst. Biol. 63, 397–408 (2014).

    Article  PubMed  Google Scholar 

  47. 47.

    Felsenstein, J. Phylogenies and the comparative method. Am. Nat. 125, 1–15 (1985).

    Article  Google Scholar 

  48. 48.

    Symonds, M.R.E & S. P. Blomberg. A primer on phylogenetic generalised least squares.In Modern Phylogenetic Comparative Methods and Their Application in Evolutionary Biology 105–130 (Springer, Berlin, Heidelberg, 2014).

  49. 49.

    Revell, L. J. Phytools: an R package for phylogenetic comparative biology (and other things). Methods Ecol. Evol. 3, 217–223 (2012).

    Article  Google Scholar 

  50. 50.

    Garland, T. Jr. & Ives, A. R. Using the past to predict the present: confidence intervals for regression equations in phylogenetic comparative methods. Am. Nat. 155, 346–364 (2000).

    Article  PubMed  Google Scholar 

  51. 51.

    United States Department of Agriculture (USDA) Center for Nutrition Policy and Promotion. Dietary Guidelines for Americans. (National Academy Press, Washington, DC, 2010).

    Google Scholar 

  52. 52.

    United States National Library of Medicine (USNLM). MedlinePlus. (2017) Accessed 17 July 2017.

  53. 53.

    Oregon State University (OSU) Linus Pauling Institute. Micronutrient Information Center (2017) Accessed 17 July 2017.

  54. 54.

    Froese, R., Palomares, M. L. D., & Pauly, D. Estimation of Life-History Key Facts. FishBase. (2005) Accessed 12 June 2016.

  55. 55.

    National Academies. Dietary Reference Intakes Tables and Application. National Academies. (2017) Accessed 1 Mar 2018.

Download references


We would like to thank the Wellcome Trust for their financial support through the Our Planet, Our Health program [Grant number: 106864MA].

Author information




B.V. led writing of the manuscript and contributed to the analysis. D.C. led analysis. M.R.S. and S.S.M. created the underlying fish nutrient database. B.L.R. contributed to draft revisions and designed Fig. 2. C.D.G. conceptualized the research idea and overall study design, and all authors contributed to writing and draft revision.

Corresponding author

Correspondence to Bapu Vaitla.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Vaitla, B., Collar, D., Smith, M.R. et al. Predicting nutrient content of ray-finned fishes using phylogenetic information. Nat Commun 9, 3742 (2018).

Download citation

Further reading


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing