Validation of Food Compass with a healthy diet, cardiometabolic health, and mortality among U.S. adults, 1999–2018

The Food Compass is a nutrient profiling system (NPS) to characterize the healthfulness of diverse foods, beverages and meals. In a nationally representative cohort of 47,999 U.S. adults, we validated a person’s individual Food Compass Score (i.FCS), ranging from 1 (least healthful) to 100 (most healthful) based on cumulative scores of items consumed, against: (a) the Healthy Eating Index (HEI) 2015; (b) clinical risk factors and health conditions; and (c) all-cause mortality. Nationally, the mean (SD) of i.FCS was 35.5 (10.9). i.FCS correlated highly with HEI-2015 (R = 0.81). After multivariable-adjustment, each one SD (10.9 point) higher i.FCS associated with more favorable BMI (−0.60 kg/m2 [−0.70,−0.51]), systolic blood pressure (−0.69 mmHg [−0.91,−0.48]), diastolic blood pressure (−0.49 mmHg [−0.66,−0.32]), LDL-C (−2.01 mg/dl [−2.63,−1.40]), HDL-C (1.65 mg/d [1.44,1.85]), HbA1c (−0.02% [−0.03,−0.01]), and fasting plasma glucose (−0.44 mg/dL [−0.74,−0.15]); lower prevalence of metabolic syndrome (OR = 0.85 [0.82,0.88]), CVD (0.92 [0.88,0.96]), cancer (0.95 [0.91,0.99]), and lung disease (0.92 [0.88,0.96]); and higher prevalence of optimal cardiometabolic health (1.24 [1.16,1.32]). i.FCS also associated with lower all-cause mortality (HR = 0.93 [0.89,0.96]). Findings were similar by age, sex, race/ethnicity, education, income, and BMI. These findings support validity of Food Compass as a tool to guide public health and private sector strategies to identify and encourage healthier eating.


Introduction
Line 51: Please include references after the first sentence.
Line 55: "industry reformulation" or is meant "food reformulation by the industry"? Methods Line 93: Please include the upper age range.
Line 109: Which criteria were applied to assess completeness and validity of the diet record data?
Line 116: Why are alcoholic beverages excluded from the score?
Line 149: The manuscript mentions 54 attributes, but Table S1 indicates that certain attributes are not used in the score due to unavailability of data in the various composition tables. Trans fats are marked as not included, but appear in the various tables. Text S1 may also need checking for similar reasons.
Line 158: It would be good to know more about the motivation for these cut-off values (<30, 30-70, >70), what the reasoning behind these has been. Also, when taking this into a public health setting, it would be good to see how the frequency distribution of these three categories is in relation to the i.FCS consumed. In other words, is it about consuming more >70 foods, less <30 foods, or the relative proportion between them?
Line 166: Please clarify that the individual domain scores also range from 0-100.
Line 203: Considering the HEI and the FCS have the same range, apart from showing the scatterplot and correlation between the two scores (figure 2), it would be nice to see the agreement in the form of a Bland-Altman plot.

Results
Line 285: It would be good to have Table S7 in the main text, since it shows the associations for the separate domains and emphasizes their relative importance for the presented risk factors/conditions. Did the results in S7 change much after adjustment for daily energy intake? Considering the domain "additives", are the observed associations mainly driven by the attribute "added sugar"? Also, protein and fiber are positively associated with prevalent diabetes, may this be due to reverse causality?
Line 274/294/313: Considering the wide age range (20-85+) in the sample, do the authors consider the associations between the i.FCS and various outcomes to be the same across the whole sample?    Table S1: Could an additional column (as first column) be added which indicates 'included' or 'excluded' from the FCS, thereby shifting any attributes listed but not included in the i.FCS into the row for 'excluded'. Table S2: What is meant by 'total' behind some of the nutrients? Is this food and supplement sources combined? Or might this be e.g. a-TE instead of mg tocopherol. The footnote describes 'original score', what is meant by this? Table S6: adjustment for daily energy intake and survey cycle minimally changed the associations observed, apart from triglycerides and LDL-C. Just curious whether there is an explanation.

Discussion
Line 406: I agree this is a big advantage. However, the ability of the FCS to capture the various settings in which people eat (home, take-away, restaurant etc) requires a dietary assessment method (and extensive food list) which equally needs to be able to capture and differentiate on these aspects.
Line 417: Even though the foods in the food composition table can be precisely categorized, the i.FCS may still suffer from systematic error due to misreporting and omission of food items by participants. Could the authors refer to the implications for this in the section on limitations?
Line 420: A very good point to make! *Food composition tables may not contain all the details required to calculate the FCS/i.FCS. How do the authors see the extension of their work to other countries/settings? What would be their advice if certain attributes are missing? *For the nutrient scores, 25% of the RDAs for men 19-50 years were used, overestimating the domains for certain age-sex groups (e.g. iron, calcium). For intention of use as FoP labelling, this systematic error seems preferable to detail and clutter, but as i.FCS, would a more 'individualized' score taking age and sex into account when it comes to RDA be preferred perhaps? Have alternative algorithms been tried?
Boston, MA 02111 June 10, 2022 Response to Reviewer Comments Thank you to the reviewers for their thoughtful feedback on our original submission, "Validation of the Food Compass Score with a healthy diet, cardiometabolic health, and mortality among U.S. adults, 1999-2018. " We are pleased to submit our revised manuscript, with careful changes in response to these comments. Our detailed responses to each comment are summarized below.
In addition, we have made the following analytical updates: (1) inclusion of National Death Index mortality data through January 1, 2019 in the prospective survival analyses which increases the number of deaths in the analysis from 4953 to 7481 [data was recently released by NDI]; (2) modification from gram-weighted to energy-weighted scoring of a product's constituent ingredients for NOVA classification -one of the attributes of the Food Compass algorithm.
We believe the manuscript is strengthened as a result of these suggested revisions and analytical updates, and hope this is now suitable for publication. Thank you for these positive comments. We agree this manuscript is well suited for the Nature family of journals, and believe Nature Communications will make an excellent home.
2. In reviewing that previous paper I said that: 'The authors say [page 21 line 474] 'important next steps include testing and validation against health outcomes' There are now to my knowledge at least four nutrient profile models that have been validated against health outcomes to varying extents (the UK FSA/Ofcom model, Nutri-Score, HSR and ONQI). I would strongly suggest that the Food Compass Score should be validated against health outcomes in comparison with at least one of these other models. This should probably be done before publication of this paper and the results published in parallel with it'.
The present paper by O'Hearn et al does present the results of testing the Food Compass nutrient profile model against health outcomes. This is significant progress. But the authors have yet to do this in comparison with any other nutrient profile model that has been validated against health outcomes and has been shown to be valid (in particular the Nutri-Score algorithm). I recommend that the authors state this clearly in this paper.
Until then I do not think the authors are justified in claiming superiority for their model over other nutrient profile models and indeed other food classification systems. Their claim that 'the Food Compass could be a possible unified standard NPS' [Line 396] for efforts to develop unified front-of-pack (FOP) label in the EU and elsewhere is both unjustified and unrealistic and should be qualified. Shown previously superiority for indentifying refined starches, healthy fats -mozaffarian 2021 "could be a possible" In the first paper on Food Compass, we assessed three major domains of validity: content validity, by assessing nutrients, food ingredients, and other characteristics of public health concern; face validity, by assessing FCS for 8,032 foods and beverages reported in NHANES/FNDDS 2015-16; and convergent and discriminant validity, from comparisons to NOVA food processing classification, Health Star Rating, and Nutri-Score. Here, we present detailed findings on the fourth domain of validity: construct validity, evaluating Food Compass against population diet quality indices and health outcomes. 13,14 With these aims, we first applied specific product scores to a person's diet, deriving an energy weighted individual Food Compass Score (i.FCS). The present investigation then performed several important analyses, including assessment against the Health Eating Index, assessment against health risk factors and prevalent disease conditions, and assessment against risk of total mortality. We also performed analyses of these complex health endpoints for each of the nine domains of the Food Compass separately, providing novel information on how the different domains relate to disease risk.
We agree that separate, future analyses could compare different nutrient profiling systems in relation to health risk. These are not simple analyses, for example requiring formal consideration of potential differences in discrimination (e.g., area under the ROC) and calibration (e.g., risk reclassification), and methods for testing statistical significance of these differences. Such analyses are beyond the scope of the present manuscript, which aims to validate the Food Compass construct validity, against a healthy eating pattern, against health risk factors and prevalent disease conditions, and against risk of total mortality.
We also note that requiring formal comparisons against other NPS, a complex separate undertaking, in an initial validation paper is inconsistent with any prior published validation paper on NPS, including several on which this Reviewer is a co-author. In the Table below, we present the 15 prior published validation studies of NPS against health outcomes. None of these compared the association of multiple NPS against health outcomes, but focused on careful validation of one NPS of interest against health outcomes.
As recommended by the Reviewer, we now state in the Discussion that neither we nor others have compared the validity of different NPS against health outcomes, and that this an important area for future work, including assessing potential comparative differences of different NPS in different nations and subpopulations.
We do not state in this paper that the FCS is "superior" to other systems. We do believe this NPS, with its characterization of refined starch, healthy fats, processing, phenolics, additives, nutrient ratios, and multiple food ingredients, "could be" a possible unified standard NPS. It has many strengths, and this should be considered for this purpose. Table. Prior published investigations assessing the association between individual dietary scores derived from a nutrient profiling system and health outcomes. None of these compared the NPS of interest against another NPS. That is an area for potential future work. The authors show, for example, that 'the i.FCS for individuals was highly correlated with HEI-2015 (R=0.82) ( Figure 2)' but that: 'Individually, the 9 domains of the FCS were not as highly correlated, ranging from 0.16 for i.Specific Lipids to 0.71 for i.Nutrient Ratios ( Figure S2)' [Line 268]. So doesn't this mean that using the i.Nutrient Ratios domain for the i.FCS gives you almost all you need to give you a 'good enough' ranking of diets at least compared with the HEI 2015? What then is the added benefit of including, for example, the i.Specific Lipids domain in the total i.FCS and by extrapolation the Food Compass Score for foods and can this be quantified?
This matters because the authors continue, in my view, to fail to justify the inclusion of 54 components in the FCS when most other nutrient profile models manage with much fewer and also because the information needed to score many of the components of the Food Compass nutrient profile model is unavailable in many food composition databases other than that used by the authors in this paper. E.g. trans fatty acid information is commonly unavailable in many food composition databases and so too is the information necessary to score foods for degree of processing (here according to the NOVA classification system). Thank you for your suggestion to include a more thorough discussion around the i.Domain Scores. We agree that the ability to disaggregate Food Compass into these component domains, and assess their respective association with health outcomes, is an important feature of this analysis and NPS.
Importantly, the FCS domains were not selected or constructed based on the associations of these domains with healthy diet patterns or with health outcomes: that would represent a circular argument, where an NPS is over-fitted based on findings on health associations in the same dataset, and then termed "valid." As described in our original paper, relevant attributes and scoring principles were developed based on assessment of more than 100 reported NPS, including 7 widely-used NPS of diverse origins; a systematic review of national and international dietary guidelines; nutrient requirements for health claims; and assessment of nutrients, ingredients, and other food characteristics linked to health outcomes. The domain system of Food Compass was also purposefully designed so that no one domain or attribute drives the overall product score-a mechanism to prevent industry reformulators from "gaming" the system.
Of the i.Domain Scores, i.Nutrient Ratios has the highest correlation with the HEI 2015. However, the correlation with HEI 2015 is just important parameter we assessed -the extent to which each i.Domain Score tracks with health risk factors, prevalent disease conditions, and mortality is also relevant.
The finding that the strongest association with HEI, and also the strongest associations with health outcomes overall, is for i.FCS, rather than any one domain, corroborates our development methods that assessing multiple domain scores in aggregate provides the strongest overall metric.
Finally, it's important to note that the primary purpose of Food Compass (and thus its associated domains) is not to derive an individual person's energy-weighted dietary score, and associate that with health outcomes. Rather, Food Compass was designed as a measure of the healthfulness of specific individual food and beverage products, and also mixed meals. When a consumer, business, or policy maker is making a decision about an individual product (e.g., to purchase, to reformulate, or to apply a positive or negative policy), a high degree of discrimination is required among different products. In our prior report, we assessed the face/content validity, convergent validity, and discriminatory validity of Food Compass across 8032 products representative of the US food supply. Food Compass demonstrated excellent content validity -the distribution of scores for major and sub-food categories as well as individual product scores were consistent with what would be expected for more healthful vs. less healthful foods. In addition, Food Compass was able to discriminate healthfulness within food categories, as well as within the NOVA classification for food processing. When compared to other commonly accepted NPS (Nutri-Score and Health Star Rating), Food Compass generally scored more consistently with the latest nutrition science -i.e., scoring refined grains products less favorably (lower scores) and scoring products with healthy fats (plant oils, seafood, etc.) more favorably than by Nutri-Score or Health Star Rating. Food Compass also does so using a single, consistent algorithm across all food and beverage types, rather than having differing, subjective scoring principles and thresholds, selected in a posthoc fashion, across different food and beverage categories. Thus, this present new paper is not aiming to reconstruct the Food Compass, which has shown previous validity as a measure of product healthfulness, based on associations of the energy-weighted i.FCS for persons with health outcomes, but to provide further validation. We have done so. The previous validation findings, coupled with the present predictive validation findings of significant associations with a healthy dietary pattern, a range of health risk factors and prevalent diseases, and total mortality support the validity of Food Compass as a measure of product healthfulness.
We have added further comments on these important points in the Discussion: "The ability to score food and beverage items across 9 component domains is an important feature of the FCS and current investigation. Among the 9 domains, applied to individual items and then energyweighted to persons' diets, none by itself had as strong an association with HEI, nor with the full range of health risk factors, prevalent diseases, and total mortality, as the overall i.FCS. At the same time, each i.Domain Score had some associations with HEI as well as varying strengths of associations with different health endpoints. These findings support the complementary nature of the different domains, each providing supportive and somewhat distinct information, as components of the Food Compass. The growing evidence for complex, heterogenous effects of diet on health (e.g., via the gut microbiota, epigenetics, etc.) 45 further supports the utility of a more holistic, multi-domain measure of healthfulness of foods and beverages. In addition, the domain structure ensures that no one domain or attributes can drive the overall product score-a mechanism which prevents mis-scoring based on extreme values of a single or few nutrients as well as industry "gaming" the system by fortifying food products with isolated vitamins. This holistic, domain-based scoring also permits a single, consistent scoring algorithm across all food and beverage products, including mixed meals -in contrast to all other major current NPS, which require subjective grouping of foods and beverages into multiple categories that use differing algorithms and/or scoring thresholds and have trouble scoring mixed meals that contain ingredients across two or more of these categories. 15 "

Specific comments
Line 74. 'The Food Compass is a novel NPS that incorporates a range of 54 protective and risk factor nutrients, ingredients, bioactives, additives and processing attributes, grouped across 9 domains, and selected and weighted based on the latest evidence about their relative healthfulness. The Food Compass Score FCS), ranging from 1 (least healthful) to 100 (most healthful), enables a more consistent and universal algorithm across all food and beverage categories; permits similar scoring for mixed dishes and meals; and improves convergent and discriminatory validity of product scoring compared to other major NPS.
The comparative novelty of the Food Compass system does indeed mainly lie in the number and type of food attributes that are encompassed but in my view provides no more consistent scoring or convergent validity than other nutrient profile models. Furthermore I continue to fail to see why having more components than other nutrient profile models 'enables a more consistent and universal algorithm across all food and beverage categories' as the authors claim. But I have said this before in my review of the previous paper by Mozaffarian et al and the criticism seems to have gone unheeded.
The combination of its attributes and domains, together with scoring per 100 kcal, does allow Food Compass to have a consistent and universal algorithm, per definition, as the same algorithm is applied across all food, beverage, and meal product categories. Other NPS such as Nutri-Score and HSR require multiple separate scoring thresholds and algorithms for different food categories, to make the scores "make sense" based on what would be expected to be a healthy choice. We did not develop Nutri-Score or HSR, so we cannot say which aspects of these scoring systems required the subjective categorization of different foods and beverages into differing scoring algorithms and cutpoints. We can say that Food Compass, with its design, uses a single algorithm for all products as well as mixed dishes and meals.
Line 162 'To extend FCS for specific products to an individual's overall diet, the scores for each item reported in a person's diet were summed, weighted by its percent contribution to that person's total energy intake and then used to calculate an individual's Food Compass Score (i.FCS).' There are many different ways of converting nutrient profile scores for foods into diet quality scores for individuals [e.g. see https://discovery.ucl.ac.uk/id/eprint/1369569/1/Thesis-GMASSET-UCL-2012-FINAL.pdf] It would be useful if the authors could explain their choice at greater length (e.g. as they say [Line 430] 'Use of energy-weighting to calculate i.FCS provides lower weighting to certain foods with lower calories per servings such as fruits and vegetables. It seems to me to be likely that the particular way nutrient profile model scores should be converted to diet quality scores depends on the use to which the nutrient profile model is put (e.g. it will be different, say, if the model is for front-of-pack labelling purposes as opposed to restrictions on the marketing of foods to children) We acknowledge that NPS can be converted to dietary scores for individuals in different ways. Energy (kcal) is the most natural unit to combine foods in a comparable way, e.g. comparing 100 kcal of one product to another, for assessing health outcomes. Use of other metrics, like weight, can be highly problematic, due to bias from water weight, fiber, and fat. For example, for NPS that use grams, 100g of pork fat contains 638 kcal, while 100g of apple contains 52 kcal. Comparing the nutrients provided by these two foods using 100g, pork fat provides far more of many nutrients, leading to absurd claims that pork fat is the 7 th healthiest food in the world (https://www.bbc.com/future/article/20180126-the-100most-nutritious-foods, https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0118697). Because the original Food Compass Score for products is based on kcal, combining these product scores based on kcal is most appropriate for generating a person's dietary score.
We have added a sentence to the Discussion: "Both the FCS and i.FCS were weighted by energy (kcal), rather than weight or portion size, which prevents bias from differences in water weight, fiber, or fat content. For example, in a recent NPS that evaluated foods per 100g, pork fat was identified as the 7th most nutritious food in the world, 56,57 ignoring the obvious problem that 100g of pork fat (and its nutrients) contains 638 kcal, compared for example to 100g of apple which contains only 52 kcal (due to higher water content), creating a flawed comparison of the total nutrients provided by each 100g portion." Line 416 'in contrast, several validation studies of other NPS (40-43) utilized food frequency questionnaires that only allow scoring of major food categories.' This is a selective comparison. There are many other validation studies of nutrient profile models that have used dietary data from multiple 24-hour recalls (e.g. validation studies of the NutriScore algorithm involving the NutriNet Sante cohort and the SUVIMAX study. (e.g https://academic.oup.com/eurpub/article/30/Supplement_5/ckaa166.1285/5914738?login=true To be clear, we were not claiming that all prior NPS validation studies utilized FFQs, but that "several" did. We have clarified the statement: "While some prior NPS validation studies utilized detailed product information from 24 hour recalls, 46-51 several utilized food frequency questionanires 52-55 …" Reviewer #2 (Remarks to the Author): This paper follows on from a publication by the same group in Nature Foods in 2021 which described a new nutrient profiling score (Food Compass). This tool (like other NPSs) is aimed at assessing the healthfulness of individual foods. This perspective of individual foods links this NPS (and others) to the issue of front of pack labelling.
The purpose of this new paper is different as it aims to take the Food Compass score for each food eaten by an individual and then to summate it to provide a summary overall score. This is intended to be an indication of the healthfulness of the diet than individual foods. What the authors show is in an analysis of NHANES that this score relates to various risk factors and prevalent disease in a way that one might expect and that it is predictive of overall mortality in a follow up analysis over nearly 17 years in which there were nearly 5000 deaths in the cohort of 38K individuals included in the prospective analysis.
This analysis does suggest that the individual Food Compass Score does have predictive validity for total mortality in a US population.
There are some minor points as below about the analysis but the major question is not about whether the iFCS has predictive validity or not but rather whether that has any utility either for public health, for clinical practice or for the people themselves.
The authors state that "NPS like Food Compass can be one important tool in such interventions"... (to improve diet quality)... "allowing consumers, industry including (food manufacturers, retailers, and restaurants,), investors, schools, hospitals, and worksites, and policy makers to identify and shift toward healthier food and beverage options". Whilst one can understand how front of pack labelling using some form of NPS might influence individual food choices, it is rather difficult to see how the computation of an overall dietary score using the iFCS would have an impact. The argument for how computation of the iFCS might actually be used, by whom and for what purpose, needs to be much clearer. It is not reasonable to merely state the argument for the individual food NPS because that is not what this paper is about. A clearer explanation of the usefulness of an overall diet NPS is required.
Thank you for these comments. We agree with these points. The primary purpose of the i.FCS was not as a new metric for public health or clinical practice, but as a validation tool of the FCS for assessing individual products.
Previous analyses demonstrated that Food Compass has good face validity, convergent validity, and discriminatory validity for individual food and beverage products (Mozaffarian et al.2021), which could be used for consumer guidance, policymaking, and product reformulation. To further confirm the validity of FCS, we developed i.FCS for predictive validation of construct validity -the association of the scoring system with health outcomes, when applied to a consumer's choices for diet.
We are not proposing that i.FCS be used for other purposes.
The basis for the calculation of the iFCS is both somewhat arbitrary and very complicated. Why the different domains are equally weighted with the exception of three which receive 0. weights is not explained. Why only the top 5 vitamins and minerals and the top 3 specific lipids are included is not selfevident.
The iFCS is calculated directly from the product scores, which are then weighted by the energy content of each product to derive the iFCS. The i.Domain scores of individuals are not separately used to derive the iFCS. Rather, they were separately calculated to explore their associations with the endpoints of this study.
We have clarified in the Methods: "The i.Domain scores were not used to derive the i.FCS, but separately calculated from the product-level domain scores to explore their associations with the endpoints of this investigation." The correlation with the Healthy Eating Index is 0.82. This does beg the question about whether the iFCS has any advantage over the HEI. It is certainly not easier to compute. Some discussion of their relative merits is warranted.
Please see the comments above. We agree. We are not aiming to replace HEI, but using HEI as a "gold standard" dietary pattern to validate the utility of the FCS for scoring individual food and beverage products. (HEI, of course, cannot score specific food and beverage products) The iFCS quintiles appear to be strongly inversely related to total energy intake and there is a weighting for each food stuff on the basis of their contribution to total energy and it appears at least in some of the analyses that TEI was also included as a covariate. A clear statement about what role TEI plays in this score would be helpful. This is an excellent point. TEI is not included as a component of either the FCS or the iFCS. Thus, adjusting for TEI as a potential confounder is important. We have clarified this in the Methods text: "Average total energy intake was not included as a component of either the FCS or the i.FCS. Thus, we derived total energy intake from the 24-hour recall data available and adjusted for it continuously as a potential confounder. " There is a lot of missing data in this study. Some variables e.g fasting glucose, LDL-cholesterol and triglycerides are missing in more than 50% of people. In such a situation of extreme missingness, imputation with whatever method is problematic. There are also some challenges with physical activity which was assessed differently in different time periods. In this case one can understand how the use of the different measures of PA can be used to impute missing values. It is less how clear how nutrients that are not assessed at all (as far as I understand it) e.g total flavenoids, vitamin D and choline can be estimated by imputation.
In NHANES, certain fasting laboratory measures are missing in a significant proportion of participants because the participants were not examined in the fasting state. We included complementary biomarkers of long-term glucose homeostasis (HbA1c) and blood lipid profiles (HDL-C, total cholesterol: HDL ratio) with less missingness; and these findings were generally consistent. We have clarified these points in the Discussion limitations: "Certain fasting biomarker levels (e.g., triglycerides, fasting plasma glucose) were missing for a significant proportion of the NHANES sample, which may have weakened the accuracy of the estimates for these health indicators. However, assessment of complementary biomarkers with less missingness for long-term glucose homeostasis (HbA1c) and blood lipid profiles (HDL-C, total cholesterol: HDL ratio) yielded generally consistent findings." The methods to assess physical activity changes over time in NHANES from 1999 to 2018. As detailed in the Supplemental Materials, we utilized imputation so that the older PA measures could be more "standardized" to the newer methods, allowing PA to be used as a covariate in the analyses.
Nutritional composition for all foods in NHANES is reported in the USDA's Food and Nutrient Database for Dietary Studies (FNDDS). Because we utilized multiple NHANES cycles, a few nutrients were missing in certain cycles. For example, FNDDS reporting on vitamin D and choline started in 2007-2008 and 2005-2006, respectively. To account for missingness in these attributes for the earlier survey cycles, we first carried values backwards for the same products reported in different cycles; and then used multivariable imputation to estimate the remaining missing values. Similarly, the USDA has a database of total flavonoid values per 100g for reported food and beverage products from 2007-2010. We used total flavonoid values available from this database first, carried forward and backward for the same products reported in different cycles; and then used multivariable imputation to estimate the remaining missing values. Any errors in nutrient imputation would unlikely be systematic with respect to disease outcomes or total mortality, and so this imputation is likely to attenuate findings toward the null. We have added this to the Discussion limitations: "Some nutrients such as Vitamin D, choline, and flavonoids were only available in certain NHANES cycles, requiring imputation in other cycles. Because errors in nutrient imputation would unlikely be systematic with respect to disease outcomes or total mortality, this imputation may have attenuated findings toward the null." There is an important typo in the description of the means which states that the mean HbA1c in the population is 6%. I think this should read 5.6%.
We have corrected this to 5.6%, thank you Reviewer #3 (Remarks to the Author): This manuscript describes an elaborate food scoring system, which has been extended to assessment of an individual's dietary intake. Such scoring ensures like-with-like comparisons and may therefore be used to estimate food quality at point of purchasing as well as diet as a whole. Here, the authors validate the score by cross-sectional as well as prospective indicators of (ill) health. These associations were observed to be largely significant though relatively small in effect size and have plausible biological mechanisms with the domains of the food compass score. The algorithm behind the food compass is extensive and "not for the faint-hearted". It captures a large range of dietary exposures which otherwise are covered in separate scores. An extensive and elaborate work which emphasizes the complexity and inter-relationships of foods which make up our diet.
Thank you for your interest in our analysis and detailed critical feedback on both the manuscript and supplement. We appreciate your attention to detail and important perspectives.
My comments are divided in two parts. The first part relates to methodology/interpretation, the second part are just suggestions for the text.

Introduction
Line 51: Please include references after the first sentence.
We have added four references now to support this sentence: 1.
El Line 55: "industry reformulation" or is meant "food reformulation by the industry"? Good point. We have clarified this in the text that it is "food reformulation by industry".

Methods
Line 93: Please include the upper age range.
The upper age limit is 85 years. We've updated the methods text accordingly.
Line 109: Which criteria were applied to assess completeness and validity of the diet record data? Line 116: Why are alcoholic beverages excluded from the score? While alcoholic beverages are sources of calories and nutrients for many individuals, these products effect health through alternative, non-nutritive pathways, outside the scope of Food Compass. Mozaffarian et al. 2021 describes the exclusions of alcoholic beverages, among other products, from Food Compass scoring as well. Given the known associations of excess alcohol consumption and poor health, we included alcohol intake (as a percentage of energy) as a covariate in our analyses.
Line 149: The manuscript mentions 54 attributes, but Table S1 indicates that certain attributes are not used in the score due to unavailability of data in the various composition tables. Trans fats are marked as not included, but appear in the various tables. Text S1 may also need checking for similar reasons.
The Food Compass algorithm ideally incorporates 54 attributes. However, due to lack of data in FNDDS, certain attributes were not available. We have updated the Text S1 to mention the exclusion of trans fats. We have also added the following sentence to the methods text: "In addition, due to data availability in FNDDS, 7 attributes were excluded from Food Compass scoring (Table S1)." Line 158: It would be good to know more about the motivation for these cut-off values (<30, 30-70, >70), what the reasoning behind these has been. Also, when taking this into a public health setting, it would be good to see how the frequency distribution of these three categories is in relation to the i.FCS consumed. In other words, is it about consuming more >70 foods, less <30 foods, or the relative proportion between them?
As described in our prior report, Food Compass was applied to 8032 unique foods and beverages from the USDA's Food and Nutrient Database for Dietary Studies. The observed distribution of Food Compass Scores was assessed overall, across 12 major and 44 minor food categories. The 25 th and 75 th percentiles of the overall food scores were about 30 and 70, respectively. On the basis of the observed ranges overall and within food categories, FCS >= 70 was selected as a reasonable cut-point for foods and beverages to be encouraged; FCS = 31-69, to be consumed in moderation; and FCS <= 30, to be minimized. We have updated the text to clarify that these cut-off values are "based on the observed ranges of the FCS distribution of scored products[cite our prior paper]".
Your question about the frequency distribution of these 3 categories in relation to i.FCS is excellent, and one that we had not previously considered. We have performed additional analyses of the counts of different products with FCS >= 70; FCS 31-69, and FCS<=30 in each of the three categories of the i.FCS score (<= 30, 31-69, and >=70). We have also created a visualization of 12 sample NHANES participants with i.FCS scores between 40-48, and the respective %E and num of foods from each FCS category (FCS >= 70; FCS 31-69, and FCS<=30) to convey the heterogeneity in dietary composition that lead to a given i.FCS diet score. These methods and findings have been added to the Manuscript -see table S5, Figure 2 and additional manuscript text below.  The distribution (count and contribution to total energy intake [%]) of consumed food and beverage products, based on previously defined product healthfulness thresholds (i.e., FCS≤30 as products to minimize; FCS 31-69 as products to be consumed in moderation; and FCS ≥70 as products to be encouraged) was assessed for all NHANES participants. Examples are shown for 10 participants with mean i.FCS at the U.S. median (35.5 ±1) and consuming at least 10 different products over 2 days of diet recalls. The numbers within each stacked bar graph indicate the count of food and beverage products consumed from that category of FCS score across two days of reported intake; and the color bars represent the percentage energy contribution of food and beverage products from that category of FCS score across two days of reported intake.  [2.1, 10.4]). At the same time, however, for any i.FCS score, there was also substantial heterogeneity in the counts and energy contribution from products of different FCS scores (Figure 2), indicating that different people could arrive at similar overall i.FCS scores in different ways."

Discussion text added: "Notably, different people could arrive at a similar i.FCS scores with very different combinations of healthier or less healthy foods, and yet the overall i.FCS was still predictive of a healthy diet pattern, health risk factors, prevalent disease conditions, and total mortality."
Line 166: Please clarify that the individual domain scores also range from 0-100.
We have updated the text to specify that the individual domain scores also were standardized to a potential range from 1-100 (similar to FCS and i.FCS).
Line 203: Considering the HEI and the FCS have the same range, apart from showing the scatterplot and correlation between the two scores (figure 2), it would be nice to see the agreement in the form of a Bland-Altman plot. This is an interesting suggestion. Bland-Altman plots are designed to identify systematic differences for two instruments measuring the same thing, i.e. repeated blood pressure measures using two different cuffs. In our analysis, we are not comparing two assays designed to measure the same thing, but comparing i.FCS to a gold standard dietary pattern measure for validation purposes. Thus, we don't think it is appropriate to include this in the analysis.

Results
Line 285: It would be good to have Table S7 in the main text, since it shows the associations for the separate domains and emphasizes their relative importance for the presented risk factors/conditions. Did the results in S7 change much after adjustment for daily energy intake? Considering the domain "additives", are the observed associations mainly driven by the attribute "added sugar"? Also, protein and fiber are positively associated with prevalent diabetes, may this be due to reverse causality? This is a good suggestion. Given Nature Communications Article instructions allowing for up to 10 display items, we have moved the Table S7 to the main manuscript, and it is now Table 3. We found adjusting for energy intake and survey cycle did not appreciably alter the observed associations with cross-sectional health biomarkers and prevalent conditions. Regarding the i.Additives domain: "added sugar" and "nitrites" attributes contribute equally to the i.Additives Domain score. However, given the US food supply has more foods with non-zero added sugar values than non-zero nitrite values, it is reasonable to assume that i.Additives domain scores for many Americans are driven more by intake of foods with added sugar.
To further explore the association between the i.Fiber and Protein domain and prevalent diabetes, we conducted sub-analyses disaggregating the Fiber and Protein attributes and found that Fiber was associated with lower odds of prevalent diabetes, while Protein was associated with higher odds of prevalent diabetes. In meta-analyses of diverse long-term prospective cohort studies, higher protein intake is associated with higher incidence of diabetes; and in randomized trials, higher protein intake drives hepatic de novo lipogenesis (the conversation of excess dietary carbohydrate and protein energy into fat for long-term storage), which is one of the driving pathways for fatty liver, visceral fat accumulation, and insulin resistance. Further interventional studies are required to assess the causal mechanisms driving the association between foods and diets higher in protein intake and increased diabetes risk. We have added these exploratory analyses to the Results, and highlighted these points in the Discussion.

Added results text: "In further post-hoc exploratory analyses disaggregating the i.Fiber and Protein domain into its individual attributes (i.Fiber and i.Protein), i.Fiber was associated with lower prevalence of diabetes, while i.Protein was associated with higher prevalence of diabetes (data not shown)."
Added discussion text: "The observed harmful association between the i.Protein attribute and prevalent diabetes is consistent with meta-analyses of prospective cohort studies which identified a positive association between higher protein intake and higher incidence diabetes 34 ; as well as randomized trials where higher protein intake was associated with hepatic de novo lipogenesis 35 -a driving pathway for fatty liver, visceral fat accumulation, and insulin resistance." Line 274/294/313: Considering the wide age range (20-85+) in the sample, do the authors consider the associations between the i.FCS and various outcomes to be the same across the whole sample? This is an excellent point. We have now explored the potential for heterogeneity in the association between i.FCS and total mortality by age, as well as by sex, education level, race/ethnicity, and income level. We have included these new analyses in a new Table S10, as well as the Methods and Results. We added the following sentence to the method text: "In exploratory analyses, we investigated the relationship between i.FCS and total mortality in subgroups by age, sex, race/ethnicity, education, and income to assess potential variation (interaction) in the association according to these key sociodemographic factors." We also added the following paragraph to the results text: "Findings were similar across subgroups, with no significant differences in the observed protective associations between i.FCS and mortality (Table S10)." Figure 1: What is the reason for the 'unevenness' in the distribution for additives?

Figures
The distribution of the i.Additives domain indicates that the energy contribution of the majority of Americans' diets was not from foods with high amounts of nitrites or added sugar. First, nitritecontaining foods (i.e. processed meats) are a relatively small proportion of the total US food supply; and there is also little overlap between foods with both high nitrite and high added sugar content, so getting a very low score would not be common. Figure 2: The Y-axis ends on 00, but this is presumably 100.
Thank you, we have now fixed this figure. Figure S3: This figure is not referred to in the text. It models the domains individually in their association to all-cause mortality (mutually adjusted for). How are these positive and negative associations interpreted/explained?
In this exploratory analyses, the associations can be interpreted as the risk of mortality given a 1 standard deviation change in each i.Domain Score, after mutually adjusting for other i.Domain Scores, demographics, and lifestyle factors. It suggests that each domain is generally insufficient alone, when considering the contributions of the other domains, to predict mortality risk. These domains were not designed nor intended to be considered adjusted for each other, but to be added together synergistically. We have removed this exploratory analyses from the paper.

Tables
On various occasions in the S-tables, the zero (?) behind the decimal point is missing.
We have manually updated this output error. Yes, but as you know, total cholesterol alone is just a screening biomarker; and the clinically predictive and utilized metrics in practice are LDL-C, HDL-C, triglycerides, and the total cholesterol: HDL ratio. Thus, we have elected not to add TC as another blood lipid profile markers. Table S1: Could an additional column (as first column) be added which indicates 'included' or 'excluded' from the FCS, thereby shifting any attributes listed but not included in the i.FCS into the row for 'excluded'. This is a helpful comment. We have now included a footnote ( §) which specifies which attributes were excluded from this analysis due to data unavailability in FNDDS. Table S2: What is meant by 'total' behind some of the nutrients? Is this food and supplement sources combined? Or might this be e.g. a-TE instead of mg tocopherol. The footnote describes 'original score', what is meant by this? 'Total' in Table S2 refers to all food sources of that particularly vitamin. In the FNDDS database we used for this analysis, total folate (DFE) is reported as well as the disaggregated food folate and folic acid. For vitamin B12 and vitamin E, added vitamin B12 and alpha-tocopherol are also reported. We have now removed the "(total)" behind these micronutrients in the Table S2 to avoid confusion.
The first footnote refers to "original score" in the calculation of the final FCS. We've since clarified this to be the "unscaled score" that is calculated from the algorithm, before scaling from 1-100. Table S6: adjustment for daily energy intake and survey cycle minimally changed the associations observed, apart from triglycerides and LDL-C. Just curious whether there is an explanation.
An interesting point. While the observed associations for triglycerides and LDC-C changed more when the models were adjusted for energy intake and survey cycle, the change was still within the confidence interval of the primary analysis -indicating that the difference in these two models was not statistically significant. We cannot think of a biological rationale for this change, so likely this (nonsignificant) change in the estimate is just a chance finding.

Discussion
Line 406: I agree this is a big advantage. However, the ability of the FCS to capture the various settings in which people eat (home, take-away, restaurant etc) requires a dietary assessment method (and extensive food list) which equally needs to be able to capture and differentiate on these aspects.
For assessing an individual's diet, this is true: 24-hour recalls will be needed. However, the purpose of FCS is not to assess an individual's diet, but to assess the healthfulness of products being sold to consumers. Thus, FCS can be used to score meals in cafeteria menus, restaurants, etc., which other major NPS cannot do. We have edited our discussion to reflect this: "In this national dataset, Food Compass was able to score not only manufactured products but a person's entire diet, including complex home-cooked, cafeteria, and restaurant mixed meals which other major NPS generally cannot do." Line 417: Even though the foods in the food composition table can be precisely categorized, the i.FCS may still suffer from systematic error due to misreporting and omission of food items by participants. Could the authors refer to the implications for this in the section on limitations? This is a good point. If such error is random with respect to the outcomes, this will attenuate findings toward the null. If systematic, it will bias results in unpredictable directions. We have added these points to the limitations section: "Misreporting and omission of food items by dietary recall participants was possible. If such error is random with respect to the outcomes, this would attenuate findings toward the null. If systematic, it could bias results in unpredictable directions." Line 420: A very good point to make! Thank you. We believe NPS validation studies that used FFQs for their dietary exposure data are limited in being able to validate the association between that NPS and health outcomes. *Food composition tables may not contain all the details required to calculate the FCS/i.FCS. How do the authors see the extension of their work to other countries/settings? What would be their advice if certain attributes are missing?
We believe the primary advantages of the FCS are not the number of attributes, but other, more fundamental novel design features. These include: (a) the use of domains, which provides a more holistic assessment of foods and beverages while also preventing excess weight from any single attribute; (b) the integration of cutting edge-science in selection of the basic attributes, including the use of nutrient ratios (unsaturated:saturated fat, potassium:sodium, fiber:carbohydrate) which more accurately capture fat, mineral, and carbohydrate quality, as well as the omission of outdated attributes which are major components of other NPS, in particular total fat and total calories; (c) the scoring on the basis of 100 kcal, rather than 100 g as in many other NPS; and (d) the incorporation of some features of processing, beyond nutrients alone.
For example, among the various domains, the most predictive was often the nutrient ratio domain. In addition, other analyses show that scoring by weight or serving size, rather than the natural unit of kcal, can be highly problematic, due to bias from water weight, fiber, and fat. For example, in a recent published NPS that scored foods based on 100 g, pork fat was declared the 7 th most nutritious food in the world. This is entirely due to the bias of scoring by weight: 100g of pork fat contains 638 kcal, while 100g of apple contains 52 kcal. Comparing the nutrients provided by weight of these two foods, pork fat of course provides more of many nutrients per 100g given that it contains no water, leading to absurd claims that pork fat is the 7 th healthiest food in the world (https://www.bbc.com/future/article/20180126-the-100-most-nutritious-foods, https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0118697).
Thus, a more limited FCS, retaining these basic design strengths but employing fewer nutrients, will likely still be a reasonable predictor of the healthfulness of products. We plan to test this question in future manuscripts. In addition, imputation can be used to deal with missingness of certain variables -we are currently working on such imputation to apply FCS to hundreds of thousands of branded food products.
We have updated the discussion text with these important points: " The Food Compass could be a possible unified standard NPS for these efforts, given its association with a healthy diet pattern and multiple health endpoints and its use of a single, consistent algorithm for all foods, beverages, mixed ingredients, and mixed meals. Such a standard may also provide a more accurate assessment of healthfulness for consumers, industry, and procurement decisions, in comparison to strategies using isolated nutrients or ingredients, such as the FOP warning labels recently implemented in Chile, Mexico, Uruguay, and Israel. 40 While FCS has more attributes than existing NPS, its core design strengths are the use of domains, which provides a holistic assessment of foods and beverages while also preventing excess weight from any single attribute; integration of cutting edge science in selection of predictive attributes (i.e., nutrient ratios which more accurately capture fat, mineral and carbohydrate quality) and omission of outdated attributes such as total fat or total calories; scoring on a basis of 100 kcal rather than 100 g, which greatly reduces bias from water weight; and incorporation of novel relevant features such as processing and additives. In future work, we plan to test FCS versions that retain these core design strengths while employing a more limited set of attributes; as well as leverage multivariable imputation to handle missing values in certain attributes in large datasets of branded products. Extension to datasets in low and middle income countries will also be important." *For the nutrient scores, 25% of the RDAs for men 19-50 years were used, overestimating the domains for certain age-sex groups (e.g. iron, calcium). For intention of use as FoP labelling, this systematic error seems preferable to detail and clutter, but as i.FCS, would a more 'individualized' score taking age and sex into account when it comes to RDA be preferred perhaps? Have alternative algorithms been tried?
This is an excellent point. For practical goals of creating a single score, some assumptions and simplifications were needed. In the future, more "individualized" NPS could be crafted based on specific characteristics of the individual, such as age, sex, disease status, and more. We have added this point to the discussion: "For practical goals of creating a single NPS score, some simplifications were needed, such as using the RDA for 19-50 year-old men as target high scores for several nutrients in the Food Compass algorithm. In the future, more "personalized" NPS could be crafted based on specific characteristics of the individual, such as age, sex, disease status, and more." *The impact of food production and consumption on the environment is becoming increasingly important. A process has been started for food guidelines to include aspects of sustainability. Could this be the 10th domain? Or would sustainability (like alcohol) be best modelled 'outside' the FCS?
We fully agree. Indeed, this was the original vision of the Food Compass, in that we anticipated each direction of the compass scoring a different feature food: e.g., healthfulness, sustainability, social justice, animal welfare. We plan in future work to add such additional independent dimensions to the Food Compass. We have added these points to the Discussion: "Finally, the long-term vision of Food Compass is to scoring additional features of foods and beverages, such as environmental sustainability, social justice, and animal welfare -one for each direction of the compass. Future work is required to explore, add, and validate these additional dimensions."

References
Please include an 'accessed date' for the web-based sources. Done Ref 44: Could the authors include the chapter or possibly even refer to the original publication(s)? Yes, we've updated this reference to include the chapter number and title from Willett's Nutritional Epidemiology.
Textual Line 25: Suggest removing "consistent", since this is implied by "algorithm". We use the word consistent to refer to the fact that we use the same algorithm across all food and beverage products, rather than different algorithms for different categories of foods, as done with many other nutrient profiling systems like Nutri-Score, Health Star Rating, etc. We've updated the text to say "same" instead of "consistent" for greater clarity.
Line 34: Suggest adding "In cross-sectional analysis, [after multivariable …]" Done Line 35: Suggest to remove "levels of" in front of BMI. Done Line 38: For better readability add "as well as, [lower prevalence]" Done Line 73: Suggest rephrasing to "health risk related nutrients, …". We have rephrased it to: "54 potentially protective and harmful nutrients," as some attributes are protective Line 80: Suggest rephrasing to "…and the corresponding validity for the association between a person's FCS and health outcomes, …". We updated it to read: "and the corresponding validity for the association between a person's FCS with a validated healthy dietary pattern and major health outcomes, have not been established." Line 98: Suggest to include "…(NDI) up to 2015 we …". We've updated this now as NDI data is available through 2018 Line 117: Suggest including "Any missing attributes (i.e. nutrient quantities) required …" Done Consider to move the FCS paragraph after the dietary assessment, followed by the paragraph on sociodemographics. Good suggestion, we have moved the FCS paragraph after dietary assessment, followed by sociodemographic factors and covariates.
Line 162: Suggest to write the sentence in the order of actions "…, the FCS score for each food item reported in a person's diet was weighted by its percent …intake and then summed." We've updated the text per your suggestion to be more clear with the order of actions: "To extend FCS for specific products to an individual's overall diet, the FCS score for each item reported in a person's diet was weighted by its percent contribution to that person's total energy intake, and then summed, to calculate an individual's Food Compass Score (i.FCS)." Line 164: Consider to remove duplicated sentence on alcohol. We believe it is important to mention exclusion of energy contribution from alcohol in the i.FCS calculation, and the inclusion of alcohol intake (as percentage of average total energy intake) as a covariate, as two separate points. We've updated the text in the i.FCS paragraph to read: "Because alcohol is not scored by FCS, energy from alcohol intake was excluded from the i.FCS calculation and included as a covariate in all models." Line 210: Suggest to include: "In cross-sectional analysis, all models were adjusted …" We've update this part of the text with slightly different language Line 232: typo casual. Good find. We've updated the text to say "causal" Line 260: Suggest "With increasing i.FCS, the percentage of participants with Asian or other ethnicity increased; whereas the reverse was observed for Non-Hispanic Black participants." Thanks for the suggestion to make this statement more consistent with how we report on other sociodemographic variables. We've modified your suggested text slightly in the findings: "In crude (unadjusted) analyses, adults with a higher i.FCS were more likely to be of Asian or other racial descent and less likely to be of Non-Hispanic Black race/ethnicity;" Line 270: Suggest to add "…highly correlated with the HEI-2015, ranging from …" Noted. We've added in the range of spearman correlations for population sub-groups: "ranging from 0.76 to 0.83".
Line 274: Suggest to change to: "In multi …, higher i.FCS were significantly associated with more favorable risk factors (Table 2)." We have removed "levels" in the first half of the sentence. We believe it is important to clarify the directionality of the association -that higher i.FCS is associated with lower levels of risk factors (BMI, HBA1c, LDL, TC, BP, etc.) but higher levels of protective factors (i.e., HDL).
Line 297: To make the difference with the survival analysis clearer, suggest to change to: "…was associated with 15% lower odds of having the metabolic syndrome, …etc" We have the updated the text to use "prevalence" instead of "risk", to highlight the cross-sectional nature of these analyses.
Line 343: 'this NPS' refers to the FCS. We have changed the text to say "Food Compass" now instead.
Line 387: Check location of brackets. Good find. We've removed the extra comma inside the bracket.
Line 401: Would be good to explain more, to avoid misinterpretation of black box. We have removed the words "black box" to avoid confusion. We believe the text is clear without that descriptor: "in comparison to strategies using isolated nutrients or ingredients, such as the FOP warning labels recently implemented in Chile, Mexico, Uruguay, and Israel." Line 416: Suggest to add "using 1-2 24-hour diet recalls per individual …". We've modified it to "up to two 24-hour recalls per individual" We believe the manuscript is greatly strengthened as a result of these suggested revisions and analytical updates, and believe this work is now suitable for publication. We look forward your review. Thank you for sharing this paper. Egnell et al. 2021 compared different variations of the same NPS, not different nutrient profiling algorithms. Also, we note that even this paper is a later publication, after other validation papers had been published assessing this sole NPS against health outcomes. All other identified studies similarly only evaluated a single NPS in relation to health outcomes: see Table 1 from our previous Response for examples.
While we believe it was still accurate, we have deleted the statement that "neither we nor others have compared the validity of different NPS against health outcomes." As previously described, we hope to conduct such complex, complementary analyses in future papers, both assessing variations of the Food Compass and comparing other, distinct NPS. We have updated the discussion to further clarify this: "Additionally, we have not compared the validity of different NPS against health outcomes, and in different nations and subpopulations -an important area for future work." "Finally, we did not compare the findings to other NPS, which can be done in future work." (new to Limitation section) Re 'The combination of its attributes and domains, together with scoring per 100 kcal, does allow Food Compass to have a consistent and universal algorithm, per definition, as the same algorithm is applied across all food, beverage, and meal product categories.' (p 8) I agree that Food Compass algorithm is different from that of Nutri-Score , etc. in being applied across all foods rather than by category, but that is not what I meant by the Food Compass algorithm providing no more consistent scoring or convergent validity than other nutrient profile models. But the authors and I are clearly not going to agree on this point so I am willing to let the matter drop.
Thank you for your input. We agree there are different scientific perspectives on similar issues.
Re 'We have added a sentence to the Discussion' (p9) Thank you for doing this but the sentence does not respond to my original request to the authors to explain their choice of method for converting nutrient profile scores for foods into diet quality scores for individuals. Instead they repeat some of their argument for why a NPS itself should have a 100 kcal reference amount rather than a 100 g reference amount. Their argument is already explored in some depth in the original article describing the Food Compass NPS published in Nature Food (p 816) and need not be repeated here (and anyway the argument as summarised here is contentious). So I recommend deleting these new sentences beginning, "Both the FCS and i.FCS were weighted by energy…' and ending, 'creating a flawed comparison of the total nutrients provided by each 100g portion.' Sorry if we did not fully understand your previous comment: we did aim to be fully and accurately responsive. See below for further details on how we have modified this section, and removed the pork fat example as you have suggested.
Re 'The iFCS quintiles appear to be strongly inversely related to total energy intake and there is a weighting for each food stuff on the basis of their contribution to total energy and it appears at least in some of the analyses that TEI was also included as a covariate. A clear statement about what role TEI plays in this score would be helpful.'(p 11) I agree with Reviewer 2 here but there is an important distinction to be made between the TEI of the diet and the TEI of the food. I am not sure that the authors have understood the point Reviewer 2 is making.
Thank you for this point. It's relevant to point out that the inverse association between i.FCS quintiles and total energy intake is crude (i.e., not adjusted for other critical sociodemographic variables) and thus should be interpreted with caution. We tested and found, for example, that this association was attenuated when accounting for age and sex: "However, after adjusting these crude associations by age and sex, the differences by i.FCS quintiles in total energy intake and physical activity were attenuated." Total energy intake plays no role in the i.FCS score, and also does not appear to confound or mediate the association when adjusted for as a covariate. We have clarified this further in the methods text: "The relative energy contribution of each food item was accounted for in the dietary i.FCS calculation. Total energy intake was not included as a component of either the FCS (food level) or the i.FCS (dietary level). Thus, we adjusted for total energy intake derived from the 24-hour recall data as a potential confounder in the models." Re.'Line 417:Even though the foods in the food composition table can be precisely categorized, the i.FCS may still suffer from systematic error due to misreporting and omission of food items by participants. Could the authors refer to the implications for this in the section on limitations?..." could bias results in unpredictable directions." ' (p21) Surely under-reporting of unhealthy items is more likely than over-reporting of healthy items and there is then some predictability in the way reporting will affect the results.? This is interesting thought exercise. We are not aware of clear evidence for a meaningful difference in misreporting "healthful" versus "less healthful" food items that is also differential by risk of poor health outcomes. (Both are necessary: i.e., systematic error only results when the error in the exposure varies according to risk of the outcome). Without such under-or over-reporting also varying according to a person's risk of disease, the bias would be in unpredictable directions. The most plausible scenario for systematic error would be where individuals at higher health risk would be more likely to under-report unhealthy foods. In this case, this would artificially inflate these individuals' Food Compass Scores (i.FCS) -a bias which would weaken the reported findings, causing i.FCS to appear less protective than it actually is. Thus, correction for such bias would strengthen our findings. We have updated our Limitations section accordingly: " If systematic with respect to the outcomes, for instance if individuals at higher health risk were more likely to underreport unhealthy foods and beverages, this would artificially inflate these individuals' i.FCS -a bias which would weaken the reported findings and make the i.FCS appear less protective than it actually is." Re. 'Comparing the nutrients provided by weight of these two foods, pork fat of course provides more of many nutrients per 100g given that it contains no water, leading to absurd claims that pork fat is the 7th healthiest food in the world' (p 22). This statement makes its way into the revised text at line 366 (omitting the word "absurd").
Of course, the claim seems absurd, but any reference amount (100g, 100kJ or serving) will generate apparent absurdities. E.g. NPSs with a 100kJ reference amount generally classify lettuce as an unhealthy food which also seems absurd. I do not think picking out particular foods helps in discussion about the optimal reference amount for an NPS and I would recommend removing the newly added sentences beginning, "Both the FCS and i.FCS were weighted by energy…' and ending, 'creating a flawed comparison of the total nutrients provided by each 100g portion.' (lines 364-369) for this reason in addition to the one I give above.
Your example of lettuce is an excellent illustration.
First, a major reason that other NPS create absurd scores is because of the 100g scale. This is a major reason why NPS like Nutri-Score and Health Star Rating must select and use different algorithms and scoring cut-points for different food categories.
Second, your example of lettuce supports our argument, as using a 100kcal (or 100kJ) reference correctly classifies lettuce as a healthy foods. In fact, almost all green leafy vegetables (romaine lettuce, collards, cress, chard, kale, spinach, etc.) score 100 when using Food Compass's (per 100 kcal) algorithm -the highest possible rank in our system. Please see the supplementary materials of our recent Nature Food publication (Table S7, pg 18) for the scoring of all 8032 foods. https://www.nature.com/articles/s43016-021-00381-y#Sec14 Nonetheless, your point is well taken about the danger of picking out particular food scores in any NPS, which is not crucial for our point, and we have thus removed the lines you have suggested.