The role of nutritional programming in infancy on later growth has been widely debated in recent years. Recently, a group of expert committees systematically reviewed the levels of evidence underlying guidelines for a wide variety of infant feeding practices and related health outcomes within the Pregnancy and Birth to 24 Months Project. For almost all of the research questions, the committees concluded that the level of evidence supporting single outcomes was low to modest, at best.1 One important issue in infant feeding concerns the optimal level of dietary protein that should be given to infants. Protein intake in the first year of life is, indeed, considered one of the main determinants of growth later in life.2 In this period, an insufficient protein intake may lead to detrimental consequences, such as growth failure and altered body composition and metabolism.3 On the other hand, previous literature suggests that a high-protein intake earlier in life is associated with a higher weight gain later on.4,5,6 Indeed, breastfed infants, with a limited protein intake and a balanced energy-to-protein ratio supplied by human milk, show a lower risk of developing fatness later in life. However, the evidence is still controversial, and the underlying mechanisms are unclear.7,8,9,10 In more recent papers, the suggestion of an association between earlier protein intakes and later fatness has been supported by several authors on the basis of heterogeneous studies, including randomized clinical trials (RCTs).11,12 Finally, a quantitative synthesis of the evidence via meta-analysis is currently not available on the effects of different amounts of protein intake within the first year of life.

The current paper is aimed at investigating the effects on growth outcomes of different amounts of protein intake in healthy full-term infants within the first year of life through: (1) a systematic review of the literature that includes any interventions with different protein content (e.g., infant formulas, follow-on formulas, or complementary feeding); and (2) a quantitative synthesis of the evidence via a meta-analysis that considers the effect of different formula-based interventions on growth outcomes consistently measured at any timepoints in similarly designed studies.

Materials and methods

Systematic review

A systematic review of the literature was initially performed in June 2018 and then updated on October 15, 2021. The search was carried out via PubMed (, Embase (, Web of Science (, and the Cochrane library (, following PRISMA guidelines.13 The following search terms were used: (first year OR first year of life OR first months OR first months of life) & (dietary protein) & (growth OR length OR weight OR body mass index OR skinfold thickness). Only human studies reported in English were considered. Letters to the editor, abstracts, and proceedings from scientific meetings were excluded from the analysis.

Included studies

We included in this systematic review any RCT on healthy term infants that: (1) compared at least two arms presenting different amounts of dietary protein in the first year of life and (2) evaluated their short- (i.e., ≤1 year from the beginning of the intervention) or long-term (i.e., >1 year from the beginning of the intervention) effects on growth. Possible interventions include infant formulas, follow-on formulas, or complementary foods. Among possible short- or long-term growth outcomes, we considered body length, weight, body mass index (BMI), waist circumference, and body fat content. We did not consider studies including mixed-type interventions, where protein intake was targeted together with additional dietary components (e.g., amino acids supplementation). Studies that did not allow to separate out the effects of protein intake in the first year of life from that of later periods were excluded. In addition, we excluded studies that did not allow us to assess arm-specific dietary protein intake. Studies targeting selected populations following specific research questions (e.g., studies selecting a priori infants born from overweight mothers only) were also discarded. Two authors (V.D.C., A.M.) independently selected the articles, retrieved and assessed the potentially relevant ones. Discrepancies in the articles’ selection or disagreement on the interpretation of methods or results were resolved by a face-to-face discussion; if the discrepancy persisted, a third researcher was consulted (C.A.).

Data collection, extraction, and analysis

Pairs of review authors (G.P.M., V.E., A.M., S.B.) independently extracted the data. Controversies were resolved by discussing with the senior author (C.A.). From each included article, the following information was extracted and inserted in a structured database: (1) general characteristics of the studies; (2) study design, inclusion criteria, data collection occasions, and characteristics of the intervention; (3) outcome; (4) main results; and (5) strengths and limitations of the included studies.

When present, we also collected the main results on high- or low-protein arm vs breastfeeding.

Study quality

All included articles were independently rated for quality by two researchers (G.P.M. and V.E.), using the Quality Assessment Tool for Clinical Trials from the NIH National Heart, Lung, and Blood Institute.14 If the ratings were different, a third author (C.A.) was consulted for quality adjudication. Each study was judged as being of “good,” “fair,” or “poor” quality. We did not identify a cut-off for the total score (calculated by summing up the 1s corresponding to the “yes” marks), but we carefully evaluated the “no” items to assess the overall risk of bias of the examined study.


We focused our meta-analysis on the comparison between growth outcomes derived from formulas showing different amounts of protein. Restriction to formulas allows us to better control treatment-related heterogeneity, potentially due to the presence of a combination of formula and complementary feeding. Among the studies included in the systematic review, we selected those which enrolled all participants and started their intervention within the first month of life, to reduce as much as possible the effect of prolonged breastfeeding before switching to the formula intervention. In case of relevant missing information or to verify the reliability of the information, the corresponding author of the article was contacted by email. If the author did not respond to our query, a second attempt by email was performed at least 20 days later. If the second email went unanswered, missing data were imputed from those available in the report. Additional details were provided in the following.

Included studies

We identified five studies to be included in the meta-analysis.15,16,17,18,19 These studies consistently measured growth outcomes at 120 days of life of the infants and were comparable in terms of treatment and potential treatment effects over the reference period. The study by Oropeza-Ceja et al.16 was included after contact with the corresponding author who confirmed that all infants assigned to the three formula groups were exclusively or mixed breastfed for no more than 30 days. Although three papers within the European ChildHood Obesity Project (CHOP)20,21,22 and the studies by Ziegler et al.,23 Åkeson et al.24 and Larnkjæ et al.25 considered formula-based interventions, the corresponding results were excluded from the meta-analysis because infants were enrolled up to approximately 2,20,21,22 3,23,24 or 925 months of life. In addition, we excluded the Picone et al. article because the outcome measure was not available at 4 months.26


Treatments under comparison included a high- and low-protein content infant formula, but their exact specification varied across studies (Table 1). After inspecting study-specific cut-offs, we assumed that subjects receiving an amount of protein ≤2.0 g/100 kcal belong to the low-protein content formula group, whereas those receiving >2.0 g/100 kcal belong to the high-protein content formula group. One paper16 provided a four-arm design, where low-, medium-, and high-protein arms were considered, together with breastfeeding. In accordance with our definition, we included the original low- (1.4 g/100 kcal) and medium-protein (1.8 g/100 kcal) content groups into one low/medium-protein-content group of 35 participants. The mean and standard deviation of the growth outcomes were calculated using the weighted mean and the decomposition of the total variance in between- and within-groups variance (Supplementary Materials and methods—text).

Table 1 Characteristics of the studies presented in the papers included in the systematic review.

Outcome measures

We conducted two parallel meta-analyses considering as the primary outcomes weight gain at 120 days of life and length gain at 120 days of life, respectively. After contacting the corresponding authors, available information was still insufficient (two studies only) to carry out a meta-analysis on BMI gain at 120 days. When the standard deviation of weight or length gain was not reported and it was not possible to calculate it from different data in the paper (for weight gain in one study17 and for length gain in two studies17,19), we imputed it as the mean of the arm-specific standard deviation provided by the remaining studies included in the meta-analysis. Sensitivity analyses were later conducted to assess the effect of this strategy, by varying the imputed standard deviations from the minimum to the maximum values available in each arm.

Statistical analysis

We used the mean difference (MD) in each growth outcome as the measure of treatment effect. When adjusted estimates of the intervention effect, together with their standard errors, were derived from an analysis of covariance model in the original studies,16 they were combined with the arm-specific change-from-baseline scores on the MD scale, as they give the most precise and least biased estimates of intervention effects. We calculated the summary estimates of the weighted MD using both fixed-effects (based on the inverse variance method) and random-effects (based on the DerSimonian and Laird estimation method) models.27,28 The forest plot presented study-specific and combined estimates of the MDs from random-effects models, to provide more conservative estimates.29

Statistical heterogeneity among studies was assessed via χ2 test on the Q statistic (corresponding p value < 0.10 suggests the presence of heterogeneity);28 the presence of potential inconsistencies was quantified through the I2 statistic30 (I2 statistic < 25%: low heterogeneity; 25% < I2 statistic < 75%: moderate heterogeneity; I2 statistic > 75%: high heterogeneity). For each trial, we plotted the treatment effect by its standard error within the funnel plot. Besides visual inspection, the presence of symmetry in the funnel plot was assessed with Egger’s and Begg’s tests, to assess if the effect decreased with sample size increasing.31,32

A cumulative meta-analysis was performed to assess how the overall estimate changes as each study is added to the pool of the previously published studies; an influence analysis excluded each study at a time from the meta-analysis. The limited number of studies included in the meta-analysis did not allow to carry out the subgroup analyses by grade and type of sponsorship originally planned in the protocol.

We also calculated the post hoc power of each meta-analysis under fixed or random-effects models by assuming the expected effect size (Cohen’s d), sample size, number of studies, and degree of heterogeneity to be those identified in our meta-analyses. In detail, we used the sum of the two arm-specific medians as the (total) sample size; we re-expressed the absolute (unstandardized) MD between the two arms in terms of effect size, as measured by Cohen’s d. To provide a fair comparison, we also calculated the post hoc study-specific power from a two-sided independent samples T test with the single study-specific sample sizes and effect sizes (Cohen’s d), as well as an alpha level equal to 0.05. All the statistical analyses were performed using STATA software (version 13; StataCorp, College Station, TX).

Risk of bias assessment

Risk of bias was assessed by two authors (G.P.M. and V.E.) for all studies included in the meta-analysis according to the criteria presented in the Cochrane Handbook for Systematic Reviews of Interventions (Version 5.1.0).33


Systematic review

The search process is shown in Fig. 1. A total of 12 papers15,16,17,18,19,20,21,22,23,24,25,26 corresponding to 10 RCTs (2275 participants) mostly from Europe and the United States were retained for the systematic review (Table 1). Two papers21,22 (723 participants) provided a follow-up of the previously published CHOP study.20 Five were double-blind trials, two16,19 single-blind ones and in one study26 blindness was present but not described in detail. Trials considered eligible infants those recruited at birth,26 or aged <7,15,17 ≤2119 or 28,18 ≤40 days of life16 or <56 days of life;20,21,22 the remaining studies enrolled infants of 323,24 and 925 months of life.

Fig. 1: Flowchart of the study selection process.
figure 1

Flowchart of the study selection process for the systematic review and meta-analysis following the guidelines from the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) group.

Most studies compared a high-protein formula, a low-protein formula, and breastfeeding. Oropeza-Ceja et al. and Picone et al.16,26 considered high- (or formula 1526 or 2.14 g/100 kcal), medium- (1.8 g/100 kcal16 and formula 1326 or 1.85 g/100 kcal), and low-protein (1.4 g/100 kcal16 and formula 1126 or 1.57 g/100 kcal) formulas, as well as breastfeeding; due to its focus on total protein intake (including weaning foods), Åkeson et al.24 enrolled breastfed infants of 3 months and assigned them to high- [formula 18 (i.e., 2.57 g/100 kcal) or 20 (i.e., 2.85 g/100 kcal) depending on the group], medium- (formula 15 or 2.14 g/100 kcal), and low-protein (formula 13 or 1.85 g/100 kcal) formulas, which, however, were higher in protein concentration than in Picone et al.26 study. In one article,25 infants fed with any infant formulas available from the market were included in the formula group. The number of subjects was similar across arms (i.e., maximum difference between pairs of arms ≤20%) in most of the papers.15,16,17,18,22,24,25,26

The minimal set of anthropometric variables collected in all studies included body weight and length. The time-schedule of data collection differed across papers. Although all studies except one17 reported anthropometric measurements at enrollment, some studies treated infants and checked anthropometric measures on the same time span15,16,18,19,21,24,25,26 and others monitored anthropometric measures on a longer follow-up (e.g., several years after the intervention).17,20,22 In either case, the most common intervention strategy considered treatment from enrollment to 120 days of life of the infant. However, heterogeneity was present even when considering age at enrollment within the first month of life.15,16,17,18,19 This implied that, while some studies were likely to actually treat infants for about 4 months in total,15,17 one study18 treated infants for 3 months, another two16,19 were likely to include subsets of older infants, treated for less than 4 months. In addition, within the same period of 120 days of an infant’s life, the number and schedule of data collection differed across studies, with measurements taken 2,19 3,16,17 5,18 and 615 times. Among selected articles, 815,17,18,20,21,22,23,26 (67%) were of “good”, 316,19 (25%) of “fair”,25 and one (8.3%) of “poor” quality.24 Two studies found a similar weight gain in the first 120 days for high- and low-protein content formulas.15,19 On the contrary, one study found a lower weight gain among infants in the low-protein, as compared with the high-protein formula group,18 and another study16 observed a lower weight gain in the low-protein formula group, as compared with medium- or high-protein content groups in the first 120 days of life. In addition, gain in weight was similar between Swedish and Italian infants assuming formulas with high-, medium-, and low-protein content over a 3–6-month and a 6–12-month period.24

Weight was similar between high- and low-protein content formulas in 4 papers, at 120 days of life,19 at 1 year,25 at 2 years,20 and at 5 years;17 however, it was found to be lower among infants in the low-protein formula group at 120 days of life in one paper17 and at 6 months in another paper.21 No significant differences in weight were also observed at 2, 4, 8 and 12 weeks in the comparison between low-, medium-, and high-protein content formulas in another article.26 Finally, at 6 years of life, weight was found to be slightly higher (3%) in the high-protein, as compared to the low-protein formula group.22

In two studies, a similar length gain in the first 120 days of life was detected.15,16 Another study found a higher (10%) length gain in the first 120 days of life in the low-protein content group.18 Finally, gain in length was similar between Swedish and Italian infants assuming formulas with high-, medium-, and low-protein content over a 3–6-month and a 6–12-month period of time.24

Length was similar between high- and low-protein content groups at 320 and 6 months,20,21 1,20,25 2,20 517 and 622 years of life. Similarly, no differences were observed at 2, 4, 8 and 12 weeks in the comparison between low-, medium-, and high-protein content formulas in another article.26 Weight for length was not different across the different protein content formulas at 120 days16,20 in one paper, but did differ in another one, being higher in the high-protein content formula group at 6 and 12 months.20

The BMI gain in the first 120 days was similar in two studies for low- vs. high-15,16 and medium-protein content formulas.16 One trial found that BMI was higher by 8%, 2% and 3% in high- vs. low-protein formula group at 6 months, 2 and 6 years of life, respectively.20,21,22 One study did not detect any difference in BMI between high- and low-protein content formula groups at 5 years of age.17

Among studies investigating differences in body composition in the high- and low-protein content formula groups, two observed that fat-free mass was similar at 120 days19 and at 5 years,17 and another one confirmed this lack of difference in fat-free mass, as well as in total fat mass, for infants of 6 months of age.21


Figures 2 and 3 show the study-specific and pooled estimates of the effect on weight and length gain of high- vs. low-protein intakes in infant formulas. The pooled MD of weight gain was 0.02 g/day (95% CI: –1.41, 1.45); the p value from the χ2 test on the Q statistic was equal to 0.05, and I2 suggested that 58% of the total variation was due to heterogeneity between studies (Fig. 2). In addition, the meta-analysis on length gain showed a pooled MD estimate of 0.004 cm/month (95% CI: –0.26, 0.27); the χ2 test on the Q statistic suggested the presence of heterogeneity between studies (p value < 0.001), and I2 ~85% (Fig. 3). For weight gain, the funnel plot did not show meaningful asymmetry of the studies (Supplementary Fig. 1). However, for length gain, two trials fell outside the funnel (Supplementary Fig. 2). Egger’s (p = 0.47) and Begg’s (p = 0.22) tests confirmed that there is no evidence of publication bias for weight gain. For length gain, the preferred Egger’s (Intercept: 3.09, 95% CI: 0.89–5.28, p = 0.02) test may suggest the presence of publication bias, whereas the less reliable Begg’s test provided inconsistent results. However, the few studies included in the meta-analysis had likely made the power of Egger’s and Begg’s tests too low to distinguish chance from real asymmetry.33 In addition, the cumulative meta-analyses showed that the overall estimates were stable as far as more recent studies were added to the pool of the previously published ones; the corresponding CIs always included 0.

Fig. 2: Forest plot showing the study-specific and pooled estimates of the effect on weight gain of high- vs. low-protein intakes in infant formulas.
figure 2

Results were derived from a random-effects meta-analysis where the measure of treatment effect was the mean difference in each growth outcome and displayed in the following way: (1) a square represented the study-specific point estimate of the intervention effect; (2) a horizontal line represented the precision of the point estimate in the form of the confidence interval. The area of the square reflected the contribution of each study to the meta-analysis in the form of weight. The pooled effect estimate and its confidence interval were represented by a diamond.

Fig. 3: Forest plot showing the study-specific and pooled estimates of the effect on length gain of high- vs. low-protein intakes in infant formulas.
figure 3

See Fig. 2 for additional details.

For both growth outcomes, the influence analysis provided reassuring results. Indeed, when omitting one study at a time (1) the CIs of the five combined estimates always included 0, thus providing the same conclusion of the main analysis and (2) the five combined estimates obtained omitting each study at a time were included in the CI of the combined estimate based on all the studies (data not shown).

For each outcome measure, the four sensitivity analyses assessing a potential role of the imputed missing standard deviations provided reassuring results: the overall point estimates and 95% Cls were similar to those from the main analysis, with the CIs still including 0.

Post hoc power was extremely low for both meta-analyses: within the random-effect model, calculation with: number of studies = 5, total sample size = 100 (50 per arm), alpha = 0.05, type of test = two-sided, I2 = 58% for weight gain and I2 = 85% for length gain, effect size (Cohen’s d) = 0.0863 for weight gain and 0.0674 for length gain led to a power to detect an MD of 15% for weight gain and 13% for length gain. This is in line with results from study-specific power calculation, where power ranges from 6%19 to 72%17 for weight gain, with a median of 7%, and from 5%19 to 100%,18 with a median of 6%, for length gain (data not shown).

Most of the included studies showed a low risk of bias (Table 2).

Table 2 Risk of bias assessment for the studies included in the meta-analysis.


We conducted a literature analysis to gauge the hypothesis low- and high-protein content diets are associated with different growth outcomes by limiting the intervention timeframe to 1 year of life. Moreover, we carried out a meta-analysis that concentrates on studies as similar as possible, to reach sound conclusions. The systematic review shows that there is no clear-cut effect on the growth of different amounts of protein intake from formulas or complementary feeding during the first year of life of full-term infants. In addition, our meta-analysis points out that a low amount (≤2.0 g/100 ml) of proteins in formula milk has no significant effect on weight or length gain at 120 days, as compared to a high-protein content (>2.0 g/100 ml) formula.

Previous literature found a potential beneficial effect, sometimes modest, of low-protein intake on growth.34,35 In 2013, Hörnell et al.34 conducted a systematic review of studies performed on a broad range of age (0–18 years of life). In 2019, Pimpin et al.35 performed a meta-analysis on the effect of protein intake on growth outcomes, including weight and length gain, but pooled data at any time point from 0 to 77 months of age. In 2016, Patro-Golab et al.36 included in their systematic review interventions up to 3 years of life and provided separate meta-analyses on growth outcomes at 3 months, between 3 and 6 months of age, and between 6 and 12 months, with a range of included articles from 2 to 4 for each meta-analysis. The authors observed a lower mean length at 3 months of age among infants receiving low- vs high-protein formulas.36 However, the remaining analyses on several other outcomes, including weight gain, provided inconclusive results; in addition, the more recent literature included in our current analysis could not be considered by these authors.36 Finally, a recent publication by Stokes et al.37 observed similar results to those of Patro-Golab et al.36 This study included cohort studies only and the corresponding meta-analysis pooled data on outcomes collected during the first 2 years of life.

Differently from the previous systematic reviews and meta-analyses,35,36,37 a systematic review published in 201538 achieved conclusions in line with ours; however, the heterogeneity of the considered studies was likely higher than in our analysis, because it included also studies on infants that were previously breastfed for >30 days or infants from obese mothers. Taken together, these analyses suggest that the effect of protein intake from formulas during the first year of life of full-term infants, if any, is likely limited, although the hypothesis of a stronger effect later in life cannot be discarded. This hypothesis is also supported by a recent very large epidemiological study, which suggested 2–6 years of life to be a possible critical age range to develop sustained obesity.39

The peculiar approach of our analysis relies on strict inclusion and exclusion criteria. We limited the meta-analysis to studies providing formula-based interventions with different protein content. Among them, we excluded those enrolling participants after 30 days of life. This choice was meant to reduce the possible confounding effect of previous prolonged breastfeeding. Exclusive breastfeeding has, indeed, an important impact on body growth. Breastfed infants present a higher body weight in the first 6–8 months of life, as compared to exclusively formula-fed ones,40 reaching values of ingested milk close to their plateaux as early as 1 month of age.41 Despite a further decrease in body weight through the first year, breastfed infants maintain a higher percentage of fat mass up to the third trimester of life.42,43

Following the previous argument, results of the three papers based on CHOP trial—still recruiting infants from 1 to 2 months of life (25% of infants were older than 30 days of life at enrollment)—were not considered in the current meta-analysis. In addition, as no information was available at 4 months, we could not materially include the papers by Koletzko et al.20 and Escribano et al.21 in the meta-analysis.

Most studies provided a modified casein/whey ratio in formulas. It is assumed that breastmilk is whey-predominant with an estimated mean ratio between whey and casein of 60:40. However, recent data obtained from individual measurements of casein-subunits’ concentrations suggest that this ratio changes throughout the lactation period.44

This study has strengths. No subject included in the meta-analysis was weaned, thus reducing the effect of possible additional dietary factors. Almost all studies were of good or fair quality and those included in the meta-analysis showed a low risk of bias. However, our analysis has limitations. The systematic review still included 12 articles only. The corresponding studies were mainly conducted in Europe and the United States; this could be due to our choice of restricting the systematic review to English papers, which has likely limited our ability to describe early nutrition somewhere else. In addition, the included papers had different objectives, with consequences in terms of eligible infants, sample size, type of protein-based intervention, and growth outcomes. The presence of many growth outcomes and the different timepoints for measurements have made the overall picture scattered and have prevented from drawing conclusions even on single growth outcomes, although the identified period of investigation was reasonably short. Among the 12 selected studies, we focused our meta-analyses on the effect at 120 days. As our meta-analysis included five studies only, derived evidence is limited. Evidence is also fraught with the moderate-to-high heterogeneity of included studies, partly due to a possible effect of previous breastfeeding among infants later enrolled in the formula arms. This heterogeneity, the small effects and related limited power found at the single study level, as well as the few studies included, are reflected in the low power of our meta-analyses to detect a difference between low- and high-protein content formulas. With few studies, it was also impossible to assess the presence of small study effects, including publication bias. However, based on a survey of meta-analyses published in the Cochrane Database of Systematic Reviews, tests for funnel plot asymmetry should be used in only a minority of meta-analyses.33,45 Finally, with such a few studies, we could not perform any subgroup analyses to assess the potential effect of relevant covariates, including study quality or sponsorship.

In conclusion, our systematic review does not allow us to conclude if different amounts of protein intake derived from either infant formulas or follow-on formulas during the first year of life provide an effect on growth outcomes in full terms infants; available evidence from our meta-analysis does not support the assumption that a differential protein content in formulas during exclusive milk-feeding leads to differences in growth outcomes at 4 months of life. Furthermore, properly powered studies including infants from birth and comparable outcomes measured at the same timepoints would allow us to definitively confirm the results of the present analysis. In the meanwhile, it is crucial to follow up formula-fed infants’ growth and provide evidence-based nutritional advice especially after weaning.