Introduction

Fetal growth is an intricate process, depending on multiple factors, genetic as well as environmental. Fetal growth restriction is closely related to mortality and morbidity, with increased risks of short and long term complications1. A correct assessment of fetal size and growth is important to identify fetuses at risk of unfavourable outcomes2. In order to correctly assess fetal size, a reliable standard is of paramount importance.

An appropriate study design is vital when new standards for fetal size and growth are constructed, as fetal growth should not be assessed using charts that originate from cross-sectional data. Standards for fetal growth are instead ideally created from longitudinal data with repeated measurements over time in pregnancies with accurate pregnancy dating3,4. During the last decade, several international growth standards have been created5,6,7,8. However, doubts have been raised regarding the applicability of these international standards when evaluated in different populations7,9,10,11. It has therefore been advised that the international standards should be evaluated in each separate population before taken into general practice5.

The current Scandinavian growth charts, created in 1996 by Maršál et al., lack data on gestational ages below 25 weeks and are based on data from a rather small number of women, only 86, of whom 24% were smokers12. Since then, demographics of the Swedish pregnant population have changed, including e.g. higher maternal BMI, fewer smokers and higher birthweights13. Further, obstetric interventions are nowadays performed even in the second trimester of pregnancy, and children born preterm can be saved as early as gestational week 22. Hence, there is a pressing need for updated Swedish charts for fetal size and growth.

The aim of this study was to create updated national reference ranges for fetal size and growth from gestational week 12 to 42, by applying modern statistical methods to longitudinally collected data on ultrasonically derived intrauterine biometric measurements in a large Swedish cohort of low-risk pregnancies.

Methods

Study design and population

In this prospective longitudinal multicenter study, 684 women were recruited in early pregnancy between September 2015 and September 2018 in five sites in central Sweden; Uppsala, Falun, Katrineholm, Västerås and Örebro. At first antenatal visit, all women who received antenatal care at 18 representative primary care units were invited to participate. The participants were recruited from urban as well as rural areas. Healthy, non-smoking women were eligible if they had regular menstrual periods (28 ± 4 days) and no previous pregnancy complications, such as preterm birth, pre-eclampsia, eclampsia, gestational diabetes, hypertension and stillbirth. Women with chronic hypertension, systemic lupus erythematosus, kidney disease, diabetes, previous gastric bypass surgery or inflammatory bowel disease were not eligible, as these conditions are known to affect fetal growth.

At first study visit, between pregnancy week 12 + 3 and 13 + 6 according to the last menstrual period, gestational age (GA) was assessed with ultrasound. Only women with spontaneously conceived singleton pregnancies were recruited, and the discrepancy between GA according to the biparietal diameter (BPD) and GA according to last menstrual period was not allowed to exceed seven days. Seven experienced sonographers performed the study scans. Before first inclusion, all sonographers were given detailed instructions regarding how the biometric measurements should be performed.

At inclusion, the women were randomized to nine study protocols, according to the timing of the follow-up visits, in which each subject was assigned four follow-up ultrasound scans between gestational week 14 and 41. The study protocols were kept in closed envelopes and were randomly assigned to each study subject. The purpose was to receive evenly distributed measurements among the GAs. The vast majority of Swedish pregnant women undergo a routine second trimester ultrasound scan, usually taking place between week 17 and 20. In cases where the study subject was not randomized to a scan in week 17 to 20 (five protocols), biometric measurements were recorded in the study database if the scan was performed by a study sonographer. In order to compensate for the expected decline in the number of women with ongoing pregnancies, a larger number of study subjects were assigned to each group for the scans in week 37–41.

Women were excluded from the study if the pregnancy was complicated by gestational hypertension, pre-eclampsia or eclampsia, gestational diabetes, placenta previa, fetal malformations or chromosomal aberrations, stillbirth, fetal growth restriction with abnormal fetal Doppler or birth before 37 + 0 gestational weeks (259 gestational days).

Procedures

The ultrasound machines used were GE Voluson E10, GE Voluson E8 and GE Voluson E6 with abdominal transducers 2–6 MHz RM6C, 2–8 MHz C4-8-D, RAB 4-8-D and 2–9 MHz C2-9-D. BPD was used to calculate the GA, using the modified Selbing and Kjessler formula, 58.65 + 1.07*BPD + 0.0138*BPD2, as recommended by the Swedish Society of Obstetrics and Gynecology14. Only fetuses with BPD at least 21 mm at first study visit were included. At each ultrasound scan, five biometric measurements were each measured three times; BPD, head circumference (HC), mean abdominal diameter (MAD), abdominal circumference (AC) and femur length (FL). All data was manually registered in a web-based study database.

BPD and HC were measured in an axial section, at the level of the thalami, with the midline echo in a central position broken anteriorly by cavum septum pellucidum. Orbitae and cerebellum were non-visible. The callipers for BPD were placed on the outer margin of the proximal parietal bone, and the inner margin of the distal parietal bone. HC was measured by placing the callipers on the outer borders of the frontal and occipital edges of the bone, and the ellipse facility was used to follow the contour of the skull. MAD and AC were measured in cross-section (circular view of the abdomen), with the stomach visible, the umbilical vein in the anterior third of the abdomen and the aorta and inferior vena cava anteriorly of the spine. Further, the greater part of a rib should be seen but not the heart or kidneys. The callipers for MAD were placed on the outer skin borders both anterioposteriorly and perpendicular transversely. AC was measured using the ellipse facility to follow the outer contours of the skin. Lastly, FL was measured in a longitudinal section of the femur in 45° to 90° angle of insonation, with the callipers placed on the outer borders of the femoral diaphysis. All measurements followed the national recommendations for biometric assessment and the practice guidelines from The International Society of Ultrasound in Obstetrics and Gynecology14,15.

Data management

Each biometric measurement was estimated three times and registered in the study database for all GAs, totalling to 38,601 repeated measurements. Data was first examined graphically using scatter-plots of each biometric parameter for GA in order to identify deviant records and inspect some data assumptions. Outliers were identified and each outlier was inspected regarding GA and the value of the individual biometric parameters. GA was evaluated and corrected against wrong data entry in the database. Incorrect GA records were adjusted during the examination process according to estimated date of delivery and date of examination. Next, extreme or unreasonable measurements (such as HC equal to or smaller than BPD) were deleted or otherwise corrected, if original measurements were available in the woman’s medical records (often available for women examined in Uppsala, unlike other study sites). Where original measurements were considered unreasonable or contradictory, the corresponding data was deleted.

In 22 out of total 33 measurement records with incorrect GA, there was no information on GA. In the remaining eleven cases, GA was incorrectly calculated. A total of 267 measurements (0.68%) were outliers, with suspected incorrect entry of measurement values. Of the incorrect values, 166 (62%) were deleted. The remaining 101 incorrect values were corrected based on original measurement data. BPD was the measurement with lowest rate of incorrect values, 0.35%, followed by MAD with 0.41% incorrect values. AC was the measurement with the highest rate of incorrect values, 1.19%, followed by HC with a rate of 0.84%. FL was incorrectly entered in 0.64% of the measurements.

Repeatability and reproducibility

A repeatability and reproducibility study was performed, where five study subjects were examined with repeated ultrasound scans during the same day. Six out of seven sonographers participated. Each study subject was examined two or three times by different sonographers, who assessed all five biometric measurements three times during each scan.

A linear mixed effect model was applied to estimate inter-observer variation. The chosen model accounts for the repeated measures and the differences in biometric measurements due to differences in GA. The model included fixed and random effects for each biometric measurement (BPD, HC, MAD, AC and FL), with a statistical marginal error of 5%.

In addition, the intraclass correlation coefficient (ICC) was assessed by applying a two-way mixed effects model to estimate intra-observer variation. We estimated absolute agreement, which includes systematic and random residual errors, for average measures.

Statistical analysis

Descriptive statistics were used for maternal characteristics at baseline as well as for delivery and neonatal characteristics. An independent samples t-test was performed to evaluate if dating discrepancy was different for girls and boys. The t-test was used after confirming that the data does not violate the test assumptions. A one-way ANOVA was employed to evaluate if dating discrepancy varied between the study sites. An independent samples Mann–Whitney U test was performed to compare median birthweights in subgroups of the cohort.

The biometric measurements were used to create reference ranges for the individual variables (BPD, HC, MAD, HC and FL). The log transformed fetal growth measurements were modelled using a multilevel approach, with fixed and random effects. First, a fractional polynomial regression was performed on the log transformed fetal measurements to identify the best fitting combination of fractional polynomials for the GA. For instance for the fetal BPD, a combination of 0.5 and 3 as the best fitting fractional polynomial powers was identified. The derived parameters were then included in the regression model as fixed effects in a multilevel model to account for repeated measurements for each fetus. We followed the approach used by Ohuma and Altman4 and Johnsen et al.16,17—a two-level hierarchical model was used, considering the measurements (level 1) for each fetus (level 2) at each visit with a random effect for the effect of the identified fractional polynomial of GA and the intercept, similar to the study by Johnsen et al.17. We used the models mentioned above to estimate the expected fetal measurements at each GA in weeks. Thereafter, similar equations as in17 were used to compute the standard deviation (SD) and the percentiles while adjusting for maternal body mass index (BMI), height, parity, county of birth (Nordic or non-Nordic) and fetal sex.

In a sensitivity analysis, where women with abnormal BMI were excluded, we applied the same adjusted statistical models to a subset of the study cohort with BMI 18.5 to 29.9 kg/m2 to estimate the expected fetal biometric measurements at each GA in weeks, and to compute the SD and percentiles. The reference ranges before and after exclusion of women with abnormal BMI were compared using an independent samples t-test for each biometric measurement, for all subjects as well as stratified according to offspring sex.

Statistical analyses were performed using IBM SPSS Statistics version 2.5 and STATA 15.0.

Results

Out of the 684 recruited women, 650 were eligible for the study. During pregnancy, 14 women (2.2%) developed hypertension or pre-eclampsia, and 11 (1.7%) developed diabetes and were hence excluded. Fetoplacental complications, such as placenta previa, placental abruption, single umbilical artery and preterm fetal growth restriction led to exclusion in six cases. One woman had a late miscarriage, one child was stillborn and 26 children were born preterm. Eight women were excluded due to fetal malformation or chromosomal aberration. Thus, the final cohort consisted of 583 women; 275 from Uppsala, 66 from Falun, 98 from Katrineholm, 50 from Västerås and 94 from Örebro.

In total 2590 ultrasound scans were performed during the study. The majority, 526 of 583 included women, were scanned at least four times. In 187 women, all five planned ultrasound scans were performed. The ultrasound examinations following the dating scan were fairly equally distributed, see Fig. 1. There was a peak at week 18–19, corresponding to the routine ultrasound scan, and week 37–39. The dating discrepancy, i.e. the difference between estimated date of delivery (EDD) according to BPD at first study visit and EDD according to last menstrual period, was within ± 7 days, and thereby fulfilled the inclusion criteria. The mean discrepancy was − 0.1 days (SD 2.8 days) and the median discrepancy was 0 days. The dating discrepancy was slightly, but not statistically significantly, larger for girls than boys (p = 0.174); mean − 0.5 days for girls (SD 2.7 days) and 0.2 days for boys (SD 2.7 days), respectively. Further, there was a difference in dating discrepancy between the study sites (p < 0.001), with the lowest discrepancy in Katrineholm (mean 0.1 days, SD 2.4 days) and the largest in Västerås (mean − 1.2 days, SD 2.8 days).

Figure 1
figure 1

Distribution of ultrasound examinations by gestational age.

The median age of the participating women was 29 years. BMI covered a range of 16.7–44.8 kg/m2, with a median BMI of 23.5 kg/m2. The majority of the study population, 92%, were born in a Nordic country (Sweden, Norway, Denmark, Finland or Iceland), and 5.5% were of non-European origin. Nearly 43% of the women were nulliparous. The median pregnancy duration was 281 days. Data on neonatal characteristics, including sex, was available for 574 children (98.5%). The median birthweight was 3625 g and median birth length 51 cm. For children with a mother born in a Nordic country, the median birthweight was 3628 g, compared with 3600 g for children with a mother born in a non-Nordic country; a difference that was not statistically significant (p = 0.258). Likewise, the median birthweight was comparable in children with younger and older mothers; 3660 g for maternal age less than 35 years and 3624.5 g for maternal age 35 years or older, p = 0.908. Nulliparous women gave birth to children with lower median birthweight compared with parous women; 3540 g for nulliparous and 3714 g for parous women, p = 0.008. Maternal and neonatal characteristics are summarized in Table 1.

Table 1 Maternal and neonatal characteristics.

The median and variance of the different biometric measurements for each gestational week are shown in Tables 2, 3, 4, 5, 6, 7, 8, 9, 10, and 11. The variance is expressed in standard deviations (+ 3 SD, + 2 SD, + 1 SD, median, − 1 SD, − 2 SD and − 3 SD) and in percentiles (2.5th, 5th, 10th, 25th, median, 75th, 90th, 95th and 97.5th).

Table 2 shows the median and variance for estimated BPD by GA in SD, and Table 3 BPD by GA in percentiles. Figure 2a shows the raw data with fitted percentiles for estimated BPD by GA.

Table 2 Estimated biparietal diameter (BPD) in mm by gestational age (GA) for males and females, standard deviations (SD).
Table 3 Estimated biparietal diameter (BPD) in mm by gestational age (GA) for males and females, percentiles.
Figure 2
figure 2

Raw data with fitted percentiles (10, 50, 90) for each estimated biometrical measurement in mm by gestational age (GA). (a) Shows biparietal diameter (BPD) by GA (n = 2586), (b) shows head circumference (HC) by GA (n = 2571), (c) shows mean abdominal diameter (MAD) by GA (n = 2585), (d) shows abdominal circumference (AC) by GA (n = 2561) and (e) shows femur length (FL) by GA (n = 2584).

Mean and variance equation for BPD in males and females:

$$E\left( {Z_{i} } \right) = \, - {2}.{53 } + \, \left[ {{2}.{\text{47 log}}\left( {{\text{GA}}_{{\text{i}}} } \right)} \right] \, + \, \left[ { - 0.0{\text{5 GA}}_{{\text{i}}}^{{1}} } \right]$$
$$Var\left( {Zi} \right) = \, 0.0{8 } + \, \left[ {0.0{\text{2 log}}\left( {{\text{GA}}_{{\text{i}}} } \right)^{{2}} } \right] \, + \, \left[ { - 0.0{\text{8 log}}\left( {{\text{GA}}_{{\text{i}}} } \right)} \right] \, + \, \left[ {0.00{\text{3 GA}}_{{\text{i}}}^{{1}} } \right] \, + \, \left[ {-0.000{\text{2 log}}\left( {{\text{GA}}_{{\text{i}}} } \right){\text{GA}}_{{\text{i}}}^{{1}} } \right] \, + \, \left[ {0.0000{\text{4 GA}}_{{\text{i}}}^{{2}} } \right]$$

Table 4 shows the median and variance for estimated HC by GA in SD, and Table 5 HC by GA in percentiles. Figure 2b shows the raw data with fitted percentiles for estimated HC by GA.

Table 4 Estimated head circumference (HC) in mm by gestational age (GA) for males and females, standard deviations (SD).
Table 5 Estimated head circumference (HC) in mm by gestational age (GA) for males and females, percentiles.

Mean and variance equation for HC in males and females:

$$E\left( {Zi} \right) = { 8}.{48 } + \, \left[ { - {14}.{\text{36 GA}}_{{\text{i}}}^{{ - 0.{5}}} } \right] \, + \, \left[ { - 0.000{\text{2 GA}}_{{\text{i}}}^{{2}} } \right]$$
$$Var\left( {Zi} \right) = \, 0.0{2 } + \, \left[ {0.{\text{25 GA}}_{{\text{i}}}^{{ - {1}}} } \right] \, + \, \left[ { - 0.{\text{13 GA}}_{{\text{i}}}^{{ - 0.{5}}} } \right] \, + \, \left[ { - {8}.{\text{24e}} - 0{\text{6 GA}}_{{\text{i}}}^{{2}} } \right] \, + \, \left[ {0.0000{\text{3 GA}}_{{\text{i}}}^{{ - 0.{5}}} {\text{GA}}_{{\text{i}}}^{{2}} } \right] \, + \, \left[ {{1}.{\text{43e}} - 0{\text{9 GA}}_{{\text{i}}}^{{4}} } \right]$$

Table 6 shows the median and variance for estimated MAD by GA in SD, and Table 7 MAD by GA in percentiles. Figure 2c shows the raw data with fitted percentiles for estimated MAD by GA.

Table 6 Estimated mean abdominal diameter (MAD) in mm by gestational age (GA) for males and females, standard deviations (SD).
Table 7 Estimated mean abdominal diameter (MAD) in mm by gestational age (GA) for males and females, percentiles.

Mean and variance equation for MAD in males and females:

$$E\left( {Zi} \right) = { 6}.{71 } + \, \left[ { - {43}.{\text{56GA}}_{{\text{i}}}^{{ - {2}}} } \right] + \left[ { - {12}.{\text{17GA}}_{{\text{i}}}^{{ - 0.{5}}} } \right]$$
$$Var\left( {Zi} \right) = \, 0.0{4 } + \, \left[ {{627}.{\text{53GA}}_{{\text{i}}}^{{ - {4}}} } \right]{ + }\left[ {{9}.0{\text{6GA}}_{{\text{i}}}^{{ - {2}}} } \right] + \left[ { - 0.{5}0{\text{GA}}_{{\text{i}}}^{{ - 0.{5}}} } \right] + \left[ { - {56}.{\text{88GA}}_{{\text{i}}}^{{ - {2}}} {\text{GA}}_{{\text{i}}}^{{ - 0.{5}}} } \right]{ + }\left[ {{1}.{\text{48GA}}_{{\text{i}}}^{{ - {1}}} } \right]$$

Table 8 shows the median and variance for estimated AC by GA in SD, and Table 9 AC by GA in percentiles. Figure 2d shows the raw data with fitted percentiles for estimated AC by GA.

Table 8 Estimated abdominal circumference (AC) in mm by gestational age (GA) for males and females, standard deviations (SD).
Table 9 Estimated abdominal circumference (AC) in mm by gestational age (GA) for males and females, percentiles.

Mean and variance equation for AC in males and females:

$$E(Z_{{\text{i}}} ) \, = { 7}.{8}0 \, + \, \left[ { - {5}0.0{\text{5 GA}}_{{\text{i}}}^{{ - {2}}} } \right] \, + \, \left[ { - {11}.{8}0{\text{ GA}}_{{\text{i}}}^{{ - 0.{5}}} } \right]$$
$$Var(Z_{{\text{i}}} ) \, = \, 0.0{4 } + \, \left[ {{47}0.{\text{55 GA}}_{{\text{i}}}^{{ - {4}}} } \right] \, + \, \left[ {{6}.{\text{86 GA}}_{{\text{i}}}^{{ - {2}}} } \right] \, + \, \left[ { - 0.{\text{39 GA}}_{{\text{i}}}^{{ - 0.{5}}} } \right] \, + \, \left[ { - {42}.{8}0{\text{ GA}}_{{\text{i}}}^{{ - {2}}} {\text{GA}}_{{\text{i}}}^{{ - 0.{5}}} } \right] \, + \, [{1}.{\text{15 GA}}_{{\text{i}}}^{{ - {1}}}]$$

Table 10 shows the median and variance for estimated FL by GA in SD, and Table 11 FL by GA in percentiles. Figure 2e shows the raw data with fitted percentiles for estimated FL by GA.

Table 10 Estimated femur length (FL) in mm by gestational age (GA) for males and females, standard deviations (SD).
Table 11 Estimated femur length (FL) in mm by gestational age (GA) for males and females, percentiles.

Mean and variance equation for FL in males and females:

$$E(Z_{{\text{i}}} ) \, = { 4}.{11 } + \, \left[ { - {344}.{\text{24 GA}}_{{\text{i}}}^{{ - {2}}} } \right] \, + \, \left[ {0.0{\text{1 GA}}_{{\text{i}}}^{{1}} } \right]$$
$$Var(Z_{{\text{i}}} ) \, = \, 0.0{1 } + \, \left[ {{584}.{\text{71 GA}}_{{\text{i}}}^{{ - {4}}} } \right] \, + \, \left[ { - {4}.{\text{46 GA}}_{{\text{i}}}^{{ - {2}}} } \right] \, + \, \left[ { - 0.000{\text{4 GA}}_{{\text{i}}}^{{1}} } \right] \, + \, \left[ {0.{\text{10 GA}}_{{\text{i}}}^{{ - {2}}} {\text{GA}}_{{\text{i}}}^{{1}} } \right] \, + \, \left[ {{4}.{\text{83e}} - 0{\text{6 GA}}_{{\text{i}}}^{{2}} } \right]$$

Supplementary Tables 15 show the median and variance of the different biometric measurements (BPD, HC, MAD, AC and FL) for each gestational week for males and females separately. Supplementary Tables 6–10 show the median and variance of the different biometric measurements for each gestational day. The variance is expressed in standard deviations (+ 3 SD, + 2 SD, + 1 SD, median, − 1 SD, − 2 SD and − 3 SD) and in percentiles (2.5th, 5th, 10th, 25th, median, 75th, 90th, 95th and 97.5th). Moreover, all supplementary tables enclose the full equations of mean and variance for each biometric measurement.

The sensitivity analysis of women with BMI 18.5 to 29.9 kg/m2 showed no statistically significant differences between the reference ranges in the complete study population and the subset of women where underweight and obese women were excluded (p = 0.9906 to 0.999). Supplementary Tables 1115 show the median and variance of the different biometric measurements for each gestational week in the subset of women with BMI 18.5 to 29.9 kg/m2. The variance is expressed in standard deviations (+ 3 SD, + 2 SD, + 1 SD, median, − 1 SD, − 2 SD and − 3 SD) and in percentiles (2.5th, 5th, 10th, 25th, median, 75th, 90th, 95th and 97.5th). Moreover, the supplementary tables enclose the full equations of mean and variance for each biometric measurement.

The GA at examination of the five study subjects included in the reproducibility study varied from 13 + 3 to 41 + 1 gestational weeks. The linear mixed effects model showed overall non-significant inter-observer variation for all five biometric measurements, with p = 0.162 for BPD, p = 0.124 for HC, p = 0.213 for FL, p = 0.087 for MAD and p = 0.166 for AC. The two-way mixed effects model of average measures and absolute agreement showed a very high ICC, with highly consistent measurements for all five biometric measurements; ICC for BPD = 1.000, for HC = 1.000, for FL = 1.000, for MAD = 0.999 and for AC = 0.979.

Discussion

In this cohort of prospectively enrolled, healthy women with low risk of aberrant fetal growth, we have constructed new Swedish reference ranges for normal size and growth of the fetal head, abdomen and femur. We have provided charts for five biometric measurements; BPD, HC, MAD, AC and FL, from gestational week 12 to 42.

Over the years, a large number of studies have presented regional and international charts for fetal size and growth. There is a large variability in study design and statistical modelling methods, as well as in reported percentiles3,4,18. The aim of a fetal growth chart is to describe how fetuses should grow under optimal conditions3. Hence, in concordance with large international studies of fetal size and growth, the present study has only included women with low risk of aberrant fetal growth5,6.

Reliable and population-representative size and growth charts are important in order to correctly evaluate both fetal size and growth, the latter as serial measurements. Altman and Chitty highlight differences in estimating size and growth, and how this affects the choice of appropriate study design18. A cross-sectional design is recommended for evaluating size, with a single measurement on each study subject. Longitudinal studies, on the other hand, comprise repeated measurements of each study subject. Compared with cross-sectional studies, longitudinal studies often use smaller study samples with measurements that are not independent of each other. Unless the repeated measurements are properly addressed, the variation may be underestimated using a longitudinal design. Since the publication of the intrauterine growth charts constructed by Maršál et al. in12, which are presently used in Sweden, statistical methods have been developed and used that take both repeated measurements and increased variation with GA into account. These methods permit the use of a longitudinal design to produce growth charts of size as well as growth intended for clinical practice4,18. A strength of our study is the prospective longitudinal design and the use of modern statistical modelling methods. Hence, the growth charts can be used to evaluate ultrasonically derived fetal biometry, both regarding size and growth. However, these growth charts are not intended for dating of pregnancies, as dating standards require different statistical analyses19. We recommend the use of dating charts that are designed solely for that purpose.

Another strength of our study is the relatively large cohort of healthy women with low-risk pregnancies recruited specifically for this study with an even spread of the examinations across the included GAs. In order to increase the reproducibility and decrease the measurement error, a limited number of experienced sonographers conducted the ultrasound scans following the biometric measurement recommendations that are in use in Sweden. The use of triplicate measurements of each biometric assessment at each ultrasound scan further reduces measurement error. The reproducibility study showed a low grade of inter-observer variability. However, the low number of study subjects is a limitation. Accordingly, the reliability of the reproducibility study cannot be assessed as high. Lastly, strict criteria for exclusion due to increased risk of aberrant growth have been applied throughout the study.

A valid estimation of GA is considered crucial for developing reliable growth reference ranges3. The used method with regular menstrual cycles where estimated date of delivery (EDD) according to last menstrual period is consistent with first trimester ultrasound dating provides a reliable dating method3,4,20,21,22. The median discrepancy of 0 days in our study indicates concordant dating between EDD according to BPD and EDD according to last menstrual period. The mean dating discrepancy was larger for girls than for boys. The dating discrepancy is in line with the findings of earlier studies that have examined discrepancy in dating using last menstrual period and ultrasound23,24. We used BPD in gestational week 12 + 3 – 13 + 6 to date the pregnancies. Swedish as well as international guidelines recommend dating with ultrasound during the first trimester, as this appears to be the most reliable method for pregnancy dating14,25. The Swedish guidelines recommend the use of crown rump length (CRL) in early pregnancy, and BPD from 21 mm (corresponding to week 12 + 3). Adherance to the recommendations of using CRL for dating in early pregnancy is however low in Sweden26. Since many Swedish sonographers are not experienced in measuring CRL, we chose to date all pregnancies with BPD in order to avoid different dating methods. Further, the equation used for dating with BPD was derived with CRL as reference for “true” GA, and later the equation was validated as well performing with low systematic and random error27,28. However, first trimester dating with BPD predicts the GA and duration of pregnancy equally well as CRL, and the choice of future dating method should therefore not affect the applicability of our growth charts27,29. Variations in early growth might have an impact on the estimated GA when first trimester ultrasound dating is used rather than last menstrual period. This implies that there is a risk that a systematic bias caused by measurement error is introduced. The potential effect of inaccurate GA assessment due to natural variation in fetal growth should however be small, as the dating discrepancy was very small.

A limitation of the study is the predominance (92.1%) of women born in Sweden or another Nordic country. This figure is high compared with the Swedish pregnant population, where 69.5% of all women giving birth in Sweden in 2018 were born in a Nordic country13. Some selection bias was unavoidable, as the written information to potential study subjects that was handed out during the recruitment process was solely available in Swedish, English and Arabic. Efforts were made to recruit women of various ethnicities and social backgrounds, by recruiting women in primary care units with a high rate of immigrants as well as in units with mainly Swedish born women.

In order to achieve a study population representative to the Swedish pregnant population, women of low as well as high BMI were included in the cohort, despite the potential effect of abnormal BMI on intrauterine growth. Since only healthy women were included, the risk of poor intrauterine growth due to malnutrition should be low. Even though the median BMI was normal, the upper interquartile range included women with overweight, indicating that a significant part of the study population were overweight. Women with obesity were not only screened with repeated random plasma glucose, but also with oral glucose tolerance test for gestational diabetes. All women who developed gestational diabetes were excluded, and solely women with normal plasma glucose and glucose tolerance fulfilled the study. It cannot be ruled out that increased fetal growth due to other factors than gestational diabetes in obese women might affect the results towards an overestimation of normal fetal size. In the sensitivity analysis, where women with BMI 18.5 to 29.9 kg/m2 were compared with the complete study population, only small differences were observed between the groups for all biometric measurements. These differences should not be of any clinical significance, and neither were there any statistically significant differences between the reference ranges if all women were included or not. Hence, including the subjects with extreme BMI values should not bias the results. Moreover, the aim of the study was to provide reference ranges in a study population of healthy women representative to the Swedish pregnant population. Maternal age and BMI in the study population were similar to the mean age (30.4 years) and BMI (25.2 kg/m2) of pregnant women in Sweden 201713. Hence, the results of the complete study population can be regarded as generalizable for estimation of fetal size and growth in the current Swedish pregnant population.

Compared with the growth charts presently used in Sweden, our new reference ranges are derived from an almost seven times larger study population12. Moreover, the study population in Maršál’s study comprises 24% smokers. Considering the potential growth retarding effect of maternal smoking, their study population does not represent a low-risk population with expected normal fetal growth30,31. Methodological considerations, such as the nowadays outdated cross-sectional analytic methods of a longitudinal study in the Maršál study, and changes in the Swedish pregnant population, motivates a change into updated reference ranges for fetal size and growth. Moreover, the corresponding Norwegian growth charts, which are based on a methodology similar to ours, are not entirely applicable to the Swedish setting, partly due to differences in demographics and birthweights, but most importantly due to differences in recommendations for how to perform the ultrasonic BPD measurements. The Norwegian reference ranges for biometry are calculated using the calipers placed on the outer margins of both the proximal and the distal parietal bone16.

During the last few years, large international projects have produced growth standards intended for universal use, with the assumption that differences in fetal growth and birthweights are caused by suboptimal environment rather than inherent differences in the populations5,6,32. Others have found evidence supporting that physiological differences rather than pathology explain the differences in size and growth between populations8,33,34,35. Applying international standards would in such a case possibly misclassify a large proportion of fetuses as either SGA, AGA or LGA10. There is an ongoing debate concerning the need of national standards for fetal size and growth. It is interesting to note that even though the INTERGROWTH-21st project showed high degree of likeness between study sites, the WHO Multicentre Growth Reference Study reported significant differences in fetal growth in different settings5,6. Even though both studies are of high quality with large study populations, recent studies have presented evidence that questions the use of a single international standard that represents ideal growth in all populations8,9,10,35. Similar conclusions were drawn by the authors of the WHO study, who recommend that if international charts are used, their performance should be tested in the local setting to assess if adjustments are needed5. Bearing these concerns in mind, we believe that there is a need for updated national reference ranges of fetal size and growth for everyday clinical practice. Moreover, further studies are needed to evaluate proper cut-offs for the updated reference ranges in order to identify fetuses with increased risk of adverse perinatal outcome.

In conclusion, as regional differences in fetal size and growth seem to be of clinical importance, this prospective longitudinal study of normal fetal growth in a healthy Swedish population provides new national reference ranges for fetal size and growth from gestational week 12 to 42.

Research involving human participants

The study was approved by the Regional Ethical Review Board in Uppsala (no. 2014/209 and 2014/209/2). All procedures involving human subjects were carried out in accordance with the ethical standards of the 1964 Helsinki declaration. All women participated voluntarily and gave their informed consent. Any pregnancy complication recognized during the study was reported to the routine obstetric care at each study site and managed according to clinical practice.