## Abstract

Gene expression is inherently noisy, posing a challenge to understanding how precise and reproducible patterns of gene expression emerge in mammals. Here we investigate this phenomenon using gastruloids, a three-dimensional in vitro model for early mammalian development. Our study reveals intrinsic reproducibility in the self-organization of gastruloids, encompassing growth dynamics and gene expression patterns. We observe a remarkable degree of control over gene expression along the main body axis, with pattern boundaries positioned with single-cell precision. Furthermore, as gastruloids grow, both their physical proportions and gene expression patterns scale proportionally with system size. Notably, these properties emerge spontaneously in self-organizing cell aggregates, distinct from many in vivo systems constrained by fixed boundary conditions. Our findings shed light on the intricacies of developmental precision, reproducibility and size scaling within a mammalian system, suggesting that these phenomena might constitute fundamental features of multicellularity.

This is a preview of subscription content, access via your institution

## Access options

Access Nature and 54 other Nature Portfolio journals

Get Nature+, our best-value online-access subscription

$29.99 / 30 days

cancel any time

Subscribe to this journal

Receive 12 print issues and online access

$209.00 per year

only $17.42 per issue

Buy this article

- Purchase on SpringerLink
- Instant access to full article PDF

Prices may be subject to local taxes which are calculated during checkout

### Similar content being viewed by others

## Data availability

Processed immunofluorescence staining data is available as maximum projection images for individual gastruloids, organized by figure number. All images have been deposited on the Zenodo repository under https://doi.org/10.5281/zenodo.8108188. Raw images are available upon request.

## Code availability

Custom Python-based analysis code for data processing is available at the GitLab repository (https://gitlab.pasteur.fr/tglab/gastruloids_precisionandscaling).

## References

Conklin, E. Organ-forming substances in the eggs of ascidians.

*Biol. Bull.***8**, 205–230 (1905).Kirschner, M. & Gerhart, J.

*Cells, Embryos and Evolution*(Blackwell Science, 1997).Houchmandzadeh, B., Wieschaus, E. & Leibler, S. Establishment of developmental precision and proportions in the early

*Drosophila*embryo.*Nature***415**, 798–802 (2002).Arias, A. M. & Hayward, P. Filtering transcriptional noise during development: concepts and mechanisms.

*Nat. Rev. Genet.***7**, 34–44 (2006).Briscoe, J. & Small, S. Morphogen rules: design principles of gradient-mediated embryo patterning.

*Development***142**, 3996–4009 (2015).Sulston, J. E., Schierenberg, E., White, J. G. & Thomson, J. N. The embryonic cell lineage of the nematode

*Caenorhabditis elegans*.*Dev. Biol.***100**, 64–119 (1983).Bollenbach, T. et al. Precision of the Dpp gradient.

*Development***135**, 1137–1146 (2008).Bier, E. & De Robertis, E. M. Embryo development. BMP gradients: a paradigm for morphogen-mediated developmental patterning.

*Science***348**, aaa5838 (2015).Bentovim, L., Harden, T. T. & DePace, A. H. Transcriptional precision and accuracy in development: from measurements to models and mechanisms.

*Development*https://doi.org/10.1242/dev.146563 (2017).Zagorski, M. et al. Decoding of position in the developing neural tube from antiparallel morphogen gradients.

*Science***356**, 1379–1383 (2017).Guignard, L. et al. Contact area-dependent cell communication and the morphological invariance of ascidian embryogenesis.

*Science***369**, eaar5663 (2020).Waddington, C. H. Canalization of development and the inheritance of acquired characters.

*Nature***150**, 563–565 (1942).Kicheva, A. et al. Coordination of progenitor specification and growth in mouse and chick spinal cord.

*Science***345**, 1254927 (2014).Tsai, T. Y.-C. et al. An adhesion code ensures robust pattern formation during tissue morphogenesis.

*Science***370**, 113–116 (2020).Petkova, M. D., Little, S. C., Liu, F. & Gregor, T. Maternal origins of developmental reproducibility.

*Curr. Biol.***24**, 1283–1288 (2014).Driever, W. & Nüsslein-Volhard, C. The bicoid protein determines position in the

*Drosophila*embryo in a concentration-dependent manner.*Cell***54**, 138–143 (1988).Petkova, M. D., Tkacik, G., Bialek, W., Wieschaus, E. F. & Gregor, T. Optimal decoding of cellular identities in a genetic network.

*Cell***176**, 844–855.e15 (2019).Gregor, T., Tank, D. W., Wieschaus, E. F. & Bialek, W. Probing the limits to positional information.

*Cell***130**, 153–164 (2007).Dubuis, J. O., Tkacik, G., Wieschaus, E. F., Gregor, T. & Bialek, W. Positional information, in bits.

*Proc. Natl Acad. Sci. USA***110**, 16301–16308 (2013).Lacalli, T. C. Patterning, from conifers to consciousness: Turing’s theory and order from fluctuations.

*Front. Cell Dev. Biol.***10**, 871950 (2022).Nikolić, M. et al. Scale invariance in early embryonic development. Preprint at https://doi.org/10.48550/arXiv.2312.17684 (2023).

Ishimatsu, K. et al. Size-reduced embryos reveal a gradient scaling based mechanism for zebrafish somite formation.

*Development*https://doi.org/10.1242/dev.161257 (2018).Uygur, A. et al. Scaling pattern to variations in size during development of the vertebrate neural tube.

*Dev. Cell***37**, 127–135 (2016).Almuedo-Castillo, M. et al. Scale-invariant patterning by size-dependent inhibition of nodal signalling.

*Nat. Cell Biol.***20**, 1032–1042 (2018).Leibovich, A., Edri, T., Klein, S. L., Moody, S. A. & Fainsod, A. Natural size variation among embryos leads to the corresponding scaling in gene expression.

*Dev. Biol.***462**, 165–179 (2020).Al Asafen, H. et al. Robustness of the dorsal morphogen gradient with respect to morphogen dosage.

*PLoS Comput. Biol.***16**, e1007750 (2020).Cheung, D., Miles, C., Kreitman, M. & Ma, J. Scaling of the Bicoid morphogen gradient by a volume-dependent production rate.

*Development***138**, 2741–2749 (2011).Ben-Zvi, D., Shilo, B.-Z. & Barkai, N. Scaling of morphogen gradients.

*Curr. Opin. Genet. Dev.***21**, 704–710 (2011).Huang, A., Rupprecht, J.-F. & Saunders, T. E. Embryonic geometry underlies phenotypic variation in decanalized conditions.

*eLife***9**, e47380 (2020).Romanova-Michaelides, M. et al. Morphogen gradient scaling by recycling of intracellular Dpp.

*Nature***602**, 287–293 (2022).Saiz, N. & Hadjantonakis, A.-K. Coordination between patterning and morphogenesis ensures robustness during mouse development.

*Philos. Trans. R. Soc. B***375**, 20190562 (2020).Stückemann, T. et al. Antagonistic self-organizing patterning systems control maintenance and regeneration of the anteroposterior axis in planarians.

*Dev. Cell***40**, 248–263.e4 (2017).Gritti, N., Oriola, D. & Trivedi, V. Rethinking embryology in vitro: a synergy between engineering, data science and theory.

*Dev. Biol.***474**, 48–61 (2021).Rosado-Olivieri, E. A. & Brivanlou, A. H. Synthetic by design: exploiting tissue self-organization to explore early human embryology.

*Dev. Biol.***474**, 16–21 (2021).van den Brink, S. C. et al. Symmetry breaking, germ layer specification and axial organisation in aggregates of mouse embryonic stem cells.

*Development*https://doi.org/10.1242/dev.113001 (2014).Beccari, L. et al. Multi-axial self-organization properties of mouse embryonic stem cells into gastruloids.

*Nature***562**, 272–276 (2018).Hashmi, A. et al. Cell-state transitions and collective cell movement generate an endoderm-like region in gastruloids.

*eLife***11**, e59371 (2022).Underhill, E. J. & Toettcher, J. E. Control of gastruloid patterning and morphogenesis by the Erk and Akt signaling pathways.

*Development***150**, dev201663 (2023).Fu, J., Warmflash, A. & Lutolf, M. P. Stem-cell-based embryo models for fundamental research and translation.

*Nat. Mater.***20**, 132–144 (2021).Beccari, L. et al. Generating gastruloids from mouse embryonic stem cells.

*Protoc. Exch*. https://doi.org/10.1038/protex.2018.094 (2018).Snow, M. H. & Tam, P. P. Is compensatory growth a complicating factor in mouse teratology?

*Nature***279**, 555–557 (1979).Lewis, N. E. & Rossant, J. Mechanism of size regulation in mouse embryo aggregates.

*J. Embryol. Exp. Morphol.***72**, 169–181 (1982).Rands, G. F. Size regulation in the mouse embryo. II. The development of half embryos.

*J. Embryol. Exp. Morphol.***98**, 209–217 (1986).Mittnenzweig, M. et al. A single-embryo, single-cell time-resolved model for mouse gastrulation.

*Cell***184**, 2825–2842.e22 (2021).Neijts, R., Simmini, S., Giuliani, F., van Rooijen, C. & Deschamps, J. Region-specific regulation of posterior axial elongation during vertebrate embryogenesis.

*Dev. Dyn.***243**, 88–98 (2014).Amin, S. et al. Cdx and T brachyury co-activate growth signaling in the embryonic axial progenitor niche.

*Cell Rep.***17**, 3165–3177 (2016).Blassberg, R. et al. Sox2 levels regulate the chromatin occupancy of Wnt mediators in epiblast progenitors responsible for vertebrate body formation.

*Nat. Cell Biol.***24**, 633–644 (2022).Elowitz, M. B., Levine, A. J., Siggia, E. D. & Swain, P. S. Stochastic gene expression in a single cell.

*Science***297**, 1183–1186 (2002).Raser, J. M. & O’Shea, E. K. Control of stochasticity in eukaryotic gene expression.

*Science***304**, 1811–1814 (2004).Carolina de Souza-Guerreiro, T., Meng, X., Dacheux, E., Firczuk, H. & McCarthy, J. Translational control of gene expression noise and its relationship to ageing in yeast.

*FEBS J.***288**, 2278–2293 (2021).Dubuis, J. O., Samanta, R. & Gregor, T. Accurate measurements of dynamics and reproducibility in small genetic networks.

*Mol. Syst. Biol.***9**, 639 (2013).Stringer, C., Wang, T., Michaelos, M. & Pachitariu, M. Cellpose: a generalist algorithm for cellular segmentation.

*Nat. Methods***18**, 100–106 (2021).Moore, J. L., Du, Z. & Bao, Z. Systematic quantification of developmental phenotypes at single-cell resolution during embryogenesis.

*Development***140**, 3266–3274 (2013).Werner, S. et al. Scaling and regeneration of self-organized patterns.

*Phys. Rev. Lett.***114**, 138101 (2015).Turing, A. M. The chemical basis of morphogenesis.

*Philos. Trans. R. Soc. Lond. B***237**, 37–71 (1952).Endy, D. Foundations for engineering biology.

*Nature***438**, 449–453 (2005).Stanton, B. C. et al. Genomic mining of prokaryotic repressors for orthogonal logic gates.

*Nat. Chem. Biol.***10**, 99–105 (2014).Clevers, H. Modeling development and disease with organoids.

*Cell***165**, 1586–1597 (2016).Fatehullah, A., Tan, S. H. & Barker, N. Organoids as an in vitro model of human development and disease.

*Nat. Cell Biol.***18**, 246–254 (2016).Lancaster, M. A. & Knoblich, J. A. Organogenesis in a dish: modeling development and disease using organoid technologies.

*Science***345**, 1247125 (2014).Shariati, L., Esmaeili, Y., Haghjooy Javanmard, S., Bidram, E. & Amini, A. Organoid technology: current standing and future perspectives.

*Stem Cells***39**, 1625–1649 (2021).Veenvliet, J. V., Lenne, P.-F., Turner, D. A., Nachman, I. & Trivedi, V. Sculpting with stem cells: how models of embryo development take shape.

*Development***148**, dev192914 (2021).Rossi, G., Manfrin, A. & Lutolf, M. P. Progress and potential in organoid research.

*Nat. Rev. Genet.***19**, 671–687 (2018).Jensen, K. B. & Little, M. H. Organoids are not organs: sources of variation and misinformation in organoid biology.

*Stem Cell Rep.***18**, 1255–1270 (2023).van den Brink, S. C. et al. Single-cell and spatial transcriptomics reveal somitogenesis in gastruloids.

*Nature*https://doi.org/10.1038/s41586-020-2024-3 (2020).Mansoury, M., Hamed, M., Karmustaji, R., Al Hannan, F. & Safrany, S. T. The edge effect: a global problem. The trouble with culturing cells in 96-well plates.

*Biochem. Biophys. Rep.***26**, 100987 (2021).Tkačik, G., Dubuis, J., Petkova, M. & Gregor, T. Positional information, positional error, and readout precision in mor- phogenesis: a mathematical framework.

*Genetics***199**, 39 (2015).

## Acknowledgements

We thank I. Bennabi, D. Brückner, M. Cerminara, M. Cohen-Tannoudji, P. Hansen, M. Nikolić, C. Mirdas, J. Pineau, J. Wong-Ng, B. Zoller and the late R. Neijts. This work was supported by Institut Pasteur (particularly the cytometry platform), Centre National de la Recherche Scientifique, CFM Foundation for Research and the French National Research Agency (ANR-10-LABX-73’Revive’, ANR-16-CONV-0005 ‘Inception’, ANR-20-CE12-0028’ChroDynE’ and ANR-23-CE13-0021’GastruCyp’).

## Author information

### Authors and Affiliations

### Contributions

M.M., L.F., C.C., A.S. and T.G. designed experiments. M.M., L.F., C.C. and A.S. developed experimental protocols. M.M., L.F. and C.C. performed experiments. M.M. and L.F. performed computational image analysis. M.M., L.F. and T.G. wrote the manuscript. T.G. secured funding and supervised the work.

### Corresponding author

## Ethics declarations

### Competing interests

The authors declare no competing interests.

## Peer review

### Peer review information

*Nature Structural & Molecular Biology* thanks Timothy Saunders and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Carolina Perdigoto, in collaboration with the *Nature Structural & Molecular Biology* team. Peer reviewer reports are available.

## Additional information

**Publisher’s note** Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Extended data

### Extended Data Fig. 1 Experimental detail, protocols, and image analysis.

**a**: Gastruloid protocol as described before with a Chi-pulse on day three^{40}. Initial seeding either done by manual multi-pipetting or using Fluorescence-activated Cell Sorting (FACS)^{65}, implying a different variability in the initial number of seeded cells \({N}_{0}\); 10% vs. 2%, respectively. Blue arrows indicate addition of Chiron and change of medium. **b**: Discarding all gastruloids grown in outer wells for increasing reproducibility. Empirical observation determined largely from different behaviors for gastruloids grown in inner versus outer wells^{66}. **c**: Image analysis steps include the definition of a smooth contour (I), drawing the midline (II), and slicing along this midline using an equidistant positioning of two sets of equal-number points on each side of the contour (III). For III, the points in left half (light blue) and in right half (dark blue) are equidistant along the contour, respectively. Gastruloid volume is reconstructed by assuming each slice is rotationally symmetric (that is, a truncated cone). Scalebar is 100 μm. **d**: Gastruloids imaged with brightfield microscopy. Gastruloid elongation efficiency is 97% for multi-pipetting and 99% for FACS seeding, for \(\overline{N}_0=300\). The remaining gastruloids have multiple poles (for example, red framed image). Scalebar is 100 μm. **e**: Schematic of the protocol to measure the volume and cell count of individual gastruloids. Brightfield images of gastruloids are acquired before chemical dissociation, left; fluorescent images of *all* individual cells composing the gastruloid are acquired after dissociation using confocal microscopy (see Methods).

### Extended Data Fig. 2 Growth reproducibility and size scaling.

**a**: Gastruloid volume as a function of time. Volumes are obtained from 2D reconstruction in Extended Data Fig. 1c. Curves shown for 23 gastruloids (subset of Fig. 1a) followed over time individually (blue) and mean (black). Percent variation around the mean is reported for each time point. **b**: Exponential growth of the total number of cells in individual gastruloids (same as in **a**). The total cell count \(N\) shown in log-scale as a function of time \(t\) is obtained from the proportionality between \(V\) and \(N\) (Fig. 1c and Extended Data Fig. 7). Exponential growth (see Methods) is assumed for each individual growth curve (in grey) to extract the effective doubling time *t*_{D} for each gastruloid (via linear fitting). Red line corresponds to exponential growth with mean effective doubling time \({t}_{\text{D}}=26.4\pm 1.7\) h. Red shaded area was computed from error propagation (Methods). **c**: Gastruloid volumes correlate with \({N}_{0}\) at all time points. Scatter plot of individual gastruloid volumes from A at different time points versus \({N}_{0}\), measured just after seeding, overlaid by a linear regression fit. The correlation coefficient for each fit is reported on the right y-axis. **d**: Scatter plot of mean gastruloid volume at different time points versus \(\overline{N}_0\), measured just after seeding, overlaid by a linear regression fit. These are the same gastruloids shown in Fig. 1b. Right y-axis shows the correlation between volume and \(\overline{N}_0\) for different time points (color). When scanning a large range of average \({\overline{N}}_{0}\) (\(50\le {\overline{N}}_{0}\le 1100\)), the correlations increase significantly. **e**: Effective doubling time \({t}_{\text{D}}\) as a function of \(\overline{N}_0\). The effective doubling time is obtained by fitting growth curves of the number of cells by an exponential growth model (see Methods). For round markers, \({t}_{\text{D}}\) is extracted from cell counts measured directly by chemical dissociation. For triangle markers, cell counts are obtained from volume measurements using the relationship in Fig. 1c. Red markers correspond to the individual gastruloids in Fig. 1a; purple markers correspond to averaged data in Fig. 1b; blue markers to the inset in Fig. 1b. Average effective doubling time for gastruloids seeded with \(150\le {\overline{N}}_{0}\le 1100\) is \({\overline{t}}_{{\rm{D}}}=27.6\pm 2.6\,{\rm{h}}\) (mean as blue dashed line; light blue area standard deviation). **f**: Evolution of average midline length per experiment over three years (2020–2023) for gastruloids with \({\overline{N}}_{0}=300\) at 120 h. Downward triangles are average midline lengths of experiments seeded by multi-pipetting; upward triangles are average midline lengths of experiments seeded using FACS. Error bars are standard deviations across individual samples. The blue line represents the overall average across all experiments with blue shaded area as the standard deviation: \(\bar{L}=590\pm 102\) μm (17%, \(n=30\)). Inset shows the corresponding evolution of the variability of the mean gastruloid midline length per experiment. Intra-experiment variability in length is on average 〈σ_{L}/*L*〉 = 9.4 ± 2.7% (*n* = 30). Over three years, both the gastruloid midline length and its variability are highly consistent. See Tables S6 and S7 for sample numbers. **g**: Average cell count \(\overline{N}\left(t\right)/{\overline{N}}_{300}\left(t\right)\) as a function of the initial average seed cell count \({\overline{N}}_{0}/300\) in units of the average reference seed cell count \({\overline{N}}_{0}=300\). Five panels correspond to gastruloid ages at 1 through 5 days (also encoded by color). Black diagonal (slope = 1) represents perfect scaling (see main text) of gastruloid size at time \(t\) upon changes in \({\overline{N}}_{0}\) ranging over \(50\le {\overline{N}}_{0}\le 1100\). For each time point, using a simple exponential growth model, the dashed lines estimate the bounds on the expected deviations from perfect scaling due to fluctuations in both \({\overline{N}}_{0}/300\) and in the doubling time \({t}_{\text{D}}\) (**e** and Methods). Insets show deviations \(D\) from perfect scaling: \(D=\frac{\overline{N}\left(t\right)/{\overline{N}}_{300}\left(t\right)}{{\overline{N}}_{0}/300}\), as a function of the initial average seed cell count \({\overline{N}}_{0}/300\) in units of the average reference seed cell count \({\overline{N}}_{0}=300\). Black horizontal line represents perfect scaling and the dashed lines show expected deviations from error propagation. Statistics as in Fig. 1d. All error bars are standard deviations.

### Extended Data Fig. 3 Immunofluorescence image analysis.

**a**: Fixed gastruloids are imaged by confocal microscopy in z-stacks of 150 μm (30 slices, dz = 5 μm). **b**: Analysis pipeline of Extended Data Fig. 1c is applied to the DAPI channel for each gastruloid to extract midline, contour, and equidistant slices. Fluorescence intensities of the other channels are max projected (here illustrated with SOX2 (green) and CDX2 (red)) and intensities of individual slices are integrated to obtain a single value per slice and to construct one-dimensional expression profiles as a function of slice position along the midline. Scalebar is 100 μm. **c**: One-dimensional profiles of SOX2 (green) and CDX2 (red) along the midline obtained for the gastruloid in **b**. **d**: Visual comparison of mean (left) versus maximum (right) projection of a gastruloid stained for SOX2 (green) and CDX2 (red). Scalebar is 100 μm. **e**: Quantitative comparison of maximum (x-axis) versus mean (y-axis) projection of intensities for the four examined genes in individual gastruloids from Fig. 2 (\(n=\){44, 44, 48, 46} respectively for SOX2, CDX2, BRA and FOXC1). Color code corresponds to the position of each slice along the midline (yellow towards the anterior pole, gray-blue towards the posterior pole). **f**: Mean profiles of expression of the four genes as a function of relative position \(x/L\) using either maximum (black) or mean (gray) projection. **g**: Variability as a function of the relative position \(x/L\) along the midline of each set of gastruloids for the four genes. Gray and black lines correspond to the variability computed respectively from either mean or maximum projections. Measured variability is lower when using maximum projection. **h**: Visual comparison of gastruloid slicing methods, straight lines (left, yellow) versus curved lines (right, pink); immunostained gastruloid stained for SOX2 (green) and CDX2 (red). Straight lines are line segments calculated between the equidistant points along both sides of the contour as in Extended Data Fig. 1c. Curved lines are obtained using both equidistant points along the contour and along the midline. From this combination of points, a parabolic equation is calculated using a second-order polynomial fit. This procedure is meant to recapitulate the overall curvature of the gastruloid. **i**: Quantitative comparison of intensities using straight (x-axis) versus curved (y-axis) line slicing for the four examined genes in individual gastruloids from Fig. 2 (\(n=\) {44, 44, 48, 46} for SOX2, CDX2, BRA and FOXC1, respectively). Color code corresponds to the position of each slice along the midline (yellow towards the anterior pole, gray-blue towards the posterior pole). **j**: Mean profiles of the four stained sets of gastruloids from Fig. 2 as a function of relative position \(x/L\) using either straight (yellow) or curved (pink) line slicing. **k**: Variability as a function of the relative position \(x/L\) along the midline of each set of gastruloids for the four genes. Yellow and purple lines correspond to straight and curved line slicing, respectively. Using the curved lines method diminishes border effects on profiles of the four genes (mean and variability). No significant change is observed for the most part of the gastruloid midline, making both methods essentially equivalent. For computational simplicity, we employ the straight lines method. All profiles are represented for \(0.1\le x/L\le 0.9\) in the rest of the paper.

### Extended Data Fig. 4 Immunofluorescence background and measurement error estimation.

**a**: Gastruloids dual-labeled immunofluorescently for SOX2 and CDX2 expression using the regular protocol, as in Fig. 2; bottom gastruloid is missing the primary antibodies to determine the background noise due to non-specific interactions of the secondary antibodies which are estimated the dominant source of background noise in the staining and imaging procedures^{51}. **b**: Individual (\(n=10\), light color) and mean profiles (bold) for SOX2 (left, green) and CDX2 (right, red) labeled including primary antibodies. Gray dashed line is the background estimation from **c**; black dashed line is the background calculated from the raw profiles as the mean intensity level in the 10% region of lowest expression (\({I}_{\min }\)). These two dashed lines are confounded in the case of the SOX2 profiles, confirming that the control experiment is a good estimate of the background. **c**: Control experiment without primary antibodies; individual (\(n=10\), light color) and mean profiles (bold) for SOX2 (left, green) and CDX2 (right, red). **d**: Comparison of the variability (\({\sigma }_{I}/\bar{I}\)) using either the raw mean profile (bold color), the control-corrected profile (bold grey) or the \({I}_{\min }\)-corrected profile (bold black). **e**: 4 single gastruloids immunofluorescently stained for SOX2 and CDX2. Gastruloids are mounted in PBS medium and rotated manually via flushing for each exposure \(n=\)7–11 times, and taken from different view angles. Images are categorized for two different orientations of the gastruloid view angle: a’side view’ (left column) and a’backside view’ (right column). The preferential orientation is determined by the gastruloid shape and is different from gastruloid to gastruloid. Scalebar is 100 μm. **f**: Mean profiles of SOX2 (green) and CDX2 (red) expression for the four gastruloids in **a** gathered by category: side view (dashed line) and backside view (dotted line). Panels in each row correspond to four experiments with a different individual gastruloid (\(n=\) {11, 9, 7, 11} images, respectively). Shaded areas are standard errors in all graphs. **g**: Variance of mean profiles in the four gastruloids due to specimen rotation for SOX2 (black, top) and CDX2 (black, bottom) calculated by bootstrapping the data in **f**. This variance is compared to the total variance (SOX2: green, top; CDX2: red, bottom) of \(n=88\) gastruloids. The rotation-induced variance represents less than 10% of the total variance. **h**: Mean expression profiles (top) and variability (bottom) for SOX2 (green) and CDX2 (red) of the \(n=88\) gastruloids from **g** classified according to their orientation (that is, determine AP orientation using SOX2 expression, determine straight versus crescent orientation, determine L/R orientation for crescent shapes; line style as in **f**). Black lines are the mean profile and standard deviation of gene expression in the total population. This classification based on specimen rotation has minimal effect on the values of mean expression or variability.

### Extended Data Fig. 5 Reproducibility and precision of gene expression profiles.

**a**: EC50 determination: \({I}_{\max }\) and \({I}_{\min }\) for individual one-dimensional gene expression profiles are defined as the average value of the 10% largest and lowest expressing bins, respectively. The raw profile (green, plain curve) is spline fitted (green, dotted curve), and the position where the fit is equal to \(\left({I}_{\max }+{I}_{\min }\right)/2\) defines \(x/{L}_{\text{EC}50}\). **b**: Variability (\({{\rm{\sigma }}}_{I}/\bar{I}\)) as a function of normalized intensity \(I\). \({I}_{\max }\) and \({I}_{\min }\) are determined as in **a** for SOX2, CDX2, BRA, and FOXC1 (\(n=\){44, 44, 48, 46} respectively for SOX2, CDX2, BRA and FOXC1). Same data set as in Fig. 2. The average value in the gray region (defined by the gene being expressed at more than 90% of its max level) is used as a measure of gene expression reproducibility for the fully *induced* gene. **c**: Comparison between raw (colored lines) and \({{\rm{\chi }}}^{2}\)-minimized (black lines) variability as a function of position \(x/L\) for SOX2, CDX2, BRA, and FOXC1 for data set in **b**. Dashed lines represent the average variability in the region where genes are most highly expressed (see **b**). These values decrease from ∼20% to ∼10% after \({\chi }^{2}\)-minimization, showing the potential for reproducibility after systematic error reduction, similar to what is seen in the fly embryo^{51}. **d**: Distribution of \(x/{L}_{\text{EC}50}\) for each of the four markers for data set in **b**, average and standard deviation are reported on top of each distribution. The average value is the gene boundary position \({x}_{B}/L\) and the standard deviation around this value is a measure of the positional error of the boundary position. **e**: Generalized positional error as a function of the normalized intensity for each marker (color code as in **b**). The zones of highest precision (that is, \({{\rm{\sigma }}}_{x/L}\le 5\backslash \%\))correspond to the transition regions between low- and high-expression domains. **f**: Positional error \({{\rm{\sigma }}}_{x/L}\) calculated for four genes as in Fig. 3c. The positional errors at the boundaries are shown here at the mean boundary position \({x}_{B}/L\) extracted in **d** (big crosses, bootstrapped errors are within marker size). The values from both methods are consistent and, for all genes, the positional errors at the boundaries correspond to a linear dimension of 1–2 cell diameters (gray bands).

### Extended Data Fig. 6 Shrinkage factor due to fixation and sample mounting.

**a**: Distribution of gastruloid volumes \({V}_{{BF}}\) (gastruloids seeded with \({\overline{N}}_{0}=300\) cells) at 120 h; 2D volume reconstruction from either brightfield images or maximum projection of confocal images on the DAPI channel. Gastruloid volumes after fixation and mounting (red, \(n=47\)) are ∼3 times smaller than the same set of gastruloids imaged live before fixation (yellow, \(n=52\)). The number of gastruloids after fixation and mounting is always smaller than during live imaging as gastruloids are lost during the protocol. **b**: A one-dimensional shrinkage factor is defined by the ratio of the average values in A: \({SF}=1-{\left({V}_{{IF}}/{V}_{{BF}}\right)}^{1/3}\). This factor quantifies by how much gastruloid size is reduced during the staining protocol. It is applied to all measured lengths of midlines from stained gastruloids. Gastruloids are mounted in 50% PBS and 50% aqueous mounting medium (Aqua-Poly/Mount, Polysciences). I–XI are 11 independent experiments where *SF* was calculated on gastruloids initially seeded with \({\overline{N}}_{0}=300\) cells and imaged at 120 h after seeding. Error bars are from bootstrapping with on average \(n=51\) for live images and \(n=42\) gastruloids after fixation and mounting (experiments I-VIII,) or \(n=20\) for live images and \(n=10\) after fixation and mounting (IX-XI). The shrinkage factor in these experimental conditions is \({SF}=0.35\pm 0.03\) (error is standard deviation). **c**: Same as **b** for a glycerol-based SlowFade^{TM} Glass Antifade mounting medium (Invitrogen) used in the phalloidin staining protocol. Each data point corresponds to an average gastruloid pool of \(n=49\) for live and \(n=27\) after fixation and mounting. Error bars from bootstrapping. Experiment I corresponds to \({\overline{N}}_{0}=100\) cells at 120 h, experiments II-IV correspond to \({\overline{N}}_{0}=300\) at 72 h, 96 h and 120 h, respectively. The shrinkage factor in this mounting medium is \({SF}=0.36\pm 0.1\) (error is standard deviation). Note that gastruloids are fixed for 1 h in the phalloidin staining protocol while they are fixed for 2 h in the immunostaining protocol. **d**: Shrinkage factor stability over time for three different mounting techniques in 50% PBS and 50% aqueous mounting medium (Aqua-Poly/Mount, Polysciences): on a slide with a 250 μm spacer or in a glass bottom dish w/ or w/o coverslip. Shrinkage factor measured repeatedly in the same set of gastruloids from IX-XI of **b** between three days and three weeks. Error bars are standard errors of the mean obtained from bootstrapping with on average \(n=20\) for live images and \(n=10\) after fixation and mounting (IX-XI).

### Extended Data Fig. 7 Determination of total cell count and effective cell diameter.

**a**: Visualisation of the cell masks obtained by 3D segmentation^{52}. (Left:) Slice of a confocal image z-stack of a 120 h old gastruloid, seeded from \({\overline{N}}_{0}=100\) cells, stained for phalloidin (orange) and DAPI (blue). (Right:) Phalloidin channel from left in grayscale overlaid with cell masks obtained by 3D segmentation (see Methods). Scalebar is 50 μm. **b**: Estimation of the discrepancy between 3D and 2D volume reconstruction. The pipeline presented in Extended Data Fig. 1c overestimates gastruloid volumes; we estimate by how much using the volume determined by 3D segmentation as a ground truth. Distribution of the error *Err* on the volume determined by 2D volume reconstruction \({V}_{2D}\),before 120 h and at 120 h, overlaid by a Gaussian distribution fit for each distribution. Vertical dashed lines correspond to the mean of each distribution. The ground truth 3D volume \({V}_{3D}\) was obtained from the 3D segmentation. Before 120 h, \({Err}=3.2\pm 8.2 \%\) (\(n=56\)). After 120 h, \({Err}=20.0\pm 11.2 \%\) (\(n=40\)). The volume was overestimated in both time classes but more so when the gastruloid elongated. Note that this evaluation of the discrepancy between 3D and 2D volume reconstruction is independent of the shrinkage factor (Extended Data Fig. 6) because 3D and 2D volume reconstructions are applied to the same shrunken gastruloid mounted with the phalloidin staining protocol. **c**: Scatter plot of the measured volume from 2D reconstruction \(V\) (corrected for the error determined in **b**) versus the total cell count \(N\) obtained by chemical dissociation (with the protocol in Extended Data Fig. 1e), for 492 individual gastruloids at different time points (color code) and with varying \({\overline{N}}_{0}\) (symbol). From \(V\) and \(N\) for each individually dissociated gastruloid an effective cell volume \({V}_{c}=V/N\) was computed, and from there we obtain the slope (black lines). The mean \({\bar{V}}_{c}\) for gastruloids aged from 24 to 48 h (before Chi-pulse) and the mean \({\bar{V}}_{c}\) for gastruloids aged from 72 to 120 h (after Chi-pulse) correspond to dashed and full lines, respectively. Inset shows correlation (\(r=0.78\)) of variability for \(V\) and \(N\) for sets of gastruloids with identical age and \({\overline{N}}_{0}\). The effective cell diameter \({d}_{c}\) can be obtained from the distribution of \({V}_{c}\), or directly from the slopes (see Methods and **d**). **d**: Distribution of the effective cell diameters \({d}_{c}\) per dissociated gastruloid, calculated from each effective single cell volume (\(V/N\)), before (red) and after (blue) Chi-pulse. Black lines are a Gaussian fit for each distribution. Vertical dashed lines correspond to the mean of each distribution. Before Chi-pulse, \({d}_{c}=16.0\pm 0.6\) μm (4.0%, \(n=206\)); after Chi-pulse, \({d}_{c}=13.9\pm 0.5\) μm (3.8%, \(n=286\)). This is evidence of a Chi-pulse-induced reduction in gastruloids’ effective cell size by ∼13% (linear dimension). **e**: Single cell volume distributions serve to reject noisy masks from 3D segmentation results. After an initial rejection of any 3D masks smaller than 10^{4} voxels, a bimodal distribution of the logarithm of single cell volumes \({V}_{c}\) (obtained by 3D segmentation of a 120 h old gastruloid with \({\overline{N}}_{0}=100\)) is fit by a two-component Gaussian mixture model (left). The mode in black corresponds to the distribution of small noisy masks, the mode in red corresponds to the distribution of well-segmented cells. *Morphological closing* is performed on the latter and the corresponding distribution of single cell volumes \({V}_{c}\) is shown in right panel, with noisy masks (black) and well-segmented masks (red). **f**: Scatter plot of gastruloid volume versus total cell count obtained by two independent methods. Blue: chemical dissociation and 2D volume reconstruction (for gastruloids dissociated after Chi-pulse only). Green: 3D segmentation for volume and cell count measurement (well-segmented cells only, see **e**). Slope of blue and green lines correspond to the mean \({V}_{c}\) for chemically dissociated and 3D segmented gastruloids, respectively. Upper left inset shows a close-up for small \(V\) and \(N\). Lower right inset shows correlation of variability for \(V\) and \(N\) for both methods. Note that the main error attached to the 3D segmentation volume is due to the estimation of the shrinkage factor of the mounting medium used in the phalloidin staining protocol (Extended Data Fig. 6C). 2D volume reconstruction from dissociated gastruloids is applied to images of live gastruloids (that is, they are not shrunken). **g**: Distribution of the logarithm of single cell volumes \({V}_{c}\) obtained by 3D segmentation after filtering and reconstruction for 96 h (\(n=28\)) and 120 h (\(n=20\)) old gastruloids with \({\overline{N}}_{0}=100\). Inset shows dispersion self-similarity \({{\rm{\delta }}}_{S}\), defined as \(\left\langle {\sigma }_{\log \left({V}_{C}\right)}/\overline{\log \left({V}_{C}\right)}\right\rangle\) for each set of distributions. It demonstrates the reproducibility of the dispersion in cell size in individual gastruloids and a further reduction in gastruloid cell size during the elongation process. The low variability indicates that the dispersion is highly conserved across gastruloids. **h**: Distribution of the effective cell diameter per gastruloid, obtained by chemical dissociation (only data from gastruloids dissociated after Chi-pulse) and 3D segmentation, overlaid by a Gaussian fit for each distribution. Vertical dotted lines correspond to the mean of each distribution. With the dissociation protocol, \({d}_{c}=13.9\pm 0.5\) μm (3.8%, \(n=286\)). With the 3D segmentation method, \({d}_{c}=13.1\pm 0.5\) μm (4.0%, \(n=108\)). Taking into account the different sources of error and our two independent methods of determination of the effective cell diameter, the relevant linear size of the system at 120 h is \({d}_{c}=13.5\pm 0.8\) μm. All error bars are standard deviations.

### Extended Data Fig. 8 Repeatability and reproducibility of a single experiment.

**a**: Twelve repetitions of the same experiment on different dates (exp I–exp IV, month/year, with \(n=\)139, 105, 84 and 95 gastruloids). Each panel shows raw individual gastruloid profiles (light green, no y-axis normalization) and mean profiles (dark green) of three same-day replicas of SOX2 expression in immunostained gastruloids seeded, cultured, fixed, stained, and imaged in parallel on three separate plates (that is, in each panel three same-day-replicas shown by full, dashed, and dotted lines). Each individual experiment (12 total) is composed of 25–50 gastruloids. Conditions are identical for all experiments except for experiment III in which gastruloids were mounted in PBS instead of Aqua-Poly/Mount. Note that same-day replicas are significantly more reproducible (that is, self-similar) than experiments across different days (that is, the mean expression pattern differs more across days than across same-day replicas, something *not* seen in developing embryos^{67}). **b**: Mean profiles as a function of relative position \(x/L\) for each replica. Shaded areas are standard errors. Normalization was performed on the entire data set across all \(n\) gastruloids for a global maximum and minimum average intensity (that is, a single max and a single mean for experiment day). Same-day replica can have absolute reproducibility (exp II–IV), where profile distributions collapse without y-axis normalization. **c**: Profile variability \({{\rm{\sigma }}}_{I}/\bar{I}\) as a function of relative position \(x/L\) along the midline for each replica (green, line style as in A), or for the entire data set across same-day replicas (black). Panels run across four experiments as in **a**. Again, same-day replicas are highly reproducible while variability profiles differ significantly across different days. **d**: Positional error \({{\rm{\sigma }}}_{x/L}\) calculated by error propagation from **a** and **b** for each replica. Gray lines correspond to one and two effective cell diameters \({d}_{c}\), respectively. The corresponding values in \({\sigma }_{x/L}\) are different between different experiments because of experiment-to-experiment variability in length (Extended Data Fig. 2f). Boundary precision is maintained near 1–2 cell diameters across all replicas (that is, same-day and across days). **e**: Variance decomposition for the SOX2 profile in experiments I and III (Methods). Plain lines correspond to the inter-plate part of the variance (for three same-day replicas) and the dashed lines to the intra-plate part of the variance. The inter-plate and intra-plate variance are represented as a fraction of the total variance of the whole population of same-day gastruloids (black lines in **c**). The decomposition is done in three ways: 1) on the raw profiles (black lines), 2) on normalized profiles (all profiles of individual replica are normalized by the same values, such as minimum/maximum expression levels of each replica’s mean profile are set to 0/1, respectively; gray lines), and 3) on \({\chi }^{2}\)-minimized profiles (all profiles of individual replicas are normalized by the same values, obtained by \({\chi }^{2}\)-minimization of the mean profiles; light gray lines). Experiment I is an example of relative but not absolute reproducibility; experiment III is reproducible in absolute units, demonstrating that in principle the system is capable to generate absolute molarities of a gene product at well-defined positions along the gastruloid midline. **f**: Weighted average of inter-plate part of the variance, in the four experiments for either raw data, min/max normalized data, or data normalized using \({\chi }^{2}\)-minimization. Internal replicas regularly achieve absolute reproducibility (that is, no normalization, raw data comparison) better than 5% of the total variance in the data. See Table S8 for sample numbers.

### Extended Data Fig. 9 Scaling of gene expression in gastruloids.

**a**: Midline length distribution for gastruloids at 120 h seeded with \({\overline{N}}_{0}\) ranging from 50 to 1100 (from Fig. 1b, on average 15 gastruloids per \({\overline{N}}_{0}\)) with a 5.3-fold total length range. A 22-fold range in \({\overline{N}}_{0}\) results in gastruloids with a 3.8-fold range in average length \(L\) (bold vertical lines). **b**: Length distributions of gastruloid sets in Fig. 4 as a function of \({\overline{N}}_{0}\) (light points are individual gastruloids, \(n=24-53\) \({\rm{gastruloids}}\) \({\rm{per}}\) \({\overline{N}}_{0},\) for a total of \(n=\) 517; dark points are average length and standard deviation per set and per gene; color code as in **a**). The span in length differs between experiments. For the data corresponding to SOX2 and CDX2, the 5-fold range in \({\overline{N}}_{0}\) achieves a 2.3-fold range in gastruloid length at 120 h. For the data corresponding to BRA and FOXC1, a 5-fold and 8-fold range in \({\overline{N}}_{0}\) achieve a 1.7-fold range in length, respectively. **c**: Individual gene expression profiles (normalized between 0 and 1 for each gastruloid individually using \({I}_{\min }\) and \({I}_{\max }\) as in Extended Data Fig. 5a) for each \({\overline{N}}_{0}\) (color code on right) and each gene as a function of absolute position along each gastruloid’s midline. **d**: Individual gene expression profiles (normalized as in **c**) for each \({\overline{N}}_{0}\) (color code on right) and each gene as a function of relative position (\(x/L\)) along the midline. See Table S5 for sample numbers.

### Extended Data Fig. 10 Limits of precision in scaled gene expression profiles.

**a**: Normalized mean expression profiles for each gene (SOX2, CDX2, BRA and FOXC1) of all gastruloids of different \({\overline{N}}_{0}\). For each gene, positional markers are defined at three positions corresponding to the 25% (\({x}_{25}\), blue), 50% (\({x}_{50}\), black), and 75% (\({x}_{75}\), red) of maximum profile intensity levels (vertical dashed lines), respectively. **b**: Absolute positions of the 25%, 50% and 75% maximum intensity levels for each gastruloid (same as in Fig. 4c) as a function of gastruloid length (same color code as above). Perfect scaling would imply \({R}^{2}=1\), meaning that 100% of the observed boundary position variance is related to gastruloid length. Slope values correspond to the average position of the three positional markers in relative units \({x}_{p}/L\). **c**: Relative position of the 25%, 50% and 75% maximum intensity levels as a function of L for each gastruloid (same color code as above). Perfect scaling predicts statistical independence of the relative boundary position (50% maximum intensity position) and the absolute gastruloid length. We performed a linear regression and found that the slopes are statistically different from zero (see Table S9 for *p*-values), with a 99% confidence interval; see slopes in legend. A slope of \({10}^{-5}\) μm^{−1} means that a decrease or an increase of 300 μm around the case \({\overline{N}}_{0}=300\) leads to a shift of the positional marker of ∼1% along the AP midline, that is ∼6 μm (\(\le 1{d}_{c}\)). A slope of \({10}^{-4}\) μm^{−1} (as is the case for BRA) means that a decrease or an increase of 300 µm leads to a shift of the positional marker of ∼10% along the AP midline, that is ∼60 μm (∼4\({d}_{c}\)). For \({x}_{50}/L\), the slopes for the four genes SOX2, CDX2, BRA, and FOXC1 are 2.8 ± 1 10^{−5}, 3.4 ± 1 10^{−5}, 2.0 ± 2 10^{−5} and 2.9 ± 3 10^{−5} μm^{−1}, respectively. **d**: Positional error for the three markers (same color code as above) converted in cell diameter units (\({d}_{c}\)) as a function of average gastruloid length for the four genes SOX2, CDX2, BRA, and FOXC1. The range of gastruloid lengths is binned; each data point corresponds to the bin average. The positional error remains between 1–2 cells for all genes and all markers within a certain length range (up to 600 μm for FOXC1, up to 800 μm for the other genes). This range corresponds to the mean length of gastruloids in a range \(100\le {\overline{N}}_{0}\le 500\) for each experiment (Extended Data Fig. 9b). See Table S5 for sample numbers.

## Supplementary information

## Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

## About this article

### Cite this article

Merle, M., Friedman, L., Chureau, C. *et al.* Precise and scalable self-organization in mammalian pseudo-embryos.
*Nat Struct Mol Biol* **31**, 896–902 (2024). https://doi.org/10.1038/s41594-024-01251-4

Received:

Accepted:

Published:

Issue Date:

DOI: https://doi.org/10.1038/s41594-024-01251-4