Quantification of the morphological characteristics of hESC colonies

The maintenance of the undifferentiated state in human embryonic stem cells (hESCs) is critical for further application in regenerative medicine, drug testing and studies of fundamental biology. Currently, the selection of the best quality cells and colonies for propagation is typically performed by eye, in terms of the displayed morphological features, such as prominent/abundant nucleoli and a colony with a tightly packed appearance and a well-defined edge. Using image analysis and computational tools, we precisely quantify these properties using phase-contrast images of hESC colonies of different sizes (0.1–1.1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\bf{\text{mm}}}}^{{\bf{2}}}$$\end{document}mm2) during days 2, 3 and 4 after plating. Our analyses reveal noticeable differences in their structure influenced directly by the colony area \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\boldsymbol{A}}$$\end{document}A. Large colonies (A > 0.6 mm2) have cells with smaller nuclei and a short intercellular distance when compared with small colonies (A < 0.2 mm2). The gaps between the cells, which are present in small and medium sized colonies with A ≤ 0.6 mm2, disappear in large colonies (A > 0.6 mm2) due to the proliferation of the cells in the bulk. This increases the colony density and the number of nearest neighbours. We also detect the self-organisation of cells in the colonies where newly divided (smallest) cells cluster together in patches, separated from larger cells at the final stages of the cell cycle. This might influence directly cell-to-cell interactions and the community effects within the colonies since the segregation induced by size differences allows the interchange of neighbours as the cells proliferate and the colony grows. Our findings are relevant to efforts to determine the quality of hESC colonies and establish colony characteristics database.

The Feret's diameter (h Feret ), also know as maximum caliper measures the longest distance between any two points along the Circularity is a shape descriptor that indicates the degree of similarity with a circle, therefore as this quantity approaches 0, the 27 shape is less circular. It is calculated using the equation h f = 4pa p 2 , with a and p the nucleus area and perimeter respectively.

28
The values for the PDF of the circularity are shown in the Supplementary Figure S2(d), with all three distributions centred 29 around ⇠ 0.85, therefore the nuclei shapes are highly circular.

31
It is very similar to circularity but is insensitive to irregular borders along the perimeter. It is measured using the highest axis of 32 the best fit ellipse. The equation for the roundness is h R = 4a pa 2 . The results for the PDF for the roundness are shown in the Describes the extent to which a shape is concave or convex. The solidity of a completely convex shape is 1, the farther the 36 solidity deviates from 1, the concavity in the nucleus increases. It is calculated following the equation, h s = a A , with A 37 representing the area of the convex hull that best encloses the nucleus boundary. The results are shown in the Supplementary 38 Figure S2(f) and indicate that most of the nuclei have values close to 1.

39
The averages for each shape descriptor are shown in the Supplementary Table S3 and  radius r and thickness dr, i.e., ng(r)4pr 2 dr. In other words, it describes the variation of the local cell density within a distance 14 r as viewed from the centring cell, relative to its bulk value. For ordered materials, such as crystals, the radial distribution 15 function show an oscillating behaviour, where the peaks in g(r) are interpreted as the average inter-particle distances. The 16 information contained in the RDF is a spatial average and has its limitations when the system is not isotropic.

17
Supplementary Figure S7 show results for the RDF for two colonies of different sizes with areas 0.690 mm 2 and 1.131 mm 2 , 18 respectively. For both cases, the first peak is the best-defined one and corresponds to the distribution of the distance between 19 the first nearest neighbours`1 ⇠ 18.56 µm and`1 ⇠ 18.02 µm respectively. The position of the second peak, gives the average 20 distance or coordination, between second neighbours`2. We conclude that the colonies show a short-range order and the nearest 21 coordination shells are visible in both cases. But, as we increase r to account for the second nearest neighbours the second 22 peak is washed out and broader than the first due to missing long-range order. The largest colony shows a second peak centred  and indicate that g(r) is similar to the ones obtained for amorphous materials, therefore the structure of the material becomes 26 blurred as the radius r increases.

27
In biology, calculations of g(r) have been performed to study protein organisation, aggregation of particles on cell 28 membranes, [55,56]. This tool has not been used previously in the literature to characterise dense aggregates of cells. Supplementary Figure S7. Radial distribution functions g(r) for the two largest colonies analysed. The first peak, associated with the separation of the first nearest neighbours indicates a location of < 20 µm, consistent with the results presented in Figure 4 obtained through the Voronoi diagram. Figure S8. Colony perimeter P as a function of the colony area A. These datapoints were obtained by applying the canny Deriche algorithm to the samples. The red dashed-dotted line shows the best fit to a power function with a scaling factor k = 7.207 and exponent g = 0.47, (R 2 = 0.963).

Supplementary Table S2.
Measurements obtained for the nuclei morphology and cellular parametric characteristics in hESC colonies. The total number of cells in the colonies N c , the mean cell nucleus area hai, the mean number of nearest neighbours hN n i and the mean intracellular distance h`ni, alongside the standard error of the mean and the standard deviations (within parenthesis) for the measurements.    Table S4. Datasets of the morphological and parametric characteristics for hESC colonies. We show the colony area A, number of cells N c , perimeter P, circularity F, Feret diameter, minimum Feret diameter, aspect ratio L, roundness R, solidity S and day of imaging. The colony identifier also indicates the zoom at which the image was taken in the last two characters: ⇥5 or ⇥10.

Tag
Identifier Cells were not counted, only features analysed. Figure S11. Gallery of colony images used for the morphological analysis. The images are labelled in the grid at the bottom-right following the tags shown in Supplementary Table S2. For example: In the grid 1, we have the colony with tag 1 in Supplementary Table S2, which corresponds to the identifier DAY2_4x5. This indicates that the colony was imaged at day 2 using a magnification 5⇥. The grids with two tags, i.e., 4,5 and 16, 17, show two colonies analysed in the same image. Scale bar 100 µm.