Unveiling universal aspects of the cellular anatomy of the brain

Recent cellular-level volumetric brain reconstructions have revealed high levels of anatomic complexity. Determining which structural aspects of the brain to focus on, especially when comparing with computational models and other organisms, remains a major challenge. Here we quantify aspects of this complexity and show evidence that brain anatomy satisfies universal scaling laws, establishing the notion of structural criticality in the cellular structure of the brain. Our framework builds upon understanding of critical systems to provide clear guidance in selecting informative structural properties of brain anatomy. As an illustration, we obtain estimates for critical exponents in the human, mouse and fruit fly brains and show that they are consistent between organisms, to the extent that data limitations allow. Such universal quantities are robust to many of the microscopic details of individual brains, providing a key step towards generative computational brain models, and also clarifying in which sense one animal may be a suitable anatomic model for another.

These detailed brain reconstructions open up the possibility of making structural comparisons between the brains of different organisms.However, meaningful comparisons require analysis of largescale data.The largest currently available reconstructions, consisting of 1 mm 3 of human [1] and mouse [2] brain, each require over a petabyte (1 petabyte = 1000 terabytes) of storage, while future whole brain mappings will be orders of magnitude larger [5,14].Making full use of all this information not only poses big data and machine learning challenges, but also requires decisions on which aspects of the anatomic structure to focus.
Identifying overarching structural features in the complexity of the brain can aid in understanding and modeling its structural properties and their relation to function.At the cellular level, complexity in the structure of neurons has frequently been quantified through the fractal dimension d f .Typical neuron d f values are between 1.1-1.9[15][16][17][18], with variation between organisms, neuron types and developmental stage, as well as details of the methodology [19,20].It has been hypothesized that d f in neurons is related to the trade-off between connectivity and wiring cost for a neuron, with higher connectivity requirements resulting in higher d f [18].Fractal behavior is an example of scale invariance, or self-similarity in the structure of the brain.Self-similar behavior has been widely reported in both the structure and function of the brain at a range of scales, as reviewed in Ref. [21].For example, self-similarity in the macroscale structure is observed in the gyrification of the cerebral cortex [22].At the microscale, self-similarity is present in the dendritic branching of individual neurons, as can be detected through measuring correlations in the structures [17,23], and box-counting techniques [16][17][18].The relationship between these self-similar spatial features and scale invariant functional properties remains not well understood [21,24].Deeper understanding of the structure of the brain will aid further exploration of this relationship.
Here, we propose that statistical physics can provide a guiding framework for determining and quantifying additional structural features in the cellular complexity of the brain.By analyzing properties related to cell size, as well as pairwise and higher-order correlations in volumetric brain regions, we show that the cellular structure of the brain displays signatures of collective behavior close to criticality.These features include self-similarity in the size-distribution of cell fragments within sample regions, and long-range spatial correlations in the structure.From these measurements, we estimate a set of exponents in each brain.We find that these exponents obey critical scaling relations, further indicating that the brain is in the vicinity of structural criticality, which is distinct from observations of critical behavior in neural dynamics [25].The relations between critical exponents mean that these different structural properties are not indepen- dent in the brain, but are different manifestations of the same emergent phenomena.These exponents therefore form a structural universality class in each brain.
We also observe that the critical exponents are compatible between organisms.This suggests that key structural properties of the brains of a broad range of organisms may be guided by similar principles that can be quantified through these universal properties.If such properties of the cellular structure of the brain are indeed universal, the compatibility between organisms can be described in terms of a structural brain universality class.Studying simple generative models within the brain universality class opens up the possibility of generating spatial structures that capture the emergent large-scale properties of cellular brain anatomy.Such models could be studied to gain insight into the relationship between brain structure and function.The exponents can also act as a baseline for comparison between structural properties of spatial models with brain-like features, such as models of physical networks [26], and the brain.

II. SAMPLING BRAIN DATASETS
We use publicly available volumetric brain datasets consisting of ∼ 1 mm 3 each of female human [1] and male mouse [2] cortical tissue, as well as half of a female fruit fly (Drosophila melanogaster ) central brain [3].These datasets are large enough in scale that our statistical sampling techniques can be conclusively applied, which would not be possible on the smaller nervous system organisms for which full volumetric reconstructions are currently available [6][7][8][9][10].
Each dataset consists of a brain volume that was sliced and imaged using electron microscopy (Fig. 1(a), (c)) before the individual cells were identified and reconstructed (Fig. 1(b-d)) to recreate the full 3D volume.While in the fly, the neurons are all fully proofread [3], the human [1] and mouse [2] datasets have currently undergone partial cell-type identification and proofreading.As such, we primarily present our analysis including all cell types, although we show that including only neurons gives qualitatively similar results.
Here, we randomly sample volumetric regions of the segmentation data, which identifies the cell present at each voxel location, at a voxel side length of 128 nm.These samples contain positional information about the cells present within 3D volumes and 2D slices of the segmentation data, as shown in the schematics in Figs.2(a) and 3(a) respectively.The cell fragments, or segments, in the samples have the appearance of clusters on a simple cubic lattice, with the lattice spacing determined by the sampling resolution.We generate sets of at least 1000 samples of linear size L in each brain, and study finitesize effects by sampling 16 ≤ L ≤ 512 voxels in 3D.In 2D, we sample up to L = 4096 voxels in the mouse and human, and L = 1024 voxels in the fly.Details of the sampling, including choice of resolution, are discussed in Methods.

III. ESTIMATING CRITICAL EXPONENTS
In models of clustered systems, information about structural properties of the system, such as phase behavior, is encoded in the behavior of properties related to the size, or mass, of clusters as well as correlations in the structure.Close to criticality, such properties are well-approximated by power-law relations.We investigate these quantities in the generated samples from the brain segmentation data to gain insight into structural properties of the brain.
We first investigate the mean size µ of the largest segment in the 3D brain samples, plotted in Fig. 2(b).We observe that µ(L) appears consistent with the relation µ(L) ∼ L d f , which is well-understood as the form µ(L) takes at criticality in percolation [27], and demonstrates the expected presence of fractal-like behavior in the brain samples.The slope of the µ vs L curve is influenced by finite-size effects.In a critical system, the relation µ(L) ∼ L d f is expected to hold at sufficiently large L. To estimate the exponent d f in the brains, we use finite-size scaling (FSS) techniques to extrapolate the large L behavior, as is well understood in physical systems [28] and has also been applied to a range of biological systems [29][30][31][32][33].We use two-point fits between points at successive L values to estimate size-dependent d f values.We then plot these values as a function of 1/L and use a linear fit to extrapolate d f in the large L limit (1/L → 0), as shown in the inset of Fig. 2(b), and discussed in more detail in Methods.The obtained d f values, listed in Tab.I, are consistent with d f ∼ 1.6 in all three organisms.To provide a benchmark with a system known to display structural criticality, we also plot results for 3D critical site and bond percolation (discussed in Methods).By matching the number of brain samples, the percolation uncertainty can set an independent expectation on the precision of the brain results.We further investigate fractal properties of select neurons in the samples using the frequently-used box counting method.This involves overlaying an object with a grid of side length L b and counting the number of grid cells N b containing part of the object (see Fig. 4(a) and Methods).The box-counting fractal dimension d In practice d f is determined from the slope of the linear region of a log-log plot of N b vs L b , which is also the region over which the structure displays self-similarity.
Here, we perform 3D box-counting on reconstructions of select neurons (discussed in Methods), with averaged results for each brain shown in Fig. 4(b) and Tab.I.In Fig. 4(c), we plot distributions of d for the 30 most frequent neuron types in the fly dataset [3], which shows that the mean d (box) f value is representative of a "typical" fly neuron despite noticeable differences between some neuron types.We observe that d in the brains and percolation.For percolation, where the calculations can be directly compared, d , indicating that boxcounting appears less reliable than the FSS approach at these sizes and numbers of samples.
We now consider the mean segment size per voxel, S, within the 3D samples.This is given by [27] where V is the sample volume, and the sum includes the masses m i of all sample segments.At criticality, in a finite sample, this scales as S(L) ∼ L γ/ν [27], where γ is the susceptibility exponent and ν is the correlation length exponent.For our 3D samples, Fig.A third property related to segment sizes is their distribution within the samples.At criticality, scale invariance results in the entire system displaying self-similarity and a broad segment size distribution, in addition to the selfsimilarity of individual segments.Fig. 2(d) shows the segment size distribution in our largest 3D samples, with datapoints logarithmically binned.The observed behavior shows slow decay of this quantity over several orders of magnitude, as is expected in a system at criticality.This measurement, along with S, therefore provides evidence beyond single-cell measures that the brain displays signatures of structural criticality.The Fisher exponent τ that characterizes this behavior is expected to be related to the previously calculated exponents through (see Methods) Using this relation to estimate τ , we find that the values, listed in Tab.II, are consistent between organisms with τ ∼ 2.2.This value is indicated in Fig. 2(d), and appears consistent with the data, suggesting that the scaling relation holds true.
Going beyond size-related measures, we now check for the presence of long-range correlations in the brain, as a further indicator of criticality.We start with the connected two-point correlation function C(r), which compares the likelihood that two points at separation r belong to the same segment with the baseline assumption of a constant density.This is defined by where r = |x−x ′ | is the separation between points at positions x and x ′ , ⟨. . .⟩ denotes averaging across possible locations, and S(x) has the form We calculate C(r) in the x and y directions for each segment in a 2D sample, and average across the two directions (see schematic Fig. 3(a)).We do not calculate C(r) in the z direction due to different voxel dimensions in the human and mouse datasets and the smaller z extent of each sample limiting attainable distances.While directionality is known to be important in the brain, we observe that our calculations are qualitatively similar along the x and y directions.We therefore average the two, treating the structure as isotropic to first approximation.While the lack of periodic boundary conditions in our samples means that technically ⟨S(x)⟩ ̸ = ⟨S(x ′ )⟩, for simplicity we assume the segment distribution is such that ⟨S(x)⟩ = ⟨S(x ′ )⟩ = ⟨ρ⟩, the average density of the segment in the sample.Fig. 3(b) shows C(r) for the largest L samples, while plots for all L are shown in Fig. S1.We observe that long-range correlations are indeed present in the brain, as evidenced by C(r) decaying slowly over more than two orders of magnitude.At criticality, C(r) ∼ r −d+2−η , where d = 3 is the spatial dimension and η is the anomalous dimension.Examining the FSS of the exponent (see Fig. 3(b, inset) and Methods) leads to the extrapolated η values in Tab.I, which are consistent between the fly and mouse with η ∼ 0.7.While the human η value appears larger, the difference is small compared to the error.
The presence of higher-order long-range correlations can offer a deeper test of criticality.We probe higherorder correlations by examining the "gap-size statistics" n(s).A pair of sites in a segment at separation s contributes to n(s) only if there are no other sites belonging to the same segment between the pair, as indicated in Fig. 3(a).This contrasts C(r), where all pairs of sites in a segment contribute, and leads to n(s) capturing higherorder structural complexity related to the concavity of the segment shape.Fig. 3(c

IV. ANALYZING NEURONS
While our analysis so far has mostly included all cell types, for many applications the physical structure of neurons may be of particular interest.Restricting the analysis to only neuron segments (discussed in Methods), we find that neurons qualitatively agree with our previous results.This agrees with the expectation that large-scale behavior is dominated by neurons as other cell types have a much more limited spatial extent.We quantify the neuron behavior in Tab.S1, which lists exponent values for neurons and all segments at the largest L available in all three brains, and observe good agreement between datapoints.
One set of points that do not overlap within the errors are between the ζ values for the human sample.At the largest L available in the human, we find ζ(L) = 2.2(4) for all segments and 1.9(6) for the neuron segments, suggesting the large L behavior is consistent between neurons and all segment types.The discrepancy at smaller L may be due to this being the size above which correla-New Figure 3 Fly, Mouse, Human: m Site, Bond: au   value calculated in (b).The median of each distribution is marked in black, while the interquartile range is indicated in white.Note that while the neurons included are classified as "uncropped" in the dataset, parts of the neuron, including the soma, may be missing from the data, which could lead to underestimation of the total volume.
tions in the human brain can be considered long-range.
The other non-overlapping errors are between the d values in the mouse for neurons and all segments, and between the ζ values in the mouse for neurons at different resolutions.We attribute these to the lower proportion of currently available cell-type identification in this dataset (see Fig. S3), which is also what limits the statistics for obtaining good estimates for the extrapolated exponents for neurons in the mouse and human.This limitation is due to the current level of proofreading in the datasets and could be revisited after more extensive cell identification has been undertaken.Overall, these results suggest that analysis including all cell types is a good approximation for the behavior of neurons at the largest scales.

V. TESTING FOR PROXIMITY TO CRITICALITY
We have shown that the brain displays signatures of structural criticality, but require further analysis to determine whether the structure is indeed critical.One consideration is whether the structure is at criticality, or slightly off-critical.While typically the critical point can be determined by tuning the control parameter, the lack of accessible control parameter here necessitates an alternative approach.
Motivated by percolation theory, one possible test of criticality is to examine the proportion of samples containing a "spanning" segment that touches all faces of the bounding cube and is connected within the sample.While spanning fraction is an order parameter of percolating clusters, and can be calculated for brain samples, the nature of the structural brain order parameter remains an open question.Fig. 5(a) shows the fraction of 3D samples containing a spanning segment, which is observed to increase with L. This behavior is consistent with the trend of percolation slightly in the ordered phase (Fig. 5(a, inset)), where the behavior is dominated by a large cluster.If spanning is a meaningful order parameter for the brain, this could indicate that the brain is slightly in the ordered phase.
Another quantity we can estimate is the fourth-order Binder cumulant [37], with [38] At criticality, U is independent of L, while in the ordered phase U (L → ∞) → 1, and in the disordered phase U (L → ∞) → 0. Note that the high order of U makes it sensitive to fluctuations, meaning it may not give reliable results in the brain samples.Calculating this quantity (Fig. 5(b)), we find that for the fly U decreases with L, while in the mouse and human, U is roughly constant at smaller L and decreases at the largest sizes, indicating the brain may be slightly in the disordered phase.The shading in Fig. 5(b) shows the region of the plot in which site percolation samples with occupancy 0.95 ≤ p/p c ≤ 1.05 fall, where p c is the critical occupancy.If percolation can offer any guidance on the brain behavior, these bounds reinforce the idea that the brain is close to criticality.Taken together, the spanning fraction and Binder cumulant measurements are inconclusive in determining on which side of the critical point the brain lies, although both suggest that brain structure is close to criticality.

VI. SCALING RELATIONS AND UNIVERSALITY
We provide further support for the idea of structural criticality in the brain by examining the scaling properties of the critical exponents.At criticality, the observed critical exponents are not independent and obey certain scaling relations, such as Eq. 3. We can test the validity of another scaling relation in the brain using Fisher's identity γ/ν = 2 − η [39], by calculating η values from our γ/ν estimates.The results, listed in Tab.II, are consistent with the η values calculated from C(r) (Tab.I), suggesting that the brain exponents obey standard scaling relations.
The observation that scaling relations hold between independently measured critical exponents is strong evidence that the structure of the brain has evolved to be at, or at least close to, a state of criticality as understood in physical systems.This means that the individually observed long-range patterns are not independent quantities.Just like in critical physical systems, many other characteristics of brain structure could be measured when suitable data becomes available.This would lead to a standard set of critical exponents in each brain, which would define a universality class.Among these there are at most two or three independent exponents, while all others are related via scaling relations.The number of independent exponents depends on the validity of certain hyperscaling relations involving the spatial dimension d.
Comparing our brain exponent estimates in Tabs.I and II, we observe that the exponent values obtained appear consistent between the three organisms studied.In this comparison, we note that our error estimates are based only on statistical properties and do not account for biological variation or uncertainty due to data quality, meaning the true uncertainty is potentially much larger.The apparent consistency we observe between the exponent values of different brains suggests that, to a first approximation, the brains of multiple organisms may be described by a single structural universality class.

VII. DISCUSSION
Our results indicate that the brains of multiple organisms show signatures of being at or close to a structural phase transition and that the corresponding critical exponents are consistent between organisms.If these structural properties of the brain are indeed critical and universal between organisms, we would expect to observe consistent exponents in other organisms and brain regions.However, with the studied datasets we have only been able to analyze a single partial brain region in three organisms, with the largest lengthscale probed being limited by the overall size of the brain samples.We therefore cannot determine whether these measurements are sensitive to known structural variability between brain regions, as well as variations between individual organisms.We also note that the statistical sampling techniques used here could not be conclusively applied to small nervous systems, and that since criticality is a large-scale phenomenon we anticipate that there would be a minimum brain size above which these features emerge.
Within the datasets, the incomplete proofreading of the human [1] and mouse [2] data may also impact results at large L, in addition to the neuron analysis discussed previously.In particular, split and merge errors in the segmentation data mean that individual cells may be classified as multiple cells, or multiple cells may have a single segmentation ID, which we expect would have the most impact at the largest sample sizes we study.The fact that these volumes are partial brain regions also impacts the segmentation due to cell fragments that would be connected outside of the sample being classified as separate cells within the volume.We explored this effect by relabeling segments within our 3D samples so that only regions of the same cell that were in physical contact within the sample were treated as the same segment.Our analysis (Fig. S4) indicates that the long-range behavior is robust to this change, with results qualitatively similar to the original segmentation.
When comparing results between brains, it is important to consider the relative length scales involved.The entire fly brain is about the size of a dendritic arbor in a mammalian cortical pyramidal neuron [3].As such, in the fly we are able to probe longer distances relative to the length scale of the entire brain than are accessible in the mouse and human.In the mammalian samples, the available length scale may only start to probe the regime of true long-range behavior.This could explain the FSS of the critical exponents in the mouse and human, where the largest sizes appear to have a different trend to the smaller sizes.While the present analysis is limited by the physical length scale of each dataset, some results could be verified at larger scales in the future using full-scale reconstructions of select neurons, such as the Allen Institute single cell data in the human [43] and mouse [44].As more data becomes available, it will be possible to expand our analysis to study larger scales [12,13], further brain regions, additional organisms, including ongoing experiments in zebra fish [5], and individual variation.Overall, we believe that with increasing data coverage and quality, statistical physics tools will become more and more powerful in characterizing brain data.
The exponents and scaling relations in the brain can be used as a benchmark for comparing structural models with actual properties of the brain.For example, recent physical network models, in which the nodes have d f ∼ 1.62, have been shown to have network properties reminiscent of the brain [26].Further analysis could determine whether other exponents in this model are also consistent with those observed in the brains.The validity of scaling relations and the universality in the structure of the brain also open up the possibility of designing generative physical models within the brain universality class.These models could be used to predict further universal aspects of brain structure including standard critical exponents and universal features that go beyond these exponents, such as the gap-size exponent ζ and universal amplitude ratios.Generative models could serve as a baseline model of brain anatomy in a wide range of organisms, and could also be useful for making predictions about brain structure and dynamics even if they are agnostic towards certain biological details.
As a first consideration towards a generative brain model, we compare the brain critical exponents with exponents in other physical systems.We highlight a broad range of 3D systems in Fig. 6, focusing on η and d f , which we estimated directly.Despite showcasing percolation as an example of structural criticality, it is unsurprising that this system, where random occupation of sites or bonds leads to clusters, provides a poor model for brain structure.We observe that the brain exponents are close to those of the Gaussian Random Field Ising Model (G.RFIM) [45], which violates hyperscaling with θ ∼ 1.5.This model may therefore offer guidance on expected values of other brain exponents, such as ν G.RFIM ∼ 1.4, although defining clusters in the G. RFIM analogous to the brain cell fragments remains an open challenge.It is intriguing that in the cosmic web, one of the few other systems where these exponents are known, the η and d f ranges overlap with those of the brain [46,47].However, it is unclear to what extent the universe could be interpreted as a critical system.
While the possibility of designing generative brain models makes structural criticality an appealing physical model for the brain, it is important to consider the biological relevance.Universal features of the brain can be directly compared between organisms, and can therefore be used as guidance for instances where one organism may be used as a model for another.The nature of the relationship between structural criticality in the cellular structure of the brain and observations of critical behavior in neuronal avalanches [25] remains an exciting direction to explore.Indeed, models of critical neuronal activity, which may give the brain optimal computing and processing properties [48,49], do not require complexity in neuronal structure [50].
Network aspects of the brain, which facilitate functional properties, are inevitably underpinned by the cellular structure.The structure of the brain neural network is expected to be influenced by the wiring cost [51][52][53][54] and functional constraints on the network [53].It has been shown that networks in C. Elegans, large-scale human brain data, and Drosophila display Rentian scaling [3,54], which in the brain relates the number of con-nected neurons in a group with the number of connections out of the group through a power-law relation.Analysis of the topological and physical Rent exponents in the network has shown that the wiring is efficient, but not minimized [54].Self-similarity in neuron structure has previously been proposed as a factor in balancing connectivity and cost [18,23,53].It is intriguing to consider that the brain may have evolved a structure close to criticality in order to aid formation of near-optimal network properties.If the structure of the brain sat far into an ordered cluster phase, the structure would have many redundant long-range connections, or excess wiring, between different points.Conversely, in the disordered phase there are no long-range connections, which are required to satisfy connectivity constraints.We therefore hypothesize that criticality in the structure may allow for the emergent near-optimal network properties by balancing geometric wiring costs with long-range connectivity requirements.To understand how these anatomic and network properties come about in the brain, it would be insightful to study brain structure during development, pruning and learning as suitable data becomes available.TABLE I. Summary of directly measured critical exponents in each brain.Numbers in parentheses give the measured uncertainty in the last digit(s) while numbers in square brackets give the difference between the largest L data point and the extrapolated value.Superscripts indicate the method used to calculate the exponent.For 3D percolation, the quoted number is based on numerical results from the literature, while the numbers in curly braces indicate the largest of the difference between our numerical result and the known result, and the uncertainty on our numerical result.
a References [27,36] TABLE II.Summary of indirectly measured critical exponents in each brain.Superscripts in each column heading indicate the directly measured exponent in Tab.I used to calculate each value in the relevant (hyper)scaling relation, with d (µ) f used for d f .The number given in parentheses gives the propagated uncertainty in the last digit(s).For 3D percolation, the quoted number is based on numerical results from the literature, while the numbers in curly braces indicate the largest of the difference between our numerical result and the known result, and the uncertainty on our numerical result.
We sample the segmentation data in each brain, which gives the ID number of the cell present at each location within the volume.The segmentation data is available at a range of resolutions (mip values), with successively increasing mip values corresponding to an increase by a factor of two in the voxel dimension in the x and y directions.Due to the larger value of the voxel size in the z direction in the human and mouse datasets, the z voxel dimensions at the smallest sizes do not double in these brain samples until mip 3.
To choose the preferred sampling resolution, in Fig. S3 we first examine how the mean number of segments in 100 randomly sampled slices of linear size L = 32.768µm changes with mip value for each brain for all cell types (left panels) and neuron fragments (right panels, discussed below).For mip 4 and below (higher resolutions) > 95% of neuron segments (neuron identification discussed below) remain within the sample window during downsampling.At mip values higher than 4 we observe a sharp drop off in the proportion of segments remaining within the sample.
The results in Fig. S3 suggest that sampling at mip 3 or below would give the least loss of data during downsampling.However, sampling at lower mip values means that larger sample sizes are required to study the same physical length scale.We therefore choose to sample each dataset at mip 4, at which the largest sample size we can generate is dictated by the geometry of the brain samples.The sampling and analysis presented here require a computational time of over 4 CPU years, while sampling at lower mip values would require significantly more time.Computational limitations impact our ability to perform sampling and calculations at the largest sizes at lower mip values.At mip 4, the voxel dimensions are 128 × 128 × 132 nm 3 for the human sample, 128×128×160 nm 3 for the mouse, and 128×128×128 nm 3 for the fly.
Tab. S1 shows the value of a single data point that would contribute to the FSS calculations for each scaling exponent calculated in the main text for each brain sample at mip 3 and mip 4. The values listed for d (µ) f are for L = 65.356µm, the largest 3D sample size in all three brain datasets (discussed below).For 1 + η and ζ the values given are for L = 131.072µm, the largest physical size sample available in all three brains.In each case the value is given at each resolution for all segments, and neuron segments (discussed below).We observe that the datapoint values are consistent between mip 4 and mip 3, thereby justifying our choice to use mip 4.
Generating brain samples from datasets -We generate three sets of samples from the segmentation data for each brain -2D slices, 3D volumes, and 3D reconstructions of select neurons -using CloudVolume (https://github.com/seung-lab/cloud-volume.git).The 2D slices are regions of size L×L sampled from the x−y plane.The origin of each sample, located at the top-left corner, is randomly selected while ensuring that the sample is contained within the brain volume.Entries in the resulting L × L matrix contain the segmentation ID of the cell present at each position.
We generate 2000 2D samples for each brain at each L value.In units of mip 4 voxels, we sample L = 2 n for integers 4 ≤ n ≤ 12 (up to L = 524.288µm) in the human and mouse, and 4 ≤ n ≤ 10 (up to L = 131.072µm) for the fly.The maximum L in each case is dictated by the brain sample geometry.We utilize 2D samples where possible due to brain sample geometries allowing a larger sample size in 2D than 3D.
In 3D we generate 1000 samples at each size using the same overall procedure as in 2D.In each case, we sample L = 2 n for integers 4 ≤ n ≤ 9 at mip 4. The maximum sample size is again dictated by the brain sample geometry.The largest sample sizes are 65.536 × 65.536 × 67.584 µm 3 in the human, 65.536 × 65.536 × 81.92 µm 3 in the mouse, and 65.536 × 65.536 × 65.536 µm 3 in the fly.Note that L values quoted in microns for 3D samples are given in terms of their x and y dimensions.
We reconstruct select proofread neurons (discussed below) in 3D by scanning through the entire volume of each dataset and saving all positions associated with a given neuron.In the fly we gathered the positions at mip-4 resolution, while in the mouse and human we used mip-5.The higher choice of mip in the mouse and human is due to the larger volume of the datasets and individual neurons, compared to the fly, which makes the computation more demanding.
Selecting dataset neurons -Each dataset provides some level of cell-type identification to accompany the segmentation data.In the fly, which is the most extensively proofread, there are 97 900 traced neurons, of which 22 212 have somas present and 21 739 are classified as uncropped.Uncropped neurons are defined in the release as having most arbors within the imaged volume, but the soma may not be present [3].Here, we choose to use the 21 739 traced, uncropped neurons in all calculations where we filter to include only neurons.
In the mouse dataset, a prediction model based on nucleus data was used to identify cells as either neurons or not neurons [62].Filtering to keep only neurons for which the soma is contained within the volume gives 72 789 neurons.These are the cells we count as neurons for calculations, except for box-counting.Of these neurons, 601 have undergone some level of comprehensive proofreading, including 78 that have undergone exten-sive proofreading of both their axons and dendrites.For the 3D neuron reconstructions used in box-counting, we use the 78 neurons that have undergone extensive proofreading.
In the human dataset, we use the provided somas table for cell-type classification.Filtering this table to include only neurons gives 15 827 neurons, which we use for all neuron calculations except box-counting.For box-counting, we use 3D reconstructions of the 104 fully proofread neurons released with the dataset.
Generating critical percolation samples -We generate samples for 3D critical site percolation on a lattice of size L 3 with free boundaries.Each site in the lattice is occupied with probability p and is otherwise unoccupied.If two nearest neighbor sites are both occupied, they are considered part of the same cluster.By setting p to the critical probability p c = 0.311608 [27], the resulting cluster structure displays structural criticality.
In 3D critical bond percolation, all sites in the lattice are occupied and nearest neighbor sites are connected if a bond is present between them.We generate samples by occupying bonds at the critical probability p c = 0.24881 [27] and identify the resulting clusters of connected sites.
For results using 3D samples, we generate 1000 samples for sizes L = 2 n for integers 4 ≤ n ≤ 9, in line with the voxel size of the largest 3D brain samples.For results using 2D samples, we generate 2000 3D percolation samples for sizes L = 2 n for integers 4 ≤ n ≤ 10 and perform calculations on a 2D slice through the center of each sample.The largest size is the same as the largest sample size, in voxels, available in the fly at mip 4.
Finite size scaling (FSS) -FSS is used to extrapolate the values of critical exponents in the thermodynamic limit by extrapolating their L-dependence.For a function X(x), an exponent a is defined by We examine the FSS of a by comparing points at size x and x/2, where x ∝ L, to obtain a two-point estimate for a, given by We estimate the value of a by examining the behavior of a(x) against 1/L and extrapolating the size dependence as to L → ∞ (1/L → 0).Note that generally, the exponents scale with a combination of terms, including L −1 and L −1/ν , as well as higher order terms that are relevant only at small sizes.However, we observe that for the largest sizes, the FSS of the exponents is approximately linear on the 1/L plots.We therefore use linear fits through this approximately linear region to extrapolate exponent values.Note that the value of ν remains unknown in the brain, and estimating it reliably from FSS would require much smaller errors.
FSS for d (µ) f and γ/ν -For the largest segment size µ ∼ L d f and the mean cluster size S ∼ L γ/ν , we use datapoints at size L and L/2 for each estimate.We extrapolate d (µ) f using a linear fit through points corresponding to the largest three sizes.For γ/ν, the approximately linear regime includes only the largest two points for each brain and the largest three points for percolation.
FSS for η -For C(r) ∼ r −d+2−η , we calculate two different two-point estimates for the exponent, taking points at x = L/8 and x = L/16.The larger, x = L/8, is the largest value at which the curve in samples of size L follows the curve in samples of size 2L.Points at x = L/8 and x = L/16 are highlighted in Fig. 3 Box counting fractal dimension -We implement box counting by overlaying 3D grids of side length L b onto our fractal objects.The box sizes are chosen to be approximately evenly spaced on a logarithmic scale.We use at least 30 box sizes on each sample, with sizes ranging from L b = 1 voxel up to the size of the fractal object.In each case, we align one corner of the grid with the minimum (x, y, z) coordinates of the fractal object and count the number of grid cells N b that contain part of the fractal for each L b .We then estimate d (box) f from the slope of the linear region of the log-log plot.Our box counting analysis is performed on the set of 3D reconstructions of select proofread neurons and on the largest clusters in the largest L percolation samples.Using the full available extent of the proofread neurons allows us to access larger length scales for this calculation than are accessible with the 3D samples.For each 3D sample, we analyzed each segment and assigned distinct cluster IDs to parts of the segment that were disconnected within the sample.We then performed calculations on these newly defined clusters to compare with properties of the original segmentation, as shown for (a) the largest cluster mass in 3D brain samples, and (b) C(r) for the largest L available in the 3D samples.In each plot markers with colored edges indicate the original segments while those with black edges show results for clusters after the connected components have been determined.

FIG. 1 .
FIG. 1. Illustration of data used from the human brain dataset [1].(a) EM image of a region of the human brain data.(b) Segmentation data for the EM image in (a).Colors indicate distinct segments.(c) EM image of a larger region of the human brain with the segmentation for five select neurons highlighted, showing how the same neurons may cross a given plane multiple times.Circles are used to highlight the locations of small fragments that are not easily visible.The black rectangle in the upper left indicates the region shown in (a).(d) 3D mesh reconstructions of the five neurons highlighted in (c) with the larger area shown in (c) indicated by the gray rectangle.

FIG. 2 .
FIG. 2. Analysis of segment size properties in 3D brain samples.(a) Schematic of a 3D brain sample of size L = 8, with colors indicating individual segments.The largest mass segment (cream) is outlined in black.(b) L-dependence of the mean largest segment size µ in the 3D samples, plotted on logarithmic axes.Results in each plot are also shown for critical site and bond percolation, which are known to display structural criticality.(inset) FSS of the fractal dimension d f extracted from µ(L) as described in Methods.(c) L-dependence of the mean segment size S in each sample, plotted on logarithmic axes.(inset) FSS of the exponent γ/ν (discussed in Methods).(d) Size distribution of segments in the largest 3D sample sizes with points logarithmically binned by volume.Black lines in (b-d) indicate the mean exponent value for the brains.Error bars in the main panel in (b) and (c), indicating the standard error on the mean, are smaller than the point size.Error bars in inset figures are propagated uncertainties from points in the corresponding main figure panel.
to the known d f value than d(box) f ) shows n(s) for the largest L 2D samples (all L are shown in Fig. S2), in which the slow decay of n(s) over several orders of magnitude indicates long-range correlations.The FSS of the exponent n(s) ∼ s −ζ (Fig. 3(c, inset), also see Methods) leads to the extrapolated ζ values in Tab.I.In the mouse and human ζ ∼ 2, while in the fly ζ is slightly smaller.However, the datapoint trend suggests that with larger sizes available ζ could converge towards 2. This is in line with known numerical or analytic ζ values in other critical systems, which all indicate ζ = 2 [34-36], further implying critical behavior in the brain.

FIG. 3 .
FIG. 3. Long-range correlations in brain samples.(a) Schematic of a 2D brain sample of size L = 8 voxels, where colors indicate different segments.Pairs of points that contribute positively to the pair correlation function C(r) starting from the highlighted point are indicated by black arrows, while white arrows indicate pairs of points in the blue cluster in the indicated row that contribute to the gap size statistics.(b) C(r) for the largest 2D sample size for each system on a log-log plot.Highlighted points are used in the two-point fit to determine η (see Methods).(inset) FSS of the exponent of C(r) ∼ r −d+2−η .(c) Log-log plot of n(s) for the largest 2D sample size for each system.Highlighted points are used in the two-point fit to determine ζ. (inset) FSS of the ζ exponent, n(s) ∼ s −ζ .Error bars in the main panel in (b) and (c) indicate the standard error, which is typically smaller than the point size, while the inset error bars are range bars indicating the values of the exponents at the two distances sampled in each case, as discussed in Methods.Black lines in (b) and (c) indicate the mean extrapolated exponent value for the brains.Note that in (b), the human value is treated as an outlier and is not included in the calculation of the mean value.

FIG. 4 .
FIG. 4. Neuron box fractal dimension calculation.(a) Schematic of grids of different sizes used in the box-counting procedure.Shown are L b = 1 (gray), L b = 2 (yellow), and L b = 4 (black).(b) Log-log plot of N b vs L b for 3D neuron reconstructions and the largest cluster in the largest size percolation samples.The magnitude of the best fit slope gives the d (box) f values listed

FIG. 5 .
FIG.5.Testing for the proximity to criticality.(a) Fraction of 3D samples containing at least one spanning cluster fspan in samples of size L. (inset) Spanning fraction for site percolation with different occupation probabilities p/pc, where pc is the critical probability, from within the shaded area in the main panel for different L values.(b) Binder cumulant U (Eq. 6) calculated for each brain and critical percolation for different L values.In both panels, the shaded region highlights the numerically-obtained region in which site percolation points within 5% of pc fall.

FIG. 6 .
FIG. 6. Summary of η and d f in the brain and a range of 3D physical systems.Brain d f values are the d (µ) f values in Tab.I.The η values are the mean of those listed in Tabs.I and II, with × indicating the individual values of the two η measurements and error bars showing the full range of uncertainty across the two values.The Ave. point representsthe mean exponent values for the brains, while the surrounding ellipse highlights the approximate region of the space into which the brain exponents fall.The remaining points represent the following 3D systems: (3D Perc.) 3D Percolation[56], (ISG) Ising Spin Glass (plotted point from values in Refs.[42,57], ellipse axes represent the range of exponent values from studies listed in[57]), (XY*) XY model[58], (G.RFIM) Gaussian Random Field Ising Model[45], (DCP) critical point of the disordered quantum Ising model/Contact Process,[59,60], (DMCP*) multicritical point of the disordered quantum Ising model/Contact Process[61], (GC) Galaxy clustering[46,47].The plotted gray line indicates the hyperscaling relation η = 2 + d − 2d f , * indicates hyperscaling is assumed to hold.
(b) for the largest L, and points at x = L/8 are highlighted for all L in Fig. S1.Plotted values in Fig. 3(b) (inset) show the mean of these two values and the error bars give the range.The extrapolated exponents are calculated with a linear fit through points for the largest three sizes.FSS for ζ -For n(s) ∼ s −ζ , we again calculate two different two-point estimates.For the brains these are taken at x = L/2 and x = L/4, while for percolation we use x = L/8 and x = L/16.These two points are highlighted on each curve in Fig. 3(c) for the largest L, while points at the larger x values are highlighted for all L in Fig. S2.The plotted exponent values in Fig. 3(c) (inset) show the mean of these two estimates while the uncertainties indicate the range.Extrapolated ζ values include the largest three sizes in the linear fit.
Relation between τ , γ/ν and d f -The Fisher exponent τ is related tod f through τ = 1 + d/d f .When hyperscaling is valid, η = 2+d−2d f , which together with the Fisher relation γ/ν = 2 − η leads to γ/ν = 2d f − d.We then use this to eliminate d in the relation for τ , leading to Eq. 3. When hyperscaling is violated, we need to substitute d → d − θ in all d-dependent relations.This does not change the resulting expression in Eq. 3.
FIG. S3.Comparison of data sampled at different resolutions (mip values).(a-c) Mean number of segments in 100 2D square samples of linear size L = 32.768µm, sampled at different mip values, for the (a) fly, (b) mouse, and (c) human datasets.The right panel of each figure shows the mean number of neuron segments in each sample.Note the different axis scales in the right panels of each plot.These numbers help inform our decision to primarily sample at mip 4, although we verify our results are consistent with mip 3 in Tab.S1.(d) C(r) for 100 samples in the fly of size L = 131.072µm at different mip values.(inset) Plotting r in µm instead of voxel units results in the curves collapsing, indicating consistent slopes between mip values.
FIG. S4.Comparison between properties of the original segmentation and clusters derived from the segmentation based on connected components within a sample.For each 3D sample, we analyzed each segment and assigned distinct cluster IDs to parts of the segment that were disconnected within the sample.We then performed calculations on these newly defined clusters to compare with properties of the original segmentation, as shown for (a) the largest cluster mass in 3D brain samples, and (b) C(r) for the largest L available in the 3D samples.In each plot markers with colored edges indicate the original segments while those with black edges show results for clusters after the connected components have been determined.