The Increase of the Functional Entropy of the Human Brain with Age

We use entropy to characterize intrinsic ageing properties of the human brain. Analysis of fMRI data from a large dataset of individuals, using resting state BOLD signals, demonstrated that a functional entropy associated with brain activity increases with age. During an average lifespan, the entropy, which was calculated from a population of individuals, increased by approximately 0.1 bits, due to correlations in BOLD activity becoming more widely distributed. We attribute this to the number of excitatory neurons and the excitatory conductance decreasing with age. Incorporating these properties into a computational model leads to quantitatively similar results to the fMRI data. Our dataset involved males and females and we found significant differences between them. The entropy of males at birth was lower than that of females. However, the entropies of the two sexes increase at different rates, and intersect at approximately 50 years; after this age, males have a larger entropy.

. Names and abbreviations of AAL brain regions 3 Entropy property proofs 3.1 Why the discrete entropy can take place of the relative entropy?
In the main text, the entropy is defined as the relative entropy 1 , which is the KL (Kullback-Leibler) divergence from the correlation distribution to a reference measure (the Lebesgue measure) (B) . In our work, this can also be considered as a differential entropy 1 . When calculating, however, we use an entropy appropriate to a discrete distribution (that results from binning the data). We shall refer to this as `the discrete entropy . Here we give the reasons why we can use the discrete entropy to take the place of the relative entropy, the differential entropy, and does not lead to errors.

Proof:
Let P and m denote the correlation distribution and the reference measure. We assume that P(dx) =f(x)m(dx). We then define B as the Lebesgue measure in [ 1,1] and as the counting measure that separate [ 1,1] evenly into k parts, with a sum of 2. Thus, and where P = ( ) Since and log( ) is a constant, we need only calculate if we compare the difference of KL divergence D (P|| ) with two different correlation distributions, P and P . In addition, log( ) is the discrete Shannon entropy 2 . As the will converge to B with k increasing, D (P|| ) also converges to D (P||B) as k increases. Therefore, we can use the discrete entropy log( ) to replace the relative entropy in our story without errors.

Entropy of the whole brain is not less than the average of every single brain region s entropy.
Method I: Calculate the entropy directly using all 4005 brain region pairs. Method II: Calculate the entropy of every brain region first, then average them. Method I calculates the entropy of the whole brain, while Method II calculates the average of every single brain s entropy. The entropy obtained by Method I is not less than that of Method II. Proof: Let [ 1,1] be separated into n parts and let stand for the probability that the correlation of the pairs connecting the j th brain region occurs in the i th part of [ 1,1]. Define H(P) as the entropy calculated by Method I. Thus, Since the entropy H is a concave function of its argument 3 , The right hand side of the last inequality is simply the entropy calculated by Method II, where every term in the sum is the entropy of a single brain region. Therefore, the entropy extracted by Method 1 cannot be not less than that of Method II. Moreover, equality only happens when all brain regions have the same entropy.

Entropy will grow higher with more intervals in [-1,1]
In the previous parts of this Supplementary Information, we separated [ 1,1] into 20 intervals. If we separate it into more parts, the entropy will be larger.

Proof:
According to the property of Shannon entropy, The left side of the last equality is the entropy with mn parts in [ 1,1], while the first term in the right side of the equality is the entropy with n parts. Moreover, the last term of the equality is not less than zero. Thus, the entropy with mn parts is larger than that with n parts. In other words, if there are more parts in [ 1,1], the entropy will be higher.

Atlases with different numbers of brain regions
In the main text, the atlas we used was the AAL template, with 90 brain regions. We have investigated the relationship between atlas choice, the number of brain regions and the value of the functional entropy. We used different atlases with various numbers of brain regions on one sample, which was from a young healthy male, where the quality of the MRI was quite good. Figure S1 shows the entropy results from different atlases (red nodes) and a fit of the data by an exponential (blue line). The relationship between entropy and the number of brain regions, N, is Entropy=1.64N . +3.305. According to this equation, the entropy asymptotically approaches a constant as the number of brain regions tends to infinity. The limiting value of the functional entropy of this individual is 3.305.

Figure S1
Entropy versus the number of brain regions. The red nodes are from an entropy calculation based on atlases with different numbers of brain regions. The blue line is a fit through the nodes from an exponential function.
Note that not all samples were suited to calculations of the entropy, using more than 2000 brain regions, because more brain regions lead to more noise. Thus, if we want to extract an accurate entropy with more brain regions, we need high quality MRI samples. Not all of the samples have sufficient quality, e.g., some samples are from 1.5T MRI or are affected by some head motions.

Atlases with a fixed number of brain regions
We further note that if we use different templates, but fix the number of brain regions, the results are quite stable. To see this consider Figure S2, where we have chosen the AAL atlas and separated it in different ways. We have considered 10 different atlases (all have 1024 brain regions) and extract the entropy of 20 normal young individuals, based on the various atlases. We have found no significant entropy differences for each sample, with various atlases. In Figure S2, different symbols label the entropy from different brain region atlases. The difference between the entropy from various atlases is sufficiently small that we need to amplify the scale to distinguish them. Thus, the entropy results are quite stable. Figure S2 10 different atlases were chosen to separate the AAL atlas into more parts. We calculated the entropy in samples from 20 normal people with ages in the range 20 to 26. There are no significant differences with various atlases.

Functional entropy and the mean correlation coefficient
In the main text we introduced the idea of functional entropy. The functional entropy in our work can be considered as a kind of second order moment of the correlation coefficient distribution. In the main text, we have shown that the entropy is closely related to age. Originally, we used the mean correlation coefficient, as has been commonly used, and which can be considered as a first order moment of the correlation coefficient distribution. However, the mean correlation coefficient does not significantly change with age (results for this are shown in Figure S3). The data are similar to those we apply in the entropy study. The blue nodes are for males, the red ones for females. If we remove data of the youngest and oldest, say younger than 12 and older than 72, a plot of the mean correlation coefficient against age has a slope for both males and females, which is zero to five significant figures. The correlations between the mean correlation coefficient and age are -0.0537 for males and 0.0279 for females with p values of 0.2262 and 0.5118 respectively (Student's t-distribution for the Pearson correlation coefficient, with 610 and 634 degrees of freedom, respectively). Thus the first order moment appears to play no role in a study of ageing. We thus considered a second order moment, namely the functional entropy in our investigations.

Figure S3
Mean correlation coefficient versus age. The red nodes are from females while blue nodes come from males.
Moreover, brain signals are found to demonstrate increased variability with age 4, 5 . We note that information entropy has a large value when there is a high level of randomness 6 and higher randomness is equivalent higher variability 7 . The above suggests that the notion of entropy will be useful in investigating the neurobiology of ageing. Thus, we have focused on the functional entropy of the brain in the present study.
In the main text, we used an age window of 25 years to determine Fig. 2b.
Here, we demonstrate that choosing windows from 19 to 30 years do not significantly affect any conclusions we draw. In Figure S4, we show the results with an age window ranging from 19 to 30 years. We conclude that the results are stable and not greatly affected by the age window size. The crossover age is 50 years in all panels.

Figure S4
The running average of the functional entropy, with averaging performed over differently sized age windows. The numbers in the upper left corners of each figure are the size of the window, in years.
In the main text, we focused on normal individuals. Additionally, we can determine the functional entropy of individuals with mental disorders, such as schizophrenia. To proceed, we selected a sample from schizophrenia patients, with a mean age of 24 years. To compare the differences between normal people and schizophrenia patients, we selected another two groups from our original dataset. One is from normal people with a mean age of 24 years. The other is from elderly people with a mean age of 69 years. The distribution of all the three groups, by gender, is approximately half males and half females. We list the correlation coefficient distributions in Figure S5, where blue represents normal young individuals, red represents schizophrenia patients and white shows elderly people. It can be seen that the distribution of correlation coefficients for schizophrenia patients is narrower than that of normal people, while the distribution of correlation coefficients of the elderly people is broader, as claimed in the main text. In the language of functional entropy, the schizophrenia patients have lower entropy while the elderly people have higher entropy. As shown in some surveys 8 , older people have a low chance of getting schizophrenia. The entropy difference between elderly people and schizophrenia patients may be the origin of this phenomenon, if the entropy difference between different classes of individuals plays the role of a risk difference. This point will be discussed in detail elsewhere.

Figure S5
A histogram of the correlation coefficients of three groups of individuals: normal young (blue), normal elderly (white) and schizophrenia patients, denoted SCZD (red). The groups of normal young individuals and schizophrenia patients both have a mean age of 24 years, while the mean of the elderly group is 69 years. The gender distribution of the three groups is approximately half males and half females.

Functional entropy in the left, right and inter hemispheres.
In the main text, we focused on the functional entropy, based on the whole brain (from all 4005 pairs of different brain regions). Moreover, we can consider the functional entropy in the inter-hemisphere and intra-hemisphere. Similarly, we also apply the AAL atlas with 90 brain regions. Thus, in the left and right hemisphere, we have 990 (C ) pairs, while there exist 2025 (45x45) pairs in the inter-hemisphere connections. Using the same calculational method for the functional entropy, we have also determined the functional entropy in the left, right and inter hemispheres. As shown in Figure S6 (left-hemisphere), Figure  S7 (right-hemisphere) and Figure S8 (inter hemisphere), we can extract similar results to the ones based on whole brain entropy. The functional entropy in the left, right and inter hemisphere will also increase with ageing with a crossover at age 50. In addition, the functional entropy based on the inter hemisphere is lower than that based on the intra hemisphere, since the inter hemispheric functional and physical connections are lower than those of the intra hemisphere, this leads to lower correlation coefficients in the inter hemisphere. Moreover, the correlation between the functional entropy based on the inter hemisphere and age are lower than ones based on the intra hemisphere.   In the main text, we combined 26 databases together to obtain an increase of functional entropy with age. To remove effects of any differences between different databases, we have carried out tests, such as removing some databases, to check the robustness of our results. We obtained very similar results, including an increase of functional entropy with age. Additionally, the crossover at age 50 persists. Additionally, we can find the increase functional entropy in the data from a single database. In our dataset, there exist two databases covering an age range of more than 50 years. They are the ICBM and Taiwan databases. As shown in Figure S9 and Figure S10, the data in the two databases also show an increasing functional entropy with age. Since the sample numbers of the two databases are 36 and 48, the correlations between the functional entropy and age are not statistically significant. The results for these databases partially certify the notion of increasing functional entropy with age. In the main text, we used a computational model from Deco et al s work 9 to verify our result about functional entropy.

Neuron dynamics
The global structure of the model is illustrated in Figure S12. Every brain region served as a node in a large scale network, which consists of a population of excitatory pyramidal neurons and a population of GABAergic inhibitory neurons, which are all-to-all connected. The communication between every two nodes is through synaptic connections between excitatory neurons in those areas. For each brain area, an integrate-and-fire neuron model with excitatory (AMPA and NMDA) and inhibitory (GABA) synaptic receptors was applied. The dynamics of the membrane potential V(t) are described by: The gating variables The sums over the k index represent all of the spikes emitted by presynaptic neuron j (at times ). The description and value of most parameters are shown in Table S3.
Each local area contains 100 excitatory neurons and 100 inhibitory neurons. The connection strengths between and within the populations are determined by dimensionless strength . Illustrated in Figure S12, there are 4 different intra-connection strength: Ÿexcitation (AMPA and NMDA) within excitatory neurons ; ž excitation (AMPA and NMDA) from excitatory neuron to inhibitory neuron =1 ; • inhibition (GABA) from inhibitory neuron to excitatory neuron =1; oe inhibition (GABA) within inhibitory neurons =1. We vary systematically to see the implications for the global functional entropy. The inter-regional connection strength is proportional to number of fibers linking every two regions. The neuroanatomical matrix whose element is fiber number, is obtained by Diffusion Tensor Imaging. Here, we used averaged structural matrix from 46 healthy people, which is showed in Figure S13.
All neurons always received an external background input from =800 external neurons emitting independent Poisson spike trains at a rate of 3Hz. More specifically, for all neurons inside a given population , the resulting global spike train, which is still Poissonian, had a time-varying rate ( ), governed by where =300 , =2.4 , =0.2 is the standard deviation of ( ), and ( ) is normalized Gaussian white noise .

BOLD signal
The simulation of the fMRI BOLD signal is computed by means of the Balloon-Windkessel hemodynamic model 10 . The BOLD-signal of each region is driven by the level of neuronal activity summed over all neurons in both populations (excitatory and inhibitory populations) in that particular region. In all our simulations, neuronal activity is given by the rate of spiking activity in a time window of 1ms. In brief, for the th region, neuronal activity z causes an increase in a vasodilator signal s that is subject to autoregulatory feedback. Inflow f responds in proportion to this signal with concomitant changes in blood volume v and deoxyhemoglobincontent q . The equations relating these biophysical variables are: where is the resting oxygen extraction fraction. The BOLD signal is taken to be a static nonlinear function of volume and deoxyhemoglobin that comprises a volume-weighted sum of extra-and intravascular signals: where V =0.02 is the resting blood volume fraction. The biophysical parameters were taken as =0.2,k =0.65, =0.41, =0.98, = 0.32, =0.34.

Simulation of the functional entropy
After we obtained the simulated BOLD time series, the global signal (average over all regions) was regressed out. Figure S14 shows typical temporal evolution of the simulated BOLD signal (after regression) for several brain regions. We then calculated the simulated functional connectivity by calculating the correlation matrix of the BOLD time series. Figure S15 and S16 plot an example of stimulated functional connectivity matrix and corresponding distribution of correlation, respectively. Using the calculation method of functional entropy, we could compare the simulated functional entropy with that from fMRI data. When we increase intra-excitatory connection strength with other parameters fixed, the firing rate of excitatory neurons in the whole brain increase. (Firing-rate amplification of inhibitory neurons can be ignored compared to that of excitatory neurons.) Based on the fact that the firing rate of one excitatory neuron, in resting state, is about 3Hz and that the model of one neuron here could also described the dynamics of several neurons or a neuron mass, we can calculate the actual excitatory neuron number in each brain region by the 100 times mean firing rate divided by 3Hz. (Here we just use the averaged firing rate of all excitatory neurons in the whole brain, not in each brain region.) Figure S17 illustrates the positive correlation between actual excitatory neuron number and intra-excitatory connection strength. The two red dashed lines show the range of connection strength [1.78,1.81], which makes the corresponding entropy match the human data. Based on the least squares line (black dashed line), the neuron number range is limited to [888,1130] (indicated by green text arrows). Schematic representation of the brain network. Each brain area is comprised of excitatory neurons (red triangles) and inhibitory interneurons (blue circles). Ÿž•oe represent the four different intra-connection during each brain area, and › describes the inter-connection between different brain area, which depends on DTI.

Figure S13
Neuroanatomical connectivity matrix, obtained by DTI after averaging across 46 human subjects.

Figure S15
Simulated functional connectivity matrix when intra-excitatory connection strength =1.81.

Figure S16
The distribution of correlation coefficients when the intra-excitatory connection strength =1.81.

Figure S17
Excitatory neuron number versus intra-excitatory connection strength .
In this part, we focus on the increase functional entropy bit per year before human maturity. As shown in Figure S18, the increase functional entropy bit before 25 years old are higher than that during 25 and 50 years old in both males and females. For males, the increase functional entropy bit per year is 0.0014 bit/year before 25 years old and 0.0004 bit/year during 25 and 50 years old, while the females hold 0.0023 bit/year and 0.0000 bit/year. In conclusion, functional entropy grows higher during childhood and adolescence, and the increasing functional entropy rate will become lower after maturity.

Figure S18
The figure represents the functional entropy trend with ageing. The left panel is from males, while the right one comes from females. The two green dashed lines in each panel stand for age 25 and 50.
In this part, we consider the relationship with the functional entropy and the estrogen. Other studies 11,12 have reported that estrogen protects brain, so we compare the functional entropy trend before and after menarche. Since the average of menarche is 13 years old 13 , we separate the samples before menopause into two groups, before and after 13 years old. As shown in Figure  S19, females before 13 years old hold a higher increasing functional entropy.
In particular, functional entropy increases 0.0118 bits per year before 13, but 0.0007 bit/year after that. This may be related to the estrogen level in females, which implies that estrogen protects brain.

Figure S19
The figure represents the functional entropy trend with ageing in females. The two green dashed lines stand for age 13 and 50.
14 Similar entropy gender pattern found in gray matter volume size We have 496 samples (252 males/ 244 females) with T1-image data. We applied the Voxel-based morphometry 14 (VBM8) to extract the gray matter size of all the AAL brain regions. As shown in Figure S20, the gray matter size will decrease with ageing in both males and females. Now we calculate the decrease cubic millimeter per year by linear regression analysis. Since 50 years old is a quite important threshold in the main text, we separate the samples into two groups, lower and older than 50 years old. For males, the gray matter size decreases 1620 mm 3 per year before 50 and 3090 mm 3 per year after 50, while that of females decreases 1930 mm 3 per year before 50 and 2960 mm 3 per year after 50. The decrease size will be larger in the group after 50 than that before 50 no matter in males or females. It should be emphasized that the decrease size per year in females is larger than that in males before 50, while this will go to the opposite direction after 50 years old. This is quite similar with the pattern of the entropy in males and females.

Figure S20
The figure represents the gray matter size trend with ageing. The left panel is from males, while the right one comes from females.
As shown in Section 10, the age distribution of the dataset in this paper is not flat, which may leads to some errors. Thus, we randomly selected some subjects with flatting age distribution (six individuals per age), and found that the significant results were still present, functional entropy increased with age, as shown in Figure S21.

Figure S21
With different randomly selected subjects with flatting age distribution, the significant results were still present, functional entropy increased with age.

Understanding the functional entropy: a mathematical description
In this subsection, we present a mathematical description of the functional entropy defined in the main text. Let x(t) be a resting fMRI time course of the brain. This is defined on a function space (with respect to time), , which is a subspace of the Lebesgue function space L ( , ), where T can be either the continuous time set [0,S] or the discrete time set {1,2, ,S}. We assume that a probability measure, P, is defined on with its -algebra induced by the norm of L ( , ). In the following, we need not know the explicit forms of the probability distributions of the random functions, to define our entropy. One particular fMRI time course from one of the regions of interest, which were considered, is regarded as a state point in the probability space { , ,P}. Let x( ) and y( ) be two independent random time courses following the same distribution. Their correlation coefficient can be regarded as a functional with respect to : . The functional entropy we defined is actually that of the random (scale) variable (x,y) that is induced by the two independent random functions x( ) and y( ). Thus, we write the entropy as H( (x,y)). The entropy of a region can be regarded as the specific conditional entropy of (x,y) when y is fixed, namely H( (x,y)|y=y( )). The mean of a region s entropy can be regarded as the conditional entropy of (x,y) with respect to the random function y(t), and we write this entropy as H( (x,y)|y).
To specify the meaning of the definition of functional entropy given, we restrict the function space to all periodic functions with the same constant frequency, . Then x(t) can be written as x(t) =a cos( t+ ) and y(t) can be written as y(t) =b cos( t+ ). For sufficiently large S, the mean of both x and y are approximately zero. Their correlation coefficient, as S , becomes Additionally, (x,y) =cos( ) also holds if S is an integer multiple of the period. Thus, the correlation coefficient between two periodic oscillations with the same period is the cosine of the phase difference. In this scenario, the function space can be embedded in a phase space (a one dimensional torus) (namely,[0,2 ]). Let { ,F,P} be the probability space induced by { , ,P}, let and be two independent random variables in it, and let z=cos( ). Then, the entropy of z is H(z), and the region s entropy is the specific conditional entropy, H(z| = ) , and the mean regional entropy is the conditional entropy H(z| ). Note that in the definition of the entropy, the Lebesgue measure of z=cos( ) is the trivial one defined in [-1, 1], denoted by m ( ) . The difference between the entropy of z and its conditional entropy is clearly the mutual information between and z: I(z, ) =H(z) H(z| ).
(18) Let =| |, then, z is equivalent to , considering the torus with 2 + . With a properly-defined Lebesgue measure of , i.e., the measure induced by that of z, namely m ( ), the entropy and conditional entropy of then equal those of z. Let us pick the measures of these variables, in order to define the (relative) entropy. That is: First, pick a joint measure of ( , ) (denoted by m ( , ) = ( ) ( ), a joint measure of two independent and identical measures, where ( ) is determined below), and a joint measure of ( , ), denoted by m ( , ( )), such that they are preserved through the transformation between them, (i.e., letting T be the transform from ( , ) to ( , ), then for any measurable set A in the space of ( , )), m ( ) = ( ( )) holds); Second, pick the measure ( , = | |) induced by m ( , ) and denoted by m ( , ); Finally, ( ) is chosen to guarantee that the embedded measure of equals to m ( ). Thus, we have (19) We conclude that D( | m ( , )) D( m( )). This implies that the mean regional entropy is a lower-bound of the entropy of the phase random variable under the measure preserving transformation. In addition, if different measures were picked, there would be constant differences in the above inequality induced by the expectation of the derivative of different Lebesgue measures (see the relative entropy part above in 3.1).
In particular, if we pick the trivial measure on the torus, , and let f( ) be the probability density function (pdf) of , then after simple algebra, we have the conditional pdf of with respect to as p( | ) =f( )+f( + ) 0.