Introduction

Circadian rhythms synchronized by the rotation of the Earth are 24-h self-sustained oscillations that can influence a range of physiological responses and behavioral processes, such as hormone secretion and sleep-wake cycles1,2,3,4,5. The circadian rhythms are regulated by a complex transcriptional feedback loop involving the “core circadian clock” in the suprachiasmatic nucleus (SCN) of the hypothalamus and the other peripheral clocks throughout the body6,7. Rhythmic fluctuations driven by the core clock could be illustrated as a sinusoidal curve with a period of about 24 h8, and hence the circadian rhythms are typically described using three key parameters (Supplementary Fig. 1): period (the length of one cycle), phase (the timing of a reference point in the cycle relative to a fixed event), and amplitude (the difference between crest and trough values of the cycle)9. While methods for measuring period10,11 and phase12,13,14,15,16,17 are well-established and accurate in both laboratory and field studies, assessing circadian amplitude remains challenging.

Although there is no accepted gold standard for measuring the circadian amplitude18, the amplitude of pineal hormone melatonin has been used as one of its reliable biomarkers, and changes in the melatonin amplitude are also observed in forced desynchrony studies19,20. However, assessing the melatonin amplitude requires a melatonin profile of at least 24 h with continuous collection of blood or saliva samples21. Given that the collection protocol of melatonin is costly, labor-intensive, and infeasible for large population studies and intensive follow-up cohort studies in natural settings, researchers have attempted to find alternative methods to characterize circadian amplitude using wearable sensor data22,23.

Wearable devices can continuously monitor the physical activity of an individual for multiple days in a natural environment, and thus can solve the challenges present in melatonin sampling. The main rationale is based on the biological knowledge that both the melatonin rhythms and rest-activity cycles are strongly influenced by the core circadian clock and hence they may share similar oscillation patterns8. Nevertheless, accelerometer data also have some drawbacks. A major challenge is that the wearable device-derived circadian amplitude is influenced by voluntary behavioral changes in the natural environment that are not related to circadian rhythms12,17,18. More importantly, the most commonly used accelerometer-derived measure, relative amplitude, is calculated solely based on the differences between the highest and lowest activity counts22,24, which has lost rich information of the high-dimensional activity data. Furthermore, the direct relationship between the circadian amplitude characterized by accelerometer data and the circadian amplitude depicted by melatonin data has not been well studied. Thus, it is critical to establish and validate a sound method for extracting accurate information about circadian amplitude from high-dimensional accelerometer data to fully leverage the rich high-dimensional information without the masking effect of behaviors.

Changes in the amplitude of melatonin have been widely observed in age-related cognitive impairments and neurodegenerative diseases9,25,26,27,28. However, few studies investigated the association between circadian amplitude and cognitive function in general population. Given that the circadian rhythms vary greatly during adolescence and many mental disorders also first appear in the adolescent stage29,30, and that the circadian rhythms are susceptible to heavy life and work during adulthood, it is necessary to examine the correlation between circadian amplitude and cognitive performance at these stages of life31. Furthermore, currently there is no evidence to support a causal relationship between circadian amplitude and cognitive performance9,32. One prior study tried to study the causal relationship, but it only measured the relative amplitude and found that it was only associated with a small number of genetic variants that could not be used to investigate the causal relationship using Mendelian randomization (MR) analysis33. Thus, it is necessary to develop a metric of circadian amplitude to well capture circadian amplitude information with a strong genetic basis and clinical relevance, which can allow us to infer the causal relationship between circadian rhythms and cognitive functions.

Therefore, in this study we developed an amplitude metric for circadian rhythms based on the accelerometer data and explored its clinical potential in two large-scale databases with different age groups. The workflow chart is shown in Fig. 1. First, we proposed a pipeline to derive a metric called circadian activity rhythm energy (CARE) which can remove the influence of behavioral factors using spectral analysis approaches, and then we validated the metric CARE with the melatonin amplitude. Secondly, we applied CARE to two accelerometer datasets with 1703 adolescents and 92,202 adults, respectively, to examine the relationship between CARE and cognitive functions. Finally, we explored the causal relationship between CARE and cognitive performance through a genome-wide association study (GWAS) and MR causal analysis.

Fig. 1: Workflow chart of the study.
figure 1

CARE circadian activity rhythm energy. *We also selected the significant loci that were associated with the relative amplitude. However, only three single-nucleotide polymorphisms (SNPs) were found, the number of which was too small to conduct further causal analysis.

Results

A wearable-derived measure of circadian amplitude: CARE

In this paper, we developed a pipeline to extract a metric CARE (i.e., circadian activity rhythm energy) to accurately quantify the circadian amplitude at a large scale with low costs. CARE can extract the circadian feature via acceleration signal decomposition, circadian sub-signal extraction, and energy estimation of the sub-signals from accelerometer data (details shown in Fig. 2).

Fig. 2: Pipeline to derive the measure circadian activity rhythm energy (CARE) from accelerometer data.
figure 2

a Actogram to show 7 days of wrist accelerometer activity of a single participant. b A visual representation of the raw activity signal and the singular spectrum analysis (SSA) decomposed signals, including the base signal (i.e., the first sub-signal indicating the non-periodic trend with the largest energy among all sub-signals), the 24-h signal, and the behavioral noise signal (i.e., sum of the sub-signals with periods <24 h). We can obtain the raw activity signal by adding these three signals together. c The graphical display of SSA algorithm. Specifically, the activity time series \({{\rm{X}}}_{N}\) of length N could be decomposed using SSA as follows. First, we chose an appropriate window length L such that \(2\le L\le \frac{N}{2}\). Then, \({{\rm{X}}}_{N}\) was transferred into a trajectory matrix with K lagged vectors of \({{\rm{X}}}_{L}\) as given by T, where K = N – L + 1. The trajectory matrix T was decomposed by singular value decomposition. By grouping the eigentriples and averaging the elements of reconstructed trajectory matrix along anti diagonals, we could get filtered time series represented by \({{\rm{G}}}_{N}=\left\{{g}_{1},{g}_{2},\ldots ,{g}_{N}\right\}\).

The accelerometer data can be considered as a composition of sub-signals with different frequencies. In order to characterize circadian rhythms with behavioral masking effects removed, we decomposed the data of activity counts and extracted the sub-signals with a period of ~24 h to represent the endogenous circadian oscillation. Among a list of signal decomposition approaches such as Fourier and wavelet analyses, we chose singular spectrum analysis (SSA, see a detailed description in Fig. 2c) in our study for its data adaptive property34,35 and better performance compared to Fourier and wavelet transform methods. Inspired by previous decomposition analysis on accelerometer data23,36, we used the relative energy of the circadian sub-signal with respect to the original signal as an estimate of the circadian amplitude. For a discrete-time signal x(t), energy is defined as the envelope of squared signal magnitude (Eq. (1)):

$${\rm{Energy}}= < x\left(t\right),\;x(t) > =\mathop{\sum }\limits_{{t}_{0}}^{{t}_{0}+T}{(x\left(t\right))}^{2}$$
(1)

where \({t}_{0}\) is the starting time point of \({\rm{x}}\left({\rm{t}}\right)\) and \(T\) is the time span of \({\rm{x}}\left({\rm{t}}\right)\). The unit of \({\rm{x}}\left({\rm{t}}\right)\) and energy are (activity counts)/minute and (activity count)2/min2, respectively.

In summary, CARE is calculated using the following equation (Eq. (2)):

$${\rm{CARE}}=\frac{{{\rm{Energy}}}\;{{\rm{of}}}\;24\mbox{-}{{\rm{h}}}\;{{\rm{signal}}}}{{{\rm{Total}}}\;{{\rm{energy}}}\;{{\rm{of}}}\;{{\rm{activity}}}\;{{\rm{signal}}}}=\frac{{\sum }_{i\in I}{{||}{X}_{i}{||}}_{{\rm{F}}}^{2}}{{{||X||}}_{{\rm{F}}}^{2}}$$
(2)

where \({{||}{X}_{i}{||}}_{{\rm{F}}}^{2}\) is Frobenius norm of the ith sub-signal, \(I\) is the subset of indices corresponding to the 24-h signal, and \({{||X||}}_{{\rm{F}}}^{2}\) is Frobenius norm of the raw activity signal. It should be noted that CARE is expressed as a ratio ranging from 0 to 1 and is unitless. Furthermore, to effectively capture the dominant temporal scales and variability in the accelerometer data, and to obtain a reliable estimate of the autocovariance matrix feature, we required a minimum input data length of three days to calculate the CARE metric.

Validating CARE with melatonin amplitude

We validated the feature CARE by examining its association with the melatonin amplitude in a dataset of 33 healthy participants aged 23–61 years, where accelerometer activity and melatonin were simultaneously collected (Table 1). Among the 33 participants, mean CARE was 0.26 (SD = 0.05; range 0.14–0.35), mean relative amplitude was 0.86 (SD = 0.07; range 0.67–0.96), and mean melatonin amplitude was 11.44 pg/ml (SD = 6.81; range 1.02–28.27). The correlation analysis revealed that the melatonin amplitude was only significantly associated with CARE (Pearson’s r = 0.46, P = 0.007; Fig. 3a). On the other hand, we observed no significant association between melatonin amplitude and relative amplitude (Pearson’s r = 0.24, P = 0.19; Fig. 3) or relative energy of behavioral noise signals (Pearson’s r = 0.07, P = 0.68; Fig. 3). Moreover, our study found no significant association between age and sex with melatonin amplitude (P > 0.05; Supplementary Table 1). We also found that CARE accounted for 21.16% of the total variance of melatonin amplitude, whereas age and sex accounted for only 4.3% and 0.03% of the variance, respectively (Supplementary Table 1).

Table 1 Description of the participants and the accelerometer data.
Fig. 3: Correlations between wearable-derived CARE, relative amplitude, and relative energy of behavioral noise with melatonin amplitude.
figure 3

CARE circadian activity rhythm energy. Scatter plots between CARE, relative amplitude, and relative energy of behavioral noise with melatonin amplitude in the melatonin study are shown in the bottom-left corner, and the correlation coefficients between these variables are shown in the upper-right corner. The black lines indicate the linear regression fits, and the shaded areas represent the confidence intervals of the fitted mean values. The correlation coefficients annotated with an asterisk (*) indicate that the correlation is significant at P < 0.05 level, while two asterisks (**) indicate that the correlation is significant at P < 0.01 level.

Correlation between CARE and cognitive outcomes in adolescents and adults

We applied CARE to two population-based accelerometer datasets with a total of 1703 adolescents aged 10–19 years from China and 92,202 adults aged 39–70 years from UK. Characteristics of the participants and accelerometers data are presented in Table 1, and other analyzed variables are described in Supplementary Tables 2 and 3.

Mean CARE values were 0.10 (0.04) and 0.13 (0.04) for adolescents and adults, respectively. And mean relative amplitude values were 0.91 (0.05) and 0.87 (0.06) for adolescents and adults, respectively. Our results showed that CARE was significantly associated with relative amplitude in both populations (Pearson’s r = 0.41, P < 0.0001 for the adolescents; Pearson’s r = 0.35, P < 0.0001 for the adults). Furthermore, we confirmed the internal consistency of CARE by analyzing the intra-subject versus inter-subject variance. The results showed that the intra-subject variability of CARE values was not significant in both adolescents (P = 0.18) and adults (P = 0.25), while the inter-subject variability was significant in both datasets (P < 0.0001, Supplementary Table 4). Moreover, we found significant between-group differences in CARE values among individuals with psychiatric disorders, such as bipolar affective disorder, schizophrenia, depression, and those in the control group from the adult dataset (P = 0.02, Supplementary Table 5).

In the adolescent dataset, we used the Behavior Rating Inventory of Executive Function (BRIEF)37 to assess adolescent cognitive functions in nature settings, which includes three summary indices, namely the Behavioral Regulation Index (BRI), the Metacognition Index (MI), and the Global Executive Composite (GEC). After adjusting confounding factors, the results showed that CARE had significant associations with GEC scores (P = 0.016), but it was not associated with BRI (P = 0.10) and MI scores (P = 0.03) (see detailed results in Table 2). However, no significant association was found between relative amplitude with any of the cognitive scores (P > 0.017) (see detailed results in Supplementary Table 6).

Table 2 Associations between CARE and cognitive functions in adolescents.

In the adult dataset, cognitive assessment was conducted via touchscreen cognitive tests, and four main tasks were considered, i.e., reaction time test assessing processing/reaction speed domain, fluid intelligence test assessing reasoning ability domain, pairs matching test assessing short-term memory domain, and prospective memory test assessing prospective memory. We found that CARE showed significant correlations with reasoning ability, short-term and prospective memory (P < 0.0001), but correlations were not significant for processing/reaction speed (P = 0.31). Details regarding the associations between CARE and cognitive scores are represented in Table 3. Though we found that relative amplitude was correlated with processing/reaction speed (P < 0.0001), it was not associated with reasoning ability, short-term and prospective memory (P > 0.013). Details regarding the associations between relative amplitude and cognitive scores are represented in Supplementary Table 7.

Table 3 Associations between CARE and cognitive functions in adults.

Genomic locus associated with CARE

We conducted a GWAS study of CARE in the adult sample, which contains filtered genetic data for 85,361 people. One genomic locus on chromosomes 3 was found to correlate with CARE at the genome-wide significance level of 5 × 10−8, including 126 single-nucleotide polymorphisms (SNPs) (Fig. 4, lead CARE-associated variants in Supplementary Table 8 and 126 CARE-associated SNPs in Supplementary Table 9). The quantile–quantile (QQ) plot checking for population stratification indicated proper control of population structure, with a slight deviation in test statistics compared to the null (Fig. 4, λGC = 1.10). The heritability estimate for CARE was >11% (h2SNP = 0.114, se = 0.007). This locus was reported to be associated with common executive functions38, intelligence39, and educational attainment40 in previous studies. However, only three SNPs were found to be associated with relative amplitude, which was too few for causal inference using MR methods (Supplementary Fig. 2 and Supplementary Table 10).

Fig. 4: Manhattan and QQ plots for CARE in genome-wide association study in the adult sample (n = 85,361).
figure 4

a The Manhattan plot shows association test (−log10 P value on the y-axis against physical autosomal location on the x-axis). The red line represents the genome-wide significant locus (P < 5×10−8). Heritability estimates were calculated using LDSC tool. b The QQ plot identifies a slight inflation (λGC = 1.10) in the test statistic.

Causal relationship between CARE and cognitive phenotypes

To investigate whether CARE has a causal effect on cognitive outcomes, we performed MR analyses on significant correlation pairs, i.e., CARE with reasoning ability, short-term and prospective memory (Table 3). Of the 126 CARE-related SNPs, a total of 109 variants were used as instrumental variables to conduct the MR analyses (Supplementary Table 9). Using the weighted median method, we found that the genetically predicted CARE had a significant causal effect on reasoning ability (\(\beta =-59.91\), P < 0.0001), short-term (\(\beta =7.94\), P < 0.0001) and prospective memory (\(\beta =16.85\), P < 0.0001). Sensitivity analyses including inverse variance weighted (IVW)41, MR-Lasso42, the mode-based estimate (MBE)43, and constrained maximum likelihood-based method (MR-cML)44 all showed consistent results with weighted median methods (Supplementary Table 11).

Single-tissue and cross-tissue transcriptome-wide association analysis

Single-tissue enrichment analysis identified 119 unique genes associated with CARE (P < 2.9 × 10−6) in 44 Genotype-Tissue Expression (GTEx) tissues (Supplementary Table 12). Among them, APEH was associated with CARE in 18 tissues and was found to correlate with blood protein measurements in previous GWAS studies45. TRAIP was detected in 11 tissues and was reported to be associated with intelligence38, mental disorders46,47, insomnia48, blood protein measurement49,50, and body mass index51. MST1R was associated with CARE in 9 tissues and is related to physical activity measurements52, intelligence53, and body mass index54. These correlation results also emphasized shared genetic architecture between the central nervous system, the metabolic system, and the circadian system. Furthermore, a total of 13 significant genes were detected using joint-tissue tests for gene–CARE associations across tissues (P < 2.9 × 10−6; Supplementary Table 13).

Discussion

In the present study, we proposed a pipeline to derive a wearable-based feature (i.e., CARE) to characterize circadian amplitude and applied it to identify the association between circadian amplitude and cognitive functions in two large-scale datasets with different age groups. We found a significant association between CARE and melatonin amplitude (a reliable measure of circadian amplitude), while relative amplitude (a commonly used amplitude of the activity) was not. Findings that CARE was associated with a multitude of cognitive outcomes in adolescents and adults demonstrated that CARE could be a clinically meaningful circadian feature derived from objective accelerometer records. Furthermore, we identified one genetic locus with 126 SNPs associated with CARE, and provided the first direct evidence of the causal relations between CARE with reasoning and memory abilities in an adult sample.

Our study found a moderate correlation between the feature CARE and melatonin amplitude in general population under natural settings. The use of CARE can effectively eliminate the influence of behavioral noise on the assessment of circadian amplitude using the accelerometer data, as supported by the lack of significant association between behavioral noise signals and melatonin amplitude. Notably, we calculated CARE values using accelerometer activity data of at least 3 days, making it a summary statistic across multiple days for each individual. This computation approach enhances the stability of CARE values and reduces the intra-subject variability. Nonetheless, it is important to note that the CARE values may be impacted by data obtained from various accelerometer devices, which could potentially affect their comparability. Furthermore, the range of CARE values observed in the melatonin dataset is relatively narrow, which is likely due to the limited range of observed melatonin amplitude, as the maximum melatonin amplitude can be up to 84 pg/ml in healthy adults55. Therefore, it would be beneficial to investigate whether a broader range of melatonin amplitude leads to a wider range of CARE values in the future research.

The findings that age and sex were not significantly associated with melatonin amplitude may contradict with previous research that melatonin levels typically decrease with age starting from middle age and that women tend to have higher melatonin amplitude than men56,57,58. This might be due to the limited number of middle-aged and elderly participants (age >40 years) and the limited sample size in our dataset. Future research shall recruit larger samples, especially from the elderly adults, to further examine age and sex differences.

Our study demonstrated that CARE and relative amplitude may assess different aspects of circadian rhythmicity. Although they are all derived from accelerometer data, CARE was designed to reflect the strength of the core circadian clock located in the SCN, and relative amplitude is more of a metric to quantify the highest/lowest disruption of rest-activity rhythms. The identification of a sufficient number of CARE-associated variants can enable us to perform causal inference through MR analysis, which is not possible with relative amplitude. In addition, the SNP heritability of CARE accounted for a higher proportion of population variance than relative amplitude in the adult sample (UK Biobank). These results further supported CARE as a clinically meaningful feature. Future research is still warranted to systematically compare and assess the relative contribution and relevance of CARE and relative amplitude, particularly in the context of health and disease.

Our study observed a differential association between CARE and cognitive functions in adolescent and adult groups. In adults, we observed a correlation between CARE and the problem-solving component of cognitive functions, specifically reasoning and memory domains. However, in adolescents, this association was not present. This discrepancy may indicate the existence of a compensatory mechanism in adolescents that counterbalances the impact of circadian rhythm impairments on problem-solving ability. More research is needed to confirm our hypothesis in the future.

In the present study, we demonstrated a causal effect of CARE on reasoning and memory abilities in adults, and we also provided evidence for the shared genetic architecture of circadian rhythms with neurological function9. The genome-wide significant locus was found to be associated with common executive functions38, intelligence39, and educational attainment40. Furthermore, results from the single-tissue and cross-tissue transcriptome-wide association analyses further confirmed the CARE’s strong biological basis and highlighted the important role of circadian rhythms on the central nervous system, which is in line with previous findings from the other UK Biobank studies59.

In conclusion, the feature CARE that we derived from accelerometer data, is closely related to the melatonin amplitude in natural settings, and also serves as a meaningful metric for a wide range of cognitive functions in general adolescents and adults. Future studies with large sample size and different protocols such as forced desynchrony settings and shift work should be conducted to validate the feature CARE and to confirm its causality on other health-related outcomes.

Methods

The melatonin dataset

Participants

This dataset recruited 33 healthy adults (17 male and 16 female) aged 23–61 years in January 2023. Prior to enrollment, all participants underwent screening to exclude the use of sleep aid supplements (e.g., melatonin), a history of shift work, and sleep disorders. The participants were asked to complete a range of demographic, lifestyle, and health related questionnaires, to wear an accelerometer, and to collect saliva samples with 5 times. This protocol was approved by the Shanghai Children’s Medical Center’s Human Ethics Committee (SCMCIRB-K2021070-2) and Shanghai Jiao Tong University School of Medicine (SJTUPN-202301). All participants provided digital written informed consent.

Accelerometer data collection and processing

The participants were asked to wear an accelerometer (Micromotion logger sleep watch, Ambulatory Monitoring Inc., Ardsley, NY) on their non-dominant wrist for 9 consecutive days, meanwhile, they were required to keep a diary recording bedtime, rise time, and off-wrist period on each day. They were informed not to remove the device unless engaged in special activities (e.g., showering), and to re-wear it immediately after the activities were finished. We defined the daytime period as 08:00–20:00 and the nighttime period as 20:00–08:00. According to the diary recording for the off-wrist periods, we defined a criterion that at least 15 min of consecutive zero counts were identified nonwear periods in period 08:00–23:00, and at least 3 h of consecutive zero counts were identified nonwear periods in the remaining period. We also considered that the valid data on each day should meet at least 20 h (i.e., more than 83% of the whole 24 h) of wearing time without more than three consecutive hours of missing data. If less than 3 h of missing data existed, we used the mean of the corresponding time points on the other days to fill the gaps. After checking, if there were still missing values (often due to the missing data being at a fixed time every day, so it could not be filled using the mean value), the average of physical activity counts around the missing area (i.e., about 6 epoch) was used to fill. After preprocessing the invalid and missing raw data, each individual included should have at least three valid days of data, and the activity counts with 1-min resolution were used to calculate the feature CARE using the above-proposed pipeline and to compute the commonly used feature relative amplitude by the difference between the average activity level of an individual during the most active consecutive 10 h (M10) and the least active consecutive 5 h of the day (L5) (Eq. (3)).

$${\rm{Relative}}\; {\rm{amplitude}}=\frac{M10-L5}{M10+L5}$$
(3)

Saliva melatonin data collection and processing

We used the Salivettes® (Sarstedt, Nümbrecht, Germany) to collect saliva. On the eighth and ninth day of wearing the accelerometer, the participants were asked to collect 2 ml saliva samples 5 times (i.e., 9 a.m., 15 p.m., 21 p.m., 3 a.m., and 9 a.m. of the next day; time window ±1 h) across 24 h, of which the saliva sample at 21 p.m. and 3 a.m. were required to be collected under the dark light (light intensity <30 lux) if the participants usually went to sleep at that time. The participants were informed to follow their usual sleep schedule and do not intake caffeine, alcohol, and fatty foods on the sampling day, to keep their mouth clean (e.g., food intake restriction) in the 30 min prior to sampling, and do not swallow saliva while chewing the swab. They were also informed to store each sample immediately in the freezer layer of the refrigerator after collection, and to transfer all of the samples to the laboratory using ice packs. Subsequently, after centrifugation at 4 °C for 10 min at 3000 rpm, the supernatant saliva samples were collected into 2 ml EP tube and stored at −80 °C until use.

Saliva melatonin was measured by liquid chromatography-tandem mass spectrometry, which can quantify the range of the melatonin from 1 to 1000 pg/ml, and only use small volume of the saliva samples (100 µl). Detailed measurement was based on an article to be published, which was similar to the quantification of melatonin in human milk60. For more information, please contact the corresponding authors. After quantification of saliva melatonin levels, we checked the extreme values, and excluded the values less than 2 pg/ml which collected at 3 a.m. (time window ±1 h). And then melatonin profiles were obtained by linear interpolation and curve fitting using skewed baseline cosine function (SBCF) (Supplementary Fig. 3), a robust model for melatonin estimation when there were limited time points in the collection of melatonin61.

The adolescent dataset

Participants

The Shanghai Children’s Health, Education, and Lifestyle Evaluation-Adolescent (SCHEDULE-A) is a population-based cross-sectional study designed to investigate the physical and mental health of adolescents (10–19 years). A total of 1703 adolescents from a subsample of SCHEDULE-A (Shanghai city, China) were utilized for this study. Two data acquisition schemes, an online questionnaire survey, and offline accelerometer wearing, were performed simultaneously to collect adolescents’ social-demographic information, lifestyle behaviors, physical and mental health outcomes. Details of the protocol can be found in one prior published study62. This study was approved by the Human Ethics Committee of the Shanghai Children’s Medical Center according to the Declaration of Helsinki (SCMCIRB-K2018103). Written informed consent was obtained from all participants.

Accelerometer data collection and processing

Participants were asked to wear an accelerometer (ActiGraph wGT3X-BT, Pensacola, FL, USA) on the non-dominant wrist for seven consecutive days. The other requirements including device wearing and diary recording were similar to the melatonin dataset. Using the same way to preprocess the raw activity data, and then CARE and relative amplitude were calculated.

Measurement of cognitive functions

The Chinese version of the BRIEF was used to assess adolescent cognitive functions in nature settings63. The BRIEF has 86 items, each one is rated on a three-point Likert-type scale, including never, sometimes and often, and parents of the participants were asked to select the most suitable answer for their adolescents in the past 6 months. After cleaning the raw data62, three indexes were calculated: (1) the BRI, represents the ability to modulate behavioral and emotional responses appropriately; (2) the MI, evaluates the ability to solve problems, including initiating activities and tasks, holding information in mind for purpose of completing a task, anticipating future events, monitoring the effects of one’s behaviors on others and keeping track of daily assignments; (3) the GEC, is a composite measure of all the cognitive functions mentioned above. Values of BRIEF scores reflect the extent of impairment of the corresponding cognitive functions, and higher scores indicate greater impairments.

Covariates

One of the participants’ parents was asked to complete a questionnaire about their social-demographic characteristics, i.e., adolescents’ age, sex, primary caregiver (parents/grandparents or others), family income (<50,000 CNY/year, 50,000–150,000 CNY/year, ≥150,000 CNY/year), and parental education level (below high school, high school, college and above).

The adult dataset

Participants

The UK Biobank is a general longitudinal study with over 500,000 participants recruited from the United Kingdom (UK). In the current study, we selected 92,202 participants who provided qualified accelerometer data for at least three days after data preprocessing and who had complete records in age and sex. The demographic, lifestyle, health, mood, cognitive, and physical conditions of the participants were collected at 22 assessment centers. The application number for the UK Biobank dataset was 57,947 and the UK Biobank had received ethical approval from the North-West Research Ethics Committee (11/NW/03820) and all participants gave written informed consent.

Accelerometer data collection and processing

Participants were asked to wear an accelerometer (Activity AX3 wrist-worn triaxial accelerometer, Newcastle University, UK) for 7 consecutive days. The accelerometer raw data were preprocessed similarly to the above melatonin and adolescent datasets. Participants were also excluded if their data were identified by UK Biobank as poorly calibrated (n = 11) or unreliable (unexpectedly small or large size, n = 4692). Participants who provided accelerometer data for less than 72 h or who did not provide data for all one-hour periods within the 24-h cycle (n = 4465) were also excluded from further analyses. Similarly, after cleaning the raw data, CARE and relative amplitude were calculated.

Measurement of cognitive functions

Four tests were used to assess adult processing/reaction speed, reasoning ability, short-term memory, and prospective memory. (1) The processing/reaction speed was assessed by the reaction time test. During the test, participants were shown 12 pairs of images of cards for 12 rounds, and they pressed the button on the touchscreen when the two cards are the same. The processing/reaction speed was then calculated by the mean reaction time of the correct matches over the last 7 rounds after the exclusion of times below 50 ms and above 2000 ms. Due to negative skewness in reaction time, we performed the analyses using both the raw and log-transformed data form. Since the results were equivalent, only the results of raw reaction time were reported for ease of interpretation. (2) Reasoning ability was measured by the fluid intelligence tests. Participants were asked to answer 13 fluid intelligence questions within 2 minutes and the total number of correct answers to these questions was recorded as the fluid intelligence score. (3) Short-term memory was assessed by the pairs matching tests. Participants were asked to memorize as many pairs of cards as possible within the time the cards were shown. Then the cards were turned face down and the participants were asked to match the cards. The test was conducted in two rounds, with 3 and 6 pairs of cards presented in each round. The outcome measure used in this study was the number of incorrect matches in rounds, as suggested by the UK Biobank website. (4) Prospective memory was assessed by the prospective memory test. Before the other touchscreen cognitive tests, the participant was shown the message “At the end of the games we will show you four colored shapes and ask you to touch the Blue Square. However, to test your memory, we want you to actually touch the Orange Circle instead.” The results were divided into 3 categories, 0 for instruction not recalled, 1 for correct recall on the first attempt, and 2 for correct recall on the second attempt, respectively. The outcome measure used in this study was the number of attempts before getting the correct recall.

Covariates

Participants provided data on social-demographic characteristics including age, sex, ethnicity, educational attainment, and lifestyle at the baseline assessment. Townsend’s social deprivation score is defined based on the postcode of residence, with negative scores reflecting relatively greater affluence. The season when the participant started to wear the accelerometer was derived from raw accelerometer records, i.e., spring for March to May, summer for June to August, autumn for September to November, and winter for December to February according to the UK Meteorological Office definitions.

Determining a suitable method for signal decomposition

We have used data from the melatonin dataset to compare the correlations between melatonin amplitude and CARE values derived from three signal decomposition methods: the Fourier transform (FFT), discrete wavelet transform (DWT), and singular spectral analysis (SSA). We found that SSA-derived CARE value was significantly associated with melatonin amplitude (Pearson’s r = 0.46, P = 0.007), but not FFT-derived (Pearson’s r = 0.20, P = 0.26) and DWT-derived CARE values (Pearson’s r = 0.15, P = 0.41). Besides, unlike FFT and DWT, SSA does not require a fixed base function for signal decomposition, which makes it entirely data-driven and flexible to use in practice. Thus, we chose SSA as the decomposition method in our pipeline.

Step 1: Validation of the wearable-derived feature CARE

We firstly calculated the feature CARE using the above-proposed pipeline from accelerometer data. Secondly, using the same pipeline, we also derived the relative energy of behavioral noise signals (sub-signals with a period less than 24 h) from accelerometer activity data (Eq. (4)):

$${\rm{Relative}}\; {\rm{energy}}\; {\rm{of}}\; {\rm{behavioral}}\; {\rm{signals}}=\frac{{{\rm{Energy}}}\;{{\rm{of}}}\;{{\rm{behavioral}}}\;{{\rm{noise}}}\;{{\rm{signal}}}}{{{\rm{Total}}}\;{{\rm{energy}}}\;{{\rm{of}}}\;{{\rm{activity}}}\;{{\rm{signal}}}}=\frac{{\sum }_{u\in U}{{||}{X}_{u}{||}}_{{\rm{F}}}^{2}}{{{||X||}}_{{\rm{F}}}^{2}}$$
(4)

where \({{||}{X}_{u}{||}}_{{\rm{F}}}^{2}\) is Frobenius norm of the uth sub-signal, \(U\) is the subset of indices corresponding to the behavioral noise signal (sub-signals with a period <24 h), and \({{||X||}}_{{\rm{F}}}^{2}\) is Frobenius norm of the raw activity signal. Thirdly, we obtained the melatonin profiles by linear interpolation and curve fitting using SBCF61; Fourthly, the melatonin amplitude was measured by subtraction of the maximum and minimum values of the melatonin secretion curve; and finally, we conducted Pearson correlation analyses between CARE, relative amplitude, and the relative energy of behavioral high-frequency signals with melatonin amplitude. Additionally, we examined the associations between CARE and melatonin amplitude while adjusting for age and sex.

Step 2: Determining the correlation relationship of CARE with multitude of cognitive functions in adolescents and adults

Prior to performing the association analysis, we evaluated the internal stability of CARE values through two distinct analyses. Firstly, we randomly selected a subset of 1,000 individuals each from the adolescent and adult datasets who had at least six days of accelerometer data. We then calculated the CARE values using the first three days of recordings (repetition 1) and the fourth to sixth days of recordings (repetition 2), respectively. An Analysis of Variance (ANOVA) was then performed to compare the intra-subject and inter-subject variability of CARE values. Secondly, we conducted another ANOVA to explore the within-group and between group variability of CARE values in bipolar affective disorder (n = 147), schizophrenia (n = 42), depression (n = 2252), and control groups (n = 36,315) in the adult dataset. Specifically, the control group comprised individuals who did not currently have the above-mentioned psychiatric diseases, did not have a history of mental disease, did not take psychiatric medication, did not receive any professional help for mental distress, and had not suffered from mental distress.

We then performed different regression models to determine the relationship of CARE with multitude of cognitive functions in two datasets. Specifically, in adolescents, because all the three BRIEF indices (i.e., BRI, MI, and GEC) were skewed distributed, median regression analysis was used with adjustment for the demographic variables (i.e., age, sex, parental education level, family income, and primary caregiver); and in adults, linear regression model was performed to assess the associations between CARE and processing/reaction speed, ordinal logistic regression model was employed to investigate the associations between CARE and count outcome variables (reasoning ability, and short-term memory), while logistic regression model was conducted to test the correlation between CARE and the binary cognitive measures of prospective memory. Similarly, models were adjusted for age, sex, ethnicity, Townsend score, and the season when the participant started wearing the accelerometer. To maximize sample sizes, available-case analyses were employed, and sample sizes for each model were reported. Moreover, the analyses mentioned above were also performed on relative amplitude for the purpose of comparison.

All analyses were performed in R 4.1.2, and the threshold for significance was set at P < 0.05/3 = 0.017 (3 cognitive outcomes) and P < 0.05/4 = 0.013 (4 cognitive outcomes) to correct for multiple testing according to Bonferroni’s approach in adolescent and adult dataset analyses, respectively.

Step 3: Identifying the causal relationship between CARE and cognitive functions in the adult dataset

Identification of loci associated with CARE

Before the genetic analysis, we excluded SNPs with Minor Allele Frequency <0.1%, Hardy–Weinberg equilibrium <\(1\times {10}^{-10}\), and imputation quality score (UKBB information score) <0.8. Furthermore, for related individuals (first cousins or closer), one individual from each pair of related individuals was randomly selected for inclusion in the analysis, and non-white individuals were also removed from the analysis. After data filtering, a total of 85,361 samples remained for further analysis. To identify genetic variants associated with CARE, we performed a GWAS analysis using PLINK64. Linear regression models were adjusted by age, sex, and the first 20 genetic principal components, and the genome-wide significance level was set at 5 × 10−8. To estimate SNP heritability (h2SNP), the Linkage Disequilibrium Score Regression (LDSR) was applied to the GWAS summary statistics of CARE with the LDSC tool65,66. Furthermore, we identified one significant locus that was associated with the relative amplitude; however, only three SNPs were found, which was just too little that we could not conduct the further causal analysis.

Causal relationship between CARE and cognitive functions

To investigate the causal relationship between CARE and cognitive functions, we conducted MR analyses using the R package “MendelianRandomization”67. In order to exclude instrumental variables with weak power, we pruned CARE-related SNPs and finally determined 109 SNPS as instrument variables. To avoid inflation of associations caused by the sample overlap, genetic effect sizes for reasoning ability, short-term and prospective memory were obtained from GWAS analysis after excluding all individuals used in the GWAS analysis of CARE (n = 360,885). In the MR analyses, the weighted median approach68 was used as the main analysis due to its robustness to pleiotropy69. Furthermore, we applied inverse variance weighted (IVW)41, MR-Lasso42, the mode-based estimate (MBE)43, and constrained maximum likelihood-based method (MR-cML)44 to perform sensitivity analyses. The beta coefficients were reported and Bonferroni-adjusted P values <0.017 (3 outcomes) were considered evidence of associations.

Single-tissue and cross-tissue transcriptome-wide association analysis

In addition, we examined the biological and functional mechanisms underlying CARE by performing transcriptome-wide analysis in 44 tissue types in GTEx v7. We identified single-tissue and cross-tissue gene-CARE associations based on GWAS summary statistics and expression quantitative trait loci (eQTL) information, using UTMOST (Unified Test for MOlecular SignaTures)70. The Bonferroni-corrected threshold for significance in UTMOST is 2.9 × 10−6 for 17,290 genes.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.