Abstract
The hallmark of bipolar disorder is a clinical course of recurrent manic and depressive symptoms of varying severity and duration. Mathematical modeling of bipolar disorder holds the promise of an ability to personalize diagnoses, to predict future mood episodes, to directly compare diverse datasets, and to link basic mechanisms to behavioral data. Several modeling frameworks have been proposed for bipolar disorder, which represent competing hypothesis about the basic framework of the disorder. Here, we test these hypotheses with selfreport assessments of mania and depression symptoms from 178 bipolar patients followed prospectively for 4 or more years. Statistical analysis of the data did not support the hypotheses that mood arises from a rhythmic process or multiple stable states (e.g., mania or depression) or that manic and depressive symptoms are highly anticorrelated. Alternatively, it is shown that bipolar disorder could arise from an inability for mood to quickly return to normal when perturbed. This latter concept is embodied by an affective instability model that can be personalized to the clinical course of any individual with chronic disorders that have an affective component.
Introduction
Bipolar disorder (BP) is a chronic illness of recurrent episodes of mania and depression, affecting 2.4% of the adult population^{1,2}. This disorder is classified according to diagnostic criteria in the Diagnostic and Statistical Manual 5th Edition (DSM5)^{3}, which are largely based on expert consensus through empirical clinical observations. These criteria lose information on subsyndromal symptoms and the dynamic nature of the illness beyond simple observations of episodic pattern. There is much interest in further quantifying BP through mathematical modeling.
Several models have recently been proposed for BP. They are built upon the following hypotheses:
Bipolar is intrinsically rhythmic: A periodic assumption postulates mood is driven by an internal timekeeping mechanism. As a result, mood cycles through mania and depressive episodes rhythmically or periodically. Periodic models are readily available for BP^{4,5,6,7,8,9}.
Bipolar is multistable: A multistable assumption argues mood in BP tends to distinct mood states (e.g., mania and depression) which sustain mood at severe levels. Multistability is captured mathematically with stable points or attractors in dynamical systems^{7,8,10,11,12}.
Bipolar is onedimensional: A onedimensional assumption supposes mood is a spectrum with mania and depression on opposite ends (i.e. the two “poles” in bipolar). Manic or depressive symptoms arise only in the absence of the other. Many models of mood in BP use a onedimensional assumption^{4,5,9,11,13}.
Testing these hypotheses could have major implications for the study of BP. The multistable hypothesis suggests that environmental or internal perturbations, “stress” or “noise,” are what triggers a mood episode. Thus, if these could be minimized, mood would stay in its current state indefinitely. Likewise, when the next episode is, or its duration, is fundamentally unpredictable. The rhythmic hypothesis suggests the opposite: that mood transitions could occur independently of these perturbations and that mood episodes are fundamentally predictable and have a characteristic duration. A third hypothesis is that mood episodes are triggered by these perturbations, but that these mood states are not “sustaining” or that there is an effective maximal time by which the mood episode would ultimately end, perhaps because of an exponential return to a baseline mood.
The onedimensional hypothesis suggests that any increase in manic symptoms means a decrease in depressive symptoms. Even if a model allows for two separate variables, it could “attract” to a state where this relationship between manic and depressive symptoms is true. An alternative is to assume manic and depressive scores are independent. A middle ground exists where mania and depression are “correlated” (positively or negatively).
In this study, we aimed to (i) formally test the validity of these hypotheses at the patientlevel, and subsequently (ii) establish a mathematical framework for clinical course in BP. In what follows, patientlevel statistical tests are combined across patients to examine collective evidence of each hypothesis. In the process, we reveal evidence of a new model that describes mood course in BP as arising from extreme instability in manic and depressive symptoms. We then show how this model can be personalized to the mood course of any individual with a chronic condition wherein mood and affective symptoms are present (not just bipolar patients), thereby providing a quantitative phenotype to study biological mechanisms of disorders that manifest, at least in part, with affective symptoms.
Materials and methods
Data
The primary dataset, the bimonthly dataset, was collected from 178 BP individuals followed prospectively for at least 4 years in the Prechter Longitudinal Study of Bipolar Disorder at the University of Michigan^{14}. An Altman SelfReported Mania scale (ASRM)^{15} and the Patient Health Questionnaire for Depression (PHQ9)^{16} were completed at 2 month intervals. On average across the individuals, 3.5% of PHQ9 scores were missing and 0.55% of ASRM were missing. Of the 178 individuals, 138 were BPI, 12 were BP not otherwise specified, and 28 were BPII; 134 were female; 153 individuals were white, 6 black or AfricanAmerican, 2 Asian, 9 more than one race, and 8 patients of unknown race; 166 individuals were notHispanic, 5 Hispanic, and 7 of unknown ethnicity. Individuals were on average 42.3 ± 12.2 (±standard deviation [SD]) years of age at the initial interview for the Prechter study. The UM IRB approved recruitment, assessment, and research procedures (HUM606).
A second dataset, the weekly dataset, was included for two statistical tests that depend on how often mood was sampled. This dataset was collected on BP individuals (N = 15) from the Prechter Study with at least 24 surveys scores on the Young Mania Ratings Scale (YMRS)^{17} and the Structured Interview Guide from the Hamilton Depression Rating Scale (SIGHD)^{18}, administered weekly by trained interviewers. On average across the individuals, 9.2% of SIGHD scores were missing and 9.7% of YMRS score were missing. Of these patients, 9 were BPI and 6 were BPII; 12 were female; 9 were white, 1 Asian, 4 black or AfricanAmerican, and 1 of unknown ethnicity; 15 were notHispanic. They were on average 39.9 ± 9.4 (±SD) years of age at the initial interview. Ten patients had both bimonthly and weekly data.
Statistical approach
Because the hypotheses of interest are about BP at the patientlevel, they should be tested with patientlevel data. Since patientlevel data is limited and highly variable, patientlevel tests may not have sufficient statistical power to reject hypotheses. Testing hypotheses on aggregated data across patients, however, can lead to wrong conclusions about patientlevel trends. To overcome this limitations, we perform statistical tests on patientlevel data, but then aggregate statistics and/or test results across patients. Significance was considered an alpha level of 0.05, and the analysis was performed in Matlab (Mathworks; Natick, MA) unless otherwise specified.
Patientlevel statistics
Testing a onedimensional hypothesis
Kendall rank correlation was measured between concurrent depressive and manic symptoms (bimonthly dataset). Rank correlation measures the degree to which an increase in depressive symptoms is accompanied by an increase in mania and similarly, an increase in manic symptoms is accompanied by an increase in depression. Because a onedimensional hypothesis ignores mixed states, we also compared risk for mania (ASRM score ≥ 6) while depressed (PHQ9 score ≥ 10) versus not depressed (PHQ9 score < 10) (bimonthly dataset). We ignored illdefined rank correlations (an individual’s survey scores are all identical) and illdefined risk values (an individual was either never depressed or never not depressed).
Testing a rhythmic hypothesis
An individual’s scores were transformed to the frequency domain using nonparametric spectral estimation based on Thomson’s multitaper approach^{19}. We then looked for significant oscillation frequencies using Thomson’s harmonic F test, from which we could recover Pvalues for each frequency (Null hypothesis i: An individual’s manic or depressive scores do not oscillate at a specified frequency; bimonthly and weekly datasets). Thirty frequencies were tested, equally spaced between 1/24 and 1/2 of the sampling frequency, i.e., between 1/4 and 3 cycles per year for the bimonthly dataset and 1/24 and 1/2 cycles per week for the weekly dataset. Spectral estimation was performed in R (R Foundation for Statistical Software, Vienna, Austria) using the multitaper package with seven Slepian tapers, a time bandwidth of four, adaptive estimation, and removing the estimated mean of the time series using Slepian tapers. This analysis required contiguous data, so the R function na.approx (found in the zoo package) filled in missing data via an interpolation method.
Testing a multistable assumption
To test for multiple states, we looked for survey scores with probability density functions that had multiple modes, i.e., particular mood scores that are relatively more common than nearby scores^{20}. A Hartigan’s dip test was used to measure deviation from unimodality^{21} (null hypothesis ii: an individual’s manic or depressive scores has a unimodal distribution; bimonthly dataset). Since a Hartigan’s dip test requires continuous data and survey scores are discrete, we added a uniform random variable to each score and applied the dip test to these modified scores.
Testing our affective instability model
We tested the validity of our model by evaluating whether it fits survey scores significantly worse than data sampled from the model, i.e. whether we can distinguish between actual data and data simulated from the model. This approach does not favor model complexity: even if a complex model fits the data better, it is not guaranteed that the model fits the data better than simulated data. This testing relied on two types of goodnessoffit tests, following a method to validate models for financial data described by AitSahalia et al.^{22} and Fan^{23} (see Supplementary Appendix for details). The first goodnessoffit test evaluated how well certain probability density functions fit survey scores. (Null hypothesis iii: an individual’s manic or depressive scores are drawn from a specified probability density function; monthly dataset). The second goodnessoffit test evaluated how well certain transition distributions fit sequences of mood scores (Null hypothesis iv: an individual’s sequence of manic or depressive scores are drawn from a specified transition distribution; weekly dataset). For the latter test, the weekly dataset was used, since bimonthly scores were not sampled frequently enough to accurately estimate parameters. We measured goodnessoffit to the density function and transition distribution defined by our model. To show these tests have enough statistical power to reject models, we also tested two common stochastic differential equation models: an OrnsteinUhlenbeck process and a CoxIngersoll Ross process^{24}. Since our model is unistable under no noise, these tests also serve to further test multistability.
Populationlevel statistics
Statistical tests were aggregated to examine collective evidence of each patientlevel null hypothesis. For tests with Pvalues, we followed Loughin^{25}, calculating a mean Pvalue across patients for each test and testing for significance by comparing it to the mean Pvalue of the same number of independent uniform random variables (Null hypothesis I: the null hypothesis of a particular test holds across patients, and tests are independent between patients; bimonthly and weekly datasets). Under this approach, a hypothesis would not be rejected with one Pvalue of 0.2, but would be rejected with onehundred Pvalues of 0.2.
Since multiple frequencies are tested (leading to a higher chance for Type I errors) and mood could oscillate at different frequencies between patients, we also calculated the minimum of the Pvalues across the 30 frequencies for each individual. These minima were averaged and compared to a similar statistic assuming Pvalues across individuals and frequencies were independent uniform random variables. (Null hypothesis II: the null hypothesis of a Thomson’s F tests holds across patients and frequencies, and tests are independent between frequencies and patients; bimonthly and weekly datasets).
To clearly associate a populationlevel statistical test to a reported Pvalue, we use scalars \(\hat P_i\), \(\hat P_{ii}\), \(\hat P_{iii},\,\hat P_{iv}\) to denote Pvalues recovered under the populationlevel null hypothesis I, which in turn are associated with null hypotheses i, ii, iii, or iv. In addition, \(\tilde P_i\) denotes the Pvalue recovered under populationlevel null hypothesis II.
Parametric study
To analyze our affective instability model, we estimated mean duration of mood episodes and percent time in a mood state for certain parameters. To remove the dependence of these estimates on initial mood values and random noise, we followed a common strategy in stochastic simulation by simulating the affective instability model for a sufficiently long period of time with a suitable warmup or burnin period^{26}. With mood episodes lasting on a scale of weeks to months, we chose to simulate the model for 1100 simulated lifeyears with initial manic/depressive mood values of 0.1 and warmup period of 100 years, storing daily samples of mood for only the last 1000 simulated lifeyears. We are in no way assuming that individuals live for 1000 years, but use this value simply to remove any dependence on initial conditions and random noise.
Because mood is continuous in the affective instability model, threshold values of mood were chosen to separate mood into states. Because mood is also dimensionless, any threshold value can be used and gains an interpretation when compared to other parameters and when mood is scaled to match a survey. So, we chose a threshold value of 3 to define mood states:

Euthymia: manic and depressive variables less than 3;

Mania: manic variable greater than 3 and a depressive variable less than 3;

Depression: manic variable less than 3 and a depressive variable greater than 3; and

Mixed State: manic and depressive variables greater than 3.
Manic, depressive, and mixed episodes were defined as periods in the corresponding mood state lasting at least a week to agree with current DSM guidelines; euthymic episodes were periods inbetween mood episodes.
Results
Testing conventional hypotheses
Onedimensional
A onedimensional hypothesis would mean that individuals are never both manic and depressed (Fig. 1a, b). However on average, individuals in our dataset have a 15% risk for mania (ASRM score ≥ 6) while depressed (PHQ9 score ≥ 10) (Fig. 1c). Interestingly, an individual has only a 23% risk for mania while not depressed. The significant risk for mania while depressed goes against the onedimensional hypothesis.
To test this further, because a onedimensional hypothesis requires that high manic scores would occur only with low depressive scores and high depressive scores occur only with low manic scores, a parametric plot of manic and depressive survey scores would appear near a onedimensional curve in each individual (Fig. 1b). We measured Kendall’s rank correlation between depressive and manic scores for each individual (Fig. 1d) to tell if mania and depression are negatively correlated (i.e. high manic scores are found with low depressive scores and vice versa), or positively correlated, which would lead to mixed episodes. The rank correlation was on average −0.13 across individuals and ranged from −0.60 to 0.75. While some individuals (e.g., Person 1 in Fig. 1b) showed a negative correlation close to 1, this was not typical. Positive correlation for some individuals further contradicts the idea that mania and depression are mutuallyexclusive and the onedimensional hypothesis.
Rhythmic
Under a rhythmic assumption, an individual’s mood scores transformed to a frequency domain peak around a particular frequency (Fig. 2a). We tested frequencies for significance and combined these tests across individuals into mean Pvalues, leading to equivocal evidence of rhythmicity (Fig. 2b). Near the fundamental frequencies (the reciprocal of the observation period), mean Pvalues were low enough across patients to be significant for both datasets. However, the fundamental frequency should be significant, since spectral estimation requires repeating the data every observation period. For the weekly dataset, mean Pvalues were not low enough to suggest mood oscillates at any other frequencies.
For the bimonthly data, mean Pvalues were low enough across patients to suggest manic symptoms oscillate at 1.13 and/or 1.78 cycles per year (\(\hat P_i\) = 0.028 and 0.020), but not depressive symptoms (\(\hat P_i\) = 0.47 and 0.63), and to suggest depressive symptoms oscillate at 2.53 and/or 2.72 cycles per year (\(\hat P_i\) = 0.012 and 0.017), but not manic symptoms (\(\hat P_i\) = 0.12 and 0.34). Manic and depressive symptoms might thus oscillate at different frequencies that may not correspond to any biological (e.g. 4 week periods) or seasonal oscillation (e.g. one or six month periods). However, we tested 30 frequencies for significance, which increases the possibility of falsely concluding a frequency is significant when it is not. When we correct for testing multiple frequencies and allow mood to oscillate at different frequencies between individuals, we find insufficient evidence to suggest that manic or depressive symptoms oscillate in the weekly or monthly datasets (\(\tilde P_i\) > 0.19). In sum, it is unclear if manic and depressive symptoms oscillate from our datasets.
Multistable
Under a multistable assumption, an individual has multiple mood values that are relatively more stable than nearby mood values and hence would spend relatively more time around these values than nearby values. This feature would manifest as multiple modes in its probability distribution, where each mode would represent a mood value relatively more stable than nearby moods (Fig. 3a). So, if we determined that the data allowed us to reject a unimodal distribution, then we could conclude that a multistable model was appropriate. Combining tests across patients, we could not reject unimodal distribution for bimonthly manic or depressive symptoms (\(\hat P_{ii}\) > 0.50; Fig. 3b).
An alternative hypothesis for mood in BP
Not being able to reject unimodality, we questioned if a model with only one stable point (in the absence of noise) could explain the mood data. With goodnessoffit tests, we determined if we could distinguish between actual mood data and data sampled from three unistable models: our affective instability described below and two popular models from finance, an OrsteinUnhlenbeck model and CoxIngersollRoss model. On average, bimonthly manic symptoms fit the density function associated with our affective instability model as well as 49% of sampled data, and bimonthly depressive symptoms fit the density function associated with our model as well as 44% of sampled data (Fig. 4). As a result, we did not find a significant difference between bimonthly manic symptoms and data simulated from our model (\(\hat P_{iii}\) = 0.29). Even though the model did as well as 44% of sampled data, we could find a significance difference between bimonthly depressive scores and data sampled from our model (\(\hat P_{iii}\) = 0.003). However, agreement between model and data was much higher for the affective instability model than for density functions associated with the other two models, where we could find significant differences in all cases (\(\hat P_{iii}\) < 1e9).
We also tested if sequences of weekly manic and depressive symptoms could fit transition distributions (i.e. the distribution function for a mood value at some point in time conditional on a preceding value) associated with each model as well as simulated data. On average, weekly manic symptoms fit the transition distribution associated with our model as well as 49% of sampled data, and weekly depressive symptoms fit the density function associated with our model as well as 46% of sampled data (Fig. 4). Hence, we did not find a significant difference between sequences of weekly manic symptoms and sequences sampled from our model (\(\hat P_{iii}\) = 0.46) or between sequences of weekly depressive scores and sequences sampled from our model (\(\hat P_{iii}\) = 0.31). Agreement between model and data was also found for the other two models in the case of sequences of weekly depressive scores (\(\hat P_{iii} > 0.12\)) and for the CoxIngersollRoss model in the case of sequences of weekly manic scores \(\hat P_{iii} = 0.22\), but not for the OrnsteinUhlenbeck model in the case of sequences of manic depressive scores (\(\hat P_{iii} =\)7e4). Overall, the affective instability model best explained the data.
Note that our affective instability model has only one stable state, and so, its density function and transition distributions are both unimodal. For a model with multiple stable states, an individual’s mood conditional on a particular starting value would spend more time around certain mood values relative to nearby scores (similar to its density function), which manifests as multiple modes in its transition distributions. Not being able to reject specific unimodal density functions and unimodal transition distributions for our datasets provides further evidence against a multistable assumption.
An affective instability model
Motivated by the empirical results, we describe mood as a twodimensional random process represented by a depressive variable D_{ t } and manic variable M_{ t }. These variables take positive values ranging from zero for no symptoms, to a larger number for minor symptoms, to an even higher number for severe symptoms; and satisfy the stochastic differential equation (SDE) model:
where \(a_d,a_m,b_d,b_m\) are positive parameters, \(\rho \in [  1,1]\), and dV_{ t } and dW_{ t } are independent Wiener processes. A Wiener process, also known as Brownian motion, is the most standard way to mathematically model a continuous timecourse that is noisy. Lastly, mood variables are related to particular survey scores through scaling: s_{ d }D_{ t } and s_{ m }M_{ t }, for positive parameters s_{ m } and s_{ d }.
From the model, we can see that mood is inclined towards a baseline/normal level, represented by a single asymptotically stable point \((\sqrt {b_d} ,\sqrt {b_m} )\) in the absence of noise (Fig. 5a), but reaches pathological levels in certain individuals for one of two reasons. Either the baseline level is simply too close to pathological levels, so that even small fluctuations can bring mood into pathological levels. Or, mood is particularly sensitive and reactive to (random/unobserved) events, e.g. stressful life events such as job loss. Simply put, clinical course in BP arises from a weaker ability to keep mood within normal ranges. Note that the model does not impose any boundaries between mood states, such as in a multistable model (Fig. 5a). Without a natural boundary, subthreshold symptoms, i.e. those insufficient in number or criteria to constitute a mood episode, have increased importance. As in actual individuals with BP, subthreshold symptoms in the model persist even when DSM mood episodes have long subsided, and persons can spend upwards of half their time with subthreshold symptoms^{27}.
The model can be personalized to an individual through parameter choices. Parameters b_{ d } and b_{ m } influence the overall severity of mood (Fig. 5b). Higher values of b_{ d } captures an individual that spends more time with depressive symptoms and has longer depressive episodes. Higher values of b_{ m } has an analogous effect on manic symptoms. Parameters a_{ d } and a_{ m } control the speed at which symptoms fluctuate (Fig. 5c), where higher values of a_{ d } or a_{ m } lead to shorter mood episodes to reflect a rapidcycler. The parameter a_{ d } is dimensional compared to the current definition of rapidcycling, which as a categorical variable may impose an artificial boundary between patients^{28,29,30}. Parameter ρ controls the prevalence of mixed states, ranging from ρ = −1 to 1 for manic symptoms that oppose or cooperate with depressive symptoms, respectively. ρ = −1 would indicate that manic and depressive symptoms are not correlated. Lastly, s_{ d } and s_{ m } determine how mood translates into external observations, capturing, for example, variation in how mood is measured between surveys, clinicians, and/or demographic groups (gender, culture).
Although we focus on BP, there is no assumption in the model that excludes it from describing anyone’s mood. By adjusting parameters, we can theoretically capture mood courses that describe not only bipolar I, bipolar II and rapidcycling, but also disorders such as major depression (low b_{ m } and high b_{ d }) as well as healthy individuals (Fig. 5d). Major mood disorders can thus be conceptualized with boundaries that are more fluid than those recognized in the DSM.
Discussion
Mathematical modeling provides an important framework for understanding human behavior. Borbély’s twoprocess model has formed the basis for how many researchers reason about sleepwake dynamics^{31}, and Daan’s eveningmorning oscillator model likewise has led to an important understanding of how behavior can be consolidated in a 24h day^{32}. Both papers remain high cited, and many similar highlevel models are used to understand physiological processes. Based on the success of these previous models, a framework for mood dynamics in BP could be impactful for the field.
A major challenge in the study of BP is how to properly classify patients, which has led to much debate about the DSM5. A recent study has shown that bipolar I patients can be fit to a mathematical model^{33}, and that the parameters of these fits can be classified into three groups with validated clinical outcomes. Our model has only seven parameters, which can easily be identified from patient data. As it is a framework that is validated against data, it should provide an important diagnostic tool for bipolar subjects.
We examined hypotheses about mood course in a dataset of manic and depressive surveys from 178 BP subjects followed for a minimum of four years. Patientlevel statistics were combined to formally test these hypotheses at the patientlevel. This analysis finds little evidence in our data of onedimensionality, rhythmicity, or multistability, but finds support for an affectivity instability model. Data with higher temporal resolution, longer sampling periods, and additional patients could provide additional testing of modeling hypotheses. For example, one could test the possibility that mood should be modeled differently during different pharmacological treatments, or as the disease progresses. The bimonthly dataset was selfreported, and hence, subject to reporting bias. The weekly dataset, for example, was based on trained interviewer assessments and reinforced conclusions from the bimonthly dataset. Our model also assumes mood is Markovian, meaning that the current mood state is all that is needed to predict future mood and parameters do not vary with time. Future research might also incorporate events, such as a job loss, into the model as was done by Steinacher and Wright^{11}.
Three additional modeling frameworks have also been proposed, but for which further testing does not appear to be needed. Mood was proposed to be generated by a mathematical chaotic system^{34}, which is difficult to distinguish from randomness with discrete samples^{35}. Nevertheless, previous testing of this hypothesis has questioned chaos’ role in mood disorders^{36,37,38}. Other models are built just the rate of transitions between mood states rather than mood on a dimensional scale. Although they do not fulfill the modeling goals of this manuscript, such models are useful for classifying patients^{33}. “Kindling” models of Bipolar have also been proposed, but the kindling hypothesis also has come into question^{39}.
The affective instability model provides an alternative hypothesis for how biological processes could drive mood in BP, namely, manic and depressive symptoms could be driven by a twodimensional process that is weakly regulated. Mania and depression are separately regulated, but may respond to some similar unpredictable or random inputs or environmental factors. Other random models of mood in BP have been presented where time and mood can be discrete or continuous^{4,11,13,20,33,36,40,41,42}. A continuoustime continuousstate model, such as the one introduced here and those presented in Bonsall et al.^{4} and Steinacher and Wright^{11} permits data at regular or irregular time intervals and from different surveys. Random models, as the foundation of statistical inference, are relatively easy to personalize and validate with data, whereas inference is less common for chaotic models.
A major benefit of the affective instability model is that it provides a quantitative and dimensional phenotype for studying BP. Not only is mood characterized on both a manic and depressive scale, but each of the model parameters could be viewed as a characterization of an individual’s illness. Dimensional constructs that capture both pathological and nonpathological behavior are emphasized in RDoC from the National Institute of Mental Health^{43}, since current classification categories are believed to impose artificial boundaries between individuals. Future research could use the affective instability model by customizing model parameters to any individual’s behavior and then utilizing these parameters to explain variation in behavior between subjects. The resulting seven parameters also provide a meaningful way for a clinician to assess clinical courses, e.g. assess the tendency for mixed episodes. With roles in BP’s pathophysiology^{1}, potential candidates for the biological processes that could lead to an affective instability model would be the serotonergic and dopaminergic systems providing an important potential link to physiology.
References
 1.
Goodwin, F., Jamison, K. & Ghaemi, S. Manicdepressive illness: bipolar disorders and recurrent depression. 2nd ed, (Oxford University Press, New York, NY, 2007).
 2.
Merikangas, K. et al. Prevalence and correlates of bipolar spectrum disorder in the world mental health survey initiative. Arch. Gen. Psychiatry 68, 241–251 (2011).
 3.
American Pyschiatric Association. Diagnostic and Statistical Manual of Mental Disorders 5th edn. (Arlington, VA, 2013).
 4.
Bonsall, M., Geddes, J., Goodwin, G. & Holmes, E. Bipolar disorder dynamics: affective instabilities, relaxation oscillations and noise. J. R. Soc. Interface 12, 20150670 (2015).
 5.
Daugherty, D. et al. Mathematical models of bipolar disorder. Commun. Nonlinear Sci. 14, 2897–2908 (2009).
 6.
Frank, F. A limit cycle oscillator model for cycling mood variations of bipolar disorder patients derived from cellular biochemical reaction equations. Commun. Nonlinear Sci. 18, 2107–2119 (2013).
 7.
Goldbeter, A. A model for the dynamics of bipolar disorders. Prog. Biophys. Mol. Bio. 105, 119–127 (2011).
 8.
Goldbeter, A. Origin of cyclicity in bipolar disorders: a computational approach. Pharmacopsychiatry 46, S44–S52 (2013).
 9.
Nana, L. Bifurcation analysis of parametrically excited bipolar disorder model. Commun. Nonlinear Sci. 14, 351–360 (2009).
 10.
Hadaeghi, F., Golpayegani, M. & Murray, G. Towards a complex system understanding of bipolar disorder: a map based model of a complex winnerless competition. J. Theor. Biol. 376, 74–81 (2015).
 11.
Steinacher, A. & Wright, K. Relating the bipolar spectrum to dysregulatio of behavioural activation: a perspective from dynamical modelling. PLoS. ONE 8, e63345 (2013).
 12.
Bystritsky, A., Nierenberg, A., Feusner, J. & Rabinovich, M. Computational nonlinear dynamical psychiatry: a new methodological paradigm for diagnosis and course of illness. J. Psychiatr. Res. 46, 428–435 (2012).
 13.
Bonsall, M., WallaceHadrill, S., Geddes, J., Goodwin, G. & Holmes, E. Nonlinear timeseries approaches in characterizing mood stability and mood instability in bipolar disorder. Proc. Biol. Sci. 279, 916–924 (2012).
 14.
Langenecker, S., Saunders, E., Kade, A., Ransom, M. & McInnis, M. Intermediate: cognitive phenotypes in bipolar disorder. J. Affect. Disord. 122, 285–293 (2010).
 15.
Altman, E., Hedeker, D., Peterson, J. & Davis, J. The Altman SelfRating Mania Scale. Biol. Psychiatry 42, 948–955 (1997).
 16.
Kroenke, K., Spitzer, R. & Williams, J. The PHQ9: validity of a brief depression severity measure. J. General. Intern. Med. 16, 606–613 (2001).
 17.
Young, R., Biggs, J., Ziegler, V. & Meyer, D. A rating scale for mania: reliability, validity and sensitivity. Br. J. Psychiatry.: J. Ment. Sci. 133, 429–435 (1978).
 18.
Hamilton, M. A rating scale for depression. J. Neurol. Neurosurg. Psychiatry 23, 56–62 (1960).
 19.
Thomson, D. Spectrum estimation and harmonicanalysis. P IEEE 70, 1055–1096 (1982).
 20.
Cochran A. L., Schultz A., McInnis M. & Forger D. A Comparison of Mathematical Models of Mood in Bipolar Disorder. (eds Érdi, P., Bhattacharya, B. S. & Cochran, A. L.) Computational Neurology and Psychiatry, (Springer International Publishing, 2017. p. 315–341).
 21.
Hartigan, J. & Hartigan, P. The Dip Test of Unimodality. Ann. Stat. 13, 70–84 (1985).
 22.
AitSahalia, Y., Fan, J. & Peng, H. Nonparametric transitionbased tests for jump diffusions. J. Am. Stat. Assoc. 104, 1102–1116 (2009).
 23.
Fan, J. A selective overview of nonparametric methods in financial econometrics. Stat. Sci. 20, 317–337 (2005).
 24.
Iacus, S. Simulation and Inference for Stochastic Differential Equations: with R Examples. (Springer, New York, NY, 2008).
 25.
Loughin, T. A systematic comparison of methods for combining pvalues from independent tests. Comput. Stat. Data An. 47, 467–485 (2004).
 26.
Asmussen, S. & Glynn, P. Stochastic Simulation: Algorithms and Analysis Springer Science & Business Media, New York, NY (2007).
 27.
Judd, L. et al. The longterm natural history of the weekly symptomatic status of bipolar I disorder. Arch. Gen. Psychiatry 59, 530–537 (2002).
 28.
Bauer, M., Beulieu, S., Dunner, D., Lafer, B. & Kupka, R. Rapid cycling bipolar disorderdiagnostic concepts. Bipolar Disord. 10, 153–162 (2008).
 29.
Kupka, R. et al. Comparison of rapidcycling and nonrapidcycling bipolar disorder based on prospective mood ratings in 539 outpatients. Am. J. Psychiatry 162, 1273–1280 (2005).
 30.
Schneck, C. et al. The prospective course of rapidcycling bipolar disorder: findings from the STEPBD. Am. J. Psychiatry 165, 370–377 (2008).
 31.
Borbély, A. A. A two process model of sleep regulation. Hum. Neurobiol. 1, 195–204 (1982).
 32.
Pittendrigh, C. & Daan, S. A functional analysis of circadian pacemakers in nocturnal rodents. J. Comp. Physiol. 106, 333–355 (1975).
 33.
Cochran, A., McInnis, M. & Forger, D. Datadriven classification of bipolar I disorder from longitudinal course of mood. Transl. Psychiatry 6, e912 (2016).
 34.
Gottschalk, A., Bauer, M. & Whybrow, P. Evidence of chaotic mood variation in bipolar disorder. Arch. Gen. Psychiatry 52, 947–959 (1995).
 35.
Werndl, C. Are deterministic descriptions and indeterministic descriptions observationally equivalent? Stud. Hist. Philos. Mp. 40, 232–242 (2009).
 36.
Moore, P., Little, M., McSharry, P., Goodwin, G. & Geddes, J. Mood dynamics in bipolar disorder. Int. J. Bipolar Disord. 2, 11 (2014).
 37.
van der Werf, S. et al. Major depressive episodes and random mood. Arch. Gen. Psychiatry 63, 509–518 (2006).
 38.
Krystal, A. & Greenside, H. Lowdimensional chaos in bipolar disorder? Arch. Gen. Psychiatry 55, 275 (1998).
 39.
Bender, R. & Alloy, L. Life stress and kindling in bipolar disorder: review of the evidence and integration with emerging biopsychosocial theories. Clin. Psychol. Rev. 31, 383–398 (2011).
 40.
Fan, J. On Markov and Hidden Markov Models with Applications to Trajectories University of Pittsburgh, Pittsburgh, PA (2014).
 41.
Lopez, A. Markov Models for Longitudinal Course of Youth Bipolar Disorder University of Pittsburgh, Pittsburgh, PA (2014).
 42.
Moore, P., Little, M., McSharry, P., Geddes, J. & Goodwin, G. Forecasting depression in bipolar disorder. IEEE Trans. Biomed. Eng. 59, 2801–2807 (2012).
 43.
Insel, T. et al. Research domain criteria (RDoC): toward a new classification framework for research on mental disorders. Am. J. Psychiatry 167, 748–751 (2010).
Acknowledgements
This research was supported by Heinz C. Prechter Bipolar Research Fund at the University of Michigan Depression Center; the Richard Tam Foundation; Human Frontiers of Science Program Grant (RPG 24/2012); and Health and Human Services, Department of National Institutes of Health (R34 MH10040403; K01 MH112876).
Author information
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Received
Accepted
Published
DOI
Further reading

Reporting guidelines on remotely collected electronic mood data in mood disorder (eMOOD)—recommendations
Translational Psychiatry (2019)

Engagement Strategies for SelfMonitoring Symptoms of Bipolar Disorder With Mobile and Wearable Technology: Protocol for a Randomized Controlled Trial
JMIR Research Protocols (2018)