Prediction of the COVID-19 outbreak in China based on a new stochastic dynamic model

The current outbreak of coronavirus disease 2019 (COVID-19) has become a global crisis due to its quick and wide spread over the world. A good understanding of the dynamic of the disease would greatly enhance the control and prevention of COVID19. However, to the best of our knowledge, the unique features of the outbreak have limited the applications of all existing dynamic models. In this paper, a novel stochastic model was proposed aiming to account for the unique transmission dynamics of COVID-19 and capture the effects of intervention measures implemented in Mainland China. We found that: (1) instead of aberration, there was a remarkable amount of asymptomatic virus carriers, (2) a virus carrier with symptoms was approximately twice more likely to pass the disease to others than that of an asymptomatic virus carrier, (3) the transmission rate reduced significantly since the implementation of control measures in Mainland China, and (4) it was expected that the epidemic outbreak would be contained by early March in the selected provinces and cities in China.

The states of the proposed stochastic model are listed as follows, • S: Susceptible.
• E: Exposed. It divided into four sub-states: -E 1 : will become symptomatic in the future and is not traceable in medical tracking.
-E 2 : will become symptomatic in the future and is traceable in medical tracking.
-A 1 : won't become symptomatic in the future and is not traceable in medical tracking.
-A 2 : won't become symptomatic in the future and is traceable in medical tracking.
• Q: Quarantined. It divided into two sub-states: -E q : will become symptomatic in the future.
1 -A q : won't become symptomatic in the future.
• IN: Infected, symptomatic, but not yet admitted to hospital. They are divided into two sub-states: -IN 1 : not traceable in medical tracking.
-IN 2 : traceable in medical tracking.
• IH: Infected, symptomatic and currently under hospitalization. divided into two sub-states: -IH L : with light symptoms.
-IH S : with severe symptoms.
• R: Recovered. They are divided into three sub-states: -R A : recover from state A 1 and A 2 .
-R N : recover from state IN 1 and IN 2 .
-R H : recover from state IH L .
We denote S(t), E 1 (t) and so on as the population sizes in the corresponding states at time t. The evolution of , D(t)} over time t forms a continuous time Markov Process with state space {0, 1, 2, ..., N } 15 . The corresponding transition process can be illustrated by Figure S1. The transition rates of the system ξ(t), which uniquely determines the continuous time Markov Process in our proposed model are as follows.
• Symptoms onset: • Hospitalization: - • Symptom relief: • Recovery: • Death: - From a functional analysis point of view, the stochastic dynamic model defined above can be equivalently described with its Markov Semigroup or infinitesimal generator. See [11,5] for details. From those operators, it is natural for us to consider the following Mean-field Differential Equation System that serves as a deterministic counterpart of the stochastic model, R H (t) = γ HĨ H L (t), whereS(t) is the deterministic counterpart of S(t) and the same notation applies to the rest. Intuitively, the mean-field ODE system above serves as a degenerated case of our stochastic model, where all randomness has been averaged out. When the differential equations are linear, which is not true in our case, the ODE also describes the evolution of the expectation of the stochastic system (Dynkin's Formula, see [4] Section 4.4 for example). Besides, according to [9] and [10], we may see that this deterministic model can also serve, under a much weaker condition, as a scaling approximation of the stochastic one after recaled by the renormalizing factor N . To be more specific, the renormalized random path ξ N (t)/N as a stochastic process is "almost" deterministic, and fluctuates closely around the deterministic trajectoryξ N (t)/N if the total population N is large and the renormalized initial values ξ N (0)/N andξ N (0)/N are close to each other, whereξ(t) denotes the determinsitic counterpart of ξ(t), andξ N (t) and ξ N (t) be copies of the deterministic and stochastic models with total population N . Using mathematical language, with probability one there is pathwise convergence for all t 0 ∈ [0, ∞). However, we would like to specifically bring to reader's attention that, NO MATTER the size of the total population N , as long as the size of epidemic outbreak is NOT comparable to N , the convergence above in (1) DOES NOT imply that the un-renormalized ξ(t) is non-random by itself or the epidemic involved populations. For example, E 1 (t) andẼ 1 (t) are close to each other in any sense without the rescaling factor. Actually the fact that the size of epidemic outbreak is not comparable to N itself already implies that which perfectly satisfies (1) while at the same time provides no information on whether the actual values of E 1 (t) andẼ 1 (t) are close to each other. To summarize, when the size of epidemic outbreak is NOT comparable to N , the stochastic model possesses intrinsic randomness, and may not be well approximated by a deterministic ODE in any nondegenerate sense.

Supplementary B
In the state-collapsed version of the stochastic process (see Figure 1), the D is removed as fatality rate is extremely low in all the selected regions and the collected data could not provide reliable estimations of the death rates, and only recoveries from hospital, R H , is considered. Furthermore, states E 1 , E 2 , A 1 and A 2 are collapsed into E, states IN 1 and IN 2 are collapsed into IN , and IH L and IH S are collapsed into IH, which would ease the identifications of the initial values in the model. r H ,r s ,r q can be prefixed through the average time from symptoms onset to diagnosis, the mean incubation period and mean difference between infectious period and serial interval shown in [17]. We assume that r q increase to 1/3.6 (which means the expected difference between infectious period and serial decreases by 3 days) due to improvement in testing means and control measures after these measures are taken.