An extreme value analysis of daily new cases of COVID-19 for sixteen countries in west Africa

We provide an extreme value analysis of daily new cases of COVID-19. We use data from Benin, Burkina Faso, Cabo Verde, Cote d’Ivoire, Gambia, Ghana, Guinea, Guinea-Bissau, Liberia, Mali, Mauritania, Niger, Nigeria, Senegal, Sierra Leone and Togo, covering a period of 37 months. Extreme values were defined as monthly maximums of daily new cases. The generalized extreme value distribution was fitted to them with two of its three parameters allowed to vary linearly or quadratically with respect to month number. Ten of the sixteen countries were found to exhibit significant downward trends in monthly maximums. The adequacy of fits was assessed by probability plots and the Kolmogorov-Smirnov test. The fitted models were used to derive quantiles of the monthly maximum of new cases as well as their limits when the month number is taken to infinity.

shows scatter plots of the data on monthly maximums for the sixteen countries. Also shown are lowess 26,27 smoothed versions of the scatter plots. Most countries appear to exhibit significant trends with respect to month number. The trends appear either linear or quadratic. www.nature.com/scientificreports/ "Section Models" assumes that the monthly maximums of new cases follow the generalized extreme value model. Estimation of the generalized extreme value model by the method of maximum likelihood requires that the data are independent (because likelihood function is defined as the product of probability density functions). We tested for independence using runs test. The corresponding p-values were 0.099, 0.078, 0.087, 0.067, 0.084, 0.077, 0.087, 0.077, 0.088, 0.061, 0.078, 0.065, 0.069, 0.093, 0.057 and 0.092.

Models
Let X denote a random variable representing the monthly maximum of new cases. The fit of the generalized extreme value model to the data on X, conditioned on X > 0 , by the method of maximum likelihood (see Coles 16 for details) yielded the probability plots shown in Fig. 3. We see that the fits of the generalized extreme value model are generally poor.
To produce a better fit, we now account for the trends in the data noted in Section "Data". We fitted the following models (the first referred to as the constant model is the same as the generalized extreme value model).
1. the constant model given by 2. the linear location model given by 3. the quadratic location model given by 4. the linear scale model given by 5. the quadratic scale model given by (2) µ(x) ≡ exp (a 1 ), σ (x) ≡ exp (a 2 ), ξ(x) ≡ a 3 ; (3) µ(x) = exp a 1 + b 1 · month no (x) + c 1 · month no (x) 2 , µ(x) ≡ exp (a 1 ), σ (x) = exp [a 2 + b 2 · month no (x) ], ξ(x) ≡ a 3 ;  0  5  10  15  20  25  30  35   0   100   200   300   400   500   600   Month number   Maximum of new cases   0  5  10  15  20  25  30  35   0   100   200   300   400   500 600 0  5  10  15  20  25  30  35   0   2 00   400   6 00   800   Month number   Maximum of new cases   0  5  10  15  20  25  30  35   0   2 00   400   6 00 800       www.nature.com/scientificreports/ 8. the linear location and quadratic scale model given by 9. the quadratic location and quadratic scale model given by   www.nature.com/scientificreports/ where x denotes the data (monthly maximum of new cases) and month no (x) taking values 1, 2, . . . , 37 denotes the month number corresponding to x. The parameters b 1 and b 2 correspond to linear trends with respect to month number. The parameters c 1 and c 2 correspond to quadratic trends with respect to month number. The models given by (2) to (10) were fitted by the method of maximum likelihood by maximizing where {x i : x i > 0} are the data, n is the number of x i s strictly greater than zero, µ(x i ) is given by (2)-(10), σ (x i ) is also given by (2)-(10), µ = (a 1 , b 1 , c 1 ) and σ = (a 2 , b 2 , c 2 ) . The maximization was performed by using the optim function in the R software (R Development Core Team, 28 ). Let µ = a 1 , b 1 , c 1 and σ = a 2 , b 2 , c 2 denote that maximum likelihood estimates of µ and σ , respectively. Standard errors/confidence intervals associated with parameters can be obtained by assuming asymptotic normality of maximum likelihood estimates and inverting the observed information matrix. But this approach supposes: (i) normality of maximum likelihood estimates; (ii) the sample size is infinity; (iii) the parameter estimates are fixed and non-random. None of these assumptions may hold in reality. A realistic approach is to use the following simulation scheme to obtain standard errors/confidence intervals 29 : (i) simulate 10,000 samples from the fitted model each of the same size of the data; (ii) refit the model for each of the 10,000 samples; (iii) compute the empirical distribution out of the 10,000 estimates of the parameter of interest; (iv) use the empirical distribution to compute the standard error/confidence interval for the parameter.
The parameters of interest could be the parameters in (2)-(10) or probabilities of interest.

Results and discussion
We used the method in "Section Models" to model the data on monthly maximum of new cases for each of the sixteen countries. We started with constant model and then added one parameter at a time to fit the linear location, quadratic location, linear scale, quadratic scale, linear location and linear scale, quadratic location and linear scale, linear location and quadratic scale and the quadratic location and quadratic scale models. We also started with the quadratic location and quadratic scale model and subtracted one parameter at a time to fit the linear location and quadratic scale, quadratic location and linear scale, linear location and linear scale, quadratic scale, linear scale, quadratic location, linear location and the constant models. Both approaches led to the same model. The significance or non-significance of parameters (to be added or subtracted) was determined by the likelihood ratio test by comparing likelihood values 30 . We also used the Akaike information criterion due to Akaike 31 and the Bayesian information criterion due to Schwarz 32 to check significance or non-significance.
The values of log-likelihood, Akaike information criterion and Bayesian information criterion for (2) to (10) for Burkina Faso, one of the sixteen countries, are given in Table 2. We see that the constant model is the best fitting model. Table 3 gives the parameter estimates and standard errors of the best fitting models for all the sixteen countries. The standard errors were obtained by the bootstrap procedure outlined in "Section Models". We see that all of the standard errors are less than the parameter estimates in magnitude.
The shape parameter estimates in Table 3 are positive for all of the countries. When the constant model was fitted and H 0 : ξ = 0 was tested, the p-values given in Table 4 suggest that the estimates of ξ are significantly positive. When ξ > 0 , the distribution of the monthly maximum of new cases is heavy tailed and unbounded (10) µ(x) = exp a 1 + b 1 · month no (x) + c 1 · month no (x) 2 ,       www.nature.com/scientificreports/ location parameter estimate decreases by a multiple of 3.469 × 10 −5 for every unit increase in month number. The scale parameter estimate decreases by a multiple of 1.204 × 10 −4 for every unit increase in month number. Table 5 gives the limit of (11) as the month number is taken to infinity. The 2.5 percent and 5 percent quantiles are equal to zero for all the ten countries. Quantiles of all levels are equal to zero for Mali, Senegal and Sierra Leone. Among the remaining countries, the 50th, 95th and 97.5th percent quantiles are smallest for Cabo Verde, followed by Mauritania, Benin, Cote d'Ivoire, Guinea, Ghana and Nigeria.

Conclusions
We have modeled the monthly maximums of daily new cases from Benin, Burkina Faso, Cabo Verde, Cote d'Ivoire, Gambia, Ghana, Guinea, Guinea-Bissau, Liberia, Mali, Mauritania, Niger, Nigeria, Senegal, Sierra Leone and Togo using the generalized extreme value distribution. Two of the three parameters of this distribution were allowed to vary linearly or quadratically with respect to month number to account for trends. www.nature.com/scientificreports/ quadratic trend in its location parameter and a linear trend in its scale parameter. Senegal exhibited a quadratic trend in its location and scale parameters. We have provided estimates of 2.5 percent, 5 percent, 50 percent, 95 percent and 97.5 percent quantiles for all the sixteen countries for up to March of 2024. We have also provided asymptotic limits of these quantiles as the month number approaches infinity. The goodness of the fitted models was examined by probability plots and the Kolmogorov-Smirnov test.
It is of concern that Burkina Faso, Gambia, Guinea-Bissau, Liberia, Niger and Togo did not exhibit significant downward trends in monthly maximums of daily new cases. It is also of concern that the asymptotic limits of 50 percent, 95 percent and 97.5 percent quantiles are strictly positive for Benin, Cabo Verde, Cote d'Ivoire, Ghana, Guinea, Mauritania and Nigeria. The governments of these countries may want to take appropriate actions to reduce new cases of COVID-19.
Future work are to: (i) extend the extreme value analysis of daily new cases to other countries in Africa; (ii) provide multivariate extreme value analysis of daily new cases from multiple countries; (iii) provide spatial and temporal extreme value analysis of daily new cases across the continent; (iv) extend the analysis for daily new deaths.    www.nature.com/scientificreports/

Data availability
The data used can be obtained from the corresponding author.

Code availability
The code used can be obtained from the corresponding author. www.nature.com/scientificreports/ Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.