Introduction

Optimal operation of a supply chain (SC) remains a challenging task. A major obstacle to efficient SC management is the intrinsic uncertainty of future demand. The parties involved in an SC attempt to control the stock level and avoid overproduction and overstocking based on a past demand pattern1. However, when the parties perform this in an uncoordinated manner, fluctuation in demand is amplified and transferred to upstream parties, which is called the bullwhip effect2,3,4. Factors causing the bullwhip effect can be classified into five groups: demand signal processing, lead time, order batching, price fluctuations, and rationing and shortage gaming2,3. These inefficient practices amplify the demand signal and transfer it to upstream parties.

In addition to empirical studies4,5,6, a number of theoretical studies have demonstrated that replenishment policies based on incomplete information (i.e., demand signal processing) are the cause of the bullwhip effect. Traditional approaches to study inventory control use exponential smoothing2,3,7,8,9,10,11 and moving average7,10 techniques. Although these control methods are effective for certain types of demand signals when the strength of the feedback is properly set, feedback that is too strong often destabilizes the SC. Another control approach is a model-based one, in which replenishment is performed based on a model that directly describes the temporal structure of the demands. For example, autoregressive models, including the autoregressive moving average (ARMA)12 and autoregressive integrated moving average (ARIMA)11,13,14 models, have been used in various studies. These studies demonstrated that information sharing among firms prevents the firms from overreacting to signals and is the key to stabilizing the SC15.

In addition, demand forecasting using machine learning modeling, which aims at high prediction accuracy, has recently attracted attention16,17,18,19. This approach has been facilitated by the increased accessibility of high computational power, state-of-the-art modeling techniques, and big data20,21,22. Although this approach has not yet achieved satisfactory performance18, the use of relatively complex models (e.g., support vector machines and neural networks16,17,18,19), which can accommodate more variables, appears promising because the demand may be influenced by many (measurable or unmeasurable) factors23, such as price24, promotions25,26, and calendar events27. Information on these factors is likely to increase the accuracy of the prediction if properly included in the model.

As a feature of mathematical modeling, training (or parameter estimation) based on past data requires higher cost than performing prediction (or inference) based on a model28. Thus, a constructed forecasting model is not updated as often as forecasting. It should be noted that there is a common implementation in which model parameters are updated (retrained) for each forecast if the cost of doing so is small28. Because the demand pattern may change over a long-term period and is often difficult to model, the forecasting model should be updated at the appropriate time. However, the appropriate time for retraining is often unknown for each firm, which may destabilize the SC.

In this study, we examine the scenario in which model retraining is performed in an uncoordinated manner in each firm comprising an SC under long-term variable demand. We demonstrate that this can cause the bullwhip effect, which can be obviously prevented by sharing the model of the retailer (i.e., the firm at the customer end of the SC) to upstream firms. Information sharing at any level is often difficult to implement in practice due to various reasons, e.g., cost, compatibility of information systems, and confidentiality issues29. Thus, evaluating the benefits of sharing the model and cost of not doing so contributes to efficient supply chain management. We use a simple forecasting model to illustrate this phenomenon, excluding the possibility of other factors (e.g., lead time, seasonality, and model structures) confounding the cause of the bullwhip effect.

The remainder of this paper is organized as follows. In Sec. 2, we define the model and introduce three retraining policies and one baseline (non-retraining) policy. In Sec. 3, we report the inventory level and the amount of sales loss yielded by each policy. We demonstrate that uncoordinated model retraining causes the bullwhip effect, and that sharing the forecasting model improves the performance of the SC. In Sec. 4, we summarize the results and discuss their implications.

Figure 1
figure 1

Overview of system. Each echelon places an order to the next upstream echelon according to the order-up-to-point policy. The target inventory level is determined by the forecasting model in each echelon, which is temporally updated. The demand at the end customer is stochastically generated by a probability distribution, \({\mathcal {N}}(\mu _0(t),\sigma _0(t))\).

Figure 2
figure 2

Average inventory level at each echelon. (a) Regular update scheme. (b) Independent update scheme. (c) Shared forecasting model scheme. In each panel, we varied the length of training data, \(L_\text{train}\). Simulations were performed for three types of intervals of demand change: \(T_\text{int} = 50\) (left), \(T_\text{int} = 100\) (middle), and \(T_\text{int} \sim U(50, 100)\) (right). The standard deviation of the inventory level for 5 echelons was computed for each time step. For each simulation condition, the results were averaged over \(t=10^7\) steps.

Figure 3
figure 3

Percentage of lost sales opportunities. (a) Regular update scheme. (b) Independent update scheme. (c) Shared forecasting model scheme. In each panel, we varied the length of training data, \(L_\text{train}\). The simulations were performed for three types of intervals of demand change, i.e., \(T_\text{int} = 50\) (left), \(T_\text{int} = 100\) (middle), and \(T_\text{int} \sim U(50, 100)\) (right). For each simulation condition, the results were averaged over \(t=10^7\) steps.

Figure 4
figure 4

Trade-off between the average inventory level (Fig. 2) and the percentage of lost sales opportunities (Fig. 3). (a) \(T_\text{int} = 50\). (b) \(T_\text{int} = 100\). (c) \(T_\text{int} \sim U(50, 100)\). Each symbol represents a simulation condition (i.e., a single \(L_\text{train}\) value) in Figs. 2 and 3 . The red dashed line represents the results of the constant policy.

Model

Order and shipment

We consider a simple SC consisting of N echelons adopting an order-up-to-point policy8,30,31 (Fig. 1). Each echelon may represent a retailer, wholesaler, or manufacturers of certain consumer goods, for example. When goods are sold at the consumer end of the SC (i.e., retailer), its inventory is reduced, and orders are placed with a wholesaler to replenish the products. The wholesaler places orders with the next wholesaler or manufacture to meet the demand. As the result of these orders, products or materials are sent in the opposite direction.

The target inventory level in the policy is determined by the forecasting model of each echelon, which is described in Sec 2.2. Briefly, each echelon first receives an order from a downstream echelon (or the end customer) and performs shipments to it, followed by an immediate replenishment process by sending an order to the next upstream echelon, in a single discrete time step. A single discrete time step represents an interval between successive orders, which may represent a day, week, and so forth, in a real situation. For simplicity, here, we assume that each firm in the SC places an order in a synchronized manner in each time step. Note that we ignore the lead time in this model to exclude its effects on the bullwhip effect.

Technically, we update the system from \(t = T\) to \(t = T+1\) as follows. First, at the most downstream customer, demand is generated by a Gaussian distribution, \({\mathcal {N}}(\mu _{0}(T), \sigma _{0}(T))\) (with temporally variable parameters; see Sec. 2.3 for details). Note that we have carried out the entire simulations with log-normal distributions as well (Supporting Information) to confirm the conclusions were not specific to the Gaussian distribution. Then, we sequentially update each echelon from the downstream. An update of an echelon is twofold, involving a shipment to the next downstream echelon and subsequent replenishment by placing an order to the next upstream echelon. Echelon i receives an order from the next downstream echelon \(i-1\) (or the end customer for \(i=1\)), which is denoted by \(O_{i-1}(T)\). If the ordered amount is smaller than the inventory level of echelon i, \(I_{i}(t)\) (i.e., \(O_{i-1}(T) < I_{i}(T)\)), echelon i depletes the ordered amount (i.e., \(I_i(T+0.5) = I_i(T) - O_{i-1}(T)\)); otherwise, it depletes the entire current inventory (i.e., \(I_i(T+0.5) = 0\)). The downstream echelon \(i-1\) receives this shipment from echelon i, \(S_{i}(T) = \max {\{O_{i-1}(T), I_{i}(T)\}}\), resulting in \(I_{i-1}(T+1) = I_{i-1}(T+0.5) + S_i(T)\). For \(i=N\), the ordered amount is replenished (i.e., \(I_{N}(T+1) =I_{N}(T+0.5) +O_N(T)\)). Note that we do not consider the backlog of orders; thus, the amount of demand that exceeds the inventory level is lost, which is recorded as the lost sales opportunity.

Order policy and models for demands

Each echelon i has a forecasting model for demand as a probability distribution, based on which it determines the safety stock and target inventory levels. Here we use the Gaussian distribution, \(O_{i-1}(t) \sim {\mathcal {N}}(\mu _i(t), \sigma _{i}(t))\), as the forecasting model. This model is one of the simplest forecasting models and mimics the functions of other types of more complex models that generate a prediction with an estimated error. We use this model to exclude the possibility that a specific temporal structure of a forecasting model, not the retraining of the models on which we focus here, causes the bullwhip effect. Essentially the same models for the demand have been widely used in the literature to theoretically investigate the dynamics of SCs32,33,34,35.

Using the two parameters in the model (i.e., \(\mu _i\) and \(\sigma _i\) denoting the mean and standard deviation, respectively), the target inventory level is set to

$$\begin{aligned} I_{i,\mathrm {target}}(t) = \mu _i(t) + c \sigma _i(t), \end{aligned}$$
(1)

where c is the safety factor defining the amount of safety stock (i.e., order-up-to-point policy8,30,31). The ordered amount is simply the difference between the target and current (at \(t = T+0.5\)) inventory levels (i.e., \(O_i(T) = I_{i,\mathrm {target}}(T) - I_{i}(T+0.5)\)). If the current inventory level is higher than the target inventory level (which may occur when the forecasting model is retrained), we set \(O_i(T) = 0\). Note that this order policy alone does not amplify the demand signal, and thus does not cause the bullwhip effect36.

Fluctuations in demand

We also fluctuate the demand distribution at the end customer, \({\mathcal {N}}(\mu _0,\sigma _0)\). As an example, we set the following simple stochastic process with moderate variability. We fixed \(\sigma _0 = 0.1\). When updated, the mean of the distribution, \(\mu _0\), was redrawn from a uniform distribution between 1/2 and 2, i.e., \(\mu _{0, \mathrm{new}} \sim U(1/2, 2)\).

We considered two types for intervals of updating the distribution, \(L_\text{{int}}\). The first is a constant update interval, with which the demand distribution is updated every \(L_\text{{int}}(=Const.)\) time steps. In this study, we examined \(L_{\rm int} = 50\) and 100. The second type of interval is a random update interval, which is redrawn from the uniform distribution, \(U(L_\text{{min}},L_\text{{max}})\), every time the demand distribution is updated. Here, we set \(L_\text{{min}} = 50\) and \(L_\text{{max}}=100\). These two update intervals were used to show that our results were not influenced by either discrete nor stochastic properties of the demand.

Retraining schemes for demand forecasting at echelons

As the demand pattern varies, the forecasting models of the echelons are also updated (i.e., retrained). Each echelon i refers to the past \(L_\text{{train}}\) orders it received, \(\{O_{i-1}(T-1),\ldots ,O_{i-1}(T-L_\text{{train}})\}\). When the model is updated, \(\mu _i\) and \(\sigma _i\) were replaced by the mean and corrected standard deviation computed from the sample, respectively. The optimal value of \(L_\text{{train}}\) depends on the environment and is generally unknown.

We consider the following four types of schemes defining when and how to perform retraining at each echelon.

Regular update

We simultaneously update the models of all echelons every \(L_\text{{train}}\) steps, i.e., at \(t = nL_\text{{train}}\) \((n=1,2,\ldots )\).

Independent update

At every time step, each echelon i verifies whether the sample average computed from the most recent past data, \(\{O_{i-1}(T-1),\ldots ,O_{i-1}(T-L_\text{{train}})\}\), falls within the interval \([\mu _i - 1.96\sigma _i / \sqrt{L_\text{{train}}}, \mu _i + 1.96\sigma _i/ \sqrt{L_\text{{train}}}]\), and if not, it updates the forecasting model. If the current model correctly predicts the demand, 95% of the sample average falls within this interval. Note that this scheme yields 5% false positives every time step. This criterion is examined independently at each echelon.

Shared forecasting model

Only echelon 1 updates the model by the same rule as in the independent update scheme. If the forecasting model is updated, it is copied to all echelons. In this way, the forecasting model of echelon 1, which has the most accurate information about the demand, is shared to the other firms.

Parameter settings and initial conditions

We performed simulations for \(N=5\) echelons, varying \(L_\text{train}\). The safety factor was set to \(c=1.96\). As initial conditions, we set \(\mu _0 = \mu _1=\cdots =\mu _N = 1.0\), \(\sigma _0 = \sigma _1=\cdots =\sigma _N = 0.1\), and \(I_1=\cdots =I_N = 1.196\).

Constant policy

To evaluate the efficacy of the retraining schemes, we prepared a baseline model, which is referred to as the constant policy. In this policy, each echelon’s prediction is based on the Gaussian distribution, \({\mathcal {N}}({\bar{\mu }},\sigma )\), which does not change over time. Here, \({\bar{\mu }}=1.25\) is a fixed parameter representing the long-term mean of demands (\(\mu _0 \sim U(1/2,2)\)), and we examined various values of \(\sigma\) ranging from 0 to 1.2 in separate runs. This policy assumes an ideal situation in which each echelon knows the value of \({\bar{\mu }}\).

Results

Emergence of bullwhip effect

First, we present the average inventory level of each echelon in Fig. 2. Upstream echelons had more inventory than downstream echelons except when the shared forecasting model algorithm was applied. A higher inventory level indicates that the echelon estimates larger variability in the demand signals (i.e., larger \(\sigma _i\)), resulting in ordering more to have more stock.

With the independent update algorithm, the inventory level became extremely high in upstream echelons when \(L_\text{train}\) was small (Fig. 2b). In this case, unnecessarily frequent retraining perturbed the system too much, and its amplification caused the bullwhip effect.

As expected, the shared forecasting model algorithm had the best performance in this respect. Because the forecasting model was shared to upstream echelons, they had the same estimate of demand variability, resulting in identical inventory levels.

Lost sales opportunities

We then examined the level of service achieved. In general, a lower inventory level leads to more stockouts and subsequent loss of sales opportunities. Even if we consider the backlog of orders, stockouts causing additional costs are undesirable4,37. We quantified the number of lost sales opportunities of echelon i at \(t=T\) by \(O_{i-1}(T) - S_{i}(T) (= \max {\{0, O_{i-1}(T) - I_{i}(T)\}})\). This value was divided by 1.25 (i.e., the long-term average of the demands) to compute the percentage of the lost sales opportunities (Fig. 3).

The regular update scheme (Fig. 3a) caused a substantial number of sales losses, which reached the maximum around \(L_\text{train}=T_\text{{int}}/2\) where the inventory level was at its minimum (Fig. 2a).

The independent update scheme (Fig. 3b) suppressed the sales losses to a reasonable level while keeping the inventory level moderate when \(L_\text{{train}}\) was not very small (Fig. 2b).

The shared forecasting model scheme significantly reduced the sales losses (Fig. 3c) with a low inventory level (Fig. 2c).

In Fig. 3a,b, echelon 2 or 3, not echelon 1, had the most sales losses. This is explained as follows. In these schemes, echelon 1 swiftly adjusted to the change in demand, and the upstream echelons followed after some time. When the demand increased, because echelons 2 and 3 did not receive an order larger than \(I_1\), which is smaller than the inventory level of the other echelons, the increase in demand became difficult to detect, and adaptation to it was delayed.

Trade-off between inventory level and lost sales opportunities

To further evaluate the performance of the three schemes (i.e., regular update, independent update, and shared forecasting model schemes), we consider a two-dimensional plane of the total inventory level (i.e., sum of \(I_i,\) \(i=1,\ldots ,N\)) in the system, and the percentage of lost sales opportunities at echelon 1 (Fig. 4). Note that the lost sales at echelon 1 are equal to those of the entire SC. Because these two indices have in a trade-off relationship, we must evaluate both at the same time to measure the performance. The lower left area in Fig. 4 corresponds to a lower inventory level and fewer lost sales opportunities (i.e., a set of better operating points). For various values of \(L_\text{train}\), the results of the three retraining schemes are plotted with the results of the constant policy (red dashed line). Symbols located above this line indicate that the inventory control is inferior to that of the constant policy.

The performance of the regular update scheme was comparable to that of the constant policy. In addition, the independent update and the shared forecasting model schemes provided substantially better operating points than the constant policy. In particular, the shared forecasting model scheme had the best performance. Moreover, the operating points of the shared forecasting model scheme were distributed over a small area, indicating the robustness of the scheme.

Conclusion

We demonstrated that retraining a forecasting model can cause the bullwhip effect in an SC. Furthermore, the shared forecasting model scheme effectively suppressed the bullwhip effect without increasing the loss of sales. This scheme functioned robustly for various lengths of training data, which is an advantageous property for practical applications.

We illustrated the effect of model retraining for long-term demand fluctuations using a simple model. Because our findings are based on the intrinsic dynamic mechanisms of SCs, we believe that they generalize to other types of settings.

There are several factors that were ignored in the model, whose effects on our conclusions are discussed below. First, we ignored the lead time and its variability. If the lead time is variable, it should be reflected in the ordered amount (Eq. (1))4,10,35. In this case, we suppose that the SC becomes more unstable and more vulnerable to perturbations by the retraining reported in this paper.

Second, the effects of order batching32,38 should be considered. In particular, when the echelons place orders asynchronously with different batch sizes, simple forecasting model sharing may not be sufficient to stabilize the SC. In this case, comprehensive information sharing, including the batching policy and current inventory level of downstream echelons, is required.

Finally, we did not include the temporal structure in the forecasting model because our focus was on retraining the models, which is primarily affected by the errors from the expectation. Thus, we did not consider seasonality39,40 and trends7. However, if we regress these factors out from the original demand signal, the problem should remain essentially identical to that examined in this paper.

In this paper, we proposed that sharing the forecasting model of echelon 1 (i.e., a retailer) to upstream echelons benefits the entire SC. In this scheme, only the retailer bears the cost of collecting information and performing retraining, although upstream suppliers benefit more from the information. Thus, in practice, we should consider contracts among firms to incentivize retailers and downstream suppliers to share their models8,34,41,42,43.

It should be noted that our results were based on a simple model, which is a general limitation of theoretical studies44. We believe that the effects of model retraining should be studied in greater detail with more practical settings in future research. For instance, investigating the effects of model retraining on SC networks45,46,47 and more complex forecasting models for high-dimensional demand signals48 would be a useful future research direction.