Monitoring of semiconductor manufacturing process on Bayesian AEWMA control chart under paired ranked set sampling schemes

Quality control often employs memory-type control charts, including the exponentially weighted moving average (EWMA) and Shewhart control charts, to identify shifts in the location parameter of a process. This article pioneers a new Bayesian Adaptive EWMA (AEWMA) control chart, built on diverse loss functions (LFs) such as the square error loss function (SELF) and the Linex loss function (LLF). The proposed chart aims to enhance the process of identifying small to moderate as well as significant shifts in the mean, signifying a notable advancement in the field of quality control. These are implemented utilizing an informative prior for both posterior and posterior predictive distributions, employing various paired ranked set sampling (PRSS) schemes. The effectiveness of the suggested chart is appraised using average run length (ARL) and the standard deviation of run length (SDRL). Monte Carlo simulations are employed to contrast the recommended approach against other control charts. The outcomes demonstrate the dignitary performance of the recommended chart in identifying out-of-control signals, especially applying PRSS designs, in comparison to simple random sampling (SRS). Finally, a practical application was conducted in the semiconductor manufacturing context to appraise the efficacy of the offered chart using various paired ranked set sampling strategies. The results reveal that the suggested control chart performed well in capturing the out-of-control signals far better than the already in use control charts. Overall, this study interposes a new technique with diverse LFs and PRSS designs, improving the precision and effectiveness in detecting process mean shifts, thereby contributing to advancements in quality control and process monitoring.

Statistical Process Control (SPC) is a critical quality management tool employed in various industries to supervise, regulate, and improve production processes.By using statistical methodologies, SPC ensures the continuous monitoring of manufacturing operations, ensuring their effective and reliable function within predefined quality benchmarks.It involves data collection, analysis, and interpretation to identify variations and trends within the production process.Utilizing control charts (CCs), and other statistical tools, SPC aids in the timely identification of potential deviations from normal patterns, facilitating swift corrective measures to maintain desired quality levels.SPC significantly contributes to defect reduction, enhanced production efficiency, and the overall improvement of product quality, resulting in increased customer satisfaction and reduced operational costs.A CC is a fundamental component of SPC that facilitates the ongoing monitoring and evaluation of the stability and performance of manufacturing or business processes.It visually represents process data, enabling the identification of variations and trends that could potentially impact output quality.By plotting data points on a graph with predetermined control limits, it assists in recognizing common sources of variation, such as random fluctuations, as well as special causes like defects or errors.CCs enable organizations to distinguish between normal process variations and those requiring corrective actions, thereby ensuring consistent product quality and preventing defects.The effective utilization of control charts enables businesses to make informed, data-driven decisions, improve process efficiency, and achieve higher levels of customer satisfaction.Renowned engineer

Bayesian approach
A foundational framework for statistical inference, the Bayesian approach places a strong emphasis on the representation and manipulation of uncertainty through probabilities.The Bayesian method, named after Reverend Thomas Bayes, employs Bayes' theorem to calculate event probabilities based on prior knowledge.It combines prior beliefs with observed evidence to generate posterior probabilities, enabling dynamic belief updating.It underpins statistical inference and decision-making, enhancing understanding of complex systems.To draw conclusions about unknown quantities, the Bayesian paradigm incorporates both observed data and prior beliefs.The Bayesian approach approaches parameters as random variables with their own probability distributions, in contrast to frequentist statistics, which treats parameters as fixed but unknown values.This enables the measurement of the estimating process's uncertainty.Because they offer a flexible and natural way to incorporate past knowledge into statistical modeling and analysis, Bayesian methods are widely used in many fields, including machine learning, data analysis, and decision making under uncertainty.In situations where data is scarce or noisy, they offer a potent tool for well-informed decisions and forecasts.Additionally, a dynamic and iterative learning process is made possible by the Bayesian approach, which permits beliefs to be updated in response to new data.In the context of statistical analysis, the variable under consideration X represents an under control process with parameters θ and δ 2 .A normal prior distribution is chosen with parameters θ 0 and δ 2 0 to express initial beliefs or knowledge about these parameters prior to any data observation is given by: When there is little or no previous knowledge about an unknown population parameter, Bayesian analysis frequently applies a non-informative prior, which is typically has a negligible impact on the prior distribution.In response to this, Jeffrey 28 formulated a prior distribution which is directly proportional to the Fisher information matrix, thereby addressing this particular scenario.The probability function is defined as p(θ) ∝ √ I(θ) where, I(θ) is known as Fisher information matrix.This enables the analysis to incorporate any accessible information on the parameter.
The Bayesian P distribution, updates our knowledge of parameters of interest by fusing prior beliefs with the likelihood function derived from the analyzed data.Considering both past knowledge and recent evidence, it represents the refined beliefs about these parameters.In order to enable a methodical approach to statistical decision-making by combining both prior beliefs and observed data, the P distribution is a crucial part of Bayesian inference.The p(θ|x) is given as p(θ|x) = p(x|θ)p(θ) p(x|θ)p(θ)dθ .The predictive distribution in Bayesian statistics, is a key tool for predicting upcoming observations by fusing prior assumptions about parameters with likelihood derived from data.The Bayes theorem is used to update our knowledge of unknown quantities in light of fresh evidence.With parameter uncertainty and data variability taken into account, it computes the probability distribution of future observations y given observed data x.This method is useful in areas like machine learning, econometrics, and uncertainty-aware decision-making because it enables the quantification of uncertainty in complex data scenarios.The predictive distribution ensures a principled approach to prediction, integrating prior knowledge with observed data for well-informed decision-making.the p y|x is mathematically described as

Squared error loss function
In the context of the Bayesian approach, The SELF is a metric that assesses the discrepancy between the estimated and true parameters.It serves as a way to evaluate the accuracy of an estimator by considering the squared difference between the true value and the estimated value.The SELF is a fundamental component in Bayesian decision theory, where it helps to assess the quality of estimators and aids in making decisions.Specifically, it helps in quantifying the loss incurred due to the incongruity between the estimated and true values, with the aim of minimizing this loss in the decision-making process.Gauss 29 suggested a SELF and mathematically described as Using SELF the Bayes estimator is mathematized as:

Linex loss function
Within the Bayesian framework, the LLF, asymmetric measurement, evaluates the distinction between the actual and the estimated parameter.It integrates exponential and linear components, enabling the evaluation of accuracy with non-uniform preferences.This characteristic foster adaptability in decision-making and estimation, aligning with specific preferences and priorities in the Bayesian approach.Varian 30 introduced an asymmetric LLF.The estimation method for the location parameter under the LLF can be described as follows: (1) (2) p y|x = p y|θ p(θ|x)dθ.

Paired ranked set sampling
Muttlak 31 is recognized as the pioneer of the paired RSS (PRSS) method.This technique involves selecting a subset of population units for ranking, and instead of choosing only one unit from each set, two units are selected for estimation.The PRSS strategies can be implemented as follows: If the set size l is even, l 2 2 units are ran- domly selected from the population.These units are then divided among l 2 sets, with each set comprising l units.The items within each set are ranked by incorporating sources, such as expert insights or auxiliary data.Subsequently, the first and l th ranked units from the initial set are chosen, followed by the second and (l − 1)th units from the second set, and so on, until the l 2 th and l 2 + 1 th elements taken from the last set.In the case of an odd value of l, l(l + 1) 2 th elements are taken directly from the under study population.The PRSS procedure involves randomly distributing the selected units among (l + 1) 2 th sets, where each set consists of l units.This finalizes one cycle of the PRSS procedure.The entire method can be repeated r times if necessary to get the desired sample size n = lr .The procedure for the PRSS can be described as follows: consider a specific cycle, denoted as r.within this cycle, let Z i(j),r , i, j = 1,2,3 … l; r = 1, 2, 3 … c, represents the jth order statistic in the ith sample, with cycle r.In this context, the RSS is used to estimate the population mean, and the estimator under PRSS approach for a single cycle is computed using the following for even l is given as and For odd l and variance

Extreme pair ranked set sampling
A modified version of the PRSS method, proposed by Balci et al. 32 and referred to as extreme PRSS (EPRSS), introduces an innovative approach to sample selection.EPRSS is particularly valuable in cases where the population follows a heavy-tailed distribution, a scenario more common than a normal distribution.This modification addresses the limitations of standard sampling techniques, which often struggle to capture extreme values in datasets with heavy-tailed distributions.By identifying and accounting for these extreme values, EPRSS aids in producing more accurate and representative estimates, thereby mitigating potential biases that could arise from skewed estimations.The EPRSS method entails the following steps: If l is even, a certain number of sampling units, specified as l 2 2 , are taken from the population concerned.These units are divided into l 2 sets of comparable size.Following this, elements in each group are arranged in ascending sequence, and measurements are obtained from the initial and final elements in each ordered group.However, if the value of l is odd, an alternative method is implemented.In this case, a total of l(l + 1) 2 elements taken from population under study.These units are randomly distributed into l − 1 2 sets, and all elements in each set are ranked accordingly.This comprehensive process enables the identification and collection of specific data points from the population, facilitating a more nuanced and inclusive analysis within the EPRSS framework.If vital, the entire EPRSS technique is recurrent r times to get a sample of size n = lr .The technique for estimating the mean and variance in EPRSS for a one rotation is given as follows: In case of l being even, the estimator is given by: Vol

Quartile pair ranked set sampling
Tayyab et al. 33 proposed the Quartile Paired RSS (QPRSS) design as an approach for estimating population parameters.The QPRSS technique can be summarized as follows: if l is an even number, l 2 2 elements are randomly drawn from the available population and allocated to l 2 sets, with each set having a size of l.The elements selected within each set are ranked using cost-effective sources.Subsequently, the (l + 1) 4 th and 3(l + 1) 4 th ordered elements from each set are chosen.If l is an odd number, l(l + 1) 2 ordered elements are randomly taken from the population and allocated to (l + 1) 2 sets.After ordering the units in each set, the (l + 1) 4 th and 3(l + 1) 4 th ordered units from the (l − 1) 2 sets, along with the (l + 1) 2 th unit from the enduring last set, are quantified to complete a single cycle.If essential, repeat the preceding procedures r times to obtain the needed sample size n = lr.The mean estimator for QPRSS for a single series is specified as follows: In case l is an even number, mean estimator is mathematized as and if l is odd then respective variances are and

Suggested AEWMA CC applying with various PRSS schemes utilizing Bayesian methodology
The section discusses the recommended CC applying distinct PRSS strategies for monitoring the process parameter of a normally distribution.Consider the independently and identically normally distributed variable i.e., X 1 , X 2 , ...X n with θ and σ 2 respectively.As a result, the probability function can be expressed as: www.nature.com/scientificreports/ The estimation of the mean shift is denoted by δ * t and is considered as the sequence of EMWA statistics using {X t } .Mathematically, this is expressed as where δ * 0 = 0 and ψ denotes the smoothing constant, the estimator δ * 0 exhibits bias for out-of-control processes and is unbiased for in-control processes.Haq et al. 9 have contributed an equitable approximation of σ that is suitable for both processes within control and those that are out of control.This impartial estimation can be mathematically articulated as: It is offered to use δt = δ * * t for δ estimation.The proposed statistic applying PRSS designs utilizing Bayesian technique for estimating the process mean using the sequence {X t } is provided by: where i = 1, 2, 3,RSS 1 = PRSS , RSS 2 = QPRSS , RSS 3 = EPRSS,g δt ∈ (0, 1] and F 0 = 0 such that g Atif et al. 34 presented a function, labeled as (23), aimed at adjusting the smoothing constant while factoring shift estimated.The recommended constants utilized in g δt is a = 7 and c = 1 , when 1 < δt ≤ 2.7 , the value of c = 2 for δt ≤ 1 .When the Bayesian AEWMA plotting statistic exceeds the predetermined threshold value h, it signals an out-of-control process.And if the statistic remains lower than the assigned threshold value, it signifies a process under control.
In scenarios where probability function and the prior distribution adhere to a normal distribution, the subsquent posterior distribution also conforms to a normal distribution, characterized by θ n and σ 2 n, mean and variance respectively.The P(θ/x) can be mathematically represented as: where, θ n = . The estimator for the suggested approach, which incorporates the Bayesian methodology under various PRSS designs utilizes the SELF, can be expressed as follows: The estimator for the offered approach, which integrates Bayesian methodology across various PRSS designs employs the SELF, is mathematically represented as follows: The properties of the θ(SELF) is expressed, E θ(SELF) = . The Bayes estimator utilizing the LLF and with PRSS, can be calculated as follows: The mean of θ(LLF) is mathematized as Suppose there are future observations of size h, denoted as y 1 , y 2 , …, y n .In context of Bayesian methodology, employing different RSS strategies for posterior predictive distribution, the P(y/x) can be represented as: www.nature.com/scientificreports/where y x is normally distributed having mean θ n and the standard deviation δ 1 , mathematized as . Then θ is estimated for posterior predictive distribution applying LLF with different PRSS designs by where, δ2 1 and sd θLLF =

Simulation study
The effectiveness of the AEWMA CC, which incorporates Bayesian methodology and is applicable to various PRSS designs, is evaluated using Monte Carlo simulation.The evaluation process encompasses various measures, including the ARL and the SDRL.To evaluate the impact of the proposed CC with different LFs, smoothing constants of ψ = 0.10 and 0.25 are utilized.The state of an in-control process is indicated at 370.Hereafter, we present a summary of the essential simulation steps required to implement the offered CC.
Step 1: Setting in-control ARL • The prior and sampling distribution are assumed to follow a standard normal distribution, from which the properties are determined for different LFs.i.e., E θ(LLF) and δ LLF .• The determination of the threshold value 'h' is grounded on a particular chosen smoothing constant value.
• For an in-control process, generate a random samples from a normal distribution of size n, X ∼ N E θ , δ 2 .
• Compute the recommended AEWMA statistic and assess the process in line with the predetermined design specifications.• Repeat the preceding three stages indefinitely as long as the process stays under control, and maintain track of the number of run lengths for the under-control process until it is identified as out-of-control.
Step 2: For out-of-control ARL • In the case of a shifted process, draw samples from a Gaussian distribution.i.e., X ∼ N E θLF + σ δ √ n , δ .• Compute the statistic Ft for the AEWMA using a Bayesian approach, and assess the process under the offered design.• Continue to repeat the above-mentioned steps as long as the process remains within control, while keeping a record of the run length for the in-control process.• Perform iterations of steps (i-iii) for a total of 100,000 times, and compute the ARL and SDRL.

Results and discussions
Tables 1, 2, 3, 4, 5 and 6 present a detailed comparison between the proposed methodology and the existing chart that employs Bayesian approach using SRS.The suggested CC is developed through the implementation of distinct PRSS strategies, each utilizing two distinct LFs.The observations suggest that the suggested CC demonstrates a more pronounced ability to effectively monitor the mean of the process when compared with available Bayesian chart that utilizes SRS based on the analysis of performance measures such as the ARL and the SDRL values of the offered CC, which are derived from the PRSS schemes utilizing the SELF under an informative prior.This performance is notably superior to that of the Bayesian AEWMA CC, which employs SRS.As an illustration, consider the results obtained from the available Bayesian chart applying SRS with a specific ψ = 0.10.The ARL values for distinct shifts, such as 0.0, 0.30, 0.50, 0.80, 1.50, and 4, are 370.16,43.59, 18.90, 7.90, 2.56, and 1.01, respectively.In a similar scenario, the ARL values of offered CC, employing PRSS are 370.51,23.55, 9.20, 3.73, 1.45, and 1, while those under QPRSS are 369.17,22.46, 8.73, 3.55, 1.40, and 1.Furthermore, the run length outcomes of the offered CC under EPRSS are 371.14, 24.20, 9.52, 3.94, 1.49, and 1.As an illustration, consider the results obtained from the existing CC using SRS, with SELF under ψ = 0.10.ARL values for various shifts, such as 0.0, 0.30, 0.50, 0.80, 1.50, and 4, are 370.16,43.59, 18.90, 7.90, 2.56, and 1.01, respectively.In a similar scenario, the ARL values for the offered chart, employing PRSS, are 370.51,23.55, 9.20, 3.73, 1.45, and 1, while those under QPRSS are 369.17,22.46, 8.73, 3.55, 1.40, and 1.Furthermore, ARL output of recommended chart utilizing EPRSS are 371.14, 24.20, 9.52, 3.94, 1.49, and 1.The findings illustrate the effectiveness of the offered chart when applying PRSS designs.Additionally, a comparison is made between the effectiveness of the Bayesian chart applying SRS and the suggested CC under PRSS methods, which include an informative prior and two distinct LFs at ψ = 0.25.These comparisons are conducted across different shift values such as 0.0, 0.30, 0.50, 0.80, 1.50, and 4, revealing ARLs of 369.50, 55.71, 27.40, 12.96, 4.08, and 1.08, respectively.ARL outcomes of recommended method utilizing PRSS demonstrate values of 371.18, 28.96, 14.91, 6.24, 2.02, and 1 for various shift magnitudes.In contrast, employing QPRSS yields ARL output are 370.56,31.82,14.12, 5.83, 1.93, and 1.Furthermore, when utilizing EPRSS, the ARL outputs are 369.23, 21.93, 11.03, 6.41, 1.37, and 1 for shifts of differing magnitudes.In contrast to AEWMA chart, which uses Bayesian approach under SRS, the results suggest that the proposed methodology shows a rapid decay in values under PRSS systems, especially at larger shifts.
This bearish trend is a testament to the system's superior ability to effectively detect runaway signals within the monitored process.These findings can be summarized briefly and succinctly in the following key points.
• Analysis of the ARL results for the proposed CC applying SELF across distinct PRSS designs shows a con- sistent and rapid decrease in values with increasing shift in the process mean.This trend designates that the offered technique remains unbiased, as shown in Tables 1 and 2. For example, looking at the results in Table 1 with an ARL = 370 and a smoothing constant (δ) set to various shifts such as 0. www.nature.com/scientificreports/ • From Tables 3 and 4, it can be seen that the performance of the offered technique is susceptible to variations in the value of ψ , which are given as 0.10 and 0.25.Considering the LLF, ARL and SDRL outcomes for the offered method with emphasis on the P distribution are presented in Tables 3 and 4.These tables illustrate a decrease in efficiency as the smoothing constant increases for offered chart.For example, with ARL = 370 and ψ = 0.10 along with a shift of 0.20, respective ARL outputs of suggested chart using PRSS, QPRSS, and EPRSS are 46.17,44.18, and 47.89, respectively.Furthermore, the ARL values for the same displacement (δ) of 0.20 are 54.76 for PRSS, 52.23 for QPRSS, and 56.82 for EPRSS, respectively.• The run length output of the offered chart under various PRSS schemes are presented in Tables 5 and 6.These tables offer valuable information on how the proposed chart performs when employing PRSS methodologies with the LLF.Specifically, at ARL = 370, with a shift (δ) value of 0.50 and a smoothing constant (sci) set at Comparatively, for the same scenario, the ARL values using QPRSS are 9.05 and 13.00, while those obtained using EPRSS are 9.90 and 14.90, respectively.• From An examination of Tables 1, 2, 3, 4, 5 and 6 reveals that the suggested chart displays a relatively higher susceptibility in identifying out-of-control with the comparison to the Bayesian CC that utilizes SRS.This decision is drawn from the Figs. 1, 2, 3, 4, 5, 6 and 7, which provides clear evidence of the offered Bayesian AEWMA CC comparatively limited effectiveness in identifying deviations from the expected process behavior.The r codes for the proposed design are included in the Appendix A.

Real data applications
In the realm of research, the utilization of real datasets and simulated examples is a standard practice aimed at illustrating the practical application and effectiveness of proposed charts.In the context of this particular study, a real dataset is used to showcase the operational dynamics and the practical utility of the charts.The investigation focuses on semiconductor manufacturing, particularly the integration of the hard-bake process with photolithography.The central objective revolves around establishing statistical control over the resist flow width   www.nature.com/scientificreports/within this process, employing both the existing and recommended chart.To achieve this, a dataset obtained from Montgomery 35 is employed, comprising forty-five samples, each involving 5 wafers derived from the manufacturing process.These samples are taken at hourly intervals, with the measurements of flow width recorded in microns.The initial 30 samples are presumed to reflect data from an in-control process, constituting the phase 1 dataset, while the subsequent 15 samples represent data from an out-of-control process, forming the phase 2 dataset.for both the P and PP distributions.

Conclusion
The implementation of the recommended CC applying PRSS schemes for both P and PP distribution has been proposed to effectively monitor process mean.This innovative methodology is meticulously compared to the availiable CC under SRS, and the comprehensive analysis is documented in Tables 1, 2, 3, 4, 5 and 6.Notably, the results obtained from the recommended approach demonstrate a superior performance compared to the conventional CC.To exemplify the practical implementation of the proposed technique, a real-world dataset is utilized, showcasing its efficacy in precisely tracking the location parameter and promptly identifying any deviations from the desired target.In order to further enhance the Bayesian AEWMA CC, the study suggests several promising research avenues.These research avenues involve delving into the method's adaptability and resilience when dealing with non-normal distributions.Additionally, they encompass an examination of alternative sampling techniques, such as consecutive sampling, to improve the precision of the control chart.By focusing on these areas of investigation, the proposed methodology can be customized to various scenarios, ultimately bolstering its efficacy in overseeing processes and ensuring quality control.This research highlights the importance of these developments in managing varied datasets and provides valuable guidance for future studies, thereby making ongoing contributions to the enhancement of process monitoring and quality management practices.

Figure 1 .
Figure 1.Using SELF, Plots for P and PP distribution.

Figure 2 .
Figure 2. ARL graphs for P distribution applying LLF with PRSS designs.

Figure 3 .
Figure 3. Graphs for PP distribution with LLF using PRSS schemes.

Figure 4 .
Figure 4. Based on SRS, the ARL graph for the Bayesian chart with SELF.

Figure 5 .
Figure 5.Using PRSS, ARL graph for Bayesian CC under SELF.

Table 1 .
ARL and SDRL outcomes with SELF for proposed CC based on Bayesian theory, for ψ = 0.10, n = 5.
20and 0.70, the ARL values are 44.39 and 4 for PRSS 0.70, for QPRSS at 46.01 and 4.77 and for EPRSS at 42.81 and 4.78.

Table 2 .
Run length output of the CC when implementing SESL within the recommended CC., for ψ = 0.25, n = 5.

Table 4 .
ARLs and SDRLs values for the Bayesian AEWMA CC for P distribution applying LLF with ψ = 0.25 and n = 5.