A novel optimization method for belief rule base expert system with activation rate

Xiang, Gang; Wang, Jie; Han, XiaoXia; Tang, Shuaiwen; Hu, Guanyu

doi:10.1038/s41598-023-27498-3

Download PDF

Article
Open access
Published: 11 January 2023

A novel optimization method for belief rule base expert system with activation rate

Gang Xiang^1,2,
Jie Wang³,
XiaoXia Han⁴,
Shuaiwen Tang⁴ &
…
Guanyu Hu^3,4

Scientific Reports volume 13, Article number: 584 (2023) Cite this article

907 Accesses
1 Citations
Metrics details

Subjects

Abstract

Although the belief rule base (BRB) expert system has many advantages, such as the effective use of semi-quantitative information, objective description of uncertainty, and efficient nonlinear modeling capability, it is always limited by the problem of combinatorial explosion. The main reason is that the optimization of a BRB with many rules will consume many computing resources, which makes it unable to meet the real-time requirements in some complex systems. Another reason is that the optimization process will destroy the interpretability of those parameters that belong to the inadequately activated rules given by experts. To solve these problems, a novel optimization method for BRB is proposed in this paper. Through the activation rate, the rules that have never been activated or inadequately activated are pruned during the optimization process. Furthermore, even if there is a complete data set and all rules are activated, the activation rate can also be used in the parallel optimization process of the BRB expert system, where the training data set is divided into some subprocesses. The proposed method effectively solves the combinatorial explosion problem of BRB and can make full use of quantitative data without destroying the original interpretability provided by experts. Case studies prove the advantages and effectiveness of the proposed method, which greatly expands the application fields of the BRB expert system.

A novel belief rule base expert system with interval-valued references

Article Open access 26 April 2022

A complex system health state assessment method with reference value optimization for interpretable BRB

Article Open access 28 January 2024

A fault diagnosis method for wireless sensor network nodes based on a belief rule base with adaptive attribute weights

Article Open access 19 February 2024

Introduction

Expert systems are one of the most traditional artificial intelligence methods and have been used in many fields, including finance, industry, medicine, and education¹. It can express extensive knowledge and experience of a complex system and obtain the final results by the inference engine. However, in the era of big data, an expert system cannot effectively utilize multisource data from complex environments and internal systems, which limits its applications. Data-driven approaches can make up for this defect, such as neural networks^2,3, dynamic Bayesian networks⁴, and deep learning methods⁵. However, they cannot use expert experience and domain knowledge to guide the setting of initial parameters, which brings much uncertainty and pressure to the model optimization. A semiquantitative model can combine the advantages of the above two types of models, such as the hidden Markov model (HMM)⁶ and fuzzy neural network⁷. Although the above methods have been applied in many cases, they all lack the ability to process various types of uncertainty, including randomness, fuzziness and ignorance, and lack the interpretability and credibility of the results. As an intelligent expert system and interpretable artificial intelligence method, the belief rule base (BRB) expert system^8,9,10 can effectively utilize semiquantitative information, including qualitative knowledge and quantitative data, and objectively express uncertain information. Currently, increasing attention is being paid to BRBs, and many variants have been developed^{11,12,13,14,15}.

Although BRB has many advantages, as an expert system, it will also face the problem of combinatorial explosion when the number of attributes is increased. The main reason is that the number of rules in the BRB will increase exponentially with the increase in attributes and reference levels. There are many problems to be solved for BRBs, and the most important problem is model optimization. Because of the increasing parameters in the BRB, the optimization process will expend considerable time and computing resources. For example, there are 8 attributes in distinguishing diabetes. Assuming that each attribute has 3 reference levels, the number of rules in BRB is 3⁸ = 6561. In the following sections, we will see that each rule of BRB has 12 parameters, including rule weights, attribute weights, and belief degrees of consequents. Thus, the number of parameters of the BRB is 6561 × 12 = 78,732, which means that searching for optimal parameters will run in a very high-dimensional solution space. In addition, the objective function of BRB optimization is a nonconvex, highly nonlinear, and existing equality constraint problem. Therefore, the optimization process of BRBs with a large number of parameters is very difficult to solve. On the other hand, it is also unreasonable to optimize all the parameters of the BRB because some rules may not be activated or only activated a few times by incomplete quantitative data. It is not only meaningless to optimize those rules but will also destroy the original interpretability provided by experts.

Based on the above descriptions, it is necessary to develop an effective method to solve the optimization problem of BRBs. Currently, there are 2 types of methods for this problem: (1) Dimension reduction methods, which can reduce the number of attributes, such as principal component analysis (PCA)¹⁶ and linear discriminant analysis (LDA)¹⁷. For example, Hu used PCA to reduce the characteristics of attacks in network security situation prediction by using BRB¹⁸. (2) Structure reduction methods, which can reduce the number of rules. For example, Zhou proposed an automatic adding and deleting criterion for belief rules in BRBs based on statistical utility¹⁹. Chang proposed a structure learning method for BRBs based on gray targets (GTs) and multidimensional scaling (MDS)¹⁶.

Although the above methods can relieve the pressure of BRB optimization to a certain extent, some shortcomings still exist. Dimension reduction methods cannot keep the original meaning of the attribute, which weakens the advantage that the BRB expert system can effectively utilize domain knowledge. Structure reduction methods are not always efficient and reach practical optimization speeds at the expense of precision. Obviously, from the view of scale reduction, the optimization problem of BRBs with a large number of parameters cannot be effectively solved. Therefore, a novel optimization method for BRBs with activation rates is proposed in this paper, where the activation rate is used to determine which rules should be optimized in a process, and then the whole optimization process of BRBs can be simplified without losing accuracy. Furthermore, in the situation that most of the rules are activated, the activation rate can also be used in the parallel process of BRB optimization, where the training data set is divided into some subprocesses. The proposed method in this paper can reduce the unnecessary optimization of those unactivated rules of the BRB expert system, which ensures that the quantitative data can be utilized as much as possible without destroying the original interpretability provided by experts.

The remainder of this paper is organized as follows. In “The basic description of the BRB expert system” section, the basic description of the BRB expert system is introduced, and the optimization problem of BRBs is analysed. In “Optimization method for BRBs” section, the novel optimization method with activation rate is proposed, and the parallelized process is constructed. In “Case studies” section, two case studies are designed to verify the effectiveness of the proposed method. The paper is concluded in “Conclusion” section.

The basic description of the BRB expert system

The structure of the BRB expert system

BRB is an intelligence expert system that can effectively use qualitative knowledge and quantitative data and can express most uncertainty information. The basic construction of the BRB expert system is as follows⁸.

$$\begin{array}{*{20}l} {R_{k} :} \hfill & {{\text{If}}\;(a_{1} \;{\text{is}}\;A_{1}^{k} ) \wedge (a_{2} \;{\text{is}}\;A_{2}^{k} ) \wedge , \ldots , \wedge (a_{M} \;{\text{is}}\;A_{M}^{k} )} \hfill \\ {} \hfill & {{\text{Then}}\{ (D_{1} ,\beta_{1,k} ),(D_{2} ,\beta_{2,k} ), \ldots ,(D_{N} ,\beta_{N,k} ),(D,\beta_{D,k} )\} } \hfill \\ {} \hfill & {{\text{with}}\;{\text{rule}}\;{\text{weight}}\;\theta_{k} \;\left( {k = 1,2, \ldots ,L} \right){\text{ and}}\;{\text{attribute}}\;{\text{weight}}\;\delta_{i} \quad \left( {i = 1,2, \ldots ,M} \right)} \hfill \\ \end{array}$$

(1)

where $R_{k}$ denotes the $kth$ rule in the belief rule base, and $a_{i} (i = 1, \ldots ,M)$ denotes the $ith$ antecedent attribute, whose referential value is $A_{i}^{k}$, $A_{i}^{k} = \left( {A_{i,1}^{k} ,A_{i,2}^{k} , \ldots A_{{i,J_{i} }}^{k} } \right)$, where $J_{i}$ denotes the number of referential levels of the $ith$ attribute. $M$ denotes the number of antecedent attributes. To simplify the problem, we assume that the number of attributes in each rule is the same. $D_{j} \left( {j = 1, \ldots ,N} \right)$ denotes the $jth$ output result, whose belief degree can be expressed by $\beta_{j,k}$. $D$ is the set that includes all the results, so the belief degree $\beta_{D,k}$ assigned to $D$ denotes the remaining belief degree. Because $\beta$ denotes the probabilities of the results, the sum of those belief degrees must equal 1, which is the constraint condition in training BRB.

The BRB expert system uses the above belief rules to construct the nonlinear relationship between the input and output of a complex system, and as a general probability, the belief degree can express various types of uncertain information in an objective world.

The reasoning process of the BRB expert system

When data are imported into BRB, some rules are activated. The principle of the activation is that when attributes of the data match the corresponding reference levels, the transformation method is used to generate a matching degree of the attribute value relative to the reference value. The transformation method depends on the form of the attributes. If the attributes are quantitative, the matching degrees can be obtained by the equivalence transformation technique^20,21. If the attributes are qualitative, the matching degrees can be obtained by the subjective judgment of experts²². All the matching degrees will construct $M$ matching degree vectors, denoted by $v_{i}$, where each vector includes $J_{i}$ matching degrees. Then a matching degree matrix $V$ can be obtained. $V$ includes an element $V_{k,i} \, \left( {k = 1, \ldots L; \, \,i = 1, \ldots M} \right)$ that is selected from $v_{i}$ according to the rules arranged by reference level. Thus the activation weight of the $kth$ rule $\omega_{k}$ can be calculated by

$$\omega_{k} = \frac{{\theta_{k} \prod\nolimits_{i = 1}^{M} {(V_{k,i} )^{{\overline{\delta }_{i} }} } }}{{\sum\nolimits_{k = 1}^{L} {\theta_{k} \prod\nolimits_{i = 1}^{M} {(V_{k,i} )^{{\overline{\delta }_{i} }} } } }},\,{\text{ where}}\, \, \overline{\delta }_{i} = \frac{{\delta_{i} }}{{\max \left\{ {\delta_{i} } \right\}}}$$

(2)

If the activation weight is not equal to 0, the corresponding rule is activated. Then, the following evidential reasoning (ER) rule is utilized to fuse the activated rules and finally obtain the distribution of belief degrees $\widehat{\beta }_{j}$ assigned to the results, as shown in Eq.

$$\begin{array}{*{20}c} {\widehat{\beta }_{j} = \frac{{\gamma \times \left[ {\prod\nolimits_{k = 1}^{L} {\left( {\omega_{k} \beta_{j,k} + 1 - \omega_{k} \sum\limits_{i = 1}^{N} {\beta_{i,k} } } \right)} - \prod\nolimits_{k = 1}^{N} {\left( {1 - \omega_{k} \sum\nolimits_{i = 1}^{N} {\beta_{i,k} } } \right)} } \right]}}{{1 - \gamma \times \left[ {\prod\nolimits_{k = 1}^{L} {\left( {1 - \omega_{k} } \right)} } \right]}}\quad } & {\left( {j = 1, \ldots ,N} \right)} \\ \end{array}$$

(3)

$$\gamma = \left[ {\sum\nolimits_{j = 1}^{N} {\prod\nolimits_{k = 1}^{L} {\left( {\omega_{k} \beta_{j,k} + 1 - \omega_{k} \sum\limits_{i = 1}^{N} {\beta_{i,k} } } \right) - \left( {N - 1} \right)\prod\nolimits_{k = 1}^{L} {\left( {1 - \omega_{k} \sum\limits_{i = 1}^{N} {\beta_{i,k} } } \right)} } } } \right]^{ - 1}$$

(4)

Optimization of the BRB expert system

It can be seen from the above descriptions that BRB includes many parameters, of which the initial values are usually given by experts. These initial parameters constitute a rough BRB, which cannot produce accurate results. Therefore, parameter optimization for BRBs is necessary. The first step is to establish an optimization objective, shown as follows.

$$\begin{aligned} & \quad \min \left\{ {F\left( \Omega \right)} \right\} \\ & \begin{array}{*{20}l} {{\text{s}}.{\text{t}}{.}\quad 0 \le \theta_{k} \le 1,} \hfill & {k = 1, \ldots ,L} \hfill \\ {\quad \,\,\,\,\,0 \le \delta_{i} \le 1,} \hfill & {i = 1, \ldots ,M} \hfill \\ {\quad \,\,\,\,\,0 \le \beta_{j,k} \le 1,} \hfill & { \, j = 1, \ldots ,N,k = 1, \ldots ,L} \hfill \\ {\quad \,\,\,\,\,\beta_{D,k} + \sum\limits_{j = 1}^{N} {\beta_{j,k} } = 1} \hfill & {} \hfill \\ \end{array} \, \\ \end{aligned}$$

(5)

where $F\left( \Omega \right)$ denotes the objective function, which can be defined through the mean square deviation between the real values and testing results of the BRB. $\Omega$ denotes the parameters to be optimized.

Equation (3) is a highly nonlinear, highly dimensional, strongly constrained optimization problem. Therefore, the second step is to select an appropriate optimization algorithm to solve the optimization objective of the BRB²³ used the sequential quadratic programming (SQP) algorithm to obtain the optimal parameters of BRBs²⁴ proposed the projection covariance matrix adaptation evolution strategy (P-CMA-ES) algorithm, which achieves a good optimization effect.

Although the accuracy of BBR can be improved by the parameter optimization process, when the attributes and reference levels are increased, BRB has to face the problem of combinatorial explosion. As described in “Introduction” section, a large number of parameters will not only reduce the speed of optimization, which will lead to failure in the scene with high real-time requirements but also lead to a decrease in accuracy because of the difficulty of optimal solution search in high-dimensional space. The above limitation greatly restricts the application of BRBs in more complex and wide fields. Therefore, an efficient optimization method for BRBs is proposed in this paper.

Optimization method for BRBs

A novel optimization method for BRBs with activation rates is proposed in this section. The method can be used in two different application scenarios: (1) BRBs with incomplete samples or patterns, which means that a part of the rules will never be activated or only activated a few times. By pruning these rules by using the activation rate and threshold, the optimization dimension will be greatly reduced. (2) Fully activated BRB, where all rules are activated many times. Through parallel operation by using the activation rate and threshold, optimization will be separated into many child processes, which will fundamentally solve the problem of BRB optimization. Next, the basic principles of the proposed optimization method will be introduced in two different scenarios.

BRB optimization method with activation rate

After an in-depth analysis of the combination explosion problem, we found that when samples are incomplete or the actual system does not cover all patterns, a part of the rules will never be activated or only activated very few times in the whole training process, which is called inadequate-activated rules. However, in the traditional optimization process of BRBs, the parameters of all rules are involved in optimization, which is unreasonable. The initial parameters of the BRB are given by experts based on experience and domain knowledge. If the parameters of these nonactivated or inadequate-activated rules are optimized, then we give up expert knowledge without enough quantitative data to provide information for model learning. To solve the above problem, the activation rate for the BRB is first proposed, as follows.

$$ar_{k} = \frac{{an_{k} }}{{\sum\nolimits_{n = 1}^{L} {an_{n} } }}$$

(6)

where $ar_{k}$ denotes the activation rate of the $kth$ rule and $an_{k}$ denotes the number of activation times of the $kth$ rule.

To prune the nonactivated and inadequate-activated rules, a threshold $h$ of the activation rate must be given. When the activation rate is greater than the threshold $h$, the parameters of the corresponding rules can be optimized, and the remaining rules still keep the initial values given by experts.

Remark 1

Note that an_k can be obtained only after all samples are input into the initial BRB, which cannot affect the efficiency of optimization because the initial BRB is without a training process and can quickly obtain output results.

Parallel optimization method of BRBs

The scale of the BRB can only be reduced to a certain extent through the activation rate, but when the data set is relatively complete, the scale reduction will be limited, which cannot solve the optimization problem of BRBs with a large number of parameters in essence. With the development of computer technology, parallel optimization provides a good solution, which can greatly reduce the optimization time. Theoretically, if we have enough computing units, the optimization time will surely meet the requirements of the actual system. Thus, a parallel optimization method for BRBs is proposed in this section.

Inspired by the activation rate and pruning rules, the parallelization of BRB optimization can be achieved by partitioning the data set, which can be denoted as

$$S = \left[ {\begin{array}{*{20}l} {s_{1,1} ,} \hfill & {\quad s_{1,2} ,} \hfill & {\quad \cdots ,} \hfill & {\quad s_{1,M} } \hfill \\ {s_{2,1} ,} \hfill & {\quad s_{2,2} ,} \hfill & {\quad \cdots ,} \hfill & {\quad s_{{{2},M}} } \hfill \\ {} \hfill & {} \hfill & {\quad \cdots } \hfill & {} \hfill \\ {s_{sn,1} ,} \hfill & {\quad s_{sn,2} ,} \hfill & {\quad \cdots ,} \hfill & {\quad s_{sn,M} } \hfill \\ \end{array} } \right]$$

(7)

where $S$ denotes the original data set, $sn$ denotes the number of samples, and $M$ denotes the number of attributes of each sample $s_{i,1} ,s_{i,2} , \cdots ,s_{i,M}$.

The parallelization steps of BRB optimization are shown as follows:

Step 1 First, the initial parameter values of the BRB are set according to expert knowledge.
Step 2 Assuming that the number of optimization subprocesses is pn, the training data set can be divided into pn parts, each of which is an average sampled from sn samples of the original data set.
Step 3 Input every sub data set into the initial BRB, and calculate the activation rate $AR_{n} = \left( {ar_{1}^{n} ,ar_{2}^{n} , \ldots ,ar_{k}^{n} } \right);\left( {n = 1,2, \ldots pn} \right)$ of each sub data set, where $AR_{i}$ denotes the activation rate set of the $ith$ sub data set. $ar_{k}^{n}$ denotes the activation rate of the $kth$ rule activated by the $nth$ sub data set.
Step 4 Set the threshold for the activation rate, which can decide which rules to participate in each optimization subprocess. Then, the BRB can be divided into pn sub-BRB models, denoted as $BRB_{n}$.
Step 5 The corresponding sub-BRB is assigned to different computing units and optimized independently according to the corresponding training sub data set. The optimization algorithm is P-CMA-ES, which is used to minimize the objective function shown in Eq. (3). Please refer to algorithm 1 for pseudo code of P-CMA-ES algorithm.
Step 6 After the above steps, we obtain pn groups of belief degree distributions, and each group has $sn$ belief degree distributions for the output results of $sn$ samples in the BRB. The belief degree distribution generated by the $nth$ optimization subprocess of the $ith$ testing sample can be denoted as $B_{n,i} = \left( {\hat{\beta }_{n,i}^{1} ,\hat{\beta }_{n,i}^{2} , \ldots ,\hat{\beta }_{n,i}^{N} } \right)$. To obtain the final belief degree distribution, the weighted average method is utilized, and the weight of the $nth$ distribution can be determined by $pw_{n}$

$$p\omega_{n} = \frac{{\sum\nolimits_{k = 1}^{L} {an_{k}^{n} } }}{{\sum\nolimits_{n = 1}^{pn} {\sum\nolimits_{k = 1}^{L} {an_{k}^{n} } } }} \times \left( {pn - 1} \right)$$

(8)

where $an_{k}^{n}$ denotes the number of activation times of the $kth$ rule in $BRB_{n}$. Then, the final belief degrees distribution of $ith$ sample $B_{f}^{i}$ can be obtained by Eq. (8), the final results of the BRB can be obtained by Eq. (9).

$$B_{f}^{i} { = }\left( {\frac{{\sum\nolimits_{n = 1}^{pn} {p\omega_{n} \times \hat{\beta }_{n,i}^{1} } }}{pn},\frac{{\sum\nolimits_{n = 1}^{pn} {p\omega_{n} \times \hat{\beta }_{n,i}^{2} } }}{pn}, \ldots ,\frac{{\sum\nolimits_{n = 1}^{pn} {p\omega_{n} \times \hat{\beta }_{n,i}^{N} } }}{pn}} \right)$$

(9)

$${\rm Z}_{i} = \sum\limits_{j = 1}^{N} {B_{f,j}^{i} \times D_{j} ;} \quad \left( {i = 1, \ldots ,sn} \right)$$

(10)

Remark 2

The above optimization subprocesses are independent of each other. The weighted average operation for the final belief degree distribution is executed only when the optimization is completed.

Case studies

To verify the superiority of the proposed method, two cases, “Health status assessment of laser gyro” and “Leak size estimation of oil pipeline”, were used for verification.

Health status assessment of laser gyro

Problem formulation

A laser gyro is a precision instrument in the navigation control system. Its state parameters are zero-order drift coefficient, first-order drift coefficient, X-axis gyroscope light intensity voltage. When these parameters exceed the calibration threshold, it means that the laser gyro has failed, and the navigation control system will fail at this time. However, when these parameters are within the threshold, the laser gyro will also show different states. At this time, evaluating their health is also a necessary means to measure whether the laser gyro meets the navigation accuracy. Therefore, this case studies the health assessment of laser gyro, using the following data sets.

In this case, the data set of the laser gyro is used to prove the advantages of the proposed method. This data set contains a zero-order drift coefficient, first-order drift coefficient, X-axis gyroscope light intensity voltage, and expected utility value. The data set has 2000 samples, as shown in Figs. 1, 2, 3 and 4.

First, we can establish a BRB expert system according to expert experience or domain knowledge. The reference values of the zero-term drift coefficient, first-term drift coefficient, and X-axis gyroscope light intensity voltage are shown in Tables 1, 2 and 3. Thus the BRB expert system of laser gyroscope health status detection can be described

$$\begin{array}{*{20}l} {R_{k} :} \hfill & {IF \, \,\left( {zero - term\, \, drift \, \,coefficient\,{\text{ is}}\, \, A_{1}^{k} } \right) \wedge \left( {first - term \, \,drift \, \,coefficient \, \,{\text{is }}A_{2}^{k} } \right) \wedge \left( {X - axis\,{\text{ is }}\,A_{3}^{k} } \right)} \hfill \\ {} \hfill & {Then\, \, laser\, \, gyroscope\, \, health\, \, status \, \,is \, \,\left\{ {\left( {H, \, \beta_{1}^{k} } \right), \, \left( {SH, \, \beta_{2}^{k} } \right), \, \left( {UH, \, \beta_{3}^{k} } \right)} \right\}} \hfill \\ {} \hfill & {with \, \,rule\, \, weight\, \, \theta_{k} \, \,and\, \, attribute \, \,weight\, \, \delta_{i} } \hfill \\ \end{array}$$

(11)

where $H,SH,UH$ denote the reference values of the laser gyroscope health status, as shown in Table 4. The other parameters of the BRB expert system are shown in Table 5.

Table 1 The reference values and points of the zero-order term drift coefficient.

Subjects

Abstract

Similar content being viewed by others

A novel belief rule base expert system with interval-valued references

A complex system health state assessment method with reference value optimization for interpretable BRB

A fault diagnosis method for wireless sensor network nodes based on a belief rule base with adaptive attribute weights

Introduction

The basic description of the BRB expert system

The structure of the BRB expert system

The reasoning process of the BRB expert system

Optimization of the BRB expert system

Optimization method for BRBs

BRB optimization method with activation rate

Remark 1

Parallel optimization method of BRBs

Remark 2

Case studies

Health status assessment of laser gyro

Problem formulation

Case 1-optimization process of a BRB using activation rates

Remark 3

Remark 4

Case 2-parallel processing for BRBs using activation rates

Remark 5

Leak size estimation of oil pipeline

Problem formulation

Case 1-optimization process of BRB using activation rates

Case 2-parallel processing for BRB using activation rates

Conclusion

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Share this article

Comments

Search

Quick links