Discernment of transformer oil stray gassing anomalies using machine learning classification techniques

Ngwenyama, M. K.; Gitau, M. N.

doi:10.1038/s41598-023-50833-7

Download PDF

Article
Open access
Published: 03 January 2024

Discernment of transformer oil stray gassing anomalies using machine learning classification techniques

M. K. Ngwenyama¹^na1 &
M. N. Gitau¹^na1

Scientific Reports volume 14, Article number: 376 (2024) Cite this article

839 Accesses
1 Citations
Metrics details

Subjects

Abstract

This work examines the application of machine learning (ML) algorithms to evaluate dissolved gas analysis (DGA) data to quickly identify incipient faults in oil-immersed transformers (OITs). Transformers are pivotal equipment in the transmission and distribution of electrical power. The failure of a particular unit during service may interrupt a massive number of consumers and disrupt commercial activities in that area. Therefore, several monitoring techniques are proposed to ensure that the unit maintains an adequate level of functionality in addition to an extended useful lifespan. DGA is a technique commonly employed for monitoring the state of OITs. The understanding of DGA samples is conversely unsatisfactory from the perspective of evaluating incipient faults and relies mainly on the proficiency of test engineers. In the current work, a multi-classification model that is centered on ML algorithms is demonstrated to have a logical, precise, and perfect understanding of DGA. The proposed model is used to analyze 138 transformer oil (TO) samples that exhibited different stray gassing characteristics in various South African substations. The proposed model combines the design of four ML classifiers and enhances diagnosis accuracy and trust between the transformer manufacturer and power utility. Furthermore, case reports on transformer failure analysis using the proposed model, IEC 60599:2022, and Eskom (Specification—Ref: 240-75661431) standards are presented. In addition, a comparison analysis is conducted in this work against the conventional DGA approaches to validate the proposed model. The proposed model demonstrates the highest degree of accuracy of 87.7%, which was produced by Bagged Trees, followed by Fine KNN with 86.2%, and the third in rank is Quadratic SVM with 84.1%.

Transformer fault diagnosis method based on TLR-ADASYN balanced dataset

Article Open access 27 December 2023

Effects of SF6 decomposition components and concentrations on the discharge faults and insulation defects in GIS equipment

Article Open access 14 September 2020

Neural networks and particle swarm for transformer oil diagnosis by dissolved gas analysis

Article Open access 23 April 2024

Introduction

With the radical growth in the power system capacity, the demands for power generation, transmission, and distribution, have become greater¹. As a significant piece of equipment for power distribution in power systems, the power transformer (PT) is critical for the secure operation of the complete power system. The occurrence of a fault in a PT will result in damage to the unit. The most severe faults might even cause the failure of the entire power system, adversely affecting the functioning of the total national economy. Thus, it is beneficial to examine fault diagnosis technology relating to PTs². PT faults usually emerge from electrical and thermal stresses, such faults vary merely in their energy, site, and time of occurrence. The oil temperature increases and several gases will be generated when the fault occurs. Generally, the combustible gasses found in the TO in service are hydrogen $\left({\text{H}}_{2}\right)$, methane $\left({\text{CH}}_{4}\right)$, ethane $\left({\text{C}}_{2}{{\text{H}}}_{2}\right)$, ethylene $\left({\text{C}}_{2}{{\text{H}}}_{4}\right)$, and acetylene $\left({\text{C}}_{2}{{\text{H}}}_{6}\right)$^3,4. The pollutants in oil are mostly the consequence of the degradation of insulating elements (oil or sheet) because of faults or chemical responses in the apparatus in question.

The quality and quantity of disintegrated gases have a prominent function in assessing the fault type in PTs^5,6. Many conventional techniques have been developed to analyze transformer faults with gas chromatography; a procedure where a chemical combination transported through a gas or liquid is broken down into its constituent parts as a result of the substances flowing differently along or above a static solution. Such schemes for fault analysis are usually categorized into three types, specifically, the distinctive gas scheme^7,8,9,10, the gas production rate scheme¹⁰, and the three-ratio scheme^11,12,13. In China, over 50% of the PT faults in the energy system were evaluated by employing DGA-based analysis schemes which analyze transformer fault types and their severity following the content, proportion to one another, and the gas production rate of the DGs in the TO¹³. Adding to the above three key conventional techniques, some enhanced schemes have emerged, like the Doernenburg scheme, the Rogers ratio scheme, the Duval triangle scheme, the International Electrotechnical Commission (IEC) ratio scheme, and the Key Gas (KG) scheme^{14,15,16,17,18}. Such schemes usually employ numerous gas ratios or compare gas levels with the appointed criteria to analyze the state of a PT. However, most of these conventional analysis techniques provide a restricted impact to a transformer’s fault analysis, which is unable to precisely identify its correct fault type. Particularly, it is extremely complex to precisely determine the fault state with several DGs, a great probability of misdiagnosis will occur when the calculated and analyzed gas ratio is near the critical value¹⁹. Furthermore, the more comprehensive the classifications of fault types are, the lesser the precision rate of fault analysis is, and vice versa. Moreover, rough classifications are not conducive to the fault analysis of a PT, and it is challenging to meet the demands of applications.

DGA is a technique for detecting and forecasting problems in OITs by (i) determining the levels of various gases contained in the insulation oil, as well as respective gas rates and gas proportions, (ii) fault detection utilizing diagnosis instruments such as KG^20,21, IEC ratios²², Rogers ratios²³, Doernenburg ratios²⁴ and Duval triangle²³. Nevertheless, these instruments have certain flaws. In certain situations, the computed gas ratios deviate from the instruments’ specified ratio codes. Faults that develop within the transformer might be undetectable²⁵. Additionally, these instruments can produce various analytical outcomes for the equivalent dissolved gas (DG) file, making it challenging for experts to reach a definitive conclusion when confronted with such a wide range of data²⁶. Due to these constraints, several scientists have developed systems that are integrated with ML approaches that use historical DGA information to forecast imminent or undiscovered faults for diagnosing faults. The complexity of identifying the appropriate fault situation and the analytic precisions for units under fault categories are defined by these aspects^27,28. The KG ratios, as well as graphic depiction schemes, are all DGA schemes that are utilized as data inputs to ML classifiers for fault classification. In the current study, a multi-classification model that is centered on ML algorithms is shown to have an intelligible, precise, and clear understanding of DGA. This enthusiasm is supported by (i) efficient adaptation to fresh data in ML; (ii) for structural layout, ML needs minimal exertion (i.e. several control settings are involved.); and (iii) the capability of ML to categorize unpredictable issues²⁹. Capitalizing on these benefits, the proposed model is used to analyze and evaluate the state and suitable gas name subscription of 138 TO samples that exhibited different stray gassing characteristics in various South African substations. The model uses four ML classifiers, namely: (i) Decision Tree (DT)³⁰; (ii) Support Vector Machine (SVM)³¹; (iii) K-Nearest Neighbour (KNN)³²; and (iv) Ensemble Classifier (EC)³³. These classifiers are applied for oil sample classification and are selected based on their capacity to compare new data inputs to existing data to identify the class that closely resembles existing classes to place new data within. In MATLAB/Simulink, the proposed model serves as the framework underlying the various classifiers and is designed to aggregate ML algorithms for information-gathering activities. A detailed summary of the various ML classifiers utilized in this work is provided in the section that follows:

DT: As shown in Fig. 1, the DT classifier³⁴ is an ML technique that makes predictions using a tree structure. It builds a flowchart-like tree structure where each internal node represents a feature test, each branch represents a test outcome, and each leaf node stores a class label. It is constructed by constantly splitting the training data into subsets depending on feature values until a stopping requirement is met, such as the maximum depth of the tree or the minimum number of samples needed to divide a node. The method replicates the operation for every split subgroup that is the offspring of a given node. Lastly, the tree is trimmed by deleting limbs that are not useful for classification.
SVM: The working of the SVM classifier³⁵ can be understood by using Fig. 2. SVMs fall within the broad group of kernel schemes³⁶ that rely solely on data using mark pairings. To guarantee that the hyperplane is as broad as feasible across categories, the kernel function determines an estimation product for certain potentially large-scale feature regions. SVMs possess the benefits of becoming less mathematically intensive compared to different methods of classification, performing well in large-scale areas, as well as managing unpredictable classification effectively by utilizing the kernel trick, which subsequently converts the data area into a different large-scale feature area.
KNN: The KNN classifier³⁷ is a monitored learning approach utilized for numerous machine learning scenarios. It arranges elements using the nearest trained samples in the characteristic domain. The goal underlying KNN is to locate a well-known amount of training data that is nearest in proximity to a particular querying case and estimate the querying case's category based on them. Regarding categorization, KNN is comparable to a DT method, except that rather than developing a tree, instead, it creates a route through the graph. KNNs are also quicker compared to DTs. The working of the KNN is shown in Fig. 3.
EC: The ensemble classifier³⁸ produces classification forecasts using a set of classifiers, which achieves more accurate specialization than one classifier and results in an improved measurement grade. A dataset is used to train a list of classifiers, and the separate predictions made by each of the classifiers applied to the dataset form the basis of EC. The ensemble model then combines the outcomes of each classifier prediction to get the final result. This sort of classifier remains simple to simulate but is often appropriate for large samples. The working of the EC is shown in Fig. 4.

In monitoring the insulation status in OITs, several chemical and electrical processes are employed, such as DGA and Furan Analysis (FA), which indicate the Degree of Polymerization (DP) of the cellulose paper^6,39. DGA is one of the most common methods for detecting an incipient fault in PTs. DGA can be used to assess present-day transformer status, predict future failures, and identify inconvenient transformer operations to provide appropriate maintenance planning. Figure 5 illustrates the standard technique employed by the transformer manufacturing sector to collect transformer oil on-site for DGA at the testing facility.

The presented DGA approaches do not contain any mathematical development, and the assessment depends on an experiential method that can vary depending on the expertise of the laboratory analyst, which results in unpredictable assessment⁴⁰. To overcome this limitation, several computational models based on ML have been used in assessing incipient faults in PTs. In the proposed research work, recent related studies and their contributions to transformer fault diagnosis have been highlighted and a multi-classification model for transformer fault diagnosis is proposed. Table 1 presents a comparative study of the existing recent survey and the proposed model for transformer fault analysis.

Table 1 Summary of recent related studies.

Subjects

Abstract

Similar content being viewed by others

Transformer fault diagnosis method based on TLR-ADASYN balanced dataset

Effects of SF6 decomposition components and concentrations on the discharge faults and insulation defects in GIS equipment

Neural networks and particle swarm for transformer oil diagnosis by dissolved gas analysis

Introduction

Contribution and novelty

Research contribution

Research novelty

Paper organization

Review of existing DGA approaches

CIGRE approach

Doernenburg ratio approach

KG approach

Nomograph approach

IEC ratio approach

Duval triangle approach

Rogers ratio approach

Applicable works

Proposed approach

Dataset preparation

Experimental setup

Training and testing of the ML models

Classification accuracy

Materials and protocols

Results

Conclusions

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Application of back propagation neural network in complex diagnostics and forecasting loss of life of cellulose paper insulation in oil-immersed transformers

Comments

Search

Quick links