Introduction

Industrial nations with an ever-increasing demand for products from textile and steel industries have placed a high focus on scientific endeavours to mitigate the increased pollution levels caused by effluent discharge1,2. Textile industries use large amounts of water for the process of coloring, cleaning, heat treatment, cooling etc3. Textile effluent discharged directly into the natural water bodies or through land composting, contaminates the natural resources by the addition of color, chemicals, polymers, etc. which are extremely deleterious due to their carcinogenic properties4. The limited availability of water resources has made it essential to develop technologies that promote reusability and sustainability using various remediation techniques5.

The implementation of existing conventional treatment methodologies of physical, chemical, hybrid, or biological domains depends on their applications and relative efficacy in removing organic matter, suspended solids, dye-based inorganics, etc. from wastewater with limited success for industrial-scale treatment processes6. Single-stage treatment methods are unable to provide a high degree of remediation requirements due to the recalcitrant dye nature. The application of treatment techniques classified as primary, secondary, and tertiary treatment processes, find usefulness by overcoming these disadvantages but suffer from serious limitations during complex secondary aeration processes due to variable pH loads7. Recent applications of Advanced Oxidation Processes (AOPs) remove complex chemicals, heavy metals, color etc. which allow reuse under certain conditions for scaled applications8,9. Sunlight-induced AOPs using metal-oxide semiconductors (MOS) in combination with filtration, coagulation/flocculation, and carbon-based adsorption processes result in a more efficient scalable industrial wastewater treatment through literature10,11,12. Mcyotto et al. have discussed the dye characteristics along with their structure and correlated the color removal efficiency with a single-stage coagulation process13. Liang et al. have studied coagulation/flocculation (CF) and nanofiltration (NF) and their combination for the effective treatment of highly concentrated multiple-dye wastewater with increased overall performance14.

AOPs work on the principle of radiation-induced generation of hydroxyl and oxide radicals (·OH, HO2·, O2·), which initiates dye degradation into simpler and degradable compounds with minimal solid secondary pollutants15. The photocatalytic process initiated through band gap modification enhances the catalytic behavior through radical-induced secondary redox reactions degrading complex dye molecules into less harmful fragmented degradation products8,16. Liu et al. detailed an overview of the AOPs for the treatment of refractory industrial wastewater and listed major barriers to large-scale industrial applications including procedure sustainability, economic benefits, and by-product analysis along with safety evaluations17. Zinc oxide (ZnO) catalyst, earlier used in laboratory scale applications18, has a modified band gap (3.3 eV)19 which makes it an ideal inexpensive nanomaterial (NM) for photocatalytic applications20,21. To alleviate the problem of charge carrier recombination during the redox processes for ZnO NM, the graphene oxide (GO) nanosheet layer is utilized as conductive and electrons-attracting oxygen groups which scavenge ZnO conduction band electron groups22. Rodrigues et al. outline the synthesis of impregnated ZnO for photocatalytic degradation of reactive dye-based textile effluent and detail the effect of catalyst size, dye concentration, and length/diameter ratio on photocatalytic degradation23. Roy et al. presented a hybrid AOP system using composite ZnO/ZnFe2O4 for radical-induced degradation through ionization of CBZ before hydroxylation and oxidation24. An et al. reviewed membrane separation technologies using emerging cost-effective graphene oxide (GO) with excellent resistance, hydrophilicity, separation performance, and lower fouling for realizing sustainable wastewater recycling and a “zero discharge” water treatment process25.

The quality of the effluent is dependent on various parameters having different physio-chemical effects for the inter-relationships26. Disruptive Machine learning (ML) techniques present a viable approach for determining the efficiency of the treatment approaches through data-driven modeling for different industrial remediation applications. Regression-based Statistical and Neural Networks (NN) models are efficient methodologies to predict the degradation process parameters and identify the degradation processes27 of a Dye-wastewater treatment plant (DWWTP) process28. ML models include multi-linear regression (MLR)29, multi-layer perceptron (MLP), Artificial Neural Networks (ANN)30,31, and Deep Learning (DL)32 are commonly used for generating predictive models33. MLR modeling provides a simple, efficient, and optimum method for determining the relationship between input and output parameters for each treatment process. NN accurately predicts the process performance parameters34 using the Levenberg–Marquardt algorithm (LMA)35 to obtain the optimal solution through faster convergence of the mean squared error36. Sharma et al. have employed MLR modeling for analyzing BOD removal efficiency using time series plots which are evaluated through standard criterion37. Guo et al., have developed machine learning models to predict effluent concentration for WWTP using model parameters optimization for decision-making modeling and process control38. Lin et al. detail BPANN modeling for the correlation of multiple parameters for the design of a control strategy for a disinfection process of effluents39.

In this article, a detailed discussion on the modified combined AOP-based effluent DWWTP is proposed with detailed design, process operations, and modeling through different ML strategies. This pre-treatment system is designed through a controlled coagulation and flocculation process which removes the precipitate and flocs. The effluent is then passed through chemically modified sand filters acting as an adsorbent for the removal of color. This pre-filtration stage is followed by a batch photocatalysis reactor designed to be amenable to visible light penetration for effective photocatalytic remediation. A detailed discussion on the influence of parameters on the overall decolouration efficiency utilizing solar irradiation through ZnO NanoMat photocatalytic filters is presented. The breakdown of the remaining color and other complex organic compounds is performed when passed through PAN (polyacrylonitrile) fiber filtration and activated carbon filtration (ACF) steps. The twin objectives of modeling and prediction of the discharge (outlet) parameters for a set of influents (inlet) for performance evaluation are achieved through nonlinear functional modeling using MLR and NN regression techniques. A statistical study between the initial and final parameters is presented for the decolouration process with an adequate graphical representation of process parameters. Modeling studies are quantified with an adjusted coefficient of correlation (R2adj > 0.9) between the measured and predicted output variables for major treatment processes. The installed pilot treatment plant efficiently remediates textile effluents resulting in a zero liquid discharge (ZLD) system.

Results

The combined effluent treatment pilot plant developed through this study, has been commissioned at a medium-scale textile production unit in the western part of India, as shown schematically in Fig. 1. For the initial part of the study, a case of textile/steel wastewater mixing to form the influent for the pilot is evaluated at a combined effluent treatment facility. The same pilot plant is later commissioned at the source of effluent coming out of a textile production unit.

Fig. 1: Schematic of advanced oxidation process based on combined treatment plant.
figure 1

Illustration of the various filtration stages and process components setup and working of the dye-wastewater treatment plant.

Combination treatment processes components

The treatment of the industrial textile effluent follows a combined remediation procedure outlined in the schematic in Fig. 1. The stored influent is transferred for the pre-treatment process by passing through a coagulation and flocculation (C&F) unit, forming precipitates and flocs. This is followed by activated sand-based filtration (SF) unit for adsorption of the dyes, which requires regular backwash cycles to resolve the issue of pore-clogging, thereby maintaining the overall effectiveness of the filtration process. The treated water is stored in a buffer tank, from where it is transferred to a photocatalytic reactor fitted with ZnO-type nanomaterial coated over perforated metallic filter plates. Two separate NanoMat (NM) tanks (NM1 and NM2) with 5 KL (Kiloliter) capacity each are used serially for solar-induced photocatalytic degradation for a total cycle time of 2 h for each reactors. The transfer of effluent between NM1 and NM2 is necessary for meeting the agitation requirements of the system and for effective effluent contact with the photo-catalyst which leads to the generation of interaction sites between the catalyst and the dye molecules for higher dye degradation efficiency. The photo-catalytic treatment stage is responsible for degrading dye molecules into smaller fragments by changing the absorption spectra of the effluent. The effluent is then passed through a polyacrylonitrile (PAN) fiber and activated carbon filtration (ACF) steps to remove residual suspended solids and broken-down compounds through the adsorption process. Evaluation of the water parameters is performed on samples collected offline for each stage.

Pre-filtration (coagulation & flocculation) unit

The pre-filtration unit comprises a coagulation and flocculation (C&F) chamber that removes the colloidal suspensions by precipitation using the generated floc creation mechanism, as outlined earlier14. For effective operation of the coagulation process, the effluent should have a neutral pH range (6.5 ~ 6.8). It may be worth mentioning that the range of pH is found to be highly acidic for steel effluent (pHsteel < 1) and basic in the case of textile effluents (pHtextile > 9). A mixture at 1:35 of the steel and textile effluent is considered while textile effluent is taken in its original state, detailed in Supplementary Methods (1.1.1).

Activated sand filtration (SF) unit

Acid activation of sand is used for the dye sorption process for both types of wastewaters (Steel-textile mixture and pure textile) and is widely applied due to its simplicity and low-cost nature. The filtration process is based on the principle that an overall increase in the surface area and pore volume leads to greater adsorption sites40, detailed in Supplementary Methods (1.1.2).

Fabrication of ZnO/ZnOGO coated SS sheets (NanoMat)

A detailed discussion on the fabrication of ZnO/ZnOGO nanocomposite (NanoMat) over a substrate has been carried out in earlier studies18,22 and other pieces of work available in the prior art19,41,42. The nanocomposite grown over a perforated stainless steel (SS304) sheet substrate can withstand highly corrosive working conditions. A detailed procedure for the synthesis of ZnO/ZnOGO loaded SS sheets is further detailed in Supplementary Notes (1.10). The AOP-based photocatalytic reactor functions in the presence of solar irradiation to produce hydroxyl radicals which degrade the dye molecules through oxidation of complex pollutants and dyes into simpler by-products. This process step reduces COD level43 to achieve the regulatory standards (IWDS limits).

Photocatalytic nano-mat (NM) reactor

The photocatalytic reactor is designed to accommodate serially placed perforated ZnO/ZnOGO coated SS sheet panels with a replacement option, shown in Fig. 2. The schematic of the photocatalytic unit is detailed in Fig. 3 and consists of 2 sequential photocatalytic chambers/reactors each with a serpentine flow path for carrying out the degradation of the industrial effluent. The photocatalytic unit is placed on a platform at an elevation of 5 m and 3 m respectively, designed for maximum daily solar exposure (Fig. 3e). Each chamber is divided into 4 parallel sections with walls made up of a 38 mm thick acrylic sheet that is transparent to the incident light. The effluent enters these reactors through a water inlet (10 cm dia.) and passes through 35 serially placed perforated coated stainless-steel sheets (Fig. 3d) in each section of the flow reactor. The flow of effluent stream through the serpentine path via each section of the reactor is designed to maximize the effluent contact time with the catalyst-coated SS plates. Each photocatalytic chamber/reactor has a volume of ~5.6 m3 with an effective capacity of 5 KL (Kiloliters). Each section has 35 sheets arranged serially and separated with a gap of 9 cm between them and inclined at an angle of 65° for maximizing solar exposure duration (Fig. 3b). The galvanized SS sheets are custom-made for the reactor with equidistant holes (detailed design analysis provided in Supplementary Methods (1.1.4) over the whole surface area of the sheet with 64 holes each of diameter of 2 mm drilled in an XY-matrix (Fig. 3a). Grooves (3 × 4 mm) have been made on the side walls such that the SS sheets may be inserted and removed periodically (Fig. 3c). The grooves are made in a way so that alternate sheets are docked slightly above the reactor bottom with a clearance passage for effluent. The other set of filters are docked with negligible clearance from the reactor bottom inducing a flow geometry along a zigzag path and holes present in the plate surface. A combination of serpentine flow in separated sections with a zigzag type of flow between the perforated sheets facilitates maximum contact with the catalyst and efficeint mixing of effluent during the reactor loading.

Fig. 2: Schematic of photocatalytic unit using advanced oxidation process.
figure 2

Illustration of the construction details of the photocatalytic reactor via different views and the orientation of the sheets and array of holes deposited on the sheet deposition of the nanomaterial photocatalyst.

Fig. 3: Schematic of components of the Photocatalytic unit.
figure 3

a SS Sheet panel design consisting of micro-hole arrays in an equidistant pattern, b Groove schematics for SS sheet plates for vertical placements, c Top view of grooves for fitting in the SS sheets for operational efficiency, d Top view of flow Reactor/Chamber through the serpentine path via each section designed to maximize the effluent contact time with the catalyst-coated SS plates, and e Schematic of 2 sequential photocatalytic chambers/reactors to maximize the operational efficiency of the Photocatalysis unit.

Nano-filtration (NF) units

To satisfy the ZLD standards, the treated dye wastewater is further passed through a combination of Poly(acrylonitrile) (PAN) fiber filter and Activation Carbon filter (ACF) for performing the Nano-Filtration (NF) process step. The carbon nano-fiber filters are obtained from E-spin Technologies Pvt. Ltd. and the characterization performed is recorded in Supplementary Figs. 6744. The NF process works on the principle of adsorption of the organic compounds in nano-pores of the carbon fibers, resulting in a significant decrease (~42%) of total organic carbon (TOC) parameters, which regular backwash cycle.

Structural characterization of photocatalytic nano-mat filter

The characterization of ZnO and ZnOGO NanoMat using Field Emission Scanning Electron Microscopy (FESEM)is shown in Fig. 4 where 4a, b shows the ZnO nanostructures are densely grown over the requisite substrate with a uniform size distribution, and 4c, d shows a decent joining and homogeneity of the nanocomposite of ZnO and GO.

Fig. 4: Photo-catalyst morphology characterization using Field emission scanning electron microscope (FESEM).
figure 4

The FESEM images of (a, b) ZnO and (c, d) ZnOGO NanoMat. The scale bars in (ad) represent 2 μm, 100 nm, 1 μm, and 2 μm, respectively.

The X-ray diffraction (XRD) analysis in Fig. 5a shows the characteristic peaks of the wurtzite-type crystal structure of ZnO and the characteristic (001) peak of graphene oxide (GO)45,46. In the nanocomposite of ZnO/ZnOGO, the (001) peak of GO shifts was created due to the heterojunction formation between ZnO and GO47 and exfoliation of the GO sheets along with decrease in intensity. FTIR spectra of the ZnO/ZnOGO nanocomposite in Fig. 5b show the peaks at 1625 cm−1 corresponding to the C = O stretching bond, which is shifted to 1617 cm1 for pure GO indicating a composite formation in ZnOGO48,49. The other peaks observed in the range of 1000–1500 cm−1 belong to functional oxide groups obtained from oxidation reactions during the synthesis of GO. RAMAN analysis of ZnO/ZnOGO photocatalyst in Fig. 5c shows 2 characteristic spectra peaks at 1350 and 1580 cm−1, corresponding to the D band and G band of GO50. The D band which represents out-of-plane sp3 hybridized carbon atoms created due to the defects during acid exfoliation of the graphite and the G band represents the in-plane vibration of sp2 hybridized carbon atoms51. The decrease in intensity and broadening of peaks in the nanocomposite can be attributed to the change in the defect states of the material and the peak observed at 490 and 780 cm−1 represents characteristic phonon vibrations of ZnO52.

Fig. 5: Photo-catalyst structure characterization.
figure 5

a The XRD pattern of ZnO, GO, and ZnO-GO nanocomposites, b The FTIR of ZnO, GO, and ZnOGO nanocomposites, c The RAMAN of ZnO, GO, and ZnOGO nanocomposites, and d EDS analysis of ZnO, GO, and ZnOGO nanocomposites.

Pilot unit operational details

A prototype pilot combined AOP-based dye wastewater treatment plant unit is shown in Fig. 6. The C&F unit removes the suspended organic solids through the coagulation and flocculation process, followed by the Activated sand filter (ASF) which adsorbs the remaining compound and color from the effluent stream. The photocatalysis (NM1 and NM2) reactor degrades the remaining complex organic compounds through the oxidation route using highly reactive hydroxyl radical (·OH) generated from the photocatalyst, which further reduces the Chemical oxygen demand (COD) and total organic carbon (TOC) parameters. This is followed by a capture of the degraded products and remaining chemical compounds using hollow PAN fiber and an activated carbon filtration remediation unit. The treated water obtained at the outlet of the remediation process has COD, TOC, BOD, and other parameters under dischargeable limits as detailed in Table 1. For the analysis of the remediation processes in the pilot plant, a detailed study is performed to extract correlations between treatment processes to determine the optimum performance parameters of the treatment processes.

Fig. 6: Prototype combined dye-remediation filtration unit.
figure 6

The schematic diagram describes the pilot-scale 10 KLD treatment plant using serial intervention of combination treatment strategies for effective treatment of dye-containing wastewater.

Table 1 Mean and Percentage degradation of influent parameters

Study of photocatalytic filter performance

The analysis of the photocatalytic efficiency of the ZnO Nano-Mat is performed using UV-vis absorbance spectra of the effluent measured at regular intervals of time. The industrial effluent is introduced into the photocatalytic reactor after the C&F and SF steps. The UV-vis absorbance spectra show a decrease in absorbance (~95%) indicating a reduction in color concentration in 4 h (Fig. 7). Normalized concentration variation w.r.t. reaction time is obtained through the plot of ln(Co/C) vs t (irradiation time) and results in first-order reaction kinetics (\({K}_{{app}}=0.00542\) min1). Figure 7 validates the ZnO NM-based photocatalysis process as it successfully degrades industrial wastewater. A detailed discussion of the reaction mechanism for the photocatalysis process is provided in Supplementary Methods (1.5).

Fig. 7: Degradation results and kinetic curve.
figure 7

Photocatalytic degradation results using Time variation of Normalized concentration and logarithmic normalized concentration for degradation kinetic curve using ZnO NanoMat Photocatalytic reactor.

Wastewater parameters analysis

The combined treatment processes follows the objective of maintaining the treated water dischargeable into the environment under tolerable parameter limits. For textile wastewater, COD, BOD, TDS, TOC, turbidity, and pH are the most important parameters which determine the quality and environmental safety of the effluent. Table 1 illustrates the wastewater parameters for various treatment stages of the plant53. The value of the wastewater parameters following the various treatment stages decreases through serial interventions by degradation, adsorption, oxidation, and filtration processes. The color of the wastewater changes and becomes clearer, as shown in Fig. 8 indicating the degradation of complex organic dyes and other chemicals. Figure 9 depicts the time series plots of COD and TOC variation after each treatment stage for an experimental period of 8 days, showing effective degradation by reduction of the parameter values. The treatment pilot plant was tested at CETP, Jodhpur (mixture of steel and textile wastewater) and Laxmi Textiles, Jaipur (only textile wastewater).

Fig. 8: Pictorial depiction of dye degradation processes.
figure 8

Figures representing dye degradation represented through a change in color of effluent during combination of treatment processes of Coagulation and Flocculation (C&F), Activated Sand Filter (SF), Photocatalytic ZnO NanoMat (NM), Poly(acrylonitrile) (PAN), and Activation Carbon filter (ACF) for (a) textile and steel effluent (right to left), (b) Textile effluent (left to right).

Fig. 9: Time series Parametric variation of wastewater.
figure 9

Time series model of experimental stage of the combined treatment plant during pilot scale study with each treatment stage for COD and TOC at (a, c) CETP, Jodhpur (mixture of steel and textile wastewater) and (b, d) Laxmi Textiles, Jaipur (textile wastewater).

Data-driven predictive performance modeling

Data-driven modeling (DDM) techniques are used to obtain accurate prediction models using inlet parameters measured through the inexpensive implementation of online sensors. DDM is useful for processes where the existing mechanistic models are too complex to be implemented and sensor data acquisition can easily be deployed. Machine Learning (ML) algorithms can be effectively utilized for highly probable predictions and for obtaining desired responses by modeling the response characteristics using a given set of inputs and outputs54 and establishing the parametric relationship between state variables measured through the data acquisition system55. Due to the dynamic nature of the effluent variability and process dynamics, parsimonious linear models present a reasonable choice for their interpretable nature over powerful non-linear modeling techniques56,57. The modeling process comprises partitioning the dataset into the training set and validation set which is used to fit the model and calculate the residual error. A correlation factor between the response variable and parametric variable is used for the performance evaluation of the process. All process steps of the treatment plant are denoted by (A) Coagulation and Flocculation (C&F), (B) Activated Sand Filter (SF), (C) NanoMat (NM) filter, (D) PAN fiber (PAN) filters, and (F) Activation Carbon filtration (ACF).

In this study, Multi-Linear Regression (MLR) and Neural Networks (NN) based predictive models are generated to determine the parametric model of the treatment processes. The selection of the input parameters for predictive modeling is ascertained through Inland wastewater discharge standards (IWDS) which are then utilized to determine the efficiency of the treatment processes for the pilot plant. Regression modeling is utilized for predicting a continuous set of values consisting of independent variables i.e., outlet COD, TOC, Turbidity etc. to various dependent sets of variables like pH, COD, BOD, TDS, TOC, and Turbidity of the influent stream. Monitoring total organic carbon (TOC) signifies the amount of organic compound present after each treatment process while chemical oxygen demand (COD) details the amount of oxygen required to oxidize it completely. Turbidity measures the amount of cloudiness, which in turn shares a direct correlation with COD and TOC parameters. The process required to measure the COD and TOC parameters is long and tedious and prediction modeling of the output parameters allows real-time modeling of the treatment process for various applications.

Statistical analysis

A time series plot for outlet COD, TOC, and Turbidity removal efficiency for dynamically changing the influents using different processes is depicted in Supplementary Figure 13 for each process for 45 days of plant operation. The plots depict the breakdown of the dye into simpler by-products through the combined treatment processes with output removal efficiency varying from ~70% for C&F and SF, and ~50% for NM and NF processes as shown in Table 1. To measure the corresponding accuracy of the output sample distribution, standard errors using standard deviation are analysed to measure the deviation of the actual mean from sample values. Table 2 shows the mean and standard deviation for the outlet stream for the treatment processes.

Table 2 Mean and Standard deviation analysis of outlet parameters at each treatment process

Further investigation between the inlet and outlet parameters using correlation coefficient (R) and Covariance is detailed in Supplementary Tables 715. A strong correlation between COD, BOD, and turbidity is observed while pH, TOC, and TDS show a weak or variable correlation factor between the regressor variables, which in conjunction with the covariance and correlation are used for establishing conditions for the formulation of the prediction model. A threshold p < 0.05 is considered for the estimated coefficients indicating a significant similarity relationship and level of marginal significance of the Outlet parameters, shown in Fig. 10 and Supplementary Table 6. A smaller p-value signifies strong evidence for an alternate hypothesis w.r.t. the difference between predicted and actual process parameters.

Fig. 10: Significance testing of the influent parameters.
figure 10

p-value outlet plots for different treatment process steps signifying the similarity relationship and level of marginal significance of the outlet w.r.t. the treatment processes (A) C&F, (B) SF, (C) NM, (D) PAN, and (E) ACF.

Process parameter identification using multi-linear regression

The treatment processes are nonlinear and are subjected to large variations of the inputs which together with uncertainties requires efficient process parameterization for consistent outlet and robust process design. The outlet Process model identified using MLR is used for determining efficient process parameters of the DWWTP. Table 3 presents the values of the outlet COD, TOC, and turbidity, denoting a good fit for the estimated model through efficient prediction for the process application. Using the goodness of the fit model, Fig. 11 details the degree of fit of the predictive model for the treatment process steps. A close-fitted line indicates the significance of the fit model in predicting the parameter values, denoted by the measure of fit. The designed fit model evaluated using R2 and R2adj parameters assesses variability of the inlet regressor for C&F and SF treatment steps indicating ~60–80% and ~50–97%, respectively of the total variations for Outlet COD, TOC, and Turbidity. For the NM, PAN, and ACF treatment steps, R2 and R2adj values ranging from 92-99% for the outlet parameters indicating a progressively accurate prediction model for the processes due to less variability and higher degradation efficiency of the remediation process. Outlet COD, TOC, and Turbidity prediction models are evaluated using various criteria of standard error, 95% Confidence Interval statistics and covariance resulting in a p < 0.0001, detailed in Supplementary Tables 1315 and Fig. 11. A smaller standard error value signifying better fit, and less overfitting provides better point estimates of the mean response around the confidence region.

Table 3 Statistical Regression Fit Model of combined treatment processes
Fig. 11: Multi-linear regression for prediction performance modeling.
figure 11

Illustrative plots between the actual vs predicted values of the outlet parameters for different treatment process steps depicting the correlation between the experimental and predicted values w.r.t. the treatment processes (A) C&F, (B) SF, (C) NM, (D) PAN, and (E) ACF.

Parameter estimation of the process response using artificial neural networks

ANN-based process predictive modeling is utilized for the treatment process steps in DWWTP which are inherently complex due to the nature of the degradation process. ANN captures the data patterns through predictive modeling and optimization of the DWWTP operations. An optimum number of 10 hidden nodes is chosen based on the performance requirements of the prediction model. Better generalization capability at the cost of overfitting based on the minimum mean squared error using the Levenberg-Marquardt backpropagation algorithm (LMA) is achieved. A total of 45 data samples are considered for each treatment step for training, validation, and testing steps for optimum performance58.

ANN prediction model shows higher model performance for the outlet COD, TOC, and Turbidity parameters, showing high R2 ≈ 0.99 with small residual errors, as shown in Fig. 12. The result of the testing model demonstrates high R2 values for the performance model. The performance study in Fig. 13 tests the accuracy of the outlet COD, TOC, and Turbidity prediction models achieved at different epochs. The lower value of the validation performance indicates the best fit and high performance of effluent parameters and captures the optimum parameters for model predictive design with improved prediction of the removal of influent dye wastewater. MLR and ANN prediction models for output parameters provide forecasting capability for the efficient process operation of the DWWTP. Due to their robust nature and suitable learning characteristics, NN models capture the dynamic linear relationship between inputs and output, providing a good fit and efficient predictive performance.

Fig. 12: Performance of the training, validation, and testing neural network modeling for waste-water treatment processes.
figure 12

Observed vs predicted performance modeling through training, validation, and test of the dataset parameters using artificial neural Networks-based regression modeling w.r.t. the treatment processes (A) C&F, (B) SF, (C) NM, (D) PAN, and (E) ACF.

Fig. 13: Validation performance of the training, validation, and testing neural network model for waste-water treatment processes.
figure 13

Mean squared error response observed vs the number of iterations of the training dataset through the Neural Networks-based learning algorithm for Outlet COD, TOC, and Turbidity w.r.t. the treatment processes (A) C&F, (B) SF, (C) NM, (D) PAN, and (E) ACF.

Residual evaluation of the parametric prediction model

Residual analysis of the prediction models provides information about the model adequacy using the goodness-of-fit of the generated model, shown in Fig. 14 for MLR and Fig. 15 for ANN models. The scatter plots of residuals show that the variance around the regressors, which is highest for the initial treatment steps and incrementally decreases with each treatment process step correlating with the earlier results. The residual plot obtained from the generated ANN model denotes a better prediction capability of the output against multiple regressor variables. The incrementally low error residual for each process results in a high correlation among the set of input and output variables verifing the better degradation through the treatment ptocesses.

Fig. 14: Outlet residual analysis using multi-linear regression.
figure 14

Residual vs Predicted parameter plots for Outlet COD, TOC, and Turbidity parameters for treatment process steps showing independent observations without any non-random patterns w.r.t. the treatment processes (A) C&F, (B) SF, (C) NM, (D) PAN, and (E) ACF.

Fig. 15: Outlet residual analysis using neural network modeling.
figure 15

Residual vs Predicted parameter plots for Outlet COD, TOC, and Turbidity parameters for treatment process steps showing independent observations without any autocorrelated model errors w.r.t. the treatment processes (A) C&F, (B) SF, (C) NM, (D) PAN, and (E) ACF.

Discussion

A unique method for the treatment of textile industrial wastewater using ZnO/ZnOGO NanoMat successfully grown over large metallic plates and utilized as a photocatalyst for remediation of dye wastewater using combined treatment processes is presented through this study. Each filtration stage reduces the wastewater parameters to a significant degree, which upon integration provides a complete solution for the remediation of industrial textile wastewater. Fast reaction kinetics enables 95% discolouration of the industrial wastewater and dye effluents of 20KL capacity. This study focuses on predicting organic carbon removal through the treatment process using multi-linear regression and artificial neural network models with a simplistic architecture. The wastewater parameters of pH, COD, BOD, TDS, TOC, and Turbidity are monitored after each step and the prediction model provides high accuracy for the given combination of dependent set of influent variables. Various statistical indices are evaluated to validate the accuracy of regression models. The learning model exhibits high accuracy in predicting output variables with a value of R2adj > 0.9 and provides a useful practical estimation methodology for modeling organic carbon removal. The study shows a strong correlation between the measured and predicted effluent concentrations with a high correlation value and a very small p-value. The data-based learning approach studied here is quite suitable to describe the relationship between wastewater quality parameters and has application potential for performance prediction, software sensing, and autonomous control operations of dye-based wastewater treatment plant processes, by integrating with advanced sensing technologies to result in a decision support framework.

Methods

Activated sand filter (SF)

The minute screened sand particles are well dried and activated with 1 N H2SO4 and 1 N NaOH solution through washing and settling until the desired pH is obtained. A normal sand filter is attached in series to a chemically modified soil filter to remove the total suspended solids from the influent stream and to extend the life of the designed SF thus improving the overall throughput. The chemically modified sand is sandwiched between gravel along with a liner to hold the layer intact and follows a backwash cycle at regular intervals to remove clogging through suspensions. The effluent is passed and collected in a buffer tank for further processing, further detailed in Supplementary Notes (1.11.2).

Synthesis of ZnO/ZnOGO Filter coated sheets (NanoMat)

A detailed discussion on the fabrication of ZnO/ZnO-GO nanocomposite over a substrate has been carried out in earlier work18,46 and Supplementary Notes (1.10). To utilize the ZnO/ZnOGO nanostructures in the 10 KLD scale photocatalytic treatment, the nanocomposite is grown over a perforated substrate (SS sheet), which allows easier cyclic movement and better contact of the photocatalyst element with the effluent. The selection of the substrate is a critical decision due to the various chemicals which are present in the effluent stream that not only react with the material but also result in a high level of corrosion59. Initial studies performed show Stainless steel (SS304) offers a good substrate choice among the treatment processes for storing wastewater and other effluents as it can withstand overall harsh working conditions. For growing ZnO nanorods onto the surface, 62.5 mL methanol containing 0.01 M (Zn(Ac)2) is kept on a beaker under constant stirring at a temperature of 60 °C. This stirring process is accompanied by a dropwise addition of 0.03 M NaOH mixed in 32.5 mL DI water, under strong stirring action for around 3 h for uniform growth of ZnO nano-seeds in the solution. Upon completion of the stirring, the solution is ultrasonicated for 30 min. for enhanced adhesion properties and is grown in situ to form a thin film by drop casting on a substrate placed on a heated surface maintained at 170 °C. GO is fabricated using the modified Hummer’s process with slight modifications and discussed in earlier work56. The first step in the fabrication of the GO is choosing its derivative with different chemical properties and sizes best suited for the treatment application, which is detailed in Supplementary Section S160. The growth of ZnO nanorod is achieved through an 800mL DI water solution mixture of 0.025 M Zinc Nitrate hexahydrate ((Zn(NO3))2.6H20) and 0.125 M Hexamethylenetetramine (HMTA) obtained by ultrasonication for 60 min61. The patterned ZnO nanorods in thin film form are obtained by placing the thin drop cast ZnO nano-seed film on a substrate in an upside-down manner over the above solution for 2 h. and is equilibrated at 90 °C in an oven to obtain the desired ‘NanoMat’. ZnOGO is obtained through a process of mixing ZnO nano seeds with GO solution and grown using a similar procedure as outlined in the supplementary section. The ZnO/ZnOGO loaded SS sheets are placed in the photocatalytic reactor and exposed to sunlight directly as described in Fig. 3. The AOP-based visible light photocatalysis oxidizes the complex pollutants and dyes present in the effluent stream into simpler by-products, thereby reducing the COD levels43 to achieve the standard discharge limit.

Structural characterization of filter units

Spectroscopy techniques

Spectrometric analysis of the dye samples is carried out using Evolution™ 300 UV-Vis Spectrophotometer equipped with a long-lifetime xenon flash lamp and extended wavelength-range silicon photodiode detector. VISIONpro, Thermo Scientific™ software, is utilized as a control and data manipulation package for scanning for sample identification and method development, quantitative analysis, etc.

Material characterization

To analyse various properties of the fabricated ZnO/ZnO-GO photocatalyst, different characterization techniques were employed namely Field emission scanning electron microscopy (FESEM) (Zeiss Supra 40 V, Germany), X-ray diffraction (XRD) (PANalytical, Cu Kα carried out at wavelength = 1.5418 Å), RAMAN spectroscopy (WITec alpha 300 using Helium-Neon laser, at wavelength 532 nm) etc.

Photocatalytic process

The adsorption capacity of the synthesized FES is determined using known kinetic and isotherm models62. A kinetic process based on the Langmuir–Hinshelwood method63 is used for modeling the generation of electrons and holes in the presence of solar irradiation. Photocatalytic dye degradation reactions follow the Langmuir mechanism, \(\frac{{dC}}{{dt}}={k}_{{app}}* C\), which upon integration yields the first-order equations \(C={C}_{0}.\exp (-{k}_{{app}}t)\) where \(\frac{{dC}}{{dt}}\) is the rate of dye decolourization with light irradiation, ‘t’ denotes treatment process time (min), \({k}_{{app}}\) represents pseudo-first-order discolouration reaction rate constant (min−1), which equals the slope of the fitting line. The reaction rate of discolouration follows the first-order kinetics, and the pseudo-first-order reaction rate constant is determined using the semi-logarithmic relation \(\mathrm{ln}(\frac{{C}_{0}}{C})\) vs t. The normalized concentration variation through photocatalytic reaction time can be observed as a straight line resulting in constant \({k}_{{app}}\) value. For determining the dye degradation efficiency, percentage dye discolouration is obtained using the relation of \({Dye}\,{decolourization}\left( \% \right)=\left[\left(C-{C}_{0}\right)/{C}_{0}\right]\times 100\) where, ‘\({C}_{0}\)’ and ‘\(C\)’ are initial and instantaneous dye concentrations respectively.

Data-driven modeling (DDM)

Data-driven modeling is useful for processes where the existing mechanistic models are too complex to be implemented and data acquisition through measurements is readily available. Machine learning techniques can be effectively utilized for highly probable predictions and obtaining desired responses by modeling the processes and response characteristics through a given set of inputs and outputs54. Due to the dynamic nature of the effluent treatment process, parsimonious linear models are a reasonable choice for their interpretable nature over powerful non-linear modeling techniques. Machine learning-based soft sensing can automatically adapt to changes in the processes during plant operation64. Signal selection is based on the desired properties of the application and mode of analysis which includes either online or offline analysis, the accuracy of the sensor, and sampling rate.

Data pre-processing and mining

The dataset consists of various parameters validated and detailed in Supplementary Table 5.

Multi-linear regression

MLR models are preferred due to their simple formulation and for providing suitable learning characteristics by capturing the dynamic linear relationship and interaction effects between input regressors and variable output. The regression modeling technique is used for hypothesis or equation generation of a target output based on a set of ‘n’ independent variables and is given by,

$$y=f\left(X,\theta \right)+\varepsilon ={\theta }_{0}{X}_{0}+{\theta }_{1}{X}_{1}+{\theta }_{2}{X}_{2}+\ldots \ldots \ldots .+{\theta }_{n}{X}_{n}+\varepsilon$$
(1)
$$\hat{y}={\theta }^{T}{X}_{i}={\theta }_{0}+{\theta }_{1}{X}_{1}+{\theta }_{2}{X}_{2}+{\theta }_{3}{X}_{3}+{\theta }_{4}{X}_{4}+{\theta }_{5}{X}_{5}+{\theta }_{6}{X}_{6}+\varepsilon$$
(2)
$${\hat{y}}_{{Outlet}}={\theta }_{0}+{\theta }_{1}.{x}_{{\rm{COD}}}+{\theta }_{2}.{x}_{{\rm{BOD}}}+{\theta }_{3}.{x}_{{\rm{TOC}}}+{\theta }_{4}.{x}_{{\rm{TDS}}}+{\theta }_{5}.{x}_{{\rm{pH}}}+{\theta }_{6}.{x}_{{\rm{Turbidity}}}+\varepsilon$$
(3)
$${\theta }^{T}=\left[{\theta }_{0}\,{\theta }_{1}\,{\theta }_{2}\,{\theta }_{3}\,{\theta }_{4}\,{\theta }_{5}\,{\theta }_{6}\right]{\rm{;}}X=\left[\begin{array}{c}1\\ {X}_{1}\\ {X}_{2}\\ {X}_{3}\\ {X}_{4}\\ {X}_{5}\\ {X}_{6}\end{array}\right]$$
(4)

Where \({y}_{n}\) is the output-dependent variable, \(\hat{y}\) (outlet COD, TOC, and Turbidity) is the output for \(X\) (COD, BOD, TOC, TDS, pH, Turbidity) as the independent process variables for corresponding \(\theta\) of the estimated regression coefficients, and \(\varepsilon\) is the random error for \(i=\mathrm{1,2},\ldots \ldots .,n\) sample observations. To estimate the value of \(\theta\), either Ordinary least squares (OLS) or Gradient descent (GD) algorithms are utilized, given as:

$$\hat{\theta }={({X}^{{\prime} }X)}^{-1}{X}^{{\prime} }y$$
(5)
$$\hat{y}=X\hat{\theta }$$
(6)

Neural networks

Muti-Layer Perceptron (MLP) feed-forward network finds wide application for the modeling process using the backpropagation algorithm to map the relationship between numeric inputs and a set of targets, also termed as backpropagation neural network (NN)65. NN is developed through a three-layer feed-forward network relating the complex nonlinear processes using input-output data, detailed in Supplementary Fig. 15. Input values are fed to the summing junction after weighing along with the bias and passed with sigmoid hidden neurons and linear output neurons connected with separate weighing and bias values using the relation:

$${y}_{j}=f\left(\mathop{\sum }\limits_{i=0}^{n}{W}_{{ij}}{\alpha }_{i}+{b}_{i}\right)$$
(7)

where \({W}_{{ij}}\) are the weights, \({\alpha }_{i}\) and \({b}_{i}\) are the corresponding inputs of COD, BOD, TOC, TDS, pH, and Turbidity with biases and y corresponds to outlet COD, TOC, and Turbidity. The sigmoid transfer function is used as the nonlinear transformation function, given as

$$\sigma \left(x\right)=\frac{1}{1+{e}^{-x}}{\rm{;}}\,{f}^{{\prime} }\left(x\right)=f(x)(1-f(x)){\rm{;}}\,0 \,<\, f\,(x)\, <\, 1$$
(8)

OLS algorithm is used to calculate the parameter for the independent process variables by minimizing the sum of squares (MSE) of the difference between the predicted and actual variables of the dataset66. Gradient descent Optimization algorithms can be used which iteratively minimizes the error for obtaining an optimum value of θ and provides better and faster results for large data sets.

Model performance evaluation approach

The performance model evaluation is based on the minimization of the mean squared error for making the correct predictions of output COD, TOC, and turbidity using various modeling techniques. The wellness of the prediction performance is measured using mean squared error (MSE), root mean squared error (RMSE), coefficient of determination (R2) and Adjusted coefficient of determination (R2adj)67.

$${Mean\; Squared\; Error},{MSE}=\frac{1}{n}\mathop{\sum }\limits_{i=1}^{n}{\left({y}_{i}-{\hat{y}}_{i}\right)}^{2}$$
(9)
$${Root\; Mean\; Squared\; Error},{RMSE}=\sqrt{\frac{1}{n}\,\mathop{\sum }\limits_{i=1}^{n}{\left({y}_{i}-{\hat{y}}_{i}\right)}^{2}}$$
(10)
$${R}^{2}{score}=1\mbox{-}\frac{{\sum }_{i=1}^{n}{({y}_{i}-\hat{y})}^{2}}{{\sum }_{i=1}^{n}{({y}_{i}-\bar{y})}^{2}}$$
(11)