## Introduction

Harvesting sustainable energy in spontaneous natural processes such as moisture diffusion1,2, water flow or evaporation3,4,5, and heat transfer6,7 has attracted extensive attention, which will potentially supply innovative and clean power for the fast-growing human society. Recently, based on the biomimetic and selective ion transport behaviors, some specific materials (e.g. MoS28, metal-organic framework9, polyimide10, etc.) with artificial micro/nanochannels demonstrate promising electricity generation ability when water flows through them. These water flow-enabled electric generators (WEGs) are proposed to involve the mechanism of the electric double layer (EDL) at the solid–liquid interface and water flow-induced charge separation11,12,13. Meanwhile, the energy conversion ability has been improved by regulating the intrinsic or structural feature (e.g. surface charge density, channel size)14,15 of materials. However, the development of high-performance WEG still faces great challenges and it seems to encounter an insurmountable bottleneck in macro-scale practical applications. The main reasons are as follows: (1) Although nanoscale WEGs have favorable power density, it is difficult to maintain the energy generation ability after integrating nanochannels into macroscopic assemblies, because of the disordered structures and incoordination between individual nanochannels16,17,18. (2) In addition to the macroscale and uniform integration of nanochannels, the performance of WEG will also strongly depend on the regulation of surface charges of nanochannels, interactions between water and materials, internal water transport and so on3,13,19,20. Unfortunately, the multiple factors are restrictively coupled, and the change in one of them will dramatically affect the matching configuration and ultimate power generation ability21,22. Especially in the expected macroscopic WEG, the interrelationship among massive nanochannels will bring multi-objective trade-offs, multi-parameter coupling, and vast parameter optimization exploration, which could make the experiments complicated, expensive, and time-consuming14,23,24. (3) To promote water flow through the nanochannels of WEGs, a large water pressure difference should be applied on two sides of the nanochannels, which makes the energy conversion system complex and is far from the expected natural energy transfer process5,11,15,25,26. Besides, the mechanical properties of WEG should meet the practical requirements in various conditions27,28,29. All these problems have severely restricted the fabrication and development of macroscopic WEG to achieve the realistically available applications.

2D graphene oxide (GO) sheets rich in oxygenated functional groups can be easily layer-by-layer assembled into desirable and oriented structures with the forming capability in macroscale30,31. Meanwhile, the versatile modification of functional groups on GO sheets can control the surface charge density when interacting with water32,33. As a result, abundant and designable 2D nanochannels between GO sheets could be uniformly integrated into macroscopic assemblies, which provide ideal platforms for the development of high-performance WEG34,35.

In this work, WEG with massive integrated 2D nanochannels (2D-WEG) within the long-range (1–20 cm) oriented 2D GO assembled framework is developed by a rotational freeze-casting method. This 2D-WEG with tunable inner structural and chemical characters can spontaneously absorb water and promote water flow inside nanochannels to generate considerable electric energy (Fig. 1a). Especially, we implement a transfer learning (TL) strategy to address the complicated multi-parameters coupling optimization for the 2D-WEG with limited experimental data (Fig. 1b)36,37. Unlike previously reported single-parameter analyses, this TL strategy can provide uniform multi-parameter modeling and highly accurate performance prediction under a small experimental dataset, which is a high-efficiency route for the design of 2D-WEG. In consequence, the TL-optimized 2D-WEG generates ~2.9 V voltage or ~16.8 μA current (Fig. 1c). In addition, the 2D-WEG with favorable mechanical stability can be flexibly designed as a waterscape screen as well as the water power generation folding fan and building component to generate the considerable electricity. Moreover, the high electric output of ~12 V or ~83 μA is realized by connecting several 2D-WEGs in series or parallel to directly power commercial electronics like calculators, LED arrays, and display screens, demonstrating the potential of TL-empowered 2D-WEG for the development of water enabled clean energy system.

## Results

### Rotational freeze-casting method for the preparation of 2D-WEG

The 2D-WEG with long-range and integrated 2D nanochannels is prepared by using the rotational freeze-casting method (Fig. 2a). In this process, high-speed rotation treatment generates strong centrifugal force in the radial direction, which will cause GO sheets to align in the tangential direction of mold (X-axis direction)38. Meanwhile, the expelling force of the directional ice crystals from the bottom makes GO sheets arrangement along with the vertical direction (Z-axis direction)39. As a result, GO sheets can be assembled orderly and the long-range texture parallel to XZ 2D plane is formed along the long side of the employed mold (Supplementary Fig. 1). In contrast to previous porous graphene assemblies that have random structures or ordered distribution in only one dimension31,39, this GO assembled framework can arrange GO sheets parallel to the XZ plane, which forms the massive oriented 2D channels (Fig. 2b and Supplementary Fig. 2). X-ray tomography images and the 3D reconstruction results clearly show the 2D lamellar structure of GO sheets (Fig. 2c). Furthermore, the tortuosity value in the Y direction is ~4.5 that is much higher than the value (~1) in X and Z directions, further indicating the ordered arrangement of GO sheets in XZ plane (Fig. 2d and e)40. Meanwhile, the orientation degree of GO sheets can be regulated by controlling the rotational speed in the preparation of a 2D GO assembled framework (Supplementary Fig. 3), which is reflected in the X-ray diffraction (XRD) results. The full width at the half maximum (FWHM) of XRD patterns decreases with increasing rotational speed, indicating that the distribution of GO sheets is well-ordered (Fig. 2f)38. Additionally, the surface charge density of GO sheets will affect the water-enabled power generation process, which can be controlled with the modification of different polyelectrolyte molecules3,33. The introduction of polystyrene sulfonate (PSS), polyacrylic acid (PAA), and sodium alginate (SA) with sulfonic/carboxyl/hydroxyl groups can modulate the surface charge density of GO sheets as indicated by the decrease in Zeta potential from −31 to −83 mV (Fig. 2g), which provide the wide optimizable space for the material and device (Supplementary Figs. 4 and 5). Finally, the 2D GO assembled framework is tableted along the Y-axis direction and processed by the direct laser writing into the designed shape for the 2D-WEG packaging. Then, the electric conductive carbon electrodes are subsequently connected at the ends of the 2D GO assembled framework to collect the electric signal, and PDMS is coated on the outside to confine the physical space of the GO assembled framework and maintain the mechanical strength (Fig. 2h and Supplementary Figs. 6, 7). The mechanically stable and macroscopic 2D-WEG integrates massive 2D-oriented nanochannels, while the key parameters related to power generation can be easily adjusted in the preparation process.

### Water-enabled electric generation and transfer learning optimization

Because of the abundant oxygenated functional groups and uniformly massive 2D nanochannels, when water is supplied at one edge of 2D-WEG, water can be spontaneously absorbed and flowed inside the massive 2D nanochannels under the action of capillary force to generate electricity (Supplementary Fig. 8). The as-prepared 2D-WEG can produce an open‐circuit voltage (Voc) of ~0.73 V (Fig. 3a) and a short‐circuit current (Isc) of ~1.68 μA (Fig. 3b). The Voc and Isc of the 2D-WEG spontaneously last for hours without significant attenuation (Supplementary Fig. 9), because water can diffuse along the 2D nanochannels to the other edge of 2D-WEG and evaporate continuously to the air. Meanwhile, the rational PDMS package makes the 2D-WEG self-supporting and bendable, effectively avoiding damage due to stress concentration (Fig. 3c and Supplementary Fig. 10)41. Under different bending deformations, the Voc and Isc fluctuations of the 2D-WEG are within ±0.03 V and ±0.05 μA, respectively (Fig. 3d). The 2D-WEG with a serpentine form maintains structural stability at 100% tensile deformation and water can transmit along the designed path to produce electricity under the initial or stretching state. Also, the generated Voc and Isc only fluctuate by approximately ±0.04 V and ±0.07 μA, respectively, which shows stable power generation and mechanical flexibility in varied deformation environments (Fig. 3e).

The mechanism for water-enabled electricity generation of 2D-WEG is proposed as follows. First, the oxygenated functional groups (e.g. carboxy) on the GO sheets will be ionized after adsorbing water, which makes 2D GO nanochannels negatively charged42. Meanwhile, the positively charged ions in the water will adsorb on the surface of GO sheets to form EDL3,24,27,42,43. Spontaneous water absorption at one edge and evaporation at the other edge will result in directional water flow inside the 2D-WEG. Then, with the directional transport of water in the 2D-WEG, the positive ions adsorbed in the EDL will move towards the evaporation side due to the shearing effect of water flow, while the negative ions can hardly pass through positively charged nanochannels, thus forming the electric potential difference between the two electrodes (Fig. 3f)11,12,44,45,46. Briefly, the electricity generation mechanism of 2D-WEG should be concluded as the charge separation induced electric potential difference caused by the directional water flow in 2D nanochannels (Supplementary Figs. 11 and 12)3,42. Therefore, the mechanism of 2D-WEG is mainly based on the reported streaming potential5,11, which is the phenomenon of the liquid flowing through the charged nanochannels with EDL and inducing positive and negative charge separation, then forming the electric potential difference. However, the electricity generation of 2D-WEG involves the restrictive coupling of several processes such as water absorption, directional flow, and evaporation, which is beyond the only streaming potential process of water transport20,46. More parameters that affect the performance of 2D-WEG should be considered systematically. For example, liquid temperature (T) and environmental relative humidity (RH) would affect the water evaporation process; Ion concentration (C) in water and Zeta potential ($$\zeta$$) of the 2D-WEG would affect the EDL; The device length (l), the nanochannel spacing (d), structure tortuosity ($$\tau$$) of 2D-WEG and driving pressure (P) would affect the directional transport behavior of water13,14,20,23,46,47,48. Predictably, the performance of 2D-WEG is significantly influenced by the synergy of these parameters, and they will be restrictively coupled and involve huge trade-offs, making the experimental exploration of optimal parameter combinations challenging. Recently, machine learning techniques have demonstrated an excellent ability to model complex systems36,37,49,50, which depends on the training data volume. However, experimentally collecting large-scale data is complicated and time-consuming. TL is a machine learning method that utilizes knowledge implied in the related domain to improve the performance of machine learning models with insufficient training data. Based on the highly similar basic background (e.g., EDL formation and charges separation) behind 2D-WEG and streaming potential generation, we transfer the deep knowledge representation from the abundant streaming potential data to efficiently guide the design of 2D-WEG with limited experimental data36,37. This TL strategy could greatly reduce the amount of experimental data required by reusing data with similar backgrounds, thus significantly lowering the threshold for applying machine learning methods to optimize the performance in complex 2D-WEG systems.

To apply the TL strategy, we first learn knowledge representation in the streaming potential domain (Fig. 4a). The source model$$:{{{{{{{\boldsymbol{{{{{\mathcal{X}}}}}}}}}}}}}\to {{{{{{{{\boldsymbol{{{{{\mathcal{R}}}}}}}}}}}}}}$$ is a mapping from the parameter space $${{{{{{{{\boldsymbol{{{{{\mathcal{X}}}}}}}}}}}}}}$$ of streaming potential to the generation performance space $${{{{{{{{\boldsymbol{{{{{\mathcal{R}}}}}}}}}}}}}}$$, which consists of two components—Encoder and Decoder. The Encoder$$:{{{{{{{{\boldsymbol{{{{{\mathcal{X}}}}}}}}}}}}}}\to {{{{{{{{\boldsymbol{{{{{\mathcal{H}}}}}}}}}}}}}}$$ embeds the raw parameters $$\vec{{{{{{{{\bf{x}}}}}}}}}\in {{{{{{{{\boldsymbol{{{{{\mathcal{X}}}}}}}}}}}}}}$$ of the streaming potential into the latent space $${{{{{{{{\boldsymbol{{{{{\mathcal{H}}}}}}}}}}}}}}$$ by using a 4-layer multilayer perceptron (MLP). The Decoder$$:{{{{{{{{\boldsymbol{{{{{\mathcal{H}}}}}}}}}}}}}}\to {{{{{{{{\boldsymbol{{{{{\mathcal{R}}}}}}}}}}}}}}$$ uses the knowledge representation $$\vec{{{{{{{{\bf{h}}}}}}}}}$$ in the latent space $${{{{{{{{\boldsymbol{{{{{\mathcal{H}}}}}}}}}}}}}}$$ to predict the streaming potential performance $$\vec{{{{{{{{\bf{r}}}}}}}}}\in {{{{{{{{\boldsymbol{{{{{\mathcal{R}}}}}}}}}}}}}}$$ with a 2-layer MLP. The source model learns the relationship between the parameters of the streaming potential and the induced voltages and currents from the open dataset until convergence51 (Supplementary Fig. 13a). The 2D-WEG performance optimization model (opt-model)$$:{{{{{{{{\boldsymbol{{{{{\mathcal{P}}}}}}}}}}}}}}\to {{{{{{{{\boldsymbol{{{{{\mathcal{G}}}}}}}}}}}}}}$$ has three components——NormLayer, Encoder, Decoder. The NormLayer$$:{{{{{{{{\boldsymbol{{{{{\mathcal{P}}}}}}}}}}}}}}\to {{{{{{{{\boldsymbol{{{{{\mathcal{X}}}}}}}}}}}}}}$$ is a learnable adaptive normalization layer that can perform learnable element-wise affine normalization to accommodate the order of magnitude mismatch between experimental data and streaming potential data. Due to the similarity of the mechanism background, the Encoder$$:{{{{{{{{\boldsymbol{{{{{\mathcal{X}}}}}}}}}}}}}}\to {{{{{{{{\boldsymbol{{{{{\mathcal{H}}}}}}}}}}}}}}$$ of the opt-model is initialized using the weights of the Encoder of the source model, which is only fine-tuned during the training process. Finally, the Decoder$$:{{{{{{{{\boldsymbol{{{{{\mathcal{H}}}}}}}}}}}}}}\to {{{{{{{{\boldsymbol{{{{{\mathcal{G}}}}}}}}}}}}}}$$ with 3-layer task-specific MLP is deployed at the top to accurately characterize the relationship between the hidden knowledge representation $$\vec{{{{{{{{\bf{h}}}}}}}}}\in {{{{{{{{\boldsymbol{{{{{\mathcal{H}}}}}}}}}}}}}}$$ and the generated Voc/Isc $$\in {{{{{{{{\boldsymbol{{{{{\mathcal{G}}}}}}}}}}}}}}$$. Above all, this opt-model for 2D-WEG based on TL strategy reuses the knowledge representation learned from the streaming potential data through weight sharing (see Methods section for full details about the TL framework). Furthermore, in terms of the opt-model training, an iterative optimization strategy is adopted to optimize the parameters of 2D-WEG by using the latest experimental data (Fig. 4b)52. The initial stage of the iterative optimization strategy is to experimentally collect 2D-WEG data to train opt-models. Then, the learned opt models are used to generate a series of candidate parameters to maximize the Voc or Isc. Subsequently, we design the 2D-WEG based on the candidate parameters and re-collect experimental data to refine the opt-model to locally alleviate the inevitable over-fitting problem (see the Methods section for details about the iterative optimization strategy). These steps are repeated until there is no further improvement in the electricity generation performance, which takes about 30 cycles in 2D-WEG.

As shown in Fig. 5a, when only limited experimental data are used to train the TL-based opt-model, the mean absolute error (MAE) of Voc prediction is about 0.12 V with a corresponding mean relative error of 15.0%53, and the correlation coefficient is 0.93. Similarly, the MAE of Isc prediction is about 0.16 μA with a corresponding mean relative error of 5.9%, and the correlation coefficient is 0.96 (Fig. 5b), indicating that the TL-based opt-model can accurately predict the generation performance from the characteristic parameters of the 2D-WEG using limited experimental data. For the model trained from scratch without TL, the MAEs of Voc and Isc on the test set are expanded by 2.56 times and 2.32 times (Fig. 5c and d), respectively, which further shows that the TL-based opt-model can significantly avoid overfitting problem on small-scale training sets (Supplementary Fig. 13b and c). More importantly, unlike the previous single-factor optimization adopted for water-enabled power generation systems, this TL-based optimization strategy can systematically model and search for the best combination of multiple parameters52. As indicated in the parallel coordinate plot outputted by the TL-based opt-model, the corresponding relationship between the power generation performance of 2D-WEG and all the parameters (T, RH, C, $$\zeta$$, l, d, $$\tau$$, P) can be provided directly (Fig. 5e), which is crucial for exploring optimal parameter combinations in multi-parameters coupled scenarios. As a result, the TL-based opt-model can use data with a similar background to reduce the amount of training experimental data required by machine learning methods, and accurately learn the dependence of power generation performance on 2D-WEG characteristic parameters. Meanwhile, the learned opt-model is able to deliver a clear and high-precision numerical model of the relationship between the influencing factors and Voc or Isc of 2D-WEG to guide 2D-WEG design and help to understand how key factors influence the energy generation process (Supplementary Note 1, Table 1 and Figs. 14–17).

### Optimized 2D-WEGs for scalable integration and applications

Benefiting from accurate modeling and multi-parameters coupling optimization of TL-based opt-model, a series of 2D-WEGs with desirable voltage and current outputs according to requirements can be prepared in a controllable manner (Supplementary Fig. 18). As shown in Fig. 6a, we design and fabricate a 14.6 cm long 2D-WEG with matched parameters (d = 98 nm, $${\tau }_{Y}$$ = 1.04, $$\zeta$$ = −83 mV, T = 30 °C, RH = 15%, P = 0 Pa, C = $$0{{{{{{{\rm{M}}}}}}}}$$), which performs a Voc of ~2.9 V that is about 397% higher than that of initially 2D-WEG. Meanwhile, a high Isc of ~16.8 μA can be achieved by a 1 cm long 2D-WEG (Fig. 6b). Figure 6c further shows that the 2D-WEGs with featured parameters from the TL-based opt-model have higher power output at different Voc compared to previous water flow-induced electricity-generating devices (Supplementary Table 2)3,4,11,12,13,14,20,25,28,29,42,43,44,45,46,47,54,55,56,57,58. Additionally, the power output can be further scaled up by simple series and parallel connections of 2D-WEGs. An integrated device with five 2D-WEG units connected in series can produce a voltage of ~11.9 V and a high current output of ~82.7 μA can be realized by connection in parallel (Fig. 6d). All these results confirm the integration of 2D nanochannels in macroscopic 2D-WEG to generate favorable electric power. Moreover, the developed TL can guide the performance optimization of 2D-WEG by accurate modeling and combinatorial multi-parameter output, which provides an efficient strategy for the development of macroscale water-powered systems.

To demonstrate practical applications, a waterscape screen containing 10 2D-WEGs has been explored (Fig. 7a), which can spontaneously absorb water from the bottom like natural plants and continuously generate electricity (~2.8 V and ~8.3 μA) to power 19 LEDs (Fig. 7b, c and Supplementary Movie 1). Benefitting from the designability of 2D-WEG based on the TL-based opt-model, a small architectural landscape constructed from a single 2D-WEG has been further developed (Fig. 7d). As shown in Fig. 7e and f, the electricity generated by this architectural landscape can support a series of scientific calculations on a commercial calculator by simply watering it (Supplementary Movie 2). In addition, according to the positive influencing factors suggested by opt-model, the kinetic energy of the fan is further used to promote power generation by integrating 2D-WEGs on the fan blade (Fig. 7g). The water at one side of 2D-WEG can be transported towards the other side in an accelerated manner (Supplementary Fig. 19). Then, the generated power stored in commercial energy storage equipment (Fig. 7g) is enough to drive a 4.2-inch electronic ink screen to play multiple animations (Fig. 7h and Supplementary Movie 3).

## Discussion

In summary, the macroscopic 2D-WEG integrated with massive 2D-oriented nanochannels is developed to generate considerable electricity by internal spontaneous water flow. A TL strategy is implemented to efficiently guide 2D-WEG design and achieve multi-parameters coupling optimization and accurate performance prediction using limited experimental data. As a result, the 2D-WEG with mechanical flexibility can generate a high voltage of ~2.9 V or a current of ~16.8 μA, and adjust the output performance in a controllable manner according to practical requirements. The simple series or parallel connection of 2D-WEGs is able to enlarge the electricity generation up to ~11.9 V and ~ 82.7 μA, respectively. Furthermore, diverse water-enabled electricity-generating systems have been developed, including the waterscape screen, the architectural landscape, and the 2D-WEG fan, which can power scientific calculator, LED array, and electronic ink screen. The fabricated 2D-WEG and TL optimization strategy have demonstrated the potential to develop the promising water-enabled clean energy system.

## Methods

### Preparation of the 2D-WEG

GO dispersion was synthesized by using the modified Hummers’ method. To prepare the 2D-WEG with long-range and integrated 2D nanochannels, 100 mL GO dispersion (5.5 mg mL−1) was first mixed with 3 mL PVA solution (1 wt%) and 5 mL ethanol and the appropriate amount of polyelectrolyte molecules. The mixed dispersion was transferred to a homemade cuboid mold. After the mold rotated (2000 r min−1) around the Z-axis for 10 min, the bottom was slightly exposed to liquid nitrogen of −196 °C under the rotation. After the GO dispersion was completely frozen, the GO-assembled framework was obtained by the conventional freeze-drying method. Subsequently, a tableting process was applied to get a controllable thickness along the Y-axis and the direct laser writing cut the GO-assembled framework into the desired shape for device fabrication. Then, carbon paper electrodes were attached to the top and bottom of the GO-assembled framework using conductive silver glue. Finally, the device was carefully sealed with a 2-h pre-formed PDMS film to complete the 2D-WEG preparation. The PDMS was first pre-formed at 80 °C for 2 h and tightly sealed the GO assembled framework at 80 °C for 4 h to cure.

Polystyrene sulfonic acid (PSS, M.W. 75,000) was purchased from Energy Chemical Co., Ltd. (Anhui, China). Polyacrylic acid (PAA, M.W. 450,000) was from Aikon International Co., Ltd. (Jiangsu, China). Sodium alginate (SA) was purchased from Shanghai Macklin Biochemical Co., Ltd (Shanghai, China). Polyelectrolyte solution was slowly added to GO dispersion under vigorous stirring and the mixture was subjected to sonication for 30 min to obtain homogeneous dispersions. The relative content of the composites can be easily adjusted by changing the volume of the polyelectrolyte solution.

#### Electrical measurements

The electrical output signal of 2D-WEGs was measured by using a Keithley 2612 multifunctional source meter. During the test, one end of 2D-WEG is in contact with water while the other end is kept in a constant temperature and humidity chamber to ensure a stable condition. The electrochemical Impedance Spectroscopy analyses were performed on a CHI 660E electrochemical workstation (CH Instruments Inc.).

#### Characterization

The morphology of 2D-WEGs was investigated with a scanning electron microscope (SEM, FLexSEM 1000). Optical photos and videos are taken with the camera (SONY Alpha ILCE-7RM3). X-ray diffraction (XRD) patterns were recorded on a Bruker AXS D2 PHASER diffractometer with a Cu Kα irradiation source (λ = 1.54 Å). The 3D microstructure of 2D-WEG was examined by the X-ray nano-tomography system (ZEISS Xradia 520 Versa) and the structural tortuosity was calculated from X-ray nano-tomography images using the method as reported40. Fourier-transform infrared (FTIR) spectra were recorded by UATR Two FT-IR spectrometer. X-ray photoelectron spectroscopy (XPS) was measured by PHI Quantera II (Ulvac-Phi Incorporation) photoelectron spectrometer with Al Kα (1846.6 eV). Raman spectra measurements were carried out using a LabRAM HR Raman spectrometer (Horiba Jobin Yvon) with a 532 nm laser. Zeta potential was investigated by Zeta potential analyzer (Zetasizer Nano ZS90, Malvern, UK). Tensile and bending tests were conducted by using an Instron 5943 universal testing machine with a strain rate of 2 mm min−1 for stretching. The laser (HGTECH LSU3EA) used for cutting was focused with an objective lens with a focal length of 170 mm.

#### Datasets

The source domain data used in this work are from the steaming potential datasets built by experimental observations51. To refine the data coverage, we generated a series of fitting data as data augmentation based on the theoretical model59,60,61. The source domain streaming potential dataset contains a total of 81253 samples. The 2D-WEG generation performance dataset of the target domain contains a total of 3620 samples, which we collected experimentally by using a programmable constant temperature and humidity chamber.

#### Transfer learning framework

The TL framework was implemented using PyTorch62. The source model$$:{{{{{{{{\boldsymbol{{{{{\mathcal{X}}}}}}}}}}}}}}\to {{{{{{{{\boldsymbol{{{{{\mathcal{R}}}}}}}}}}}}}}$$ is a mapping from the parameter space $${{{{{{{{\boldsymbol{{{{{\mathcal{X}}}}}}}}}}}}}}$$ of streaming potential to the generation performance space $${{{{{{{{\boldsymbol{{{{{\mathcal{R}}}}}}}}}}}}}}$$, which consists of two components—Encoder and Decoder. We implemented the components of the transfer learning framework using fully connected layers. The Encoder$$:{{{{{{{{\boldsymbol{{{{{\mathcal{X}}}}}}}}}}}}}}\to {{{{{{{{\boldsymbol{{{{{\mathcal{H}}}}}}}}}}}}}}$$ maps the raw parameter of the streaming potential into a hidden vector $$\vec{{{{{{{{\bf{h}}}}}}}}}\in {{{{{{{{\boldsymbol{{{{{\mathcal{H}}}}}}}}}}}}}}$$, which is implemented by a 4-layer multilayer perceptron (MLP) with 16, 64, [64, 128] and [64, 128] neurons per layer, where [64, 128] represents the number of neurons in this layer is randomly selected in the range of 64–128. The Decoder$$:{{{{{{{{\boldsymbol{{{{{\mathcal{H}}}}}}}}}}}}}}\to {{{{{{{{\boldsymbol{{{{{\mathcal{R}}}}}}}}}}}}}}$$ is a mapping from knowledge representation $$\vec{{{{{{{{\bf{h}}}}}}}}}$$ in the latent space $${{{{{{{{\boldsymbol{{{{{\mathcal{H}}}}}}}}}}}}}}$$ to the streaming potential performance $$\vec{{{{{{{{\bf{r}}}}}}}}}\in {{{{{{{{\boldsymbol{{{{{\mathcal{R}}}}}}}}}}}}}}$$, which is implemented by a 2-layer MLP, where the number of neurons per layer is [16, 64] and 2. All activation functions in the source model are ReLU. We use the source model to learn the relationship between the key parameters of the streaming potential on the output voltage and define the loss function as the mean square error (MSE) between the ground truth and predicted values of the normalized voltage. A total of 200 source models were pre-trained, and for each source model, 63% of the data from the dataset was randomly selected as the training set, and the remaining data were used as the valid set to verify the learned models during training. All parameters are initialized by using He initialization and we optimize the source model parameters over MSE loss with the Adam optimizer, using a mini-batch size of 16. The source model is trained for 500 epochs, with exponential learning rate decay from 10−4 to 10−6. To accommodate the volatility of the experimental data, a random Gaussian noise with a standard deviation of 0.005 is added to the steaming potential or streaming current data during training to make its distribution closer to the distribution of the 2D-WEG generation performance data.

The opt-model$$:{{{{{{\boldsymbol{{{{{\mathcal{P}}}}}}}}}}}}\to {{{{{{\boldsymbol{{{{{\mathcal{G}}}}}}}}}}}}$$ is a mapping from the parameter space $${{{{{{\boldsymbol{{{{{\mathcal{X}}}}}}}}}}}}$$ of 2D-WEG to the generation performance (Voc or Isc) space $${{{{{{{{\boldsymbol{{{{{\mathcal{R}}}}}}}}}}}}}}$$, which has three components——NormLayer, Encoder, Decoder. The NormLayer$$:{{{{{{{{\boldsymbol{{{{{\mathcal{P}}}}}}}}}}}}}}\to {{{{{{{{\boldsymbol{{{{{\mathcal{X}}}}}}}}}}}}}}$$ is a learnable auto-normalization layer at the bottom of the opt-model to accommodate the order of magnitude mismatch between 2D-WEG generation performance data and streaming potential data, which multiplies each parameter individually by a learnable weight and plus a learnable bias, respectively, containing only 16 weights and 16 biases as parameters. The Encoder$$:{{{{{{{{\boldsymbol{{{{{\mathcal{X}}}}}}}}}}}}}}\to {{{{{{{{\boldsymbol{{{{{\mathcal{H}}}}}}}}}}}}}}$$ is a mapping from renormalized parameters space $${{{{{{{{\boldsymbol{{{{{\mathcal{X}}}}}}}}}}}}}}$$ to the latent space $${{{{{{{{\boldsymbol{{{{{\mathcal{H}}}}}}}}}}}}}}$$, which maintains the same architecture as the Encoder of the corresponding source model. The Decoder$$:{{{{{{{{\boldsymbol{{{{{\mathcal{H}}}}}}}}}}}}}}\to {{{{{{{{\boldsymbol{{{{{\mathcal{G}}}}}}}}}}}}}}$$ maps the hidden knowledge representation $$\vec{{{{{{{{\bf{h}}}}}}}}}\in {{{{{{\boldsymbol{{{{{\mathcal{H}}}}}}}}}}}}$$ to the Voc/Isc $$\in {{{{{{{{\boldsymbol{{{{{\mathcal{G}}}}}}}}}}}}}}$$ of 2D-WEG, which is implemented by a 3-layer MLP and the number of neurons per layer is [32, 128], [16, 32] and 1, respectively. All activation functions in the opt-model are ReLU. Due to the similar physical mechanism background, the underlying features learned in the MLP layers at the bottom of the source model are reusable. We transfer the parameter weights and biases of the Encoder of the source model to the Encoder of opt-model as initialization parameters. The NormLayer and the Decoder are trained from scratch using experimental data, while the Encoder is only fine-tuned with the initial learning rate of each layer set to 10−6, 10−6, 3 × 10−6, and 5 × 10−6, respectively. Each opt-model is trained by randomly selecting 63% of the data from the 2D-WEG dataset as the training set and the remaining data are randomly divided into two sets, a validation set and a test set to evaluate the learned models during training and after convergence, respectively. During training, the learning rate of the NormLayer and the Decoder decays from 10−4 to 10−6, while the learning rate of the Encoder decays simultaneously.

The inputs to both the source model and opt-model are the eight characteristic parameters and their logarithmic transformations, i.e. $$\vec{{{{{{\bf{x}}}}}}}=[T,\,{{{{{\rm{log }}}}}}T,{{{{{{\rm{RH}}}}}}},\,{{{{{\rm{log }}}}}}{{{{{{\rm{RH}}}}}}},{C},\,{{{{{\rm{log }}}}}}C,\tau,{{{{{\rm{log }}}}}}T,\zeta,\,{{{{{\rm{log }}}}}} | \zeta |,{l},\,{{{{{\rm{log }}}}}}l,{d},\, {{{{{\rm{log }}}}}}d,{P}+1,\, {{{{{\rm{log }}}}}}\left(P+1\right)]$$, and the output is the streaming potential or the generation performance data. The inputs and outputs of both the source model and the opt-model are normalized using the sample mean and sample standard deviation of the respective data sets, which are pre-calculated using the entire datasets and stored in metadata.json.

In the candidate optimal parameter search phase of the iterative optimization strategy, the predictions of opt-models are used as the objective function and the differential evolution algorithm is used to explore candidate optimal parameter combinations. The differential evolution algorithm has a differential evolution strategy of DE/rand-to-best/1/bin, a binomial crossover rate of 0.7, a mutation constant (scaling factor) of 1, a population size of 10,000, and the maximum number of generations for population evolution is 100,000. In the 2D-WEG reconstruction phase of the iterative optimization strategy, the optimal parameter combinations predicted by opt-models are used to prepare 2D-WEGs, and we recollect the power generation performance of 2D-WEGs with different structural parameter combinations. In the opt-model refinement phase of the iterative optimization strategy, newly collected experimental data are added to the 2D-WEG generation performance dataset, and the opt-model is retrained using the above training parameters.