Flow and detailed 3D morphodynamic data from laboratory experiments of fluvial dike breaching

This paper presents a dataset obtained from fifty four laboratory experiments of the breaching of fluvial dikes due to flow overtopping. Data were collected on two complementary experimental setups, each consisting of a main channel representing the river, an erodible lateral dike and a floodplain. The dataset covers seven test series, involving varying hydraulic boundary conditions (e.g. inflow discharge, downstream boundary conditions), main channel dimensions, as well as bottom and dike material. The following experimental data were produced: time series of water levels in the main channel, time series of flow discharges in the main channel and through the breach, as well as high resolution 3D reconstructions of the breach geometry during its expansion. The latter measurements were performed using a novel non-intrusive laser profilometry technique developed for this research. Reuse of the collected data will support efforts to improve our understanding of the physical processes underpinning fluvial dike breaching. It will also enable benchmarking the accuracy of conceptual or detailed numerical models for the prediction of dike breaching, which is central to flood risk management.


Background & Summary
Fluvial dikes, i.e embankment levees, along river banks, are built to channelize the flow and limit lateral riverbed migration, or as defence structures against floods. Dike failure during a flood event may lead to casualties and major damage in the protected areas, which can be more severe than that of equivalent floods without the presence of dikes. Understanding the dike breach processes and the potential consequences of dike failure induced floods is a major concern, as sustained demographic and economic development is witnessed in dike-protected areas 1 and these areas can be the location of highly sensitive and strategic assets. However, despite the major relevance of the topic for public safety, and despite several leading research groups and engineering entities seeking the development of reliable dike breaching models, major uncertainties regarding the physical processes are still hampering the proper consideration of fluvial dike breaching for flood risk analysis and management 2,3 .
Frequent exposure to flow actions (e.g. high or rapidly varying water levels, piping, and riverside erosion), lack of maintenance, inadequate rehabilitation works and animal burrows 4 contribute to dike weakening and increase the risk of structural failures. Different dike failure mechanisms can be identified, which can be categorized as 5,6 : (i) Instability, referring to defects on dike structural integrity when soil particle movement active strengths exceed resistant strengths 7 , (ii) Internal erosion through the dike and/or foundation body due to seepage flows, (iii) External erosion, which broadly encompasses mechanisms involving wearing of the structure surface due to floods, waves, wind, or any other natural process 8 . Flow overtopping, generating external erosion, is identified as the most frequent cause of dike failure [9][10][11] . Dike overtopping occurs if the river discharge exceeds the design value of the dike during a flood event, or, broadly, if the water level exceeds the dike crest or the flow overtops a weak dike segment.
The flow over the dike crest can trigger external erosion, eventually leading to breaching of the dike and catastrophic damage on the adjacent floodplain. Therefore, the breach growth dynamics and subsequent breach flow hydrograph have to be accurately estimated as they are key inputs for flood risk analysis and management. The present dataset is the outcome of a research project aiming at improving our understanding of the physical processes underpinning breach expansion and breach hydraulics in the case of fluvial dike failures 3,12 . A comprehensive test program was conducted (54 tests) on two experimental models, shedding light on dike breaching dynamics in various realistic configurations, including varying: (i) channel flow conditions, (ii) channel dimensions, (iii) floodplain confinement level, (iv) bottom erodibility, and (v) dike material composition. Each test was densely monitored using a 3D continuous reconstruction (i.e. digital elevation model) of the breach geometry, which allows a full depiction of the breach expansion stages. Water levels in the main channel were measured, allowing the estimation of the flow discharge through the breach.
The present dataset 13 complements and comprises findings, counting datasets, discussed in two previous research papers: (i) the experiments conducted by Rifai et al. 3 which focuses on the channel flow condition effects on dike breaching and (ii) the work conducted by Rifai et al. 12 investigating tailwater effects, i.e. water level increase on the floodplain side, on breach expansion dynamics. Details of the experimental method and scientific analyses of the present dataset were also presented in the PhD thesis of Rifai 14 .
On one hand, the generated dataset enables a thorough understanding of the mechanisms involved in breach expansion, and, on the other hand, it constitutes a sound database for testing modelling hypotheses and validating numerical models.

Methods
Experimental models. Two laboratory facilities were designed and built (Fig. 1). Experiments consisted in provoking the overtopping of the main channel flow over the dike crest at a designated spot (i.e. initial notch) and observing subsequent dike erosion and breach expansion. The first experimental model (referred herein as ULiège model) was built at the Engineering Hydraulics Laboratory of University of Liège (Figs 1a and 2). The second one (i.e. LNHE model) was built at the National Laboratory for Hydraulics and Environment of EDF-R&D (Figs 1b and 3). These two facilities differ in the main channel dimensions (width, length) and the dike length. The experimental models were intended to be complementary in their design, which allowed conducting various test series while investigating different parameters. Each model consisted of a main straight channel with a long side opening toward a floodplain. This side opening was closed with sand, representing a trapezoidal shaped fluvial dike separating the main channel and the protected floodplain. In all tests, the dike was h d = 0.3 m high, the crest was l dc = 0.1 m wide and the inner and outer dike face slopes were S i = S o = 1:2 (V:H); the bottom dike width was The bottom of the main channel and the floodplain were at the same level and can be made of erodible material.
The inflow discharges Q i , into the main channel was kept steady throughout each test. In the ULiège model, three different regulating systems were tested at the main channel outlet: (i) perforated plane, (ii) rectilinear weir, and (iii) sluice gate. Each regulating system was calibrated to insure that the main channel water surface, prior to breaching, was at the crest level for the considered inflow discharge Q i . Therefore, for each test the number of www.nature.com/scientificdata www.nature.com/scientificdata/ holes in the perforated plane, the weir crest level, or the sluice gate level were adjusted. In the LNHE model, only a perforated plane was used.
To ensure dike stability prior to overtopping, the seepage flow was limited by installing a drainage system at the dike bottom (see sections A-A and B-B in Fig. 2, section A-A in Fig. 3). The drainage system consisted of a 4 cm-thick layer of dike material wrapped in a geotextile that was placed on a coarse grid. To trigger overtopping, a 0.02 m deep, 0.1 m wide initial notch was cut in the dike crest at 0.85 m and 2.55 m from the dike upstream end in the ULiège model and LNHE model, respectively (Figs 2 and 3). In the ULiège model, the main channel was L mc = 10 m long, l mc = 1 m wide; the erodible dike was L d = 3 m long and the floodplain was 4.3 m × 2.5 m. In most tests the flow through the breach was discharged freely from the floodplain without any storage change nor tailwater effects ( Fig. 2a and Online-only Table 1, test series A, B, D and E). In some tests the floodplain was confined with wood panels to prescribe a predefined floodplain water level z fp ( Fig. 2c and Online-only Table 1, test series C). Two filling non-cohesive, uniform materials were tested for the dike construction, namely uniform coarse sand with median diameter d 50 = 1 m or 1.67 mm ( Table 1, Materials 1 and 2). For selected tests the water surface in the main channel was controlled by different downstream regulating systems (perforated plane, rectilinear weir, sluice gate). www.nature.com/scientificdata www.nature.com/scientificdata/ The LNHE model channel was L mc = 17.9 m long and allowed adjusting the channel width l mc ; three values were tested, namely 1.4 m, 1.8 m and 2.25 m. The channel side opening, i.e. dike length, was L d = 7 m long and the free floodplain 7 m × 1.7 m. Different dike materials were tested, namely Material 1 (i.e. uniform coarse sand) and three other heterogeneous compositions consisting of Material 1 mixed with fine sand (Material F) of d 50 = 0.24 mm. These three mixtures were made according to the volumetric fine sand ratios of 10%, 20%, and 30%, respectively ( Table 1, Materials F1-F3). Finally, the LNHE model allowed conducting tests with erodible channel and floodplain bottoms, composed of the same material as the dike (Online-only Table 1, test series G); the moveable bed layer was 0.10 m thick.
Online-only Table 1 gives an overview of the overall test program, whereas Table 1 summarizes properties of the tested dike materials. In total, 54 tests were performed, 30 tests on the ULiège model and 24 on the LNHE model. Tests can be categorized in 7 series: • Series A focuses on the effect of the inflow discharge Q i , • Series B focuses on the effect of the main channel downstream boundary condition, • Series C investigates the effect of the floodplain confinement, • Series D investigates the effect of the grain size of the dike material, • Series E focuses on the effect of the main channel size, • Series F investigates the effect of fine sediments and associated apparent cohesion, • Series G highlights the effect of bed changes in the channel and floodplain.
Test procedure. Dike construction consisted in stacking and compacting the material in the channel side opening. Compaction was performed manually to avoid any structural defects. A template of the trapezoidal cross-section was then swiped along the longitudinal axis (i.e. x-axis) to shape the dike and remove excess material before setting the initial notch over which flow overtopping was initiated. At the test start, the main channel was filled progressively with a discharge Q i0 equal to about 75% of the nominal test inflow discharge Q i . For tests with a rectilinear weir, Q i0 ≈ 0.5 × Q i to avoid overtopping of the initial notch during the "filling phase". The inflow was provided by pumping from a tank, which formed a closed circuit with the experimental model. The inflow discharge Q i was adjusted via a frequency converter and using valves. Once the flow stabilized, the dike and drainage system were inspected. The inflow discharge was then increased to the test nominal value Q i . The main channel water level increased up to overtop the dike crest at the initial notch triggering subsequent external erosion and dike breaching.
The material was reused after each test except of test series F which were conducted using fine sand and coarse sand mixtures. To check the validity of such reuse, the material grain size distribution was measured from three samples: one from the original material (supplier record), one from the dike before conducting the test, and one after conducting 20 tests. Reuse of the homogeneous material was deemed consistent as no significant grading was observed.
Water level measurements. Water levels were recorded continuously at fixed locations (G1 to G6, Figs 2 and 3) using ultrasound sensors. Two sensors, mic + 35/IU/TC and mic + 130/IU/TC by Microsonic, were used depending on the range of water elevation variability.
Flow discharge calculation. The inflow discharge Q i was measured using an electromagnetic flowmeter.
The main channel outflow discharge Q o was estimated from the discharge passing through a measuring weir Q o,mw , using a V-notch weir in the ULiège model and a trapezoidal weir in the LNHE model. Q o was evaluated from the following mass balance in the control volume between the main channel downstream regulating weir and the measuring weir: with t the time, A G4 the water surface area of the control volume, and z w,o the water level measured at gauge G4. The breach discharge Q b was determined using the same approach (i.e. prismatic routing) [15][16][17][18][19] , adapted to the control volume of the main channel: ) a weighted average of water levels z G1 , z G2 , and z G3 at G1, G2, and G3, respectively, with A G1 , A G2 , and A G3 as main channel surface areas associated with G1, G2, and G3, respectively (Figs 2 and 3). On the ULiège model, the drainage discharge Q d was computed by a prismatic routing on the drainage reservoir placed under the dike, i.e. Q d = A G5 × dz w,dr /dt, with z w,dr the water level in the reservoir tank bellow the dike, and A G5 the reservoir surface area. When the drainage reservoir was full, a drain valve was opened. During the tank emptying phase, measurements were not collected. The drainage discharge was expected to have small variations once the breaching is established. Thus, linear interpolation between measurements was deemed satisfactory to estimate Q d . For the LNHE model, continuous measurements of Q d were not feasible due to the limited storage capacity of the drainage tank. Q d was deduced by prismatic routing at the test start only, and was considered constant for the test duration.

Dike 3D reconstruction. Dike breaching was monitored by a laser profilometry technique (LPT), a
non-intrusive method consisting in sweeping a laser plan (emitted by a laser line projector) over the dike and processing the images recorded by a single video camera located above the dike. The reconstruction relies on the Direct Linear Transformation (DLT) algorithm for camera calibration, accounting for optical and decentring distortion 19 . Three-dimensional (3D) object coordinates of the laser profiles on recorded frames were deduced using the DLT and the laser plan position in the object coordinates. Subaqueous laser profiles (i.e. laser profiles incident on submerged areas) were affected by a bias due to light refraction at the free surface. This bias was corrected assuming an abstract prismatic representation of the water free surface and using Snell-Descartes law 20 . Combining all generated profiles of one laser sweeping, a 3D referenced point cloud of the dike was constructed. One laser sweeping lasted between 1.5 to 5 seconds. The laser sweeping frequency was set depending on the breaching stage, with rapid sweepings at fast breach expansion stages and prolonged sweeping for slower stages. Each point cloud resulting from one sweeping was considered as an instantaneous representation of the dike geometry. Filtering was applied to the final point clouds to remove outliers and to reduce noise. The final point clouds were then projected onto a 1 cm × 1 cm grid. Additional details on the LPT are presented by Rifai et al. 21 and Rifai 14 .
The laser profilometry setups were similar on both the experimental models. The video recording was performed with similar cameras set on a Full HD resolution (1920 × 1080 pixels) at 60 frames per second. The laser sweeping on the ULiège model was performed by a rotating laser projector (Fig. 4a). On the LNHE model, two laser projectors were used to cover the full 7 m dike length. The laser projectors were fixed to an automated sliding rail system (Fig. 4b).

Data Records
Data structure. The present dataset 13 is structured as presented in Online-only Table 1. Data corresponding to each test is stored in a directory named according to the test number. The file "tree.txt" presents a depth-indented listing of test directories and data stored within. Each test directory comprises: • "<test name>_data_structure.txt" file which presents the tree structure and content of the test data full file "<test name>.h5" (see 'Supplementary File 1' for an example of such file, and Fig. 5 for a sketch of the corresponding data structure for Test 1), • "<test name>.h5" file containing the test full data. The data file is under the HDF5 format (Hierarchical Data Format 5). HDF5 libraries exist for different programing languages (e.g. Python, Matlab, Fortran, C, and R). More details can be found on the HDF5 website (https://support.hdfgroup.org/HDF5/), • "<test name>_Exp_config.txt" file which summarises the experimental configuration and test boundary conditions. This file allows an easy reading of the experimental configuration data stored in the HDF5 file, www.nature.com/scientificdata www.nature.com/scientificdata/ Data stored in the HDF5 files is structured flowing a directory, subdirectory, and file tree structure, corresponding to groups, subgroups and datasets, respectively (https://support.hdfgroup.org/HDF5/). The data is organized under four groups, namely: (i) experimental configuration, which summarizes the model dimensions and test boundary conditions, (ii) hydraulic variables, including, the measured water levels at different gauging locations as well as discharges, (iii) raw 3D point cloud reconstructions, and (iv) rasterized reconstructions. www.nature.com/scientificdata www.nature.com/scientificdata/ In addition, a Python script, "read_datah5.py", is provided to serve as an example of data retrieving and display from the HDF5 files. The script can be used to start processing and analysis of data. The "Extract_example. mp4" video is a screen recording showing how the retrieving and display of the data, using the provided Python script, works. The videos "Test_6.mp4" and 'Test_49.mp4" are stop-motion videos recording during the experiments for tests conducted on the ULiège model and LNHE model, respectively.
Notations and units. Datasets in the HDF5 are labelled following the physical variable name. Table 2 lists names of variables and associated units.

technical Validation
Hydraulic variables. Water levels were measured using ultrasound probes. The probe measurement accuracy was ±1%. Record frequency was set to 100 measurements per second for the ULiège model and 10 measurements per second for the LNHE model. Measurements were then processed and outliers were deleted based on a moving average and standard deviation thresholding. For the sake of visibility and further processing, the measured signals were smoothed using a low-pass filter, namely the Savitzky & Golay 22 filtering function of Matlab 2014b (sgolayfilt). The polynomial order and the frame length parameters of the functions were adjusted depending on the measurement frequency. The use of the Savitzky & Golay 22 filter better preserves the distribution features (i.e. position and width of the extrema, area below the curve) in comparison to averaging techniques 22 .
Water level measurements were used to compute the channel outflow, breach, and drainage discharges (cf. flow discharge calculation). The channel outflow discharge Q o at the regulating weir was deduced from the water level measured at G4 and the rating curve of the measuring weir (Eq. 1). The rating curve of each measuring weir was calibrated and corresponding adjustment formula were proposed (Fig. 6). A comparison between the computed outflow discharges Q o and those deduced from the rating curve of a calibrated perforated plane showed a maximum deviation below 10% of the test inflow discharge Q i , whereas the root mean square error represented 3.5% of Q i .
Mass balance on the main channel enabled the estimation of the breach discharge Q b (Eq. 2), with the inflow discharge Q i measured at the inflow pipe with an electromagnetic flow meter of ±4% accuracy. The water surface rippling induced physical disturbances on the water level measurements. The breach discharge Q b computed from Eq. 2 was therefore smoothed using the Savitzky & Golay 22 filtering function. Dike geometry reconstructions. Two additional 3D reconstructions were conducted with a laser scanner (FARO Focus-3D) to assess the overall validity of the LPT reconstructions. Scans were done on the initial and final dike geometries in the LNHE model (Fig. 7a-d). Each scan required up to 30 minutes and placing the laser scanner at two different locations and merge the generated point clouds to reduce blind spots. Figure 7e,f show differences between the laser scanner and LPT methods. Blind spots uncovered by the LPT were the main source of discrepancies between the reconstructions.
Other tests were conducted on the ULiège model with a plastered partially breached dike to assess the validity of the refraction correction module on a dike breaching experiment and the validity of the synthetic prismatic water surface assumption. Noticeable improvement of the results were obtained compared to the non-corrected reconstructions, although the refraction correction was slightly overestimated in some areas.
Other sources of uncertainties in the LPT reconstruction scheme include 21 : (i) calibration of the camera based on the location of image calibration points as well as (ii) physical calibration points, (iii) extraction of the laser profiles by image filtering and processing, (iv) localization on the images of the intersection of the projected laser sheet with the laser identification axis, and (v) measurements and approximation of the main channel water surface levels used in the refraction correction steps. These uncertainty sources were quantified and propagated on     Fig. 8 show the standard deviation of reconstruction uncertainties. It can be noted that the highest uncertainties come from the refraction correction as, in the partially submerged cases (Fig. 8a,c), the highest uncertainties are located in the submerged areas (y > 0.7 m and in the breach). In the non-submerged areas, uncertainties top at σ z ≈ 0.02 m and ≈0.01 m on the ULiège model and LNHE model, respectively. Data usage caution. Note that, out of the 54 tests conducted, five of them (Tests 6, 37, 41, 42, and 54) have incomplete data. In Test 6, correction for refraction could not be performed. In Test 37, the camera did not operate between t = 1590 s and 3570 s. During Test 41, the ultrasound sensors did not operate between t ≈ 1000 s and 4600 s, liner interpolation of the water level time series was deemed sufficient in this case. Interpolated water level time series were then used for discharge calculation and water surfaces used for refraction bias correction. In Test 42, data from the ultrasound sensor G2 could not be processed. Finally, in Test 54, with the highest inflow discharge combined with an erodible bottom, the dike geometry reconstruction was affected by strong free surface variations due to flow turbulence.

Usage Notes
This dataset is intended to bring insight into fluvial dike breaching processes and dynamic under various conditions. This dataset can be used to improve standardly used conceptual or numerical models to enable a comprehensive grasp of flow and erosion characteristics occurring throughout the different stages of the breaching process in a fluvial configuration (e.g. effects of saturation and apparent cohesion, secondary currents adjusted to abrupt flow deviation).
Another use of the dataset can include derivation of a simplified proxy representation of the breach geometry and expansion processes. This is relevant to practitioners as it can represent a more efficient implementation of dike breaching process in large scale numerical models used for flood risk assessment studies.

code Availability
The laser profilometry technique algorithm was developed on Matlab 2014b. Further details on the geometry 3D reconstruction and the algorithm structure are given by Rifai et al. 21 and Rifai 14 .