Repeated giant earthquakes on the Wairarapa fault, New Zealand, revealed by Lidar-based paleoseismology

The Mw 7.8 2016 Kaikoura earthquake ruptured the Kekerengu-Needle fault resulting in the loading of its eastern continuation, the Wairarapa fault. Since the most recent earthquake on Wairarapa occurred in 1855 and is one of the strongest continental earthquakes ever observed, it is critical to assess the seismic potential of the Wairarapa fault, which might be prone to break. Using Lidar data, we examine its bare-earth morphology and reveal ~650 mostly undiscovered offset geomorphic markers. Using a code we developed in earlier work, we automatically measure the lateral and vertical offsets of these markers providing more than 7000 well constrained measurements. The data document the lateral and vertical slip profiles of the 1855 earthquake for the first time and show its total slip reached ~20 m at surface. Modeling the entire offset dataset reveals 7 prior earthquakes ruptured the entire fault, each similarly producing 16.9 ± 1.4 m dextral slip and ~0.6 m vertical slip at surface in the same central bend zone of the fault. Thus, the Wairarapa fault repeatedly produced giant earthquakes and is likely able to produce a similarly strong forthcoming event. The extreme large size of the Wairarapa earthquakes questions our understanding of earthquake physics.

thus identified in each polygon, through which 3D best-fit lines are calculated and projected onto the dipping fault plane. Lateral and vertical offsets are calculated (with their total uncertainties) from the relative position of the best-fit line piercing points on either side of the fault (details in Methods, and in Stewart et al., 2018). The corresponding measurements are listed in Supplementary Table 1, and those retained in analysis indicated below the figure (the others are ignored for they are not well constrained or not geomorphically relevant). (c): example of 3D view of identified points characterizing one of the marker edges. Green for Max, red for top, blue for Mid, and yellow for base of riser scarp.  Figure 4A: The graphs report all the published and precisely located field offset data we found in the literature (see references in Supplementary Table 1 and in text). Overall, ~70% of our PREF and OPT lateral and vertical offsets are consistent with the prior field measures. Figure 4B: In a non-published GNS report, Villamor et al. (2008) compiled about 100 additional lateral offsets measured on the field (plot (a) which is Fig.4A from Villamor et al. (2008)). Yet the offset markers are not located precisely and the uncertainties on the offset measures are not provided. To gain insight on these prior field measures, we digitalized the graph (a). The recovered approximate lateral offset values and their along-fault locations are reported on the graph (b) (along with the best constrained field measures from our Fig. 4A). For comparison, the lateral offsets measured in the present study are shown alone in plot (c). Plot (d) shows all prior and present lateral offsets together. Plot (e) allows a better comparison between prior and present lateral offset data: it shows that most prior lateral field offsets fall within one or other of the lateral offset clusters we identified and modeled in the present study (model from Fig.3b).

Supplementary Figure 5: Statistics of the offset measures
(a-top) Distribution of the PREF and OPT lateral offsets across the main and secondary faults. Note that the few lateral offsets greater than 260 m are not presented for clarity. (a-bottom) Distribution of the PREF and MID vertical offsets across the main and secondary faults. Note that the few vertical offsets greater than 20 m are not presented for clarity. OPT vertical offsets are not shown for they generally are not representative of the actual vertical offset: the vertical offset indeed greatly varies among the geometric features of a marker (the riverbed is especially eroded with a lower apparent vertical offset), and therefore the OPT mean value is not meaningful. The MID geometric feature has been shown to best record the actual vertical offset. See more detailed explanations in Stewart et al., 2018. (b-top): Uncertainties on the PREF and OPT lateral offsets.   , 2016). Average generic envelope shape is in yellow. (b) Variable degree of asymmetry revealed in the earthquake slip profile populations shown in (a); while most slip profiles are asymmetric, some are fairly symmetric, yet still triangular overall. Average generic envelope shape is in yellow. (c) Additional slip-length profiles of continental and subduction earthquakes, that were not included in plots in (a) and (b) above. Some profiles are asymmetric, others are more symmetric. is represented in X axis with its half value (for the example above: 15 km). On the 2 top graphs, red is mean value of WRMS or AICC. In third graph, red is 1855 earthquake slip. The test shows that, whatever the range of Ls, the position of largest slip is found stable (provided the Ls range contains it), between 30 and 45 km, and more precisely at about 37-39 km. The best model in terms of indicators is for the Ls range [30-45 km].  Models run with different slip increment amplitudes, indicated in bold in the graphs. Apart from the slip increment amplitude, models are run with the conditions reported on top of the graphs (a) Statistics of the models. On the 2 top graphs, red is mean value of WRMS or AICC. In third graph, red is 1855 earthquake slip.

Supplementary
It first needs to be noted that no continental earthquake is known to have produced more than 18-19 m of slip at surface (see Rodgers and Little, 2006;Manighetti et al., 2007). Therefore slip increments provided as inputs to the model cannot be much greater than 20 m. The test shows that realistic 1855 and prior earthquake slips are only produced with slip increment ranges from [0-18m] to [0-24m], while the model indicators are better for the larger ranges. The best compromise is found for the slip increment range [0-22 m], as the later does not produce any earthquake slip greater than 20 m, as does the range [0-24 m]. Models run with different rupture length ranges either side (total rupture length indicated in bold in the graphs). Apart from the length range, models are run with the conditions reported on top of the graphs (a) Statistics of the models. On the 2 top graphs, red is mean value of WRMS or AICC. In third graph, red is 1855 earthquake slip. The test shows that, the longer the rupture, the better the indicators of the models. However, this is mainly related from the longer rupture-models integrating more data points. It is therefore difficult to determine the actual rupture length. The model statistics shows however that the modeled 1855 earthquake slip abruptly drops down for rupture length greater than 110 km. This might suggest it is unlikely that the total rupture length be greater than 110 km. Data from Supplementary Table 1 (a) Lateral slip profiles. In all graphs, the profiles show the lateral offset data which have been assigned to one or other of the 8 modeled functions (from Figs.3b and Supplementary 10; only quality 1 and 2). Therefore, the first, lower profile in red is the 1855 earthquake lateral slip profile, while the above profiles are cumulative. The corresponding earthquakes are indicated. The vertical lines highlight the major segments (from Supplementary Fig. 1) which seem to shape the lateral slip distributions, especially the 1855 one. The maximum and mean slips of the 1855 earthquake are indicated (derived from the slip data in red). (b) Vertical slip profiles. The lateral offset data characterizing each earthquake being identified in (a), we could isolate the corresponding vertical slip data. Those are plotted in (b). Therefore, the first, lower profile in red is the 1855 earthquake vertical slip profile, while the above profiles are cumulative. The corresponding earthquakes are those indicated in (a). The vertical lines are as in (a). The mean vertical slip of the 1855 earthquake is indicated (derived from the slip data in red), while the mean cumulative vertical slips of the prior earthquakes are also indicated (derived from the data shown).  Table 1 (b) Vertical to lateral slip ratios for the 5 most recent earthquakes. The lateral and vertical slip data are those from Supplementary Fig. 17. Therefore, the first, lower plot in red shows the vertical to lateral slip ratio along the 1855 earthquake rupture, while the above graphs show the cumulative vertical to lateral slip ratios along the prior earthquake ruptures. The corresponding earthquakes are indicated in (b). The vertical lines are as in Supplementary Fig.  17. The mean vertical to lateral slip ratios are indicated. (b) Total, composed slip profiles of the 5 most recent earthquakes. The lateral and vertical slip data are those from Supplementary Fig. 17. They have been composed to produce the total slip profiles. The first, lower profile in red is the 1855 earthquake total slip profile, while the above profiles are cumulative total profiles. The corresponding earthquakes are indicated. The vertical lines are as in Supplementary Fig. 17. The maximum total slips are indicated.

Supplementary data spreadsheet (excel file): the ~15, 800 offset measurements provided in this study
Markers are named (column A) in reference to the coordinate of their northern piercing point in the Lidar data reference. Note however that, during the measurement process, a few changes were made to some of the marker traces while their initial name was not changed. Markers are located on Supplementary Fig.1 with the color indicated in column E. In column A, markers measured in 1 m airborne Lidar data are in white background, while markers measured in 30 cm TLS Lidar data are in orange background. Some markers have both measurements. Prior field offset measures are indicated in blue. Distance (column B) is measured along major fault trace from its westernmost end in Lidar data. For secondary faults, distance is projected on main fault trace. Maximum error on distances is ~20 m. Major fault segments (column C) are derived from fault mapping, and reported on Supplementary Fig.1. Column D discriminate markers offset across major fault trace (orange background color in column G) or secondary faults (blue background color in column G). Names of secondary faults are not reported in Supplementary Fig.1 for clarity. Marker brief description is provided in Column F, with additional comments in Column H. Fault, especially sense of dip at surface, is commented in Column I. Quality rating in Column G is based at this stage on preservation of the marker and clarity of its lateral offset. This quality is updated later in Columns BX and BZ. Columns J to AY are output measurements from the 3D-Fault-Offset code; the P-AG columns are the lateral offset measures, and the AH-AY columns are the vertical offset measures. See Stewart et al. (2018) for more explanations. Note that, because the WP fault dips towards the NW, the convention for the manual tracing of the fault line results in a rotation of the original DEM and output measures. Therefore, any "west" and "east" in the code output results refers to actual East and West, respectively, while the output "north" and "south" are actual south and north, respectively. The output sign of the vertical offsets thus needs to be multiplied by (-1) to obtain the actual sense of vertical slip (positive where north up). In the J-AY columns, the grey values are those which are not well constrained or not geomorphically-relevant: they are ignored from the analysis, and only the black values, geomorphically-relevant, are retained. Note however that there are a few cases of complex markers for which discriminating geomorphically-relevant offsets is difficult; for those cases we kept all or most of the code measures so as to have a statistically-constrained offset. The "grey ignored values" are defined from the lateral offsets and then applied as "mirror" to the vertical offsets; this is a simplification as vertical offsets are sometimes better defined than their corresponding lateral offsets. The underlined values are the PREF offsets. Note that, when no PREF value can be recognized, the OPT value is retained as PREF. Symbol >> means that the error on the offset is unrealistic (i.e., the code could not resolve a relevant bestfit line). Column BQ reports comments on the code results and backslip reconstructions, especially on vertical slips. Column BW examines whether the PREF and OPT lateral offsets also reconstruct some adjacent markers. If not, the quality of the lateral offset in Column BX is maintained as that first stated in Column G. If yes, the quality is increased because the reconstruction of several markers attests of the robustness of the lateral offset. Dark blue is for quality 1 (best), medium blue for quality 2 (good), and white for quality 3 (weaker). Column BZ evaluates the quality of the PREF vertical slip as both being well constrained and well reconstructing the prefaulted vertical geometry of the marker. Dark green is for quality 1 (best), medium green for quality 2 (good), and white for quality 3 (weaker). A synthesis of total number of identified markers, series of measures, code offset measures, and geomorphically-relevant measures retained in analysis, is provided at the end of each segment section. A short table synthesizing all these numbers is provided at the bottom of Table 1. In red are the segments where the largest proportions of offsets were measured on secondary faults off the main fault trace.

Supplementary Document A: Parameters of best lateral offset models
Figures where models are shown are indicated in brackets. The best models are the average models calculated by the outermost routine from the multiple Nr "best models" derived from the intermediate routine (see Methods). Here we generally chose Nr=500, and therefore each of the best models is derived from 50 000 iterations. Ll is left (west) distance of rupture western tip with respect to main fault western end in Lidar data, taken as 0 km. Lr is right (east) distance of rupture eastern tip with respect to main fault eastern end in Lidar data, which is 70 km. Ls is position of largest lateral slip along the [0-70 km] main fault in Lidar data. Ds is largest modeled lateral slip. In all cases, error is standard deviation of the Nr models. Errors on Ds are conservative and overestimated because their calculation is not weighted with the quality of the Nr models (i.e., a weaker model with larger WRMS and AICC parameters has same weight as a more robust model with smaller WRMS and AICC). "Max EQ slip" is largest earthquake lateral slip derived from Ds values. For reasons explained above, we do not translate the Ds errors on the Max EQ slips. Mean values of parameters are provided for each model, along with the mean WRMS and AICC criteria of the model. Best Model PREFQ1Q2_MF is calculated from quality 1 and quality 2 PREF lateral offsets (in range 0-160m) measured across the main fault. Data are weighted by both their uncertainty and quality weight. The input parameters for the best model calculation are M_N_fit_rand_tri13(500,100,L,D,W,8,-20,0,30,45,0,22,70,90), that is : Nr=500, Nt=100 Best Model OPTQ1Q2Q3_MF is calculated from quality 1, quality 2, and quality 3 OPT lateral offsets (in range 0-160m) measured across the main fault. Data are weighted by both their uncertainty and quality weight. The input parameters for the best model calculation are M_N_fit_rand_tri13(500,100,L,D,W,8,-20,0,30,45,0,22,70,90) (as above). Best Model PREFQ1Q2_MF+SF is calculated from quality 1 and quality 2 PREF lateral offsets (in range 0-160m) measured across both the main fault and the secondary faults. Data are weighted by both their uncertainty and quality weight. Input parameters for the model calculation are M_N_fit_rand_tri13(500,100,L,D,W,8,-20,0,30,45,0,22,70,90) (as above). Best Model PREFQ1Q2Q3_MF+SF is calculated from quality 1, quality 2 and quality 3 PREF lateral offsets (in range 0-160m) measured across both the main fault and the secondary faults. Data are weighted by both their uncertainty and quality weight. Input parameters for the model calculation are M_N_fit_rand_tri13(100,100,L,D,W,8,-20,0,30,45,0,22,70,90) (as above but with Nr=100 ). Best Model OPTQ1Q2_MF+SF is calculated from quality 1 and quality 2 OPT lateral offsets (in range 0-160m) measured across both the main fault and the secondary faults. Data are weighted by both their uncertainty and quality weight. Input parameters for the model calculation are M_N_fit_rand_tri13(500,100,L,D,W,8,-20,0,30,45,0,22,70,90) (as above). Best Model OPTQ1Q2Q3_MF+SF is calculated from quality 1, quality 2 and quality 3 OPT lateral offsets (in range 0-160m) measured across both the main fault and the secondary faults. Data are weighted by both their uncertainty and quality weight. Input parameters for model calculation are M_N_fit_rand_tri13(100,100,L,D,W,8,-20,0,30,45,0,22,70,90) (as above but with Nr=100 ).
Global means in red are average from all model means, calculated with Gaussian probability density functions.

Supplementary Document A: Parameters of best lateral offset models
Best Model PREFQ1Q2_MF (Fig. 3 These input parameter ranges are the same for the other models below. Best Model PREFQ1Q2Q3_MF is calculated from quality 1, quality 2 and quality 3 PREF lateral offsets (in range 0-160m) measured across the main fault. Data are weighted by both their uncertainty and quality weight. The input parameters for the best model calculation are M_N_fit_rand_tri13(500,100,L,D,W,8,-20,0,30,45,0,22,70,90) (as above). Best Model OPTQ1Q2_MF is calculated from quality 1 and quality 2 OPT lateral offsets (in range 0-160m) measured across the main fault. Data are weighted by both their uncertainty and quality weight. The input parameters for the best model calculation are M_N_fit_rand_tri13(500,100,L,D,W,8,-20,0,30,45,0,22,70,90) (as above). Best Model OPTQ1Q2Q3_MF is calculated from quality 1, quality 2, and quality 3 OPT lateral offsets (in range 0-160m) measured across the main fault. Data are weighted by both their uncertainty and quality weight. The input parameters for the best model calculation are M_N_fit_rand_tri13(500,100,L,D,W,8,-20,0,30,45,0,22,70,90) (as above). Best Model PREFQ1Q2_MF+SF is calculated from quality 1 and quality 2 PREF lateral offsets (in range 0-160m) measured across both the main fault and the secondary faults. Data are weighted by both their uncertainty and quality weight. Input parameters for the model calculation are M_N_fit_rand_tri13(500,100,L,D,W,8,-20,0,30,45,0,22,70,90) (as above). Best Model PREFQ1Q2Q3_MF+SF is calculated from quality 1, quality 2 and quality 3 PREF lateral offsets (in range 0-160m) measured across both the main fault and the secondary faults. Data are weighted by both their uncertainty and quality weight. Input parameters for the model calculation are M_N_fit_rand_tri13(100,100,L,D,W,8,-20,0,30,45,0,22,70,90) (as above but with Nr=100 ). Best Model OPTQ1Q2_MF+SF is calculated from quality 1 and quality 2 OPT lateral offsets (in range 0-160m) measured across both the main fault and the secondary faults. Data are weighted by both their uncertainty and quality weight. Input parameters for the model calculation are M_N_fit_rand_tri13(500,100,L,D,W,8,-20,0,30,45,0,22,70,90) (as above). Best Model OPTQ1Q2Q3_MF+SF is calculated from quality 1, quality 2 and quality 3 OPT lateral offsets (in range 0-160m) measured across both the main fault and the secondary faults. Data are weighted by both their uncertainty and quality weight. Input parameters for model calculation are M_N_fit_rand_tri13(100,100,L,D,W,8,-20,0,30,45,0,22,70,90) (as above but with Nr=100 ).
Global means in red are average from all model means, calculated with Gaussian probability density functions.