Supplement to: A revised acceleration rate from the altimetry-derived global mean sea level record

The satellite radar altimetry data for TOPEX and ERS-1&2 are obtained from the Radar Altimetry Database System (RADS). The applied geophysical corrections are listed in Table 1. The orbits solutions used are computed as part of the Sea Level project of the Climate Change Initiative (SLCCI). Note that the pole tide from Desai et al. 2 is used in the crossover analysis, which contains the variations with respect to a linear mean pole. In the tide-gauge comparison, we use the pole tide as described in the IERS2010 conventions, which account for a non-linear mean pole. Mean polar motion is implicity taken into account in a Vertical Land Motion (VLM) correction for the tide gauges (Sect. 3). Furthermore, the tides and dynamic atmosphere correction are not applied for the tide-gauge comparison. The major tidal harmonics are regressed as part of the TOPEX drift estimation (Sect. 3).

transition between TOPEX-A&B appears to be continuous for the CSR and CLS SSB correction, there is an observable discontinuity for the Gaspar-corrected time series. Note that estimating a intramission bias from these time series is nontrivial, since it strongly depends on the number of months before and after the TOPEX-A/B transition that are used in a model. Table 1: List of geophysical corrections used for ERS-1&2 and TOPEX. *Not applied in the comparison with tide gauges. **Only the solid Earth part of the pole tide is corrected for in the comparison with tide gauges, following the IERS2010 conventions.  After removing the internal path delay calibration (cal-1) the time series becomes more linear (Fig. 1, right panel). It reduces the discontinuity between TOPEX-A&B for the Gaspar SSB correction, but it induces an offset for the other two solutions. Note that TOPEX sea surface heights are negatively drifting with respect to those of ERS-1&2.
To estimate these drifts and intramission biases, we fit three models to the time series: 1) one TOPEX-A/B drift, 2) one TOPEX-A/B drift and an intramission bias and 3) two separate TOPEX-A&B drifts and an intramission bias. The statistical significance of an improvement of models 2 and 3 with respect to model 1 determined with an F-test: where e 1 are the residuals after fitting model 1, e i the residuals after fitting models 2 and 3, (n − k) the degrees-of-freedom and g the number of additional estimated parameters with respect to model 1.  Figure 2: Time series of crossovers differences between TOPEX and ERS1&2 with cal-1 not applied. On the left the averaged time series for the southern hemisphere and on the right for the northern hemisphere.
We also investigate the geographical dependency of the drifts and the intramission biases. Fig. 2 shows the crossover time series southern and northern hemispheres and Table 2 contains the drifts and intramission biases with statistics. The cause for these geographical differences is unknown, but it is likely related to the orbit solution. For the southern hemisphere negative drifts are obtained for TOPEX-A&B, while for the northern hemisphere the TOPEX-B drift is positive in model 3. Globally, as shown in Fig. 1 of the main article, the drifts for TOPEX-A&B are both negative. The intramission biases computed for the CLS and CSR SSB corrections over the southern hemisphere are negligible, while globally these are significantly different from zero. Between the northern and southern hemisphere there is a substantial difference between the estimated intramission biases. Note that on the northern hemisphere the choice of model (2 or 3) affects the magnitude of the estimated intramission bias. The geographical dependency of the intramission bias and the TOPEX-B drift implies that a calibration is only valid for Global Mean Sea Level (GMSL) and is not directly applicable in regional studies.

Altimeter-tide gauge processing
For the comparison with tide gauges, we use a procedure slightly modified with respect to Watson et al. 4 . Instead of the fast delivery products (fast), the hourly research quality tide-gauge records (rqds) from UHSLC 5 are used. Initially, all records that span the period 1993.0-2002.5 are considered. The tide-gauge records within a radius of 1000 km from ≥7.5 moment-magnitude earthquake events during that period are removed. All altimetry time series within a radius of 220 km around a tide gauge are taken into account, with a minimum distance from the coast of 30 km to avoid land signals contaminating waveforms and radiometer wet troposphere delay estimates. The TOPEX data are colinearly stacked, so that altimetry time series are created at every 6 km along-track. ALT-TG sea-level differences larger than 1 meter are removed.
Linear corrections for Vertical Land Motion (VLM) are estimated from models or from GNSS trends. We take the median of all GNSS trends within 50 km from the tide gauge in the database of the Nevada Geodetic Laboratory (NGL) 6 . Only GNSS trends with a formal uncertainty smaller than 1 mm yr −1 are included. To cope with VLM differences between the time spans of GNSS and TOPEX, due to non-linear present-day mass redistribution, we use the VLM estimates based on the models and loads used by Frederikse et al. 7 . The correction is implemented as in Kleinherenbrink et al. 8 and implicitly also deals with the problem of polar wander, which is not captured by the IERS2010 pole tide used as background model for GNSS and altimetry. Note that Kleinherenbrink et al. 8 showed that the consistency between altimetry, tide-gauges and GNSS trends improved if the non-linear VLM is taken into account. In case there are no GNSS stations nearby the tide gauge, the linear trends from the present-day mass redistribution VLM model and the ICE-6G C VM5a GIA model 9 are used to estimate the total VLM trend at the tide gauge. That means that for tide gauges without a nearby GNSS receiver, a part of the (primarily small-scale) VLM variations cannot be taken into account, but it increases the number of tide gauge that can be used for validation.
From every altimetry time series, cubicly interpolated UHSLC tide gauge sea level measurements are subtracted, creating an ALT-TG differenced time series at every 6 km along-track. Closely following Watson et al. 4 , a model is regressed through the ALT-TG time series containing 12 ocean tides, a latitude and longitude dependence complemented with altimeters drifts and intramission biases as model 1-3 in Sect. 2. Spectrograms of the residuals are computed and harmonics corresponding to peaks larger than 4σ 2 are added to the model after which it is recomputed. Outliers outside 3RMS of the residuals are iteratively removed. Drifts and intramission bias estimates are only considered if their corresponding ALT-TG time series still contain at least 250 samples at this stage.
Thresholds are set to remove noisy Control Point (CP) time series, which require the propagation of uncertainties. To estimate the uncertainties of trends and biases, which are computed from the CP time series using ordinary leastsquares, we fit an AR(1)-model through the residuals. With that we construct variance-covariance matrix Q yy : where σ is the standard deviation of the residuals and φ 1 the first-lag autocorrelation. By propagation of errors, we estimate the variance-covariance matrix for the estimated parameters as: where A is the design matrix used in the ordinary least-squares estimate described above. The standard deviations for the drifts and biases are extracted from the matrix Q xx . We remove the time series for which the standard deviations of the residuals is larger than 110 mm, with a TOPEX-A drift uncertainty larger than 10 mm yr −1 (both as in Watson et al. 4 ) or a TOPEX-A/B drift uncertainty larger than 8 mm yr −1 .
To investigate the stability of the ALT-TG comparison, the drifts and intramission biases are averaged using four methods. First the unweighted average trends and biases (m1) at every along-track location. The mean and uncertainties of the drifts and biases are computed from distributions estimated with a Monte-Carlo simulation, in which we randomly leave twenty tide gauges out of consideration. These uncertainties only reflect changes due to weighting and network geometry. The uncertainties are inflated by 0.3 mm yr −1 to account for reference frame stability issues 11 and differences between long-term ALT-TG and GNSS VLM trends 8 . This is done for all four methods. The second method is similar to Watson et al. 4 and weights the trends and biases with their variance (m2). A 1.5 mm yr −1 uncertainty for the GNSS trends is taken into account. Third, the means of trends and biases are computed for every tide gauge and consecutively all mean trends and biases are averaged (m3). Only tide gauges that are coupled with at least ten trends/biases are included, which makes the method less prone to outliers. Finally, a virtual-station method 12 is applied (m4). This method takes an average of the trends and biases between the nearest tide gauges and creates a virtual station at the midpoint. The virtual station is added and the two original stations are removed. This process loops until no stations are closer than 500 km from each other, which should be enough to remove most correlations between neighbouring stations and which creates a more homogeneously spaced network. The limit of 500 km is chosen to avoid generation of high weights for several remote stations, which would make the result sensitive to outliers. A second consideration for this radius is that in most regions ocean signals correlate at smaller radii 13 . Similar to the third method, a minimum of ten trends and biases is required per tide gauge.

Tide-gauge comparison
A tide-gauge comparison is required to validate the results of the crossovers, i.e. it is required to determine whether the TOPEX sea surface heights are drifting or the ERS1&2 sea surface heights. To verify whether a tide-gauge comparison is suitable to estimate such short-term drifts, the network geometry, the data editing and the distribution of the Monte Carlo simulations are investigated.

Geographical distribution
The number of ALT-TG time series and the corresponding drifts and intramission biases varies slightly, depending on the model (1-3) and the applied SSB correction.
For the Gaspar SSB correction in combination with fitting model 1, the number ALT-TG solutions per tide gauge is given in Fig. 3. In general, higher numbers are found near island-based tide gauges. This is mainly because altimetry data are available from all sides of the island. Additionally, the ocean signals over continental shelves have a short correlation distance 14 and therefore residual they can show up in ALT-TG time series. A typical example of this is given in Kleinherenbrink et al. 8 , where ENSO-related signals are visible at the Winter Harbour tide-gauge in Canada. These residual ocean signals inflate the estimated drift and bias uncertainties and as a consequence a part of the ALT-TG time drifts and biases are removed from the analysis. The (correlated) residuals enhance the uncertainties of the drifts, so that part of them will be removed due to the thresholds set in Sect. 3. At several high-latitude locations the number of ALT-TG time series also increases due to the smaller ground-track spacing towards the poles. The varying number of ALT-TG solutions per tide gauge directly translates into under-and overweighting of certain regions in a tide-gauge-based satellite altimetry validation and/or calibration procedure. For the four weighting methods (m1-m4) the relative weight per tide gauge is shown in Fig. 4. Method m1, the unweighted mean, strongly weights drifts and intramission biases estimated at tide-gauges in Europe and the Pacific and Indian Ocean islands. In method m2, the variance weighting, a similar pattern emerges, but the weights for the islandbased tide gauges in the Pacific is reduced. Both aforementoined methods have relatively low weights for the tide-gauge at the North-American coasts, but note that there are many tide gauges in this region. Due to the density of tide gauges at the North-American coasts, a relatively large weight is given to this region by the tide-gauge average method m3. The virtual-station method m4 improves the homogeneity of the stations (bottom-right panel of Fig. 4). However, all methods suffer from undersampling of certain regions. Especially the soutern oceans lack tide gauges.

Histograms
We inspect the stability of the tide-gauge comparison using histograms for the three models. The results are considered to be insensitive if statistically consistent drifts and intramission biases are obtained using methods (m1-4) in combinations with the regressed models. As a reference, the three models are computed through the time series with the cal-1 correction unapplied. The histograms of the drifts and biases estimated with models 1 (Fig. 7), model 2 ( Fig. 6) and model 3 (Fig.  5) demonstrate the effects of the weighting methods. With model 1, a drift of slightly larger magnitude is found for the Gaspar SSB time series than for the other two, which is consistent with the crossovers (main text, Table 1). Since the drifts for model 1 are statistically consistent with the crossover drifts, it suggests that TOPEX sea surface heights are drifting and not the ERS1&2 sea surface heights.  However, when additionally an intramission bias (model 2) is estimated in the ALT-TG time series, different results are obtained than for the crossovers (Fig.  6). Independent of the averaging method and the SSB correction the intramission biases become 4-6 mm lower than those found when using the crossovers. This leads to TOPEX-A/B drifts that are statistically indistinguishable from zero. Also note that for the methods with the largest drift, the smallest intramission bias is found and vice versa. A possible source for this is the geographic dependence of the intramission bias, as demonstrated with the crossovers. The geographically varying intramission bias in combination with the over-and underweighting of certain regions in the tide-gauge comparison affects the average result. We argue that the intramission bias obtained from the crossovers is therefore more accurate. Residual ocean signals in ALT-TG time series 15 , which correlate between tide gauges, are another source for the difference in results. ALT-TG bias estimates from CPs located in the tropical Pacific and along the American shore are especially prone to remaining signals from the consecutive 97-98 El Niño 8 and 99-00 La Niña events, because they occurr around the time of the TOPEX-A/B transition.  The statistics of model 3 (Fig. 7) show that the averaging methods find large differences in the drift of TOPEX-B. The estimated biases are closer to zero with respect to model 2 and appear to be negatively correlated with the drifts in TOPEX-B. The drifts in TOPEX-A are negative for methods m1, m2 and m3 and slightly higher for m4, but all of them close to zero and statistically consistent between the methods. Large deviations between the methods suggest that geographically varying signals affect the global estimate. Additionally, for estimating an accurate drift through TOPEX-B, the time series appear is too short. Since the intramission bias is geographically varying and both drifts and the intramission biases are susceptible to correlating ocean signals from for example ENSO, we recommend to use model 1 and not the other two for validation purposes.

Thresholds
Preferably, the estimated drifts and intramission biases do not vary significantly between averaging methods (m1-4) and when thresholds are varied. To determine the effect of the thresholds used in the ALT-TG analysis on the model parameters, estimates of drifts and intermission biases are made at varying thresholds ( Fig. 8 and Fig. 9). In Fig. 8, the threshold on the standard deviation of the residuals is varied while we set the trend uncertainty thresholds to 100 mm yr −1 . In Fig. 9, the threshold on the trend uncertainty is varied, while the threshold on the standard deviation of the residuals is set to 200 mm. Note that the nominal thresholds were 110 mm for the standard deviation and 10 mm yr −1 and 8 mm yr −1 for the trend uncertainties of TOPEX-A and TOPEX-A/B, respectively. The middle right panels show that it is difficult to constrain a drift for TOPEX-B, which supports the findings in the histograms. The TOPEX-B drifts between methods differ more than 1 mm yr −1 and the estimated drift strongly depends on the threshold for the standard deviation of residuals. This also appears to negatively correlate with the estimated intramission bias and slightly with the estimated TOPEX-A drift. As the results are unstable, we do not recommend to fit a model with two separate drifts for TOPEX-A&B for validation purposes, and certainly not for calibration.
More stable results are obtained for models with a single drift for TOPEX-A/B. The virtual station method (m4) yields a larger intramission bias at higher thresholds than the other methods. This is possibly related to the overall geometry of the network, but the method is also more prone to outliers as shown by the wider distributions in the histograms. The estimated TOPEX-A/B drift in both models for all methods is statistically consistent between methods. In conclusion, since statistically consistent results are only obtained by model 1, we recommend to use only model 1 for validations purposes.