Version 4 of the CRU TS monthly high-resolution gridded multivariate climate dataset

CRU TS (Climatic Research Unit gridded Time Series) is a widely used climate dataset on a 0.5° latitude by 0.5° longitude grid over all land domains of the world except Antarctica. It is derived by the interpolation of monthly climate anomalies from extensive networks of weather station observations. Here we describe the construction of a major new version, CRU TS v4. It is updated to span 1901–2018 by the inclusion of additional station observations, and it will be updated annually. The interpolation process has been changed to use angular-distance weighting (ADW), and the production of secondary variables has been revised to better suit this approach. This implementation of ADW provides improved traceability between each gridded value and the input observations, and allows more informative diagnostics that dataset users can utilise to assess how dataset quality might vary geographically.

sources: CLIMAT messages, exchanged internationally between WMO (World Meteorological Organisation) countries, obtained as quality-controlled files via the UK Met Office; MCDW (Monthly Climatic Data for the World) summaries, obtained from the US National Oceanographic and Atmospheric Administration (NOAA) via its National Climate Data Centre (NCDC); and updates of minimum and maximum temperatures for Australia, obtained from the Bureau of Meteorology (BoM). In addition, ad-hoc collections of stations are incorporated (after quality control checks including location, correspondence to existing holdings, and outlier checking). These observations serve to provide six 'databases' of monthly values (Diurnal Temperature Range being calculated from Minimum and Maximum Temperatures). Coverage for selected variables at selected dates is shown for precipitation in Fig. 1 and for temperature, DTR and vapour pressure in the three figures of Supplementary File 1, and discussed further in the 'Meteorological station database updating' subsection of 'Methods' . Figure 2 shows the overall process by which these observations, along with various static repositories, are used to derive each version of the CRU TS data set. Further variables are derived from these, including Potential Evapotranspiration (PET), which is required by many users in the agricultural and hydrological sectors.
Because of the overriding objective to present complete coverage of land surfaces (excluding Antarctica) from 1901 onwards, CRU TS is not necessarily an appropriate tool for assessing or monitoring global and regional climate change trends. Nevertheless, with care taken to identify and avoid trend artefacts caused by changing data coverage or data inhomogeneities, then CRU TS can be used for global and regional trend analysis. The first issue is that unlike, for example, CRUTEM, regions uninformed by observations are not left missing but instead are replaced by the published climatology 1 . This has the advantage of being a known entity, rather than an estimate, but has the unavoidable side effect of decreasing variance. Additionally, the numbers and locations of stations contributing to any grid cell will change over time. Both effects can potentially give rise to trend artefacts. This is a particular problem with high-resolution grids, if individual grid cells or small groups of grid cells are analysed without checking to see if they contain any observation stations at all, or whether they are interpolated from distant stations during one part of the record and from close stations during another period. However, the metadata provided with the CRU TS version 4 dataset enables users to understand the level of support behind each grid cell and time step, permitting informed detection of trends or masking of areas so that analysis of trends can focus on well-observed regions. Temperature, in particular, has been shown to be resilient to the problems described above: this is in part due to its long correlation decay distance (CDD) of 1200 km (Supplementary File 1). Precipitation, with its much shorter CDD of 450 km, has reduced and more time-dependent coverage (Fig. 1), and  39 , and in areas including regional agronomic production 40 and river basin vegetation 41 . Note 5: minimum and maximum temperatures are the monthly means of the individual daily minimum and maximum temperatures; they are not the overall minimum or maximum temperature recorded in each month.
so subject to these problems unless the data are masked prior to analysis. The second issue is that no extra homogenization is performed on the observations, so artefacts could be present where the originators have not already homogenized their data. Comparisons with other observation-based datasets at a global scale (GPCC 17 , UDEL 18 , CRUTEM 19 , and regional or third-party exercises [20][21][22] ) demonstrate the robustness of the dataset at large spatial scales. Assessment of the grids is discussed further in the 'Technical Validation' section of this paper.

Methods
Meteorological station database updating. The process to update the databases with observations, and to derive the DTR database, is unchanged and is described in 5 . Holdings of observations vary by variable, with spatial and temporal concerns affecting cover. In Fig. 1 www.nature.com/scientificdata www.nature.com/scientificdata/ Supplementary File 1, TMP station cover (p.2) shows that in the early 20th century, even the high CDD of TMP cannot deliver full land cover: central-west Africa being the most obvious region that will default to the climatology. DTR cover (p.3) has far patchier cover than TMP, owing to its shorter CDD (750 km) and lower station numbers. The final figure in Supplementary File 1, VAP station cover (p.4) demonstrates the difference between the cover provided by VAP observations, and that introduced with the addition of synthetic VAP: for this reason, only two decades (1940-49 and 1970-79) are shown. The comparisons between b) and d), and between f) and h), show how essential synthetic variables are to achieving much greater land cover. Note that the synthetic VAP, as it is derived from TMP and DTR, inherits the lower of their CDDs (750 km). Cover is therefore reduced from that of TMP.   Table 1 for details of the variables).
www.nature.com/scientificdata www.nature.com/scientificdata/ Anomalies. The first stage of the process is to convert each station series into anomalies. The mean used to construct the anomalies is based on the period 1961-1990, and a minimum of 75% of observations must be present in this period (23 months or more) for each of the 12 months to be processed. Outlying values exceeding a threshold (±3 standard deviations, SD, for TMP; +4 SD for PRE) are omitted. This outlier-threshold check for TMP is more stringent than that for CRUTEM4.6, which uses a ±5 SD outlier check 24  www.nature.com/scientificdata www.nature.com/scientificdata/ validation' section. While the process to construct anomalies is algorithmically unchanged from the previous version 5 , additional elements now construct a lookup table which, for each anomalised station, lists all land cells for the destination 0.5° grid that are within the correlation decay distance (CDD) for the variable in question. This improves the computational performance of the later interpolation process.

Production of primary variables: TMP, DTR, PRe. Primary variables have no synthetic component.
Station observations are anomalised using each station's 1961-1990 normals (monthly averages). PRE is converted to percentage anomalies, so the lowest possible value would be −100, meaning no rain; and a percentage anomaly of 0 indicates equivalence with the 1961-1990 mean. Monthly anomaly fields are then interpolated onto the 0.5° × 0.5° target land grid using ADW. Land grid cells where no observation reaches are set to 0 (representing the climatology in anomaly space). Finally, the CRU CL published climatologies are used to convert the gridded anomalies to actuals. Secondary variables. Secondary variables differ from primary variables in that they have fewer direct observations available. We therefore supplement these by estimating synthetic values from the primary variables. The synthetic estimates are obtained using empirical relationships with the primary variables that are unchanged from those described in 5 . What has changed in CRU TS4 is that the synthetic estimates are now calculated from the primary variable station observations rather than from the primary variable gridded values. Two advantages of this change are that (1) it is more transparent which stations have contributed to the gridded values (those with observations of the secondary variable and those with observations of the primary variable(s) needed to obtain the synthetic estimates); and (2) the interpolation of the synthetic estimates can now use the CDD of the secondary variable in deciding the distance weighting. Previously, some synthetic estimates were derived from gridded primary variables that had themselves been interpolated using the CDD of the primary variable (hence less transparent, and information from further afield than the secondary variable's CDD would have been used). One result of this is that the coverage (the regions where the variable is not simply filled in by its climatological values) of the secondary variables is less complete than previously. However, this reduction in coverage arises from removing potentially low quality estimates that were previously made from too-distant observations. Synthetic VAP production. Synthetic VAP observations are generated from TMP and DTR station anomalies (or from TMP station anomalies and gridded DTR anomalies where the station data does not include TMN and TMX), as well as the published CRU climatologies for TMP and VAP 1 . While the process broadly follows that described in 5 , synthetic anomalies are now produced at a station level, rather than as gridded data, because this better suits the interpolation process as explained above. The VAP process is shown in Fig. 3, and the impact of the inclusion of synthetic VAP on the final gridded coverage is illustrated in Supplementary File 1 (p4).
Synthetic WET production. The WET variable represents counts of wet days defined as having ≥0.1 mm of precipitation (section 2.4.1 of 5 ). Figure 4 shows the process by which synthetic WET values are incorporated into production of the WET product. The empirical algorithm that synthesizes WET uses PRE observations, together with normals (the CRU CL 1961-1990 climatologies 1 ) for PRE and WET. Therefore, the PRE anomalies at a station level are converted to absolute values using the PRE normal from the enclosing gridcells, and then used in the synthesis. The absolute synthetic WET values produced go to create a synthetic WET database; this is then anomalised in the same way as the observed WET database, and both sets of anomalies are passed to the interpolation algorithm. Some users use WET, and as rain day counts are part of the monthly messages we access, they are straightforward to add to the databases.
Synthetic CLD production. The process to generate synthetic cloud cover observations from DTR observations is as described in 5 , save that the synthetic station-based values are not gridded separately, but are fed into the main gridding process alongside the CLD observation anomalies.
Interpolation. General approach. The interpolation process implements angular-distance weighting (ADW) and is shown in Fig. 5. The station influence lookup tables produced as part of the anomaly process (described in the ' Anomalies' subsection of 'Methods') are used to allocate station anomalies to an array of gridcells that, for each monthly time step and cell, stores the nearest eight or fewer anomalies lying within the relevant CDD. Once the observed anomalies have been allocated, and if a secondary variable is being processed, synthetic anomalies are then allocated in the same way. However, they are excluded if within 25 km of either an observed anomaly, another synthetic anomaly, or the centre of the target cell; and if they lie within a 45° subtended angle of an observed anomaly. Additionally, they cannot replace an observed anomaly: the maximum of eight anomalies applies throughout. Once all allocations have been made, distance and (angular) separation weights are calculated (section 2b of 2 ), and used to obtain an interpolated anomaly value for each gridcell. Any land cells without allocated anomalies are set to zero, representing the climatology in anomaly space. Elevation is not specifically included in the interpolation; it is introduced via the climatologies when the gridded anomalies are converted to absolute values ('Production of absolutes' in 'Data records'). Results of a cross-validation exercise to quantify the accuracy of the ADW interpolation scheme are reported in the 'Technical validation' section.
Improvements to weighting for v4.02 and later versions. The approach to distance-weighting adopted for version 4 was taken from 2 , a decay function of the form e ( ) , where d is the distance of the station, CDD is the correlation decay distance of the variable, and m = 4 (a value arrived at after extensive sensitivity testing reported in 2 ). However, the function was only used as part of the ADW process, when weighting more than one station to www.nature.com/scientificdata www.nature.com/scientificdata/ achieve an interpolated value. This resulted in unrealistic artefacts in the interpolated field. To address this, there was a need for the interpolated anomaly -or, for a single station, its anomaly -to be damped using distance-weighting as well, and a cross-validation exercise was conducted. This involved the reconstruction of every observed anomaly from every station in the process, provided at least one other station was available to interpolate from. These reconstructions were made for two basic decay functions, the original: n In both cases, the power m or n ranged in integer steps from 1 to 8. The interpolation process applied the selected function at all stages: the decay of a lone station anomaly with distance, the relative distance weighting in the ADW calculation, and the decay of the ADW-derived anomaly with distance. Errors were calculated as mean absolute error  (2) with n = 1 gave the smallest errors as a global picture: PRE was served equally by both, while TMP was served better by Eq. (2). However, this sine curve does not allow a gradual decay at close distances, resulting in unrealistic artifacts as before. Further calculations showed that increasing the power in the sine function introduced little extra error, and a value of n = 4 was selected as a compromise between the need for accuracy in the gridcells and the need to reduce or eliminate unrealistic artifacts in the field to provide a continuous surface.     www.nature.com/scientificdata www.nature.com/scientificdata/ Consistency between variables. One of the benefits of a multivariate dataset is the opportunity to present, at a point in space and time, a set of variable values that are (to an extent) internally consistent. This explains much of the design of the variable production process: TMN and TMX are consistent with TMP and DTR because they are derived from them (DTR having been previously derived from TMN and TMX observations); VAP is consistent with the temperature variables inasmuch as synthetic VAP is derived from them; similarly, the synthetic parts of WET and CLD are consistent with, respectively, PRE and DTR; and FRS and PET are entirely consistent with other variables, being wholly derived from them. Figure 6 shows the consistency relationships.
Homogeneity. As described in 5 , and in the 'Production of primary variables' subsection of 'Methods' , CRU TS is not specifically homogeneous. Some National Meteorological Agencies (NMAs) homogenize their station observations, either before release or at a later stage (requiring a re-release). Therefore, many CRU TS observations have been homogenized (and also quality controlled) within each country. However, performing additional homogenization on the CRU TS databases would be complicated and not completely possible because of elements of the process, such as partly synthetic variables and the use of published climatologies. Sparse data coverage in some regions, or for some variables, is a particular limitation for applying neighbour-based homogeneity tests, as noted by 4 , where a degree of homogenization was implemented. The multivariate nature of CRU TS means that homogeneities identified in, for example, mean temperature data, are likely to influence other variables as well.
Comparisons with other datasets can be used to identify any large inhomogeneities that might be present in CRU TS v4. For example, partial homogeneity assessment and correction was undertaken for an earlier version (v2.1) of CRU TS 4 and at large spatial scales and for most country averages there is close agreement between CRU TS versions with and without this additional homogenization. Other, single-variable datasets perform various homogeneity assessments on their observations, though even here there are difficulties because of reporting delays 17 . The CRUTEM4.6 temperature dataset incorporates homogeneity as a result of previous work and work by originating bodies [24, section 2.2]. CRU TS v4 TMP is compared with CRUTEM4.6 in the 'Technical validation' section and Fig. 7. These various inter-dataset comparisons do not indicate that there any large inhomogeneities present in the CRU TS v4 dataset, unless they are also present in the comparison datasets despite these other data being subject to further homogeneity checks.

Data Records
external data records. The CRU TS v4.03 dataset 27 comprises ten variables of high-resolution global land surface gridded absolute values. The data are available in two formats: NetCDF, and space-separated ASCII text. This ensures maximum availability for the diverse users of the dataset. The files are available in decadal blocks, as well as full-length, for the same reason. www.nature.com/scientificdata www.nature.com/scientificdata/ The gridded data, excepting PET, are made available alongside metadata indicating the level of station support enjoyed by each datum; this varies between 0 (no cover, climatology inserted, see Interpolation above), and eight (the maximum station count for interpolation). For primary (TMP, DTR, PRE) and secondary (VAP, WET, CLD) variables, the counts produced by their interpolation are used. For derived (TMN, TMX) variables, and for FRS, the DTR counts are used. Because PET is calculated from multiple variables using a Penman-Monteith formula 25 , no meaningful station count can be produced. The station count metadata are included in the NetCDF files as a second variable ('stn'), and are published separately as ASCII text files.
An interface is also provided by a file in Keyhole Markup Language (KML) and an accompanying suite of images and datafiles. This is a standard of the Open Geospatial Consortium (https://www.opengeospatial.org/ standards/kml) and allows the data set to be accessed in Earth browsers such as Google Earth (https://earth. google.com/). This Google Earth interface is currently available for the TMP and PRE data, allowing access to individual grid-cell series as well as station observations in an intuitive, hierarchical structure.
Internal data records. The CRU TS process is realised through a collection of Fortran-77 programs that are called from a master program. This arrangement has provided compartmentalisation and flexibility as the process has evolved. This section will address the data files that allow communication between the programs, organised by the program that produces the data files. All files are ASCII text, with space-separated fields, unless otherwise stated. www.nature.com/scientificdata www.nature.com/scientificdata/ Anomaly production. The anomaly program produces monthly data files, listing the station anomalies for that month. Station metadata is included. Additionally, two files needed for the interpolation process are produced: a list of stations giving in grid terms the North, South, East and West bounds of their influence (based on the CDD of that variable); and a list of stations giving the co-ordinates and distances of all gridcells within that influence. Files produced by the anomaly process are used by the interpolation process. Additionally, anomalies for primary variables are used by the processes synthesizing VAP, WET and CLD.
Synthetic VAP production. The synthetic VAP program produces monthly data files in ASCII text, listing the synthetic anomalies for that month. Station metadata is included. Because the process can make use of gridded DTR anomalies if there is no match for a TMP station, the metadata can take one of two forms: either a TMP station, or both TMP and DTR stations. These files are used by the interpolation process.
Synthetic WET production. The synthetic WET program produces monthly data files in ASCII text, listing the station absolutes for that month. Station metadata is included. The format is identical to the station record format used for observations, and the files are read by the anomaly process.
Synthetic CLD production. The synthetic CLD program produces monthly data files in ASCII text, listing the station anomalies for that month. Station metadata is included. The format is compatible with the anomaly files produced by the anomaly process, and these files are used by the interpolation process. www.nature.com/scientificdata www.nature.com/scientificdata/ Interpolation. The interpolation program produces monthly gridded data files in ASCII text, comprising the gridded anomalies for that month. These files are used by the absolutes process. A separate monthly file identifies, for each datum, the number of stations that contributed to the interpolation, these files are used by the output process. A further monthly file identifies, for each datum, the stations used and whether they were observations or synthetic. This latter file is not currently used by any process, but it exists to provide full traceability when required. www.nature.com/scientificdata www.nature.com/scientificdata/ Production of absolutes. The absolutes program reads the gridded anomaly files for primary and secondary variables from the interpolation process, and converts them to absolutes using the appropriate CRU CL v1.0 climatology. It produces monthly gridded files, which are used by the output process as well as in the derivation of TMN and TMX, and the calculation of FRS and PET. www.nature.com/scientificdata www.nature.com/scientificdata/ Derivation of TMN and TMX. The program that derives TMN and TMX does so by reading the monthly gridded files of absolutes for TMP and DTR, produced by the absolute process. It produces monthly files of TMN and TMX in the same format, which are read by the output process. TMN is calculated as TMP − 0.5*DTR, and TMX as TMP + 0.5*DTR. www.nature.com/scientificdata www.nature.com/scientificdata/ Synthetic FRS production. The synthetic FRS program reads monthly gridded TMN absolutes produced by the absolute process, and produces monthly files of FRS in the same format. These are read by the output process.
Synthetic PET production. The synthetic PET program reads monthly gridded TMP, TMN, TMX, VAP and CLD absolutes produced by the absolute process, and produces monthly files of PET in the same format. These are read by the output process.
Output process. The output process reads the gridded absolute files produced by the absolute process, and the station count files produced by the interpolation process. It produces the final output files described in the 'External data records' subsection of 'Data records' .

Technical Validation
Quality control of input data. Source observations are often homogenized by national meteorological agencies before dissemination. Addition of new observations includes basic range checking, where the observations are from a trusted service, and interactive operator-controlled addition in other cases.
No achievable level of quality control can guarantee to exclude all errant data from a large dataset, because of the myriad ways in which the data may be evaluated and the elusive definition of 'errant' . The disparate users of CRU TS subject the data to many kinds of statistical processing, and on occasion this can reveal potential issues for further exploration and perhaps correction.
The process of anomaly production includes screening of exceedences; these are defined as values exceeding three standard deviations (based on the full length of the station series), extended to four standard (positive) deviations for precipitation.

Comparisons between versions and with alternative datasets. When CRU TS v4 was introduced,
CRU TS v3 continued to be produced in parallel, to allow users to investigate how the move would affect their work. So v4.00 was released alongside v3. 24 19 and the reanalysis dataset JRA-55 from JMA 28 . For PRE, DWD's GPCC 17 was chosen for its high observation count. TMP comparisons (Fig. 7) show good high-frequency agreement of CRU TS with CRUTEM4.6 (correlation coefficient, r = 0.99 globally), UDEL (r = 0.97 globally) and JRA-55 (r = 0.99 globally, 1958-2017 only). However, the long-term trends in global and hemispheric land temperature are notably stronger in CRUTEM4.6 than in UDEL: the CRU TS trend lies between them but closer to CRUTEM4.6. This is clear in the difference plot ('CRUTS-UDEL'), with CRU TS being warmer than UDEL in the early Twentieth Century, and cooler more recently. Comparisons between CRUTEM and other global temperature datasets (as reported, e.g., by 29 ) support the reliability of the long-term CRUTEM trend). It should be noted that CRUTEM is not a spatially interpolated dataset; this may explain some differences. Figure 8 shows the comparisons of PRE with GPCC for global-and hemispheric-mean land precipitation. The high-frequency r for Global is 0.92, though CRU TS is drier in the early Twentieth Century, perhaps due to having lower observation counts and reduced coverage. The difference is largest in the Southern Hemisphere, while the Northern Hemisphere series agree more closely (annual anomalies correlate at 0.94).
In general, comparison with reanalysis products is not appropriate as a way to validate observation-based datasets. Reanalyses are forecast models constrained by some observed variables. Precipitation, for example, is not usually assimilated. There are many examples of gridded observations being used to 'bias correct' reanalyses, a selection that used CRU TS are described in 30,31 , and 13 . CRU TS is also used as independent assessment of other datasets, such as satellite-derived data for recent decades 32 , highlighting the continuing need for a dataset based only on in-situ direct observations.

Cross-validation of the interpolated anomalies.
A separate suite of skill testing programs use station cross validation 33 to assess the skill of the interpolation algorithm and to provide a quantitative guide to the expected accuracy of the individual interpolated values. Cross-validation could not have been performed before the move to ADW and is one of the motivations for changing to ADW. Figure 9 shows spatial maps of correlation coefficients (r) (a) and (MAE) (b) for TMP stations, with the respective distributions in (c) and (d). Figure 10 has the same format, showing DTR results, and Fig. 11 displays results for PRE. All three figures include distribution graphs for r and MAE: these should be consulted for a global overview of performance. Note that PRE anomalies are percentage differences from the mean rather than in mm units. For all three variables, the defined minimum series length for comparison was 20 months. In practice, 95% of lengths were >=236 months and 99% >=47 months for DTR, higher for TMP and PRE, with minimum lengths of 23 for TMP and PRE, 22 for DTR.
The majority of interpolated monthly temperature anomalies are highly correlated (in 95% of cases, r is >=0.56) with their withheld validation data and mean absolute errors mostly lie between 0.25 and 0.75 °C (MAE is <=0.87 in 95% of cases). Correlations are particularly strong in the densely observed mid-latitudes, and as expected they are weaker where the observation network is sparser and interpolation distances are larger. The www.nature.com/scientificdata www.nature.com/scientificdata/ monthly DTR anomalies have larger interpolation errors (95% of MAE values are <=0.87 °C) and weaker correlation coefficients, (though 95% of correlations are >=0.41). The pattern of cross-validation outcomes for DTR again shows the most reliable values are in regions with dense observations, e.g. N America, parts of Europe, China and Japan, and well-observed regions of S America, southern Africa and Australia. There is some indication of weaker correlations in coastal regions for the DTR cross validation, but the size of the interpolation errors (MAE) shows a different pattern with a region of raised errors in the SW USA and Mexico and very small errors in Europe and China.
The cross-validation for the interpolated monthly precipitation anomalies shows a broader range of outcomes, consistent with the shorted CDD for this variable, but still overwhelmingly dominated by positive cross-validation correlations (95% are >=0.38). The mode of the distribution of MAE lies just below a 30% relative error, with 95% of MAE <=66.81%. Most of the large relative errors for PRE are in very dry regions, such as the edges of the Sahara and other deserts, and will likely be small in absolute terms. The cross-validation gives correlations above 0.8 for the regions with dense networks. It is more common to find correlation around 0.5 or lower in the regions with sparse data, though it is likely (due to cancellation of the random component of errors) that correlations for seasonal, annual and decadal mean values would be greater than for the monthly values shown here.
Many users have made their own attempts at validation of CRU TS, usually for particular variables (TMP, PRE, DTR). These range from comparison with other, regional data sources (eg. 20,21,34 ), to global intercomparisons (eg. 22,35 ), using either in situ, satellite-based or reanalysis-based data sources. These independent evaluations, of which there are many others, are expected to continue for CRU TS v4. The comparisons with established global datasets (Figs. 10 and 11) described above also serve to underline the validity of key CRU TS variables.

Usage Notes
The NetCDF-formatted output files of CRU TS data may be read with any NetCDF tools; they are CF-1.4 compliant. Files for all variables except PET contain two data variables, the named one, (i.e., 'tmp'), and a station count ('stn') giving the number of stations used to build each datum. These two variables have identical dimensions. The ASCII text-formatted output files, where the 'stn' data are in accompanying files, may be read programmatically. The 'stn' data may be used to quantify uncertainty, most particularly by excluding cells with a 0 (zero) count, as these will have been set to the default climatology. Other sources of uncertainty, particularly those associated with the observations themselves, cannot be quantified in the same way: representativeness varies from variable to variable, and we are dealing with monthly means (or totals) rather than daily or sub-daily measurements. The cross-validation exercises summarised in the previous section can give some confidence in the interpolation process itself, but as with all such metadata, it is for the user to decide on boundaries to uncertainty for their particular application.

Code availability
Code archives for CRU TS releases are available on the CRU website, accompanying each release. The automatic archiving of code for each release was introduced recently, so archives are not available for releases prior to v4.03.