Inferring the depth of pre-instrumental earthquakes from macroseismic intensity data: a case-history from Northern Italy

Determining the hypocentral depth of pre-instrumental earthquakes is a long-standing geophysical issue that still awaits to be elucidated. Using very well documented recent earthquakes we found that the depth of crustal and upper-mantle events correlates well with the slope of the first 50 km of their intensity attenuation curve, regardless of their magnitude. We used this observation to build a magnitude-independent method for calculating the depth of selected historical and early-instrumental earthquakes of northern Italy based on their macroseismic intensity field. Our method relies on both standard intensity data and questionnaire-based data for 20 earthquakes, encompassing a relatively large range of magnitude (Mw 4.0–5.8) and depth (3.0–72.4 km), that occurred in Northern Italy between 1983 and 2019. We then used the method to estimate the depth of 20 older earthquakes that occurred in the same region between 1570 and 1972. Knowing the approximate depth of historical earthquakes is crucial for assigning them to the relevant seismogenic source, especially where seismogenic faults occur at different depths, allowing for a better characterisation of the region’s seismotectonic setting. Knowing the focal depth also allows recalculating the equivalent magnitude, which turns out to be consistently larger for deeper events, suggesting a reassessment of the local seismic hazard.

In this paper we discuss a new method for constraining the depth of historical earthquakes based on the attenuation of macroseismic intensity within the first 50 km of the earthquake epicentre. We show that recent, instrumentally-located earthquakes that are also well described from the macroseismic point of view can be used as a valuable key to infer the depth of past earthquakes.
Our study area includes the NE-verging Northern Apennines fold-and-thrust belt of Italy and its foredeep/ foreland area in the southern Po Plain. The region is especially well-suited for our goals because local earthquakes -and hence their causative faults -exhibit largely variable kinematics, ranging from purely extensional to purely compressional, and hypocentral depth, from very shallow to deep (>100 km) 1 (Fig. 1). It is well established that different geodynamic frameworks may be characterised by highly variable attenuation properties, ultimately resulting in rather different attenuation curves [2][3][4][5][6][7] . For example, Gasperini 2 found that the Italian region can be subdivided into two portions -northern and southern -characterised by different attenuation properties. We maintain that the 50 km maximum epicentral distance we considered in our calculations should keep the scatter of the attenuation properties within the experimental uncertainties imposed by the nature of the observations. The Northern Apennines thrust belt evolved within the slow Africa-Europe plate convergence; it is well described by an outer compression and inner extension pair migrating northeastward in the wake of the retreat of the subducting Adriatic slab 8,9 . The geometry of the Northern Apennines orogen is complicated by major discontinuities perpendicular to the main structural trend, interpreted as a consequence of the differential retreat of the west-dipping subduction system 10,11 .
In its turn, the Po Plain is a sedimentary basin formed by the Plio-Quaternary foredeep of the Northern Apennines 12 . Its southern portion hosts the external buried thrusts of the chain, organised in a complex system of arched fronts overlain by a variable thickness of terrigenous sediments 13 .
Besides the relatively well known, shallow active thrusts 14 (Table 2; solid blue dots), and all the analysed historical and early-instrumental earthquakes (analysed set in Table 3; solid red dots) for which we computed the focal depth based on the results of this work. In all cases the size of the symbols is scaled with magnitude. Orange boxes are the surface projections of Composite Seismogenic Sources from the DISS database 1 (v. 3.2.1). The outcropping Quaternary deposits are highlighted in dark gray. (Inset) Schematic section across the Northern Apennines, showing the expected depth range and relative position of the different seismogenic source types with respect to the subduction system. The sources are plotted along with selected associated earthquakes and seismogenic sources from the DISS database. Note that the seismogenic source types 4 and 7 (Table 1) cannot be shown as they lie parallel to the strike of the section. The Moho depth is from Ziegler and Dèzes 56 ; the top of the Adriatic monocline (in green) is from Livani et al. 13 . magnitude affects the isoseismal pattern more than hypocentral depth. This methodology was applied to recent (instrumentally-recorded), low-magnitude British earthquakes illuminated by good quality macroseismic data, and the calculated depths turned out to be close to the instrumental ones 33 .
The Italian historical earthquake catalogues -CPTI15 18 and CFTI5Med 34 -use the Boxer code 23 routinely for constraining the earthquake parameters. Its most recent release 35 embeds the method proposed by Musson to determine the hypocentral depth, but relies on individual Macroseismic Data Points (MDPs) rather than on isoseismals, which makes it more reliable and less subjective. Nevertheless, a comparison between instrumental and intensity-based focal depths of Italian earthquakes shows that in most cases the two estimates are very different 35 .  Table 2. List of the earthquakes used as learning set and shown in Fig. 1. The magnitudes are all M w except for events #9 and #20 (highlighted with an asterisk), for which we show an M L . MDP is the total number of available Macroseismic Data Points. The macroseismic data are from the ASMI database 59 (https://emidius.mi.ingv.it/ ASMI/) for the earthquakes from #1 to #5, and from the HSIT database for #6 to #20 (see Suppl. Table 1 for a direct link to the web page of each earthquake). The "total responses" is the total number of macroseismic questionnaires available for each earthquake and is shown only for the HSIT dataset. For each earthquake the slope of the attenuation curve was calculated using Eq. 1 and fitting 8 or 9 points: the results are shown in Fig. 3. The Source type column lists hypothesised types of causative faults inferred from location, depth and focal mechanism and compared with the reference tectonic model (see Fig. 1). Extensional faults located west of the regional drainage divide and antithetic to the subduction system and their west-dipping antithetic faults 1-10 6 Deep intraslab compressional faults >20 7 Transverse fault systems related to differential slab retreat >10 Table 1. Seismogenic source types that exist in the study area (from Vannoli et al. 20 ) and their estimated depth range from DISS database 1 . The depth of the source types 3, 6 and 7 is correlated to the geometry of the slab, so we provide only a minimum depth. See Fig. 1 for a conceptual sketch of their relative position.
On these grounds the hypocentral depths were not considered reliable, and were not supplied in the latest versions of both CPTI15 and CFTI5Med. We believe that the reason why the method developed by Musson 32 does not seem to work on Italian earthquakes lies in the larger structural diversity of the Italian peninsula compared to Great Britain. Nevertheless, we maintain that there exists a simpler but robust macroseismic method for estimating the hypocentral depths even in extremely heterogeneous crustal regions. The method we propose rests on three fundamental pillars: (a) it relies only on the slope of the attenuation curve, not on the absolute intensity levels attained by any given earthquake: as such, it is fully independent of its magnitude; (b) it is based on individual MDPs, not on isoseismals: as such, it eliminates the subjectivity relating to the style of isoseismal drawing and to the associated geographic uncertainties; (c) it is primarily based on data gathered through the web-based questionnaires collected in the HSIT database (Hai Sentito Il Terremoto, or Did You Feel the Earthquake; http://www.haisentitoilterremoto.it/): a nationwide, community-based effort for collecting information from people who felt an earthquake and for creating large and homogeneous pseudo-intensity datasets 36 Table 3. List of the historical and early-instrumental earthquakes analysed in this study (referred to as the analysed set) and shown in Fig. 1. The Expected depth and the Depth range were calculated using Eq. 1. The M L and M w were recalculated using Eq. 2 and the conversion equation by Grünthal and Wahlstrom 42 . The last column shows the % increase or decrease of the recalculated seismic moment with respect to the original estimate. MDP is the number of the Macroseismic Data Points, i.e. the number of municipalities with an intensity assigned. The Source type column lists hypothesised types of causative faults inferred from location and expected depth and compared with the reference tectonic model (see Table 1 and Fig. 1

Results
Using a set of recent, well-recorded crustal and upper-mantle earthquakes we found that their depth correlates well with the slope of the first 50 km of the curve describing the decay of macroseismic intensity with epicentral distance. The retrieved depth can then be used for constraining two equally important and totally independent aspects of seismic hazard assessment at any scale: (a) the assignment of any given earthquake to a specific seismotectonic process, i.e. to a specific source type; and (b) the re-evaluation of earthquake magnitude for all events that turn out to be deeper than "upper crustal", assumed as the standard depth by Boxer.
The subduction system that includes the Northern Apennines fold-and-thrust belt comprises different seismogenic sources located at different depth (Table 1 and Fig. 1). Assigning each of them to the relevant seismotectonic domain is a pre-requisite of any fault-based seismic hazard assessment scheme. This step, however, requires that the focal depth be estimated with an accuracy that is smaller than the average depth difference between the different fault types coexisting in the same area. Since the earthquake attenuation is also a function of depth, the availability of this latter parameter may be the key for correctly interpreting the variability of the attenuation parameters in the different existing geodynamic settings 35 .
We started off from the analysis of three instrumental earthquakes that occurred in 2012 in our study area (Fig. 2, #11, 12 and 13 in Table 2), with M w ranging between 4.9 and 5.8. All three shocks were felt over a large portion of central and northern Italy, as testified by the many data reported in HSIT (an average of 9,000 observations per shock). They occurred closely spaced in time within a region where events of this size are relatively rare: yet, they exhibit a widely different focal depth, suggesting they were generated by independent tectonic processes.
For each event we generated an attenuation curve based on HSIT data: given the wealth of closely-spaced datapoints, the intensity values were averaged over circular moving windows of 10 km -i.e. over progressively larger rings centred on the epicentre -each one overlapping the adjacent by 5 km: an approach similar to that used by Gasperini 2 . By doing this we averaged out any possible anisotropies, so that the observed trend can be assumed independent of local anomalies. Other investigators 4 used a different binning method based on intensity values, referred to as "intensity binning method". We decided not to use this approach, however, because in the first 50 km the intensity decay is generally low, especially for deeper events; this would result in a very limited number of averaged intensity points, whose linear fit would return highly uncertain results. Nevertheless, the two approaches show similar results within the first 100 km from the epicentre 4 .
Following Gasperini 2 , who showed that the first part of the intensity attenuation function has a linear trend up to 40-50 km, we calculated the parameters of the line that fits best the first 50 km of the attenuation curves of the three 2012 earthquakes discussed above. We noticed that their slope decreases as the hypocentral depth increases (see Table 2). This encouraging result led us to extend the analysis to 17 additional earthquakes that occurred in our study area, selected following the criteria described in the Method section. Together with the three 2012 earthquakes, all these events form a learning set to be used for subsequent elaborations. Figure 3 shows that they support the initial observation that the slope of the attenuation curve in the distance range 0-50 km correlates well with hypocentral depth. In contrast, the slope does not seem to be affected by magnitude, as clearly shown by events #10, 11 and 20, having a rather different M w (4.0-4.9) but a similar depth (21.0-29.0 km), and hence a  Table 2). Individual intensity datapoints were obtained averaging the intensity values over a circular moving window of 10 km, overlapping the adjacent window by 5 km. We used a linear fit for the first 50 km of the curves and calculated the resulting slope. www.nature.com/scientificreports www.nature.com/scientificreports/ similar slope (0.024-0.029), while events #16 and 20 exhibit a similar M w (4.6-4.7) but a rather different depth (8.7-20.6), resulting in a very different slope (0.026-0.046); see Table 2 and Fig. 3.
We then computed the absolute value of the slope (S) for the attenuation curve of each earthquake (Table 2) and plotted it versus the hypocentral depth (D). We found that the data points (Fig. 4) are fitted by the logarithmic function with a Pearson Coefficient of 0.87, corresponding to a significance level of correlation better than 10 −3 : ± . + . ± . S ( 0 022 0 003) ln D 0 099 0 009 (1) We are now ready to use Eq. 1 as a modern Rosetta Stone that should reveal a plausible range for the hypocentral depth of pre-instrumental earthquakes, once the slope of their attenuation curve is calculated. The data show that this slope -and hence the entire approach -is independent of the magnitude of the earthquake being considered, but depends largely on its focal depth ( Fig. 3 and Table 2).
We then selected all the earthquakes that occurred in our study areas since the year 1800 and are backed by at least 100 MDP (Fig. 1). Our selection resulted in 17 events with magnitude ranging between 4.9 and 6.5 (Table 3; events from #24 to #40). Their epicentres fall in both the extensional and compressional domains of the Northern Apennines. We also included three older earthquakes that are backed by less than 100 MDP but are among the largest ever reported in our study area (17 November Table 3). For each of these 20 events, which we refer to collectively as the analysed set, we obtained the expected depth and the associated depth range (Table 3), based on the confidence intervals shown in Fig. 4.
Once the expected depth was estimated, we calculated the expected M L for all the earthquakes of the analysed set by inverting for magnitude the Intensity Prediction Equation (IPE) proposed by Tosi et al. 39 . This equation, derived using the whole HSIT macroseismic dataset, takes the following form (Eq. 2):  Table 2). Individual intensity datapoints were obtained averaging the intensity values over a circular moving window of 10 km, overlapping the adjacent window by 5 km. We used a linear fit for the first 50 km of the curves and calculated the resulting slope.
www.nature.com/scientificreports www.nature.com/scientificreports/ where I is the Mercalli-Cancani-Sieberg (MCS) macroseismic intensity and r is the hypocentral distance in km. We used the average intensity data following standard methodologies 40,41 in the distance range 0-200 km. We did not recalculate M L for the events whose macroseismic field does not extend to 200 km. All the inferred M L were then converted into M w using the equation proposed by Grünthal and Wahlström 42 (Table 3), to allow them to be compared with the magnitudes reported in CPTI15. We estimated the associated error by calculating the M w for the learning set with the same procedure used for the analysed set. We then compared the results with the instrumentally-determined magnitudes, obtaining a root mean square error of 0.44 for the whole dataset and of 0.25 for the earthquakes of M w ≥ 4.8, that is the starting magnitude of the range considered in the analysed set (Table 3).

Discussion characteristics of the attenuation curves.
To obtain the attenuation curves we first assumed isotropic intensity attenuation, then binned intensity values inside circular windows. The finding that the attenuation curve can be approximated by a straight line in the first 50 km from the epicentre (Figs 2 and 3) agrees with the results obtained by Gasperini 2 . After imposing a hypocentral depth of 10 km for all the analysed historical earthquakes, this investigator found empirically that the attenuation of seismic intensity in Italy is well described by a bilinear model; the change in slope occurs rather abruptly between 40 and 50 km from the assumed epicentre, and the average slope of the first part of the attenuation curve is 0.05 to 0.06 2 . Previous workers observed an abrupt change in the slope of the attenuation curve at about the same epicentral distance using a numerical simulation of the attenuation of PGA 43 .
A previous investigation of seismic intensity attenuation in Italy 44 found a correlation between the heat flow and the slope of the attenuation curve over the first 45 km from the source. In this investigation the hypocentral depth for all earthquakes was fixed at 10 km; the results show larger attenuation in areas of high heat flow. This result can be also interpreted in terms of variations of the focal depth, since the seismogenic layer is thinner in high heat flow areas 45 , and hence earthquakes are necessarily shallower.
the Rosetta Stone: pros and cons. Our method is independent of magnitude over the full range analysed in this study (M 4.0-6.5). In fact, we observed that the slope of the attenuation curve within the first 50 km from the epicentre is strongly influenced by the hypocentral depth, while an increase in magnitude only causes its upward shift: this circumstance results from a generally larger epicentral intensity (I 0 ), and consequently larger intensity values at least within the first 50 km from the epicentre (Figs 3 and 5).
The independence of magnitude is a crucial feature that distinguishes our approach from other methods. Recently, Traversa et al. developed a methodology based on an exploration-tree approach to estimate the M w and depth of French historical earthquakes 41 . In this procedure, however, the two unknowns, i.e. depth and magnitude, were not treated independently. On the contrary, our approach does not require calculating the magnitude, but its only requirement is an even distribution of MDP within the first 50 km around the epicentre. For example, we were able to infer the depth of the 1570, 1624 and 1688 earthquakes, the three oldest events of our dataset (Table 3), despite the limited number of available MDPs; but we are well aware that the poor data coverage  Table 2) and is shown along with its standard error relative to the slope. www.nature.com/scientificreports www.nature.com/scientificreports/ inevitably translates into a larger uncertainty in slope estimation (see the Method section for a description of the estimation procedure).
The inferred depth range of the analysed set (Table 3) is larger for deeper events. This is inherently due to the logarithmic behavior of the data trend as confirmed by the log-character of the best fit curve. For example, the 1920 earthquake (#36) exhibits a slope of 0.057, implying that it is a shallow event with a depth range of 5 to 10 km and an expected depth of 7 km, while the 1909 earthquake (#29), whose slope is 0.017, is a deeper event falling in the range 34 to 56 km and having a preferred depth of 41 km. Hence, all shallow earthquakes, i.e. those whose slope and depth are >0.045 and <15.0 km, respectively (Fig. 4), are easier to discriminate from all other intermediate and deep events, whereas separating the deepest events is not straightforward. In spite of this limitation, even a raw depth estimation is usually sufficient to assign these deeper earthquakes to their relevant source type (Fig. 1), given the large difference between the minimum and maximum depth of seismogenic faults in our study area 1,20 (Table 1).
Applying the Rosetta Stone to selected unknown-depth earthquakes: seismotectonic implications. Here we discuss the relationships between the hypocentral depth we calculated using Eq. 1 for earthquakes of the analysed set (Table 3) and their relevant seismogenic source types (#1, #3, #4 and #7 for the Po Plain; #2 to #7 for the hinge of the Apennines: see Fig. 1 and Table 1).
For some of the earthquakes located in the Po Plain (#21, #23, #26, #29 and #38 in Fig. 1 and Table 3) we obtained a depth that falls well below the southward-dipping basal detachment of the outer Northern Apennines thrust fronts, locally attested between 5 and 15 km depth 14 (Fig. 1). We also recalculated the magnitude of these earthquakes in view of their estimated hypocentral depth, with the exception of the 1570 and 1688 events because their MDPs do not reach a distance of 200 km from the epicentre.
Some of the results we obtained are quite striking: for example, the recalculated M w for the deep 1909 earthquake (#29 in Table 3) is 6.2: a figure that, even considering the associated mean root square error of Figure 5. Attenuation curves obtained for all the earthquakes of the analysed set ( Fig. 1 and Table 3). The 50-km slopes were calculated using the same procedure described for the learning set earthquakes (see text and www.nature.com/scientificreports www.nature.com/scientificreports/ 0.25 obtained from the learning set, is significantly larger than the 5.4 proposed by the CPTI15 catalogue 18 . Altogether these results highlight the significant seismogenic potential of the inherited faults that occur in the Adriatic crust (type #3 in Fig. 1). In the literature some of the earthquakes they generated have been associated with shallow thrust faults 1 (e.g. the 1688 and 1875 events), but based on the new depth estimation they should now be reinterpreted as belonging to one of the deeper domains. More specifically, the 1909 earthquake, that had already been hypothesised to have a deep source 46 , falls close to the epicentre of the M w 5.5, 1796 earthquake, whose macroseismic field resembles that of a deep earthquake 20 . Unfortunately, the number of MDPs available for this event is too limited for it to be analysed in this study; nevertheless, we suggest that also this earthquake has a deep source, and hence that its magnitude is likely to be even larger than that recalculated for the 1909 earthquake.
Moving westward, for the 1951 earthquake (#38 in Fig. 1 and Table 3) we found an estimated depth range confirming this as a deep event, in agreement with a recent reappraisal based on instrumental data 19 . The occurrence of this earthquake has long been debated as its hypocentre falls below a major gas-and oil-field that started being exploited after World War II. For this reason, it was originally suggested it could have been triggered by gas extraction 47 , implying a shallow source. Conversely, our elaborations put its hypocentre in a depth range 36-59 km ( Table 3), suggesting that the earthquake was caused by a still unidentified inherited fault in the lower Adria crust (type #3 in Fig. 1) and dismissing the hypothesis that it was triggered by human activities. Its recalculated M w is 5.9 ± 0.25, significantly larger than the M w 5.2-5.3 respectively assigned by the CPTI15 catalogue 18 and by the instrumental determination 19 .
For other earthquakes that occurred in the study area (events #22, #27, #30, #32, #33, #37 and #39: Fig. 1 and Table 3) we found a shallow hypocentral depth that relates them to the activity of the outer Northern Apennines thrust fronts (type #1), in agreement with previous regional and local studies 1,20,48,49 ; this stresses even further their seismogenic potential, recently highlighted by the May 2012 earthquake sequence.
Although the instrumental earthquakes used to build the curve of Fig. 4 were generated mostly by shallow and mid-crustal compressional sources of the Northern Apennines thrust belt (Table 2), since our analysis did not take into account the faulting mechanism, we applied our method also to earthquakes that occurred along the crest of the Apennines, such as the 29 June 1919, M w 6.4, Mugello and the 7 September 1920, M w 6.5, Garfagnana events (#35 and #36 in Table 3). These are the largest earthquakes of the whole dataset, and due to their location, they were likely generated by shallow crustal extensional faults 1 (type #5 in Fig. 1) belonging to the northern Etrurian Fault 21 , a large system of gently northeast-dipping normal faults that typically extend up to 20 km depth.
For the 1920 earthquake our elaborations return a possible depth range of 5 to 10 km, fully compatible with the existing seismotectonic model 1 . Conversely, for the 1919 event we obtained a depth range 14-23 km, slightly deeper than expected. The macroseismic field of this earthquake, however, is known to have been contaminated by the effects of the 10 November 1918, M w 6.0, Santa Sofia earthquake 34 (#34 in Table 3), which returned a depth range of 3 to 6 km and is located about 30 km east of the 1919 epicentre. On the one hand, the reported intensity field for the 1919 earthquake appears much more spread out than it is in reality, due to the difficulty of separating the effects of the previous earthquake; this results in an apparent lower attenuation of macroseismic intensity, and hence in anomalously deeper depth. On the other hand, the effects of the 1918 earthquake probably affected also our calculation of the magnitude of the 1919 earthquake, since the M w 5.6 we obtained is definitely smaller than the M w 6.4 supplied by the CPTI15 catalogue. Similar contaminations are common in Italy as much as in many other seismogenic areas and call for special care when dealing with earthquakes whose territorial impact overlaps with that of nearby, similarly large shocks. potential seismic hazard implications. Our findings have important implications for seismic hazard assessment, especially when conducted at the local scale and in areas of low or sparse seismicity. Under these circumstances any SHA analysis, either deterministic or probabilistic, usually relies on a handful -sometimes just one or two -of early-or pre-instrumental earthquakes, representing the only available source of seismological information. This makes it crucial to characterise them as accurately as possible based on their intensity reports.
The focal depth is indeed a most crucial parameter for all these analyses, because: • earthquakes occurring at different depth intervals can be assigned to different seismotectonic categories (see Table 1 and inset in Fig. 1), which allows also their kinematics and their expected rates of occurrence to be determined with better confidence. This information and the ensuing hypotheses may eventually lead to devise a 3D model of seismogenic zonation 46 -i.e. a model that is different at different depths -thus overcoming a known limitation of standard zonation models 50 ; • the expected increase of the inferred magnitude with depth may also have substantial seismic hazard implications. We find that in some instances the corresponding M 0 increases by over one order of magnitude (see the last column in Table 3), implying that current catalogues may underestimate the earthquake budget associated with a specific type of earthquake sources by 90% or more; • the combined increase in earthquake magnitude and depth also has substantial earthquake engineering implications, for at least two reasons: (a) a larger shock implies larger expected ground shaking, or at least a similar shaking level over a much broader region; and (b) the seismic radiation associated with larger and deeper earthquakes is generally characterised by rather different spectral amplitudes with respect to shallower and smaller events. In particular, they may feature a larger low-frequency content, which may cause resonance effects in tall buildings 51 .

conclusions
We elaborated a methodology to estimate the earthquake hypocentral depth based on macroseismic intensity data. The methodology is especially suited for obtaining a realistic depth range for earthquakes that occurred in the pre-instrumental and early-instrumental era, up to the 1970s, and for which there exists a detailed reconstruction of the macroseismic field. In other words, while reasonably accurate estimates of the focal depth exist for all significant Italian earthquakes of the past 50 years, our method can be used to extend this interval to the past three centuries and beyond. Hence, the method allows each earthquake to be assigned to one of the various seismogenic source types that exist at different depth intervals in any active region, providing crucial insight to improve the understanding of seismogenic processes and of their rates. By analysing the slope of the first 50 km of the attenuations curve of well-documented recent earthquakes we found a statistically significant correlation with their hypocentral depth. The correlation obtained from this learning set is the key to infer the depth range of historical and early-instrumental earthquakes. Our methodology was tested on a set of such earthquakes -which we referred to as the analysed set -that occurred in northern Italy, a region characterised by multiple seismogenic sources occurring at different depth. The test returned encouraging results and our methodology is ready to be extended to any other seismogenic area of the world, ideally following a re-calibration based on local data.
The method is easy to use, independent of magnitude, and suitable also for incomplete macroseismic intensity fields, as in the case of the oldest earthquakes of our dataset, dating back to the 16 th and early 17 th century. We wish to stress that the method works independently of magnitude, implying that the obtained depth estimates can be used to recalculate the macroseismic magnitude itself.
We show that for earthquakes deeper than 30 km the revised magnitudes are consistently larger than those based on intensity alone (e.g. using the Boxer code). We conclude that in addition to highlighting the existence of deeper and less known seismogenic sources, our methodology may have a significant impact on the calculation of annual earthquake rates, and hence on seismic hazard assessment, particularly in areas of infrequent seismicity and low strain rates.

Method
Selection criteria for the learning set. We selected the events from the web-based macroseismic HSIT catalogue ( Fig. 1; #6 to #20 in Table 2), that relies on the INGV catalogue 52 for instrumental earthquake locations, using the following criteria: • M ≥ 4.0 for earthquakes deeper than 25 km, in order to collect a statistically significant number of events; • M > 4.5 for events having hypocentral depth ≤25 km.
From this large dataset we discarded: • all the events whose hypocentral depth was fixed (i.e. not obtained from the seismograms); • all the events for which less than 1,000 questionnaires are available; • all the aftershocks that occurred too close in time to the mainshock, in order to prevent any possible mistakes by the compilers 53 .
These rather strict criteria were fulfilled by 14 earthquakes, including the three 2012 earthquakes discussed earlier and shown in Fig. 2, ranging in depth from 3.0 to 72.4 km and ranging in M w between 4.0 and 5.8 (Table 2). To further extend our dataset we also included six significant earthquakes that do not fit the above requirements: • the 26 March 2008, M w 4.0 earthquake (#6 in Table 2): a relatively small event that is backed by only 85 questionnaires but occurred at 72 km depth and is well constrained by seismological data. This event is crucial for extending the explored focal depth range, being one of the only two events that occurred at a depth of over 50 km; • five additional earthquakes falling in our study area but not included in the HSIT catalogue because they occurred before 2007 (Fig. 1), provided that their location is reliable and, more specifically, that their hypocentral depth is calculated (non-fixed).
These latter five events (#1 to #5 in Table 2) were taken from the DBMI15 37 and CFTI5Med 34 databases; they all have M > 4.5 and a number of MDP larger than 100. We made this choice following the results of a recent study 54 that has shown that traditional macroseismic datasets are fully consistent with HSIT observations. Unfortunately our learning set, that is the outcome of the application of these selection criteria, exhibits an uneven distribution of earthquakes with depth, the events shallower than 35 km being 80% of the total. intensity estimation methods. The macroseismic data of recent earthquakes that occurred since 2007 were obtained through an online macroseismic questionnaire "Hai Sentito Il Terremoto" (Did You Feel the Earthquake), developed, published and maintained by INGV (http://www.haisentitoilterremoto.it/). The online questionnaire is compiled by volunteers and by registered users (currently about 27,000, spread over the whole Italian territory). Each questionnaire is analysed through automatic procedures, and a macroseismic intensity is ultimately assessed for each municipality. The procedure for the interpretation of the questionnaires was described in detail in previous papers 36,39 . The HSIT macroseismic intensities have been found to be in good agreement with the intensity values obtained from conventional macroseismsic surveys 36,39,54 .
The macroseismic data for earthquakes that occurred before 2007 were derived from field macroseismic surveys or from the interpretation of questionnaires compiled by Italian local administrations. Macroseismic data of recent and historical earthquakes reported in the CPTI15 database 18 derive from as many as 185 original studies, databases, reports, and bulletins. In particular, the data for most of the largest earthquakes were supplied by the CFTI5Med catalogue 34 , a new-generation analytical catalogue that reports a great deal of information on the effects of each earthquake at each of the affected localities. Tables 2 and 3 report the source of macroseismic data for each earthquake analysed in this work.
See Supplementary Tables 1 and 2 to find out the URL of the web page containing the macroseismic data available for each earthquake.
Uncertainty estimation of the slope of the attenuation curves. The reliability of the slope calculated for the initial 50 km of the attenuation curves, both for recent and for historical earthquakes, is controlled by several factors, including the anisotropy of attenuation, the number of available MDPs, and the intrinsic uncertainties in MDPs.
The attenuation curves are obtained fitting averaged intensities in a circular moving window of 10 km, overlapping the adjacent window by 5 km. The slope of the attenuation curves in the first 50 km is calculated from a first-degree fit of a set of 8-9 averaged intensities, depending on data availability. The resulting coefficient of determination R 2 values are often good at a significance level better than 10 −3 , except for earthquakes #6, 8,17,20,23,35, that exhibit a significance of 10 −2 . Earthquakes #12, 26, 40 exhibit a scattered attenuation pattern, causing low R 2 (Tables 2 and 3). Each slope value is supplied along with its standard error bar (Fig. 4). earthquake parameters. The parameters of the earthquakes of the learning set that occurred since 1985 were obtained from the INGV earthquake catalogue 52 (available from http://cnt.rm.ingv.it/iside) and are listed in Table 2. The parameters of the only event older than 1985 (#1) were derived from the CSTI1.1 database 55 . For the depth values we used data from the same databases, except when there exists a specific study about a given earthquake; in that case, we preferred the re-located depth given in the study (source specified in Table 2). The parameters of the analysed set (Table 3) were taken from the CPTI15 and CFTI5Med catalogues 18,34 .

Data availability
All macroseismic data used during this study can be downloaded from public web sites (see Supplementary  Tables 1 and 2 for details).