Fatigue dataset of high-entropy alloys

Fatigue failure of metallic structures is of great concern to industrial applications. A material will not be practically useful if it is prone to fatigue failures. To take the advantage of lately emerged high-entropy alloys (HEAs) for designing novel fatigue-resistant alloys, we compiled a fatigue database of HEAs from the literature reported until the beginning of 2022. The database is subdivided into three categories, i.e., low-cycle fatigue (LCF), high-cycle fatigue (HCF), and fatigue crack growth rate (FCGR), which contain 15, 23, and 28 distinct data records, respectively. Each data record in any of three categories is characteristic of a summary, which is comprised of alloy compositions, key fatigue properties, and additional information influential to, or interrelated with, fatigue (e.g., material processing history, phase constitution, grain size, uniaxial tensile properties, and fatigue testing conditions), and an individual dataset, which makes up the original fatigue testing curve. Some representative individual datasets in each category are graphically visualized. The dataset is hosted in an open data repository, Materials Cloud.

with each category containing tens of independent records. We hope that our database merely serves as a starting point, from which the fatigue database of HEAs can continuously evolve and grow. We call for contributions from all researchers in the field, especially those studying the fatigue behavior of HEAs, to sustain the continuous growth and maturity of the database.

Methods
The data are sourced from the publications on fatigue properties of HEAs, since the appearance of the very first publication in 2012, until the beginning of the calendar year of 2022. Throughout this about 9-year span, about 80% fatigue data were published after 2018. Web of Science and Google Scholar were two of the main search engines used for locating the publications in this sub-field. Following downloading of the publications, figures therein containing fatigue data were screenshotted, the data points were digitized with WebPlotDigitizer version 4.5 25 and stored in an Excel template. Alongside the extracted and recorded data were alloy compositions as well as a number of factors that could influence the fatigue properties of alloys, such as processing history, grain size, uniaxial tensile properties, fatigue testing conditions, etc.
The raw data existing in the literature were originally reported in varied units. For consistency, unit conversions were applied to the data extracted, so that the data in the database had consistent units. On occasion, there were ambiguities in certain data or essential data were missing from the publications. In such instances, we emailed the authors asking for clarifications or providing us with raw data. If the authors did not reply, even after a couple of follow-up attempts, we chose to drop the records, in order to prevent any confusion or misrepresentation.
Additionally, we expanded the fatigue data originally reported in the literature by adding extra data columns that directly derivable from the original data. In LCF, the following relationships exist.
el pl and, pl are the total, elastic, and plastic strain amplitudes, Δɛ is the strain range, ɛ max is the maximum applied strain, and R = ɛ min /ɛ max is the strain ratio. ε  1) and (2) to the reported literature data.
Similarly, the following relationships exist in HCF. www.nature.com/scientificdata www.nature.com/scientificdata/   is the stress amplitude, σ max is the maximum applied strain, and σ σ = R / min max is the stress ratio. Both

Data Records
The data are structured in seven color-coded worksheets of a Microsoft Excel workbook by category, which are "Front page", "LCF summary", "LCF individual dataset", "HCF summary", "HCF individual dataset", "   www.nature.com/scientificdata www.nature.com/scientificdata/ The data are subdivided into three broad groups based on the types of fatigue tests conducted, namely, low-cycle fatigue (LCF), high-cycle fatigue (HCF), and fatigue-crack-growth rate (FCGR). LCF, HCF, and FCGR consist of 15, 23, and 28 date records, respectively, as schematically illustrated in Fig. 1. Each record represents a uniquely defined metallurgical condition. For instance, the alloy compositions of two records may be identical, but there are at least one other factor distinguishing them from each other, such as the processing history. For all three groups, each individual record is hierarchical, comprising of (1) a high-level summary summarizing all key information about the alloy and its fatigue properties, and (2) a low-level individual dataset delineating the full fatigue testing trajectory. The summary and the individual dataset are linked by the one-to-one indices.
The two-level data structure for the LCF data is exemplified in Fig. 2. The summary is further constituted by an array of data that can be broadly classified as the alloy basic information, tensile properties, LCF testing conditions, and the source reference. Each individual dataset is composed of the column data of the number of fatigue cycles, total/elastic/plastic strain amplitude, and maximum strain. HCF and FCGR consist of similar data structures and yet varied fatigue entries in the summary, and completely different columns of data in the individual dataset, as illustrated in Figs. 3 and 4.  www.nature.com/scientificdata www.nature.com/scientificdata/

technical Validation
The original literature data of the same type coming in distinct units are converted to a consistent unit. The accuracy of the extracted data, derived data, and unit conversion is cross-checked and verified multiple times by the team with extensive experiences in HEAs and their fatigue properties.
Data visualization serves as an another means of data validation. All records in the individual datasets of LCF, HCF, and FCGR are plotted to visually compare to the source plots in the literature from which the data are extracted. Any spotted discrepancies between our plots and the source ones are investigated and corrected if misrepresentation is confirmed. Some of the plots under the identical temperature and stress or strain ratios are given as follows. Figure 5 depicts the LCF for a variety of alloys at room temperature and the strain ratio of ε ε = =− R / 1 min max (ɛ min and ɛ max are the minimum and maximum applied strains), represented by the total, plastic, and elastic strain magnitudes versus number of reversals to failure, 2N f . Likewise, the selected records in the individual dataset of HCF are visualized in Fig. 6 as stress amplitude, σ σ , as a function of number, 2N f , for the alloys tested at the stress ratios of R = 0.1 and R = −1, and at room temperature. The fatigue-crack-growth rate, da/dn, versus the stress intensity factor range, ΔK, for the selected FCGR records is pictured in Fig. 7. Note that Figs. 5-7 merely represent part of the data from the database. Data under other conditions are not illustrated.

Usage Notes
The data contained in the database may be used individually or collectively for various purposes. The basic usage may involve comparing the fatigue properties of the HEAs in the database with other materials of interest or with HEAs subsequently tested. As the database continues to grow, the data may be used for AI or machine learning to, for example, facilitate the design of highly fatigue-resistant alloys 17 . Furthermore, many more usage possibilities are waiting to be explored by researchers.

Code availability
The code for digitizing and extracting the data from the literature plots is the open-source code WebPlotDigitizer version 4.5 25 , which is freely accessible.