Discovering plasticity models without stress data

We propose a new approach for data-driven automated discovery of material laws, which we call EUCLID (Efficient Unsupervised Constitutive Law Identification and Discovery), and we apply it here to the discovery of plasticity models, including arbitrarily shaped yield surfaces and isotropic and/or kinematic hardening laws. The approach is unsupervised, i.e., it requires no stress data but only full-field displacement and global force data; it delivers interpretable models, i.e., models that are embodied by parsimonious mathematical expressions discovered through sparse regression of a potentially large catalogue of candidate functions; it is one-shot, i.e., discovery only needs one experiment. The material model library is constructed by expanding the yield function with a Fourier series, whereas isotropic and kinematic hardening are introduced by assuming a yield function dependency on internal history variables that evolve with the plastic deformation. For selecting the most relevant Fourier modes and identifying the hardening behavior, EUCLID employs physics knowledge, i.e., the optimization problem that governs the discovery enforces the equilibrium constraints in the bulk and at the loaded boundary of the domain. Sparsity promoting regularization is deployed to generate a set of solutions out of which a solution with low cost and high parsimony is automatically selected. Through virtual experiments, we demonstrate the ability of EUCLID to accurately discover several plastic yield surfaces and hardening mechanisms of different complexity.


Introduction
Data-driven and machine-learning-based methods are currently pushing forward the frontiers of material modeling. What started with simple regression on uniaxial tensile data has rapidly expanded into high-dimensional and big-data-based surrogate modeling of basically all types of materials of technical interest, including metals, polymers, composites, and more. While conventional material modeling was based on the a priori assumption of a constitutive law of which the unknown parameters were identified through best-fitting with experimental (or, within multiscale settings, lower-scale computational) results, current data-driven and machine-learning methods give up the usage of an analytical constitutive law altogether. In doing so, they avoid the modeling errors arising due to, e.g., the largely experience-based modeling assumptions and the choice of experimental (or computational) tests being too restrictive to describe the true physics.
Despite the achieved progress, currently available methods are still problematic due to their data-hungry and black-box nature. The state-of-the-art techniques 1-9 that either bypass (directly use data as look-up tables in a model-free fashion) or surrogate (encode in, e.g., artificial neural networks (ANNs) or Gaussian processes (GPs)) material models are rooted in a supervised learning or curve-fitting setting. Hence, they need large amount of data consisting of input-output, i.e., strain-stress pairs. Since experimental stress data are only obtainable in the simplest situations, e.g., uniaxial tensile or bending tests, the comprehensive observation of strain-stress relations relying on these tests is nearly impossible. Additionally, stress tensors are challenging to measure experimentally, while force measurements only provide incomplete data in the form of boundaryaveraged projections of stress tensors. Multiscale simulations can generate training data sets with tensorial stress-strain pairs, but their computational cost is still too expensive to probe the entire high-dimensional stress-strain space. Recognition of this issue is very recent. Within the constitutive-model-free paradigm, it motivated the development of the data-driven identification method 10,11 , which formulates the inverse problem associated to the approach in 3 . Likewise recognizing the issue of limited stress data availability, a mathematical framework is proposed in 12 to calculate stress fields from deformation fields under the assumption of the alignment of the principal directions of stress and strain or strain rate. Within the stream of research on surrogating constitutive models with ANNs, recent attempts to use only displacement and global force data have been performed [13][14][15] , but are limited to very simple cases (constitutive models of known form with unknown parameters or unknown constitutive models but for one-dimensional cases). Lastly, the uninterpretability of stress-strain relations in both paradigms is a standing challenge: it implies significant difficulties in enforcing or verifying the satisfaction of physics constraints and it hinders the extrapolation power.
From the perspectives of both the labeled data requirement and lack of interpretability, the treatment of path-dependent material behavior, such as plasticity, is even more challenging as the stress state at a material point is not solely defined by its strain state, but is additionally dependent on the history of that material point, which is traditionally described using internal variables. Also in this context, the idea of constitutive-model-free approaches is to bypass the formulation of a path-dependent constitutive law, and hence any assumptions on the material behavior, by solving forward problems that are directly informed by the given data [16][17][18][19][20][21] . The other stream of approaches describe the path-dependent constitutive behavior based on ANNs 7, 22-27 , support vector machines 28 , symbolic regression 29 or use the information gained from the data to correct material models known from traditional theories 30 . Being supervised, all these methods require a tremendous amount of labeled data in form of stress-strain paths (with time adding a new dimensionality to the learning problem) for the training process. The additional challenge for path-dependent material behavior is that these paths are needed for several -theoretically infinite -loading histories and often fail to extrapolate beyond the time duration of training data paths. The internal variables of conventional constitutive modeling are physically immeasurable and prohibit any interpretability in the learnt models.
In this light, we propose EUCLID (Efficient Unsupervised Constitutive Law Identification and Discovery), an unsupervised discovery framework that bridges the advantages of data-driven and traditional modeling approaches. EUCLID does not require the a priori choice of a material model and thus is flexible to describe a variety of different material behaviors, and it only relies on unlabeled data, i.e., full-field displacements (obtained e.g. via Digital Image Correlation (DIC)) and global reaction forces but no stress data, generated by a single experiment. EUCLID was recently successfully demonstrated for hyperelastic material model discovery 31 and is here extended to the significantly more challenging problem of path-dependent elasto-plasticity. The idea is to formulate a large library of interpretable candidate material models (also referred to as features) and to automatically discover the most relevant features in the library based on the given unlabeled data using only physics constraints (as opposed to stress labels). The inspiration for this approach comes from the dynamics community, where sparse regression from a library of candidate features has been used to discover the nonlinear dynamics of physical systems 32 , albeit in a strictly supervised setting.
Our objective here is to discover fully general and evolving plastic yield surfaces, which characterize the material behavior of three-dimensional elasto-plastic solids, purely based on two-dimensional displacement field and reaction force measurements from only one experimental test on an arbitrarily shaped specimen. The material model library is constructed by expanding the yield function with a Fourier series containing a potentially large number of terms. Yield surface growth and translation, i.e., isotropic and kinematic hardening, are introduced by making the yield function dependent on the accumulated plastic multiplier and the back stress, which are internal history variables that evolve with the plastic deformation. For selecting the most relevant Fourier modes and identifying the hardening mechanisms, EUCLID employs physics knowledge, i.e., the optimization problem that governs the discovery is formulated based on the balance of linear momentum, compensating for the unavailability of stress data. The step-by-step schematic of EUCLID is illustrated in Figure 1 and described below.

Material model library
In this first work on path-dependent material behavior, we focus on homogeneous, isotropic materials for which linear elastic behavior is followed by associated, pressure-insensitive plastic behavior with isotropic and/or kinematic hardening. We also assume small strains and plane stress conditions. In the theory of elasto-plasticity, the infinitesimal strain tensor, which is obtained from the spatial gradient of the displacement field u u u, is additively split into an elastic contribution and a plastic contribution ε ε ε = ε ε ε e + ε ε ε p , with the plastic strain acting as internal (or history) variable. The elastic properties of the material are characterized by the stiffness tensor C, which determines the linear relation between the elastic strain and the Cauchy stress tensor σ σ σ = C : ε ε ε e , whereas the plastic properties are described through the yield function f (σ σ σ , γ, σ σ σ back ), which is here assumed to be dependent on the stress tensor, the accumulated plastic multiplier γ and the back stress tensor σ σ σ back . The zero level set f = 0 defines the yield surface, i.e., the material deforms elastically if f < 0, and plastic yielding occurs at f = 0. Further, the yield function governs the evolution of the plastic strain through the plastic evolution laẇ where the superposed dot denotes the derivative with respect to time. The plastic multiplier and the yield function need to fulfill the Kuhn-Tucker loading and unloading conditions as well as the consistency condition (see Supplementary Information (SI) Section 1). For details on the theory of elasto-plasticity, the reader is referred to 33,34 . The construction of a suitable material model library, i.e., a large catalogue of potential candidate models, builds the basis for the unsupervised discovery framework. As the elastic material properties can be identified independently from the plastic material properties in a preprocessing step (for example based on the full-field measurements of the first load steps, see 31, 35 ), the elastic stiffness tensor is assumed to be known here and the main focus lies on the discovery of the yield function (see Step-by-step schematic of EUCLID. In a single experiment with complex geometry (a), point-wise displacements (b) and global reaction forces (i) are measured. A quadrilateral finite element mesh is constructed (c) to interpolate the displacement data. The resulting displacement field (d) is differentiated to arrive at the strain field (e). The material model library (f) is constructed (here based on a Fourier ansatz). Based on this library and for given material parameters θ θ θ and H H H, the stresses can be calculated by applying a classical elastic predictor -plastic corrector return mapping algorithm at each load step in the data set, while the history variables are updated at each step (g). Based on the stresses, the internal and external virtual works and hence the internal (h) and external (i) force imbalances are calculated, contributing to the cost function C. Finally, the cost function is minimized jointly with a sparsity promoting regularization term (j) to generate a set of solutions out of which a solution with low cost and high parsimony is automatically selected. Details are provided in SI Fig. S4. Table 1). The material model library is constructed by choosing a Fourier series ansatz 1

3/28
where θ i are the unknown components of the material parameter vector θ θ θ and (n f + 1) is the number of features in the library. The Lode radius r and the Lode angle α are invariants of the relative stress tensor σ σ σ rel = σ σ σ − σ σ σ back , and are related to the relative principal stresses σ i , i.e., the eigenvalues of σ σ σ rel , through where atan2(·, ·) is the four-quadrant inverse tangent and the eigenvalues are taken in increasing order, i.e., σ 1 ≤ σ 2 ≤ σ 3 . Sine terms as well as certain cosine terms are excluded from the library (2) to fulfill isotropy requirements (see SI Section 1). Isotropic hardening is considered in (2) through the nonlinear isotropic hardening function H iso (γ), whereas kinematic hardening is incorporated by letting the back stress evolve nonlinearly with the plastic deformation. We here assume Voce isotropic hardening 38, 39 and Armstrong-Frederick kinematic hardening 40 by defining where is a vector containing the unknown hardening parameters that are here assumed to be non-negative.
By choosing different combinations of active features in the Fourier series, (2) can be used to describe smooth yield surfaces of arbitrary shape. The representation of the yield function as a closed-form mathematical expression (in contrast to black-box models or constitutive-model-free approaches) facilitates physical interpretation of the material model. E.g., it becomes straightforward to verify whether the yield surface is convex or whether the material behavior is tension-compression symmetric (see SI Section 1). The closed-form description also enables interpretable constraints on the parameters θ θ θ based on physical requirements. E.g., assuming a vanishing stress state and that no hardening has occurred (σ σ σ = 0 0 0, γ = 0, σ σ σ back = 0 0 0), the material is expected to behave elastically ( f < 0), The numerical implementation of the model library presented in this section, either for forward finite element simulations (used to generate the artificial data) or for the inverse discovery algorithm (EUCLID), requires the formulation of a stress update procedure. Given the strain ε ε ε t at the current time step t, the history variables h h h t−1 = {ε ε ε t−1 p , γ t−1 , (σ σ σ back ) t−1 } of the previous time step 2 and the material parameters θ θ θ and H H H, the current stress σ σ σ t (ε ε ε t , h h h t−1 , θ θ θ , H H H) is calculated via a classical elastic predictor-plastic corrector return mapping algorithm (see SI Section 1).

Optimization problem
To compensate for the unavailability of stress data, we employ physics knowledge to identify which features in the feature library should be active and to find the values of the corresponding active parameters within θ θ θ and H H H. Under quasi-static loading of a two-dimensional domain Ω with boundary ∂ Ω, the linear momentum balance (∇ · σ σ σ = 0 0 0) in its weak formulation is given by for all admissible test functions v v v, wheret t t t denotes the boundary tractions. The available data consists of displacement measurements {u u u a,t : a = 1, . . . , n n ;t = 1, . . . , n t } at n n points and n β net reaction force measurements {R β ,t : β = 1, . . . , n β ;t = 1, . . . , n t } on some boundary segments, both for n t time steps. Our objective is to determine θ θ θ and H H H such that (6) is satisfied by the data. We emphasize that, although a two-dimensional inverse problem is formulated here, the discovered plasticity models are valid for three-dimensional stress states. This means that we exploit a two-dimensional dataset to automatically find plasticity models applicable to three-dimensional solids.
We create a mesh connecting the points, each point being associated with finite element shape functions {N a (X X X) : a = 1, . . . , n n } such that the strain field is obtained as ε ε ε t (X X X) = ∑ a sym(∇N a (X X X) ⊗ u u u a,t ), and consequently the stress field σ σ σ t (X X X, ε ε ε t , h h h t−1 , θ θ θ , H H H) is determined for each time step via the elasto-plastic constitutive law described by θ θ θ and H H H. Using the same set of shape functions for the test functions, the nodal internal forces are computed as Equilibrium dictates that the internal forces corresponding to all free degrees of freedom (grouped in the set D free ) be zero at each time step, which naturally leads to the cost function At the same time, for each set of constrained degrees of freedom D disp,β , equilibrium requires that the sum of the internal forces be balanced at each time step by the corresponding reaction forceR β ,t , We combine the two costs in a single cost function with the balancing hyperparameter λ r > 0. For details, see SI Section 3.
The key difference between EUCLID and traditional (supervised or unsupervised) parameter identification methods is the fact that the form of the material model is not known a priori. In the context of yield surface discovery, the number and combination of active features in (2) is unknown. Directly minimizing the cost function in (10) would result in a dense solution vector θ θ θ , i.e., a highly complex material model with many non-zero material parameters. Such material models are not desired as their calibration and implementation are impractical and they bear a high risk of overfitting the data, possibly resulting in physically inadmissible material behavior. For these reasons, we promote sparsity in the solution vector by formulating an p -regularized minimization problem as For the hardening models we assume a mathematical form that is parsimonious at the outset but flexible enough to describe a large class of different hardening mechanisms. To achieve parsimonity in the yield surface, we apply p -regularization to the parameter vector θ θ θ . The regularization with hyperparameters λ p > 0 and p ∈ (0, 1] is a generalization of the LASSO (least absolute shrinkage and selection operator) 41 which is recovered for p = 1. Smaller values of p and higher values of λ p promote sparsity more aggressively, but on the other side increase the degree of non-convexity of the function to be minimized. Here, as in our previous work 31 , we use p = 1/4, while the choice of λ p is discussed in SI Section 4. Note that in order to fulfill (5), the parameter θ 0 corresponding to the constant Fourier mode is purposefully not considered in the p regularization. Further, we highlight that applying the p regularization shrinks the absolute values |θ i | for i ∈ {1, . . . , n f } and hence reinforces the fulfillment of the physical constraint in (5). A similar correlation between sparsity of the material model and fulfillment of physical requirements has been observed by 31, 32 , supporting the hypothesis that sparse models are more likely to fulfill physical requirements.

Benchmarks
We benchmark EUCLID on eight largely different elasto-plastic material models with different hardening parameters (see Table 1, Table 2 and Figure 3 for yield function expressions, hardening parameters and yield surface plots, respectively) 43, 44 : • VM: Von-Mises yield function 45 • F2: Yield function in (2)  The first four material models VM, F2, TR * , SI * exhibit tension-compression symmetry, while the others are tensioncompression asymmetric. Material models VM, F2, F1, NC are obtained by choosing different combinations of active and inactive parameters in (2). While yield surfaces in classical plasticity are convex and despite the issues connected with the thermodynamic consistency of non-convex yield surfaces, we deliberately choose model NC to evaluate the capabilities of our approach also in the rare case of non-convexity 51 . The original Tresca, Schmidt-Ishlinsky, Ivlev and Mariotte models are characterized by non-smooth yield functions which require complex stress update procedures 34 beyond the scope of this work. Therefore, and in order to verify the flexibility of the chosen yield function library, the models with non-smooth yield functions are approximated by the Fourier-type expansion in (2) and denoted by superscript (·) * . Each of the approximated models contains eleven active features with n f up to 20 in (2), see SI Table S1. The validity of such approximations is proved in SI Section 1.4. EUCLID takes full-field displacement and global reaction force measurements as input. DIC data are emulated by generating artificial data via the finite element method (FEM) based on the material models illustrated above. The chosen domain is a square plate with two elliptic holes (as schematically shown in Figure 2) in plane stress conditions and meshed with bilinear quadrilateral elements. The plate is deformed under displacement-controlled tension, followed by displacement-controlled compression. In the tension phase, the prescribed vertical displacement δ is linearly increased from 0 to δ = 0.5 mm, and subsequently decreased to δ = −0.5 mm in the compression phase. The nodal displacements are recorded from the FEM solution at a total of n t = 2250 load steps, 750 load steps in the tension phase and 1500 load steps in the compression phase. The total horizontal and vertical reaction forces on the top boundary are also recorded. Note that we purposefully choose a complex specimen geometry, in contrast to the simple geometry of traditional coupon tests, with the objective to obtain a strain field that is rich enough to solve the ill-posed problem of identifying the yield surface with no stress data and just one experiment.

Figure 3.
Yield surface plots of the (true) hidden and discovered plasticity models for different noise levels σ (in mm). Yield surfaces are shown for γ = 0.1, which is an upper limit of the accumulated plastic multiplier in the considered datasets. Coordinates π 1 , π 2 are in kN/mm 2 . For the tension-compression symmetric models, the symmetry axes are indicated by black dotted lines.
As real DIC data are unavoidably affected by noise in the measured displacement field, we add independent Gaussian noise with zero mean and standard deviation σ > 0 to the synthetic displacement data coming from the FEM simulations (see SI Section 2). We consider noise levels σ ∈ {0µm, 0.1µm, 0.3µm, 0.5µm}, where σ = 0.1 µm is considered a reasonable upper limit for modern DIC setups 52, 53 . Pushing EUCLID to its breaking point, we further test it for a noise level of σ = 0.5µm. The effect of the noise on the yield surface discovery can be reduced by temporal and spatial smoothing. Here, we restrict ourselves to temporal denoising by applying a Savitzky-Golay filter 54 based on quadratic polynomial fitting with a moving-window length of 50 time steps.
With EUCLID, discovery can proceed from a potentially very large model library -e.g. in our previous work 31 a library with 43 features was used for discovery of hyperelastic strain energy functions. However, it turns out that with the chosen Fourier ansatz a relatively small number of features is already sufficient to provide a remarkably flexible and general yield surface description. Thus, we consider here only seven features (n f = 6) in the model library, i.e., cosine terms with frequencies up to 18 2π . The closed-form expressions of the yield functions discovered by EUCLID from the data with different noise levels are reported in Table 1, in comparison to the true expressions. Table 2 shows the corresponding hardening parameters. A comparison of the yield surface plots in the π plane of the true models and the discovered models after hardening (γ = 0.1) is presented in Figure 3. As the π plane depends on the relative stresses and not on the absolute stresses, the yield surface plots in   Figure 4 for an exemplary material model at the end of three different deformation paths in the dataset. Similar plots for the other material models are shown in SI Fig. S6 and SI Fig. S7.
In the case of displacements without noise (σ = 0), material models VM, F2, F1, NC are discovered exactly, i.e., both the mathematical form of the yield function and the parameters are correctly identified. For increasing noise, the discovered parameters deviate from the true parameters as expected. In the non-convex case (NC), false-positive predictions (features that appear in the discovered formula but are not present in the true model) are observed. Material models TR * , SI * , IV * , MA * on the other hand cannot be described exactly by the chosen model library which makes the exact discovery of the yield function form impossible. However, for the tension-compression symmetric models (TR * , SI * ) it is observed in many cases that tension-compression symmetry-breaking features (e.g., cos(3α), cos(9α) and cos(15α)) are automatically discarded by EUCLID.
Whenever EUCLID fails to discover the correct yield function form, parameters (θ i ) corresponding to false-positive features are observed to be considerably smaller than the other parameters and hence have a small influence on the material behavior. Further, false-negative feature predictions (features that are not discovered although active in the true model) do not seem to have a high impact on the material behavior either, which is corroborated by the high accuracy observed in the yield surface plots for all models and noise levels except higher noise TR * and NC (see Figure 3 and Figure 4). Hence, for the models and noise levels for which EUCLID does not find the correct closed-form expression of the yield function, accurate surrogate models are discovered that mimic the behavior of the true yield function. The Tresca yield surface is the only benchmark for which non-optimal fitting results are observed in shear stress regions even in the case without noise (see Figure 3). The reasons for this are discussed in detail in SI Section 1.4, where we also show that enriching the data with shear deformation drastically improves the results.

10/28
Discussion We show that EUCLID is able to discover interpretable plasticity models from displacement and net reaction force data only and without using any stress data. The method hence provides a physics-constrained, data-efficient alternative to supervised data-driven and machine learning methods, which require an enormous amount of labelled data and thus are most often inapplicable. The sparse regression enables parsimonious model selection in contrast to an a priori choice of the plasticity model such as in the traditional material model calibration techniques. Hence, after having demonstrated EUCLID for hyperelasticity 31 and for plasticity, we aim at pursuing its extension to the discovery of more general cases of plasticity with pressure sensitivity and anisotropy in future work. Further extensions of interest may include other categories of material behavior such as visco-elasticity, visco-plasticity, damage and general combinations thereof. Another important future goal will be the employment of EUCLID on experimental data in the two-and three-dimensional settings using digital image and volume correlation data, respectively.

Methods
Detailed method descriptions are provided in the Supplementary Information (SI). SI Section 1 discusses the numerical implementation of the material model library. SI Section 2 provides details on data generation. The formulation of the objective function and the optimization procedure are discussed in SI Section 3 and 4, respectively.

Code availability
The codes generated during the current study are available at https://euclid-code.github.io/.

Data availability
The data generated during the current study are available at https://euclid-code.github.io/.

13/28
Supplementary information 1 Material model library and numerical implementation aspects

Candidate material models and physical requirements
For isotropic material behavior, the yield function can be expressed as a function of the three relative principal stresses 3 σ i of the relative Cauchy stress tensor σ σ σ rel = σ σ σ − σ σ σ back and the accumulated plastic multiplier γ For pressure-insensitive materials, the yield function does not change along the hydrostatic direction σ 1 = σ 2 = σ 3 . It is therefore convenient to apply a coordinate transformation of the relative principal stress space such that π 1 , π 2 , π 3 are the transformed stress invariants with π 3 aligned with the hydrostatic axis. The yield function can then be conveniently written as a function of two instead of three relative stress invariants and the accumulated plastic multiplier where the remaining invariants π 1 and π 2 span the so-called π plane. The yield function can be further rewritten as a function of the polar coordinates r and α in the π plane, which are commonly referred to as Lode radius and Lode angle, respectively. Here, atan2(·, ·) denotes the four-quadrant inverse tangent function (also known as two-argument arcus tangent) to correctly identify the polar angle α in the four quadrants of the π plane. A generalized plastic yield function is chosen by parameterizing (4) in terms of a Fourier series where a i and b i are real-valued parameters and H iso (γ) is a real-valued function (with the property H iso (γ = 0) = 1) describing the isotropic hardening characteristics. This parameterization enables the mathematical description of arbitrarily shaped smooth yield surfaces. However, material isotropy (which we assume here) requires that the yield function is invariant to the order of the relative principal stresses, i.e., To fulfill this requirement, parameters a i in (5) are allowed to be non-zero only for i ∈ {0, 3, 6, ...}, while parameters b i must be zero for all i. Defining a new set of real-valued parameters θ i , we hence arrive at where we truncate the Fourier series after (n f + 1) terms for the numerical treatment. It can further be observed that for tension-compression symmetric material behavior -which must fulfill f (r, α = 0, γ) = f (r, α = π, γ) -only Fourier modes with even-numbered i are allowed. At γ = 0, the distancer of the yield surface (which is defined through the level set f = 0) from the origin of the π plane (r = 0) is expressed in terms of the Lode angle α as The nonlinear system of equations above can be linearized and solved iteratively for σ σ σ t , ∆γ t and (∆σ σ σ back ) t , followed by the computation of the plastic strain ε ε ε t p = ε ε ε t − C −1 : σ σ σ t . In addition to calculating σ σ σ t , the implementation of the presented material models in forward finite element simulations (used to generate the data) requires calculating the elasto-plastic consistent tangent modulus For ∆γ t = 0, it is C t ep = C. Otherwise, the elasto-plastic consistent tangent modulus for ∆γ t > 0 is calculated as (superscripts t are omitted) with The tangent modulus is not needed for the inverse problem (EUCLID).

Derivatives of the yield function
As follows, we compute the derivatives of the yield function ( f ) with respect to the stress tensor (σ σ σ ), the accumulated plastic multiplier (γ) and the back stress (σ σ σ back ), which are required for the return mapping algorithm and the consistent tangent modulus.
The yield function is written as with where σ σ σ dev = σ σ σ rel − 1/3 tr(σ σ σ rel )I I I is the deviatoric part of the relative stress tensor and I I I is the identity tensor.

Derivatives with respect to the stress tensor
Noting that

16/28
we derive in the following the derivative of the yield function with respect to the relative stress. Employing chain and product rules of differentiation and using the Einstein summation convention, we obtain the first and second derivatives as Note that σ h denotes the h th relative principal stress (with σ 1 ≤ σ 2 ≤ σ 3 ), while σ rel kl denotes the (k, l) entry of the relative stress tensor σ σ σ rel . The respective partial derivatives in (23) are provided in the following. The derivatives of f r with respect to σ σ σ rel are given by where I dev is a fourth order tensor defined such that σ σ σ dev = I dev : σ σ σ rel . The derivatives of f α with respect to α are given by , With the derivatives of α with respect to the relative principal stresses are The derivatives of the relative principal stresses with respect to the relative stress are computed as where v v v (i) denotes the principal direction corresponding to the relative principal stress σ i and no summation is performed over i. If the relative principal stresses are distinct, For repeated relative principal stresses, differentiating the principal directions with respect to the relative stress components is not trivial 55, 56 . For the sake of simplicity, we approximate summands corresponding to repeated relative principal stresses to zero in the equation above, i.e., This is a reasonable assumption considering that the chance of two exactly equal relative principal stresses is negligible in the numerical simulations. 17/28

Derivatives with respect to the accumulated plastic multiplier
We obtain and the mixed derivative where the derivative of H iso (γ) is trivial.

Derivatives with respect to the back stress
Employing chain rule and the definition of the back stress, we obtain and the mixed derivative (33)

Validation of the Fourier approximation of the Tresca yield surface
The original Tresca, Schmidt-Ishlinsky, Ivlev and Mariotte models are characterized by non-smooth yield functions, which require complex stress update procedures 34 . To avoid the need for such procedures and in order to verify the flexibility of the chosen yield function library, in this work the models with non-smooth yield functions are approximated by the Fourier-type expansion in (7) and denoted by superscript (·) * , see Table S1. Each of the approximated models contains eleven active features with n f up to 20 in (7). In this section, we prove the validity of such approximations exemplarily on the Tresca model. As the Fourier-type approximation only introduces errors on the shape of the yield surface but not on its hardening behavior, we show the validation for the perfectly plastic Tresca model (H H H = 0 0 0). First, we show that the Tresca model TR and its Fourier-type approximation TR * yield the same predictions in forward finite element computations. To this end, we simulate the deformation of the specimen described in the main article in a tension phase (up to δ = 0.1) and consecutive compression phase (up to δ = −0.1) based on the models TR and TR * , and we compute the net reaction forces at the upper boundary of the specimen. As shown in Figure S1, a perfect match between the two is obtained. Second, we apply the discovery algorithm, which is based on the smooth Fourier library, to the data generated from the models TR and TR * and show that the discovered models are in good agreement with the true models, see Figure S2.
As can be seen in the main article results as well as in Figure S2, non-optimal fitting accuracy is obtained for the Tresca yield surface in shear stress regions even in the case without noise. The reason for this is that the considered (artificial) experimental loading setup generates comparatively low shear deformations in the specimen. As the Tresca material model has a low shear stress resistance, the lack of significant shear deformation data is more detrimental for the Tresca model compared to the other plasticity models. We show in Figure S3 that enriching the dataset with shear deformation drastically improves the results for the Tresca model TR * with mixed isotropic and kinematic hardening. To this end, a horizontal displacement of the upper specimen boundary was assumed during the data generation in addition to the vertical displacements.

Data generation
Two-dimensional full-field displacement measurements, e.g., obtained through Digital Image Correlation (DIC), are assumed to be known. In this work, artificial data is generated through simulations with the finite element method (FEM), see Table S2. To emulate real experiments that include measurement noise, we add independent normally distributed noise with zero mean and standard deviation σ > 0 to the simulated displacements, where u fem,a,t i is the i th component of the displacement at node a at the t th time step obtained from FEM, and u noise,a,t i denotes the noise added to it. After applying artificial noise to the data, temporal Savitzky-Golay denoising 54 with quadratic polynomial order and a moving-window length of 50 time steps is applied.

18/28
To facilitate derivatives and integrals of the field quantities, we associate the points at which the displacements are known to a finite element mesh. For the purpose of this work, we use 4-node bilinear quadrilateral elements with four quadrature points each. The displacement field is then approximated as where N a : Ω → R is the shape function associated with the a th node. Under the small strain assumption, the infinitesimal strain tensor is calculated from the displacement field by where ∇ denotes the gradient operator with respect to the reference coordinates and ∇ sym = 1 2 (∇ + ∇ T ) denotes the symmetric gradient operator.
Let D = {(a, i) : a = 1, . . . , n n ; i = 1, 2} denote the set of all nodal degrees of freedom where n n is the number of nodes at which displacements are known. D is further split into two non-intersecting subsets of free and displacement-constrained degrees of freedom: D free and D disp , respectively, such that D free ∪ D disp = D and D free ∩ D disp = / 0. On the basis of the interpolated strain field and the material model library (Section 1), the objective is to find suitable material parameters θ θ θ and H H H such that linear momentum balance is fulfilled both in the bulk material and at the boundary. Note that the angular momentum balance is automatically fulfilled in the described setting. Assuming negligible body forces, the weak formulation of linear momentum balance in the reference domain Ω under quasi-static conditions reads wheret t t is the boundary traction. The test functions v v v are admissible if they are sufficiently regular and vanish at the fixed degrees of freedom. Note that we prefer the weak to the strong formulation of linear momentum balance because the double spatial derivatives required by the latter make it more sensitive to noise. Using for v v v the same approximation basis as for u u u (Bubnov-Galerkin method), the test functions are approximated as and are assumed constant over time. Therefore, (37) reduces to where F F F a,t is interpreted as the nodal internal force on the a th node at time step t. At each interior node as well as at each node located on the unrestrained portion of the boundary, i.e., for each degree of freedom in D free , the second integral vanishes and (37) yields This can also be interpreted as satisfying the weak form of linear momentum balance for the following virtual fields v v v: where {ê e e 1 ,ê e e 2 } are the cartesian coordinate axes. In general, the reaction force is not known at every displacement-constrained degree of freedom in D disp . Instead, only the sums of the reaction forces for subsets of the displaced degrees of freedom (each subset corresponding to one displaced side of the specimen and one coordinate direction) are known, emulating force measurements in an experiment. Let n β be the number of these subsets. For β = 1, . . . , n β , let D disp,β ⊆ D disp with D disp,β ∩ D disp,β = / 0 for β = β be the set of degrees of freedom

19/28
under displacement control for which the sum of the reaction forcesR β ,t at time step t is known. Each reaction forceR β ,t must be balanced by the sum of internal nodal forces on those degrees of freedom, i.e., Alternatively, this can be interpreted as satisfying the weak form of linear momentum balance for the following virtual fields v v v: The squared residuals of the equilibrium conditions (40) and (42) can now be evaluated and added over all the time steps and corresponding degrees of freedom to obtain the cost functions described in Eq. (8) and Eq. (9) of the main article. As both C free and C disp need to be minimized to find the unknown material parameters, a multi-objective cost function (reproduced from Eq. (10) of the main article) is adopted with λ r > 0 as a hyperparameter weighting the contribution of the reaction force balance term. This weighting coefficient is important to ensure that both boundary and interior force balance terms are contributing to a similar extent to the value of the objective function. We choose λ r = 100 heuristically and keep it constant throughout the numerical experiments.

Optimization of the p -regularized problem
As discussed in the main article, p regularization is applied to promote sparsity in the solution vector θ θ θ Evaluating the objective function for a given set of material parameters requires calculating σ σ σ t (ε ε ε t , h h h t−1 , θ θ θ , H H H) for each time step. To this end, we assume h h h 0 = {0 0 0, 0, 0 0 0} and calculate the stress update for each time step as described in 1.2. After each stress update, the history variables have to be stored as they are needed for the next step. The implicit dependence of the cost function C on the material parameters and on the internal variables calls for an optimization method that requires a small amount of cost function and gradient evaluations. We choose a trust-region reflective Newton solver 42 with gradients approximated through finite differences 4 . As an alternative choice, the Nelder-Mead simplex method 57 may be applied to solve the problem. This method has shown promising results in minimizing the objective at hand, however, due to a better performance in terms of computational efficiency the trust-region reflective Newton solver was preferred. As a preconditioning step, we minimize C without regularization with respect to the parameters θ 0 , H iso 1 and H kin 1 while constraining the other parameters to zero. This step can be interpreted as assuming that the yield surface is described by a perfect circle in the π plane (von Mises model) with linear isotropic and kinematic hardening. While being computationally inexpensive, this optimization allows to approximately estimate the size of the hidden yield surface.
Afterwards, we minimize C without regularization with respect to the parameters θ θ θ and H H H. As the trust-region reflective algorithm is not designed to find a global minimum to an optimization problem, we introduce n g = 100 random initial guesses. To this end, we use θ 0 from the preconditioning step, set all other parameters to zero and randomly perturb the parameters, where the perturbation follows a Gaussian distribution 5 . The best solution of the unregularized minimization problem, i.e., the solution with the lowest objective function, serves then as the initial guess for the regularized minimization problem.
The solution to the regularized problem (45) is dependent on the choice of λ p , which controls the degree of sparsity of the solution vector. We propose an algorithmic procedure for automatically selecting λ p . To this end, we minimize (45) in parallel for a set of different choices of λ p , specifically, λ p ∈ {2 i : i = −5, . . . , 15}.
We hence obtain one solution for each choice of λ p , which we store in the set of possible solutions 4 The Matlab ® built-in optimizer lsqnonlin is applied. 5 Assuming the order of magnitude of the target parameters to be approximately known, the standard deviations of the Gaussian perturbations are set to 0.1/2 i for θ i , 1 for H iso 2 , 100 for H iso 1 and H kin 1 , 1000 for H iso 3 and H kin 2 . In our experience, the choice of standard deviations for the random perturbations is not crucial in finding accurate solutions to the inverse problem.

20/28
where n λ is the number of different choices of λ p . The solution corresponding to the lowest cost in the solution set S is expected to feature a dense solution vector and to provide the highest accuracy. To find a parsimonious material model which still provides a reasonable fitting accuracy, we first discard all solutions whose cost is higher than a threshold value C th From the remaining solutions S th , which are expected to provide an acceptable accuracy if C th is sufficiently low, the sparsest solution is selected by choosing the solution with the smallest θ θ θ opt j p p . As the p regularization shrinks certain parameters but does not exactly set them to zero, a final thresholding is applied, where all parameters in the selected solution vector with absolute value below a threshold θ th are set to zero. Note that the parameters C th and θ th are the only hyperparameters in the proposed selection rule, resulting in low effort for parameter tuning. We here choose C th to be slightly greater than the lowest cost C min in the solution set S , i.e., C th = 1.01 C min , and θ th to be much smaller than the first component of the selected solution vector, i.e., θ th = 0.005 θ 0 . Figure S4 schematically summarizes the optimization process and Table S2 lists the corresponding parameters.
Each optimization process requires to repeatedly evaluate the cost function, which in turn requires to repeatedly apply the stress-update procedure. Depending on the load step size and the parameters for which the cost function is evaluated, the stress update procedure may fail to converge. In such scenario the corresponding cost function is set to infinity. This can hinder the trust-region reflective algorithm to converge to a local minimum, in which case the corresponding cost is set to infinity. This issue is observed to be particularly dominant if the parameters for which the cost function is evaluated get closer to violating the physical constraint in Eq. (5) of the main article (see Figure S5). In our numerical examples this phenomenon only occurred during the optimization of the unregularized problem for some of the random initial guesses. The convergence issues were not present in the regularized problem, for which we used the converged solution from the unregularized problem as initial guess. Note that when running a forward problem, the convergence issues can be fixed by decreasing the load step size. In the inverse problem, however, the load step size is predefined by the given data set. Synthetically generating data points by linear interpolation between subsequent points in time may be a solution to overcome this problem, but was not tried in the present work. Figure S1. Simulated net reaction forces considering the Tresca material model (TR) and its approximation in terms of the Fourier series (TR * ).R 1 andR 2 denote the horizontal and vertical reaction force (in kN) at the upper specimen boundary, respectively.  Figure S2. Yield surface plot of the Tresca model (TR), its Fourier-type approximation (TR * ) and the discovered material models for different noise levels. Coordinates π 1 , π 2 are in kN/mm 2 .  Table S1. Non-smooth yield functions and their fully expanded smooth approximations. For better clarity we show the initial yield functions, i.e., γ = 0. Material parameters are in kN/mm 2 .