Machine learning the dimension of a Fano variety

Fano varieties are basic building blocks in geometry – they are ‘atomic pieces’ of mathematical shapes. Recent progress in the classification of Fano varieties involves analysing an invariant called the quantum period. This is a sequence of integers which gives a numerical fingerprint for a Fano variety. It is conjectured that a Fano variety is uniquely determined by its quantum period. If this is true, one should be able to recover geometric properties of a Fano variety directly from its quantum period. We apply machine learning to the question: does the quantum period of X know the dimension of X? Note that there is as yet no theoretical understanding of this. We show that a simple feed-forward neural network can determine the dimension of X with 98% accuracy. Building on this, we establish rigorous asymptotics for the quantum periods of a class of Fano varieties. These asymptotics determine the dimension of X from its quantum period. Our results demonstrate that machine learning can pick out structure from complex mathematical data in situations where we lack theoretical understanding. They also give positive evidence for the conjecture that the quantum period of a Fano variety determines that variety.


Introduction
Algebraic geometry describes shapes as the solution sets of systems of polynomial equations, and manipulates or analyses a shape  by manipulating or analysing the equations that define .This interplay between algebra and geometry has applications across mathematics and science; see e.g.[3,22,53,57].Shapes defined by polynomial equations are called algebraic varieties.Fano varieties are a key class of algebraic varieties.They are, in a precise sense, atomic pieces of mathematical shapes [45,46].Fano varieties also play an essential role in string theory.They provide, through their 'anticanonical sections', the main construction of the Calabi-Yau manifolds which give geometric models of spacetime [6,30,55].
The classification of Fano varieties is a long-standing open problem.The only one-dimensional example is a line; this is classical.The ten smooth two-dimensional Fano varieties were found by del Pezzo in the 1880s [19].The classification of smooth Fano varieties in dimension three was a triumph of 20th century mathematics: it combines work by Fano in the 1930s, Iskovskikh in the 1970s, and Mori-Mukai in the 1980s [24, 38-40, 51, 52].Beyond this, little is known, particularly for the important case of Fano varieties that are not smooth.
A new approach to Fano classification centres around a set of ideas from string theory called Mirror Symmetry [7,15,31,35].From this perspective, the key invariant of a Fano variety is its regularized quantum period [8] This is a power series with coefficients  0 = 1,  1 = 0, and   =   !, where   is a certain Gromov-Witten invariant of .Intuitively speaking,   is the number of rational curves in  of degree  that pass through a fixed generic point and have a certain constraint on their complex structure.In general   can be a rational number, because curves with a symmetry group of order  are counted with weight 1/, but in all known cases the coefficients   in (1) are integers.
It is expected that the regularized quantum period   uniquely determines .This is true (and proven) for smooth Fano varieties in low dimensions, but is unknown in dimensions four and higher, and for Fano varieties that are not smooth.In this paper we will treat the regularized quantum period as a numerical signature for the Fano variety , given by the sequence of integers ( 0 ,  1 , . ..).A priori this looks like an infinite amount of data, but in fact there is a differential operator  such that    ≡ 0; see e.g.[8,Theorem 4.3].This gives a recurrence relation that determines all of the coefficients   from the first few terms, so the regularized quantum period   contains only a finite amount of information.Encoding a Fano variety  by a vector in Z +1 given by finitely many coefficients ( 0 ,  1 , . . .,   ) of the regularized quantum period allows us to investigate questions about Fano varieties using machine learning.
In this paper we ask whether the regularized quantum period of a Fano variety  knows the dimension of .There is currently no viable theoretical approach to this question.Instead we use machine learning methods applied to a large dataset to argue that the answer is probably yes, and then prove that the answer is yes for toric Fano varieties of low Picard rank.The use of machine learning was essential to the formulation of our rigorous results (Theorems 5 and 6 below).This work is therefore proof-of-concept for a larger program, demonstrating that machine learning can uncover previously unknown structure in complex mathematical datasets.Thus the Data Revolution, which has had such impact across the rest of science, also brings important new insights to pure mathematics [18,21,34,49,58,59].This is particularly true for large-scale classification questions, e.g.[1,10,14,17,47], where these methods can potentially reveal both the classification itself and structural relationships within it.

Results
Algebraic varieties can be smooth or have singularities.Depending on their equations, algebraic varieties can be smooth (as in Figure 1(a)) or have singularities (as in Figure 1(b)).In this paper we consider algebraic varieties over the complex numbers.The equations in Figures 1(a) and 1(b) therefore define complex surfaces; however, for ease of visualisation, we have plotted only the points on these surfaces with co-ordinates that are real numbers.
Most of the algebraic varieties that we consider below will be singular, but they all have a class of singularities called terminal quotient singularities.This is the most natural class of singularities to allow from the point of view of Fano classification [46].Terminal quotient singularities are very mild; indeed, in dimensions one and two, an algebraic variety has terminal quotient singularities if and only if it is smooth.
The Fano varieties that we consider.The fundamental example of a Fano variety is projective space P −1 .This is a quotient of C  \ {0} by the group C × , where the action of  ∈ C × identifies the points ( 1 ,  2 , . . .,   ) and ( 1 ,  2 , . . .,   ).The resulting algebraic variety is smooth and has dimension  − 1.We will consider generalisations of projective spaces called weighted projective spaces and toric varieties of Picard rank two.A detailed introduction to these spaces is given in §A.
To define a toric variety of Picard rank two, choose a matrix with non-negative integer entries and no zero columns.This defines an action of C × × C × on C  , where (, ) ∈ C × × C × identifies the points and suppose that (, ) is not a scalar multiple of (  ,   ) for any .This determines linear subspaces of C  , and we consider the quotient where  =  + ∪  − .The quotient  is an algebraic variety of dimension  − 2 and second Betti number  2 () ≤ 2. If, as we assume henceforth, the subspaces  + and  − both have dimension two or more then  2 () = 2, and thus  has Picard rank two.In general  will have singular points, the precise form of which is determined by the weights in (2).There are closed formulas for the regularized quantum period of weighted projective spaces and toric varieties [9].We have where P = P( 1 , . . .,   ) and  =  1 +  2 + • • • +   , and where the weights for  are as in (2), and  is the cone in R 2 defined by the equations    +    ≥ 0,  ∈ {1, 2, . . .,  }.Formula (4) implies that, for weighted projective spaces, the coefficient   from (1) is zero unless  is divisible by .Formula (5) implies that, for toric varieties of Picard rank two,   = 0 unless  is divisible by gcd{, }.Weighted projective spaces with terminal quotient singularities have been classified in dimensions up to four [41,43].Classifications in higher dimensions are hindered by the lack of an effective upper bound on .
Data generation: toric varieties.Deduplicating randomly-generated toric varieties of Picard rank two is harder than deduplicating randomly generated weighted projective spaces, because different weight matrices in (2) can give rise to the same toric variety.Toric varieties are uniquely determined, up to isomorphism, by a combinatorial object called a fan [25].A fan is a collection of cones, and one can determine the singularities of a toric variety  from the geometry of the cones in the corresponding fan.
We randomly generated 200 000 distinct toric varieties of Picard rank two with terminal quotient singularities, and with dimension up to 10, as follows.We randomly generated weight matrices, as in (2), such that 0 ≤   ,   ≤ 5. We then discarded the weight matrix if any column was zero, and otherwise formed the corresponding fan .We discarded the weight matrix unless: (i)  had  rays; (ii) each cone in  was simplicial (i.e. has number of rays equal to its dimension); (iii) the convex hull of the primitive generators of the rays of  contained no lattice points other than the rays and the origin.
Conditions (i) and (ii) together guarantee that  has Picard rank two, and are equivalent to the conditions on the weight matrix in (2) given in our definition.Conditions (ii) and (iii) guarantee that  has terminal quotient singularities.We then deduplicated the weight matrices according to the isomorphism type of , by putting  in normal form [32,48]. See Table 1 for a summary of the dataset.Data analysis: weighted projective spaces.We computed an initial segment ( 0 ,  1 , . . .,   ) of the regularized quantum period for all the examples in the sample of 150 000 terminal weighted projective spaces, with  ≈ 100 000.The non-zero coefficients   appeared to grow exponentially with , and so we considered {log   } ∈ where  = { ∈ Z ≥0 |   ≠ 0}.To reduce dimension we fitted a linear model to the set {(, log   ) |  ∈ } and used the slope and intercept of this model as features; see Figure 2(a) for a typical example.Plotting the slope against the -intercept and colouring datapoints according to the dimension we obtain Figure 3(a): note the clear separation by dimension.A Support Vector Machine (SVM) trained on 10% of the slope and -intercept data predicted the dimension of the weighted projective space with an accuracy of 99.99%.Full details are given in § §B-C.
Data analysis: toric varieties.As before, the non-zero coefficients   appeared to grow exponentially with , so we fitted a linear model to the set {(, log   ) |  ∈ } where  = { ∈ Z ≥0 |   ≠ 0}.We used the slope and intercept of this linear model as features.
Example 3. In Figure 2(b) we plot a typical example: the logarithm of the regularized quantum period sequence for the nine-dimensional toric variety with weight matrix 1 2 5 3 3 3 0 0 0 0 0 0 0 0 3 4 4 1 2 2 3 4 along with the linear approximation.We see a periodic deviation from the linear approximation; the magnitude of this deviation decreases as  increases (not shown).
To reduce computational costs, we computed pairs (, log   ) for 1000 ≤  ≤ 20 000 by sampling every 100th term.We discarded the beginning of the period sequence because of the noise it introduces to the linear regression.In cases where the sampled coefficient   is zero, we considered instead the next non-zero coefficient.The resulting plot of slope against -intercept, with datapoints coloured according to dimension, is shown in Figure 3(b).
We analysed the standard errors for the slope and -intercept of the linear model.The standard errors for the slope are small compared to the range of slopes, but in many cases the standard error  int for the -intercept is relatively large.As Figure 4 illustrates, discarding data points where the standard error  int for the -intercept exceeds some threshold reduces apparent noise.This suggests that the underlying structure is being obscured by inaccuracies in the linear regression caused by oscillatory behaviour in the initial terms of the quantum period sequence; these inaccuracies are concentrated in the -intercept of the linear model.Note that restricting attention to those data points where  int is small also greatly decreases the range of -intercepts that occur.As Example 4 and Figure 5 suggest, this reflects both transient oscillatory behaviour and also the presence of a subleading term in the asymptotics of log   which is missing from our feature set.We discuss this further below.We computed the first 40 000 coefficients   in (1).As Figure 5 shows, as  increases the -intercept of the linear model increases to −28.96 and  int decreases to 0.7877.At the same time, the slope of the linear model remains more or less unchanged, decreasing to 1.635.This supports the idea that computing (many) more coefficients   would significantly reduce noise in Figure 3(b).In this example, even 40 000 coefficients may not be enough.
Computing many more coefficients   across the whole dataset would require impractical amounts of computation time.In the example above, which is typical in this regard, increasing the number of coefficients computed from 20 000 to 40 000 increased the computation time by a factor of more than 10.Instead we restrict to those toric varieties of Picard rank two such that the -intercept standard error  int is less than 0.3; this retains 67 443 of the 200 000 datapoints.We used 70% of the slope and -intercept data in the restricted dataset for model training, and the rest for validation.An SVM model predicted the dimension of the toric variety with an accuracy of 87.7%, and a Random Forest Classifier (RFC) predicted the dimension with an accuracy of 88.6%.Neural networks.Neural networks do not handle unbalanced datasets well.We therefore removed the toric varieties of dimensions 3, 4, and 5 from our data, leaving 61 164 toric varieties of Picard rank two with terminal quotient singularities and  int < 0.3.This dataset is approximately balanced by dimension.
A Multilayer Perceptron (MLP) with three hidden layers of sizes (10, 30, 10) using the slope and intercept as features predicted the dimension with 89.0% accuracy.Since the slope and intercept give good control over log   for  ≫ 0, but not for small , it is likely that the coefficients   with  small contain extra information that the slope and intercept do not see.Supplementing the feature set by including the first 100 coefficients   as well as the slope and intercept increased the accuracy of the prediction to 97.7%.Full details can be found in § §B-C.
From machine learning to rigorous analysis.Elementary "out of the box" models (SVM, RFC, and MLP) trained on the slope and intercept data alone already gave a highly accurate prediction for the dimension.Furthermore even for the many-feature MLP, which was the most accurate, sensitivity analysis using SHAP values [50] showed that the slope and intercept were substantially more important to the prediction than any of the coefficients   : see Figure 6.This suggested that the dimension of  might be visible from a rigorous estimate of the growth rate of log   .
In §3 we establish asymptotic results for the regularized quantum period of toric varieties with low Picard rank, as follows.These results apply to any weighted projective space or toric variety of Picard rank two: they do not require a terminality hypothesis.Note, in each case, the presence of a subleading logarithmic term in the asymptotics for log   .Theorem 5. Let  denote the weighted projective space P( 1 , . . .,   ), so that the dimension of  is  − 1.
Let   denote the coefficient of   in the regularized quantum period   () given in (4).
Note, although it plays no role in what follows, that  is the Shannon entropy of the discrete random variable  with distribution ( 1 ,  2 , . . .,   ), and that  is a constant plus half the total self-information of .
Theorem 5 is a straightforward application of Stirling's formula.Theorem 6 is more involved, and relies on a Central Limit-type theorem that generalises the De Moivre-Laplace theorem.
Theoretical analysis.The asymptotics in Theorems 5 and 6 imply that, for  a weighted projective space or toric variety of Picard rank two, the quantum period determines the dimension of .Let us revisit the clustering analysis from this perspective.Recall the asymptotic expression log   ∼  − dim  2 log  +  and the formulae for  and  from Theorem 5. Figure 7   and  for a sample of weighted projective spaces, coloured by dimension.Note the clusters, which overlap.Broadly speaking, the values of  increase as the dimension of the weighted projective space increases, whereas in Figure 3(a) the -intercepts decrease as the dimension increases.This reflects the fact that we fitted a linear model to log   , omitting the subleading log  term in the asymptotics.
As Figure 8 shows, the linear model assigns the omitted term to the -intercept rather than the slope.The slope of the linear model is approximately equal to .The -intercept, however, differs from  by a dimension-dependent factor.The omitted log term does not vary too much over the range of degrees ( < 100 000) that we considered, and has the effect of reducing the observed -intercept from  to approximately  − 9 2 dim , distorting the clusters slightly and translating them downwards by a dimension-dependent factor.This separates the clusters.We expect that the same mechanism applies in Picard rank two as well: see Figure 7(b).
We can show that each cluster in Figure 7(a) is linearly bounded using constrained optimisation techniques.Consider for example the cluster for weighted projective spaces of dimension five, as in Figure 9.
Fix a suitable  ≥ 0 and consider on the five-simplex gives a linear lower bound for the cluster.This bound does not use terminality: it applies to any weighted projective space of dimension five.The expression + is unbounded above on the five-simplex (because  is) so we cannot obtain an upper bound this way.Instead, consider max( + ) subject to for an appropriate small positive , which we can take to be 1/ where  is the maximum sum of the weights.For Figure 9, for example, we can take  = 124, and in general such an  exists because there are only finitely many terminal weighted projective spaces.This gives a linear upper bound for the cluster.
The same methods yield linear bounds on each of the clusters in Figure 7(a).As the Figure shows however, the clusters are not linearly separable.
Discussion.We developed machine learning models that predict, with high accuracy, the dimension of a Fano variety from its regularized quantum period.These models apply to weighted projective spaces and toric varieties of Picard rank two with terminal quotient singularities.We then established rigorous asymptotics for the regularized quantum period of these Fano varieties.The form of the asymptotics implies that, in these cases, the regularized quantum period of a Fano variety  determines the dimension of .The asymptotics also give a theoretical underpinning for the success of the machine learning models.
Perversely, because the series involved converge extremely slowly, reading the dimension of a Fano variety directly from the asymptotics of the regularized quantum period is not practical.For the same reason, enhancing the feature set of our machine learning models by including a log  term in the linear regression results in less accurate predictions.So although the asymptotics in Theorems 5 and 6 determine the dimension in theory, in practice the most effective way to determine the dimension of an unknown Fano variety from its quantum period is to apply a machine learning model.
The insights gained from machine learning were the key to our formulation of the rigorous results in Theorems 5 and 6.Indeed, it might be hard to discover these results without a machine learning approach.It is notable that the techniques in the proof of Theorem 6 -the identification of generating functions for Gromov-Witten invariants of toric varieties with certain hypergeometric functions -have been known since the late 1990s and have been studied by many experts in hypergeometric functions since then.For us, the essential step in the discovery of the results was the feature extraction that we performed as part of our ML pipeline.
This work demonstrates that machine learning can uncover previously unknown structure in complex mathematical data, and is a powerful tool for developing rigorous mathematical results; cf.[18].It also provides evidence for a fundamental conjecture in the Fano classification program [8]: that the regularized quantum period of a Fano variety determines that variety.

Methods
In this section we prove Theorem 5 and Theorem 6.The following result implies Theorem 5. Theorem 8. Let  denote the weighted projective space P( 1 , . . .,   ), so that the dimension of  is  − 1.
For  in the strict interior of  with  •  = , we have that as  → ∞.
The objective function  =1 (  • ) log(  • ) here is the pullback to R 2 of the function along the linear embedding  : R 2 → R  given by ( 1 , . . .,   ).Note that  is the preimage under  of the positive orthant R  + , so we need to minimise  on the intersection of the simplex  1 +• • •+   = , ( 1 , . . . ,  ) ∈ R  + with the image of .The function  is convex and decreases as we move away from the boundary of the simplex, so the minimisation problem in (6) has a unique solution  * and this lies in the strict interior of .We can therefore find the minimum  * using the method of Lagrange multipliers, by solving

□
Given a solution  * to (7), any positive scalar multiple of  * also satisfies (7), with a different value of  and a different value of .Thus the solutions  * , as  varies, lie on a half-line through the origin.The direction vector [ : ] ∈ P 1 of this half-line is the unique solution to the system  =1 Note that the first equation here is homogeneous in  and ; it is equivalent to (7), by exponentiating and then eliminating .Any two solutions  * , for different values of , differ by rescaling, and the quantities   in Proposition 9 are invariant under this rescaling.They also satisfy We use the following result, known in the literature as the "Local Theorem" [29], to approximate multinomial coefficients.
as  → ∞, uniformly in all   's, where Theorem 6.Let  be a toric variety of Picard rank two and dimension  − 2 with weight matrix Proof.We need to estimate For  sufficiently large, each such summand is bounded by  − 1+dim  2 for some constant  -see (9).Since the number of such summands grows linearly with , in the limit  → ∞ the contribution to   from  ∉   √  vanishes.As  → ∞, therefore Writing   = ( −  * )/ √ , considering the sum here as a Riemann sum, and letting  → ∞, we see that where   is the line through the origin given by ker  and  is the measure on   given by the integer lattice Z 2 ∩   ⊂   .
To evaluate the integral, let and observe that the pullback of  along the map R →   given by  ↦ →  ⊥ is the standard measure on R. Thus ∫ 2 , and Taking logarithms gives the result.We begin with an introduction to weighted projective spaces and toric varieties, aimed at nonspecialists.
Projective spaces and weighted projective spaces.The fundamental example of a Fano variety is two-dimensional projective space P 2 .This is a quotient of C 3 \ {0} by the group C × , where the action of  ∈ C × identifies the points (, , ) and (, , ) in C 3 \ {0}.The variety P 2 is smooth: we can see this by covering it with three open sets   ,   ,   that are each isomorphic to the plane C 2 :   = {(1, , )} given by rescaling  to 1   = {( , 1, )} given by rescaling  to 1   = {( , , 1)} given by rescaling  to 1 Here, for example, in the case   we take  ≠ 0 and set  = /,  = /.
Although the projective space P 2 is smooth, there are closely related Fano varieties called weighted projective spaces [20,36] that have singularities.For example, consider the weighted projective plane P(1, 2, 3): this is the quotient of C 3 \ {0} by C × , where the action of  ∈ C × identifies the points (, , ) and (,  2 ,  3 ).Let us write for the group of th roots of unity.The variety P(1, 2, 3) is once again covered by open sets   = {(1, , )} given by rescaling  to 1   = {( , 1, )} given by rescaling  to 1 } given by rescaling  to 1 but this time we have   C 2 ,   C 2 / 2 , and   = C 2 / 3 .This is because, for example, when we choose  ∈ C × to rescale (, , ) with  ≠ 0 to ( , , 1), there are three possible choices for  and they differ by the action of  3 .In particular this lets us see that P(1, 2, 3) is singular.For example, functions on the chart   C 2 / 2 are polynomials in  and  that are invariant under  ↦ → −,  ↦ → −, or in other words Thus the chart   is the solution set for the equation  −  2 = 0, as pictured in Figure 10(a).Similarly, the chart   C 2 / 3 can be written as and is the solution set to the equation  −  3 = 0, as pictured in Figure 10(b).The variety P(1, 2, 3) has singular points at (0, 1, 0) ∈   and (0, 0, 1) ∈   , and away from these points it is smooth.
There are weighted projective spaces of any dimension.Let  1 ,  2 , . . .,   be positive integers such that any subset of size  − 1 has no common factor, and consider where the action of  ∈ C × identifies the points Hence every weighted projective space has second Betti number  2 = 1.There is a closed formula [9, Proposition D.9] for the regularized quantum period of  = P( 1 ,  2 , . . .,   ): where Toric varieties of Picard rank 2. As well as weighted projective spaces, which are quotients of C  \ {0} by an action of C × , we will consider varieties that arise as quotients of C  \  by C × × C × , where  is a union of linear subspaces.These are examples of toric varieties [16,25].Specifically, consider a matrix with non-negative integer entries and no zero columns.This defines an action of C × × C × on C  , where (, ) ∈ C × × C × identifies the points and suppose that (, ) is not a scalar multiple of (  ,   ) for any .This determines linear subspaces of C  , and we consider the quotient where  =  + ∪  − .See e.g.[5, §A.5].These quotients behave in many ways like weighted projective spaces.Indeed, if we take the weight matrix (11) to be then  coincides with P( 1 ,  2 , . . .,   ).We will consider only weight matrices such that the subspaces  + and  − both have dimension two or more; this implies that the second Betti number  2 () = 2, and hence  is not a weighted projective space.We will refer to such quotients (12) as toric varieties of Picard rank two, because general theory implies that the Picard lattice of  has rank two.The dimension of  is  − 2. As for weighted projective spaces, toric varieties of Picard rank two can have singular points, the precise form of which is determined by the weights (11).There is also a closed  Weighted projective spaces.We excluded dimensions one and two from the analysis, since there is only one weighted projective space in each case (namely P 1 and P 2 ).Therefore we have a dataset of 149 998 slope-intercept pairs, labelled by the dimension which varies between three and ten.We standardised the features, by translating the means to zero and scaling to unit variance, and applied a Support Vector Machine (SVM) with linear kernel and regularisation parameter  = 10.By looking at different train-test splits we obtained the learning curves shown in Figure 15.The figure displays the mean accuracies for both training and validation data obtained by performing five random test-train splits each time: the shaded areas around the lines correspond to the 1 region, where  denotes the   In Figure 16 we plot the decision boundaries computed by the SVM between neighbouring dimension classes.
Toric varieties of Picard rank 2. In light of the discussion above, we restricted attention to toric varieties with Picard rank two such that the -intercept standard error  int is less than 0.3.We also excluded dimension two from the analysis, since in this case there are only two varieties (namely, P 1 × P 1 and the Hirzebruch surface F 1 ).The resulting dataset contains 67 443 slope-intercept pairs, labelled by dimension; the dimension varies between three and ten, as shown in Table 3.
Support Vector Machine.We used a linear SVM with regularisation parameter  = 50.By considering different train-test splits we obtained the learning curves shown in Figure 17, where the means and the standard deviations were obtained by performing five random samples for each split.Note that the model did not overfit.We obtained a validation accuracy of 88.2% using 70% of the data for training.
Figure 18 shows the decision boundaries computed by the SVM between neighbouring dimension classes.Figure 19 shows the confusion matrices for the same train-test split.
Random Forest Classifier.We used a Random Forest Classifier (RFC) with 1500 estimators and the same features (slope and -intercept for the linear model).By considering different train-test splits we obtained the learning curves shown in Figure 20; note again that the model did not overfit.Using 70% of the data for training, the RFC gave a validation accuracy of 89.4%. Figure 21 on page 22 shows confusion matrices for the same train-test split.Rank-two toric varieties with  int < 0.      Feed-forward neural network.As discussed above, neural networks do not handle unbalanced datasets well, and therefore we removed the toric varieties with dimensions three, four, and five from our dataset: see Table 3.We trained a Multilayer Perceptron (MLP) classifier on the same features, using an MLP with three hidden layers (10, 30, 10), Adam optimiser [44], and rectified linear activation function [2].Different train-test splits produced the learning curve in Figure 22; again the model did not overfit.Using 70% of the data for training, the MLP gave a validation accuracy of 88.7%.One could further balance the dataset, by randomly undersampling so that there are the same number of representatives in each dimension (8244 representatives: see Table 3).This resulted in a slight decrease in accuracy: the better balance was outweighed by loss of data caused by undersampling.
Feed-forward neural network with many features.We trained an MLP with the same architecture, but supplemented the features by including log   for 1 ≤  ≤ 100 (unless   was zero in which case we set that feature to zero), as well as the slope and -intercept as before.We refer to the previous neural network as MLP 2 , because it uses 2 features, and refer to this neural network as MLP 102 , because it uses 102 features.Figure 23 shows the learning curves obtained for different train-test splits.Using 70% of the data for training, the MLP 102 model gave a validation accuracy of 97.7%.We do not understand the reason for the performance improvement between MLP 102 and MLP 2 .But one possible explanation is the following.Recall that the first 1000 terms of the period sequence were excluded when calculating the slope and intercept, because they exhibit irregular oscillations that decay as  grows.These oscillations reduce the accuracy of the linear regression.The oscillations may, however, carry information about the toric variety, and so including the first few values of log(  ) potentially makes more information available to the model.For example, examining the pattern of zeroes at the beginning of the sequence (  ) sometimes allows one to recover the values of  and  -see (13) for the notation.This information is relevant to estimating the dimension because, as a very crude approximation, larger  and  go along with larger dimension.Omitting the slope and intercept, however, and training on the coefficients log   for 1 ≤  ≤ 100 with the same architecture gave an accuracy of only 62%.
Comparison of models.The validation accuracies of the SVM, RFC, and the neural networks MLP 2 and MLP 102 , on the same data set ( int < 0.3, dimension between six and ten), are compared in Table 4. Their confusion matrices are shown in Table 5.All models trained on only the regression data performed well, with the RFC slightly more accurate than the SVM and the neural network MLP 2 slightly more accurate still.Misclassified examples are generally in higher dimension, which is consistent with the idea that misclassification is due to convergence-related noise.The neural network trained on the supplemented feature set, MLP 102 , outperforms all other models.However, as discussed above, feature importance analysis using SHAP values showed that the slope and the intercept were the most influential features in the prediction.

Appendix D. Supplementary Discussion
Comparison with Principal Component Analysis.An alternative approach to dimensionality reduction, rather than fitting a linear model to log   , would be to perform Principal Component Analysis (PCA) on this sequence and retain only the first few principal components.Since the vectors (  ) have different patterns of zeroes -  is non-zero only if  is divisible by the Fano index  of  -we need to perform PCA for Fano varieties of each index  separately.We analysed this in the weighted projective space case, finding that for each  the first two components of PCA are related to the growth coefficients (, ) from Theorem 5 by an invertible affine-linear transformation.That is, our analysis suggests that the coefficients (, ) contain exactly the same information as the first two components of PCA.Note, however, that the affine-linear transformation that relates PCA to (, ) varies with the Fano index .Using  and  as features therefore allows for meaningful comparison between Fano varieties of different index.Furthermore, unlike PCA-derived values, the coefficients (, ) can be computed for a single Fano variety, rather than requiring a sufficiently large collection of Fano varieties of the same index.
Towards more general Fano varieties.Weighted projective spaces and toric varieties of Picard rank two are very special among Fano varieties.It is hard to quantify this, because so little is known about Fano classification in the higher-dimensional and non-smooth cases, but for example this class includes only 18% of the Q-factorial terminal Fano toric varieties in three dimensions.On the other hand, one can regard weighted projective spaces and toric varieties of Picard rank two as representative of a much broader class of algebraic varieties called toric complete intersections.Toric complete intersections share the key properties that we used to prove Theorems 5 and 6 -geometry that is tightly controlled by combinatorics, including explicit expressions for genus-zero Gromov-Witten invariants in terms of hypergeometric functions -and we believe that the rigorous results of this paper will generalise to the toric complete intersection case.three-dimensional Fano varieties are toric complete intersections [9].Many theorems in algebraic geometry were first proved for toric varieties and later extended to toric complete intersections and more general algebraic varieties; cf.[26,27,33] and [28,56].
The machine learning paradigm presented here, however, applies much more broadly.Since our models take only the regularized quantum period sequence as input, we expect that whenever we can calculate   -which is the case for almost all known Fano varieties -we should be able to apply a machine learning pipeline to extract geometric information about .

(a) 𝑥 2 + 2 Figure 1 .
Figure 1.Algebraic varieties and their equations: (a) a smooth example; (b) an example with a singular point.

Figure 2 .Figure 3 .
Figure 2. The logarithm of the non-zero period coefficients   : (a) for a typical weighted projective space; (b) for the toric variety of Picard rank two from Example 3.

Figure 4 .
Figure 4.The slopes and -intercepts from the linear model.This is as in Figure 3(b), but plotting only data points for which the standard error  int for the -intercept satisfies  int < 0.3.The colour records the dimension of the toric variety.

Example 4 .
Consider the toric variety with Picard rank two and weight matrix 1 This is one of the outliers in Figure 3(b).The toric variety is five-dimensional, and has slope 1.637 and -intercept −62.64.The standard errors are 4.246 × 10 −4 for the slope and 5.021 for the -intercept.

Figure 5 .
Figure 5. Variation as we move deeper into the period sequence.The -intercept and its standard error  int for the toric variety from Example 4, as computed from pairs (, log   ) such that  − 20 000 ≤  ≤  by sampling every 100th term.We also show LOWESS-smoothed trend lines.

Figure 6 .
Figure 6.Model sensitivity analysis using SHAP values.The model is an MLP with three hidden layers of sizes (10,30,10) applied to toric varieties of Picard rank two with terminal quotient singularities.It is trained on the slope, -intercept, and the first 100 coefficients   as features, and predicts the dimension with 97.7% accuracy.
(a)  shows the values of

Figure 7 .
Figure 7.The values of the asymptotic coefficients  and : (a) for all weighted projective spaces P( 1 , . . .,   ) with terminal quotient singularities and   ≤ 25 for all .The colour records the dimension of the weighted projective space.(b) for toric varieties of Picard rank two in our dataset.The colour records the dimension of the toric variety.

Figure 8 .
Figure 8.For weighted projective spaces, the asymptotic coefficients  and  are closely related to the slope and -intercept.(a) Comparison between  and the slope from the linear model, for weighted projective spaces that occur in both Figure 3(a) and Figure 7(a), coloured by dimension.The line slope =  is indicated.(b) Comparison between  and the -intercept from the linear model, for weighted projective spaces that occur in both Figure 3(a) and Figure 7(a), coloured by dimension.In each case the line -intercept =  − 9 2 dim  is shown.

Figure 9 .
Figure 9. Linear bounds for the cluster of five-dimensional weighted projective spaces in Figure 7(a).The bounds are given by Proposition 7.

□Figure 10 .
Figure 10.Singular charts on the weighted projective space P(1, 2, 3): (a) the real-valued points in the chart   .(b) the real-valued points in the chart   .

Figure 11 .
Figure 11.Standard errors for the slope and -intercept.The distribution of standard errors for the slope and -intercept from the linear model applied to weighted projective spaces  with terminal quotient singularities: (a) standard error for the slope.(b) standard error for the intercept.(c) standard error for the -intercept by dimension.

Figure 12 .
Figure 12.Standard errors for the slope and -intercept.The distribution of standard errors for the slope and -intercept from the linear model applied to toric varieties of Picard rank two with terminal quotient singularities: (a) standard error for the slope.(b) standard error for the -intercept.

Figure 13 .Figure 14 .
Figure 13.The slopes and -intercepts from the linear model applied to toric varieties of Picard rank two with terminal quotient singularities.Data points are selected according to the standard error  int for the -intercept.The colour records the dimension of the toric variety.(a) All data points.(b) Points with  int < 1: 101 183/200000 points.(c) Points with  int < 0.3: 67 445/200000 points.

Figure 15 .
Figure 15.Learning curves for a Support Vector Machine with linear kernel applied to the dataset of weighted projective spaces.The plot shows the means of the training and validation accuracies for five different random train-test splits.The shaded regions show the 1 interval, where  is the standard deviation.

Figure 16 .
Figure 16.Decision boundaries computed from a Support Vector Machine with linear kernel trained on 70% of the dataset of weighted projective spaces.Note that the data has been standardised.

Figure 17 .
Figure 17.Learning curves for a Support Vector Machine with linear kernel applied to the dataset of toric varieties of Picard rank two.The plot shows the means of the training and validation accuracies for five different random train-test splits.The shaded regions show the 1 interval, where  is the standard deviation.

Figure 18 .
Figure 18.Decision boundaries computed from a Support Vector Machine with linear kernel trained on 70% of the dataset of toric varieties of Picard rank two.Note that the data has been standardised.

Figure 19 .
Figure 19.Confusion matrices for a Support Vector Machine with linear kernel trained on 70% of the dataset of toric varieties of Picard rank two.

Figure 20 .
Figure 20.Learning curves for a Random Forest Classifier applied to the dataset of toric varieties of Picard rank two.The plot shows the means of the training and validation accuracies for five different random train-test splits.The shaded regions show the 1 interval, where  is the standard deviation.
(a) Confusion matrix normalised with respect to the true values.(b) Confusion matrix normalised with respect to the predicted values.

Figure 21 .
Figure 21.Confusion matrices for a Random Forest Classifier trained on 70% of the dataset of toric varieties of Picard rank two.

Figure 22 .
Figure 22.Learning curves for a Multilayer Perceptron classifier MLP 2 applied to the dataset of toric varieties of Picard rank two and dimension at least six, using just the regression data as features.The plot shows the means of the training and validation accuracies for five different random train-test splits.The shaded regions show the 1 interval, where  is the standard deviation.

Figure 23 .
Figure 23.Learning curves for a Multilayer Perceptron classifier MLP 102 applied to the dataset of toric varieties of Picard rank two and dimension at least six, using as features the regression data as well as log   for 1 ≤  ≤ 100.The plot shows the means of the training and validation accuracies for five different random train-test splits.The shaded regions show the 1 interval, where  is the standard deviation.

Table 1 .
The distribution by dimension in our datasets.

Table 3 .
The distribution by dimension among toric varieties of Picard rank two in our dataset with  int < 0.3.

Table 4 .
Comparison of model accuracies.Accuracies for various models applied to the dataset of toric varieties of Picard rank two and dimension at least six: a Support Vector Machine with linear kernel, a Random Forest Classifier, and the neural networks MLP 2 and MLP 102 .

Table 5 .
All smooth two-dimensional Fano varieties and 92 of the 105 smooth Comparison of confusion matrices.Confusion matrices for various models applied to the dataset of toric varieties of Picard rank two and dimension at least six: a Support Vector Machine with linear kernel, a Random Forest Classifier, and the neural networks MLP 2 and MLP 102 .