Abstract
It is challenging to determine the ground states of heterofullerenes due to the numerous isomers. Taking the C_{60n }B_{ n } heterofullerenes (1 ≤ n ≤ 4) as an example, our firstprinciples calculations with the isomer enumeration present the most stable structure of C_{57}B_{3}, which is energetically favored by 0.73 eV than the reported counterpart. It was difficult to conduct the enumeration for the isomers with n beyond 4 because of the expensive firstprinciple calculations. Here, we propose a nomenclature to enhance structural recognition and adopt an extended cluster expansion to describe the structural stabilities, in which the energies of the heterofullerenes with various concentrations are predicted by linear combination of the multibody interactions. Unlike the conventional cluster expansion, the interaction parameters are derived from the enumeration of C_{60n }B_{ n } (n = 1~4), where there are only 4 coefficients to be fitted as a function of composition for the consideration of local bonding. The crossvalidation scores are 1~2 meV per atom for both C_{55}B_{5} and C_{54}B_{6}, ensuring the ground states obtained from our model are in line with the firstprinciples results. With the help of the structural recognition, the extended cluster expansion could be further applied to other binary systems as an effective complement to the firstprinciple calculations.
Introduction
Since the discovery in 1985^{1}, the C_{60} fullerene has attracted great attentions due to various potential applications with its unique structuredependent properties. The cage of C_{60} fullerene, with the size which is large enough to be observed by transmission electron microscopy or scanning probe methods^{2,3}, is likely to keep stable when built into molecular circuits as semiconductor materials^{4,5,6,7}. Doping has been adopted as the conceivable way to alter its charge distribution and then tune the optical, electronic and magnetic properties in the solid state^{8,9,10}, including the way of adding exohedral, endohedral and substitutional atoms^{11}. As neighbors to carbon in the Periodic Table, boron and nitrogen with similar atomic radius are the popular choices as the heteroatoms for the substitution of one or more of the carbon atoms^{12,13,14}. The C_{60n }B_{ n } heterofullerenes with 1 ≤ n ≤ 6 were produced by Laser vaporization of a graphite pellet containing boron nitride powder^{15}, which indicated that boron doped C_{60} cage still appeared to be particularly stable. During the synthesis and characterization of C_{59}N^{16}, C_{59}N in the vapor phase was found existing in monomer form as a molecular free radical^{17}, where the single C_{59}N heterofullerene molecule could be used as a new molecular rectifier in a doublebarrier tunnel junction via the single electron tunneling effect^{18}.
Theoretically, Kurita et al. found that the molecular structures of C_{59}B and C_{59}N maintained the cage of C_{60}, which was distorted due to a largesize dopant such as Sulfur^{19}. In addition to the calculation of C_{59}B by the firstprinciples method^{20}, the ground state geometries of C_{60n }N_{ n } and C_{60n }B_{ n } for 2 ≤ n ≤ 8 were screened using semiempirical MNDO, AM1, PM3, and ab initio methods^{21}. As C_{48}B_{12} and C_{48}N_{12} are promising components for molecular rectifiers^{22}, Garg et al.^{23} reported a detailed study of structural, electronic and vibrational properties of Bdoped heterofullerenes (C_{60n }B_{ n }, for n = 1~12) based on ab initio calculations, concluding that the maximum number of boron atoms in a pentagon/hexagon ring was one/two. In general, it is difficult to determine the ground state of heterofullerenes due to two main obstacles: (i) only small amount of isomers for given compositions were considered; (ii) the optimized heterofullerenes largely depend on the initial geometry of numerous possible isomers^{24}. Hence, there is still a lack of theoretical studies to search the energeticallypreferred structures of heterofullerenes.
In this paper, we perform a systematic investigation of C_{60n }B_{ n } (n = 1~6) based on the firstprinciples calculations with the congruence check, in which the structure recognition is achieved by a uniform numbering scheme for C_{60}. Furthermore, an extended cluster expansion is proposed to estimate the total energies with all the possible pair, threebody, and fourbody interactions derived from the enumeration of C_{60n }B_{ n } (n = 1~4), however, there are only four coefficients to be fitted for the consideration of composition. We determine the ground state structures of C_{55}B_{5} and C_{54}B_{6} as confirmed by the firstprinciples calculations, indicating the possible application to other alloy systems.
Structure recognition
We adopt the systematic numbering scheme recommended by the IUPAC^{25} to identify the vertices of the C_{60} cage, as shown in Fig. 1a. Any atom in the C_{60} fullerene cage (the coordinates are listed in Supplementary Table S1) has its unique sequence number (SN). An isomer of C_{60n }B_{ n } heterofullerenes is denoted by an index consisting of the ascending ordered SNs of the substituted vertices, i.e.\(\,({\sigma }_{1},{\sigma }_{2},\ldots ,{\sigma }_{n})\) which is called structural index (SI). According to the congruence check, the total number of the C_{60n }B_{ n } (n ≤ 4) heterofullerenes isomers is 4,517, which is only about 1% of the corresponding combination number (see Supplementary Table S2). All the total energies of these candidates are obtained by the firstprinciples calculations (details in Supplementary Information), where all the calculated structures are fully relaxed without any symmetry constraint. The ground state structures with the energy profiles are shown in Fig. 1. There are 23 different isomers (shown in Table 1) for C_{58}B_{2}, which is in agreement with earlier calculations^{23,26,27}. As shown in Fig. 1b, the global minimum structure of C_{58}B_{2} is the cage with two boron atoms at the opposite sites of a hexagon ring. The ground states of C_{57}B_{3} and C_{56}B_{4}, shown in Fig. 1c and d, respectively, have the similar pattern that all boron atoms are at the opposite vertices of the hexagon rings adjacent to each other, where there are no more than two boron atoms on a hexagon ring. The ground state structures of C_{58}B_{2} and C_{56}B_{4} are in agreement with those previous calculations^{21,23}. However, the total energy of the most stable C_{57}B_{3} is 0.73 eV lower than that of the one proposed by Garg et al.^{23}, which ranks 126^{th} according to our enumeration of the 303 isomers as shown in Fig. 1c.
Buckminsterfullerene, i.e., the only isomer fulfilling the isolated pentagon rule (IPR)^{28}, is a spherical molecule with 60 carbon atoms at vertices, containing 32 faces including 20 hexagons and 12 pentagons where no pentagon shares a vertex^{29}. Treating C_{60} as a semiregular polyhedron and considering all the symmetry operations represented by the 120 symmetry matrices (SMs), we obtain a 60 × 120 matrix called numbering matrix (NM) (available in the Supplementary Dataset file), in which the n ^{th} row lists the coincident atoms for the n ^{th} atom under all the symmetry operations and the n ^{th} column contains the corresponding coincident atoms for all the 60 atoms under the operation of the n ^{th} SM.
Herein, based on the NM of C_{60}, we propose a nomenclature for the C_{60n }B_{ n } heterofullerenes. The flow chart of our structure recognition method is shown in Supplementary Fig. S1. The detailed information of the structure recognition method is available in the Supplementary Information. We have deduced all the SIs of the inequivalent structures of C_{60n }B_{ n } heterofullerenes for 2 ≤ n ≤ 10, the numbers of the SIs are listed in Supplementary Table S2, which is in good agreement with the previous results^{24,30,31}. The inequivalent structures can also be singled out by our recently developed structure recognition method^{32}. Note that our nomenclature is derived from the symmetry operation matrices, which can be obtained according to the coordinates of the system with the corresponding symmetry operations. Thus, this nomenclature can be extended for the C_{60} with nonIPR isomers, as well as the larger fullerenes e.g. C_{70} and C_{82}.
Extended cluster expansion method
Combined with the isomer enumeration and the firstprinciples calculations, we have determined the ground state structures of C_{60n }B_{ n } heterofullerenes with 2 ≤ n ≤ 4. For those heterofullerenes with higher boron concentration, we can enumerate all the isomers by the recognition method discussed above. However, it will be over expensive to search the ground state structures with the firstprinciples calculations, because there are 45,718, 418,470, 3,220,218, 21,330,558, 123,204,921 and 628,330,629 isomers for C_{55}B_{5}, C_{54}B_{6}, C_{53}B_{7}, C_{52}B_{8}, C_{51}B_{9} and C_{50}B_{10}, respectively. Analogue to the conventional cluster expansion(CE)^{33}, an extended cluster expansion (ExCE) method is discussed aiming at this problem in the following.
As is known, the CE method is an efficient tool for studying structural properties of any binary structures over a wide range of concentrations^{34,35,36,37,38}, parameterizing the total energy for any given configuration of A_{x}B_{1−x} (0 ≤ x ≤ 1) to avoid the expensive cost of the firstprinciples calculations. The enthalpy of formation for a certain configuration \(\mathop{s}\limits^{\rightharpoonup }\) is described exactly by a set of multibody interaction parameters J _{ i } combined as the form of an Isinglike Hamiltonian^{37}, which is often approximated as a polynomial function of occupation variables,
where the summation is over all the nonequivalent clusters, a set of sites i denoted by α, and the average is taken over all the clusters α that are equivalent to α by symmetry. The coefficients are defined as effective cluster interaction (ECI) parameters, and m _{ α } is the number of the clusters equivalent to α.
In general, the total energies of given configurations are described by the combine of singleatom contributions, pair interactions and multibody interactions, which are expected to gradually converge as more interactions are considered. However, fitting with larger number of parameters will be also time expensive. To balance the accuracy and efficiency of CE method, the number of effective multisite interactions can be greatly reduced^{39}, while the combinations of possible effective interactions result in another global optimization. Herein, we attempt to derive the multibody interactions taking account of the total impurity concentration and fit the total energy with fewer parameters for higher accuracy.
An isomer of C_{60n }B_{ n } heterofullerenes, taking the carbon atoms as the background, can be viewed as a cluster of boron atoms denoted by \(\,({\sigma }_{1},{\sigma }_{2},\cdots ,{\sigma }_{n})\). We suppose the total energy be attributed to all the subclusters, which are enumerated for fitting the total energy. Firstly, there is only one isomer for C_{59}B, therefore the energy difference of C_{59}B relative to C_{60} is E _{1} = E _{ tot } − E _{0} = 3.39 eV, where E _{ tot } and E _{0} are the total energy of C_{59}B and C_{60}, respectively. The energy difference E _{1} is responsible for the reaction heat when one C atom is substituted by one boron atom, which can be considered as the singleatom contribution in the expansion. For a C_{58}B_{2} isomer whose subclusters are two equivalent boron single dopants, the environment of each boron atom is different from that of C_{59}B, thus the energies can be expressed as E _{0} + 2c _{1} E _{1} with the coefficient c _{1} as a function of boron concentration. Based on all the energies of C_{58}B_{2} isomers, we fit this coefficient c _{1} to be 0.980 with the average deviation of 0.208 eV. Similar to the CE method, the fitting quality is determined by the crossvalidation (CV) score^{39},
where \({E}_{i}^{DFT}\) and \({\hat{E}}_{i}\) denote the DFT calculated and predicted energy of a particular structure i. The deviation of \({E}_{i}^{DFT}\) from \({\hat{E}}_{i}\) is taken as the interactive energy of the corresponding boron cluster. For C_{58}B_{2}, The 23 fitting deviations are the BB interactions in the 23 isomers, respectively.
Similarly, the total energy of a C_{57}B_{3} isomer denoted by \(\,\mathop{\sigma }\limits^{\rightharpoonup }\), is contributed by three single dopants and three pair interactions. For example, the isomer of C_{57}B_{3} denoted by (1, 7, 11) can be expanded as 6 clusters including 3 singles and 3 pairs. The initial SIs of the subclusters are listed in Fig. 2a along with their smallest SIs by our recognition method. Apart from the singles, the isomer denoted by (1, 7, 11) has the 3 subclusters denoted by (1, 7), (1, 7) and (1, 24). Hence we express the total energy as \(\,{E}_{0}+3{c}_{1}{E}_{1}+{c}_{2}\sum _{\alpha }{E}_{2}^{\alpha }\), where \({E}_{2}^{\alpha }\) denotes the BB interaction for any boron pair as a subcluster of \(\,\mathop{\sigma }\limits^{\rightharpoonup }\). The coefficients c _{1} and c _{2} are fitted to be 0.972 and 0.782 respectively, and the average deviation is 0.097 eV. The 303 fitting deviations are further taken as the 303 3body interactive energies, respectively. For a C_{56}B_{4} heterofullerene, the total energy is contributed by 4 single dopants, 6 pair interactions, and 4 triplet interactions. For example, the subclusters from the expansion of the isomer of C_{56}B_{4} denoted by (1, 7, 11, 24) are listed in Fig. 2b. Analogously, we obtain the 4,190 4body interactions when the energies of all C_{56}B_{4} isomers are fitted by E _{0}, 3E _{1}, the 2body and 3body interactions, in which the average fitting deviation is 0.065 eV and the coefficients c _{1} ,c _{2} and c _{3} are 0.966, 0.690 and 0.640, respectively.
As shown above, the coefficients reflect the boron atom’s concentration and the fitting deviations are attributed to the multibody interactions. The fitting coefficients and CV scores for C_{58}B_{2}, C_{57}B_{3} and C_{56}B_{4} are listed in Table 2. It shows that the introducing of multibody interactions will improve the accuracy of cluster expansion and the interactions will decrease as boron dopants increasing. For example, the coefficients c _{1} is from 0.980 to 0.966, and c _{2} is from 0.782 to 0.690. Fig. 3 shows the statistical distributions for the 2body, 3body and 4body interactions. Compared to the 3body and 4body interactions, the 2body interactions distribute in a wider range. The 4body interactions exhibit the similar characteristics of normal distribution around zero point. It can be inferred that the interactions of 2body and 3body are much more important than that of 4body or other multibody interactions, hence the fitting will also reach a rather good convergence even if only 2body 3body interactions are considered.
Herein, we propose an extended cluster expansion for the C_{60n }B_{ n } heterofullerene, where the energy of isomer denoted by \(\,\mathop{\sigma }\limits^{\rightharpoonup }\) is expressed as
where \(\,{E}_{DFT}(\mathop{\sigma }\limits^{\rightharpoonup })\) and \(\hat{E}(\mathop{\sigma }\limits^{\rightharpoonup })\) are the DFT calculated energy and the predicted energy, respectively, and E _{ n } denotes the fitting deviation which is supposed to be responsible for the nbody interaction of the boron cluster denoted by \(\,\mathop{\sigma }\limits^{\rightharpoonup }\). The predicted energy is as follows,
where the summation runs over all possible sizes of the subclusters of \(\,\mathop{\sigma }\limits^{\rightharpoonup }\). The first term E _{0} represents the energy of C_{60}, and E _{ i } denotes the total effective energy contribution from all the clusters with i heteroatoms, as is expressed as below
where the summation is over all those subclusters consisting of i heteroatoms, i.e. \(1\le \alpha \le {C}_{n}^{i}\). Different from the conventional CE, the multibody interactive energies E _{ i } apart from E _{ n }, should be multiplied by different combination coefficients c _{ i } before they make contribution to the total energy of the C_{60n }B_{ n } heterofullerene cages, where the coefficients are obtained by fitting the DFTcalculated energies of the selected C_{60n }B_{ n } heterofullerenes with those E _{ i } for i < n. To balance the accuracy and computation cost, we set 4 as the maximum value of the summing index in Eq. (4) for the cages of C_{60n }B_{ n } where n ≥ 5.
We show the flow chart of our method in Fig. 4 and make a detailed description for the process in searching the ground state structures for the C_{55}B_{5} cage.

(1)
Generate the SIs of all the C_{55}B_{5} isomers (45,718 in all), list all the subclusters of these SIs and calculate the total energies of the isomers by Eq. (3), where the combination coefficients for the interaction of singles, pairs and triplets are from the fitting of the energies of C_{56}B_{4}, and the coefficient for quadrupletbody interaction is 1.

(2)
Choose the 100 minimum energetic structures and calculate their total energies (saved in \({E}_{i}^{DFT}\)) using the firstprinciples calculations.

(3)
Retain the corresponding coefficients with the total energies from the firstprinciples calculations. Use the coefficients to calculate the total energies of all the isomers by Eq. (3).

(4)
Apart from those selected before, select the 100 minimum energetic structures and calculate their total energies (appended to \({E}_{i}^{DFT}\)) using DFT.

(5)
Fit the energies of the structures, which has been selected until now, by Eq. (3), and update the corresponding coefficients.

(6)
Use the coefficients to calculate the total energies of all the isomers.

(7)
Check whether the latest DFT calculation brings about new structures whose energies are among the minimum 100 ones in \({E}_{i}^{DFT}\) If it does, repeat the steps 4 to 7, otherwise, break the process.
The fitting ultimately reaches a rather reasonable convergence after several hundreds of structures with the lowest predicted energy are calculated. Supplementary Table S3 and Table S4 shows the variations of coefficients for C_{55}B_{5} and C_{54}B_{6}, respectively, as a function of the number of isomers calculated by the firstprinciples method. Note that the coefficients of c _{2} and c _{3} are around 0.6 and 0.4 respectively, while the coefficient c _{4} is approaching zero. Similar to the conventional CE, the energy of interatomic bonds is usually dominated by shortrange interactions^{39}. On the other hand, enormous interactions would be introduced when we use the C_{60n }B_{ n } with higher boron concentration for cluster expansion, which will be expensive in computational cost. Note that any other binary systems can be similarly searched by the extended cluster expansion, where the appropriate cutoff of the size of the subcluster should be carefully made to balance of the accuracy and computation cost. The nomenclature and extended cluster expansion can be also applied for the ternary systems, where different atoms are distinguished from the candidates found in the binary systems.
Application to C_{55}B_{5} and C_{54}B_{6}
According to the structure recognition, there are 45,718 and 418,470 inequivalent structures for C_{55}B_{5} and C_{54}B_{6}, respectively. Using the method discussed above and following those steps, we have made a prediction for the ground state structures of C_{55}B_{5} and C_{54}B_{6}, where the energy profiles are shown in Fig. 5 (detailed in Supplementary Table S5). The optimized fitting coefficients of C_{55}B_{5} are adopted for the initial combination coefficients for C_{54}B_{6}. As the fitting steps move forward, most of the energies of the newly added isomers in each fitting iteration gradually increase. However, the 100^{th} lowest energy of the fitting iteration decreases rapidly and eventually converges, after 6/8 fitting iterations for C_{55}B_{5}/C_{54}B_{6}. The minimum energetic isomers for both C_{55}B_{5} and C_{54}B_{6} emerge in the first iteration. The optimized fitting coefficients were obtained for C_{55}B_{5}/C_{54}B_{6} after the energies of the selected 600/800 isomers are calculated and fitted, for which the results are listed in Table 2. The CV score of the final fitting is 0.064/0.124 eV for C_{55}B_{5}/C_{54}B_{6}, and the largest deviation of total energy between ExCE method and DFT calculations is 0.192/0.403 eV for C_{55}B_{5}/C_{54}B_{6}, indicating that the fitting energies are in good agreement with the DFT calculations. For both C_{55}B_{5} and C_{54}B_{6}, the coefficients c _{1} are close to 1, implying that single boron atom does make an important contribution despite of the concentration. The coefficient c _{4} is much smaller and the quadruplet interactions play a trivial role in the ExCE calculations, since the fitting will be in good accuracy when the pair and triplet interactions are considered in the energy predications of C_{60n }B_{ n } for n ≥ 5. This is consistent with the above assumption that the energy of interatomic bonds should be usually dominated by shortrange interactions.
For C_{55}B_{5}, the ExCE energy versus the computational energy is shown in Fig. 6a. The putative ground state is (1, 7, 11, 24, 27). The five heteroatoms are located at the 5 apposite sites of 5 hexagon rings and make up of a pentagon which encloses a carbonic pentagon ring, with the similar pattern of the ground state of C_{60n }B_{ n } for 2 ≤ n ≤ 4. The next preferred positions for boron atoms are (1, 7, 11, 32, 35), with a total energy of 0.32 eV higher. It was reported in ref.^{23} that the minimal energetic structure for C_{55}B_{5} was (1, 7, 18, 51, 59). In contrast, this structure is found to be 253^{rd} in the stability ranking and higher in energy by 0.68 eV than our minimal energetic structure.
For C_{54}B_{6}, the putative lowest energy is from (1, 6, 11, 18, 24, 27) which is shown in Fig. 6b. In this isomer, 4 boron atoms are at the consecutive opposite sites and the other two are at the isolated opposite sites. The next most favorable configuration is (1, 7, 11, 16, 24, 36). Chen et al.^{21} reported that the global minimum structure was (1, 6, 9, 12, 15, 18), but from our result, this structure has a higher energy of 0.34 eV relative to our minimal energy and ranks 100^{th} in our ascending order list of the total energies. Garg et al.^{23} predicted the minimal energetic cage for C_{54}B_{6} was (1, 6, 11, 18, 27, 31), now in our calculation, this structure rates the 340^{st} in the ranking of stability and is less stable by 0.52 eV with respect to our minimal energy structure.
Summary
We have developed a nomenclature to enhance structural recognition and adopted an extended cluster expansion to describe the structural stabilities, which is good agreement with the results from the firstprinciples calculations. Unlike the conventional cluster expansion, the interaction parameters are derived from the enumeration of C_{60n }B_{ n } (n = 1~4), where there are only 4 coefficients to be fitted for the composition consideration. Notably, we have found the stable isomers of C_{57}B_{3}, C_{55}B_{5}, and C_{54}B_{6}, which are energetically favored by at least 0.3 eV than the reported counterparts. With the symmetry operation matrices, the nomenclature can be applied for other binary/ternary systems, where the ground state structures are searched with the extended cluster expansion. Thus, our finding will be an effective complement to the firstprinciples calculations in materials science.
Additional information
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
 1.
Kroto, H. W., Heath, J. R., O’Brien, S. C., Curl, R. F. & Smalley, R. E. C_{60}: Buckminsterfullerene. Nature 318, 162–163 (1985).
 2.
Joachim, C., Gimzewski, J. K., Schlittler, R. R. & Chavy, C. Electronic Transparence of a Single C_{60} Molecule. Physical review letters 74, 2102–2105 (1995).
 3.
Sakurai, T. et al. Scanning tunneling microscopy study of fullerenes. Progress in Surface Science 51, 263–408 (1996).
 4.
Haddock, J. N., Zhang, X., Domercq, B. & Kippelen, B. Fullerene based ntype organic thinfilm transistors. Organic Electronics 6, 182–187 (2005).
 5.
Wöbkenberg, P. H. et al. High mobility nchannel organic fieldeffect transistors based on soluble C_{60} and C_{70} fullerene derivatives. Synthetic Metals 158, 468–472 (2008).
 6.
Yakuphanoglu, F. Electrical conductivity, optical and metal–semiconductor contact properties of organic semiconductor based on MEHPPV/fullerene blend. Journal of Physics and Chemistry of Solids 69, 949–954 (2008).
 7.
Moriarty, P. J. Fullerene adsorption on semiconductor surfaces. Surface Science Reports 65, 175–227 (2010).
 8.
Ray, C. et al. Synthesis and Structure of Silicondoped Heterofullerenes. Physical review letters 80, 5365–5368 (1998).
 9.
Hirsch, A. & Nuber, B. Nitrogen Heterofullerenes. Accounts of Chemical Research 32, 795–804 (1999).
 10.
László, F. & László, M. Electronic properties of doped fullerenes. Reports on Progress in Physics 64, 649 (2001).
 11.
Pichler, T. et al. Onball doping of fullerenes: The electronic structure of C_{59}N dimers from experiment and theory. Physical review letters 78, 4249–4252 (1997).
 12.
Hummelen, J. C., Knight, B., Pavlovich, J., Gonzalez, R. & Wudl, F. Isolation of the heterofullerene C5_{9N} as its dimer (C5_{9N}2). Science 269, 1554–1556 (1995).
 13.
Hultman, L. et al. Crosslinked nanoonions of carbon nitride in the solid phase: Existence of a novel C_{48}N_{12} azafullerene. Physical review letters 87, art. no.225503 (2001).
 14.
Schultz, D., Droppa, R., Alvarez, F. & dos Santos, M. C. Stability of small carbonnitride heterofullerenes. Physical review letters 90 (2003).
 15.
Guo, T., Jin, C. & Smalley, R. E. Doping bucky: formation and properties of borondoped buckminsterfullerene. The Journal of Physical Chemistry 95, 4948–4950 (1991).
 16.
Yu, R. et al. Simultaneous Synthesis of Carbon Nanotubes and NitrogenDoped Fullerenes in Nitrogen Atmosphere. The Journal of Physical Chemistry 99, 1818–1819 (1995).
 17.
Butcher, M. J. et al. C_{59}N monomers: Stabilization through immobilization. Physical review letters 83, 3478–3481 (1999).
 18.
Jin, Z. et al. Single C_{59}N molecule as a molecular rectifier. Physical review letters 95, 045502/045501–045504 (2005).
 19.
Kurita, N., Kobayashi, K., Kumahora, H., Tago, K. & Ozawa, K. Molecular structures, binding energies and electronic properties of dopyballs C_{59}X (X = B, N and S). Chemical Physics Letters 198, 95–99 (1992).
 20.
Miyamoto, Y., Hamada, N., Oshiyama, A. & Saito, S. Electronic structures of solid BC_{59}. Physical Review B 46, 1749–1753 (1992).
 21.
Chen, Z. F., Zhao, X. Z. & Tang, A. C. Theoretical studies of the substitution patterns in heterofullerenes C_{60x}N_{x} and C_{60x}B_{x} (x = 28). Journal of Physical Chemistry A 103, 10961–10968 (1999).
 22.
Xie, R. H. et al. Tailorable acceptor C_{60n}B_{n} and donor C_{60m}N_{m} pairs for molecular electronics. Physical review letters 90 (2003).
 23.
Garg, I., Sharma, H., Dharamvir, K. & Jindal, V. K. Substitutional Patterns in Boron Doped Heterofullerenes C_{60n}B_{n} (n = 112). Journal of Computational and Theoretical Nanoscience 8, 642–655 (2011).
 24.
Shinsaku, F. Soccerane Derivatives of Given Symmetries. Bulletin of the Chemical Society of Japan 64, 3215–3223 (1991).
 25.
Cozzi, F., Powell, W. H. & Thilgen, C. Numbering of fullerenes  (IUPAC Recommendations 2005). Pure and Applied Chemistry 77, 843–923 (2005).
 26.
Hedberg, K. et al. Bond lengths in free molecules of buckminsterfullerene, C_{60}, from gasphase electron diffraction. Science (New York, NY) 254, 410–412 (1991).
 27.
Qi, J., Zhu, H., Zheng, M. & Hu, X. Theoretical studies on characterization of heterofullerene C_{58}B_{2} isomers by Xray spectroscopy. RSC Advances 6, 96752–96761 (2016).
 28.
Kroto, H. W. The stability of the fullerenes Cn, with n = 24, 28, 32, 36, 50, 60 and 70. Nature 329, 529–531 (1987).
 29.
Thilgen, C. & Diederich, F. Structural Aspects of Fullerene ChemistryA Journey through Fullerene Chirality. Chemical Reviews 106, 5049–5135 (2006).
 30.
Balasubramanian, K. Enumeration of chiral and positional isomers of substituted fullerene cages (C_{20}C_{70}). The Journal of Physical Chemistry 97, 6990–6998 (1993).
 31.
Babic, D., Doslic, T., Klein, D. J. & Misra, A. Kekulenoid addition patterns for fullerenes and some lower homologs. Bulletin of the Chemical Society of Japan 77, 2003–2010 (2004).
 32.
Li, X.T., Yang, X.B. & Zhao, Y.J. Geometrical eigensubspace framework based molecular conformation representation for efficient structure recognition and comparison. The Journal of Chemical Physics 146, 154108 (2017).
 33.
Sanchez, J. M., Ducastelle, F. & Gratias, D. Generalized cluster description of multicomponent systems. Physica A 128a, 334–350 (1984).
 34.
Sluiter, M. H. F. & Kawazoe, Y. Cluster expansion method for adsorption: Application to hydrogen chemisorption on graphene. Physical Review B 68 (2003).
 35.
Hart, G. L. W., Blum, V., Walorski, M. J. & Zunger, A. Evolutionary approach for determining firstprinciples hamiltonians. Nature Materials 4, 391–394 (2005).
 36.
Seko, A., Yuge, K., Oba, F., Kuwabara, A. & Tanaka, I. Prediction of groundstate structures and orderdisorder phase transitions in IIIII spinel oxides: A combined clusterexpansion method and firstprinciples study. Physical Review B 73 (2006).
 37.
Muzyk, M., NguyenManh, D., Kurzydlowski, K. J., Baluc, N. L. & Dudarev, S. L. Phase stability, point defects, and elastic properties of WV and WTa alloys. Physical Review B 84 (2011).
 38.
Nahas, S., Ghosh, B., Bhowmick, S. & Agarwal, A. Firstprinciples cluster expansion study of functionalization of black phosphorene via fluorination and oxidation. Physical Review B 93 (2016).
 39.
van de Walle, A. & Ceder, G. Automating firstprinciples phase diagram calculations. Journal of Phase Equilibria 23, 348 (2002).
Acknowledgements
This work was supported by NSFC (Grant Nos. 11474100 and 11574088), Guangdong Natural Science Funds for Distinguished Young Scholars (Grant No. 2014A030306024) and Guangdong Natural Science Funds (Grant No. 2017A030310086). The computer times at National Supercomputing Center in Guangzhou (NSCCGZ) are gratefully acknowledged.
Author information
Affiliations
Department of Physics, South China University of Technology, Guangzhou, 510640, People’s Republic of China
 YunHua Cheng
 , JiHai Liao
 , YuJun Zhao
 & XiaoBao Yang
Key Laboratory of Advanced Energy Storage Materials of Guangdong Province, South China University of Technology, Guangzhou, 510640, P. R. China
 YuJun Zhao
 & XiaoBao Yang
Authors
Search for YunHua Cheng in:
Search for JiHai Liao in:
Search for YuJun Zhao in:
Search for XiaoBao Yang in:
Contributions
Y.C. and X.Y. designed the research. Y.C. did the computation. Y.C., J.L. and X.Y. analyzed the data and discussed the results. Y.C., Y.Z. and X.Y. wrote the manuscript. All the authors have reviewed and finalized the manuscript.
Competing Interests
The authors declare that they have no competing interests.
Corresponding author
Correspondence to XiaoBao Yang.
Electronic supplementary material
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.