## Introduction

The unique properties of graphene have the potential to support the development of the next generation of composites, electronics, membranes, energy storage devices, drug delivery systems, implantable materials, etc.1,2,3,4. An exclusive graphene industry was already emerging even before the standards for quality and toxicity control had been established. Previous study has shown that most of the graphene products available in the market are poorly characterized ‘black powders’ that do not adhere to appropriate standards5. Even more pressing is the question concerning the toxicity of the globally available graphene, especially when graphene-based products are considered for medical and consumer use.

The current literature presents conflicting results regarding the cytocompatibility-related character of graphene. “The dose makes the poison”, and the physiochemical characteristics of the material can play critical roles in cytocompatibility6. Contamination can occur at many different stages of graphene production due to the diversity of precursors and manufacturing processes7,8,9,10. Several methods have been developed to produce graphene, and some of these involve the use of chemicals that are not biocompatible and can cause adverse health effects10,11,12,13,14. Moreover, the graphene obtained industrially is multi-layered with a broad distribution of dimensions which can also impact its toxicity10,12,15,16,17. Finally, harsh conditions imposed on the graphitic precursors during the industrial process, the presence of intermetallic impurities and certain amounts of permanent structural defects in the hexagonal carbon framework can also impact graphene’s toxicity8,9,18,19. The graphene sold worldwide contains a wide span of particle sizes, number of layers, different types of structural defects, and are contaminated with extrinsic material5,10,14,20. Figure 1a illustrates in a simplified manner some of the named aspects of graphene. These can confer either cytotoxicity or cytocompatibility character depending on the mode of use and the dosage21,22,23,24,25,26,27,28. Thus, there is a growing understanding that the toxic character of each graphene product must be evaluated individually.

We investigated the toxicity of graphene products available commercially and correlated their cytotoxicity to the material’s physical and chemical characteristics. We examined 36 commercially available graphene products and found a surprisingly large variability in cytotoxicity, from harmless to highly toxic. The graphene products were investigated as received from suppliers, without any additional processing or modifications that could skew in vitro toxicity results. To understand these results, we have performed a detailed characterization of these materials. Our study suggests that over 35% of the graphene products contain highly defective graphene; we observed the presence of organic and/or inorganic impurities in almost all the samples. Our results prompt the conclusion that a product’s cytotoxicity cannot be related to any characteristic of graphene alone, but it is determined by the contaminants present.

## Results

### Nominal graphene materials (NGMs) characterization

We refer to products in this study as nominal graphene materials (NGMs), as they represent heterogenous class of materials with different chemical components and different physiochemical properties of graphitic portion. The obtained NGMs consist of graphitic materials with different macroscopical appearance: some of them are dense and grained with a metallic aspect, while others are extremely soft, voluminous, and deep black (Supplementary Fig. 1), or in-between the outlined range. This already indicates that their physio-chemical properties differ strongly from what we will consequently outline in the following section in greater detail.

### Flake thickness, graphene fraction, and lateral size

Using optical microscopy, Raman spectroscopy, and atomic force microscopy (AFM), we found that the NGMs consist of few- and multi-layered materials (Figs. 1d, 2, Supplementary Fig. 2a). In 2017, the ISO defined the concepts of bilayer and few-layer graphene (FLG, with three to 10 layers), in accordance with the definitions detailed by the Editorial Board of the Journal Carbon and Wick et al.29,30,31.

We define N50 and N90 as the 50th percentile (median value) and 90th percentile value of the distribution of the number of graphene layers of each material. N50 ≤ 10 layers occurred, whereas the average N50 value was 23. None of the samples had N90 below 10 layers. The lowest value of N90 (13 layers) was found for two NGMs, and the next best N90 was equal to 31. Thus, we defined the graphene fraction (rG) as a fraction of the flakes with N50 ≤ 10, satisfying the definition of FLG. Most of the samples had rG < 10%, and only one sample had rG of ~50%. Similarly, the flakes’ sizes were quantified for L50 and L90 as the median value and 90th percentile. Most of the samples had L50 diameter, ranging between 0.5 and 15 µm (overall median value of 1.1 µm), and L90 varying between 1.5 µm and 100 µm (overall median of 5.4 µm).

The three-dimensional representation of physiochemical properties defining the quality of the graphene according to Wick et al. is presented in Fig. 2.

### Defect density, carbon content, volatile, and sp2 fractions

The quality of NGMs was analyzed through an evaluation of the intensities of the D and G mode (ID/IG) of recorded Raman spectra, and the density of defects (LD) with respect to carbon atoms was calculated using a formula published by Lucchese, Cançado et al. (Supplementary Fig. 3, Supplementary Table 1)32. We separated all materials involved in the survey into two groups: intact (NGMint) and defective dominated (NGMdef); the threshold was established as 1.3% defects (ID/IG ~ 1.3, and average length between two defects, LD ~ 1.42 nm). For higher densities, the ID/IG becomes less precise with the ratio approximating values around ~1.0, and densities of defects cannot be reliably calculated anymore in this defect-dominated regime33. Nevertheless, 23 materials were almost intact and bear a low density of defects and were consequently categorized as NGMint, while the other 13 samples have a significantly disturbed carbon lattice and were thus labeled as NGMdef. From this group, three samples presented Raman spectra resembling amorphous carbon. Since defects in the hexagonal carbon structure can be not only permanent in-plane lattice defects such as vacancies, grain boundaries, and other non-hexagonal ring structures, but also sp3 hybridized carbon with functional groups (cf. graphene oxide), we will have a differentiated evaluation of both groups in the following analysis.

Carbon content (wt%(C)) was determined via combustion elemental analysis (Supplementary Fig. 2b). Ideally, graphene should consist of pure carbon, but herein, NGMs presented wt%(C) varying widely between 63 and 98 wt%. From the 13 samples that presented wt%(C) > 92 wt%, only one was NGMdef. Notably, half of the 14 samples with wt%(C) ranging from 90 to 79 wt% were NGMdef. Finally, the five materials with the lowest wt%(C) ranging from 63 to 73 wt%, were all in the NGMdef group. Based on this outcome, it is feasible to note that a correlation between structural integrity and long-range order by means of Raman spectroscopy does not necessarily imply a high detected carbon content in the NGM and, consequently, high purity.

We characterized the thermal decomposition of NGMs using thermogravimetric analysis-mass spectroscopy (TGA-MS; 100 °C to 800 °C) to deduce the part of the non-carbon content that corresponds to functional groups, absorbents, surfactants, and other contaminants (Supplementary Fig. 4a–d). Herein, 80% of the samples had ΔTG lesser than 12 wt%. The highest ΔTG values (31.1–9.1 wt%) were observed for the NGMdef group. The strong mass decay with increasing temperature may be related to thermal degradation at the large defect sites, as observed in humic substances and activated carbon, but unlikely from graphene oxide-like materials, as obligatory mass fragments with m/z 17, 18, 28, 44, 64 do not match the typical pattern or are negligible to absent (Supplementary Fig. 4e, Supplementary Table 1)34. Interestingly, some NGMint had ΔTG = 8.8 wt%, while some NGMdef had negligible mass loss (ΔTG = 0.9 wt%). This is an interesting observation, as high mass loss is typically associated with the cleavage of functional groups which can easily be detected by Raman spectroscopy, while the ideal graphene material without any defects should not experience any thermal decomposition. Furthermore, all NGMs presented a noticeable amount of fragment ions, with m/z ratios between 15 and 78 (Supplementary Table 1), indicating that all the NGMs contain contaminants which are not present in pristine graphite used as control (Supplementary Fig. 4f).

We identified that a certain amount of non-carbon material (cf. results from EA) failed to evaporate when heated up to 800 °C in TGA, which we will refer to as persistent inorganics (PIs). The precise proportion of carbon-to-inorganics in the volatile component ΔTG is unknown; hence, we defined a lower bound for the cases in which all the volatiles ΔTG are inorganics and an upper bond if all volatiles are carbon.

The fraction of sp2 bounded carbon (r(sp2)) was obtained from the deconvoluted intensities of XPS C1s core spectra with Shirley background correction (Supplementary Fig. 5)35. For ideal graphene, the r(sp2) should be close to 100%, with only the edge carbon atoms and adventurous carbon slightly reducing the overall value, whereby other defects, functional groups, and organics reduce the overall r(sp2) significantly. The XPS is a surface-sensitive technique where the penetration of the photoelectrons is limited to a few nanometers36; hence, it can detect contaminants and inclusions on flake surfaces regardless of its thickness. The r(sp2) ranged from 8.2 to 86.5%, with a median of 38.9% and a mean value of 41.4%. Interestingly, nine out of the ten lowest r(sp2) values were NGMint, suggesting the presence of surface contaminants (e.g., adventitious carbon, solvent, or surfactant residues), as intact graphene should not have sp3 hybridized carbon.

Washing of the samples for the removal of possible organic and inorganic adsorbents and inclusions was performed via repetitive centrifugation and the use of 1 M HCl, ethanol, chloroform, ethanol, 0.1 M KOH, and finally, six times washing with DI water in the listed order. Aside from the final wash with DI water, the incubation time and temperature for the NGM treatment were 60 °C for 2 h for each solvent or solution, respectively. Finally, the materials were dried at 80 °C at reduced pressure for two days using a common drying oven and membrane pump.

### Toxicity profile

The cytotoxicity of the NGMs on L929 cell lines was determined using the following: (a) colorimetric measurement of mitochondrial dehydrogenase activity via the conversion of tetrazolium salt providing information about the cell viability (WST-1) and (b) assessment of the lactate dehydrogenase (LDH) activity following plasma membrane rupture, signifying cytotoxicity. Generally, NGMs with a high cell viability (WST-1) have low cytotoxicity (LDH) and vice versa, but owing to the different mechanisms of cellular toxicity, the correlation is not always perfect. As the toxicity is concentration-dependent6, we tested the effect of different NGM concentrations. The results from the highest concentration (50 µg/ml) were discarded, as it promoted the eradication of cells. Herein, the trends for 10 µg/ml were analogous to 25 µg/ml (Supplementary Figs. 68; Supplementary Table 1). A correlation coefficient value (rs) < 0.3 indicates a poor correlation, while values between 0.3–0.5 and >0.5 indicate moderate and strong correlations, respectively. The statistical correlations are discussed in the following for all samples (combined, rscom) or for structural intact (rsint) or defective (rsdef) materials separately, and a tabular overview of the correlation coefficients can be found in Table 1 and Supplementary Table 1.

The statistical correlations for WST-1 and LDH with NGMs’ flake size (L50) and number of layers (N50) are indicated in Figs. 3, 4 and Table 1 (for 90-percentile values L90 and N90 see Supplementary Fig. 8). We observed a moderately positive correlation between the N50 and cell viability (WST-1 rscom = 0.381, p < 0.05), and a slight positive correlation between L50 and cytotoxicity (LDH rscom = 0.405, p < 0.05). Generally, we have not observed higher toxicity for NGMdef samples as compared with NGMint (Figs. 3, 4). There was no significant correlation between the reduced cell viability for larger flake diameter (WST-1 rsdef = −0.427, p > 0.05) and low graphene content (WST-1 rsdef = −0.473, p > 0.05) for NGMdef. Despite this correlation, some NGMdef presented low toxicity while having large flakes or low graphene content (Figs. 1f, 3b, e, 4b, e). This suggests that the toxicity is not primarily determined by the density of lattice defects, flake thickness, or dimensions; rather, it is likely to be caused by the presence of contaminants.

Furthermore, there were no significant correlations were seen between properties like specific surface area (SSA), r(sp2) and analyzed measures of cell toxicity (SSA- WST-1 rscom = 0.007, p > 0.05; LDH rscom = 0.110, p > 0.05; r(sp2)- WST-1 rscom = 0.099, p > 0.05; LDH rscom = −0.134, p > 0.05). We also observed a moderate and significant correlation with the total carbon content (WST-1 rscom = 0.499, p < 0.001; LDH rscom = −0.374, p < 0.05), indicating that the non-carbon inclusions are frequent, but not the sole determinant of toxicity.

We observed no relevant correlation between the cytotoxicity and the physical or chemical properties of the graphene components of NGMs. These findings led us to conclude that the toxicity mostly arises from the different types of non-graphene contaminants in the samples. To corroborate the principle experiment, we washed six representative NGMs comprising of both, intact and defective samples with both high and low cytocompatibility, via consecutive washing with 1 M acid, ethanol, chloroform, 0.1 M lye, and finally DI water, throughout at 60 °C, to free the NGMs from potential impurities on the surface. Notably, there was no significant statistical difference in the cytotoxic character of highly cytocompatible graphene materials after washing (low cytotoxicity in Fig. 5, p > 0.05). However, there was a significant decrease in toxicity for most of the cytotoxic samples, analyzed after the washing procedures (p < 0.05). Collectively, these data suggest that the toxicity was due to non-graphene contaminants which may be present in the raw material or involved in the production process (Fig. 5a, b). Additional optimization in the washing protocol is warranted to further improve the cytocompatible character of commercially available graphene materials.

### Principal component and hierarchical cluster analysis

We have established the toxicity profile (i.e., the cell viability and %cytotoxicity reported as WST-1 and LDH respectively) of the NGMs using principal component analysis (PCA) and hierarchical cluster analysis (HCA). This was done to determine whether there is any relationship between the toxicity profile of NGMint and NGMdef, independent of the properties analyzed. A PCA-biplot (scatter and loading plot) was constructed, in which the scatter points are the principal component (PC) scores of each sample, and the arrows represent the loading of each sample (i.e., the cell viability and cytotoxicity). The analysis computed two significant components: PC1 and PC2 that accounts for the highest and second highest variance probabilities in the dataset. In this analysis, the lower the PC1 score, the more cytotoxic is the material and vice-versa. Clearly, PC1 predominantly contributes (86.83%) to the total variation of the cell viability and cytotoxicity (Fig. 6a). The scattered data in the PCA biplot are individual data points which suggests that three apparent clusters (block = NGMint, open = NGMdef) are formed. The blue cluster represents the most cytocompatible materials (high WST-1) and slight overlaps with the red cluster, both dominated by NGMint, indicating higher overall cell viability by the NGMint. The materials from the green cluster presented high toxicity mainly induced by membrane damage (LDH release). Finally, an inspection of the loading plot clearly indicates that PC1 mainly describes the dimension correlated with WST-1 and LDH. As expected, these two vectors (WST-1 and LDH) point in opposite directions, revealing a negative correlation between them.

An unsupervised HCA was carried out which further confirms the clusters formed in PCA. Five groups were classified for the 36 samples analyzed (Fig. 6b). The largest cluster (in blue) predominantly includes 19 samples, of which 15 are NGMint. Samples in this cluster have the highest principal component score (0–2), which indicates the highest cellular viability and lower cytotoxicity. The second largest cluster (in red) consists of 10 samples, out of which six samples are NGMdef. While this cluster has a principal component score comparable to the blue cluster (−1 to 1), it had lower cell proliferation and higher cytotoxicity and grouped separately. The other clusters were cytotoxic with principal component scores of −4 for NGMdef (orange) and ranging from −2 to −1 to NGMint (green and pink).

## Discussion

A precise characterization of graphene-based materials is necessary to anticipate their impact on safety, health, and environment37. We observed that the 36 commercially available NGMs were a diverse class of heterogeneous materials that can be considered either cytocompatible, inert or highly cytotoxic. We found that the carbon content of the sample varies from 63 to 98 wt%. More than a third of the samples contain highly defective graphene, which tends to have a lower carbon content.

Graphene family materials interact with mammalian cells via different mechanisms that may or may not result in toxic effects24,27,38,39,40. The flow of molecules and materials through the cell membrane is largely dependent on material’s particle size, geometry, surface, and molecular chemistries. Particle size exerts strong influence on binding and activation of cellular membrane receptors41. Dissipative particle dynamics simulations suggest that the interaction of graphene nanosheets (side length of 3.5 nm) with bilayer membrane depends on the material’s lateral size and can be divided into three stages: (i) graphene nanosheet draws near to the membrane without a preferred orientation and is not captured by the cell, (ii) the nanosheet assumes an entry angle of ~47° while invading the membrane, and (iii) a sandwiched graphene-membrane superstructure is formed upon the rotation of graphene nanosheet towards the center of the lipid bilayer38. Another graphene–cell mode of interaction is material invagination by plasma membrane as observed in protein-coated graphene oxide nanosheets that are engulfed via the formation of intracellular vesicle40. Notably, the angles of interaction vary depending on the type of graphene. For instance, protein-coated graphene oxide nanosheets (equivalent diameter of 0.6 µm) adheres face to face, and not perpendicularly, to cell membrane allowing for the formation of intracellular vesicle for engulfment40. Conversely, graphene micro-sheets (5–10 µm) can assume acute penetration angles that may pierce and damage cell membranes38,42. It needs to be highlighted that there is no universal size threshold that dictates the internalization and regulates intracellular distribution of graphene family materials. Nonetheless, it seems that the continuum between small and large particles can change the mode of cellular uptake, from clathrin-mediated endocytosis to phagocytotic uptake, at least for protein-coated graphene oxide nanosheets40. Likewise, the manner in which graphene family materials are made available in the microenvironment also influences the mode of interaction with mammalian cells. For instance, films produced with chemically vapor deposited graphene or graphene oxide substrates allow for effective attachment and proliferation of human neurons, cardiomyocytes, and different types of stem cells without apparent adverse effects on cellular and mitochondrial membrane or exacerbation of inflammatory markers22,24,25,27,43,44,45,46,47,48. On these solid substrates, cells take advantage of material surface features (e.g., wrinkles) to anchor themselves and develop focal adhesion spots27,49,50. In some cases, the chemical and surface features of graphene can even promote cell proliferation due to the higher interaction with secreted-extracellular proteins46. It is also possible to modulate cell–substrate adhesion and neurogenic differentiation of human mesenchymal stem cells by changing the size and amount of defects at the domain boundaries of graphene films produced by chemical vapor deposition51. In summary, the interactions between cells and graphene family materials vary in response to the material’s shapes, sizes, chemistries, and modes of use (particle suspension vs. substrates) inducing and modulating several cell responses, including the cell stability, survival, and toxicity.

It has been previously proven that several graphene characteristics (e.g., the presence of functional groups, charges, and sizes) may affect its cytocompatibility28, and our results are consonant with this postulate. Moreover, we demonstrated that no single physical or chemical parameter can define NGMs’ cytotoxicity alone. This can be observed in Fig. 1f, wherein NGM-11 and NGM-33 present variable LDH release despite having similar intact structure, high carbon content (≥97 wt%), and a low degree of volatile impurities (ΔTG < 5%). However, NGM-33 presents a lower fraction of sp2-bonded carbon (r(sp2)). Meanwhile, NGM-9, which also has a high graphitic content and low r(sp2), like NGM-33, is more cytocompatible. Strikingly, both NGM-15 and NGM-24 have different cytocompatible characteristics despite having a high density of defects and volatile impurities (ΔTG = 21%) and low carbon content (wt%(C) = 78%).

The cytotoxicity was not linked to single parameter that we analyzed. Thus, we have investigated whether the presence of impurities could contribute to the toxicity of commercially available NGMs. We focused on two broad types of impurities, namely persistent inorganics and adsorbents, which can be linked to the purity of the raw graphite and the efficiency of the purification processes performed during the production of NGMs.

First, we observed that the lower bound of PI exceeded 10% of the total sample weight in over 20% of NGMs tested. The PI may be intercalated and adsorbed impurities as well as chemical residues from the synthesis processes52,53,54. Notably, the trendline from Fig. 6c suggests that NGMs are more cytotoxic as the PI fraction increases (rs = 0.393; p < 0.05). The PI exemplifies the negative effects of production residues on the cytotoxicity of commercially available graphene materials.

We have also investigated the effects of impurities (e.g., adsorbents and surfactants) on NGMs’ toxicity. For this, we have selected ten NGMs presenting low fraction of volatiles (ΔTG < 5%), and PI (<3.5%) and wt%(C) > 95% (Fig. 6d). In these samples, most of the anticipated impurities may be carbon-rich surfactants that adsorb on the graphene surface that would decrease the sp2/sp3 ratio (Supplementary Fig. 5). Notably, 9 out of 10 samples with the highest sp3 bond fraction r(sp3) are NGMint, which confirms the assumption that the NGMs with the most ideal composition rely heavily on surfactants in their method of synthesis10. This was supported by TGA-MS, as identified cleaved fragments are typical for aromatic (e.g., m/z ~ 77) or aliphatic organic compounds (m/z ~15 for CH3, ~29 for C2H5), which should be not present in pristine graphene (Supplementary Table 1). There was an insignificant correlation between cytotoxicity (cell membrane rupture) and r(sp3), which is a measure of the density of graphene adsorbents (rs = 0.382; p > 0.05, Fig. 6d). Interestingly, when we compared the least and most toxic samples from this group of NGMint, the sp2 fraction was the most prominent factor inducing toxicity (Fig. 6e, f). Thus, these non-graphene additives and impurities (possibly from the raw materials or synthesis methods) are likely to increase cytotoxicity. Unfortunately, these compounds are highly specific to the synthesis processes which are rarely disclosed by manufacturers.

There was a statistically significant moderate correlation between PIs and LDH (Fig. 6c) Notably, we have observed a decrease in toxicity after washing the samples to remove persistent inorganics and impurities (Fig. 5a, b). These findings reinforce the need for rigorous characterization; at this stage, post-processing/purification must be considered for graphene materials obtained from commercial parties prior to using them in research and development efforts, and to anticipate their impact on safety, health, and environment6,37.

Furthermore, the cytotoxicity induced by graphene and graphene family materials is concentration dependent. It has been seen that higher concentration of graphene in contact with cells for an extended period significantly releases higher levels of lactate dehydrogenase enzyme indicating loss of plasma membrane integrity39. This is coherent with the results seen in our study whereby higher level of LDH was seen associated with high flake size (LDH (rscom = 0.470, p < 0.05, Table 1).

The issue of graphene toxicity must be urgently addressed from two distinct perspectives: the harmonization of experimental protocols and consideration of application specificities. The plethora of testing concentrations used to assess graphene’s toxicity, from millimolar to g ml^−124,27,28,38,39,40,42, renders it impossible to discern what is indeed the toxic character of graphene family materials. Herein, some of the materials tested at 10 µg/ml were deemed cytotoxic, whereas no apparent cytotoxic effects have been observed for other graphene family materials tested in other studies at the same concentration55. We have also observed that even samples categorized as “cytocompatible” were deemed excessively cytotoxic when tested at 50 µg/ml, resulting in very high LDH values and eradication of cells (data not shown). Similar detrimental effects at 50 µg/ml have been observed, such as increased degrees of LDH release by another type of mammalian cells55 and generation of reactive oxygen species (ROS) by microbial communities present in sludge samples from a wastewater treatment plant56. Some of these detrimental effects may be associated with graphene family materials altering cell function through the interaction with organelles which can lead to alteration, or overproduction of ROS which is preceded by various factors viz., altered mitochondrial membrane potential, localization of graphene nanoparticles on the F-actin filament or activation of nicotinamide adenine dinucleotide phosphate (NADPH) oxidase system57,58,59. In our study, we have used WST-1 reagent to study the mitochondrial activity, which is based on the principle of cleavage of tetrazolium salts by viable enzymes from mitochondria (mainly mitochondrial dehydrogenase) to form colored formazans (reported here as absorbance), which are indicative of mitochondrial and thus cellular viability. Notably, the carbon content (analyzed by EA in Fig. 4e) was positively correlated with the mitochondrial activity (WST-1 rscom = 0.499, p << 0.001, Table 1). Previous studies have shown that carbon nanoparticles can present positive effects on cell proliferation, viability, and adhesion when grafted over polyethylene polymers60,61. This is coherent with our findings as a higher carbon percentage, despite the lattice arrangement, has significantly positive correlations and thus the effect on measures of cellular viability (Fig. 4).

The “risk/no-risk” or “toxic/non-toxic” archetypes are no longer acceptable if one does not put into context the intended use of graphene family materials6,62. For instance, graphene films coated on arbitrary materials have been repeatedly proven to be cytocompatible21,22,24,25,26,43,63. Nonetheless, those findings cannot be proxies to confirm the cytocompatibility of graphene family materials dispersed in suspensions6,55,56. Thus, results from a specific experimental set-up cannot be applied to all potential graphene applications (e.g., coating vs. drug carrier/delivery systems). Likewise, the experimental set-ups also have intrinsic limitations. Herein, we have followed the ISO guidelines that recommend established cell lines from recognized repositories for cytotoxicity tests29. However, immortalized cell lines may present chromosomal instability and metabolic abnormalities due to their derivation. Alternatively, primary and stem cells have brief lifespans, can lose specific markers over short time in culture, and present varied genomic profiles depending on the donor; all of which can influence the cell’s biological response to materials64. Herein, the trend observed in the cytotoxic character of selected NGMs characterized with immortalized cell lines (L929) was similar to that observed with human mesenchymal stem cells despite the numerical variations observed for the absorbance values (Supplementary Fig. 10). This data, therefore, suggests that not only cytotoxicity is material-specific, but also NGMs may present varying toxicity levels depending on the type of cells and tissues analyzed (as previously shown in6,65,66).

We also need to highlight that the characterizations and assays performed in our study are non-exhaustive and overlook certain properties like surface charge and graphene shape (sharpness) that can also affect cell membrane integrity and overall toxicity profile42,67. We have also observed that safety datasheets provided by graphene suppliers are either unavailable or inadequate, reflecting the urgent need for the establishment of standards for quality assurance and the communication of technical information for commercial graphene products. Hence, it is critical to establish the basic testing standards to which a graphene sample needs to be subjected to characterize its safety and biocompatibility. The fast-paced scientific breakthroughs and anticipation of graphene products with enhanced capabilities have great potential to transform many industries. However, not all institutes and enterprises involved in graphene research and development are fully equipped to independently produce and characterize high-quality graphene materials. Hence, using materials that are commercially available becomes a mean to advance research efforts. This scenario is promoting the emergence of new and much-needed type of industry that aims to make graphene materials readily available worldwide. The use of commercially available samples has allowed advancing research efforts, but the lack of international standards and specifications (viz. assessing cytocompatibility) could precipitate severe reluctance and delays in the implementation of new technologies.

Our work showed that over than 35% of commercially available graphene materials were defective graphene layers as determined by Raman and XPS analysis and that no single property or characteristic can depict enough information about the material’s cytotoxic profile. Moreover, some of the marketed NGM showed persistent impurities that had contributed to higher cytotoxicity. In fact, further purification has reduced the %cytotoxicity by at least 30% in three out of four highly cytotoxic NGM tested. Hence, the structure, purity, and toxicity of commercially available NGM must be independently characterized by customers (researchers). Finally, the high versatility of NGM and diverse potential applications demand the biological outcomes to be interpreted in the light of intended final graphene-based/-containing product.

## Methods

### Sample preparation and characterization

We obtained 36 commercially available graphene samples from graphene producers located in the United States, Canada, India, China, Malaysia, Norway, Italy, and England. Graphene samples in powder form were used in the experiments as received, while graphene dispersions were dried to obtain the powder5.

The materials were characterized through the following techniques: CHNS combustion elemental analysis (EA), thermogravimetric analysis coupled with mass spectrometry (TGA-MS), Raman, and X-ray photoelectron spectroscopy (XPS), and light and atomic force microscopy (AFM). The specific surface area (SSA) was determined according to Brunauer, Emmett, and Teller (BET). Bulk samples were used for TGA-MS and EA, while for XPS, Raman, BET, and AFM dispersions of 1 mg/ml were prepared with either water, isopropanol or n-methyl-2-pyrrolidone, and subsequently spin-coated on 300 nm SiO2/Si wafer (SPS-Europe, 5000 RPM, 90 s).

Carbon content was quantified using carbon, hydrogen, and nitrogen elemental analysis (Elementar Vario MICRO Cube, Germany). A thermogravimetric analyzer coupled with mass spectroscopy (STA 449 F3 Jupiter coupled to an Aeolos III mass spectrometer, Netzsch, Germany) was used to characterize the degree of functionalization and contamination in 10 mg of each graphene sample under a helium atmosphere. The relative mass loss was determined as being between 50 °C and 800 °C, and cleaved fragments with m/z between 10 and 110 were recorded over the same range. As reference samples, we used graphite (type 3061, Asbury Carbon Mills, USA) or graphene oxide produced from this graphite according to previously published method68. We defined a lower bound for the cases in which all the volatiles ΔTG are inorganics (PIlow = 100% – wt%(C) – ΔTG), and an upper bond if all volatiles are carbon (PIup = 100% – wt%(C)). Raman spectroscopy (CRM 200, Witec, Germany) was conducted using a green laser with an excitation wavelength of 532 nm and a 100× objective. For each sample, at least ten different areas were analyzed by recording single spectra with a laser spot size of about 400 nm. The chemical composition was analyzed via X-ray photoelectron spectroscopy (Kartos Analytical Ltd., U.K.) using an Al Kα X-ray photon source with h = 1486.71 eV. The characteristic carbon peak at 284.8 eV was used as a reference for all binding energies. The degree of functionalization or contamination of each graphene sample was determined by the deconvolution of the Carbon 1 s core level spectra and by analyzing the sp2 and sp3 contents5. We defined the sp3 bond fraction, r(sp3) = 100% – r(sp2), as a statistical measure of the density of impurities on clean and intact graphene. The SSA of materials was determined using the BET method, and at least ten areas on the SiO2/Si-wafer were measured.

Flake thickness was determined using atomic force microscopy (Dimension® Icon™ Atomic Force equipped with ScanAsyst™, Bruker, Germany) in tapping mode to study the flake thickness, and the number of layers was estimated as the flake thickness/inter-layer distance. Here, the inter-layer distance was assumed to be the same, with a graphite inter-layer distance of 0.335 nm43. The lateral dimensions of NGMs and their distribution were extracted from light microscopy images with the open-source software ImageJ (version 1.53d, NIH, USA). Therefore, we converted the green channel of the RGB images to a black and white image and analyzed at least 1000 flakes from 10 different areas all over the wafer.

Purification to remove persistent adsorbents and inclusions was performed for selected NGMs by washing with 1 M HCl, followed by ethanol, chloroform, ethanol, 0.1 M KOH (60 °C, 2 h each), and distilled water. The materials were centrifuged between each solution and finally dried at 80 °C at reduced pressure for 48 h.

### Cell culture, cytotoxicity, and membrane damage assays

The cytotoxicity of the samples was evaluated following the BS EN ISO Biological evaluation of medical devices Part 5: Tests for in vitro cytotoxicity (ISO 10993-5:2009). To this end, L929 cells (Sigma-Aldrich, USA) were sub-cultured in basal growth media (high-glucose media, 10% fetal bovine serum, and 1% penicillin/streptomycin, all from Gibco, USA) until 70–80% confluence had been reached. Thereafter, cells were harvested (TrypLE, Gibco), seeded in a 96-well plate (103 cells per well), and left undisturbed for 24 h at 37 °C and 5% CO2. Subsequently, the basal growth media was completely removed, and the cells were washed with phosphate-buffered solution (PBS, pH 7.4, Gibco). Next, the cells were treated with graphene-containing media which was prepared via mild ultra-sonication of the graphene powder (10 or 25 µg/ml) into basal growth culture media for 10 min at room temperature. After the 24 h treatment period, cells were washed twice with PBS.

Cell viability was assessed via WST-1 assay (Abcam, UK), which indicates the proportional number of living cells based on the cleavage of the tetrazolium salt to formazan by cellular mitochondrial dehydrogenase, enabling estimates of the relative number of living cells. To this end, cells were treated with 100 µl of fresh basal growth media containing 10 µl WST-1 reagent for 90 min at 37 °C. Afterwards, 70 µl of media was transferred to a new plate, and the absorbance was taken at 450 nm (Multiskan GO, ThermoFischer, USA).

Cytotoxicity was assessed based on cell membrane damage, which was quantified by the release of LDH. For this, CytoTox-One (Promega, USA) and supernatant basal growth media were mixed (50 µl each), added to the wells, and incubated at room temperature for 30 min. Following the incubation, the fluorescence was measured at 560Ex/590Em (Fluoroskan FL, ThermoFischer, USA), and the percentage of cytotoxicity was calculated using Eq. (1).

$$\% cytotoxicity = 100 \ast \left( {\frac{{{{{\mathrm{Experimental}}}}-{{{\mathrm{Culture}}}}\;{{{\mathrm{medium}}}}\;{{{\mathrm{background}}}}}}{{{{{\mathrm{Maximum}}}}\;{{{\mathrm{LDH}}}}\;{{{\mathrm{release}}}}-{{{\mathrm{Culture}}}}\;{{{\mathrm{medium}}}}\;{{{\mathrm{background}}}}}}} \right)$$
(1)

The negative control was cells cultured in basal growth media without NGMs, and the positive control for LDH was cells pre-treated with lysis buffer. All the cellular toxicity experiments were performed in three independent sessions.

### Statistical analysis

The potential association of tested properties and cell viability and cytotoxicity was analyzed using a bivariate Spearman correlation analysis (SPSS version 22.0, IBM, USA). The Spearman’s correlation coefficient (rs), along with the probability value (p), are reported. We also report the median (50th) along with 10th to 90th percentile values of the assessed parameters.

PCA was performed for cell viability and cytotoxicity, reported as WST-1 and LDH at 10 µg/ml after centering the data obtained with different units and scales. The number of principal components was selected according to Kaiser’s criterion and using a scree plot. Only principal components with Eigenvalue greater than 1 were retained. The score and loading plot were drawn. The principal component score was computed to represent the cytocompatibility of the samples.

An unsupervised HCA was employed to establish clusters among the samples based on toxicity. Clusters were formed through the agglomerative process, starting with connecting two samples with the highest similarities. The Euclidean distance with the average linkage method was used to compute the samples’ similarities. A circular dendrogram was plotted to present the result of the hierarchical clustering analysis. The vertical lines in the dendrogram represent root samples, while horizontal lines represent similarities between samples in terms of Euclidean distance. The PCA and HCA were performed using multivariate analysis in Origin (2019(b), OriginLab, USA).