Introduction

Nanoparticles (NPs) are increasingly prevalent in our society, with applications ranging from nanomedicine, photovoltaics, cosmetics and beyond. Despite the ubiquity of their use, we still lack a fundamental understanding of how NPs interact with their local environments. This poor understanding becomes a critical problem when using NPs in real-life situations. Notably, it has been well documented that introducing NPs into biological fluids results in the formation of a surrogated biological identity known as the biomolecular corona, which is the spontaneous adsorption of proteins1, lipids2, sugar moieties3, nucleic acids4 and metabolites5,6 onto nanomaterial surfaces. The composition of the biomolecular corona strongly determines the ultimate fate of NPs7,8,9. As most studies that address the biomolecular corona formation have focused on proteins adsorbed to the surface of NPs10, together with the fact that the predominant model for the formation of the biomolecular corona is hegemonized by the direct adsorption of proteins from biological milieus11, this Review comprehensively focuses on the protein corona.

The emergence of the concept of the protein corona in nanomedicine substantially altered perspectives of researchers and interpretation of biosystem–NP interactions. The first critical milestone occurred in 2007 when Dawson’s group conceived the term protein corona12. However, protein corona studies commenced as early as the late 1950s13 and early 1960s14,15 under a slightly different terminology and a colloidal science context. The characterization of protein coronas has revealed that the adsorption of proteins onto nanomaterials can have mixed biological effects. Protein corona patterns, which are dependent on the physicochemical properties of nanomaterials and the complexities of biological matrices, can unpredictably change NP outcomes including function16, uptake17,18, biodistribution19,20, immunological responses10,21 and toxicity22,23, and can thus be a challenge for therapeutic nanomedicine. Mechanistically, the protein corona can be composed of endogenous ligands that can mask the innate or designed reactivity of NP surfaces, block cell membrane receptors thus hindering NP internalization, protect NPs from opsonization (that is, the immunological response that targets NPs for removal through the adsorption of antibodies and complement proteins onto the surfaces of NPs), limit blood circulation or enhance cytotoxicity compared with their pristine NP counterparts.

Juxtaposingly, the protein corona also creates new opportunities for diagnostic and personalized nanomedicine; the protein corona can be utilized for disease diagnosis24,25,26,27, for the optimization of cell internalization28,29,30 or leveraged for improving the in vivo biodistribution21,31 of nanomedicines. The enrichment of plasma proteins, particularly rare proteins and glycoproteins, onto NPs can create a protein corona ‘fingerprint’, which can be used for the personalized detection of biomarkers and assist in post-translational modifications in support of risk stratification, prognosis and disease recognition. Selectively coating NPs with purposely designed protein coronas can regulate cell-dependent uptake, promote blood circulation and enhance their therapeutic efficacies. Furthermore, protein corona studies have advanced from the conceptualization on how biosystems ‘see’ and perceive NPs32 towards the development of new diagnostics and therapies24,25,26,33,34, the emergence of the eco-corona35,36,37 and materialization of the biological and ecological impacts induced by protein-corona-coated nanoplastics38,39. Artificial intelligence (AI) can be harnessed for data-driven discoveries of NP–biological interactions and complexities. Machine learning algorithms can identify the important variables that affect protein corona formation on specific types of NPs40 and have predicted diseases in patients using personalized protein corona fingerprints19,26. Accordingly, a better understanding of the composition, pattern and decoration of biomolecules at the surface of NPs, supplemented by AI, can facilitate the development of safer and more effective nanomedicine technologies with desired biological fates. Several challenges remain in the field of protein corona research, including NP heterogeneity, interpretation of protein patterns and induced perturbations of immunological and toxicological responses. Through standardization of methodologies and detailed characterization of the protein corona, foundational for data sets that fuel AI, targeted therapeutic and/or diagnostic nanomedicines can be optimized.

In this Review, we summarize the progress, challenges and opportunities in protein corona research. We expose the current state of protein corona research in nanomedicine, highlight current challenges in research methodology and characterization and address how AI can be used to tackle these challenges. We then discuss emerging opportunities offered by the protein corona for the design of efficient nanomedicines and innovative therapies, in proteomic-based screening and for the detection of diseases. We also address the importance of understanding the protein corona to assess the ecotoxicological impacts of nanomaterials that traverse through the environment.

Identity of the protein corona for nanomedicine

Understanding the biofluid protein corona

The scientific understanding of the protein corona for nanomedicine applications began with empirical analyses of the as-of-yet uncharacterized protein corona that formed when polystyrene NPs interacted with proteins in complex biological fluids, such as plasma or serum41,42,43. These studies evaluated the efficiency of polymers to hinder protein adsorption onto colloidal particles, with the goal of enhancing drug targeting and correlating adsorbed protein patterns with the role of surface charges on blood cells. Thus, although the protein corona was inadvertently analysed, the main aim was not its characterization and exploitation. Following these studies, protein corona composition was scrutinized by addressing kinetics12,44,45, protein conformation46,47,48 and the functionality of the adsorbed proteins on the surfaces of NPs49,50,51. Figure 1 provides a timeline of key milestones within protein corona research.

Fig. 1: History of protein corona research.
figure 1

AI, artificial intelligence; ML, machine learning; NP, nanoparticle; SA, sensor array.

Equilibrium binding between proteins and NPs is largely dependent on protein identity and on NP physicochemical properties (such as size and surface chemistry)12,44. Binding affinities for the adsorption and desorption of proteins affect the colloidal stability of NPs. Proteins may also change conformation upon adsorption onto NPs as conformational changes are dependent on both protein and NP properties. Greater conformational changes are observed in proteins adsorbed onto hydrophobic NP surfaces relative to their hydrophilic counterparts, with the shape of the protein impacting its stability on NPs with high surface curvatures46,47. External factors such as pH and ionic strength also influence surface-binding-induced protein unfolding45. For example, adsorption-induced protein conformational changes were quantified by targeting and fluorescently labelling exposed amine groups; these residues, which were originally buried inside the tertiary structures of proteins in native conformations, became exposed because of adsorption-induced conformational disorders48. Protein conformational changes can impact protein functionality through the exposure of once-concealed functional epitopes or misfolding that results in complete loss-of-protein function, subsequently impacting downstream biological processes and mediating NP–biosystem interactions.

By studying the spatial organization of the protein corona on polystyrene NPs, the location of adsorbed proteins was discovered to be randomly distributed. Yet, these randomly adsorbed proteins possessed functional epitopes that could bind to receptors49. Similarly, adsorbed lipoproteins on 2D graphene flakes were found to possess several functional epitopes capable of binding to receptors in the liver50. Furthermore, analysis of adsorbed low-density lipoproteins and IgG on SiO2 NPs revealed functional epitopes, which enhanced NP uptake in human lung and embryonic kidney cells51. Overall, the functionality of the protein corona, which is influenced by nanomaterial physiochemical properties, protein varieties and concentrations in biofluids and ambient parameters (such as pH and temperature), allows for the specific recognition of cell receptors and suggests that there may exist multivalent interactions between cells and NPs mediated by rare corona proteins that are, to date, not well understood.

The understanding of corona-mediated functionalities can also be leveraged for the development of nanomedicines. Personalized protein27,52,53,54 and biomolecular coronas4,5,6,55 have been proposed as tools for disease diagnosis and prognosis, as the composition of the protein corona developed in patient-derived serum is disease dependent. For instance, the protein corona formed in patient sera can be used to identify biomarkers for lung cancer52, predict Alzheimer disease53 and screen for pancreatic cancer54. Multi-omics of the biomolecular corona4,5,55, which combines the analyses of metabolites, including organic acids, sugars, amino acids and hormones, as well as lipids and proteins, has enabled a more thorough profiling of human samples for disease diagnostics, but these studies are limited by extraction methods and separation conditions. Methodical sample preparation is crucial and must account for surface chemistry, along with pH and ionic strength of elution and/or extraction buffers, to isolate metabolite, protein and lipid constituents from the biomolecular corona6. Additionally, protein coronas have been utilized to maximize the performance of traditional proteomic pipelines by deep sampling complex plasma proteomes, comprising thousands of proteins in human plasma samples26. Traditional proteomic analysis of the plasma proteome has been difficult to achieve owing to the presence of a few dozens of extremely abundant proteins that dominate the protein mass content in plasma. Formation of the protein corona on NPs results in the enrichment of rare proteins and biomarkers within the proteome that can be identified by liquid chromatography–mass spectrometry (LC–MS/MS)24,25,26,27. Recently, protein coronas that formed on five different NPs were utilized to systematically detect over 2,000 proteins from more than 100 plasma samples, bypassing the typical complex sample preparation workflows for neat plasma that immunodeplete highly abundant proteins and fractionate samples with chromatography. These protein corona data sets were then utilized to predict low-abundance proteins associated with non-small-cell lung cancer26.

Soft and hard coronas

NPs possess unique physiochemical properties56, such as size57,58, shape59 and surface chemistries19,57,60, that can affect the formation of the protein corona. Additionally, the corona profile can be influenced by ambient factors such as biomolecule concentration60, pH61, ionic strength62, temperature63 and incubation time64,65,66. Upon corona formation, the identity of the NP is transformed; its physicochemical properties67, targeting abilities20,28,68 and biological responses17,22,65,69 change owing to the surrogated biological identity dictated by the adsorbed proteins. For example, characterization of the protein corona on core-shell NPs, composed of identical PEGylated lipid bilayer shells with varying core elasticities, revealed that apolipoprotein A1 preferentially adsorbs onto NPs with intermediate elasticity and the adsorption strongly correlates with longer NP elasticity-dependent in vivo systemic circulation lifetimes70.

Time-dependent profiling of protein coronas, extracted from human plasma, revealed that the formation of protein corona fingerprints occurs rapidly (~30 s), and the amount of protein in the corona, but not the composition of the protein profile, can change over time71. Moreover, the rapid formation of the protein corona impacts the kinetic pathophysiology of the NPs. For example, both an increased NP uptake by microvascular endothelial cells and the inhibition of erythrocyte haemolysis were observed upon protein corona formation71. The study of in situ protein corona formation in complex biological matrices, such as whole blood and plasma, showed that PEGylated gold NPs do not aggregate in these fluidic environments72.

Protein coronas incorporate both soft and hard coronas. The soft corona, comprising biomolecules such as proteins with low affinities to NP surfaces, endures dynamic and reversible exchanges that are dependent on the conditions of the surrounding biological fluid58,73,74. The hard corona develops gradually, is more stable and is composed of ‘relatively immobile’ proteins with substantial affinities to, and low tendencies to dissociate from, NP surfaces57,60,64. The understanding of both soft and hard coronas is crucial for inferring NP stability, function and interactions with biological systems. Kinetically, the associations between proteins and NPs in biological fluids are governed by non-covalent interactions, such as electrostatic forces, hydrophobic forces, hydrogen bonding and π–π stacking75. Proteins competitively bind to NP surfaces76 establishing transient NP–protein complexes composed of soft and hard corona proteins58,64,73,74 under thermodynamically favourable conditions75. Because of the high dissociation rate of the soft corona proteins, our current understanding of the biological identity of the protein corona is typically limited to the hard corona proteins. Moreover, both NP properties and the conditions of biological matrices can impact the kinetic and dynamic binding equilibrium of the soft and hard protein corona to NP surfaces. Protein–protein interactions on the hard protein corona can either be stable or transient similar to the soft corona77. The majority of soft corona proteins are not solely unique to the soft corona composition, but rather also compose the hard corona profile, indicating that soft corona proteins possess variable binding strength states78. Moreover, artificially hardening the dynamic soft corona to increase the residence time of the soft corona proteins revealed decreased cellular associations, suggesting that the dissociation of soft corona proteins can reveal bare NP surfaces allowing for nonspecific interactions with cell membranes78.

Activation of bioorthogonal nanozymes, in this case transition metal catalysts that mimic enzymes, varies depending on the dynamic nature of the protein corona: ex vivo, the soft corona reduces nanozyme functionality, but the hard corona results in aggregation and total loss of nanozyme activity; intracellularly, endosomal and lysosomal proteases restore the catalytic activity of both the hard and soft corona-coated nanozymes16. Protein corona formation can also be stereo-specific. For example, the composition of the soft corona of chiral cuprous sulfide (Cu2S) is chirality-dependent and correlates to its stereo-selective biodistribution79. Time-dependent changes to the chirality-mediated soft and hard corona compositions were correlated with decreased blood circulation and trafficking to the liver79. Silver NPs typically dissolve into biologically active and highly toxic Ag+ ions80. However, the presence of a hard corona on the NPs induces their sulfidation, which decreases their toxicity by forming insoluble Ag2S nanocrystals81; meanwhile, the soft corona was found to mediate Ag+ removal, reducing Ag2S nanocrystal formation82. These previous studies highlight that to better comprehend the distinct biological responses of nanomedicines, the differential compositions of the soft and hard protein coronas need to be taken into consideration.

Remaining challenges limiting nanomedicine development

There remain several challenges within the protein corona discipline that must be taken into consideration to improve nanomedicines.

Protein patterns

Although several studies have identified the composition and abundances of adsorbed proteins on NP surfaces, few have been able to distinguish protein patterns, such as multifaceted protein recognition domains, and their roles towards NP recognition pathways by biosystems32. Although NPs coated with protein coronas can interact with cell membranes directly through ligand–receptor connections, transient interactions with multiple proteins on the corona can also occur on the membrane. It is possible that the presence of multiple membrane receptors can facilitate the interaction with multiple functional epitopes exposed on the protein corona. These short-lived interfaces can recruit intracellular biomolecules that form cytosolic signal transduction clusters and ultimately bionanosynapse with protein corona-coated NPs, thus preparing cells for ensuing biological impacts32. Moreover, there is a knowledge gap in distinguishing all of the multivalent roles of corona proteins with regard to their recognition by cells, in NP pharmacokinetics as well as immunoregulatory signalling and gene expression. Decoding which corona protein pattern impacts cell internalization, biological distribution into tissues and organs and the secretion of inflammatory cytokines and genetic material is an extraordinarily challenging question.

Differing biological and toxicological responses

The identity and nature of the adsorbed corona proteins can mediate biological and toxicological responses; even though the protein corona can provide a protective coating and mitigate cytotoxicity83,84,85, it also prompts enhanced toxicity in some cases86,87,88. Although there exist several differences in protein corona profiles (such as the presence of rare proteins in some cases) extracted from NP surfaces, multiple studies have confirmed that abundant proteins — such as albumin, fibrinogen, apolipoproteins, complement proteins, transferrin and immunoglobulins — commonly adsorb on NPs because of their relatively high abundance in human serum and plasma89. These adsorbed proteins can influence NP physiological responses. For example, although bare carboxylated-multiwalled carbon nanotubes (CNTs) induced blood platelet aggregation and release of platelet membrane microparticles, coating the CNTs with albumin and fibrinogen attenuated platelet aggregation and prevented the release of membrane microparticles, respectively90.

Protein conformational changes induced by adsorption can also mediate biological responses. For example, albumin changed conformation following its adsorption onto nanoporous polymer NPs (NPP), with cell-dependent differential uptake mediated by the unfolded albumin. In that case, a substantial decrease in NPP uptake by monocytes, but a slightly elevated receptor-mediated phagocytosis of NPP in macrophages, was observed91. Similarly, conformational changes of fibrinogen upon adsorption onto gold NPs induced the activation of cell receptors, upregulated transcription factor signalling and released inflammatory cytokines in monocytes69. Conformational changes to IgG, but not to fibrinogen, upon adsorption to CNTs were associated with elevated levels of reactive oxygen species and inflammatory cytokines from macrophages, mediated by the denatured IgG adsorbed on CNTs92. Currently, reports describing the nature of the adsorbed protein corona and subsequent paradoxical biological impacts are limited because of the lack in diversity of protein–NP systems under investigation and because of deficiencies in understanding protein–NP adsorption mechanisms93, thus precluding generalizable trends of protein corona dynamics.

Perturbed immunological response

The injection of nanomedicines into the circulatory system can result in perturbed immunological responses as well as the unsolicited formation of the protein corona, opsonization or immune system recognition of specific corona patterns that limit the circulation of nanotherapeutics65,94,95. For example, during the opsonization of superparamagnetic iron oxide nanoworms in human plasma, complement component 3 (C3) covalently bound to absorbed proteins at the surface of the used magnetic nanoworms. C3, which is the most abundant complement protein in serum, activates an immunological response (the complement system) for the removal of foreign materials, such as NPs, from cells. After the binding of C3 to the nanoworm protein corona in vitro, a dynamic exchange was observed in vivo, suggesting that the immunological corona was kinetically unstable21. The exchangeable nature of the protein corona may induce re-recognition by the immune system in vivo as it does in vitro21, resulting in rapid, undesirable clearance of nanomedicines94,95. Further physiologically relevant in vivo protein corona exploration is necessary to engineer nanomedicines that deter immunological responses, such as complement protein exchange, and enhance circulation time.

Nanoparticle heterogeneity

Separation of heterogeneous NPs and analysis of the individual corona from a single particle is a challenging task that has not been studied extensively96,97. The MagLev (magnetic levitation) technique, which uses a high-intensity magnetic field to levitate and separate diamagnetic objects according to their densities, has been applied to study protein corona heterogeneity and has been suggested for extraction of homogeneous corona-coated NPs96. Future endeavours should conduct global analyses of the variances in protein corona, as typical ‘averages’ of the protein corona do not constitute all corona subclasses97, just as single-cell gene expression analysis can have striking differences from the averaged analysis of a population of cells98. Thus, there are several reasons why it is crucial to characterize NP protein coronas and analyse their biological impacts for the development of reproducible nanomedicines99. Integration of high-throughput computational tools, such as AI, can complement high-throughput protein corona investigations to catalyse the advancement of nanomedicine.

Targeting strategies for therapeutic nanomedicine

NPs that have targeting capabilities are obtained by functionalizing their surface with targeting species (such as antibodies, small molecules and nucleic acids), enabling their localization to desired locations in the body for payload release and/or imaging purposes. However, the formation of the protein corona can shield these targeting moieties and cause mistargeting and unfavourable biodistribution20.

There are four major proposed strategies to address the shielding role of the protein corona on targeted NPs (Fig. 2).

Fig. 2: Schematics showing the major challenges created by the NP protein corona and proposed strategies to address them.
figure 2

a, The use of protein-repellent compounds to coat nanoparticles (NPs) can minimize corona formation and, therefore, reduce the shielding effect of corona on active/functional sites of targeting moieties. b, The use of specific proteins to pre-coat NPs to enhance recruitment of proteins with intrinsic targeting capacities during corona formation. c, Maintaining the exposure of active site of antibodies, even after corona formation by pre-adsorbing targeting antibodies to the surface of NPs. d, Using strategies to attach targeting moieties to the surface of corona-coated NPs.

Protein-repellent coating

The first strategy involves the use of protein-repellent coatings (such as zwitterionic compounds) on the surface of NPs to prevent and/or minimize corona formation100,101 (Fig. 2a). For example, silica NPs functionalized with cysteine (as a zwitterionic ligand) and conjugated with biotin (as a targeting molecule) prevented the formation of the protein corona, when compared with the same NP without cysteine functionalization101. The zwitterionic coating also substantially improved the targeting capacity. Another relevant example is the use of antifouling polymers (such as polyethylene glycol or polyethylene oxide)102, which reduce shielding by the protein corona on the basis of the characteristics of the specific coating (such as density, size, length and heterogeneity of the polymeric coating)103,104.

Pre-coating strategy

Another strategy is to pre-coat NPs with components that recruit plasma proteins with targeting capacities during corona formation29 (Fig. 2b). The pre-coating strategy, which was first used in 1980 to inhibit uptake of liposomes by macrophages105, can be precisely designed to manipulate protein corona composition in a way that provides a new impetus for targeted payload delivery to specific biosystems and/or tissues. For example, surface modification of liposomes with a short amyloid β-derived peptide can recruit plasma apolipoproteins, and the resulting apolipoprotein corona-coated liposomes can target brain tissue for high-yield drug delivery applications106. One of the main challenges of the pre-coating approach, however, is that the active site of proteins should be exposed in the outer corona layer for interaction with specific cell receptors. For instance, although silica NPs pre-coated with the serum proteins γ-globulins could enrich the protein corona with immunoglobulins and opsonin, which have high targeting efficacy and recognition to macrophages through Fc (fragment crystallizable) receptors, there was no substantial enhancement of uptake of these NPs by macrophages compared with non-pre-coated NPs29. This result highlights the importance of exposing the functional binding motifs to cell receptors49, which is difficult to engineer from the spontaneous adsorption of such proteins to the NP surface. Therefore, new strategies should be developed to increase the efficiency of the pre-coating approach by enhancing the availability of functional motifs of corona proteins for cell receptors to achieve desired targeting efficacy49. We now understand that pre-coating with albumin can bring more carrier proteins to the surface of NPs and improve their blood circulation time, therefore improving their biodistribution profile and targeting efficacies.

Pre-adsorption versus chemical conjugation

The third strategy entails using a physical approach to attach targeting moieties to the surface of NPs rather than chemical conjugation107 (Fig. 2c). For instance, two types of targeted polystyrene NPs were synthesized by pre-adsorption versus chemical conjugation of anti-CD63 antibodies to their surfaces, and their targeting efficacies towards monocyte-derived dendritic cells with CD63 surface receptors, in the presence and absence of serum or plasma, were studied107. Although, in the absence of serum or plasma, both targeted NPs demonstrated similar targeting capacity towards dendritic cells, the pre-absorbed antibody-coated NPs showed substantially higher targeting efficacy compared with chemically attached targeted NPs in the presence of serum or plasma. This is, at least in large part, because the pre-adsorption method allows attachment of antibodies with outward-facing recognition domains and better targeting outcomes compared with chemical conjugation.

Conjugation to protein corona

The last strategy involves attaching targeting moieties to the surface of corona-coated NPs108 (Fig. 2d). The targeting species can be attached to the formed protein corona in an equilibrium state — when the protein corona is formed within a controlled single-component or multicomponent biofluid that does not change over time. For example, both bare and corona-coated gold and silica NPs were functionalized with transferrin, and their targeting capacities towards cells with transferrin receptors were monitored in the absence and presence of serum proteins108. Although the targeting capacity of transferrin-conjugated bare gold and silica NPs in the presence of serum was reduced compared with the serum-free media, there was no evidence of targeting efficacy reduction when transferrin was conjugated to the equilibrated protein corona of gold and silica NPs.

Challenges in the research methodology and characterization of the protein corona

Methodology

Poor methodology can induce errors and/or lead to misinterpretation of NP protein corona outcomes109. Here, we focus on how the formation of the protein corona may cause misinterpretation in well-defined biological methodologies including toxicity assays and monitoring drug release strategies.

The interference of protein corona with cytotoxicity assays (including the 3-[4,5-dimethylthiazol-2yl]-2,5-diphenyltetrazolium bromide (MTT) assay) was one of the first issues raised110. The formation of the protein corona can substantially change the composition and nutritional balance of cell culture media in static settings mainly because of the attraction of proteins, amino acids and vitamins to the surface of NPs and because of variations in protein conformation after interactions with the surface of NPs. As a result, the modified cell culture medium itself can induce cytotoxic effects under static in vitro conditions and may cause errors in NP toxicity outcomes. Two main strategies can be used to achieve reliable NP toxicological data. The first approach is to use a bioreactor, in which the use of dynamic flow can minimize the protein–NP adsorptive effects in cell culture media, thereby avoiding NP-induced variabilities in media composition. The second strategy is to introduce NPs to the cell culture media first and incubate them for an hour (to ensure that equilibrium among the biomolecules on the NP surface is reached) and then collect the particles and introduce them to cells in fresh culture media to minimize the effect of the protein corona on the composition of the culture media.

Interactions of NPs with proteins whose oligomerizations and fibrillations can cause neurodegenerative diseases (such as amyloid β-synuclein and α-synuclein for Alzheimer and Parkinson diseases, respectively) have been a subject of extensive research in nanomedicine111,112,113. However, the role of NPs in the fibrillation process has often been probed in the absence of the protein corona. As NPs that reach brain tissues for possible interactions with amyloid or synuclein proteins would certainly have previously interacted with biological fluids, their interaction with neurodegenerative-related proteins should also be studied in the presence of the protein corona. Corona-coated NPs mediate the antifibrillation impact of NPs and slow amyloid β-fibrillation114, which may cause a challenge in clinical translation of neurodegenerative nanotechnologies: ignoring the critical role of protein corona in probing the role of NPs in the fibrillation process can cause substantial misinterpretation of the outcomes, which, in turn, can lead to incorrect prediction of behaviour of NPs in vivo. Therefore, to achieve robust and clinically relevant effects on neurodegeneration-related proteins, corona-coated NPs instead of bare NPs should be studied.

Another major field in which the formation of the protein corona can make a substantial impact is the development of drug delivery nanoplatforms115. The protein corona can create an additional barrier to payload release, substantially influencing payload release profiles and even mechanism of action116,117. As a result, the reported outcomes of payload release studies that lack consideration of the protein corona may not accurately represent the in vivo payload release from the nanoplatform. Corona formation can attenuate or accelerate drug release kinetics on the basis of the physicochemical properties of a nanocarrier, the corona itself and/or influential parameters (such as pH and temperature). Although studies have begun to consider the critical role of protein corona in the payload release kinetics of nanocarriers117, the effects on release kinetics of stimulus-responsive nanocarriers are poorly understood. Future studies should focus on the role of the protein corona on nanocarriers that respond to exogenous and/or physiological stimulus including light, electric field, magnetic field, temperature and pH.

Overall, a deeper understanding of the methodological implications of the protein corona in various critical assays (such as toxicity and payload release) in nanomedicine would enable scientists and drug developers to design safe and more predictably efficient therapeutic nanomedicine technologies.

Characterization

The robust and precise characterization of the physicochemical properties and colloidal stability of corona-coated NPs are crucial for the identification of possible protein contamination and for the interpretation of protein corona outcomes118. Here, we focus on characterizing the composition of the protein corona in terms of protein identity and abundance, which is key to predicting and interpreting the interactions of NPs with biosystems. LC–MS/MS is one of the few techniques being used to define the type and abundance of proteins in the NP corona layer. Therefore, understanding the complexity of LC–MS/MS, from sample preparation methodologies to data analysis, is essential to accurately predict the biological fate of NPs.

Many laboratories that study protein corona have expertise in characterizing and analysing the physicochemical properties of the protein corona in house, but they are not necessarily specialized in MS-based proteomics; therefore, they rely on core facilities for proteomic analysis of their samples. As different MS-based proteomics laboratories and/or core facilities may use different methods in their LC–MS/MS workflow, different instruments, equipment and commercial software, one can expect that the heterogenicity of the outcomes would be substantial99,119,120,121,122. To shed more light on how much and to what extent LC–MS/MS characterization and analysis can affect protein corona outcomes, 17 identical aliquots of corona-coated polystyrene NPs were sent to 17 different laboratories/core facilities for proteomics analysis123. The outcomes were surprising: out of 4,022 identified unique proteins in the protein corona layer, only 73 (1.8%) were shared across the laboratories and/or core facilities. It is noteworthy that the technical repeats from each of the core facilities revealed reproducible results, which emphasizes that using the identical sample preparation approach and instrumentation can provide reliable results. The observed heterogeneity across laboratories and/or core facilities, however, is an extremely important point, which needs to be seriously considered in nanomedicine literature, as any interpretation regarding the interactions of NPs with biological systems heavily relies on the composition of protein corona. To improve the reliability and robustness of protein corona data, the nanobio interface community should develop standard protocols on methodologies, analysis, reporting and interpretation of LC–MS/MS data.

AI and the protein corona

Implementation of standardized protocols on methodologies and analyses of protein corona can generate extensive multi-omic-based data sets that can be used to teach AI to prognosticate diseases using protein corona fingerprints and to predict protein corona formation on distinct NPs for the fundamental design of nanomedicines. Characterization and prediction of the protein corona are both important to understand the interactions of NPs in biological milieus, yet there is a large discrepancy between the two, as the former comprises substantially more reports than the latter124,125. This immense difference is attributed to the disadvantages of existing high-throughput techniques and instruments used to test the biocompatibility and biofouling of nanotechnologies. Protocols for protein corona extraction can vary between laboratories, and thus there is a need for robust and standardized methods109,126. Moreover, MS is laborious, expensive and requires a high level of expertise, and such high-throughput analytical experiments are subject to multiple errors and variabilities arising from laboratory-to-laboratory differences in sample preparation and analysis123. AI and machine learning approaches can overcome these technical barriers and further elucidate the impact of the protein corona by predicting protein adsorption to NPs as well as their biological impacts (Box 1).

Among the promising machine learning algorithms, random forest classification (RFC), which integrates multiple decision trees to form a predictive model127, can easily be trained with MS data sets for the identification of protein features that promote or discourage protein adsorptivity to NPs. Typical performance metrics that validate the predictions of RFCs are accuracy, precision and recall. In general, these metrics measure the overall correctness of RFC model predictions, as well as the proportion of correctly predicted positives made by the model. The RFC algorithm has been successful in predicting protein adsorption to single-walled CNTs using 38 protein features, which are variables that define the relationship between input and output data for the model, based exclusively on amino acid sequences, with 78% accuracy and 70% precision40. Moreover, using kernel density estimates, statistical methods to estimate the distribution of variables, protein features such as elevated solvent-exposed glycine residues and high leucine residues were strongly correlated with predicted proteins that adsorb or desorb from single-walled CNTs, respectively40. RFC has also predicted the protein corona that adsorbs on silver NPs with 75% accuracy and 76% precision128 by predominantly using the protein physiochemical properties as features; 10 features classified protein properties, and 4 described solvent characteristics (cysteine and NaCl concentrations) and NP properties (size and zeta potential). RFCs can calculate the importance of features on the basis of how they impact model performance metrics. By excluding particular features from data sets, accuracy, precision and recall can decrease, revealing which features are necessary to have high metric values. Protein features were more important in predicting protein corona formation, but this may have been a result of the lower quantity of solvent and nanomaterial features used in the model128. Thus, more features are necessary to comparably determine those that strongly impact protein corona prediction. The majority of RFCs that predict corona formation utilize several protein-based features and limited NP descriptors. This is due to the diversity and complexity of proteins, resulting in multiple characterizable protein features (molecular weight, isoelectric point, pH, grand average of hydropathy and amino acid compositions); meanwhile, NPs have less characterizable properties that can serve as features for machine learning models. Protein corona fluorescamine labelling was used as an NP descriptor because of its positive correlation with the physiochemical properties of the NP129. Alongside this singular NP descriptor, derived from the screening of the fluorescamine-labelled protein coronas formed on 22 diverse NPs, 4 protein classifiers were used in the RFC model for protein corona prediction with 84% precision; protein corona fluorescamine labelling serving as an NP feature was as effective as solely using typical NP features (size and charge) to build the RFC129. Moreover, this unique NP feature was used to build a model that was able to predict the adsorption of proteins on five different 2D nanomaterials with 75% precision, a feat that has been difficult to achieve with common classifiers because of the heterogeneity of 2D nanomaterials129. Additionally, RFCs have been used to predict the protein corona on several NPs with multiple types of surface chemistries, using proteomic data sets derived from 56 individual studies; the majority of protein coronas were predicted with >75% accuracy130. The most important factors that dominated protein corona prediction in these models were NP surface chemistries (bare, PEGylated, functionalization of amines and other ligands). The RFC model was also used to predict protein functional components (that is, apolipoprotein, complement protein, coagulation protein, immune protein and clusterin), which were then correlated with recognition indexes impacted by the protein corona. The recognition indexes represented cell uptake efficiencies, pro-inflammatory responses and perturbations to the immune system. These findings suggest that the predicted functional components of the protein corona were associated with cell recognition of different NPs130.

The biological impacts of protein corona-coated NPs have also been predicted using other machine learning algorithms such as partial least-squares regression (PLSR), support vector machine (SVM)-based classification, neural network (NN) and k nearest neighbour-based regression. Instead of predicting the composition of the protein corona, these algorithms use the physicochemical properties of the NP and/or the protein corona as features to predict NP-induced biological effects. Quantitative structure–activity relationship (QSAR), which is computational modelling that can correlate NP and/or protein corona physicochemical properties to biological responses, developed with PLSR predicted in vitro cell interactions of Au and Ag NPs using the protein corona as a descriptor131. Briefly, proteins identified on the coronas of a library of 105 surface-modified Au NPs were used to build single-parameter linear models describing NP–cell associations as a function of the relative abundance of each protein in the corona or as the sum of the densities of the proteins that composed the corona. Using a selection of 64 predictive proteins, PLSR predicted NP–cell associations with an accuracy of 81%; the model accuracy increased by 5% when the parameters describing the NP formulation included both protein corona and NP physiochemical parameters, suggesting that the most relevant NP parameter for the prediction of cell association was the protein corona. Moreover, the Au NP protein corona was used to build a model that predicted cell associations by Ag NPs but was not successful (accuracy ~5%). An additional model was built using only the Ag NP protein corona and resulted in a high prediction accuracy (~80%). These findings suggested that there are NP-dependent differences in protein orientation and conformation that can influence cellular associations. By calculating the similarity of the protein coronas formed around Au and Ag NPs, the core material of the NPs had the greatest influence on protein corona composition when compared with size or surface chemistry131. Compared with the aforementioned study that utilized RFC to calculate the most informative feature among both protein and NP properties128, protein corona similarities were calculated to determine how influential the physicochemical properties of NPs were for protein corona formation131. Thus, the latter did not deduce if the protein corona was an essential feature for their cell association model. In a follow-up study, a nonlinear QSAR model was built with SVM-based classification to predict in vitro cell association based on both the physiochemical properties of 84 gold NPs, with varying surface chemistries, and the protein corona132. Succinctly, the nonlinear QSAR model used 6 proteins, identified from the protein coronas of 105 Au NPs, and zeta potentials as the most important contributors for predicting NP–cell associations. By including the zeta potentials of NPs, the predictive power of the NP–cell association model increased by 5%, when compared with a model that only used proteins from the corona as features (~85% accuracy)132. The in vivo fate of NPs was predicted with a supervised deep NN using protein coronas extracted at multiple time points from blood after in vivo circulation with 94% accuracy133. The NN algorithm was taught protein corona patterns, which fluctuate over time in circulation and are dependent on NP size, to accurately predict the clearance of NPs through the spleen or liver. Moreover, the model found that NP accumulation into these organs was independent of a single protein, but rather was contingent on an assortment of proteins on the corona that formed unique patterns throughout circulation time. An adaptive lasso-identified subset of proteins, which were strongly associated with highly accurate biodistribution, could predict clearance with an analogous accuracy as the NN prediction using more than 700 proteins133. SVM-based classification and k nearest neighbour-based regression algorithms have constructed quantitative nanostructure–activity relationship models of NPs to predict their biological impacts134, but these models lack input from the protein corona. Immune activation by spherical nucleic acid nanomedicines has also been predicted with a quantitative nanostructure–activity relationship model135, but similarly did not integrate the protein corona, which can potentially interfere with the biodistribution of the spherical nucleic acids or impact their intended therapies.

In silico approaches, such as machine learning algorithms, can effectively predict protein corona formation and biological interactions of several NPs for the tactical translation of innovative nanomedicines. By implementing predictive models to learn proteomic and other-omics data sets associated with NP protein corona (Fig. 3), we can identify protein features, such as functional protein motifs and epitopes, that may contribute to protein binding to NP surfaces as well as understand the relationship of the protein corona in NP cell recognition, protein–protein interactions and subsequent in vivo biological interactions. These extrapolative tools can also potentially predict protein binding even in the absence of new proteomic data sets. By predicting the protein corona on NPs, we can engineer nanomedicines that seamlessly integrate into therapeutic applications without the need for experimental MS analysis. Additionally, machine learning can reduce the number of animals and experimental efforts required to assess nanomedicine efficacy before human trials.

Fig. 3: Representative workflow used to predict protein corona formation with machine learning.
figure 3

Protein corona data, usually from mass spectrometry experiments, are used to train a machine learning classifier that learns which features of proteins are likely to be found in versus out of the nanoparticle protein corona. Classifiers can be tested through experimental validation of single-protein binding affinities, or by predicting and validating protein adsorption from biofluids different from those used in the training set.

Considering that current studies have confirmed the dependence of the protein corona on the prediction of NP–biological interactions131,133, machine learning algorithms have also successfully utilized protein corona data sets to predict Alzheimer disease53 and various types of cancers19,26 and can potentially predict other biological processes, such as subcellular localization of NPs or the induction of epigenetic mechanisms by nanomedicines. Future efforts that support the prediction of the protein corona, as well as the comprehension of the relationships between NP physicochemical properties and corona formation, should rely on extensive libraries of well-characterized NPs and proteomic data sets of NP coronas to establish an accurate, reliable and easily accessible bioinformatic database from which data can be extracted for machine learning applications. AlphaFold, a protein structure database developed by DeepMind and EMBL-EBI, can predict protein 3D structures from amino sequences with high accuracies136,137 and could potentially be extended to improve the predictions of protein structures likely to form the NP protein corona and eco-corona. Furthermore, aforementioned studies emphasize the need for continued data collection to feed into machine learning models, together with developing more sophisticated models with greater predictive power, for the fabrication of safe, sustainable and effective nanomedicines.

Emerging opportunities offered by the protein corona

Although the formation of protein corona on the surface of NPs usually causes several negative consequences (such as mistargeting and inducing errors in nano-based assays), it also enables new opportunities to address a wide range of issues, from early detection of diseases to the eco-corona (Fig. 4). In other words, the recent progresses in the field of protein corona revealed the other side of the coin: the usefulness of protein corona in the design of new diagnostic and/or therapeutic techniques.

Fig. 4: New emerging technologies offered by the protein corona.
figure 4

Nanoparticles (NPs) specifically decorated with targeting moieties such as immune system activating proteins can serve to modulate the immune system and catalyse the design of new therapeutics. Proteomics analysis of protein coronas of various NPs (for example, sensor array) provide a unique opportunity for identification of novel biomolecular patterns with disease detection capacity. The eco-corona forms when NPs enter ecological environments, resulting in spontaneous protein adsorption from ecological sources.

Design of new therapies

The coronavirus global pandemic underscored that a virus can modify its virulent surface protein, altering its interactions with the human immune system138,139,140. The nanomedicine community can learn from the rich literature on various interactions between biosystems and nanosized viruses, including the coronavirus, and their effects (such as immune system recognition and response). By combining these lessons with our understanding of journeys of NPs in biosystems141,142,143,144, the nanomedicine community should strive to fully and mechanistically understand biosystem interactions with corona-coated NPs, from recognition to the relevant cellular processing and pathways32. Specific attention should be paid to recognition of biosystems of NPs through the patterns and organizations of their ‘rare proteins’ rather than their bulk compositions. In other words, achieving a deep understanding of the critical role of patterns and organizations of less-abundant proteins in the protein corona profile of NPs enables the precise manipulation of biosystem responses to NPs and the design and development of novel and efficient protein corona therapies.

The sex and age of the biosystems are important in their responses to NPs, yet poorly considered in nanomedicine145. As such, recent findings on the interactions of the coronavirus with the immune system revealed the critical roles of age146 and sex147 on immune responses. For example, it was shown that male-derived immune cells produce higher levels of innate immune cytokines in blood plasma compared with female-derived cells, and more robust T cell activation occurs in female-derived cells compared with male-derived cells during coronavirus infection.

Once the aforementioned knowledge and understanding is achieved, researchers will be more capable of designing and developing new or adapted nanotherapeutics (such as nanoimmunotherapy by activating desired immune system pathways and mRNA and gene editing nanocarriers), by precisely designing and controlling the protein corona composition and decoration of the surface of NPs. The effects of the contributing factors (such as sex and age) on the safety and efficacy of NPs are strongly dependent on the type of NPs and their potential payload (such as mRNA, proteins and active biomolecules and pharmaceuticals). For monitoring the role of sex and age on the interactions of payload-free NPs with biosystems, for example, researchers can use ‘empty’ coronavirus-like particles and/or membranes.

More in-depth information and analysis of the biological nanoscale recognition mechanisms and responses are offered in another perspective32. If the role of sex and age is considered in the design and development of new nanotherapeutics (based on the effects of their potential payloads), NPs with engineered biological identities can target and/or activate sex-associated and/or ageing-associated pathways, which could improve the safety and therapeutic efficacy of nanomedicine products for both sexes and all ages.

Proteomics

One of the positive attributes of the NP protein corona is that the abundance of its proteins is different from the protein composition of the native biofluid8,148,149. In other words, NPs can enrich or deplete specific proteins in their corona profiles, regardless of the composition of these proteins in biological fluids, which can be useful for protein identification and characterization purposes. As such, the protein corona has a unique potential to overcome major problems in the global discovery of plasma proteomics (Box 2), such as biomarker discovery8,26,27,33,71,148. The composition of protein coronas on the surface of identical NPs strongly depends on the type of disease(s) the plasma donors have (known as ‘personalized’ or ‘disease-specific’ protein corona)27. This concept has been validated and used by various groups for studying personalized and disease-specific protein corona interactions with biosystems53,78,150,151,152,153,154,155,156. Although the health condition of plasma donors can alter the protein corona profile of identical NPs27, fewer proteins are identified in the corona than in the native plasma biofluid (from few tens to several hundreds in the protein corona, which depends on the physicochemical properties of NPs and the sensitivity of MS)8,26,123. The low numbers of plasma proteins in protein corona relative to the native plasma biofluid may increase the sensitivity and specificity of disease and/or biomarker detection using the protein corona approach, assuming that target proteins are among the ones adsorbing to and forming the NP protein corona. For example, blood-circulating liposomes in tumour-bearing mice could capture secreted tumour-specific proteins of human cells that had been used for tumour creation34.

To improve the specificity and sensitivity of disease detection, increasing the number of uniquely identified plasma proteins can be done using a ‘protein corona sensor array’33. The protein corona sensor array combines the NP protein corona with sensor array technology and machine learning for the robust identification of biomarker patterns for the detection of diseases33. The protein corona on the surface of various NPs (which are the sensor array elements), formed after interactions with healthy and various disease plasmas (or other biological fluids), is analysed by machine learning to identify protein/biomolecular patterns that have critical roles in the identification and discrimination of individual diseases. The cross-reactive interactions of the protein classes with NPs may provide unique fingerprints (that is, sensor-specific biomarkers) for each type of disease, which would facilitate disease identification and discrimination33. It is noteworthy that increasing the number of sensor array elements (by adding distinct NPs) can provide more proteomics data from plasma proteins, which, in turn, can increase the sensitivity and specificity of the machine learning algorithm for the detection, discrimination and prediction of diseases33. The sensor array mimics the human olfactory system, which can identify and discriminate ≥10,000 different odourants157. However, there are nowhere near 10,000 specific receptors for the lock-and-key identification of each odourant. Instead, recognition specificity comes from pattern recognition (cross-responsive receptors that produce composite responses unique to each odourant)158. Theoretically, the purpose of a sensor array is to identify, discriminate and quantify analytes and biomolecules much more sensitively and easily than specific individual sensors can. For example, there is no singular sensor capable of distinguishing NPs of different shapes and sizes. However, complex identification and discrimination of various NPs, even at very low concentration (100 ng ml−1), is easily achievable by the use of a sensor array159.

As in olfactory arrays, the specificity in protein corona sensor arrays comes from pattern recognition, in which the sensor array elements produce specific protein corona pattern unique to each disease type. Using liposomes with three distinct surface properties, a protein corona sensor array successfully identified and discriminated five distinct types of human cancers (lung, glioblastoma, meningioma, myeloma and pancreatic cancers) through the creation of unique protein corona patterns on NP surfaces that served as fingerprints for each type of cancer33. By increasing the numbers of used NPs, one can expect more robustness in detection and discrimination capability of the platform. The robustness of the protein corona sensor array for the identification and discrimination of cancers at very early stages was assessed using Golestan cohort plasmas that were obtained from healthy individuals but that would go on to be diagnosed with lung, pancreas and brain cancers several years after initial plasma collection. The outcomes revealed that the protein corona sensor array could robustly identify and discriminate among the cancers years before the patient develops clinical symptoms33.

For disease diagnostic purposes, the main advantage of using sensor arrays is that they can recognize patterns of rare proteins rather than conventional biomarkers. This is mainly because biomarkers mostly refer to specific biomolecules (such as proteins) that are elevated — rather than emergent — in the blood plasma of patients (that is, ‘turning on and off’ in disease and healthy conditions, respectively), each of them having a diagnostic value in the analysis of human plasma. However, the presence of proteins in the protein corona that in general are recognized as biomarkers (with high confidence being placed on protein corona purity) does not represent their elevation in blood plasma. In other words, there is less correlation between the concentration of proteins in plasma and at the surface of the NPs (that is, elevation of one protein in plasma does not cause elevation of that particular protein in the corona composition). In addition, elevation of biomarkers in plasma may substantially change the entire composition of the protein corona. Human plasma also undergoes substantial changes in the composition of small biomolecules such as metabolomes (such as glucose) and lipids (such as cholesterol)160,161 during disease occurrence and progress, which further alters the interaction of plasma proteins with NPs (as shown by both simulation162 and experimental163 results). In contrast to biomarkers, the pattern recognition of rare proteins in the corona layer, associated with disease occurrence and progresses, can provide more robust diagnostic outcomes. In addition, other biomolecules in plasma (such as lipids, metabolomes and nucleic acids) can affect the patterns of rare proteins. Overall, for disease detection purposes, simply searching and finding known disease-specific biomarkers that are elevated in plasmas of patients in the protein corona profiles, rather than defining the corona pattern of rare proteins, may induce substantial errors in the sensitivity and specificity of disease identification and prediction.

In another study using silica-coated multicore superparamagnetic iron oxide NPs with various physicochemical properties, protein corona compositions were prepared, collected and subjected to mass spectrometry analysis in an automated manner for efficient proteomic profiling; over 2,000 proteins were identified from plasma samples26. Because of its automated nature, the technology, which is now commercially available, could be used for the rapid and high-throughput profiling of plasma proteomes for biomarker discovery26.

The unique emerging role of the NP protein corona in proteomics may be leveraged for the rapid and early screening and discrimination of various diseases (such as cancers and neurodegenerative disorders), in which very early detection could result in the early initiation of therapeutic options, improving patient quality of life and patient outcomes.

The eco-corona

Although most studies on protein corona relate to their use in nanomedicine through preclinical applications, emerging research points to the environmental aspect of the protein corona, termed the ‘eco-corona’ and its role in nano-ecotoxicology35. The increasing production and use of NPs, including nanoplastics and nanomedicines, result in uncontrolled nanomaterial release into the environment164,165,166. Moreover, the integration of NPs to enhance agricultural production167,168,169,170 may be another source of NP release into the environment. To fully understand how NPs biodistribute and accumulate in our environment, understanding eco-corona formation on NP surfaces is imperative. Analogous to the formation of the protein corona from blood-based biofluids in nanomedicine, the formation of the eco-corona on NPs occurs through the adsorption of proteins171, humic substances23,172,173,174,175, metabolites171 and natural organic matter (NOM)176. The formation of the eco-corona outside aquatic organisms, as well as within organisms following NP internalization, was reviewed elsewhere35. Biofluid-derived protein corona studies, which have studied the interactions between nanomaterials and biomolecules in complex biological fluids, can be instrumental for analysing and understanding the real-world eco-corona. Moreover, extracting the real-world eco-corona is challenging and complex, as NPs transgress among industrial, aquatic, terrestrial and atmospheric environments. Box 3 features and relates eco-corona and biofluid-derived protein corona research.

Characterization of adsorbed amphiphiles (that is, humic substances, peptides, fatty acids and NOM) on the eco-corona is deficient because of the extensive heterogeneity of amphiphiles in the environment. It was found that amphiphiles derived from algal exudates exhibit Vroman-like competitive binding, in which proteins with high binding affinities displace proteins with low binding affinities, on nanosheet surfaces177. Soft and hard eco-corona form with amphiphiles possessing varying binding affinities to the graphene nanosheets. The protein component on the eco-corona is frequently scrutinized as proteins can engage with receptors and influence biological signalling in organisms and may impact ecological systems through biotransformation and environmental distribution35,178. Comparably to the biofluid-derived protein corona detailed earlier, adsorbed ecological substances on NP surfaces can alter identity, properties and biological impacts of NPs179,180. For example, secreted ecological proteins from planktonic Daphnia magna formed an eco-corona on polystyrene NPs, which then become more toxic than their pristine counterparts in vivo; the clearance and ability of D. magna to feed were negatively impacted following exposure to the eco-corona-coated plastic NPs171. Adsorption of humic acid on zinc oxide NPs increased the hatching success of NP-exposed zebrafish embryos. However, NP toxicity was not ameliorated by the eco-corona172. Attenuated NP toxicity towards D. magna was observed following exposure to eco-corona-coated tungsten carbide cobalt NPs; the reduced toxicity was speculated to be mediated through NP–organism interactions after the NOM-rich eco-corona NPs were internalized181. Similarly, formation of the eco-corona on silver NPs, through the adsorption of NOM collected from river water, decreased the bactericidal efficacy of NPs in the bacteria Shewanella oneidensis MR-1 (ref. 182). It is of vital importance to characterize NP eco-coronas to establish the environmental fate and eco-toxicity of NPs and engineer safer nanotechnologies, such as innocuous agricultural sensors with enhanced efficacy183,184. Characterizations of the protein corona in human biofluids have enabled NPs to gain much traction in clinical practice, yet the eco-corona remains much more sparsely characterized within a much larger exposure space. As such, the current state of the field supports the need to develop standardized protocols and improved technologies to extract and assess the eco-corona and its biological and ecological impacts under environmentally realistic conditions, a phenomenon that has not been feasible to date (Box 3).

It is now well understood that the environmental release of engineered NPs, whether intentional (such as in the application of NPs for environmental or agricultural applications) or unintentional (such as through product degradation), is inevitable35,164,185. NPs interact with a wide range of environmental components, from air and water to live organisms including bacteria and plants, in a component-specific manner. In addition, NPs can easily travel among various environmental components and even get into the food web. For example, NPs in air, soil and/or water can easily transfer into plants, vegetables, fruits and fish and interact with them on various levels on the basis of the organism, plant, vegetable or fruit type164,186,187,188,189. Ingestion of environmentally relevant nanoplastics190,191 can increase fat absorption by gastrointestinal cells modulated by the biomolecular corona of digested nanoplastics39. Therefore, from an environmental and food safety, monitoring and management perspective, it is essential to understand the environmental journey of NPs.

A deeper understanding of the formation of the eco-corona at the surface of NPs may provide essential information regarding their environmental transfer before and after their uptake by an organism and even their translocation between various organisms or species. Before entering an environmental milieu, the eco-corona may consist of a mixture of (bio)molecules originating from previous exposure to an organism, whereby the eco-corona will accumulate various types of biomolecules including proteins, lipids and metabolomes, depending to the organism type192. For example, in plants, the apoplast that comprises the intercellular space, the cell walls and the xylem, a vascular tissue, contains various types of proteins, lipids and other types of plant-specific biomolecules192. Therefore, the analysis of eco-corona of NPs may shed light on environmental journey of NPs and can assist decision makers on environmental and food safety, monitoring and management, enabling them to take proper actions regarding the current challenges in the field of environmental nanotoxicity.

Conclusion and perspectives

The mechanistic understanding of the protein corona, which has been the object of the majority of biomolecular corona studies thus far, has enabled the development of more efficient and safe nanotechnologies that can bridge nano-technological advances and clinical applications. Achieving robust characterization of the formation and evolution of the protein corona in various environments not only improves our understanding of the interactions between biosystems and NPs but also enables the prediction of such interactions with high accuracy. Although protein corona formation creates inertia in clinical translation of nanomedicinal technologies, the protein corona also creates new opportunities for multiplexed diagnostics and for protein corona engineering for targeted delivery and sensing applications. The dynamic nature of the protein corona also provides a paradigm shift in our understanding of the unintended environmental journey of NPs. The adsorption of environmental proteins and natural organic matter onto nanomaterials intentionally put into practice for agricultural applications, or indirectly introduced into the environment via manufacturing processes and waste disposal, is not stagnant; the eco-corona can undergo various transformations and ecological interactions as nanomaterials transit the environment.

Achieving reproducible data sets across research groups involved in protein corona research remains one of the major issues in the field. This poor reproducibility is, at least in large part, due to the variability of biological systems from which the protein corona has been studied (such as, for blood plasma, sex, gender, age, ethnicity and health spectrum)99,145, the lack of unifying standards for nano-bio characterization protocols193 (specifically for MS)123 and variabilities in experimental reporting strategies194,195. The nanomedicine community would also benefit from the development of new methodologies to probe the conformation of proteins in the biomolecular corona layer. Although some techniques, such as circular dichroism, can probe the conformational changes of single proteins after interactions with the surface of NPs, robust methods to probe conformational changes of various proteins in the corona layer are lacking. The information on the multiprotein conformational changes in the biomolecular corona layer helps the understanding and prediction of recognition of NPs by immune systems and their responses to nanomedicine technologies.

Another issue in protein corona research is that less attention has been paid to the role of the patterns and organizations of low-abundance proteins and biomolecules, when compared with the bulk corona compositions, in their interactions with biosystems. Robust and precise characterization can lead to a deep understanding of biosystem responses enabling the development of new, safe and efficient therapeutic approaches. In addition, the critical role of disease-specific lipids (such as cholesterol)162 and metabolomes (such as glucose)162, which can substantially alter the interaction of plasma proteins with NPs and, consequently, affect the patterns and organizations of the rare proteins and biomolecules in the corona layer, should be studied and considered.

A further obstacle is the lack of thorough investigation of protein corona profiles of certain NPs that are widely used in clinical trials or clinics. For example, achieving robust protein corona characterization on lipid NPs (specifically those used for COVID-19 vaccines)196 and NPs that are being used for payload delivery to the central nervous system197 will enable researchers to better understand and predict biosystem responses to the NPs and, therefore, develop safer and more efficient nanotherapeutics.

Although the investigation of NPs in human biofluids has been an ongoing field of research over the past few decades, the use of nanotechnologies in plants and agriculture is relatively nascent183. Environmental stressors, such as climate change, pathogens and population growth, complexify agricultural practices. Nanobiotechnology offers sustainable solutions by adapting nanomaterials as nanosensors that can monitor plant analytes, augment crop stress resistance and proctor productivity by reporting metrics that can advance agronomy. Nanomaterials have also served as carriers of plasmid DNA and small interfering RNA168 for genetic and post-transcriptional biofortification of crop plants. NPs have contributed to pesticide and nutrient delivery198 and as elicitors of plant-defence responses199. Despite the prolific and growing use of nanotechnology in agriculture, the nano–bio interactions in nanotechnologies remain unknown. In plants, the use of nanotechnologies results in spontaneous biomolecule adsorption from plant fluids onto nanosensor surfaces. The protein corona that forms on NPs in plants remains largely unexplored yet dictates NP interactions with plants and the environment183. The growing interest for nanotechnologies in agriculture motivates further exploration of the fundamental nano–bio interactions that drive spontaneous protein and eco-corona formation before deploying these nanotechnologies into the field.

Regardless of the NP application space, whether in the clinic or in the environment, experimental approaches and biological variability in characterizing the NP protein corona across different systems limit the throughput of experimental validation of nanotechnologies. The protein corona has vast implications when applying NPs in biological systems, unpredictably changing outcomes such as the NP function, biodistribution and toxicity. Yet, to-date, there are orders of magnitude more reports of nanotechnologies developed for supposed biological application than reports of their interactions with biologically relevant milieus. This discrepancy is seen because testing nanotechnologies for biocompatibility and biofouling is a costly and time-consuming process that often relies on high-throughput MS or similar experiments with lower throughput. By implementing AI and machine learning algorithms, we can identify protein features that contribute to protein adsorption or omission from the NP protein corona128,130. A well-trained machine learning classifier may also enable rapid determination of protein corona composition from entirely new biofluid or NP data sets (that is, unfamiliar relative to the training data), which can be confirmed experimentally using surface binding measurement assays40. In this manner, machine learning supports the development of predictive protein corona models that will enable researchers to implement a wider range of nanotechnologies across different biological environments. Developments of supervised learning tools provide algorithms that researchers can use to parse protein properties from publicly available databases to determine protein corona formation, as a step towards in silico testing of nanotechnologies for their biocompatibility and biofouling propensities in a broad range of applications.