In a recent Nature paper, Mensah et al. demonstrate that a significant subset of frameshift mutations in C-terminal intrinsically disordered regions (IDRs) can alter phase separation propensities, cause protein mislocalization into the nucleolus and disrupt nucleolar function. The work addresses the lack of molecular mechanisms for disease-associated variants in IDRs and provides strong support for the general assessment of how IDR variants cause defects through their impact on biomolecular condensates.

Current genetic approaches have significantly increased known disease-associated variants, with up to a quarter of such variants localized to intrinsically disordered regions (IDRs).1 However, the molecular mechanisms by which IDR variants drive disease pathology are not well understood. Nonetheless, certain IDRs are now appreciated to have a role in assembly of biomolecular condensates, including membraneless organelles, through dynamic, multivalent interactions involving IDRs, folded domains, and nucleic acids.2 Biomolecular condensates regulate many essential processes, including chromatin accessibility, transcription, splicing, translation, and cellular signaling.2 Disease-associated variants in certain IDRs have thus been proposed to alter regulation of biomolecular condensates and promote aberrant states.3,4 IDR variants may decrease (increase) the threshold concentration for phase separation, even abrogating or creating novel condensates. Variants can change the nature of condensate solvent to impact partitioning of biomolecules and enzymatic activities, resulting in mislocalization and gain or loss of function. Other physical properties, including viscoelasticity and molecular diffusion, can be altered, potentially enhancing transitions to fibers, which are associated with neurodegenerative disorders.5

Proteins associated with some common diseases exhibit higher predicted phase separation propensity relative to the human proteome, and in the cases of autism spectrum disorder (ASD) and cancer, high percentages of genetic variants map to IDRs.3 One example is the seven-residue segment in the N-terminal IDR of eIF4G1 encoded by a microexon.6 Absence of the microexon, an ASD-associated deficiency, reduces eIF4G1 phase separation in vitro and coalescence of neuronal granule components, over-stimulating translation and leading to ASD-like cognitive defects in mice. That study showed how changes in phase separation of disease-associated proteins could be a useful framework for understanding pathological effects of IDR variants.

The recent Nature paper by Mensah et al. builds on previous studies to provide compelling evidence in support of this view, as they demonstrate that protein phase separation is altered by a subset of disease-associated variants in IDRs, leading to mispartitioning into the nucleolus and disruption of nucleolar function.7 The authors show that five individuals diagnosed with brachyphalangy, polydactyly and tibial aplasia/hypoplasia syndrome share a common heterozygous frameshift mutation in the final exon of the DNA binding factor HMGB1, replacing the C-terminal acidic IDR with a basic, arginine-rich region followed by a hydrophobic patch. The frameshift variant escapes nonsense-mediated decay as evidenced by mutant HMGB1 transcripts found in patient peripheral blood cells. Mutant HMGB1 has a lower phase separation threshold concentration than wildtype (WT) and forms amorphous condensates with reduced diffusion in vitro. While WT HMGB1 has diffuse nuclear distribution, mutant HMGB1 preferentially partitions into the nucleolus granular component and displaces the key nucleolar protein, NPM1. HMGB1 mislocalization is associated with slower nucleolar translational diffusion and reduction in 28S rRNA levels and cell viability, consistent with nucleolar dysfunction. Notably, the authors show that the mislocalization is due to the arginine-rich region, while loss of condensate dynamics is associated with the hydrophobic patch. These exciting results provide evidence for how IDR variants can promote aberrant pathological states through alterations in phase behavior underlying condensate function.

Importantly, the authors generalize their findings by identifying > 200,000 disease variants in C-terminal IDRs, of which > 600 frameshift mutations result in arginine-rich regions. Ectopic expression of numerous transcription factors bearing arginine-rich IDR frameshift mutations often showed mispartitioning into the nucleolus, with some causing nucleolar dysfunction. These data support a widespread mechanism by which IDR variants can disrupt protein function through aberrant phase behavior, including (mis)partitioning into condensates, and expand appreciation of the biological function of IDRs and biomolecular condensates. They also underscore the role of IDR bioinformatics tools in providing insights into IDR function and pathology, including involvement in condensates. While IDRs exhibit poor positional sequence alignment, they contain molecular features that are conserved over evolution and that are useful to predict specific biological functions.8 Arginine-rich and hydrophobic characters of IDRs represent only two of many IDR molecular features that regulate condensate partitioning. Thus, this C-terminal IDR variant catalog provides a model for development of other catalogs to stimulate investigation of effects of insertions/deletions, splicing/microexon variants6 and other mutations not explored in this work. The catalogs from this and other studies will serve as invaluable resources for generation of bioinformatics-derived hypotheses to be experimentally tested in order to evaluate how disease-associated IDR variants affect condensates, enriching our understanding of genetic variation in biology and pathology.