Abstract
As different taxa evolve, gene order often changes slowly enough that chromosomal ‘blocks’ with conserved gene orders (synteny) are discernible. The MCScanX toolkit (https://github.com/wyp1125/MCScanX) was published in 2012 as freely available software for the detection of such ‘colinear blocks’ and subsequent synteny and evolutionary analyses based on genome-wide gene location and protein sequence information. Owing to its simplicity and high efficiency for colinear block detection, MCScanX provides a powerful tool for conducting diverse synteny and evolutionary analyses. Moreover, the detection of colinear blocks has been embraced as an integral step for pangenome graph construction. Here, new application trends of MCScanX are explored, striving to better connect this increasingly used tool to other tools and accelerate insight generation from exponentially growing sequence data. We provide a detailed protocol that covers how to install MCScanX on diverse platforms, tune parameters, prepare input files from data from the National Center for Biotechnology Information, run MCScanX and its visualization and evolutionary analysis tools, and connect MCScanX with external tools, including MCScanX-transposed, Circos and SynVisio. This protocol is easily implemented by users with minimal computational background and is adaptable to new data of interest to them. The data and utility programs for this protocol can be obtained from http://bdx-consulting.com/mcscanx-protocol.
Key points
-
During evolution, chromosomes are dynamically reorganized by duplication, inversion and translocation. Comparative analysis of genomes is empowered by identifying homologous genes that remain in their ancestral positions (colinearity). The MCScanX software package aids evolutionary studies by facilitating the detection of colinearity blocks, in particular, enabling alignments of multiple chromosomes (or segments).
-
MCScanX generates easy-to-read output files and has built-in visualization tools for easy representation of evolutionary insights.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
Users can download the reference genome sequences and gene annotations of this study from Phytozome 13 (https://phytozome-next.jgi.doe.gov) and NCBI genome (https://www.ncbi.nlm.nih.gov/datasets/genome). Small-scale testing data for MCScanX are available under the ‘data’ folder of the MCScanX package. The data and programs for this protocol are wrapped as a ZIP file named ‘MCScanX_protocol.zip’, which is available at http://bdx-consulting.com/mcscanx-protocol.
Code availability
MCScanX is available at https://github.com/wyp1125/MCScanX. MCScanX-transposed is available at https://github.com/wyp1125/MCScanX-transposed. Programs especially for this protocol are available within the ‘MCScanX_protocol.zip’ file, which is available at http://bdx-consulting.com/mcscanx-protocol.
References
Tang, H. et al. Synteny and collinearity in plant genomes. Science 320, 486–488 (2008).
Tang, H. B. et al. Unraveling ancient hexaploidy through multiply-aligned angiosperm gene maps. Genome Res. 18, 1944–1954 (2008).
Wang, X. Y. et al. Statistical inference of chromosomal homology based on gene colinearity and applications to Arabidopsis and rice. BMC Bioinforma. 7, 447 (2006).
Myers, P. Z. Synteny: inferring ancestral genomes. Nat. Educ. 1, 47 (2008).
Darling, A. C., Mau, B., Blattner, F. R. & Perna, N. T. Mauve: multiple alignment of conserved genomic sequence with rearrangements. Genome Res. 14, 1394–1403 (2004).
Bowers, J. E., Chapman, B. A., Rong, J. K. & Paterson, A. H. Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events. Nature 422, 433–438 (2003).
Tang, H. B., Bowers, J. E., Wang, X. Y. & Paterson, A. H. Angiosperm genome comparisons reveal early polyploidy in the monocot lineage. Proc. Natl Acad. Sci. USA 107, 472–477 (2010).
Freeling, M. et al. Many or most genes in Arabidopsis transposed after the origin of the order Brassicales. Genome Res. 18, 1924–1937 (2008).
Pevzner, P. & Tesler, G. Genome rearrangements in mammalian evolution: lessons from human and mouse genomes. Genome Res. 13, 37–45 (2003).
Jun, J., Mandoiu, I. I. & Nelson, C. E. Identification of mammalian orthologs using local synteny. BMC Genomics 10, 630 (2009).
Tekaia, F. Inferring orthologs: open questions and perspectives. Genomics Insights 9, 17–28 (2016).
Zheng, X. H., Lu, F., Wang, Z. Y., Hoover, J. & Mural, R. Using shared genomic synteny and shared protein functions to enhance the identification of orthologous gene pairs. Bioinformatics 21, 703–710 (2005).
Freeling, M. Bias in plant gene content following different sorts of duplication: tandem, whole-genome, segmental, or by transposition. Annu. Rev. f. Plant Biol. 60, 433–453 (2009).
Guo, H. et al. Gene duplication and genetic innovation in cereal genomes. Genome Res. 29, 261–269 (2019).
Hakes, L., Pinney, J. W., Lovell, S. C., Oliver, S. G. & Robertson, D. L. All duplicates are not equal: the difference between small-scale and genome duplication. Genome Biol. 8, R209 (2007).
Li, Z. et al. Multiple large-scale gene and genome duplications during the evolution of hexapods. Proc. Natl Acad. Sci. USA 115, 4713–4718 (2018).
Liu, C. et al. Illegitimate recombination between homeologous genes in wheat genome. Front. Plant Sci. 11, 1076 (2020).
Wang, X. Y., Tang, H. B., Bowers, J. E. & Paterson, A. H. Comparative inference of illegitimate recombination between rice and sorghum duplicated genes produced by polyploidization. Genome Res. 19, 1026–1032 (2009).
Wang, Y., Ficklin, S. P., Wang, X., Feltus, F. A. & Paterson, A. H. Large-scale gene relocations following an ancient genome triplication associated with the diversification of core eudicots. PLoS One 11, e0155637 (2016).
Wang, Y., Li, J. & Paterson, A. H. MCScanX-transposed: detecting transposed gene duplications based on multiple colinearity scans. Bioinformatics 29, 1458–1460 (2013).
Qiao, X. et al. Gene duplication and evolution in recurring polyploidization-diploidization cycles in plants. Genome Biol. 20, 38 (2019).
Wang, Y. P. et al. Modes of gene duplication contribute differently to genetic novelty and redundancy, but show parallels across divergent angiosperms. Plos One 6, e28150 (2011).
Wang, Y. et al. MCScanX: a toolkit for detection and evolutionary analysis of gene synteny and collinearity. Nucleic Acids Res. 40, e49 (2012).
Krzywinski, M. et al. Circos: an information aesthetic for comparative genomics. Genome Res. 19, 1639–1645 (2009).
Bandi, V., Gutwin, C. Interactive exploration of genomic conservation. In Proceedings of the 46th Graphics Interface Conference 2020 (Waterloo, 2020).
Chen, C. et al. TBtools: an integrative toolkit developed for interactive analyses of big biological data. Mol. Plant 13, 1194–1202 (2020).
Altschul, S. F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997).
Haas, B. J., Delcher, A. L., Wortman, J. R. & Salzberg, S. L. DAGchainer: a tool for mining segmental genome duplications and synteny. Bioinformatics 20, 3643–3646 (2004).
Lallemand, T., Leduc, M., Landes, C., Rizzon, C. & Lerat, E. An overview of duplicated gene detection methods: why the duplication mechanism has to be accounted for in their choice. Genes (Basel) 11, 1046 (2020).
Drillon, G., Carbone, A. & Fischer, G. SynChro: a fast and easy tool to reconstruct and visualize synteny blocks along eukaryotic chromosomes. PLoS One 9, e92621 (2014).
Xu, Y. et al. VGSC: a web-based vector graph toolkit of genome synteny and collinearity. Biomed. Res. Int. 2016, 7823429 (2016).
Kolishovski, G. et al. The JAX Synteny Browser for mouse-human comparative genomics. Mamm. Genome 30, 353–361 (2019).
Lovell, J. T. et al. The genomic landscape of molecular responses to natural drought stress in Panicum hallii. Nat. Commun. 9, 5213 (2018).
Marchant, D. B. et al. Dynamic genome evolution in a model fern. Nat. Plants 8, 1038–1051 (2021).
Lovell, J. T. et al. Four chromosome scale genomes and a pan-genome annotation to accelerate pecan tree breeding. Nat. Commun. 12, 4125 (2021).
Song, J. M. et al. Eight high-quality genomes reveal pan-genome architecture and ecotype differentiation of Brassica napus. Nat. Plants 6, 34–45 (2020).
Yang, T. et al. Improved pea reference genome and pan-genome highlight genomic features and evolutionary characteristics. Nat. Genet. 54, 1553–1563 (2022).
Tao, Y. et al. Extensive variation within the pan-genome of cultivated and wild sorghum. Nat. Plants 7, 766–773 (2021).
Tamura, K., Stecher, G. & Kumar, S. MEGA11: molecular evolutionary genetics analysis version 11. Mol. Biol. Evol. 38, 3022–3027 (2021).
Acknowledgements
A.H.P. appreciates funding from the National Science Foundation (NSF: DBI 0849896, MCB 0821096 and MCB 1021718) and from a Regents Professorship from the University System of Georgia. H.T. is supported by funding from the National Key Research and Development Program (2021YFF1000104). X.W. is supported by funding from the China Natural Science Foundation program (32070669). P.V.J. is supported by the National Institute on Alcohol Abuse and Alcoholism under award number Z01AA000135, the National Institute of Nursing Research and the Rockefeller University Heilbrunn Nurse Scholar Award. P.V.J. is supported by the Office of Workforce Diversity and the Office of Workforce Diversity, National Institutes of Health Distinguished Scholar Program.
Author information
Authors and Affiliations
Contributions
A.H.P. conceived the project. A.H.P. and P.V.J. guided the project. X.W. and H.T. developed early versions of the genome alignment algorithms. Y.W. improved on algorithms and developed the MCScanX protocol. Y.S. performed the experiments. Y.W., P.V.J. and A.H.P. drafted the manuscript. All authors contributed to the final writing and editing of the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Protocols thanks Ingo Ebersberger and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Related links
Key references using this protocol
Tang, H. B. et al. Genome Res. 18, 1944–1954 (2008): https://doi.org/10.1101/gr.080978.108
Wang, Y. et al. Nucleic Acids Res. 40, e49 (2012): https://doi.org/10.1093/nar/gkr1293
Wang, Y. et al. PLoS One 11, e0155637 (2016): https://doi.org/10.1371/journal.pone.0155637
Supplementary information
Supplementary Table 1
Supplementary Table 1
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wang, Y., Tang, H., Wang, X. et al. Detection of colinear blocks and synteny and evolutionary analyses based on utilization of MCScanX. Nat Protoc (2024). https://doi.org/10.1038/s41596-024-00968-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41596-024-00968-2
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.