Data availability
Genome assemblies of Cavendish, Gros Michel and Zebrina v2.0 have been deposited into NCBI under GenBank numbers JAVVNX000000000, JAVVNW000000000 and JAVVNV000000000 and in the National Genomics Data Center BioProject database (https://ngdc.cncb.ac.cn/bioproject/) under the accession number PRJCA019650. Genome assemblies with annotations and results of ChIP–seq and DNase-seq can be accessed at FigShare (https://figshare.com/projects/Origin_and_evolution_of_the_triploid_cultivated_banana_genome/178041). Raw data used for the assemblies, including PacBio, Illumina and Hi-C data, are available through the Sequence Read Archive of the National Centre for Biotechnology Information (NCBI) under the BioProject PRJNA1017453 with SRA accessions from SRR23425440 to SRR23425472 and from SRR23885547 to SRR23885549. Fifty-eight RNA-seq datasets were downloaded from NCBI BioProject accessions PRJNA381300, PRJNA394594 and PRJNA598018. DNA methylation data were downloaded from NCBI BioProject PRJNA381300.
Code availability
Custom code and scripts for mapping the origins of chromosomal segments are available at FigShare (https://doi.org/10.6084/m9.figshare.21229205.v1)70. All public software used in this study is provided in the accompanying Nature Portfolio Reporting Summary.
References
Rouard, M. et al. Three new genome assemblies support a rapid radiation in Musa acuminata (wild banana). Genome Biol. Evol. 10, 3129–3140 (2018).
Langhe, E. D., Vrydaghs, L., Maret, P. D., Perrier, X. & Denham, T. Why bananas matter: an introduction to the history of banana domestication. Ethnobot. Res. Appl. 7, 322–326 (2008).
D'Hont, A. et al. The banana (Musa acuminata) genome and the evolution of monocotyledonous plants. Nature 488, 213–217 (2012).
Wang, Z. et al. Musa balbisiana genome reveals subgenome evolution and functional divergence. Nat. Plants 5, 810–821 (2019).
Davey, M. W. et al. A draft Musa balbisiana genome sequence for molecular genetics in polyploid, inter- and intra-specific Musa hybrids. BMC Genomics 14, 683 (2013).
de Jesus, O. N. et al. Genetic diversity and population structure of Musa accessions in ex situ conservation. BMC Plant Biol. 13, 41 (2013).
Martin, G. et al. Genome ancestry mosaics reveal multiple and cryptic contributors to cultivated banana. Plant J. 102, 1008–1025 (2020).
Kallow, S. et al. Maximizing genetic representation in seed collections from populations of self and cross-pollinated banana wild relatives. BMC Plant Biol. 21, 415 (2021).
Martin, G. et al. Chromosome reciprocal translocations have accompanied subspecies evolution in bananas. Plant J. 104, 1698–1711 (2020).
Baurens, F. C. et al. Recombination and large structural variations shape interspecific edible bananas genomes. Mol. Biol. Evol. 36, 97–111 (2019).
Belser, C. et al. Telomere-to-telomere gapless chromosomes of banana using nanopore sequencing. Commun. Biol. 4, 1047 (2021).
Belser, C. et al. Chromosome-scale assemblies of plant genomes using nanopore long reads and optical maps. Nat. Plants 4, 879–887 (2018).
Cenci, A. et al. Unravelling the complex story of intergenomic recombination in ABB allotriploid bananas. Ann. Bot. 127, 7–20 (2021).
Martin, G. et al. Interspecific introgression patterns reveal the origins of worldwide cultivated bananas in New Guinea. Plant J. 113, 802–818 (2023).
Lescot, T. Genetic diversity of banana in figures. FruiTrop 189, 58–62 (2008).
Stokstad, E. Banana fungus puts Latin America on alert. Science 365, 207–208 (2019).
Maxmen, A. CRISPR might be the banana’s only hope against a deadly fungus. Nature 574, 15 (2019).
Busche, M. et al. Genome sequencing of Musa acuminata dwarf Cavendish reveals a duplication of a large segment of chromosome 2. G3 10, 37–42 (2020).
Carreel, F. et al. Ascertaining maternal and paternal lineage within Musa by chloroplast and mitochondrial DNA RFLP analyses. Genome 45, 679–692 (2002).
Christelová, P. et al. Molecular and cytological characterization of the global Musa germplasm collection provides insights into the treasure of banana diversity. Biodivers. Conserv. 26, 801–824 (2017).
Wang, X., Yu, R. & Li, J. Using genetic engineering techniques to develop banana cultivars with Fusarium wilt resistance and ideal plant architecture. Front. Plant Sci. 11, 617528 (2020).
Stokstad, E. GM banana shows promise against deadly fungus strain. Science 358, 979 (2017).
Dale, J. et al. Transgenic Cavendish bananas with resistance to Fusarium wilt tropical race 4. Nat. Commun. 8, 1496 (2017).
Tripathi, L., Ntui, V. O. & Tripathi, J. N. CRISPR/Cas9-based genome editing of banana for disease resistance. Curr. Opin. Plant Biol. 56, 118–126 (2020).
Ahmad, F. et al. Genetic mapping of Fusarium wilt resistance in a wild banana Musa acuminata ssp. malaccensis accession. Theor. Appl. Genet. 133, 3409–3418 (2020).
Lü, P. et al. Genome encode analyses reveal the basis of convergent evolution of fleshy fruit ripening. Nat. Plants 4, 784–791 (2018).
Thomas, B. C., Pedersen, B. & Freeling, M. Following tetraploidy in an Arabidopsis ancestor, genes were removed preferentially from one homeolog leaving clusters enriched in dose-sensitive genes. Genome Res. 16, 934–946 (2006).
Koren, S. et al. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res. 27, 722–736 (2017).
Walker, B. J. et al. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS ONE 9, e112963 (2014).
Nurk, S. et al. HiCanu: accurate assembly of segmental duplications, satellites, and allelic variants from high-fidelity long reads. Genome Res. 30, 1291–1305 (2020).
Koren, S. et al. De novo assembly of haplotype-resolved genomes with trio binning. Nat. Biotechnol. 36, 1174–1182 (2018).
Rhie, A., Walenz, B. P., Koren, S. & Phillippy, A. M. Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies. Genome Biol. 21, 245 (2020).
Alonge, M. et al. RaGOO: fast and accurate reference-guided scaffolding of draft genomes. Genome Biol. 20, 224 (2019).
Schneeberger, K. et al. Reference-guided assembly of four diverse Arabidopsis thaliana genomes. Proc. Natl Acad. Sci. USA 108, 10249–10254 (2011).
Zhang, X., Zhang, S., Zhao, Q., Ming, R. & Tang, H. Assembly of allele-aware, chromosomal-scale autopolyploid genomes based on Hi-C data. Nat. Plants 5, 833–845 (2019).
Lieberman-Aiden, E. et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289–293 (2009).
Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).
Hu, J., Fan, J., Sun, Z. & Liu, S. NextPolish: a fast and efficient genome polishing tool for long-read assembly. Bioinformatics 36, 2253–2255 (2020).
Holt, C. & Yandell, M. MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinformatics 12, 491 (2011).
Stanke, M., Diekhans, M., Baertsch, R. & Haussler, D. Using native and syntenically mapped cDNA alignments to improve de novo gene finding. Bioinformatics 24, 637–644 (2008).
Brůna, T., Lomsadze, A. & Borodovsky, M. GeneMark-EP+: eukaryotic gene prediction with self-training in the space of genes and proteins. NAR Genom. Bioinform. 2, lqaa026 (2020).
Kriventseva, E. V. et al. OrthoDB v10: sampling the diversity of animal, plant, fungal, protist, bacterial and viral genomes for evolutionary and functional annotations of orthologs. Nucleic Acids Res. 47, D807–D811 (2019).
Haas, B. J. et al. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat. Protoc. 8, 1494–1512 (2013).
Simão, F. A. et al. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–3212 (2015).
Jones, P. et al. InterProScan 5: genome-scale protein function classification. Bioinformatics 30, 1236–1240 (2014).
Kent, W. J. BLAT—the BLAST-like alignment tool. Genome Res. 12, 656–664 (2002).
Xu, Z. & Wang, H. LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res. 35, W265–W268 (2007).
Flynn, J. M. et al. RepeatModeler2 for automated genomic discovery of transposable element families. Proc. Natl Acad. Sci. USA 117, 9451–9457 (2020).
Spannagl, M. et al. PGSB PlantsDB: updates to the database framework for comparative plant genome research. Nucleic Acids Res. 44, D1141–D1147 (2016).
Tarailo-Graovac, M. & Chen, N. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr. Protoc. Bioinformatics Chapter 4, 10.1– 10.14 (2009).
Jurka, J. et al. Repbase Update, a database of eukaryotic repetitive elements. Cytogenet. Genome Res. 110, 462–467 (2005).
Emms, D. M. & Kelly, S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. Evol. 20, 238 (2019).
Darriba, D., Taboada, G. L., Doallo, R. & Posada, D. ProtTest 3: fast selection of best-fit models of protein evolution. Bioinformatics 27, 1164–1165 (2011).
Guindon, S. et al. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst. Biol. 59, 307–321 (2010).
Tang, H. et al. Synteny and collinearity in plant genomes. Science 320, 486–488 (2008).
Yang, Z. PAML: a program package for phylogenetic analysis by maximum likelihood. Comput. Appl. Biosci. 13, 555–556 (1997).
Stubbs, T. M. et al. Multi-tissue DNA methylation age predictor in mouse. Genome Biol. 18, 68 (2017).
Broad Institute. Picard toolkit. GitHub https://broadinstitute.github.io/picard (2019).
Zhang, Y. et al. Model-based analysis of ChIP–Seq (MACS). Genome Biol. 9, R137 (2008).
Lawrence, M., Gentleman, R. & Carey, V. rtracklayer: an R package for interfacing with genome browsers. Bioinformatics 25, 1841–1842 (2009).
Krueger, F. & Andrews, S. R. Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications. Bioinformatics 27, 1571–1572 (2011).
Akalin, A. et al. methylKit: a comprehensive R package for the analysis of genome-wide DNA methylation profiles. Genome Biol. 13, R87 (2012).
Kim, D., Paggi, J. M., Park, C., Bennett, C. & Salzberg, S. L. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat. Biotechnol. 37, 907–915 (2019).
Liao, Y., Smyth, G. K. & Shi, W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923–930 (2014).
Ramírez-González, R. H. et al. The transcriptional landscape of polyploid wheat. Science 361, eaar6089 (2018).
Li, P. et al. RGAugury: a pipeline for genome-wide prediction of resistance gene analogs (RGAs) in plants. BMC Genomics 17, 852 (2016).
Eddy, S. R. Accelerated profile HMM searches. PLoS Comput. Biol. 7, e1002195 (2011).
Price, M. N., Dehal, P. S. & Arkin, A. P. FastTree: computing large minimum evolution trees with profiles instead of a distance matrix. Mol. Biol. Evol. 26, 1641–1650 (2009).
He, Z. et al. Evolview v2: an online visualization and management tool for customized and annotated phylogenetic trees. Nucleic Acids Res. 44, W236–W241 (2016).
Li, X. et al. Custom code and scripts for mapping the origins of chromosomal segments. FigShare https://doi.org/10.6084/m9.figshare.21229205.v1 (2023).
Acknowledgements
We thank G. Riddihough (Life Science Editors) for text editing. X.L. acknowledges funding from the National Natural Science Foundation of China (32370687). P.L. acknowledges funding from the National Natural Science Foundation of China (32372666) and Construction of Plateau Discipline of Fujian Province (102/71201801104). L.Z. acknowledges funding from the National Natural Science Foundation of China (32272750). Y.V.d.P. acknowledges funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (no. 833522) and from Ghent University (Methusalem funding, BOF.MET.2021.0005.01).
Author information
Author notes
These authors contributed equally: Xiuxiu Li, Sheng Yu, Zhihao Cheng, Xiaojun Chang, Yingzi Yun, Mengwei Jiang.
Authors and Affiliations
State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, Haixia Institute of Science and Technology, College of Horticulture, Fujian Agriculture and Forestry University, Fuzhou, China
Xiuxiu Li,Yingzi Yun,Mengwei Jiang,Xuequn Chen,Hua Li,Wenjun Zhu,Shiyao Xu,Yanbing Xu,Xianjun Wang,Chen Zhang,Zonghua Wang&Peitao Lü
Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
Sheng Yu
Haikou Experimental Station, National Key Laboratory for Tropical Crop Breeding, Chinese Academy of Tropical Agricultural Sciences, Haikou, China
Zhihao Cheng&Qiong Wu
Laboratory of Medicinal Plant Biotechnology, School of Pharmaceutical Sciences, Zhejiang Chinese Medical University, Hangzhou, China
Xiaojun Chang
Zhejiang Provincial Key Laboratory of Horticultural Plant Integrative Biology, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou, China
Xiaohui Wen,Jin Hu&Liangsheng Zhang
Hainan Institute of Zhejiang University, Sanya, China
Xiaohui Wen,Jin Hu&Liangsheng Zhang
Fuzhou Institute of Oceanography, Minjiang University, Fuzhou, China
Chen Zhang&Zonghua Wang
Department of Biology, Saint Louis University, St. Louis, MO, USA
Zhenguo Lin
Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, Evry, France
Jean-Marc Aury
Department of Plant Biotechnology and Bioinformatics, Ghent University and VIB Center for Plant Systems Biology, Ghent, Belgium
Yves Van de Peer
Centre for Microbial Ecology and Genomics, Department of Biochemistry, Genetics and Microbiology, University of Pretoria, Pretoria, South Africa
Yves Van de Peer
College of Horticulture, Academy for Advanced Interdisciplinary Studies, Nanjing Agricultural University, Nanjing, China
Yves Van de Peer
State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, Guangdong Laboratory for Lingnan Modern Agriculture, Guangdong Province Key Laboratory of Microbial Signals and Disease Control, Integrative Microbiology Research Center, South China Agricultural University, Guangzhou, China
Xiaofan Zhou
Yunnan Seed Laboratory, Kunming, China
Jihua Wang
Authors
- Xiuxiu Li
You can also search for this author in PubMedGoogle Scholar
- Sheng Yu
You can also search for this author in PubMedGoogle Scholar
- Zhihao Cheng
You can also search for this author in PubMedGoogle Scholar
- Xiaojun Chang
You can also search for this author in PubMedGoogle Scholar
- Yingzi Yun
You can also search for this author in PubMedGoogle Scholar
- Mengwei Jiang
You can also search for this author in PubMedGoogle Scholar
- Xuequn Chen
You can also search for this author in PubMedGoogle Scholar
- Xiaohui Wen
You can also search for this author in PubMedGoogle Scholar
- Hua Li
You can also search for this author in PubMedGoogle Scholar
- Wenjun Zhu
You can also search for this author in PubMedGoogle Scholar
- Shiyao Xu
You can also search for this author in PubMedGoogle Scholar
- Yanbing Xu
You can also search for this author in PubMedGoogle Scholar
- Xianjun Wang
You can also search for this author in PubMedGoogle Scholar
- Chen Zhang
You can also search for this author in PubMedGoogle Scholar
- Qiong Wu
You can also search for this author in PubMedGoogle Scholar
- Jin Hu
You can also search for this author in PubMedGoogle Scholar
- Zhenguo Lin
You can also search for this author in PubMedGoogle Scholar
- Jean-Marc Aury
You can also search for this author in PubMedGoogle Scholar
- Yves Van de Peer
You can also search for this author in PubMedGoogle Scholar
- Zonghua Wang
You can also search for this author in PubMedGoogle Scholar
- Xiaofan Zhou
You can also search for this author in PubMedGoogle Scholar
- Jihua Wang
You can also search for this author in PubMedGoogle Scholar
- Peitao Lü
You can also search for this author in PubMedGoogle Scholar
- Liangsheng Zhang
You can also search for this author in PubMedGoogle Scholar
Contributions
L.Z. conceived and designed the project. P.L., Z.C., Y.Y., W.Z., S.X., Y.X., J.W. and H.L. collected the samples and extracted DNA and RNA. L.Z., P.L., J.W. and S.Y. coordinated the Illumina and PacBio sequencing. X.Z., M.J. and X. Chang assembled genomes and Hi-C data analyses. X.Z., C.Z. and X. Wang conducted protein-coding gene and repetitive sequence annotations. L.Z. and X.L. performed phylogenetic analyses. X.L., X. Chen and L.Z. performed comparative genomic analysis. X.L., X.Z., Q.W. and X. Wen performed the RNA-seq analysis. P.L. and S.Y. performed ChIP–seq experiments, DNase-seq experiments and bioinformatic analysis of ChIP–seq, DNase-seq and WGBS data. X.L., P.L., S.Y. and X.Z. wrote the manuscript draft. L.Z., P.L., S.Y., X.L., X.Z., Y.V.d.P., Z.L., Z.W., J.H. and J.-M.A. reviewed and revised the manuscript. All authors read and approved the manuscript.
Corresponding authors
Correspondence to Yves Van de Peer, Zonghua Wang, Xiaofan Zhou, Jihua Wang, Peitao Lü or Liangsheng Zhang.
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Genetics thanks Jordi Garcia-Mas and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Genome assemblies of Cavendish and Gros Michel.
a, BUSCO completeness assessments of the genome assemblies of Cavendish, Gros Michel, and four diploid wild banana species (Banksii, DH-Pahang, Zebrina, and Calcutta 4). Cavendish* was assembled by Busche et al.18. Zebrina v1.0 was assembled by Rouard et al.1, and Zebrina v2.0 was our assembly based on nanopore long-reads. The abbreviations of banana species refer to Fig. 1a. b, Macrosyntenic comparison of the entire Cavendish, Gros Michel and three diploid wild banana genomes (Banksii, DH-Pahang, and Zerbina), with each chromosome colored according to sub-genomes (Ban in blue, Dh in orange, and Ze in green).
Extended Data Fig. 2 Macrosyntenic comparison of the entire Cavendish and three diploid wild banana genomes: Banksii (a), DH-Pahang (b), and Zebrina (c).
Each chromosome set colored according to sub-genomes (Ban in blue, Dh in orange, and Ze in green). The abbreviations of banana species refer to Fig. 1a.
Extended Data Fig. 3 Macrosyntenic comparison of the entire Gros Michel and three diploid wild banana genomes: Banksii (a), DH-Pahang (b), and Zebrina (c).
Each chromosome set colored according to sub-genomes (Ban in blue, Dh in orange, and Ze in green). The abbreviations of banana species refer to Fig. 1a.
Extended Data Fig. 4 Examples of high-quality Cavendish and Zebrina genome assemblies.
a-d, NBS-LRR cluster, RLK cluster, RLP cluster, and RLP/LRR cluster on Ze03, Ze01, Dh10, and Ze10 of Cavendish, while not assembled in the previously published Cavendish assembly. Cavendish* was assembled by Busche et al.18. e and f, NBS-LRR cluster on chromosome 3 and RLP/LRR cluster on chromosome 10 of our assembled Zebrina v2.0 with length of 280 kb and 370 kb, while being two big gaps in the published Zebrina v1.0 (ref. 1). Each resistance gene was colored on micro-synteny plot (NBS-LRR in blue, RLK in pink, RLP in red, LRR in green, and other gene in yellow). The abbreviations of banana species refer to Fig. 1a.
Extended Data Fig. 5 Phylogenetic tree of banana RLPs involved in Foc race1-associated QTL (named as RLP locus)25.
The purple stars denote RLPs located in the Ze sub-genome, while the two red stars denote RLPs found only in the Ze sub-genome of Cavendish. The abbreviations of banana species refer to Fig. 1a.
Extended Data Fig. 6 A model of MaNAP4/5′ regulation of banana fruit ripening.
In the model, these genes directly regulated by MaNAP4/5 are key genes in the fruit ripening process.
Extended Data Fig. 7 Sub-genome dominance in the triploid banana genome.
a, Statistical comparison of categories of syntenic triad hom*oeolog expression bias. P-values were determined by one-way ANOVA with Tukey’s HSD test (n = 26 tissues of each category) within the suppression and dominance categories, and P-values less than 0.05 was highlighted in red. For boxplot in this study, the middle line represents the median, the lower and upper edges of the box represent the first and third quartiles, the end of the lower whisker represents the smallest value at most 1.5× inter-quartile range from the lower edge of the box, the end of the upper whisker represents the largest value at most 1.5× inter-quartile range from the upper edge of the box. b and c, Total number (b) and length (c) of DNase-hypersensitive sites (DHSs) detected in mature green and ripe fruits. d-f, Sub-genome distribution of MaNAP4/5 binding motifs (d), sites (e) and genes (f). g, Distribution of NBS-LRR resistance genes in the sub-genomes.
Supplementary information
Supplementary Information
Supplementary Notes 1 and 2 and Figs. 1–12.
Supplementary Tables
Supplementary Tables 1–16.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, X., Yu, S., Cheng, Z. et al. Origin and evolution of the triploid cultivated banana genome. Nat Genet 56, 136–142 (2024). https://doi.org/10.1038/s41588-023-01589-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41588-023-01589-3