Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

The genomic basis of geographic differentiation and fiber improvement in cultivated cotton

Abstract

Large-scale genomic surveys of crop germplasm are important for understanding the genetic architecture of favorable traits. The genomic basis of geographic differentiation and fiber improvement in cultivated cotton is poorly understood. Here, we analyzed 3,248 tetraploid cotton genomes and confirmed that the extensive chromosome inversions on chromosomes A06 and A08 underlies the geographic differentiation in cultivated Gossypium hirsutum. We further revealed that the haplotypic diversity originated from landraces, which might be essential for understanding adaptative evolution in cultivated cotton. Introgression and association analyses identified new fiber quality-related loci and demonstrated that the introgressed alleles from two diploid cottons had a large effect on fiber quality improvement. These loci provided the potential power to overcome the bottleneck in fiber quality improvement. Our study uncovered several critical genomic signatures generated by historical breeding effects in cotton and a wealth of data that enrich genomic resources for the research community.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Population structure and divergence in tetraploid cotton.
Fig. 2: Genomic divergence of chromosomes A06 and A08 impacts the geographic differentiation in improved G. hirsutum.
Fig. 3: Interspecific introgressions in improved G. hirsutum and their effect on fiber quality improvement.
Fig. 4: Genetic basis of fiber quality in G. hirsutum.

Similar content being viewed by others

Data availability

All raw transcriptome data (PRJNA634606) and raw resequencing data (PRJNA605345) have been deposited at in the NCBI BioProject database. All supporting data (assembled genome sequence of G. hirsutum ‘Xinluzao 7’ (ICR_XLZ 7), genotype files for genetic diversity and population structure analysis and phenotype data for GWAS) are available in the cotton genomic variation database (CottonGVD) (http://120.78.174.209:30081/ftp).

Code availability

Introgression analysis pipeline can be accessed through https://github.com/sungaofei/3K-TCG.

References

  1. Lubbers, E. L. & Chee, P. W. in Genetics and Genomics of Cotton Part I (ed. Paterson, A. H.) 23–52 (Springer, 2009).

  2. Brubaker, C. L. & Wendel, J. F. Reevaluating the origin of domesticated cotton (Gossypium hirsutum; Malvaceae) using nuclear restriction fragment length polymorphisms (RFLPs). Am. J. Bot. 81, 1309–1326 (1994).

    Article  Google Scholar 

  3. Yuan, D. et al. Parallel and intertwining threads of domestication in allopolyploid cotton. Adv. Sci. https://doi.org/10.1002/advs.202003634 (2021).

  4. Wendel, J. F., Brubaker, C. L. & Seelanan, T. in Physiology of Cotton (eds Stewart, J. M. et al.) 1–18 (Springer, 2010).

  5. Hoffmann, A. A. & Rieseberg, L. H. Revisiting the impact of inversions in evolution: from population genetic markers to drivers of adaptive shifts and speciation? Annu. Rev. Ecol. Evol. Syst. 39, 21–42 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  6. Wellenreuther, M. & Bernatchez, L. Eco-evolutionary genomics of chromosomal inversions. Trends Ecol. Evol. 33, 427–440 (2018).

    Article  PubMed  Google Scholar 

  7. Fang, L. et al. Genomic analyses in cotton identify signatures of selection and loci associated with fiber quality and yield traits. Nat. Genet. 49, 1089–1098 (2017).

    Article  CAS  PubMed  Google Scholar 

  8. Wang, M. et al. Asymmetric subgenome selection and cis-regulatory divergence during cotton domestication. Nat. Genet. 49, 579–587 (2017).

    Article  CAS  PubMed  Google Scholar 

  9. Jia, G. et al. A haplotype map of genomic variations and genome-wide association studies of agronomic traits in foxtail millet (Setaria italica). Nat. Genet. 45, 957–961 (2013).

    Article  CAS  PubMed  Google Scholar 

  10. Zhou, Z. et al. Resequencing 302 wild and cultivated accessions identifies genes related to domestication and improvement in soybean. Nat. Biotechnol. 33, 408–414 (2015).

    Article  CAS  PubMed  Google Scholar 

  11. Dai, P. et al. Extensive haplotypes are associated with population differentiation and environmental adaptability in upland cotton (Gossypium hirsutum). Theor. Appl. Genet. 133, 3273–3285 (2020).

    Article  CAS  PubMed  Google Scholar 

  12. Yang, Z. et al. Extensive intraspecific gene order and gene structural variations in upland cotton cultivars. Nat. Commun. 10, 2989 (2019).

  13. Huang, G. et al. Genome sequence of Gossypium herbaceum and genome updates of Gossypium arboreum and Gossypium hirsutum provide insights into cotton A-genome evolution. Nat. Genet. 52, 516–524 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Chen, Z. J. et al. Genomic diversifications of five Gossypium allopolyploid species and their impact on cotton improvement. Nat. Genet. 52, 525–533 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Ma, Z. et al. Resequencing a core collection of upland cotton identifies genomic variation and loci influencing fiber quality and yield. Nat. Genet. 50, 803–813 (2018).

    Article  CAS  PubMed  Google Scholar 

  16. Mascher, M. et al. Genebank genomics bridges the gap between the conservation of crop diversity and plant breeding. Nat. Genet. 51, 1076–1081 (2019).

    Article  CAS  PubMed  Google Scholar 

  17. Wang, W. et al. Genomic variation in 3,010 diverse accessions of Asian cultivated rice. Nature 557, 43–49 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. He, F. et al. Exome sequencing highlights the role of wild-relative introgression in shaping the adaptive landscape of the wheat genome. Nat. Genet. 51, 896–904 (2019).

    Article  CAS  PubMed  Google Scholar 

  19. Zhao, G. et al. A comprehensive genome variation map of melon identifies multiple domestication events and loci influencing agronomic traits. Nat. Genet. 51, 1607–1615 (2019).

    Article  CAS  PubMed  Google Scholar 

  20. Jia, Y., Sun, J. & Du X. in World Cotton Germplasm Resources (ed. Abdurakhmonov, I. Y.) 35–53 (IntechOpen, 2014).

  21. Wendel, J. F., Rowley, R. & Stewart, J. M. Genetic diversity in and phylogenetic relationships of the Brazilian endemic cotton, Gossypium mustelinum (Malvaceae). Plant Syst. Evol. 192, 49–59 (1994).

    Article  Google Scholar 

  22. Hutchinson, J. B. Intra-specific differentiation in Gossypium hirsutum. Heredity 5, 161–193 (1951).

    Article  Google Scholar 

  23. He, S. et al. Introgression leads to genomic divergence and responsible for important traits in upland cotton. Front. Plant Sci. 11, 929 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  24. Zamir, D. Improving plant breeding with exotic genetic libraries. Nat. Rev. Genet. 2, 983–989 (2001).

    Article  CAS  PubMed  Google Scholar 

  25. Wendel, J. F., Brubaker, C. L. & Percival, A. E. Genetic diversity in Gossypium hirsutum and the origin of upland cotton. Am. J. Bot. 79, 1291–1310 (1992).

    Article  Google Scholar 

  26. Beasley, J. O. The origin of American tetraploid Gossypium species. Am. Nat. 74, 285–286 (1940).

    Article  Google Scholar 

  27. Wang, L. et al. Alien genomic introgressions enhanced fiber strength in upland cotton (Gossypium hirsutum L.). Ind. Crop. Prod. 159, 113028 (2021).

    Article  CAS  Google Scholar 

  28. Campbell, B. T. et al. Genetic improvement of the Pee Dee cotton germplasm collection following seventy years of plant breeding. Crop Sci. 51, 955–968 (2011).

    Article  Google Scholar 

  29. Thyssen, G. N. et al. Whole genome sequencing of a MAGIC population identified genomic loci and candidate genes for major fiber quality traits in upland cotton (Gossypium hirsutum L.). Theor. Appl. Genet. 132, 989–999 (2019).

    Article  CAS  PubMed  Google Scholar 

  30. Whittaker, D. J. & Triplett, B. A. Gene-specific changes in alpha-tubulin transcript accumulation in developing cotton fibers. Plant Physiol. 121, 181–188 (1999).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Jones, F. C. et al. The genomic basis of adaptive evolution in threespine sticklebacks. Nature 484, 55–61 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Cheng, C. et al. Ecological genomics of Anopheles gambiae along a latitudinal cline: a population-resequencing approach. Genetics 190, 1417–1432 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  33. Küpper, C. et al. A supergene determines highly divergent male reproductive morphs in the ruff. Nat. Genet. 48, 79–83 (2016).

    Article  PubMed  Google Scholar 

  34. Wang, J. et al. A Y-like social chromosome causes alternative colony organization in fire ants. Nature 493, 664–668 (2013).

    Article  CAS  PubMed  Google Scholar 

  35. Kirubakaran, T. G. et al. Two adjacent inversions maintain genomic differentiation between migratory and stationary ecotypes of Atlantic cod. Mol. Ecol. 25, 2130–2143 (2016).

    Article  CAS  PubMed  Google Scholar 

  36. Berg, P. R. et al. Trans-oceanic genomic divergence of Atlantic cod ecotypes is associated with large inversions. Heredity 119, 418–428 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Lowry, D. B. & Willis, J. H. A widespread chromosomal inversion polymorphism contributes to a major life-history transition, local adaptation, and reproductive isolation. PLoS Biol. 8, e1000500 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  38. Fang, Z. et al. Megabase-scale inversion polymorphism in the wild ancestor of maize. Genetics 191, 883–894 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  39. Lee, C. R. et al. Young inversion with multiple linked QTLs under selection in a hybrid zone. Nat. Ecol. Evol. 1, 119 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  40. Rieseberg, L. H. Chromosomal rearrangements and speciation. Trends Ecol. Evol. 16, 351–358 (2001).

    Article  PubMed  Google Scholar 

  41. Todesco, M. et al. Massive haplotypes underlie ecotypic differentiation in sunflowers. Nature 584, 602–607 (2020).

    Article  CAS  PubMed  Google Scholar 

  42. Minka, T. P. & Deckmyn, A. maps: draw geographical maps. R package version 3.3.0 https://cran.r-project.org/web/packages/maps/ (2018).

  43. R Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2019).

  44. Fick, S. E. & Hijmans, R. J. WorldClim 2: new 1‐km spatial resolution climate surfaces for global land areas. Int. J. Climatol. 37, 4302–4315 (2017).

    Article  Google Scholar 

  45. QGIS Geographic Information System v.3.16.3 (Open Source Geospatial Foundation Project, 2020).

  46. Paterson, A. H., Brubaker, C. L. & Wendel, J. F. A rapid method for extraction of cotton (Gossypium spp.) genomic DNA suitable for RFLP or PCR analysis. Plant Mol. Biol. Rep. 11, 122–127 (1993).

    Article  CAS  Google Scholar 

  47. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. McKenna, A. et al. The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164 (2010).

  50. Danecek, P. et al. The variant call format and VCFtools. Bioinformatics 27, 2156–2158 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Price, M. N., Dehal, P. S. & Arkin, A. P. FastTree: computing large minimum evolution trees with profiles instead of a distance matrix. Mol. Biol. Evol. 26, 1641–1650 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Price, A. L. et al. Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 38, 904–909 (2006).

    Article  CAS  PubMed  Google Scholar 

  53. Alexander, D. H., Novembre, J. & Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19, 1655–1664 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Lieberman-Aiden, E. et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289–293 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Koren, S. et al. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res. 27, 722–736 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Dudchenko, O. et al. De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science 356, 92–95 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Bates, D., Mächler, M., Bolker, B. M. & Walker, S. C. Fitting linear mixed-effects models using lme4. J. Stat. Softw. 67, 1–48 (2014).

    Google Scholar 

  58. Kang, H. M. et al. Variance component model to account for sample structure in genome-wide association studies. Nat. Genet. 42, 348–354 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  59. Li, B. & Dewey, C. N. RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics 12, 323 (2011).

  60. Livak, K. J. & Schmittgen, T. D. Analysis of relative gene expression data using real-time quantitative PCR and the \(2^{{-\Delta\Delta}{C}_{t}}\) method. Methods 25, 402–408 (2001).

  61. Clough, S. J. & Bent, A. F. Floral dip: a simplified method for Agrobacterium-mediated transformation of Arabidopsis thaliana. Plant J. 16, 735–743 (1998).

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

This work was funded by the National Key Technology R&D Program, the Ministry of Science and Technology (grant nos. 2016YFD0100203 to X.D. and S.H. and 2016YFD0100306 to S.H.), the National Natural Science Foundation of China (grant nos. 31871677 to S.H. and 31671746 to X.D.), the Agricultural Science and Technology Innovation Program of the Chinese Academy of Agricultural Sciences and the National Crop Germplasm Resources Center (grant no. NICGR2019-12 to Y.J.). We thank the National Mid-term Gene Bank for Cotton at the Institute of Cotton Research, Chinese Academy of Agricultural Sciences, for providing the seeds; J. A. Udall of Southern Plains Agricultural Research Center, US Department of Agriculture for sharing the sequencing data in NCBI (PRJNA414461); K. Wang and F. Liu of the Institute of Cotton Research, Chinese Academy of Agricultural Sciences for providing the DNA samples of wild species and landraces; and J. Ma and X. Li (Research Institute of Economic Crops, Xinjiang Academy of Agricultural Sciences), Y. Li and C. Ye (Biotechnology Research Institute of Xinjiang Academy of Agricultural and Reclamation Sciences), Y. Qian and W. Jin (Institute of Cotton, Hebei Academy of Agriculture and Forestry Sciences), J. Liu and J. Zhao (Institute of Industrial Crops, Jiangsu Academy of Agricultural Sciences) and Z. Zhou (Hunan Agricultural University) for assisting in planting cottons and investigating phenotypes.

Author information

Authors and Affiliations

Authors

Contributions

X.D. and S.H. conceived and designed the research. G.S., S.H., P.D. and Liyuan Wang performed the bioinformatics and data analysis. X.G., W.G., Y.J. and Z. Pan prepared the leaf tissues and extracted DNA samples. W.S., J.W., S.X., S.C., C.Y., Z.X., F.W., J.S., G.F., Liyuan Wang, Z. Peng, D.H., Liru Wang and B.P. participated in the phenotype data investigation. B.C. performed the qRT–PCR and overexpression experiment. S.H. wrote the manuscript.

Corresponding author

Correspondence to Xiongming Du.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature Genetics thanks Michael Bevan and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Extensive chromosomal inversions led to haplotype polymorphism on chromosomes A06 (a) and A08 (b) in G. hirsutum.

For confirming chromosome inversions cause haplotype polymorphism on two chromosomes, we de novo assembled the genome of G. hirsutum ‘Xinluzao 7’ (ICR_XLZ 7) which carried the haplotype (Hap-A06-3 and Hap-A08-4) contrasting with the reference genome (ICR_TM-1, Hap-A06-1 and Hap-A08-3). Two and three major inversions are found on chromosomes A06 (a) and A08 (b), respectively (marked by red boxes).

Extended Data Fig. 2 Two fiber quality-related loci derived from introgressions of diploid cottons.

a, Local clustering of 3,278 accessions based on the SNPs of G. arboreum introgressed region on chromosome A09 (GaIR_A09, ranged from ~61.8M to ~62.1 Mb) (FL3/FS2). A zoom-in view of the GaIR_A09 clade (right). G. arboreum (red branch) is clustered closely with G. hirsutum introgression lines. b, Local clustering of 3,278 accessions based on the SNPs of G. thurberi introgressed region on chromosome D08 (GthIR_D08, ranged from ~7.8Mb to ~60.4 Mb) (FS3). A zoom-in view of the GthIR_D08 clade (right). G. thurberi (purple branch) is clustered closely with all the introgression lines. c, The possible origination of FL3/FS2 and FS3 in Chinese elite cotton lines with superior fiber quality.

Extended Data Fig. 3 The genetic architecture of FL2.

a, Manhattan plots of GWAS for fiber length in the GWAS panel. Red circle denotes the genomic location of FL2 locus on chromosome D11. Blue dot line indicates the significant threshold of -log10(P) value (7.35). b, Gene models (top), local Manhattan plot (middle) and local LD heatmap (bottom) in the FL2 region. c, Haplotypes of FL2 locus in the 3K-TCG panel. Accessions (vertical) are re-ordered according to the clustering based on regional SNPs (horizontal). The genotype of accessions is categorized into three haplotypes (Hap_FL2_1, Hap_FL2_2 and Hap_FL2_3). Colored lines (left) indicate the subgroup classification and the red lines (right) indicate the accessions selected for GWAS (n = 1,245). d, Gene expression profiles in the genomic region of FL2. Comparison of gene expression in various tissues between alternative haplotype (FL2 and fl2). DPA, day postanthesis. e, Comparison of fiber length among different haplotypes of locus FL2. In scatter dot plot, horizontal lines and whiskers indicate the medians and interquartile ranges. Significances are tested by the two-tailed Student’s t-test. f, Allelic frequency of locus FL2 in G. hirsutum subgroups.

Extended Data Fig. 4 The genetic architecture of FL3/FS2.

a, Manhattan plots of GWAS for fiber length (top) and fiber strength (bottom) in GWAS panel. Red circle denotes the genomic location of FL3/FS2 locus on chromosome A09. Blue dot lines indicate the significant threshold of -log10(P) value (7.35). b, Gene models (top), local Manhattan plots (middle) and LD heatmap (bottom) in the FL3/FS2 region. c, Haplotypes of FL3/FS2 locus in the 3K-TCG panel. Accessions (vertical) are re-ordered according to the clustering based on regional SNPs (horizontal). The genotype of accessions is categorized into two haplotypes (Hap_FL3/FS2_1 and Hap_FL3/FS2_2). Colored lines (left) indicate the subgroup classification, and the red lines (right) indicate the accessions selected for GWAS (n = 1,245). d, Gene expression profiles in the genomic region of FL3/FS2. Comparison of gene expression in various tissues between alternative haplotype (FL3/FS2 and fl3/fs2). DPA, day postanthesis. e, Comparison of fiber length and fiber strength among different haplotypes of locus FL3/FS2. In scatter dot plot, horizontal lines and whiskers indicate the medians and interquartile ranges. Significances are tested by the two-tailed Student’s t-test. f, Allelic frequency of locus FL3/FS2 in G. hirsutum subgroups.

Extended Data Fig. 5 The genetic architecture of FL4.

a, Manhattan plots of GWAS for fiber length (top) and fiber strength (bottom) in GWAS panel. Red circle denotes the genomic location of FL4 locus on chromosome A10. Blue dot lines indicate the significant threshold of -log10(P) value (7.35). b, Gene model (top) and local Manhattan plots (bottom) in the FL4 region. c, The expression of Gh_A10G233100 in various tissues between alternative haplotype (FL4 and fl4). DPA, day postanthesis. d, Comparison of fiber length among different haplotypes of locus FL4. In scatter dot plot, horizontal lines and whiskers indicate the medians and interquartile ranges. Significances are tested by the two-tailed Student’s t-test. e, Allelic frequency of locus FL4 in G. hirsutum subgroups.

Extended Data Fig. 6 The genetic architecture of FL5/FS1.

a, Manhattan plots of GWAS for fiber length (top) and fiber strength (bottom) in GWAS panel. Red circle denotes the genomic location of FL5/FS1 locus on chromosome A07. Blue dot lines indicate the significant threshold of -log10(P) value (7.35). b, Gene models (top), local Manhattan plots (middle), and LD heatmap (bottom) in the FL5/FS1 region. c, Haplotypes of FL5/FS1 locus in the 3K-TCG panel. Accessions (vertical) are re-ordered according to the clustering based on regional SNPs (horizontal). The genotype of accessions is categorized into four haplotypes (Hap_FL5/FS1_1, Hap_FL5/FS1_2, Hap_FL5/FS1_3 and Hap_FL5/FS1_4). Colored lines (left) indicate the subgroup classification, and the red lines (right) indicate the accessions selected for GWAS (n = 1,245). d, Gene expression profiles in the genomic region of FL5/FS1. Comparison of gene expression in various tissues between alternative haplotype (FL5/FS1 and fl5/fs1). DPA, day postanthesis. e, Comparison of fiber length and fiber strength among different haplotypes of locus FL5/FS1. In scatter dot plot, horizontal lines and whiskers indicate the medians and interquartile ranges. Significances are tested by the two-tailed Student’s t-test. f, Allelic frequency of locus FL5/FS1 in G. hirsutum subgroups.

Extended Data Fig. 7 The genetic architecture of FE1.

a, Manhattan plots of GWAS for fiber elongation rate in GWAS panel. Red circle denotes the genomic location of FE1 locus on chromosome D04. Blue dot lines indicate the significant threshold of -log10(P) value (7.35). b, Gene models (top), local Manhattan plots (middle), and LD heatmap (bottom) in the FE1 region. c, Haplotypes of FE1 locus in the 3K-TCG panel. Accessions (vertical) are re-ordered according to the clustering based on regional SNPs (horizontal). The genotype of accessions is categorized into two haplotypes (Hap_FE1_1 and Hap_FE1_2). Colored lines (left) indicate the subgroup classification, and the red lines (right) indicate the accessions selected for GWAS (n = 1,245). d, Gene expression profiles in the genomic region of FE1. Comparison of gene expression in various tissues between alternative haplotype (FE1 and fe1). DPA, day postanthesis. e, qRT–PCR analysis of Gh_D04G181300 (GhTUA2) expression between accessions carrying alternative haplotype (mean ± s.d., n = 3 independent experiments). f, The root phenotype in GhTUA2-overexpressed Arabidopsis. g, Comparison of fiber elongation rate among different haplotypes of locus FE1. In scatter dot plot, horizontal lines and whiskers indicate the medians and interquartile ranges. Significances are tested by the two-tailed Student’s t-test. h, Allelic frequency of locus FE1 in G. hirsutum subgroups.

Extended Data Fig. 8 The genetic architecture of FE2.

a, Manhattan plots of GWAS for fiber elongation rate in GWAS panel. Red circle denotes the genomic location of FE2 locus on chromosome D01. Blue dot lines indicate the significant threshold of -log10(P) value (7.35). b, Gene models (top), local Manhattan plots (middle), and LD heatmap (bottom) in the FE2 region. c, Haplotypes of FE2 locus in the 3K-TCG panel. Accessions (vertical) are re-ordered according to the clustering based on regional SNPs (horizontal). The genotype of accessions is categorized into two haplotypes (Hap_FE2_1 and Hap_FE2_2). Colored lines (left) indicate the subgroup classification, and the red lines (right) indicate the accessions selected for GWAS (n = 1,245). d, Gene expression profiles in the genomic region of FE2. Comparison of gene expression in various tissues between alternative haplotype (FE2 and fe2). DPA, day postanthesis. e, qRT–PCR analysis of Gh_D01G220400 expression between accessions carrying alternative haplotype (mean ± s.d., n = 3 independent experiments). f, Comparison of fiber elongation rate among different haplotypes of locus FE2. In scatter dot plot, horizontal lines and whiskers indicate the medians and interquartile ranges. Significances are tested by the two-tailed Student’s t-test. g, Allelic frequency of locus FE1 in G. hirsutum subgroups.

Extended Data Fig. 9 The genetic architecture of FE3.

a, Manhattan plots of GWAS for fiber elongation rate in GWAS panel. Red circle denotes the genomic location of FE3 locus on chromosome A05. Blue dot lines indicate the significant threshold of -log10(P) value (7.35). b, Gene models (top), local Manhattan plots (middle), and LD heatmap (bottom) in the FE3 region. c, Haplotypes of FE3 locus in the 3K-TCG panel. Accessions (vertical) are re-ordered according to the clustering based on regional SNPs (horizontal). The genotype of accessions is categorized into two haplotypes (Hap_FE3_1 and Hap_FE3_2). Colored lines (left) indicate the subgroup classification, and the red lines (right) indicate the accessions selected for GWAS (n = 1,245). d, Gene expression profiles in the genomic region of FE3. Comparison of gene expression in various tissues between alternative haplotype (FE3 and fe3). DPA, day postanthesis. e, qRT–PCR analysis of Gh_A05G094100 expression between accessions carrying alternative haplotype (mean ± s.d., n = 3 independent experiments). f, Comparison of fiber elongation rate among different haplotypes of locus FE3. In scatter dot plot, horizontal lines and whiskers indicate the medians and interquartile ranges. Significances are tested by the two-tailed Student’s t-test. g, Allelic frequency of locus FE3 in G. hirsutum subgroups.

Extended Data Fig. 10 Correlation of favorable allelic combinations for fiber elongation rate (a), fiber length (b), and fiber strength (c) in GWAS panel.

Colored dots represent accessions carrying different allelic combinations. All accessions with superior fiber quality (fiber length > 32mm, fiber strength > 32cN/tex) are marked by blue rectangles.

Supplementary information

Supplementary Information

Supplementary Figs. 1–8 and Tables 2, 3, 5, 6 and 14.

Reporting Summary

Supplementary Table

Supplementary Tables 1, 4 and 7–13

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

He, S., Sun, G., Geng, X. et al. The genomic basis of geographic differentiation and fiber improvement in cultivated cotton. Nat Genet 53, 916–924 (2021). https://doi.org/10.1038/s41588-021-00844-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41588-021-00844-9

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing