Sequencing the Aegilops tauschii genome

Abstract

Bread wheat is one of three pillars on which the global food supply rests. Despite the exceptional importance of wheat, a high-quality draft reference sequence of the wheat genome is not available due primarily to its hybrid origin (polyploidy) and the enormous size of its genome. To assist the international bread wheat genome sequencing effort, a high quality draft of the genome of Aegilops tauschii, one of the three progenitors of bread wheat, will be produced. The large size and great complexity of the Ae. tauschii genome necessitate adopting the ordered-clone sequencing strategy for generating a high-quality genome sequence draft. This strategy will involve the sequencing of about 50,000 bacterial artificial chromosome (BAC) clones harboring large fragments of Ae. tauschii DNA that have been ordered to represent the contiguous sequence of nucleotides in Ae. tauschii chromosomal DNA. Pools of BAC clones will be sequenced with a next generation DNA sequencing platform and assembled into long contiguous sequences. The correctness of the assembled sequences will be validated with a novel optical nanotechnique. Genes and transposable elements in the assembled sequences will be annotated. In this way, the sequence, location, and orientation of all genes and transposable elements in the Ae. tauschii genome will be determined.

The most immediate impact of the Ae. tauschii genome sequence produced by this project will be in predicting the location of genes in wheat and its relatives, facilitating thus gene discovery and manipulation in these species with the goal of incorporating these genes into wheat by traditional breeding methods or biotechnology. The Ae. tauschii genome sequence will serve as a reference in analyses of genomic changes that have taken place in the wheat genome since the origin of wheat, providing significant and fundamental contributions to the understanding of grass genome structure and evolution and accelerating progress in genome sequencing of wheat and its relatives. Association of the more than 36,000 genes predicted within the draft genome sequence with biological functions will be a daunting task. For this reason, the project will engage the large research community interested in analyzing genes of Ae. tauschiii in community-based manual gene annotation and graphical compilation of the results in the project database. To maximize the awareness about the project resources among the wheat breeders and other stakeholders, regular presentations/workshops of these resources will be made at professional meetings attended by wheat breeders. The project will also provide excellent interdisciplinary training for students and postdocs in plant genomics, bioinformatics, DNA sequencing and sequence assembly, validation, and analysis. Validated sequences of BAC contigs and sequences parsed to individual BAC clones will be deposited on a monthly basis to the project website. All sequences will be deposited at the long-term NCBI repository and incorporated into the Gramene, GrainGenes, and MIPS comparative databases. Individual BAC clones or their groups from nine libraries of Ae. tauschiii accession AL8/78 are publicly available.