Personal tools
You are here: Home 2010 project Abstract

Abstract

The original project abstract for DEB-0115062

The entire 130 million base pair genome of the plant Arabidopsis thaliana was finished last year.  The objective of this project is to leverage the genome sequence to catalog the naturally occurring genetic variation in the species.  The project is based on the theoretical insight that, in highly self-fertilizing organisms, like A. thaliana, it should be possible to create such a catalog very efficiently by looking at the pattern of variation in a number of small segments distributed over the genome.  Rather than sequencing the entire genome of one additional individual, one should sequence 1% of the genome in 100 individuals.

Specifically, the project will sequence 1500–2000 chromosomal segments of length 500–700 base pairs, distributed over the genome, in a sample of 96 carefully selected individuals.  The data will be publicly available through GenBank, as well as through a highly flexible relational database developed specifically for this purpose.  The database will be equipped with web-based bioinformatics tools to query it, and will be continuously updated.

The project represents the first serious attempt to describe the genomic variation in a species.  It is highly relevant to the objectives of the 2010 project in a general sense, because it will not be possible to "determine the function of all genes [...] within their cellular, organismal, and evolutionary contexts" without understanding how genetic variation is structured in the species.  More immediately, the database will be an invaluable resource for plant geneticists interested in finding the genes responsible for variation in agriculturally important traits such as drought tolerance.  In this respect, the project should be compared to the large databases of human variation that are currently being created to aid genetic epidemiology. The tools and methods created for this project will also be directly applicable to several organisms of direct economic importance, such as rice and barley.   Finally, the database will serve as a very important training tool for students in computational and evolutionary biology, and in statistical genetics.

Document Actions