Expressed Sequence Tag (EST) sequencing is one of the most efficient means for gene discovery and gene expression profiling. With a good resource of ESTs, a large number of molecular markers can be identified, and issues related to alternative splicing and differential poly adenylation can be addressed at the genome-wide scale. Through the Community Sequencing Program, a catfish EST sequencing project was selected by the DOE’s Joint Genome Institute (JGI). In this project, a total of 12 cDNA libraries were constructed including eight from channel catfish (Ictalurus punctatus) and four from blue catfish (I. furcatus). A total of 600,000 sequencing attempts were made, generating a total of 438,321 quality ESTs. With previously existing ESTs in GenBank, this project brings the total of ESTs to nearly 500,000 in the catfish. The JGI EST sequencing had an overall sequencing success rate of 73% with an average length of 576 bp. All the ESTs were assembled using CAP3, resulting in 111,578 unique sequences, including 45,306 contigs and 66,272 singletons. Of these unique sequences, over 35% had significant similarities to known genes by BLASTX searches, which allowed the identification of 14,776 unique genes in the catfish. A total of 1,350 and 849 full length cDNAs have been identified from channel catfish and blue catfish, respectively. The ESTs are an enormous resource for SNP identification. The quality assessment parameters for EST-derived were established based on a pilot study with 384 SNPs. In order to select reliable SNPs, contigs containing four or more ESTs should be used and the minor allele sequence should be represented at least twice. Genotyping primers should be designed from a single exon, completely avoiding introns. Application of such quality assessment measures, along with large resources of ESTs, should provide effective means for SNP identification in species where genome sequence resources are lacking. Over 300,000 putative SNPs have been identified, of which over 48,000 are high quality SNPs as defined by contig size of at least four sequences and the minor allele presence of at least twice in the contig. The EST resource should also be valuable for identification of microsatellites, comparative genome analysis. This large scale EST sequencing project would allow the identification of majority of catfish transcriptome. The parallel analysis of ESTs from the two closely related ictalurid catfishes should also provide powerful means for the evaluation of ancient and recent gene duplications, and for the development of high-density microarrays in catfish. The inter- and intra-specific SNPs identified from all catfish EST dataset assembly will greatly benefit the catfish introgression breeding selection and whole genome association studies. All ESTs have been deposited in GenBank.
|School Location:||United States -- Alabama|
|Source:||DAI-B 70/12, Dissertation Abstracts International|
|Subjects:||Molecular biology, Genetics, Bioinformatics|
|Keywords:||Expressed sequence tags, Genome selection, Single nucleotide polymorphisms|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be