Dissertation/Thesis Abstract

Systematic characterization of cis-regulation in Caenorhabditis elegans using evolutionary conservation
by Cheng, Donavan, Ph.D., The Johns Hopkins University, 2009, 172; 3395668
Abstract (Summary)

Comparative genomics approaches for cis-regulatory element detection typically rely on sequence alignment, even though recent studies show modest overlap (∼50%) between confirmed regulatory elements and regions of high sequence alignability. This dissertation focuses on developing alignment-independent approaches for detecting conserved cis-regulatory elements and modules and is organized in three parts: In the first study, we present Flipper, a novel alignment-independent Gibbs sampling based algorithm which uses over-representation and evolutionary conservation equally to detect conserved DNA regulatory elements ab initio from orthologous sequence. Flipper performs up to 23% better than existing methods at recovering seeded motifs from synthetic test data and also recovers more known motifs from yeast, worm and fly ChIP-chip data. To discover novel regulatory motifs, we ran Flipper on promoters of sets of coexpressed genes in C.elegans. We focused on the ribosomal protein (RP) gene cluster, as it is highly coexpressed but yet little is known about its regulation. Flipper detected 22 motifs associated with the RP promoters, where four motifs (M546, M313, M540 and M439) were significantly conserved and specific to the RP gene cluster in C.elegans and its relatives C.remanei, C.briggsae, and C.brenneri. In our second study, we used a promoter::mCherry transcriptional reporter assay to test our predicted motifs for function. M546 severely abrogated mCherry expression when mutated in 8 out of 11 tested promoters and similarly, M313 was necessary for promoter function in 4 of 9 cases, M540 in 3 of 7 cases and M439 in 1 of 3 cases respectively. In a promoter "transplant" experiment, we demonstrated that M546 and M540 are functionally conserved and are necessary for C.briggsae promoters to drive mCherry expression in C.elegans . M546 and M540 occur in a large number of non-ribosomal promoters and we show that M546 is also necessary for function in the mcm-7 promoter, even though its expression profile is markedly different from RPs. In the third study, we demonstrate that rules governing the organization of cis-regulatory elements in modules, in terms of relative spacing, positioning and orientation constraints, can also be conserved across species. Using this information, we discover a strong, conserved spacing and orientation bias in pairs of co-occurring M546 and M540 sites in RP promoters. Using a "sequence swap" experiment, we disrupted the spacing between M546 and M540 sites and showed that it has a severe effect on rps-7 promoter function. We show that a large number of non-ribosomal promoters contain M546 and M540 sites because these sites reside in an arm of the CELE2 transposon, which happened to insert itself in these promoters. Interestingly, the M546-M540 pair in these promoters do not obey the RP spacing constraint and these promoters are not enriched in any common GO annotations, while other non-ribosomal promoters containing M546-M540 sites with the RP spacing constraint are strongly enriched for growth and development GO annotations (p < 10 -9</super>), which are consistent with the need for RP biogenesis. In summary, using an alignment independent approach, we have identified conserved cis-regulatory elements necessary for RP gene expression in C.elegans, with the M546 and M540 motifs possibly part of a regulatory module that is involved in more general regulation of growth and early development processes.

Indexing (document details)
Advisor: Beer, Michael A.
Commitee:
School: The Johns Hopkins University
School Location: United States -- Maryland
Source: DAI-B 71/01, Dissertation Abstracts International
Source Type: DISSERTATION
Subjects: Biostatistics, Genetics, Bioinformatics
Keywords: Computational biology, Gene regulation, Genomics, Machine learning, Transcription
Publication Number: 3395668
ISBN: 9781109585445
Copyright © 2019 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy
ProQuest