Monthly Archives: January 2012

Linking a lincRNA to active chromatin

Wang et al show that a lincRNA encoded at one end of the HOXA gene cluster acts as a transcriptional enhancer, necessary for the translation of high order chromosomal structure into a transcriptionally active chromatin state.

Hox genes encode transcription factors that determine positional identities along the anterior-posterior (a-p) body axis and along the proximal-distal (p-d) axes of appendages. In vertebrates, Hox genes are found in four clusters, in which their 3′-5′ genomic arrangement mirrors their a-p and p-d expression patterns. For instance, genes found at 3′ end of the HOXA cluster such as HOXA1 and HOXA2 are necessary for specifying positional identities in hindbrain, whilst genes found at the 5′ end of the cluster such as HOXA13 and HOXA11 determine distal elements of the limbs.

To investigate how this genomic colinearity is translated into differential p-d expression patterns, Wang et al analysed chromosomal conformation at the HOXA locus in different human fibroblast cells. In distally derived cells (foreskin or foot fibroblasts) the 5′ end of the HOXA cluster displayed a compact and looped conformation, whilst the 3′ end seemed largely linear. An opposite conformation was found in proximally derived cells (lung fibroblasts). These findings correlated with the presence of specific histone post-translational modifications (PTMs). Areas of high chromatin interactions showed high levels of trimethylated histone H3 lysine 4 (H3K4me3) (associated with transcriptionally active chromatin), and low levels of histone H3 lysine 27 trimethylation (H3K27me3) (associated with transcriptionally silent chromatin).

histone lysine methylation states across the HOXA cluster compared between distal and proximal cells.

A lincRNA named HOTTIP is encoded at the 5′ end of the HOXA locus. Analysis of it’s expression revealed that it is transcribed in posterior and distal territories, in a manner similar to nearby HOXA genes. When HOTTIP RNA was depleted (using small interfering RNAs) in distal cells, expression of 5′ HOXA genes was abrogated. HOXA genes nearest the HOTTIP gene were the most effected, however transcriptional activity over 40kb of the HOXA locus was lessened. HOTTIP RNA depletion did not effect the expression of other genes tested, such as the highly homologous HOXD genes.

HOTTIP RNA depleted distal cells did not show any changes to the higher order chromosomal conformation at the HOXA locus, however they did display a broad loss of H3K4me3 at the 5′ end of the cluster. This was not mirrored by a concomitant gain of H3K27me3. Therefore it appears the loss of 5′ HOXA gene expression upon HOTTIP RNA depletion is linked to the loss of H3K4me3.

Effects of HOTTIP RNA depletion on histone lysine methylation states.

H3K4 methylation and it’s maintenance are mediated by protein complexes composed of lysine methyltransferases such as MLL1 and associated proteins like WDR5. Analysis of MLL1 and WDR5 occupancy in the HOXA cluster of distal fibroblasts revealed occupancy peaks at the transcriptional start sites of the 5′ located genes. These occupancy peaks disappeared when HOTTIP RNA was knocked down.Wang et al go on to show that HOTTIP RNA physically interacts with WDR5 protein.

Effects of HOTTIP RNA depletion on occupancy of MLL1 and WRD5.

A number of lines of evidence suggest that HOTTIP RNA acts in cis to regulate 5′ HOXA genes. For instance, HOTTIP RNA is expressed at very low copy number, and depletion did not have any effect on the HOXD locus. Retroviral insertion driven overexpression of HOTTIP RNA did not ectopically activate HOXA genes, nor could it rescue depletion of the endogenous HOTTIP RNA.

Tying these results together yields a model in which chromosomal looping brings HOTTIP RNA (specifically bound to the HOTTIP gene) into contact with the 5′ HOXA genes. HOTTIP lincRNA binds and targets WDR5/MLL complexes to the 5′ HOXA locus, creating a domain of H3K4me3 and transcriptional activation.

lincRNAs are probably a heterogeneous class of molecules, as they are arbitrarily defined on the basis of length. These results however, suggest what could be quite a common mechanism for how lincRNAs implicated in enhancer function can effect gene activation and program chromatin states.

See also: lincRNAs in development and evolution

Wang, K., Yang, Y., Liu, B., Sanyal, A., Corces-Zimmerman, R., Chen, Y., Lajoie, B., Protacio, A., Flynn, R., Gupta, R., Wysocka, J., Lei, M., Dekker, J., Helms, J., & Chang, H. (2011). A long noncoding RNA maintains active chromatin to coordinate homeotic gene expression Nature, 472 (7341), 120-124 DOI: 10.1038/nature09819


Novel modes of lateral gene transfer in bacteria

Understanding the mechanisms of lateral gene transfer (LGT) between bacteria is crucial to our understanding of microbial evolution. It is also important for human health as LGT facilitates the emergence and spread of bacterial virulence and antibiotic resistance. The three ‘classical’ mechanisms of LGT; transformation (in which naked DNA is taken up from the environment), transduction (by which bacteriophage facilitate gene transfer by packaging host DNA as well as their own) and conjugation (when plasmids encode a pilus by which they can be transferred from cell to cell) have been important in the emergence of molecular biology. Three other mechanisms by which DNA can be transferred between bacteria have come to light, potentially broadening our understanding of the importance of LGT in the microbial biosphere.

Gene Transfer Agents (GTAs)

GTAs are virus-like particles that carry random pieces of the producing cell’s genome. The best characterised GTA was discovered in 1974 from the purple, non-sulphur photosynthetic bacterium Rhodobacter capsulatus, a member of the alpha-proteobacteria. The R. capsulatus GTA (RcGTA) packages 4.5kb of DNA, but is encoded by a 14.1kb cluster of 15 genes on the R. capsulatus chromosome. Many of these genes have homology with bacteriophage structural genes. It is unclear how RcGTA particles are released from the cell, as no recognisable lysis genes have been identified. Transcription of the RcGTA gene cluster has been shown to be under the control of a sensor kinase/response regulatory system that transduces environmental signals. RcGTA-like gene clusters are widespread throughout the alpha-proteobacteria and phlogenetic trees based on RcGTA-like sequence recapitulate phylogenies based on 16s rRNA sequences suggesting that the RcGTA ancestor arose early in the evolution of the alpha-proteobacteria lineage.

Other GTAs (with probable independent origins) have been identified in a diverse range of prokarya including the archaebacterium Methanococcus voltae, the delta-proteobacterium Desulfovibrio desulfuricans and the spirochete Brachyspira hyodysenteriae.  None of them packages more than 14kb of DNA, and all of them take the form of small bacteriophage. It appears most likely that GTAs have been derived from bacteriophage that have lost their ability to self-propagate. Recent data suggests that alpha-proteobacterial GTAs are common in marine environments, and transfer genes at high frequency between diverse classes of alpha-proteobacteria. These ‘generalised transducing machines’, under the control of bacterial populations quorum sensing systems, are probably a major force in microbial evolution and ecology.

DNA transfer by membrane vesicle.

DNA encapsulated by MVs. A rosette-like structure is seen in the centre, a plasmid is in the box, linear DNA molecules - arrowheads.

Membrane vesicles (MVs), from gram –ve bacteria can traffic toxins, signals and other proteins between bacteria. They have also been shown to be able to mediate the transfer of DNA between cells. E.coli 0157:H7 MVs were found to contain linear DNA, circular plasmids and rosette-like DNA structures, that included genes from chromosomal DNA as well as plasmid and phage. The MVs were capable of transforming related enteric bacteria and increasing their cytotoxicity. DNA transfer by membrane vesicles could be a more widespread phenomenon than is currently appreciated, however as yet it is more commonly reported as an aside from other MV studies.

Intercellular nanotubes.

top two images show B. subtilis cells with nanotubes (note more intimate thin connections in circle). Lower three images show inter-specific nanotubes.

A year ago Dubey and Ben-Yehuda showed the existence of tubular conduits forming between Bacillus subtilis cells. These nanotubes were shown to be able to mediate the exchange of proteins and non-conjugative plasmids. Nanotubes were also formed between B. subtilis and Staphylococcus aureus (both gram +ve) and a thinner variety were formed between either of the gram +ve species, and gram –ve E.coli. The authors suggest that the formation of ‘syncytium-like synergistic consortia’ mediated by nanotube connections underlies many of the traits displayed by biofilms.

These three phenomena have a tantalising savour, suggesting the depths of our ignorance of the complexity of microbial ecosystems and prokaryotic evolution. However, I imagine that progress in these fields will accelerate. The explosion of microbial and environmental sequencing will be useful in identifying the prevalence of GTAs. Understanding all six modes of LGT will be crucial to our appreciation of the ecology of natural microbial communities and of bacterial evolution, as well as having important application for human health.

See also: A novel gene transfer agent from Bartonella

Stanton, T. (2007). Prophage-like gene transfer agents—Novel mechanisms of gene exchange for Methanococcus, Desulfovibrio, Brachyspira, and Rhodobacter species Anaerobe, 13 (2), 43-49 DOI: 10.1016/j.anaerobe.2007.03.004

Lang, A., & Beatty, J. (2007). Importance of widespread gene transfer agent genes in α-proteobacteria Trends in Microbiology, 15 (2), 54-62 DOI: 10.1016/j.tim.2006.12.001

McDaniel, L., Young, E., Delaney, J., Ruhnau, F., Ritchie, K., & Paul, J. (2010). High Frequency of Horizontal Gene Transfer in the Oceans Science, 330 (6000), 50-50 DOI: 10.1126/science.1192243

Yaron, S., Kolling, G., Simon, L., & Matthews, K. (2000). Vesicle-Mediated Transfer of Virulence Genes from Escherichia coli O157:H7 to Other Enteric Bacteria Applied and Environmental Microbiology, 66 (10), 4414-4420 DOI: 10.1128/AEM.66.10.4414-4420.2000

Dubey, G., & Ben-Yehuda, S. (2011). Intercellular Nanotubes Mediate Bacterial Communication Cell, 144 (4), 590-600 DOI: 10.1016/j.cell.2011.01.015

Lysine Crotonylation and the Histone Code

A recent study has identified 67 new histone modifications, bringing the current total of known histone marks to 163. Two new classes of modification were discovered: lysine crotonylation and tyrosine hydroxylation. Tan et al go on to show that crotonylated lysine marks active promoters and potentially plays an important role in male germ cell differentiation.

Eukaryotic chromosomal DNA is condensed by being wound around octamers of histone proteins to form nucleosomes. Post-translational modifications (PTMs) of histones can modulate chromatin structure, altering its biological activity (for example it’s transcription status). Different combinations of histone proteins and their PTMs are found through the genome and between different cell types. Deciphering this ‘histone code’ is crucial to our understanding of cellular regulation and differentiation, and is therefore the focus of huge amounts of current biological research.

Prior to this new paper at least twelve different types of histone PTM, at over sixty different amino acid residues had been reported. These include the most commonly discussed such as methylation and acetylation, as well as esoterica like citrullination. By performing a highly comprehensive survey of histone PTMs based on mass spectrometry, Tan et al have identified two new types of modification and 67 new histone marks.

The structure of the nucleosome. The four core histones are in different colours. Their N terminal tails are protruding from the nucleosome.

Nucleosomal cores consist of histone octamers containing two molecules each of histones H2A, H2B, H3, and H4. Interactions between histone proteins and between histones and DNA are generally mediated within the globular core domains of the histone proteins, whilst their N-terminal tails protrude from the nucleosome and have been considered the primary sites for post-translational modifications. However, this new study identified many histone PTMs within the globular cores, suggesting that previous methods of PTM identification have been biased against their discovery.

Tan et al also report further characterisation of one of the new types of histone PTM: lysine crotonylation (KCr). Crotonylation was found at 28 different lysine residues from all four core histones and the linker histone H1. KCr was detected in histones isolated from yeast, nematodes and fruit flies, as well as mice and humans.

Using an antibody that recognised all lysine crotonylation, chromatin immunoprecipitation followed by sequencing (ChIP-seq) showed that histone KCr was associated with active chromatin and was particularly enriched at promoter and enhancer regions.

Tan et al went on to find that during mouse spermatogenesis histone KCr is highly enriched in post-meiotic spermatids, coinciding with a general transcriptional shutdown. By using ChIP-seq in combination with transcriptomic data, they showed that KCr was marking a group of genes on the sex chromosomes that are transcriptionally active, whilst the rest of the sex chromosome is inactivated.

Lysine crotonylation appears to be an important new PTM adding even more complexity to an already complex field of study. The comprehensiveness of the technique employed for PTM identification used in this study, however, suggests that there may not be too many more histone marks to add to the list. The next questions to ask will be whether crotonylation of different lysine residues correlates with different biological events? What enzymes are responsible for the addition and removal of crotonyl modification? And what effects does the disruption of their activity have? What proteins interact with KCr? As can be ascertained from this taster, deciphering the histone code is going to keep a lot of people busy for a long time.

Tan, M., Luo, H., Lee, S., Jin, F., Yang, J., Montellier, E., Buchou, T., Cheng, Z., Rousseaux, S., Rajagopal, N., Lu, Z., Ye, Z., Zhu, Q., Wysocka, J., Ye, Y., Khochbin, S., Ren, B., & Zhao, Y. (2011). Identification of 67 Histone Marks and Histone Lysine Crotonylation as a New Type of Histone Modification Cell, 146 (6), 1016-1028 DOI: 10.1016/j.cell.2011.08.008

RNAi and Chromatin Modification

RNAi silences genes by targeting mRNAs for degradation. However, a second mode by which RNAi effects gene silencing has emerged: by triggering chromatin modifications. Gu et al have analysed the pattern of a specific chromatin modification in response to exogenous double stranded RNA (dsRNA) in C. elegans and show that RNAi triggered chromatin modification is target gene specific and transgenerationally heritable.

The ability of exogenous dsRNAs to silence homologous target genes (RNA interference, RNAi) was discovered in the nematode worm, C. elegans, approximately fifteen years ago. Feeding worms bacteria expressing dsRNA, or bathing worms in dsRNA, has the ability to specifically block gene function and most surprisingly this effect in C. elegans is inherited for some generations. RNAi has become an incredibly useful technique in biology as it works to a greater or lesser extent throughout eukarya, and offers a simple and fast method for compromising gene action specifically. Uncovering the mechanisms by which RNAi works has also opened up huge new vistas on cellular function: namely the proliferation of newly identified classes of endogenous RNA molecules and the discovery of their crucial roles in cellular regulation.

In C. elegans the mechanism of RNAi can be divided into two phases: Firstly, dsRNA is cut into 20-30nt molecules (primary short interfering (si) RNAs) by the enzyme Dicer. These siRNAs complexed with Argonaute proteins recognise and target mRNAs for degradation. The second phase (only present in some organisms) is the de novo synthesis of secondary siRNAs by the primary siRNA/Argonaute mediated recruitment of RNA directed RNA polymerases that use the target mRNA as a template. Apart from cytoplasmic siRNA mediated mRNA degradation, RNAi also functions in the nucleus by siRNA/Argonaute complex interactions with nascent mRNAs and RNA polymerase, and by causing chromatin silencing by histone modifications.

Gu et al have used ChIP-seq (Chromatin immunoprecipitation followed by high throughput sequencing) to make genome wide assessments of the effects of RNAi on the extent of a specific histone modification associated with transcriptionally silenced chromatin (histone 3 lysine 9 trimethylation, H3K9me3). RNAi against the gene lin-15B caused an enrichment of H3K9me3 at the lin-15B locus that spread as far as 9kb from the trigger region (the sequence directly targeted by dsRNA) meaning that two neighbouring genes also showed higher H3K9me3 modifications. No other genomic locations showed H3K9me3 enrichment, meaning that RNAi effects on this chromatin modification are specific to the target gene and neighbouring loci. The same pattern was seen when this experiment was repeated with RNAi against three other genes.

RNAi using dsRNA directed at target sequences that are not transcribed into mRNA was unable to affect H3K9me3 levels, showing that interactions with target mRNAs are essential for RNAi triggered chromatin modification. Likewise, RNAi on worm strains defective for various argonaute proteins that are necessary for secondary siRNA biogenesis, failed to trigger H3K9me3 chromatin modifications.

RNAi mediated gene silencing can last for multiple generations in C. elegans, however it is unclear whether these heritable silencing effects are mediated by inherited siRNAs, by a chromatin based mechanism or by a combination of the two.   Gu et al profiled H3K9me3 and small RNAs through three generations after dsRNA exposure. In the first generation of offspring, H3K9me3 enrichment occurred at a similar level as in the parental worms, although the level of siRNAs had fallen off drastically. The chromatin response was still present in the second generation but had fallen away to background levels by the third generation. These results suggest that H3K9me3 chromatin modifications induced by RNAi are transgenerationally inherited without a requirement for inherited siRNAs or trigger RNA, but are not conclusive.

These results are consistent with an emerging model in which secondary siRNA/ argonaute complexes, transported to the nucleus, direct chromatin silencing by interacting with nascent RNAs or with cognate DNA sequences. Histone modification is then propagated some distance from the trigger sequences. The finding with regard to heritability, are inconclusive and seem to be potentially at odds with a previously discussed paper regarding heritable antiviral response in C. elegans. No doubt we’ll be revisiting this subject matter soon.

Gu, S., Pak, J., Guang, S., Maniar, J., Kennedy, S., & Fire, A. (2012). Amplification of siRNA in Caenorhabditis elegans generates a transgenerational sequence-targeted histone H3 lysine 9 methylation footprint Nature Genetics DOI: 10.1038/ng.1039

Chromatin Assembly and Asymmetric Neuronal Cell Fate Specification

A new paper in Cell by Nakano et al describes the first mutant histone allele recovered from a genetic screen of a multicellular organism. This gain of function mutation in a histone H3 gene of C. elegans causes a very specific defect: a transformation in the fate of a single asymmetric motor neuron. To account for these findings the authors put forward a radical model in which differential epigenetic regulation between sister chromatids leads to asymmetric fate determination upon cell division.

The nematode worm C. elegans has an invariant cell lineage, meaning that any particular cell is generated from a specific series of mother and grandmother cells. Differences between daughter cells are determined either by non-cell autonomous mechanisms such as signalling by neighbouring cells, or by cell autonomous mechanisms such as the asymmetric inheritance of cell fate determinants, or by both.

The MI motor neuron is a left-right unpaired neuron located in the pharynx. The great-great-grandmother cell of MI gives rise to left and right paired lineages of cells, symmetrical, except for one left-right asymmetry: the MI motor neuron and the e3D pharyngeal epithelial cell. The researchers had previously shown that the MI-e3D asymmetry was dependent on a cascade of transcription factors asymmetrically expressed in the grandmother and mother cells of MI: CEH-36 (an Otx homeodomain protein) promoted the expression of the bHLH containing proneural proteins NGN-1 and HLH-2. When any of these proteins are inactivated, the MI neuron is transformed into an e3D-like cell.

In a genetic screen to find other factors involved in the MI-e3D asymmetry, Nakano et al identified a gain of function allele in the gene his-9 as causing MI-e3D transformation. his-9 encodes one of 14 identical replication-dependent histone H3 proteins in C. elegans.

In eukaryotes, chromosomal DNA is condensed by being wound around octamers of various histone proteins to form nucleosomes. Alterations to nucleosome structure or density can determine the accessibility of the DNA to the transcriptional apparatus, and hence the transcription state of that piece of chromatin. These variable chromatin states are said to be ‘epigenetically’ determined, as they can be maintained through mitoses by the inheritance of the modification status of histones (and other non-DNA sequence chromosomal features).

The nucleosome core contains a tetramer composed of two histone H3/ H4 dimers. This dimerisation occurs due to interactions between the two H3 molecules. It was these H3-H3 interactions that were compromised in the original mutant allele. The addition of similarly mutated versions of other replication dependent histone H3 genes into wild type worms also had the ability to transform the fate of the MI and yet showed no other gross abnormalisties. This showed that MI cell fate specification is very sensitive to gain of function mutations in histone H3 genes.

By generating worms that carried mutant his-9 transgenes on an extrachromasomal array that is mitotically unstable (hence creating mosaic worms), Nakano et al showed that the histone H3 gain of function activity acts cell autonomously within the MI mother cell.

Histone H3-H4 dimers are deposited into the nucleosome by a histone chaperone complex called CAF-1. Compromising the activity of any of the CAF-1 subunits in C. elegans also caused MI transformation. Therefore, replication dependent nucleosome formation mediated by CAF-1 is necessary to generate MI-e3D asymmetry.

To integrate their earlier findings with their new data, Nakano et al suggest that the NGN-1/HLH-2 complex recruits histone modifying enzymes that act on CAF-1 assembled nucleosomal arrays to generate an epigenetically marked MI-neuronal state. They combine this with the idea that CAF-1 can generate differences in the densities of nucleosomes between sister chromatids that upon mitotic segregation would generate a difference between sister cells. MI neuronal fate determination would require NGN-1/HLH-2 mediated histone modifications to be found at a specific (CAF-1 mediated) density.

To my knowledge, the idea that epigenetic marks, asymmetrically inherited, can act as cell fate determinants is novel and potentially a very important mechanism of development. In this case it is only a model that will require a lot more experimentation, however the authors go on to suggest that it could be a conserved mechanism generating bilateral asymmetries in the nervous systems of mammals as well. Mutations in a microtubule-based motor protein called left-right dynein (LRD) randomize visceral left-right asymmetry in the mouse due to defective cilia causing a left-right determining flow in the node to fail. LRD has also been implicated in biased chromatid segregation and interestingly rather than randomized asymmetry in the brain, LRD mutant mouse hippocampuses exhibit a loss of bilateral asymmetry that the authors suggest could be caused by parallel mechanisms as MI-e3D asymmetry. This is probably a leap too far, but fun anyway.

Nakano, S., Stillman, B., & Horvitz, H. (2011). Replication-Coupled Chromatin Assembly Generates a Neuronal Bilateral Asymmetry in C. elegans Cell, 147 (7), 1525-1536 DOI: 10.1016/j.cell.2011.11.053

lincRNAs in development and evolution

A new study identifying hundreds of long intervening noncoding RNAs (lincRNAs) in the zebrafish shows that these molecules have important conserved roles in vertebrate development.

Thousands of loci in mammalian genomes produce capped, polyadenylated, and often spliced RNA molecules that are greater than 200nt in length yet do not encode proteins. These lincRNAs have been shown to function in a number of cellular processes including X chromosome inactivation and transcriptional regulation. The roles of the vast majority of identified lincRNAs are however unknown.

To try and identify lincRNAs in the zebrafish, Ulitsky et al designed a pipeline of genomic datasets. The first stage defined boundaries of transcriptional units by combining maps identifying the genomic locations of the 3′ termini of polyadenylated transcripts, with a genome wide chromatin state map based on a specific chromatin modification found in gene promoters, defining 5′ ends. Upon subtracting any transcription units known to encode proteins or small RNAs, and comparison with datasets of transcribed sequences, 567 lincRNA genes were defined. Their approach was quite stringent, so this is an underestimate of the total lincRNAs, and is especially biased against those with low levels of expression or especially tissue-restricted expression.

Within the 567 zebrafish lincRNA gene dataset, only 29 instances of sequence conservation with mammalian lincRNAs were identified. This sequence homology typically only spanned small portions of the transcripts (308nt average in relation to 1,951nt average length of lincRNA). However, broader features of lincRNA gene structure, such as the distribution and length of exons and introns, were better conserved. The positional relationships between lincRNA genes and neighbouring genes (synteny) was also well conserved.

Analysis of the expression of a subset of the identified lincRNAs showed that a high proportion displayed tissue specific embryonic expression patterns, most commonly in the developing central nervous system. To enquire further about the functional significance of lincRNA, the researchers used antisense reagents (morpholinos) to interfere with the function of two of the lincRNAs with significant mammalian homology. In both cases morpholinos causing defective splicing or targeting the areas of conserved sequence caused developmental defects. These morphant phenotypes could be rescued by coinjection of the properly spliced lincRNA. Importantly, they could also be rescued by injection of the orthologous human or mouse lincRNAs. This showed that the developmental functions of these lincRNAs were conserved through vertebrate evolution.

One of the most interesting aspects of this paper is the discussion on the potential mechanisms of lincRNA gene evolution. A higher proportion of zebrafish lincRNA genes show sequence homology with mammalian protein coding sequences than they do with mammalian lincRNA genes. 8.6% of zebrafish lincRNAs showed sequence similarity with zebrafish protein coding genes as well. These findings suggest that some lincRNAs originated from protein coding genes (and vice versa). In this scenario a lincRNA gene can arise either from a pseudogene that has already lost it’s protein coding function, or from a gene that maintained both protein and lincRNA coding function before losing it’s protein coding ability. This raises the possibility that some mRNAs might currently carry out lincRNA type non-coding functions.

See also: Linking a lincRNA to active chromatin

Ulitsky, I., Shkumatava, A., Jan, C., Sive, H., & Bartel, D. (2011). Conserved Function of lincRNAs in Vertebrate Embryonic Development despite Rapid Sequence Evolution Cell, 147 (7), 1537-1550 DOI: 10.1016/j.cell.2011.11.055

Retrotransposons as regulatory elements

In a paper from 2004, Peaston et al reported on the expression of various retrotransposons (RTEs) in the mouse oocyte and pre-implantation embryo, finding widespread RTE transcription and the presence of chimeric transcripts composed of host genes and RTEs.

In a cDNA library constructed from full grown oocyte (FGO) transcripts, 12% of sequences were derived from MT (mouse transcript, a member of the MaLR family of nonautonomous LTR class III retrotransposons), whilst in a library from 2 cell stage embryos, 3% of cDNAs were derived from murine ERV-L (another class III  LTR RTE). Expression of these and other RTEs tailed off to nothing by the blastocyst stage. The differential developmental expression profile of these RTEs is interesting: MT is a large component of the maternally contributed RNA in the oocyte, whilst MuERV-L must be expressed zygotically very early in development.

The most important finding of this paper was that the cDNA libraries from FGO and 2 cell embryos contained many chimeric gene transcripts in which the 5′ sequence was derived from retrotransposons. These chimeric mRNAs made up 3% of the FGO library and 1.4% of the 2 cell stage embryo library. A large variety of RTEs contribute to chimeras in the FGO library but 51% of them involved MT. 56% of chimeric transcripts in the 2 cell stage had 5′ contributions from MuERV-L and it’s relatives, so RTE composition of the chimeric transcripts correlated with specific RTE abundance. The genes expressed as chimeric transcripts didn’t show any particular functional bias.

When the chimeric transcripts were compared with genomic sequence it was found that the cognate RTEs were either located within the gene locus or upstream of it. If the RTE was encoded within the gene, the chimeric transcript lacked any exons upstream of it. When the RTE was located upstream of the gene, the chimeric mRNA often lacked one or more 5′ exons (2/3rds of the time).

Therefore it appears that RTE sequences act as cis-regulatory elements driving oocyte and pre-implantation embryo specific expression of a population of alternatively spliced transcripts encoding (generally) variant proteins. The notion of RTEs as alternative promoters is close to that of transposons as “controlling elements” put forward by their discoverer Barbara McClintock. The authors note that RTE insertions could give rise to co-regulated gene expression and that RTE driven transcription of multiple host genes “provides grounds for selection of new modes of gene regulation by introducing variation”.

In a review of this work Shapiro uses this as evidence for a “functionalist” perspective, in which he regards mobile elements as “distributed genomic control modules”. This does seem to overstate the purposiveness of TE insertion. One potentially forgets all the cases of deleterious mutations leaving no issue. However, there is no doubt that through evolutionary time, host/parasite arms races can become coevolved integrated functions. An interesting finding in Peaston et al was that sense and antisense transcripts were found in relatively equal ratio when MuERV-L was expressed. This suggested that dsRNA would be formed, triggering RNAi that could seed heterochromatin formation to repress RTE expression. This is again open to a dichotomy of interpretation: in that this is part of a host mechanism to inhibit genomic parasites, or conversely (as Shapiro does) “another mechanism by which RTE insertions can influence the expression of nearby coding sequences and act to construct distributed suites of co-ordinately regulated loci”.

See also: On Transposable Elements and Regulatory Evolution 

Peaston AE, Evsikov AV, Graber JH, de Vries WN, Holbrook AE, Solter D, & Knowles BB (2004). Retrotransposons regulate host genes in mouse oocytes and preimplantation embryos. Developmental cell, 7 (4), 597-606 PMID: 15469847

Shapiro JA (2005). Retrotransposons and regulatory suites. BioEssays : news and reviews in molecular, cellular and developmental biology, 27 (2), 122-5 PMID: 15666350