Tag Archives: Transposable elements

small silencing RNAs. I: Piwi-interacting RNAs.

Three major classes of small RNAs involved in gene silencing have been found in animals: microRNAs (miRNAs), small-interfering RNAs (siRNAs) and Piwi-interacting RNAs (piRNAs). miRNAs are involved in the regulation of mRNA stability and translation, whilst the main purpose of the siRNA and piRNA pathways appears to be the defense of the cell and genome from viruses and transposable elements. Unlike the other two systems that are ubiquitously active, the piRNA pathway is generally only active in germline cells, the most important locus of defense against transposons.

A common feature of all three pathways is the formation of RNA-induced silencing complexes (RISCs), composed of a small RNA bound to an Argonaute family protein. The small RNA guides RISC to specific target RNAs, resulting in target silencing (generally by the Argonaute protein ‘slicing’ the cognate RNA). A key stage in the miRNA and siRNA silencing pathways is the recognition of double stranded RNAs, and their cleavage by Dicer proteins. This is not a feature of the piRNA system. Another difference is that piRNAs range from 22nt to 30nt in length, whilst siRNAs and miRNAs are 21 or 22-24nt long respectively. When piRNAs were first discovered they were called repeat-associated small-interfering RNAs (rasiRNAs). However, as they are not always associated with repeat sequences and as they bind a specific clade of Argonaute proteins, the PIWI family, they were subsequently renamed.

The piRNA system in Drosophila

A Drosophila melanogaster egg chamber. The large nurse cell nuclei are visible in the upper half, whilst the follicle cells cover the oocyte in the lower half.

The piRNA transposon silencing system has been most comprehensively analysed during oogenesis in the fruitfly, Drosophila melanogaster. Within a Drosophila egg chamber, the germline cells (fifteen nurse cells and the oocyte) share a common syncytial cytoplasm. They are surrounded by a layer of somatic follicle cells, which exchange developmental signals and nutrients with the germline cells. The Drosophila genome harbours over a hundred transposon families, including representatives of all three major classes (LTR and non-LTR retrotransposons, and DNA elements). Some retrotransposons, such as the gypsy family, form viral particles that have been shown to be able to invade the germline from the follicle cells via cellular transport vesicles. Therefore the germline is under threat from transposable elements primarily from within, but also from the somatic follicle cells. Two different variants of the piRNA system function in the germline and the somatic follicle cells: a more complicated system involving three PIWI family Argonaute proteins and a piRNA amplification system functions in the germline, whilst a simpler system involving only one PIWI protein works in the follicle cells to silence a more limited repertoire of retrotransposons.

The piRNA pathway in somatic follicle cells

Approximately 70% of somatic piRNAs map to transposons or transposon fragments. Of these 90% are antisense to active transposons. Mapping piRNAs to genomic sequence has yielded a great insight into genomic structure and the piRNA system of transposon control: piRNAs are derived from large clusters of densely packed, inactive transposon copies and fragments. piRNA clusters are a conserved feature of piRNA biology. They generally span dozens to hundreds of kilobases and are located in the heterochromatin associated with centromeres or telomeres. In the case of Drosophila somatic follicle cells two piRNA-clusters dominate: The flamenco locus and cluster 20A. Follicle cell piRNAs from these clusters are derived from one DNA strand, meaning that transcription is unidirectional. In flamenco and cluster 20A, the transposon fragments are generally oriented antisense to the direction of transcription, explaining the strong antisense bias of somatic follicle cell piRNAs. A P-element insertion at the beginning of the flamenco cluster blocks piRNA production from the whole 180kb cluster, suggesting that the formation of long single stranded transcripts of piRNA clusters is a necessary stage of piRNA biogenesis. However, the mechanisms of piRNA generation are not clear. It appears likely that the long piRNA precursor transcripts are stochastically cut into smaller fragments. Piwi then selectively binds fragments with a 5′ uridine (75% of Piwi-bound piRNAs have a 5′ uridine residue), and the pre-piRNAs are then 3′ trimmed to generate the final piRNA.

The germline piRNA pathway and ping-pong amplification.

In addition to Piwi, Drosophila ovarian germline cells express two related PIWI family Argonaute proteins: Aubergine (Aub) and AGO3. Unlike Piwi, which is localised to the nucleus, Aub and AGO3 are associated with an electron-dense peri-nuclear region of cytoplasm called nuage. Most importantly, they act together in a sophisticated piRNA amplification loop that is dependent on target expression, termed the ping-pong cycle. In a simplified version: Aub complexed with an antisense piRNA targets and slices a sense transcript of an active transposon, resulting in the production of novel sense piRNA species which are loaded onto AGO3. The AGO3-piRNA complexes then cleave complementary piRNA cluster transcripts, resulting in the production of novel antisense piRNA to be complexed with Aub. The ping-pong cycle results in the amplification of sets of antisense and sense piRNAs that are 10nt out of register with each other, suggesting the site of Aub slicer activity and providing a useful signal that shows that ping-pong amplification has occurred.

In the germline, more piRNA clusters are active, representing a larger spectrum of transposons. They are also expressed bi-directionally. An outstanding question is why this doesn’t trigger ping-pong amplification? The most likely reason is that the processes of piRNA biogenesis and transposon silencing are tightly localised and regulated. The roles of other proteins in these processes are starting to be understood. Proteins containing Tudor domains appear to be very important in the localisation and function of Aub and AGO3 in the nuage.

Many other intriguing aspects of piRNA biology are yet to be understood. Although the bulk of piRNAs are directed against transposons, some are involved in the regulation of cellular mRNAs. These piRNAs are derived from mRNAs rather than cluster transcripts: Are these transcripts marked in some way to be processed into piRNAs? The links between the primary piRNA biogenesis pathway and the ping-pong amplification system are also poorly understood. An interesting aspect of the piRNA system active in mouse spermatogenesis, is that the nucleus localised mouse PIWI family protein MIWI2 has been implicated in guiding de novo DNA methylation at transposon loci. Is this a more widespread phenomenon?

The piRNA system has been likened to an acquired immune response and works together with the (more acute response) siRNA pathway in transposon silencing. Future posts will discuss the other small RNA systems, and go further into piRNA biology.

Senti, K., & Brennecke, J. (2010). The piRNA pathway: a fly’s perspective on the guardian of the genome Trends in Genetics, 26 (12), 499-509 DOI: 10.1016/j.tig.2010.08.007

Khurana, J., & Theurkauf, W. (2010). piRNAs, transposon silencing, and Drosophila germline development The Journal of Cell Biology, 191 (5), 905-913 DOI: 10.1083/jcb.201006034

of further interest: piRNAs in the brain: epigenetics and memory

On Transposable Elements and Regulatory Evolution

Transposable elements (TEs), generally considered molecular parasites on the genome, are increasingly being linked to the evolution of new biological functions. TEs have been shown to be a source of novel genes and exons, the ‘arms race’ between them and their hosts has been a driving force in the evolution of epigenetic silencing mechanisms, and they have been shown to serve as cis-acting regulatory elements for host genes. This last role, as regulatory elements has potentially wide ramifications: TE mobilisation could cause changes to the expression of co-regulated suites of genes. Recently, the emergence of novel TEs and their mobilisation has been argued to be a causative factor underlying such ‘punctuated equilibria’ evolutionary phenomena as the Cambrian explosion and the rapid speciation of cichlid fishes. Two new papers analysing mammalian genomic evolution further link transposable elements with the spread of regulatory elements through the genome, and the evolution of novel characters.

CTCF binding sites.

CTCF (CCCTC-binding factor) is a DNA-binding protein with such a diverse and exciting array of potential roles attributed to it that it has been called a ‘master weaver of the genome’. It acts as an insulator, dividing different chromatin domains, and is therefore important for transcriptional activation and repression. This role appears to be linked to the formation of long distance chromosomal loops, and hence to the global organisation of the chromosomes within the nucleus. Schmidt et al. used ChIP-seq to define all the CTCF binding events in liver cells from five eutherian mammals (human, macaque, mouse, rat, and dog) and a marsupial (opossum). Using this data they defined a core DNA sequence motif that CTCF commonly binds, as well as sets of CTCF binding events that are conserved between the various species. In some lineages certain CTCF bound DNA sequence motifs were overrepresented. These overrepresented ‘motif-words’ were often embedded within lineage specific SINE repeats (short interspersed nuclear elements, non-autonomous non-LTR retrotransposons). For instance, mice and rats share about 2000 CTCF binding events that are associated with B2 SINES, mice have a further 5,300 B2 associated binding events and rats a further 1,200. Enrichments of CTCF binding events associated with lineage specific SINEs also occurred in the canine and opossum genomes (on a lesser scale). Surprisingly however, no similar TE associated enrichment occurred in the primate lineage. Looking at CTCF binding events that were conserved between multiple mammals, Schmidt et al. were also able to find over 100 binding events that were associated with fossilised ancestral transposable sequences.

Overall, this data shows that CTCF binding has expanded via retrotransposition in multiple mammalian lineages and that this is an ancient mechanism of regulatory evolution. CTCF binds a long DNA sequence motif (33/34bp) that is less likely to be generated by random point mutations than the smaller motifs more commonly bound by transcription factors. This is one reason why CTCF binding site expansion should be more associated with TEs than other regulatory sequence motifs. Another suggestion that the authors make to explain this association is that CTCF binding may protect TEs from repressive DNA or chromatin modifications.

Transposons and the evolution of pregnancy

During mammalian pregnancy, endometrial stromal cells (ESCs) differentiate in response to progesterone and signalling via the cAMP second messenger pathway, to produce a vascularised placenta that can accommodate implantation (a process termed decidualisation). The enhancer that drives expression of Prolactin in response to progesterone/cAMP signalling in ESCs is derived from a MER20 transposon (a hAT-Charlie family DNA transposon). Lynch et al. have found a strong association between MER20 elements and genes that are differentially expressed in mammalian ESCs and genes that are responsive to progesterone/cAMP signalling.

Analysing MER20s that are located close to stromally regulated genes, they found that, based on their association with CpG islands and various histone modifications, they often had regulatory potential. They then tested whether 21 randomly chosen MER20s bound various transcription factors and insulator proteins. 14 MER20s bound a suite of 5 different insulator proteins (including CTCF), whilst 5 different transcription factors important for ESC development bound together in 4 cases. This suggested that MER20s could be classified into ‘insulator’ and ‘enhancer-repressor’ types. Using a reporter gene assay in various cell types, they then showed that the majority of these MER20s acted as regulatory elements in response to progesterone/cAMP signalling specifically in ESCs.

This data indicates that the rewiring of the gene regulatory network of ESCs during the evolution of pregnancy was partly mediated by MER20 transposition events. In this case, MER20s contain sequences for regulatory assemblies of transcription factors responsive to specific signalling pathways, and hence have acted as cell type specific regulatory elements.

These two papers, as well as an increasing number of other studies, show that TEs are important agents of gene regulatory network evolution. The findings of Lynch et al. especially confirm the perspicacity of the discoverer of transposable elements, Barbara McClintock in terming them ‘controlling elements’.

See also: Retrotransposons as regulatory elements

Lynch, V., Leclerc, R., May, G., & Wagner, G. (2011). Transposon-mediated rewiring of gene regulatory networks contributed to the evolution of pregnancy in mammals Nature Genetics, 43 (11), 1154-1159 DOI: 10.1038/ng.917

Schmidt, D., Schwalie, P., Wilson, M., Ballester, B., Gonçalves, A., Kutter, C., Brown, G., Marshall, A., Flicek, P., & Odom, D. (2012). Waves of Retrotransposon Expansion Remodel Genome Organization and CTCF Binding in Multiple Mammalian Lineages Cell, 148 (1-2), 335-348 DOI: 10.1016/j.cell.2011.11.058

Zeh, D., Zeh, J., & Ishida, Y. (2009). Transposable elements and an epigenetic basis for punctuated equilibria BioEssays, 31 (7), 715-726 DOI: 10.1002/bies.200900026

Phillips, J., & Corces, V. (2009). CTCF: Master Weaver of the Genome Cell, 137 (7), 1194-1211 DOI: 10.1016/j.cell.2009.06.001