Category Archives: minireviews

Interacting small RNA pathways in worms 1: Introduction

A cluster of new papers, in the journals Cell and Science, discuss the links between piRNAs and various endogenous siRNA pathways in the nematode worm C. elegans. Emerging from these experiments is a picture of a genome-wide surveillance system capable of differentiating between self and non-self nucleic acids. The commonalities and differences between these papers require rather detailed analyses. I’m therefore intending to write a series of posts; first covering some of the background information on these small RNA systems and then getting onto the new findings.

A panoply of small RNA molecules, involved in diverse cellular functions have been discovered in the wake of the initial observation of RNA interference (RNAi). Originally RNAi described the mechanism by which genes could be specifically silenced by the exogenous application of cognate double-stranded RNAs. Nowadays, the term RNAi is more generally applied to gene silencing pathways involving the three major classes of small RNAs; microRNAs (miRNAs), small-interfering RNAs (siRNAs), and piwi-interacting RNAs (piRNAs). A common feature of all these small RNAs is that they complex with members of the Argonaute (AGO) family of proteins. Embedded within AGOs, the small RNAs act as guides; base-pairing with specific target RNAs that can then be cleaved by the RNase H endoribonuclease activity of the AGO protein. However, not all argonautes act by this ‘slicing’ activity; gene silencing can also be achieved by interactions with pathways involved in chromatin modification, or the inhibition of transcriptional elongation. Meanwhile the list of non-silencing roles of AGOs and their complexed small RNAs continues to grow; chromosome segregation, double-strand break repair, programmed genomic rearrangement etc.

The synthesis of both miRNAs and siRNAs generally involves the recognition of dsRNA and its cleavage by Dicer enzymes. miRNAs are derived from short stem-loop structures found in transcripts. siRNAs are the main effectors of the ‘classical’ RNAi. Exogenous dsRNA molecules are cleaved by Dicer into 20-30nt siRNAs that are loaded onto AGOs. However, this pathway also targets endogenously formed dsRNA, which can derive from hairpin structures in transcripts, or by base-pairing between transcripts produced either from separate loci, or by bidirectional transcription at individual loci. Hence, endogenous siRNAs generally target transposons or other repetitive sequences, but can target genes as well.

This basic siRNA pathway is present in most animals, but more complicated systems exist in plants, fungi and nematode worms. These ‘secondary siRNA’ systems wrest on the use of RNA dependent RNA polymerases (RdRPs) to amplify siRNAs against targets recognised by primary siRNAs. In the cases of plants and fungi the dsRNAs produced by RdRPs are again processed by Dicer enzymes and then loaded onto AGOs. However, various populations of RdRP-produced siRNAs in C. elegans do not require Dicer cleavage.

As noted above, the key effectors of these small RNA pathways are argonaute proteins. The numbers of AGOs present in organisms varies widely. The Drosophila genome encodes 5 AGOs; 3 of these are involved in the piRNA system, whilst the other two specialise in either miRNAs or siRNAs. In contrast, C. elegans has 27 AGOs. This reflects the presence of various additional networks of endogenous- siRNAs. Deep sequencing in C. elegans has revealed a large diversity of different varieties of small RNAs, with major peaks at 21, 22, and 26 nt. Different types of sRNAs have different biases in relation to their predominant 5′ residue, 3′ modifications, and extent of 5′ phosphorylation.

This series of posts will ignore many classes of C. elegans siRNAs and instead concentrate on two varieties of 22nt endo-siRNAs with 5′ guanosine residues (22G-RNAs). 22G-RNAs associated with the AGO CSR-1 have been shown to play critical roles in chromosome segregation during meiosis and mitosis. Another population of 22G-RNAs that associate with various worm-specific AGOs (WAGOs) have been implicated in the long-term silencing of transposons and other genomic loci. The piRNAs of worms – 21U-RNAs – display some critical differences with those found in Drosophila and mammals. However, understanding their role in C. elegans may help to explain some of the outstanding questions about their functions thoughout animals.

Ketting RF (2011). The many faces of RNAi. Developmental cell, 20 (2), 148-61 PMID: 21316584

See also these related posts:
small silencing RNAs. I: Piwi-interacting RNAs.
RNAi and Chromatin Modification
Lamarckian inheritance of antiviral response in Nematodes.
Double-strand break interacting RNAs (diRNAs)


Prions, more than just brain rot.

Prions, self-replicating proteins, the causative agents underlying BSE and CJD, have potentially important roles in evolution and memory formation.

Here in the UK, we don’t need reminding about the horrific consequences of transmissable spongiform encephalopathies. Over one hundred and fifty britons have died of variant Creutzfeld-Jacob disease, and the images of cattle suffering the effects of BSE (and ministers feeding their daughters burgers) are still fresh in the mind. Part of the difficulty felt by scientists and government in handling the BSE crisis were due to these diseases being caused by a novel form of pathogenic entity, prions. Rather than encoding information facilitating their replication and transmission in a DNA or RNA genome like viruses and other pathogens, prions are proteins capable of self-replication. More specifically they are an aberrant conformation of a protein, capable of seeding the misfolding of the native form. In the case of the spongiform encephalopathies, the native ‘prion protein’ (a component of neuronal membranes, still of unknown function) is converted into a tangled form that is resistant to enzymatic degradation. This prion form is therefore capable of transmission through digestive systems that would normally degrade proteins. In the brain, the prions form toxic aggregations, causing neurodegeneration and death.

The idea that self-replicating proteins could act as elements of inheritance was revolutionary at the time, but prions are now being found to exist in many other contexts, and rather than acting as pathogens, their potential functions are yielding exciting insights into evolution and brain function.

In yeast, at least nine different proteins have been shown to form prions, and eighteen more contain prion-forming domains. These are often important proteins involved in the control of cellular regulation. The best characterised yeast prion, [ PSI+], is a form of a factor responsible for the termination of translation (the process of converting the sequence of messenger RNA transcripts into the amino acid sequence of protein), Sup35. [ PSI+] titrates normal Sup35 protein, lessening its level, and leading to an increase in translational read-through. This read-through effectively uncovers cryptic genetic variation. Genetic sequence that is not normally encoding protein sequence will be under less stringent evolutionary selection pressure than coding sequence. If this sequence is suddenly translated into protein in [ PSI+] cells it may, in a minority of cases, be beneficial for cell’s adaptation to their environment. Protein folding is mediated by ‘chaperone’ proteins, which are also closely involved in the response to environmental stresses. Hence, prion formation, a case of protein ‘mis-folding’, is more likely to occur during times of stress. Prions can therefore act as switches responsible for the sudden appearance of complex traits in response to environmental conditions.  Although these possible ‘evolvability’ roles for prions in yeast are controversial, it appears that the prion-forming domains responsible for this capacity, have been conserved for long periods of evolutionary time, and do not generally have other major functions.  Recently, [ PSI+] and another yeast prion have been shown to exist in wild yeast strains, strengthening the argument that they are not simply diseases or artifacts of laboratory culture.

Proteins that contain prion-forming domains are present in many branches of the tree of life. A particularly exciting example, found in the sea slug Aplysia, is an RNA-binding protein called CPEB. This protein is responsible for regulating the activation of the translation of mRNA transcripts in neuronal synapses in response to neurotransmitters, such as serotonin. The fact that it contains a prion-domain capable of propagating and stabilising a conformational change in the protein, and that this change equates to variant activities, has suggested an exciting answer to a hoary problem in neurobiology: the endurance of memories. Proteins and other cellular components are generally turned over in a matter of hours. How then can they be responsible for encoding memories that can last years? By CPEB undergoing a regulated prion-like polymerisation in response to synaptic transmission, a long-term memory of this stimulation can be stored. An equivalent CPEB in the fruit fly has recently been shown to be working in a similar way. It appears that this could be a more general mechanism for cellular memory storage in animal neurons.

A role for prions in memory is intriguing, as it hints at a reason why neurodegenerative diseases are so often associated with build ups of inactive mis-folded proteins. These ‘amyloid’ plaques are a feature of Alzheimer’s, Parkinson’s and Huntington’s diseases as well as the spongiform encephalopathies. Is this commonality a side effect of the brain normally permitting more regulated prion-like polymerisation events during memory formation?

The existence of self-replicating proteins, a new ‘epigenetic’ level of inheritance, has opened exciting new avenues of research. These new roles for prions in brain function and evolution could be just the tip of iceberg.

Shorter, J., & Lindquist, S. (2005). Prions as adaptive conduits of memory and inheritance Nature Reviews Genetics, 6 (6), 435-450 DOI: 10.1038/nrg1616

Halfmann, R., & Lindquist, S. (2010). Epigenetics in the Extreme: Prions and the Inheritance of Environmentally Acquired Traits Science, 330 (6004), 629-632 DOI: 10.1126/science.1191081

See also,
Amyloid-like oligomers and long-term memory.

On ICE: Integrative and Conjugative Elements.

Integrative and conjugative elements are bacterial mobile genetic elements that primarily reside in the host cell’s chromosome, yet have the ability to be transferred between cells by conjugation. ICEs can be considered as mosaic elements, combining features from other mobile elements: the integrative ability of bacteriophage or transposons, and the transfer mechanisms of conjugative plasmids. This mosaicism is reflected in modular structures: genes encoding the core functions of integration/excision, conjugation and regulation are generally found clustered together. As well as these core functions, ICEs often carry accessory genes that can bestow adaptive phenotypes on their hosts. Gene cassettes encoding antibiotic resistance, nitrogen fixation, virulence factors and various other functions have all been documented in ICEs. They are therefore important vectors for the horizontal dissemination of genetic information, facilitating rapid bacterial evolution.

Chromosomal integration and excision of ICEs is mediated by integrase (Int) enzymes. Most commonly integrases are tyrosine recombinases related to the well studied phage λ Int. They mediate site-specific recombination events between identical or near-identical sequences in the host and ICE genomes (termed attB and attP respectively). These integration events normally occur into tRNA genes. No definitive reason for this association of tyrosine recombinase mediated integration with tRNA genes is known, however tRNA genes evolve more slowly than protein coding genes, potentially broadening the possible host range. Other ICEs encode transposase family tyrosine recombinase Ints that have broader target sequence preferences. Members of the DDE transposase and serine recombinase families also serve as integrases in some ICEs. Before ICE transfer occurs, the element is excised and circularised. Excision of ICEs also requires Int activity, however the process is biased towards excision by ‘recombination directionality factors’ (RDFs). If chromosomal replication or cell devision occurs whilst the ICE is in the excised chromosomal state, the element could be lost from the cell. To prevent this ICEs (like plasmids) often encode addiction modules (toxin-antitoxin systems) that kill cells not inheriting the ICE.

Conjugative transfer occurs via the formation of a multiprotein apparatus that connects the donor and recipient cells: a type IV secretion system (T4SS). This consists of a membrane spanning secretion channel and often an extracellular pilus. The extrachromosomal ICE DNA is first nicked at the origin of transfer (oriT) by a relaxase (MOB) enzyme. Rolling circle replication is then initiated. MOB remains bound to the displaced single-stranded DNA and this nucleoprotein complex is targeted to the T4SS by a coupling protein (T4CP). Rather than using a T4SS, some ICEs are transferred between cells using FtsK-like DNA translocase pumps (in this case dsDNA is transferred). After transfer, the ssDNA ICE is replicated into dsDNA and integrated into the recipient cell’s chromosome. The ICE in the donor cell is also converted into dsDNA and re-integrated into the genome.

ICE transmission is under the control of networks responsive to environmental stimuli. For instance, transfer of the SXT-R391 family of ICEs is controlled by SelR, a homologue of the λ repressor CI. Regulation occurs by a similar mechanism as that controlling the λ switch from lysogeny to lysis. As with CI, SetR repression can be relieved by the action of RecA, the main effector of the ‘SOS’ response to DNA damage. Other ICEs transmission have been shown to be under the control of quorum sensing networks.

A recent bioinformatic study of ICE prevalence and diversity identified ICEs by finding clustered conjugative apparatus modules (Guglielmini et al). If these were found on chromosomal locations they were defined as belonging to ICEs, whilst those on extra-chromosomal elements were considered conjugative plasmids. No reference was made to the presence of integrases. Within this definition ICEs were more common than conjugative plasmids: 18% of sequenced prokaryotic genomes contained at least one ICE as opposed to 12% possessing conjugative plasmids. ICEs are generally defined as not being capable of autonomous extra-chromosomal replication and maintenance. This is opposed to conjugative plasmids that include replication origins and systems. However, this definition is not watertight, as there appear to be various exceptions. Likewise, conjugative plasmids can be integrated chromosomally, either by homologous recombination at repeat sequences, or by site-specific recombination events. These elements therefore exist on a spectrum. Phylogenetic analysis of VirB4 genes (an ATPase component of T4SS) shows that ICEs and conjugative plasmids do not segregate as monophyletic clades. Instead they are intermingled throughout the tree, suggesting that conjugative plasmids often become ICEs and vice versa. Guglielmini et al. therefore consider them as two sides of the same coin. If selection pressures are strong enough though, ICEs can be stabilised as chromosomal structures for long periods of time.

A striking example of the potency and evolutionary importance of ICEs is found in the genomes of the obligate intracellular bacterial family Rickettsiales. One third of the genome of Orientia tsutsugamushi (the mite borne causative agent of scrub typhus) is made up of degenerate copies of an ICE named RAGE (Rickettsiales amplified genetic element). Multiple invasions of RAGE have also configured the genome of a Rickettsial endosymbiont of a deer tick (REIS). In this case RAGEs have acted as hotspots for recombination and the insertion of other mobile elements, leading to the insertion of clusters of novel horizontally transferred genes (a process termed piggybacking). These two Rickettsiales species have especially large genomes for obligate intracellular bacteria, but it seems likely that RAGE has been important in the evolution of this entire clade.

See also: Expanding the Conjugative Realm

Wozniak, R., & Waldor, M. (2010). Integrative and conjugative elements: mosaic mobile genetic elements enabling dynamic lateral gene flow Nature Reviews Microbiology, 8 (8), 552-563 DOI: 10.1038/nrmicro2382

Guglielmini, J., Quintais, L., Garcillán-Barcia, M., de la Cruz, F., & Rocha, E. (2011). The Repertoire of ICE in Prokaryotes Underscores the Unity, Diversity, and Ubiquity of Conjugation PLoS Genetics, 7 (8) DOI: 10.1371/journal.pgen.1002222

Nakayama, K., Yamashita, A., Kurokawa, K., Morimoto, T., Ogawa, M., Fukuhara, M., Urakami, H., Ohnishi, M., Uchiyama, I., Ogura, Y., Ooka, T., Oshima, K., Tamura, A., Hattori, M., & Hayashi, T. (2008). The Whole-genome Sequencing of the Obligate Intracellular Bacterium Orientia tsutsugamushi Revealed Massive Gene Amplification During Reductive Genome Evolution DNA Research, 15 (4), 185-199 DOI: 10.1093/dnares/dsn011

Gillespie, J., Joardar, V., Williams, K., Driscoll, T., Hostetler, J., Nordberg, E., Shukla, M., Walenz, B., Hill, C., Nene, V., Azad, A., Sobral, B., & Caler, E. (2011). A Rickettsia Genome Overrun by Mobile Genetic Elements Provides Insight into the Acquisition of Genes Characteristic of an Obligate Intracellular Lifestyle Journal of Bacteriology, 194 (2), 376-394 DOI: 10.1128/JB.06244-11

On Retrons

the secondary structure of msDNA Ec73. The 76 nt RNA (in box), is joined to a 73nt ssDNA. Note the 2'-5 phosphodiester bond connecting the two molecules at the branching guanosine.

Retrons are an understudied type of prokaryotic retroelement responsible for the synthesis of an enigmatic species of small extra-chromosomal satellite DNA termed multicopy single-stranded DNA (msDNA). msDNAs are actually composed of both a single-stranded (ss) DNA and a ssRNA. The 5′ end of the msDNA is covalently bonded to an internal guanosine residue of the msRNA by a unique 2′-5′ phosphodiester bond, whilst the 3′ ends of the molecules are joined by a small stretch of base-pairing. msDNAs are therefore a sort of looped hybrid molecule, but extensive internal base pairing creates various stem-loop/hairpin secondary structures (see figure). The retron, (ie. the genetic loci encoding the msRNA and msDNA molecules (msr and msd) and the gene encoding the reverse transcriptase (ret) responsible for the synthesis of msDNA) is transcribed as an operon.

Retrons are present in a wide variety of eubacterial, and some archaeal, genomes. A recent study identified 97 different retron-like reverse transcriptase genes within bacteria, however their distribution is sporadic. For instance, seven distinct retron elements have been found amongst E. coli strains, but only 15% of natural E. coli isolates produce msDNAs. Based on their sporadic occurrence and analysis of codon usage, retrons have been suggested to be a recent addition to the E. coli genome.

A major exception to the sporadic distribution found in most bacteria is within the myxobacteria, where all ten genera include msDNA-producing species. Myxobacterial retrons form a phylogenetically related group. These features, as well as sequence divergence, suggest that the common ancestor of the extant myxobacteria contained a retron as much as 150 million years ago, which has been vertically transmitted.

Retrons have not been shown to be mobile genetic elements, although the presence of reverse transcriptase does suggest this possibility. A clue to their propagation is the association of many of them with prophage sequences, suggesting their spread could be associated with bacteriophage. However, as with many observations about retrons, there are plenty of exceptions.

Organisation of a retron operon. note the inverse orientations and short overlap of msr and msd.

msDNA is essentially a cDNA produced from a short region of an mRNA template. During msDNA synthesis, an RNA template derived from the operon mRNA and composed of msr and msd, is folded into a specific secondary structure due to flanking inverted repeat sequences. The msd sequence is then reverse transcribed by the retron reverse transcriptase, using the 2’OH group of the ‘branching’ guanosine residue as a primer. The lagging RNA template strand is then degraded by RNaseH activity (probably host cell derived), leaving the msDNA covalently bonded at it’s 5′ end and base paired to the msRNA at their 3′ ends.

No function has been unequivocally attributed to msDNA. Mutating retron ret genes to prevent synthesis of E. coli or myxococcal msDNAs produces no detectable effects. Overexpression of certain E. coli msDNAs has been shown to increase mutation rate. msDNAs generally form hairpin structures by complementary base pairing of inverted repeat sequences (see figure). However, in many msDNA hairpins the base pairing is imperfect. It appears that the overexpression associated mutation rate phenotype is due to mismatch containing msDNAs sequestering the mismatch repair enzyme MutS. Overexpression of msDNAs without mismatch-containing hairpins does not cause similar effects. It is possible that msDNA could be regulating MutS availability by this titration mechanism in normal conditions or as part of a stress response. However, the overexpression experiments lead to msDNA concentrations far beyond normal physiological levels, so can yield no more than a hint of normal function.

In conclusion, the lacunae in our understanding of retrons and msDNA, are far more striking than the known facts. Are retrons parasitic elements? or do msDNAs have physiological roles in their host cells? Are retrons mobile elements? Just what does msDNA do? Judging from the literature, interest in retrons peaked around 1990, and recent years have been very fallow. I do hope that funding agencies and researchers keep pursuing the answers to these questions and don’t let them remain as an interesting oddity in the literature.

Lampson, B., Inouye, M., & Inouye, S. (2005). Retrons, msDNA, and the bacterial genome Cytogenetic and Genome Research, 110 (1-4), 491-499 DOI: 10.1159/000084982

Simon, D., & Zimmerly, S. (2008). A diversity of uncharacterized reverse transcriptases in bacteria Nucleic Acids Research, 36 (22), 7219-7229 DOI: 10.1093/nar/gkn867

small silencing RNAs. I: Piwi-interacting RNAs.

Three major classes of small RNAs involved in gene silencing have been found in animals: microRNAs (miRNAs), small-interfering RNAs (siRNAs) and Piwi-interacting RNAs (piRNAs). miRNAs are involved in the regulation of mRNA stability and translation, whilst the main purpose of the siRNA and piRNA pathways appears to be the defense of the cell and genome from viruses and transposable elements. Unlike the other two systems that are ubiquitously active, the piRNA pathway is generally only active in germline cells, the most important locus of defense against transposons.

A common feature of all three pathways is the formation of RNA-induced silencing complexes (RISCs), composed of a small RNA bound to an Argonaute family protein. The small RNA guides RISC to specific target RNAs, resulting in target silencing (generally by the Argonaute protein ‘slicing’ the cognate RNA). A key stage in the miRNA and siRNA silencing pathways is the recognition of double stranded RNAs, and their cleavage by Dicer proteins. This is not a feature of the piRNA system. Another difference is that piRNAs range from 22nt to 30nt in length, whilst siRNAs and miRNAs are 21 or 22-24nt long respectively. When piRNAs were first discovered they were called repeat-associated small-interfering RNAs (rasiRNAs). However, as they are not always associated with repeat sequences and as they bind a specific clade of Argonaute proteins, the PIWI family, they were subsequently renamed.

The piRNA system in Drosophila

A Drosophila melanogaster egg chamber. The large nurse cell nuclei are visible in the upper half, whilst the follicle cells cover the oocyte in the lower half.

The piRNA transposon silencing system has been most comprehensively analysed during oogenesis in the fruitfly, Drosophila melanogaster. Within a Drosophila egg chamber, the germline cells (fifteen nurse cells and the oocyte) share a common syncytial cytoplasm. They are surrounded by a layer of somatic follicle cells, which exchange developmental signals and nutrients with the germline cells. The Drosophila genome harbours over a hundred transposon families, including representatives of all three major classes (LTR and non-LTR retrotransposons, and DNA elements). Some retrotransposons, such as the gypsy family, form viral particles that have been shown to be able to invade the germline from the follicle cells via cellular transport vesicles. Therefore the germline is under threat from transposable elements primarily from within, but also from the somatic follicle cells. Two different variants of the piRNA system function in the germline and the somatic follicle cells: a more complicated system involving three PIWI family Argonaute proteins and a piRNA amplification system functions in the germline, whilst a simpler system involving only one PIWI protein works in the follicle cells to silence a more limited repertoire of retrotransposons.

The piRNA pathway in somatic follicle cells

Approximately 70% of somatic piRNAs map to transposons or transposon fragments. Of these 90% are antisense to active transposons. Mapping piRNAs to genomic sequence has yielded a great insight into genomic structure and the piRNA system of transposon control: piRNAs are derived from large clusters of densely packed, inactive transposon copies and fragments. piRNA clusters are a conserved feature of piRNA biology. They generally span dozens to hundreds of kilobases and are located in the heterochromatin associated with centromeres or telomeres. In the case of Drosophila somatic follicle cells two piRNA-clusters dominate: The flamenco locus and cluster 20A. Follicle cell piRNAs from these clusters are derived from one DNA strand, meaning that transcription is unidirectional. In flamenco and cluster 20A, the transposon fragments are generally oriented antisense to the direction of transcription, explaining the strong antisense bias of somatic follicle cell piRNAs. A P-element insertion at the beginning of the flamenco cluster blocks piRNA production from the whole 180kb cluster, suggesting that the formation of long single stranded transcripts of piRNA clusters is a necessary stage of piRNA biogenesis. However, the mechanisms of piRNA generation are not clear. It appears likely that the long piRNA precursor transcripts are stochastically cut into smaller fragments. Piwi then selectively binds fragments with a 5′ uridine (75% of Piwi-bound piRNAs have a 5′ uridine residue), and the pre-piRNAs are then 3′ trimmed to generate the final piRNA.

The germline piRNA pathway and ping-pong amplification.

In addition to Piwi, Drosophila ovarian germline cells express two related PIWI family Argonaute proteins: Aubergine (Aub) and AGO3. Unlike Piwi, which is localised to the nucleus, Aub and AGO3 are associated with an electron-dense peri-nuclear region of cytoplasm called nuage. Most importantly, they act together in a sophisticated piRNA amplification loop that is dependent on target expression, termed the ping-pong cycle. In a simplified version: Aub complexed with an antisense piRNA targets and slices a sense transcript of an active transposon, resulting in the production of novel sense piRNA species which are loaded onto AGO3. The AGO3-piRNA complexes then cleave complementary piRNA cluster transcripts, resulting in the production of novel antisense piRNA to be complexed with Aub. The ping-pong cycle results in the amplification of sets of antisense and sense piRNAs that are 10nt out of register with each other, suggesting the site of Aub slicer activity and providing a useful signal that shows that ping-pong amplification has occurred.

In the germline, more piRNA clusters are active, representing a larger spectrum of transposons. They are also expressed bi-directionally. An outstanding question is why this doesn’t trigger ping-pong amplification? The most likely reason is that the processes of piRNA biogenesis and transposon silencing are tightly localised and regulated. The roles of other proteins in these processes are starting to be understood. Proteins containing Tudor domains appear to be very important in the localisation and function of Aub and AGO3 in the nuage.

Many other intriguing aspects of piRNA biology are yet to be understood. Although the bulk of piRNAs are directed against transposons, some are involved in the regulation of cellular mRNAs. These piRNAs are derived from mRNAs rather than cluster transcripts: Are these transcripts marked in some way to be processed into piRNAs? The links between the primary piRNA biogenesis pathway and the ping-pong amplification system are also poorly understood. An interesting aspect of the piRNA system active in mouse spermatogenesis, is that the nucleus localised mouse PIWI family protein MIWI2 has been implicated in guiding de novo DNA methylation at transposon loci. Is this a more widespread phenomenon?

The piRNA system has been likened to an acquired immune response and works together with the (more acute response) siRNA pathway in transposon silencing. Future posts will discuss the other small RNA systems, and go further into piRNA biology.

Senti, K., & Brennecke, J. (2010). The piRNA pathway: a fly’s perspective on the guardian of the genome Trends in Genetics, 26 (12), 499-509 DOI: 10.1016/j.tig.2010.08.007

Khurana, J., & Theurkauf, W. (2010). piRNAs, transposon silencing, and Drosophila germline development The Journal of Cell Biology, 191 (5), 905-913 DOI: 10.1083/jcb.201006034

of further interest: piRNAs in the brain: epigenetics and memory

On Transposable Elements and Regulatory Evolution

Transposable elements (TEs), generally considered molecular parasites on the genome, are increasingly being linked to the evolution of new biological functions. TEs have been shown to be a source of novel genes and exons, the ‘arms race’ between them and their hosts has been a driving force in the evolution of epigenetic silencing mechanisms, and they have been shown to serve as cis-acting regulatory elements for host genes. This last role, as regulatory elements has potentially wide ramifications: TE mobilisation could cause changes to the expression of co-regulated suites of genes. Recently, the emergence of novel TEs and their mobilisation has been argued to be a causative factor underlying such ‘punctuated equilibria’ evolutionary phenomena as the Cambrian explosion and the rapid speciation of cichlid fishes. Two new papers analysing mammalian genomic evolution further link transposable elements with the spread of regulatory elements through the genome, and the evolution of novel characters.

CTCF binding sites.

CTCF (CCCTC-binding factor) is a DNA-binding protein with such a diverse and exciting array of potential roles attributed to it that it has been called a ‘master weaver of the genome’. It acts as an insulator, dividing different chromatin domains, and is therefore important for transcriptional activation and repression. This role appears to be linked to the formation of long distance chromosomal loops, and hence to the global organisation of the chromosomes within the nucleus. Schmidt et al. used ChIP-seq to define all the CTCF binding events in liver cells from five eutherian mammals (human, macaque, mouse, rat, and dog) and a marsupial (opossum). Using this data they defined a core DNA sequence motif that CTCF commonly binds, as well as sets of CTCF binding events that are conserved between the various species. In some lineages certain CTCF bound DNA sequence motifs were overrepresented. These overrepresented ‘motif-words’ were often embedded within lineage specific SINE repeats (short interspersed nuclear elements, non-autonomous non-LTR retrotransposons). For instance, mice and rats share about 2000 CTCF binding events that are associated with B2 SINES, mice have a further 5,300 B2 associated binding events and rats a further 1,200. Enrichments of CTCF binding events associated with lineage specific SINEs also occurred in the canine and opossum genomes (on a lesser scale). Surprisingly however, no similar TE associated enrichment occurred in the primate lineage. Looking at CTCF binding events that were conserved between multiple mammals, Schmidt et al. were also able to find over 100 binding events that were associated with fossilised ancestral transposable sequences.

Overall, this data shows that CTCF binding has expanded via retrotransposition in multiple mammalian lineages and that this is an ancient mechanism of regulatory evolution. CTCF binds a long DNA sequence motif (33/34bp) that is less likely to be generated by random point mutations than the smaller motifs more commonly bound by transcription factors. This is one reason why CTCF binding site expansion should be more associated with TEs than other regulatory sequence motifs. Another suggestion that the authors make to explain this association is that CTCF binding may protect TEs from repressive DNA or chromatin modifications.

Transposons and the evolution of pregnancy

During mammalian pregnancy, endometrial stromal cells (ESCs) differentiate in response to progesterone and signalling via the cAMP second messenger pathway, to produce a vascularised placenta that can accommodate implantation (a process termed decidualisation). The enhancer that drives expression of Prolactin in response to progesterone/cAMP signalling in ESCs is derived from a MER20 transposon (a hAT-Charlie family DNA transposon). Lynch et al. have found a strong association between MER20 elements and genes that are differentially expressed in mammalian ESCs and genes that are responsive to progesterone/cAMP signalling.

Analysing MER20s that are located close to stromally regulated genes, they found that, based on their association with CpG islands and various histone modifications, they often had regulatory potential. They then tested whether 21 randomly chosen MER20s bound various transcription factors and insulator proteins. 14 MER20s bound a suite of 5 different insulator proteins (including CTCF), whilst 5 different transcription factors important for ESC development bound together in 4 cases. This suggested that MER20s could be classified into ‘insulator’ and ‘enhancer-repressor’ types. Using a reporter gene assay in various cell types, they then showed that the majority of these MER20s acted as regulatory elements in response to progesterone/cAMP signalling specifically in ESCs.

This data indicates that the rewiring of the gene regulatory network of ESCs during the evolution of pregnancy was partly mediated by MER20 transposition events. In this case, MER20s contain sequences for regulatory assemblies of transcription factors responsive to specific signalling pathways, and hence have acted as cell type specific regulatory elements.

These two papers, as well as an increasing number of other studies, show that TEs are important agents of gene regulatory network evolution. The findings of Lynch et al. especially confirm the perspicacity of the discoverer of transposable elements, Barbara McClintock in terming them ‘controlling elements’.

See also: Retrotransposons as regulatory elements

Lynch, V., Leclerc, R., May, G., & Wagner, G. (2011). Transposon-mediated rewiring of gene regulatory networks contributed to the evolution of pregnancy in mammals Nature Genetics, 43 (11), 1154-1159 DOI: 10.1038/ng.917

Schmidt, D., Schwalie, P., Wilson, M., Ballester, B., Gonçalves, A., Kutter, C., Brown, G., Marshall, A., Flicek, P., & Odom, D. (2012). Waves of Retrotransposon Expansion Remodel Genome Organization and CTCF Binding in Multiple Mammalian Lineages Cell, 148 (1-2), 335-348 DOI: 10.1016/j.cell.2011.11.058

Zeh, D., Zeh, J., & Ishida, Y. (2009). Transposable elements and an epigenetic basis for punctuated equilibria BioEssays, 31 (7), 715-726 DOI: 10.1002/bies.200900026

Phillips, J., & Corces, V. (2009). CTCF: Master Weaver of the Genome Cell, 137 (7), 1194-1211 DOI: 10.1016/j.cell.2009.06.001

Novel modes of lateral gene transfer in bacteria

Understanding the mechanisms of lateral gene transfer (LGT) between bacteria is crucial to our understanding of microbial evolution. It is also important for human health as LGT facilitates the emergence and spread of bacterial virulence and antibiotic resistance. The three ‘classical’ mechanisms of LGT; transformation (in which naked DNA is taken up from the environment), transduction (by which bacteriophage facilitate gene transfer by packaging host DNA as well as their own) and conjugation (when plasmids encode a pilus by which they can be transferred from cell to cell) have been important in the emergence of molecular biology. Three other mechanisms by which DNA can be transferred between bacteria have come to light, potentially broadening our understanding of the importance of LGT in the microbial biosphere.

Gene Transfer Agents (GTAs)

GTAs are virus-like particles that carry random pieces of the producing cell’s genome. The best characterised GTA was discovered in 1974 from the purple, non-sulphur photosynthetic bacterium Rhodobacter capsulatus, a member of the alpha-proteobacteria. The R. capsulatus GTA (RcGTA) packages 4.5kb of DNA, but is encoded by a 14.1kb cluster of 15 genes on the R. capsulatus chromosome. Many of these genes have homology with bacteriophage structural genes. It is unclear how RcGTA particles are released from the cell, as no recognisable lysis genes have been identified. Transcription of the RcGTA gene cluster has been shown to be under the control of a sensor kinase/response regulatory system that transduces environmental signals. RcGTA-like gene clusters are widespread throughout the alpha-proteobacteria and phlogenetic trees based on RcGTA-like sequence recapitulate phylogenies based on 16s rRNA sequences suggesting that the RcGTA ancestor arose early in the evolution of the alpha-proteobacteria lineage.

Other GTAs (with probable independent origins) have been identified in a diverse range of prokarya including the archaebacterium Methanococcus voltae, the delta-proteobacterium Desulfovibrio desulfuricans and the spirochete Brachyspira hyodysenteriae.  None of them packages more than 14kb of DNA, and all of them take the form of small bacteriophage. It appears most likely that GTAs have been derived from bacteriophage that have lost their ability to self-propagate. Recent data suggests that alpha-proteobacterial GTAs are common in marine environments, and transfer genes at high frequency between diverse classes of alpha-proteobacteria. These ‘generalised transducing machines’, under the control of bacterial populations quorum sensing systems, are probably a major force in microbial evolution and ecology.

DNA transfer by membrane vesicle.

DNA encapsulated by MVs. A rosette-like structure is seen in the centre, a plasmid is in the box, linear DNA molecules - arrowheads.

Membrane vesicles (MVs), from gram –ve bacteria can traffic toxins, signals and other proteins between bacteria. They have also been shown to be able to mediate the transfer of DNA between cells. E.coli 0157:H7 MVs were found to contain linear DNA, circular plasmids and rosette-like DNA structures, that included genes from chromosomal DNA as well as plasmid and phage. The MVs were capable of transforming related enteric bacteria and increasing their cytotoxicity. DNA transfer by membrane vesicles could be a more widespread phenomenon than is currently appreciated, however as yet it is more commonly reported as an aside from other MV studies.

Intercellular nanotubes.

top two images show B. subtilis cells with nanotubes (note more intimate thin connections in circle). Lower three images show inter-specific nanotubes.

A year ago Dubey and Ben-Yehuda showed the existence of tubular conduits forming between Bacillus subtilis cells. These nanotubes were shown to be able to mediate the exchange of proteins and non-conjugative plasmids. Nanotubes were also formed between B. subtilis and Staphylococcus aureus (both gram +ve) and a thinner variety were formed between either of the gram +ve species, and gram –ve E.coli. The authors suggest that the formation of ‘syncytium-like synergistic consortia’ mediated by nanotube connections underlies many of the traits displayed by biofilms.

These three phenomena have a tantalising savour, suggesting the depths of our ignorance of the complexity of microbial ecosystems and prokaryotic evolution. However, I imagine that progress in these fields will accelerate. The explosion of microbial and environmental sequencing will be useful in identifying the prevalence of GTAs. Understanding all six modes of LGT will be crucial to our appreciation of the ecology of natural microbial communities and of bacterial evolution, as well as having important application for human health.

See also: A novel gene transfer agent from Bartonella

Stanton, T. (2007). Prophage-like gene transfer agents—Novel mechanisms of gene exchange for Methanococcus, Desulfovibrio, Brachyspira, and Rhodobacter species Anaerobe, 13 (2), 43-49 DOI: 10.1016/j.anaerobe.2007.03.004

Lang, A., & Beatty, J. (2007). Importance of widespread gene transfer agent genes in α-proteobacteria Trends in Microbiology, 15 (2), 54-62 DOI: 10.1016/j.tim.2006.12.001

McDaniel, L., Young, E., Delaney, J., Ruhnau, F., Ritchie, K., & Paul, J. (2010). High Frequency of Horizontal Gene Transfer in the Oceans Science, 330 (6000), 50-50 DOI: 10.1126/science.1192243

Yaron, S., Kolling, G., Simon, L., & Matthews, K. (2000). Vesicle-Mediated Transfer of Virulence Genes from Escherichia coli O157:H7 to Other Enteric Bacteria Applied and Environmental Microbiology, 66 (10), 4414-4420 DOI: 10.1128/AEM.66.10.4414-4420.2000

Dubey, G., & Ben-Yehuda, S. (2011). Intercellular Nanotubes Mediate Bacterial Communication Cell, 144 (4), 590-600 DOI: 10.1016/j.cell.2011.01.015