Monthly Archives: December 2011

On Diversity Generating Retroelements.

Diversity Generating Retroelements (DGRs) are genetic elements that utilise reverse transcription and site-directed, adenine specific mutagenesis to diversify DNA sequences and the proteins they encode. DGRs have been found encoded on bacterial chromosomes as well as on plasmids and bacteriophage. Most of the research on these elements has concerned the first DGR identified: an element responsible for tropism switching in a temperate bacteriophage of Bordetella.

Bordetella species cause respiratory infections in mammals. The infectious cycle consists of two phases: during Bvg+ phase the bacteria are adapted for colonisation of the respiratory tract whilst in Bvg- they are adapted to ex vivo survival and growth. The phenotypic transition between the phases is associated with major changes in surface structures and secreted proteins. The bacteriophage BPP-1 was identified as having a tropism for the Bvg+ phase, during which it utilised the surface protein Pertactin as it’s receptor. However, BPP-1 was observed to be capable of producing variants that preferentially infected Bvg- phase cells or that were indiscriminate in their phase preference. Comparison of the genomes of the phage variants led to the identification of a region of variability at the 3′ end of a gene mtd(major tropism determinant) and to the characterisation of the BPP-1 DGR.


The variable region (VR) is a 134bp sequence, within which nucleotide substitutions occur at 23 discrete positions. The sites of variability are predominantly located in the first two bases of codons, maximising the potential generation of amino acid substitutions. Downstream of VR is a second copy of the 134bp repeat, which is however invariant: the template repeat (TR). Variable sites in VR correspond to adenine residues in TR. When silent substitutions are engineered into TR they are transmitted into VR during tropism switching. When new adenine residues are inserted into TR these too become sites of variability. The TR therefore serves as a donor cassette and VR as the recipient of variable sequence information.

This process of information transfer and selective mutation (‘mutagenic homing’) is mediated by a reverse transcriptase encoded nearby (brt). Another protein of unknown function encoded in the DGR, accessory tropism determinant (atd) is also required. Mutations at brt, atd or TR produce infective phage that are incapable of tropism switching.

Mutagenic homing has been shown to be dependent on short sequences immediately downstream of VR and TR: IMH and IMH* (initiation of mutagenic homing) respectively. These consist of 14bp runs of guanosine and cytosine residues(G/C14 ), followed by a 21bp sequence that differs at five positions between IMH and IMH*. Substitution of IMH* with IMH converts TR into a recipient of diversified sequence information, whereas the opposite substitution eliminates tropism switching. Mutagenic homing has been proposed to occur by a process of target DNA-primed reverse transcription (TPRT). The reverse transcription reaction is primed from either a nick or a double strand break occurring in the G/C14 element of IMH, and is dependent on the rest of IMH as well as potential hairpin/cruciform structure forming short repeat sequences nearby. The 5′ end integration can be mediated by short stretches of homology and is thought to occur by a process of template switching or strand displacement. Mutagenesis has been shown not to occur in the RNA intermediate and so probably occurs during reverse transcription itself. Adenine mutagenesis is thought to be an inherent property of the Brt protein and the related reverse transcriptases encoded in other DGRs.

Homologous DGRs have been identified in nearly 30 diverse bacterial or phage genomes. All of these elements encode related reverse transcriptases and include sequences analogous to TR and VR. Adenine specific mutagenesis is a constant feature of DGR function. The VR encoded variable residues of BPP-1 Mtd are presented on the ligand binding surface using a C-type lectin fold. The structure of this fold is essentially invariant, whilst the ligand binding affinity can differ dramatically. Interestingly, all the VR sequences of DGRs identified thus far are present in C-lec fold encoding sequences. DGRs likely function in a similar manner, however there is diversity in their organisation. TR sequences can be located upstream, downstream or within RTase coding sequences. There is also a dichotomy relating to the presence of atd homologs or genes encoding HRDC (Helicase and RNAseD C terminal) domains. Most interestingly, some DGRs probably diversify multiple VR containing loci.

Most retroelements such as retrotransposons or group II introns are generally considered as ‘selfish DNA’ parasitizing host genomes. DGRs are the first retroelements to obviously confer selective advantage to their hosts. Although their ability to introduce adaptive targeted mutations is within the narrow bounds of diversifying specific C-lec folds they do represent another exception to the ‘central dogma’ of molecular biology and a challenge to the neo-darwinian concept of random mutation.

DGRs could also have important applications in synthetic biology. By engineering a cognate TR and inserting an IMH and related sequences, Jeff Miller’s group (responsible for all this research) were able to diversify antibiotic genes in such a way as to restore function, ie in vivo directed protein evolution.

Watch an excellent 36 min lecture on DGRs by Jeff Miller here

Medhekar, B., & Miller, J. (2007). Diversity-generating retroelements Current Opinion in Microbiology, 10 (4), 388-395 DOI: 10.1016/j.mib.2007.06.004

Liu M, Deora R, Doulatov SR, Gingery M, Eiserling FA, Preston A, Maskell DJ, Simons RW, Cotter PA, Parkhill J, & Miller JF (2002). Reverse transcriptase-mediated tropism switching in Bordetella bacteriophage. Science (New York, N.Y.), 295 (5562), 2091-4 PMID: 11896279

Doulatov S, Hodes A, Dai L, Mandhana N, Liu M, Deora R, Simons RW, Zimmerly S, & Miller JF (2004). Tropism switching in Bordetella bacteriophage defines a family of diversity-generating retroelements. Nature, 431 (7007), 476-81 PMID: 15386016

Guo H, Tse LV, Barbalat R, Sivaamnuaiphorn S, Xu M, Doulatov S, & Miller JF (2008). Diversity-generating retroelement homing regenerates target sequences for repeated rounds of codon rewriting and protein diversification. Molecular cell, 31 (6), 813-23 PMID: 18922465

Guo H, Tse LV, Nieh AW, Czornyj E, Williams S, Oukil S, Liu VB, & Miller JF (2011). Target site recognition by a diversity-generating retroelement. PLoS genetics, 7 (12) PMID: 22194701

Regulatory evolution of a transcription factor reconstructed.

Lynch, V., May, G., & Wagner, G. (2011). Regulatory evolution through divergence of a phosphoswitch in the transcription factor CEBPB Nature, 480 (7377), 383-386 DOI: 10.1038/nature10595

This paper demonstrates that change in the response of transcription factors to signalling pathways is an important mechanism in the evolution of novel developmental functionalities.

During mammalian pregnancy, the expression of Prolactin (PRL) in endometrial stromal cells (ESCs) is activated by an interaction of the transcription factors CEBPB (CCAAT/ enhancer binding protein b) and FOXO1A. Both of these transcription factors are present in non-mammalian animals, so to test whether this function evolved coincidentally with the origin of pregnancy, the investigators designed a biochemical assay in which activation of an enhancer for ESC Prolactin expression could be measured by expression of a luciferase reporter gene. Expressing human CEBPB and FOXO1A together transactivated the expression of luciferase. However, Coexpression of the two proteins from chicken or from opossum (a marsupial) failed to cooperatively transactivate expression from the ESC PRL enhancer. To test the hypothesis that this cooperative interaction evolved in placental mammals (Eutheria) they phylogenetically reconstructed the putative ancestral sequences of CEBPB and FOXO1A from the common ancestor of both Theria (all mammals excluding the egg bearing monotremes) (AncTheria) and placental mammals (AncEutheria), synthesised them and tested their ability to activate ESC PRL expression in the assay. As predicted the AncEutherian CEBPB/FOXO1A complex strongly transactivated tuciferase expression, whilst the AncTherian complex did not.

Having ruled out change in FOXO1A being responsible for this derived cooperative function, the researchers set out to determine which domains of CEBPB are responsible by making truncated forms of the gene and assaying their transactivation ability (in the presence of FOXO1A). The deletion of an internal regulatory domain (RD2) reduced the AncEutherian protein’s transactivation ability to that of the AncTherian CEBPB. Having shown that RD2 plays a critical role in PRL regulation, they then tested the roles of five eutherian specific amino acid substitutions in this region. Back mutations from the AncEutherian CEBPB and forward mutations from the AncTherian CEBPB showed amino acid substitutions at sites 3 and 4 contributed to the derived regulatory ability but that they are dependent on the substitution at site 5. Upon mapping of predicted sites for phosphorylation of CEBPB it was clear that the three amino acid substitutions corresponded to AncEutherian CEBPB having lost serine phosphosites at positions 3 and 4 and having gained one at site 5. To test the importance of phosphorylation (by the kinase GSK3b) in potentiating transactivation by CEBPB they repeated their experiments in the presence of a phosphorylation inhibitor. In accordance with their predictions, inhibition of phosphorylation prevented AncEutherian CEBPB mediated transactivation. Interestingly however, it potentiated transactivation by AncTherian CEBPB. This shows that AncEutherian CEBPB is activated by GSK3b whereas AncTherian CEBPB is repressed by it. These different responses to GSK3b are mediated by the differences in phosphorylation of the three amino acid positions in RD2.

The researchers went on to use protein structure modelling simulations to infer the consequences of repositioning phosphorylation sites in RD2. Unphosphorylated AncEutherian CEBPB is predicted to be collapsed into a knot-like bundle in which the various domains are in contact with each other – an intrinsically repressed conformation. Phosphorylation of the AncEutherian protein is predicted to give rise to an open conformation in which the domains don’t contact each other, freeing the DNA binding domain from an intramolecular masking effect. The AncTherian CEBPB model however shows the transcriptionally inactive conformation.

The importance of this paper is that it shows changing the response of transcription factors to signalling pathways can be an important mechanism of genetic regulatory evolution. It is known that the interactions of cis-regulatory elements and transcription factors are the main substrate on which evolution acts to change developmental processes. However, more emphasis has been put on changes to cis-regulatory elements. This is partly due to their amenability, but also probably partly due to the pleiotropic expression of transcription factors and other developmentally important molecules, ie they are expressed in lots of different places but have different developmental roles. The thinking would be that it is easier to evolve novel enhancers, that in combination with other cis-elements integrate a transcription factor code to drive differential gene expression, whereas evolution of transcription factors themselves would be more likely to affect a large amount of developmental processes. This work shows how evolution can affect transcription factor function in a cell type specific way without potentially deleterious pleiotropic effects. The other important aspect of this paper is that it shows how experimental evolutionary developmental biology can be done. Reconstruction of ancestral genes and testing their function yielded insights into CEBPB regulation and the evolution of mammalian pregnancy. This type of reconstructive biology is open to criticism as it will always have a speculative and ultimately unknowable aspect to it. However if evolutionary developmental biology is to be anything more than molecular comparative anatomy this more experimental approach is the way to go.

Virophages and the evolution of transposable elements

Fischer, M., & Suttle, C. (2011). A Virophage at the Origin of Large DNA Transposons Science, 332 (6026), 231-234 DOI: 10.1126/science.1199412

This paper reports the discovery and characterisation of a virophage, Mavirus, and postulates that it’s similarities with a family of transposable elements suggest a common evolutionary origin.

Mavirus (white arrows) in the virion factory of CroV (CroV viral particles white arrow head, black arrow - a potentially defective CroV capsid)

Mavirus is the second virophage so far discovered. Both it and the other, Sputnik, parasitize giant viruses that infect protists. There is some controversy as to whether these viruses should be classified as virophages or a new class of satellite viruses. Virophages are predicted to have no nuclear phase in their infection cycle, to replicate in the virion factory of the host virus and to be dependent on enzymes provided by the host virus rather than the host cell. Mavirus is associated with the Cafeteria roenbergensis virus (CroV) that infects the marine phagotrophic flagellate C. roenbergensis. Mavirus can’t replicate in the absence of CroV, and CroV production and host cell lysis was reduced when infected with Mavirus.

The Mavirus has a 19kb circular double stranded DNA genome that is predicted to contain 20 protein coding genes (fig 1).  Between each gene the researchers found promoters that are highly similar to the predicted late promoter motif in CroV, implying that Mavirus gene expression is activated by the transcription machinery of CroV. Only four Mavirus genes showed homology with those on Sputnik, the other known virophage (including those encoding the capsid protein and a DNA pumping ATPase).

Figure 1. Genome diagram of Mavirus. Genes conserved with MP TEs are in red. Genes sometimes found in MP TEs in blue.

More genetic homology is found between Mavirus and a class of DNA transposons: the self-synthesising Maverick or Polinton Transposable elements (MP TEs). MP TEs are between 9 and 22kb long and encode up to 20 proteins. A conserved subset of genes is found in all MP TEs including a retroviral integrase (rve-INT) (responsible for DNA integration), a protein primed DNA polymerase B (PolB), an ATPase similar to those responsible for packaging dsDNA in viruses, and a cysteine protease with adenoviral homologues. Genes for all four of these proteins were found in Mavirus as well as another three often encountered in MP TEs.

The best evidence for a close evolutionary relationship between Mavirus and MP TEs was the identification of a region of synteny between Mavirus and a genomic fragment from the slime mold Polysphondylium pallidum that contains a MP TE-like fragment (fig 2).

Fig 2. Comparison of gene organisation between Mavirus and a truncated MP TE from P. pallidum. homologous genes are the same colour and the syntenic region shown in green.

Another resemblance between MP TEs and Mavirus relates to genome structure. MP TEs have terminal inverted repeats of several hundred nucleotides, and highly conserved ends starting with 5′-AG and ending with CT-3′. Although the Mavirus genome is a circular molecule at one point it encodes a ~50bp inverted repeat with the potential to adopt a hairpin structure. The adenine at the top of this hairpin was designated position 1 of the genome. Cutting the Mavirus genome between nucleotides 19,063 and 1 would result in the same termini as MP TEs and inverted repeats.

All these parallels between MP TEs and Mavirus suggest that the two are derived from a common ancestor. The question is whether the common ancestor was a transposable element or a virophage? The authors consider that the more parsimonious explanation is that a Mavirus ancestor (MVA) gave rise to MP TEs, and go on to suggest a possible evolutionary scenario. In early eukaryotic cells susceptible to infection by large DNA viruses there would be selective pressure on the host cells to stabilise relationships with a MVA. The acquisition of a retroviral integrase gene by MVA and integration in the host cell genome could have conferred protection against the large DNA virus. Various endogenisation of virophage events would create provirophages that may have led to various MP TEs. The close relationship between retroviruses and retrotransposons has long been known. This characterisation of the links between a DNA virus and a class of DNA transposable element yields similar insight into the evolution of mobile DNA and it’s importance in genomic evolution.

A novel family of secreted guidance factors characterised in C. elegans.

Seetharaman, A., Selman, G., Puckrin, R., Barbier, L., Wong, E., D’Souza, S., & Roy, P. (2011). MADD-4 Is a Secreted Cue Required for Midline-Oriented Guidance in Caenorhabditis elegans. Developmental Cell, 21 (4), 669-680 DOI: 10.1016/j.devcel.2011.07.020

A newly characterised protein, MADD-4, involved in attracting muscle membrane extension and axons in nematodes is the founding member of a new family of secreted axon guidance proteins.

Bilaterally symmetrical animals use the left-right midline to organise development along the dorsal-ventral (DV) axis. Secreted molecules, such as Slits and Netrins, create gradients that migrating cells or cell extensions (such as axons) use as either attractive or repulsive guidance cues. In the nematode worm Caenorhabditis elegans, a ventral nerve cord is the site of highest Netrin (UNC-6) concentration, whilst Slit (SLT-1) is most concentrated in the dorsal midline. These molecules then act to guide cell migration and axon guidance on the DV axis. Commissural motor neurons originate on the ventral midline where their cell bodies are located. Their axons travel down the UNC-6 gradient to the dorsal midline where they extend longitudinally (see image). The locomotory body muscles of C. elegans are arranged in longitudinal rows, flanking the dorsal and ventral nerve cords. The muscles extend actin based membrane projections called muscle arms to the motor axons in the nearest nerve cord where they synapse producing neuromuscular junctions.

Commissural motor neurons in C. elegans. Ventral cord is lower, Dorsal cord higher. Cell bodies are visible as bright spots ventrally. Commissural axons run ventral to dorsal connecting the two midlines.

Seetharaman et al have used muscle arm extension to investigate midline oriented guidance. The group’s previous work had shown that the Netrin receptor, UNC-40 was required in the body muscles to mediate muscle arm extension. Interestingly however, UNC-6 (Netrin, it’s canonical ligand) was not required. To dissect muscle arm extension further the researchers carried out a genetic screen to identify mutants with muscle arm defective (madd) phenotypes. They found that animals with loss of function alleles of madd-4 have extensive dorsal muscle arm extension defects and weaker ventral extension defects. The alleles are semidominant suggesting dose dependency (consistent with secreted signalling molecules). The worms do not show serious disturbances to commissural axon guidance or dorsal cord formation.

Using transgenic reporter constructs the group showed that the two isoforms of MADD-4 were only expressed in the commissural motor neurons, hence showing localisation in the ventral and dorsal cords (a similar pattern to that in the image). Misexpressing MADD-4 in neurons that extend along the lateral line of the worm caused redirection of muscle arm extension (towards this lateral expression). This phenotype could be suppressed by loss of unc-40 function, suggesting that MADD-4 interacts with the netrin receptor (either directly, or via another receptor whose function is dependent on UNC-40. Using transgenic worms in which MADD-4 cannot be secreted, the researchers showed that it’s function is dependent on extracellular secretion and diffusion. The paper goes on to show that MADD-4 is not just involved in muscle arm extension but also has roles in axon guidance.

MADD-4 is the founding member of a new family of guidance proteins. Orthologues of madd-4 are present in the human and Drosophila genomes. In the case of Netrins and Slits, after their identification in invertebrate model organisms it has been shown that they are widely used in neural development for axon guidance and cell migration in vertebrates. It is likely by analogy that MADD-4 orthologues are playing important roles in vertebrate development. Interestingly a human orthologue ADAMTSL3 has been linked to colorectal cancer, as has the Netrin receptor DCC.

Lamarckian inheritance of antiviral response in Nematodes.

Rechavi, O., Minevich, G., & Hobert, O. (2011). Transgenerational Inheritance of an Acquired Small RNA-Based Antiviral Response in C. elegans Cell, 147 (6), 1248-1256 DOI: 10.1016/j.cell.2011.10.042

A new paper in Cell shows a non-mendelian multigenerational inheritance of an acquired trait in the nematode Caenorhabditis elegans.

It has been known since the 1990s that the induction of RNA interference (RNAi) by the exogenous application of double stranded RNAs leads to specific gene silencing and that in C. elegans these effects are often inherited by the worms progeny. However, the mechanism of this transmission has remained unclear, as have the potential biological roles. This new study uses a series of elegant genetic crosses and a modified viral reporter transgene to clarify these outstanding questions.

The authors used a transgenic worm that supports the autonomous replication of the single strand RNA nodavirus Flock House virus (FHV) modified to express GFP and make it’s replication inducible by heat shock. In the first series of experiments, worms heterozygous for certain components of the RNAi pathway (rde-1 or rde-4) that also contained the heat inducible viral transgene were generated. Upon induction of viral replication a robust antiviral response occurs meaning that no GFP is expressed. When these worms self fertilise to produce a new generation, a quarter of the offspring are homozygotes for the mutant rde1 or rde4. These worms would be expected to be unable to silence viral replication as their RNAi mechanisms are non-functional. Instead they still do not express GFP after heat shock indicating that viral silencing continues. This silencing effect continues for several generations until it gradually ‘wears off’. However this ‘fading’ mode of silencing only occurred in a subset of the worms, in others a more stable inherited silencing occurred where GFP expression never reoccurred after many generations. When these two types of worms were crossed all the offspring had the viral GFP signal eliminated. This showed that the suppression of viral production was transmitted in a non-mendelian fashion, in accord with the silencing factors being diffusible trans-acting factors (rather than a hypothetical genomic locus suppressing virus production that would have segregated in a mendelian manner).

In another series of genetic crosses the authors went on to show that the transgenerational viral silencing was maintained in the absence of the viral template. Finally, the authors isolated viRNAs complementary to regions of the viral genome from worms that must have inherited them from their grandparents.

This new research importantly shows a physiological context for transgenerational  transmission of RNA mediated gene silencing, ie in inherited antiviral immunity. It also shows that the mechanism of the transmission of gene silencing can be mediated by inherited small RNA molecules.

A modified ribosome mediates stress in E.coli.

Vesper, O., Amitai, S., Belitsky, M., Byrgazov, K., Kaberdina, A., Engelberg-Kulka, H., & Moll, I. (2011). Selective Translation of Leaderless mRNAs by Specialized Ribosomes Generated by MazF in Escherichia coli Cell, 147 (1), 147-157 DOI: 10.1016/j.cell.2011.07.047

This paper has characterised an interesting new mechanism of stress adaptation in bacteria in which ribosomes are modified to selectively translate a subset of mRNAs that have also been modified by the same enzyme.

Toxin-antitoxin (TA) modules are widespread prokaryotic genetic elements that have generally been characterised as selfish DNA when encoded on plasmids. Chromosomally located TA systems functions are more likely to be integrated into the host cells regulatory networks. mazEF is a well characterised chromosomal TA system in E.coli. The two genes are cotranscribed as an operon; mazE encoding a relatively labile antitoxin that inactivates the more stable endoribonuclease MazF. Under conditions of cell stress mazEF expression is inhibited. As MazE is less stable and degraded by a protease, MazF activity is released. MazF cleaves single stranded mRNAs at ACA sequences hence inhibiting protein synthesis. However this inhibition is not global: about 10% of protein’s synthesis are specifically enabled by MazF action. Some of these protein’s actions are responsible for programmed cell death, others have been shown to permit the survival of a subpopulation of bacterial cells (Amitai et al.2009). This new paper has uncovered the mechanism by which the selective synthesis of this subset of the cell’s proteins is activated by MazF.

Analysing transcripts encoding proteins known to be synthesised in the presence of MazF activity the authors found that they were cleaved at ACA sequences at or closely upstream of their AUG translation start sites. This creates a population of leaderless mRNAs (lmRNAs) that the paper also shows are selectively translated in the presence of MazF activity. Postulating that the selective translation of lmRNAs could  be mediated by MazF modifications to the ribosome itself, the investigators went on to show that MazF also cleaves the 16S rRNA of the 30S ribosomal subunit. This cleavage results in the loss of 43nt from the 3′ end of the 16S rRNA including the anti-Shine Dalgarno sequence (aSD). SD – aSD interactions are important for the initiation of translation of canonical mRNAs with structured 5′ UTRs. However, in this case MazF generates specialised “Stress Ribosomes” lacking the aSD that selectively translate a “leaderless mRNA regulon” also generated by cleavage by MazF.

This paper is important and interesting in that it has discovered an elegant and novel molecular mechanism employed by bacteria during times of environmental stress. It also adds greatly to understanding bacterial programmed cell death and the functions of chromosomally located TA systems both of which are contentious subjects in relation to how their evolution typifies aspects of both altruism and selfishness.