Supergenes, Sociality, and Sex Chromosomes

A new paper in Nature shows that the social structure of fire ant colonies is determined by a ‘supergene’ – a single non-recombining cluster of hundreds of genes. The supergene makes up more than 50% of a pair of divergent chromosomes. The social chromosomes appear to have emerged in a similar manner by which sex chromosomes evolve.

Colonies of the fire ant Solenopsis invicta can be organised in two different modes, either containing a single queen (monogyne) or multiple queens (polygyne). This social polymorphism has been shown to be associated with a single mendelian genetic factor. Two different alleles, B and b, of the Gp-9 gene, encoding an odorant-binding protein, predict the two colony types. This effect is mediated by worker-ant behaviour. Colonies composed solely of Gp-9BB workers live under a single queen, whereas mixed colonies of Gp-9BB and Bb workers accept many queens. These queens are invariably Bb as the Bb workers will kill any BB queens they encounter. In this way the b allele acts as a ‘green beard’ gene – promoting its’ propagation by behavioural self-recognition. However, it does not spread unchecked through the population as it is recessive lethal – Gp-9bb individuals die early.

The two social forms differ in many aspects of their biology. Monogyne queens tend to found new colonies after long nuptial flights. They build their nests without the aid of workers or foraging and therefore require extensive fat reserves and a longer period of maturation. Polygyne queens often stay within their original nest or undertake limited nuptial flights, hence requiring smaller fat reserves and less maturation. Monogyne colonies are therefore simple families and highly dispersed whilst polygyne colonies contain a number of families, tend to bud from each other and are frequently clustered. These different population structures display different behaviours; the more related monogyne colonies being more aggressive and territorial than those of pologyne fire ants.

Although the Gp-9 encoded odorant-binding protein may well regulate the two forms of social organisation by chemical communication, it was scarcely believable that this panoply of different behavioural, morphological and life-history traits could be regulated by the one protein. Researchers therefore speculated that rather than being the one determining gene, Gp-9 may instead be a marker whose presence correlates with a number of other genetic factors determining the alternative forms of colony organisation.

Clusters of co-segregating genetic loci – ‘supergenes’ – have been found to underlie patterns of adaptive variation regulating different floral types and butterfly mimicry. These types of loci facilitate the co-transmission of many different traits in parallel. They effectively behave like one classically defined gene but are in fact composed of many molecularly defined genes.

Wang et al. have now found that the fire ant colony organisation polymorphism is determined by a supergene. Comparing the haploid sons of BB and Bb queens they discovered a 13Mb region of a 23Mb chromosome in which no recombination occurred between the B and b forms. This region included Gp-9 and at least 615 other genes (6.7% of known S. invicta genes).  The vast majority of these genes were present in both chromosome types, but recombination fails to occur due to chromosomal rearrangements. The authors identified one major inversion changing the orientation of 9.3Mb of the non-recombining section.

This heteromorphic pair of ‘social chromosomes’, seemingly regulating many aspects of the monogyny/polygyny division, bears resemblances to sex chromosomes. Recombination continues when B chromosomes are paired (as in XX chromosomes in mammals), whilst it fails to occur between B and b (as in XY). bb, like YY, is non viable. The most widely accepted theory about how recombination is suppressed in the evolution of sex chromosomes from autosomes posits that inversions are selected for when genes with male specific benefits are linked to a male sex-determining gene. It has been hard to prove this theory though, as it is difficult to identify alleles with sex-specific benefits. The demonstration that the same mechanism (ie. inversion preventing recombination and permanently linking alleles) is responsible for the social supergene and chromosomes in fire ants lends support for this model of sex chromosome evolution. One can envisage how a series of inversions gradually permanently links loci responsible for major complex adaptive traits into supergenes and potentially into specialised chromosomes.

Like Y chromosomes, the lack of recombination allows degenerative mutations to build up on the social b chromosome. Wang et al found increased numbers of repetitive elements and longer introns in the non-recombining region. It is because of this accumulation of deleterious mutation that, like Y chromosomes, the b chromosomes contain recessive lethal alleles. Importantly however, during the life-cycle, haploid males containing a single b chromosome have to be viable. This pressure limits the degeneration of the b chromosome in comparison to Y chromosomes.

As yet there is relatively little information on the roles of the 616 genes in the non-recombinant supergene. 70% of genes known to be differentially expressed between BB and Bb workers do however map to the non-recombining region. Further studies will no doubt tackle the molecular mechanisms by which the two social systems differentiate. It will be interesting to ask what proportion of genes in the supergene are directly involved in determination of social system. The authors estimate that suppression of recombination on the social chromosomes started relatively recently in fire ants, 390,000 years ago. Monogyny and polygyny exist in many other species of ant; do similar supergene systems underlie these other instances? Further, how widespread are supergenes? Potentially they could be a common mechanism underlying the evolution of complex adaptations such as social behaviours.

Wang, J., Wurm, Y., Nipitwattanaphon, M., Riba-Grognuz, O., Huang, Y., Shoemaker, D., & Keller, L. (2013). A Y-like social chromosome causes alternative colony organization in fire ants Nature DOI: 10.1038/nature11832

Bourke, A., & Mank, J. (2013). Genetics: A social rearrangement Nature DOI: 10.1038/nature11854

The Transposon/piRNA/Chromatin Nexus

Close observation of chromatin states at piRNA-silenced genomic loci demonstrates the power of transposons to change native gene expression.

As reviewed in an earlier post, the Drosophila Piwi/piRNA transposon silencing pathway can be divided into two facets; a complex pathway operating in the germline centred on the Piwi-family argonautes Aubergine and AGO3 localised in peri-nuclear nuage, and a linear pathway operational in the somatic follicle cells. In this linear pathway, piRNAs derived from uni-directional piRNA clusters such as flamenco target Piwi to mediate silencing of a limited subset of retrotransposons. Unlike Aub and AGO3, Piwi is localised to the nucleus, leading to speculation that rather than silencing transposons post-transcriptionally by ‘slicing’ their transcripts, it may act at the transcriptional level. There are many precedents in other organisms for argonautes mediating transcriptional silencing via interactions with chromatin modification and DNA methylation pathways. However, whether one of these silencing modes is employed by Drosophila Piwi was unresolved. A new paper from the lab of Julius Brennecke, generally analysing the linear piRNA pathway active in a cell line derived from the somatic follicle cells surrounding the oocyte (OSC cells) includes important findings for a number of aspects of Piwi-mediated transposon silencing leading to insights on the wider genomic ecology of transposon insertions.

In the first section of the paper, Sienski et al. demonstrate that Maelstrom (Mael), a protein containing putative RNA and DNA binding domains, expressed in both cytoplasm and nuclei and previously implicated in a number of Piwi-pathway effects, acts downstream of Piwi to effect TE silencing. Silencing requires the nuclear localisation of both Piwi and Mael. Further, mutation of the residues necessary for ‘slicer’ activity in Piwi did not de-repress TEs, suggesting a different mechanism for Piwi-mediated silencing.

Sienski et al. go on to marshal three different high-throughput techniques to show that Piwi mediates gene silencing at the transcriptional level. Knocking down (KD) the expression of Piwi pathway factors (piwi, mael) in OSC cells they determined the set of repressed transposable elements (TEs) by comparing RNA levels (RNA-seq). Changes in the steady-state RNA levels were highly correlated with transcription rate as monitored by RNA polymerase II occupancy (ChIP-seq) and levels of nascent RNAs (GRO-seq). Judging by how closely correlated derepression of TEs was to transcription rate, it seems unlikely that the linear piRNA pathway active in follicle cells acts post-transcriptionally at all.

Reasoning that Piwi-mediated transcriptional gene silencing may involve chromatin modification, Sienski et al. profiled the distribution of the repressive histone mark H3K9me3 in OSCs after piwi or mael knockdown. H3K9me3 levels at transposable elements known to be repressed by the piRNA pathway were significantly reduced in the absence of Piwi (and to a lesser extent Mael). This data was from across the genome irrespective of whether the TE was inserted into heterochromatic or euchromatic regions. To negate general effects associated with heterochromatin, the authors looked more closely at TE insertions within euchromatic regions.

Approximate sketch of the patterns of RNA pol II occupancy (ie Transcription), and H3K9me3 at the mdg1 locus after piwi or mael knockdown and normally in control.

Approximate sketch of the patterns of RNA pol II occupancy (ie Transcription), and H3K9me3 at the mdg1 locus after piwi or mael knockdown and normally in control.

At a specific euchromatic insertion of the retrotransposon mdg1, they observed that upon either piwi KD or mael KD, transcription downstream of the insertion strongly increased. However, although this transcriptional bleeding into the surrounding area was similar upon TE derepression due to either piwi KD or mael KD, the pattern of H3K9me3 was very different. Normally this mdg1 insertion displays H3K9me3 in the surrounding 12kb, peaking at the insertion site. This was strongly reduced in piwi KD cells, but in mael KD, H3K9me3 was moderately reduced at the insertion site but had actually spread further downstream (see figure). Similar patterns were observed at nearly all euchromatic mdg1 insertions, as well as other TEs known to be targeted by the linear piRNA pathway active in OSC cells.

Strikingly, most euchromatic H3K9me3 peaks were sensitive to piwi knockdown, whilst 88% of H3K9me3 peaks were found within 5Kb of TE insertions. Piwi-mediated transposon silencing therefore seems to be the main trigger for H3K9 trimethylation in euchromatin.

This transposon silencing mechanism appears to have a major impact on native genes upon TE insertion in their vicinity. An insertion of the retrotransposon gypsy into the first intron of the expanded (ex) gene serves as paradigm for these effects. In OSC cells, the gypsy insertion triggered H3K9me3 spreading into the surrounding 10-12Kb. In control cells RNA polymerase II occupancy was observable at the ex transcription start site (TSS) but weak. Upon piwi or mael knockdown, transcription from the ex TSS was massively increased. As in the earlier mdg1 example, H3K9me3 levels were greatly reduced upon piwi KD but not in mael KD cells. Sienski et al. observed similar effects on the transcription of 28 more genes with nearby TE insertions in OSC cells.

This data has a number of ramifications speaking of a complex interplay between transcription, the establishment and maintenance of repressive chromatin states and the Piwi pathway. Firstly, H3K9me3 considered a transcriptionally repressive histone mark is compatible with transcription. In fact, based on it’s pattern in mael KD cells, the authors propose that downstream transcriptional bleeding leads to the spread of H3K9me3. Further, although H3K9me3 has an integral role in Piwi-mediated silencing, it is not the final silencing mark. H3K9 trimethylation is downstream of Piwi action, but is either upstream or acts in parallel to Mael, which mediates an unknown silencing step crucial to Piwi transcriptional gene silencing.

Importantly, this paper has demonstrated the impact that TE insertion and subsequent piRNA pathway transcriptional repression can have on native gene expression. There are two different modes in which the inactivation of Piwi-mediated TE silencing can lead to the transcriptional activation of these loci. Firstly, the spreading of repressive chromatin marks at transposons can suppress RNA polymerase II access to the genes promoter. Alleviation of TE repression hence leads to (re-)activation of gene expression. Conversely, as TEs (especially the long terminal repeats of some retrotransposons) can serve as promoters, the loss of their repressed chromatin state upon piRNA pathway loss, can activate transcription of downstream regions. Although both these modes lead to transcriptional activation after Piwi pathway loss, they demonstrate that transposon insertion can either activate or repress transcription within relatively extensive genomic surroundings. This underscores the scope for transposons to act as regulatory elements, or to produce new chimerical transcripts and hence potential new genes.

These experiments were mainly performed in one cell type that only partially reflects the activity of what is already a subset of piwi/piRNA action during Drosophila oogenesis.  Piwi and Mael are also active in the nurse cells and oocyte, and this paper suggests that they have similar roles within the context of the expanded piRNA pathways active in the germline. It will be interesting to integrate this nuclear-localised transcriptional-silencing aspect of piRNA silencing into the context of ping-pong amplification and bi-directional piRNA cluster transcripts. Further, do these Piwi-mediated chromatin effects in the germline impact on the transcriptional status of TEs and genes later in somatic development? And if not, do other systems have equivalent activity?

This paper underlines again the importance of the arms race between mobile genetic elements and genomic immune systems such as the piRNA pathway on the wider genomic regulatory context. This contest is being observed to have shaped so many aspects of genome organisation throughout evolution that it sometimes becomes hard to differentiate parasitism from regulation. It is clear however, that to understand the evolutionary impact of mobile elements we must also understand the import of the various epigenetic mechanisms controlling their spread. The minutiae of these mechanisms with regard to their targets, plasticity, adaptability, heritability – often different from organism to organism – has major evolutionary significance. Evolution works differently depending on these mechanisms.

Sienski, G., Dönertas, D., & Brennecke, J. (2012). Transcriptional Silencing of Transposons by Piwi and Maelstrom and Its Impact on Chromatin State and Gene Expression Cell, 151 (5), 964-980 DOI: 10.1016/j.cell.2012.10.040

Breaking Neuronal Symmetry by Chromatin Memories

The asymmetric fates of two bilaterally symmetrical neurons are determined by a two-step activation program at a miRNA locus. Very low levels of transcription ‘prime’ the locus many cell generations before the final fate determination is imposed by a bilateral ‘boost’.

Animal nervous systems are generally bilaterally symmetrical anatomically, whilst displaying many functionally important left-right asymmetries. How is asymmetry imposed on a bilaterally symmetrical ground plan? The nematode C. elegans with its invariant cell lineage and tractable genetics offers a powerful model system in which to tackle this issue. Cochella and Hobert have published an elegant new paper describing how a distinct chromatin state at a microRNA locus serves as a molecular mark encoding a memory of a cell’s ancestry in an asymmetric lineage.  After many cell generations, this mark engenders a different response to terminal differentiation from its’ bilaterally symmetric partner cell.

The bilaterally symmetrical pair of gustatory neurons ASEL(eft) and ASER(ight) express different repertoires of chemoreceptors. This functional asymmetry is underpinned by the differential expression of two transcription factors. DIE-1 is expressed in ASEL and COG-1 in ASER. Together with a microRNA, lsy-6, which represses COG-1 expression, they form a bistable feedback loop responsible for determining the asymmetric fates of ASE neurons. Loss of any of these three factors results in conversion of one ASE to the other. However, the asymmetric expression of die-1 and cog-1 only occurs within the post-mitotic neurons themselves. How then is this asymmetry established?

A schematic representation of the features of asymmetric ASE specification. Note the original asymmetry in the lineages is determined at the 4 cell stage, tbx-37/38 expression in the great-granddaughters of ABa, and lsy-6 in the loop in ASEL.

A schematic representation of the features of asymmetric ASE specification. Note the original asymmetry in the lineages is determined at the 4 cell stage, tbx-37/38 expression in the great-granddaughters of ABa, and lsy-6 in the loop in ASEL.

The two ASEs are derived from different cell lineages that diverge at the 4-cell stage. The two daughters of the 2-cell stage blastomere AB, ABa and ABp, differentiate from each other due to signalling from one of the other 4-cell stage blastomeres to ABp. This signalling event represses the expression in the ABp lineage of a pair of redundant transcription factors, TBX-37 and TBX-38, which are transiently expressed in the 8 great-granddaughter cells derived from ABa. The expression of these TBX proteins is crucial to the asymmetric fate specification of ASE neurons as in tbx-37/38 double mutants ASEL is converted into ASER. However, TBX-37 and 38 are only expressed in the lineage giving rise to ASEL six cell generations before it’s birth. This large gap between the different stages of asymmetric ASE determination lead researchers to postulate the existence of a ‘memory mark’ linking TBX-37/38 action to the expression of the asymmetry defining feedback loop.

During this hiatus between TBX-37/38 expression and terminal ASE determination, the lineages giving rise to the two neurons become symmetric. A number of left/right pairs of neuronal precursors expressing the proneural gene hlh-14 develop from the two lineages, but only the pair of ASE mother cells express the ‘terminal selector’ transcription factor CHE-1. CHE-1 drives the expression of many ASE-expressed genes and activates expression of the asymmetric loop components, lsy-6, die-1, and cog-1. It is at this point that the TBX-37/38-dependent memory mark must integrate into the bilateral activity of CHE-1 generating the asymmetric expression of the loop components.

To try to discover the nature of the memory mark Cochella and Hobert performed a detailed analysis of the expression of the loop components. lsy-6 was suggested to act upstream of die-1 and cog-1 by genetic experiments, and the researchers found that it was the first of the loop components to be expressed.  It is expressed asymmetrically from the start in the ASEL mother cell. Deletion of lsy-6 results in conversion of ASEL to ASER. A construct of lsy-6 in combination with 932 bp of upstream sequence is able to rescue this effect, but sometimes leads to the conversion of ASER to ASEL. This suggested that the ‘upstream element’ construct drove ectopic expression in ASER as well as ASEL. Indeed, the upstream element contains CHE-1 binding motifs causing expression in both the ASE neurons. Cochella and Hobert therefore assayed other lsy-6 surrounding sequence for cis-regulatory information limiting its expression to ASEL. In fact, a construct including the upstream element, lsy-6, and 300 bp of downstream sequence completely rescued lsy-6 null alleles, eliminated ectopic ASER conversion, and was expressed identically to the endogenous miRNA. Normal lsy-6 expression is therefore regulated by both the upstream and downstream elements.

When the downstream element was used alone to drive expression of a reporter gene, it produced a very different pattern. Expression started early in a few ABa-derived blastomeres, one cell division after the expression of tbx-37/38, continuing in the progenitive lineage of ASEL until its’ birth.It never drove expression in ABp derived lineages. Expression from the downstream element was completely lost in tbx-37/38 double mutants, whilst mis-expression of TBX-37/38 in ABp derived cells lead to ectopic expression of the downstream element reporter. The downstream element contains a predicted binding-site for T-box proteins, directly linking the lineage–dependent expression of tbx-37/38 with theasymmetry-defining loop.

The expression pattern driven by the downstream reporter suggested that lsy-6 may be expressed far earlier than previously observed. The researchers therefore used a very sensitive technique to image potential lsy-6 transcripts. This showed that a few lsy-6 RNAs were present in cells in the lineage giving rise to ASEL five generations before strong expression is observed in the ASEL mother cell.

Broadly therefore, lsy-6 expression occurs in two phases; a very low level of activation early, dependent on the downstream element, and a second upstream element-dependent higher level of expression in the ASEL mother cell. However, deletion of the downstream element within large genomic constructs abrogated expression at all stages, and failed to rescue lsy-6 null alleles. This contrasted with earlier observations in which the upstream element alone could drive expression and rescue. The difference between these observations suggested  that, within a normal genomic context, the upstream element can only function in combination with the downstream element.

The authors therefore posited a model in which early downstream element/tbx-37/38- dependent transcription may ‘prime’ the locus in some way, rendering it competent to respond to the later transcriptional ‘boost’ mediated by CHE-1 acting on the upstream element.

Cochella and Hobert tested their model by substituting priming via tbx-37/38/downstream element for priming via ectopic CHE-1. In worms with the downstream element deleted, they drove early expression of CHE-1 from a heat-shock promoter approximately 4 cell generations before its’ normal time of expression. This caused low levels of lsy-6 transcription, rescuing the priming phase and allowing later ASEL expression and determination. Priming is therefore not dependent on a specific transcription factor acting on the downstream element, rather as long as low levels of transcription occur at the locus, it is primed.

This suggested that the memory mark causing the different response of the lsy-6 locus may be a lineage-specific transcription-dependent chromatin state. Using a cunning technique to visualise the level of chromatin compaction on transgenic arrays containing the lsy-6 locus, they observed chromatin decompaction of thelocus in the ASEL progenitive lineage 1 cell division after tbx-37/38 expression. Chromatin decompaction is associated with active genes; in the absence of early transcription the locus becomes compacted and refractory to CHE-1 activation later. In tbx-37/38 double mutants this lineage-specific decompaction was never observed, nor was it seen when the downstream element was deleted.

The memory mark is therefore chromatin decompaction at a miRNA locus linked to very low levels of transcription imposed within a cell lineage at an early stage of development. This primed state relays asymmetric information into an otherwise bilaterally symmetrical developmental program, facilitating diversification of neuronal cell fates. The timing of the priming mechanism fits in with earlier evidence that C. elegans embryos are relatively developmentally plastic until the 64-128 cell stage when developmental genes become compacted and refractory to ectopic activation.

Although I find this paper very elegant and convincing, I do have a few qualms about the most crucial experiment: the early ectopic activation by CHE-1. It seems like a slightly dirty experiment and I think I would’ve preferred to see ectopic induction of lsy-6 transcription via an unrelated mechanism. Perhaps experiments such as these would’ve had their own problems and my doubts are unfounded. I would also have liked to see the compaction assay performed with the ectopic CHE-1 induced activation.

The demonstration of a chromatin-based lineage specific prepattern facilitating differential responses to more generic inputs later in embryogenesis has wide implications, not just for asymmetries in the worm nervous system, but for the way we understand development in many animals. Firstly a technical point; to visualise early lsy-6 transcription the authors had to use a very labour intensive and hi-tech form of in situ hybridisation. The transcription they found, of just a few individual RNA molecules per cell, had massive developmental significance. Generally the techniques used to judge expression in developmental studies is nowhere near as sensitive, implying that we may be missing a lot of important information. Secondly, a more general point; A cell or tissues’ ‘competence’ to respond to developmental signalling, a concept derived from experimental embryology, and perhaps disdained in more genetical perspectives is relevant here. Molecular memories encoded by chromatin states may be a very widespread mode for imposing pre-pattern or developmental competence during embryogenesis. It seems to me that these types of understandings can begin to blend together the two different meanings of epigenetics; namely the derivation of the word by Waddington from epigenesis (meaning the increase in complexity during development), with the more current usage of epigenetics as describing a diverse collection of non-genetic inherited information.

Cochella, L., & Hobert, O. (2012). Embryonic Priming of a miRNA Locus Predetermines Postmitotic Neuronal Left/Right Asymmetry in C. elegans Cell, 151 (6), 1229-1242 DOI: 10.1016/j.cell.2012.10.049

The Heterodox Dinokaryon

The nuclei of dinoflagellates display a highly derived organisation; chromosomes are permanently condensed and seem to lack histone proteins. A new study in Current Biology links the emergence of these characters to the importation of a novel family of nuclear proteins originating in giant viruses.

A Haeckel print of various Dinoflagellates

Dinoflagellates are a diverse and successful phylum of protists.  Many are photosynthetic with a major role in the oceans’ primary production, whilst others have symbiotic, parasitic or predatory lifestyles. Their nuclei are highly unusual. Whereas in all other eukaryotes chromosomes only condense during mitosis, dinoflagellate chromosomes display a permanently condensed, liquid crystalline form. This ‘cholesteric’ structure produces a banded appearance in electron micrographs. Another key dinoflagellate heterodoxy is the absence (or at least undetectability) of histone proteins and the nucleosomal organisation of chromatin. These differences are so radical that dinoflagellates were suggested to represent an intermediate ‘mesokaryotic’ stage between prokarya and eukarya. Molecular phylogenetics has since clarified that they are in fact a sister clade to apicomplexan protists, leaving no doubt that that the dinoflagellate nuclear organisation – the dinokaryon – is derived from standard eukaryotic ancestors. Other atypical features of the dinokaryon include very high DNA content and the replacement of as much as 70% of the base thymine with the rare base 5-hydoxymethyluracil.  However, there is some variability in the occurrence of these features. For instance the chromosome banding patterns are not always evident and some dinoflagellate species’ chromosomes can be decondensed at certain stages of their lifecycles.

A dinoflagellate nucleus. Note the condensed chromosomes with characteristic banding pattern (not Blastodinium sp.).

To investigate the emergence of these dinokaryotic characteristics during the early evolution of the dinoflagellates, Gornik et al. investigated the nuclei of two early-branching members of the lineage.  Perkinsus marinus represents the closest known lineage not included within the dinoflagellates proper, whilst Hematodinium sp. branches basally within the clade. In line with their expectations the genome of P. marinus is organised into nucleosomal units, whilst that of Hematodinium sp. is not and appears to be 80 times larger. The P. marinus genome contains sequences for the 4 core histones as well as the linker histone H1, all of which were prominently detectable as protein in extracts from nuclei. Genome sequence is not available for Hematodinium sp., however transcriptomic sequencing revealed the presence of the four core histones as well as a number of variants. Unlike the histone genes of P. marinus the sequences were quite divergent from the highly conserved eukaryotic norm, however the core ‘histone-fold’ regions were relatively well preserved, as were key residues that serve as sites for post-translational modification.  Histone genes have been found in other dinoflagellate genomes recently, but histone protein expression had not previously been detected. Gornik et al could identify histone H2A protein in nuclear extracts from Hematotinium sp. However, whereas in P. marinus and other eukaryotes, histone proteins are the dominant species in such extracts, in Hematodinium sp a single 30kDa species dominated.

When this band was extracted and the protein identified by mass spectrometry, it was found to correspond to a novel family of proteins, at least 4 of which were expressed in Hematodinium sp., whilst 13 were found in the transcriptome. This family of proteins only appears to be present in dinoflagellates; no homologues were found in other eukaryotic groups or in prokaryotes. However database searching did reveal homology with a protein of unknown function widely found encoded in the genomes of phycodnaviruses, a family of giant viruses infecting algae. Gornik et al. therefore named these proteins Dinoflagellate/Viral NucleoProteins (DVNPs).

Like histones and many other DNA-binding proteins, DVNPs are highly basic proteins. They are relatively variable in their N-terminal regions, with higher conservation in a core region, which may potentially include a DNA-binding helix-turn-helix motif. Biochemical experiments demonstrated that DVNPs have a high affinity for DNA and are post-translationally modified at various residues by phosphorylation.

The phycodnaviridae are members of the nucleocytoplasmic large DNA viruses (NCLDVs), a monophyletic clade of giant viruses that encode much more of their replication apparatus than is typical of viruses. They are predicted to have emerged more than 2 billion years ago, predating the first dinoflagellates by more than a billion years. As most phycodnaviruses include DVNP orthologues dinoflagellates must have acquired DVNPs from the phycodnaviruses early in their evolution. As yet there is no information on the roles of DVNPs in the phycodnaviridae, but the fact that both taxa have expanded genomes suggests a possible similar function. Do DVNPs allow such efficient DNA packing that the costs of genome expansion are somehow minimised?

The DVNPs are not the first family of putative histone-replacement proteins discovered in dinoflagellates. Later-branching taxa express ‘histone-like proteins’ (HLPs), probably related to the bacterial DNA-binding protein HU, and shown to be able to bend DNA in vitro. HLPs are not found in Hematodinium sp. or other early-branching dinoflagellates, whereas DVNPs are found in combination with HLPs in later-branching taxa. DVNPs therefore seem to be associated with the core dinokaryotic characteristics of permanently condensed chromosomes and expanded genome size, whilst the presence of HLPs correlates with other characters such as the chromosome banding patterns observed in later-branching taxa.

The observation that dinoflagellates do in fact encode and express divergent histones at low levels raises the question of what their roles could be if they are not primarily responsible for the bulk packing of DNA? Linked to this is the broad question of how DVNPs and HLPs act to condense dinoflagellate chromosomes. Considering the vast quantity of research attempting to understand the biology of eukaryotic chromosomes, it is rather daunting to find a whole new way of doing things; how do transcription and replication mechanisms work in the context of permanently condensed chromosomes? How does this link in with genome expansion? I don’t know how much dinoflagellate genomic data is available, but I imagine that a finished genome sequence would be of great use. Perhaps though, I’d prefer instead to prioritise biochemical and structural studies of these various proteins actions on DNA.

Gornik, S., Ford, K., Mulhern, T., Bacic, A., McFadden, G., & Waller, R. (2012). Loss of Nucleosomal DNA Condensation Coincides with Appearance of a Novel Nuclear Protein in Dinoflagellates Current Biology DOI: 10.1016/j.cub.2012.10.036

Uploading piRNAs to the Cloud.

A new paper finds a protein linking piRNA transcription with processing in nuage.

The Piwi/piRNA system is responsible for protecting the germline from the mutagenic effects of transposon mobilisation. As summarised in an earlier post, in Drosophila large arrays of transposon fragments, located in pericentromeric and subtelomeric chromatin domains give rise to long piRNA cluster transcripts. These transcripts are then processed to produce the 23-30 nt piRNAs which, when complexed with Piwi-family argonaute proteins effect the post-transcriptional silencing of transposons. Although a more limited piRNA system functions in the somatic follicle cells surrounding the Drosophila egg chamber, the bulk of germline transposon silencing is performed by the system active in the germline siblings of the oocyte – the nurse cells. Here, dual-strand piRNA cluster transcripts are processed in the nuage, a perinuclear electron-dense cytoplasmic structure, where the ‘ping-pong’ system of reciprocal cutting and complexing between the Piwi proteins Aubergine (Aub) and Ago3 leads to piRNA amplification.

Nuage is a hallmark of germline cytoplasm in animals, and appears to be the site of both piRNA processing and transposon silencing. A hierarchy of proteins responsible for the assembly and function of nuage has been revealed by studies in Drosophila. Vasa, a DEAD-box RNA-dependent helicase protein, is required for the localisation of Tudor and other Tudor-domain-containing (Tdrd) proteins. These serve as a platform for the piRNA system, binding Aub and Ago3. Defects in many of these piRNA biogenesis components do not just lead to uncontrolled transposon activity; rather, they affect the asymmetric localisation of RNAs in the developing oocyte – a process by which developmental prepattern is organised. Zheng et al. discovered that weak mutations in the uap56 gene caused similar defects, suggesting a potential role in piRNA biogenesis.

UAP56 is another DEAD-box containing RNA-binding protein. It is ubiquitously expressed, localised in nuclei and has previously been shown to be involved in mRNA splicing and export. Zheng et al. found that in nurse cells it localises to discrete foci in the periphery of the nucleus. This was a similar pattern to that of Rhino (Rhi), a Heterochromatin Protein 1 variant previously shown to associate with piRNA clusters. Indeed, UAP56 and Rhino co-localised ~99% of the time in nurse cell nuclei.  Mutations in either uap56 or rhi caused a failure in the focal localisation of the other protein, showing their co-dependence.

When Vasa was imaged at the same time, it became apparent that it localised to foci in the nuage directly across the nuclear envelope from UAP-56-Rhi foci. Co-labelling with a nucleoporin showed that in fact UAP56-Rhi foci and Vasa foci directly abut nuclear pores from either side.

In the absence of functional UAP56 the nuage fails to assemble properly; Vasa, Aub and Ago3 all fail to localise. Similar effects are observed in rhi mutants, placing both UAP56 and Rhino upstream of Vasa as extrinsic factors necessary for nuage assembly. The uap56 mutants also fail to produce a large part of the proper complement of piRNAs leading to a consequent mobilisation of transposons. No effects on the level of genic mRNAs were detectable. Due to the failure of nuage assembly, the uap56 mutants also display germline DNA damage and the morphological defects caused by mislocalisation of asymmetric RNAs.

DEAD-box containing proteins act as ATP-dependent RNA clamps. As Rhino is known to associate with dual-strand piRNA clusters, Zhang et al postulated that UAP56 may be binding and stabilising nascent cluster transcripts. Indeed piRNA cluster transcripts could be co-immunoprecipitated with UAP56 and Vasa.

The data therefore suggests an attractive model in which cluster transcripts are passed across the nuclear pore between the two DEAD-box containing proteins, UAP56 and Vasa. The authors term this a nuclear pore spanning piRNA processing compartment. piRNA cluster transcripts must in some way be marked and specifically transported via the trans- nuclear pore compartment.

Running through this work as a consistent undertone are the implicit links to the broader RNA processing systems. The nuage is obviously intricately linked to the differential transportation of RNAs from the nurse cells and around the oocyte. UAP56 has other roles in mRNA splicing and export from the nucleus. What exactly are the links between the germline specific role of UAP56 and the general RNA splicing and export machinery? Zhang et al end with the enticing observation that mutations in two different genes encoding conserved exon junction splicing components also lead to similar asymmetric RNA localisation defects. It appears that the control of piRNA processing and transposon silencing in nuage is intimately linked to broader networks controlling germline specification and the patterning of the oocyte. Although the different strands of these systems are difficult to tease apart, Drosophila oogenesis continues to offer an unparalled paradigm for their investigation. The piRNA system is widely conserved in animals, but there does appear to be quite a lot of plasticity in its specifics. For instance, as discussed at length in this series of posts, in C. elegans, piRNAs are individually transcribed. I’d be very interested to find out whether homologues of Rhino and UAP56 play any role in this system? I’ll riff on the similarities and differences of piRNA systems and their links to development some more in future posts.

Zhang, F., Wang, J., Xu, J., Zhang, Z., Koppetsch, B., Schultz, N., Vreven, T., Meignin, C., Davis, I., Zamore, P., Weng, Z., & Theurkauf, W. (2012). UAP56 Couples piRNA Clusters to the Perinuclear Transposon Silencing Machinery Cell, 151 (4), 871-884 DOI: 10.1016/j.cell.2012.09.040

Lin, H. (2012). Capturing the Cloud: UAP56 in Nuage Assembly and Function Cell, 151 (4), 699-701 DOI: 10.1016/j.cell.2012.10.026

A chimeric fusion of RNA and DNA viruses.

The discovery of a new family of viruses leads to speculations on possible modes recombination between RNA and DNA viruses.

The virosphere can be divided into three major classes; viruses with DNA genomes, retroviruses that reverse-transcribe their RNA genome into DNA during their lifecycle, and RNA-only viruses that don’t require DNA intermediates to replicate. In fact, viruses use all sorts of different permutations of genetic material; double-stranded RNA, single-stranded RNA (either negative or positive strand), dsDNA and ssDNA. Viruses evolve notoriously quickly and lateral gene transfer between them is rampant. However, gene transfer has most commonly occurred between closely related viruses or between those with similar replication mechanisms. A recent paper has reported the discovery of a new family of viruses that appear to have arisen via lateral gene transfer between a (non-retroid) +ve single-stranded RNA virus and a ssDNA virus.

Diemer and Stedman discovered the new virus whilst investigating viral diversity in a geothermal lake in California. Boiling Springs Lake is an acidic, high temperature lake with a purely microbial ecosystem composed of archaea, bacteria, and some single cell eukaryotes. Using a metagenomics approach (ie. large-scale sequencing  of environmental DNA from a virus particle sized fraction), they discovered the strange juxtaposition of a capsid protein (CP) gene related to those from the ssRNA plant-infecting Tombusviridae, with a rolling-circle replicase (Rep) gene most similar to those from the circular ssDNA-containing Circoviridae. Using primers designed against CP they confirmed the genome sequence of this putative virus, finding that it consisted of a single-stranded circular DNA containing 4 ORFs. ORFs 3 and 4 are of unknown function and unrelated to known genes. The virus contains a stem loop structure upstream of the Rep gene similar to those that serve as replication origins in other Circoviruses. Thanks to the chimeric origin of the Rep and CP genes, the authors termed it RNA-DNA hybrid virus (RDHV). This term is slightly open to misinterpretation as it could suggest that both molecules are actually encoding its’ genome, but to be clear this is a circular ssDNA virus whose capsid protein is derived from ssRNA viruses.

Organisation of RDHV. Note that ORFs 3 and 4 are not equivalent to those of Tombusviruses, and RDHV is twice the size of other Circoviruses.

Scanning databases of environmental sequence, the researchers found three other instances of homologous CP and Rep sequences arranged in the same configuration, two from global ocean surveys and one from the Sargasso Sea. This shows that RDHV defines a new family of viruses that are common in marine environments and could be more widespread. As CP and Rep are still highly similar to their sibling genes, it appears that the LGT event underlying the evolution of this new family occurred quite recently.

How did recombination occur between a non-retrovirus ssRNA virus and a DNA virus? A number of genes derived from non-retroid RNA viruses have been found in eukaryotic genomes, so perhaps this type of exchange is not as strange or rare as it may seem. The most likely scenario involves the RNA gene being converted into DNA by reverse transcription, followed by DNA-DNA recombination. As reverse transcriptase is not encoded by either virus, it could have been supplied in trans by retrotransposons, group II introns, or retroviruses within a common host cell. This brings us to the problem of metagenomic studies; they have amazing power to identify novel viruses and organisms, but yield very little information on the biology of what is found. In this case of RDHV and it’s family we do not know what their hosts are, don’t know the morphology of the viruses, and don’t know about the functions of half it’s 4 gene genome. I’m not sure how quickly these questions will be answered. Nevertheless, this study shows that amazing diversity is still out there being found, and yields insight into mechanisms underlying virus evolution – possibly in the deep past as well as more recently.

Diemer, G., & Stedman, K. (2012). A novel virus genome discovered in an extreme environment suggests recombination between unrelated groups of RNA and DNA viruses Biology Direct, 7 (1) DOI: 10.1186/1745-6150-7-13

Beating a Toxin-Antitoxin System; Evading Suicide

Bacteria have evolved many different systems to evade viral predation. One strategy, abortive infection (Abi), involves altruistic suicide. Mediated by a toxin-antitoxin (TA) system, the suicide of the infected cell protects the clonal bacterial population by preventing the spread of replicated bacteriophage. A new paper in Plos Genetics has discovered a molecular mimicry-based strategy that allows phage to escape abortive infection.

Toxin-antitoxin systems are widespread prokaryotic genetic elements found on both plasmids and bacterial chromosomes. Encoding a relatively long-lived toxin and a more labile antitoxin expressed from a single bi-cistronic operon, they were originally characterised as ‘addiction modules’. In the event that a plasmid expressing a TA system fails to be inherited by a daughter cell, the absence of antitoxin allows the persisting toxin to kill the cell – post-segregational killing. These attributes as ‘selfish elements’ made it slightly surprising that so many TA systems have been found encoded on bacterial chromosomes themselves. I’ve previously written about an example of one such TA system’s activity in mediating a stress response in E. coli, and they’ve also been implicated in the formation of antibiotic resisting ‘persister’ cells.

TA systems have been classified into three different classes defined by the level of the molecular interaction between their two components. In type I systems, translation of the toxin is prevented by an antisense RNA antitoxin binding to its’ transcript, whilst in type II systems both partners are proteins. Most recently, type III TA systems have been characterised in which the toxin is neutralised by binding of an RNA antitoxin. Examples of all three varieties have been found to protect bacteria from phage infection via abortive infection; phage replication disrupts the normal cellular transcriptional program, interrupting antitoxin production and hence leading to cell death.

The first type III TA system to be characterised, ToxIN, was found on plasmids in the phytopathogen, Pectobacterium atrosepticum (Pba), and shown to inhibit the propagation of multiple different bacteriophage. ToxN is an endoribonuclease, whilst the antitoxin ToxI, is a 36nt RNA structured as a ‘pseudoknot’. The partners combine into a hetero-hexameric structure composed of 3 ToxN molecules and 3 ToxI pseudoknots.

Blower et al. have discovered that phage can evade the Abi system by producing molecular mimics of the ToxI RNA. The lytic bacteriophage ΦTE, normally fails to infect Pba carrying a toxIN-containing plasmid. At low frequency however, new phage strains emerge capable of evading the Abi system. Upon sequencing the genomes of these ‘escape strains’, the researchers discovered that they all contained sequence expansions at one specific locus. The toxI locus contains 5.5 repeats of the 36nt RNA pseudoknot-encoding sequence. The ‘escape locus’ from the phage normally encoded 1.5 repeats of a pseudo-ToxI sequence. In all the escape strains this repeat had been expanded so that it contained either 4.5 or 5.5 repeats. These expansions had probably arisen due to strand-slippage during phage replication. In one escape strain homologous recombination had occurred between the phage pseudo-ToxI and the endogenous toxI; the phage had effectively hijacked a normal antitoxin- encoding gene.

The 1.5 repeat pseudo-ToxI could not inhibit Abi (as the sequence was out of phase it did not actually encode a functional psudoknot). However, the repeat expansions had allowed the phage to make an antitoxin mimic that protected them from the TA system and hence Abi.

ΦTE is capable of generalised transduction – the ability to package and transfer chromosomal and plasmid DNA from its’ host and transfer it during infection. Blower et al. showed that one of the ΦTE escape strains is able to transduce the plasmid encoded ToxIN – a case of a bacteriophage horizontally transferring an anti-phage defence mechanism. This brings into focus the complex evolutionary dynamics operating between the three different genetic entities being studied; the bacterial cell, the plasmid encoding the TA system, and the bacteriophage evading it and potentially propagating it. From the selfish viewpoint of the TA module what’s best, preventing the spread of the phage or being disseminated by it? These speculations aren’t about to be easily answered, however, it is an interesting way to analyse further examples of similar systems.

Blower, T., Evans, T., Przybilski, R., Fineran, P., & Salmond, G. (2012). Viral Evasion of a Bacterial Suicide System by RNA–Based Molecular Mimicry Enables Infectious Altruism PLoS Genetics, 8 (10) DOI: 10.1371/journal.pgen.1003023