Monthly Archives: November 2012

The Heterodox Dinokaryon

The nuclei of dinoflagellates display a highly derived organisation; chromosomes are permanently condensed and seem to lack histone proteins. A new study in Current Biology links the emergence of these characters to the importation of a novel family of nuclear proteins originating in giant viruses.

A Haeckel print of various Dinoflagellates

Dinoflagellates are a diverse and successful phylum of protists.  Many are photosynthetic with a major role in the oceans’ primary production, whilst others have symbiotic, parasitic or predatory lifestyles. Their nuclei are highly unusual. Whereas in all other eukaryotes chromosomes only condense during mitosis, dinoflagellate chromosomes display a permanently condensed, liquid crystalline form. This ‘cholesteric’ structure produces a banded appearance in electron micrographs. Another key dinoflagellate heterodoxy is the absence (or at least undetectability) of histone proteins and the nucleosomal organisation of chromatin. These differences are so radical that dinoflagellates were suggested to represent an intermediate ‘mesokaryotic’ stage between prokarya and eukarya. Molecular phylogenetics has since clarified that they are in fact a sister clade to apicomplexan protists, leaving no doubt that that the dinoflagellate nuclear organisation – the dinokaryon – is derived from standard eukaryotic ancestors. Other atypical features of the dinokaryon include very high DNA content and the replacement of as much as 70% of the base thymine with the rare base 5-hydoxymethyluracil.  However, there is some variability in the occurrence of these features. For instance the chromosome banding patterns are not always evident and some dinoflagellate species’ chromosomes can be decondensed at certain stages of their lifecycles.

A dinoflagellate nucleus. Note the condensed chromosomes with characteristic banding pattern (not Blastodinium sp.).

To investigate the emergence of these dinokaryotic characteristics during the early evolution of the dinoflagellates, Gornik et al. investigated the nuclei of two early-branching members of the lineage.  Perkinsus marinus represents the closest known lineage not included within the dinoflagellates proper, whilst Hematodinium sp. branches basally within the clade. In line with their expectations the genome of P. marinus is organised into nucleosomal units, whilst that of Hematodinium sp. is not and appears to be 80 times larger. The P. marinus genome contains sequences for the 4 core histones as well as the linker histone H1, all of which were prominently detectable as protein in extracts from nuclei. Genome sequence is not available for Hematodinium sp., however transcriptomic sequencing revealed the presence of the four core histones as well as a number of variants. Unlike the histone genes of P. marinus the sequences were quite divergent from the highly conserved eukaryotic norm, however the core ‘histone-fold’ regions were relatively well preserved, as were key residues that serve as sites for post-translational modification.  Histone genes have been found in other dinoflagellate genomes recently, but histone protein expression had not previously been detected. Gornik et al could identify histone H2A protein in nuclear extracts from Hematotinium sp. However, whereas in P. marinus and other eukaryotes, histone proteins are the dominant species in such extracts, in Hematodinium sp a single 30kDa species dominated.

When this band was extracted and the protein identified by mass spectrometry, it was found to correspond to a novel family of proteins, at least 4 of which were expressed in Hematodinium sp., whilst 13 were found in the transcriptome. This family of proteins only appears to be present in dinoflagellates; no homologues were found in other eukaryotic groups or in prokaryotes. However database searching did reveal homology with a protein of unknown function widely found encoded in the genomes of phycodnaviruses, a family of giant viruses infecting algae. Gornik et al. therefore named these proteins Dinoflagellate/Viral NucleoProteins (DVNPs).

Like histones and many other DNA-binding proteins, DVNPs are highly basic proteins. They are relatively variable in their N-terminal regions, with higher conservation in a core region, which may potentially include a DNA-binding helix-turn-helix motif. Biochemical experiments demonstrated that DVNPs have a high affinity for DNA and are post-translationally modified at various residues by phosphorylation.

The phycodnaviridae are members of the nucleocytoplasmic large DNA viruses (NCLDVs), a monophyletic clade of giant viruses that encode much more of their replication apparatus than is typical of viruses. They are predicted to have emerged more than 2 billion years ago, predating the first dinoflagellates by more than a billion years. As most phycodnaviruses include DVNP orthologues dinoflagellates must have acquired DVNPs from the phycodnaviruses early in their evolution. As yet there is no information on the roles of DVNPs in the phycodnaviridae, but the fact that both taxa have expanded genomes suggests a possible similar function. Do DVNPs allow such efficient DNA packing that the costs of genome expansion are somehow minimised?

The DVNPs are not the first family of putative histone-replacement proteins discovered in dinoflagellates. Later-branching taxa express ‘histone-like proteins’ (HLPs), probably related to the bacterial DNA-binding protein HU, and shown to be able to bend DNA in vitro. HLPs are not found in Hematodinium sp. or other early-branching dinoflagellates, whereas DVNPs are found in combination with HLPs in later-branching taxa. DVNPs therefore seem to be associated with the core dinokaryotic characteristics of permanently condensed chromosomes and expanded genome size, whilst the presence of HLPs correlates with other characters such as the chromosome banding patterns observed in later-branching taxa.

The observation that dinoflagellates do in fact encode and express divergent histones at low levels raises the question of what their roles could be if they are not primarily responsible for the bulk packing of DNA? Linked to this is the broad question of how DVNPs and HLPs act to condense dinoflagellate chromosomes. Considering the vast quantity of research attempting to understand the biology of eukaryotic chromosomes, it is rather daunting to find a whole new way of doing things; how do transcription and replication mechanisms work in the context of permanently condensed chromosomes? How does this link in with genome expansion? I don’t know how much dinoflagellate genomic data is available, but I imagine that a finished genome sequence would be of great use. Perhaps though, I’d prefer instead to prioritise biochemical and structural studies of these various proteins actions on DNA.

Gornik, S., Ford, K., Mulhern, T., Bacic, A., McFadden, G., & Waller, R. (2012). Loss of Nucleosomal DNA Condensation Coincides with Appearance of a Novel Nuclear Protein in Dinoflagellates Current Biology DOI: 10.1016/j.cub.2012.10.036

Uploading piRNAs to the Cloud.

A new paper finds a protein linking piRNA transcription with processing in nuage.

The Piwi/piRNA system is responsible for protecting the germline from the mutagenic effects of transposon mobilisation. As summarised in an earlier post, in Drosophila large arrays of transposon fragments, located in pericentromeric and subtelomeric chromatin domains give rise to long piRNA cluster transcripts. These transcripts are then processed to produce the 23-30 nt piRNAs which, when complexed with Piwi-family argonaute proteins effect the post-transcriptional silencing of transposons. Although a more limited piRNA system functions in the somatic follicle cells surrounding the Drosophila egg chamber, the bulk of germline transposon silencing is performed by the system active in the germline siblings of the oocyte – the nurse cells. Here, dual-strand piRNA cluster transcripts are processed in the nuage, a perinuclear electron-dense cytoplasmic structure, where the ‘ping-pong’ system of reciprocal cutting and complexing between the Piwi proteins Aubergine (Aub) and Ago3 leads to piRNA amplification.

Nuage is a hallmark of germline cytoplasm in animals, and appears to be the site of both piRNA processing and transposon silencing. A hierarchy of proteins responsible for the assembly and function of nuage has been revealed by studies in Drosophila. Vasa, a DEAD-box RNA-dependent helicase protein, is required for the localisation of Tudor and other Tudor-domain-containing (Tdrd) proteins. These serve as a platform for the piRNA system, binding Aub and Ago3. Defects in many of these piRNA biogenesis components do not just lead to uncontrolled transposon activity; rather, they affect the asymmetric localisation of RNAs in the developing oocyte – a process by which developmental prepattern is organised. Zheng et al. discovered that weak mutations in the uap56 gene caused similar defects, suggesting a potential role in piRNA biogenesis.

UAP56 is another DEAD-box containing RNA-binding protein. It is ubiquitously expressed, localised in nuclei and has previously been shown to be involved in mRNA splicing and export. Zheng et al. found that in nurse cells it localises to discrete foci in the periphery of the nucleus. This was a similar pattern to that of Rhino (Rhi), a Heterochromatin Protein 1 variant previously shown to associate with piRNA clusters. Indeed, UAP56 and Rhino co-localised ~99% of the time in nurse cell nuclei.  Mutations in either uap56 or rhi caused a failure in the focal localisation of the other protein, showing their co-dependence.

When Vasa was imaged at the same time, it became apparent that it localised to foci in the nuage directly across the nuclear envelope from UAP-56-Rhi foci. Co-labelling with a nucleoporin showed that in fact UAP56-Rhi foci and Vasa foci directly abut nuclear pores from either side.

In the absence of functional UAP56 the nuage fails to assemble properly; Vasa, Aub and Ago3 all fail to localise. Similar effects are observed in rhi mutants, placing both UAP56 and Rhino upstream of Vasa as extrinsic factors necessary for nuage assembly. The uap56 mutants also fail to produce a large part of the proper complement of piRNAs leading to a consequent mobilisation of transposons. No effects on the level of genic mRNAs were detectable. Due to the failure of nuage assembly, the uap56 mutants also display germline DNA damage and the morphological defects caused by mislocalisation of asymmetric RNAs.

DEAD-box containing proteins act as ATP-dependent RNA clamps. As Rhino is known to associate with dual-strand piRNA clusters, Zhang et al postulated that UAP56 may be binding and stabilising nascent cluster transcripts. Indeed piRNA cluster transcripts could be co-immunoprecipitated with UAP56 and Vasa.

The data therefore suggests an attractive model in which cluster transcripts are passed across the nuclear pore between the two DEAD-box containing proteins, UAP56 and Vasa. The authors term this a nuclear pore spanning piRNA processing compartment. piRNA cluster transcripts must in some way be marked and specifically transported via the trans– nuclear pore compartment.

Running through this work as a consistent undertone are the implicit links to the broader RNA processing systems. The nuage is obviously intricately linked to the differential transportation of RNAs from the nurse cells and around the oocyte. UAP56 has other roles in mRNA splicing and export from the nucleus. What exactly are the links between the germline specific role of UAP56 and the general RNA splicing and export machinery? Zhang et al end with the enticing observation that mutations in two different genes encoding conserved exon junction splicing components also lead to similar asymmetric RNA localisation defects. It appears that the control of piRNA processing and transposon silencing in nuage is intimately linked to broader networks controlling germline specification and the patterning of the oocyte. Although the different strands of these systems are difficult to tease apart, Drosophila oogenesis continues to offer an unparalled paradigm for their investigation. The piRNA system is widely conserved in animals, but there does appear to be quite a lot of plasticity in its specifics. For instance, as discussed at length in this series of posts, in C. elegans, piRNAs are individually transcribed. I’d be very interested to find out whether homologues of Rhino and UAP56 play any role in this system? I’ll riff on the similarities and differences of piRNA systems and their links to development some more in future posts.

Zhang, F., Wang, J., Xu, J., Zhang, Z., Koppetsch, B., Schultz, N., Vreven, T., Meignin, C., Davis, I., Zamore, P., Weng, Z., & Theurkauf, W. (2012). UAP56 Couples piRNA Clusters to the Perinuclear Transposon Silencing Machinery Cell, 151 (4), 871-884 DOI: 10.1016/j.cell.2012.09.040

Lin, H. (2012). Capturing the Cloud: UAP56 in Nuage Assembly and Function Cell, 151 (4), 699-701 DOI: 10.1016/j.cell.2012.10.026

A chimeric fusion of RNA and DNA viruses.

The discovery of a new family of viruses leads to speculations on possible modes recombination between RNA and DNA viruses.

The virosphere can be divided into three major classes; viruses with DNA genomes, retroviruses that reverse-transcribe their RNA genome into DNA during their lifecycle, and RNA-only viruses that don’t require DNA intermediates to replicate. In fact, viruses use all sorts of different permutations of genetic material; double-stranded RNA, single-stranded RNA (either negative or positive strand), dsDNA and ssDNA. Viruses evolve notoriously quickly and lateral gene transfer between them is rampant. However, gene transfer has most commonly occurred between closely related viruses or between those with similar replication mechanisms. A recent paper has reported the discovery of a new family of viruses that appear to have arisen via lateral gene transfer between a (non-retroid) +ve single-stranded RNA virus and a ssDNA virus.

Diemer and Stedman discovered the new virus whilst investigating viral diversity in a geothermal lake in California. Boiling Springs Lake is an acidic, high temperature lake with a purely microbial ecosystem composed of archaea, bacteria, and some single cell eukaryotes. Using a metagenomics approach (ie. large-scale sequencing  of environmental DNA from a virus particle sized fraction), they discovered the strange juxtaposition of a capsid protein (CP) gene related to those from the ssRNA plant-infecting Tombusviridae, with a rolling-circle replicase (Rep) gene most similar to those from the circular ssDNA-containing Circoviridae. Using primers designed against CP they confirmed the genome sequence of this putative virus, finding that it consisted of a single-stranded circular DNA containing 4 ORFs. ORFs 3 and 4 are of unknown function and unrelated to known genes. The virus contains a stem loop structure upstream of the Rep gene similar to those that serve as replication origins in other Circoviruses. Thanks to the chimeric origin of the Rep and CP genes, the authors termed it RNA-DNA hybrid virus (RDHV). This term is slightly open to misinterpretation as it could suggest that both molecules are actually encoding its’ genome, but to be clear this is a circular ssDNA virus whose capsid protein is derived from ssRNA viruses.

Organisation of RDHV. Note that ORFs 3 and 4 are not equivalent to those of Tombusviruses, and RDHV is twice the size of other Circoviruses.

Scanning databases of environmental sequence, the researchers found three other instances of homologous CP and Rep sequences arranged in the same configuration, two from global ocean surveys and one from the Sargasso Sea. This shows that RDHV defines a new family of viruses that are common in marine environments and could be more widespread. As CP and Rep are still highly similar to their sibling genes, it appears that the LGT event underlying the evolution of this new family occurred quite recently.

How did recombination occur between a non-retrovirus ssRNA virus and a DNA virus? A number of genes derived from non-retroid RNA viruses have been found in eukaryotic genomes, so perhaps this type of exchange is not as strange or rare as it may seem. The most likely scenario involves the RNA gene being converted into DNA by reverse transcription, followed by DNA-DNA recombination. As reverse transcriptase is not encoded by either virus, it could have been supplied in trans by retrotransposons, group II introns, or retroviruses within a common host cell. This brings us to the problem of metagenomic studies; they have amazing power to identify novel viruses and organisms, but yield very little information on the biology of what is found. In this case of RDHV and it’s family we do not know what their hosts are, don’t know the morphology of the viruses, and don’t know about the functions of half it’s 4 gene genome. I’m not sure how quickly these questions will be answered. Nevertheless, this study shows that amazing diversity is still out there being found, and yields insight into mechanisms underlying virus evolution – possibly in the deep past as well as more recently.

Diemer, G., & Stedman, K. (2012). A novel virus genome discovered in an extreme environment suggests recombination between unrelated groups of RNA and DNA viruses Biology Direct, 7 (1) DOI: 10.1186/1745-6150-7-13