Category Archives: New Papers

A dual purpose RNA and Hox regulation

A new paper in Plos Genetics shows that a long non-coding RNA regulates the expression of a Hox gene in Drosophila in cis. This finding suggests an explanation for the co-linearity displayed by Hox genes between genomic arrangement and expression pattern.

The Ultrabithorax mutant.

Hox genes are master-regulators of positional identity along the anterior-posterior axis throughout bilaterian animals. Hox genes are found in genomic clusters in which their 3′-5′ organisation mirrors their expression pattern along the A-P axis. This correspondence between body axis and genomic organisation is termed co-linearity. An important feature of Hox gene genetics is the phenomenon of ‘posterior prevalence’. In any given segment the gene that has it’s most anterior boundary of expression in that segment will define segmental identity. Hence, if that gene is not expressed the segment will take on a more anterior identity. Perhaps the clearest example of this phenomenon is the Ultrabithorax mutant in Drosophila, in which segments that would have generated abdominal structures instead take on a thoracic fate, leading to flies with two sets of wings.

The Hox gene cluster is actually divided into two partial clusters in Drosophila; the Antennapedia complex (ANT-C) and the Bithorax complex (BX-C). BX-C consists of three Hox genes responsible for posterior patterning in Drosophila, Ultrabithorax (Ubx), abdominal-A (abd-A), and Abdominal-B (Abd-B) spread over ~300kb, and has become a paradigm for the understanding on genetic regulation. Many transcriptional enhancers, maintenance elements (sites for the binding of Polycomb-group and Trithorax-group chromatin modulating complexes), and encoded microRNAs responsible for regulating the expression of the BX-C genes have been discovered. However, a complete picture of BX-C regulation is still far away. It’s been known since the 1980’s that much of BX-C is transcribed, but the significance of this finding is just emerging. Gummalla et al. have used classical genetics to characterise the role of one such non-coding RNA in relation to the expression of abd-A in the embryonic CNS.

Figure showing the expression of ABD-A (red), and ABD-B (green) in the embryonic CNS. Note the gap in PS13, that isn’t filled by derepressed ABD-A in this mutant.

abd-A is expressed in the embryonic epidermis and CNS in parasegments (PS) 7-12 but is excluded from PS13. In line with ‘posterior prevalence’, this was considered to be due to Abd-B repressing abd-A expression. A mutation that removes Abd-B, shows expression of abd-A expression extending into PS13. However, this mutation also removed some of the sequence downstream of the transcription unit of Abd-B. In flies homozygous for more subtle mutations affecting Abd-B, abd-A expression only spreads into PS13 epidermis and not the CNS.  Therefore, some function located in the genomic region downstream of Abd-B (termed iab-8), was necessary for abd-A repression in the PS13 CNS. Gumalla et al. knew that a long non-coding RNA (iab-8 ncRNA) was predicted to initiate in this area, and therefore set out to characterise it’s function.

A map of the abdominal half of the bithorax complex. the iab-8 ncRNA is shown in blue (note exon structure). Abd-B, and abd-A are in black and the position of the miR-iab-8 is shown.

iab-8 ncRNA is transcribed from virtually the entire region between Abd-B and abd-A, spanning 92kb. Mutations that truncate iab-8 ncRNA near the Abd-B end cause a derepression of abd-A expression in the PS13 CNS, but mutations affecting the end nearest abd-A display only subtle derepression. The difference between these two classes of mutants, appears to be the position of a microRNA encoded by iab-8 ncRNA, miR-iab-8. This suggested that miR-iab-8 was responsible for the repression of abd-A in PS13 CNS. However, mutants with this miRNA deleted did not display the complete derepression phenotype, rather a very weak derepression of abd-A. This showed that there must be a second, partially redundant function of iab-8 ncRNA, apart from producing miR-iab-8.

To test whether a second miRNA or a small polypeptide encoded by iab-8 ncRNA was responsible for this second function, Gummalla et al. missexpressed iab-8 ncRNA from another locus in PS 8-13. This had no effect on ABD-A expression, suggesting that no other trans-acting factor is encoded by the ncRNA. They then performed some complicated genetic experiments that showed that iab-8 ncRNA acts to repress abd-A is cis. They generated flies that contained a deletion of miR-iab-8 on one chromosome, and a truncated copy of the iab-8 ncRNA on the other. These flies do not produce any of the miRNA, but still produce the ncRNA on one chromosome, and yet abd-A is derepressed in PS13 CNS. When flies are generated with one copy of the BX-C deleted, and a deletion of miR-iab-8 on the other chromosome, abd-A is not derepressed.

The iab-8 ncRNA therefore acts to repress abd-A expression in CNS of PS13 through two different mechanisms: a trans-acting miRNA, and through a cis-acting process of transcriptional interference. Although it is possible that this process of cis-repression could act by iab-8 ncRNA recruiting gene silencing machinery that would act by heterochromatin formation or DNA methylation, the authors suggest that it is more likely that iab-8 ncRNA acts by somehow interfering with the abd-A promoter. This leads them to suggest that if this method of gene regulation was widely used within Hox clusters it could explain the link between posterior prevalence and co-linearity. In this case expression of a more anterior gene is blocked in posterior segments by a more ‘posterior’ transcript. Similarly an upstream ncRNA acts to repress Ubx (Petruk et al.2006). This method of transcriptional interference by readthrough of more posterior genes or by upstream ncRNAs would fix the arrangement of Hox genes in an ancestral cluster, and hence the co-linearity that is observed today.

Gummalla, M., Maeda, R., Castro Alvarez, J., Gyurkovics, H., Singari, S., Edwards, K., Karch, F., & Bender, W. (2012). abd-A Regulation by the iab-8 Noncoding RNA PLoS Genetics, 8 (5) DOI: 10.1371/journal.pgen.1002720


Patterns of RNA methylation

A new paper in Cell provides a transcriptome-wide survey of the methylation of adenosine residues in RNAs. Meyer et al find that this epitranscriptomic post-transcriptional modification is widespread and dynamically regulated, and likely to play important roles in cellular regulation.

Methylation of the N6 position of adenosine residues (m6A) has been known to be a post-transcriptional modification of RNAs for many years. Research in the 1960’s and 70’s demonstrated that m6A is present in tRNAs, rRNAs and viral RNAs, and made up between 0.1% and 0.4% or total adenosines in cellular RNA. However as m6A was not easily detectable by commonly available methods, research on this modified base foundered. A recent spur to experimentation on m6A has come from the analysis of a gene linked to obesity. FTO (fat mass and obesity associated) is a major regulator of metabolism and energy utilisation. It appears that the major catalytic function of FTO is the demethylation of N6-methyladensosine (m6A), suggesting that m6A has important physiological roles in humans and other mammals.

As m6A is not detectable by sequencing or hybridisation based techniques, nor susceptible to chemical modification, Meyer et al. based their experiments on the use of an anti-m6A antibody (ά-m6A). They first showed that m6A was present in RNA from a wide selection of different mouse tissues and cell lines. It was especially enriched in liver, kidney, and brain, and showed a dramatic increase in adult neural tissue as opposed to embryonic. m6A was found to be present in RNAs of all sizes, and was enriched in the polyadenylated fraction (ie. mRNAs), but not present in the poly(A) tails themselves.

To look in more detail at the distribution of m6A throughout the transcriptome, Meyer et al. developed a high throughput technique called MeRIP-Seq. Cellular RNA is fragmented into ~100nt fragments, and then m6A containing fragments are immunoprecipitated using ά-m6A. The RNA fragments are then deep sequenced. m6A residues should be detected on multiple RNA fragment sequence reads, allowing the detection of m6A peaks, that can be assigned to their approximate position on RNA molecules. Using adult mouse brain RNA in multiple MeRIP-Seq experiments, Meyer et al. identified 41, 072 distinct peaks in the RNAs of 8,843 genes. However they used a smaller, highly reproducible, subset of 13, 471 peaks in 4, 654 genes for their further analyses.

94.5% of the m6A peaks occurred in mRNAs, but more than 3% were found within long non-coding RNAs, showing that ncRNAs are also targets for adenosine methylation. mRNAs from a wide variety of genes were found to contain methylated adenosines, including many involved in cellular regulation, and genes linked to neurodevelopmental and neurological disorders.

The largest proportion of m6A containing mRNAs exhibited a single m6A peak (46%) (equating to either a single m6A residue or a cluster of adjacent m6As), whilst 48.5% contained two or three peaks. However, mRNAs can contain more than 15 peaks along their lengths. Although MeRIP-Seq doesn’t allow one to say exactly which adenosines are methylated, it does give one a good idea of their positions on RNAs. m6A levels are low in the 5’ ends of mRNAs. They increase steadily throughout the coding sequence, peak in the vicinity of the stop codon, remain high in the first portion of the 3’ UTR and then rapidly decline. This linkage between the region of the translational stop codon and m6A is the most important finding of the paper.

Meyer et al. went on to show that regions of m6A occurrence are more likely to be conserved in vertebrates. They also found a correlation between m6A in 3’UTRs and the presence of microRNA binding sites.

Adenosine methylation has therefore been shown to be a widespread and dynamically regulated post-transcriptional modification of mRNAs and lncRNAs in mammals. Its functional significance however, is still difficult to gauge. So far, the pathways responsible for adenosine methylation of RNAs are not characterised. It is also unclear as to whether FTO is the primary enzyme responsible for adenosine demethylation. FTO knockout mice survive, but display postnatal growth retardation and decreased locomotor activity. The linkages between m6A, stop codons and miRNA binding await mechanistic study, but are suggestive of important regulatory roles for RNA methylation. With MerIP-Seq, Meyer et al. have invented a useful technique for the analysis of this important modification.

Meyer, K., Saletore, Y., Zumbo, P., Elemento, O., Mason, C., & Jaffrey, S. (2012). Comprehensive Analysis of mRNA Methylation Reveals Enrichment in 3′ UTRs and near Stop Codons Cell DOI: 10.1016/j.cell.2012.05.003

A follow-up to this post on 5-methylcytosine in RNAs: Patterns of RNA methylation 2

A Ribosome Code?

The ribosome, a universally conserved molecular machine that catalyses protein synthesis, has generally been considered to act constitutively. That is to say, that ribosomes act to translate mRNAs in the same way across all cells and developmental stages. Regulatory control of translation is predominantly exerted by the action of translation initiation factors, which guide the association of the ribosome with target mRNAs. The eukaryotic ribosome is composed of 4 RNA molecules and 79 different ribosomal proteins (RPs). A paper published last year by Kondrashov et al. has shown one RP (RPL38) specifically regulates the expression of a subset of mRNAs during embryonic development in the mouse. Together with findings from human genetic diseases and from other organisms, this data is suggestive of a ‘ribosomal code’ regulating translation.

Kondrashov et al. set out to discover what gene was responsible for causing the morphological defects found in a spontaneous mouse mutant, tail-short (Ts). These mice display skeletal patterning defects, including homeotic transformations (ie. the conversion of a tissue’s identity to that of a different tissue; in this case changes between the segmental identities of vertebrae and ribs). They also display eye and craniofacial defects, short and kinky tails, and wavy neural tubes. These phenotypes are only found in heterozygous mice (Ts/+); homozygotes die at implantation stages. By positional cloning, Kondrashov et al. found that the gene responsible for Ts was Rpl38.

Ts/+ mice display skeletal defects and transformations along the entire length of the anterior-posterior body axis. The key regulators of morphological identity along the A-P axis are Hox genes. Hox genes encode homeodomain-containing transcription factors, and are found in four genomic clusters in vertebrates. Loss of function mutations in, or missexpression of, Hox genes generally leads to homeotic transformations (most shockingly seen in the Drosophila mutants antennapedia and ultrabithorax). Kondrashov et al. therefore examined the expression of Hox gene transcripts in Ts/+ mouse embryos. Surprisingly, they found no changes in the levels or expression domains of the Hox genes.

Schematic representation of the axial skeleton of WT and Ts/+ mice. Defects are explained by the effects of corresponding Hox gene mutants.

The researchers then asked whether changes in translational control of Hox genes were responsible for the Ts/+ phenotypes. Using various techniques they showed that there were no changes in global protein synthesis. However, by using quantitative PCR on mRNAs that were purified with active ribosomes, they identified a subset of Hox genes that were translationally deregulated in Ts/+ embryos (Hoxa4; a5; a9; a11; b3; b13; c8; d11).  These findings were confirmed by observing protein levels for HOXA5, A11, and B13 in the Ts/+ mouse embryos. The majority of the Ts/+ axial skeleton phenotypes could be accounting for by the known effects of loss of function mutations in the Hox genes that were translationally deregulated.

It therefore appears that RPL38 is exerting a specialised control on the translation of specific Hox genes. In further experiments Kondrashov et al. find that RPL38 is likely facilitating the formation of the 80S (complete) ribosomal complex on specific mRNAs (the ribosome is made up of two subunits, the 40S subunit associates with the 5’UTR of the target mRNA first and is then joined by the 60S subunit to make a translationally competent ribosome). An important question is whether RPL38 exerts it’s function as part of the ribosome, or whether it has extra-ribosomal roles as well? By separating ribosomal from ribosome-free cytosolic fractions, Kondrashov et al, found that RPL38 was only ever found in the ribosome.

Ribosomal proteins have generally been considered as ubiquitously expressed cellular ‘housekeeping’ proteins. However, when the researchers examined Rpl38 expression, they found that transcripts were enriched in specific tissues. For instance, embryonic tissues that give rise to facial structures, as well as the neural retina, showed high levels of Rpl38 expression, correlating with the craniofacial and eye defects in Ts/+ mice. Likewise, Rpl38 was strongly expressed in the somites and the neural tube, the embryonic tissues giving rise to the vertebrae and the spinal cord respectively. Kondrashov et al. went on to examine the expression of 72 different ribosomal proteins in 14 different tissue and cell types. They found a large amount of heterogeneity in RP expression, suggesting that many have specialised, tissue specific roles.

A few obvious outstanding questions for future studies should be noted; Does RPL38 bind cis-regulatory sequence or structure elements within target mRNAs? and what are they? Do trans-acting factors also play a role? Other developmental questions also stand out. Hox genes are not involved in eye development, and it also seems unlikely that the Hox genes implicated in the trunk segmental effects are also responsible for the craniofacial defects. What other RPL38 mRNA targets are responsible for these phenotypes?

These experiments have therefore shown that RPL38 has transcript-specific roles in the control of translation, and that many RPs display heterogeneous expression patterns rather than the previously assumed ubiquity. Together these findings suggest that RPs are imparting a new level of specificity in the control of gene expression. They fit into a broader array of observations that hint at the existence of a ‘ribosome code’ in which alterations in the composition of ribosomes leads to their translational specialisation towards subsets of mRNAs. Diamond-Blackfan Anaemia is a human genetic disease caused by mutations in a number of ribosomal proteins. Patients display limb defects, cleft palates, growth failures and cancer predisposition. Likewise knockdown of multiple distinct RPs in zebrafish leads to a wide range of developmental defects and a high incidence of cancer. A possible explanation for these types of finding, is that highly proliferating tissues may be more sensitive to differences in the rate of protein synthesis. Hence, indirect effects on cell proliferation and apoptosis may lead to the morphological abnormalities. However, Kondrashov et al. have shown in this study of Ts/+, overall protein synthesis is not affected, and the effects on a subset of developmental patterning genes are responsible for the bulk of the phenotypes.

Ribosomal RNAs and proteins are also targets for extensive chemical modifications such as phosphorylation and methylation, most of which are as yet uncharacterised. Interestingly, another human genetic disease, X-linked Dyskeratosis Congenita, is probably caused by failures of rRNA modifications. By analogy with the levels of complexity see with regard to modifications and combinations of chromatin-associated histones, a ‘ribosome code’ imparting translational specificity by heterogeneity of RPs and modifications has the potential to be a hugely important level of regulatory control.

Kondrashov, N., Pusic, A., Stumpf, C., Shimizu, K., Hsieh, A., Xue, S., Ishijima, J., Shiroishi, T., & Barna, M. (2011). Ribosome-Mediated Specificity in Hox mRNA Translation and Vertebrate Tissue Patterning Cell, 145 (3), 383-397 DOI: 10.1016/j.cell.2011.03.028

Topisirovic, I., & Sonenberg, N. (2011). Translational Control by the Eukaryotic Ribosome Cell, 145 (3), 333-334 DOI: 10.1016/j.cell.2011.04.006

piRNAs in the brain: epigenetics and memory

An exciting new paper in Cell, links Piwi-interacting RNAs (piRNAs) to long-term memory via the epigenetic regulation of gene expression by DNA methylation.

Two different novel findings are especially important: piRNAs had been thought to be a germline specific mode of genetic control, specifically a type of genetic immunity against the mobilisation of transposable elements. Rajasethupathy et al demonstrate that in the sea hare, Aplysia, piRNAs are expressed in the CNS and other somatic tissues. Secondly, this paper demonstrates specific dynamic de novo methylation of the promoter of a gene regulating neuronal plasticity in response to neurotransmitter- mediated excitation. This provides an epigenetic mechanism by which memories can persist by molecular encoding.

To investigate microRNAs expressed in the Aplysia CNS, Rajasethupathy et al had constructed a small RNA library. Surprisingly, they found that ~20% of the sequence reads from this library were ~28nt long, and showed a bias towards having 5′ uridine residue. This fitted with them being Piwi-interacting RNAs (piRNAs) rather than microRNAs. When they were mapped to the Aplysia genome, it was clear that the piRNAs were generated from piRNA clusters (see previous introductory post). After constructing more libraries and deep sequencing, Rajasethupathy et al. identified 372 distinct Aplysia piRNA clusters. Certain piRNAs are found far more commonly than surrounding piRNAs from the same cluster, indicating that an amplification process is occurring during piRNA biogenesis. Although overall piRNA content was highest in the ovotestes (ie the germline and associated somatic tissues), various piRNAs were found to be enriched in the CNS, as well as other somatic tissues analysed.

Consistent with the presence of piRNAs in the CNS, Rajasethupathy et al. were able to clone a full-length cDNA for Piwi protein from Aplysia CNS. Using an antibody against Piwi, they were then able to co-immunoprecipitate piRNAs with Piwi protein from the CNS. By separating cell nuclei from cytoplasm and then western and northern blotting against Piwi and piRNAs respectively, they showed that both were primarily found in nuclei.

To briefly comment on the piRNA aspect of this paper; a number of outstanding questions arise. In the best characterised model systems, piRNA amplification occurs by the ‘ping-pong’ mechanism in which reciprocal recognition and cleavage reactions between sense and antisense piRNAs complexed with the (cytoplasmic) Piwi-related proteins Aubergine and AGO3 (Drosophila terminology), leads to selective amplification of piRNAs with a tell-tale 10nt offset. Neither of these proteins, nor the 10nt offset, nor the ratio between sense and antisense piRNAs are mentioned by Rajasethupathy et al. I imagine that these questions would’ve been looked into and mentioned if found, therefore it seems that the mechanism of piRNA amplification in the Aplysia CNS is potentially novel.

To explore possible functions of piRNAs in the Aplysia CNS, the researchers used a co-culture system that monitors changes in synaptic plasticity in response to stimulation by the neurotransmitter serotonin (5HT). In this assay, two sensory neurons synapse with a motor neuron. Changes in the strength of one sensory-motor synapse are monitored by electrophysiological recording from the motor neuron. This system measures long-term facilitation (LTF): ie. changes in synaptic strength in response to 5HT stimulation. LTF is considered to be a memory-related phenomenon, however, it is contentious just how well it serves as a paradigm for long-term memory.

Knockdown of Piwi (and hence of complexed piRNAs), by the injection of antisense oligonucleotides into one of the sensory neurons, significantly impaired LTF, whilst overexpression of Piwi enhanced it. To investigate how these Piwi effects on LTF were mediated, Rajasethupathy et al. looked at the expression of proteins known to be regulating synaptic plasticity in response to Piwi knockdown. Only one of the assayed proteins was responsive: CREB2, a transcriptional repressor, known to be a major inhibitory constraint on LTF, was upregulated in response to Piwi knockdown. Interestingly, an even greater increase in CREB2 mRNA was observed.

The fact that Piwi knockdown led to an increase in CREB2 mRNA, and it’s nuclear localisation, suggested that rather than acting post-transcriptionally (ie by degrading mRNAs as in Drosophila), Piwi/piRNA complexes appeared to be inhibiting CREB2 gene expression at the DNA level.  It is known that in mice Piwi/piRNA complexes act to silence transcription by facilitating DNA methylation. Rajasethupathy et al therefore asked whether CREB2 regulation by Piwi occurred via DNA methylation.

The enzyme responsible for methylation of cytosine residues in CG dinucleotides, DNA methyltransferase (DNMT), was known to be expressed in the Aplysia CNS. Inhibition of DNMT activity (using a pharmacological reagent) led to a strong increase in the level of CREB2. In the normal LTF experiments, CREB2 levels are reduced 12 hours after exposure to 5HT and remain low until 48hrs. This downregulation of CREB2 was abolished when DNMT activity was inhibited, as was 5HT-dependent LTF. This led the researchers to search for CpG islands in the promoter region of CREB2 that could be sites for DNA methylation mediated transcriptional control. Indeed, they identified a CpG island in the CREB2 promoter region that normally exists in both methylated and unmethylated states. After 5HT exposure, this CpG island is almost entirely methylated, whilst in the presence of the DNMT inhibitor it becomes almost entirely unmethylated. This 5HT-dependent methylation of the CREB2 CpG island requires Piwi, as it was abolished when Piwi was inhibited. The authors then went on to search for candidate piRNAs that could be responsible for mediating this effect, by searching for those with complementarity to the CREB2 promoter. They identified 4, one of which, aca-piR-F, when knocked down caused an increse in CREB2 expression. Notably, Rajasethupathy et al. did not demonstrate the expected result that aca-piR-F knockdown would lead to demethylation of the CREB2 CpG island, although this experiment was surely attempted.

In conclusion, this paper offers a broad outline for a mechanism of memory encoding; it connects neurotransmitter synaptic stimulation with the stable epigenetic marking of the transcription state of an important regulator of neuronal plasticity, via the action of Piwi/piRNA complexes. It should be noted that ‘epigenetic’ is used in this context in a loose definition with reference only to a stable marking of cellular state. Strictly ‘epigenetic’ should refer to heritable non-genetic changes, but as neurons do not divide that is inapplicable. In this case DNA methylation is a relatively long-lasting mark. However, for instance the change in CREB2 expression with respect to 5HT-stimulated long term facilitation only lasts a couple of days – does this correspond to the level of methylation in the promoter? In which case, one gets an impression that DNA methylation and demethylation are highly dynamic processes in the Aplysia CNS. Currently there are three modes proposed to resolve the difficult question of how memories can persist for a long time, whilst the cellular components that must mediate them have a high rate of turnover: Prion-like synaptic marks, autoregulatory loops that can maintain a cell state whilst their components come and go, and epigenetic mechanisms that can alter gene expression in a long term manner. This paper shows a clear example of the latter mode, but the apparent dynamism of DNA methylation in this system suggests a lack of permanence.

Although I like the way this paper has ranged over a large terrain and connected so many disparate elements, by necessity it raises many questions and leaves many aspects of the work unmentioned. I’ve already mentioned some questions about Aplysia piRNAs; no doubt a fully annotated Aplysia genome will answer some of them. A few other questions and future directions spring to mind: The authors haven’t quite shown that DNA methylation is responsible for the transcriptional silencing at the CREB2 promoter, only correlated it. Likewise the mode by which Piwi/piRNA complexes act to promote DNA methylation is unclear. A wider question is the nature of DNA methylation in Aplysia and other invertebrates. Some invertebrates show virtually no DNA methylation (C. elegans, Drosophila) whilst the majority display mosaic patterns quite different from those found in vertebrates. This suggests functional differences, and without deeper knowledge of the role of DNA methylation in Aplysia it is difficult to guess how widely applicable these findings are in other systems. Likewise the finding that piRNAs are acting at the level of DNA methylation, previously only found in mammals, raises questions about the state of affairs in other invertebrate model systems. Also, do Aplysia piRNAs only act on DNA methylation, or post-transcriptionally aswell.? Future studies will no doubt also look at how this type of regulation corresponds to histone marks, and try to synthesise the different levels of regulation. Perhaps the most important take home message is that piRNAs are more than a germline specific immunity against tranposons. Just how widespread these other roles are is an open question.

Rajasethupathy, P., Antonov, I., Sheridan, R., Frey, S., Sander, C., Tuschl, T., & Kandel, E. (2012). A Role for Neuronal piRNAs in the Epigenetic Control of Memory-Related Synaptic Plasticity Cell, 149 (3), 693-707 DOI: 10.1016/j.cell.2012.02.057


On Ribosomal Pausing

A new paper in Nature, describes how Shine-Dalgarno-like features in protein coding sequences, leads to bacterial ribosomes pausing during translation. Selection against ribosomal pausing leads to biases in codon usage and coding sequence evolution. Translational pausing represents a new level of regulatory control of gene expression.

Translation, the process by which the nucleotide sequence of mRNA transcripts is decoded and converted into amino acid sequence during protein synthesis, is carried out by ribosomes. Within the ribosome, transfer RNA molecules recognise specific trinucleotide codons on the mRNA, and add their cognate amino acids to nascent protein chains. In bacteria and archaea, ribosomes often recognise the translation start site with the help of a sequence 8 to 12 nucleotides upstream of it – the Shine-Dalgarno sequence (SD). It’s been known since the 1980s that different mRNAs are translated at different rates. The main reason for these differences was thought to be the concentration of rarer varieties of tRNA limiting the rate at which some transcripts could be decoded.

Li et al. have used a new technique, ribosome profiling, that maps ribosome occupancy along mRNAs. This has yielded high-resolution views of local translation rates on the entire protein coding transcriptome of E. coli and Bacillus subtilis.  Briefly put, mRNA fragments that have been protected from nuclease digestion by ribosomal binding, are ‘deep sequenced’. The concentration of these ribosome footprints equates to the ribosome occupancy throughout the transcriptome. The local translation rate is inversely related to ribosome occupancy.

Using this technique, Li et al. found many sites where ribosomal density is ten fold or more than the mean. They sought to correlate these translational pauses with specific codons. However, there was little link between occupancy of specifc codons and the abundance of their corresponding tRNAs. Therefore, the concentration of rare tRNAs is not responsible for much translational pausing under normal growth conditions.

To try to find sequence features that were linked to ribosomal pausing, the researchers then tried to correlate any trinucleotide sequences (independently of reading frame) with ribosome occupancy. They found that 6 different trinucleotide sequences, with features resembling Shine-Dalgarno sequences, did correlate with the position of paused ribosomes. This correlation was not found in the yeast, Saccharomyces cerevisiae; in agreement with eukaryotic ribosomes not using SD- anti-SD interactions.

Li et al. went on to show definitively that internal SD-like sequences are linked to translational pausing, by introducing a mutation into one such site and showing that ribosome occupancy was reduced. They also showed that peaks of ribosome occupancy, were caused by translational pausing, rather than attempted internal translational initiation.

As translational pausing limits the amount of free ribosomes, widespread internal SD-like sequences would limit the rate of protein synthesis, and hence the potential bacterial growth rate. In line with this, SD-like sequences in protein coding genes are disfavoured. Selection pressure against SD-like sequences is therefore a major factor in determining codon usage, and more especially the usage of codon pairs (SD sequences are 6/7 nt long).

Interestingly, the authors found that patterns of ribosome occupancy were conserved between orthologous genes in E. coli and B. subtilis. This reflects two different factors; firstly, coding is obviously constrained by protein’s functionality, but secondly it’s suggestive of translational pausing being exploited for functional purposes. Li et al. suggest a number of different ways in which ribosomal pausing can regulate gene expression. It’s known that internal SD-like sequences can promote regulated shifting of reading frame. Ribosome pausing may also modulate folding of nascent protein chains. Lastly, as transcription and translation are closely coupled in bacteria, ribosome occupancy may inhibit the formation of stem-loop structures that prevent transcriptional termination. It will be exciting to work out the extents to which these potential regulatory systems are active. Eukaryotic ribosomes do not use recognition of SD sequences, instead using the 5’ mRNA cap and the less well defined Kozak sequence for translational initiation. Does ribosome pausing occur in eukaryotes? and does it have functional significance?

Li, G., Oh, E., & Weissman, J. (2012). The anti-Shine–Dalgarno sequence drives translational pausing and codon choice in bacteria Nature, 484 (7395), 538-541 DOI: 10.1038/nature10965

microDNAs: small mammalian extrachromosomal circular DNAs

A new paper in Science, reports the detection of a new species of DNA in mammalian cells: microDNA. microDNAs are extrachromosomal circular DNA molecules, generally 200-400bp long, derived from non-repetitive genomic sequence. microDNAs appear to arise from microdeletions occurring in the 5’ ends of genes. This data implies widespread genetic variation with respect to microdeletions between somatic cells in mammals. 

To identify sites of intramolecular homologous recombination that could lead to genetic mosaicism in mammalian brains, Shibata et al. searched for the extrachromosomal circular DNAs (eccDNA) that could be produced. DNA was purified from embryonic mouse brains and linear DNA was degraded with a specific exonuclease. The remaining fraction was then amplified with an unbiased non-PCR technique (multiple displacement amplification). The linear products were then sheared into 500bp fragments, cloned, and sequenced. The majority of the clones in the library included repeated sequences, consistent with the products of rolling circle amplification of small circular DNAs. When these repeated sequences were searched against the mouse genome, they were only found once, showing that they were not produced by repetitive sequence (eg. transposable elements). To prove that these sequences were indeed derived from circular DNA molecules, PCR using outward directed primers designed from the repeated sequences was performed on both extrachromosomal and chromosomal DNA. If the template DNA was circular, PCR amplification should occur, if linear, it shouldn’t. This was (generally) the case, proving the existence of a population of extrachromosomal circular DNAs, a few hundred base pairs long, derived from unique portions of the chromosomal genome.

To further explore the nature and extent of this population of DNA molecules, Shibata et al. went on to purify eccDNA from a range of embryonic and adult mouse tissues, and from mouse and human cancer cell lines. After amplification and the sequencing of the ends of the generated fragments, they found that tens of thousands of unique genomic sequences yield extrachromosomal circular DNAs. The eccDNA from mouse tissues ranged from 80-2000bp in length, but most were between 200-400bp. Lengths of ~200bp and ~400bp were enriched in the mouse brain and liver populations. A similar pattern was detected in human cancer cell lines, but in these eccDNA populations an additional pattern of length distribution peaks at a 150bp periodicity was detected. As in the earlier experiment, the circular DNAs mapped to unique positions in the genome. To differentiate this population of eccDNA from previously reported longer forms derived from repetitive sequence, the authors termed them microDNAs.

Electron micrograph showing double-stranded (left) and single-stranded (right) microDNAs.

The researchers went on to directly visualise microDNA molecules by electron microscopy. Using a technique that specifically labels single stranded DNA, they discovered both double-stranded and single-stranded microDNAs were present in approximately equal measure.

Bioinformatic analysis of the sources of microDNAs revealed high enrichment for 5’ UTRs, exons, and CpG islands (regions upstream of genes where cytosine residues in CG dinucleotides are not methylated), suggesting that microDNAs are commonly derived from the 5’ ends of genes. microDNAs also have a higher percentage GC content than the average for the genome (55% as opposed to 45%). In a relatively high proportion of microDNAs, the researchers detected short direct repeats of 2-15bp of microhomology at the starts and ends of the molecules.

microDNAs could potentially be created by excision from chromosomal DNA, by replication of short stretches of DNA, or by reverse transcription of RNA molecules. Shibata et al. selected two genomic loci that yielded microDNAs and found that microdeletions do occur in these regions in some cells. The lengths and GC content of the microdeletions that they identified were in line with those found in microDNAs. The majority of the microdeletions displayed short stretches of microhomology at their excised ends.

These short direct repeats at the start and ends of microDNAs, and at their presumptive source microdeletions, suggest two possibilities for microDNA generation. Regions of microhomology could cause the DNA replication process to stall and switch template. Incorrect repair processes would then lead to the release of a microDNA. Alternatively, microhomology mediated repair processes could lead to the excision of a microDNA by intramolecular homologous recombination. The 150bp length periodicity detected in the cancer cell line microDNAs is suggestive of a link to nucleosomes (in which ~150bp of chromosomal DNA are wrapped around the histone core). A link to the position of nucleosomes (either in tightly bound nucleosomes causing replication problems or in facilitating microDNA circularisation) may explain the enrichment of microDNAs from the 5’ends of genes. Another suggestion made to explain the origin of ss microDNA, is that they could be formed by displaced Okazaki fragments (the short sections of replicated DNA formed on the lagging strand). All of these ideas are ‘hand wavey stuff’ but exciting avenues for future experiments nonetheless. A couple of obvious counter-arguments to these suggestions would be that microhomologies were only detected in 37% of microDNAs, and that the 150bp periodicity was only found in the cancer cell line microDNAs. A combination of the above putative modes of microDNA generation could be taking place, and microDNAs may be a heterogeneous population of molecules (as the presence of ss and ds DNAs suggests).

Perhaps the most striking conclusion of this paper is that the widespread generation of microDNAs by microdeletions yields large amounts of genetic variation between somatic cells. This mosaicism may well lead to functional differences between cells. What are implications of this mosaicism? Do microDNAs have any specific functions? Or are they simply a product of defective replication/repair processes? Are microDNAs only found in mammalian cells? Or are they more widespread (the researchers didn’t observe any in yeast cells)? It will be exciting to see future research attempt to answer these questions.

Shibata, Y., Kumar, P., Layer, R., Willcox, S., Gagan, J., Griffith, J., & Dutta, A. (2012). Extrachromosomal MicroDNAs and Chromosomal Microdeletions in Normal Tissues Science, 336 (6077), 82-86 DOI: 10.1126/science.1213307

Double-strand break interacting RNAs (diRNAs)

A new role for small RNAs in the repair of DNA double-strand breaks has been reported in Cell. Wei et al. have found diRNAs, derived from the vicinity of DNA double-strand breaks, in both Arabidopsis thaliana and human cells.

DNA double strand breaks (DSBs) are a particularly deleterious form of DNA damage as they can cause chromosomal translocations and induce cell death. To maintain the genome’s integrity, eukaryotic cells employ two different mechanisms of DSB repair. Non-homologous end joining (NHEJ) is an efficient mechanism that rapidly repairs DSBs without requiring an homologous template. However, NHEJ often causes insertions or deletions at the break site. Homologous recombination (HR) is a less error prone mechanism in which a sister chromatid is used as a template for repair. A specialised form of HR, single-strand annealing (SSA) that doesn’t require a sister chromatid can take place at repetitive sequences.

Wei et al. used an assay system that monitors DSB repair by SSA in the model plant Arabidopsis thaliana. A genetic cross causes a single DSB in an inactive reporter gene containing a repeat. SSA mediated repair restores the activity of the reporter gene and allows a quantitative and visible readout of DSB repair events. For instance, when this assay system is introduced (by crossing) into a genetic background mutant for atr (encoding a PI3 kinase known to be involved in DSB response), the researchers observed a large reduction in repair efficiency.

The first clue that suggested that small RNAs may be involved in double strand break repair came when they crossed their DSB repair assay system into lines mutant for Dicer-like proteins (DCL). Dicer and DCLs are responsible for the biogenesis of small RNAs (miRNAs and siRNAs) from double-stranded RNAs. Mutations in three different dcl genes (especially dcl3) all diminished the efficiency of DSB repair. The researchers therefore tried to examine whether small RNAs were produced from sequences adjacent to the DSB site. By probing northern blots with sequence flanking the DSB site, Wei et al detected a population of small RNAs approximately 21nt in length that were only present when DSBs had been induced. Deep sequencing (direct sequencing of RNA populations) revealed that these DSB-induced small RNAs (diRNAs) were specifically produced from the sequences flanking the DSB (approximately 800bp in each direction) and derived from both the sense and antisense strands in equal measure. By using a similar assay that monitored DSB repair by HR, they showed that diRNAs were also produced in this system.

In plants, a well characterised small RNA system mediates heterochromatic silencing of repetitive sequences by DNA methylation. Wei et al. used this pathway as a model to dissect the diRNA system. In the heterochromatic-siRNA system, single stranded RNA transcripts generated by the DNA-dependent RNA polymerase IV (Pol IV)  are converted to dsRNAs by the action of the RNA-dependent RNA polymerase 2 (RDR2). The dsRNAs are then cleaved into hc-siRNAs by Dicer-like proteins. When complexed with the Argonaute protein AGO4, hc-siRNAs direct de novo DNA methylation. By using the DSB assay system in backgrounds mutant for these factors and deep sequencing, Wei et al.  found that diRNA production requires the activity of Pol IV, RDR2 and RDR6 and DCLs, and that this pathway is under the control of DSB responsive kinase ATR. However, diRNA-mediated DSB repair does not involve RNA-directed DNA methylation pathway effector components such as AGO4. Instead, a different Argonaute protein AGO2 was found to complex diRNAs. Both diRNA accumulation and DSB repair were compromised in ago2 mutants.

Wei et al. went on to enquire as to whether diRNAs are involved in DSB repair in animals as well as plants. Using a similar HR mediated DSB repair assay in a human cell line, they showed small RNAs are also produced close to DSBs. Interestingly, whereas in plants the diRNAs were produced from sequences immediately neighbouring the DSB, in human cells they originated from a broader vicinity around the break site and not immediately adjacent. When Dicer or Ago2 were depleted in human cells DSB repair was compromised.

This paper has demonstrated the existence of a new class of small RNAs and their involvement in yet another important biological process. However, the details of how diRNAs act in DSB repair are completely unknown as yet. The authors suggest that diRNAs may guide histone modifications around the DSB site that facilitate DNA repair. Alternatively, diRNA-AGO2 complexes may be directly target DSB repair complexes to break sites. The assay systems used in this study only tested DSB repair by HR and SSA. It would be interesting to know whether diRNAs are also involved in DSB repair by NHEJ.

Wei, W., Ba, Z., Gao, M., Wu, Y., Ma, Y., Amiard, S., White, C., Danielsen, J., Yang, Y., & Qi, Y. (2012). A Role for Small RNAs in DNA Double-Strand Break Repair Cell DOI: 10.1016/j.cell.2012.03.002