A new study identifying hundreds of long intervening noncoding RNAs (lincRNAs) in the zebrafish shows that these molecules have important conserved roles in vertebrate development.
Thousands of loci in mammalian genomes produce capped, polyadenylated, and often spliced RNA molecules that are greater than 200nt in length yet do not encode proteins. These lincRNAs have been shown to function in a number of cellular processes including X chromosome inactivation and transcriptional regulation. The roles of the vast majority of identified lincRNAs are however unknown.
To try and identify lincRNAs in the zebrafish, Ulitsky et al designed a pipeline of genomic datasets. The first stage defined boundaries of transcriptional units by combining maps identifying the genomic locations of the 3′ termini of polyadenylated transcripts, with a genome wide chromatin state map based on a specific chromatin modification found in gene promoters, defining 5′ ends. Upon subtracting any transcription units known to encode proteins or small RNAs, and comparison with datasets of transcribed sequences, 567 lincRNA genes were defined. Their approach was quite stringent, so this is an underestimate of the total lincRNAs, and is especially biased against those with low levels of expression or especially tissue-restricted expression.
Within the 567 zebrafish lincRNA gene dataset, only 29 instances of sequence conservation with mammalian lincRNAs were identified. This sequence homology typically only spanned small portions of the transcripts (308nt average in relation to 1,951nt average length of lincRNA). However, broader features of lincRNA gene structure, such as the distribution and length of exons and introns, were better conserved. The positional relationships between lincRNA genes and neighbouring genes (synteny) was also well conserved.
Analysis of the expression of a subset of the identified lincRNAs showed that a high proportion displayed tissue specific embryonic expression patterns, most commonly in the developing central nervous system. To enquire further about the functional significance of lincRNA, the researchers used antisense reagents (morpholinos) to interfere with the function of two of the lincRNAs with significant mammalian homology. In both cases morpholinos causing defective splicing or targeting the areas of conserved sequence caused developmental defects. These morphant phenotypes could be rescued by coinjection of the properly spliced lincRNA. Importantly, they could also be rescued by injection of the orthologous human or mouse lincRNAs. This showed that the developmental functions of these lincRNAs were conserved through vertebrate evolution.
One of the most interesting aspects of this paper is the discussion on the potential mechanisms of lincRNA gene evolution. A higher proportion of zebrafish lincRNA genes show sequence homology with mammalian protein coding sequences than they do with mammalian lincRNA genes. 8.6% of zebrafish lincRNAs showed sequence similarity with zebrafish protein coding genes as well. These findings suggest that some lincRNAs originated from protein coding genes (and vice versa). In this scenario a lincRNA gene can arise either from a pseudogene that has already lost it’s protein coding function, or from a gene that maintained both protein and lincRNA coding function before losing it’s protein coding ability. This raises the possibility that some mRNAs might currently carry out lincRNA type non-coding functions.
See also: Linking a lincRNA to active chromatin
Ulitsky, I., Shkumatava, A., Jan, C., Sive, H., & Bartel, D. (2011). Conserved Function of lincRNAs in Vertebrate Embryonic Development despite Rapid Sequence Evolution Cell, 147 (7), 1537-1550 DOI: 10.1016/j.cell.2011.11.055