Recent reports have described an intricate interplay among diverse RNA species, including protein-coding messenger RNAs and non-coding RNAs such as long non-coding RNAs, pseudogenes and circular RNAs. These RNA transcripts act as competing endogenous RNAs (ceRNAs) or natural microRNA sponges — they communicate with and co-regulate each other by competing for binding to shared microRNAs, a family of small non-coding RNAs that are important post-transcriptional regulators of gene expression.
Understanding this novel RNA crosstalk will lead to significant insight into gene regulatory networks and have implications in human development and disease. Aside from around 21,000 protein-coding genes (less than 2% of the total genome), the human transcriptome includes about 9,000 small RNAs, about 10,000–32,000 long non-coding RNAs (lncRNAs) and around 11,000 pseudogenes.
Non-coding transcripts can generally be divided into two major classes on the basis of their size. Small non-coding RNAs have been relatively well characterized, and include transfer RNAs, which are involved in translation of messenger RNAs; microRNAs (miRNAs) and small-interfering RNAs, which are implicated in post-transcriptional RNA silencing; small nuclear RNAs, which are involved in splicing; small nucleolar RNAs, which are implicated in ribosomal RNA modification; PIWI-interacting RNAs, which are involved in transposon repression; and transcription initiation RNAs, promoter upstream transcripts and promoter-associated small RNAs, which may be involved in transcription regulation. lncRNAs can vary in length from 200 nucleotides to 100 kilobases, and have been implicated in a diverse range of biological processes from pluripotency to immune responses.
Although thousands of lncRNAs have been identified in the past decade (one of the best-studied and most dramatic examples is XIST which can recruit chromatin-modifying complexes to inactivate an entire chromosome during dosage compensation), only a small number have been functionally characterized.
Genome utilization among species is substatntially different (for example, the protein-coding genome constitutes almost the entire genome of unicellular yeast, but only 2% of mammalian genomes). The non-coding transcriptome is often dysregulated in cancer. These observations suggest that the non-coding transcriptome is of crucial importance in determining the greater complexity of higher eukaryotes and in disease pathogenesis. (Nature 505, 344–352 (16 January 2014) doi:10.1038/nature12986)