Be the first to like this
In the past few years it has become evident that over 85%of the human genome is processed into RNA, with less than 2% encoding proteins. The expanding compendium of non-coding RNAs identified in transcriptomic studies lies in stark contrast to their functional annotation.
Evolutionarily conserved RNA secondary structures are a robust indicator of purifying selection and, consequently, molecular function. Evaluating their genome-wide occurrence through comparative genomics has consistently been plagued by high false-positive rates and divergent predictions.
This poster presents a novel benchmarking pipeline aimed at calibrating the precision of genome-wide scans for consensus RNA structure prediction. The benchmarking data was used to fine-tune the parameters of an optimized workflow for genomic sliding window screens.
When applied to consistency-based multiple genome alignments of 35 mammals, our approach confidently identifies >4 million evolutionarily constrained RNA structures using a conservative sensitivity threshold that entails historically low false discovery rates for such analyses (5–22%). These predictions comprise 13.6% of the human genome, 88% of which fall outside any known sequence-constrained element, suggesting that a large proportion of the mammalian genome is functional.
This work provides an extensive set of functional transcriptomic annotations that will assist researchers in uncovering the precise mechanisms underlying regulation of gene expression via ncRNAs.
Authors: Martin A. Smith, Tanja Gesel, Peter F. Stadler, John S. Mattick