Annotating nc-RNAs with Rfam<br />Luca Cozzuto @ Bioinformatics Core<br />http://rfam.sanger.ac.uk/<br />
Functional RNAs<br />Non-coding RNA genes codify for a functional RNA product rather than for a protein.<br />
Functional RNAs<br />Non-coding genes codify for a functional RNA product rather than for a protein.<br />Family of functi...
Functional RNAs<br />The majority of functional RNAs fold in stable structures that are essential for their biological act...
Functional RNAs<br />Unlike protein-coding genes functional RNAs often show no significant sequence similarity but preserv...
Rfam: RNA family<br />For Rfam database a functional RNA family is represented by a multiple sequence alignment and a cova...
Rfam family<br />The Rfam Seed alignment for the U12 minor spliceosomal RNA family.<br />
Finding family members<br />Only one sequence, up to 10 kb<br />Search methodology<br />The query sequence is scanned agai...
Finding family members<br />Results <br />Positive hits are reported together with the score, e-value and alignment to the...
Finding family members<br />Bit score:  how well the sequence matches your model.<br />The score reflects whether the sequ...
Finding family members<br />I Predicted secondary structure<br />“<> [ ] { }” base pairs  “_” hairpin loop “-”interior bul...
Finding family members<br />Going to the family information<br />A summary written in wikipedia about the family is shown ...
Finding family members<br />Going to the family information<br />Sequences part of that family can be viewed (if they are ...
Finding family members<br />Going to the family information<br />Both seed and full alignments of members can be displayed...
Finding family members<br />Going to the family information<br />Both seed and full alignments of members can be displayed...
Finding family members<br />Going to the family information<br />The secondary structure can be viewed.<br />
Finding family members<br />Going to the family information<br />The secondary structure can be viewed.<br />
Finding family members<br />Going to the family information<br />Also the tree of genomes containing members of that famil...
Finding family members<br />Going to the family information<br />If a PDB entry is available it is possible to see also th...
Finding family members<br />Going to the family information<br />If a PDB entry is available it is possible to see also th...
Finding family members<br />Going to the family information<br />You can reach some publication on the family.<br />
Finding family members<br />Problems in searching sequences<br /><ul><li> To speed up the searching it is necessary a filt...
 The genomes of higher eukaryotes contain many ncRNA-derived pseudogenes and repeats that looks like structured functional...
Finding family members<br />Batch search<br />You can upload a file containing several sequences in fasta format. Generall...
Finding family members<br />Browsing for genome<br />Genomes scanned for the presence of a Rfma family are reported in Bro...
Finding family members<br />Browsing for genome<br />Species, kingdom, number of Rfam families and members found within th...
Finding family members<br />Browsing for genome<br />
Finding family members<br />Browsing for genome<br />
Finding family members<br />Running a complete search for a whole genome.<br />You may install locally the infernal progra...
Finding family members<br />Running a complete search for a whole genome.<br />Typical usage of infernal.<br />cmsearch -o...
Upcoming SlideShare
Loading in...5
×

Annotating nc-RNAs with Rfam

1,506

Published on

Rfam is an open access database (hosted at the Wellcome Trust Sanger Institute) containing information for RNA families and annotations for millions of RNA genes. Designed to work in a similar way to the Pfam database of protein families, Rfam uses a similar model for annotation and display and is built on the same principle of open access to the data. Each entry in the Rfam database includes multiple sequence alignments, a secondary structure and probabilistic models known as covariance models (CMs), these models can simultaneously handle an RNA sequence and its structure. In conjunction with the Infernal software package, Rfam CMs can be used to search genomes or other DNA sequence databases for homologs to known structural RNA families. You can find more about Rfam at http://rfam.janelia.org/

Published in: Education, Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,506
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
33
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Annotating nc-RNAs with Rfam

  1. 1. Annotating nc-RNAs with Rfam<br />Luca Cozzuto @ Bioinformatics Core<br />http://rfam.sanger.ac.uk/<br />
  2. 2. Functional RNAs<br />Non-coding RNA genes codify for a functional RNA product rather than for a protein.<br />
  3. 3. Functional RNAs<br />Non-coding genes codify for a functional RNA product rather than for a protein.<br />Family of functional RNAs:<br />
  4. 4. Functional RNAs<br />The majority of functional RNAs fold in stable structures that are essential for their biological activity. <br />Micro-RNA<br />precursor<br />U2 spliceosomal RNA<br />Part of Riboswitch<br />tRNA<br />
  5. 5. Functional RNAs<br />Unlike protein-coding genes functional RNAs often show no significant sequence similarity but preserve a base-paired secondary structure.<br />This makes very difficult to search for those genes looking only for sequence similarity (i.e. by using BLAST, FASTA…) <br />ncRNA_1 AAAAAAGGGGTTTTTT<br />ncRNA_2 AAATAAGGGGTTATTT<br />Struct((((((....))))))<br />
  6. 6. Rfam: RNA family<br />For Rfam database a functional RNA family is represented by a multiple sequence alignment and a covariance model. <br />The model takes into account both sequence and structure and can be used to scan a genomic sequence to detect new members of the same family. <br />
  7. 7. Rfam family<br />The Rfam Seed alignment for the U12 minor spliceosomal RNA family.<br />
  8. 8. Finding family members<br />Only one sequence, up to 10 kb<br />Search methodology<br />The query sequence is scanned against a library of Rfam sequences using WU-BLAST, with an E-value threshold of 1.0. Any matches to this are then scanned against the corresponding covariance model using the hand-curated threshold for that family. <br />
  9. 9. Finding family members<br />Results <br />Positive hits are reported together with the score, e-value and alignment to the family CM. <br />
  10. 10. Finding family members<br />Bit score: how well the sequence matches your model.<br />The score reflects whether the sequence matches better to the profile model (positive score) or to the null model of nonhomologous sequences (negative score). <br />E-value: expected number of false positives with bit scores at least high as your hit.<br />The value is related to the size of database used for the search.<br />
  11. 11. Finding family members<br />I Predicted secondary structure<br />“<> [ ] { }” base pairs “_” hairpin loop “-”interior bulge and loop “,”single stranded multifurcation loop “:”external single stranded residues “.”insertion to the consensus. <br />II Consensus of the query model<br />III Alignment to the model and scoring system<br />“Capital letter” = max score. “: +” score >=0 for base pairs and single stranded. “” negative score<br />IV Target sequence <br />
  12. 12. Finding family members<br />Going to the family information<br />A summary written in wikipedia about the family is shown together with information stored into the database.<br />
  13. 13. Finding family members<br />Going to the family information<br />Sequences part of that family can be viewed (if they are not so much)<br />
  14. 14. Finding family members<br />Going to the family information<br />Both seed and full alignments of members can be displayed. <br />
  15. 15. Finding family members<br />Going to the family information<br />Both seed and full alignments of members can be displayed. <br />
  16. 16. Finding family members<br />Going to the family information<br />The secondary structure can be viewed.<br />
  17. 17. Finding family members<br />Going to the family information<br />The secondary structure can be viewed.<br />
  18. 18. Finding family members<br />Going to the family information<br />Also the tree of genomes containing members of that family can be browsed<br />
  19. 19. Finding family members<br />Going to the family information<br />If a PDB entry is available it is possible to see also the three-dimensional structure. <br />
  20. 20. Finding family members<br />Going to the family information<br />If a PDB entry is available it is possible to see also the three-dimensional structure. <br />
  21. 21. Finding family members<br />Going to the family information<br />You can reach some publication on the family.<br />
  22. 22. Finding family members<br />Problems in searching sequences<br /><ul><li> To speed up the searching it is necessary a filtering step based on blast search. This will decrease the sensitivity in finding true homologues of the functional RNA family.
  23. 23. The genomes of higher eukaryotes contain many ncRNA-derived pseudogenes and repeats that looks like structured functional RNAs.</li></ul>Gardner PP, et al. Bateman A. Rfam: updates to the RNA families database. Nucleic Acids Res. 2009<br />
  24. 24. Finding family members<br />Batch search<br />You can upload a file containing several sequences in fasta format. Generally a job takes 48 hours.<br />Files must have fewer than 100,000 lines and fewer than 1000 sequences with a size shorter than 200,000 nucleotides<br />
  25. 25. Finding family members<br />Browsing for genome<br />Genomes scanned for the presence of a Rfma family are reported in Browse tab.<br />
  26. 26. Finding family members<br />Browsing for genome<br />Species, kingdom, number of Rfam families and members found within the specie (Regions) are reported.<br />
  27. 27. Finding family members<br />Browsing for genome<br />
  28. 28. Finding family members<br />Browsing for genome<br />
  29. 29. Finding family members<br />Running a complete search for a whole genome.<br />You may install locally the infernal program available at http://infernal.janelia.org/.<br />To speed up the search you may install also the rfam_scan.pl script available at ftp://ftp.sanger.ac.uk/pub/databases/Rfam/tools/ that relies on Blast program.<br />
  30. 30. Finding family members<br />Running a complete search for a whole genome.<br />Typical usage of infernal.<br />cmsearch -ooutput.aln --tabfileoutput.tabinfile.fnaRfam.cm<br />Typical usage of rfam_scan.pl<br />Perl rfam_scan.pl –blastdbRfam.fasta -outfile.outRfam.cminfile.fna<br />
  31. 31. Finding family members<br />Thanks!<br />
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×