SlideShare a Scribd company logo
Visualising RNA

             Paul Gardner
 paul.gardner@canterbury.ac.nz
University of Canterbury, Christchurch,
             New Zealand.


           March 20, 2013




         Paul Gardner   Visualising RNA
Feel free to share
       Feel free to tweet (@ppgardne), Google+, tumblr, ...
       Slides are available from
       http://www.slideshare.net/ppgardne/.




                         Paul Gardner   Visualising RNA
What is an RNA?

    A                                                      Primary Structure
         5        10        15      20          25    30     35     40    45    50     55   60   65      70        75
                                           Ψ               Ψ
  5’ GCGGAUUUAGCUCAGDDGGGAGAGCGCCAGACUGAAYA.CUGGAGGUCCUGUGT.CGAUCCACAGAAUUCGCACCA 3’



    B Secondary Structure                                                       C Tertiary Structure
                                       75
                                 3’A   C                                                                      5’
                                            C                     T ΨC                                                  3’
                                 5’ G C A                         Loop
              Acceptor C G
              Stem     G C 70
                       GU                             T ΨC
        D Loop        5A U                                                                            Acceptor
             15            UA             Loop                    D Loop
        DGA            U U A 65       60                                                              Stem
      D            A                  C UA
            C U C G 10        GACAC
      G     G                 CUGUG         G
       G G A A G C 25        C 50
                  G          U        T ΨC
                                         .
         20
                     C GAG  G            55
                   C G 45
                   AU
                  G C 40
                A 30
              C        Ψ .      Variable
   Anticodon U          A                                                                    Anticodon
                                Loop
   Loop        G
                 A A
                       .
                       Y                                                                     Loop
                       35




                                                     Paul Gardner        Visualising RNA
What is Rfam?




      Sister database to Pfam
      Aims to annotate all ncRNA families
      Consortium headed by Alex Bateman (Wellcome Trust Sanger
      Institute), Sean Eddy (Janelia, Howard Hughes), Sam
      Griffiths-Jones (Manchester, BBSRC), Paul Gardner
      (University of Canterbury, RSNZ)


                        Paul Gardner   Visualising RNA
Rfam: families of ncRNAs




   http://rfam.sanger.ac.uk
   http://rfam.janelia.org




                              Paul Gardner   Visualising RNA
Building an Rfam family


          A structure from literature




   Pollard KS, et al. (2006). An RNA gene expressed during cortical development evolved rapidly in humans. Nature.




                                          Paul Gardner       Visualising RNA
Building an Rfam family

        An Rfam family: produced manually from publication figures
    # STOCKHOLM 1.0

    G.gallus.1        UGAAAUGGAGGAGAAAUUACAGCAAUUUAUCAACUGAAAUUAUAGGUGUAGACACAUGUCAGCAGUAG
    M.musculus.1      UAAAAUGGAGGAGAAAUUACAGCAAUUUAUCAGCUGAAAUUAUAGGUGUAGACACAUGUCAGCCGUGG
    M.mulatta.1       UGAAAUGGAGGAGAAAUUACAGCAAUUUAUCAGCUGAAAUUAUAGGUGUAGACACAUGUCAGCAGUGG
    G.gorilla.1       UGAAAUGGAGGAGAAAUUACAGCAAUUUAUCAACUGAAAUUAUAGGUGUAGACACAUGUCAGCAGUGG
    H.sapiens.1       UGAAACGGAGGAGACGUUACAGCAACGUGUCAGCUGAAAUGAUGGGCGUAGACGCACGUCAGCGGCGG
    P.troglodytes.1   UGAAAUGGAGGAGAAAUUACAGCAAUUUAUCAACUGAAAUUAUAGGUGUAGACACAUGUCAGCAGUGG
    P.abelii.1        UGAAAUGGAGGAGAAAUUACAGCAAUUUAUCAACUGAAAUUAUAGGUGUAGACACAUGUCAGCAGUGG
    C.lupus.1         UGAAAUGGAGGAGAAAUUACAGCAAUUUAUCAACUGAAAUUAUAGGUGUAGACACAUGUCAGCGGUGC
    T.truncatus.1     CGAAAAGGAGGGGAAAUUACAGCAAUUCAUCAACUGAAAUUAUAGGUGUAGACACAUGUCAGCAGUGG
    B.taurus.1        CGAAAUGGAGGAGAAAUUACAGCAAUUCAUCAGCUGAAAUUAUAGGUGUAGACACAUGUCAGCAGUGG
    V.pacos.1         UGAAACAGAGGAGAAAUUACAGCAAUUCAUCAACCGAAAUGAUAGGGAUAGACAUGUGUCGGCAGUGG
    M.lucifugus.1     CGAAAUGGAGGAGAAAUUACAGCAAUUUAUCAACUGAAAUUAUAGGUGUAGACACAUGUCAUCCGUGG
    O.anatinus.1      UGAAAUGGAGGAUAAAUUACAGCAAUUUAUCAAAUGAAAUUAUAGGUGUAGACACAUGUCAGCAAUGG
    #=GC SS_cons      <<<<<<.<<<<<<<<<<<.....>>>>>.....>><<<<<.<<<.<<<....>>>.>>>.........
    #=GC RF           uGaaacGGaGGagaaguuAcAGcaacuuAUcAgcuGaaacuaugGGcGUAGACgCAcgucAGcaguGg
    G.gallus.1        AAACAGUUUCUAUCAAAAUUAAAGUAUUUAGAGAUUUUCCUCAAAUUUCA
    M.musculus.1      AAAUGGUUUCUAUCAAAAUUAAAGUAUUUAGAGAUUUUCCUCAAAUUUCA
    M.mulatta.1       AAAUAGUUUCUAUCAAAAUUAAAGUAUUUAGAGAUUUUCCUCAAAUUUCA
    G.gorilla.1       AAAUAGUUUCUAUCAAAAUUAAAGUAUUUAGAGAUUUUCCUCAAAUUUCA
    H.sapiens.1       AAAUGGUUUCUAUCAAAAUGAAAGUGUUUAGAGAUUUUCCUCAAGUUUCA
    P.troglodytes.1   AAAUAGUUUCUAUCAAAAUUAAAGUAUUUAGAGAUUUUCCUCAAAUUUCA
    P.abelii.1        AAAUAGUUUCUAUCAAAAUUAAAGUAUUUAGAGAUUUUCCUCAAAUUUCA
    C.lupus.1         AAACAGUUUCUAUCAAAAUUAAAGUAUUUAGAGAUUUUCCUCAAAUUUCA
    T.truncatus.1     GAACACUUUCUAUCAAAAUUAAAGUACUUAGCGAUUUUCCUUAAAUUUCA
    B.taurus.1        AAACCGUUUCUAUCAAAAUUAAAGUAUUUAGAGAUUUUCCUUAAAUUUCA
    V.pacos.1         AAACAGUUUCUAUCAAAAUUAAAGUAUUUAGAGACUUUCCUCAAAUUUCA
    M.lucifugus.1     AAACAGUUACGAUCAAAAUUAAAGUGUUUAGAGAUUUUCCUC.AAUUUUA
    O.anatinus.1      AAACAAUUUCUAUCAAAAUUAAAGUAUUUAGAGAUUUUCCUCAAAUUUCA
    #=GC SS_cons      .....>>>>>....<<<<<..............>>>>>>>>>..>>>>>>
    #=GC RF           AAAuaguuuCUAUcaaaauuAAAGUAUUUAGAGauuuuCCuCAAguuuCa
    //


                                  Paul Gardner   Visualising RNA
Building an Rfam family

      And the Wikipedia entry




                       Paul Gardner   Visualising RNA
Conflicting priorities

          A Curator’s priorities                           A User’s priorities
             1.   New families                                1.   FTP (Bioinformaticians)
             2.   Accuracy of models                          2.   Website
             3.   Annotation                                  3.   Visualization
             4.   Functional codebase                         4.   Number of families
             5.   Website                                     5.   Accuracy of models
             6.   Visualization                               6.   Annotation




   Image credits: www.conflictdynamics.org
                                            Paul Gardner   Visualising RNA
2007: challenges
      Quality Control
      Re-write the website and add some bling
      Update codebase
      Export annotation to Wikipedia
      User community input via RNA Biology




                        Paul Gardner   Visualising RNA
Visualisation priorities

   SCALE
       Two to two million sequences, 30 to 3,000 nucleotides long, 0
       to 1,000 basepairs.
       AUTOMATED: thousands of families.


   INFORMATIVE
       Generates biologically relevant hypotheses


   INCLUSIVE
       Make the most of our fantastic Bioinformatic & Visualisation
       community.


                           Paul Gardner   Visualising RNA
Examples




      Caveat: none of these images I am showing are final solutions,
      everything can be improved upon.

           Secondary Structure                 Alignment
           Taxonomic Distribution              Genomic contexts & Gene
                                               Order




                         Paul Gardner   Visualising RNA
RNA Secondary Structure
                                        5’   3’

                                   UM
                                   VH
                                   DU
                               UA HWY A GU
                            AG             CU
                        U                                U
                    G                                        G
                A                                                a
             G                                                       A
           Y                                                          A
          U                                                            S
         C                                                               a
       M                                                                  G
       A                                                                   U
      C                                                                    R
      U                                                                    W
      U                                                                     B
      C                                                                    U
      W                                                                    M
       U                                                                  U
        u                                                                A
         G                                                              G
           G                                                           U
            U                                                        R
             C                                                      M
                  C                                             Y
                      G                                     C
                            U                           M
                                GU              R
                                     UUCUGA g a
         0              1
         Sequence conservation


   Gardner, Bateman & Poole (2010) SnoPatrol: how many snoRNA genes are there?. Journal of Biology.

                                                  Paul Gardner           Visualising RNA
Old Taxonomic distributions: RybB
      Contamination displayed first.




                        Paul Gardner   Visualising RNA
Old Taxonomic distributions: RybB
      After some scrolling




                         Paul Gardner   Visualising RNA
New Taxonomic distributions: RybB
      Sunbursts: concentric “pie charts”, each external ring
      contains the “children” nodes of the internal ring.




                         Paul Gardner   Visualising RNA
Alignments




      When we have sequenced everything, how is this view going
      to look?

                        Paul Gardner   Visualising RNA
Genomic contexts & Gene Order


           How can we display comparative gene-order information in a
           scalable fashion?
           Think of hundreds to thousands of genomes, tens to hundreds
           of features.




   Barquist L, et al. (2013). A comparison of dense transposon insertion libraries in the Salmonella serovars Typhi and
   Typhimurium. Nucleic Acids Research.




                                            Paul Gardner        Visualising RNA
Open problems



      Evolution and RNA structure
      Scalable, alignment visualisation (and editing)
          As alignments grow, we need to be able to be able to partition,
          compress and summarize groupings of sequences. 1,000s of
          sequences from the same species is not interesting to view, nor
          is a screen full of gaps.
      Expression and conservation levels
      Genomic context & gene-order




                         Paul Gardner   Visualising RNA
Thanks!
         The Rfam Consortium:
                 Alex Bateman, Sean
                 Eddy, Sam
                 Griffiths-Jones, Sarah
                 Burge, Eric Nawrocki,
                 John Tate, Rob Finn,
                 Jennifer Daub, Ruth                        Visualisation Tools:
                 Eberhardt                                          Ivo Hofacker, Yann
                                                                    Ponti, Jim Proctor,
                                                                    Ian Holmes, Irmtraud
                                                                    Meyer, Zasha
                                                                    Weinberg and many
                                                                    others.




  PPG is supported by a Rutherford Discovery Fellowship from Government funding, administered by the Royal
  Society of New Zealand.

                                        Paul Gardner       Visualising RNA

More Related Content

More from Paul Gardner

ppgardner-lecture03-genomesize-complexity.pdf
ppgardner-lecture03-genomesize-complexity.pdfppgardner-lecture03-genomesize-complexity.pdf
ppgardner-lecture03-genomesize-complexity.pdf
Paul Gardner
 
Does RNA avoidance dictate protein expression level?
Does RNA avoidance dictate protein expression level?Does RNA avoidance dictate protein expression level?
Does RNA avoidance dictate protein expression level?
Paul Gardner
 
Machine learning methods
Machine learning methodsMachine learning methods
Machine learning methods
Paul Gardner
 
Clustering
ClusteringClustering
Clustering
Paul Gardner
 
Monte Carlo methods
Monte Carlo methodsMonte Carlo methods
Monte Carlo methods
Paul Gardner
 
The jackknife and bootstrap
The jackknife and bootstrapThe jackknife and bootstrap
The jackknife and bootstrap
Paul Gardner
 
Contingency tables
Contingency tablesContingency tables
Contingency tables
Paul Gardner
 
Regression (II)
Regression (II)Regression (II)
Regression (II)
Paul Gardner
 
Regression (I)
Regression (I)Regression (I)
Regression (I)
Paul Gardner
 
Analysis of covariation and correlation
Analysis of covariation and correlationAnalysis of covariation and correlation
Analysis of covariation and correlation
Paul Gardner
 
Analysis of two samples
Analysis of two samplesAnalysis of two samples
Analysis of two samples
Paul Gardner
 
Analysis of single samples
Analysis of single samplesAnalysis of single samples
Analysis of single samples
Paul Gardner
 
Centrality and spread
Centrality and spreadCentrality and spread
Centrality and spread
Paul Gardner
 
Fundamentals of statistical analysis
Fundamentals of statistical analysisFundamentals of statistical analysis
Fundamentals of statistical analysis
Paul Gardner
 
Random RNA interactions control protein expression in prokaryotes
Random RNA interactions control protein expression in prokaryotesRandom RNA interactions control protein expression in prokaryotes
Random RNA interactions control protein expression in prokaryotes
Paul Gardner
 
Avoidance of stochastic RNA interactions can be harnessed to control protein ...
Avoidance of stochastic RNA interactions can be harnessed to control protein ...Avoidance of stochastic RNA interactions can be harnessed to control protein ...
Avoidance of stochastic RNA interactions can be harnessed to control protein ...
Paul Gardner
 
A meta-analysis of computational biology benchmarks reveals predictors of pro...
A meta-analysis of computational biology benchmarks reveals predictors of pro...A meta-analysis of computational biology benchmarks reveals predictors of pro...
A meta-analysis of computational biology benchmarks reveals predictors of pro...
Paul Gardner
 
01 nc rna-intro
01 nc rna-intro01 nc rna-intro
01 nc rna-intro
Paul Gardner
 
Introduction to RNA-seq
Introduction to RNA-seqIntroduction to RNA-seq
Introduction to RNA-seq
Paul Gardner
 
BIOL335: RNA bioinformatics
BIOL335: RNA bioinformaticsBIOL335: RNA bioinformatics
BIOL335: RNA bioinformatics
Paul Gardner
 

More from Paul Gardner (20)

ppgardner-lecture03-genomesize-complexity.pdf
ppgardner-lecture03-genomesize-complexity.pdfppgardner-lecture03-genomesize-complexity.pdf
ppgardner-lecture03-genomesize-complexity.pdf
 
Does RNA avoidance dictate protein expression level?
Does RNA avoidance dictate protein expression level?Does RNA avoidance dictate protein expression level?
Does RNA avoidance dictate protein expression level?
 
Machine learning methods
Machine learning methodsMachine learning methods
Machine learning methods
 
Clustering
ClusteringClustering
Clustering
 
Monte Carlo methods
Monte Carlo methodsMonte Carlo methods
Monte Carlo methods
 
The jackknife and bootstrap
The jackknife and bootstrapThe jackknife and bootstrap
The jackknife and bootstrap
 
Contingency tables
Contingency tablesContingency tables
Contingency tables
 
Regression (II)
Regression (II)Regression (II)
Regression (II)
 
Regression (I)
Regression (I)Regression (I)
Regression (I)
 
Analysis of covariation and correlation
Analysis of covariation and correlationAnalysis of covariation and correlation
Analysis of covariation and correlation
 
Analysis of two samples
Analysis of two samplesAnalysis of two samples
Analysis of two samples
 
Analysis of single samples
Analysis of single samplesAnalysis of single samples
Analysis of single samples
 
Centrality and spread
Centrality and spreadCentrality and spread
Centrality and spread
 
Fundamentals of statistical analysis
Fundamentals of statistical analysisFundamentals of statistical analysis
Fundamentals of statistical analysis
 
Random RNA interactions control protein expression in prokaryotes
Random RNA interactions control protein expression in prokaryotesRandom RNA interactions control protein expression in prokaryotes
Random RNA interactions control protein expression in prokaryotes
 
Avoidance of stochastic RNA interactions can be harnessed to control protein ...
Avoidance of stochastic RNA interactions can be harnessed to control protein ...Avoidance of stochastic RNA interactions can be harnessed to control protein ...
Avoidance of stochastic RNA interactions can be harnessed to control protein ...
 
A meta-analysis of computational biology benchmarks reveals predictors of pro...
A meta-analysis of computational biology benchmarks reveals predictors of pro...A meta-analysis of computational biology benchmarks reveals predictors of pro...
A meta-analysis of computational biology benchmarks reveals predictors of pro...
 
01 nc rna-intro
01 nc rna-intro01 nc rna-intro
01 nc rna-intro
 
Introduction to RNA-seq
Introduction to RNA-seqIntroduction to RNA-seq
Introduction to RNA-seq
 
BIOL335: RNA bioinformatics
BIOL335: RNA bioinformaticsBIOL335: RNA bioinformatics
BIOL335: RNA bioinformatics
 

Recently uploaded

Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid ResearchHarnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Neo4j
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansBiomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Neo4j
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Precisely
 
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
saastr
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
Javier Junquera
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
Antonios Katsarakis
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
ScyllaDB
 
Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
Ivo Velitchkov
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Tatiana Kojar
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
Alex Pruden
 
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 

Recently uploaded (20)

Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid ResearchHarnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansBiomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
 
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
 
Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
 
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 

Vizbi2013: Visualising RNA

  • 1. Visualising RNA Paul Gardner paul.gardner@canterbury.ac.nz University of Canterbury, Christchurch, New Zealand. March 20, 2013 Paul Gardner Visualising RNA
  • 2. Feel free to share Feel free to tweet (@ppgardne), Google+, tumblr, ... Slides are available from http://www.slideshare.net/ppgardne/. Paul Gardner Visualising RNA
  • 3. What is an RNA? A Primary Structure 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 Ψ Ψ 5’ GCGGAUUUAGCUCAGDDGGGAGAGCGCCAGACUGAAYA.CUGGAGGUCCUGUGT.CGAUCCACAGAAUUCGCACCA 3’ B Secondary Structure C Tertiary Structure 75 3’A C 5’ C T ΨC 3’ 5’ G C A Loop Acceptor C G Stem G C 70 GU T ΨC D Loop 5A U Acceptor 15 UA Loop D Loop DGA U U A 65 60 Stem D A C UA C U C G 10 GACAC G G CUGUG G G G A A G C 25 C 50 G U T ΨC . 20 C GAG G 55 C G 45 AU G C 40 A 30 C Ψ . Variable Anticodon U A Anticodon Loop Loop G A A . Y Loop 35 Paul Gardner Visualising RNA
  • 4. What is Rfam? Sister database to Pfam Aims to annotate all ncRNA families Consortium headed by Alex Bateman (Wellcome Trust Sanger Institute), Sean Eddy (Janelia, Howard Hughes), Sam Griffiths-Jones (Manchester, BBSRC), Paul Gardner (University of Canterbury, RSNZ) Paul Gardner Visualising RNA
  • 5. Rfam: families of ncRNAs http://rfam.sanger.ac.uk http://rfam.janelia.org Paul Gardner Visualising RNA
  • 6. Building an Rfam family A structure from literature Pollard KS, et al. (2006). An RNA gene expressed during cortical development evolved rapidly in humans. Nature. Paul Gardner Visualising RNA
  • 7. Building an Rfam family An Rfam family: produced manually from publication figures # STOCKHOLM 1.0 G.gallus.1 UGAAAUGGAGGAGAAAUUACAGCAAUUUAUCAACUGAAAUUAUAGGUGUAGACACAUGUCAGCAGUAG M.musculus.1 UAAAAUGGAGGAGAAAUUACAGCAAUUUAUCAGCUGAAAUUAUAGGUGUAGACACAUGUCAGCCGUGG M.mulatta.1 UGAAAUGGAGGAGAAAUUACAGCAAUUUAUCAGCUGAAAUUAUAGGUGUAGACACAUGUCAGCAGUGG G.gorilla.1 UGAAAUGGAGGAGAAAUUACAGCAAUUUAUCAACUGAAAUUAUAGGUGUAGACACAUGUCAGCAGUGG H.sapiens.1 UGAAACGGAGGAGACGUUACAGCAACGUGUCAGCUGAAAUGAUGGGCGUAGACGCACGUCAGCGGCGG P.troglodytes.1 UGAAAUGGAGGAGAAAUUACAGCAAUUUAUCAACUGAAAUUAUAGGUGUAGACACAUGUCAGCAGUGG P.abelii.1 UGAAAUGGAGGAGAAAUUACAGCAAUUUAUCAACUGAAAUUAUAGGUGUAGACACAUGUCAGCAGUGG C.lupus.1 UGAAAUGGAGGAGAAAUUACAGCAAUUUAUCAACUGAAAUUAUAGGUGUAGACACAUGUCAGCGGUGC T.truncatus.1 CGAAAAGGAGGGGAAAUUACAGCAAUUCAUCAACUGAAAUUAUAGGUGUAGACACAUGUCAGCAGUGG B.taurus.1 CGAAAUGGAGGAGAAAUUACAGCAAUUCAUCAGCUGAAAUUAUAGGUGUAGACACAUGUCAGCAGUGG V.pacos.1 UGAAACAGAGGAGAAAUUACAGCAAUUCAUCAACCGAAAUGAUAGGGAUAGACAUGUGUCGGCAGUGG M.lucifugus.1 CGAAAUGGAGGAGAAAUUACAGCAAUUUAUCAACUGAAAUUAUAGGUGUAGACACAUGUCAUCCGUGG O.anatinus.1 UGAAAUGGAGGAUAAAUUACAGCAAUUUAUCAAAUGAAAUUAUAGGUGUAGACACAUGUCAGCAAUGG #=GC SS_cons <<<<<<.<<<<<<<<<<<.....>>>>>.....>><<<<<.<<<.<<<....>>>.>>>......... #=GC RF uGaaacGGaGGagaaguuAcAGcaacuuAUcAgcuGaaacuaugGGcGUAGACgCAcgucAGcaguGg G.gallus.1 AAACAGUUUCUAUCAAAAUUAAAGUAUUUAGAGAUUUUCCUCAAAUUUCA M.musculus.1 AAAUGGUUUCUAUCAAAAUUAAAGUAUUUAGAGAUUUUCCUCAAAUUUCA M.mulatta.1 AAAUAGUUUCUAUCAAAAUUAAAGUAUUUAGAGAUUUUCCUCAAAUUUCA G.gorilla.1 AAAUAGUUUCUAUCAAAAUUAAAGUAUUUAGAGAUUUUCCUCAAAUUUCA H.sapiens.1 AAAUGGUUUCUAUCAAAAUGAAAGUGUUUAGAGAUUUUCCUCAAGUUUCA P.troglodytes.1 AAAUAGUUUCUAUCAAAAUUAAAGUAUUUAGAGAUUUUCCUCAAAUUUCA P.abelii.1 AAAUAGUUUCUAUCAAAAUUAAAGUAUUUAGAGAUUUUCCUCAAAUUUCA C.lupus.1 AAACAGUUUCUAUCAAAAUUAAAGUAUUUAGAGAUUUUCCUCAAAUUUCA T.truncatus.1 GAACACUUUCUAUCAAAAUUAAAGUACUUAGCGAUUUUCCUUAAAUUUCA B.taurus.1 AAACCGUUUCUAUCAAAAUUAAAGUAUUUAGAGAUUUUCCUUAAAUUUCA V.pacos.1 AAACAGUUUCUAUCAAAAUUAAAGUAUUUAGAGACUUUCCUCAAAUUUCA M.lucifugus.1 AAACAGUUACGAUCAAAAUUAAAGUGUUUAGAGAUUUUCCUC.AAUUUUA O.anatinus.1 AAACAAUUUCUAUCAAAAUUAAAGUAUUUAGAGAUUUUCCUCAAAUUUCA #=GC SS_cons .....>>>>>....<<<<<..............>>>>>>>>>..>>>>>> #=GC RF AAAuaguuuCUAUcaaaauuAAAGUAUUUAGAGauuuuCCuCAAguuuCa // Paul Gardner Visualising RNA
  • 8. Building an Rfam family And the Wikipedia entry Paul Gardner Visualising RNA
  • 9. Conflicting priorities A Curator’s priorities A User’s priorities 1. New families 1. FTP (Bioinformaticians) 2. Accuracy of models 2. Website 3. Annotation 3. Visualization 4. Functional codebase 4. Number of families 5. Website 5. Accuracy of models 6. Visualization 6. Annotation Image credits: www.conflictdynamics.org Paul Gardner Visualising RNA
  • 10. 2007: challenges Quality Control Re-write the website and add some bling Update codebase Export annotation to Wikipedia User community input via RNA Biology Paul Gardner Visualising RNA
  • 11. Visualisation priorities SCALE Two to two million sequences, 30 to 3,000 nucleotides long, 0 to 1,000 basepairs. AUTOMATED: thousands of families. INFORMATIVE Generates biologically relevant hypotheses INCLUSIVE Make the most of our fantastic Bioinformatic & Visualisation community. Paul Gardner Visualising RNA
  • 12. Examples Caveat: none of these images I am showing are final solutions, everything can be improved upon. Secondary Structure Alignment Taxonomic Distribution Genomic contexts & Gene Order Paul Gardner Visualising RNA
  • 13. RNA Secondary Structure 5’ 3’ UM VH DU UA HWY A GU AG CU U U G G A a G A Y A U S C a M G A U C R U W U B C U W M U U u A G G G U U R C M C Y G C U M GU R UUCUGA g a 0 1 Sequence conservation Gardner, Bateman & Poole (2010) SnoPatrol: how many snoRNA genes are there?. Journal of Biology. Paul Gardner Visualising RNA
  • 14. Old Taxonomic distributions: RybB Contamination displayed first. Paul Gardner Visualising RNA
  • 15. Old Taxonomic distributions: RybB After some scrolling Paul Gardner Visualising RNA
  • 16. New Taxonomic distributions: RybB Sunbursts: concentric “pie charts”, each external ring contains the “children” nodes of the internal ring. Paul Gardner Visualising RNA
  • 17. Alignments When we have sequenced everything, how is this view going to look? Paul Gardner Visualising RNA
  • 18. Genomic contexts & Gene Order How can we display comparative gene-order information in a scalable fashion? Think of hundreds to thousands of genomes, tens to hundreds of features. Barquist L, et al. (2013). A comparison of dense transposon insertion libraries in the Salmonella serovars Typhi and Typhimurium. Nucleic Acids Research. Paul Gardner Visualising RNA
  • 19. Open problems Evolution and RNA structure Scalable, alignment visualisation (and editing) As alignments grow, we need to be able to be able to partition, compress and summarize groupings of sequences. 1,000s of sequences from the same species is not interesting to view, nor is a screen full of gaps. Expression and conservation levels Genomic context & gene-order Paul Gardner Visualising RNA
  • 20. Thanks! The Rfam Consortium: Alex Bateman, Sean Eddy, Sam Griffiths-Jones, Sarah Burge, Eric Nawrocki, John Tate, Rob Finn, Jennifer Daub, Ruth Visualisation Tools: Eberhardt Ivo Hofacker, Yann Ponti, Jim Proctor, Ian Holmes, Irmtraud Meyer, Zasha Weinberg and many others. PPG is supported by a Rutherford Discovery Fellowship from Government funding, administered by the Royal Society of New Zealand. Paul Gardner Visualising RNA