Your SlideShare is downloading. ×
2009-02_JohnInnesRoadshow_GO_jdeegan.ppt
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

2009-02_JohnInnesRoadshow_GO_jdeegan.ppt

205
views

Published on


0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
205
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • The gene ontology consortium develops ontologies and makes annotation of gene products to those ontologies. The ontologies are databases containing sets of biological processes, molecular functions, and cellular components and the relationships between them. Annotators within the consortium use these ontologies to categorise gene products. During my talk today I’m going to explain about the uses of GO and give a more detailed explanation of the ontolgoies and of the system of annotation. Then Harold Drabkin is going to talk in more detail about annotation and about how you can submit annotations of your own gene products of interest.
  • This is a gene product that has already been annotated to all three gene ontologies. It is the Mitochondrial P450 gene product.
  • The mitochondrial p450 gene products are localised on the mitochondrial inner membrane and the GO cellular component term for this is mitochondrial inner membrane ; GO:0005743
  • The function of the gene product is described by the GO molecular function term: monooxygenase activity ; GO:0004497
  • The process in which the gene product is involved is GO biological process term: electron transport ; GO:0006118 In this way you can see that many aspects of a single gene product can be recorded simply by annotating it to the three ontologies.
  • The gene ontology consortium develops ontologies and makes annotation of gene products to those ontologies. The ontologies are databases containing sets of biological processes, molecular functions, and cellular components and the relationships between them. Annotators within the consortium use these ontologies to categorise gene products. During my talk today I’m going to explain about the uses of GO and give a more detailed explanation of the ontolgoies and of the system of annotation. Then Harold Drabkin is going to talk in more detail about annotation and about how you can submit annotations of your own gene products of interest.
  • The gene ontology consortium develops ontologies and makes annotation of gene products to those ontologies. The ontologies are databases containing sets of biological processes, molecular functions, and cellular components and the relationships between them. Annotators within the consortium use these ontologies to categorise gene products. During my talk today I’m going to explain about the uses of GO and give a more detailed explanation of the ontolgoies and of the system of annotation. Then Harold Drabkin is going to talk in more detail about annotation and about how you can submit annotations of your own gene products of interest.
  • In developing the ontologies we are solving a number of problems biologists. Currently in biology there are many ambiguities in language. Groups of researchers may use the same words to mean different things, or they may use several different words to refer to the same thing. This causes problems for scientists trying to access research carried out by groups outside their immediate field. It also makes it very difficult to process biological information using a computer. For example three groups of biologists studying different model organisms may all be studying the perception of touch. Scientists in different groups might talk about this single process as ‘tactition’, ‘tactile sense’ or ‘taction’. This differing use of language means that when they try to find and read each other’s papers they will have more trouble. It will also be harder for them to use a computer to find and interpret biological data on this subject since the computer has no way to know that these words mean the same thing.
  • The GO provides a solution to this problem since we take biological concepts like the perception of touch and we make them a single GO item in the ontology. We add all the relevant synonyms, and give a unique numerical identifier to the concept.
  • The GO also provides a solution to the opposite problem in which several groups of scientists use the same words to refer to different things. For example the phrase ‘bud inititation’ could refer to the initiation of a tooth bud, a yeast reproductive bud, or a bud on a tree. However, these three types of bud are initiated in quite different ways, and scientists would like to be able to distinguish between them.
  • To solve this problem the GO differentiates between differing concepts by adding a ‘sensu ending’. So according to this example, we would have ‘ bud initiation sensu Metazoa’ to mean the kind of bud initation that gives rise to a tooth in mammals. We would have ‘ bud initiation sensu Saccharomyces’ for the initation of a reproductive bud in yeast, and we would have ‘ bud initiation sensu Viridiplantae’ for the initation of a tree bud. This means that gene products involved in bud initiation can be categorised along with only those other gene products involved in the same kind of bud initation.
  • … piece of text…from which can pull out GO terms
  • The gene ontology consortium develops ontologies and makes annotation of gene products to those ontologies. The ontologies are databases containing sets of biological processes, molecular functions, and cellular components and the relationships between them. Annotators within the consortium use these ontologies to categorise gene products. During my talk today I’m going to explain about the uses of GO and give a more detailed explanation of the ontolgoies and of the system of annotation. Then Harold Drabkin is going to talk in more detail about annotation and about how you can submit annotations of your own gene products of interest.
  • The gene ontology consortium develops ontologies and makes annotation of gene products to those ontologies. The ontologies are databases containing sets of biological processes, molecular functions, and cellular components and the relationships between them. Annotators within the consortium use these ontologies to categorise gene products. During my talk today I’m going to explain about the uses of GO and give a more detailed explanation of the ontolgoies and of the system of annotation. Then Harold Drabkin is going to talk in more detail about annotation and about how you can submit annotations of your own gene products of interest.
  • The GO can be used for further very specific applications in the lab. For example in microarray analysis you can use the GO data to show which processes are modified by the treatment being studied in a given microarray experiment. You can also use the GO to give an overview of the range of gene products in a whole genome as represented by the functions of processes those genes are involved in.
  • Finally, this is the current list of groups in the consortium. The Editorial office where the ontologies are developed is in Cambridge in the UK, and the rest of these groups contribute annotations. We are keen to include more groups in the annotation process so that more species will be manually annotated to the go and so Harold Drabkin is now going to talk about the process of annotation and about how you can contribute manual annotations of your own gene products.
  • Transcript

    • 1. The Gene Ontology Project An Introduction
    • 2. There is a lot of biological research output.
    • 3. Search on mesoderm development…
    • 4. You get 6752 results! How will you ever find what you want? Another example…
    • 5. Bregje Wertheim at the Centre for Evolutionary Genomics, Department of Biology, UCL and Eugene Schuster Group, EBI. Microarray data shows changed expression of thousands of genes. How will you spot the patterns? attacked time control
    • 6. Scientists work hard. http://www.kilbot.com.au/wp-content/shop/careful-scientist.gif http://www.teamtechnology.co.uk/f-scientist.jpg
    • 7. There are lots of papers to read. http://www.kilbot.com.au/wp-content/shop/careful-scientist.gif http://www.teamtechnology.co.uk/f-scientist.jpg
    • 8. more every week. http://www.kilbot.com.au/wp-content/shop/careful-scientist.gif http://www.teamtechnology.co.uk/f-scientist.jpg
    • 9. and more… http://www.kilbot.com.au/wp-content/shop/careful-scientist.gif http://www.teamtechnology.co.uk/f-scientist.jpg
    • 10. more and more and more! http://www.kilbot.com.au/wp-content/shop/careful-scientist.gif http://www.teamtechnology.co.uk/f-scientist.jpg
    • 11. Help! Help! more and more and more! http://www.kilbot.com.au/wp-content/shop/careful-scientist.gif http://www.teamtechnology.co.uk/f-scientist.jpg
    • 12. Ontology is a way to capture knowledge in a written and computable form. Computable means that the computer finds patterns so we don’t have to. Computable
    • 13. Ebay search (keyword ‘lead’) v. Pubmed search (keyword ‘flower’) Demo and practical work
    • 14. The Gene Ontology
    • 15. This is our browser.
    • 16. Search on mesoderm development.
    • 17. Here is mesoderm development.
    • 18. Definition of mesoderm development. Gene products involved in mesoderm development.
    • 19. There are many gene products involved in mesoderm development. You can read papers describing what is known about them. But fewer gene products than papers.
    • 20.  
    • 21. attacked time control Bregje Wertheim at the Centre for Evolutionary Genomics, Department of Biology, UCL and Eugene Schuster Group, EBI.
    • 22. attacked time control Puparial adhesion Molting cycle hemocyanin Defense response Immune response Response to stimulus Toll regulated genes JAK-STAT regulated genes Immune response Toll regulated genes Amino acid catabolism Lipid metobolism Peptidase activity Protein catabloism Immune response See which processes are upregulated or downregulated. Bregje Wertheim at the Centre for Evolutionary Genomics, Department of Biology, UCL and Eugene Schuster Group, EBI.
    • 23. Practical work: Search AmiGO Did you find your favourite gene product or process?
    • 24. How does the Gene Ontology work?
    • 25.  
    • 26. The Gene Ontology is like a dictionary term : transcription initiation definition : Processes involved in the assembly of the RNA polymerase complex at the promoter region of a DNA template resulting in the subsequent synthesis of RNA from that promoter. id: GO:0006352
    • 27. Clark et al ., 2005 The whole system. part_of is_a
    • 28. Mitochondrial P450 ( CC24 PR01238 ; MITP450CC24) An example…
    • 29. GO cellular component term: GO:0005743 Where is it? Mitochondrial p450 mitochondrial inner membrane
    • 30. GO molecular function term: GO:0004497 What does it do? substrate + O 2 = CO 2 +H 2 0 product monooxygenase activity
    • 31. http:// ntri.tamuk.edu /cell/ mitochondrion/krebpic.html GO biological process term: GO:0006118 Which process is this? electron transport
    • 32. Clark et al ., 2005 The whole system. part_of is_a
    • 33. The Gene Ontology is for all species and that means we have to *bridge* some language barriers.
    • 34. http://www.darknessandlight.co.uk/cambridge_photographs.html http://www.lockeheemstra.com/italy/bridge-of-sighs-venice.html Ponte dei Sospiri, Venice. Bridge of Sighs, Cambridge. Same name, same thing?
    • 35. In biology… Tactition Tactile sense Taction ?
    • 36. Tactition Tactile sense Taction perception of touch ; GO:0050975
    • 37. Bud initiation?
    • 38. = tooth bud initiation = reproductive bud initiation = branch bud initiation
    • 39. Demo: Writing an ontology The car ontology
    • 40.
      • Demo: The gene ontology
    • 41. Categorization of gene products using GO is called annotation. So how does that happen?
    • 42. Choose your favourite gene. P05147
    • 43. Find a paper about it. P05147 PMID: 2976880
    • 44. Find the GO term describing its function, process or location of action. P05147 PMID: 2976880 GO:0047519
    • 45. GO:0047519 What evidence do they show? P05147 PMID: 2976880 IDA
    • 46. Write these down… P05147 PMID: 2976880 GO:0047519 IDA P05147 GO:0047519 IDA PMID:2976880
    • 47. Send to the GO Consortium .
    • 48. Finding annotations in a paper In this study, we report the isolation and molecular characterization of the B. napus PERK1 cDNA, that is predicted to encode a novel receptor-like kinase. We have shown that like other plant RLKs, the kinase domain of PERK1 has serine/threonine kinase activity, In addition, the location of a PERK1-GTP fusion protein to the plasma membrane supports the prediction that PERK1 is an integral membrane protein…these kinases have been implicated in early stages of wound response… Process: response to wounding GO:0009611 wound response serine/threonine kinase activity, Function: protein serine/threonine kinase activity GO:0004674 integral membrane protein Component: integral to plasma membrane GO:0005887 … for B. napus PERK1 protein (Q9ARH1) PubMed ID: 12374299
    • 49. Annotation details
    • 50.  
    • 51.  
    • 52. Where to get annotations?
      • Non-redundant species database
        • Contains all GO annotations for given species + other information.
        • http://www.arabidopsis.org/
      • Multispecies database - GOA
        • Contains all GO annotations.
        • http://beta.uniprot.org/
    • 53. Evidence codes
    • 54. IDA - inferred from direct assay Enzyme assays In vitro reconstitution (e.g. transcription) Immunofluorescence (for cellular component) Cell fractionation (for cellular component) Physical interaction/binding IEP - inferred from expression pattern Transcript levels (e.g. Northerns, microarray data) Protein levels (e.g. Western blots) IGC - inferred from genomic context Operon structure Syntenic regions Pathway analysis Genome-scale analysis of processes
    • 55. IGI - inferred from genetic interaction "Traditional" genetic interactions such as suppressors, synthetic lethals, etc. Functional complementation Rescue experiments Inference about one gene drawn from the phenotype of a mutation in a different gene. IMP - inferred from mutant phenotype Any gene mutation/knockout Overexpression/ectopic expression of wild-type or mutant genes Anti-sense experiments RNAi experiments Specific protein inhibitors Polymorphism or allelic variation IPI - inferred from physical interaction 2-hybrid interactions Co-purification Co-immunoprecipitation Ion/protein binding experiments
    • 56. ISS - inferred from sequence or structural similarity Sequence similarity (homologue of/most closely related to) Recognized domains Structural similarity Southern blotting RCA - inferred from reviewed computational analysis Large-scale protein-protein interaction experiments Microarray experiments Integration of large-scale datasets of several types Text-based computation IEA - Inferred from Electronic Annotation NAS - non-traceable author statement ND - no biological data available TAS - traceable author statement NR - not recorded
    • 57. PMID: 15960829 Should we trust electronic annotations?
    • 58. http://www.geneontology.org/GO.indices.shtml
    • 59.  
    • 60.
      • !version: $Revision: 1.67 $
      • !date: $Date: 2008/01/21 11:29:01 $
      • !Mapping of GO function_ontology "enzymes" to Enzyme Commission Numbers.
      • !original mapping by Michael Ashburner, Cambridge.
      • !This version parsed from function.ontology on 2008/01/15 14:01:16
      • !by Daniel Barrell, EBI, Hinxton
      • !
      • EC:1 > GO:oxidoreductase activity ; GO:0016491
      • EC:1.1 > GO:oxidoreductase activity, acting on CH-OH group of donors ; GO:0016614
      • EC:1.1.1 > GO:oxidoreductase activity, acting on the CH-OH group of donors, NAD or NADP as acceptor ; GO:0016616
      • EC:1.1.1.1 > GO:alcohol dehydrogenase activity ; GO:0004022
      • EC:1.1.1.10 > GO:L-xylulose reductase activity ; GO:0050038
      • EC:1.1.1.100 > GO:3-oxoacyl-[acyl-carrier-protein] reductase activity ; GO:0004316
      • EC:1.1.1.101 > GO:acylglycerone-phosphate reductase activity ; GO:0000140
      • EC:1.1.1.102 > GO:3-dehydrosphinganine reductase activity ; GO:0047560
      • EC:1.1.1.103 > GO:L-threonine 3-dehydrogenase activity ; GO:0008743
      • EC:1.1.1.104 > GO:4-oxoproline reductase activity ; GO:0016617
      ec2go mapping
    • 61. interpro2go mapping
      • !date: 2008/01/15 13:01:24
      • !Mapping of InterPro entries to GO
      • !Nicola Mulder, Hinxton
      • !
      • InterPro:IPR000003 Retinoid X receptor > GO:DNA binding ; GO:0003677
      • InterPro:IPR000003 Retinoid X receptor > GO:steroid binding ; GO:0005496
      • InterPro:IPR000003 Retinoid X receptor > GO:regulation of transcription, DNA-dependent ; GO:0006355
      • InterPro:IPR000003 Retinoid X receptor > GO:nucleus ; GO:0005634
      • InterPro:IPR000005 Helix-turn-helix, AraC type > GO:transcription factor activity ; GO:0003700
      • InterPro:IPR000005 Helix-turn-helix, AraC type > GO:intracellular ; GO:0005622
      • InterPro:IPR000006 Metallothionein, vertebrate > GO:metal ion binding ; GO:0046872
      • InterPro:IPR000013 Peptidase M7, snapalysin > GO:extracellular region ; GO:0005576
      • InterPro:IPR000014 PAS > GO:signal transducer activity ; GO:0004871
      • InterPro:IPR000015 Fimbrial biogenesis outer membrane usher protein > GO:transporter activity ; GO:0005215
      • InterPro:IPR000018 P2Y4 purinoceptor > GO:purinergic nucleotide receptor activity, G-protein coupled ; GO:0045028
      • InterPro:IPR000020 Anaphylatoxin/fibulin > GO:extracellular region ; GO:0005576
      • InterPro:IPR000021 Hok/gef cell toxic protein > GO:membrane ; GO:0016020
      • InterPro:IPR000022 Carboxyl transferase > GO:ligase activity ; GO:0016874
      • InterPro:IPR000023 Phosphofructokinase > GO:6-phosphofructokinase activity ; GO:0003872
      • InterPro:IPR000025 Melatonin receptor > GO:integral to membrane ; GO:0016021
      • InterPro:IPR000026 Guanine-specific ribonuclease N1 and T1 > GO:endoribonuclease activity ; GO:0004521
      • InterPro:IPR000028 Chloroperoxidase > GO:peroxidase activity ; GO:0004601
      InterPro is a database of protein families, domains and functional sites in which identifiable features found in known proteins can be applied to unknown protein sequences.
    • 62. Manual annotation appears in AmiGO. Manual and electronic annotation appears in QuickGO.
    • 63. Clark et al ., 2005 Many species groups annotate. We see the research of one function across all species.
    • 64. Exercise: Search for your favourite gene and see if the annotation is electronic or manual. http://www.ebi.ac.uk/ego/
    • 65. Submit new GO terms: http://www.geneontology.org/
    • 66.  
    • 67. GO slims
    • 68. Clark et al ., 2005 part_of is_a
    • 69. Clark et al ., 2005 part_of is_a
    • 70. Whole genome analysis (J. D. Munkvold et al ., 2004)
    • 71. attacked time control Puparial adhesion Molting cycle hemocyanin Immune response Toll regulated genes Amino acid catabolism Lipid metobolism Peptidase activity Protein catabloism Immune response Bregje Wertheim at the Centre for Evolutionary Genomics, Department of Biology, UCL and Eugene Schuster Group, EBI. … analysis of high-throughput data according to GO
    • 72. Making Slims: OBO-Edit
    • 73. Reapplying slimmed ontology to annotations: AmiGO http://amigo.geneontology.org/
    • 74. Converting IDs: PICR http://www.ebi.ac.uk/Tools/picr/
    • 75. GOOSE http://www.berkeleybop.org/goose
    • 76. 2006 Consortium Meeting, St. Croix, U.S. Virgin Islands, March 30 - April 3, 2006
    • 77. http://www.geneontology.org Reactome E. Coli hub