Your SlideShare is downloading. ×
0
Biocuration 2013 - Fiona Brinkman - From genes, to genomes to networks, with community aided curation
Biocuration 2013 - Fiona Brinkman - From genes, to genomes to networks, with community aided curation
Biocuration 2013 - Fiona Brinkman - From genes, to genomes to networks, with community aided curation
Biocuration 2013 - Fiona Brinkman - From genes, to genomes to networks, with community aided curation
Biocuration 2013 - Fiona Brinkman - From genes, to genomes to networks, with community aided curation
Biocuration 2013 - Fiona Brinkman - From genes, to genomes to networks, with community aided curation
Biocuration 2013 - Fiona Brinkman - From genes, to genomes to networks, with community aided curation
Biocuration 2013 - Fiona Brinkman - From genes, to genomes to networks, with community aided curation
Biocuration 2013 - Fiona Brinkman - From genes, to genomes to networks, with community aided curation
Biocuration 2013 - Fiona Brinkman - From genes, to genomes to networks, with community aided curation
Biocuration 2013 - Fiona Brinkman - From genes, to genomes to networks, with community aided curation
Biocuration 2013 - Fiona Brinkman - From genes, to genomes to networks, with community aided curation
Biocuration 2013 - Fiona Brinkman - From genes, to genomes to networks, with community aided curation
Biocuration 2013 - Fiona Brinkman - From genes, to genomes to networks, with community aided curation
Biocuration 2013 - Fiona Brinkman - From genes, to genomes to networks, with community aided curation
Biocuration 2013 - Fiona Brinkman - From genes, to genomes to networks, with community aided curation
Biocuration 2013 - Fiona Brinkman - From genes, to genomes to networks, with community aided curation
Biocuration 2013 - Fiona Brinkman - From genes, to genomes to networks, with community aided curation
Biocuration 2013 - Fiona Brinkman - From genes, to genomes to networks, with community aided curation
Biocuration 2013 - Fiona Brinkman - From genes, to genomes to networks, with community aided curation
Biocuration 2013 - Fiona Brinkman - From genes, to genomes to networks, with community aided curation
Biocuration 2013 - Fiona Brinkman - From genes, to genomes to networks, with community aided curation
Biocuration 2013 - Fiona Brinkman - From genes, to genomes to networks, with community aided curation
Biocuration 2013 - Fiona Brinkman - From genes, to genomes to networks, with community aided curation
Biocuration 2013 - Fiona Brinkman - From genes, to genomes to networks, with community aided curation
Biocuration 2013 - Fiona Brinkman - From genes, to genomes to networks, with community aided curation
Biocuration 2013 - Fiona Brinkman - From genes, to genomes to networks, with community aided curation
Biocuration 2013 - Fiona Brinkman - From genes, to genomes to networks, with community aided curation
Biocuration 2013 - Fiona Brinkman - From genes, to genomes to networks, with community aided curation
Biocuration 2013 - Fiona Brinkman - From genes, to genomes to networks, with community aided curation
Biocuration 2013 - Fiona Brinkman - From genes, to genomes to networks, with community aided curation
Biocuration 2013 - Fiona Brinkman - From genes, to genomes to networks, with community aided curation
Biocuration 2013 - Fiona Brinkman - From genes, to genomes to networks, with community aided curation
Biocuration 2013 - Fiona Brinkman - From genes, to genomes to networks, with community aided curation
Biocuration 2013 - Fiona Brinkman - From genes, to genomes to networks, with community aided curation
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Biocuration 2013 - Fiona Brinkman - From genes, to genomes to networks, with community aided curation

443

Published on

Biocuration 2013 conference talk - April 9, 2013

Biocuration 2013 conference talk - April 9, 2013

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
443
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
9
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. From  genes,  to  genomes  to  networks,     with  community  aided  cura5on   Fiona  Brinkman   Simon  Fraser  University   Biocura4on  conference   April  2013  
  • 2. From  genes,  to  genomes  to  networks,     with  community  aided  cura5on  with  a  li?le  help  from  my  friends  …  
  • 3. My  Primary  Research  Interest  Developing  more  sustainable  approaches  for  infec:ous  disease  control  …using  novel  computa:onal  tools,  integrated  data  and  interdisciplinary  approaches   Targe4ng  major  players  resul4ng  in  infec4ous  disease:   o   Pathogen  virulence      ID  an4-­‐infec4ves  (don’t  kill  the  pathogen,  disarm  them)   o   Host  immune  system  failure/over-­‐ac4vity     Immune  modulators  that  dampen  damaging   inflamma4on  and  boost  “good”  immune  response     o   Changes  in  environment/social  factors      Integra4ng  pathogen  genome  data  with  environment,   microbiome,  and  social  network  data      Be?er  iden4fy  source/cause  of  disease  outbreaks  3
  • 4. Some  of  our  labs  tools…  o   Pathogen  virulence     PSORTb  –  Protein  localiza4on  analysis  (ID  cell  surface/secreted  drug  targets)       IslandViewer  –  Genomic  island  analysis,  pathogen-­‐associated  genes       Ortholuge  DB  –  Precomputed  assessments  of  bacterial  orthologs     Genera-­‐specific  DBs  like  Pseudomonas  Genome  Database  o   Host  immune  system  failure/over-­‐ac5vity    InnateDB  –  Human/Mouse  interactome  +  curated  innate  immunity-­‐associated  interac4ons  o   Changes  in  environment/social  factors     Metagenomics  projects     Integrated  Rapid  Infec4ous  Disease  Analysis  Pipeline  (IRIDA)  4
  • 5. Some  of  our  labs  tools…  o   Pathogen  virulence     PSORTb  –  Protein  localiza4on  analysis  (ID  cell  surface/secreted  drug  targets)       IslandViewer  –  Genomic  island  analysis,  pathogen-­‐associated  genes       Ortholuge  DB  –  Precomputed  assessments  of  bacterial  orthologs     Genera-­‐specific  DBs  like  Pseudomonas  Genome  Database  o   Host  immune  system  failure/over-­‐ac5vity    InnateDB  –  Human/Mouse  interactome  +  curated  innate  immunity-­‐associated  interac4ons  o   Changes  in  environment/social  factors     Metagenomics  projects     Integrated  Rapid  Infec4ous  Disease  Analysis  Pipeline  (IRIDA)  5
  • 6. Research  Philosophy   High  quality  analyses  are  only  as  good  as   the  robust  data,     effec:ve  data  organiza:on  and     accurate  analysis  methods  used.       Robust data Want  high  accuracy     The –  usually  erring  on  the  side  of  high   Nexus Accurate precision  at  the  expense  of  recall.    Data analysisorganization methods To  a?ain  high  accuracy,     biocura4on  is  oben  KEY   6
  • 7.                                                              Overview  •   Community-­‐based    Community-­‐aided  gene/genome  annota4on   •     1997  –  present:  Pseudomonas  Genome  Project  and  PseudoCAP                                (Pseudomonas  Community  Annota4on  Project)  •   Community-­‐aided    Mul4ple  community-­‐aided  contextual  cura4on      of  molecular  interac4ons   •   2006  –  present:  InnateDB  project  •   What  we’re  doing  next…  •   Funding  it  all!   7
  • 8. Pseudomonas  Community  Annota5on  Project  Goals  Cri4cal  and  conserva4ve  genome  annota4on  Minimize  project  costs    Capitalize  on  large  Pseudomonas  aeruginosa  research  community  Solu:on  Community-­‐based,  Internet-­‐based  approach    for  (con4nually  updated)  genome  annota4on  “Crowdsourcing”  in  the  90’s!   8
  • 9. Pseudomonas  Community  Annota5on  Project  Ini:al  PseudoCAP  leading  to  genome  publica:on  (1997  –  2000)  61  researchers  from  13  countries,  1741  annota4ons                                                                                                  Focus  on  conserva4ve  annota4on                                                                                                Need  to  capture  researcher’s  excellent,  diverse            biol                                                                  biological  knowledge,  NOT  their  diverse                                                                                    ways  of  annota4ng!   9
  • 10. Pseudomonas  Community  Annota5on  Project  Ini:al  PseudoCAP  leading  to  genome  publica:on  (1997  –  2000)  Ini4al  1741  community-­‐based  annota4ons…    Annota4ons  incorporated  by  3  annotators  through  web-­‐based  tool    1st  fully  internet-­‐based  community  annota4on  effort     10
  • 11. Pseudomonas  Community  Annota5on  Project  Current  PseudoCAP  –  con:nually  updated  annota:on  (2000  –  present)  151  researchers,  2356  curated  gene  annota4ons    (not  incl.  computa4onal  analyses)  Movement  from  gene-­‐based    genes  plus  other  genome  features  (2,590  other  genome  features  added  in  the  last  year  alone)  Found  we  needed  to  further  modify  our  community-­‐based  approach…  Winsor et al 2011 PMID: 20929876 11Winsor et al 2005 PMID: 15608211
  • 12. Pseudomonas  Community  Annota5on  Project  Current  PseudoCAP  –  con:nually  updated  annota:on  (2000  –  present)  Annota4ons  incorporated  by  one  part  4me  project  coordinator  Subject  to  review  process  (peer  reviewed  paper  or  other  peer  review)  Increasing  movement  from  Community-­‐based    Community-­‐aided  -­‐  Coordinator  contacts  researchers  more  to  get  input    -­‐   Capitalize  on  exper4se  most  efficiently    -­‐   Coordinator  ensures  consistency                          Coordinator  and  community                            collec4vely  ensures  quality   12
  • 13. Pseudomonas  Community  Annota5on  Project  Challenges  and  Solu:ons  -­‐   Disputes  between  researchers  regarding  an  annota4on    -­‐  Go  with  first  published  and  have  alternate  annota4ons  -­‐   Researchers  are  busy!     -­‐   Keep  submission  system/input  process  simple!   -­‐   We  now  contact  them  more  than  they  contact  us   -­‐   Have  rounds  of  major  annota4on  pushes   Future:  Will  try  again  the     “paper  carrot”  for  another     annota4on  push  –     authorship  on  a  NAR  update     paper  (as  a  consor4um)     to  encourage  par4cipa4on   13
  • 14. InnateDB:    Cura5ng  molecular  interac5ons,  networks   Community-­‐aided    “Mul4ple  community-­‐aided”   Highly  contextual  annota4on   14
  • 15. InnateDB  Developed  to  Aid  Two  Large  Interna4onal     Systems  Biology  Projects   Mouse Model Datasets: Cerebral Malaria mouse model (IMR, Australia) Tuberculosis mouse model (AECM) Shigella xenograft model (Pasteur) Human Clinical Datasets: Typhoid & Malaria Vietnam (OUCRU/Stanford/ Sanger) Non Typhoidal Salmonella Malawi (Sanger)+ Chronic/Acute Helminth Ecuador (USF de Quito/ Sanger) Dengue (OUCRU) Modulating innate immune response via Host Defense peptides (Hancock lab, UBC) Mouse KOs (Sanger) Novel insight into host response and mechanism of peptides. Common Pathways, networks and transcriptional regulation. Thompson et al PNAS December 2009
  • 16. Systems Biology & The Innate Immune Response:  Many layers of complexity.  Layers of regulation: transcriptional; post-transcriptional (miRNAs); post-translational (ubiquitination, phosphorylation)  Host-pathogen interactions  100s – 1000s DE genes  Not simple pathways - networks of molecular interactions Gardy*, Lynn*, Brinkman, Hancock (2009). Enabling a systems biology approach to immunology: focus on innate immunity. Trends in Immunology PMID: 19428301
  • 17. Breuer et al., 2013 InnateDB: systems biology of innate immunity and beyond… NAR (DB issue) PMID: 23180781
  • 18. Manual Curation of Interaction Data From Literature to Database Greatly Enhances Coverage of Innate Immunity Interactome INNATEDB CURATED INTERACTOME INTERACTIONS INTERACTIONS ONLY CURATED ALSO CURATED BY INNATEDB BY TOP 5 OTHER INTERACTION DATABASES: BIND, INTACT, DIP, BIOGRID & MINT Lynn et al., Curating the Innate Immunity Interactome. BMC Systems Biology 2010 PMID: 20727158
  • 19. Manual  Cura4on  of  Interac4on  Data  From  Literature  to  Database  –  Enhancing  coverage  of  Innate  Immunity  Interactome   The  InnateDB  curated   interactome  in  July  2012.   Red  edges  represent   interac4ons  that  have  been   added  in  2011  and  2012.   Breuer et al., 2013 InnateDB: systems biology of innate immunity and beyond… Nucleic Acids Research (Database issue)
  • 20. Contextually Curating Innate Immunity-Relevant InteractionsAnnotated fields include:Molecule type; organism;biological role; interactiondetection method; thehost system (in vitro, invivo, ex vivo); hostorganism; interactiontype; cell, cell-line andtissue types; cell status(primary/cell line);experimental role;participant identificationmethod and sub-cellularlocalization, plus varietyof additional curatorcomments.
  • 21. Curating Innate Immunity-Relevant Interactions  71% human, 22% mouse, 7% human- mouse  ~80% interactions in innate immunity interactome not annotated by other major databases  Protein (69%), DNA and RNA interactions  Developed InnateDB submission system software to allow submission of interaction annotation in an OBO ontology-controlled and MIMIx & PSI-MI 2.5 compliant manner. Lynn et al., Curating the Innate Immunity Interactome. BMC Systems Biology 2010 PMID: 20727158
  • 22. Which journals are curated?  >4,400 journal articles curated to date  Don’t focus on specific journals - relevant articles curated if meet appropriate quality standards for the interaction evidence.  Indeed, at least one protein has been curated from >200 different journals.  More than 70% of curated articles have come from 20 journals.  Note many journals in top 20 are not “immunology journals”, underscoring importance of not limiting curation efforts to journals perceived as “relevant”.
  • 23. Curating Innate Immunity-Relevant Interactions – 4-pronged approach  Curation  primarily pathway-centric   systematically review all literature describing interactions for a particular innate immunity pathway.   Curate all other interactors regardless of whether the interacting molecule is a member of the pathway or has any known role in innate immunity  expands network outside of known innate immunity players.   Systematically curated pathways are scheduled for frequent re-curation as the field is moving quickly.  Also, new publications on innate immunity assessed on a daily basis to identify novel interactions of interest.   Priority given to the most recent publications  incorporates new information on the most current research  Immunology Community-aided:   Curators consult with researchers to confirm unclear literature data   Most common issue: Unclear what species the protein/DNA/RNA interactors come from  Curation Community-aided:   InnateDB curators review each others curations as an error check   IMEx consortium! http://www.innatedb.com/doc/InnateDB_2010_curation_guide.pdf
  • 24. •  InnateDB  is  a  member  of  IMEx   –  an  interna4onal  consor4um   of  interac4on  databases   involved  in  cura4on  •  Goal:  Develop  common   standards,  avoid  too  much   redundancy  in  data   collec4on/cura4on,  central   registry,  single  search   interface  •  Orchard  et  al  Nature  Methods   9:345-­‐350  PMID:  22453911  •  Stay  tuned  for  Sandra   Orchards  talk!    
  • 25. Going Beyond Innate Immunity – An Integrative Biology Resource  >196,000 human and mouse interactions extracted & loaded from BIND, INTACT, DIP, BIOGRID & MINT DBs  Cross-referenced genes to >3,000 pathways from KEGG, PID, BIOCARTA, INOH, NetPath & Reactome DBs   Visualize/analyze interactions associated with specific pathway   Pathway over-representation analysis  Ensembl annotation provides details of all human & mouse genes/transcripts/ proteins. UniProt, Entrez, Gene Ontology, etc  rich protein & gene annotation  Transcript. factor–DNA interactions experimentally confirmed from Transfac, TransCompel  Robust orthology & gene synteny analysis facilitate human-mouse comparisons
  • 26. InnateDB  –  Advanced  Yet  User-­‐Friendly  Searching  –  Find  &  Analyze  Relevant   Interac4ons,  Pathways  &  Genes/Proteins.  
  • 27. InnateDB  –  Facilita4ng  Systems-­‐Level  Analyses  of  Gene  Expression  Data   Upload Your Own Gene Expression Data - Up to 10 conditions/timepoints at 1 time.Overlay Gene Expression Data Pathway, Gene Ontology & TF ORA tools from Multiple Conditions on Find – DE Pathways/Functionally Related Networks/Pathways Genes/TFs Go Beyond Pathway Analysis – Differentially Expressed Sub-networks – New Pathways? How Are DE Genes Actually Inter-connected? Central Regulators (Network Hubs)
  • 28. InnateDB and curated data aided study of an immune modulator – host-directed adjunctive therapy coupled with anti-malarial
  • 29. What  we’re  doing  next…  Need  to  develop  more  ontologies  and  data  standards  to  integrate  microbial  genomic  data  from  a  disease  outbreak  with  epidemiological  data.    Cura4ng  pathogen  status  for  complete  microbial  genomes  Will  try  the  “paper  carrot”  again  for  next  Pseudomonas  Genome  Database  cura4on  project    InnateDB  –  expanding  to    Allergy  and  Asthma 29
  • 30.   Iden4fy  genes  unique  to/shared  between  strains,  species,  genera,   any  selected  bacteria….   30
  • 31. Funding!    Grants!  One  of  the  biggest  challenges  is  to    secure  long  term,  reliable  funding.    Weve  found:      Need  to  target  cura4on  to  specific  bio  projects.    (ie  innate  immunity,  then  to  allergy  and  asthma;    aiding  a  specific  Pseudomonas  analysis)    Limits  what  we  can  do,  but  good  in  the  sense  that    cura4on  benefits  are  more  quickly  felt  as  they  are  needed/used  by  others   31
  • 32. Concluding  comments  Using  community-­‐aided,  expert  curator-­‐centered,  approach  for  balancing  consistency,  reliability  and  maximizing  knowledge.  Degree    of  community  involvement  depends  on  nature  of  data.    Capitalize  on  both  bio  community  and  cura4on  community  –  keep  linked  Researchers  are  busy!  Make  it  super  easy  for  them  to  provide  input.  A  li?le  contribu4on  can  go  a  long  way  Paper  carrots!  Link  cura4on  to  bio  research  to  secure  funding  Indoctrinate  young  minds!  Get  biocura4on    and  its  challenges  into  undergrad  curriculums     32
  • 33. Acknowledgements - InnateDB  InnateDB Principle Investigators:   InnateDB Curation: www.innatedb.com   Fiona Brinkman (SFU)   Bob Hancock (UBC)   Raymond Lo   David Lynn (Teagasc)   Anastasia Sribnaia   Carol Chan  InnateDB Development:   Misbah Naseer   Karin Breuer   Melissa Yau   Geoff Winsor   Giselle Ring   Matthew Laird   Kathleen Wee   Calvin Chan   Jaimmie Que   Amir Foroushani   Brian Meredith   Cerebral network visualizer:   Nathan Lawless   Nicolas Richard   Avinash Chikatamarla   Aaron Barsky   Jennifer Gardy   Fiona Roche   Tamara Munzner   Timothy Chan   Naisha Shah   Michael Acab   FNIH/GCGH Collaborators:   Gordon Dougan (Sanger)   Fernanda Schreiber (Sanger)   Melita Gordon (U. Liverpool)   Bill Jacobs (AECM)   Dee Dao (AECM)   Philip Cooper (St. Georges)   Louis Schofield (WEHI)   Sandra Pilat (WEHI)   Sarah Dunstan (OUCRU)   Brett Finlay (UBC)
  • 34. Acknowledgements  –  PseudoCAP  Geoff  Winsor  Ray  Lo  Ma?  Laird  Bhav  Dhillon  Ma?hew  Whiteside  151  PseudoCAP  par4cipants  www.pseudomonas.com  

×