From	  genes,	  to	  genomes	  to	  networks,	                                                     	     with	  community	...
From	  genes,	  to	  genomes	  to	  networks,	                                                                	           ...
My	  Primary	  Research	  Interest	  Developing	  more	  sustainable	  approaches	  for	  infec:ous	  disease	  control	  ...
Some	  of	  our	  labs	  tools…	  o 	  Pathogen	  virulence	   	  PSORTb	  –	  Protein	  localiza4on	  analysis	  (ID	  c...
Some	  of	  our	  labs	  tools…	  o 	  Pathogen	  virulence	   	  PSORTb	  –	  Protein	  localiza4on	  analysis	  (ID	  c...
Research	  Philosophy	                                    High	  quality	  analyses	  are	  only	  as	  good	  as	        ...
 	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  	  Overview	  • 	  Community-­‐bas...
Pseudomonas	  Community	  Annota5on	  Project	  Goals	  Cri4cal	  and	  conserva4ve	  genome	  annota4on	  Minimize	  proj...
Pseudomonas	  Community	  Annota5on	  Project	  Ini:al	  PseudoCAP	  leading	  to	  genome	  publica:on	  (1997	  –	  2000...
Pseudomonas	  Community	  Annota5on	  Project	  Ini:al	  PseudoCAP	  leading	  to	  genome	  publica:on	  (1997	  –	  2000...
Pseudomonas	  Community	  Annota5on	  Project	  Current	  PseudoCAP	  –	  con:nually	  updated	  annota:on	  (2000	  –	  p...
Pseudomonas	  Community	  Annota5on	  Project	  Current	  PseudoCAP	  –	  con:nually	  updated	  annota:on	  (2000	  –	  p...
Pseudomonas	  Community	  Annota5on	  Project	  Challenges	  and	  Solu:ons	  -­‐ 	  Disputes	  between	  researchers	  re...
InnateDB:	  	  Cura5ng	  molecular	  interac5ons,	  networks	   Community-­‐aided	  	  “Mul4ple	  community-­‐aided”	    ...
InnateDB	  Developed	  to	  Aid	  Two	  Large	  Interna4onal	                                                             ...
Systems Biology & The Innate     Immune Response:    Many layers of complexity.    Layers of regulation:     transcripti...
Breuer et al., 2013 InnateDB: systems biology of innate immunity and beyond… NAR (DB issue) PMID: 23180781
Manual Curation of Interaction Data From Literature to Database  Greatly Enhances Coverage of Innate Immunity Interactome ...
Manual	  Cura4on	  of	  Interac4on	  Data	  From	  Literature	  to	  Database	  –	  Enhancing	  coverage	  of	  Innate	  I...
Contextually Curating Innate Immunity-Relevant InteractionsAnnotated fields include:Molecule type; organism;biological rol...
Curating Innate Immunity-Relevant Interactions    71% human, 22% mouse, 7% human-     mouse    ~80% interactions in inna...
Which journals are curated?    >4,400 journal articles curated to date    Don’t focus on specific journals - relevant ar...
Curating Innate Immunity-Relevant Interactions                           – 4-pronged approach    Curation  primarily pat...
•  InnateDB	  is	  a	  member	  of	  IMEx	     –	  an	  interna4onal	  consor4um	     of	  interac4on	  databases	     inv...
Going Beyond Innate Immunity – An Integrative Biology Resource    >196,000 human and mouse interactions     extracted & l...
InnateDB	  –	  Advanced	  Yet	  User-­‐Friendly	  Searching	  –	  Find	  &	  Analyze	  Relevant	                        In...
InnateDB	  –	  Facilita4ng	  Systems-­‐Level	  Analyses	  of	  Gene	  Expression	  Data	                                  ...
InnateDB and curated data aided study of an immune modulator –    host-directed adjunctive therapy coupled with anti-malar...
What	  we’re	  doing	  next…	  Need	  to	  develop	  more	  ontologies	  and	  data	  standards	  to	  integrate	  microbi...
  Iden4fy	  genes	  unique	  to/shared	  between	  strains,	  species,	  genera,	     any	  selected	  bacteria….	       ...
Funding!	  	  Grants!	  One	  of	  the	  biggest	  challenges	  is	  to	  	  secure	  long	  term,	  reliable	  funding.	 ...
Concluding	  comments	  Using	  community-­‐aided,	  expert	  curator-­‐centered,	  approach	  for	  balancing	  consisten...
Acknowledgements - InnateDB    InnateDB Principle Investigators:       InnateDB Curation:                   www.innatedb...
Acknowledgements	  –	  PseudoCAP	  Geoff	  Winsor	  Ray	  Lo	  Ma?	  Laird	  Bhav	  Dhillon	  Ma?hew	  Whiteside	  151	  Ps...
Biocuration 2013 - Fiona Brinkman - From genes, to genomes to networks, with community aided curation
Upcoming SlideShare
Loading in …5
×

Biocuration 2013 - Fiona Brinkman - From genes, to genomes to networks, with community aided curation

756 views

Published on

Biocuration 2013 conference talk - April 9, 2013

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
756
On SlideShare
0
From Embeds
0
Number of Embeds
37
Actions
Shares
0
Downloads
12
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Biocuration 2013 - Fiona Brinkman - From genes, to genomes to networks, with community aided curation

  1. 1. From  genes,  to  genomes  to  networks,     with  community  aided  cura5on   Fiona  Brinkman   Simon  Fraser  University   Biocura4on  conference   April  2013  
  2. 2. From  genes,  to  genomes  to  networks,     with  community  aided  cura5on  with  a  li?le  help  from  my  friends  …  
  3. 3. My  Primary  Research  Interest  Developing  more  sustainable  approaches  for  infec:ous  disease  control  …using  novel  computa:onal  tools,  integrated  data  and  interdisciplinary  approaches   Targe4ng  major  players  resul4ng  in  infec4ous  disease:   o   Pathogen  virulence      ID  an4-­‐infec4ves  (don’t  kill  the  pathogen,  disarm  them)   o   Host  immune  system  failure/over-­‐ac4vity     Immune  modulators  that  dampen  damaging   inflamma4on  and  boost  “good”  immune  response     o   Changes  in  environment/social  factors      Integra4ng  pathogen  genome  data  with  environment,   microbiome,  and  social  network  data      Be?er  iden4fy  source/cause  of  disease  outbreaks  3
  4. 4. Some  of  our  labs  tools…  o   Pathogen  virulence     PSORTb  –  Protein  localiza4on  analysis  (ID  cell  surface/secreted  drug  targets)       IslandViewer  –  Genomic  island  analysis,  pathogen-­‐associated  genes       Ortholuge  DB  –  Precomputed  assessments  of  bacterial  orthologs     Genera-­‐specific  DBs  like  Pseudomonas  Genome  Database  o   Host  immune  system  failure/over-­‐ac5vity    InnateDB  –  Human/Mouse  interactome  +  curated  innate  immunity-­‐associated  interac4ons  o   Changes  in  environment/social  factors     Metagenomics  projects     Integrated  Rapid  Infec4ous  Disease  Analysis  Pipeline  (IRIDA)  4
  5. 5. Some  of  our  labs  tools…  o   Pathogen  virulence     PSORTb  –  Protein  localiza4on  analysis  (ID  cell  surface/secreted  drug  targets)       IslandViewer  –  Genomic  island  analysis,  pathogen-­‐associated  genes       Ortholuge  DB  –  Precomputed  assessments  of  bacterial  orthologs     Genera-­‐specific  DBs  like  Pseudomonas  Genome  Database  o   Host  immune  system  failure/over-­‐ac5vity    InnateDB  –  Human/Mouse  interactome  +  curated  innate  immunity-­‐associated  interac4ons  o   Changes  in  environment/social  factors     Metagenomics  projects     Integrated  Rapid  Infec4ous  Disease  Analysis  Pipeline  (IRIDA)  5
  6. 6. Research  Philosophy   High  quality  analyses  are  only  as  good  as   the  robust  data,     effec:ve  data  organiza:on  and     accurate  analysis  methods  used.       Robust data Want  high  accuracy     The –  usually  erring  on  the  side  of  high   Nexus Accurate precision  at  the  expense  of  recall.    Data analysisorganization methods To  a?ain  high  accuracy,     biocura4on  is  oben  KEY   6
  7. 7.                                                              Overview  •   Community-­‐based    Community-­‐aided  gene/genome  annota4on   •     1997  –  present:  Pseudomonas  Genome  Project  and  PseudoCAP                                (Pseudomonas  Community  Annota4on  Project)  •   Community-­‐aided    Mul4ple  community-­‐aided  contextual  cura4on      of  molecular  interac4ons   •   2006  –  present:  InnateDB  project  •   What  we’re  doing  next…  •   Funding  it  all!   7
  8. 8. Pseudomonas  Community  Annota5on  Project  Goals  Cri4cal  and  conserva4ve  genome  annota4on  Minimize  project  costs    Capitalize  on  large  Pseudomonas  aeruginosa  research  community  Solu:on  Community-­‐based,  Internet-­‐based  approach    for  (con4nually  updated)  genome  annota4on  “Crowdsourcing”  in  the  90’s!   8
  9. 9. Pseudomonas  Community  Annota5on  Project  Ini:al  PseudoCAP  leading  to  genome  publica:on  (1997  –  2000)  61  researchers  from  13  countries,  1741  annota4ons                                                                                                  Focus  on  conserva4ve  annota4on                                                                                                Need  to  capture  researcher’s  excellent,  diverse            biol                                                                  biological  knowledge,  NOT  their  diverse                                                                                    ways  of  annota4ng!   9
  10. 10. Pseudomonas  Community  Annota5on  Project  Ini:al  PseudoCAP  leading  to  genome  publica:on  (1997  –  2000)  Ini4al  1741  community-­‐based  annota4ons…    Annota4ons  incorporated  by  3  annotators  through  web-­‐based  tool    1st  fully  internet-­‐based  community  annota4on  effort     10
  11. 11. Pseudomonas  Community  Annota5on  Project  Current  PseudoCAP  –  con:nually  updated  annota:on  (2000  –  present)  151  researchers,  2356  curated  gene  annota4ons    (not  incl.  computa4onal  analyses)  Movement  from  gene-­‐based    genes  plus  other  genome  features  (2,590  other  genome  features  added  in  the  last  year  alone)  Found  we  needed  to  further  modify  our  community-­‐based  approach…  Winsor et al 2011 PMID: 20929876 11Winsor et al 2005 PMID: 15608211
  12. 12. Pseudomonas  Community  Annota5on  Project  Current  PseudoCAP  –  con:nually  updated  annota:on  (2000  –  present)  Annota4ons  incorporated  by  one  part  4me  project  coordinator  Subject  to  review  process  (peer  reviewed  paper  or  other  peer  review)  Increasing  movement  from  Community-­‐based    Community-­‐aided  -­‐  Coordinator  contacts  researchers  more  to  get  input    -­‐   Capitalize  on  exper4se  most  efficiently    -­‐   Coordinator  ensures  consistency                          Coordinator  and  community                            collec4vely  ensures  quality   12
  13. 13. Pseudomonas  Community  Annota5on  Project  Challenges  and  Solu:ons  -­‐   Disputes  between  researchers  regarding  an  annota4on    -­‐  Go  with  first  published  and  have  alternate  annota4ons  -­‐   Researchers  are  busy!     -­‐   Keep  submission  system/input  process  simple!   -­‐   We  now  contact  them  more  than  they  contact  us   -­‐   Have  rounds  of  major  annota4on  pushes   Future:  Will  try  again  the     “paper  carrot”  for  another     annota4on  push  –     authorship  on  a  NAR  update     paper  (as  a  consor4um)     to  encourage  par4cipa4on   13
  14. 14. InnateDB:    Cura5ng  molecular  interac5ons,  networks   Community-­‐aided    “Mul4ple  community-­‐aided”   Highly  contextual  annota4on   14
  15. 15. InnateDB  Developed  to  Aid  Two  Large  Interna4onal     Systems  Biology  Projects   Mouse Model Datasets: Cerebral Malaria mouse model (IMR, Australia) Tuberculosis mouse model (AECM) Shigella xenograft model (Pasteur) Human Clinical Datasets: Typhoid & Malaria Vietnam (OUCRU/Stanford/ Sanger) Non Typhoidal Salmonella Malawi (Sanger)+ Chronic/Acute Helminth Ecuador (USF de Quito/ Sanger) Dengue (OUCRU) Modulating innate immune response via Host Defense peptides (Hancock lab, UBC) Mouse KOs (Sanger) Novel insight into host response and mechanism of peptides. Common Pathways, networks and transcriptional regulation. Thompson et al PNAS December 2009
  16. 16. Systems Biology & The Innate Immune Response:  Many layers of complexity.  Layers of regulation: transcriptional; post-transcriptional (miRNAs); post-translational (ubiquitination, phosphorylation)  Host-pathogen interactions  100s – 1000s DE genes  Not simple pathways - networks of molecular interactions Gardy*, Lynn*, Brinkman, Hancock (2009). Enabling a systems biology approach to immunology: focus on innate immunity. Trends in Immunology PMID: 19428301
  17. 17. Breuer et al., 2013 InnateDB: systems biology of innate immunity and beyond… NAR (DB issue) PMID: 23180781
  18. 18. Manual Curation of Interaction Data From Literature to Database Greatly Enhances Coverage of Innate Immunity Interactome INNATEDB CURATED INTERACTOME INTERACTIONS INTERACTIONS ONLY CURATED ALSO CURATED BY INNATEDB BY TOP 5 OTHER INTERACTION DATABASES: BIND, INTACT, DIP, BIOGRID & MINT Lynn et al., Curating the Innate Immunity Interactome. BMC Systems Biology 2010 PMID: 20727158
  19. 19. Manual  Cura4on  of  Interac4on  Data  From  Literature  to  Database  –  Enhancing  coverage  of  Innate  Immunity  Interactome   The  InnateDB  curated   interactome  in  July  2012.   Red  edges  represent   interac4ons  that  have  been   added  in  2011  and  2012.   Breuer et al., 2013 InnateDB: systems biology of innate immunity and beyond… Nucleic Acids Research (Database issue)
  20. 20. Contextually Curating Innate Immunity-Relevant InteractionsAnnotated fields include:Molecule type; organism;biological role; interactiondetection method; thehost system (in vitro, invivo, ex vivo); hostorganism; interactiontype; cell, cell-line andtissue types; cell status(primary/cell line);experimental role;participant identificationmethod and sub-cellularlocalization, plus varietyof additional curatorcomments.
  21. 21. Curating Innate Immunity-Relevant Interactions  71% human, 22% mouse, 7% human- mouse  ~80% interactions in innate immunity interactome not annotated by other major databases  Protein (69%), DNA and RNA interactions  Developed InnateDB submission system software to allow submission of interaction annotation in an OBO ontology-controlled and MIMIx & PSI-MI 2.5 compliant manner. Lynn et al., Curating the Innate Immunity Interactome. BMC Systems Biology 2010 PMID: 20727158
  22. 22. Which journals are curated?  >4,400 journal articles curated to date  Don’t focus on specific journals - relevant articles curated if meet appropriate quality standards for the interaction evidence.  Indeed, at least one protein has been curated from >200 different journals.  More than 70% of curated articles have come from 20 journals.  Note many journals in top 20 are not “immunology journals”, underscoring importance of not limiting curation efforts to journals perceived as “relevant”.
  23. 23. Curating Innate Immunity-Relevant Interactions – 4-pronged approach  Curation  primarily pathway-centric   systematically review all literature describing interactions for a particular innate immunity pathway.   Curate all other interactors regardless of whether the interacting molecule is a member of the pathway or has any known role in innate immunity  expands network outside of known innate immunity players.   Systematically curated pathways are scheduled for frequent re-curation as the field is moving quickly.  Also, new publications on innate immunity assessed on a daily basis to identify novel interactions of interest.   Priority given to the most recent publications  incorporates new information on the most current research  Immunology Community-aided:   Curators consult with researchers to confirm unclear literature data   Most common issue: Unclear what species the protein/DNA/RNA interactors come from  Curation Community-aided:   InnateDB curators review each others curations as an error check   IMEx consortium! http://www.innatedb.com/doc/InnateDB_2010_curation_guide.pdf
  24. 24. •  InnateDB  is  a  member  of  IMEx   –  an  interna4onal  consor4um   of  interac4on  databases   involved  in  cura4on  •  Goal:  Develop  common   standards,  avoid  too  much   redundancy  in  data   collec4on/cura4on,  central   registry,  single  search   interface  •  Orchard  et  al  Nature  Methods   9:345-­‐350  PMID:  22453911  •  Stay  tuned  for  Sandra   Orchards  talk!    
  25. 25. Going Beyond Innate Immunity – An Integrative Biology Resource  >196,000 human and mouse interactions extracted & loaded from BIND, INTACT, DIP, BIOGRID & MINT DBs  Cross-referenced genes to >3,000 pathways from KEGG, PID, BIOCARTA, INOH, NetPath & Reactome DBs   Visualize/analyze interactions associated with specific pathway   Pathway over-representation analysis  Ensembl annotation provides details of all human & mouse genes/transcripts/ proteins. UniProt, Entrez, Gene Ontology, etc  rich protein & gene annotation  Transcript. factor–DNA interactions experimentally confirmed from Transfac, TransCompel  Robust orthology & gene synteny analysis facilitate human-mouse comparisons
  26. 26. InnateDB  –  Advanced  Yet  User-­‐Friendly  Searching  –  Find  &  Analyze  Relevant   Interac4ons,  Pathways  &  Genes/Proteins.  
  27. 27. InnateDB  –  Facilita4ng  Systems-­‐Level  Analyses  of  Gene  Expression  Data   Upload Your Own Gene Expression Data - Up to 10 conditions/timepoints at 1 time.Overlay Gene Expression Data Pathway, Gene Ontology & TF ORA tools from Multiple Conditions on Find – DE Pathways/Functionally Related Networks/Pathways Genes/TFs Go Beyond Pathway Analysis – Differentially Expressed Sub-networks – New Pathways? How Are DE Genes Actually Inter-connected? Central Regulators (Network Hubs)
  28. 28. InnateDB and curated data aided study of an immune modulator – host-directed adjunctive therapy coupled with anti-malarial
  29. 29. What  we’re  doing  next…  Need  to  develop  more  ontologies  and  data  standards  to  integrate  microbial  genomic  data  from  a  disease  outbreak  with  epidemiological  data.    Cura4ng  pathogen  status  for  complete  microbial  genomes  Will  try  the  “paper  carrot”  again  for  next  Pseudomonas  Genome  Database  cura4on  project    InnateDB  –  expanding  to    Allergy  and  Asthma 29
  30. 30.   Iden4fy  genes  unique  to/shared  between  strains,  species,  genera,   any  selected  bacteria….   30
  31. 31. Funding!    Grants!  One  of  the  biggest  challenges  is  to    secure  long  term,  reliable  funding.    Weve  found:      Need  to  target  cura4on  to  specific  bio  projects.    (ie  innate  immunity,  then  to  allergy  and  asthma;    aiding  a  specific  Pseudomonas  analysis)    Limits  what  we  can  do,  but  good  in  the  sense  that    cura4on  benefits  are  more  quickly  felt  as  they  are  needed/used  by  others   31
  32. 32. Concluding  comments  Using  community-­‐aided,  expert  curator-­‐centered,  approach  for  balancing  consistency,  reliability  and  maximizing  knowledge.  Degree    of  community  involvement  depends  on  nature  of  data.    Capitalize  on  both  bio  community  and  cura4on  community  –  keep  linked  Researchers  are  busy!  Make  it  super  easy  for  them  to  provide  input.  A  li?le  contribu4on  can  go  a  long  way  Paper  carrots!  Link  cura4on  to  bio  research  to  secure  funding  Indoctrinate  young  minds!  Get  biocura4on    and  its  challenges  into  undergrad  curriculums     32
  33. 33. Acknowledgements - InnateDB  InnateDB Principle Investigators:   InnateDB Curation: www.innatedb.com   Fiona Brinkman (SFU)   Bob Hancock (UBC)   Raymond Lo   David Lynn (Teagasc)   Anastasia Sribnaia   Carol Chan  InnateDB Development:   Misbah Naseer   Karin Breuer   Melissa Yau   Geoff Winsor   Giselle Ring   Matthew Laird   Kathleen Wee   Calvin Chan   Jaimmie Que   Amir Foroushani   Brian Meredith   Cerebral network visualizer:   Nathan Lawless   Nicolas Richard   Avinash Chikatamarla   Aaron Barsky   Jennifer Gardy   Fiona Roche   Tamara Munzner   Timothy Chan   Naisha Shah   Michael Acab   FNIH/GCGH Collaborators:   Gordon Dougan (Sanger)   Fernanda Schreiber (Sanger)   Melita Gordon (U. Liverpool)   Bill Jacobs (AECM)   Dee Dao (AECM)   Philip Cooper (St. Georges)   Louis Schofield (WEHI)   Sandra Pilat (WEHI)   Sarah Dunstan (OUCRU)   Brett Finlay (UBC)
  34. 34. Acknowledgements  –  PseudoCAP  Geoff  Winsor  Ray  Lo  Ma?  Laird  Bhav  Dhillon  Ma?hew  Whiteside  151  PseudoCAP  par4cipants  www.pseudomonas.com  

×