SlideShare a Scribd company logo
1 of 35
Download to read offline
From	
  genes,	
  to	
  genomes	
  to	
  networks,	
  
                                                   	
  
   with	
  community	
  aided	
  cura5on      	
  


                 Fiona	
  Brinkman	
  
                Simon	
  Fraser	
  University	
  

                Biocura4on	
  conference	
  
                      April	
  2013	
  
From	
  genes,	
  to	
  genomes	
  to	
  networks,	
  
                                                              	
  
              with	
  community	
  aided	
  cura5on      	
  


with	
  a	
  li?le	
  help	
  from	
  my	
  friends	
  …	
  
My	
  Primary	
  Research	
  Interest	
  

Developing	
  more	
  sustainable	
  approaches	
  for	
  infec:ous	
  disease	
  
control	
  …using	
  novel	
  computa:onal	
  tools,	
  integrated	
  data	
  and	
  
interdisciplinary	
  approaches	
  

                            Targe4ng	
  major	
  players	
  resul4ng	
  in	
  infec4ous	
  disease:	
  

                                  o 	
  Pathogen	
  virulence	
  
                                   	
  ID	
  an4-­‐infec4ves	
  (don’t	
  kill	
  the	
  pathogen,	
  disarm	
  them)	
  

                                  o 	
  Host	
  immune	
  system	
  failure/over-­‐ac4vity	
  
                                  	
  Immune	
  modulators	
  that	
  dampen	
  damaging	
  
                                  inflamma4on	
  and	
  boost	
  “good”	
  immune	
  response	
  	
  

                                  o 	
  Changes	
  in	
  environment/social	
  factors	
  
                                   	
  Integra4ng	
  pathogen	
  genome	
  data	
  with	
  environment,	
  
                                  microbiome,	
  and	
  social	
  network	
  data	
  
                                   	
  Be?er	
  iden4fy	
  source/cause	
  of	
  disease	
  outbreaks	
  
3
Some	
  of	
  our	
  labs	
  tools…	
  




o 	
  Pathogen	
  virulence	
  
 	
  PSORTb	
  –	
  Protein	
  localiza4on	
  analysis	
  (ID	
  cell	
  surface/secreted	
  drug	
  targets)	
  	
  
 	
  IslandViewer	
  –	
  Genomic	
  island	
  analysis,	
  pathogen-­‐associated	
  genes	
  	
  
 	
  Ortholuge	
  DB	
  –	
  Precomputed	
  assessments	
  of	
  bacterial	
  orthologs	
  
 	
  Genera-­‐specific	
  DBs	
  like	
  Pseudomonas	
  Genome	
  Database	
  

o 	
  Host	
  immune	
  system	
  failure/over-­‐ac5vity	
  
	
  InnateDB	
  –	
  Human/Mouse	
  interactome	
  +	
  curated	
  innate	
  immunity-­‐associated	
  
interac4ons	
  

o 	
  Changes	
  in	
  environment/social	
  factors	
  
 	
  Metagenomics	
  projects	
  
 	
  Integrated	
  Rapid	
  Infec4ous	
  Disease	
  Analysis	
  Pipeline	
  (IRIDA)	
  
4
Some	
  of	
  our	
  labs	
  tools…	
  




o 	
  Pathogen	
  virulence	
  
 	
  PSORTb	
  –	
  Protein	
  localiza4on	
  analysis	
  (ID	
  cell	
  surface/secreted	
  drug	
  targets)	
  	
  
 	
  IslandViewer	
  –	
  Genomic	
  island	
  analysis,	
  pathogen-­‐associated	
  genes	
  	
  
 	
  Ortholuge	
  DB	
  –	
  Precomputed	
  assessments	
  of	
  bacterial	
  orthologs	
  
 	
  Genera-­‐specific	
  DBs	
  like	
  Pseudomonas	
  Genome	
  Database	
  

o 	
  Host	
  immune	
  system	
  failure/over-­‐ac5vity	
  
	
  InnateDB	
  –	
  Human/Mouse	
  interactome	
  +	
  curated	
  innate	
  immunity-­‐associated	
  
interac4ons	
  

o 	
  Changes	
  in	
  environment/social	
  factors	
  
 	
  Metagenomics	
  projects	
  
 	
  Integrated	
  Rapid	
  Infec4ous	
  Disease	
  Analysis	
  Pipeline	
  (IRIDA)	
  
5
Research	
  Philosophy	
  

                                  High	
  quality	
  analyses	
  are	
  only	
  as	
  good	
  as	
  
                                  the	
  robust	
  data,	
  	
  
                                  effec:ve	
  data	
  organiza:on	
  and	
  	
  
                                  accurate	
  analysis	
  methods	
  used.	
  	
  	
  
         Robust data


                                  Want	
  high	
  accuracy	
  	
  
               The                –	
  usually	
  erring	
  on	
  the	
  side	
  of	
  high	
  
               Nexus   Accurate   precision	
  at	
  the	
  expense	
  of	
  recall.	
  	
  
Data                   analysis
organization           methods
                                  To	
  a?ain	
  high	
  accuracy,	
  	
  
                                  biocura4on	
  is	
  oben	
  KEY	
  


  6
 	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Overview	
  
• 	
  Community-­‐based	
  	
  Community-­‐aided	
  gene/genome	
  annota4on	
  
        • 	
  	
  1997	
  –	
  present:	
  Pseudomonas	
  Genome	
  Project	
  and	
  PseudoCAP	
  
                   	
  	
       	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  (Pseudomonas	
  Community	
  Annota4on	
  Project)	
  

• 	
  Community-­‐aided	
  	
  Mul4ple	
  community-­‐aided	
  contextual	
  cura4on	
  
	
  	
  of	
  molecular	
  interac4ons	
  
              • 	
  2006	
  –	
  present:	
  InnateDB	
  project	
  

• 	
  What	
  we’re	
  doing	
  next…	
  

• 	
  Funding	
  it	
  all!	
  



                                                                                                                           7
Pseudomonas	
  Community	
  Annota5on	
  Project	
  
Goals	
  

Cri4cal	
  and	
  conserva4ve	
  genome	
  annota4on	
  
Minimize	
  project	
  costs	
  	
  
Capitalize	
  on	
  large	
  Pseudomonas	
  aeruginosa	
  research	
  community	
  

Solu:on	
  

Community-­‐based,	
  Internet-­‐based	
  approach	
  	
  
for	
  (con4nually	
  updated)	
  genome	
  annota4on	
  

“Crowdsourcing”	
  in	
  the	
  90’s!	
  


                                              8
Pseudomonas	
  Community	
  Annota5on	
  Project	
  
Ini:al	
  PseudoCAP	
  leading	
  to	
  genome	
  publica:on	
  (1997	
  –	
  2000)	
  

61	
  researchers	
  from	
  13	
  countries,	
  1741	
  annota4ons	
  	
  



	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Focus	
  on	
  conserva4ve	
  annota4on	
  

	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Need	
  to	
  capture	
  researcher’s	
  excellent,	
  diverse	
  	
  	
  	
  	
  	
  
biol	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  biological	
  knowledge,	
  NOT	
  their	
  diverse	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  ways	
  of	
  annota4ng!	
  




                                                                                                                                                        9
Pseudomonas	
  Community	
  Annota5on	
  Project	
  
Ini:al	
  PseudoCAP	
  leading	
  to	
  genome	
  publica:on	
  (1997	
  –	
  2000)	
  

Ini4al	
  1741	
  community-­‐based	
  annota4ons…	
  	
  
Annota4ons	
  incorporated	
  by	
  3	
  annotators	
  through	
  web-­‐based	
  tool	
  	
  
1st	
  fully	
  internet-­‐based	
  community	
  annota4on	
  effort	
  	
  




                                                  10
Pseudomonas	
  Community	
  Annota5on	
  Project	
  
Current	
  PseudoCAP	
  –	
  con:nually	
  updated	
  annota:on	
  (2000	
  –	
  present)	
  

151	
  researchers,	
  2356	
  curated	
  gene	
  annota4ons	
  	
  
(not	
  incl.	
  computa4onal	
  analyses)	
  

Movement	
  from	
  gene-­‐based	
  	
  genes	
  plus	
  other	
  genome	
  features	
  
(2,590	
  other	
  genome	
  features	
  added	
  in	
  the	
  last	
  year	
  alone)	
  

Found	
  we	
  needed	
  to	
  further	
  modify	
  our	
  community-­‐based	
  approach…	
  




Winsor et al 2011 PMID: 20929876
                                                    11
Winsor et al 2005 PMID: 15608211
Pseudomonas	
  Community	
  Annota5on	
  Project	
  
Current	
  PseudoCAP	
  –	
  con:nually	
  updated	
  annota:on	
  (2000	
  –	
  present)	
  

Annota4ons	
  incorporated	
  by	
  one	
  part	
  4me	
  project	
  coordinator	
  
Subject	
  to	
  review	
  process	
  (peer	
  reviewed	
  paper	
  or	
  other	
  peer	
  review)	
  

Increasing	
  movement	
  from	
  Community-­‐based	
  	
  Community-­‐aided	
  
-­‐	
  Coordinator	
  contacts	
  researchers	
  more	
  to	
  get	
  input	
  	
  
-­‐ 	
  Capitalize	
  on	
  exper4se	
  most	
  efficiently	
  	
  
-­‐ 	
  Coordinator	
  ensures	
  consistency	
  


	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Coordinator	
  and	
  community	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  collec4vely	
  ensures	
  quality	
  

                                                                                          12
Pseudomonas	
  Community	
  Annota5on	
  Project	
  
Challenges	
  and	
  Solu:ons	
  

-­‐ 	
  Disputes	
  between	
  researchers	
  regarding	
  an	
  annota4on	
  
             	
  -­‐	
  Go	
  with	
  first	
  published	
  and	
  have	
  alternate	
  annota4ons	
  

-­‐ 	
  Researchers	
  are	
  busy!	
  	
  
             -­‐ 	
  Keep	
  submission	
  system/input	
  process	
  simple!	
  
             -­‐ 	
  We	
  now	
  contact	
  them	
  more	
  than	
  they	
  contact	
  us	
  
             -­‐ 	
  Have	
  rounds	
  of	
  major	
  annota4on	
  pushes	
  

           Future:	
  Will	
  try	
  again	
  the	
  	
  
           “paper	
  carrot”	
  for	
  another	
  	
  
           annota4on	
  push	
  –	
  	
  
           authorship	
  on	
  a	
  NAR	
  update	
  	
  
           paper	
  (as	
  a	
  consor4um)	
  	
  
           to	
  encourage	
  par4cipa4on	
                  13
InnateDB:	
  	
  
Cura5ng	
  molecular	
  interac5ons,	
  networks	
  
 Community-­‐aided	
  	
  “Mul4ple	
  community-­‐aided”	
  
         Highly	
  contextual	
  annota4on	
  




                                 14
InnateDB	
  Developed	
  to	
  Aid	
  Two	
  Large	
  Interna4onal	
  
                                                                      	
  
                   Systems	
  Biology	
  Projects     	
  
                                                               Mouse Model Datasets:

                                                     Cerebral Malaria mouse model (IMR, Australia)
                                                          Tuberculosis mouse model (AECM)
                                                           Shigella xenograft model (Pasteur)
                                                              Human Clinical Datasets:
                                                      Typhoid & Malaria Vietnam (OUCRU/Stanford/
                                                                        Sanger)
                                                       Non Typhoidal Salmonella Malawi (Sanger)
+                                                    Chronic/Acute Helminth Ecuador (USF de Quito/
                                                                        Sanger)
                                                                   Dengue (OUCRU)
          Modulating innate immune response via
         Host Defense peptides (Hancock lab, UBC)
                   Mouse KOs (Sanger)




                                               Novel insight into host response and mechanism of peptides.
                                               Common Pathways, networks and transcriptional regulation.
                                                                                         Thompson et al PNAS December 2009
Systems Biology & The Innate
     Immune Response:

    Many layers of complexity.

    Layers of regulation:
     transcriptional;
     post-transcriptional (miRNAs);
     post-translational (ubiquitination,
     phosphorylation)

    Host-pathogen interactions

    100s – 1000s DE genes

    Not simple pathways - networks
     of molecular interactions

     Gardy*, Lynn*, Brinkman,
     Hancock (2009). Enabling a
     systems biology approach to
     immunology: focus on innate
     immunity. Trends in Immunology
     PMID: 19428301
Breuer et al., 2013 InnateDB: systems biology of innate immunity and beyond… NAR (DB issue) PMID: 23180781
Manual Curation of Interaction Data From Literature to Database
  Greatly Enhances Coverage of Innate Immunity Interactome


                                                               INNATEDB CURATED INTERACTOME




 INTERACTIONS                                                              INTERACTIONS
 ONLY CURATED                                                              ALSO CURATED
  BY INNATEDB                                                             BY TOP 5 OTHER
                                                                            INTERACTION
                                                                             DATABASES:
                                                                          BIND, INTACT, DIP,
                                                                           BIOGRID & MINT


  Lynn et al., Curating the Innate Immunity Interactome. BMC Systems Biology 2010 PMID: 20727158
Manual	
  Cura4on	
  of	
  Interac4on	
  Data	
  From	
  Literature	
  to	
  
Database	
  –	
  Enhancing	
  coverage	
  of	
  Innate	
  Immunity	
  Interactome 	
  




                                                                                  The	
  InnateDB	
  curated	
  
                                                                                  interactome	
  in	
  July	
  2012.	
  
                                                                                  Red	
  edges	
  represent	
  
                                                                                  interac4ons	
  that	
  have	
  been	
  
                                                                                  added	
  in	
  2011	
  and	
  2012.	
  




    Breuer et al., 2013 InnateDB: systems biology of innate immunity and beyond… Nucleic Acids Research (Database issue)
Contextually Curating Innate Immunity-Relevant Interactions

Annotated fields include:

Molecule type; organism;
biological role; interaction
detection method; the
host system (in vitro, in
vivo, ex vivo); host
organism; interaction
type; cell, cell-line and
tissue types; cell status
(primary/cell line);
experimental role;
participant identification
method and sub-cellular
localization, plus variety
of additional curator
comments.
Curating Innate Immunity-Relevant Interactions

    71% human, 22% mouse, 7% human-
     mouse

    ~80% interactions in innate immunity
     interactome not annotated by other major
     databases

    Protein (69%), DNA and RNA interactions

    Developed InnateDB submission system
     software to allow submission of interaction
     annotation in an OBO ontology-controlled
     and MIMIx & PSI-MI 2.5 compliant manner.




       Lynn et al., Curating the Innate Immunity Interactome. BMC Systems Biology 2010 PMID: 20727158
Which journals are curated?
    >4,400 journal articles curated to date

    Don’t focus on specific journals - relevant articles curated if meet appropriate
     quality standards for the interaction evidence.

    Indeed, at least one protein has been curated from >200 different journals.

    More than 70% of curated articles have come from 20 journals.

    Note many journals in top 20 are not “immunology journals”, underscoring
     importance of not limiting curation efforts to journals perceived as “relevant”.
Curating Innate Immunity-Relevant Interactions
                           – 4-pronged approach
    Curation  primarily pathway-centric
        systematically review all literature describing interactions for a particular innate
         immunity pathway.
        Curate all other interactors regardless of whether the interacting molecule is a member
         of the pathway or has any known role in innate immunity  expands network outside
         of known innate immunity players.
        Systematically curated pathways are scheduled for frequent re-curation as the field is
         moving quickly.

    Also, new publications on innate immunity assessed on a daily basis to identify novel
     interactions of interest.
         Priority given to the most recent publications  incorporates new information on the
          most current research

    Immunology Community-aided:
        Curators consult with researchers to confirm unclear literature data
        Most common issue: Unclear what species the protein/DNA/RNA interactors come
         from

    Curation Community-aided:
        InnateDB curators review each others curations as an error check
        IMEx consortium!

                                 http://www.innatedb.com/doc/InnateDB_2010_curation_guide.pdf
•  InnateDB	
  is	
  a	
  member	
  of	
  IMEx	
  
   –	
  an	
  interna4onal	
  consor4um	
  
   of	
  interac4on	
  databases	
  
   involved	
  in	
  cura4on	
  

•  Goal:	
  Develop	
  common	
  
   standards,	
  avoid	
  too	
  much	
  
   redundancy	
  in	
  data	
  
   collec4on/cura4on,	
  central	
  
   registry,	
  single	
  search	
  
   interface	
  

•  Orchard	
  et	
  al	
  Nature	
  Methods	
  
   9:345-­‐350	
  PMID:	
  22453911	
  

•  Stay	
  tuned	
  for	
  Sandra	
  
   Orchard's	
  talk!	
  	
  
Going Beyond Innate Immunity – An Integrative Biology Resource
    >196,000 human and mouse interactions
     extracted & loaded from BIND, INTACT,
     DIP, BIOGRID & MINT DBs

    Cross-referenced genes to >3,000
     pathways from KEGG, PID, BIOCARTA,
     INOH, NetPath & Reactome DBs
        Visualize/analyze interactions
         associated with specific pathway
        Pathway over-representation analysis

    Ensembl annotation provides details of all
     human & mouse genes/transcripts/
     proteins. UniProt, Entrez, Gene Ontology,
     etc  rich protein & gene annotation

    Transcript. factor–DNA interactions
     experimentally confirmed from Transfac,
     TransCompel

    Robust orthology & gene synteny analysis
     facilitate human-mouse comparisons
InnateDB	
  –	
  Advanced	
  Yet	
  User-­‐Friendly	
  Searching	
  –	
  Find	
  &	
  Analyze	
  Relevant	
  
                      Interac4ons,	
  Pathways	
  &	
  Genes/Proteins.            	
  
InnateDB	
  –	
  Facilita4ng	
  Systems-­‐Level	
  Analyses	
  of	
  Gene	
  Expression	
  Data	
  

                                      Upload Your Own Gene Expression Data
                                     - Up to 10 conditions/timepoints at 1 time.

Overlay Gene Expression Data                                                   Pathway, Gene Ontology & TF ORA tools
 from Multiple Conditions on                                                   Find – DE Pathways/Functionally Related
     Networks/Pathways                                                                       Genes/TFs




                   Go Beyond Pathway Analysis – Differentially Expressed Sub-networks – New
                   Pathways? How Are DE Genes Actually Inter-connected? Central Regulators
                                              (Network Hubs)
InnateDB and curated data aided study of an immune modulator –
    host-directed adjunctive therapy coupled with anti-malarial
What	
  we’re	
  doing	
  next…	
  
Need	
  to	
  develop	
  more	
  ontologies	
  and	
  data	
  standards	
  to	
  integrate	
  microbial	
  genomic	
  
data	
  from	
  a	
  disease	
  outbreak	
  with	
  epidemiological	
  data.	
  	
  

Cura4ng	
  pathogen	
  status	
  for	
  complete	
  microbial	
  genomes	
  

Will	
  try	
  the	
  “paper	
  carrot”	
  again	
  for	
  next	
  Pseudomonas	
  Genome	
  Database	
  cura4on	
  
project	
  	
  

InnateDB	
  –	
  expanding	
  to	
  	
  
Allergy	
  and	
  Asthma




                                                            29
  Iden4fy	
  genes	
  unique	
  to/shared	
  between	
  strains,	
  species,	
  genera,	
  
   any	
  selected	
  bacteria….	
  




                                                                                      30
Funding!	
  	
  
Grants!	
  

One	
  of	
  the	
  biggest	
  challenges	
  is	
  to	
  	
  
secure	
  long	
  term,	
  reliable	
  funding.	
  	
  


We've	
  found:	
  	
  	
  

Need	
  to	
  target	
  cura4on	
  to	
  specific	
  bio	
  projects.	
  	
  
(ie	
  innate	
  immunity,	
  then	
  to	
  allergy	
  and	
  asthma;	
  	
  
aiding	
  a	
  specific	
  Pseudomonas	
  analysis)	
  	
  

Limits	
  what	
  we	
  can	
  do,	
  but	
  good	
  in	
  the	
  sense	
  that	
  	
  
cura4on	
  benefits	
  are	
  more	
  quickly	
  felt	
  as	
  they	
  are	
  needed/used	
  by	
  others	
  



                                                                    31
Concluding	
  comments	
  
Using	
  community-­‐aided,	
  expert	
  curator-­‐centered,	
  approach	
  for	
  
balancing	
  consistency,	
  reliability	
  and	
  maximizing	
  knowledge.	
  Degree	
  	
  
of	
  community	
  involvement	
  depends	
  on	
  nature	
  of	
  data.	
  	
  

Capitalize	
  on	
  both	
  bio	
  community	
  and	
  cura4on	
  community	
  –	
  keep	
  
linked	
  

Researchers	
  are	
  busy!	
  Make	
  it	
  super	
  easy	
  for	
  them	
  to	
  provide	
  input.	
  A	
  
li?le	
  contribu4on	
  can	
  go	
  a	
  long	
  way	
  

Paper	
  carrots!	
  

Link	
  cura4on	
  to	
  bio	
  research	
  to	
  secure	
  funding	
  

Indoctrinate	
  young	
  minds!	
  Get	
  biocura4on	
  	
  
and	
  its	
  challenges	
  into	
  undergrad	
  curriculums	
  	
  
                                                    32
Acknowledgements - InnateDB

    InnateDB Principle Investigators:       InnateDB Curation:                   www.innatedb.com
         Fiona Brinkman (SFU)
         Bob Hancock (UBC)                        Raymond Lo
         David Lynn (Teagasc)                     Anastasia Sribnaia
                                                   Carol Chan
    InnateDB Development:                         Misbah Naseer
         Karin Breuer                             Melissa Yau
         Geoff Winsor                             Giselle Ring
         Matthew Laird                            Kathleen Wee
         Calvin Chan                              Jaimmie Que
         Amir Foroushani
         Brian Meredith
                                             Cerebral network visualizer:
         Nathan Lawless
         Nicolas Richard
         Avinash Chikatamarla
                                                   Aaron Barsky
                                                   Jennifer Gardy
         Fiona Roche
                                                   Tamara Munzner
         Timothy Chan
         Naisha Shah
         Michael Acab
                                             FNIH/GCGH Collaborators:

                                                   Gordon Dougan (Sanger)
                                                   Fernanda Schreiber (Sanger)
                                                   Melita Gordon (U. Liverpool)
                                                   Bill Jacobs (AECM)
                                                   Dee Dao (AECM)
                                                   Philip Cooper (St. Georges)
                                                   Louis Schofield (WEHI)
                                                   Sandra Pilat (WEHI)
                                                   Sarah Dunstan (OUCRU)
                                                   Brett Finlay (UBC)
Acknowledgements	
  –	
  PseudoCAP	
  




Geoff	
  Winsor	
  
Ray	
  Lo	
  
Ma?	
  Laird	
  
Bhav	
  Dhillon	
  
Ma?hew	
  Whiteside	
  

151	
  PseudoCAP	
  par4cipants	
  


www.pseudomonas.com	
  
Biocuration 2013 - Fiona Brinkman - From genes, to genomes to networks, with community aided curation

More Related Content

What's hot

What's hot (20)

Session 2: Diagnostics in Africa
Session 2: Diagnostics in AfricaSession 2: Diagnostics in Africa
Session 2: Diagnostics in Africa
 
Global surveillance One World – One Health
Global surveillance  One World – One HealthGlobal surveillance  One World – One Health
Global surveillance One World – One Health
 
Session 7: Improved postharvest market access treatments in horticultural com...
Session 7: Improved postharvest market access treatments in horticultural com...Session 7: Improved postharvest market access treatments in horticultural com...
Session 7: Improved postharvest market access treatments in horticultural com...
 
Web applications for rapid microbial taxonomy identification
Web applications for rapid microbial taxonomy identification Web applications for rapid microbial taxonomy identification
Web applications for rapid microbial taxonomy identification
 
Real-Time Genome Sequencing of Resistant Bacteria Provides Precision Infectio...
Real-Time Genome Sequencing of Resistant Bacteria Provides Precision Infectio...Real-Time Genome Sequencing of Resistant Bacteria Provides Precision Infectio...
Real-Time Genome Sequencing of Resistant Bacteria Provides Precision Infectio...
 
Session 6: New tools for field grains surveillance
Session 6: New tools for field grains surveillanceSession 6: New tools for field grains surveillance
Session 6: New tools for field grains surveillance
 
Session 10: Managing myrtle rust in Australia
Session 10: Managing myrtle rust in AustraliaSession 10: Managing myrtle rust in Australia
Session 10: Managing myrtle rust in Australia
 
Session 2: Next generation national fruit fly diagnostics and handbook
Session 2: Next generation national fruit fly diagnostics and handbookSession 2: Next generation national fruit fly diagnostics and handbook
Session 2: Next generation national fruit fly diagnostics and handbook
 
Introduction to Bioinformatics
Introduction to BioinformaticsIntroduction to Bioinformatics
Introduction to Bioinformatics
 
Session 2: Genome-Informed Diagnostics - In-field Detection of Bacterial Plan...
Session 2: Genome-Informed Diagnostics - In-field Detection of Bacterial Plan...Session 2: Genome-Informed Diagnostics - In-field Detection of Bacterial Plan...
Session 2: Genome-Informed Diagnostics - In-field Detection of Bacterial Plan...
 
Use of Next Generation Sequencing techniques for characterisation of baculovi...
Use of Next Generation Sequencing techniques for characterisation of baculovi...Use of Next Generation Sequencing techniques for characterisation of baculovi...
Use of Next Generation Sequencing techniques for characterisation of baculovi...
 
Microbiome Identification to Characterization: Pathogen Detection Webinar Ser...
Microbiome Identification to Characterization: Pathogen Detection Webinar Ser...Microbiome Identification to Characterization: Pathogen Detection Webinar Ser...
Microbiome Identification to Characterization: Pathogen Detection Webinar Ser...
 
Biochemistry: A pivotal aspects in forensic science
Biochemistry: A pivotal aspects in forensic scienceBiochemistry: A pivotal aspects in forensic science
Biochemistry: A pivotal aspects in forensic science
 
Natural dispersal as a biosecurity risk -are we prepared?
Natural dispersal as a biosecurity risk -are we prepared?Natural dispersal as a biosecurity risk -are we prepared?
Natural dispersal as a biosecurity risk -are we prepared?
 
Session 1: Natural dispersal as a biosecurity risk - are we prepared?
Session 1: Natural dispersal as a biosecurity risk - are we prepared?Session 1: Natural dispersal as a biosecurity risk - are we prepared?
Session 1: Natural dispersal as a biosecurity risk - are we prepared?
 
Session 10: Invasive fungus threatens Australian native communities
Session 10: Invasive fungus threatens Australian native communitiesSession 10: Invasive fungus threatens Australian native communities
Session 10: Invasive fungus threatens Australian native communities
 
Session 6: Combining monitoring and incursion surveillance for grains
Session 6: Combining monitoring and incursion surveillance for grainsSession 6: Combining monitoring and incursion surveillance for grains
Session 6: Combining monitoring and incursion surveillance for grains
 
Whole Genome Sequencing (WGS) for surveillance of foodborne infections in Den...
Whole Genome Sequencing (WGS) for surveillance of foodborne infections in Den...Whole Genome Sequencing (WGS) for surveillance of foodborne infections in Den...
Whole Genome Sequencing (WGS) for surveillance of foodborne infections in Den...
 
Added Value of Open data sharing using examples from GenomeTrakr
Added Value of Open data sharing using examples from GenomeTrakrAdded Value of Open data sharing using examples from GenomeTrakr
Added Value of Open data sharing using examples from GenomeTrakr
 
dmontaner at cipf_2014
dmontaner at cipf_2014dmontaner at cipf_2014
dmontaner at cipf_2014
 

Similar to Biocuration 2013 - Fiona Brinkman - From genes, to genomes to networks, with community aided curation

Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
biinoida
 
Apollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Apollo and i5K: Collaborative Curation and Interactive Analysis of GenomesApollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Apollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Monica Munoz-Torres
 

Similar to Biocuration 2013 - Fiona Brinkman - From genes, to genomes to networks, with community aided curation (20)

Web Apollo: Lessons learned from community-based biocuration efforts.
Web Apollo: Lessons learned from community-based biocuration efforts.Web Apollo: Lessons learned from community-based biocuration efforts.
Web Apollo: Lessons learned from community-based biocuration efforts.
 
GenomeTrakr: Perspectives on linking internationally - Canada and IRIDA.ca
GenomeTrakr: Perspectives on linking internationally - Canada and IRIDA.caGenomeTrakr: Perspectives on linking internationally - Canada and IRIDA.ca
GenomeTrakr: Perspectives on linking internationally - Canada and IRIDA.ca
 
Grand round whsiao_may2015
Grand round whsiao_may2015Grand round whsiao_may2015
Grand round whsiao_may2015
 
How Can We Make Genomic Epidemiology a Widespread Reality? - William Hsiao
How Can We Make Genomic Epidemiology a Widespread Reality?  - William HsiaoHow Can We Make Genomic Epidemiology a Widespread Reality?  - William Hsiao
How Can We Make Genomic Epidemiology a Widespread Reality? - William Hsiao
 
KnetMiner - EBI Workshop 2017
KnetMiner - EBI Workshop 2017KnetMiner - EBI Workshop 2017
KnetMiner - EBI Workshop 2017
 
Seattle-Denver VA Center for Innovation
Seattle-Denver VA Center for InnovationSeattle-Denver VA Center for Innovation
Seattle-Denver VA Center for Innovation
 
eScience-School-Oct2012-Campinas-Brazil
eScience-School-Oct2012-Campinas-BrazileScience-School-Oct2012-Campinas-Brazil
eScience-School-Oct2012-Campinas-Brazil
 
Friend p4c 2012-11-29
Friend p4c 2012-11-29Friend p4c 2012-11-29
Friend p4c 2012-11-29
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
rheumatoid arthritis
rheumatoid arthritisrheumatoid arthritis
rheumatoid arthritis
 
Artificial intelligence in plant disease detection
Artificial intelligence in plant disease detectionArtificial intelligence in plant disease detection
Artificial intelligence in plant disease detection
 
Friend NRNB 2012-12-13
Friend NRNB 2012-12-13Friend NRNB 2012-12-13
Friend NRNB 2012-12-13
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Genome sequencing and the development of our current information library
Genome sequencing and the development of our current information libraryGenome sequencing and the development of our current information library
Genome sequencing and the development of our current information library
 
Apollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Apollo and i5K: Collaborative Curation and Interactive Analysis of GenomesApollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Apollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
 
IRIDA: Canada’s federated platform for genomic epidemiology, ABPHM 2015 WHsiao
IRIDA: Canada’s federated platform for genomic epidemiology, ABPHM 2015 WHsiaoIRIDA: Canada’s federated platform for genomic epidemiology, ABPHM 2015 WHsiao
IRIDA: Canada’s federated platform for genomic epidemiology, ABPHM 2015 WHsiao
 
Thesis defense, Heather Piwowar, Sharing biomedical research data
Thesis defense, Heather Piwowar, Sharing biomedical research dataThesis defense, Heather Piwowar, Sharing biomedical research data
Thesis defense, Heather Piwowar, Sharing biomedical research data
 
Will Biomedical Research Fundamentally Change in the Era of Big Data?
Will Biomedical Research Fundamentally Change in the Era of Big Data?Will Biomedical Research Fundamentally Change in the Era of Big Data?
Will Biomedical Research Fundamentally Change in the Era of Big Data?
 
Introduction to Gene Mining Part A: BLASTn-off!
Introduction to Gene Mining Part A: BLASTn-off!Introduction to Gene Mining Part A: BLASTn-off!
Introduction to Gene Mining Part A: BLASTn-off!
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Recently uploaded (20)

Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 

Biocuration 2013 - Fiona Brinkman - From genes, to genomes to networks, with community aided curation

  • 1. From  genes,  to  genomes  to  networks,     with  community  aided  cura5on   Fiona  Brinkman   Simon  Fraser  University   Biocura4on  conference   April  2013  
  • 2. From  genes,  to  genomes  to  networks,     with  community  aided  cura5on   with  a  li?le  help  from  my  friends  …  
  • 3. My  Primary  Research  Interest   Developing  more  sustainable  approaches  for  infec:ous  disease   control  …using  novel  computa:onal  tools,  integrated  data  and   interdisciplinary  approaches   Targe4ng  major  players  resul4ng  in  infec4ous  disease:   o   Pathogen  virulence      ID  an4-­‐infec4ves  (don’t  kill  the  pathogen,  disarm  them)   o   Host  immune  system  failure/over-­‐ac4vity     Immune  modulators  that  dampen  damaging   inflamma4on  and  boost  “good”  immune  response     o   Changes  in  environment/social  factors      Integra4ng  pathogen  genome  data  with  environment,   microbiome,  and  social  network  data      Be?er  iden4fy  source/cause  of  disease  outbreaks   3
  • 4. Some  of  our  labs  tools…   o   Pathogen  virulence      PSORTb  –  Protein  localiza4on  analysis  (ID  cell  surface/secreted  drug  targets)        IslandViewer  –  Genomic  island  analysis,  pathogen-­‐associated  genes        Ortholuge  DB  –  Precomputed  assessments  of  bacterial  orthologs      Genera-­‐specific  DBs  like  Pseudomonas  Genome  Database   o   Host  immune  system  failure/over-­‐ac5vity     InnateDB  –  Human/Mouse  interactome  +  curated  innate  immunity-­‐associated   interac4ons   o   Changes  in  environment/social  factors      Metagenomics  projects      Integrated  Rapid  Infec4ous  Disease  Analysis  Pipeline  (IRIDA)   4
  • 5. Some  of  our  labs  tools…   o   Pathogen  virulence      PSORTb  –  Protein  localiza4on  analysis  (ID  cell  surface/secreted  drug  targets)        IslandViewer  –  Genomic  island  analysis,  pathogen-­‐associated  genes        Ortholuge  DB  –  Precomputed  assessments  of  bacterial  orthologs      Genera-­‐specific  DBs  like  Pseudomonas  Genome  Database   o   Host  immune  system  failure/over-­‐ac5vity     InnateDB  –  Human/Mouse  interactome  +  curated  innate  immunity-­‐associated   interac4ons   o   Changes  in  environment/social  factors      Metagenomics  projects      Integrated  Rapid  Infec4ous  Disease  Analysis  Pipeline  (IRIDA)   5
  • 6. Research  Philosophy   High  quality  analyses  are  only  as  good  as   the  robust  data,     effec:ve  data  organiza:on  and     accurate  analysis  methods  used.       Robust data Want  high  accuracy     The –  usually  erring  on  the  side  of  high   Nexus Accurate precision  at  the  expense  of  recall.     Data analysis organization methods To  a?ain  high  accuracy,     biocura4on  is  oben  KEY   6
  • 7.                                                              Overview   •   Community-­‐based    Community-­‐aided  gene/genome  annota4on   •     1997  –  present:  Pseudomonas  Genome  Project  and  PseudoCAP                                (Pseudomonas  Community  Annota4on  Project)   •   Community-­‐aided    Mul4ple  community-­‐aided  contextual  cura4on      of  molecular  interac4ons   •   2006  –  present:  InnateDB  project   •   What  we’re  doing  next…   •   Funding  it  all!   7
  • 8. Pseudomonas  Community  Annota5on  Project   Goals   Cri4cal  and  conserva4ve  genome  annota4on   Minimize  project  costs     Capitalize  on  large  Pseudomonas  aeruginosa  research  community   Solu:on   Community-­‐based,  Internet-­‐based  approach     for  (con4nually  updated)  genome  annota4on   “Crowdsourcing”  in  the  90’s!   8
  • 9. Pseudomonas  Community  Annota5on  Project   Ini:al  PseudoCAP  leading  to  genome  publica:on  (1997  –  2000)   61  researchers  from  13  countries,  1741  annota4ons                                                                                                  Focus  on  conserva4ve  annota4on                                                                                                Need  to  capture  researcher’s  excellent,  diverse             biol                                                                  biological  knowledge,  NOT  their  diverse                                                                                    ways  of  annota4ng!   9
  • 10. Pseudomonas  Community  Annota5on  Project   Ini:al  PseudoCAP  leading  to  genome  publica:on  (1997  –  2000)   Ini4al  1741  community-­‐based  annota4ons…     Annota4ons  incorporated  by  3  annotators  through  web-­‐based  tool     1st  fully  internet-­‐based  community  annota4on  effort     10
  • 11. Pseudomonas  Community  Annota5on  Project   Current  PseudoCAP  –  con:nually  updated  annota:on  (2000  –  present)   151  researchers,  2356  curated  gene  annota4ons     (not  incl.  computa4onal  analyses)   Movement  from  gene-­‐based    genes  plus  other  genome  features   (2,590  other  genome  features  added  in  the  last  year  alone)   Found  we  needed  to  further  modify  our  community-­‐based  approach…   Winsor et al 2011 PMID: 20929876 11 Winsor et al 2005 PMID: 15608211
  • 12. Pseudomonas  Community  Annota5on  Project   Current  PseudoCAP  –  con:nually  updated  annota:on  (2000  –  present)   Annota4ons  incorporated  by  one  part  4me  project  coordinator   Subject  to  review  process  (peer  reviewed  paper  or  other  peer  review)   Increasing  movement  from  Community-­‐based    Community-­‐aided   -­‐  Coordinator  contacts  researchers  more  to  get  input     -­‐   Capitalize  on  exper4se  most  efficiently     -­‐   Coordinator  ensures  consistency                          Coordinator  and  community                            collec4vely  ensures  quality   12
  • 13. Pseudomonas  Community  Annota5on  Project   Challenges  and  Solu:ons   -­‐   Disputes  between  researchers  regarding  an  annota4on    -­‐  Go  with  first  published  and  have  alternate  annota4ons   -­‐   Researchers  are  busy!     -­‐   Keep  submission  system/input  process  simple!   -­‐   We  now  contact  them  more  than  they  contact  us   -­‐   Have  rounds  of  major  annota4on  pushes   Future:  Will  try  again  the     “paper  carrot”  for  another     annota4on  push  –     authorship  on  a  NAR  update     paper  (as  a  consor4um)     to  encourage  par4cipa4on   13
  • 14. InnateDB:     Cura5ng  molecular  interac5ons,  networks   Community-­‐aided    “Mul4ple  community-­‐aided”   Highly  contextual  annota4on   14
  • 15. InnateDB  Developed  to  Aid  Two  Large  Interna4onal     Systems  Biology  Projects   Mouse Model Datasets: Cerebral Malaria mouse model (IMR, Australia) Tuberculosis mouse model (AECM) Shigella xenograft model (Pasteur) Human Clinical Datasets: Typhoid & Malaria Vietnam (OUCRU/Stanford/ Sanger) Non Typhoidal Salmonella Malawi (Sanger) + Chronic/Acute Helminth Ecuador (USF de Quito/ Sanger) Dengue (OUCRU) Modulating innate immune response via Host Defense peptides (Hancock lab, UBC) Mouse KOs (Sanger) Novel insight into host response and mechanism of peptides. Common Pathways, networks and transcriptional regulation. Thompson et al PNAS December 2009
  • 16. Systems Biology & The Innate Immune Response:   Many layers of complexity.   Layers of regulation: transcriptional; post-transcriptional (miRNAs); post-translational (ubiquitination, phosphorylation)   Host-pathogen interactions   100s – 1000s DE genes   Not simple pathways - networks of molecular interactions Gardy*, Lynn*, Brinkman, Hancock (2009). Enabling a systems biology approach to immunology: focus on innate immunity. Trends in Immunology PMID: 19428301
  • 17. Breuer et al., 2013 InnateDB: systems biology of innate immunity and beyond… NAR (DB issue) PMID: 23180781
  • 18. Manual Curation of Interaction Data From Literature to Database Greatly Enhances Coverage of Innate Immunity Interactome INNATEDB CURATED INTERACTOME INTERACTIONS INTERACTIONS ONLY CURATED ALSO CURATED BY INNATEDB BY TOP 5 OTHER INTERACTION DATABASES: BIND, INTACT, DIP, BIOGRID & MINT Lynn et al., Curating the Innate Immunity Interactome. BMC Systems Biology 2010 PMID: 20727158
  • 19. Manual  Cura4on  of  Interac4on  Data  From  Literature  to   Database  –  Enhancing  coverage  of  Innate  Immunity  Interactome   The  InnateDB  curated   interactome  in  July  2012.   Red  edges  represent   interac4ons  that  have  been   added  in  2011  and  2012.   Breuer et al., 2013 InnateDB: systems biology of innate immunity and beyond… Nucleic Acids Research (Database issue)
  • 20. Contextually Curating Innate Immunity-Relevant Interactions Annotated fields include: Molecule type; organism; biological role; interaction detection method; the host system (in vitro, in vivo, ex vivo); host organism; interaction type; cell, cell-line and tissue types; cell status (primary/cell line); experimental role; participant identification method and sub-cellular localization, plus variety of additional curator comments.
  • 21. Curating Innate Immunity-Relevant Interactions   71% human, 22% mouse, 7% human- mouse   ~80% interactions in innate immunity interactome not annotated by other major databases   Protein (69%), DNA and RNA interactions   Developed InnateDB submission system software to allow submission of interaction annotation in an OBO ontology-controlled and MIMIx & PSI-MI 2.5 compliant manner. Lynn et al., Curating the Innate Immunity Interactome. BMC Systems Biology 2010 PMID: 20727158
  • 22. Which journals are curated?   >4,400 journal articles curated to date   Don’t focus on specific journals - relevant articles curated if meet appropriate quality standards for the interaction evidence.   Indeed, at least one protein has been curated from >200 different journals.   More than 70% of curated articles have come from 20 journals.   Note many journals in top 20 are not “immunology journals”, underscoring importance of not limiting curation efforts to journals perceived as “relevant”.
  • 23. Curating Innate Immunity-Relevant Interactions – 4-pronged approach   Curation  primarily pathway-centric   systematically review all literature describing interactions for a particular innate immunity pathway.   Curate all other interactors regardless of whether the interacting molecule is a member of the pathway or has any known role in innate immunity  expands network outside of known innate immunity players.   Systematically curated pathways are scheduled for frequent re-curation as the field is moving quickly.   Also, new publications on innate immunity assessed on a daily basis to identify novel interactions of interest.   Priority given to the most recent publications  incorporates new information on the most current research   Immunology Community-aided:   Curators consult with researchers to confirm unclear literature data   Most common issue: Unclear what species the protein/DNA/RNA interactors come from   Curation Community-aided:   InnateDB curators review each others curations as an error check   IMEx consortium! http://www.innatedb.com/doc/InnateDB_2010_curation_guide.pdf
  • 24. •  InnateDB  is  a  member  of  IMEx   –  an  interna4onal  consor4um   of  interac4on  databases   involved  in  cura4on   •  Goal:  Develop  common   standards,  avoid  too  much   redundancy  in  data   collec4on/cura4on,  central   registry,  single  search   interface   •  Orchard  et  al  Nature  Methods   9:345-­‐350  PMID:  22453911   •  Stay  tuned  for  Sandra   Orchard's  talk!    
  • 25. Going Beyond Innate Immunity – An Integrative Biology Resource   >196,000 human and mouse interactions extracted & loaded from BIND, INTACT, DIP, BIOGRID & MINT DBs   Cross-referenced genes to >3,000 pathways from KEGG, PID, BIOCARTA, INOH, NetPath & Reactome DBs   Visualize/analyze interactions associated with specific pathway   Pathway over-representation analysis   Ensembl annotation provides details of all human & mouse genes/transcripts/ proteins. UniProt, Entrez, Gene Ontology, etc  rich protein & gene annotation   Transcript. factor–DNA interactions experimentally confirmed from Transfac, TransCompel   Robust orthology & gene synteny analysis facilitate human-mouse comparisons
  • 26. InnateDB  –  Advanced  Yet  User-­‐Friendly  Searching  –  Find  &  Analyze  Relevant   Interac4ons,  Pathways  &  Genes/Proteins.  
  • 27. InnateDB  –  Facilita4ng  Systems-­‐Level  Analyses  of  Gene  Expression  Data   Upload Your Own Gene Expression Data - Up to 10 conditions/timepoints at 1 time. Overlay Gene Expression Data Pathway, Gene Ontology & TF ORA tools from Multiple Conditions on Find – DE Pathways/Functionally Related Networks/Pathways Genes/TFs Go Beyond Pathway Analysis – Differentially Expressed Sub-networks – New Pathways? How Are DE Genes Actually Inter-connected? Central Regulators (Network Hubs)
  • 28. InnateDB and curated data aided study of an immune modulator – host-directed adjunctive therapy coupled with anti-malarial
  • 29. What  we’re  doing  next…   Need  to  develop  more  ontologies  and  data  standards  to  integrate  microbial  genomic   data  from  a  disease  outbreak  with  epidemiological  data.     Cura4ng  pathogen  status  for  complete  microbial  genomes   Will  try  the  “paper  carrot”  again  for  next  Pseudomonas  Genome  Database  cura4on   project     InnateDB  –  expanding  to     Allergy  and  Asthma 29
  • 30.   Iden4fy  genes  unique  to/shared  between  strains,  species,  genera,   any  selected  bacteria….   30
  • 31. Funding!     Grants!   One  of  the  biggest  challenges  is  to     secure  long  term,  reliable  funding.     We've  found:       Need  to  target  cura4on  to  specific  bio  projects.     (ie  innate  immunity,  then  to  allergy  and  asthma;     aiding  a  specific  Pseudomonas  analysis)     Limits  what  we  can  do,  but  good  in  the  sense  that     cura4on  benefits  are  more  quickly  felt  as  they  are  needed/used  by  others   31
  • 32. Concluding  comments   Using  community-­‐aided,  expert  curator-­‐centered,  approach  for   balancing  consistency,  reliability  and  maximizing  knowledge.  Degree     of  community  involvement  depends  on  nature  of  data.     Capitalize  on  both  bio  community  and  cura4on  community  –  keep   linked   Researchers  are  busy!  Make  it  super  easy  for  them  to  provide  input.  A   li?le  contribu4on  can  go  a  long  way   Paper  carrots!   Link  cura4on  to  bio  research  to  secure  funding   Indoctrinate  young  minds!  Get  biocura4on     and  its  challenges  into  undergrad  curriculums     32
  • 33. Acknowledgements - InnateDB   InnateDB Principle Investigators:   InnateDB Curation: www.innatedb.com   Fiona Brinkman (SFU)   Bob Hancock (UBC)   Raymond Lo   David Lynn (Teagasc)   Anastasia Sribnaia   Carol Chan   InnateDB Development:   Misbah Naseer   Karin Breuer   Melissa Yau   Geoff Winsor   Giselle Ring   Matthew Laird   Kathleen Wee   Calvin Chan   Jaimmie Que   Amir Foroushani   Brian Meredith   Cerebral network visualizer:   Nathan Lawless   Nicolas Richard   Avinash Chikatamarla   Aaron Barsky   Jennifer Gardy   Fiona Roche   Tamara Munzner   Timothy Chan   Naisha Shah   Michael Acab   FNIH/GCGH Collaborators:   Gordon Dougan (Sanger)   Fernanda Schreiber (Sanger)   Melita Gordon (U. Liverpool)   Bill Jacobs (AECM)   Dee Dao (AECM)   Philip Cooper (St. Georges)   Louis Schofield (WEHI)   Sandra Pilat (WEHI)   Sarah Dunstan (OUCRU)   Brett Finlay (UBC)
  • 34. Acknowledgements  –  PseudoCAP   Geoff  Winsor   Ray  Lo   Ma?  Laird   Bhav  Dhillon   Ma?hew  Whiteside   151  PseudoCAP  par4cipants   www.pseudomonas.com