Drug Discovery- ELRIG -2012
Upcoming SlideShare
Loading in...5
×
 

Drug Discovery- ELRIG -2012

on

  • 1,730 views

 

Statistics

Views

Total Views
1,730
Views on SlideShare
557
Embed Views
1,173

Actions

Likes
1
Downloads
1
Comments
0

7 Embeds 1,173

http://www.oerc.ox.ac.uk 1075
http://oerc.ox.ac.uk 66
http://mundus.oerc.ox.ac.uk 18
http://intranet.oerc.ox.ac.uk 11
http://twitter.com 1
http://webcache.googleusercontent.com 1
http://www.google.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Drug Discovery- ELRIG -2012 Drug Discovery- ELRIG -2012 Presentation Transcript

  • Community-­‐standards  for  reproducible  and   reusable  research  -­‐     fundamentals  and  challenges   Alejandra  González-­‐Beltrán,  PhD   Senior Software Engineer, ISATeam University  of  Oxford  e-­‐Research  Centre,  Oxford,  UK   Drug  Discovery  2012,  Manchester,  UK,  September  6-­‐7  
  • Ioannidis   et   al.,   Repeatability   of   published   microarray  gene  expression  analyses.  Nature  Gene*cs  41(2),  149-­‐55  (2009)  doi:10.1038/ng.295    
  • Ioannidis   et   al.,   Repeatability   of   published   microarray  gene  expression  analyses.  Nature  Gene*cs  41(2),  149-­‐55  (2009)  doi:10.1038/ng.295     View slide
  • Roadmap   Reproducible  &  Reusable     Bioscience  Research   Principles  &  Challenges   View slide
  • Roadmap   reasoning   visualizaYon   analysis   browsing   integraYon   exchange   retrieval   Well-­‐annotated  &   Structured  Data   Reproducible  &  Reusable     Bioscience  Research   Principles  &  Challenges  
  • Roadmap   reasoning   visualizaYon   analysis   browsing   integraYon   exchange   retrieval  Community  Standards   So[ware  Tools   Well-­‐annotated  &   Structured  Data   Reproducible  &  Reusable     Bioscience  Research   Principles  &  Challenges  
  • Bioscience  is  mulY-­‐domain…   health   env   agro   tox/pharma  §       Interdisciplinary  and  integra9ve  in  character     •  need  to  deal  with  new  and  exis9ng  datasets   •  deal  with  a  variety  of  data  types   Source  of  the  figure:  EBI  website  
  • From  reusable  data  to  reproducible  research  To   make   the   datasets   comprehensible   and   interoperable,   underpinning   future  invesYgaYons,  we  need  common  ways  to  report  and  share  the  experimental  details  and  the  associated  results   Consistent  reporYng  will  have  a  posiYve  and  long-­‐lasYng  impact  on  the  value  of   collec9ve  scien9fic  outputs.   Community  Standards   The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project
  • Different  communiYes,  different  norms  and  standards,  e.g.:   use  the  same  term  to   allow  data  to  flow  from   report  the  same  core,     refer  to  the  same  ‘thing’   one  system  to  another   essenYal  informaYon    
  • Different  communiYes,  different  norms  and  standards,  e.g.:   use  the  same  term  to   allow  data  to  flow  from   report  the  same  core,     refer  to  the  same  ‘thing’   one  system  to  another   essenYal  informaYon     Challenges: lack of interaction and coordination, duplication of effort, fragmentation and uneven coverage…hinders interoperability
  • Guidelines  for  InformaYon  About  Therapy  Experiments  GIATE   TherapeuYc   InvesYgaYon   Generic  Model   Molecular   Cellular   Cellular   Animal   Animal   Clinical   Clinical   Molecular   Model   Model   Model   Model   Model   Model   Model   Model  
  • Growing  number  of  bioscience  reporYng  standards   MAGE-Tab! AAO! miame! GCDML! MIAPA! CHEBI! GIATE! SRAxml! OBI! MIRIAM! VO! SOFT! MIQAS! FASTA! PATO! MIX! CML! ENVO! REMARK! DICOM! MIGEN! GELML! MOD! SBRML! MIAPE! MIQE! TEDDY! MITAB! MzML! XAO! CIMR! CONSORT! BTO!ISA-Tab! SEDML…! DO   PRO! IDO…! MIASE! MISFISHIE….!
  • Growing  number  of  bioscience  reporYng  standards   303  +       150  +       130  +       Source:  MIBBI,     Source:  BioPortal   Es9mated   EQUATOR   Databases,     annotaYon,   curaYon     tools   MAGE-Tab! AAO! miame! GCDML! MIAPA! CHEBI! GIATE! SRAxml! OBI! MIRIAM! VO! SOFT! MIQAS! FASTA! PATO! MIX! CML! ENVO! REMARK! DICOM! MIGEN! GELML! MOD! SBRML! MIAPE! MIQE! TEDDY! MITAB! MzML! XAO! CIMR! CONSORT! BTO!ISA-Tab! SEDML…! DO   PRO! IDO…! MIASE! MISFISHIE….!
  • But…     what  do  we  know  about  them  and  how  they  are  related   MAGE-Tab! AAO! miame! GCDML! MIAPA! CHEBI! GIATE! SRAxml! OBI! MIRIAM! VO! SOFT! MIQAS! FASTA! PATO! MIX! CML! ENVO! REMARK! DICOM! MIGEN! GELML! MOD! SBRML! MIAPE! MIQE! TEDDY! MITAB! MzML! XAO! CIMR! CONSORT! BTO!ISA-Tab! SEDML…! DO   PRO! IDO…! MIASE! MISFISHIE….!
  • But…     what  do  we  know  about  them  and  how  they  are  related   I  use  high  throughput   Which  tools  and   sequencing  technologies,   databases   which  ones  are  relevant  to   implement  which   me?   standards?   How  can  I  get   What  are  the   involved  to  propose   criteria  to  evaluate   extensions  or   their  status  and   modificaYons?   value?   Which  ones  are   Which  formats   I  work  on  plants,  are   mature  enough  for   support  specific   these  just  for   me  to  use  or   minimum   biomedical   recommend?   informaYon   applicaYons?   guidelines?  
  • A  coherent,  curated  and   searchable  catalogue  of   data  sharing  resources    •  Bioscience  standards  and   associated  data-­‐sharing   policies,  publica9ons,  tools   and  databases  •  Assessment  criteria  for   usability  and  popularity  of   standards  •  Rela9onships  among   standards  •  Encouragement  for   communica9on  &   interac9on  among  groups  •  PromoYng  interoperability   &  informed  decisions  about   standards  
  • Standards  compliance  is  challenging…   Is  it  possible  to  achieve  a  common,  structured  representaYon   of  diverse  bioscience  experiments  that:   •  transcends  individual  bioscience  domains,  but  also   •  follows  the  appropriate  community  norms  and  standards?   The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project
  • Structured  descripYon  of  datasets   §  Capture  all  salient  features  of   the  experimental  workflow     §  Make  annotaYon  explicit  and   discoverable       §  Structure  the  descripYons  for   consistency,  tracking   §  independent  variables   §  dependent  variables   and  using   §  resolvable  idenYfiers  and   cross-­‐references  
  • Not  too  much,  not  too  lille,  just  ‘right’   §  We  must  strike  a  balance   between  sufficiency  and   pracYcability:   •  depth  and  breadth  of   informaYon   •  burden  to  produce  and   maintain  the  informaYon  
  • Metadata tracking framework, designed tosupport the use of several standardschecklists, terminologies andconversions to (a growing number of) othermetadata formats, used by publicrepositories, e.g. MAGE-Tab Pride-xml SRA-xml SOFT
  • user communityThe International Conference onSystems Biology (ICSB), 22-28August, 2008 Susanna-AssuntaSansone www.ebi.ac.uk/net-project
  • ISA   soQware   suite:   supporYng   standards-­‐compliant   experimental   annotaYon   and  enabling  curaYon  at  the  community  level  (Rocca-­‐Serra  et  al,  2010)  
  • 2 The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta3 Sansone www.ebi.ac.uk/net-project empowering researchers to use standards
  • Ontology  Search  and  Tagging  in  Google  Spreadsheets  
  • Ontology Search and Tagging in Google Spreadsheets
  • ISA  infrastructure  &  linked  data   •  Work  in  progress  to  convert  to  RDF/OWL  to  connect   to  the  growing  Linked  Data  universe        RDF  =  Resource  DescripYon  Framework,  OWL  =  Web  Ontology  Language   •  CollaboraYons  with  Toxbank  &  W3C  HCLSIG  <subject,  predicate,  object>    <lipoprotein>  <parYcipates_in>  <inflammatory  response>    <PRO:212342352>  <BFO_0000056>  <GO:0006954>  
  • Increasing  levels  of  structure…  Notes in Lab Books Spreadsheets and Tables Facts as RDF statements(information for humans) ( the compromise) (information for machines)
  • A  growing  ecosystem  of  over  30  public  and  internal  resources  using  the  ISA   metadata   tracking   framework   to   facilitate   standards-­‐compliant   collec9on,   cura9on,   management   and   reuse   of   invesYgaYons   in   an   increasingly   diverse   set  of  life  science  domains,  including:       •  environmental  health   •  stem  cell  discovery   •  environmental  genomics   •  system  biology   •  metabolomics   •  transcriptomics   •  metagenomics   •  toxicogenomics   •  nanotechnology   •  also  by  communiYes  working  to  build  a   •  proteomics,   library  of  cellular  signatures  We  aim  to  achieve  a  common  representaYon   of  experimental  content  that  transcends   individual  bioscience  domains   Sansone et al., Towards interoperable bioscience data. Nature Genetics 44, 121-126 (2012) doi:10.1038/ng.1054
  • A  growing  ecosystem  of  over  30  public  and  internal  resources  using  the  ISA   metadata   tracking   framework   to   facilitate   standards-­‐compliant   collec9on,   cura9on,   management   and   reuse   of   invesYgaYons   in   an   increasingly   diverse   set  of  life  science  domains,  including:       •  environmental  health   •  stem  cell  discovery   •  environmental  genomics   •  system  biology   •  metabolomics   •  transcriptomics   •  metagenomics   •  toxicogenomics   •  nanotechnology   •  also  by  communiYes  working  to  build  a   •  proteomics,   library  of  cellular  signatures   Some  of  the  public  groups/resources:   Some  of  the  internal  projects:   Stem Cell Commons Nanotechnology    InformaYcs  Working   Group    
  • Implementation at Harvard ISA hlp://discovery.hsci.harvard.edu/  
  • Implementation at the EBI hlp://www.ebi.ac.uk/metabolights   31
  • reasoning   visualizaYon   analysis   browsing   integraYon   exchange   retrieval  Community  Standards   So[ware  Tools   Well-­‐annotated  &   Guidelines   GIATE   Structured  Data   Formats  Terminologies   Reproducible  &  Reusable     lack  of   Bioscience  Research   Standards-­‐compliant     coordinaYon,   data  sharing  is    fragmentaYon  and   demanding  and     uneven  coverage   Yme-­‐consuming  
  • @isatools  @biosharing  Isa-­‐tools.org          isacommons.org        biosharing.org