Your SlideShare is downloading. ×
0
CSHALS 2013
CSHALS 2013
CSHALS 2013
CSHALS 2013
CSHALS 2013
CSHALS 2013
CSHALS 2013
CSHALS 2013
CSHALS 2013
CSHALS 2013
CSHALS 2013
CSHALS 2013
CSHALS 2013
CSHALS 2013
CSHALS 2013
CSHALS 2013
CSHALS 2013
CSHALS 2013
CSHALS 2013
CSHALS 2013
CSHALS 2013
CSHALS 2013
CSHALS 2013
CSHALS 2013
CSHALS 2013
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

CSHALS 2013

2,143

Published on

1 Comment
0 Likes
Statistics
Notes
  • Be the first to like this

No Downloads
Views
Total Views
2,143
On Slideshare
0
From Embeds
0
Number of Embeds
11
Actions
Shares
0
Downloads
3
Comments
1
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  1. The  ISA  Infrastructure  for  the  biosciences   from  data  curaDon  at  source  to  the  linked  data  cloud   Alejandra  Gonzalez-­‐Beltran   University  of  Oxford  e-­‐Research  Centre,  UK   Alejandra.GonzalezBeltran@oerc.ox.ac.uk  Conference on Semantics in Healthcare and Life Sciences (CSHALS) Boston, USA Feb 27- Mar 1 2013
  2. Outline   •  The                                    infrastructure  :  a  metadata  tracking   framework  in  the  biosciences:  the                                                  format,     a  set  of  open  source  soMware  tools  and  the  user   community   •  The                                                syntax  and  its  implicit  semanDcs   •  The                                                  component  of  the  infrastructure   •                                             for  mapping  the  syntax  to  ontologies   •  A  couple  of  mappings,  architecture,  conversion  
  3. Contextual  informaDon  (metadata):   •  Sample  characterisDcs   •  Technology  and  measurement  types   •  Instrument  parameters   •  …  
  4. Need  for  a  generic  representaDon,  applied  to:    •microarray  based  experiments  (MAGE)    •sequencing  based  experiments  (SRA)    •flow  cytometry  based  experiments  (FuGE-­‐Flow  Cyt)    •mass  spectrometry  and  NMR  spectroscopy  experiments  (Metabolights  and  PRIDE)  
  5. ISA  soMware  suite:  supporDng   standards-­‐compliant  experimental                              infrastructure   annotaDon  and  enabling  curaDon  at   the  community  level   Rocca-­‐Serra  et  al,    2010   BioinformaDcs   •  Assist  in  the  annotaDon  and  management  of   experimental  metadata  at  source,  supporDng  data   provenance  tracking   •  Deal  with  high-­‐throughput  studies  using  one  or  a   combinaDon  of  omics  and  other  technologies   •  Empower  users  to  uptake  community-­‐defined   checklists  and  ontologies   •  Facilitate  data  sharing,  re-­‐use,  comparison  and   reproducibility  of  experiments,  submission  to   internaDonal  public  repositories  
  6. Towards  interoperable  bioscience  data   Sansone  et  al,  2012   Nature  GeneDcs   A  growing  ecosystem     of  over  30  public  and  internal  resources  using  the  ISA  metadata  tracking  framework    to  facilitate  standards-­‐compliant  collecDon,  curaDon,  management  and  reuse  of  invesDgaDons  in  an   increasingly  diverse  set  of  life  science  domains.  
  7.  syntax    (and  its  implicit  semanDcs)  
  8. HybridizaDon   Derived  Array  Data  File   Sample  Name   Material  Type   Assay  Design  REF   Array  Data  File   Protocol  REF   Assay  Name       sample1   genomic  DNA   assay1   A-AFFY-107" assay1.cel   data  normalizaDon   assay1.txt   sample2   genomic  DNA   assay2   A-AFFY-107" assay2.cel   data  normalizaDon   assay2.txt   sample3   genomic  DNA   assay3   A-AFFY-107" assay3.cel   data  normalizaDon   assay3.txt  Material transformations... Material Node Data File Node " " DATA! Characteristics[…] Material! Derived Data File Factor Value[…] (independent Protocol variables) Process Material Type Comment[…] Parameter Value […] " " Material! DATA! Raw Data File Performer (operator effect) Date (day effect)
  9. 11   Tagging:  from  free  text  to  ontology-­‐based   • single  intervenDon  representaDon,  free  text  annotaDon   Factor   CharacterisDcs[organism]   Factor   Factor   Source  Name   Value[perturbaDon     Value[dose]   Value[duraDon]   agent]   individual1   human   aspirin   high  dose   12  weeks   • single  intervenDon,  ontology-­‐based  annotaDon   Factor   CharacterisDcs[organism Term  Source   Term  Accession   Value[chemical   Term  Source   Term  Accession  Source  Name   obi:0100026)])   REF   Number   compound   REF   Number     CHEBI_37577)]  individual1   Homo  sapiens   NCBITax   9606   aspirin   CHEBI   1231354   Factor   Term  Source   Term  Accession   Factor  Value[Dme   Term  Source   Term  Accession   Unit   Value[dose(OBI_0000984)   REF   Number   (PATO_0000165)]   REF   Number   low  dose   LNC   LP30872-­‐3   12   week   UO   0000034  
  10. ToxBank  effort    developed  by  Nina  Jeliazkova     Health  Care  &  Life  Sciences    Kohonen  et  al.  The  ToxBank  Data  Warehouse:  a   Interest  Group     research  cluster  of  7     EU  FP7  Health  systems  toxicology  and   toxicogenomics  projects.    
  11. •  Make  the  semanDcs  of  ISAtab  explicit,  including   materials  &  data  enDDes  &  processes  &  their   relaDonships  •  Provide  incenDves  for  provision  of  ontology-­‐based   annotaDons  in  ISA-­‐TAB  datasets;  exploit  those   annotaDons    •  Augment  ISA  syntax  with  new  elements  (e.g.   groups),  facilitaDng  the  understanding  &  querying  of   experimental  design  •  Facilitate  data  integraDon  &  knowledge  discovery/ reasoning  
  12. architecture  ISA-TAB parser graph isa2owl mapping analysis parser Configuration file
  13. •  Ontology  search  and  automated  tagging    (relying  on     NCBO  Bioportal  services)  on  Google  Spreadsheets   •  CollaboraDve  annotaDon;  support  for  distributed  users   •  Version  control  &  history  OntoMaton:  a  Bioportal  powered   Ontology  widget  for  Google   Spreadsheets   Maguire  et  al,    2013   BioinformaDcs  
  14. vocabularies   Chemical   Biomolecular     InformaDon   domain   domain   domain        Experimental   domain   Factor   CharacterisDcs[organi Term   Term   Term  Accession   Value[chemical   Term   Source  Name   smobi:0100026)])   Accession   Source  REF   Number   compound   Source  REF       Number   CHEBI_37577)]   individual1   Homo  sapiens   NCBITax   9606   aspirin   CHEBI   1231354  
  15. Open  Biological  and   Biomedical  Ontologies   (OBO)  Foundry   BFO   ChEBI   GO   IAO   Factor   CharacterisDcs[organi Term  OBI   Term   Term  Accession   Value[chemical   Term   Source  Name   smobi:0100026)])   Accession   Source  REF   Number   compound   Source  REF     Number   CHEBI_37577)]   individual1   Homo  sapiens   NCBITax   9606   aspirin   CHEBI   1231354  
  16. ISA-­‐OBI  mapping  
  17. ISA-­‐SIO  mapping  
  18. faahKO  dataset     Available  in   Bioconductor     (with  ISA-­‐TAB   metadata)  Global  metabolite   profiling  Data  subset:  LC/MS  peaks  from  the  spinal  cords  of  6  wild-­‐type  and  6  FAAH  (fapy  acid  amyde  hydrolase)  knockout  mice  
  19. •  support  different  conversion  modes  (different  levels   of  granularity)  •  querying  for  ISA-­‐TAB  datasets,  across  mulDple   experiment  types  •  reasoning  exploiDng  ontology  annotaDons   •   semanDc  validaDon  of  ISA-­‐TAB  datasets  •  augmented  annotaDon  over  naDve  ISA  syntax   •  idenDficaDon  gaps  in  ontological  representaDons     •  feedback  of  findings  to  community  ontologies  
  20. Increasing  level  of  structure     for  experimental  metadata  Notes  in  Lab  books   Spreadsheets  &  Tables   Facts  as  RDF  statements   (ISAtab  metadata)  
  21. @isatools @biosharing isa-tools.org isacommons.org biosharing.org

×