• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Performance Metrics and Figures of Merit Working Group Summary Aug2012
 

Performance Metrics and Figures of Merit Working Group Summary Aug2012

on

  • 362 views

 

Statistics

Views

Total Views
362
Views on SlideShare
344
Embed Views
18

Actions

Likes
0
Downloads
12
Comments
0

1 Embed 18

http://www.dnalinklabs.com 18

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Performance Metrics and Figures of Merit Working Group Summary Aug2012 Performance Metrics and Figures of Merit Working Group Summary Aug2012 Presentation Transcript

    • Genome in a Bottle Performance Metric & Figures of Merit
    • OverviewBioinforma5cs   Experimental  Data   Data   •  Sequence  Data  &  Varia5on   Integra5on  /   •  Metadata  Representa5on   Database   Refine  and  Feedback   •  RM  vs.  Reference   •  Every  Base   Compare  and  Report   Visualize  and  Filter   •  Single  Genome  Browser   •  Browser  over  DB   •  Valida5onProtocol.org   •  Query  by  Experiment  Data   Experimental Data = Combination of Prep / Sequencing / Analysis
    • Experimental Data•  Prepara5on   –  Link  to  published  prep  protocol   –  ROI  in  Bed/GFF/GBK  Format  •  Sequencing   –  PlaQorm  Informa5on  (Minimally  Name)   –  Chemistry  (Minimally  Version)  •  Analysis   –  Link  to  published  analysis  protocol  or  best  prac5ces   –  Read  Data  (fastq,  sra,  hdf5,  others)   –  Alignment/Assembly  Data  (bam)   •  Minimal  Tag  Set  TBD   –  Varia5on  (vcf)   •  Minimal  Tag  Set  TBD  in  INFO  field  of  VCF  or  define  external   XSD  
    • Metadata•  All  Required  fields  in  VCF  4.1  •  Others  (Examples)   –  AA  :  ancestral  allele   –  AC  :  allele  count  in  genotypes,  for  each  ALT  allele,  in  the  same  order  as  listed   –  AF  :  allele  frequency  for  each  ALT  allele  in  the  same  order  as  listed:  use  this  when  es5mated  from  primary  data,  not   called  genotypes   –  AN  :  total  number  of  alleles  in  called  genotypes   –  BQ  :  RMS  base  quality  at  this  posi5on   –  CIGAR  :  cigar  string  describing  how  to  align  an  alternate  allele  to  the  reference  allele   –  DB  :  dbSNP  membership   –  DP  :  combined  depth  across  samples,  e.g.  DP=154   –  END  :  end  posi5on  of  the  variant  described  in  this  record  (for  use  with  symbolic  alleles)   –  H2  :  membership  in  hapmap2   –  H3  :  membership  in  hapmap3   –  MQ  :  RMS  mapping  quality,  e.g.  MQ=52   –  MQ0  :  Number  of  MAPQ  ==  0  reads  covering  this  record   –  NS  :  Number  of  samples  with  data   –  SB  :  strand  bias  at  this  posi5on   –  SOMATIC  :  indicates  that  the  record  is  a  soma5c  muta5on,  for  cancer  genomics   –  VALIDATED  :  validated  by  follow-­‐up  experiment   –  1000G  :  membership  in  1000  Genomes  
    • Database•  Store  Each  Base  +  Meta  of  RM  versus  Reference  for  each   Experiment   –  Dis5nguish  missing  versus  homozygous  reference   –  Include  copy  number  and  phasing  when  available,  not   required  •  Engine  that  drives  front  end  visualiza5on  (Genome  Browser)  
    • Visualize and Filter•  Build  on  GetRM/NCBI  Browser  Work  •  Single  RM  -­‐>  Many  Experiments  •  Not  all  metadata  will  be  visual,  but  most/all  will  be  filterable  •  Filter  data  to  generate  ROI  or  VOI     –  Canned:      i.e.  Intersect  of  All  PlaQorms  +  Analysis,  All  OMIM  SNPs,   Clinical  Cert  SNV  List,  etc   –  Dynamic:  allowing  people  to  explore  prep,  sequence,  or  analysis  bias  •  Slice,  Dice,  Export  VOI  to  compare  and  repor5ng  SW  •  Allow  user  defined  tracks  •  By  product  is  community  educa5onal  resource   –  I  have  a  ROI  for  a  test  and  want  to  know  what  plaQorm,   prep,  exome  kit  version,  etc  covers  it  best.    What  do  I  do?  
    • Compare and Reporting•  Take  in  ROI  or  VOI  from  the  visualize  and  filter  stage  •  Take  in  user  defined  VOI  or  VOI  +  ROI  •  Poten5ally  Leverage  SW  under  Valida5onProtocol.org  to   generate  reports  and  files  including  BNLT:   –  Summary  of  completeness,  accuracy,  phasing   –  Discordant  variants  in  VCF   –  Concordant  variants  in  VCF   –  Phasing  errors  in  VCF  •  Provide  intui5ve  way  to  feed  these  resultants  in  downstream   analysis  SW  or  back  into  browser  (User  Defined  Track)  
    • Compare and Reporting
    • Realistic Approach•  Tell  Group  3  what  is  needed,  they  provide  feedback  on   priority  and  reality  of  request.  •  Should  extend  no  maher  RM  or  if  WGS,  WES,  Gene  Panel,  etc.