Background
                           Implementation
                          Software features
                                 Examples
                                 Summary




                                      Evoker:
           A visualization tool for genotype intensity data


James A. Morris 1 , Joshua C. Randall 2 , Julian B. Maller 2 and Jeffrey
                             C. Barrett 1

             1
                 Statistical and Computational Genetics research group,
                   Wellcome Trust Sanger Institute, Cambridge, UK.
                     2
                       Wellcome Trust Centre for Human Genetics,
                            University of Oxford, Oxford, UK.




                              James Morris    Evoker
Background
                           What are GWAS?
        Implementation
                           Quality control
       Software features
                           Evoker
              Examples
                           Data Challenges
              Summary



GWAS




          James Morris     Evoker
Background
                                                                                                                                   What are GWAS?
                                                                                                     Implementation
                                                                                                                                   Quality control
                                                                                                    Software features
                                                                                                                                   Evoker
                                                                                                           Examples
                                                                                                                                   Data Challenges
                                                                                                           Summary



Quality control

                                             Missingness and heterozygosity, by individual

                                                                                                                                    q


                                                                                                                               q

                                                                                                                                                     Rigorous quality control
                                  0.6




                                                                                                                           q qq
                                                                                                                             q



                                                                                                             q                                       (QC) procedures are
  Fraction of missing gentoypes

                                  0.5




                                                                                                                                                     essential in GWAS
                                                                                                                                                     Sources of false
                                  0.4




                                                                                              q q


                                                                 q
                                                                                 q
                                                                                          q                                                          positives include:
                                  0.3




                                                                     q
                                                                             q
                                                                                      q                                                                  Poor quality DNA
                                                                         q
                                                                                                                                                         Population structure
                                  0.2




                                                                       q    q     q
                                                           q q qq q q
                                                               qq               q
                                                                         q q
                                              q                qq
                                                              q q
                                                                 q
                                                                  q
                                                                  q
                                                                  q q q qq
                                                                          q q q
                                                                       qqq q
                                                                q q q qq qq q
                                                                   q qq q q
                                                                   qq
                                                                   qq
                                                                    q q qq
                                                                      q q
                                                                          q
                                                                          q
                                                                              q
                                                                              q
                                                                                                                                                         Hidden confounders
                                                                     qq q
                                                                  qq qq
                                                                     q
                                                                     q
                                                                     qqq
                                                                    qq
                                                                   qq q
                                                                                                                                                         Genotyping artifacts
                                  0.1




                                                                     q
                                                                    qq
                                                        q q qqq     q
                                                                   qq
                                                                   qq
                                                                  qq
                                                                  qq
                                                                   q
                                                                 qqq
                                                                  q
                                                          q qqq qqq
                                                                qqq
                                                                  qq
                                                                qq
                                                                 qq
                                                                 q
                                                             q qqq
                                                                qq
                                                  q           qq
                                                               qqq
                                                                 q
                                                              qq q
                                                               qq
                                                              qq
                                                     q        qq
                                                              qq
                                                              qq
                                                              q
                                                                 q
                                                             qq
                                                             qq
                                                             q
                                                            qq
                                                              q
                                                             qq
                                                              q
                                                              q
                                                             q
                                                             q
                                                           qqq
                                                           qq
                                                           qq
                                                           qqq
                                                          qqqqq q
                                                            qq
                                                            qq
                                                             q
                                                          qqq
                                                           qq
                                                          qqq
                                                         qqq
                                                           q
                                                        qqqqq
                                                         qqq
                                                          qq
                                                         qqq
                                                         qqq
                                                         qqq q
                                                        qqqqq
                                                       qqqqqq q
                                                           qq
                                                         qqq
                                                        qqqq
                                                        qqqq
                                                       qqqqqq
                                                           qq
                                                       qqqqqqq
                                                           qq
                                                           qq
                                                   q qqqqqq q q q
                                                          qq
                                                          qq
                                                          qq
                                                           qq
                                                     qqqqqqq q
                                                         qq q
                                                       qqqqq q
                                                           qq
                                                           qq
                                                         qqq
                                                         qqq
                                                          qq
                                  0.0




                                                         qqq
                                                          qq
                                                        qqqqq
                                                           qq
                                                      qqqqqq
                                                    qqqqqqqq q
                                                           q
                                                           q
                                                           qq
                                                         qqq
                                                          qq
                                                     qqqqqqqqq
                                                          qqq
                                                        qqq q
                                                          qq
                                                         qqq
                                                          qq
                                                          qq
                                                         qqq
                                                          qq
                                                           qq
                                                          qq
                                                         qqqq
                                                     qqqqqqqq q
                                                          qqqq
                                                       qqqq q
                                                          qq             qq
                                        qq
                                         q            qqqqqq
                                                   qqqqqqqq q q q
                                                          qq
                                                         qqq
                                                         qq
                                                         qqq
                                                           q
                                                           qq
                                                          qq
                                                        qqqqq
                                                          qq
                                                         qqq
                                                          qqq
                                                          q
                                                          qq
                                                         qqq
                                                         qq
                                                          qq
                                                          qq
                                                          qq
                                                          qq
                                                         qq q
                                                           qq
                                                          qqqq
                                                      qqqqq q
                                                         qqq
                                                           qq
                                                          qq
                                                           qq
                                                          qq
                                                          qq
                                                         qqqq
                                                          qqqq




                                        0.2                  0.3                      0.4           0.5          0.6     0.7

                                                                                     Mean heterozygosity
                                                                                                          James Morris             Evoker
Background
                                               What are GWAS?
                            Implementation
                                               Quality control
                           Software features
                                               Evoker
                                  Examples
                                               Data Challenges
                                  Summary



Evoker




         Evoker is an open source program for visualizing genotype intensity
         cluster plots
         Designed to be integrated into GWAS quality control workflows
         It provides a fast, user-friendly and interactive interface
         Evoker is a solution to the computational and storage problems
         related to working with such large datasets
                              James Morris     Evoker
Background
                                      What are GWAS?
                   Implementation
                                      Quality control
                  Software features
                                      Evoker
                         Examples
                                      Data Challenges
                         Summary



Data Challenges




                     James Morris     Evoker
Background
                          Implementation
                         Software features
                                Examples
                                Summary



Implementation



      Evoker is written in Java
      Utilizes helper scripts written in Perl
      Platform specific packages are available for:
          Mac
          Windows
          Linux




                            James Morris     Evoker
Background
                  Implementation
                                     Input features
                 Software features
                                     Interactive features
                        Examples
                        Summary



Input features




                    James Morris     Evoker
Background
                        Implementation
                                           Input features
                       Software features
                                           Interactive features
                              Examples
                              Summary



Interactive features




                          James Morris     Evoker
Background
                    Implementation
                   Software features
                          Examples
                          Summary



Using Evoker to confirm good genotype calls




                      James Morris     Evoker
Background
                      Implementation
                     Software features
                            Examples
                            Summary



Using Evoker to detect calling errors




                        James Morris     Evoker
Background
                    Implementation
                   Software features
                          Examples
                          Summary



Using Evoker to detect poor genotyping data




                      James Morris     Evoker
Background
                          Implementation
                         Software features
                                Examples
                                Summary



Summary




               www.sanger.ac.uk/resources/software/evoker/

  Open source software under the MIT license
  Source code and files are available here: sourceforge.net/projects/evoker
  Morris et al. (2010) Evoker: a visualization for genotype intensity data.
  Bioinformatics 2010 26(14):1786-1787
                            James Morris     Evoker

Morris bosc2010 evoker

  • 1.
    Background Implementation Software features Examples Summary Evoker: A visualization tool for genotype intensity data James A. Morris 1 , Joshua C. Randall 2 , Julian B. Maller 2 and Jeffrey C. Barrett 1 1 Statistical and Computational Genetics research group, Wellcome Trust Sanger Institute, Cambridge, UK. 2 Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, UK. James Morris Evoker
  • 2.
    Background What are GWAS? Implementation Quality control Software features Evoker Examples Data Challenges Summary GWAS James Morris Evoker
  • 3.
    Background What are GWAS? Implementation Quality control Software features Evoker Examples Data Challenges Summary Quality control Missingness and heterozygosity, by individual q q Rigorous quality control 0.6 q qq q q (QC) procedures are Fraction of missing gentoypes 0.5 essential in GWAS Sources of false 0.4 q q q q q positives include: 0.3 q q q Poor quality DNA q Population structure 0.2 q q q q q qq q q qq q q q q qq q q q q q q q q qq q q q qqq q q q q qq qq q q qq q q qq qq q q qq q q q q q q Hidden confounders qq q qq qq q q qqq qq qq q Genotyping artifacts 0.1 q qq q q qqq q qq qq qq qq q qqq q q qqq qqq qqq qq qq qq q q qqq qq q qq qqq q qq q qq qq q qq qq qq q q qq qq q qq q qq q q q q qqq qq qq qqq qqqqq q qq qq q qqq qq qqq qqq q qqqqq qqq qq qqq qqq qqq q qqqqq qqqqqq q qq qqq qqqq qqqq qqqqqq qq qqqqqqq qq qq q qqqqqq q q q qq qq qq qq qqqqqqq q qq q qqqqq q qq qq qqq qqq qq 0.0 qqq qq qqqqq qq qqqqqq qqqqqqqq q q q qq qqq qq qqqqqqqqq qqq qqq q qq qqq qq qq qqq qq qq qq qqqq qqqqqqqq q qqqq qqqq q qq qq qq q qqqqqq qqqqqqqq q q q qq qqq qq qqq q qq qq qqqqq qq qqq qqq q qq qqq qq qq qq qq qq qq q qq qqqq qqqqq q qqq qq qq qq qq qq qqqq qqqq 0.2 0.3 0.4 0.5 0.6 0.7 Mean heterozygosity James Morris Evoker
  • 4.
    Background What are GWAS? Implementation Quality control Software features Evoker Examples Data Challenges Summary Evoker Evoker is an open source program for visualizing genotype intensity cluster plots Designed to be integrated into GWAS quality control workflows It provides a fast, user-friendly and interactive interface Evoker is a solution to the computational and storage problems related to working with such large datasets James Morris Evoker
  • 5.
    Background What are GWAS? Implementation Quality control Software features Evoker Examples Data Challenges Summary Data Challenges James Morris Evoker
  • 6.
    Background Implementation Software features Examples Summary Implementation Evoker is written in Java Utilizes helper scripts written in Perl Platform specific packages are available for: Mac Windows Linux James Morris Evoker
  • 7.
    Background Implementation Input features Software features Interactive features Examples Summary Input features James Morris Evoker
  • 8.
    Background Implementation Input features Software features Interactive features Examples Summary Interactive features James Morris Evoker
  • 9.
    Background Implementation Software features Examples Summary Using Evoker to confirm good genotype calls James Morris Evoker
  • 10.
    Background Implementation Software features Examples Summary Using Evoker to detect calling errors James Morris Evoker
  • 11.
    Background Implementation Software features Examples Summary Using Evoker to detect poor genotyping data James Morris Evoker
  • 12.
    Background Implementation Software features Examples Summary Summary www.sanger.ac.uk/resources/software/evoker/ Open source software under the MIT license Source code and files are available here: sourceforge.net/projects/evoker Morris et al. (2010) Evoker: a visualization for genotype intensity data. Bioinformatics 2010 26(14):1786-1787 James Morris Evoker