SlideShare a Scribd company logo
http://www.mrsymbiomath.eu
This work has been partially supported by the Mr.Symbiomath IAPP (Project Code: 324554); the ‘Plataforma de Recursos Biomoleculares y Bioinformaticos (ISCIII-PT13.0001.0012)’ and ‘Proyecto de Excelencia Junta de Andalucia
(P10-TIC-6108)’
Alex Upton1, Priscill Orue1, Oswaldo Trelles1,2
1 Computer Architecture Department, University of Malaga (UMA), Spain
2 RISC Software GmbH. Hagenberg, Austria
Abstract
It is widely agreed that complex diseases are typically caused by joint effects of multiple genetic variations, rather than a single genetic variation [1]. Multi-SNP interactions, also known as epistatic
interactions, have the potential to provide information about causes of complex diseases, and build on GWAS studies that look at associations between single SNPs and phenotypes. However, epistatic
analysis methods are both computationally expensive, and have limited accessibility for biologists wanting to analyse GWAS datasets due to being command line based.
Here we present APPistatic, a prototype desktop version of a pipeline for epistatic analysis of GWAS datasets. This application combines ease-of-use, via a GUI, with accelerated implementation of
BOOST [2] and FaST-LMM [3] epistatic analysis methods.
Pipeline
Conclusions
• Implementation of the analysis methods via a GUI results in improved accessibility, thereby making epistatic analysis tools a viable option for end users such as biologists that are not comfortable
with command line based tools. This allows further analysis of GWAS data sets, potentially building on existing analysis and resulting in additional genetic information being discovered.
• Notable improvement in execution time also obtained, compared to default execution of epistatic analysis tools. Future HPC deployment makes typical GWAS data set analysis feasible; a relatively
small GWAS dataset, with 100,000 SNPs that pass quality control, has 5x10-9 pairwise interactions, that would take approximately two years to calculate on a desktop computer. Using HPC, this
can be executed in a number of days, aiding in the analysis of genetic variants of disease.
• In addition, a cloud-based version of the pipeline could also be developed using Web services, which could be accessed via a client such as jORCA [6]. Cloud Computing allows researchers to rent
computational and storage resources on an ad-hoc basis for large scale data processing, allowing access to High Performance Computing. Furthermore, this implementation could join up with
existing cloud-based pipelines to create an all-in-one process. Additionally, we are exploring the option of exporting results directly to visualisation software for visual inspection of the results.
Accessibility
Analysis of GWAS Data
The application provides an easy-to-use all-in-one analysis of GWAS data by
incorporating a number of analysis steps which are shown in Figure 1 below.
Steps Involved
(1) End user loads GWAS files of interest. These can be either in VCF or PLINK format.
For end users with raw .CEL files, one recommended tool for obtaining VCF files is
the Cloud-based GWAS Analysis Pipeline for Clinical Researchers [4].
(2) Prior to epistatic analysis, it is of interest to carry out single SNP association analysis.
This is performed using the widely used tool PLINK [5].
(3) The next step is to carry out an epistatic analysis using an optimised implementation
of BOOST that takes advantage of the multi-core environment of modern computers.
(4) The next step is to use the FaST-LMM [3] analysis tools. Prior to using these, the user
files have to be converted to ensure compatibility. This is carried out in this step.
(5) The next step is to carry out a single SNP association analysis with FaST-LMM, that
corrects for population structure.
(6) The final step is to carry out an epistatic analysis using FaST-LMM. As with BOOST,
implementation has been optimised to take advantage of multiple cores.
Acceleration
Desktop PC Implementation
The execution of APPistatic on a typical desktop PC results in a speedup of between 4
and 8 times for epistatic analysis, depending on the number of cores. The screenshot
above shows the default acceleration, using 4 tasks and 256MB RAM per task.
HPC Implementation
Greater speedup making the analysis of typical GWAS datasets feasible is obtained by
using High Performance Computing (HPC). Initial HPC deployment using 100 cores
shows a promising speedup of over 114 times. Table 2 below shows the execution times
for BOOST and FaST-LMM epistatic analysis for a demo data set for both a typical
desktop PC running Windows, and initial HPC deployment. It should be noted that the
demo data set contains 10,000 SNPs. The faster execution time of BOOST is due to the
use of a linear regression model, compared to the linear mixed method model used by
FaST-LMM.
Computational Environment BOOST Epistatic
Execution Time (s)
FaST-LMM Epistatic
Execution Time (s)
Standard Implementation (a) 25.4 15123
Appistatic Deployed on Desktop PC (b) 4.8 1903
Deployment on HPC (c) 1.2 132
(a) Default execution of applications from command line on Desktop PC (detailed below)
(b) Desktop PC with Intel Core 2 Quad 2.66 GHz CPU and 4GB RAM running Windows 7
(c) Split into 100 tasks with 4 cores and 8GB ram assigned to each task
References
[1] Anunciação, Orlando, Susana Vinga, and Arlindo L. Oliveira. "Using Information Interaction to Discover Epistatic Effects in Complex Diseases." PloS one 8, no. 10 (2013): e76300.
[2] Wan, Xiang, Can Yang, Qiang Yang, Hong Xue, Xiaodan Fan, Nelson LS Tang, and Weichuan Yu. "BOOST: A fast approach to detecting gene-gene interactions in genome-wide case-control studies." The American Journal of Human Genetics 87, no. 3 (2010):
325-340.
[3] Lippert, Christoph, Jennifer Listgarten, Ying Liu, Carl M. Kadie, Robert I. Davidson, and David Heckerman. "FaST linear mixed models for genome-wide association studies." Nature Methods 8, no. 10 (2011): 833-835.
[4] P. Heinzlreiter, J. Perkins, O. Torreñno Tirado, J. Karlsson, A. Mitterecker, M. Blanca and O. Trelles. "A Cloud-based GWAS Analysis Pipeline for Clinical Researchers" 4th International Conference on Cloud Computing and Services Science, CLOSER 2014.
[5] Purcell, Shaun, Benjamin Neale, Kathe Todd-Brown, Lori Thomas, Manuel AR Ferreira, David Bender, Julian Maller et al. "PLINK: a tool set for whole-genome association and population-based linkage analyses." The American Journal of Human Genetics 81, no.
3 (2007): 559-575.
[6] Martín-Requena, Victoria, Javier Ríos, Maximiliano García, Sergio Ramírez, and Oswaldo Trelles. "jORCA: easily integrating bioinformatics Web Services." Bioinformatics 26, no. 4 (2010): 553-559.
Figure 1: Overview of Pipeline
Graphical User Interface
Providing GUI access to epistatic analysis
methods, along with single SNP association
methods, improves their accessibility as multiple
tools are accessed in the same manner,
allowing targeted non-expert computer users,
e.g. biologists, to easily analyse their GWAS
datasets without having to learn different
commands for each tool. The GUI is shown in
Figure 2 on the left. Note the easily configurable
options for acceleration. The prototype version
of APPistatic can be downloaded from:
Figure 2: Implementation Results
http://chirimoyo.ac.uma.es/appistaticFigure 2: APPistatic GUI

More Related Content

What's hot

Scientific Workflow Systems for accessible, reproducible research
Scientific Workflow Systems for accessible, reproducible researchScientific Workflow Systems for accessible, reproducible research
Scientific Workflow Systems for accessible, reproducible research
Peter van Heusden
 
Bionimbus Cambridge Workshop (3-28-11, v7)
Bionimbus Cambridge Workshop (3-28-11, v7)Bionimbus Cambridge Workshop (3-28-11, v7)
Bionimbus Cambridge Workshop (3-28-11, v7)
Robert Grossman
 
Assessing Galaxy's ability to express scientific workflows in bioinformatics
Assessing Galaxy's ability to express scientific workflows in bioinformaticsAssessing Galaxy's ability to express scientific workflows in bioinformatics
Assessing Galaxy's ability to express scientific workflows in bioinformaticsPeter van Heusden
 
Integrative data management for reproducibility of microscopy experiments
Integrative data management for reproducibility of microscopy experimentsIntegrative data management for reproducibility of microscopy experiments
Integrative data management for reproducibility of microscopy experiments
Sheeba Samuel
 
(2016)application of parallel glowworm swarm optimization algorithm for data ...
(2016)application of parallel glowworm swarm optimization algorithm for data ...(2016)application of parallel glowworm swarm optimization algorithm for data ...
(2016)application of parallel glowworm swarm optimization algorithm for data ...
Akram Pasha
 
Open Science Data Cloud - CCA 11
Open Science Data Cloud - CCA 11Open Science Data Cloud - CCA 11
Open Science Data Cloud - CCA 11
Robert Grossman
 
Rethinking data intensive science using scalable analytics systems
 Rethinking data intensive science using scalable analytics systems Rethinking data intensive science using scalable analytics systems
Rethinking data intensive science using scalable analytics systems
newmooxx
 
ReComp: optimising the re-execution of analytics pipelines in response to cha...
ReComp: optimising the re-execution of analytics pipelines in response to cha...ReComp: optimising the re-execution of analytics pipelines in response to cha...
ReComp: optimising the re-execution of analytics pipelines in response to cha...
Paolo Missier
 
The DuraMat Data Hub and Analytics Capability: A Resource for Solar PV Data
The DuraMat Data Hub and Analytics Capability: A Resource for Solar PV DataThe DuraMat Data Hub and Analytics Capability: A Resource for Solar PV Data
The DuraMat Data Hub and Analytics Capability: A Resource for Solar PV Data
Anubhav Jain
 
AI at Scale for Materials and Chemistry
AI at Scale for Materials and ChemistryAI at Scale for Materials and Chemistry
AI at Scale for Materials and Chemistry
Ian Foster
 
Indexing data on the web a comparison of schema level indices for data search
Indexing data on the web a comparison of schema level indices for data searchIndexing data on the web a comparison of schema level indices for data search
Indexing data on the web a comparison of schema level indices for data search
Till Blume
 
DuraMat Data Management and Analytics
DuraMat Data Management and AnalyticsDuraMat Data Management and Analytics
DuraMat Data Management and Analytics
Anubhav Jain
 
Method for conducting a combined analysis of grid environment’s fta and gwa t...
Method for conducting a combined analysis of grid environment’s fta and gwa t...Method for conducting a combined analysis of grid environment’s fta and gwa t...
Method for conducting a combined analysis of grid environment’s fta and gwa t...
ijgca
 
VariantSpark a library for genomics by Lynn Langit
VariantSpark a library for genomics by Lynn LangitVariantSpark a library for genomics by Lynn Langit
VariantSpark a library for genomics by Lynn Langit
Data Con LA
 
Machine learning for materials design: opportunities, challenges, and methods
Machine learning for materials design: opportunities, challenges, and methodsMachine learning for materials design: opportunities, challenges, and methods
Machine learning for materials design: opportunities, challenges, and methods
Anubhav Jain
 
Accelerated Materials Discovery Using Theory, Optimization, and Natural Langu...
Accelerated Materials Discovery Using Theory, Optimization, and Natural Langu...Accelerated Materials Discovery Using Theory, Optimization, and Natural Langu...
Accelerated Materials Discovery Using Theory, Optimization, and Natural Langu...
Anubhav Jain
 
Open Science Data Cloud (IEEE Cloud 2011)
Open Science Data Cloud (IEEE Cloud 2011)Open Science Data Cloud (IEEE Cloud 2011)
Open Science Data Cloud (IEEE Cloud 2011)
Robert Grossman
 
Spatial Analysis On Histological Images Using Spark
Spatial Analysis On Histological Images Using SparkSpatial Analysis On Histological Images Using Spark
Spatial Analysis On Histological Images Using Spark
Jen Aman
 

What's hot (18)

Scientific Workflow Systems for accessible, reproducible research
Scientific Workflow Systems for accessible, reproducible researchScientific Workflow Systems for accessible, reproducible research
Scientific Workflow Systems for accessible, reproducible research
 
Bionimbus Cambridge Workshop (3-28-11, v7)
Bionimbus Cambridge Workshop (3-28-11, v7)Bionimbus Cambridge Workshop (3-28-11, v7)
Bionimbus Cambridge Workshop (3-28-11, v7)
 
Assessing Galaxy's ability to express scientific workflows in bioinformatics
Assessing Galaxy's ability to express scientific workflows in bioinformaticsAssessing Galaxy's ability to express scientific workflows in bioinformatics
Assessing Galaxy's ability to express scientific workflows in bioinformatics
 
Integrative data management for reproducibility of microscopy experiments
Integrative data management for reproducibility of microscopy experimentsIntegrative data management for reproducibility of microscopy experiments
Integrative data management for reproducibility of microscopy experiments
 
(2016)application of parallel glowworm swarm optimization algorithm for data ...
(2016)application of parallel glowworm swarm optimization algorithm for data ...(2016)application of parallel glowworm swarm optimization algorithm for data ...
(2016)application of parallel glowworm swarm optimization algorithm for data ...
 
Open Science Data Cloud - CCA 11
Open Science Data Cloud - CCA 11Open Science Data Cloud - CCA 11
Open Science Data Cloud - CCA 11
 
Rethinking data intensive science using scalable analytics systems
 Rethinking data intensive science using scalable analytics systems Rethinking data intensive science using scalable analytics systems
Rethinking data intensive science using scalable analytics systems
 
ReComp: optimising the re-execution of analytics pipelines in response to cha...
ReComp: optimising the re-execution of analytics pipelines in response to cha...ReComp: optimising the re-execution of analytics pipelines in response to cha...
ReComp: optimising the re-execution of analytics pipelines in response to cha...
 
The DuraMat Data Hub and Analytics Capability: A Resource for Solar PV Data
The DuraMat Data Hub and Analytics Capability: A Resource for Solar PV DataThe DuraMat Data Hub and Analytics Capability: A Resource for Solar PV Data
The DuraMat Data Hub and Analytics Capability: A Resource for Solar PV Data
 
AI at Scale for Materials and Chemistry
AI at Scale for Materials and ChemistryAI at Scale for Materials and Chemistry
AI at Scale for Materials and Chemistry
 
Indexing data on the web a comparison of schema level indices for data search
Indexing data on the web a comparison of schema level indices for data searchIndexing data on the web a comparison of schema level indices for data search
Indexing data on the web a comparison of schema level indices for data search
 
DuraMat Data Management and Analytics
DuraMat Data Management and AnalyticsDuraMat Data Management and Analytics
DuraMat Data Management and Analytics
 
Method for conducting a combined analysis of grid environment’s fta and gwa t...
Method for conducting a combined analysis of grid environment’s fta and gwa t...Method for conducting a combined analysis of grid environment’s fta and gwa t...
Method for conducting a combined analysis of grid environment’s fta and gwa t...
 
VariantSpark a library for genomics by Lynn Langit
VariantSpark a library for genomics by Lynn LangitVariantSpark a library for genomics by Lynn Langit
VariantSpark a library for genomics by Lynn Langit
 
Machine learning for materials design: opportunities, challenges, and methods
Machine learning for materials design: opportunities, challenges, and methodsMachine learning for materials design: opportunities, challenges, and methods
Machine learning for materials design: opportunities, challenges, and methods
 
Accelerated Materials Discovery Using Theory, Optimization, and Natural Langu...
Accelerated Materials Discovery Using Theory, Optimization, and Natural Langu...Accelerated Materials Discovery Using Theory, Optimization, and Natural Langu...
Accelerated Materials Discovery Using Theory, Optimization, and Natural Langu...
 
Open Science Data Cloud (IEEE Cloud 2011)
Open Science Data Cloud (IEEE Cloud 2011)Open Science Data Cloud (IEEE Cloud 2011)
Open Science Data Cloud (IEEE Cloud 2011)
 
Spatial Analysis On Histological Images Using Spark
Spatial Analysis On Histological Images Using SparkSpatial Analysis On Histological Images Using Spark
Spatial Analysis On Histological Images Using Spark
 

Viewers also liked

Association mapping using local genealogies
Association mapping using local genealogiesAssociation mapping using local genealogies
Association mapping using local genealogiesmailund
 
GEE & GLMM in GWAS
GEE & GLMM in GWASGEE & GLMM in GWAS
GEE & GLMM in GWAS
Jinseob Kim
 
Probability And Stats Intro
Probability And Stats IntroProbability And Stats Intro
Probability And Stats Intromailund
 
linkage
linkagelinkage
linkage
DUSHYANT DUBE
 
SNPs Presentation Cavalcanti Lab
SNPs Presentation Cavalcanti LabSNPs Presentation Cavalcanti Lab
SNPs Presentation Cavalcanti Labjsrep91
 
Intro gwas
Intro gwasIntro gwas
Intro gwas
Omar Yang
 
Measures of Linkage Disequilibrium
Measures of Linkage DisequilibriumMeasures of Linkage Disequilibrium
Measures of Linkage Disequilibrium
Awais Khan
 
Ch5 linkage
Ch5 linkageCh5 linkage
Estimation of Linkage Disequilibrium using GGT2 Software
Estimation of Linkage Disequilibrium using GGT2 SoftwareEstimation of Linkage Disequilibrium using GGT2 Software
Estimation of Linkage Disequilibrium using GGT2 Software
Awais Khan
 
Lecture 3 l dand_haplotypes_full
Lecture 3 l dand_haplotypes_fullLecture 3 l dand_haplotypes_full
Lecture 3 l dand_haplotypes_full
Lekki Frazier-Wood
 
Creating a Kinship Matrix Using MSA
Creating a Kinship Matrix Using MSACreating a Kinship Matrix Using MSA
Creating a Kinship Matrix Using MSAheathermerk
 
Mapping and Applications of Linkage Disequilibrium and Association Mapping in...
Mapping and Applications of Linkage Disequilibrium and Association Mapping in...Mapping and Applications of Linkage Disequilibrium and Association Mapping in...
Mapping and Applications of Linkage Disequilibrium and Association Mapping in...
FAO
 
Epi519 Gwas Talk
Epi519 Gwas TalkEpi519 Gwas Talk
Epi519 Gwas Talk
joshbis
 
Genelinkagemap
GenelinkagemapGenelinkagemap
Genelinkagemap
sarahhg
 
Introduction to association mapping and tutorial using tassel
Introduction to association mapping and tutorial using tasselIntroduction to association mapping and tutorial using tassel
Introduction to association mapping and tutorial using tassel
Awais Khan
 
GWAS
GWASGWAS
Application of Genome-Wide Association Study (GWAS) and transcriptomics to st...
Application of Genome-Wide Association Study (GWAS) and transcriptomics to st...Application of Genome-Wide Association Study (GWAS) and transcriptomics to st...
Application of Genome-Wide Association Study (GWAS) and transcriptomics to st...
CGIAR Research Program on Roots, Tubers and Bananas
 
How to solve linkage map problems
How to solve linkage map problemsHow to solve linkage map problems
How to solve linkage map problems
martyynyyte
 
Genome wide association studies seminar
Genome wide association studies seminarGenome wide association studies seminar
Genome wide association studies seminarVarsha Gayatonde
 
Genetic Linkage
Genetic LinkageGenetic Linkage
Genetic LinkageJolie Yu
 

Viewers also liked (20)

Association mapping using local genealogies
Association mapping using local genealogiesAssociation mapping using local genealogies
Association mapping using local genealogies
 
GEE & GLMM in GWAS
GEE & GLMM in GWASGEE & GLMM in GWAS
GEE & GLMM in GWAS
 
Probability And Stats Intro
Probability And Stats IntroProbability And Stats Intro
Probability And Stats Intro
 
linkage
linkagelinkage
linkage
 
SNPs Presentation Cavalcanti Lab
SNPs Presentation Cavalcanti LabSNPs Presentation Cavalcanti Lab
SNPs Presentation Cavalcanti Lab
 
Intro gwas
Intro gwasIntro gwas
Intro gwas
 
Measures of Linkage Disequilibrium
Measures of Linkage DisequilibriumMeasures of Linkage Disequilibrium
Measures of Linkage Disequilibrium
 
Ch5 linkage
Ch5 linkageCh5 linkage
Ch5 linkage
 
Estimation of Linkage Disequilibrium using GGT2 Software
Estimation of Linkage Disequilibrium using GGT2 SoftwareEstimation of Linkage Disequilibrium using GGT2 Software
Estimation of Linkage Disequilibrium using GGT2 Software
 
Lecture 3 l dand_haplotypes_full
Lecture 3 l dand_haplotypes_fullLecture 3 l dand_haplotypes_full
Lecture 3 l dand_haplotypes_full
 
Creating a Kinship Matrix Using MSA
Creating a Kinship Matrix Using MSACreating a Kinship Matrix Using MSA
Creating a Kinship Matrix Using MSA
 
Mapping and Applications of Linkage Disequilibrium and Association Mapping in...
Mapping and Applications of Linkage Disequilibrium and Association Mapping in...Mapping and Applications of Linkage Disequilibrium and Association Mapping in...
Mapping and Applications of Linkage Disequilibrium and Association Mapping in...
 
Epi519 Gwas Talk
Epi519 Gwas TalkEpi519 Gwas Talk
Epi519 Gwas Talk
 
Genelinkagemap
GenelinkagemapGenelinkagemap
Genelinkagemap
 
Introduction to association mapping and tutorial using tassel
Introduction to association mapping and tutorial using tasselIntroduction to association mapping and tutorial using tassel
Introduction to association mapping and tutorial using tassel
 
GWAS
GWASGWAS
GWAS
 
Application of Genome-Wide Association Study (GWAS) and transcriptomics to st...
Application of Genome-Wide Association Study (GWAS) and transcriptomics to st...Application of Genome-Wide Association Study (GWAS) and transcriptomics to st...
Application of Genome-Wide Association Study (GWAS) and transcriptomics to st...
 
How to solve linkage map problems
How to solve linkage map problemsHow to solve linkage map problems
How to solve linkage map problems
 
Genome wide association studies seminar
Genome wide association studies seminarGenome wide association studies seminar
Genome wide association studies seminar
 
Genetic Linkage
Genetic LinkageGenetic Linkage
Genetic Linkage
 

Similar to Accelerating GWAS epistatic interaction analysis methods

2014 Taverna Tutorial Introduction to eScience and workflows
2014 Taverna Tutorial Introduction to eScience and workflows2014 Taverna Tutorial Introduction to eScience and workflows
2014 Taverna Tutorial Introduction to eScience and workflows
myGrid team
 
Collins seattle-2014-final
Collins seattle-2014-finalCollins seattle-2014-final
Collins seattle-2014-final
inside-BigData.com
 
11-Big Data Application in Biomedical Research and Health Care.pptx
11-Big Data Application in Biomedical Research and Health Care.pptx11-Big Data Application in Biomedical Research and Health Care.pptx
11-Big Data Application in Biomedical Research and Health Care.pptx
shikhamittal42
 
REAL-TIME BLEEDING DETECTION IN GASTROINTESTINAL TRACT ENDOSCOPIC EXAMINATION...
REAL-TIME BLEEDING DETECTION IN GASTROINTESTINAL TRACT ENDOSCOPIC EXAMINATION...REAL-TIME BLEEDING DETECTION IN GASTROINTESTINAL TRACT ENDOSCOPIC EXAMINATION...
REAL-TIME BLEEDING DETECTION IN GASTROINTESTINAL TRACT ENDOSCOPIC EXAMINATION...
ijdpsjournal
 
The Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource ProvisioningThe Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource Provisioning
Rafael Ferreira da Silva
 
Aaas Data Intensive Science And Grid
Aaas Data Intensive Science And GridAaas Data Intensive Science And Grid
Aaas Data Intensive Science And Grid
Ian Foster
 
Computation and Knowledge
Computation and KnowledgeComputation and Knowledge
Computation and Knowledge
Ian Foster
 
International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)
inventionjournals
 
Chapter 5 applications of neural networks
Chapter 5           applications of neural networksChapter 5           applications of neural networks
Chapter 5 applications of neural networksPunit Saini
 
OpenPOWER Academia and Research team's webinar - Presentations from Oak Ridg...
OpenPOWER Academia and Research team's webinar  - Presentations from Oak Ridg...OpenPOWER Academia and Research team's webinar  - Presentations from Oak Ridg...
OpenPOWER Academia and Research team's webinar - Presentations from Oak Ridg...
Ganesan Narayanasamy
 
healthcare application using cloud platform
healthcare  application using cloud platformhealthcare  application using cloud platform
healthcare application using cloud platform
Swathi Rampur
 
2015 GU-ICBI Poster (third printing)
2015 GU-ICBI Poster (third printing)2015 GU-ICBI Poster (third printing)
2015 GU-ICBI Poster (third printing)Michael Atkins
 
Una estrategia para la integración de ontologías, servicios web y PLN en el a...
Una estrategia para la integración de ontologías, servicios web y PLN en el a...Una estrategia para la integración de ontologías, servicios web y PLN en el a...
Una estrategia para la integración de ontologías, servicios web y PLN en el a...
Anubis Hosein
 
EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...
EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...
EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...
Servio Fernando Lima Reina
 
Book of abstract volume 8 no 9 ijcsis december 2010
Book of abstract volume 8 no 9 ijcsis december 2010Book of abstract volume 8 no 9 ijcsis december 2010
Book of abstract volume 8 no 9 ijcsis december 2010Oladokun Sulaiman
 
IEEE Fuzzy system Title and Abstract 2016
IEEE Fuzzy system Title and Abstract 2016 IEEE Fuzzy system Title and Abstract 2016
IEEE Fuzzy system Title and Abstract 2016
tsysglobalsolutions
 
A Survey on Bioinformatics Tools
A Survey on Bioinformatics ToolsA Survey on Bioinformatics Tools
A Survey on Bioinformatics Tools
idescitation
 
BIOMED_presentation.ppt
BIOMED_presentation.pptBIOMED_presentation.ppt
BIOMED_presentation.ppt
AnandKumar459862
 
Next Generation Sequencing methods
Next Generation Sequencing methods Next Generation Sequencing methods
Next Generation Sequencing methods
Zohaib HUSSAIN
 

Similar to Accelerating GWAS epistatic interaction analysis methods (20)

Poster (1)
Poster (1)Poster (1)
Poster (1)
 
2014 Taverna Tutorial Introduction to eScience and workflows
2014 Taverna Tutorial Introduction to eScience and workflows2014 Taverna Tutorial Introduction to eScience and workflows
2014 Taverna Tutorial Introduction to eScience and workflows
 
Collins seattle-2014-final
Collins seattle-2014-finalCollins seattle-2014-final
Collins seattle-2014-final
 
11-Big Data Application in Biomedical Research and Health Care.pptx
11-Big Data Application in Biomedical Research and Health Care.pptx11-Big Data Application in Biomedical Research and Health Care.pptx
11-Big Data Application in Biomedical Research and Health Care.pptx
 
REAL-TIME BLEEDING DETECTION IN GASTROINTESTINAL TRACT ENDOSCOPIC EXAMINATION...
REAL-TIME BLEEDING DETECTION IN GASTROINTESTINAL TRACT ENDOSCOPIC EXAMINATION...REAL-TIME BLEEDING DETECTION IN GASTROINTESTINAL TRACT ENDOSCOPIC EXAMINATION...
REAL-TIME BLEEDING DETECTION IN GASTROINTESTINAL TRACT ENDOSCOPIC EXAMINATION...
 
The Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource ProvisioningThe Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource Provisioning
 
Aaas Data Intensive Science And Grid
Aaas Data Intensive Science And GridAaas Data Intensive Science And Grid
Aaas Data Intensive Science And Grid
 
Computation and Knowledge
Computation and KnowledgeComputation and Knowledge
Computation and Knowledge
 
International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)
 
Chapter 5 applications of neural networks
Chapter 5           applications of neural networksChapter 5           applications of neural networks
Chapter 5 applications of neural networks
 
OpenPOWER Academia and Research team's webinar - Presentations from Oak Ridg...
OpenPOWER Academia and Research team's webinar  - Presentations from Oak Ridg...OpenPOWER Academia and Research team's webinar  - Presentations from Oak Ridg...
OpenPOWER Academia and Research team's webinar - Presentations from Oak Ridg...
 
healthcare application using cloud platform
healthcare  application using cloud platformhealthcare  application using cloud platform
healthcare application using cloud platform
 
2015 GU-ICBI Poster (third printing)
2015 GU-ICBI Poster (third printing)2015 GU-ICBI Poster (third printing)
2015 GU-ICBI Poster (third printing)
 
Una estrategia para la integración de ontologías, servicios web y PLN en el a...
Una estrategia para la integración de ontologías, servicios web y PLN en el a...Una estrategia para la integración de ontologías, servicios web y PLN en el a...
Una estrategia para la integración de ontologías, servicios web y PLN en el a...
 
EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...
EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...
EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...
 
Book of abstract volume 8 no 9 ijcsis december 2010
Book of abstract volume 8 no 9 ijcsis december 2010Book of abstract volume 8 no 9 ijcsis december 2010
Book of abstract volume 8 no 9 ijcsis december 2010
 
IEEE Fuzzy system Title and Abstract 2016
IEEE Fuzzy system Title and Abstract 2016 IEEE Fuzzy system Title and Abstract 2016
IEEE Fuzzy system Title and Abstract 2016
 
A Survey on Bioinformatics Tools
A Survey on Bioinformatics ToolsA Survey on Bioinformatics Tools
A Survey on Bioinformatics Tools
 
BIOMED_presentation.ppt
BIOMED_presentation.pptBIOMED_presentation.ppt
BIOMED_presentation.ppt
 
Next Generation Sequencing methods
Next Generation Sequencing methods Next Generation Sequencing methods
Next Generation Sequencing methods
 

More from Priscill Orue Esquivel

WiSANCloud: a set of UML-based specifications for the integration of Wireless...
WiSANCloud: a set of UML-based specifications for the integration of Wireless...WiSANCloud: a set of UML-based specifications for the integration of Wireless...
WiSANCloud: a set of UML-based specifications for the integration of Wireless...
Priscill Orue Esquivel
 
IA conexionista-RNA --Prueba y entrenamiento con modelos de RNA (2)
IA conexionista-RNA --Prueba y entrenamiento con modelos de RNA (2)IA conexionista-RNA --Prueba y entrenamiento con modelos de RNA (2)
IA conexionista-RNA --Prueba y entrenamiento con modelos de RNA (2)
Priscill Orue Esquivel
 
IA conexionista-RNA -- Prueba y entrenamiento con modelos de RNA
IA conexionista-RNA -- Prueba y entrenamiento con modelos de RNAIA conexionista-RNA -- Prueba y entrenamiento con modelos de RNA
IA conexionista-RNA -- Prueba y entrenamiento con modelos de RNA
Priscill Orue Esquivel
 
IA conexionista-Redes Neuronales Artificiales: introducción
IA conexionista-Redes Neuronales Artificiales: introducciónIA conexionista-Redes Neuronales Artificiales: introducción
IA conexionista-Redes Neuronales Artificiales: introducción
Priscill Orue Esquivel
 
Plan de curso
Plan de cursoPlan de curso
Plan de curso
Priscill Orue Esquivel
 
Aplicación de las Redes Hopfield al Problema de Asignación
Aplicación de las Redes Hopfield al Problema de AsignaciónAplicación de las Redes Hopfield al Problema de Asignación
Aplicación de las Redes Hopfield al Problema de Asignación
Priscill Orue Esquivel
 
Análisis estáticos y dinámicos en la aplicación de pruebas de intrusión (Pene...
Análisis estáticos y dinámicos en la aplicación de pruebas de intrusión (Pene...Análisis estáticos y dinámicos en la aplicación de pruebas de intrusión (Pene...
Análisis estáticos y dinámicos en la aplicación de pruebas de intrusión (Pene...
Priscill Orue Esquivel
 
Aprendizaje Computacional: Valoraciones personales sobre métodos de etiquetad...
Aprendizaje Computacional: Valoraciones personales sobre métodos de etiquetad...Aprendizaje Computacional: Valoraciones personales sobre métodos de etiquetad...
Aprendizaje Computacional: Valoraciones personales sobre métodos de etiquetad...
Priscill Orue Esquivel
 
Perspectiva docente del diseño de contenidos y evaluación para cursos a dista...
Perspectiva docente del diseño de contenidos y evaluación para cursos a dista...Perspectiva docente del diseño de contenidos y evaluación para cursos a dista...
Perspectiva docente del diseño de contenidos y evaluación para cursos a dista...
Priscill Orue Esquivel
 

More from Priscill Orue Esquivel (9)

WiSANCloud: a set of UML-based specifications for the integration of Wireless...
WiSANCloud: a set of UML-based specifications for the integration of Wireless...WiSANCloud: a set of UML-based specifications for the integration of Wireless...
WiSANCloud: a set of UML-based specifications for the integration of Wireless...
 
IA conexionista-RNA --Prueba y entrenamiento con modelos de RNA (2)
IA conexionista-RNA --Prueba y entrenamiento con modelos de RNA (2)IA conexionista-RNA --Prueba y entrenamiento con modelos de RNA (2)
IA conexionista-RNA --Prueba y entrenamiento con modelos de RNA (2)
 
IA conexionista-RNA -- Prueba y entrenamiento con modelos de RNA
IA conexionista-RNA -- Prueba y entrenamiento con modelos de RNAIA conexionista-RNA -- Prueba y entrenamiento con modelos de RNA
IA conexionista-RNA -- Prueba y entrenamiento con modelos de RNA
 
IA conexionista-Redes Neuronales Artificiales: introducción
IA conexionista-Redes Neuronales Artificiales: introducciónIA conexionista-Redes Neuronales Artificiales: introducción
IA conexionista-Redes Neuronales Artificiales: introducción
 
Plan de curso
Plan de cursoPlan de curso
Plan de curso
 
Aplicación de las Redes Hopfield al Problema de Asignación
Aplicación de las Redes Hopfield al Problema de AsignaciónAplicación de las Redes Hopfield al Problema de Asignación
Aplicación de las Redes Hopfield al Problema de Asignación
 
Análisis estáticos y dinámicos en la aplicación de pruebas de intrusión (Pene...
Análisis estáticos y dinámicos en la aplicación de pruebas de intrusión (Pene...Análisis estáticos y dinámicos en la aplicación de pruebas de intrusión (Pene...
Análisis estáticos y dinámicos en la aplicación de pruebas de intrusión (Pene...
 
Aprendizaje Computacional: Valoraciones personales sobre métodos de etiquetad...
Aprendizaje Computacional: Valoraciones personales sobre métodos de etiquetad...Aprendizaje Computacional: Valoraciones personales sobre métodos de etiquetad...
Aprendizaje Computacional: Valoraciones personales sobre métodos de etiquetad...
 
Perspectiva docente del diseño de contenidos y evaluación para cursos a dista...
Perspectiva docente del diseño de contenidos y evaluación para cursos a dista...Perspectiva docente del diseño de contenidos y evaluación para cursos a dista...
Perspectiva docente del diseño de contenidos y evaluación para cursos a dista...
 

Recently uploaded

Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
muralinath2
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
RenuJangid3
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Sérgio Sacani
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
SAMIR PANDA
 
Hemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxHemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptx
muralinath2
 
in vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptxin vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptx
yusufzako14
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
University of Rennes, INSA Rennes, Inria/IRISA, CNRS
 
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
yqqaatn0
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
University of Maribor
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
AlaminAfendy1
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
kejapriya1
 
GBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture MediaGBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture Media
Areesha Ahmad
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Erdal Coalmaker
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
Areesha Ahmad
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
silvermistyshot
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
Lokesh Patil
 
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
yqqaatn0
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
moosaasad1975
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
DiyaBiswas10
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
Nistarini College, Purulia (W.B) India
 

Recently uploaded (20)

Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
 
Hemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxHemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptx
 
in vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptxin vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptx
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
 
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
 
GBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture MediaGBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture Media
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
 
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
 

Accelerating GWAS epistatic interaction analysis methods

  • 1. http://www.mrsymbiomath.eu This work has been partially supported by the Mr.Symbiomath IAPP (Project Code: 324554); the ‘Plataforma de Recursos Biomoleculares y Bioinformaticos (ISCIII-PT13.0001.0012)’ and ‘Proyecto de Excelencia Junta de Andalucia (P10-TIC-6108)’ Alex Upton1, Priscill Orue1, Oswaldo Trelles1,2 1 Computer Architecture Department, University of Malaga (UMA), Spain 2 RISC Software GmbH. Hagenberg, Austria Abstract It is widely agreed that complex diseases are typically caused by joint effects of multiple genetic variations, rather than a single genetic variation [1]. Multi-SNP interactions, also known as epistatic interactions, have the potential to provide information about causes of complex diseases, and build on GWAS studies that look at associations between single SNPs and phenotypes. However, epistatic analysis methods are both computationally expensive, and have limited accessibility for biologists wanting to analyse GWAS datasets due to being command line based. Here we present APPistatic, a prototype desktop version of a pipeline for epistatic analysis of GWAS datasets. This application combines ease-of-use, via a GUI, with accelerated implementation of BOOST [2] and FaST-LMM [3] epistatic analysis methods. Pipeline Conclusions • Implementation of the analysis methods via a GUI results in improved accessibility, thereby making epistatic analysis tools a viable option for end users such as biologists that are not comfortable with command line based tools. This allows further analysis of GWAS data sets, potentially building on existing analysis and resulting in additional genetic information being discovered. • Notable improvement in execution time also obtained, compared to default execution of epistatic analysis tools. Future HPC deployment makes typical GWAS data set analysis feasible; a relatively small GWAS dataset, with 100,000 SNPs that pass quality control, has 5x10-9 pairwise interactions, that would take approximately two years to calculate on a desktop computer. Using HPC, this can be executed in a number of days, aiding in the analysis of genetic variants of disease. • In addition, a cloud-based version of the pipeline could also be developed using Web services, which could be accessed via a client such as jORCA [6]. Cloud Computing allows researchers to rent computational and storage resources on an ad-hoc basis for large scale data processing, allowing access to High Performance Computing. Furthermore, this implementation could join up with existing cloud-based pipelines to create an all-in-one process. Additionally, we are exploring the option of exporting results directly to visualisation software for visual inspection of the results. Accessibility Analysis of GWAS Data The application provides an easy-to-use all-in-one analysis of GWAS data by incorporating a number of analysis steps which are shown in Figure 1 below. Steps Involved (1) End user loads GWAS files of interest. These can be either in VCF or PLINK format. For end users with raw .CEL files, one recommended tool for obtaining VCF files is the Cloud-based GWAS Analysis Pipeline for Clinical Researchers [4]. (2) Prior to epistatic analysis, it is of interest to carry out single SNP association analysis. This is performed using the widely used tool PLINK [5]. (3) The next step is to carry out an epistatic analysis using an optimised implementation of BOOST that takes advantage of the multi-core environment of modern computers. (4) The next step is to use the FaST-LMM [3] analysis tools. Prior to using these, the user files have to be converted to ensure compatibility. This is carried out in this step. (5) The next step is to carry out a single SNP association analysis with FaST-LMM, that corrects for population structure. (6) The final step is to carry out an epistatic analysis using FaST-LMM. As with BOOST, implementation has been optimised to take advantage of multiple cores. Acceleration Desktop PC Implementation The execution of APPistatic on a typical desktop PC results in a speedup of between 4 and 8 times for epistatic analysis, depending on the number of cores. The screenshot above shows the default acceleration, using 4 tasks and 256MB RAM per task. HPC Implementation Greater speedup making the analysis of typical GWAS datasets feasible is obtained by using High Performance Computing (HPC). Initial HPC deployment using 100 cores shows a promising speedup of over 114 times. Table 2 below shows the execution times for BOOST and FaST-LMM epistatic analysis for a demo data set for both a typical desktop PC running Windows, and initial HPC deployment. It should be noted that the demo data set contains 10,000 SNPs. The faster execution time of BOOST is due to the use of a linear regression model, compared to the linear mixed method model used by FaST-LMM. Computational Environment BOOST Epistatic Execution Time (s) FaST-LMM Epistatic Execution Time (s) Standard Implementation (a) 25.4 15123 Appistatic Deployed on Desktop PC (b) 4.8 1903 Deployment on HPC (c) 1.2 132 (a) Default execution of applications from command line on Desktop PC (detailed below) (b) Desktop PC with Intel Core 2 Quad 2.66 GHz CPU and 4GB RAM running Windows 7 (c) Split into 100 tasks with 4 cores and 8GB ram assigned to each task References [1] Anunciação, Orlando, Susana Vinga, and Arlindo L. Oliveira. "Using Information Interaction to Discover Epistatic Effects in Complex Diseases." PloS one 8, no. 10 (2013): e76300. [2] Wan, Xiang, Can Yang, Qiang Yang, Hong Xue, Xiaodan Fan, Nelson LS Tang, and Weichuan Yu. "BOOST: A fast approach to detecting gene-gene interactions in genome-wide case-control studies." The American Journal of Human Genetics 87, no. 3 (2010): 325-340. [3] Lippert, Christoph, Jennifer Listgarten, Ying Liu, Carl M. Kadie, Robert I. Davidson, and David Heckerman. "FaST linear mixed models for genome-wide association studies." Nature Methods 8, no. 10 (2011): 833-835. [4] P. Heinzlreiter, J. Perkins, O. Torreñno Tirado, J. Karlsson, A. Mitterecker, M. Blanca and O. Trelles. "A Cloud-based GWAS Analysis Pipeline for Clinical Researchers" 4th International Conference on Cloud Computing and Services Science, CLOSER 2014. [5] Purcell, Shaun, Benjamin Neale, Kathe Todd-Brown, Lori Thomas, Manuel AR Ferreira, David Bender, Julian Maller et al. "PLINK: a tool set for whole-genome association and population-based linkage analyses." The American Journal of Human Genetics 81, no. 3 (2007): 559-575. [6] Martín-Requena, Victoria, Javier Ríos, Maximiliano García, Sergio Ramírez, and Oswaldo Trelles. "jORCA: easily integrating bioinformatics Web Services." Bioinformatics 26, no. 4 (2010): 553-559. Figure 1: Overview of Pipeline Graphical User Interface Providing GUI access to epistatic analysis methods, along with single SNP association methods, improves their accessibility as multiple tools are accessed in the same manner, allowing targeted non-expert computer users, e.g. biologists, to easily analyse their GWAS datasets without having to learn different commands for each tool. The GUI is shown in Figure 2 on the left. Note the easily configurable options for acceleration. The prototype version of APPistatic can be downloaded from: Figure 2: Implementation Results http://chirimoyo.ac.uma.es/appistaticFigure 2: APPistatic GUI