SlideShare a Scribd company logo
1 of 1
A GPU-ACCELERATED BIOINFORMATICS APPLICATION FOR
LARGE-SCALE PROTEIN INTERACTION NETWORKS
Jun Sung Yoon1, Won-Hyong Chung2
1AllegroViva Corporation, California, USA
2Korea Research Institute of Bioscience & Biotechnology, Daejeon, Korea
Introduction
Proteins, nucleic acids, and small molecules form a dense network of molecular interactions in a
cell. The architecture of molecular networks can reveal important principles of cellular
organization and function, similarly to the way that protein structure tells us about the function
and organization of a protein. Protein complexes are groups of proteins that interact with each
other at the same time and place, forming a single multimolecular machine. Functional modules,
in contrast, consist of proteins that participate in a particular cellular process while binding each
other at a different time and place1.
A protein-protein interaction network is represented as proteins are nodes and interactions
between proteins are edges. Protein complexes and functional modules can be identified as
highly interconnected subgraphs and computational methods are now inevitable to detect them
from protein interaction data. In addition, High-throughput screening techniques such as yeast
two-hybrid screening enable identification of detailed protein-protein interactions map in multiple
species. As the interaction dataset increases, the scale of interconnected protein networks
increases exponentially so that the increasing complexity of network gives computational
challenges to analyze the networks.
Graphics hardware is recently widely used in high-performance computing due to its cost
effectiveness. Bioinformatics applications also exploit GPU as a massive parallel multi-core
processor to address computational challenges in the many areas such as sequence analysis
and protein structure prediction. However, few attempts have been made to analyze biological
networks.
We present a fast parallel implementation using commodity graphics hardware based a well-
known sequential complex finding algorithm of MCODE2 to address the computational challenge.
Our parallel algorithm is implemented on the NVIDIA parallel computing architecture of CUDA. It
is evaluated for a various kinds of large-scale PPI networks. Our GPU accelerated
implementation using the latest NVIDIA graphics hardware achieves a speedup of two orders of
magnitudes compared to the original MCODE in the latest CPU for lager-scale protein-protein
interaction networks.
Protein Complex Prediction
Further Information
A well-known molecular complex detection tool of MCODE plugin is integrated in the open-
source network visualization and analysis platform of Cytoscape platform. This architecture
has two limitations to handle contemporary large interaction network.
-Serial computation: Can not fully exploit muti-core processors
-Standalone system: Its computing power is limited to user’s PC hardware spec.
Performance
Reference
 Test Network Statistics  Processing Time (sec)
Network Description
A Protein Interaction Network from BioGRID database
B Protein Interaction Network from IntAct database
C Protein Interaction Network from I2D database
D Protein Interaction Network from DIP database
E Yeast Protein Interaction Network from DroID database
F Human Protein Interaction Network from DroID database
 Test Networks
CPU Main Memory O/S GPU
Intel Core i7 920 @ 2.67GHz 6 GB DDR3 RAM Ubuntu Linux 10.04 LTS NVIDIA GTX580
 System Specification
1. V. Spirin and L. A. Mirny, “Protein complexes and functional modules in molecular networks,”
Proceedings of the National Academy of Sciences of the United States of America, vol. 100,
no. 21, pp. 12123–12126, 2003.
2. G. D. Bader and C. W. V. Hogue "An automated method for finding molecular complexes in
large protein interaction networks", BMC Bioinformatics, 4(2), 2003
AllegroMCODE plugin and our GPU computing server are freely available. You can get more
information about the installation and usage from allegroviva.com/allegromcode. Cytoscape is
an open source platform for complex-network analysis and visualization and freely available
from www.cytoscape.org.
Jun Sung Yoon : jyoon@allegroviva.com
Won-Hyong Chung : whchung@kribb.re.kr
Include Loops Degree Cutoff Node Score Cutoff K-core Max. Depth Haircut Fluff
Disabled 2 0.2 2 100 Enabled Disabled
 Algorithm Options
The processing time is measured by running MCODE Cytoscape plugin and our AllegroMCODE
Cytoscape plugin with the same options of the algorithm.
278 x
460 x
357 x
222 x
536 x
451 x
Speedup
GPU-accelerated Computing Architecture
 Enable you to exploit the GPU
acceleration without any special graphics
hardware.
 Provides the remote procedure call via
the standard XML-RPC protocol.
 Various clients implemented in Perl,
Python, C, C++, Java and PHP can
easily make a request to the server by
sending a XML document.
XML-RPC
protocol
Cytoscape Java Application
Parallel MCODE
Java Class
Supporting multi-core CPU
GPU Computing
Server
GPU-Parallel
MCODE
Library
AllegroMCODE Plugin
GPU-Parallel
MCODE
Library
NVIDIA
Graphics Card
Java
Native
Interface
(JNI)
NVIDIA Graphics Card
NVIDIA Graphics Card
NVIDIA
Graphics Card
XML-RPC
Client Class
XML-RPC
ServerPlugin Main Class
GPU-Parallel
MCODE
Native Class
PC
Protein Interaction Network
- Become larger
- More important
- More sophisticated
Network
visualization
Protein Complex
Detection
Cytoscape Platform
(visualization& analysis)
MCODE
(plugin)
Parallel MCODE Algorithm
1. Vertex Weighting 2. Molecular Complex Prediction
d: vertex weight percentage
Wv: vertex weight of v
Sv: vertex weight of seed of v
Nv : seed vertex of v
Sv ← v , for all v
while there is any changes of Sv
for all v neighbors of n do in parallel
If Nv <> Nn then
if Wv < Sn AND Wv> (1-d) Sn then
Sv ← Sn
Nv ← Nn
else if Wv = Sn AND Cv > Cn then
Nv ← Nn
end if
end if
synthronize all threads
end while
3. Post-processing
Input graph: G = (V,E)
for all v in G do in parallel
Nv ← find the subgraph which includes
the immediate neighbors of v
Kv ← Get highest k-core graph from Nv
kv ← Get highest k-core number from Nv
dv ← Get density of Kiv
Wv ← kv × dv
end for
C: complex subgraph
h: haircut flag, f: fluff flag
for all c in C do in parallel
if c not 2-core then filter
if h is TRUE then 2-core complex
if f is TRUE then fluff complex
end for
→ Long-time waiting to analyze large network interaction
→ Users need to upgrade their hardware themselves
Parallel Algorithm
 Using Cytoscape and MCODE plugin
 GPU Computing Server  AllegroMCODE plugin
 A Cytoscape plugin to help you use the
remote GPU Computing Server.
 Supports the GPU algorithm acceleration to
use your graphics hardware by loading the
same GPU-Parallel MCODE Library.
 Includes multi-threaded parallel MCODE
Implementation to fully exploit all the cores in
a CPU.

More Related Content

Similar to A GPU-accelerated bioinformatics application for large-scale protein interaction networks

81202015
8120201581202015
81202015IJRAT
 
Particle Swarm Optimization Based QoS Aware Routing for Wireless Sensor Networks
Particle Swarm Optimization Based QoS Aware Routing for Wireless Sensor NetworksParticle Swarm Optimization Based QoS Aware Routing for Wireless Sensor Networks
Particle Swarm Optimization Based QoS Aware Routing for Wireless Sensor Networksijsrd.com
 
upload.pdf
upload.pdfupload.pdf
upload.pdfzohra72
 
Ameliorate the performance using soft computing approaches in wireless networks
Ameliorate the performance using soft computing approaches  in wireless networksAmeliorate the performance using soft computing approaches  in wireless networks
Ameliorate the performance using soft computing approaches in wireless networksIJECEIAES
 
IRJET-AI Neural Network Disaster Recovery Cloud Operations Systems
IRJET-AI Neural Network Disaster Recovery Cloud Operations SystemsIRJET-AI Neural Network Disaster Recovery Cloud Operations Systems
IRJET-AI Neural Network Disaster Recovery Cloud Operations SystemsIRJET Journal
 
fast publication journals
fast publication journalsfast publication journals
fast publication journalsrikaseorika
 
Visual Cryptography Industrial Training Report
Visual Cryptography Industrial Training ReportVisual Cryptography Industrial Training Report
Visual Cryptography Industrial Training ReportMohit Kumar
 
Implementation of energy efficient coverage aware routing protocol for wirele...
Implementation of energy efficient coverage aware routing protocol for wirele...Implementation of energy efficient coverage aware routing protocol for wirele...
Implementation of energy efficient coverage aware routing protocol for wirele...ijfcstjournal
 
NEURAL NETWORK FOR THE RELIABILITY ANALYSIS OF A SERIES - PARALLEL SYSTEM SUB...
NEURAL NETWORK FOR THE RELIABILITY ANALYSIS OF A SERIES - PARALLEL SYSTEM SUB...NEURAL NETWORK FOR THE RELIABILITY ANALYSIS OF A SERIES - PARALLEL SYSTEM SUB...
NEURAL NETWORK FOR THE RELIABILITY ANALYSIS OF A SERIES - PARALLEL SYSTEM SUB...IAEME Publication
 
IRJET - Network Traffic Monitoring and Botnet Detection using K-ANN Algorithm
IRJET - Network Traffic Monitoring and Botnet Detection using K-ANN AlgorithmIRJET - Network Traffic Monitoring and Botnet Detection using K-ANN Algorithm
IRJET - Network Traffic Monitoring and Botnet Detection using K-ANN AlgorithmIRJET Journal
 
IJSRED-V1I1P5
IJSRED-V1I1P5IJSRED-V1I1P5
IJSRED-V1I1P5IJSRED
 
Implementation of Feed Forward Neural Network for Classification by Education...
Implementation of Feed Forward Neural Network for Classification by Education...Implementation of Feed Forward Neural Network for Classification by Education...
Implementation of Feed Forward Neural Network for Classification by Education...ijsrd.com
 
NetBioSIG2013-KEYNOTE Benno Schwikowski
NetBioSIG2013-KEYNOTE Benno SchwikowskiNetBioSIG2013-KEYNOTE Benno Schwikowski
NetBioSIG2013-KEYNOTE Benno SchwikowskiAlexander Pico
 
Comparative Analysis of GANs and VAEs in Generating High-Quality Images: A Ca...
Comparative Analysis of GANs and VAEs in Generating High-Quality Images: A Ca...Comparative Analysis of GANs and VAEs in Generating High-Quality Images: A Ca...
Comparative Analysis of GANs and VAEs in Generating High-Quality Images: A Ca...IRJET Journal
 
Pattern Formation Drosophila
Pattern Formation DrosophilaPattern Formation Drosophila
Pattern Formation Drosophilasunon77
 
CONVOLUTIONAL NEURAL NETWORK BASED RETINAL VESSEL SEGMENTATION
CONVOLUTIONAL NEURAL NETWORK BASED RETINAL VESSEL SEGMENTATIONCONVOLUTIONAL NEURAL NETWORK BASED RETINAL VESSEL SEGMENTATION
CONVOLUTIONAL NEURAL NETWORK BASED RETINAL VESSEL SEGMENTATIONCSEIJJournal
 
Convolutional Neural Network based Retinal Vessel Segmentation
Convolutional Neural Network based Retinal Vessel SegmentationConvolutional Neural Network based Retinal Vessel Segmentation
Convolutional Neural Network based Retinal Vessel SegmentationCSEIJJournal
 

Similar to A GPU-accelerated bioinformatics application for large-scale protein interaction networks (20)

81202015
8120201581202015
81202015
 
Particle Swarm Optimization Based QoS Aware Routing for Wireless Sensor Networks
Particle Swarm Optimization Based QoS Aware Routing for Wireless Sensor NetworksParticle Swarm Optimization Based QoS Aware Routing for Wireless Sensor Networks
Particle Swarm Optimization Based QoS Aware Routing for Wireless Sensor Networks
 
upload.pdf
upload.pdfupload.pdf
upload.pdf
 
Ameliorate the performance using soft computing approaches in wireless networks
Ameliorate the performance using soft computing approaches  in wireless networksAmeliorate the performance using soft computing approaches  in wireless networks
Ameliorate the performance using soft computing approaches in wireless networks
 
IRJET-AI Neural Network Disaster Recovery Cloud Operations Systems
IRJET-AI Neural Network Disaster Recovery Cloud Operations SystemsIRJET-AI Neural Network Disaster Recovery Cloud Operations Systems
IRJET-AI Neural Network Disaster Recovery Cloud Operations Systems
 
C1804011117
C1804011117C1804011117
C1804011117
 
fast publication journals
fast publication journalsfast publication journals
fast publication journals
 
Visual Cryptography Industrial Training Report
Visual Cryptography Industrial Training ReportVisual Cryptography Industrial Training Report
Visual Cryptography Industrial Training Report
 
Cytoscape Talk 2010
Cytoscape Talk 2010Cytoscape Talk 2010
Cytoscape Talk 2010
 
Implementation of energy efficient coverage aware routing protocol for wirele...
Implementation of energy efficient coverage aware routing protocol for wirele...Implementation of energy efficient coverage aware routing protocol for wirele...
Implementation of energy efficient coverage aware routing protocol for wirele...
 
NEURAL NETWORK FOR THE RELIABILITY ANALYSIS OF A SERIES - PARALLEL SYSTEM SUB...
NEURAL NETWORK FOR THE RELIABILITY ANALYSIS OF A SERIES - PARALLEL SYSTEM SUB...NEURAL NETWORK FOR THE RELIABILITY ANALYSIS OF A SERIES - PARALLEL SYSTEM SUB...
NEURAL NETWORK FOR THE RELIABILITY ANALYSIS OF A SERIES - PARALLEL SYSTEM SUB...
 
IRJET - Network Traffic Monitoring and Botnet Detection using K-ANN Algorithm
IRJET - Network Traffic Monitoring and Botnet Detection using K-ANN AlgorithmIRJET - Network Traffic Monitoring and Botnet Detection using K-ANN Algorithm
IRJET - Network Traffic Monitoring and Botnet Detection using K-ANN Algorithm
 
IJSRED-V1I1P5
IJSRED-V1I1P5IJSRED-V1I1P5
IJSRED-V1I1P5
 
M010237578
M010237578M010237578
M010237578
 
Implementation of Feed Forward Neural Network for Classification by Education...
Implementation of Feed Forward Neural Network for Classification by Education...Implementation of Feed Forward Neural Network for Classification by Education...
Implementation of Feed Forward Neural Network for Classification by Education...
 
NetBioSIG2013-KEYNOTE Benno Schwikowski
NetBioSIG2013-KEYNOTE Benno SchwikowskiNetBioSIG2013-KEYNOTE Benno Schwikowski
NetBioSIG2013-KEYNOTE Benno Schwikowski
 
Comparative Analysis of GANs and VAEs in Generating High-Quality Images: A Ca...
Comparative Analysis of GANs and VAEs in Generating High-Quality Images: A Ca...Comparative Analysis of GANs and VAEs in Generating High-Quality Images: A Ca...
Comparative Analysis of GANs and VAEs in Generating High-Quality Images: A Ca...
 
Pattern Formation Drosophila
Pattern Formation DrosophilaPattern Formation Drosophila
Pattern Formation Drosophila
 
CONVOLUTIONAL NEURAL NETWORK BASED RETINAL VESSEL SEGMENTATION
CONVOLUTIONAL NEURAL NETWORK BASED RETINAL VESSEL SEGMENTATIONCONVOLUTIONAL NEURAL NETWORK BASED RETINAL VESSEL SEGMENTATION
CONVOLUTIONAL NEURAL NETWORK BASED RETINAL VESSEL SEGMENTATION
 
Convolutional Neural Network based Retinal Vessel Segmentation
Convolutional Neural Network based Retinal Vessel SegmentationConvolutional Neural Network based Retinal Vessel Segmentation
Convolutional Neural Network based Retinal Vessel Segmentation
 

Recently uploaded

Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 

Recently uploaded (20)

Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 

A GPU-accelerated bioinformatics application for large-scale protein interaction networks

  • 1. A GPU-ACCELERATED BIOINFORMATICS APPLICATION FOR LARGE-SCALE PROTEIN INTERACTION NETWORKS Jun Sung Yoon1, Won-Hyong Chung2 1AllegroViva Corporation, California, USA 2Korea Research Institute of Bioscience & Biotechnology, Daejeon, Korea Introduction Proteins, nucleic acids, and small molecules form a dense network of molecular interactions in a cell. The architecture of molecular networks can reveal important principles of cellular organization and function, similarly to the way that protein structure tells us about the function and organization of a protein. Protein complexes are groups of proteins that interact with each other at the same time and place, forming a single multimolecular machine. Functional modules, in contrast, consist of proteins that participate in a particular cellular process while binding each other at a different time and place1. A protein-protein interaction network is represented as proteins are nodes and interactions between proteins are edges. Protein complexes and functional modules can be identified as highly interconnected subgraphs and computational methods are now inevitable to detect them from protein interaction data. In addition, High-throughput screening techniques such as yeast two-hybrid screening enable identification of detailed protein-protein interactions map in multiple species. As the interaction dataset increases, the scale of interconnected protein networks increases exponentially so that the increasing complexity of network gives computational challenges to analyze the networks. Graphics hardware is recently widely used in high-performance computing due to its cost effectiveness. Bioinformatics applications also exploit GPU as a massive parallel multi-core processor to address computational challenges in the many areas such as sequence analysis and protein structure prediction. However, few attempts have been made to analyze biological networks. We present a fast parallel implementation using commodity graphics hardware based a well- known sequential complex finding algorithm of MCODE2 to address the computational challenge. Our parallel algorithm is implemented on the NVIDIA parallel computing architecture of CUDA. It is evaluated for a various kinds of large-scale PPI networks. Our GPU accelerated implementation using the latest NVIDIA graphics hardware achieves a speedup of two orders of magnitudes compared to the original MCODE in the latest CPU for lager-scale protein-protein interaction networks. Protein Complex Prediction Further Information A well-known molecular complex detection tool of MCODE plugin is integrated in the open- source network visualization and analysis platform of Cytoscape platform. This architecture has two limitations to handle contemporary large interaction network. -Serial computation: Can not fully exploit muti-core processors -Standalone system: Its computing power is limited to user’s PC hardware spec. Performance Reference  Test Network Statistics  Processing Time (sec) Network Description A Protein Interaction Network from BioGRID database B Protein Interaction Network from IntAct database C Protein Interaction Network from I2D database D Protein Interaction Network from DIP database E Yeast Protein Interaction Network from DroID database F Human Protein Interaction Network from DroID database  Test Networks CPU Main Memory O/S GPU Intel Core i7 920 @ 2.67GHz 6 GB DDR3 RAM Ubuntu Linux 10.04 LTS NVIDIA GTX580  System Specification 1. V. Spirin and L. A. Mirny, “Protein complexes and functional modules in molecular networks,” Proceedings of the National Academy of Sciences of the United States of America, vol. 100, no. 21, pp. 12123–12126, 2003. 2. G. D. Bader and C. W. V. Hogue "An automated method for finding molecular complexes in large protein interaction networks", BMC Bioinformatics, 4(2), 2003 AllegroMCODE plugin and our GPU computing server are freely available. You can get more information about the installation and usage from allegroviva.com/allegromcode. Cytoscape is an open source platform for complex-network analysis and visualization and freely available from www.cytoscape.org. Jun Sung Yoon : jyoon@allegroviva.com Won-Hyong Chung : whchung@kribb.re.kr Include Loops Degree Cutoff Node Score Cutoff K-core Max. Depth Haircut Fluff Disabled 2 0.2 2 100 Enabled Disabled  Algorithm Options The processing time is measured by running MCODE Cytoscape plugin and our AllegroMCODE Cytoscape plugin with the same options of the algorithm. 278 x 460 x 357 x 222 x 536 x 451 x Speedup GPU-accelerated Computing Architecture  Enable you to exploit the GPU acceleration without any special graphics hardware.  Provides the remote procedure call via the standard XML-RPC protocol.  Various clients implemented in Perl, Python, C, C++, Java and PHP can easily make a request to the server by sending a XML document. XML-RPC protocol Cytoscape Java Application Parallel MCODE Java Class Supporting multi-core CPU GPU Computing Server GPU-Parallel MCODE Library AllegroMCODE Plugin GPU-Parallel MCODE Library NVIDIA Graphics Card Java Native Interface (JNI) NVIDIA Graphics Card NVIDIA Graphics Card NVIDIA Graphics Card XML-RPC Client Class XML-RPC ServerPlugin Main Class GPU-Parallel MCODE Native Class PC Protein Interaction Network - Become larger - More important - More sophisticated Network visualization Protein Complex Detection Cytoscape Platform (visualization& analysis) MCODE (plugin) Parallel MCODE Algorithm 1. Vertex Weighting 2. Molecular Complex Prediction d: vertex weight percentage Wv: vertex weight of v Sv: vertex weight of seed of v Nv : seed vertex of v Sv ← v , for all v while there is any changes of Sv for all v neighbors of n do in parallel If Nv <> Nn then if Wv < Sn AND Wv> (1-d) Sn then Sv ← Sn Nv ← Nn else if Wv = Sn AND Cv > Cn then Nv ← Nn end if end if synthronize all threads end while 3. Post-processing Input graph: G = (V,E) for all v in G do in parallel Nv ← find the subgraph which includes the immediate neighbors of v Kv ← Get highest k-core graph from Nv kv ← Get highest k-core number from Nv dv ← Get density of Kiv Wv ← kv × dv end for C: complex subgraph h: haircut flag, f: fluff flag for all c in C do in parallel if c not 2-core then filter if h is TRUE then 2-core complex if f is TRUE then fluff complex end for → Long-time waiting to analyze large network interaction → Users need to upgrade their hardware themselves Parallel Algorithm  Using Cytoscape and MCODE plugin  GPU Computing Server  AllegroMCODE plugin  A Cytoscape plugin to help you use the remote GPU Computing Server.  Supports the GPU algorithm acceleration to use your graphics hardware by loading the same GPU-Parallel MCODE Library.  Includes multi-threaded parallel MCODE Implementation to fully exploit all the cores in a CPU.