A GPU-ACCELERATED BIOINFORMATICS APPLICATION FORLARGE-SCALE PROTEIN INTERACTION NETWORKSJun Sung Yoon1, Won-Hyong Chung21A...
Upcoming SlideShare
Loading in …5
×

A GPU-accelerated bioinformatics application for large-scale protein interaction networks

818 views

Published on

We present a fast parallel implementation based on the sequential MCODE algorithm, a well-known cluster finding algorithm, to address the computational challenge. Our parallel algorithm is implemented on the NVIDIA parallel computing architecture of CUDA. It is evaluated for a various kinds of large-scale PPI networks. Our GPU accelerated implementation using the latest NVIDIA graphics hardware achieves a speedup of two orders of magnitudes compared to the original MCODE in the latest CPU for lager-scale protein-protein interaction networks.

For more information, please visit us at http://allegroviva.com/allegromcode/. The GPU-accelerated MCODE plugin, AllegroMCODE plugin, is available in Cytoscape App Stroe: http://apps.cytoscape.org/apps/allegromcode

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
818
On SlideShare
0
From Embeds
0
Number of Embeds
25
Actions
Shares
0
Downloads
16
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

A GPU-accelerated bioinformatics application for large-scale protein interaction networks

  1. 1. A GPU-ACCELERATED BIOINFORMATICS APPLICATION FORLARGE-SCALE PROTEIN INTERACTION NETWORKSJun Sung Yoon1, Won-Hyong Chung21AllegroViva Corporation, California, USA2Korea Research Institute of Bioscience & Biotechnology, Daejeon, KoreaIntroductionProteins, nucleic acids, and small molecules form a dense network of molecular interactions in acell. The architecture of molecular networks can reveal important principles of cellularorganization and function, similarly to the way that protein structure tells us about the functionand organization of a protein. Protein complexes are groups of proteins that interact with eachother at the same time and place, forming a single multimolecular machine. Functional modules,in contrast, consist of proteins that participate in a particular cellular process while binding eachother at a different time and place1.A protein-protein interaction network is represented as proteins are nodes and interactionsbetween proteins are edges. Protein complexes and functional modules can be identified ashighly interconnected subgraphs and computational methods are now inevitable to detect themfrom protein interaction data. In addition, High-throughput screening techniques such as yeasttwo-hybrid screening enable identification of detailed protein-protein interactions map in multiplespecies. As the interaction dataset increases, the scale of interconnected protein networksincreases exponentially so that the increasing complexity of network gives computationalchallenges to analyze the networks.Graphics hardware is recently widely used in high-performance computing due to its costeffectiveness. Bioinformatics applications also exploit GPU as a massive parallel multi-coreprocessor to address computational challenges in the many areas such as sequence analysisand protein structure prediction. However, few attempts have been made to analyze biologicalnetworks.We present a fast parallel implementation using commodity graphics hardware based a well-known sequential complex finding algorithm of MCODE2 to address the computational challenge.Our parallel algorithm is implemented on the NVIDIA parallel computing architecture of CUDA. Itis evaluated for a various kinds of large-scale PPI networks. Our GPU acceleratedimplementation using the latest NVIDIA graphics hardware achieves a speedup of two orders ofmagnitudes compared to the original MCODE in the latest CPU for lager-scale protein-proteininteraction networks.Protein Complex PredictionFurther InformationA well-known molecular complex detection tool of MCODE plugin is integrated in the open-source network visualization and analysis platform of Cytoscape platform. This architecturehas two limitations to handle contemporary large interaction network.-Serial computation: Can not fully exploit muti-core processors-Standalone system: Its computing power is limited to user’s PC hardware spec.PerformanceReference Test Network Statistics  Processing Time (sec)Network DescriptionA Protein Interaction Network from BioGRID databaseB Protein Interaction Network from IntAct databaseC Protein Interaction Network from I2D databaseD Protein Interaction Network from DIP databaseE Yeast Protein Interaction Network from DroID databaseF Human Protein Interaction Network from DroID database Test NetworksCPU Main Memory O/S GPUIntel Core i7 920 @ 2.67GHz 6 GB DDR3 RAM Ubuntu Linux 10.04 LTS NVIDIA GTX580 System Specification1. V. Spirin and L. A. Mirny, “Protein complexes and functional modules in molecular networks,”Proceedings of the National Academy of Sciences of the United States of America, vol. 100,no. 21, pp. 12123–12126, 2003.2. G. D. Bader and C. W. V. Hogue "An automated method for finding molecular complexes inlarge protein interaction networks", BMC Bioinformatics, 4(2), 2003AllegroMCODE plugin and our GPU computing server are freely available. You can get moreinformation about the installation and usage from allegroviva.com/allegromcode. Cytoscape isan open source platform for complex-network analysis and visualization and freely availablefrom www.cytoscape.org.Jun Sung Yoon : jyoon@allegroviva.comWon-Hyong Chung : whchung@kribb.re.krInclude Loops Degree Cutoff Node Score Cutoff K-core Max. Depth Haircut FluffDisabled 2 0.2 2 100 Enabled Disabled Algorithm OptionsThe processing time is measured by running MCODE Cytoscape plugin and our AllegroMCODECytoscape plugin with the same options of the algorithm.278 x460 x357 x222 x536 x451 xSpeedupGPU-accelerated Computing Architecture Enable you to exploit the GPUacceleration without any special graphicshardware. Provides the remote procedure call viathe standard XML-RPC protocol. Various clients implemented in Perl,Python, C, C++, Java and PHP caneasily make a request to the server bysending a XML document.XML-RPCprotocolCytoscape Java ApplicationParallel MCODEJava ClassSupporting multi-core CPUGPU ComputingServerGPU-ParallelMCODELibraryAllegroMCODE PluginGPU-ParallelMCODELibraryNVIDIAGraphics CardJavaNativeInterface(JNI)NVIDIA Graphics CardNVIDIA Graphics CardNVIDIAGraphics CardXML-RPCClient ClassXML-RPCServerPlugin Main ClassGPU-ParallelMCODENative ClassPCProtein Interaction Network- Become larger- More important- More sophisticatedNetworkvisualizationProtein ComplexDetectionCytoscape Platform(visualization& analysis)MCODE(plugin)Parallel MCODE Algorithm1. Vertex Weighting 2. Molecular Complex Predictiond: vertex weight percentageWv: vertex weight of vSv: vertex weight of seed of vNv : seed vertex of vSv ← v , for all vwhile there is any changes of Svfor all v neighbors of n do in parallelIf Nv <> Nn thenif Wv < Sn AND Wv> (1-d) Sn thenSv ← SnNv ← Nnelse if Wv = Sn AND Cv > Cn thenNv ← Nnend ifend ifsynthronize all threadsend while3. Post-processingInput graph: G = (V,E)for all v in G do in parallelNv ← find the subgraph which includesthe immediate neighbors of vKv ← Get highest k-core graph from Nvkv ← Get highest k-core number from Nvdv ← Get density of KivWv ← kv × dvend forC: complex subgraphh: haircut flag, f: fluff flagfor all c in C do in parallelif c not 2-core then filterif h is TRUE then 2-core complexif f is TRUE then fluff complexend for→ Long-time waiting to analyze large network interaction→ Users need to upgrade their hardware themselvesParallel Algorithm Using Cytoscape and MCODE plugin GPU Computing Server  AllegroMCODE plugin A Cytoscape plugin to help you use theremote GPU Computing Server. Supports the GPU algorithm acceleration touse your graphics hardware by loading thesame GPU-Parallel MCODE Library. Includes multi-threaded parallel MCODEImplementation to fully exploit all the cores ina CPU.

×