BacteriumSimulatorGrid (BSGrid) - Tool for Simulating the Behavior of the Bacillus thuringiensis

  • 434 views
Uploaded on

We developed BSGrid, an application to simulate the behavior of bacterial populations using stochastic methods, using high performance computing infrastructures (HPCIs) as cluster and/or grid …

We developed BSGrid, an application to simulate the behavior of bacterial populations using stochastic methods, using high performance computing infrastructures (HPCIs) as cluster and/or grid computing.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
434
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
0
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. 2009 Mesoscale Modeling of the Bacillus thuringiensis Sporulation NetworkBased on Stochastic Kinetics and Its Application for in Silico Scale-down Harold Castro, Andrés González, Sergio Orduz Mario Villamizar, Nicolás Cuervo, School of Biosciences Gabriel Lozano, Silvia Restrepo Universidad Nacional de Colombia Departments of Chemical Engineering, Medellín, Colombia Biological Sciences and Systems and Computing Engineering Universidad de los Andes Bogotá, Colombia
  • 2. Introduction to Bacillus thuringiensisBacillus thuringiensis is a gram positive bacterium widely known by itscapacity of synthesizing δ-endotoxins (parasporal crystal proteins) during thesporulation process, which are used as biopesticides.This δ-endotoxins are used in some products and no toxic effects of B.thuringiensis on humans have been detected in its years of use.
  • 3. MotivationThese biopesticides are used in countries that require the use of organicagriculture.For instance, in Colombia they can be used for a typical problem in the insectcontrol of maize crops. A B. thuringiensis subspecies as kurstaki cancontribute to combat lepidoptera in this kind of crops.
  • 4. Problem This kind of biopesticides represents 90% of the total biopesticide marketand they just participate in the 5% of the total pesticide market. Industrial-scale fermentation cannot obtain a high concentration of the δ-endotoxins, so the production of biopesticides have a high cost. The δ-endotoxins are produced during the sporulation process of B.thuringiensis. It is necessary to analyze the relationship between the sporulationprocess and the δ-endotoxin production of the δ-endotoxins to determine theoptimum conditions under which the δ-endotoxins are produced. The sporulation process is affected by intrinsic and extrinsic variableswhich can not be modeled using deterministic models.
  • 5. Project objectives Develop a mesoscale stochastic model that predicts the sporulationprocess in B. thuringiensis so it allows to analyze the relationship betweenthe sporulation process and the δ-endotoxins production, in order toincrease, by fermentation processes, the δ-endotoxins production atindustrial levels. Determine the effect of oxygen oscillations on the sporulation process inorder to analyze the evolution of the protein synthesis on industrial scale(scale-down in silico). Validate the stochastic model results with experimental results.
  • 6. Work Areas Definition of a mesoscale stochastic model for B. thuringiensis BSGrid - An application for executingsimulations using stochastic algorithms UnaGrid – An Opportunistic HighPerformance Computing Infrastructure Comparisons with experimental data
  • 7. Work Areas Definition of a mesoscale stochastic model for B. thuringiensis BSGrid - An application for executingsimulations using stochastic algorithms UnaGrid – An Opportunistic HighPerformance Computing Infrastructure Comparisons with experimental data
  • 8. A mesoscale stochastic model for B. thuringiensisFive proteins are considered: SigmaH, AbrB, KinA, Spo0A and phosporylatedSpo0A.The evolution of these proteins is determined based on 27 events classifiedin four categories (gene transcription, protein transduction, proteindegradation, degradation of messenger RNA).Messenger RNA expression is regulated with the use of the Hill equation.In the stochastic simulations the Stochastic Simulation Algorithm (SSA) ofGillespie is used.B. thuringiensis has a bimodal behavior, the planktonic population and thespore-forming population (include spore population).
  • 9. Sporulation regulatory network and the Spo0A-P roleThe phosphorylated Spo0A protein plays an important role because whenreaches high concentrations, it activates the whole sporulation process,therefore we considered that when the protein reaches a threshold value it ishighly probable that the sporulation process begin
  • 10. Sporulation regulatory network - Bimodal populationThe simulations results seem to predict a bimodal population.For finding the distribution of the populations we developed a simpleMontecarlo simulation based on a probability function. f   1 ,  1 ,  2 ,  2 , p   1  p  N   1 ,  1   pN   2 ,  2 We used reverse engineering to find the parameters of this distributionthrough the development of an algorithm based on sum squaresminimization.Each time t was analyzed for parameter regression using Microsoft Excel2007® solver tool
  • 11. Work Areas Definition of a mesoscale stochastic model for B. thuringiensis BSGrid - An application for executingsimulations using stochastic algorithms UnaGrid – An Opportunistic HighPerformance Computing Infrastructure Comparisons with experimental data
  • 12. BSGrid – Operation on Personal Computers An application useful for executing simulations using stochastic methods. Java J2SE. Friendly with the final user.1. Bacterium StructureDefinition through GUIs
  • 13. BSGrid – Operation on Personal Computers2. Configuration and Execution of theSimulations through GUIs
  • 14. BSGrid – Operation on Personal Computers 3. Visualization andanalysis of results, sohe/she can decide tomodify the bacterium structure and run simulations again.
  • 15. BSGrid – Problems for Larger Simulations on PCs 1 Individual ≈ 63 seconds 150000 Individuals ≈ 54 Days ≈ 2 Months ¿Simulations with big populations require larger processing capabilities?
  • 16. Solution: BSGrid as a Grid-Enabled application Cluster/Grid Infrastructure Independent Jobs Master XML Document Submitting BSGrid Jobs to the Cluster/Grid Infraestructure Batch Process 1. Bacterium Structure Definition through GUIsSlave 1 2. Configuration and Slave N ….. Execution of Simulations 3. Visualization and analysis of results
  • 17. Solution: BSGrid as a Grid-Enabled application (2) Cluster/Grid Infrastructure Independent Jobs BSGrid job BSGrid job BSGrid job Master XML Document Submitting BSGrid Jobs to the Cluster/Grid Infraestructure Batch Process Much time to display the global statisticsSlave 1 Slave NBSGrid ….. BSGrid job BSGrid job job User User ….. Analysis 1 Analysis N Relational Database Server
  • 18. Solution: BSGrid as a Grid-Enabled application (3) Cluster/Grid Infrastructure Independent Jobs BSGrid job BSGrid job BSGrid job Master XML Document Submitting BSGrid Jobs to the Cluster/Grid Infraestructure Batch Process The time is reduced from minutes to secondsSlave 1 Slave NBSGrid ….. BSGrid job BSGrid job job User User Analysis 1 ….. Analysis N Relational Tables Relational Database Materialized Server Views
  • 19. Friendly Graphical User Interfaces of BSGrid
  • 20. Tools of the BSGrid Application BSGrid GUI Results Stochastic Algorithms PC ExecutionGUI Definition BacteriumStructure Model Execution of Output Data Simulations RAM Memory In PCsXML BacteriumStructure Model Execution of Output Data Input File for Simulations Database Server BSGrid In Grid/Cluster GUI Results Grid/Cluster Execution
  • 21. Work Areas Definition of a mesoscale stochastic model for B. thuringiensis BSGrid - An application for executingsimulations using stochastic algorithms UnaGrid – An Opportunistic HighPerformance Computing Infrastructure Comparisons with experimental data
  • 22. A High Performance Computing Infrastructure (HPCI) This type of simulations requires large processing capabilities. Cluster and grid infrastructures regularly have dedicated computationalresources so its implementation requires large financial investments.
  • 23. A High Performance Computing Infrastructure (2) Dedicated infrastructures are an unviable option in organizations orcountries with low financial resources. However, these organizations havemany computer labs which are not fully utilized by employees or universitystudents.
  • 24. Solution: Opportunistic virtual clusters X X Cores Cores Linux Linux Processing Processing Virtual Machine Virtual Machine Physical Machine of a Physical Machine of a Computer Room Computer Room a. When there is an End User using b. When there is not an End User the physical machine using the physical machine A virtual cluster is a set of commodity and interconnected desktopsexecuting virtual machines (VMs) in background and low-priority throughvirtualization technologies, these VMs take advantage of the available idleprocessing capabilities in computer labs on an university campus.
  • 25. Solution: Opportunistic virtual clusters (2) Computer lab VM VM VM VM VM VM VM VM VM A virtual machine is executed on each computer of a lab and it supportsthe role of a cluster slave and all of these virtual machines on executionmake up a virtual processing cluster. A dedicated node is necessary for avirtual cluster and it supports the role of the cluster master.
  • 26. Solution: Opportunistic virtual clusters (2) Computer lab VM VM VM VM VM VM VM VM VM Computers in the computer lab – Virtual Cluster Slaves A virtual machine is executed on each computer of a lab and it supportsthe role of a cluster slave and all of these virtual machines on executionmake up a virtual processing cluster. A dedicated node is necessary for avirtual cluster and it supports the role of the cluster master.
  • 27. Solution: Opportunistic virtual clusters (2) Computer lab VM VM VM VM VM VM Master Dedicated computer outside the computer lab VM VM VM Computers in the computer lab – Virtual Cluster Slaves A virtual machine is executed on each computer of a lab and it supportsthe role of a cluster slave and all of these virtual machines on executionmake up a virtual processing cluster. A dedicated node is necessary for avirtual cluster and it supports the role of the cluster master.
  • 28. Opportunistic virtual clusters - Features Virtual Cluster Research Group C Cluster/Grid User Virtual Cluster Slave Slave Research Group A Cluster/Grid User Master Slave Slave Slave Slave Virtual Cluster Master Research Group B Slave Slave Slave Slave Master Slave SlaveA virtual infrastructure composed by virtual clusters.The virtual clusters take advantage of the unused physical resources.An infrastructure for general purpose – Not only for biological simulations
  • 29. Opportunistic virtual clusters – Features (2) GRID COMMUNITY Virtual Cluster Research Group B Cluster/Grid User Certificate Virtual Cluster Authority (CA) Research Group A Slave Slave Cluster/Grid User Master Slave Slave Middleware Slave Slave Grid Virtual Cluster Master Research Group C Slave Slave Cluster/Grid User Slave Slave Master Slave Slave Each research group can define its own virtual clusters with customapplication environments (middlewares, applications, configurations, etc) A grid solution (several virtual clusters) can be deployed for supportingthe processing capabilities required by some applications.
  • 30. Opportunistic Grid Virtual Infrastructure ProposedOur strategy solves the problems associated with the lack or sub-utilization ofpreexisting computer laboratories and promotes new opportunities: The collaborative work among research groups The development of research projects that requires large processingcapabilities at low cost.Limitations Best effort approach. No quality of service (QoS) is guaranteed. The capabilities of a virtual cluster depend of its configuration. Bag of tasks application.
  • 31. Opportunistic Grid Virtual Infrastructure Deployed Cluster/Grid Cluster/Grid Cluster/Grid Three computer labs, each User User User Job Submission Job Submission Job Submissionone with 35 computers and VMWare ESX Serverwindows XP as the base Globus Globusoperating system. Middleware Middleware Virtual Machine Virtual Machine Virtual Machine Master Cluster Turing Master Cluster Wuaira1 Master Cluster Wuaira2 Core 2 Duo processor Computer Labs(1,86GHz) and 4 GB of RAM. Cluster Virtual Turing Cluster Virtual Wuaira Cluster Virtual Wuaira Computer Lab Computer Lab Computer Lab Three virtual clusters. Condor scheduler. How to deploy the virtual machines? VMware virtualization If the virtual machines are always in execution,software. they will be always consuming energy including when there are not cluster/grid users using the virtual infrastructure. Globus middleware. A green solution it is necessary.
  • 32. Opportunistic Grid Virtual Infrastructure Deployed Three computer labs, each Cluster/Grid User Cluster/Grid User Cluster/Grid User Job Submissionone with 35 computers and Job Submission Job Submission VMWare ESX Serverwindows XP as the base Globus Globusoperating system. Middleware Middleware Virtual Machine Virtual Machine Virtual Machine Master Cluster Turing Master Cluster Wuaira1 Master Cluster Wuaira2 Core 2 Duo processor(1,86GHz) and 4 GB of RAM. Computer Labs Cluster Virtual Turing Cluster Virtual Wuaira Cluster Virtual Wuaira Computer Lab Computer Lab Computer Lab Three virtual clusters. Condor scheduler. Data Center Domain Controller Domain Controller Windows 2008 Server Windows 2003 Server VMware virtualizationsoftware. GUMA Admin. ADMONSIS Web Server Admin. Domain CAPRICA Domain Globus middleware. Cluster/Grid Cluster/Grid Cluster/Grid User User User
  • 33. Deployment on Demand of the Virtual Infrastructure The deployment of virtual clustersis executed on demand throughGUMA. This application allows to executeand manage virtual clusters ondemand and it provides multipleservices for managing the grid fromlight clients. It allows the monitoringof the physical and virtual machines.
  • 34. Work Areas Definition of a mesoscale stochastic model for B. thuringiensis BSGrid - An application for executingsimulations using stochastic algorithms UnaGrid – An Opportunistic HighPerformance Computing Infrastructure Comparisons with experimental data
  • 35. Experimental tests Three fermentations were carried out and the B. thuringiensis subsp.kurstaki HD1-1999 were used. One single colony was inoculated in 50 mL culture at 30 oC for 72 h. Oxygen was controlled by adding a mix of air-pure oxygen. pH andtemperature were maintained at 6.5 and 30 oC respectively. The population of planktonic, spore-forming and spores populations wereevaluated using phase contrast microscope.
  • 36. Experimental resultsOur results seem to indicate that the sporulation process is triggered aroundthe 20th hour possibly influenced by intrinsic and extrinsic noise, and due topoor oxygen transfer in Bogotá (2600 AMSL) we believe that the sporecontent did not pass over 60%, contrary to several reports.
  • 37. In silico results - Bimodal populationThe model was run for 150000 cells. The analysis was carried out for 2900cells up to 80000 seconds. In order to save computational resources, resultswere saved every 500 s.In order to assure the presence of two subpopulations in the proposedmesoscale model, we adjust our histograms to continue Gaussiandistribution curves and the bimodal population describes the presence ofplanktonic cells (low Spo0AP) and spores (high Spo0A-P) along the time.
  • 38. In silico resultsInterestingly, high Spo0A-P population increases when augmenting timeclearly indicating the augmenting of spores until reaching steady state (rightfigure). These results describe a similar dynamics compared to the sporeconcentration in the fermentor (left figure).Our analysis in silico predicts that the sporulation process takes around 8 hto be completed while the experimental results display that the process takeswithin 20 h. A deeper study is required.
  • 39. System response to oxygen oscillations Keep into account that Oxygen tension partially controls KinA activitytherefore affecting Spo0A phosphorylation rate described by: Spo 0 A   Spo 0 A  P c     A  sin  2  t  + d  n  KinA c  KMsp *     KinA n  K n   T     kasp  The stochastic kinetic constant A : Wave amplitudec was modified according to: T : Oscilation period d : M ean value of the sinusoidal function Parameters Simulation A T d Five hundred simulations were 1 0,5 0,5 0,5performed for each of these 2 0,5 1,0 0,5conditions. 3 0,625 1,0 0,625 4 0,25 1,0 1,0 5 0,5 1,0 1,0
  • 40. Spo0A-P response to oscillations in the oxygen tensionThe results of these simulations with oscillations in the oxygen tensionpredict a reduction in the size of the high Spo0A-P population demonstratingthe effects of the industrial-scale oscillations on the sporulation process.
  • 41. Results of processing time and data generated Processing time required on a personal computer: Amount of Time required for each CPU Total time Model name bacteria bacterium (sec) numbers (days) B. thruring. 150000 63 2 54,69 Processing time required on the opportunistic virtual cluster infrastructure: Amount of Time required for each CPU Total time Model name bacteria bacterium (sec) numbers (days) B. thruring. 150000 111 70 2,75 These results confirm the benefits of our strategy and performance testsconfirm the transparency of our model. We found that 10GB were generated by the model simulated.
  • 42. Conclusions Stochastic modelIn the model developed we demonstrate the presence of multistability for B.thuringiensis and we also can demonstrate that cycling the oxygen decreases thepopulation of spore-forming cells. BSGrid applicationBSGrid application is a tool for simulating biological systems using stochasticmethods and algorithms in PCs and HPCIs. Virtual infrastructure and parallel computingParallel computing provides advantages for this type of simulations through thegeneration of a large number of independent jobs.The infrastructure proposed allows the execution of this and other applications usingan opportunistic strategy (cost close to zero).
  • 43. Future work Stochastic modelThe proposed model predicts an elapsed time of 8 h for the sporulationprocess. Nevertheless our experimental results indicate a longer processtherefore more studies are required in order to understand the triggeringprocess.Analysis with new parameters in the model are required for analyzing therelationship between the sporulation process and the δ-endotoxinsproduction. Experimental resultsIn the fermentation process were not possible to differentiate between sporespopulations and spore-forming populations so an analysis more detailedshould be used for validating the mesoscale model using reporter genesrelated with the sporulation.
  • 44. Future work BSGrid applicationAdapt and publish BSGrid as an open source application.Given its modular design, BSGrid is ready to be extended to handle newstochastic methods and algorithms. InfrastructureResearchers want to work now with larger populations, more complexstructures and get more accurate answers.
  • 45. Thanks for your attention! Questions?