XSEDE15_PhastaGateway

C.W. Smith, S. Tran, O. Sahni, and M.S. Shephard,
Rensselaer Polytechnic Institute
Raminder Singh
Indiana University
ramifnu@iu.edu
Enabling HPC Simulation Workflows
for Complex Industrial Flow

Parallel Data & Services
Domain Topology
Mesh Topology/Shape
Dynamic Load Balancing
Simulation Fields
Physics and Model Parameters Input Domain Definition with Attributes
PHASTA
Parasolid
or
GeomSim
MeshSim and
MeshSim Adapt
Paraview
Solution
Transfer
Hessian-based
error indicator
NS, FE
Level set
Solution transfer constraints
mesh with fields
mesh with
fields
calculated fields
mesh size
field
meshes
and fields
meshing
operation geometric
interrogation
Attributed
topology
non-manifold
model construction
geometry updates
mesh size
field
mesh
Partition Control
Complex Flow Simulations

Project challenges
High barrier to run HPC workflows
– Requires knowledge of file system
– scheduler
– scripting
– runtime environment
– compilers … - for each HPC system
Other Challenges
– Must have very high degree of automation –
human in the loop kills scalability and performance
– Need easy access to parallel computers

User specifies
• problem definition
• simulation parameters
• required compute resources
through experiment creation web page
• Workflow steps are executed on
HPC system
• user is emailed
• output is prepared for download
option to delete or archive
• Scales to multiple users and systems
Science gateway for PHASTA lowers the
barrier

• Used PHP Gateway framework with Airavata to
develop gateway and enable PHASTA application
• Setup a community account to support the
community
• Defining resources to run the application
– TACC Stampede
– CCI IBM Blue Gene.
• Define the PHASTA application.
PHASTA Solution

What is PGA?
• PGA is the sample gateway implemented to
demonstrate Airavata middleware features.
• You can download and use it as it is or modify it
according to your requirements.
• There is an Ansible script available and docker
image worked on by a GSOC Student.
• PGA is developed using PHP.
• Visit PGA at;
– https://testdrive.airavata.org/
2

Gateway Features for Default User
• In the gateway default user can;
– Create and Launch Experiments.
– Monitor Experiments.
– Create Projects (Experiment grouping).
– Clone, Cancel and Edit Experiment.
– Report Issues & Provide Feedback.
6

• Address user requests
• Allow staging data from user desktop to
resource and vice-versa
• Tail on remote application logs
• User key generation and CCI user
accounts
Future work

Industry Challenge Talk Wednesday @ 4

Workflow Diagram for SEQC Transcriptome Assembly and Evaluation
Yes
Pre-processing, Input: Sequencing Reads FASTQ Files
• Adapter Trimming (cutadapt so ware)
• Poly A/T Trimming, and Removing mtRNA, rRNA (custom script)
• Error Correc on for RAN-Seq reads (SEECER)
Sta s cal comparison of all the ~60 assemblies (Sta s cal Tes ng for popula on of Assemblies)
• Novel Score: Efficiently Covered Bases for All Genes (EC-BAG) Score (Custom Script)
• Sta s cal Tes ng, e.g. ANOVA
Passed QC? (custom script needed to check the above QC
criteria, e.g.:
If (CEGMA_CEGs > 235) then CEGMA_flag = Passed)
Transcriptome Assemblies, Input: Trimmed Sequencing Reads FASTQ Files
• Assembling Samples A and B for six centers, using different replicate- combina ons (Trinity so ware)
• ~60 Transcriptome Assemblies
Genome Coverage – SNP Detec on for FASTQ Trimmed Input Reads
• Mapping Input Reads to the Reference Genome (TopHat so ware)
• SNP detec on (GATK so ware): Output Called SNP_Reads
• Genome Coverage, using Mapped Reads (featureCounts – R
Bioconductor Package)
Quality Control (QC), Input: Assembled Con gs Files (FASTA Format)
• DETONATE (DETONATE so ware, using human reference genome)
• CEGMA (CEGMA so ware)
• Assemblies sta s cal outputs (provided by Trinity for each assembly)
• Mapping reads back to the con gs (TopHat so ware)
Discard
the
Assembly
No
Genome Coverage – SNP Detec on for FASTA Assembled Con gs
• Mapping assembled con gs to the Reference Genome (GMAP so ware)
• SNP detec on (GATK so ware): Output Called SNP_Con gs
• Genome Coverage, using Mapped Con gs (featureCounts – R
Bioconductor Package)
SNP Compariosn
• Comparing Detected SNP_Con gs with dbSNP (Custom Script and SnpSi )
• Comparing Detected SNP_Reads with dbSNP (Custom Script and SnpSi )

Thanks!!!
Questions?
ramifnu@iu.edu
sgg@iu.edu
https://iu.box.com/xsede15

XSEDE15_PhastaGateway

Recommended

Recommended

More Related Content

What's hot

What's hot (9)

Viewers also liked

Viewers also liked (10)

Similar to XSEDE15_PhastaGateway

Similar to XSEDE15_PhastaGateway (20)

XSEDE15_PhastaGateway

Editor's Notes