Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
David Kelly SWIFT
1. Swift: A Scientist’s Gateway to
Campus Clusters, Grids and Supercomputers
Swift project: www.swiftlang.org
Presenter contact: davidkelly@uchicago.edu
swift-support@ci.uchicago.edu
David Kelly
Computation Institute, University of Chicago
and Argonne National Laboratory
2. Parallel scripting language for clusters, clouds & grids
– For writing loosely-coupled scripts of application programs
and utilities linked by exchanging files
– Can call scripts in shell, python, R, Octave, MATLAB, …
Swift does 3 important things for you:
– Makes parallelism transparent – with functional dataflow
– Makes basic failure recovery transparent
– Makes computing location transparent – can run your script
on multiple distributed sites and diverse computing
resources (from desktop to petascale)
2
www.swiftlang.org
3. MODIS script excerpt
a
3
www.swiftlang.org
$ cat modis.swift
type file;
app (file output) getLandUse (file input)
{
getLandUse @input;
}
file getLandUse_inputs[] <filesys_mapper; pattern=”images/*.rgb">;
foreach i in getLandUse_inputs {
file output <single_file_mapper; file=@strcat(i, “.output”)>;
output = getLandUse(i);
}
4. Same script runs on broad range of resources; separate throttles can be set for each site.
User can run on multiple resources:
4
www.swiftlang.org
Cluster interactive node
swift
Output
Logs
Cluster file server
Input
Swift
script
Applications
config
files
UChicago UC3
UC3 seeder nodes
Midwest T2 nodes
OSG VO nodes
Beagle Cray
Cray XE 24-core nodes
Uchicago Midway
Sandybridge nodes
Westmere nodes
Department nodes
5. Swift’s location-independent scripting
lets the user focus on science
Example of running 3,000 jobs to 3 hosts including the UC3 campus
collective:
The user started on a basic login host processing 10 files and moved up to a
3,000 file dataset, changing only the dataset name and a site-specification
list to get to the resources above
Expanded the scope of their computations from one node to hundreds or
thousands of cores
User didn’t need to look at what sites were busy, or adjust arcane scripts,
to get to these resources.
5
www.swiftlang.org
Midway 289
Beagle 1070
UC3 1641
Total 3000
6. Swift is a parallel scripting system for grids, clouds and clusters
– for loosely-coupled applications - application and utility programs linked by
exchanging files
Swift is easy to write: simple high-level C-like functional language
– Small Swift scripts can do large-scale work
Swift is easy to run: contains all services for running Grid workflow - in one
Java application
– Untar and run – acts as a self-contained Grid client
Swift is fast: uses efficient, scalable and flexible distributed execution
engine.
– Scales to range of 1M tasks per script run
Swift usage is growing:
– applications in earth systems sciences, neuroscience, proteomics, molecular
dynamics, biochemistry, economics, statistics, and more.
Try Swift: www.swiftlang.org
6
www.swiftlang.org
8. Face-IT: Framework to Advance Climate,
Economic, and Impact Investigations with
Information Technology
8
www.ci.uchicago.edu/swift www.mcs.anl.gov/exm
• Collaboration between University of Chicago, Columbia
University, Purdue University, and the University of Florida
• Galaxy – scientific workflow system
• Combine the best parts of Swift and Galaxy, and apply to
Earth Systems
The Swift parallel scripting language is a high-level language to express the coordinated and parallel execution of application programs and scripts written in any other scripting or programming language.It was developed in 2006, and is a successor to less flexible workflow languages from which it was inspired, which started around 2001.It’s a minimal language, meant to knit together tools for other languages. This need is pervasive across scientific disciplines, and we work to apply Swift broadly.But its particularly relevant tp Earth Systems sciences, where simulations and instruments typically present their outputs are large datasets of many files.Regarding parallism: the user application tools which a Swift script executes may themselves be parallel programs that use many cores or many nodes on a cluster or supercomputer.Regarding location: express the flow of your computations, and keep the plumbing details and the machine-specific details out of this dataflow description. That gives flexibility, and lets optimizing tools like Swift implement the mechanical but complex aspects of portable distributed execution.