Exploiting Web 2.0 for Scientific Simulation Gabrielle Allen Department of Computer Science Center for Computation & Technology Louisiana State University Allen,Loeffler,Radke,Schnetter, Seidel,Integrating Web 2.0 Technologies with Scientiﬁc Simulation Codes for Real-Time Collaboration, IEEE International Conference on Cluster Computing (Cluster 2009), Workshop on The Impact and Influence of Web 2.0 on e-Research Infrastructure, Services and Applications.
Gravitational Wave Physics Models Analysis & Insight Observations Petascale problems: Full 3D general relativistic models of binary systems, supernova, gamma-ray bursts
Understanding Gravity Data and Collaboration increasing Log(Data) Galileo Smarr LSU PRAC
12 year pedigree, led from LSU. Over $10M support : NSF, EU, DOD, DOE, NASA, Microsoft, MPG, LSU, NCSA.
Cactus Structure Plug-In “Thorns” (components) remote steering extensibleAPIs ANSI C driver Fortran/C/C++ parameters input/output scheduling equations of state interpolation Core “Flesh” errorhandling SOR solver Your Physics !! makesystem wave evolvers Computational Tools !! gridvariables multigrid black holes coordinates boundaryconditions
Cactus Structure Web 2.0 Plug-In “Thorns” (components) remote steering extensibleAPIs ANSI C driver Fortran/C/C++ parameters input/output scheduling equations of state interpolation Core “Flesh” errorhandling SOR solver Your Physics !! makesystem wave evolvers Computational Tools !! gridvariables multigrid black holes coordinates boundaryconditions
Cactus Application Environment Individual research groups Domain specific shared infrastructure Flesh: APIs, information, orchestration Adaptive mesh refinement, parallel I/O, interaction, …
Einstein Toolkithttp://www.einsteintoolkit.org Based on Cactus Framework Over 130 open, community developed Cactus modules Building a consortium of users Governance and software development Members 40 listed on web page 10 different groups US, Japan, Mexico, Spain, Germany, Canada 300 science publications, 50 student theses
Typical Black Hole Simulations At LSU … 300 Cactus thorns 10,000 potential parameters 20 different supercomputers 100-2000 cores Days/weeks to run (checkpoint/restart) GBs to TBs of data (HDF5, ASCII, jpeg)
Collaborative Technologies Technologies to share simulation-related information developed in our group from the early 1990s Essential to support the scientific research Review historical evolution of these technologies Show how Web 2.0 provides new tools to enable old scenarios
Web-based Mail Lists Mosaic web browser (1993, NCSA) Seidel’s group at NCSA worry about content http://archive.ncsa.illinois.edu/Cyberia/NumRel/GravWaves.html(1995) Collaborative Cork Board (CoCoBoard) (Mid 90’s) Researchers have web-based “project pages” Could attach images!! (usually 1-D plots of results) Used till late 90’s Currently Project based private wikis: parameter/output files, figures Organize material for weekly project conference calls Cons: network to access/edit wiki, editing slow
Simulation Web Interfaces Thorn “httpd” First collaborative tool fundamentally integrated into Cactus Werner Benger (1999), visiting NCSA from Germany (7 hr time difference and email) Used socket library developed for remote viz (John Shalf & TIKSL project) Thorn “HTTPD” in standard toolkit (2000) Simulation status, variables, timing, viewport, output files, parameter steering, etc Thorns can include their own web content
Issues Authorization to web pages (username/password in parameter file is insecure and awkward, newer version uses https and can also use X.509) Browsers can display images in certain formats, a Visualization thorn uses gnuplot to include e.g. performance with time, physical parameters Problem deploying on compute nodes where web server cannot be directly accessed (port forwarding, filewalls) How to find and track the simulations, publicize existence to a collaboration?
Simulation Reports and Email Readable report automatically generated for each simulation (computation and physics) Prototyped 2001 but not used (?) How to collect reports in one place? Mail Thorn (sendmail) Email reliable and fault tolerant (spool) Supercomputers do not allow mail to be sent from compute nodes.
GridLab Visualization Service BryggUllmer (2004)
Announcing Simulation Info Publish (application provided) simulation information Thorn Announce, in prototype Cactus Worm scenario (2001) Message from Flesh/Thorn info Transport: XML-RPC to remote socket (portal) Issues Job IDs Security, mapping users Cumbersome user set parameters (portal location, visibility of job, notification needs) Announcing to ASC Portal (2002)
Notification Portal notification service Portal users configure at portal, simulations configure in parameter file Email, SMS, Instant Message Initial experiments generated large telecom bills! Cool and useful, but lots of work (FTE) to develop and modify portal service, difficult to configure.
Web 2.0 Technologies Use for collaborative, simulation-level messaging and information archiving Reliable, persistent, well-documented, user-configurable, cheap, well supported, good APIs
Twitter March 2006 Real-time short messaging system. Users send and receive each others updates (tweets). Wide range of devices and rudimentary social networking. Receivers can filter messages they see and specify how they receive them Twitter API (e.g. post a new Twitter message from a user) Free
Thorn Twitter Uses libcurl Cactus parameters for twitter username/password Twitter API: statuses/update At LSU “numrel” group account Messages when simulation starts and at different stages
Flickr 2004, image hosting website for digital photographs (and now videos). Bought by Yahoo (2005). Professional account ($25/yr) for unlimited use Web service API for uploading and manipulating images Group images into Sets and Collections Tags, title, description, metadata from EXIF headers Social networking: users can comment on images, flag them, order by popularity, etc. Public/Private/Friends/Family. Blogs. RSS field allows quick previewing.
Thorn Flickr Send images from running simulation Uses: flickcurl, libcurl, libxml2, openssl Authentication more complex (api key, shared secret) Thorn uploads images that are generated by Cactus (and known to I/O layer), e.g. IoJpeg Each simulation given its own Flickr set
Future Work Extend capabilities, production testing Common authentication mechanism Social networking model (individual/shared accounts) Development of common tags, more metadata etc Storing videos (Flickr, YouTube, Vimeo) Advantage for scientists presenting Lots of other possibilities: DropBox to publish files across a collaboration, WordPress for simulation reports/blogs, FaceBook to replace grid portals and aggregate services, Cloud computing APIs for “grid” scenarios, …
Einstein Toolkit Trying to establish a community for computational relativity: Wiki for community documentation Blog for community posting www.einsteintoolkit.org
Conclusions Started as a fun project (undergrad) Web 2.0 provide reliable delivery, storage, access, and flexible collaborative features Can use Web 2.0 to easily prototype new interactive and collaborative scenarios (have really missed this) Small groups and individuals can do this too!! Target standard of ease-of-use for cyberinfrastructure development For real use need unified authentication, clear policies on data, site versions