Your SlideShare is downloading. ×
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
SBGrid Science Portal - eScience 2012
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

SBGrid Science Portal - eScience 2012

267

Published on

The SBGrid Science Portal provides multi-modal access to computational infrastructure, data storage, and data analysis tools for the structural biology community. It incorporates features not …

The SBGrid Science Portal provides multi-modal access to computational infrastructure, data storage, and data analysis tools for the structural biology community. It incorporates features not previously seen in cyberinfrastructure science gateways. It enables researchers to securely share a computational study area, including large volumes of data and active computational workflows. A rich identity management system has been developed that simplifies federated access to US national cyberinfrastructure, distributed data storage, and high performance file transfer tools. It integrates components from the Virtual Data Toolkit, Condor, glideinWMS, the Globus Toolkit and Globus Online, the FreeIPA identity management system, Apache web server, and the Django web framework.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
267
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
6
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. The  SBGrid  Science  Portal: An  integrated  environment  for protein  structure  studies Ian  Stokes-­‐Rees Harvard  Medical  School   eScience  2012,  Chicago,  October  2012
  • 2. What’s  interesting  about   another  Science  Portal?j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 3. What’s  interesting  about   another  Science  Portal? ✦ Interface  modalities • Web  forms,  RESTful  interfaces,  command  linej.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 4. What’s  interesting  about   another  Science  Portal? ✦ Interface  modalities • Web  forms,  RESTful  interfaces,  command  line ✦ Access  model • Browser  SSO,  X.509,  LDAP,  .htaccess,  GACLj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 5. What’s  interesting  about   another  Science  Portal? ✦ Interface  modalities • Web  forms,  RESTful  interfaces,  command  line ✦ Access  model • Browser  SSO,  X.509,  LDAP,  .htaccess,  GACL ✦ Identity  management • Streamlined  grid  account  creationj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 6. What’s  interesting  about   another  Science  Portal? ✦ Interface  modalities • Web  forms,  RESTful  interfaces,  command  line ✦ Access  model • Browser  SSO,  X.509,  LDAP,  .htaccess,  GACL ✦ Identity  management • Streamlined  grid  account  creation ✦ Computational  capability • local,  cluster,  and  grid  computingj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 7. What’s  interesting  about   another  Science  Portal? ✦ Interface  modalities • Web  forms,  RESTful  interfaces,  command  line ✦ Access  model • Browser  SSO,  X.509,  LDAP,  .htaccess,  GACL ✦ Identity  management • Streamlined  grid  account  creation ✦ Computational  capability • local,  cluster,  and  grid  computing ✦ Data  management • Web  (HTTP),  scp,  GridFTP,  GlobusOnline • Tiered  staging  of  dataj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 8. I’m  still  skeptical.  What  about  Taverna,   GridSphere,  Galaxy,  or  HubZero?j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 9. I’m  still  skeptical.  What  about  Taverna,   GridSphere,  Galaxy,  or  HubZero? ✦ All  great  if • the  portal  or  application  plugin  already  exists;  and • the  application  workGlows  closely  match  your   requirementsj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 10. I’m  still  skeptical.  What  about  Taverna,   GridSphere,  Galaxy,  or  HubZero? ✦ All  great  if • the  portal  or  application  plugin  already  exists;  and • the  application  workGlows  closely  match  your   requirements ✦ Not-­‐so-­‐great  if • you  have  to  implement  a  new  portal  on  top  of  one  of   those  frameworks • you  want  to  adapt  the  workGlow • your  data  model  changes • you  want  to  add  a  new  application • you  want  to  explore  the  data  in  an  unanticipated  way • command-­‐line  access  is  also  important  to  you • you  are  working  with  othersj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 11. Links www.sbgrid.org portal.sbgrid.org j.mp/esci12-sbgrid ijstokes@seas.harvard.edu @ijstokesj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 12. Outline ✦ Community • Who  the  SBGrid  Science  Portal  is  meant  to  serve ✦ Objectives • What  was  the  vision  for  the  Science  Portal ✦ Implementation • Software  and  service  architectures ✦ Security,  Collaboration,  and  IdM • ...  or  “How  I  learned  to  stop  worrying  and  love  X.509” ✦ Data • Tiered  data  distribution  modelj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 13. Washington U. School of Med. Cornell U. R. Cerione NE-CAT T. Ellenberger B. Crane R. Oswald D. Fremont S. Ealick C. Parrish Rosalind Franklin NIH M. Jin H. Sondermann D. Harrison M. Mayer A. Ke UMass Medical U. Washington T. Gonen U. Maryland W. Royer E. Toth Brandeis U. UC Davis N. Grigorieff H. Stahlberg Tufts U. K. Heldwein UCSF Columbia U. JJ Miranda Q. Fan Y. Cheng Community Rockefeller U. Stanford R. MacKinnon A. Brunger Yale U. K. Garcia T. Boggon K. Reinisch T. Jardetzky D. Braddock J. Schlessinger Y. Ha F. Sigworth CalTech E. Lolis F. Zhou P. Bjorkman Harvard and Affiliates W. Clemons N. Beglova A. Leschziner G. Jensen Rice University S. Blacklow K. Miller D. Rees E. Nikonowicz B. Chen A. Rao Y. Shamoo Vanderbilt J. Chou T. Rapoport Y.J. Tao Center for Structural Biology J. Clardy M. Samso WesternU W. Chazin C. Sanders M. Eck P. Sliz M. Swairjo B. Eichman B. Spiller B. Furie T. Springer M. Egli M. Stone R. Gaudet G. Verdine UCSD B. Lacy M. Waterman M. Grant G. Wagner T. Nakagawa M. Ohi S.C. Harrison L. Walensky H. Viadiu Thomas Jefferson J. Hogle S.Walker J. Williams D. Jeruzalmi T.Walz D. Kahne J. WangNot Pictured:University of Toronto: L. Howell, E. Pai, F. Sicheri; NHRI (Taiwan): G. Liou; Trinity College, Dublin: Amir Khan T. Kirchhausen S. Wong
  • 14. Structural  Biology: Study  of  Protein  Structure  and  Function 400m 1mmj.mp/esci12-sbgrid 10nm ijstokes@seas.harvard.edu
  • 15. Structural  Biology: Study  of  Protein  Structure  and  Function 400m 1mm 10nm • Shared  scientiGic  data  collection  facility • Data  intensive  (10-­‐100  GB/day)j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 16. Consortium  By  The  Numbers ✦ ~200  member  labs • representing  about  1500  users ✦ ~200  software  packages • multi-­‐platform  (Linux,  OS  X) • multi-­‐version ✦ 4  FTE  staff ✦ Automated  software  distribution • 80  GB  for  full  package • rsync+ssh  for  updates ✦ Everything  “Just  Works” • So  labs  are  happy  to  renew  membership  and  refer  friendsj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 17. Boston  Life  Sciences  Hub • Biomedical  researchers • Government  agencies • Life  sciences Tufts • Universities Universit y School of Medicin e • Hospitalsj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 18. Hug  a  Life  Scientist!j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 19. Hug  a  Life  Scientist! ✦ Let  them  know  you  care  ...j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 20. Hug  a  Life  Scientist! ✦ Let  them  know  you  care  ... • ...  because  the  software  we  give  them  doesn’tj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 21. Hug  a  Life  Scientist! ✦ Let  them  know  you  care  ... • ...  because  the  software  we  give  them  doesn’t • ...  and  neither  do  the  systems  we  subject  them  toj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 22. Hug  a  Life  Scientist! ✦ Let  them  know  you  care  ... • ...  because  the  software  we  give  them  doesn’t • ...  and  neither  do  the  systems  we  subject  them  to • ...  but  to  be  fair,  a  lot  of  the  pain  is  self-­‐inZlictedj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 23. Hug  a  Life  Scientist! ✦ Let  them  know  you  care  ... • ...  because  the  software  we  give  them  doesn’t • ...  and  neither  do  the  systems  we  subject  them  to • ...  but  to  be  fair,  a  lot  of  the  pain  is  self-­‐inZlicted ✦ SBGrid  came  into  existence  to  Zill  the  tech   void/pain  experienced  by  structural  biologistsj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 24. Hug  a  Life  Scientist! ✦ Let  them  know  you  care  ... • ...  because  the  software  we  give  them  doesn’t • ...  and  neither  do  the  systems  we  subject  them  to • ...  but  to  be  fair,  a  lot  of  the  pain  is  self-­‐inZlicted ✦ SBGrid  came  into  existence  to  Zill  the  tech   void/pain  experienced  by  structural  biologists ✦ Started  with  providing  reliable  compiled  softwarej.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 25. Hug  a  Life  Scientist! ✦ Let  them  know  you  care  ... • ...  because  the  software  we  give  them  doesn’t • ...  and  neither  do  the  systems  we  subject  them  to • ...  but  to  be  fair,  a  lot  of  the  pain  is  self-­‐inZlicted ✦ SBGrid  came  into  existence  to  Zill  the  tech   void/pain  experienced  by  structural  biologists ✦ Started  with  providing  reliable  compiled  software ✦ Expanded  intoj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 26. Hug  a  Life  Scientist! ✦ Let  them  know  you  care  ... • ...  because  the  software  we  give  them  doesn’t • ...  and  neither  do  the  systems  we  subject  them  to • ...  but  to  be  fair,  a  lot  of  the  pain  is  self-­‐inZlicted ✦ SBGrid  came  into  existence  to  Zill  the  tech   void/pain  experienced  by  structural  biologists ✦ Started  with  providing  reliable  compiled  software ✦ Expanded  into • training  events  and  workshopsj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 27. Hug  a  Life  Scientist! ✦ Let  them  know  you  care  ... • ...  because  the  software  we  give  them  doesn’t • ...  and  neither  do  the  systems  we  subject  them  to • ...  but  to  be  fair,  a  lot  of  the  pain  is  self-­‐inZlicted ✦ SBGrid  came  into  existence  to  Zill  the  tech   void/pain  experienced  by  structural  biologists ✦ Started  with  providing  reliable  compiled  software ✦ Expanded  into • training  events  and  workshops • best  practice  guidesj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 28. Hug  a  Life  Scientist! ✦ Let  them  know  you  care  ... • ...  because  the  software  we  give  them  doesn’t • ...  and  neither  do  the  systems  we  subject  them  to • ...  but  to  be  fair,  a  lot  of  the  pain  is  self-­‐inZlicted ✦ SBGrid  came  into  existence  to  Zill  the  tech   void/pain  experienced  by  structural  biologists ✦ Started  with  providing  reliable  compiled  software ✦ Expanded  into • training  events  and  workshops • best  practice  guides • shared  computational  infrastructure (clusters!  OSG!  GlobusOnline!)j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 29. Hug  a  Life  Scientist! ✦ Let  them  know  you  care  ... • ...  because  the  software  we  give  them  doesn’t • ...  and  neither  do  the  systems  we  subject  them  to • ...  but  to  be  fair,  a  lot  of  the  pain  is  self-­‐inZlicted ✦ SBGrid  came  into  existence  to  Zill  the  tech   void/pain  experienced  by  structural  biologists ✦ Started  with  providing  reliable  compiled  software ✦ Expanded  into • training  events  and  workshops • best  practice  guides • shared  computational  infrastructure (clusters!  OSG!  GlobusOnline!) • web-­‐based  collaborative  computational  and  data  servicesj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 30. Objectives A. Extensible  infrastructure  to  facilitate   development  and  deployment  of  novel   computational  workGlows   B. Web-­‐accessible  environment  for  collaborative,   compute  and  data  intensive  sciencej.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 31. Objectives   (explained)j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 32. Objectives   (explained) ✦ Pareto  Principlej.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 33. Objectives   (explained) ✦ Pareto  Principle • 80%  of  the  time  users  are  happy  with  basic  web  form  interface   to  standard  application  workGlow  and  canned  result  analysisj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 34. Objectives   (explained) ✦ Pareto  Principle • 80%  of  the  time  users  are  happy  with  basic  web  form  interface   to  standard  application  workGlow  and  canned  result  analysis • 20%  of  the  effort  to  address  these  routine  casesj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 35. Objectives   (explained) ✦ Pareto  Principle • 80%  of  the  time  users  are  happy  with  basic  web  form  interface   to  standard  application  workGlow  and  canned  result  analysis • 20%  of  the  effort  to  address  these  routine  cases • Science  Portals  are  a  big  win  over  cumbersome  and  complex   Fortran  codej.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 36. Objectives   (explained) ✦ Pareto  Principle • 80%  of  the  time  users  are  happy  with  basic  web  form  interface   to  standard  application  workGlow  and  canned  result  analysis • 20%  of  the  effort  to  address  these  routine  cases • Science  Portals  are  a  big  win  over  cumbersome  and  complex   Fortran  code ✦ Corollary  to  Pareto  Principlej.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 37. Objectives   (explained) ✦ Pareto  Principle • 80%  of  the  time  users  are  happy  with  basic  web  form  interface   to  standard  application  workGlow  and  canned  result  analysis • 20%  of  the  effort  to  address  these  routine  cases • Science  Portals  are  a  big  win  over  cumbersome  and  complex   Fortran  code ✦ Corollary  to  Pareto  Principle • 20%  of  the  time  users  want  or  need  customized  application   work<low  and/or  result  analysisj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 38. Objectives   (explained) ✦ Pareto  Principle • 80%  of  the  time  users  are  happy  with  basic  web  form  interface   to  standard  application  workGlow  and  canned  result  analysis • 20%  of  the  effort  to  address  these  routine  cases • Science  Portals  are  a  big  win  over  cumbersome  and  complex   Fortran  code ✦ Corollary  to  Pareto  Principle • 20%  of  the  time  users  want  or  need  customized  application   work<low  and/or  result  analysis • 80%  of  the  effort  to  make  possiblej.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 39. Objectives   (explained) ✦ Pareto  Principle • 80%  of  the  time  users  are  happy  with  basic  web  form  interface   to  standard  application  workGlow  and  canned  result  analysis • 20%  of  the  effort  to  address  these  routine  cases • Science  Portals  are  a  big  win  over  cumbersome  and  complex   Fortran  code ✦ Corollary  to  Pareto  Principle • 20%  of  the  time  users  want  or  need  customized  application   work<low  and/or  result  analysis • 80%  of  the  effort  to  make  possible • But  rare  that  anyone  knows  in  advance  whether  80  or  20  sidej.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 40. My  Experience  and  Perspectivej.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 41. Aud My  Experience  and  iPerspective enc e  Pa rtic ipat ionj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 42. Aud My  Experience  and  iPerspective enc e  Pa rtic ipat ion ✦ The  really  interesting  stuff  happensj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 43. Aud My  Experience  and  iPerspective enc e  Pa rtic ipat ion ✦ The  really  interesting  stuff  happens • in  the  unpredictable  20%j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 44. Aud My  Experience  and  iPerspective enc e  Pa rtic ipat ion ✦ The  really  interesting  stuff  happens • in  the  unpredictable  20% ✦ Innovative  analytical  strategies  requirej.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 45. Aud My  Experience  and  iPerspective enc e  Pa rtic ipat ion ✦ The  really  interesting  stuff  happens • in  the  unpredictable  20% ✦ Innovative  analytical  strategies  require • an  ability  to  rapidly  adjust  work<low  and  data  analysisj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 46. Aud My  Experience  and  iPerspective enc e  Pa rtic ipat ion ✦ The  really  interesting  stuff  happens • in  the  unpredictable  20% ✦ Innovative  analytical  strategies  require • an  ability  to  rapidly  adjust  work<low  and  data  analysis ✦ You’re  stuffedj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 47. Aud My  Experience  and  iPerspective enc e  Pa rtic ipat ion ✦ The  really  interesting  stuff  happens • in  the  unpredictable  20% ✦ Innovative  analytical  strategies  require • an  ability  to  rapidly  adjust  work<low  and  data  analysis ✦ You’re  stuffed • if  workGlow  and  data  are  tightly  coupled  to  portal  frameworkj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 48. Aud My  Experience  and  iPerspective enc e  Pa rtic ipat ion ✦ The  really  interesting  stuff  happens • in  the  unpredictable  20% ✦ Innovative  analytical  strategies  require • an  ability  to  rapidly  adjust  work<low  and  data  analysis ✦ You’re  stuffed • if  workGlow  and  data  are  tightly  coupled  to  portal  framework ✦ Collaboration  is  critical:j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 49. Aud My  Experience  and  iPerspective enc e  Pa rtic ipat ion ✦ The  really  interesting  stuff  happens • in  the  unpredictable  20% ✦ Innovative  analytical  strategies  require • an  ability  to  rapidly  adjust  work<low  and  data  analysis ✦ You’re  stuffed • if  workGlow  and  data  are  tightly  coupled  to  portal  framework ✦ Collaboration  is  critical: • you  need  to  be  able  to  share  your  work  (securely)j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 50. Aud My  Experience  and  iPerspective enc e  Pa rtic ipat ion ✦ The  really  interesting  stuff  happens • in  the  unpredictable  20% ✦ Innovative  analytical  strategies  require • an  ability  to  rapidly  adjust  work<low  and  data  analysis ✦ You’re  stuffed • if  workGlow  and  data  are  tightly  coupled  to  portal  framework ✦ Collaboration  is  critical: • you  need  to  be  able  to  share  your  work  (securely) • the  web  is  the  obvious  (only!)  way  anyone  wants  to  do  thisj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 51. Implementation  and   Architecture
  • 52. Front  End  Interface ✦ Django  (Python)  web   framework ✦ Apache  web  server ✦ Per-­‐user  protected   jobs  and  data ✦ WebDAV  to  data ✦ ssh  access  possible ✦ Richer  access  control   in  developmentj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 53. Results  Visualization  and  Analysis
  • 54. NoSQL  hierarchical  document  store ✦ The  SBGrid  Portal’s  leading  workGlow: • 100,000  jobs • 300,000  output  Giles • 20-­‐100k  CPU-­‐hours ✦ Need  a  good  way  to  store  data • Glexible  data  format • Glexible  analysis  output • Gine  grained,  user-­‐driven  access  control • parallel  access • remote  access ✦ high  capacity  non-­‐relational  hierarchical  storage • ????j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 55. Operating  Systems are  Pretty  Goodj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 56. Operating  Systems are  Pretty  Good ✦ File  systems  work  well • organize  data  carefully  (hierarchically) • include  meta-­‐data  (mod_cern_meta,  Gile  system) • serve  intelligently  via  multiple  protocols  (http,  gridftp) • leverage  POSIX  ownerships  (user,  group,  other,  r/w) • leverage  user,  group,  and  volume  quotas • storage  management  and  backups  are  easy  easierj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 57. Operating  Systems are  Pretty  Good ✦ File  systems  work  well • organize  data  carefully  (hierarchically) • include  meta-­‐data  (mod_cern_meta,  Gile  system) • serve  intelligently  via  multiple  protocols  (http,  gridftp) • leverage  POSIX  ownerships  (user,  group,  other,  r/w) • leverage  user,  group,  and  volume  quotas • storage  management  and  backups  are  easy  easier ✦ Process  management  works  well • execute  as  the  actual  user,  where  possible • setuid,  su,  ssh,  suexec,  and  gsexec  can  all  help  with  this • process  accounting  is  your  friend!  (pacct) • leverage  ulimit  for  process  resource  limitsj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 58. Data  Accessj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 59. Same  data  servedby  web  and  availablefrom  command  line
  • 60. Open  Science  Grid http://opensciencegrid.org ✦ US  National   Cyberinfrastructure ✦ Primarily  used  for  high  energy   physics  computing 5,073,293  hours ✦ 80  sites ~570  years ✦ O(1e5)  job  slots ✦ O(1e6)  core-­‐hours  per  day ✦ PB  scale  aggregate  storagej.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 61. Service  Architecture GlobusOnline UC San Diego @Argonne GUMSUser GUMS GridFTP + glideinWMS data Hadoop factory Open Science Grid computations MyProxy @NCSA, UIUC monitoring interfaces data computation ID mgmt Ganglia scp Condor FreeIPA Apache DOEGrids CA Nagios GridFTP Cycle Server @Lawrence GridSite LDAP RSV SRM VDT Berkley Labs Django VOMS Globus pacct WebDAV Sage Math GUMS glideinWMS Gratia Accting R-Studio GACL @FermiLab file SQL shell CLI server DB cluster Monitoring SBGrid Science Portal @ Harvard Medical School @Indiana
  • 62. SBGrid  Portal:  Current  Statusj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 63. SBGrid  Portal:  Current  Status ✦ 262  users  (lifetime),  72  active  in  past  quarterj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 64. SBGrid  Portal:  Current  Status ✦ 262  users  (lifetime),  72  active  in  past  quarter ✦ 2.4  million  hours  on  OSG  last  12  monthsj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 65. SBGrid  Portal:  Current  Status ✦ 262  users  (lifetime),  72  active  in  past  quarter ✦ 2.4  million  hours  on  OSG  last  12  months ✦ Seamless  data  sharing  from  web  to  ssh? • requires  NFSv4  to  allow  >12  POSIX  groups/user • suexec  or  gsexec  possibilityj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 66. SBGrid  Portal:  Current  Status ✦ 262  users  (lifetime),  72  active  in  past  quarter ✦ 2.4  million  hours  on  OSG  last  12  months ✦ Seamless  data  sharing  from  web  to  ssh? • requires  NFSv4  to  allow  >12  POSIX  groups/user • suexec  or  gsexec  possibility ✦ Account  integration • PAM  (ssh/command  line)  +  web  through  FreeIPA  LDAP • prototype  of  X.509  +  VOMS  +  MyProxy  (next  section!)j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 67. SBGrid  Portal:  Current  Status ✦ 262  users  (lifetime),  72  active  in  past  quarter ✦ 2.4  million  hours  on  OSG  last  12  months ✦ Seamless  data  sharing  from  web  to  ssh? • requires  NFSv4  to  allow  >12  POSIX  groups/user • suexec  or  gsexec  possibility ✦ Account  integration • PAM  (ssh/command  line)  +  web  through  FreeIPA  LDAP • prototype  of  X.509  +  VOMS  +  MyProxy  (next  section!) ✦ Collaboration • shared  secret  (password) • manual  .htaccess  or  .gaclj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 68. Identity  Management* *  or  “How  I  learned  to  stop  worrying  and  love  X.509”
  • 69. Big  Picturej.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 70. Big  Picture ✦ Federated  environment  requires • federated  identity  management • trusted  identity  providers  (“roots  of  trust”)j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 71. Big  Picture ✦ Federated  environment  requires • federated  identity  management • trusted  identity  providers  (“roots  of  trust”) ✦ Collaboration  requires • user-­‐driven  capacity  to  form  cross-­‐organization  user  groups   (aka  “Virtual  Organizations”) • roles  (or  at  least  privilege  levels)  within  VOj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 72. Big  Picture ✦ Federated  environment  requires • federated  identity  management • trusted  identity  providers  (“roots  of  trust”) ✦ Collaboration  requires • user-­‐driven  capacity  to  form  cross-­‐organization  user  groups   (aka  “Virtual  Organizations”) • roles  (or  at  least  privilege  levels)  within  VO ✦ State  of  Play • InCommon  will  get  us  part  way  there  (waiting  on  adoption!) • OpenID  nice  for  users,  but  no  trust  or  delegated  perms • X.509  process  and  details  still  tough  for  end  user • SSH  keys  lack  standard  root  of  trust  and  rolesj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 73. X.509  Digital  CertiZicates ✦ Analogy  to  a  passport: • Application  form • Sponsor’s  attestation • Consular  services • veriGication  of  application,  sponsor,  and  accompanying   identiGication  and  eligibility  documents • Passport  issuing  ofGice ✦ Portable,  digital  passport • Gixed  and  secure  user  identiGiers • name,  email,  home  institution • signed  by  widely  trusted  issuer • time  limited • ISO  standardj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 74. X.509  Challenges ✦ Lots  of  “humans  in  the  loop”  to  get  usable  cert • Registration  Agent,  Sponsor,  VO  Manager,  User ✦ Awkward  working  with  X.509  certs • multiple  formats • proxy  certs  and  VOMS  ACs • proxy  servers  (MyProxy) • expiry  (of  proxy,  of  base  cert,  of  VO  membership) • browser  integration  and  import  process • CA  cert  chain • digital  token  needs  to  be  available  on  all  devices • particularly  challenging  for  phones  and  tabletsj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 75. X.509  Nirvana   (ours  at  least)j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 76. X.509  Nirvana   (ours  at  least) ✦ User  never  sees  X.509  anything • unless  they  want  toj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 77. X.509  Nirvana   (ours  at  least) ✦ User  never  sees  X.509  anything • unless  they  want  to ✦ X.509  request  +  VO  membership  +  account  creation   completed  in  one  step  by  one  person • single  step  for  user • single  step  for  one  administratorj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 78. X.509  Nirvana   (ours  at  least) ✦ User  never  sees  X.509  anything • unless  they  want  to ✦ X.509  request  +  VO  membership  +  account  creation   completed  in  one  step  by  one  person • single  step  for  user • single  step  for  one  administrator ✦ Goodbye  passphrases  (and  forgotten  passphrases) • hold  private  key  in  LDAP  and  use  LDAP  authentication  to  accessj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 79. X.509  Nirvana   (ours  at  least) ✦ User  never  sees  X.509  anything • unless  they  want  to ✦ X.509  request  +  VO  membership  +  account  creation   completed  in  one  step  by  one  person • single  step  for  user • single  step  for  one  administrator ✦ Goodbye  passphrases  (and  forgotten  passphrases) • hold  private  key  in  LDAP  and  use  LDAP  authentication  to  access ✦ Automate  everything • login  (web  or  command  line)  triggers  X.509  proxy  request  with   (default)  VOMS  AC,  and  loading  to  MyProxy  serverj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 80. X.509  Nirvana   (ours  at  least) ✦ User  never  sees  X.509  anything • unless  they  want  to ✦ X.509  request  +  VO  membership  +  account  creation   completed  in  one  step  by  one  person • single  step  for  user • single  step  for  one  administrator ✦ Goodbye  passphrases  (and  forgotten  passphrases) • hold  private  key  in  LDAP  and  use  LDAP  authentication  to  access ✦ Automate  everything • login  (web  or  command  line)  triggers  X.509  proxy  request  with   (default)  VOMS  AC,  and  loading  to  MyProxy  server ✦ VO  Management  System  run  by  users • Users  need  to  be  able  to  self-­‐manage  their  (sub-­‐)  VOsj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 81. U1 U1 U1 Addressing  CertiZicate  Problems /. -..)"*& 012*%2! 3%"! )"*"!4&" ,"!&5":14(! !"#$"%&%()*"+,"!& U1 !"&$!*&!4,5(*)*$67"! *289:4)"*&% !";("<!"#$"%& R1 time ;"!(9:$%"!"=()(7(=(&: ,2*>!6"=()(7(=(&: S1 411!2;","!& R2 %()*,"!& *289:4;4(=47(=(&: !"&!(";","!& U2a "?12!&%()*"+ ,"!&5":14(!j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 82. U1 U1 U1 Addressing  CertiZicate  Problems /. -..)"*& 012*%2! 3%"! )"*"!4&" ,"!&5":14(! !"#$"%&%()*"+,"!& U1 !"&$!*&!4,5(*)*$67"! T0  =  late  Saturday *289:4)"*&% night  lab  session !";("<!"#$"%& R1 time ;"!(9:$%"!"=()(7(=(&: ,2*>!6"=()(7(=(&: S1 411!2;","!& R2 %()*,"!& *289:4;4(=47(=(&: !"&!(";","!& U2a "?12!&%()*"+ ,"!&5":14(!j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 83. U1 U1 U1 Addressing  CertiZicate  Problems /. -..)"*& 012*%2! 3%"! )"*"!4&" ,"!&5":14(! !"#$"%&%()*"+,"!& U1 !"&$!*&!4,5(*)*$67"! T0  =  late  Saturday *289:4)"*&% night  lab  session !";("<!"#$"%& R1 T+40h  =  mid-­‐Monday time ;"!(9:$%"!"=()(7(=(&: response ,2*>!6"=()(7(=(&: S1 411!2;","!& R2 %()*,"!& *289:4;4(=47(=(&: !"&!(";","!& U2a "?12!&%()*"+ ,"!&5":14(!j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 84. U1 U1 U1 Addressing  CertiZicate  Problems /. -..)"*& 012*%2! 3%"! )"*"!4&" ,"!&5":14(! !"#$"%&%()*"+,"!& U1 !"&$!*&!4,5(*)*$67"! T0  =  late  Saturday *289:4)"*&% night  lab  session !";("<!"#$"%& R1 T+40h  =  mid-­‐Monday time ;"!(9:$%"!"=()(7(=(&: response ,2*>!6"=()(7(=(&: T+60h  =  early-­‐Tuesday S1 411!2;","!& response R2 %()*,"!& *289:4;4(=47(=(&: !"&!(";","!& U2a "?12!&%()*"+ ,"!&5":14(!j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 85. U1 U1 U1 Addressing  CertiZicate  Problems /. -..)"*& 012*%2! 3%"! )"*"!4&" ,"!&5":14(! !"#$"%&%()*"+,"!& U1 !"&$!*&!4,5(*)*$67"! T0  =  late  Saturday *289:4)"*&% night  lab  session !";("<!"#$"%& R1 T+40h  =  mid-­‐Monday time ;"!(9:$%"!"=()(7(=(&: response ,2*>!6"=()(7(=(&: T+60h  =  early-­‐Tuesday S1 411!2;","!& response R2 T+66h  =  late-­‐Tuesday %()*,"!& *289:4;4(=47(=(&: response !"&!(";","!& U2a "?12!&%()*"+ ,"!&5":14(!j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 86. U1 U1 U1 Addressing  CertiZicate  Problems /. -..)"*& 012*%2! 3%"! )"*"!4&" ,"!&5":14(! !"#$"%&%()*"+,"!& U1 !"&$!*&!4,5(*)*$67"! T0  =  late  Saturday *289:4)"*&% night  lab  session !";("<!"#$"%& R1 T+40h  =  mid-­‐Monday time ;"!(9:$%"!"=()(7(=(&: response ,2*>!6"=()(7(=(&: T+60h  =  early-­‐Tuesday S1 411!2;","!& response R2 T+66h  =  late-­‐Tuesday %()*,"!& *289:4;4(=47(=(&: response !"&!(";","!& T+70h  =  late-­‐Tuesday U2a "?12!&%()*"+ STAGE  1 ,"!&5":14(!j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 87. VO  (Group)  Membership   Registration !")*# !"#$%&(# *+,(-,.# /-0.# +.0-0(:#50.:#:,#.0?>0-:#&0&90.-<+#93#@A# .0?>0-:#!"#8.,>+-#4(%#.,70-# U2b (,123#4%&(# V1 ;0.23#>-0.#07897:3# 5,(6.&#07897:3# S2time 4++.,;0#&0&90.-<+=# 8.,>+-=#4(%#.,70-# 4%%#@A# V2 :,#!")*# (,123# .0?>0-:#!")*#$B# .0:>.(#!")*#$B# 4%%#$B#:,# +.,C3#50.:#
  • 88. VO  (Group)  Membership   Registration T+82h  =  mid-­‐Wednesday !")*# !"#$%&(# *+,(-,.# /-0.# ask  “What  next?” +.0-0(:#50.:#:,#.0?>0-:#&0&90.-<+#93#@A# .0?>0-:#!"#8.,>+-#4(%#.,70-# U2b (,123#4%&(# V1 ;0.23#>-0.#07897:3# 5,(6.&#07897:3# S2time 4++.,;0#&0&90.-<+=# 8.,>+-=#4(%#.,70-# 4%%#@A# V2 :,#!")*# (,123# .0?>0-:#!")*#$B# .0:>.(#!")*#$B# 4%%#$B#:,# +.,C3#50.:#
  • 89. VO  (Group)  Membership   Registration T+82h  =  mid-­‐Wednesday !")*# !"#$%&(# *+,(-,.# /-0.# ask  “What  next?” +.0-0(:#50.:#:,#.0?>0-:#&0&90.-<+#93#@A# .0?>0-:#!"#8.,>+-#4(%#.,70-# U2b (,123#4%&(# V1 T+95h  =  early-­‐Thursday ;0.23#>-0.#07897:3# response  (time  zone!) 5,(6.&#07897:3# S2time 4++.,;0#&0&90.-<+=# 8.,>+-=#4(%#.,70-# 4%%#@A# V2 :,#!")*# (,123# .0?>0-:#!")*#$B# .0:>.(#!")*#$B# 4%%#$B#:,# +.,C3#50.:#
  • 90. VO  (Group)  Membership   Registration T+82h  =  mid-­‐Wednesday !")*# !"#$%&(# *+,(-,.# /-0.# ask  “What  next?” +.0-0(:#50.:#:,#.0?>0-:#&0&90.-<+#93#@A# .0?>0-:#!"#8.,>+-#4(%#.,70-# U2b (,123#4%&(# V1 T+95h  =  early-­‐Thursday ;0.23#>-0.#07897:3# response  (time  zone!) 5,(6.&#07897:3# S2time 4++.,;0#&0&90.-<+=# T+100h  =  early-­‐Thursday 4%%#@A# 8.,>+-=#4(%#.,70-# V2 response :,#!")*# (,123# .0?>0-:#!")*#$B# .0:>.(#!")*#$B# 4%%#$B#:,# +.,C3#50.:#
  • 91. VO  (Group)  Membership   Registration T+82h  =  mid-­‐Wednesday !")*# !"#$%&(# *+,(-,.# /-0.# ask  “What  next?” +.0-0(:#50.:#:,#.0?>0-:#&0&90.-<+#93#@A# .0?>0-:#!"#8.,>+-#4(%#.,70-# U2b (,123#4%&(# V1 T+95h  =  early-­‐Thursday ;0.23#>-0.#07897:3# response  (time  zone!) 5,(6.&#07897:3# S2time 4++.,;0#&0&90.-<+=# T+100h  =  early-­‐Thursday 4%%#@A# 8.,>+-=#4(%#.,70-# V2 response :,#!")*# (,123# T+105h  =  mid-­‐Thursday response .0?>0-:#!")*#$B# .0:>.(#!")*#$B# 4%%#$B#:,# +.,C3#50.:#
  • 92. VO  (Group)  Membership   Registration T+82h  =  mid-­‐Wednesday !")*# !"#$%&(# *+,(-,.# /-0.# ask  “What  next?” +.0-0(:#50.:#:,#.0?>0-:#&0&90.-<+#93#@A# .0?>0-:#!"#8.,>+-#4(%#.,70-# U2b (,123#4%&(# V1 T+95h  =  early-­‐Thursday ;0.23#>-0.#07897:3# response  (time  zone!) 5,(6.&#07897:3# S2time 4++.,;0#&0&90.-<+=# T+100h  =  early-­‐Thursday 4%%#@A# 8.,>+-=#4(%#.,70-# V2 response :,#!")*# (,123# T+105h  =  mid-­‐Thursday response .0?>0-:#!")*#$B# .0:>.(#!")*#$B# 4%%#$B#:,# +.,C3#50.:# T+105h  =  4.5  days  waiting
  • 93. () AB)! !"#$%& *+",-"# .-/# ;< #/>:/-$+"#$%&%66":,$ =3!#"I3 ;<=* )@81, 0/#176%?",/8%1&-/,$ /8%1&0/#17/@ U1 4/,/#%$/ !"#$%&(%)*+% #/>:/-$-14,/@6/#$ 6/#$9/3+%1# ,,#""%+#0% 1*$/2% #/$:#,$#%691,4,:85/# ,"?23%4/,$- 0/#123/&14151&1$3 A1a 6#/%$/ 6",7#8/&14151&1$3 &"6%&%66$ S1* %++#"0/6/#$time -14,6/#$ ,"?23%0%1&%51&1$3 A1b -/$#/$#1/0%&-/#1%&,:85/# -/$;<#14C$- %66":,$#/%@3,"?76%?", +"#$%&&"41, U2* #/>:/-$-14,/@6/#?76%$/ #/$:#,-14,/@6/#?76%$/ DE+%1#-14,/@6/#$ 1,$"!F(*GDH7&/ #/41-$/#+#"I36/#$ HE6#/%$/&"6%& J1$C=3!#"I3 !"#$%&(%)*+% +#"I36/#$ ,,#""%-#.#$/#.% $#"*!$,#"%
  • 94. () AB)! !"#$%& *+",-"# .-/# #/>:/-$+"#$%&%66":,$ T0  =  late  Saturday ;< =3!#"I3 ;<=* )@81, 0/#176%?",/8%1&-/,$ night  lab  session /8%1&0/#17/@ U1 4/,/#%$/ !"#$%&(%)*+% #/>:/-$-14,/@6/#$ 6/#$9/3+%1# ,,#""%+#0% 1*$/2% #/$:#,$#%691,4,:85/# ,"?23%4/,$- 0/#123/&14151&1$3 A1a 6#/%$/ 6",7#8/&14151&1$3 &"6%&%66$ S1* %++#"0/6/#$time -14,6/#$ ,"?23%0%1&%51&1$3 A1b -/$#/$#1/0%&-/#1%&,:85/# -/$;<#14C$- %66":,$#/%@3,"?76%?", +"#$%&&"41, U2* #/>:/-$-14,/@6/#?76%$/ #/$:#,-14,/@6/#?76%$/ DE+%1#-14,/@6/#$ 1,$"!F(*GDH7&/ #/41-$/#+#"I36/#$ HE6#/%$/&"6%& J1$C=3!#"I3 !"#$%&(%)*+% +#"I36/#$ ,,#""%-#.#$/#.% $#"*!$,#"%
  • 95. () AB)! !"#$%& *+",-"# .-/# #/>:/-$+"#$%&%66":,$ T0  =  late  Saturday ;< =3!#"I3 ;<=* )@81, 0/#176%?",/8%1&-/,$ night  lab  session /8%1&0/#17/@ U1 4/,/#%$/ !"#$%&(%)*+% #/>:/-$-14,/@6/#$ 6/#$9/3+%1# ,,#""%+#0% 1*$/2% #/$:#,$#%691,4,:85/# ,"?23%4/,$- 0/#123/&14151&1$3 A1a 6#/%$/ 6",7#8/&14151&1$3 &"6%&%66$ S1* %++#"0/6/#$ T+40h  =  mid-­‐Mondaytime -14,6/#$ ,"?23%0%1&%51&1$3 A1b response -/$#/$#1/0%&-/#1%&,:85/# -/$;<#14C$- %66":,$#/%@3,"?76%?", +"#$%&&"41, U2* #/>:/-$-14,/@6/#?76%$/ #/$:#,-14,/@6/#?76%$/ DE+%1#-14,/@6/#$ 1,$"!F(*GDH7&/ #/41-$/#+#"I36/#$ HE6#/%$/&"6%& J1$C=3!#"I3 !"#$%&(%)*+% +#"I36/#$ ,,#""%-#.#$/#.% $#"*!$,#"%
  • 96. () AB)! !"#$%& *+",-"# .-/# #/>:/-$+"#$%&%66":,$ T0  =  late  Saturday ;< =3!#"I3 ;<=* )@81, 0/#176%?",/8%1&-/,$ night  lab  session /8%1&0/#17/@ U1 4/,/#%$/ !"#$%&(%)*+% #/>:/-$-14,/@6/#$ 6/#$9/3+%1# ,,#""%+#0% 1*$/2% #/$:#,$#%691,4,:85/# ,"?23%4/,$- 0/#123/&14151&1$3 A1a 6#/%$/ 6",7#8/&14151&1$3 &"6%&%66$ S1* %++#"0/6/#$ T+40h  =  mid-­‐Mondaytime -14,6/#$ ,"?23%0%1&%51&1$3 A1b response -/$#/$#1/0%&-/#1%&,:85/# -/$;<#14C$- %66":,$#/%@3,"?76%?", +"#$%&&"41, U2* #/>:/-$-14,/@6/#?76%$/ #/$:#,-14,/@6/#?76%$/ DE+%1#-14,/@6/#$ 1,$"!F(*GDH7&/ #/41-$/#+#"I36/#$ HE6#/%$/&"6%& J1$C=3!#"I3 !"#$%&(%)*+% +#"I36/#$ ,,#""%-#.#$/#.% T+40h  =  1.7  day  wait $#"*!$,#"%
  • 97. Data  Managementj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 98. Data  Tiers  -­‐  Scoping • VO-­‐wide:  all  sites,  admin  managed,  very  stable • User  archive:  single  site,  user  managed,  very  stable,  10+  GB • User  project:  all  sites,  user  managed,  1-­‐10  weeks,  1-­‐3  GB • User  static:  all  sites,  user  managed,  indeZinite,  10  MB • Job  set:  all  sites,  infrastructure  managed,  1-­‐10  days,  0.1-­‐1  GB • Job:  direct  to  worker  node,  infrastructure  managed,  1  day,  <10  MB • Job  indirect:  to  worker  node  via  UCSD,  infrastructure  managed,  1   day,  <10  GBj.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 99. About  2PB  with40  front  end  servers  for  high  bandwidth  parallel  Gile  transferData  Movementscp  (users)rsync  (VO-­‐wide)grid-­‐ftp  (UCSD)curl  (WNs)cp  (NFS)htcp  (secure  web)http(s)  (web)j.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 100. Globus  Online:  High  Performance   Reliable  3rd  Party  File  Transfer GUMS DN  to  user  mapping CertiGicate  Authority VOMS root  of  trust VO  membership portal cluster Globus  Online Zile  transfer  service data collection lab file facility serverj.mp/esci12-sbgrid desktop laptop ijstokes@seas.harvard.edu
  • 101. facility file serverSBGridScience Portal lab file desktop server laptop
  • 102. Ryan,  a  postdoc  in  the   Frank  Lab  at  Columbia Access  NRAMM  facilities   securely  and  transfer  data   back  to  home  institute facility file serverSBGridScience Portal lab file desktop server laptop
  • 103. Ryan,  a  postdoc  in  the   Frank  Lab  at  Columbia Access  NRAMM  facilities   securely  and  transfer  data   back  to  home  institute /data/columbia/frank facility file serverSBGridScience Portal lab file desktop server /nfs/data/rsmith /Users/Ryan laptop
  • 104. Ryan,  a  postdoc  in  the   Frank  Lab  at  Columbia Access  NRAMM  facilities   securely  and  transfer  data   back  to  home  institute automated  X.509 application /data/columbia/frank facility file server SBGrid Ryan  applies  for  an   Science account  at  the  SBGrid   Portal Science  Portal lab fileautomated   desktop serverGlobus  Online   /nfs/data/rsmithapplication/X.509  linking(wish  list!) /Users/Ryan laptop
  • 105. Ryan,  a  postdoc  in  the   Frank  Lab  at  Columbia Access  NRAMM  facilities   securely  and  transfer  data   back  to  home  institute automated  X.509 application /data/columbia/frank facility file server veriZication  of   lab  membership SBGrid Ryan  applies  for  an   Science account  at  the  SBGrid   Portal Science  Portal lab fileautomated   desktop serverGlobus  Online   /nfs/data/rsmithapplication/X.509  linking(wish  list!) /Users/Ryan laptop
  • 106. Ryan,  a  postdoc  in  the   Frank  Lab  at  Columbia Access  NRAMM  facilities   securely  and  transfer  data   back  to  home  institute automated  X.509 application /data/columbia/frank facility file server veriZication  of   lab  membership SBGrid Ryan  applies  for  an   Science account  at  the  SBGrid   Portal Science  Portal lab fileautomated   desktop serverGlobus  Online   /nfs/data/rsmithapplication/X.509  linking(wish  list!) /Users/Ryan laptop
  • 107. Ryan,  a  postdoc  in  the   Frank  Lab  at  Columbia Access  NRAMM  facilities   securely  and  transfer  data   back  to  home  institute /data/columbia/frank facility file serverSBGridScience Portal lab file desktop server /nfs/data/rsmith /Users/Ryan laptop
  • 108. Ryan,  a  postdoc  in  the   Frank  Lab  at  Columbia Access  NRAMM  facilities   securely  and  transfer  data   back  to  home  institute /data/columbia/frank facility file serverSBGrid request  accessScience to  NRAMM Portal facility using  credential   held  by  SBGrid lab file desktop server /nfs/data/rsmith /Users/Ryan laptop
  • 109. Ryan,  a  postdoc  in  the   Frank  Lab  at  Columbia Access  NRAMM  facilities   securely  and  transfer  data   back  to  home  institute check  SBGrid  for   Ryan’s  group   membership /data/columbia/frank facility file in  Frank  Lab,  so   server grant  access  to  ZilesSBGrid request  accessScience to  NRAMM Portal facility using  credential   held  by  SBGrid lab file desktop server /nfs/data/rsmith /Users/Ryan laptop
  • 110. Ryan,  a  postdoc  in  the   Frank  Lab  at  Columbia Access  NRAMM  facilities   securely  and  transfer  data   back  to  home  institute /data/columbia/frank facility file serverSBGridScience Portal lab file desktop server /nfs/data/rsmith /Users/Ryan laptop
  • 111. Ryan,  a  postdoc  in  the   Frank  Lab  at  Columbia Access  NRAMM  facilities   securely  and  transfer  data   back  to  home  institute /data/columbia/frank facility file serverSBGridScience Portal lab file desktop server /nfs/data/rsmith use  Globus  Online   to  manage transfer  from   /Users/Ryan laptop NRAMM  back  to  lab
  • 112. Ryan,  a  postdoc  in  the   Frank  Lab  at  Columbia Access  NRAMM  facilities   securely  and  transfer  data   back  to  home  institute /data/columbia/frank facility file serverSBGrid initiate  transfer  at  Science NRAMM Portal lab file desktop server /nfs/data/rsmith use  Globus  Online   to  manage transfer  from   /Users/Ryan laptop NRAMM  back  to  lab
  • 113. Ryan,  a  postdoc  in  the   Frank  Lab  at  Columbia Access  NRAMM  facilities   securely  and  transfer  data   back  to  home  institute /data/columbia/frank facility file server transfer  data  to  labSBGrid initiate  transfer  at  Science NRAMM Portal lab file desktop server /nfs/data/rsmith use  Globus  Online   to  manage transfer  from   /Users/Ryan laptop NRAMM  back  to  lab
  • 114. Ryan,  a  postdoc  in  the   Frank  Lab  at  Columbia Access  NRAMM  facilities   securely  and  transfer  data   back  to  home  institute /data/columbia/frank facility file server transfer  data  to  labSBGrid initiate  transfer  at  Science NRAMM Portal notify  user  of   lab file completion desktop server /nfs/data/rsmith use  Globus  Online   to  manage transfer  from   /Users/Ryan laptop NRAMM  back  to  lab
  • 115. User  can  directly  access  lab   or  facility  data  from  laptop "Sco?"+from Harvard general+publicLocal  accounts  within the+Sliz+lab Public  access  available  to  lab  infrastructure ~sue archived  data  through  web   ~andy interface ~sco? Sliz+lab NEBCAT+beamline+at+APS WWW data collec(onShared  (lab  level)  accounts  at  facility /data/sliz /public/2009/ Embargo  policy  to   /stage/sliz /data/murphy /embarg/2010/ hold  deposited  data   /stage/murphy for  agreed  timeTiered  storage /data/deacon /embarg/2011/ 6+month 10+TB+per+group+ 10+PB staging+storage permanent+archive public+archive Tier1 Tier2 Tier3 VO  management VOMS GlobusOnline SBGridSciencePortal
  • 116. Data  at  Shared  ScientiZic  Facilities
  • 117. Data  at  Shared  ScientiZic  Facilities ✦ SBGrid • manages  all  user  account  creation  and  credential  mgmt • hosts  MyProxy,  VOMS,  GridFTP,  and  user  interfaces
  • 118. Data  at  Shared  ScientiZic  Facilities ✦ SBGrid • manages  all  user  account  creation  and  credential  mgmt • hosts  MyProxy,  VOMS,  GridFTP,  and  user  interfaces ✦ Facility • knows  about  lab  groups • e.g.  “Harrison”,  “Sliz” • delegates  knowledge  of  group  membership  to  SBGrid  VOMS • facility  can  poll  VOMS  for  list  of  current  members • uses  X.509  for  user  identiGication • deploys  GridFTP  server
  • 119. Data  at  Shared  ScientiZic  Facilities ✦ SBGrid • manages  all  user  account  creation  and  credential  mgmt • hosts  MyProxy,  VOMS,  GridFTP,  and  user  interfaces ✦ Facility • knows  about  lab  groups • e.g.  “Harrison”,  “Sliz” • delegates  knowledge  of  group  membership  to  SBGrid  VOMS • facility  can  poll  VOMS  for  list  of  current  members • uses  X.509  for  user  identiGication • deploys  GridFTP  server ✦ Lab  group • designates  group  manager  that  adds/removes  individuals • deploys  GridFTP  server  or  Globus  Connect  client
  • 120. Data  at  Shared  ScientiZic  Facilities ✦ SBGrid • manages  all  user  account  creation  and  credential  mgmt • hosts  MyProxy,  VOMS,  GridFTP,  and  user  interfaces ✦ Facility • knows  about  lab  groups • e.g.  “Harrison”,  “Sliz” • delegates  knowledge  of  group  membership  to  SBGrid  VOMS • facility  can  poll  VOMS  for  list  of  current  members • uses  X.509  for  user  identiGication • deploys  GridFTP  server ✦ Lab  group • designates  group  manager  that  adds/removes  individuals • deploys  GridFTP  server  or  Globus  Connect  client ✦ Individual • username/password  to  access  facility  and  lab  storage • Globus  Connect  for  personal  GridFTP  server  to  laptop • Globus  Online  web  interface  to  “drive”  transfers
  • 121. Summary ✦ Don’t  discount  unpredictable  20% • need  Glexibility  to  innovate  and  explore  (data  and  comp) ✦ “Last  mile”  challenge • to  the  desktop • to  the  laptop ✦ UniGied  and  simpliGied  identity  management • centralized  set  of  credentials  for  each  person • tight  links  to  CA/X.509,  LDAP,  MyProxy  and  VOMS ✦ Empower  collaborations  to  self-­‐manage ✦ Shift  of  focus  from  “compute”  to  “data” • for  users • for  facilities  where  data  is  the  main  challengej.mp/esci12-sbgrid ijstokes@seas.harvard.edu
  • 122. Q&A  and  Acknowledgements ✦ Piotr  Sliz • Supervisor  and  PI  at  Harvard  Medical  School • Chair  of  SBGrid  Consortium ✦ SBGrid  Science  Portal • Daniel  O’Donovan,  Meghan  Porter-­‐Mahoney,  Mick  Timoney ✦ SBGrid  System  Administrators • Ian  Levesque,  Peter  Doherty,  Steve  Jahl ✦ Facility  Collaborators • Frank  Murphy  (NE-­‐CAT/APS) • Ashley  Deacon  (JCSG/SLAC) ✦ Globus  Online  Team • Steve  Tueke,  Ian  Foster,  Rachana  Ananthakrishnan,  Raj  Kettimuthu   ✦ OSG  Collaborators • Ruth  Pordes,  Director  of  OSG,  for  championing  SBGrid • Terrence  Martin,  for  UCSD  HDFS  support • Steve  Timm  and  Keith  Chadwick  (FNAL)  for  helping  resolve  OSG  problemsj.mp/esci12-sbgrid ijstokes@seas.harvard.edu

×