SlideShare a Scribd company logo
1 of 87
Download to read offline
Grid	
  Computing	
  Overview
Federating	
  Compute	
  and	
  Storage	
  Resources	
  to	
  
  Accelerate	
  Science	
  and	
  Aid	
  Collaboration




             Ian	
  Stokes-­‐Rees,	
  PhD
      Harvard	
  Medical	
  School,	
  Boston,	
  USA

      http://portal.sbgrid.org
    ijstokes@hkl.hms.harvard.edu
Slides	
  and	
  Contact
         ijstokes@hkl.hms.harvard.edu

         http://linkedin.com/in/ijstokes
         http://slidesha.re/ijstokes-grid2011

         http://www.sbgrid.org
         http://portal.sbgrid.org
         http://www.opensciencegrid.org
         http://www.xsede.org


Grid Overview - Ian Stokes-Rees        ijstokes@hkl.hms.harvard.edu
Today
                    Grid	
  Computing	
  Origins
                        High	
  Energy	
  Physics
                    Life	
  Science	
  Computing	
  Survey
                    US	
  Cyberinfrastructure
                        XSEDE
                        Open	
  Science	
  Grid
                    Science	
  Portals
                    Data	
  Access
                    User	
  Credentials
                    Security


Grid Overview - Ian Stokes-Rees                      ijstokes@hkl.hms.harvard.edu
Things	
  ISummary About
                        ’m	
  Excited	
  




Grid Overview - Ian Stokes-Rees   ijstokes@hkl.hms.harvard.edu
About	
  Me




Grid Overview - Ian Stokes-Rees             ijstokes@hkl.hms.harvard.edu
High	
  Energy	
  Physics




Big Data - Ian Stokes-Rees           ijstokes@hkl.hms.harvard.edu
Big Data - Ian Stokes-Rees   ijstokes@hkl.hms.harvard.edu
Big Data - Ian Stokes-Rees   ijstokes@hkl.hms.harvard.edu
Big Data - Ian Stokes-Rees   ijstokes@hkl.hms.harvard.edu
40	
  MHz	
  bunch	
  crossing	
  rate
     10	
  million	
  data	
  channels
     1	
  KHz	
  level	
  1	
  event	
  recording	
  rate
     1-­10	
  MB	
  per	
  event
     14	
  hours	
  per	
  day,	
  7+	
  months	
  /	
  year
     4	
  detectors
     6	
  PB	
  of	
  data	
  /	
  year
     globally	
  distribute	
  data	
  for	
  analysis	
  (x2)



Big Data - Ian Stokes-Rees                                   ijstokes@hkl.hms.harvard.edu
Data	
  and	
  Compute	
  Intensive	
  
          Life	
  Sciences
Study	
  of	
  Protein	
  Structure	
  
                     and	
  Function




                                                                     400m
                            1mm




                                                                            10nm
                                  •   Shared	
  scientiNic	
  data	
  collection	
  facility
                                  •   Data	
  intensive	
  (10-­‐100	
  GB/day)
Grid Overview - Ian Stokes-Rees                                      ijstokes@hkl.hms.harvard.edu
Cryo	
  Electron	
  Microscopy




      • Previously,	
  1-­10,000	
  images,	
  managed	
  by	
  hand
      • Now,	
  robotic	
  systems	
  collect	
  millions	
  of	
  hi-­res	
  images
      • estimate	
  250,000	
  CPU-­hours	
  to	
  reconstruct	
  model
Grid Overview - Ian Stokes-Rees                          ijstokes@hkl.hms.harvard.edu
Molecular	
  Dynamics	
  Simulations
                                        1	
  fs	
  time	
  step
                                        1ns	
  snapshot
                                        1	
  us	
  simulation
                                        1e6	
  steps
                                        1000	
  frames
                                        10	
  MB	
  /	
  frame
                                        10	
  GB	
  /	
  sim
                                        20	
  CPU-­years
                                        3	
  months	
  (wall-­
                                        clock)

Grid Overview - Ian Stokes-Rees   ijstokes@hkl.hms.harvard.edu
Functional	
  MRI




Grid Overview - Ian Stokes-Rees        ijstokes@hkl.hms.harvard.edu
Next	
  Generation	
  Sequencing




Big Data - Ian Stokes-Rees     ijstokes@hkl.hms.harvard.edu
ScientiJic	
  Collaborations:	
  by	
  
    Region	
  and	
  Domain
SBGrid	
  Consortium                                                Cornell U.
           Washington U. School of Med.
                                                                                                       R. Cerione              NE-CAT
           T. Ellenberger
                                                                                                       B. Crane                R. Oswald
           D. Fremont
                                                                                                       S. Ealick               C. Parrish
                                               Rosalind Franklin NIH                                   M. Jin                  H. Sondermann
                                               D. Harrison                    M. Mayer
                                                                                                       A. Ke                    UMass Medical
           U. Washington
           T. Gonen
                                                                              U. Maryland                                       W. Royer
                                                                              E. Toth
                                                                                                                                 Brandeis U.
           UC Davis                                                                                                              N. Grigorieff
           H. Stahlberg                                                                                                          Tufts U.
                                                                                                                                 K. Heldwein
           UCSF                                                                                                                  Columbia U.
           JJ Miranda                                                                                                            Q. Fan
           Y. Cheng
                                                                                                                                 Rockefeller U.
           Stanford                                                                                                              R. MacKinnon
           A. Brunger                                                                                               Yale U.
           K. Garcia                                                                                                T. Boggon            K. Reinisch
           T. Jardetzky                                                                                             D. Braddock          J. Schlessinger
                                                                                                                    Y. Ha                F. Sigworth
           CalTech                                                                                                  E. Lolis             F. Zhou
           P. Bjorkman                                                                                              Harvard and Affiliates
           W. Clemons                                                                                                   N. Beglova         A. Leschziner
           G. Jensen                       Rice University                                                              S. Blacklow        K. Miller
           D. Rees                           E. Nikonowicz                                                              B. Chen            A. Rao
                                             Y. Shamoo                Vanderbilt                                        J. Chou            T. Rapoport
                                             Y.J. Tao                 Center for Structural Biology                     J. Clardy          M. Samso
           WesternU
                                                                      W. Chazin          C. Sanders                     M. Eck             P. Sliz
           M. Swairjo
                                                                      B. Eichman         B. Spiller                     B. Furie           T. Springer
                                                                      M. Egli            M. Stone                       R. Gaudet          G. Verdine
           UCSD                                                       B. Lacy            M. Waterman                    M. Grant           G. Wagner
           T. Nakagawa                                                M. Ohi                                            S.C. Harrison      L. Walensky
           H. Viadiu                                                 Thomas Jefferson                                   J. Hogle           S.Walker
                                                                     J. Williams                                        D. Jeruzalmi       T.Walz
                                                                                                                        D. Kahne           J. Wang
       Not Pictured:
       University of Toronto: L. Howell, E. Pai, F. Sicheri; NHRI (Taiwan): G. Liou; Trinity College, Dublin: Amir Khan T. Kirchhausen     S. Wong
Grid Overview - Ian Stokes-Rees                                                                                  ijstokes@hkl.hms.harvard.edu
Boston	
  Life	
  Sciences	
  Hub


   •   Biomedical	
  researchers
   •   Government	
  agencies
   •   Life	
  sciences                                 Tufts



   •   Universities
                                                        Universit
                                                        y
                                                        School
                                                        of
                                                        Medicin
                                                        e



   •   Hospitals




Grid Overview - Ian Stokes-Rees      ijstokes@hkl.hms.harvard.edu
ScientiJic	
  Research	
  Today
                • International	
  collaborations
                   •   IT	
  becomes	
  embedded	
  into	
  research	
  process:	
  data,	
  results,	
  
                       analysis,	
  visualization
                   •   Crossing	
  institutional	
  and	
  national	
  boundaries
                • Computational	
  techniques	
  increasingly	
  
                  important
                   •   ...	
  and	
  computationally	
  intensive	
  techniques	
  as	
  well
                   •   requires	
  use	
  of	
  high	
  performance	
  computing	
  systems
                • Data	
  volumes	
  are	
  growing	
  fast
                   •   hard	
  to	
  share
                   •   hard	
  to	
  manage
                • ScientiNic	
  software	
  often	
  difNicult	
  to	
  use
                   •   or	
  to	
  use	
  properly
                • Web	
  based	
  tools	
  increasingly	
  important
                   •   but	
  often	
  lack	
  disconnect	
  from	
  persisted	
  and	
  shared	
  results

Grid Overview - Ian Stokes-Rees                                            ijstokes@hkl.hms.harvard.edu
Required:

  Collaborative	
  environment	
  for	
  
compute	
  and	
  data	
  intensive	
  science
http://www.xsede.org




Grid Overview - Ian Stokes-Rees         ijstokes@hkl.hms.harvard.edu
• 30,000	
  hour	
  allocations	
  “easy”
                                    •   millions	
  of	
  hours	
  possible
                                    •   any	
  US-­‐based	
  researcher	
  can	
  apply
                                  • allocation	
  holder	
  can	
  delegate
                                  • access	
  to	
  ~dozen	
  of	
  
                                    supercomputing	
  centers
                                  • command	
  line	
  access
                                    •   standard	
  batch	
  systems	
  like	
  PBS,	
  LSF,	
  SGE
                                  • web-­‐based	
  interaction
                                    •   build	
  your	
  own	
  Science	
  Gateway
                                    •   XSEDE	
  for	
  processing	
  behind	
  the	
  scenes




Grid Overview - Ian Stokes-Rees                              ijstokes@hkl.hms.harvard.edu
Open	
  Science	
  Grid
                     http://opensciencegrid.org


 • US	
  National	
  
   Cyberinfrastructure
 • Primarily	
  used	
  for	
  high	
  
   energy	
  physics	
  computing
 • 80	
  sites
 • 100,000	
  job	
  slots
                                                        5,073,293	
  hours
 • 1,500,000	
  hours	
  per	
  day                     ~570	
  years
 • PB	
  scale	
  aggregate	
  storage
 • 1	
  PB	
  transferred	
  each	
  day
 • Virtual	
  Organization-­‐based

Grid Overview - Ian Stokes-Rees            ijstokes@hkl.hms.harvard.edu
Grid Overview - Ian Stokes-Rees   ijstokes@hkl.hms.harvard.edu
SimpliJied	
  Grid	
  Architecture




Grid Overview - Ian Stokes-Rees   ijstokes@hkl.hms.harvard.edu
Grid	
  Architectural	
  Details
    • Resources                                         • Information
       •   Uniform	
  compute	
  clusters                 •   LDAP	
  based	
  most	
  common	
  (not	
  
       •   Managed	
  via	
  batch	
  queues                  optimized	
  for	
  writes)
       •   Local	
  scratch	
  disk                       •   Domain	
  speciNic	
  layer
       •   Sometimes	
  high	
  perf.	
  network	
        •   Open	
  problem!
           (e.g.	
  InNiniBand)                         • Fabric
       •   Behind	
  NAT	
  and	
  Nirewall               •   In	
  most	
  cases,	
  assume	
  functioning	
  
       •   No	
  shell	
  access                              Internet
    • Data                                                •   Some	
  sites	
  part	
  of	
  experimental	
  
                                                              private	
  networks
       •   Tape-­‐backed	
  mass	
  storage
       •   Disk	
  arrays	
  (100s	
  TB	
  to	
  PB)   • Security
       •   High	
  bandwidth	
  (multi-­‐stream)	
        •   Typically	
  underpinned	
  by	
  X.509	
  
           transfer	
  protocols                              Public	
  Key	
  Infrastructure
       •   File	
  catalogs                               •   Same	
  standards	
  as	
  SSL/TLS	
  and	
  
       •   Meta-­‐data                                        “server	
  certs”	
  for	
  “https”
       •   Replica	
  management


Grid Overview - Ian Stokes-Rees                                   ijstokes@hkl.hms.harvard.edu
OSG	
  Components	
  (I)
    • Centralized
       •   X.509	
  CertiNicate	
  Authority:	
  Energy	
  Science	
  Network	
  CA	
  @	
  LBL
       •   Accounting:	
  Gratia	
  logging	
  system	
  to	
  track	
  usage	
  (CPU,	
  Network,	
  Disk)
       •   Status:	
  LDAP	
  directory	
  with	
  details	
  of	
  each	
  participating	
  system
       •   Support:	
  Central	
  clearing	
  house	
  for	
  support	
  tickets
       •   Software:	
  distribution	
  system,	
  update	
  testing,	
  bug	
  reporting	
  and	
  Nixing
       •   Communication:	
  Wikis,	
  docs,	
  mailing	
  lists,	
  workshops,	
  conferences,	
  etc.
    • Per	
  Site
       •   Compute	
  Element/Gatekeeper	
  (CE/GK):	
  access	
  point	
  for	
  external	
  users,	
  acts	
  
           as	
  frontend	
  for	
  any	
  cluster.	
  	
  Globus	
  GRAM	
  +	
  local	
  batch	
  system
       •   Storage	
  Element	
  (SE):	
  grid-­‐accessible	
  storage	
  system,	
  GridFTP-­‐based	
  +	
  SRM
       •   Worker	
  Nodes	
  (WN):	
  cluster	
  nodes	
  with	
  grid	
  software	
  stack
       •   User	
  Interface	
  (UI):	
  access	
  point	
  for	
  local	
  users	
  to	
  interact	
  with	
  remote	
  grid
       •   Access	
  Control:	
  GUMS	
  +	
  PRIMA	
  for	
  ACLs	
  to	
  local	
  system	
  by	
  grid	
  identities
       •   Admin	
  contact:	
  need	
  a	
  local	
  expert	
  (or	
  two!)


Grid Overview - Ian Stokes-Rees                                                        ijstokes@hkl.hms.harvard.edu
OSG	
  Components	
  (II)

     • Per	
  Virtual	
  Organization	
  (user	
  community)
        •   VO	
  Management	
  System	
  (VOMS):	
  to	
  organize	
  and	
  register	
  users
        •   Registration	
  Authority	
  (RA):	
  to	
  validate	
  community	
  users	
  with	
  X.509	
  issuer
        •   User	
  Interface	
  system	
  (UI):	
  provide	
  gateway	
  to	
  OSG	
  for	
  users
        •   Support	
  Contact:	
  users	
  are	
  supported	
  by	
  their	
  VO	
  representatives
     • Per	
  User
        •   X.509	
  user	
  certiNicate	
  (although	
  I’d	
  like	
  to	
  hide	
  that	
  part)
        •   Induction:	
  unless	
  it	
  is	
  through	
  a	
  portal,	
  grid	
  computing	
  is	
  not	
  shared	
  Nile	
  
            system	
  batch	
  computing!	
  	
  Many	
  more	
  failure	
  modes	
  and	
  gotchas.




Grid Overview - Ian Stokes-Rees                                                              ijstokes@hkl.hms.harvard.edu
Grid	
  Opportunities
         • New	
  compute	
  intensive	
  workNlows
            •   think	
  big:	
  tens	
  or	
  hundreds	
  of	
  thousands	
  of	
  hours	
  Ninished	
  in	
  1-­‐2	
  days
            •   sharing	
  resources	
  for	
  efNicient	
  and	
  large	
  scale	
  utilization

         • Data	
  intensive	
  problems
            •   we	
  mirror	
  20	
  GB	
  of	
  data	
  to	
  30	
  computing	
  centers

         • Data	
  movement,	
  management,	
  and	
  archive
         • Federated	
  identity	
  and	
  user	
  management
            •   labs,	
  collaborations	
  or	
  ad-­‐hoc	
  groups
            •   role-­‐based	
  access	
  control	
  (RBAC)	
  and	
  IdM

         • Collaborative	
  environment
         • Web-­‐based	
  access	
  to	
  applications
Grid Overview - Ian Stokes-Rees                                                       ijstokes@hkl.hms.harvard.edu
Protein	
  Structure	
  Determination




Grid Overview - Ian Stokes-Rees    ijstokes@hkl.hms.harvard.edu
Grid Overview - Ian Stokes-Rees   ijstokes@hkl.hms.harvard.edu
Typical	
  Layered	
  Environment
                                                                                    Fortran bin
             •   Command	
  line	
  application	
  (e.g.	
  Fortran)
             •   Friendly	
  application	
  API	
  wrapper                          Python API


   Map-      •   Batch	
  execution	
  wrapper	
  for	
  N-­‐iterations        Multi-exec wrapper

  Reduce
             •   Results	
  extraction	
  and	
  aggregation                   Result aggregator

             •   Grid	
  job	
  management	
  wrapper                          Grid management

             •   Web	
  interface                                                  Web interface

             •   forms,	
  views,	
  static	
  HTML	
  results
             •   GOAL	
  eliminate	
  shell	
  scripts
             •   often	
  found	
  as	
  “glue”	
  language	
  between	
  layers


Grid Overview - Ian Stokes-Rees                                  ijstokes@hkl.hms.harvard.edu
Web	
  Portals	
  for	
  Collaborative,	
  
 Multi-­‐disciplinary	
  Research...




...	
  which	
  leverage	
  capabilities	
  of	
  federated	
  
          grid	
  computing	
  environments
The	
  Browser	
  as	
  the	
  
                       Universal	
  Interface
      • If	
  it	
  isn’t	
  already	
  obvious	
  to	
  you
         •    Any	
  interactive	
  application	
  developed	
  today	
  should	
  be	
  web-­‐based	
  with	
  a	
  
              RESTful	
  interface	
  (if	
  at	
  all	
  possible)
      • A	
  rich	
  set	
  of	
  tools	
  and	
  techniques
         •    AJAX,	
  HTML4/5,	
  CSS,	
  and	
  JavaScript
         •    Dynamic	
  content	
  negotiation
         •    HTTP	
  headers,	
  caching,	
  security,	
  sessions/cookies
      • Scalable,	
  replicable,	
  centralized,	
  multi-­‐threaded,	
  
        multi-­‐user
      • Alternatives
         •    Command	
  Line	
  (CLI):	
  great	
  for	
  scriptable	
  jobs
         •    GUI	
  toolkits:	
  necessary	
  for	
  applications	
  with	
  high	
  graphics	
  or	
  I/O	
  demands


Grid Overview - Ian Stokes-Rees                                                    ijstokes@hkl.hms.harvard.edu
What	
  is	
  a	
  Science	
  Portal?
         • A	
  web-­‐based	
  gateway	
  to	
  resources	
  and	
  data
            •    simpliNied	
  access
            •    centralized	
  access
            •    uniNied	
  access	
  (CGI,	
  Perl,	
  Python,	
  PHP,	
  static	
  HTML,	
  static	
  Niles,	
  etc.)

         • Attempt	
  to	
  provide	
  uniform	
  access	
  to	
  a	
  range	
  of	
  
           services	
  and	
  resources
         • Data	
  access	
  via	
  HTTP
         • Leverage	
  brilliance	
  of	
  Apache	
  HTTPD	
  and	
  
           associated	
  modules


Grid Overview - Ian Stokes-Rees                                                        ijstokes@hkl.hms.harvard.edu
SBGrid	
  Science	
  Portal	
  Objectives

                                         A.	
  
                    Extensible	
  infrastructure	
  to	
  facilitate	
  
                   development	
  and	
  deployment	
  of	
  novel	
  
                         computational	
  workNlows	
  

                                         B.
              Web-­‐accessible	
  environment	
  for	
  collaborative,	
  
                    compute	
  and	
  data	
  intensive	
  science



Grid Overview - Ian Stokes-Rees                          ijstokes@hkl.hms.harvard.edu
TeraGrid

                 SBGrid User                        NERSC
                 Community
                                  Open Science Grid
                                  National Federated
                                  Cyberinfrastructure Odyssey

  Facilitate	
  interface	
  
  between	
  community	
  
  and	
  cyberinfrastructure                Orchestra
                                   EC2

Grid Overview - Ian Stokes-Rees              ijstokes@hkl.hms.harvard.edu
Grid Overview - Ian Stokes-Rees   ijstokes@hkl.hms.harvard.edu
Results	
  Visualization	
  and	
  Analysis




Grid Overview - Ian Stokes-Rees   ijstokes@hkl.hms.harvard.edu
Data	
  Access
User	
  access	
  to	
  results	
  data




Grid Overview - Ian Stokes-Rees        ijstokes@hkl.hms.harvard.edu
Grid Overview - Ian Stokes-Rees   ijstokes@hkl.hms.harvard.edu
Experimental	
  Data	
  Access

         • Collaboration
         • Access	
  Control
         •   Identity	
  Management
         •   Data	
  Management
         •   High	
  Performance	
  Data	
  Movement
         •   Multi-­‐modal	
  Access



Grid Overview - Ian Stokes-Rees                   ijstokes@hkl.hms.harvard.edu
Data	
  Model

    • Data	
  Tiers
       •   VO-­wide:	
  all	
  sites,	
  admin	
  managed,	
  very	
  stable
       •   User	
  project:	
  all	
  sites,	
  user	
  managed,	
  1-­‐10	
  weeks,	
  1-­‐3	
  GB
       •   User	
  static:	
  all	
  sites,	
  user	
  managed,	
  indeNinite,	
  10	
  MB
       •   Job	
  set:	
  all	
  sites,	
  infrastructure	
  managed,	
  1-­‐10	
  days,	
  0.1-­‐1	
  GB
       •   Job:	
  direct	
  to	
  worker	
  node,	
  infrastructure	
  managed,	
  1	
  day,	
  <10	
  MB
       •   Job	
  indirect:	
  to	
  worker	
  node	
  via	
  UCSD,	
  infrastructure	
  managed,	
  1	
  
           day,	
  <10	
  GB



Grid Overview - Ian Stokes-Rees                                             ijstokes@hkl.hms.harvard.edu
About	
  2PB	
  with
 100	
  front	
  end	
  
 servers	
  for	
  high	
  
 bandwidth	
  parallel	
  
 Nile	
  transfer




Grid Overview - Ian Stokes-Rees   ijstokes@hkl.hms.harvard.edu
Globus	
  Online:	
  High	
  Performance	
  
           Reliable	
  3rd	
  Party	
  File	
  Transfer
GUMS
 DN	
  to	
  user	
  mapping                                                CertiNicate	
  Authority
VOMS                                                                           root	
  of	
  trust
 VO	
  membership




                        portal

         cluster

                                           Globus	
  Online
                                              Nile	
  transfer	
  service



                                                                                   data collection
                     lab file
                                                                                       facility
                     server



                                 desktop   laptop
Architecture
✦   SBGrid
    •       manages	
  all	
  user	
  account	
  creation	
  and	
  credential	
  mgmt
    •       hosts	
  MyProxy,	
  VOMS,	
  GridFTP,	
  and	
  user	
  interfaces

✦   Facility
    •       knows	
  about	
  lab	
  groups
        •       e.g.	
  “Harrison”,	
  “Sliz”
    •       delegates	
  knowledge	
  of	
  group	
  membership	
  to	
  SBGrid	
  VOMS
        •       facility	
  can	
  poll	
  VOMS	
  for	
  list	
  of	
  current	
  members
    •       uses	
  X.509	
  for	
  user	
  identiJication
    •       deploys	
  GridFTP	
  server

✦   Lab	
  group
    •       designates	
  group	
  manager	
  that	
  adds/removes	
  individuals
    •       deploys	
  GridFTP	
  server	
  or	
  Globus	
  Connect	
  client

✦   Individual
    •       username/password	
  to	
  access	
  facility	
  and	
  lab	
  storage
    •       Globus	
  Connect	
  for	
  personal	
  GridFTP	
  server	
  to	
  laptop
    •       Globus	
  Online	
  web	
  interface	
  to	
  “drive”	
  transfers
Objective

✦   Easy	
  to	
  use	
  high	
  performance	
  data	
  mgmt	
  
    environment
✦   Fast	
  Nile	
  transfer
    •   facility-­‐to-­‐lab,	
  facility-­‐to-­‐individual,	
  lab-­‐to-­‐individual
✦   Reduced	
  administrative	
  overhead
✦   Better	
  data	
  curation
Ryan,	
  a	
  postdoc	
  in	
  the	
  
                                                                                                                                Frank	
  Lab	
  at	
  Columbia

                                                                                                                                Access	
  NRAMM	
  facilities	
  
                                                                                                                                securely	
  and	
  transfer	
  data	
  
                                                                                                                                back	
  to	
  home	
  institute
                   automated	
  X.509
                           check	
  SBGrid	
  for	
  
                   application group	
  
                           Ryan’s	
  
                           membership                                                /data/columbia/frank
                                                                     facility file
                                                                       server                   transfer	
  data	
  to	
  lab
                       veriNication	
  in	
  Frank	
  Lab,	
  so	
  
                                       of	
  
                       lab	
  membership access	
  to	
  Niles
                                    grant	
  


        SBGrid                                 Ryan	
  initiate	
  tfransfer	
  at	
  
                                                        applies	
   or	
  an	
           request	
  access
        Science                                account	
  at	
  the	
  SBGrid	
  
                                                    NRAMM                                to	
  NRAMM
         Portal                                Science	
  Portal                         facility
                                   using	
  credential	
  
                                                     notify	
  user	
  of	
  
                                   held	
  by	
  SBGrid                                                                                   lab file
                                                     completion                                                   desktop
                                                                                                                                          server
automated	
                                                                                                         /nfs/data/rsmith
Globus	
  Online	
  
application
                                        use	
  Globus	
  Online	
  
                                        to	
  manage
                                        transfer	
  from	
                                   /Users/Ryan
                                                                                    laptop
                                        NRAMM	
  back	
  to	
  lab
Challenges
✦   Access	
  control
    •   visibility
    •   policies
✦   Provenance
    •   data	
  origin
    •   history
✦   Meta-­‐data
    •   attributes
    •   searching
User	
  Credentials
UniJied	
  Account	
  Management
                                  Hierarchical	
  LDAP	
  database
                                    user	
  basics
                                    passwords
                                  Standard	
  schemas



                                  Relational	
  DB
                                   user	
  custom	
  proNiles
                                   institutions
                                   lab	
  groups
                                  Custom	
  schemas


Grid Overview - Ian Stokes-Rees                         ijstokes@hkl.hms.harvard.edu
X.509	
  Digital	
  CertiJicates
✦   Analogy	
  to	
  a	
  passport:
    •   Application	
  form
    •   Sponsor’s	
  attestation
    •   Consular	
  services
        •   veriNication	
  of	
  application,	
  sponsor,	
  and	
  accompanying	
  
            identiNication	
  and	
  eligibility	
  documents
    •   Passport	
  issuing	
  ofNice
✦   Portable,	
  digital	
  passport
    •   Nixed	
  and	
  secure	
  user	
  identiNiers
        •   name,	
  email,	
  home	
  institution
    •   signed	
  by	
  widely	
  trusted	
  issuer
    •   time	
  limited
    •   ISO	
  standard
U1          U1        U1



Addressing	
  CertiJicate	
  Problems
                /.'           -.'.)"*&'                           012*%2!' 3%"!'
                                                                             )"*"!4&"'
                                                                             ,"!&'5":'14(!'
                                        !"#$"%&'%()*"+',"!&'
                                                                                 U1
                                      !"&$!*'&!4,5(*)'*$67"!''

                    *289:'4)"*&%'
                    !";("<'!"#$"%&'
                                       R1
time




                                       ;"!(9:'$%"!'"=()(7(=(&:'

                                         ,2*>!6'"=()(7(=(&:'
                                                                         S1
                    411!2;"',"!&'
                                       R2
       %()*',"!&'
                                         *289:'4;4(=47(=(&:'

                                            !"&!(";"',"!&'
                                                                                U2a
                                                                              "?12!&'%()*"+'
                                                                              ,"!&'5":'14(!'
VO	
  (Group)	
  Membership	
  
               Registration
             !")*#        !"#$%&'(#                                 *+,(-,.#    /-0.#
                                 +.0-0(:#50.:#:,#.0?>0-:#&0&90.-<'+#93#@A#

                                 .0?>0-:#!"#8.,>+-#4(%#.,70-#                           U2b
                  (,123#4%&'(#
                                   V1
                                         ;0.'23#>-0.#07'8'9'7':3#

                                          5,(6.&#07'8'9'7':3#
                                                                           S2
time




                  4++.,;0#&0&90.-<'+=#
                  8.,>+-=#4(%#.,70-#
        4%%#@A#                     V2
       :,#!")*#
                                                 (,123#


                                          .0?>0-:#!")*#$B#
                                            .0:>.(#!")*#$B#                        4%%#$B#:,#
                                                                                   +.,C3#50.:#
()'               AB)!'       !"#$%&'                   *+",-"#'               .-/#'

                                    ;<'                           #/>:/-$'+"#$%&'%66":,$'
       =3!#"I3'     ;<=*'          )@81,'        0/#176%?",'/8%1&'-/,$'
                                                                              /8%1&'0/#17/@'           U1
                                                4/,/#%$/'                   !"#$%&'(%)*+%
                      #/>:/-$'-14,/@'6/#$'      6/#$'9/3'+%1#'              ',,#""%+#0%
                                                                            1*$/'2%
                  #/$:#,'$#%691,4',:85/#'
                  ,"?23'%4/,$-'
                                         0/#123'/&14151&1$3'
                               A1a
                           6#/%$/'                    6",7#8'/&14151&1$3'
                           &"6%&'%66$'
                                                                              S1*
                       %++#"0/'6/#$'
time




                  -14,'6/#$'
                  ,"?23'%0%1&%51&1$3'     A1b
                                         -/$'#/$#1/0%&'-/#1%&',:85/#'
                      -/$';<'#14C$-'
                                         %66":,$'#/%@3',"?76%?",'
                                                                                 +"#$%&'&"41,'
                                                                                                 U2*
                  #/>:/-$'-14,/@'6/#?76%$/'
                  #/$:#,'-14,/@'6/#?76%$/'       D'E'+%1#'-14,/@'6/#$'
                                                 1,$"'!F(*GDH'7&/'
              #/41-$/#'+#"I3'6/#$'
                                                 H'E'6#/%$/'&"6%&'
              J1$C'=3!#"I3'                                                 !"#$%&'(%)*+%
                                                 +#"I3'6/#$'
                                                                            ',,#""%-#.#$'/#.%
                                                                            $#"*!$,#"%
Process	
  and	
  Design	
  Improvements
  ✦   Single	
  web-­‐form	
  application
      •   includes	
  e-­‐mail	
  veriNicationn
  ✦   Centralized	
  and	
  connected	
  credential	
  management
      •   FreeIPA	
  LDAP	
  -­‐	
  user	
  directory	
  and	
  credential	
  store
      •   VOMS	
  -­‐	
  lab,	
  institution,	
  and	
  collaboration	
  afNiliations
      •   MyProxy	
  -­‐	
  X.509	
  credential	
  store
  ✦   Overlap	
  administrative	
  roles
      •   system	
  admin
      •   registration	
  agent	
  for	
  certiNicate	
  authority	
  (approve	
  X.509	
  
          request)
      •   VO	
  administrator	
  to	
  register	
  group	
  afNiliations
  ✦   Automation
Security
Access	
  Control
   • Need	
  a	
  strong	
  Identity	
  Management	
  environment
      •   individuals:	
  identity	
  tokens	
  and	
  identiNiers
      •   groups:	
  membership	
  lists
      •   Active	
  Directory/CIFS	
  (Windows),	
  Open	
  Directory	
  (Apple),	
  FreeIPA	
  (Unix)	
  all	
  LDAP-­‐
          based
   • Need	
  to	
  manage	
  and	
  communicate	
  Access	
  Control	
  policies
      •   institutionally	
  driven
      •   user	
  driven
   • Need	
  Authorization	
  System
      •   Policy	
  Enforcement	
  Point	
  (shell	
  login,	
  data	
  access,	
  web	
  access,	
  start	
  application)
      •   Policy	
  Decision	
  Point	
  (store	
  policies	
  and	
  understand	
  relationship	
  of	
  identity	
  token	
  	
  
          and	
  policy)




Grid Overview - Ian Stokes-Rees                                                        ijstokes@hkl.hms.harvard.edu
Access	
  Control
         • What	
  is	
  a	
  user?
            •   .htaccess	
  and	
  .htpasswd
            •   local	
  system	
  user	
  (NIS	
  or	
  /etc/passwd)
            •   portal	
  framework	
  user	
  (proprietary	
  DB	
  schema)
            •   grid	
  user	
  (X.509	
  DN)

         • What	
  are	
  we	
  securing	
  access	
  to?
            •   Web	
  pages?
            •   URLs?
            •   Data?
            •   SpeciNic	
  operations?
            •   Meta	
  Data?

         • What	
  kind	
  of	
  policies	
  do	
  we	
  enable?
            •   Simplify	
  to	
  READ	
  WRITE	
  EXECUTE	
  LIST	
  ADMIN


Grid Overview - Ian Stokes-Rees                                         ijstokes@hkl.hms.harvard.edu
Grid Overview - Ian Stokes-Rees   ijstokes@hkl.hms.harvard.edu
Grid Overview - Ian Stokes-Rees   ijstokes@hkl.hms.harvard.edu
Grid Overview - Ian Stokes-Rees   ijstokes@hkl.hms.harvard.edu
Grid Overview - Ian Stokes-Rees   ijstokes@hkl.hms.harvard.edu
Grid Overview - Ian Stokes-Rees   ijstokes@hkl.hms.harvard.edu
Architecture	
  Diagrams
Service	
  Architecture
              GlobusOnline                            UC San Diego
               @Argonne              GUMS
  User                              GUMS
                                 GridFTP +            glideinWMS
            data                  Hadoop                 factory           Open Science Grid


   computations
                                                                                        MyProxy
                                                                                      @NCSA, UIUC
    monitoring     interfaces            data          computation     ID mgmt
    Ganglia                         scp                Condor           FreeIPA
                   Apache                                                              DOEGrids CA
    Nagios                          GridFTP            Cycle Server                     @Lawrence
                   GridSite                                             LDAP
    RSV                             SRM                VDT                             Berkley Labs
                   Django                                               VOMS
                                                       Globus
    pacct                           WebDAV
                   Sage Math                                            GUMS
                                                       glideinWMS                     Gratia Acct'ing
                   R-Studio                                             GACL           @FermiLab
                                  file          SQL
                   shell CLI    server          DB       cluster
                                                                                        Monitoring
   SBGrid Science Portal @ Harvard Medical School                                       @Indiana



Grid Overview - Ian Stokes-Rees                                       ijstokes@hkl.hms.harvard.edu
Grid Overview - Ian Stokes-Rees   ijstokes@hkl.hms.harvard.edu
Summary




Grid Overview - Ian Stokes-Rees         ijstokes@hkl.hms.harvard.edu
Acknowledgements	
  &	
  Questions

   • Piotr	
  Sliz
                                                                   Please	
  contact	
  me	
  
      •   Principle	
  Investigator,	
  head	
  of	
  SBGrid
                                                                   with	
  any	
  questions:
   • SBGrid	
  Science	
  Portal                                   • Ian	
  Stokes-­‐Rees
      •   Daniel	
  O’Donovan,	
  Meghan	
  Porter-­‐Mahoney
                                                                   • ijstokes@hkl.hms.harvard.edu
   • SBGrid	
  System	
  Administrators                            • ijstokes@spmetric.com
      •   Ian	
  Levesque,	
  Peter	
  Doherty,	
  Steve	
  Jahl
   • Globus	
  Online	
  Team                                      Look	
  at	
  our	
  work
      •   Steve	
  Tueke,	
  Ian	
  Foster,	
  Rachana	
             •   portal.sbgrid.org
          Ananthakrishnan,	
  Raj	
  Kettimuthu	
                    •   www.sbgrid.org
   • Ruth	
  Pordes                                                  •   www.opensciencegrid.org
      •   Director	
  of	
  OSG,	
  for	
  championing	
  SBGrid




Grid Overview - Ian Stokes-Rees                                     ijstokes@hkl.hms.harvard.edu
Extra	
  Slides




Grid Overview - Ian Stokes-Rees               ijstokes@hkl.hms.harvard.edu
Existing	
  Security	
  
                                Infrastructure
     • X.509	
  certiNicates
        •   Department	
  of	
  Energy	
  CA
        •   Regional/Institutional	
  RAs	
  (SBGrid	
  is	
  an	
  RA)
     • X.509	
  proxy	
  certiNicate	
  system
        •   Users	
  self-­‐sign	
  a	
  short-­‐lived	
  passwordless	
  proxy	
  certiNicate	
  used	
  for	
  “portable”	
  
            and	
  “automated”	
  grid	
  processing	
  identity	
  token
        •   Similarities	
  to	
  Kerberos	
  tokens
     • Virtual	
  Organizations	
  (VO)	
  for	
  deNinitions	
  of	
  roles,	
  
       groups,	
  attrs
     • Attribute	
  CertiNicates
        •   Users	
  can	
  (attempt)	
  to	
  fetch	
  ACs	
  from	
  the	
  VO	
  to	
  be	
  attached	
  to	
  proxy	
  certs
     • POSIX-­‐like	
  Nile	
  access	
  control	
  (Grid	
  ACL)	
  
Grid Overview - Ian Stokes-Rees                                                             ijstokes@hkl.hms.harvard.edu
Grid Overview - Ian Stokes-Rees   ijstokes@hkl.hms.harvard.edu
Data	
  Management
  quota
  du	
  scan
  tmpwatch
  conventions
  workNlow	
  integration

  Data	
  Movement
  scp	
  (users)
  rsync	
  (VO-­‐wide)
  grid-­‐ftp	
  (UCSD)
  curl	
  (WNs)
  cp	
  (NFS)
  htcp	
  (secure	
  web)




Grid Overview - Ian Stokes-Rees   ijstokes@hkl.hms.harvard.edu
4.	
  pull	
  6iles	
  from
                                                                    UCSD	
  to	
  WNs


                                                                                                 5.	
  pull	
  6iles	
  from
                                  3.	
  Auto-­replicate                                          local	
  NSF	
  to	
  WNs
                                                                                                            6.	
  pull	
  6iles	
  from
                                                                                                             SBGrid	
  to	
  WNs

    red	
  -­	
  push	
  6iles
   green	
  -­	
  pull	
  6iles

                                               2.	
  replicate	
  gold	
  standard


                                                                                                      7.	
  job	
  results	
  copied	
  
                                                                                                             back	
  to	
  SBGrid
                                                                                                     8a.	
  large	
  job	
  results	
  
                                                                                                       copied	
  to	
  UCSD
                                                                                                      8b.	
  later	
  pulled	
  to	
  
                                                   1.	
  user	
  6ile	
  upload                                SBGrid
Grid Overview - Ian Stokes-Rees                                                             ijstokes@hkl.hms.harvard.edu
“weak” solution
                                               2nx5q2




                      Log Likelihood Gain
MHC-­‐TCR:	
  2VLJ
                                                               “strong” solution
                                                                    1im3a2




                                                    Translation Z score




Grid Overview - Ian Stokes-Rees                                                    ijstokes@hkl.hms.harvard.edu
•   NEBioGrid	
  Django	
  Portal                                    •   PyGACL
     Interactive	
  dynamic	
  web	
  portal	
  for	
                     Python	
  representation	
  of	
  GACL	
  model	
  
     workNlow	
  deNinition,	
  submission,	
                             and	
  API	
  to	
  work	
  with	
  GACL	
  Niles
     monitoring,	
  and	
  access	
  control
                                                                      •   osg_wrap
 •   NEBioGrid	
  Web	
  Portal                                           Swiss	
  army	
  knife	
  OSG	
  wrapper	
  script	
  to	
  
     GridSite	
  based	
  web	
  portal	
  for	
  Nile-­‐system	
         handle	
  Nile	
  staging,	
  parameter	
  sweep,	
  
     level	
  access	
  (raw	
  job	
  output),	
  meta-­‐data	
          DAG,	
  results	
  aggregation,	
  monitoring
     tagging,	
  X.509	
  access	
  control/sharing,	
  
                                                                      •   sbanalysis
     CGI
                                                                          data	
  analysis	
  and	
  graphing	
  tools	
  for	
  
 •   PyCCP4
                                                                          structural	
  biology	
  data	
  sets
     Python	
  wrappers	
  around	
  CCP4	
  
                                                                      •   osg.monitoring
     structural	
  biology	
  applications
                                                                          tools	
  to	
  enhance	
  monitoring	
  of	
  job	
  set	
  
 •   PyCondor
                                                                          and	
  remote	
  OSG	
  site	
  status
     Python	
  wrappers	
  around	
  common	
  
                                                                      •   shex
     Condor	
  operations
                                                                          Write	
  bash	
  scripts	
  in	
  Python:	
  replicate	
  
     enhanced	
  Condor	
  log	
  analysis
                                                                          commands,	
  syntax,	
  behavior
 •   PyOSG
                                                                      •   xcon6ig
     Python	
  wrappers	
  around	
  common	
  OSG	
  
                                                                          Universal	
  conNiguration
     operations

Grid Overview - Ian Stokes-Rees                                                          ijstokes@hkl.hms.harvard.edu
10k	
  grid	
  jobs

Example	
  Job	
  Set                                                                                         approx	
  30k	
  CPU	
  hours
                                                                                                              99.7%	
  success	
  rate evicted - red
                                                                                                              24	
  wall	
  clock	
  hours completed - green
                                                                                                                                           held - orange



                                                                                                                                                                         MIT




                                                                                                5292
                                                                                                                        UWisc




                                                                                                                                                                                        1173
                                                                                                                                                                           1077
                                                                                                                  120
                                                       1657




                                                                                                                                                               3




                                                                                                                                                                                  662
                                                                                                                                                             Cornell




                                                                                                       840




                                                                                                                                                   20
                                                                                                                                               Buffalo

                                                                    720




                                                                                         628
                                                                                                                                  ND




                                                                                                                  76
                                                                          407




                                                                                                             47




                                                                                                                                                                                                    421
                    Caltech                                   190
                                                                                                       FNAL
          1409




                                                                                                                                                                                                          237
                                                                                    12




                                                                                                                                          24




                                                                                                                                                                                               79
                                                                                4




                                                                                                                                                                                        47
                                                                     UNL




                                                                                                                                      6
                 1159




                                                                                                                                  3
                                                                                                                                                                                               HMS




                                                                                                                             60
                                                                                                                        20
                                                                                                                        Purdue
    349




                          10,000 jobs
                    52
             17




                                                                                                                                                        39
                 UCR                                                                                                                                         RENCI

                                local queue




                                        remote queue                                                                                                                              SPRACE




                                                                                                                                                                         1216
                                                  running




                                                                                                                                                                   316


                                                                                                                                                                                248
Grid Overview - Ian Stokes-Rees                                                                24 hours                           ijstokes@hkl.hms.harvard.edu
Job	
  Lifelines




Grid Overview - Ian Stokes-Rees       ijstokes@hkl.hms.harvard.edu
REST
        • Don’t	
  try	
  to	
  read	
  too	
  much	
  into	
  the	
  name
           •   REpresentational	
  State	
  Transfer:	
  coined	
  by	
  Roy	
  Fielding,	
  co-­‐author	
  of	
  
               HTTP	
  protocol	
  and	
  contributor	
  to	
  original	
  Apache	
  httpd	
  server
        • Idea
           •   The	
  web	
  is	
  the	
  worlds	
  largest	
  asynchronous,	
  distributed,	
  parallel	
  
               computational	
  system
           •   Resources	
  are	
  “hidden”	
  but	
  representations	
  are	
  accessible	
  via	
  URLs
           •   Representations	
  can	
  be	
  manipulated	
  via	
  HTTP	
  operations	
  GET	
  PUT	
  POST	
  
               HEAD	
  DELETE	
  and	
  associated	
  state
           •   State	
  transitions	
  are	
  initiated	
  by	
  software	
  or	
  by	
  humans
        • Implication
           •   Clean	
  URLs	
  (e.g.	
  Flickr)




Grid Overview - Ian Stokes-Rees                                                   ijstokes@hkl.hms.harvard.edu
Big Data - Ian Stokes-Rees   ijstokes@hkl.hms.harvard.edu
Cloud	
  Computing:
         Industry	
  solution	
  to	
  the	
  Grid
     •       Virtualization	
  has	
  taken	
  off	
  in	
  the	
  past	
  5	
  years
         •       VMWare,	
  Xen,	
  VirtualPC,	
  VirtualBox,	
  QEMU,	
  etc.
         •       Builds	
  on	
  ideas	
  from	
  VMS	
  (i.e.	
  old)
     •       (Good)	
  System	
  administrators	
  are	
  hard	
  to	
  come	
  by
         •       And	
  operating	
  a	
  large	
  data	
  center	
  is	
  costly
     •       Internet	
  boom	
  means	
  there	
  are	
  companies	
  that	
  have	
  Nigured	
  out	
  
             how	
  to	
  do	
  this	
  really	
  well
         •       Google,	
  Amazon,	
  Yahoo,	
  Microsoft,	
  etc.
     •       Outsource	
  IT	
  infrastructure!	
  	
  Outsource	
  software	
  hosting!
         •       Amazon	
  EC2,	
  Microsoft	
  Azure,	
  RightScale,	
  Force.com,	
  Google	
  Apps
     •       Over	
  simpliNied:
         •       You	
  can’t	
  install	
  a	
  cloud
         •       You	
  can’t	
  buy	
  a	
  grid




Grid Overview - Ian Stokes-Rees                                                               ijstokes@hkl.hms.harvard.edu
Is	
  “Cloud”	
  the	
  new	
  “Grid”?
         • Grid	
  is	
  about	
  mechanisms	
  for	
  federated,	
  
           distributed,	
  heterogeneous	
  shared	
  compute	
  and	
  
           storage	
  resources
            •   standards	
  and	
  software



         • Cloud	
  is	
  about	
  on-­‐demand	
  provisioning	
  of	
  
           compute	
  and	
  storage	
  resources
            •   services


                No	
  one	
  buys	
  a	
  grid.	
  	
  No	
  one	
  installs	
  a	
  cloud.

Grid Overview - Ian Stokes-Rees                                    ijstokes@hkl.hms.harvard.edu
The	
  interesting	
  thing	
  about	
  Cloud	
  Computing	
  is	
  that	
  
         we’ve	
  redeSined	
  Cloud	
  Computing	
  to	
  include	
  
         everything	
  that	
  we	
  already	
  do.	
  .	
  .	
  .	
  I	
  don’t	
  understand	
  
         what	
  we	
  would	
  do	
  differently	
  in	
  the	
  light	
  of	
  Cloud	
  
         Computing	
  other	
  than	
  change	
  the	
  wording	
  of	
  some	
  of	
  
         our	
  ads.
                  Larry	
  Ellison,	
  Oracle	
  CEO,	
  quoted	
  in	
  the	
  Wall	
  Street	
  Journal,	
  September	
  26,	
  2008*	
  




         *http://blogs.wsj.com/biztech/2008/09/25/larry-­‐ellisons-­‐brilliant-­‐anti-­‐cloud-­‐computing-­‐rant/




Grid Overview - Ian Stokes-Rees                                                                              ijstokes@hkl.hms.harvard.edu
When	
  is	
  cloud	
  computing	
  
                            interesting?
  •       My	
  deNinition	
  of	
  “cloud	
  computing”
      •      Dynamic	
  compute	
  and	
  storage	
  infrastructure	
  provisioning	
  in	
  a	
  scalable	
  manner	
  providing	
  
             uniform	
  interfaces	
  to	
  virtualized	
  resources
  •       The	
  underlying	
  resources	
  could	
  be
      •      	
  “in-­‐house”	
  using	
  licensed/purchased	
  software/hardware
      •      “external”	
  hosted	
  by	
  a	
  service/infrastructure	
  provider
  •       Consider	
  using	
  cloud	
  computing	
  if
      •      You	
  have	
  operational	
  problems/constraints	
  in	
  your	
  current	
  data	
  center
      •      You	
  need	
  to	
  dynamically	
  scale	
  (up	
  or	
  down)	
  access	
  to	
  services	
  and	
  data
      •      You	
  want	
  fast	
  provisioning,	
  lots	
  of	
  bandwidth,	
  and	
  low	
  latency
      •      Organizationally	
  you	
  can	
  live	
  with	
  outsourcing	
  responsibility	
  for	
  (some	
  of)	
  your	
  data	
  and	
  
             applications
  •       Consider	
  providing	
  cloud	
  computing	
  services	
  if
      •      You	
  have	
  an	
  ace	
  team	
  efNiciently	
  running	
  your	
  existing	
  data	
  center
      •      You	
  have	
  lots	
  of	
  experience	
  with	
  virtualization
      •      You	
  have	
  a	
  speciNic	
  application/domain	
  that	
  could	
  beneNit	
  from	
  being	
  tied	
  to	
  a	
  large	
  compute	
  
             farm	
  or	
  disk	
  array	
  with	
  great	
  Internet	
  connectivity
Grid Overview - Ian Stokes-Rees                                                                         ijstokes@hkl.hms.harvard.edu

More Related Content

Similar to 2011 10 pre_broad_grid_overview_ianstokesrees

A 3 d view of sodium channels
A 3 d view of sodium channelsA 3 d view of sodium channels
A 3 d view of sodium channelsProsenjit Pal
 
Chenglin zhang CV-industry
Chenglin zhang CV-industryChenglin zhang CV-industry
Chenglin zhang CV-industryChenglin Zhang
 
Investigation into the Mechanisms of Inanition and Growth of Tin Whiskers and...
Investigation into the Mechanisms of Inanition and Growth of Tin Whiskers and...Investigation into the Mechanisms of Inanition and Growth of Tin Whiskers and...
Investigation into the Mechanisms of Inanition and Growth of Tin Whiskers and...Career Communications Group
 
Chenglin zhang CV-industry
Chenglin zhang CV-industryChenglin zhang CV-industry
Chenglin zhang CV-industryChenglin Zhang
 
The absolute chronology and thermal processing of solids in the solar protopl...
The absolute chronology and thermal processing of solids in the solar protopl...The absolute chronology and thermal processing of solids in the solar protopl...
The absolute chronology and thermal processing of solids in the solar protopl...Carlos Bella
 
Evidence for a Dynamo in the Main Group Pallasite Parent Body
Evidence for a Dynamo in the Main Group Pallasite Parent BodyEvidence for a Dynamo in the Main Group Pallasite Parent Body
Evidence for a Dynamo in the Main Group Pallasite Parent BodyCarlos Bella
 
Transofrming earthquake detection
Transofrming earthquake detectionTransofrming earthquake detection
Transofrming earthquake detectionSérgio Sacani
 
dokumen.pub_physics-holt-mcdougal-physics-1nbsped-9780547586694-0547586698.pdf
dokumen.pub_physics-holt-mcdougal-physics-1nbsped-9780547586694-0547586698.pdfdokumen.pub_physics-holt-mcdougal-physics-1nbsped-9780547586694-0547586698.pdf
dokumen.pub_physics-holt-mcdougal-physics-1nbsped-9780547586694-0547586698.pdfMervatMarji2
 
Combining density functional theory calculations, supercomputing, and data-dr...
Combining density functional theory calculations, supercomputing, and data-dr...Combining density functional theory calculations, supercomputing, and data-dr...
Combining density functional theory calculations, supercomputing, and data-dr...Anubhav Jain
 
Probing the energetic particle environment near the Sun
Probing the energetic particle environment near the SunProbing the energetic particle environment near the Sun
Probing the energetic particle environment near the SunSérgio Sacani
 

Similar to 2011 10 pre_broad_grid_overview_ianstokesrees (16)

Grid Computing Overview
Grid Computing OverviewGrid Computing Overview
Grid Computing Overview
 
A 3 d view of sodium channels
A 3 d view of sodium channelsA 3 d view of sodium channels
A 3 d view of sodium channels
 
Ryan Stillwell CV
Ryan Stillwell CVRyan Stillwell CV
Ryan Stillwell CV
 
Chenglin zhang CV-industry
Chenglin zhang CV-industryChenglin zhang CV-industry
Chenglin zhang CV-industry
 
Investigation into the Mechanisms of Inanition and Growth of Tin Whiskers and...
Investigation into the Mechanisms of Inanition and Growth of Tin Whiskers and...Investigation into the Mechanisms of Inanition and Growth of Tin Whiskers and...
Investigation into the Mechanisms of Inanition and Growth of Tin Whiskers and...
 
Chenglin zhang CV-industry
Chenglin zhang CV-industryChenglin zhang CV-industry
Chenglin zhang CV-industry
 
CV - Francisco J Pedraza
CV - Francisco J PedrazaCV - Francisco J Pedraza
CV - Francisco J Pedraza
 
The absolute chronology and thermal processing of solids in the solar protopl...
The absolute chronology and thermal processing of solids in the solar protopl...The absolute chronology and thermal processing of solids in the solar protopl...
The absolute chronology and thermal processing of solids in the solar protopl...
 
Evidence for a Dynamo in the Main Group Pallasite Parent Body
Evidence for a Dynamo in the Main Group Pallasite Parent BodyEvidence for a Dynamo in the Main Group Pallasite Parent Body
Evidence for a Dynamo in the Main Group Pallasite Parent Body
 
Climate Impacts Report
Climate Impacts ReportClimate Impacts Report
Climate Impacts Report
 
Transofrming earthquake detection
Transofrming earthquake detectionTransofrming earthquake detection
Transofrming earthquake detection
 
dokumen.pub_physics-holt-mcdougal-physics-1nbsped-9780547586694-0547586698.pdf
dokumen.pub_physics-holt-mcdougal-physics-1nbsped-9780547586694-0547586698.pdfdokumen.pub_physics-holt-mcdougal-physics-1nbsped-9780547586694-0547586698.pdf
dokumen.pub_physics-holt-mcdougal-physics-1nbsped-9780547586694-0547586698.pdf
 
Combining density functional theory calculations, supercomputing, and data-dr...
Combining density functional theory calculations, supercomputing, and data-dr...Combining density functional theory calculations, supercomputing, and data-dr...
Combining density functional theory calculations, supercomputing, and data-dr...
 
PhDprofiles
PhDprofilesPhDprofiles
PhDprofiles
 
Probing the energetic particle environment near the Sun
Probing the energetic particle environment near the SunProbing the energetic particle environment near the Sun
Probing the energetic particle environment near the Sun
 
Tang-CV-WPI
Tang-CV-WPITang-CV-WPI
Tang-CV-WPI
 

More from Boston Consulting Group

Cloud-native Enterprise Data Science Teams
Cloud-native Enterprise Data Science TeamsCloud-native Enterprise Data Science Teams
Cloud-native Enterprise Data Science TeamsBoston Consulting Group
 
Cloud-native Enterprise Data Science Teams
Cloud-native Enterprise Data Science TeamsCloud-native Enterprise Data Science Teams
Cloud-native Enterprise Data Science TeamsBoston Consulting Group
 
Big Data: tools and techniques for working with large data sets
Big Data: tools and techniques for working with large data setsBig Data: tools and techniques for working with large data sets
Big Data: tools and techniques for working with large data setsBoston Consulting Group
 
Wide Search Molecular Replacement and the NEBioGrid portal interface
Wide Search Molecular Replacement and the NEBioGrid portal interfaceWide Search Molecular Replacement and the NEBioGrid portal interface
Wide Search Molecular Replacement and the NEBioGrid portal interfaceBoston Consulting Group
 

More from Boston Consulting Group (7)

Cloud-native Enterprise Data Science Teams
Cloud-native Enterprise Data Science TeamsCloud-native Enterprise Data Science Teams
Cloud-native Enterprise Data Science Teams
 
Cloud-native Enterprise Data Science Teams
Cloud-native Enterprise Data Science TeamsCloud-native Enterprise Data Science Teams
Cloud-native Enterprise Data Science Teams
 
Beyond the Science Gateway
Beyond the Science GatewayBeyond the Science Gateway
Beyond the Science Gateway
 
Anaconda Data Science Collaboration
Anaconda Data Science CollaborationAnaconda Data Science Collaboration
Anaconda Data Science Collaboration
 
Big Data: tools and techniques for working with large data sets
Big Data: tools and techniques for working with large data setsBig Data: tools and techniques for working with large data sets
Big Data: tools and techniques for working with large data sets
 
Wide Search Molecular Replacement and the NEBioGrid portal interface
Wide Search Molecular Replacement and the NEBioGrid portal interfaceWide Search Molecular Replacement and the NEBioGrid portal interface
Wide Search Molecular Replacement and the NEBioGrid portal interface
 
To Infiniband and Beyond
To Infiniband and BeyondTo Infiniband and Beyond
To Infiniband and Beyond
 

Recently uploaded

ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomnelietumpap1
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
FILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinoFILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinojohnmickonozaleda
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxMaryGraceBautista27
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)cama23
 

Recently uploaded (20)

ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choom
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
FILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinoFILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipino
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptx
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 

2011 10 pre_broad_grid_overview_ianstokesrees

  • 1. Grid  Computing  Overview Federating  Compute  and  Storage  Resources  to   Accelerate  Science  and  Aid  Collaboration Ian  Stokes-­‐Rees,  PhD Harvard  Medical  School,  Boston,  USA http://portal.sbgrid.org ijstokes@hkl.hms.harvard.edu
  • 2. Slides  and  Contact ijstokes@hkl.hms.harvard.edu http://linkedin.com/in/ijstokes http://slidesha.re/ijstokes-grid2011 http://www.sbgrid.org http://portal.sbgrid.org http://www.opensciencegrid.org http://www.xsede.org Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 3. Today Grid  Computing  Origins High  Energy  Physics Life  Science  Computing  Survey US  Cyberinfrastructure XSEDE Open  Science  Grid Science  Portals Data  Access User  Credentials Security Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 4. Things  ISummary About ’m  Excited   Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 5. About  Me Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 6. High  Energy  Physics Big Data - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 7. Big Data - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 8. Big Data - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 9. Big Data - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 10. 40  MHz  bunch  crossing  rate 10  million  data  channels 1  KHz  level  1  event  recording  rate 1-­10  MB  per  event 14  hours  per  day,  7+  months  /  year 4  detectors 6  PB  of  data  /  year globally  distribute  data  for  analysis  (x2) Big Data - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 11. Data  and  Compute  Intensive   Life  Sciences
  • 12. Study  of  Protein  Structure   and  Function 400m 1mm 10nm • Shared  scientiNic  data  collection  facility • Data  intensive  (10-­‐100  GB/day) Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 13. Cryo  Electron  Microscopy • Previously,  1-­10,000  images,  managed  by  hand • Now,  robotic  systems  collect  millions  of  hi-­res  images • estimate  250,000  CPU-­hours  to  reconstruct  model Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 14. Molecular  Dynamics  Simulations 1  fs  time  step 1ns  snapshot 1  us  simulation 1e6  steps 1000  frames 10  MB  /  frame 10  GB  /  sim 20  CPU-­years 3  months  (wall-­ clock) Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 15. Functional  MRI Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 16. Next  Generation  Sequencing Big Data - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 17. ScientiJic  Collaborations:  by   Region  and  Domain
  • 18. SBGrid  Consortium Cornell U. Washington U. School of Med. R. Cerione NE-CAT T. Ellenberger B. Crane R. Oswald D. Fremont S. Ealick C. Parrish Rosalind Franklin NIH M. Jin H. Sondermann D. Harrison M. Mayer A. Ke UMass Medical U. Washington T. Gonen U. Maryland W. Royer E. Toth Brandeis U. UC Davis N. Grigorieff H. Stahlberg Tufts U. K. Heldwein UCSF Columbia U. JJ Miranda Q. Fan Y. Cheng Rockefeller U. Stanford R. MacKinnon A. Brunger Yale U. K. Garcia T. Boggon K. Reinisch T. Jardetzky D. Braddock J. Schlessinger Y. Ha F. Sigworth CalTech E. Lolis F. Zhou P. Bjorkman Harvard and Affiliates W. Clemons N. Beglova A. Leschziner G. Jensen Rice University S. Blacklow K. Miller D. Rees E. Nikonowicz B. Chen A. Rao Y. Shamoo Vanderbilt J. Chou T. Rapoport Y.J. Tao Center for Structural Biology J. Clardy M. Samso WesternU W. Chazin C. Sanders M. Eck P. Sliz M. Swairjo B. Eichman B. Spiller B. Furie T. Springer M. Egli M. Stone R. Gaudet G. Verdine UCSD B. Lacy M. Waterman M. Grant G. Wagner T. Nakagawa M. Ohi S.C. Harrison L. Walensky H. Viadiu Thomas Jefferson J. Hogle S.Walker J. Williams D. Jeruzalmi T.Walz D. Kahne J. Wang Not Pictured: University of Toronto: L. Howell, E. Pai, F. Sicheri; NHRI (Taiwan): G. Liou; Trinity College, Dublin: Amir Khan T. Kirchhausen S. Wong Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 19. Boston  Life  Sciences  Hub • Biomedical  researchers • Government  agencies • Life  sciences Tufts • Universities Universit y School of Medicin e • Hospitals Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 20. ScientiJic  Research  Today • International  collaborations • IT  becomes  embedded  into  research  process:  data,  results,   analysis,  visualization • Crossing  institutional  and  national  boundaries • Computational  techniques  increasingly   important • ...  and  computationally  intensive  techniques  as  well • requires  use  of  high  performance  computing  systems • Data  volumes  are  growing  fast • hard  to  share • hard  to  manage • ScientiNic  software  often  difNicult  to  use • or  to  use  properly • Web  based  tools  increasingly  important • but  often  lack  disconnect  from  persisted  and  shared  results Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 21. Required: Collaborative  environment  for   compute  and  data  intensive  science
  • 22. http://www.xsede.org Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 23. • 30,000  hour  allocations  “easy” • millions  of  hours  possible • any  US-­‐based  researcher  can  apply • allocation  holder  can  delegate • access  to  ~dozen  of   supercomputing  centers • command  line  access • standard  batch  systems  like  PBS,  LSF,  SGE • web-­‐based  interaction • build  your  own  Science  Gateway • XSEDE  for  processing  behind  the  scenes Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 24. Open  Science  Grid http://opensciencegrid.org • US  National   Cyberinfrastructure • Primarily  used  for  high   energy  physics  computing • 80  sites • 100,000  job  slots 5,073,293  hours • 1,500,000  hours  per  day ~570  years • PB  scale  aggregate  storage • 1  PB  transferred  each  day • Virtual  Organization-­‐based Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 25. Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 26. SimpliJied  Grid  Architecture Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 27. Grid  Architectural  Details • Resources • Information • Uniform  compute  clusters • LDAP  based  most  common  (not   • Managed  via  batch  queues optimized  for  writes) • Local  scratch  disk • Domain  speciNic  layer • Sometimes  high  perf.  network   • Open  problem! (e.g.  InNiniBand) • Fabric • Behind  NAT  and  Nirewall • In  most  cases,  assume  functioning   • No  shell  access Internet • Data • Some  sites  part  of  experimental   private  networks • Tape-­‐backed  mass  storage • Disk  arrays  (100s  TB  to  PB) • Security • High  bandwidth  (multi-­‐stream)   • Typically  underpinned  by  X.509   transfer  protocols Public  Key  Infrastructure • File  catalogs • Same  standards  as  SSL/TLS  and   • Meta-­‐data “server  certs”  for  “https” • Replica  management Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 28. OSG  Components  (I) • Centralized • X.509  CertiNicate  Authority:  Energy  Science  Network  CA  @  LBL • Accounting:  Gratia  logging  system  to  track  usage  (CPU,  Network,  Disk) • Status:  LDAP  directory  with  details  of  each  participating  system • Support:  Central  clearing  house  for  support  tickets • Software:  distribution  system,  update  testing,  bug  reporting  and  Nixing • Communication:  Wikis,  docs,  mailing  lists,  workshops,  conferences,  etc. • Per  Site • Compute  Element/Gatekeeper  (CE/GK):  access  point  for  external  users,  acts   as  frontend  for  any  cluster.    Globus  GRAM  +  local  batch  system • Storage  Element  (SE):  grid-­‐accessible  storage  system,  GridFTP-­‐based  +  SRM • Worker  Nodes  (WN):  cluster  nodes  with  grid  software  stack • User  Interface  (UI):  access  point  for  local  users  to  interact  with  remote  grid • Access  Control:  GUMS  +  PRIMA  for  ACLs  to  local  system  by  grid  identities • Admin  contact:  need  a  local  expert  (or  two!) Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 29. OSG  Components  (II) • Per  Virtual  Organization  (user  community) • VO  Management  System  (VOMS):  to  organize  and  register  users • Registration  Authority  (RA):  to  validate  community  users  with  X.509  issuer • User  Interface  system  (UI):  provide  gateway  to  OSG  for  users • Support  Contact:  users  are  supported  by  their  VO  representatives • Per  User • X.509  user  certiNicate  (although  I’d  like  to  hide  that  part) • Induction:  unless  it  is  through  a  portal,  grid  computing  is  not  shared  Nile   system  batch  computing!    Many  more  failure  modes  and  gotchas. Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 30. Grid  Opportunities • New  compute  intensive  workNlows • think  big:  tens  or  hundreds  of  thousands  of  hours  Ninished  in  1-­‐2  days • sharing  resources  for  efNicient  and  large  scale  utilization • Data  intensive  problems • we  mirror  20  GB  of  data  to  30  computing  centers • Data  movement,  management,  and  archive • Federated  identity  and  user  management • labs,  collaborations  or  ad-­‐hoc  groups • role-­‐based  access  control  (RBAC)  and  IdM • Collaborative  environment • Web-­‐based  access  to  applications Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 31. Protein  Structure  Determination Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 32. Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 33. Typical  Layered  Environment Fortran bin • Command  line  application  (e.g.  Fortran) • Friendly  application  API  wrapper Python API Map- • Batch  execution  wrapper  for  N-­‐iterations Multi-exec wrapper Reduce • Results  extraction  and  aggregation Result aggregator • Grid  job  management  wrapper Grid management • Web  interface Web interface • forms,  views,  static  HTML  results • GOAL  eliminate  shell  scripts • often  found  as  “glue”  language  between  layers Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 34. Web  Portals  for  Collaborative,   Multi-­‐disciplinary  Research... ...  which  leverage  capabilities  of  federated   grid  computing  environments
  • 35. The  Browser  as  the   Universal  Interface • If  it  isn’t  already  obvious  to  you • Any  interactive  application  developed  today  should  be  web-­‐based  with  a   RESTful  interface  (if  at  all  possible) • A  rich  set  of  tools  and  techniques • AJAX,  HTML4/5,  CSS,  and  JavaScript • Dynamic  content  negotiation • HTTP  headers,  caching,  security,  sessions/cookies • Scalable,  replicable,  centralized,  multi-­‐threaded,   multi-­‐user • Alternatives • Command  Line  (CLI):  great  for  scriptable  jobs • GUI  toolkits:  necessary  for  applications  with  high  graphics  or  I/O  demands Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 36. What  is  a  Science  Portal? • A  web-­‐based  gateway  to  resources  and  data • simpliNied  access • centralized  access • uniNied  access  (CGI,  Perl,  Python,  PHP,  static  HTML,  static  Niles,  etc.) • Attempt  to  provide  uniform  access  to  a  range  of   services  and  resources • Data  access  via  HTTP • Leverage  brilliance  of  Apache  HTTPD  and   associated  modules Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 37. SBGrid  Science  Portal  Objectives A.   Extensible  infrastructure  to  facilitate   development  and  deployment  of  novel   computational  workNlows   B. Web-­‐accessible  environment  for  collaborative,   compute  and  data  intensive  science Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 38. TeraGrid SBGrid User NERSC Community Open Science Grid National Federated Cyberinfrastructure Odyssey Facilitate  interface   between  community   and  cyberinfrastructure Orchestra EC2 Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 39. Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 40. Results  Visualization  and  Analysis Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 42. User  access  to  results  data Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 43. Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 44. Experimental  Data  Access • Collaboration • Access  Control • Identity  Management • Data  Management • High  Performance  Data  Movement • Multi-­‐modal  Access Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 45. Data  Model • Data  Tiers • VO-­wide:  all  sites,  admin  managed,  very  stable • User  project:  all  sites,  user  managed,  1-­‐10  weeks,  1-­‐3  GB • User  static:  all  sites,  user  managed,  indeNinite,  10  MB • Job  set:  all  sites,  infrastructure  managed,  1-­‐10  days,  0.1-­‐1  GB • Job:  direct  to  worker  node,  infrastructure  managed,  1  day,  <10  MB • Job  indirect:  to  worker  node  via  UCSD,  infrastructure  managed,  1   day,  <10  GB Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 46. About  2PB  with 100  front  end   servers  for  high   bandwidth  parallel   Nile  transfer Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 47. Globus  Online:  High  Performance   Reliable  3rd  Party  File  Transfer GUMS DN  to  user  mapping CertiNicate  Authority VOMS root  of  trust VO  membership portal cluster Globus  Online Nile  transfer  service data collection lab file facility server desktop laptop
  • 48. Architecture ✦ SBGrid • manages  all  user  account  creation  and  credential  mgmt • hosts  MyProxy,  VOMS,  GridFTP,  and  user  interfaces ✦ Facility • knows  about  lab  groups • e.g.  “Harrison”,  “Sliz” • delegates  knowledge  of  group  membership  to  SBGrid  VOMS • facility  can  poll  VOMS  for  list  of  current  members • uses  X.509  for  user  identiJication • deploys  GridFTP  server ✦ Lab  group • designates  group  manager  that  adds/removes  individuals • deploys  GridFTP  server  or  Globus  Connect  client ✦ Individual • username/password  to  access  facility  and  lab  storage • Globus  Connect  for  personal  GridFTP  server  to  laptop • Globus  Online  web  interface  to  “drive”  transfers
  • 49.
  • 50. Objective ✦ Easy  to  use  high  performance  data  mgmt   environment ✦ Fast  Nile  transfer • facility-­‐to-­‐lab,  facility-­‐to-­‐individual,  lab-­‐to-­‐individual ✦ Reduced  administrative  overhead ✦ Better  data  curation
  • 51. Ryan,  a  postdoc  in  the   Frank  Lab  at  Columbia Access  NRAMM  facilities   securely  and  transfer  data   back  to  home  institute automated  X.509 check  SBGrid  for   application group   Ryan’s   membership /data/columbia/frank facility file server transfer  data  to  lab veriNication  in  Frank  Lab,  so   of   lab  membership access  to  Niles grant   SBGrid Ryan  initiate  tfransfer  at   applies   or  an   request  access Science account  at  the  SBGrid   NRAMM to  NRAMM Portal Science  Portal facility using  credential   notify  user  of   held  by  SBGrid lab file completion desktop server automated   /nfs/data/rsmith Globus  Online   application use  Globus  Online   to  manage transfer  from   /Users/Ryan laptop NRAMM  back  to  lab
  • 52. Challenges ✦ Access  control • visibility • policies ✦ Provenance • data  origin • history ✦ Meta-­‐data • attributes • searching
  • 54. UniJied  Account  Management Hierarchical  LDAP  database user  basics passwords Standard  schemas Relational  DB user  custom  proNiles institutions lab  groups Custom  schemas Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 55. X.509  Digital  CertiJicates ✦ Analogy  to  a  passport: • Application  form • Sponsor’s  attestation • Consular  services • veriNication  of  application,  sponsor,  and  accompanying   identiNication  and  eligibility  documents • Passport  issuing  ofNice ✦ Portable,  digital  passport • Nixed  and  secure  user  identiNiers • name,  email,  home  institution • signed  by  widely  trusted  issuer • time  limited • ISO  standard
  • 56. U1 U1 U1 Addressing  CertiJicate  Problems /.' -.'.)"*&' 012*%2!' 3%"!' )"*"!4&"' ,"!&'5":'14(!' !"#$"%&'%()*"+',"!&' U1 !"&$!*'&!4,5(*)'*$67"!'' *289:'4)"*&%' !";("<'!"#$"%&' R1 time ;"!(9:'$%"!'"=()(7(=(&:' ,2*>!6'"=()(7(=(&:' S1 411!2;"',"!&' R2 %()*',"!&' *289:'4;4(=47(=(&:' !"&!(";"',"!&' U2a "?12!&'%()*"+' ,"!&'5":'14(!'
  • 57. VO  (Group)  Membership   Registration !")*# !"#$%&'(# *+,(-,.# /-0.# +.0-0(:#50.:#:,#.0?>0-:#&0&90.-<'+#93#@A# .0?>0-:#!"#8.,>+-#4(%#.,70-# U2b (,123#4%&'(# V1 ;0.'23#>-0.#07'8'9'7':3# 5,(6.&#07'8'9'7':3# S2 time 4++.,;0#&0&90.-<'+=# 8.,>+-=#4(%#.,70-# 4%%#@A# V2 :,#!")*# (,123# .0?>0-:#!")*#$B# .0:>.(#!")*#$B# 4%%#$B#:,# +.,C3#50.:#
  • 58. ()' AB)!' !"#$%&' *+",-"#' .-/#' ;<' #/>:/-$'+"#$%&'%66":,$' =3!#"I3' ;<=*' )@81,' 0/#176%?",'/8%1&'-/,$' /8%1&'0/#17/@' U1 4/,/#%$/' !"#$%&'(%)*+% #/>:/-$'-14,/@'6/#$' 6/#$'9/3'+%1#' ',,#""%+#0% 1*$/'2% #/$:#,'$#%691,4',:85/#' ,"?23'%4/,$-' 0/#123'/&14151&1$3' A1a 6#/%$/' 6",7#8'/&14151&1$3' &"6%&'%66$' S1* %++#"0/'6/#$' time -14,'6/#$' ,"?23'%0%1&%51&1$3' A1b -/$'#/$#1/0%&'-/#1%&',:85/#' -/$';<'#14C$-' %66":,$'#/%@3',"?76%?",' +"#$%&'&"41,' U2* #/>:/-$'-14,/@'6/#?76%$/' #/$:#,'-14,/@'6/#?76%$/' D'E'+%1#'-14,/@'6/#$' 1,$"'!F(*GDH'7&/' #/41-$/#'+#"I3'6/#$' H'E'6#/%$/'&"6%&' J1$C'=3!#"I3' !"#$%&'(%)*+% +#"I3'6/#$' ',,#""%-#.#$'/#.% $#"*!$,#"%
  • 59. Process  and  Design  Improvements ✦ Single  web-­‐form  application • includes  e-­‐mail  veriNicationn ✦ Centralized  and  connected  credential  management • FreeIPA  LDAP  -­‐  user  directory  and  credential  store • VOMS  -­‐  lab,  institution,  and  collaboration  afNiliations • MyProxy  -­‐  X.509  credential  store ✦ Overlap  administrative  roles • system  admin • registration  agent  for  certiNicate  authority  (approve  X.509   request) • VO  administrator  to  register  group  afNiliations ✦ Automation
  • 61. Access  Control • Need  a  strong  Identity  Management  environment • individuals:  identity  tokens  and  identiNiers • groups:  membership  lists • Active  Directory/CIFS  (Windows),  Open  Directory  (Apple),  FreeIPA  (Unix)  all  LDAP-­‐ based • Need  to  manage  and  communicate  Access  Control  policies • institutionally  driven • user  driven • Need  Authorization  System • Policy  Enforcement  Point  (shell  login,  data  access,  web  access,  start  application) • Policy  Decision  Point  (store  policies  and  understand  relationship  of  identity  token     and  policy) Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 62. Access  Control • What  is  a  user? • .htaccess  and  .htpasswd • local  system  user  (NIS  or  /etc/passwd) • portal  framework  user  (proprietary  DB  schema) • grid  user  (X.509  DN) • What  are  we  securing  access  to? • Web  pages? • URLs? • Data? • SpeciNic  operations? • Meta  Data? • What  kind  of  policies  do  we  enable? • Simplify  to  READ  WRITE  EXECUTE  LIST  ADMIN Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 63. Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 64. Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 65. Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 66. Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 67. Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 69. Service  Architecture GlobusOnline UC San Diego @Argonne GUMS User GUMS GridFTP + glideinWMS data Hadoop factory Open Science Grid computations MyProxy @NCSA, UIUC monitoring interfaces data computation ID mgmt Ganglia scp Condor FreeIPA Apache DOEGrids CA Nagios GridFTP Cycle Server @Lawrence GridSite LDAP RSV SRM VDT Berkley Labs Django VOMS Globus pacct WebDAV Sage Math GUMS glideinWMS Gratia Acct'ing R-Studio GACL @FermiLab file SQL shell CLI server DB cluster Monitoring SBGrid Science Portal @ Harvard Medical School @Indiana Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 70. Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 71. Summary Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 72. Acknowledgements  &  Questions • Piotr  Sliz Please  contact  me   • Principle  Investigator,  head  of  SBGrid with  any  questions: • SBGrid  Science  Portal • Ian  Stokes-­‐Rees • Daniel  O’Donovan,  Meghan  Porter-­‐Mahoney • ijstokes@hkl.hms.harvard.edu • SBGrid  System  Administrators • ijstokes@spmetric.com • Ian  Levesque,  Peter  Doherty,  Steve  Jahl • Globus  Online  Team Look  at  our  work • Steve  Tueke,  Ian  Foster,  Rachana   • portal.sbgrid.org Ananthakrishnan,  Raj  Kettimuthu   • www.sbgrid.org • Ruth  Pordes • www.opensciencegrid.org • Director  of  OSG,  for  championing  SBGrid Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 73. Extra  Slides Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 74. Existing  Security   Infrastructure • X.509  certiNicates • Department  of  Energy  CA • Regional/Institutional  RAs  (SBGrid  is  an  RA) • X.509  proxy  certiNicate  system • Users  self-­‐sign  a  short-­‐lived  passwordless  proxy  certiNicate  used  for  “portable”   and  “automated”  grid  processing  identity  token • Similarities  to  Kerberos  tokens • Virtual  Organizations  (VO)  for  deNinitions  of  roles,   groups,  attrs • Attribute  CertiNicates • Users  can  (attempt)  to  fetch  ACs  from  the  VO  to  be  attached  to  proxy  certs • POSIX-­‐like  Nile  access  control  (Grid  ACL)   Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 75. Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 76. Data  Management quota du  scan tmpwatch conventions workNlow  integration Data  Movement scp  (users) rsync  (VO-­‐wide) grid-­‐ftp  (UCSD) curl  (WNs) cp  (NFS) htcp  (secure  web) Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 77. 4.  pull  6iles  from UCSD  to  WNs 5.  pull  6iles  from 3.  Auto-­replicate local  NSF  to  WNs 6.  pull  6iles  from SBGrid  to  WNs red  -­  push  6iles green  -­  pull  6iles 2.  replicate  gold  standard 7.  job  results  copied   back  to  SBGrid 8a.  large  job  results   copied  to  UCSD 8b.  later  pulled  to   1.  user  6ile  upload SBGrid Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 78. “weak” solution 2nx5q2 Log Likelihood Gain MHC-­‐TCR:  2VLJ “strong” solution 1im3a2 Translation Z score Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 79. NEBioGrid  Django  Portal • PyGACL Interactive  dynamic  web  portal  for   Python  representation  of  GACL  model   workNlow  deNinition,  submission,   and  API  to  work  with  GACL  Niles monitoring,  and  access  control • osg_wrap • NEBioGrid  Web  Portal Swiss  army  knife  OSG  wrapper  script  to   GridSite  based  web  portal  for  Nile-­‐system   handle  Nile  staging,  parameter  sweep,   level  access  (raw  job  output),  meta-­‐data   DAG,  results  aggregation,  monitoring tagging,  X.509  access  control/sharing,   • sbanalysis CGI data  analysis  and  graphing  tools  for   • PyCCP4 structural  biology  data  sets Python  wrappers  around  CCP4   • osg.monitoring structural  biology  applications tools  to  enhance  monitoring  of  job  set   • PyCondor and  remote  OSG  site  status Python  wrappers  around  common   • shex Condor  operations Write  bash  scripts  in  Python:  replicate   enhanced  Condor  log  analysis commands,  syntax,  behavior • PyOSG • xcon6ig Python  wrappers  around  common  OSG   Universal  conNiguration operations Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 80. 10k  grid  jobs Example  Job  Set approx  30k  CPU  hours 99.7%  success  rate evicted - red 24  wall  clock  hours completed - green held - orange MIT 5292 UWisc 1173 1077 120 1657 3 662 Cornell 840 20 Buffalo 720 628 ND 76 407 47 421 Caltech 190 FNAL 1409 237 12 24 79 4 47 UNL 6 1159 3 HMS 60 20 Purdue 349 10,000 jobs 52 17 39 UCR RENCI local queue remote queue SPRACE 1216 running 316 248 Grid Overview - Ian Stokes-Rees 24 hours ijstokes@hkl.hms.harvard.edu
  • 81. Job  Lifelines Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 82. REST • Don’t  try  to  read  too  much  into  the  name • REpresentational  State  Transfer:  coined  by  Roy  Fielding,  co-­‐author  of   HTTP  protocol  and  contributor  to  original  Apache  httpd  server • Idea • The  web  is  the  worlds  largest  asynchronous,  distributed,  parallel   computational  system • Resources  are  “hidden”  but  representations  are  accessible  via  URLs • Representations  can  be  manipulated  via  HTTP  operations  GET  PUT  POST   HEAD  DELETE  and  associated  state • State  transitions  are  initiated  by  software  or  by  humans • Implication • Clean  URLs  (e.g.  Flickr) Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 83. Big Data - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 84. Cloud  Computing: Industry  solution  to  the  Grid • Virtualization  has  taken  off  in  the  past  5  years • VMWare,  Xen,  VirtualPC,  VirtualBox,  QEMU,  etc. • Builds  on  ideas  from  VMS  (i.e.  old) • (Good)  System  administrators  are  hard  to  come  by • And  operating  a  large  data  center  is  costly • Internet  boom  means  there  are  companies  that  have  Nigured  out   how  to  do  this  really  well • Google,  Amazon,  Yahoo,  Microsoft,  etc. • Outsource  IT  infrastructure!    Outsource  software  hosting! • Amazon  EC2,  Microsoft  Azure,  RightScale,  Force.com,  Google  Apps • Over  simpliNied: • You  can’t  install  a  cloud • You  can’t  buy  a  grid Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 85. Is  “Cloud”  the  new  “Grid”? • Grid  is  about  mechanisms  for  federated,   distributed,  heterogeneous  shared  compute  and   storage  resources • standards  and  software • Cloud  is  about  on-­‐demand  provisioning  of   compute  and  storage  resources • services No  one  buys  a  grid.    No  one  installs  a  cloud. Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 86. The  interesting  thing  about  Cloud  Computing  is  that   we’ve  redeSined  Cloud  Computing  to  include   everything  that  we  already  do.  .  .  .  I  don’t  understand   what  we  would  do  differently  in  the  light  of  Cloud   Computing  other  than  change  the  wording  of  some  of   our  ads. Larry  Ellison,  Oracle  CEO,  quoted  in  the  Wall  Street  Journal,  September  26,  2008*   *http://blogs.wsj.com/biztech/2008/09/25/larry-­‐ellisons-­‐brilliant-­‐anti-­‐cloud-­‐computing-­‐rant/ Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
  • 87. When  is  cloud  computing   interesting? • My  deNinition  of  “cloud  computing” • Dynamic  compute  and  storage  infrastructure  provisioning  in  a  scalable  manner  providing   uniform  interfaces  to  virtualized  resources • The  underlying  resources  could  be •  “in-­‐house”  using  licensed/purchased  software/hardware • “external”  hosted  by  a  service/infrastructure  provider • Consider  using  cloud  computing  if • You  have  operational  problems/constraints  in  your  current  data  center • You  need  to  dynamically  scale  (up  or  down)  access  to  services  and  data • You  want  fast  provisioning,  lots  of  bandwidth,  and  low  latency • Organizationally  you  can  live  with  outsourcing  responsibility  for  (some  of)  your  data  and   applications • Consider  providing  cloud  computing  services  if • You  have  an  ace  team  efNiciently  running  your  existing  data  center • You  have  lots  of  experience  with  virtualization • You  have  a  speciNic  application/domain  that  could  beneNit  from  being  tied  to  a  large  compute   farm  or  disk  array  with  great  Internet  connectivity Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu