Advancing Science through Coordinated Cyberinfrastructure

518 views

Published on

How local, regional, and national cyberinfrastructure can be coordinated and linked to advance science and engineering, based on experiences and lessons from the Center for Computation & Technology at LSU (ideas, funding, implementation), plus some thoughts on what might be done differently if we were starting today. Presented at First Workshop - Center for Computational Engineering & Sciences, Unicamp, Campinas, Brazil 10 APR 2014

Published in: Science
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
518
On SlideShare
0
From Embeds
0
Number of Embeds
9
Actions
Shares
0
Downloads
9
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Advancing Science through Coordinated Cyberinfrastructure

  1. 1.     www.ci.anl.gov   www.ci.uchicago.edu   Advancing  Science  through   Coordinated  Cyberinfrastructure   Daniel  S.  Katz   d.katz@ieee.org   Senior  Fellow,  ComputaBon  InsBtute,  University  of  Chicago  &  Argonne  NaBonal  Laboratory   Affiliate  Faculty,  Center  for  ComputaBon  &  Technology,  Louisiana  State  University   Adjunct  Associate  Professor,  Electrical  and  Computer  Engineering,  LSU    
  2. 2. www.ci.anl.gov   www.ci.uchicago.edu   2   Advancing  Science  through  CI  –  d.katz@ieee.org   Topics   •  What  we  did  in  Louisiana  from  2006-­‐2010   •  What  I  would  do  differently  now   •  A  short  video  to  highlight  some  addiBonal  issues   that  I  hope  the  Center  for  ComputaBonal   Engineering  &  Sciences  will  keep  in  mind  
  3. 3. www.ci.anl.gov   www.ci.uchicago.edu   3   Advancing  Science  through  CI  –  d.katz@ieee.org   Louisiana   •  Area: 134 382 km2 (33/51) •  Population: 4 533 000 (2010, 25/51) •  GDP: $208 billion (2009, 24/51) •  GDP/person: $45 700 (2009, 21/51) •  In Poverty: 17% (2009, 44/51) •  High School Degree: 82% (2009, 46/51) •  BS Degree: 21% (2009, 47/51) •  Advanced Degree: 7% (2009, 48/51) State  Goals:  talented  workforce,  great  compeBBveness,  strong   educaBonal  system,  increased  economic  development  
  4. 4. www.ci.anl.gov   www.ci.uchicago.edu   4   Advancing  Science  through  CI  –  d.katz@ieee.org   PITAC  Report  Summary:     •  “ComputaBonal  science  -­‐-­‐  the  use  of   advanced  compuBng  capabiliBes  to   understand  and  solve  complex   problems  -­‐-­‐  is  criBcal  to  scienBfic   leadership,  economic  compeBBveness,   and  naBonal  security.  It  is  one  of  the   most  important  technical  fields  of  the   21st  century  because  it  is  essenBal  to   advances  throughout  society.”   •  “UniversiBes  must  significantly  change   organizaBonal  structures:     mulBdisciplinary  &  collaboraBve   research  are  needed  [for  US]  to  remain   compeBBve  in  global  science”   Complex  problems:    Innova1ons  will  occur  at  boundaries  
  5. 5. www.ci.anl.gov   www.ci.uchicago.edu   5   Advancing  Science  through  CI  –  d.katz@ieee.org   Big  Science  and  Infrastructure   •  Higgs*  boson  discovery  announced  at  CERN  July  4,  2012   •  Instrument:  Large  Hadron  Collider  (LHC)   •  Infrastructure   –  CompuBng  Hardware:  Worldwide  LHC  CompuBng  Grid  (WLCG):  235,000  cores   across  36  countries,  including  OpenScience  Grid  (OSG,  US),  European  Grid   Infrastructure  (EGI,  Europe),  ...   –  Data:  ~20  PB  of  data  created  in  2011-­‐2012   –  Soiware:  grid  middleware,  physics  analysis  applicaBons,  ...   –  Networks   –  EducaBon  &   Training   •  Data  generated     centrally,  moved     (~3  PB/week)   across  mulB-­‐Bered     infrastructure  to  be     compuBng  upon  
  6. 6. www.ci.anl.gov   www.ci.uchicago.edu   6   Advancing  Science  through  CI  –  d.katz@ieee.org   Big  Science  and  Infrastructure   •  Hurricanes  affect  humans   •  MulB-­‐physics:  atmosphere,  ocean,  coast,  vegetaBon,  soil   –  Sensors  and  data  as  inputs   •  Humans:  what  have  they  built,  where  are  they,  what  will  they  do   –  Data  and  models  as  inputs   •  Infrastructure:   –  Urgent/scheduled  processing,     workflow  systems   –  Soiware  applicaBons,  workflows   –  Networks   –  Decision-­‐support  systems,     visualizaBon   –  Data  storage,   interoperability  
  7. 7. www.ci.anl.gov   www.ci.uchicago.edu   7   Advancing  Science  through  CI  –  d.katz@ieee.org   Long-­‐tail  Science  and  Infrastructure   •  Exploding  data  volumes  &  powerful   simulaBon  methods    mean  that  more   researchers  need  advanced  infrastructure   •  Such  “long-­‐tail”  researchers    cannot  afford   expensive  experBse  and  unique   infrastructure     •  Challenge:  Outsource  and/or  automate   Bme-­‐consuming  common  processes   –  Tools,  e.g.,  Globus  Online  and  data   management   o  Note:  much  LHC  data  is  moved  by  Globus  GridFTP,   e.g.,  May/June  2012,  >20  PB,  >20M  files   –  Gateways,  e.g.,  nanoHUB,  CIPRES,  access  to   scienBfic  simulaBon  soiware   NSF  grant  size,  2007.  (“Dark   data  in  the  long  tail  of   science”,  B.  Heidorn)  
  8. 8. www.ci.anl.gov   www.ci.uchicago.edu   8   Advancing  Science  through  CI  –  d.katz@ieee.org   Long-­‐tail  Science  and  Infrastructure   •  CIPRES  Science  Gateway  for  PhylogeneBcs   –  Study  of  diversificaBon  of  life  and  relaBonships  among  living  things  through  Bme   •  Highly  used,  as  of  mid  2013:   –  Cited  in  at  least  400  publicaBons,  e.g.,  Nature,  PNAS,  Cell   –  More  than  5000  unique  users  in  3  years   –  Used  rouBnely  in  at  least  68  undergraduate  classes   –  45%  US  (including  most  states),  55%  70  other  countries   •  Infrastructure   –  Flexible  web  applicaBon   o  A  science  gateway,  uses  soiware  and  lessons  from  XSEDE  gateways  team,  e.g.,  idenBfy   management,  HPC  job  control   –  Science  soiware:  tree  inference  and  sequence  alignment   o  Parallel  versions  of  MrBayes,  RAxML,  GARLI,  BEAST,  MAFFT   o  PAUP*,  Poy,  ClustalW,  Contralign,  FSA,  MUSCLE,  ...   –  Data   o  Personal  user  space  for  storing     results   o  Tools  to  transfer  and  view  data   Credit:  Mark  Miller,  SDSC  
  9. 9. www.ci.anl.gov   www.ci.uchicago.edu   9   Advancing  Science  through  CI  –  d.katz@ieee.org   Infrastructure  Challenges   •  Science   –  Larger  teams,  more  disciplines,  more  countries   •  Data     –  Size,  complexity,  rates  all  increasing  rapidly   –  Need  for  interoperability  (systems  and  policies)   •  Systems   –  More  cores,  more  architectures  (GPUs),  more  memory  hierarchy   –  Changing  balances  (latency  vs  bandwidth)   –  Changing  limits  (power,  funds)   –  System  architecture  and  business  models  changing  (clouds)   –  Network  capacity  growing;  increase  networks  -­‐>  increased  security   •  Soiware   –  MulBphysics  algorithms,  frameworks   –  Programing  models  and  abstracBons  for  science,  data,  and  hardware   –  V&V,  reproducibility,  fault  tolerance   •  People   –  EducaBon  and  training   –  Career  paths   –  Credit  and  avribuBon  
  10. 10. www.ci.anl.gov   www.ci.uchicago.edu   10   Advancing  Science  through  CI  –  d.katz@ieee.org   Cyberinfrastructure   “Cyberinfrastructure  consists  of    compu1ng  systems,    data  storage  systems,      advanced  instruments  and      data  repositories,      visualiza1on  environments,  and      people,     all  linked  together  by  so@ware  and      high  performance  networks     to  improve  research  produc1vity  and  enable  breakthroughs      not  otherwise  possible.”              -­‐-­‐  Craig  Stewart    
  11. 11. www.ci.anl.gov   www.ci.uchicago.edu   11   Advancing  Science  through  CI  –  d.katz@ieee.org   ComputaBonal  &  Data-­‐enabled     Science  &  Engineering  (CDS&E)   •  LIGO:    Laser  Interferometric  GravitaBonal  Wave   Observatory   •  Ties  together  theory,  computaBon,  and  experiment   –  Each  drives  the  other  two!  
  12. 12. www.ci.anl.gov   www.ci.uchicago.edu   12   Advancing  Science  through  CI  –  d.katz@ieee.org   How  We  Started   •  State  commitment:  $25M/year  for  Vision  20/20   –  $9M:  LSU  -­‐>  CCT  (similarly,  ULL  -­‐>  LITE)   •  University  commitment  to  build  new  programs  for   21st  century   •  State  and  University  willingness  to  make   extraordinary  investments   •  Opportunity  to  build  new  world  class  program  in   interdisciplinary  research  and  educaBon,  involving   all  of  LSU   •  Ed  Seidel-­‐led  vision  to  insBgate  state-­‐wide   collaboraBon  
  13. 13. www.ci.anl.gov   www.ci.uchicago.edu   13   Advancing  Science  through  CI  –  d.katz@ieee.org   Advancing  Research   •  PotenBally  requires  advances  in  three  areas,   depending  on  exisBng  strengths  
  14. 14. www.ci.anl.gov   www.ci.uchicago.edu   14   Advancing  Science  through  CI  –  d.katz@ieee.org   CCT Director Office Edward Seidel HPC Partnership McMahon Cyberinfrastructure Development

×