Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
1CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
Name	
  of	
  Mee)ng	
  •	
  Loca)on	
  •	
...
2CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
A	
  Dedicated	
  Survey	
  Telescope	
  
−...
3CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
Open	
  Data,	
  Open	
  Source:	
  A	
  Co...
4CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
Why	
  LSST:	
  The	
  Science	
  
5CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
History	
  
	
  
1996-­‐2000	
  “Dark	
  Ma...
CfA	
  Code	
  Coffee	
  •	
  Harvard-­‐Smithsonian	
  Center	
  for	
  Astrophysics	
  •	
  June	
  4,	
  2014.	
  
LSST:	...
7CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
Fron7ers	
  of	
  Survey	
  Astronomy	
  
−...
8CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
Funding	
  Status	
  
−  December	
  6th,	
...
CfA	
  Code	
  Coffee	
  •	
  Harvard-­‐Smithsonian	
  Center	
  for	
  Astrophysics	
  •	
  June	
  4,	
  2014.	
  
Loca)o...
10CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
LSST	
  Observatory	
  (cca.	
  late	
  ~2...
11CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
12CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
Combined	
  Primary/Ter7ary	
  Mirror	
  
...
13CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
LSST	
  Camera	
  
Parameter	
   Value	
  ...
14CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
Bandpasses:	
  u,g,r,i,z,y	
  
16CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
Next	
  mee5ng:	
  August	
  11-­‐15th	
  ...
17CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
LSST	
  From	
  the	
  Astronomer’s	
  Per...
18CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
LSST	
  Data	
  Management	
  System	
  
(...
19CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
LSST	
  Data	
  Management:	
  Roles	
  
−...
20CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
HQ	
  Site	
  
Science	
  Opera)ons	
  
Ob...
21CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
Infrastructure:	
  Petascale	
  Compu7ng,	...
22CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
“Applica7ons”:	
  Scien7fic	
  Core	
  of	
...
23CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
Middleware	
  Layer:	
  Isola7ng	
  Hardwa...
24CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
Database	
  and	
  Science	
  UI:	
  Deliv...
25CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
Going	
  Where	
  the	
  Talent	
  is:	
  ...
26CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
The	
  LSST	
  Soiware	
  Stack	
  
(scien...
27CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
LSST	
  Science	
  Pipelines	
  
−  02C.01...
28CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
Implementa7on	
  Strategy:	
  Transfer	
  ...
29CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
Maintainable	
  Design	
  /	
  Language	
 ...
30CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
Modular	
  Architecture	
  
Applica)on	
  ...
31CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
Module	
  Dependency	
  Tree	
  
eigen xpa...
32CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
Module	
  Dependency	
  Tree	
  
eigen xpa...
33CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
(Very	
  Basic)	
  SExtractor	
  with	
  l...
34CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
(Very	
  Basic)	
  SExtractor	
  with	
  l...
35CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
36CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
Current	
  Status:	
  Advanced	
  Prototyp...
Figure:	
  	
  
5	
  sq.	
  deg.	
  	
  
background-­‐matched	
  
coadd	
  composite	
  
	
  
(g,r,i)	
  
~55	
  epochs	
 ...
38CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
Streams	
  in	
  LSST-­‐reprocessed	
  SDS...
39CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
Example:	
  Forced	
  Photometry	
  on	
  ...
40CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
Winter	
  2014	
  SoYware	
  Release	
  
c...
41CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
WARNING!	
  ADVERTENCIA!	
  AVERTISSEMENT!...
42CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
WARNING!	
  ADVERTENCIA!	
  AVERTISSEMENT!...
43CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
The	
  Big	
  Picture:	
  	
  
	
  
Prepar...
44CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
“Astro	
  2020”:	
  Rise	
  of	
  the	
  M...
45CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
“Astro	
  2020”:	
  Rise	
  of	
  the	
  M...
46CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
“Astro	
  2020”:	
  Rise	
  of	
  the	
  M...
47CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
LSST:	
  Helping	
  Build	
  the	
  Common...
48CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
LSST:	
  A	
  Piece	
  of	
  the	
  Puzzle...
49CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
@LSST	
  	
  @mjuric	
  
LSST/DM: Building a Next Generation Survey Data Processing System
Upcoming SlideShare
Loading in …5
×

LSST/DM: Building a Next Generation Survey Data Processing System

2,855 views

Published on

A presentation about LSST Data Management delivered at Harvard-Smithsonian CfA Code Coffee.

Published in: Science
  • Be the first to comment

LSST/DM: Building a Next Generation Survey Data Processing System

  1. 1. 1CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Name  of  Mee)ng  •  Loca)on  •  Date    -­‐    Change  in  Slide  Master   LSST/DM:  Building  a  Next  Genera7on  Survey  Data   Processing  System     Mario  Juric   LSST  Data  Management  Project  Scien5st                       CFA CODE COFFEE June 4, 2014 Robyn  Allsman,   Yusra  AlSayyad,   Tim  Axelrod,   Jacek  Becla,   Andrew  Becker,       Steve  Bickerton,   Jim  Bosch,     Bill  Chickering,   Andy  Connolly,     Greg  Daues,   Gregory  Dubois-­‐ Fellsman,   Mike  Freemon,   Andy  Hanushevsky,   Fabrice  Jammes,   Lynne  Jones,   Jeff  Kantor,     Kian-­‐Tat  Lim,   Dus5n  Lang,     Ron  Lambert,   Robert  Lupton  (the  Good),     Simon  Krughoff,   Serge  Monkewitz,   Jon  Myers,   Russell  Owen,   Steve  Pietrowicz,   Ray  Plante,   Paul  Price,     Andrei  Salnikov,   Dick  Shaw,   Schuyler  Van  Dyk,   Daniel  Wang     and  the  LSST  Project  Team  
  2. 2. 2CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. A  Dedicated  Survey  Telescope   −  A  wide  (half  the  sky),  deep  (24.5/27.5  mag),  fast  (image  the  sky  once  every  3  days)   survey  telescope.  Beginning  in  2022,  it  will  repeatedly  image  the  sky  for  10  years.   −  The  LSST  is  an  integrated  survey  system.  The  Observatory,  Telescope,  Camera  and   Data  Management  system  are  all  built  to  support  the  LSST  survey.  There’s  no  PI   mode,  proposals,  or  )me.     −  The  ul7mate  deliverable  of  LSST  is  not  the  telescope,  nor  the  instruments;  it  is  the   fully  reduced  data.   •  All  science  will  be  come  from  survey  catalogs  and  images     Telescope    è          Images    è          Catalogs  
  3. 3. 3CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Open  Data,  Open  Source:  A  Community  Resource   −  LSST  data,  including  images  and  catalogs,  will  be  available  with  no   proprietary  period  to  the  astronomical  community  of  the  United  States,   Chile,  and  Interna7onal  Partners     −  Alerts  to  variable  sources  (“transient  alerts”)  will  be  available  world-­‐wide   within  60  seconds,  using  standard  protocols     −  LSST  data  processing  stack  will  be  free  soYware  (licensed  under  the  GPL,   v3-­‐or-­‐later)   −  All  science  will  be  done  by  the  community  (not  the  Project!),  using  LSST’s   data  products  
  4. 4. 4CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Why  LSST:  The  Science  
  5. 5. 5CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. History     1996-­‐2000  “Dark  MaSer  Telescope”   This  project  began  as  a  quest  to   understand  cosmology  and  the  Solar   System.     2000  -­‐  …      “LSST”   Emphasizes  a  broad  range  of  science   from  the  same  mul7-­‐wavelength   survey  data,  including  unique  7me   domain  explora7on     A  single  telescope,  a  single  data  set,   can  serve  to  answer  a  wide  swath  of   science  ques7ons   The  evolu1on  of  LSST  design   LSST:  Evolu7on  of  Design  and  Purpose  
  6. 6. CfA  Code  Coffee  •  Harvard-­‐Smithsonian  Center  for  Astrophysics  •  June  4,  2014.   LSST:  A  Deep,  Wide,  Fast,  Optical  Sky  Survey         8.4m  telescope  18000+  deg2  10mas  astrom.  r<24.5  (<27.5@10yr)     ugrizy  0.5-­‐1%  photometry   3.2Gpix  camera  30sec  exp/4sec  rd      15TB/night  37  B  objects     Imaging  the  visible  sky,  once  every  3  days,  for  10  years  (825  revisits)   http://lsst.org  
  7. 7. 7CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Fron7ers  of  Survey  Astronomy   −  Time  domain  science     •  Nova,  supernova,  GRBs     •  Source  characteriza)on     •  Instantaneous  discovery     −  Census  of  the  Solar  System   •  NEOs,  MBAs,  Comets   •  KBOs,  Oort  Cloud   −  Mapping  the  Milky  Way   •  Tidal  streams   •  Galac)c  structure   −  Dark  energy  and  dark  mafer   •  Strong  lensing   •  Weak  lensing   •  Constraining  the  nature  of  dark  energy  
  8. 8. 8CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Funding  Status   −  December  6th,  2013:  Passed  the   NSF  Final  Design  Review;  declared   ready  for  Construc1on!   −  January  17th,  2014:  FY2014   budget  signed,  with  NSF   appropria1on  allowing  for  LSST   start.   −  May  8th,  2014:  NSB  authorizes   NSF  Director  to  start  the  project.   −  Expec5ng  the  signing  of   coopera5ve  agreement  and  start   of  construc5on  in  July  2014!  
  9. 9. CfA  Code  Coffee  •  Harvard-­‐Smithsonian  Center  for  Astrophysics  •  June  4,  2014.   Loca)on:  Cerro  Pachon,  Chile   Leveling  of  El  Peñón  (the  summit  of  Cerro  Pachón)  
  10. 10. 10CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. LSST  Observatory  (cca.  late  ~2018)  
  11. 11. 11CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
  12. 12. 12CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Combined  Primary/Ter7ary  Mirror   Thin  Meniscus  Secondary   −  Primary-­‐Ter)ary  was  cast  in  the  spring  of  2008.   −  Fabrica)on  underway  at  the  Steward  Observatory   Mirror  Lab  -­‐  comple)on  by  the  end  of  2014.       −  Secondary  substrate  fabricated  by  Corning  in  2009.   −  Currently  in  storage  wai)ng  for  construc)on.    
  13. 13. 13CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. LSST  Camera   Parameter   Value   Diameter   1.65  m   Length   3.7  m   Weight   3000  kg   F.P.  Diam   634  mm   1.65 m 5’-5” –  3.2 Gigapixels –  0.2 arcsec pixels –  9.6 square degree FOV –  2 second readout –  6 filters
  14. 14. 14CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Bandpasses:  u,g,r,i,z,y  
  15. 15. 16CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Next  mee5ng:  August  11-­‐15th  2014,  Phoenix,  AZ  (hSp://ls.st/hf9)   Community:  LSST  Science  Collabora7ons   2012  All  Hands  Mee)ng  Group  Photo,  Aug  13-­‐17  2012,  Marana,  AZ  
  16. 16. 17CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. LSST  From  the  Astronomer’s  Perspec7ve   −  A  stream  of  ~10  million  )me-­‐domain  events  per  night,  detected  and   transmiled  to  event  distribu)on  networks  within  60  seconds  of   observa)on.   −  A  catalog  of  orbits  for  ~6  million  bodies  in  the  Solar  System.   −  A  catalog  of  ~37  billion  objects  (20B  galaxies,  17B  stars),  ~7  trillion   observa)ons  (“sources”),  and  ~30  trillion  measurements  (“forced   sources”),  produced  annually,  accessible  through  online  databases.   −  Deep  co-­‐added  images.   −  Services  and  compu)ng  resources  at  the  Data  Access  Centers  to   enable  user-­‐specified  custom  processing  and  analysis.   −  Sonware  and  APIs  enabling  development  of  analysis  codes.   Level  3  Level  1  Level  2  
  17. 17. 18CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. LSST  Data  Management  System   (from  readout  to  delivery  to  the  user)    
  18. 18. 19CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. LSST  Data  Management:  Roles   −  Archive  Raw  Data:  Receive  the  incoming  stream  of  images  that  the   Camera  system  generates  to  archive  the  raw  images.     −  Process  to  Data  Products:  Detect  and  alert  on  transient  events  within   one  minute  of  visit  acquisi)on.  Approximately  once  per  year  create  and   archive  a  Data  Release,  a  sta)c  self-­‐consistent  collec)on  of  data  products   generated  from  all  survey  data  taken  from  the  date  of  survey  ini)a)on  to   the  cutoff  date  for  the  Data  Release.   −  Publish:  Make  all  LSST  data  available  through  an  interface  that  uses   community-­‐accepted  standards,  and  facilitate  user  data  analysis  and   produc7on  of  user-­‐defined  data  products  at  Data  Access  Centers  (DACs)   and  external  sites.  
  19. 19. 20CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. HQ  Site   Science  Opera)ons   Observatory  Management   Educa)on  and  Public  Outreach   Archive  Site   Archive  Center   Alert  Produc)on   Data  Release  Produc)on   Calibra)on  Products  Produc)on   EPO  Infrastructure    Long-­‐term  Storage  (copy  2)   Data  Access  Center   Data  Access  and  User  Services   Summit  and  Base  Sites   Telescope  and  Camera   Data  Acquisi)on   Crosstalk  Correc)on   Long-­‐term  storage  (copy  1)   Chilean  Data  Access  Center   Dedicated  Long  Haul   Networks     Two  redundant  40  Gbit  links  from  La   Serena  to  Champaign,  IL  (exis)ng  fiber)  
  20. 20. 21CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Infrastructure:  Petascale  Compu7ng,  Gbit  Networks   Long  Haul  Networks  to  transport   data  from  Chile  to  the  U.S.     •  200  Gbps  from  Summit  to  La  Serena  (new  fiber)   •  2x40  Gbit  (minimum)  for  La  Serena  to  Champaign,  IL   (protected,  exis1ng  fiber)   Archive  Site  and  U.S.   Data  Access  Center   NCSA,  Champaign,  IL   Base  Site  and  Chilean   Data  Access  Center   La  Serena,  Chile   The  compu1ng  cluster  at  the   LSST  Archive  (at  NCSA)  will   run  the  processing  pipelines.     •  Single-­‐user,  single-­‐applica1on,   dedicated  data  center   •  Process  images  in  real-­‐1me  to  detect   changes  in  the  sky   •  Produce  annual  data  releases  
  21. 21. 22CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. “Applica7ons”:  Scien7fic  Core  of  LSST  DM   −  Applica1ons  carry  core  scien)fic  algorithms   that  process  or  analyze  raw  LSST  data  to   generate  output  Data  Products     −  Variety  of  processing   •  Image  processing   •  Measurement  of  source  proper)es   •  Associa)ng  sources  across  space  and  )me,  e.g.   for  tracking  solar  system  objects     −  Applica1ons  framework  layer  (afw;  not  shown)   allows  them  to  be  wrilen  in  a  high-­‐level   language    
  22. 22. 23CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Middleware  Layer:  Isola7ng  Hardware,  Orchestra7ng   SoYware   Enabling  execu1on  of  science  pipelines  on  hundreds  of   thousands  of  cores.     •  Frameworks  to  construct  pipelines  out  of  basic  algorithmic   components   •  Orchestra)on  of  execu)on  on  thousands  of  cores   •  Control  and  monitoring  of  the  whole  DM  System   Isola1ng  the  science  pipelines  from  details  of  underlying   hardware     •  Services  used  by  applica)ons  to  access/produce  data  and   communicate   •  "Common  denominator"  interfaces  handle  changing  underlying   technologies  
  23. 23. 24CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Database  and  Science  UI:  Delivering  to  Users   Massively  parallel,   distributed,  fault-­‐tolerant   rela5onal  database.     •  To  be  built  on  exis)ng,  robust,  well-­‐ understood,  technologies  (MySQL  and   xrootd)   •  Commodity  hardware,  open  source   •  Advanced  prototype  in  existence  (qserv)   Science  User  Interface  to  enable  the   access  to  and  analysis  of  LSST  data     •  Web  and  machine  interfaces  to  LSST  databases   •  Visualiza)on  and  analysis  capabili)es  
  24. 24. 25CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Going  Where  the  Talent  is:  One  Distributed  Team   Infrastructure   Middleware   Core  Algorithms  (“Apps”)   Database   UI        Mgmt,  I&T,  and  Science  QA  
  25. 25. 26CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. The  LSST  Soiware  Stack   (science  pipelines,  middleware,  database,  user  interfaces)     “Enabling  LSST  science  by  crea1ng  a  well  documented,  state-­‐ of-­‐the-­‐art,  high-­‐performance,  scalable,  mul1-­‐camera,  open   source,  O/IR  survey  data  processing  and  analysis  system.”  
  26. 26. 27CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. LSST  Science  Pipelines   −  02C.01.02.01/02.  Data  Quality  Assessment  Pipelines      (slides  by  Juric)   −  02C.01.[02.01.04,04.01,04.02]    Calibra7on  Pipelines      (slides  by  Axelrod,  Yoachim)   −  02C.03.01.      Single-­‐Frame  Processing  Pipeline      (slides  by  Krughoff,  Lupton)   −  02C.03.02.      Associa7on  pipeline  (slides  by  Lupton)   −  02C.03.03.      Alert  Genera7on  Pipeline      (slides  by  Becker)   −  02C.03.04.      Image  Differencing  Pipeline      (slides  by  Becker)   −  02C.03.06.      Moving  Object  Pipeline      (slides  by  Jones)   −  02C.04.03.      PSF  Es7ma7on  Pipeline    (slides  by  Lupton)   −  02C.04.04.      Image  Coaddi7on  Pipeline      (slides  by  AlSayyad)   −  02C.04.05.      Deep  Detec7on  Pipeline      (slides  by  Lupton)   −  02C.04.06.      Object  Characteriza7on  Pipeline      (slides  by  Lupton,  Bosch)   −  02C.01.02.03.    Science  Pipeline  Toolkit                            (slides  by  Dubois-­‐Felsmann)   −  02C.03.05/04.07  Applica7on  Framework                        (slides  by  Lupton)   Calibra1on  reviewed  in  July  ’13,   by  Wood-­‐Vasey  et  al.   Pipelines  reviewed  in  Sep.  ’13,   by  Magnier  et  al.     Level  1  Level  2  L3   Data  Management  Applica1ons  Design  (LDM-­‐151)  
  27. 27. 28CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Implementa7on  Strategy:  Transfer  Know-­‐how,  not  Code   −  Difficulty  adap7ng  exis7ng  public  codes  to  LSST  requirements   (AstroMa7c  suite,  PHOTO,  Elixir,  IRAF-­‐based  pipelines,  etc.)   •  Need  to  run  efficiently  at  scale   •  Need  to  be  flexible  (plugging/unplugging  of  algorithms  at  run)me)   •  Need  to  have  it  developed  by  a  large  team  (20+  scien)sts  and   programmers)   •  Need  to  be  maintainable  over  ~25  years  of  R&D,  Construc)on,  and   Survey  Opera)ons   •  Need  to  run  on  a  variety  of  hardware  and  sonware  pla{orms   •  Need  to  have  logging  and  provenance  built  into  the  design   −  Early  on  (~2006),  a  decision  was  made  to  (largely)  transfer  the  scien7fic   know-­‐how,  but  not  code.  
  28. 28. 29CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Maintainable  Design  /  Language  Choices   −  LSST  sonware  stack  is  largely  wrilen  from  scratch,  in  Python,  unless   computa)onal  demands  require  the  use  of  C++   •  C++:   -  Computa)onally  intensive  code   -  Made  available  to  Python  via  SWIG   •  Python:   -  All  high-­‐level  code   -  Prefer  Python  to  C++  unless  performance  demands  otherwise   −  Modularity   •  Virtually  everything  is  a  Python  module.   •  ~60  packages  (git  repositories,  ~corresponding  to  python  packages)   −  Build  system:  scons    Version  control:  git    Package  management:  EUPS    
  29. 29. 30CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Modular  Architecture   Applica)on  Framework  (comp.   intensive  C++,  SWIG-­‐wrapped  into   Python)   Middleware  (I/O,  configura)on,  …)   External  C/C++  Libraries  (Boost,   FFTW,  Eigen,  CUDA  ..)   External  Python  Modules  (numpy,   pyfits,  matplotlib,  …)   Camera  Abstrac)on   Layer   (obs_*  packages)   Measurement   Algorithms  (meas_*)   Tasks  (ISR,  Detec)on,  Co-­‐adding,  …)   Command-­‐line  driver  scripts   Cluster  execu)on  middleware   …   Red:  Mostly  C++  (but  Python  wrapped);          Blue:  Mostly  Python;          Black:  External  Libraries   Middleware  (I/O,  configura)on,  …)  
  30. 30. 31CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Module  Dependency  Tree   eigen xpa fftwminuit2afwdata cuda_toolkit pysqlitemysqlclientlibpngfreetype astrometry_net suprime_ddata zlib tcltk cfitsio doxygengsl python swig boostmysqlpythonnumpy sconswcslib matplotlib pyfits sconsUtils base ndarray pex_exceptions utils daf_base geom pex_logging pex_policy daf_persistencepex_config afw obs_test coadd_utils pipe_baseskymap skypixtesting_displayQA coadd_chisquared daf_butlerUtilsmeas_algorithms ip_diffim ip_isrmeas_astrom meas_extensions_photometryKron meas_extensions_rotAnglemeas_extensions_shapeHSM obs_lsstSim obs_subaru pipe_tasks
  31. 31. 32CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Module  Dependency  Tree   eigen xpa fftwminuit2afwdata cuda_toolkit pysqlitemysqlclientlibpngfreetype astrometry_net suprime_ddata zlib tcltk cfitsio doxygengsl python swig boostmysqlpythonnumpy sconswcslib matplotlib pyfits sconsUtils base ndarray pex_exceptions utils daf_base geom pex_logging pex_policy daf_persistencepex_config afw obs_test coadd_utils pipe_baseskymap skypixtesting_displayQA coadd_chisquared daf_butlerUtilsmeas_algorithms ip_diffim ip_isrmeas_astrom meas_extensions_photometryKron meas_extensions_rotAnglemeas_extensions_shapeHSM obs_lsstSim obs_subaru pipe_tasks External  Tools  and  Libraries   AFW   Camera  abstrac)ons  Measurement  Algorithms   Top-­‐level   scripts  
  32. 32. 33CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. (Very  Basic)  SExtractor  with  lsst  primi7ves  (1/2)  
  33. 33. 34CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. (Very  Basic)  SExtractor  with  lsst  primi7ves  (2/2)  
  34. 34. 35CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.
  35. 35. 36CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Current  Status:  Advanced  Prototypes   −  8-­‐year  prototyping  effort   •  8  sonware  releases  (Data  Challenges)   •  Status:  A  rapidly  maturing  state-­‐of-­‐the  art  astronomical  data  reduc)on  system   -  ~SDSS/SExtractor  level  quality  of  reduc)ons   -  Most  recently  tested  by  building  co-­‐adds  using  SDSS  Stripe  82  data   -  Used  in  commissioning  of  the  Hyper  Suprime-­‐Cam  Survey  on  Subaru     −  Prototyped  Features:   •  Instrumental  signature  removal   •  Single-­‐frame  processing   •  Point  source  photometry   •  Extended  source  photometry  (model  fi•ng)   •  Deblender   •  Co-­‐addi)on  of  images   •  Image  differencing   •  Object  characteriza)on  on  mul)-­‐epoch  data  (StackFit/Mul)Fit)   •  …     Planning  to  begin  addressing  it  over   the  next  few  months.  
  36. 36. Figure:     5  sq.  deg.     background-­‐matched   coadd  composite     (g,r,i)   ~55  epochs       Region:    Aqr   Galac)c  lat  =  -­‐35.0           New  Algorithms:  Background-­‐matched  co-­‐ add  of  SDSS  Stripe  82  in  the  vicinity  of  M2.     Background  matching  preserves  diffuse   structures.     Generated  with  LSST  pipeline  prototypes.   hfp://moe.astro.washington.edu/sdss/   Slide:  Yusra  AlSayyad  
  37. 37. 38CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Streams  in  LSST-­‐reprocessed  SDSS  Stripe  82   Stripe  82  background-­‐matched  coadds  built  with  LSST  Data  Management  stack  (hfp://moe.astro.washington.edu)   hfp://moe.astro.washington.edu/sdss/  
  38. 38. 39CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Example:  Forced  Photometry  on  SDSS  Stripe  82   Forced  Photometry     For  every  detec)on  in  the  deep  co-­‐add,   perform  PSF  photometry  on  individual   frames  (ugriz).  Note  that  the  majority  of   these  will  be  below  the  single-­‐frame  SNR   detec)on  treshold.     Averaging  those  fluxes  allows  one  to  go   deeper.     Len:  comparison  of  Ivezic  et  al.  (2004)  w  and   y  color  loci;  single  frame  vs.  deep  catalog.    
  39. 39. 40CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Winter  2014  SoYware  Release   curl  –O  http://sw.lsstcorp.org/eupspkg/newinstall.sh   bash  newinstall.sh   Installing   •  Supported  plaqorms  (plaqorms  we  regularly  build  on;  generally  builds  on   any  Linux/BSD)   •  RHEL  6   •  OS  X  10.8  Mountain  Lion   •  OS  X  10.9  Mavericks    
  40. 40. 41CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. WARNING!  ADVERTENCIA!  AVERTISSEMENT!   THIS  IS  STILL  NOT  A  FINISHED,  POLISHED,  READY-­‐TO-­‐USE  END-­‐ USER  PRODUCT!  BEFORE  DOWNLOADING,  PLEASE  MAKE  SURE   TO  READ  THE  DM  STACK  FAQ:  
  41. 41. 42CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. WARNING!  ADVERTENCIA!  AVERTISSEMENT!   THIS  IS  STILL  NOT  A  FINISHED,  POLISHED,  READY-­‐TO-­‐USE  END-­‐ USER  PRODUCT!  BEFORE  DOWNLOADING,  PLEASE  MAKE  SURE   TO  READ  THE  DM  STACK  FAQ:     hfp://dev.lsstcorp.org/trac/wiki/DM/Policy/UsingDMCode/ FAQ     KEY  POINTS:   -­‐  POOR  DOCUMENTATION   -­‐  YOU’RE  DOWNLOADING  UNSUPPORTED,  PROTOTYPE,  CODE   -­‐  THIS  CODE  WILL  NOT  WORK  OUT  OF  THE  BOX  FOR  CAMERAS   OTHER  THAN  LSST  (AND  SDSS).   -­‐  EXPECT  TO  WRITE  SOME  PYTHON  CODE  
  42. 42. 43CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. The  Big  Picture:       Preparing  for  the  Data   Driven  Astronomy  of  the   Next  Decade  (and  beyond)  
  43. 43. 44CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. “Astro  2020”:  Rise  of  the  Machines   −  We’re  witnessing  a  change  in  how  astronomy   is  done,  and  the  technical  knowledge  and   tools  needed  to  do  it.   •  The  rise  of  big  projects  and  end  to  data   scarcity   •  The  rise  of  systema)cs  limited  science   •  The  rise  of  open  (source),  (massively)   collabora)ve,  science  
  44. 44. 45CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. “Astro  2020”:  Rise  of  the  Machines   −  We’re  witnessing  a  change  in  how  astronomy   is  done,  and  the  technical  knowledge  and   tools  needed  to  do  it.   •  The  rise  of  big  projects  and  end  to  data   scarcity   •  The  rise  of  systema)cs  limited  science   •  The  rise  of  open  (source),  (massively)   collabora)ve,  science   −  Consequences   •  Ability  to  collect  data  has  outstripped  the   ability  to  analyze  it   -  Extrac)on  of  features  from  the  data  (“image   processing”)   -  Mining  of  knowledge  from  the  data  (“data   mining”)   •  We  cri)cally  dependent  on  compu)ng   infrastructure  and  sonware/algorithm   research  for  astronomical  progress   -  Yet  we  don’t  generally  acknowledge,   encourage,  or  teach  it    
  45. 45. 46CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. “Astro  2020”:  Rise  of  the  Machines   −  We’re  witnessing  a  change  in  how  astronomy   is  done,  and  the  technical  knowledge  and   tools  needed  to  do  it.   •  The  rise  of  big  projects  and  end  to  data   scarcity   •  The  rise  of  systema)cs  limited  science   •  The  rise  of  open  (source),  (massively)   collabora)ve,  science   −  Consequences   •  Ability  to  collect  data  has  outstripped  the   ability  to  analyze  it   -  Extrac)on  of  features  from  the  data  (“image   processing”)   -  Mining  of  knowledge  from  the  data  (“data   mining”)   •  We  cri)cally  dependent  on  compu)ng   infrastructure  and  sonware/algorithm   research  for  astronomical  progress   -  Yet  we  don’t  generally  acknowledge,   encourage,  or  teach  it     −  Challenges   •  Eleva)ng  sonware  engineering  to  a   foo)ng  equal  to  mathema)cs?   -  Learn-­‐by-­‐osmosis  not  sufficient  any   more   •  T(construc)on)  >>  T(discovery)   -  Research  becoming  more  data  driven   -  Broad  interests  in  astrophysics   -  Sta)s)cs,  CS,  sonware  engineering,  etc.   •  Sonware  reusability   -  Increasing  complexity  makes   perpetual  wheel  reinven7ons   infeasible  (and,  honestly,  silly…)  
  46. 46. 47CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. LSST:  Helping  Build  the  Common  Codebase  for   the  Next  Quarter  Century   −  LSST  sonware  will  be  general  purpose  and   highly  reusable  by  design.   •  Necessary  to  deal  with  real-­‐world  hardware   •  Necessary  to  be  able  to  process  precursor   data   •  Necessary  to  enable  science  (“Level  3”)   sonware  to  be  wrilen  on  top  of  it   −  Opportuni7es  for  using  LSST-­‐derived  code   on  other  data  sets   •  More  work  ahead,  but  becoming  a  state  of   the  art,  well  supported,  codebase   •  Possibili)es:  SDSS,  CFHT-­‐LS,  PanSTARRS,   HSC,  DES,  WFIRST,  Euclid,  …   •  Good  basis  for  analysis  frameworks  (LSST   DESC)   •  Leveraging  a  100M+  NSF  investment  in   large  survey  data  management   −  The  benefits  feed  back  to  LSST:  more   users,  less  bugs,  beler  understanding,   shorter  path  to  science.  
  47. 47. 48CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. LSST:  A  Piece  of  the  Puzzle   −  LSST  can  help  posi7on  us  for  the  future  in  two   ways   •  With  code  (see  previous  slide)   •  With  people/culture   −  SoYware  Development  Culture   •  We  will  run  the  sonware  effort  as  an  open  source   project  with  reusability  in  mind   -  A  source  tarball  at  the  very  end  is  not  useful!   -  Open  bug  trackers,  mailing  lists,  repositories   -  S7ll  have  a  job  to  do!  But  that  doesn’t  mean  we   must  do  it  in  a  closed,  insulated,  manner!   •  Think  Fedora  Project/RedHat,  Android/Google,   Debian/Ubuntu/Mint   •  Use  what  works:  numpy,  scipy,  astropy,  etc…   -  Improve  upstream  rather  than  fork!   -  Where  we  run  into  problems:  poor  sonware   engineering,  performance  issues,  licenses   •  Startup  mentality:  excellence  wins,  agile  process,   con)nuous  change  &  learning,  collabora)ve  spirit,   sense  of  urgency  and  excitement.   −  People   •  We  will  have  40+  people  working  on   LSST  Data  Management  over  (1)8+  yrs   -  Crea)ng  a  career  path  for  sonware   instrumentalists   •  We  can  help  train  a  whole  genera)on   of  “data  driven  astronomers”   -  Impar)ng  the  know-­‐how  needed  to   make  the  best  use  of  the  next   genera)on  of  surveys  
  48. 48. 49CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. @LSST    @mjuric  

×