TRAP (transient detection pipeline) status update

921 views

Published on

These are the slides from the talk I gave at the 'Radio Transients with SKA Pathfinders and Precursors' conference at Kruger Park, South Africa. 9-12 July 2013

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
921
On SlideShare
0
From Embeds
0
Number of Embeds
8
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

TRAP (transient detection pipeline) status update

  1. 1. TRAP STATUS UPDATE TRAnsients Pipeline Gijs Molenaar gijs@pythonic.nl @gijzelaerr Thursday, July 11, 13
  2. 2. ABOUTTRAP • TRAnsients Pipeline • Detect and classify transients in multi-frequency radio sky image time series • Emit VOevents • 99% Python Thursday, July 11, 13
  3. 3. STEPS Thursday, July 11, 13
  4. 4. A LOT HAPPENED • Version 1.0 imminent • Focused on code quality and performance • No big new science features Thursday, July 11, 13
  5. 5. PERFORMANCE • A lot faster • Really a lot faster • 0.85 image per second per core • Scales well minutes Thursday, July 11, 13
  6. 6. RSM CYCLE0 RUN0 • 3402 images • processing record - 5:21 min • 2 machines, 36 cores • 5645 unique sources • 667 detected transients • previous version: 400 min on 40 cores Thursday, July 11, 13
  7. 7. TRAP & AARTFAAC • AARTFAAC • 48 images/s • 57 (real) cores required • 1 or 2 big fat systems will do! Thursday, July 11, 13
  8. 8. INSTALLABILITY • Merged TKP into TRAP • Almost open source • Easy database setup • Remove many dependencies • Like Lofar System Software (closed source) Thursday, July 11, 13
  9. 9. QUALITY CONTROL • Automated rejection of bad images • Known bright source in FOV • RMS x times higher than theoretical noise • oversampled / undersampled / highly elliptical Thursday, July 11, 13
  10. 10. STORAGE • Added support for PostgreSQL • fast with small datasets • Many off-the-shelf tools available Thursday, July 11, 13
  11. 11. UNDERTHE HOOD • Switched to celery • asynchronous job queue • based on distributed message passing • No more cuisine Thursday, July 11, 13
  12. 12. WHY CELERY • Easier to use / install / debug • Faster - hot processes • Many off-the-shelf tools • CEP1 compatible • Easy to add compute nodes Thursday, July 11, 13
  13. 13. Thursday, July 11, 13
  14. 14. DISCO? • Maybe add support for Disco in the future • Similar • Map - Reduce • Hadoop for Python • Distributed file system Thursday, July 11, 13
  15. 15. USABILITY • tkp-manage.py • Pipeline management tool • Inspired by Django manage.py command • Easy to • setup pipeline • add and run jobs • run celery workers • Add new commands Thursday, July 11, 13
  16. 16. DEMO? Thursday, July 11, 13
  17. 17. SUPPORTEDTELESCOPES • Support for FITS and CASA tables • field parsers for LOFAR • Possible to add telescope specific field parsing and quality checks • ThunderKAT next week Thursday, July 11, 13
  18. 18. PROJECT CLEANUP • removed 40% of code • 80% unit tested • Added jenkins build server • Performance regression tests • Pull request/review work flow • hipchat for central communication Thursday, July 11, 13
  19. 19. WEB INTERFACE BANANA • New web interface • Rewrite of TKP-web • Future ready • Scientist friendly Thursday, July 11, 13
  20. 20. Thursday, July 11, 13
  21. 21. DEMO? Thursday, July 11, 13
  22. 22. FUTURE WORK • More stable releases • Add support for non-LOFAR data • More quality checks • Source storage and association performance • Distributed file system • Automated classification • Web based data exploration Thursday, July 11, 13
  23. 23. QUESTIONS gijs@pythonic.nl @gijzelaerr Thursday, July 11, 13

×