Kafka has come to play an essential role in astronomy research. Particularly where black holes and neutron stars are involved, astronomers are increasingly seeking out the “time domain” and want to study explosive transients and variability. In response, observatories are increasingly adopting streaming technologies to send alerts to astronomers and to get their data to their science users in real time. In this talk, we will discuss architectural choices, challenges, and lessons learned in adapting Kafka for open science and open data. Our novel approach to OpenID Connect / OAuth2 in Kafka is designed to securely scale Kafka from access inside a single organization to access by the general public. We will present a case study of the General Coordinates Network (GCN), a public collaboration platform run by NASA for the astronomy research community to share alerts and rapid communications about high-energy, multi-messenger, and transient phenomena. Over the past 30 years, GCN has helped enable many seminal advances by disseminating observations, quantitative near-term predictions, requests for follow-up observations, and observing plans. GCN distributes alerts between space- and ground-based observatories, physics experiments, and thousands of working astronomers around the world.
2. Fast Astronomical Transients
• Regularly detected by ground- and space-based
observatories
• Initially outburst on timescales of milliseconds
to days
• Detection and follow-up observations across
electromagnetic spectrum (radio to very high-
energy gamma-ray) and multi-messenger
• Requires worldwide coordination and
cooperation
Solar
Flares
Supernovae
Fast
Radio
Bursts
Novae
Gamma-Ray
Bursts
Magnetar
Flares
2
4. Realtime Alerts Born of Necessity
• The onboard recorder on NASA’s
Compton Gamma-Ray
Observatory (1991-2000) failed
in 1992
• The need to downlink events as
they occurred created an
opportunity for realtime follow-
up observations
• BAtse COordinates DIstribution
NEtwork (BACODINE) was built to
receive and distribute those
alerts worldwide
• BACODINE became GCN
4
Compton’s Burst And Transient Search Experiment (BATSE)
detected transients daily and helped solve the mystery of the
origin of gamma-ray bursts
CGRO Deployment from the
Space Shuttle Atlantis
5. Gamma-ray burst Coordinates Network
(GCN Classic, 1992-present)
• GCN has provided rapid alerts to international astronomy community for 30 years
• Utilized public internet infrastructure before such applications were common
• Still serves as the backbone of several astronomical communities
• Runs on-premises at Goddard, custom data formats and protocols, manual
account administration 5
https://gcn.gsfc.nasa.gov
6. Two Types of GCN Data Products
GCN Notices GCN Circulars
• By and for machines
• Fixed, predefined format
• Schema specific to each notice type
• By and for humans (some automated)
• Freeform text (with established style)
• Citable (but not peer-reviewed) 6
8. NASA’s Astrophysics Mission Fleet
8
Missions that
report
astronomical
transients via
GCN
Notices
Source Media
9. NASA’s Astrophysics Mission Fleet
9
Missions that
report
astronomical
transients via
GCN
Circulars
Source Media
10. GCN Notice Producer Space- and Ground-based
Missions/Observatories/Experiments
10
11. Rapidly Responding to the Universe’s Most Distant Explosions
• Rapidly Responding to the Most Distant Explosions
Gamma-ray Burst
(Distant Universe)
Time
Seconds Minutes Hours Days
TITLE: GCN/FERMI NOTICE
NOTICE_DATE: Sun 03 May 20
23:25:19 UT
NOTICE_TYPE: Fermi-GBM Alert
RECORD_NUM: 1
TRIGGER_NUM: 610241118
GRB_DATE: 18972 TJD; 124 DOY;
20/05/03
GRB_TIME: 84313.46 SOD
{23:25:13.46} UT
TRIGGER_SIGNIF: 11.1 [sigma]
TRIGGER_DUR: 0.016 [sec]
E_RANGE: 3-4 [chan] 47-291
[keV]
ALGORITHM: 1 DETECTORS: 1,0,0,
1,0,0,
Ground-based Optical
Telescopes
(e.g. Gemini, ZTF, VLT, NOT)
Ground-based Radio
Telescopes
(e.g. VLA, ALMA)
Occasional Later Follow-
up with additional space-
based X-ray, Optical,
Infrared Telescopes
Space-based
X-ray & UV/Optical
(e.g. Swift)
Space-based
Gamma-ray
Observatories (e.g.
Swift & Fermi)
12. Transient Example 1:
Gamma-ray Bursts
• Initial burst lasts seconds to minutes
• Sometimes poorly localized (100’s-
1000’s of square degrees)
• Rapidly evolving afterglow (and/or
kilonova/supernova) lasts hours to
days+
• Occurs ~1/day, but have to look at all
events to find rare and unusually
interesting ones
• Challenges – rapid evolution, many
instruments provide poor localization,
afterglow signal follow-up across
spectrum
12
13. GRB 221016A Timeline
13
Oct 16, 2022
23:39:22.71 UT
T0
Detection
T0+6 s
Fermi-GBM
Alert
Swift-BAT Alert
T0+17 s
Swift-UVOT
Im
age
T0+97 s
Swift-XRT Im
age
T0+180 s
LCOGT Ground-based
Observation
T0+2.55h
TITLE: GCN CIRCULAR
NUMBER: 32775
SUBJECT: GRB 221016A: LCOGT Optical Afterglow
Detection
DATE: 22/10/17 03:23:46 GMT
We observed the GRB 221016A (Page et al., GCN
32774) field with the LCOGT 1-meter Sinistro
instrument at the Cerro Tololo Interamerican
Observatory, Chile site, on October 17, from 01:39
to 02:06 (corresponding to 2.00 to 2.55 hours from
the GRB trigger time) with the Bessel I and R filters.
We performed a series of 3x300s exposures in each
band. We detect an uncataloged optical source
within the XRT error region (Page et al., GCN
32774), in R band, and marginally in I-band (2-sigma
detection). The following magnitudes are calculated
using the USNO-B1.0 catalog as reference: R = 21.22
+/- 0.18 I = 20.03 +/- 0.17
These magnitudes are not corrected for galactic
extinction. Gemini imaging+spectroscopic follow up
is ongoing.
TITLE: GCN/SWIFT NOTICE
NOTICE_DATE: Sun 16 Oct 22 23:39:41 UT
NOTICE_TYPE: Swift-BAT GRB Position
TRIGGER_NUM: 1129775, Seg_Num: 0
GRB_RA: 38.949d {+02h 35m 48s} (J2000), 39.186d {+02h 36m 45s} (current)
GRB_DEC: -34.624d {-34d 37' 26"} (J2000), -34.526d {-34d 31' 31"} (current)
GRB_ERROR: 3.00 [arcmin radius, statistical only] GRB_INTEN: 4064 [cnts]
Image_Peak=416 [image_cnts]
TRIGGER_DUR: 1.024 [sec]
TRIGGER_INDEX: 146
E_range: 25-100 keV
BKG_INTEN: 15385 [cnts]
BKG_TIME: 85145.95 SOD {23:39:05.95} UT
BKG_DUR: 8 [sec] GRB_DATE: 19868 TJD; 289 DOY; 22/10/16 GRB_TIME:
85164.19 SOD {23:39:24.19} UT
GRB_PHI: -69.17 [deg]
GRB_THETA: 24.27 [deg]
SOLN_STATUS: 0x20000003
RATE_SIGNIF: 44.29 [sigma]
IMAGE_SIGNIF: 15.30 [sigma]
MERIT_PARAMS: +1 +0 +0 +0 +2 +4 +0 +0 +68 +0 COMMENTS: SWIFT-BAT GRB
Coordinates.
COMMENTS: This is a rate trigger.
COMMENTS: A point_source was found.
COMMENTS: This does not match any source in the on-board catalog.
COMMENTS: This does not match any source in the ground catalog.
COMMENTS: This is a GRB. COMMENTS: This trigger occurred at
longitude,latitude = 88.55,-19.08 [deg].
COMMENTS: NOTE: This BAT event is temporally(2.0<100sec) coincident with
the FERMI_GBM event (trignum=687656367).
14. Transient Example 2: Gravitational Wave from
Binary Neutron Star Mergers
• Initial signal lasts seconds, sometimes in
coincidence with gamma-ray signal
• Rapidly evolving electromagnetic
counterpart + longer-lived afterglow or
kilonova
• Initial poor localization
• Rare and very exciting transients
• Requires huge community coordination
effort
• Challenges – recognizing coincident
signals fast, poor localization, huge follow-
up coordination effort
14
Abbott et al., 2017
16. Transient Example 3: Neutrino Coincident with
Flaring Active Galaxy
• Astrophysical neutrinos are rare detections,
but indicative of a longer-term (months)
outburst from an active galactic nuclei
• Challenges – Broadband all-sky monitoring for
statistical coincidence
16
IceCube et al. 2018
17. The Changing Scientific Landscape
GCN is constantly evolving to
serve new transients,
messengers, and
observatories:
• Gravitational wave events
(GW150914, GW170817)
• High-energy neutrinos
(IC170922A)
• Tidal disruption events (Swift
J1644+57)
• Magnetar giant flares
(200415A)
17
Still from an animation of binary neutron star merger detected
August 17, 2017, seen in gravitational waves and in light across the
spectrum from radio to gamma rays
18. The Changing Technical Landscape
• Astronomy is adopting industry-
developed general time-series databases
and streaming frameworks
• The upcoming revolutionary optical
survey by the Vera C. Rubin
Observatory LSST Survey will use Apache
Kafka to distribute transient alerts as its
primary data product
• Many other experiments are following
suit: Zwicky Transient Facility,
LIGO/Virgo/KAGRA, and GCN
18
19. NASA Kafka Applications
Project NASA
Center
Description Status
GCN GSFC Astronomical Transient Alert System In-production since 2022,
development ongoing
Enterprise Business
Information Services
JPL Data integration among various types of business
applications
In-production since 2020
Federated Airspace
Management
Framework
Ames Software facility serving several NASA Projects
and external partners (e.g. FAA, DoD) providing
advanced air traffic management software for
various drone-related research efforts
Working with multiple
ARMD, STMD, and external
efforts
Complex Event
Processor
JPL Transports Deep Space Network system metrics,
distributes them in real time, and also stores the
last 30 days of metrics
Pre-production
19
All Federal agencies are using self-managed Kafka brokers, either Apache Kafka or Confluent Platform
GCN is sponsoring the FedRAMP authorization for Confluent Cloud to make it easy for NASA and other federal
agencies to deploy Kafka software-as-a-service
20. The New GCN: The General Coordinates Network
• New web portal
• Self Service (user managed accounts
through website)
• Open Standards (Apache Kafka
Protocol)
• Open Source
• Highly Available (Cloud based)
• Secure (SSL/TLS)
• Expansion of producer instruments and
transient source classes
• Future expansion for database and web
applications
20
https://gcn.nasa.gov, Funded Through the NASA ISFM Program
21. The New GCN is built on Kafka
• GCN Classic provides
three formats over three
custom protocols
• GCN Classic over Kafka
provides all three formats
over one standard
protocol: Apache Kafka
• GCN Kafka will transition
over the next few years
to streaming all data in
JSON format over Kafka
(Notices and Circulars)
21
23. Kafka Is Designed for Use Inside an Organization
Kafka is usually deployed inside an organization, where the following
assumptions about the end users with Kafka client access hold:
• The identities of the users are known to those setting up the cluster.
• The users are trusted.
• There are a limited number of users. (Confluent Cloud has a quota of 500
users.)
None of these assumptions holds for GCN!
23
24. GCN’s Approach to Kafka as a Public Service
Kafka has supported OpenID Connect (OIDC) for single sign-on (SSO) since version
3.1.0 with KIP-768
• Allows total decoupling of account provisioning from Kafka cluster
• Allows practically limitless number of accounts
• Compatible with numerous commercial and open source OIDC auth solutions (we
use Amazon Cognito)
• We use the same auth system across our web site and our Kafka broker
• No custom auth extensions to side-load into the server; everything we need is in
open-source Apache Kafka
• Vendor support: we can use fully managed brokers like Confluent Platform and
Confluent Cloud
24
25. Try it for yourself!
at https://gcn.nasa.gov/quickstart
• Start streaming GCN notices in seconds!
• Receive alerts from Fermi, Swift,
LIGO/Virgo/KAGRA*, IceCube, +many
more
• Use our "quick start" interface to:
• Sign in / sign up
• Create API credentials
• Customize your alerts
• Generate Kafka client code for your favorite
language
• (Python, JS, C++, C#, ...)
* see also
https://emfollow.docs.ligo.org/userguide/tutorial/receiving/gcn.html
25
26. Streaming GCN Notices in Python (1/9)
Launch quick start
• Go to
https://gcn.nasa.gov and
click Start streaming GCN
Notices
26
27. Streaming GCN Notices in Python (2/9)
Step 1: Sign in / Sign
up
• Click "Sign in / Sign up"
to create a GCN
account.
27
28. Streaming GCN Notices in Python (3/9)
Choose how to sign up
• Choose any one of the following
methods to sign up:
• Email and password
• Google
• Facebook
• LaunchPad (for NASA employees
and affiliates)
• Important: make sure you sign
in the same way each
time. Accounts are not linked.
28
30. Streaming GCN Notices in Python (5/9)
Step 2: Select
Credentials
Client credentials allow your
scripts to interact with GCN on
your behalf.
1. Choose a name for your
credential.
2. Complete the CAPTCHA.
3. Click "Create New
Credentials" to go to the
next step.
30
31. Streaming GCN Notices in Python (6/9)
Step 3: Customize Alerts
Select one of these alert
formats.
• Text: plain text key-value pairs
separated by newlines.
• VOEvent: VOEvent XML.
• Binary: 160-byte binary
format. Field packing
is specific to each notice type.
• JSON: To be expanded soon
31
32. Streaming GCN Notices in Python (7/9)
Step 3 Continued:
Choose Notice Types
Select the missions that
you want to subscribe
to. Expand a mission to
fine-tune notice types.
Click “Details” for more
info.
32
33. Streaming GCN Notices in Python (8/9)
Step 3 Continued:
Choose Notice Types
Learn about missions,
observatories, and
experiments and their
Notice types that you’d
like to receive.
33
34. Streaming GCN Notices in Python (9/9)
Step 4: Get Sample
Code
• Copy and paste Python
client code or download it
to your computer to run.
• Client sample code is also
available in Node.js (ESM or
CommonJS), C/C++, C#.
34
35. Self-Service Email Alerts (Powered by Kafka)
Alternatively
• Email is still a popular way
to receive GCN Notices.
• You can manage your
email subscriptions
yourself in “Email
Notifications”.
35
36. GCN Kafka Already in Use in the Astronomical
Community
• Instrument Pipelines (e.g. Fermi,
Swift)
• Automated downloads of data from
space missions (Swift-BAT)
• Coincident sub-threshold searches
(LIGO/Virgo/KAGRA + Fermi + Swift)
• Automated and manual responses
with ground-based optical and
gamma-ray telescopes
36
37. Future GCN Development
• GCN Circulars distribution over Kafka
• Database and searchable archive of Notices incorporated into Circulars
archive
• Convert all GCN Classic producers to using JSON format Notices with our
Unified Alert Schema distributed via Kafka
• Enable simple process for producers to modify and create new topics
• Add new missions/observatories/experiments to Notice producers
• Build correlators with other astronomical transient broker streams
37
38. Thanks for listening!
Web site: https://gcn.nasa.gov
Github Team: https://github.com/nasa-gcn
Contact us: https://gcn.nasa.gov/contact
38
Grab a sticker!