Josh Bloom
UC Berkeley Astronomy
@profjsb
Autoencoding RNN for inference on
unevenly sampled time-series data
Data Driven Discovery Investigator
Workshop on Applying Advanced AI Workflows
In Astronomy and Microscopy
11 Sept 2018 (UCSC, Santa Clara)
Discovery in images:
Real or spurious sources?
(Ever) Increasing need for ML methods
in Time-Domain Astronomy
Bloom+12, Goldstein+16, …
Inference: What is
this event and is it
worth following up?
Levitan+14
Surrogate modelling &
parameter estimation
Supernova (Thomas/Nugent);
Exoplanets (Ford+11)
Supernova Discovery in the Pinwheel Galaxy
11 hr after explosion
nearest SN Ia in >3 decades
ML-assisted discovery
©Peter Nugent
Nugent+11, Li, Bloom+12, Bloom+12…
Probabilistic Classification of
50k+ Variable Stars
Shivvers,JSB,Richards MNRAS,2014
106 “DEB” candidates
12 new
mass-radii
15 “RCB/DYP”

candidates
8 new discoveries
Triple # of
Galactic
DYPer Stars
Miller, Richards, JSB,..ApJ 2012
5400
Spectroscopic
Targets
Miller, JSB, Richards,..ApJ 2015
Turn synoptic
imagers into
~spectrographs
Challenges with Traditional ("Hand-Crafted Featurization")
Approaches
• Feature engineering is expensive (people/compute), needs
a lot of domain knowledge
• "Small data" domain with only 1000s of labelled training
examples
• Traditional ML techniques don't account for feature
uncertainty
• Ideally would like to learn on one survey and apply that
knowledge to another (e.g., ASAS→ZTF→LSST)
https://github.com/cesium-ml/cesium
1. Build an autoencoder network to
learn to reproduce irregularly sampled
light curves using an information
bottleneck (B)
E( (→
B
D→ ( ( ≈
2. Use B as features and learn a
traditional classifier (random forest)
len(B) = 64
Example Reconstructions
of the Autoencoder
Bottleneck clearly learns
important features
underlying the "physics"
that generates the data
Results rival best-in-class approaches
Code/Data: https://github.com/bnaul/IrregularTimeSeriesAutoencoderPaper
Figure 1: Diagram of an RNN encoder/decoder architecture for irregularly sampled time ser
data. This network uses two RNN layers (specifically, bidirectional gated recurrent units (GRU) [6, 2
• Natively handles
irregularly sampling
Novelties & Improvements
Figure 1: Diagram of an RNN encoder/decoder architecture for irregularly sampled time ser
data. This network uses two RNN layers (specifically, bidirectional gated recurrent units (GRU) [6, 2
• Natively handles
irregularly sampling
• Learning loss accounts
for uncertainty
Novelties & Improvements
Figure 1: Diagram of an RNN encoder/decoder architecture for irregularly sampled time ser
data. This network uses two RNN layers (specifically, bidirectional gated recurrent units (GRU) [6, 2
• Natively handles
irregularly sampling
• Learning loss accounts
for uncertainty
• Natural data
augmentation with
bootstrap resampling
Novelties & Improvements
Figure 1: Diagram of an RNN encoder/decoder architecture for irregularly sampled time ser
data. This network uses two RNN layers (specifically, bidirectional gated recurrent units (GRU) [6, 2
• unsupervised feature
learning → leverage large
corpus of unlabelled light
curves
Novelties & Improvements
Figure 1: Diagram of an RNN encoder/decoder architecture for irregularly sampled time ser
data. This network uses two RNN layers (specifically, bidirectional gated recurrent units (GRU) [6, 2
• unsupervised feature
learning → leverage large
corpus of unlabelled light
curves
• transfer learning appears
to work
Novelties & Improvements
Figure 1: Diagram of an RNN encoder/decoder architecture for irregularly sampled time ser
data. This network uses two RNN layers (specifically, bidirectional gated recurrent units (GRU) [6, 2
• unsupervised feature
learning → leverage large
corpus of unlabelled light
curves
• transfer learning appears
to work
• learning scales linearly in
training examples
Novelties & Improvements
Extensions/Active Research
• Anomaly detection (on the bottleneck features)
• Hyperspectral topology
UMAP applied to
L2-normed autoencoder
for MNIST
Ellie Schwab Abrahams
Also, with Sara Jamal
• New layer types: explore Temporal Convnet (TCNs)
• Co-training across surveys
• Semi-supervised topology + metadata
Loss ~ Lts + λ Lclass
Source
Metadata
Source
Time series
Bottleneck
Unsupervised
SupervisedClassification
Time series
Reconstruction
FC
LSTM
LSTM
Extensions/Active Research
Ellie Schwab Abrahams
Also, with Sara Jamal
Josh Bloom
UC Berkeley Astronomy
@profjsb
Autoencoding RNN for inference on
unevenly sampled time-series data
Data Driven Discovery Investigator
Thanks!
Workshop on Applying Advanced AI Workflows
In Astronomy and Microscopy
11 Sept 2018 (UCSC, Santa Clara)
50k variables, 810 with known labels (timeseries, colors)
Challenge: classification on large sets
Richards+11, 12

Autoencoding RNN for inference on unevenly sampled time-series data

  • 1.
    Josh Bloom UC BerkeleyAstronomy @profjsb Autoencoding RNN for inference on unevenly sampled time-series data Data Driven Discovery Investigator Workshop on Applying Advanced AI Workflows In Astronomy and Microscopy 11 Sept 2018 (UCSC, Santa Clara)
  • 2.
    Discovery in images: Realor spurious sources? (Ever) Increasing need for ML methods in Time-Domain Astronomy Bloom+12, Goldstein+16, … Inference: What is this event and is it worth following up? Levitan+14 Surrogate modelling & parameter estimation Supernova (Thomas/Nugent); Exoplanets (Ford+11)
  • 3.
    Supernova Discovery inthe Pinwheel Galaxy 11 hr after explosion nearest SN Ia in >3 decades ML-assisted discovery ©Peter Nugent Nugent+11, Li, Bloom+12, Bloom+12…
  • 4.
    Probabilistic Classification of 50k+Variable Stars Shivvers,JSB,Richards MNRAS,2014 106 “DEB” candidates 12 new mass-radii 15 “RCB/DYP”
 candidates 8 new discoveries Triple # of Galactic DYPer Stars Miller, Richards, JSB,..ApJ 2012 5400 Spectroscopic Targets Miller, JSB, Richards,..ApJ 2015 Turn synoptic imagers into ~spectrographs
  • 5.
    Challenges with Traditional("Hand-Crafted Featurization") Approaches • Feature engineering is expensive (people/compute), needs a lot of domain knowledge • "Small data" domain with only 1000s of labelled training examples • Traditional ML techniques don't account for feature uncertainty • Ideally would like to learn on one survey and apply that knowledge to another (e.g., ASAS→ZTF→LSST) https://github.com/cesium-ml/cesium
  • 6.
    1. Build anautoencoder network to learn to reproduce irregularly sampled light curves using an information bottleneck (B) E( (→ B D→ ( ( ≈ 2. Use B as features and learn a traditional classifier (random forest)
  • 7.
    len(B) = 64 ExampleReconstructions of the Autoencoder
  • 8.
    Bottleneck clearly learns importantfeatures underlying the "physics" that generates the data
  • 9.
    Results rival best-in-classapproaches Code/Data: https://github.com/bnaul/IrregularTimeSeriesAutoencoderPaper
  • 10.
    Figure 1: Diagramof an RNN encoder/decoder architecture for irregularly sampled time ser data. This network uses two RNN layers (specifically, bidirectional gated recurrent units (GRU) [6, 2 • Natively handles irregularly sampling Novelties & Improvements
  • 11.
    Figure 1: Diagramof an RNN encoder/decoder architecture for irregularly sampled time ser data. This network uses two RNN layers (specifically, bidirectional gated recurrent units (GRU) [6, 2 • Natively handles irregularly sampling • Learning loss accounts for uncertainty Novelties & Improvements
  • 12.
    Figure 1: Diagramof an RNN encoder/decoder architecture for irregularly sampled time ser data. This network uses two RNN layers (specifically, bidirectional gated recurrent units (GRU) [6, 2 • Natively handles irregularly sampling • Learning loss accounts for uncertainty • Natural data augmentation with bootstrap resampling Novelties & Improvements
  • 13.
    Figure 1: Diagramof an RNN encoder/decoder architecture for irregularly sampled time ser data. This network uses two RNN layers (specifically, bidirectional gated recurrent units (GRU) [6, 2 • unsupervised feature learning → leverage large corpus of unlabelled light curves Novelties & Improvements
  • 14.
    Figure 1: Diagramof an RNN encoder/decoder architecture for irregularly sampled time ser data. This network uses two RNN layers (specifically, bidirectional gated recurrent units (GRU) [6, 2 • unsupervised feature learning → leverage large corpus of unlabelled light curves • transfer learning appears to work Novelties & Improvements
  • 15.
    Figure 1: Diagramof an RNN encoder/decoder architecture for irregularly sampled time ser data. This network uses two RNN layers (specifically, bidirectional gated recurrent units (GRU) [6, 2 • unsupervised feature learning → leverage large corpus of unlabelled light curves • transfer learning appears to work • learning scales linearly in training examples Novelties & Improvements
  • 16.
    Extensions/Active Research • Anomalydetection (on the bottleneck features) • Hyperspectral topology UMAP applied to L2-normed autoencoder for MNIST Ellie Schwab Abrahams Also, with Sara Jamal
  • 17.
    • New layertypes: explore Temporal Convnet (TCNs) • Co-training across surveys • Semi-supervised topology + metadata Loss ~ Lts + λ Lclass Source Metadata Source Time series Bottleneck Unsupervised SupervisedClassification Time series Reconstruction FC LSTM LSTM Extensions/Active Research Ellie Schwab Abrahams Also, with Sara Jamal
  • 18.
    Josh Bloom UC BerkeleyAstronomy @profjsb Autoencoding RNN for inference on unevenly sampled time-series data Data Driven Discovery Investigator Thanks! Workshop on Applying Advanced AI Workflows In Astronomy and Microscopy 11 Sept 2018 (UCSC, Santa Clara)
  • 20.
    50k variables, 810with known labels (timeseries, colors) Challenge: classification on large sets Richards+11, 12