SlideShare a Scribd company logo
1 of 20
Download to read offline
Josh Bloom
UC Berkeley Astronomy
@profjsb
Autoencoding RNN for inference on
unevenly sampled time-series data
Data Driven Discovery Investigator
Workshop on Applying Advanced AI Workflows
In Astronomy and Microscopy
11 Sept 2018 (UCSC, Santa Clara)
Discovery in images:
Real or spurious sources?
(Ever) Increasing need for ML methods
in Time-Domain Astronomy
Bloom+12, Goldstein+16, …
Inference: What is
this event and is it
worth following up?
Levitan+14
Surrogate modelling &
parameter estimation
Supernova (Thomas/Nugent);
Exoplanets (Ford+11)
Supernova Discovery in the Pinwheel Galaxy
11 hr after explosion
nearest SN Ia in >3 decades
ML-assisted discovery
©Peter Nugent
Nugent+11, Li, Bloom+12, Bloom+12…
Probabilistic Classification of
50k+ Variable Stars
Shivvers,JSB,Richards MNRAS,2014
106 “DEB” candidates
12 new
mass-radii
15 “RCB/DYP”

candidates
8 new discoveries
Triple # of
Galactic
DYPer Stars
Miller, Richards, JSB,..ApJ 2012
5400
Spectroscopic
Targets
Miller, JSB, Richards,..ApJ 2015
Turn synoptic
imagers into
~spectrographs
Challenges with Traditional ("Hand-Crafted Featurization")
Approaches
• Feature engineering is expensive (people/compute), needs
a lot of domain knowledge
• "Small data" domain with only 1000s of labelled training
examples
• Traditional ML techniques don't account for feature
uncertainty
• Ideally would like to learn on one survey and apply that
knowledge to another (e.g., ASAS→ZTF→LSST)
https://github.com/cesium-ml/cesium
1. Build an autoencoder network to
learn to reproduce irregularly sampled
light curves using an information
bottleneck (B)
E( (→
B
D→ ( ( ≈
2. Use B as features and learn a
traditional classifier (random forest)
len(B) = 64
Example Reconstructions
of the Autoencoder
Bottleneck clearly learns
important features
underlying the "physics"
that generates the data
Results rival best-in-class approaches
Code/Data: https://github.com/bnaul/IrregularTimeSeriesAutoencoderPaper
Figure 1: Diagram of an RNN encoder/decoder architecture for irregularly sampled time ser
data. This network uses two RNN layers (specifically, bidirectional gated recurrent units (GRU) [6, 2
• Natively handles
irregularly sampling
Novelties & Improvements
Figure 1: Diagram of an RNN encoder/decoder architecture for irregularly sampled time ser
data. This network uses two RNN layers (specifically, bidirectional gated recurrent units (GRU) [6, 2
• Natively handles
irregularly sampling
• Learning loss accounts
for uncertainty
Novelties & Improvements
Figure 1: Diagram of an RNN encoder/decoder architecture for irregularly sampled time ser
data. This network uses two RNN layers (specifically, bidirectional gated recurrent units (GRU) [6, 2
• Natively handles
irregularly sampling
• Learning loss accounts
for uncertainty
• Natural data
augmentation with
bootstrap resampling
Novelties & Improvements
Figure 1: Diagram of an RNN encoder/decoder architecture for irregularly sampled time ser
data. This network uses two RNN layers (specifically, bidirectional gated recurrent units (GRU) [6, 2
• unsupervised feature
learning → leverage large
corpus of unlabelled light
curves
Novelties & Improvements
Figure 1: Diagram of an RNN encoder/decoder architecture for irregularly sampled time ser
data. This network uses two RNN layers (specifically, bidirectional gated recurrent units (GRU) [6, 2
• unsupervised feature
learning → leverage large
corpus of unlabelled light
curves
• transfer learning appears
to work
Novelties & Improvements
Figure 1: Diagram of an RNN encoder/decoder architecture for irregularly sampled time ser
data. This network uses two RNN layers (specifically, bidirectional gated recurrent units (GRU) [6, 2
• unsupervised feature
learning → leverage large
corpus of unlabelled light
curves
• transfer learning appears
to work
• learning scales linearly in
training examples
Novelties & Improvements
Extensions/Active Research
• Anomaly detection (on the bottleneck features)
• Hyperspectral topology
UMAP applied to
L2-normed autoencoder
for MNIST
Ellie Schwab Abrahams
Also, with Sara Jamal
• New layer types: explore Temporal Convnet (TCNs)
• Co-training across surveys
• Semi-supervised topology + metadata
Loss ~ Lts + λ Lclass
Source
Metadata
Source
Time series
Bottleneck
Unsupervised
SupervisedClassification
Time series
Reconstruction
FC
LSTM
LSTM
Extensions/Active Research
Ellie Schwab Abrahams
Also, with Sara Jamal
Josh Bloom
UC Berkeley Astronomy
@profjsb
Autoencoding RNN for inference on
unevenly sampled time-series data
Data Driven Discovery Investigator
Thanks!
Workshop on Applying Advanced AI Workflows
In Astronomy and Microscopy
11 Sept 2018 (UCSC, Santa Clara)
50k variables, 810 with known labels (timeseries, colors)
Challenge: classification on large sets
Richards+11, 12

More Related Content

What's hot

Detecting solar farms with deep learning
Detecting solar farms with deep learningDetecting solar farms with deep learning
Detecting solar farms with deep learningJason Brown
 
FOSDEM 2015: Distributed Tile Processing with GeoTrellis and Spark
FOSDEM 2015: Distributed Tile Processing with GeoTrellis and SparkFOSDEM 2015: Distributed Tile Processing with GeoTrellis and Spark
FOSDEM 2015: Distributed Tile Processing with GeoTrellis and SparkRob Emanuele
 
Bioclouds CAMDA (Robert Grossman) 09-v9p
Bioclouds CAMDA (Robert Grossman) 09-v9pBioclouds CAMDA (Robert Grossman) 09-v9p
Bioclouds CAMDA (Robert Grossman) 09-v9pRobert Grossman
 
LocationTech Projects
LocationTech ProjectsLocationTech Projects
LocationTech ProjectsJody Garnett
 
Big Data for Big Discoveries
Big Data for Big DiscoveriesBig Data for Big Discoveries
Big Data for Big DiscoveriesGovnet Events
 
OCC Overview OMG Clouds Meeting 07-13-09 v3
OCC Overview OMG Clouds Meeting 07-13-09 v3OCC Overview OMG Clouds Meeting 07-13-09 v3
OCC Overview OMG Clouds Meeting 07-13-09 v3Robert Grossman
 
Climate data in r with the raster package
Climate data in r with the raster packageClimate data in r with the raster package
Climate data in r with the raster packageAlberto Labarga
 
Automatic Features Generation And Model Training On Spark: A Bayesian Approach
Automatic Features Generation And Model Training On Spark: A Bayesian ApproachAutomatic Features Generation And Model Training On Spark: A Bayesian Approach
Automatic Features Generation And Model Training On Spark: A Bayesian ApproachSpark Summit
 
Super COMPUTING Journal
Super COMPUTING JournalSuper COMPUTING Journal
Super COMPUTING JournalPandey_G
 
Histogram Equalized Heat Maps from Log Data via Apache Spark with Arvind Rao
Histogram Equalized Heat Maps from Log Data via Apache Spark with Arvind RaoHistogram Equalized Heat Maps from Log Data via Apache Spark with Arvind Rao
Histogram Equalized Heat Maps from Log Data via Apache Spark with Arvind RaoSpark Summit
 
Secondary Spectrum Usage for Mobile Devices
Secondary Spectrum Usage for Mobile DevicesSecondary Spectrum Usage for Mobile Devices
Secondary Spectrum Usage for Mobile DevicesAmjed Majid
 
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data AnalyticsFebruary 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data AnalyticsYahoo Developer Network
 
Faster R-CNN
Faster R-CNNFaster R-CNN
Faster R-CNNanna8885
 
Advanced deep learning based object detection methods
Advanced deep learning based object detection methodsAdvanced deep learning based object detection methods
Advanced deep learning based object detection methodsBrodmann17
 
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...Jen Aman
 
Of Sampling and Smoothing: Approximating Distributions over Linked Open Data
Of Sampling and Smoothing: Approximating Distributions over Linked Open DataOf Sampling and Smoothing: Approximating Distributions over Linked Open Data
Of Sampling and Smoothing: Approximating Distributions over Linked Open DataThomas Gottron
 
Artificial Neural Networks for Storm Surge Prediction in North Carolina
Artificial Neural Networks for Storm Surge Prediction in North CarolinaArtificial Neural Networks for Storm Surge Prediction in North Carolina
Artificial Neural Networks for Storm Surge Prediction in North CarolinaAnton Bezuglov
 

What's hot (20)

Detecting solar farms with deep learning
Detecting solar farms with deep learningDetecting solar farms with deep learning
Detecting solar farms with deep learning
 
FOSDEM 2015: Distributed Tile Processing with GeoTrellis and Spark
FOSDEM 2015: Distributed Tile Processing with GeoTrellis and SparkFOSDEM 2015: Distributed Tile Processing with GeoTrellis and Spark
FOSDEM 2015: Distributed Tile Processing with GeoTrellis and Spark
 
Bioclouds CAMDA (Robert Grossman) 09-v9p
Bioclouds CAMDA (Robert Grossman) 09-v9pBioclouds CAMDA (Robert Grossman) 09-v9p
Bioclouds CAMDA (Robert Grossman) 09-v9p
 
LocationTech Projects
LocationTech ProjectsLocationTech Projects
LocationTech Projects
 
Big Data for Big Discoveries
Big Data for Big DiscoveriesBig Data for Big Discoveries
Big Data for Big Discoveries
 
OCC Overview OMG Clouds Meeting 07-13-09 v3
OCC Overview OMG Clouds Meeting 07-13-09 v3OCC Overview OMG Clouds Meeting 07-13-09 v3
OCC Overview OMG Clouds Meeting 07-13-09 v3
 
Climate data in r with the raster package
Climate data in r with the raster packageClimate data in r with the raster package
Climate data in r with the raster package
 
Automatic Features Generation And Model Training On Spark: A Bayesian Approach
Automatic Features Generation And Model Training On Spark: A Bayesian ApproachAutomatic Features Generation And Model Training On Spark: A Bayesian Approach
Automatic Features Generation And Model Training On Spark: A Bayesian Approach
 
SSD: Single Shot MultiBox Detector (UPC Reading Group)
SSD: Single Shot MultiBox Detector (UPC Reading Group)SSD: Single Shot MultiBox Detector (UPC Reading Group)
SSD: Single Shot MultiBox Detector (UPC Reading Group)
 
Detection
DetectionDetection
Detection
 
Super COMPUTING Journal
Super COMPUTING JournalSuper COMPUTING Journal
Super COMPUTING Journal
 
Histogram Equalized Heat Maps from Log Data via Apache Spark with Arvind Rao
Histogram Equalized Heat Maps from Log Data via Apache Spark with Arvind RaoHistogram Equalized Heat Maps from Log Data via Apache Spark with Arvind Rao
Histogram Equalized Heat Maps from Log Data via Apache Spark with Arvind Rao
 
Secondary Spectrum Usage for Mobile Devices
Secondary Spectrum Usage for Mobile DevicesSecondary Spectrum Usage for Mobile Devices
Secondary Spectrum Usage for Mobile Devices
 
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data AnalyticsFebruary 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
 
Mask R-CNN
Mask R-CNNMask R-CNN
Mask R-CNN
 
Faster R-CNN
Faster R-CNNFaster R-CNN
Faster R-CNN
 
Advanced deep learning based object detection methods
Advanced deep learning based object detection methodsAdvanced deep learning based object detection methods
Advanced deep learning based object detection methods
 
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
 
Of Sampling and Smoothing: Approximating Distributions over Linked Open Data
Of Sampling and Smoothing: Approximating Distributions over Linked Open DataOf Sampling and Smoothing: Approximating Distributions over Linked Open Data
Of Sampling and Smoothing: Approximating Distributions over Linked Open Data
 
Artificial Neural Networks for Storm Surge Prediction in North Carolina
Artificial Neural Networks for Storm Surge Prediction in North CarolinaArtificial Neural Networks for Storm Surge Prediction in North Carolina
Artificial Neural Networks for Storm Surge Prediction in North Carolina
 

Similar to Autoencoding RNN for inference on unevenly sampled time-series data

Computational Training and Data Literacy for Domain Scientists
Computational Training and Data Literacy for Domain ScientistsComputational Training and Data Literacy for Domain Scientists
Computational Training and Data Literacy for Domain ScientistsJoshua Bloom
 
Data Science Education: Needs & Opportunities in Astronomy
Data Science Education: Needs & Opportunities in AstronomyData Science Education: Needs & Opportunities in Astronomy
Data Science Education: Needs & Opportunities in AstronomyJoshua Bloom
 
The Emerging Cyberinfrastructure for Earth and Ocean Sciences
The Emerging Cyberinfrastructure for Earth and Ocean SciencesThe Emerging Cyberinfrastructure for Earth and Ocean Sciences
The Emerging Cyberinfrastructure for Earth and Ocean SciencesLarry Smarr
 
Identifying Exoplanets with Machine Learning Methods: A Preliminary Study
Identifying Exoplanets with Machine Learning Methods: A Preliminary StudyIdentifying Exoplanets with Machine Learning Methods: A Preliminary Study
Identifying Exoplanets with Machine Learning Methods: A Preliminary StudyIJCI JOURNAL
 
Astronomical Data Processing on the LSST Scale with Apache Spark
Astronomical Data Processing on the LSST Scale with Apache SparkAstronomical Data Processing on the LSST Scale with Apache Spark
Astronomical Data Processing on the LSST Scale with Apache SparkDatabricks
 
IRJET- Deep Convolution Neural Networks for Galaxy Morphology Classification
IRJET- Deep Convolution Neural Networks for Galaxy Morphology ClassificationIRJET- Deep Convolution Neural Networks for Galaxy Morphology Classification
IRJET- Deep Convolution Neural Networks for Galaxy Morphology ClassificationIRJET Journal
 
Cyberinfrastructure to Support Ocean Observatories
Cyberinfrastructure to Support Ocean ObservatoriesCyberinfrastructure to Support Ocean Observatories
Cyberinfrastructure to Support Ocean ObservatoriesLarry Smarr
 
(Research Note) Delving deeper into convolutional neural networks for camera ...
(Research Note) Delving deeper into convolutional neural networks for camera ...(Research Note) Delving deeper into convolutional neural networks for camera ...
(Research Note) Delving deeper into convolutional neural networks for camera ...Jacky Liu
 
Computational Training for Domain Scientists & Data Literacy
Computational Training for Domain Scientists & Data LiteracyComputational Training for Domain Scientists & Data Literacy
Computational Training for Domain Scientists & Data LiteracyJoshua Bloom
 
ExoSGAN and ExoACGAN: Exoplanet Detection using Adversarial Training Algorithms
ExoSGAN and ExoACGAN: Exoplanet Detection using Adversarial Training AlgorithmsExoSGAN and ExoACGAN: Exoplanet Detection using Adversarial Training Algorithms
ExoSGAN and ExoACGAN: Exoplanet Detection using Adversarial Training AlgorithmsIRJET Journal
 
Science and Cyberinfrastructure in the Data-Dominated Era
Science and Cyberinfrastructure in the Data-Dominated EraScience and Cyberinfrastructure in the Data-Dominated Era
Science and Cyberinfrastructure in the Data-Dominated EraLarry Smarr
 
[CVPR 2018] Utilizing unlabeled or noisy labeled data (classification, detect...
[CVPR 2018] Utilizing unlabeled or noisy labeled data (classification, detect...[CVPR 2018] Utilizing unlabeled or noisy labeled data (classification, detect...
[CVPR 2018] Utilizing unlabeled or noisy labeled data (classification, detect...NAVER Engineering
 
AstroCV: A computer vision library for Astronomy
AstroCV: A computer vision library for AstronomyAstroCV: A computer vision library for Astronomy
AstroCV: A computer vision library for AstronomyRoberto Muñoz
 
myashar_research_2016
myashar_research_2016myashar_research_2016
myashar_research_2016Mark Yashar
 
Ieee 2016 nss mic poster N30-21
Ieee 2016 nss mic poster N30-21Ieee 2016 nss mic poster N30-21
Ieee 2016 nss mic poster N30-21Dae Woon Kim
 
Real-Time Analysis of Streaming Synchotron Data: SCinet SC19 Technology Chall...
Real-Time Analysis of Streaming Synchotron Data: SCinet SC19 Technology Chall...Real-Time Analysis of Streaming Synchotron Data: SCinet SC19 Technology Chall...
Real-Time Analysis of Streaming Synchotron Data: SCinet SC19 Technology Chall...Globus
 
Enabling Real Time Analysis & Decision Making - A Paradigm Shift for Experime...
Enabling Real Time Analysis & Decision Making - A Paradigm Shift for Experime...Enabling Real Time Analysis & Decision Making - A Paradigm Shift for Experime...
Enabling Real Time Analysis & Decision Making - A Paradigm Shift for Experime...PyData
 
Toward Real-Time Analysis of Large Data Volumes for Diffraction Studies by Ma...
Toward Real-Time Analysis of Large Data Volumes for Diffraction Studies by Ma...Toward Real-Time Analysis of Large Data Volumes for Diffraction Studies by Ma...
Toward Real-Time Analysis of Large Data Volumes for Diffraction Studies by Ma...EarthCube
 
120_SEM_Special_Topics.ppt
120_SEM_Special_Topics.ppt120_SEM_Special_Topics.ppt
120_SEM_Special_Topics.pptzaki194502
 
BurstCube Poster Final Draft
BurstCube Poster Final DraftBurstCube Poster Final Draft
BurstCube Poster Final DraftYkeshia Zamore
 

Similar to Autoencoding RNN for inference on unevenly sampled time-series data (20)

Computational Training and Data Literacy for Domain Scientists
Computational Training and Data Literacy for Domain ScientistsComputational Training and Data Literacy for Domain Scientists
Computational Training and Data Literacy for Domain Scientists
 
Data Science Education: Needs & Opportunities in Astronomy
Data Science Education: Needs & Opportunities in AstronomyData Science Education: Needs & Opportunities in Astronomy
Data Science Education: Needs & Opportunities in Astronomy
 
The Emerging Cyberinfrastructure for Earth and Ocean Sciences
The Emerging Cyberinfrastructure for Earth and Ocean SciencesThe Emerging Cyberinfrastructure for Earth and Ocean Sciences
The Emerging Cyberinfrastructure for Earth and Ocean Sciences
 
Identifying Exoplanets with Machine Learning Methods: A Preliminary Study
Identifying Exoplanets with Machine Learning Methods: A Preliminary StudyIdentifying Exoplanets with Machine Learning Methods: A Preliminary Study
Identifying Exoplanets with Machine Learning Methods: A Preliminary Study
 
Astronomical Data Processing on the LSST Scale with Apache Spark
Astronomical Data Processing on the LSST Scale with Apache SparkAstronomical Data Processing on the LSST Scale with Apache Spark
Astronomical Data Processing on the LSST Scale with Apache Spark
 
IRJET- Deep Convolution Neural Networks for Galaxy Morphology Classification
IRJET- Deep Convolution Neural Networks for Galaxy Morphology ClassificationIRJET- Deep Convolution Neural Networks for Galaxy Morphology Classification
IRJET- Deep Convolution Neural Networks for Galaxy Morphology Classification
 
Cyberinfrastructure to Support Ocean Observatories
Cyberinfrastructure to Support Ocean ObservatoriesCyberinfrastructure to Support Ocean Observatories
Cyberinfrastructure to Support Ocean Observatories
 
(Research Note) Delving deeper into convolutional neural networks for camera ...
(Research Note) Delving deeper into convolutional neural networks for camera ...(Research Note) Delving deeper into convolutional neural networks for camera ...
(Research Note) Delving deeper into convolutional neural networks for camera ...
 
Computational Training for Domain Scientists & Data Literacy
Computational Training for Domain Scientists & Data LiteracyComputational Training for Domain Scientists & Data Literacy
Computational Training for Domain Scientists & Data Literacy
 
ExoSGAN and ExoACGAN: Exoplanet Detection using Adversarial Training Algorithms
ExoSGAN and ExoACGAN: Exoplanet Detection using Adversarial Training AlgorithmsExoSGAN and ExoACGAN: Exoplanet Detection using Adversarial Training Algorithms
ExoSGAN and ExoACGAN: Exoplanet Detection using Adversarial Training Algorithms
 
Science and Cyberinfrastructure in the Data-Dominated Era
Science and Cyberinfrastructure in the Data-Dominated EraScience and Cyberinfrastructure in the Data-Dominated Era
Science and Cyberinfrastructure in the Data-Dominated Era
 
[CVPR 2018] Utilizing unlabeled or noisy labeled data (classification, detect...
[CVPR 2018] Utilizing unlabeled or noisy labeled data (classification, detect...[CVPR 2018] Utilizing unlabeled or noisy labeled data (classification, detect...
[CVPR 2018] Utilizing unlabeled or noisy labeled data (classification, detect...
 
AstroCV: A computer vision library for Astronomy
AstroCV: A computer vision library for AstronomyAstroCV: A computer vision library for Astronomy
AstroCV: A computer vision library for Astronomy
 
myashar_research_2016
myashar_research_2016myashar_research_2016
myashar_research_2016
 
Ieee 2016 nss mic poster N30-21
Ieee 2016 nss mic poster N30-21Ieee 2016 nss mic poster N30-21
Ieee 2016 nss mic poster N30-21
 
Real-Time Analysis of Streaming Synchotron Data: SCinet SC19 Technology Chall...
Real-Time Analysis of Streaming Synchotron Data: SCinet SC19 Technology Chall...Real-Time Analysis of Streaming Synchotron Data: SCinet SC19 Technology Chall...
Real-Time Analysis of Streaming Synchotron Data: SCinet SC19 Technology Chall...
 
Enabling Real Time Analysis & Decision Making - A Paradigm Shift for Experime...
Enabling Real Time Analysis & Decision Making - A Paradigm Shift for Experime...Enabling Real Time Analysis & Decision Making - A Paradigm Shift for Experime...
Enabling Real Time Analysis & Decision Making - A Paradigm Shift for Experime...
 
Toward Real-Time Analysis of Large Data Volumes for Diffraction Studies by Ma...
Toward Real-Time Analysis of Large Data Volumes for Diffraction Studies by Ma...Toward Real-Time Analysis of Large Data Volumes for Diffraction Studies by Ma...
Toward Real-Time Analysis of Large Data Volumes for Diffraction Studies by Ma...
 
120_SEM_Special_Topics.ppt
120_SEM_Special_Topics.ppt120_SEM_Special_Topics.ppt
120_SEM_Special_Topics.ppt
 
BurstCube Poster Final Draft
BurstCube Poster Final DraftBurstCube Poster Final Draft
BurstCube Poster Final Draft
 

More from Joshua Bloom

Industrial Machine Learning (SIGKDD17)
Industrial Machine Learning (SIGKDD17)Industrial Machine Learning (SIGKDD17)
Industrial Machine Learning (SIGKDD17)Joshua Bloom
 
Industrial Machine Learning (at GE)
Industrial Machine Learning (at GE)Industrial Machine Learning (at GE)
Industrial Machine Learning (at GE)Joshua Bloom
 
PyData 2015 Keynote: "A Systems View of Machine Learning"
PyData 2015 Keynote: "A Systems View of Machine Learning" PyData 2015 Keynote: "A Systems View of Machine Learning"
PyData 2015 Keynote: "A Systems View of Machine Learning" Joshua Bloom
 
Large-Scale Inference in Time Domain Astrophysics
Large-Scale Inference in Time Domain AstrophysicsLarge-Scale Inference in Time Domain Astrophysics
Large-Scale Inference in Time Domain AstrophysicsJoshua Bloom
 
Data Science at Berkeley
Data Science at BerkeleyData Science at Berkeley
Data Science at BerkeleyJoshua Bloom
 
Joshua Bloom: Machine Learning and Classification in the Synoptic Survey Era
Joshua Bloom: Machine Learning and Classification in the Synoptic Survey EraJoshua Bloom: Machine Learning and Classification in the Synoptic Survey Era
Joshua Bloom: Machine Learning and Classification in the Synoptic Survey EraJoshua Bloom
 

More from Joshua Bloom (6)

Industrial Machine Learning (SIGKDD17)
Industrial Machine Learning (SIGKDD17)Industrial Machine Learning (SIGKDD17)
Industrial Machine Learning (SIGKDD17)
 
Industrial Machine Learning (at GE)
Industrial Machine Learning (at GE)Industrial Machine Learning (at GE)
Industrial Machine Learning (at GE)
 
PyData 2015 Keynote: "A Systems View of Machine Learning"
PyData 2015 Keynote: "A Systems View of Machine Learning" PyData 2015 Keynote: "A Systems View of Machine Learning"
PyData 2015 Keynote: "A Systems View of Machine Learning"
 
Large-Scale Inference in Time Domain Astrophysics
Large-Scale Inference in Time Domain AstrophysicsLarge-Scale Inference in Time Domain Astrophysics
Large-Scale Inference in Time Domain Astrophysics
 
Data Science at Berkeley
Data Science at BerkeleyData Science at Berkeley
Data Science at Berkeley
 
Joshua Bloom: Machine Learning and Classification in the Synoptic Survey Era
Joshua Bloom: Machine Learning and Classification in the Synoptic Survey EraJoshua Bloom: Machine Learning and Classification in the Synoptic Survey Era
Joshua Bloom: Machine Learning and Classification in the Synoptic Survey Era
 

Recently uploaded

CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡anilsa9823
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfSumit Kumar yadav
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 

Recently uploaded (20)

CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 

Autoencoding RNN for inference on unevenly sampled time-series data

  • 1. Josh Bloom UC Berkeley Astronomy @profjsb Autoencoding RNN for inference on unevenly sampled time-series data Data Driven Discovery Investigator Workshop on Applying Advanced AI Workflows In Astronomy and Microscopy 11 Sept 2018 (UCSC, Santa Clara)
  • 2. Discovery in images: Real or spurious sources? (Ever) Increasing need for ML methods in Time-Domain Astronomy Bloom+12, Goldstein+16, … Inference: What is this event and is it worth following up? Levitan+14 Surrogate modelling & parameter estimation Supernova (Thomas/Nugent); Exoplanets (Ford+11)
  • 3. Supernova Discovery in the Pinwheel Galaxy 11 hr after explosion nearest SN Ia in >3 decades ML-assisted discovery ©Peter Nugent Nugent+11, Li, Bloom+12, Bloom+12…
  • 4. Probabilistic Classification of 50k+ Variable Stars Shivvers,JSB,Richards MNRAS,2014 106 “DEB” candidates 12 new mass-radii 15 “RCB/DYP”
 candidates 8 new discoveries Triple # of Galactic DYPer Stars Miller, Richards, JSB,..ApJ 2012 5400 Spectroscopic Targets Miller, JSB, Richards,..ApJ 2015 Turn synoptic imagers into ~spectrographs
  • 5. Challenges with Traditional ("Hand-Crafted Featurization") Approaches • Feature engineering is expensive (people/compute), needs a lot of domain knowledge • "Small data" domain with only 1000s of labelled training examples • Traditional ML techniques don't account for feature uncertainty • Ideally would like to learn on one survey and apply that knowledge to another (e.g., ASAS→ZTF→LSST) https://github.com/cesium-ml/cesium
  • 6. 1. Build an autoencoder network to learn to reproduce irregularly sampled light curves using an information bottleneck (B) E( (→ B D→ ( ( ≈ 2. Use B as features and learn a traditional classifier (random forest)
  • 7. len(B) = 64 Example Reconstructions of the Autoencoder
  • 8. Bottleneck clearly learns important features underlying the "physics" that generates the data
  • 9. Results rival best-in-class approaches Code/Data: https://github.com/bnaul/IrregularTimeSeriesAutoencoderPaper
  • 10. Figure 1: Diagram of an RNN encoder/decoder architecture for irregularly sampled time ser data. This network uses two RNN layers (specifically, bidirectional gated recurrent units (GRU) [6, 2 • Natively handles irregularly sampling Novelties & Improvements
  • 11. Figure 1: Diagram of an RNN encoder/decoder architecture for irregularly sampled time ser data. This network uses two RNN layers (specifically, bidirectional gated recurrent units (GRU) [6, 2 • Natively handles irregularly sampling • Learning loss accounts for uncertainty Novelties & Improvements
  • 12. Figure 1: Diagram of an RNN encoder/decoder architecture for irregularly sampled time ser data. This network uses two RNN layers (specifically, bidirectional gated recurrent units (GRU) [6, 2 • Natively handles irregularly sampling • Learning loss accounts for uncertainty • Natural data augmentation with bootstrap resampling Novelties & Improvements
  • 13. Figure 1: Diagram of an RNN encoder/decoder architecture for irregularly sampled time ser data. This network uses two RNN layers (specifically, bidirectional gated recurrent units (GRU) [6, 2 • unsupervised feature learning → leverage large corpus of unlabelled light curves Novelties & Improvements
  • 14. Figure 1: Diagram of an RNN encoder/decoder architecture for irregularly sampled time ser data. This network uses two RNN layers (specifically, bidirectional gated recurrent units (GRU) [6, 2 • unsupervised feature learning → leverage large corpus of unlabelled light curves • transfer learning appears to work Novelties & Improvements
  • 15. Figure 1: Diagram of an RNN encoder/decoder architecture for irregularly sampled time ser data. This network uses two RNN layers (specifically, bidirectional gated recurrent units (GRU) [6, 2 • unsupervised feature learning → leverage large corpus of unlabelled light curves • transfer learning appears to work • learning scales linearly in training examples Novelties & Improvements
  • 16. Extensions/Active Research • Anomaly detection (on the bottleneck features) • Hyperspectral topology UMAP applied to L2-normed autoencoder for MNIST Ellie Schwab Abrahams Also, with Sara Jamal
  • 17. • New layer types: explore Temporal Convnet (TCNs) • Co-training across surveys • Semi-supervised topology + metadata Loss ~ Lts + λ Lclass Source Metadata Source Time series Bottleneck Unsupervised SupervisedClassification Time series Reconstruction FC LSTM LSTM Extensions/Active Research Ellie Schwab Abrahams Also, with Sara Jamal
  • 18. Josh Bloom UC Berkeley Astronomy @profjsb Autoencoding RNN for inference on unevenly sampled time-series data Data Driven Discovery Investigator Thanks! Workshop on Applying Advanced AI Workflows In Astronomy and Microscopy 11 Sept 2018 (UCSC, Santa Clara)
  • 19.
  • 20. 50k variables, 810 with known labels (timeseries, colors) Challenge: classification on large sets Richards+11, 12