SlideShare a Scribd company logo
BigSkyEarth II:
Feature extraction from time-series data
Ashish Mahabal
Center for Data-Driven Discovery, Caltech
5 April 2016
Ashish Mahabal - BSE II
Transformations in astronomy
Shallow to deep; Small to large; Sporadic to repeated; more wavelengths
Move towards
digital movies!
Finding
transients
Doing new
science2014-06-16 Ashish Mahabal
3
Ashish Mahabal - BSE II
Something that has a large brightness
change (delta-magnitude) within a short
timespan (small delta-time)
What is a transient?
4
Ashish Mahabal - BSE II
Phenomenological variety
Flare Star Dwarf Nova Blazar
Catalina Real-time Transient Survey (CRTS)
5
Ashish Mahabal - BSE II
Supernova from SN Hunt
6
Ashish Mahabal - BSE II
Binary Black-holes
PG 1302-102
Graham et al.
7
Ashish Mahabal - BSE II
Variability Tree
8
Ashish Mahabal - BSE II
irregular light curves●●●●
●
● ●
●
●
●
16
17
18
19
20
55250 55500 55750 56000
Mean Julian Date
Magnitude
9
Ashish Mahabal - BSE II
GPR fitted to light curves
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
19
20
21
0 1000 2000
Day
Magnitude
AGN
●
●
●
●
●
●
●
●
●
19.0
19.5
20.0
20.5
21.0
0 1000 2000
Day
Magnitude
SN
●
●
●
●
●●●●
●●●●
●●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●●
●
●
●
●●
●●
● ●●
●●
●
●
●●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●●●
●●
●●
14
16
18
20
0 1000 2000
Day
Magnitude Flare
●●●
● ●●●●●●●● ●
●
●● ●●
●
● ●●●●
●
●●●●●
●●●●●●●●●● ●●
●
● ● ●●
●●●●●●●●
16
17
18
19
20
21
0 1000 2000
Day
Magnitude
Non Transient
10
Ashish Mahabal - BSE II
A
separate
Kepler
AGN
CRTS
AGN
Juxtaposition of surveys to get training sets for newer
ones
11
Ashish Mahabal - BSE II
Regular versus irregular
• Financial time series and predictions
• Variety of tools
• Statistical features
12
Ashish Mahabal - BSE II
Challenge: A Variety of Parameters
•  Discovery: magnitudes, delta-magnitudes
•  Contextual:
–  Distance to nearest star
–  Magnitude of the star
–  Color of that star
–  Normalized distance to nearest galaxy
–  Distance to nearest radio source
–  Flux of nearest radio source
–  Galactic latitude
•  Follow-up
–  Colors (g-r, r-I, i-z etc.)
•  Prior classifications (event type)
•  Characteristics from light-curve
–  Amplitude
–  Median buffer range percentage
–  Standard deviation
–  Stetson k
–  Flux percentile ratio mid80
–  Prior outburst statistic
Not all parameters are
always present leading to
swiss-cheese like data
sets
http://ki-media.blogspot.com/
Measures from Feigelson and Babu
(Graham)
New lightcurve-based parameters:
(Faraway)
• Whole curve measures
• Fitted curve measures
• Residual from fit measures
• Cluster measures
• Other
13
Ashish Mahabal - BSE II
beyond1std
skew
Amplitude
freq_signif
freq_varrat
freq_y_offset
freq_model_max_delta_mag
freq_model_min_delta_mag
freq_model_phi1_phi2
freq_rrd
freq_n_alias
flux_%_mid20
flux_%_mid35
flux_%_mid50
flux_%_mid65
flux_%_mid80
linear_trend
max_slope
MAD
median_buffer_range_percentage
pair_slope_trend
percent_amplitude
percent_difference_flux_percentile
QSO
non_QSO
std
small_kurtosis
stetson_j
stetson_k
scatter_res_raw
p2p_scatter_2praw
p2p_scatter_over_mad
p2p_scatter_pfold_over_mad
medperc90_p2_p
fold_2p_slope_10%
fold_2p_slope_90%
p2p_ssqr_diff_over_var
Many features
- not all are independent
Adam Miller
15 Jan 2015 Ashish Mahabal 20
14
Ashish Mahabal - BSE II
Importance of cadence
• Cadence wars with the LSST
• Asteroids versus fast transients versus periodic
objects versus cosmology
• Constraints due to promises made
• Mechanical constraints
15
16
Ashish Mahabal - BSE II
continuing cadence meetings
Sensitivity (visit? coadd?) by filter (especially u and g), needed for several
(many? all?) variable types
Phased uniformity (periodic variables): for a given period how uniformly
would the lightcurve be sampled?*
Window function (per filter/all filters) FWHM, ...
statistics of revisit time histogram (per filter/all filters) e.g. min/max/
median/5th & 95th percentiles
Hour angle distribution (to check aliasing), at a given sky position,
maximum difference, rms ...
17
Ashish Mahabal - BSE II
Follow-up a huge issue
• Depth is a big difference: faint sources
• Characterization/ML needed: part of it only from LSST
• A lot of LSST science may get done before LSST
• Synergy with GMT/TMT
• LSST as a follow-up device in the days of aLIGO
18
Ashish Mahabal - BSE II
Optimization more than in Tzolk’in
Rohit Gawande
Victory points == science
Temples
Technologies
Currency
Buildings
Resources
Large number of variables
and each player wants to
win.
19
Ashish Mahabal - BSE II
Optimizing is (generally) a zero-sum game
Easy to make the survey “greatest” in one science
Optimization means compromise
BUT, the sum of parts is GREATER than the whole i.e.
compromise does NOT mean sacrifice
In other words, the players are NOT playing AGAINST each other
It’s the best middle ground we are seeking
LSST is its own follow-up machine in a proactive way.
By coming up with a good cadence we can minimize the follow-up needed.
And you can help. And get the science you love done in the process.
20
Ashish Mahabal - BSE II
CRTS light-curves
• 500M, most with hundreds of points over 10 years
• Most are non-variable (as a rule)
• Processing “done” for all SSS (100M+)
• Great training set for many procedures
21
Ashish Mahabal - BSE II
Caltech service
• Not for mass-computation: http://nirgun.caltech.edu:
8000/scripts/description.html
• Mass processing: xsede, IUCAA, other …
22
Ashish Mahabal - BSE II
Faraway, Mahabal et al. methods
Recursive Partitioning
Param n
Type n
Numbers/
names not
for
reading J Faraway
10/17/2014 Ashish Mahabal, IACS 58
23
Ashish Mahabal - BSE II
domain knowledge • Peakiness - SNe separator
• Period finding variations
• Detailed features for a subset
Non-SNe (1) SNe
(2)
1
2
2
2
1
1
Using 900 non-SNe and 600
SNe
80-90% completeness
using just these
parameters
Mahabal, Ball
24
Ashish Mahabal - BSE II
dimensionality reduction
Feature selection strategies
Donalek et al. arXiv:1310.1976
•  Fast Relief Algorithm (wt and
threshold)
•  Fisher Discriminant Ratio
•  Correlation based Feature
Selection
•  Fast Correlation Based Filter
•  Multi Class Feature Selection
25
Ashish Mahabal - BSE II
Using features for clustering
• Using similarity of features to seed larger unknown
sets
• Exercise based on that …
• Deep learning?
Open
questions
26
Ashish Mahabal - BSE II
Augmenting light-curves
with newer points
• Millions of sources observed each night
• Appending those points to all those light-curves
• Recomputing stats and follow-up characterizations
Open
questions
27
Ashish Mahabal - BSE II
Summary
• Astronomical time-series tend to be more gappy,
heteroskedastic, diverse
• A great variety of statistical and domain-based
features extractable
• Many challenges and interesting problems remain
28

More Related Content

Similar to 06 ashish mahabal bse2

Two stage deep learning for accelerated 3 d time-of-flight mra without matche...
Two stage deep learning for accelerated 3 d time-of-flight mra without matche...Two stage deep learning for accelerated 3 d time-of-flight mra without matche...
Two stage deep learning for accelerated 3 d time-of-flight mra without matche...
Chung Hyung Jin
 
Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...
Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...
Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...
Databricks
 
|QAB> : Quantum Computing, AI and Blockchain
|QAB> : Quantum Computing, AI and Blockchain|QAB> : Quantum Computing, AI and Blockchain
|QAB> : Quantum Computing, AI and Blockchain
Kan Yuenyong
 
ChIP-seq - Data processing
ChIP-seq - Data processingChIP-seq - Data processing
ChIP-seq - Data processing
Sebastian Schmeier
 
Online Stochastic Tensor Decomposition for Background Subtraction in Multispe...
Online Stochastic Tensor Decomposition for Background Subtraction in Multispe...Online Stochastic Tensor Decomposition for Background Subtraction in Multispe...
Online Stochastic Tensor Decomposition for Background Subtraction in Multispe...
ActiveEon
 
J07.00011 : Superconducting Parametric Cavities as an “Optical” Quantum Compu...
J07.00011 : Superconducting Parametric Cavities as an “Optical” Quantum Compu...J07.00011 : Superconducting Parametric Cavities as an “Optical” Quantum Compu...
J07.00011 : Superconducting Parametric Cavities as an “Optical” Quantum Compu...
Jimmy Shih-Chun Hung
 
Empirical Analysis of Radix Sort using Curve Fitting Technique in Personal Co...
Empirical Analysis of Radix Sort using Curve Fitting Technique in Personal Co...Empirical Analysis of Radix Sort using Curve Fitting Technique in Personal Co...
Empirical Analysis of Radix Sort using Curve Fitting Technique in Personal Co...
IRJET Journal
 
Rethinking garbage collection
Rethinking garbage collectionRethinking garbage collection
Rethinking garbage collection
JUGBD
 
Rethinking garbage collection
Rethinking garbage collectionRethinking garbage collection
Rethinking garbage collection
Bazlur Rokon
 
WBDB 2015 Performance Evaluation of Spark SQL using BigBench
WBDB 2015 Performance Evaluation of Spark SQL using BigBenchWBDB 2015 Performance Evaluation of Spark SQL using BigBench
WBDB 2015 Performance Evaluation of Spark SQL using BigBench
t_ivanov
 
23AFMC_Beamer.pdf
23AFMC_Beamer.pdf23AFMC_Beamer.pdf
23AFMC_Beamer.pdf
LorenzoCampoli1
 
Morgan uw maGIV v1.3 dist
Morgan uw maGIV v1.3 distMorgan uw maGIV v1.3 dist
Morgan uw maGIV v1.3 dist
ddm314
 
NGS Pipeline Preparation - Tools Selection
NGS Pipeline Preparation - Tools SelectionNGS Pipeline Preparation - Tools Selection
NGS Pipeline Preparation - Tools Selection
Minesh A. Jethva
 
Practical Digital Image Processing 4
Practical Digital Image Processing 4Practical Digital Image Processing 4
Practical Digital Image Processing 4
Aly Abdelkareem
 
HACC: Fitting the Universe Inside a Supercomputer
HACC: Fitting the Universe Inside a SupercomputerHACC: Fitting the Universe Inside a Supercomputer
HACC: Fitting the Universe Inside a Supercomputer
inside-BigData.com
 
Data-driven methods for the initialization of full-waveform inversion
Data-driven methods for the initialization of full-waveform inversionData-driven methods for the initialization of full-waveform inversion
Data-driven methods for the initialization of full-waveform inversion
Oleg Ovcharenko
 

Similar to 06 ashish mahabal bse2 (16)

Two stage deep learning for accelerated 3 d time-of-flight mra without matche...
Two stage deep learning for accelerated 3 d time-of-flight mra without matche...Two stage deep learning for accelerated 3 d time-of-flight mra without matche...
Two stage deep learning for accelerated 3 d time-of-flight mra without matche...
 
Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...
Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...
Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...
 
|QAB> : Quantum Computing, AI and Blockchain
|QAB> : Quantum Computing, AI and Blockchain|QAB> : Quantum Computing, AI and Blockchain
|QAB> : Quantum Computing, AI and Blockchain
 
ChIP-seq - Data processing
ChIP-seq - Data processingChIP-seq - Data processing
ChIP-seq - Data processing
 
Online Stochastic Tensor Decomposition for Background Subtraction in Multispe...
Online Stochastic Tensor Decomposition for Background Subtraction in Multispe...Online Stochastic Tensor Decomposition for Background Subtraction in Multispe...
Online Stochastic Tensor Decomposition for Background Subtraction in Multispe...
 
J07.00011 : Superconducting Parametric Cavities as an “Optical” Quantum Compu...
J07.00011 : Superconducting Parametric Cavities as an “Optical” Quantum Compu...J07.00011 : Superconducting Parametric Cavities as an “Optical” Quantum Compu...
J07.00011 : Superconducting Parametric Cavities as an “Optical” Quantum Compu...
 
Empirical Analysis of Radix Sort using Curve Fitting Technique in Personal Co...
Empirical Analysis of Radix Sort using Curve Fitting Technique in Personal Co...Empirical Analysis of Radix Sort using Curve Fitting Technique in Personal Co...
Empirical Analysis of Radix Sort using Curve Fitting Technique in Personal Co...
 
Rethinking garbage collection
Rethinking garbage collectionRethinking garbage collection
Rethinking garbage collection
 
Rethinking garbage collection
Rethinking garbage collectionRethinking garbage collection
Rethinking garbage collection
 
WBDB 2015 Performance Evaluation of Spark SQL using BigBench
WBDB 2015 Performance Evaluation of Spark SQL using BigBenchWBDB 2015 Performance Evaluation of Spark SQL using BigBench
WBDB 2015 Performance Evaluation of Spark SQL using BigBench
 
23AFMC_Beamer.pdf
23AFMC_Beamer.pdf23AFMC_Beamer.pdf
23AFMC_Beamer.pdf
 
Morgan uw maGIV v1.3 dist
Morgan uw maGIV v1.3 distMorgan uw maGIV v1.3 dist
Morgan uw maGIV v1.3 dist
 
NGS Pipeline Preparation - Tools Selection
NGS Pipeline Preparation - Tools SelectionNGS Pipeline Preparation - Tools Selection
NGS Pipeline Preparation - Tools Selection
 
Practical Digital Image Processing 4
Practical Digital Image Processing 4Practical Digital Image Processing 4
Practical Digital Image Processing 4
 
HACC: Fitting the Universe Inside a Supercomputer
HACC: Fitting the Universe Inside a SupercomputerHACC: Fitting the Universe Inside a Supercomputer
HACC: Fitting the Universe Inside a Supercomputer
 
Data-driven methods for the initialization of full-waveform inversion
Data-driven methods for the initialization of full-waveform inversionData-driven methods for the initialization of full-waveform inversion
Data-driven methods for the initialization of full-waveform inversion
 

Recently uploaded

Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdfTopic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
TinyAnderson
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
Sharon Liu
 
Equivariant neural networks and representation theory
Equivariant neural networks and representation theoryEquivariant neural networks and representation theory
Equivariant neural networks and representation theory
Daniel Tubbenhauer
 
The binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defectsThe binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defects
Sérgio Sacani
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
RitabrataSarkar3
 
Bob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdfBob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdf
Texas Alliance of Groundwater Districts
 
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
yqqaatn0
 
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxThe use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
MAGOTI ERNEST
 
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdfwaterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
LengamoLAppostilic
 
Oedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptxOedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptx
muralinath2
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
PRIYANKA PATEL
 
The debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically youngThe debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically young
Sérgio Sacani
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
University of Maribor
 
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
vluwdy49
 
molar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptxmolar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptx
Anagha Prasad
 
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Leonel Morgado
 
Immersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths ForwardImmersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths Forward
Leonel Morgado
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
KrushnaDarade1
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
Sérgio Sacani
 
Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.
Aditi Bajpai
 

Recently uploaded (20)

Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdfTopic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
 
Equivariant neural networks and representation theory
Equivariant neural networks and representation theoryEquivariant neural networks and representation theory
Equivariant neural networks and representation theory
 
The binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defectsThe binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defects
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
 
Bob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdfBob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdf
 
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
 
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxThe use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
 
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdfwaterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
 
Oedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptxOedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptx
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
 
The debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically youngThe debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically young
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
 
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
 
molar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptxmolar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptx
 
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
 
Immersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths ForwardImmersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths Forward
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
 
Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.
 

06 ashish mahabal bse2

  • 1. BigSkyEarth II: Feature extraction from time-series data Ashish Mahabal Center for Data-Driven Discovery, Caltech 5 April 2016
  • 2. Ashish Mahabal - BSE II Transformations in astronomy Shallow to deep; Small to large; Sporadic to repeated; more wavelengths Move towards digital movies! Finding transients Doing new science2014-06-16 Ashish Mahabal 3
  • 3. Ashish Mahabal - BSE II Something that has a large brightness change (delta-magnitude) within a short timespan (small delta-time) What is a transient? 4
  • 4. Ashish Mahabal - BSE II Phenomenological variety Flare Star Dwarf Nova Blazar Catalina Real-time Transient Survey (CRTS) 5
  • 5. Ashish Mahabal - BSE II Supernova from SN Hunt 6
  • 6. Ashish Mahabal - BSE II Binary Black-holes PG 1302-102 Graham et al. 7
  • 7. Ashish Mahabal - BSE II Variability Tree 8
  • 8. Ashish Mahabal - BSE II irregular light curves●●●● ● ● ● ● ● ● 16 17 18 19 20 55250 55500 55750 56000 Mean Julian Date Magnitude 9
  • 9. Ashish Mahabal - BSE II GPR fitted to light curves ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 19 20 21 0 1000 2000 Day Magnitude AGN ● ● ● ● ● ● ● ● ● 19.0 19.5 20.0 20.5 21.0 0 1000 2000 Day Magnitude SN ● ● ● ● ●●●● ●●●● ●● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●●● ● ● ● ●● ●● ● ●● ●● ● ● ●● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ●●● ●● ●● 14 16 18 20 0 1000 2000 Day Magnitude Flare ●●● ● ●●●●●●●● ● ● ●● ●● ● ● ●●●● ● ●●●●● ●●●●●●●●●● ●● ● ● ● ●● ●●●●●●●● 16 17 18 19 20 21 0 1000 2000 Day Magnitude Non Transient 10
  • 10. Ashish Mahabal - BSE II A separate Kepler AGN CRTS AGN Juxtaposition of surveys to get training sets for newer ones 11
  • 11. Ashish Mahabal - BSE II Regular versus irregular • Financial time series and predictions • Variety of tools • Statistical features 12
  • 12. Ashish Mahabal - BSE II Challenge: A Variety of Parameters •  Discovery: magnitudes, delta-magnitudes •  Contextual: –  Distance to nearest star –  Magnitude of the star –  Color of that star –  Normalized distance to nearest galaxy –  Distance to nearest radio source –  Flux of nearest radio source –  Galactic latitude •  Follow-up –  Colors (g-r, r-I, i-z etc.) •  Prior classifications (event type) •  Characteristics from light-curve –  Amplitude –  Median buffer range percentage –  Standard deviation –  Stetson k –  Flux percentile ratio mid80 –  Prior outburst statistic Not all parameters are always present leading to swiss-cheese like data sets http://ki-media.blogspot.com/ Measures from Feigelson and Babu (Graham) New lightcurve-based parameters: (Faraway) • Whole curve measures • Fitted curve measures • Residual from fit measures • Cluster measures • Other 13
  • 13. Ashish Mahabal - BSE II beyond1std skew Amplitude freq_signif freq_varrat freq_y_offset freq_model_max_delta_mag freq_model_min_delta_mag freq_model_phi1_phi2 freq_rrd freq_n_alias flux_%_mid20 flux_%_mid35 flux_%_mid50 flux_%_mid65 flux_%_mid80 linear_trend max_slope MAD median_buffer_range_percentage pair_slope_trend percent_amplitude percent_difference_flux_percentile QSO non_QSO std small_kurtosis stetson_j stetson_k scatter_res_raw p2p_scatter_2praw p2p_scatter_over_mad p2p_scatter_pfold_over_mad medperc90_p2_p fold_2p_slope_10% fold_2p_slope_90% p2p_ssqr_diff_over_var Many features - not all are independent Adam Miller 15 Jan 2015 Ashish Mahabal 20 14
  • 14. Ashish Mahabal - BSE II Importance of cadence • Cadence wars with the LSST • Asteroids versus fast transients versus periodic objects versus cosmology • Constraints due to promises made • Mechanical constraints 15
  • 15. 16
  • 16. Ashish Mahabal - BSE II continuing cadence meetings Sensitivity (visit? coadd?) by filter (especially u and g), needed for several (many? all?) variable types Phased uniformity (periodic variables): for a given period how uniformly would the lightcurve be sampled?* Window function (per filter/all filters) FWHM, ... statistics of revisit time histogram (per filter/all filters) e.g. min/max/ median/5th & 95th percentiles Hour angle distribution (to check aliasing), at a given sky position, maximum difference, rms ... 17
  • 17. Ashish Mahabal - BSE II Follow-up a huge issue • Depth is a big difference: faint sources • Characterization/ML needed: part of it only from LSST • A lot of LSST science may get done before LSST • Synergy with GMT/TMT • LSST as a follow-up device in the days of aLIGO 18
  • 18. Ashish Mahabal - BSE II Optimization more than in Tzolk’in Rohit Gawande Victory points == science Temples Technologies Currency Buildings Resources Large number of variables and each player wants to win. 19
  • 19. Ashish Mahabal - BSE II Optimizing is (generally) a zero-sum game Easy to make the survey “greatest” in one science Optimization means compromise BUT, the sum of parts is GREATER than the whole i.e. compromise does NOT mean sacrifice In other words, the players are NOT playing AGAINST each other It’s the best middle ground we are seeking LSST is its own follow-up machine in a proactive way. By coming up with a good cadence we can minimize the follow-up needed. And you can help. And get the science you love done in the process. 20
  • 20. Ashish Mahabal - BSE II CRTS light-curves • 500M, most with hundreds of points over 10 years • Most are non-variable (as a rule) • Processing “done” for all SSS (100M+) • Great training set for many procedures 21
  • 21. Ashish Mahabal - BSE II Caltech service • Not for mass-computation: http://nirgun.caltech.edu: 8000/scripts/description.html • Mass processing: xsede, IUCAA, other … 22
  • 22. Ashish Mahabal - BSE II Faraway, Mahabal et al. methods Recursive Partitioning Param n Type n Numbers/ names not for reading J Faraway 10/17/2014 Ashish Mahabal, IACS 58 23
  • 23. Ashish Mahabal - BSE II domain knowledge • Peakiness - SNe separator • Period finding variations • Detailed features for a subset Non-SNe (1) SNe (2) 1 2 2 2 1 1 Using 900 non-SNe and 600 SNe 80-90% completeness using just these parameters Mahabal, Ball 24
  • 24. Ashish Mahabal - BSE II dimensionality reduction Feature selection strategies Donalek et al. arXiv:1310.1976 •  Fast Relief Algorithm (wt and threshold) •  Fisher Discriminant Ratio •  Correlation based Feature Selection •  Fast Correlation Based Filter •  Multi Class Feature Selection 25
  • 25. Ashish Mahabal - BSE II Using features for clustering • Using similarity of features to seed larger unknown sets • Exercise based on that … • Deep learning? Open questions 26
  • 26. Ashish Mahabal - BSE II Augmenting light-curves with newer points • Millions of sources observed each night • Appending those points to all those light-curves • Recomputing stats and follow-up characterizations Open questions 27
  • 27. Ashish Mahabal - BSE II Summary • Astronomical time-series tend to be more gappy, heteroskedastic, diverse • A great variety of statistical and domain-based features extractable • Many challenges and interesting problems remain 28