SlideShare a Scribd company logo
1 of 38
Download to read offline
Pablo A. Estévez, DIE, Universidad de Chile
Joint work with
Pavlos Protopapas, University of Harvard
Pablo Zegers, Universidad de los Andes, Chile
Pablo Huijse, PhD Student, Universidad de Chile
Jose C. Principe, University of Florida, Gainesville



  ASTRONOMICAL TIME SERIES
  ANALYSIS USING INFORMATION
  THEORETIC LEARNING

                        Workshop on CI Challenges, September 2012
Astronomical Time Series: Light Curves

   Light Curve: Stellar brightness (magnitude or flux)
    versus time.
   Variable stars: stars whose luminosity varies over
    time (3% of the stars in the universe are variables,
    and 1% are periodic variable stars)
   Light Curve Analysis: Useful for period detection,
    event detection, stellar classification, extra solar
    planet discovery, measure distance to earth, etc.
An Example of a Light Curve
Challenges
   Light curves are unevenly spaced or irregularly
    sampled, with gaps of different sizes. This is due to:
     Time constraints on the observation time
     Day-night cycle, weather conditions

     Equipment operability

   Light curves are noisy due to photometric errors,
    atmospheric and sky background
   Astronomical surveys generate tens of millions of
    light curves. Light curve generation rate will continue
    growing during the next years.
Variable stars




Eclipsing binary stars   Pulsating star
Problem Statement

 Discriminate periodic versus non-periodic light
  curves in astronomical survey databases
 Estimate the underlying period of periodic light

  curves.
 Goal: To develop an automated method for

  periodic detection and estimation based on
  information theoretic learning.
Information theoretic learning (ITL)
   Apply concepts of information theory such as entropy
    and mutual information to machine learning
   Renyi’s quadratic entropy, with Gaussian kernel




     Renyi´s  entropy is a generalization of Shannon’s entropy
     IP: Information potential is the argument of the logarithm

   CORRENTROPY (Generalized Correlation): It measures
    similarity between feature vectors separated by a
    certain time delay
Proposed discrimination metric
   It combines correntropy (generalized correlation)
    with a periodic kernel
   The periodic kernel measures similarity among
    samples separated by a given period
   The new metric provides a periodogram, whose
    peaks are associated with the fundamental
    frequencies present in the data
   It is computed directly from the available samples
   Correntropy Kernelized Periodogram (CKP)
Correntropy Kernelized Periodogram

   Synthetic data example: sin(2 pi t /P) + noise in time and
    magnitude
     Noise in time simulates uneven sampling
     True period 2.456 days

     The CKP reaches a global maximum at the corresponding
      true period (left figure).
Statistical Test Using CKP

                             Degree of
                             confidence
                               99%




                               90%
Receiver Operator Characteristic
ROC curves for CKP, and alternative methods: LS-periodogram and AoV-
periodogram. Dataset: 750 periodic light curves and 1500 aperiodic light
curves from the MACHO survey. Due to the natural classs imbalance, very low
false positive rates are required (0.1%).
EROS Survey
   Survey of the Magellanic Clouds and the Galactic bulge
   Data taken from ESO Observatory, in La Silla, Chile
   EROS main goal: search for the dark matter of the
    Galactic halo
   EROS survey is a goldmine for stellar variability studies:
    Cepheids, RR-Lyrae, Eclipsing Binaries, and Supernovas.
   Each EROS field has ~17,300 light curves.
   There are 88x32 fields of the Large Magellanic Cloud
    (LMC), i.e.
     48.744.522 light curves =>48.7M light curves
Computational Time Requirements
    for EROS Survey
   Computational time measured using NVIDIA Tesla C2070
    GPU (448 cores)
   Sweeping 20000 trial periods with CKP, the total time
    per light curve (~650 samples): 1.5 [s]
   For 48.7M light curves: ~845 days!
   Evaluating 600 precomputed trial periods (by using
    correntropy and other methods) and optimizing the code:
    0.2 [s] per light curve
   For 48.7M million light curves: ~113 days!
NCSA Dell/NVIDIA Cluster: FORGE
   National Center for Supercomputing Applications (NCSA)
    at the University of Illinois at Urbana-Champaign
   We are using a queue with 12 machines each having 8
    cores with Tesla C2070 GPUs => 96 GPUs
   Computing eight EROS fields using a machine with 8 cores
    takes 1 hour
   So far we have processed 1.2M light curves in 40 mins
   At this rate for computing 48.7M light curves: ~30 hours!
   FORGE has 44 machines with 288 GPUs in total. Using
    the whole cluster we might process 1 BILLION light curves
    in 10 days
Conclusions & Future Work
   A framework for light curve analysis based on ITL
    and kernel methods has been introduced.
   CKP allows discriminating between periodic and
    non-periodic light curves with high accuracy and
    low number of false positives.
   Required: Efficient computation of ITL based
    methods
   Challenge: Applying our methods to large
    untested astronomical databases.
ALMA Site in Northern Chile
THE END
   P.Huijse, P. Estevez, P. Zegers, P. Protopapas, J.
    Principe, “Period Estimation in Astronomical Time
    Series using Slotted Correntropy”, IEEE Signal
    Processing Letters, Vol. 18, n°6, pp. 371-374,
    2011.
   P.Huijse, P. Estevez, P. Protopapas, P. Zegers, J.
    Principe, “An Information Theoretic Algorithm for
    Finding Periodicities in Stellar Light Curves”, IEEE
    Transactions on Signal Processing, Vol. 60, n°10, pp.
    5135-5145, 2012.
Computational Intelligence Applied to
       Time Series Analysis

                  Pablo A. Estévez
        Department of Electrical Engineering
                 University of Chile


              University of Cyprus, Cyprus
                 September 14, 2012
Outline

                    First Topic
   Introduction to Self-Organizing Maps (SOM)
   SOMs for temporal sequence processing
   Short-term Gamma Memories
   Experimental Results
   Conclusions
                   Second Topic
   Analysis of Astronomical Time Series
   Information Theoretic Learning Approach
Kohonen’s Map

   Self-Organizing Feature Map (SOM)
   Unsupervised Neural Networks
   Vector Quantization of Feature Space
   Topological Ordered Mapping
   Main Applications:
       Dimensionality reduction
       Visualization of high-dimensional data in 2D or 3D maps
       Clustering
       Knowledge discovery in Databases
Topological Ordered Map

   SOM defines a fixed grid in output space
   Each node in the output grid is associated with
    a prototype (codebook) vector in input space
   Neighborhood is measured in the output space
   This neighborhood is used for updating
    codebook vectors in input space
Example: Kohonen’s Map in 2D




   It uses a 2D output grid for visualization of high-dimensional
    data
Example of Neural Gas




Connections are created between the best matching unit and the second closest
Connections are allowed aging and are removed eventually if not refreshed
SOMs for data temporal processing

   Several recent extensions of SOM for processing
    data sequences that are temporally or spatially
    connected
       For example: words, DNA sequences, time series
   Models differ on the notion of context, i.e. the
    way they store sequences
   Each neuron is represented by a weight w i d
    (codebook) vector and a context (several)
    vector(s) ci d
Gamma Memories

   The Gamma filter is defined in the time domain
    as           K
            y  n    k ck  n 
                      k 1

            ck  n    ck (n  1)  (1   )ck 1  n  1

   where c0 (n)  x(n) is the input signal, y (n) is the
    filter output, and k , k are the filter
    parameters
   Parameter  controls the tradeoff between
    depth and resolution of the filter
Cascade of K-stages

   A recursive rule for context descriptor of order-
    K can be constructed




    The K context descriptors are described as


               ck (n)   ckIn1  1    ckIn1 , k
                                                  1



               c0n1  wIn1
                I


               I n 1 : previous winner
Gamma SOM Map
Delay Coordinate Embedding

   Takens´embedding theorem allows us to
    reconstruct the dynamics of an n-dimensional
    space state starting by a one-dimensional time
    series, e.g. strange attractor.
   To embed a time series, the following delay
    coordinate vector is constructed:
             s(t )   xi (t ), xi (t  t ),   , xi (t  (m 1) t )


   Embedding parameters (t,m) are found by
    using ad-hoc methods
       First minimum of the average mutual information (t)
       False nearest neighbor algorithm (m)
Gamma Filtering Embedding

   Gamma SOM construct a Gamma filtered
    embedding, as follows:

                   u i (t )   wi (t ), c1i (t ),
                                                    , cK (t ) 
                                                        i
                                                               
   Wherew i is the weight vector and c i are the contexts
   Embedding parameters are determined by
    sweeping an array of (, K) values
       Find the top 10 combinations of parameters with lower
        temporal quantization errors
       Project   u  t  to the principal direction by using PCA
       Search for the 1D-PCA projection (allowing for shift delays)
        having maximal mutual information with the original time
        series
Experiments
   Chaotic Lorenz System: state variable x(t )




   NH3-Far Infrared Laser:
       Data set A in the Santa Fe Time Series Competition
Phase Portrait for Lorenz original
            dataset
   Bicup 2006 challenge time series
Phase Portrait for noisy Lorenz dataset
1D projection of Gamma SOM for noisy
            Lorenz dataset
2D projection of Gamma SOM for Laser
             Time Series
Conclusions

   Gamma SOM models can reconstruct the state
    space by using Gamma filtering embedding
   Useful tools for non-linear time series analysis
   Advantage of noise reduction
   Future work: Time series prediction
References

   Estevez, P.A., Hernandez, R.: Gamma SOM for Temporal
    Sequence Processing. In: Advances in Self-Organizing
    Maps, WSOM 2009, LNCS 5629, St. Augustine, FL, pp.
    63-71 (2009)
   Estevez, P.A., Hernandez, R., Perez, C.A., Held, C.M.:
    Gamma-filter Self-organizing Neural Networks for
    Unsupervised Sequence Processing. Electronics Letters
    (2011)-
   Estevez, P.A., Hernandez, R.: Gamma –filter Self-
    Organizing Neural Networks for Time Series Analysis.
    In: Advances in Self-Organizing Maps, WSOM 2011,
    LNCS 5629, Espoo, Finland, pp. 63-71 (2011)
   Estevez, P.A. and Vergara, J.: Nonlinear Time Series
    Analysis by Using Gamma Growing Neural Gas, WSOM
    2012, Santiago, Chile (in press)
ALMA Site in Northern Chile

More Related Content

What's hot

AN ANALYSIS OF THE KALMAN, EXTENDED KALMAN, UNCENTED KALMAN AND PARTICLE FILT...
AN ANALYSIS OF THE KALMAN, EXTENDED KALMAN, UNCENTED KALMAN AND PARTICLE FILT...AN ANALYSIS OF THE KALMAN, EXTENDED KALMAN, UNCENTED KALMAN AND PARTICLE FILT...
AN ANALYSIS OF THE KALMAN, EXTENDED KALMAN, UNCENTED KALMAN AND PARTICLE FILT...sipij
 
Spectroscopic confirmation of_the_existence_of_large_diffuse_galaxies_in_the_...
Spectroscopic confirmation of_the_existence_of_large_diffuse_galaxies_in_the_...Spectroscopic confirmation of_the_existence_of_large_diffuse_galaxies_in_the_...
Spectroscopic confirmation of_the_existence_of_large_diffuse_galaxies_in_the_...Sérgio Sacani
 
METHOD FOR THE DETECTION OF MIXED QPSK SIGNALS BASED ON THE CALCULATION OF FO...
METHOD FOR THE DETECTION OF MIXED QPSK SIGNALS BASED ON THE CALCULATION OF FO...METHOD FOR THE DETECTION OF MIXED QPSK SIGNALS BASED ON THE CALCULATION OF FO...
METHOD FOR THE DETECTION OF MIXED QPSK SIGNALS BASED ON THE CALCULATION OF FO...sipij
 
NEURAL NETWORKS FOR HIGH PERFORMANCE TIME-DELAY ESTIMATION AND ACOUSTIC SOURC...
NEURAL NETWORKS FOR HIGH PERFORMANCE TIME-DELAY ESTIMATION AND ACOUSTIC SOURC...NEURAL NETWORKS FOR HIGH PERFORMANCE TIME-DELAY ESTIMATION AND ACOUSTIC SOURC...
NEURAL NETWORKS FOR HIGH PERFORMANCE TIME-DELAY ESTIMATION AND ACOUSTIC SOURC...csandit
 
Distributed target tracking under realistic network conditions
Distributed target tracking under realistic network conditionsDistributed target tracking under realistic network conditions
Distributed target tracking under realistic network conditionssmartcameras
 
DEEP LEARNING BASED MULTIPLE REGRESSION TO PREDICT TOTAL COLUMN WATER VAPOR (...
DEEP LEARNING BASED MULTIPLE REGRESSION TO PREDICT TOTAL COLUMN WATER VAPOR (...DEEP LEARNING BASED MULTIPLE REGRESSION TO PREDICT TOTAL COLUMN WATER VAPOR (...
DEEP LEARNING BASED MULTIPLE REGRESSION TO PREDICT TOTAL COLUMN WATER VAPOR (...IJDKP
 
150807 Fast R-CNN
150807 Fast R-CNN150807 Fast R-CNN
150807 Fast R-CNNJunho Cho
 
igarss11_1126_corbella.ppt
igarss11_1126_corbella.pptigarss11_1126_corbella.ppt
igarss11_1126_corbella.pptgrssieee
 
Faster R-CNN
Faster R-CNNFaster R-CNN
Faster R-CNNanna8885
 
Chini_NN_IGARSS2011.pptx
Chini_NN_IGARSS2011.pptxChini_NN_IGARSS2011.pptx
Chini_NN_IGARSS2011.pptxgrssieee
 
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation..."Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation...Edge AI and Vision Alliance
 
IGARSS11_takaku_dsm_report.ppt
IGARSS11_takaku_dsm_report.pptIGARSS11_takaku_dsm_report.ppt
IGARSS11_takaku_dsm_report.pptgrssieee
 
Auro tripathy - Localizing with CNNs
Auro tripathy -  Localizing with CNNsAuro tripathy -  Localizing with CNNs
Auro tripathy - Localizing with CNNsAuro Tripathy
 
Visual odometry & slam utilizing indoor structured environments
Visual odometry & slam utilizing indoor structured environmentsVisual odometry & slam utilizing indoor structured environments
Visual odometry & slam utilizing indoor structured environmentsNAVER Engineering
 
"An adaptive modular approach to the mining of sensor network ...
"An adaptive modular approach to the mining of sensor network ..."An adaptive modular approach to the mining of sensor network ...
"An adaptive modular approach to the mining of sensor network ...butest
 
Compact Polarimetry Potentials.ppt
Compact Polarimetry Potentials.pptCompact Polarimetry Potentials.ppt
Compact Polarimetry Potentials.pptgrssieee
 
Kernel-Based_Retrieval_of_Atmospheric_Profiles_from_IASI_Data.pdf
Kernel-Based_Retrieval_of_Atmospheric_Profiles_from_IASI_Data.pdfKernel-Based_Retrieval_of_Atmospheric_Profiles_from_IASI_Data.pdf
Kernel-Based_Retrieval_of_Atmospheric_Profiles_from_IASI_Data.pdfgrssieee
 

What's hot (20)

AN ANALYSIS OF THE KALMAN, EXTENDED KALMAN, UNCENTED KALMAN AND PARTICLE FILT...
AN ANALYSIS OF THE KALMAN, EXTENDED KALMAN, UNCENTED KALMAN AND PARTICLE FILT...AN ANALYSIS OF THE KALMAN, EXTENDED KALMAN, UNCENTED KALMAN AND PARTICLE FILT...
AN ANALYSIS OF THE KALMAN, EXTENDED KALMAN, UNCENTED KALMAN AND PARTICLE FILT...
 
Spectroscopic confirmation of_the_existence_of_large_diffuse_galaxies_in_the_...
Spectroscopic confirmation of_the_existence_of_large_diffuse_galaxies_in_the_...Spectroscopic confirmation of_the_existence_of_large_diffuse_galaxies_in_the_...
Spectroscopic confirmation of_the_existence_of_large_diffuse_galaxies_in_the_...
 
METHOD FOR THE DETECTION OF MIXED QPSK SIGNALS BASED ON THE CALCULATION OF FO...
METHOD FOR THE DETECTION OF MIXED QPSK SIGNALS BASED ON THE CALCULATION OF FO...METHOD FOR THE DETECTION OF MIXED QPSK SIGNALS BASED ON THE CALCULATION OF FO...
METHOD FOR THE DETECTION OF MIXED QPSK SIGNALS BASED ON THE CALCULATION OF FO...
 
NEURAL NETWORKS FOR HIGH PERFORMANCE TIME-DELAY ESTIMATION AND ACOUSTIC SOURC...
NEURAL NETWORKS FOR HIGH PERFORMANCE TIME-DELAY ESTIMATION AND ACOUSTIC SOURC...NEURAL NETWORKS FOR HIGH PERFORMANCE TIME-DELAY ESTIMATION AND ACOUSTIC SOURC...
NEURAL NETWORKS FOR HIGH PERFORMANCE TIME-DELAY ESTIMATION AND ACOUSTIC SOURC...
 
Distributed target tracking under realistic network conditions
Distributed target tracking under realistic network conditionsDistributed target tracking under realistic network conditions
Distributed target tracking under realistic network conditions
 
reportVPLProject
reportVPLProjectreportVPLProject
reportVPLProject
 
DEEP LEARNING BASED MULTIPLE REGRESSION TO PREDICT TOTAL COLUMN WATER VAPOR (...
DEEP LEARNING BASED MULTIPLE REGRESSION TO PREDICT TOTAL COLUMN WATER VAPOR (...DEEP LEARNING BASED MULTIPLE REGRESSION TO PREDICT TOTAL COLUMN WATER VAPOR (...
DEEP LEARNING BASED MULTIPLE REGRESSION TO PREDICT TOTAL COLUMN WATER VAPOR (...
 
150807 Fast R-CNN
150807 Fast R-CNN150807 Fast R-CNN
150807 Fast R-CNN
 
igarss11_1126_corbella.ppt
igarss11_1126_corbella.pptigarss11_1126_corbella.ppt
igarss11_1126_corbella.ppt
 
Faster R-CNN
Faster R-CNNFaster R-CNN
Faster R-CNN
 
Chini_NN_IGARSS2011.pptx
Chini_NN_IGARSS2011.pptxChini_NN_IGARSS2011.pptx
Chini_NN_IGARSS2011.pptx
 
Convolutional Features for Instance Search
Convolutional Features for Instance SearchConvolutional Features for Instance Search
Convolutional Features for Instance Search
 
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation..."Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
 
IGARSS11_takaku_dsm_report.ppt
IGARSS11_takaku_dsm_report.pptIGARSS11_takaku_dsm_report.ppt
IGARSS11_takaku_dsm_report.ppt
 
Auro tripathy - Localizing with CNNs
Auro tripathy -  Localizing with CNNsAuro tripathy -  Localizing with CNNs
Auro tripathy - Localizing with CNNs
 
Visual odometry & slam utilizing indoor structured environments
Visual odometry & slam utilizing indoor structured environmentsVisual odometry & slam utilizing indoor structured environments
Visual odometry & slam utilizing indoor structured environments
 
"An adaptive modular approach to the mining of sensor network ...
"An adaptive modular approach to the mining of sensor network ..."An adaptive modular approach to the mining of sensor network ...
"An adaptive modular approach to the mining of sensor network ...
 
Detection
DetectionDetection
Detection
 
Compact Polarimetry Potentials.ppt
Compact Polarimetry Potentials.pptCompact Polarimetry Potentials.ppt
Compact Polarimetry Potentials.ppt
 
Kernel-Based_Retrieval_of_Atmospheric_Profiles_from_IASI_Data.pdf
Kernel-Based_Retrieval_of_Atmospheric_Profiles_from_IASI_Data.pdfKernel-Based_Retrieval_of_Atmospheric_Profiles_from_IASI_Data.pdf
Kernel-Based_Retrieval_of_Atmospheric_Profiles_from_IASI_Data.pdf
 

Similar to Astronomical Time Series Analysis Using Information Theoretic Learning

Neural Networks for High Performance Time-Delay Estimation and Acoustic Sourc...
Neural Networks for High Performance Time-Delay Estimation and Acoustic Sourc...Neural Networks for High Performance Time-Delay Estimation and Acoustic Sourc...
Neural Networks for High Performance Time-Delay Estimation and Acoustic Sourc...cscpconf
 
PR12-225 Discovering Physical Concepts With Neural Networks
PR12-225 Discovering Physical Concepts With Neural NetworksPR12-225 Discovering Physical Concepts With Neural Networks
PR12-225 Discovering Physical Concepts With Neural NetworksKyunghoon Jung
 
MIRAS: the instrument aboard SMOS
MIRAS: the instrument aboard SMOSMIRAS: the instrument aboard SMOS
MIRAS: the instrument aboard SMOSadrianocamps
 
Comaskey_William_Poster_SULI_FALL_2014
Comaskey_William_Poster_SULI_FALL_2014Comaskey_William_Poster_SULI_FALL_2014
Comaskey_William_Poster_SULI_FALL_2014William Comaskey
 
[20240415_LabSeminar_Huy]Deciphering Spatio-Temporal Graph Forecasting: A Cau...
[20240415_LabSeminar_Huy]Deciphering Spatio-Temporal Graph Forecasting: A Cau...[20240415_LabSeminar_Huy]Deciphering Spatio-Temporal Graph Forecasting: A Cau...
[20240415_LabSeminar_Huy]Deciphering Spatio-Temporal Graph Forecasting: A Cau...thanhdowork
 
From Darkness, Light: Computing Cosmological Reionization
From Darkness, Light: Computing Cosmological ReionizationFrom Darkness, Light: Computing Cosmological Reionization
From Darkness, Light: Computing Cosmological ReionizationCosmoAIMS Bassett
 
Towards the identification of the primary particle nature by the radiodetecti...
Towards the identification of the primary particle nature by the radiodetecti...Towards the identification of the primary particle nature by the radiodetecti...
Towards the identification of the primary particle nature by the radiodetecti...Ahmed Ammar Rebai PhD
 
PCCC20 筑波大学計算科学研究センター「学際計算科学による最新の研究成果」
PCCC20 筑波大学計算科学研究センター「学際計算科学による最新の研究成果」PCCC20 筑波大学計算科学研究センター「学際計算科学による最新の研究成果」
PCCC20 筑波大学計算科学研究センター「学際計算科学による最新の研究成果」PC Cluster Consortium
 
Integral field spectroscopy
Integral field spectroscopyIntegral field spectroscopy
Integral field spectroscopyFernando Reche
 
The canarias einstein_ring_a_newly_discovered_optical_einstein_ring
The canarias einstein_ring_a_newly_discovered_optical_einstein_ringThe canarias einstein_ring_a_newly_discovered_optical_einstein_ring
The canarias einstein_ring_a_newly_discovered_optical_einstein_ringSérgio Sacani
 
MIRAS: The SMOS Instrument
MIRAS: The SMOS InstrumentMIRAS: The SMOS Instrument
MIRAS: The SMOS Instrumentadrianocamps
 
120_SEM_Special_Topics.ppt
120_SEM_Special_Topics.ppt120_SEM_Special_Topics.ppt
120_SEM_Special_Topics.pptzaki194502
 
Data Science Education: Needs & Opportunities in Astronomy
Data Science Education: Needs & Opportunities in AstronomyData Science Education: Needs & Opportunities in Astronomy
Data Science Education: Needs & Opportunities in AstronomyJoshua Bloom
 
ssnow_manuscript_postreview
ssnow_manuscript_postreviewssnow_manuscript_postreview
ssnow_manuscript_postreviewStephen Snow
 
P-Wave Onset Point Detection for Seismic Signal Using Bhattacharyya Distance
P-Wave Onset Point Detection for Seismic Signal Using Bhattacharyya DistanceP-Wave Onset Point Detection for Seismic Signal Using Bhattacharyya Distance
P-Wave Onset Point Detection for Seismic Signal Using Bhattacharyya DistanceCSCJournals
 
Using Subspace Pursuit Algorithm to Improve Performance of the Distributed Co...
Using Subspace Pursuit Algorithm to Improve Performance of the Distributed Co...Using Subspace Pursuit Algorithm to Improve Performance of the Distributed Co...
Using Subspace Pursuit Algorithm to Improve Performance of the Distributed Co...Polytechnique Montreal
 

Similar to Astronomical Time Series Analysis Using Information Theoretic Learning (20)

Neural Networks for High Performance Time-Delay Estimation and Acoustic Sourc...
Neural Networks for High Performance Time-Delay Estimation and Acoustic Sourc...Neural Networks for High Performance Time-Delay Estimation and Acoustic Sourc...
Neural Networks for High Performance Time-Delay Estimation and Acoustic Sourc...
 
PR12-225 Discovering Physical Concepts With Neural Networks
PR12-225 Discovering Physical Concepts With Neural NetworksPR12-225 Discovering Physical Concepts With Neural Networks
PR12-225 Discovering Physical Concepts With Neural Networks
 
MIRAS: the instrument aboard SMOS
MIRAS: the instrument aboard SMOSMIRAS: the instrument aboard SMOS
MIRAS: the instrument aboard SMOS
 
Climate Extremes Workshop - Extreme Values of Vertical Wind Speed in Doppler ...
Climate Extremes Workshop - Extreme Values of Vertical Wind Speed in Doppler ...Climate Extremes Workshop - Extreme Values of Vertical Wind Speed in Doppler ...
Climate Extremes Workshop - Extreme Values of Vertical Wind Speed in Doppler ...
 
Comaskey_William_Poster_SULI_FALL_2014
Comaskey_William_Poster_SULI_FALL_2014Comaskey_William_Poster_SULI_FALL_2014
Comaskey_William_Poster_SULI_FALL_2014
 
[20240415_LabSeminar_Huy]Deciphering Spatio-Temporal Graph Forecasting: A Cau...
[20240415_LabSeminar_Huy]Deciphering Spatio-Temporal Graph Forecasting: A Cau...[20240415_LabSeminar_Huy]Deciphering Spatio-Temporal Graph Forecasting: A Cau...
[20240415_LabSeminar_Huy]Deciphering Spatio-Temporal Graph Forecasting: A Cau...
 
From Darkness, Light: Computing Cosmological Reionization
From Darkness, Light: Computing Cosmological ReionizationFrom Darkness, Light: Computing Cosmological Reionization
From Darkness, Light: Computing Cosmological Reionization
 
Towards the identification of the primary particle nature by the radiodetecti...
Towards the identification of the primary particle nature by the radiodetecti...Towards the identification of the primary particle nature by the radiodetecti...
Towards the identification of the primary particle nature by the radiodetecti...
 
PCCC20 筑波大学計算科学研究センター「学際計算科学による最新の研究成果」
PCCC20 筑波大学計算科学研究センター「学際計算科学による最新の研究成果」PCCC20 筑波大学計算科学研究センター「学際計算科学による最新の研究成果」
PCCC20 筑波大学計算科学研究センター「学際計算科学による最新の研究成果」
 
Integral field spectroscopy
Integral field spectroscopyIntegral field spectroscopy
Integral field spectroscopy
 
VO Course 11: Spatial indexing
VO Course 11: Spatial indexingVO Course 11: Spatial indexing
VO Course 11: Spatial indexing
 
The canarias einstein_ring_a_newly_discovered_optical_einstein_ring
The canarias einstein_ring_a_newly_discovered_optical_einstein_ringThe canarias einstein_ring_a_newly_discovered_optical_einstein_ring
The canarias einstein_ring_a_newly_discovered_optical_einstein_ring
 
MIRAS: The SMOS Instrument
MIRAS: The SMOS InstrumentMIRAS: The SMOS Instrument
MIRAS: The SMOS Instrument
 
120_SEM_Special_Topics.ppt
120_SEM_Special_Topics.ppt120_SEM_Special_Topics.ppt
120_SEM_Special_Topics.ppt
 
Data Science Education: Needs & Opportunities in Astronomy
Data Science Education: Needs & Opportunities in AstronomyData Science Education: Needs & Opportunities in Astronomy
Data Science Education: Needs & Opportunities in Astronomy
 
ssnow_manuscript_postreview
ssnow_manuscript_postreviewssnow_manuscript_postreview
ssnow_manuscript_postreview
 
HS Demo
HS DemoHS Demo
HS Demo
 
P-Wave Onset Point Detection for Seismic Signal Using Bhattacharyya Distance
P-Wave Onset Point Detection for Seismic Signal Using Bhattacharyya DistanceP-Wave Onset Point Detection for Seismic Signal Using Bhattacharyya Distance
P-Wave Onset Point Detection for Seismic Signal Using Bhattacharyya Distance
 
Using Subspace Pursuit Algorithm to Improve Performance of the Distributed Co...
Using Subspace Pursuit Algorithm to Improve Performance of the Distributed Co...Using Subspace Pursuit Algorithm to Improve Performance of the Distributed Co...
Using Subspace Pursuit Algorithm to Improve Performance of the Distributed Co...
 
Astronomy
AstronomyAstronomy
Astronomy
 

More from ieee_cis_cyprus

Piero Bonissone: "Analytics, Cloud-Computing, and Crowdsourcing --- or How To...
Piero Bonissone: "Analytics, Cloud-Computing, and Crowdsourcing --- or How To...Piero Bonissone: "Analytics, Cloud-Computing, and Crowdsourcing --- or How To...
Piero Bonissone: "Analytics, Cloud-Computing, and Crowdsourcing --- or How To...ieee_cis_cyprus
 
Johan Suykens: "Models from Data: a Unifying Picture"
Johan Suykens: "Models from Data: a Unifying Picture" Johan Suykens: "Models from Data: a Unifying Picture"
Johan Suykens: "Models from Data: a Unifying Picture" ieee_cis_cyprus
 
Jennie Si: "Computing with Neural Spikes"
Jennie Si: "Computing with Neural Spikes" Jennie Si: "Computing with Neural Spikes"
Jennie Si: "Computing with Neural Spikes" ieee_cis_cyprus
 
Hisao Ishibuchi: "Scalability Improvement of Genetics-Based Machine Learning ...
Hisao Ishibuchi: "Scalability Improvement of Genetics-Based Machine Learning ...Hisao Ishibuchi: "Scalability Improvement of Genetics-Based Machine Learning ...
Hisao Ishibuchi: "Scalability Improvement of Genetics-Based Machine Learning ...ieee_cis_cyprus
 
Gary Yen: "Multi-objective Optimization and Performance Metrics Ensemble"
Gary Yen: "Multi-objective Optimization and Performance Metrics Ensemble" Gary Yen: "Multi-objective Optimization and Performance Metrics Ensemble"
Gary Yen: "Multi-objective Optimization and Performance Metrics Ensemble" ieee_cis_cyprus
 
Xin Yao: "What can evolutionary computation do for you?"
Xin Yao: "What can evolutionary computation do for you?"Xin Yao: "What can evolutionary computation do for you?"
Xin Yao: "What can evolutionary computation do for you?"ieee_cis_cyprus
 
Prof. Jim Bezdek: Every Picture Tells a Story — Visual Cluster Analysis
Prof. Jim Bezdek: Every Picture Tells a Story — Visual Cluster AnalysisProf. Jim Bezdek: Every Picture Tells a Story — Visual Cluster Analysis
Prof. Jim Bezdek: Every Picture Tells a Story — Visual Cluster Analysisieee_cis_cyprus
 

More from ieee_cis_cyprus (7)

Piero Bonissone: "Analytics, Cloud-Computing, and Crowdsourcing --- or How To...
Piero Bonissone: "Analytics, Cloud-Computing, and Crowdsourcing --- or How To...Piero Bonissone: "Analytics, Cloud-Computing, and Crowdsourcing --- or How To...
Piero Bonissone: "Analytics, Cloud-Computing, and Crowdsourcing --- or How To...
 
Johan Suykens: "Models from Data: a Unifying Picture"
Johan Suykens: "Models from Data: a Unifying Picture" Johan Suykens: "Models from Data: a Unifying Picture"
Johan Suykens: "Models from Data: a Unifying Picture"
 
Jennie Si: "Computing with Neural Spikes"
Jennie Si: "Computing with Neural Spikes" Jennie Si: "Computing with Neural Spikes"
Jennie Si: "Computing with Neural Spikes"
 
Hisao Ishibuchi: "Scalability Improvement of Genetics-Based Machine Learning ...
Hisao Ishibuchi: "Scalability Improvement of Genetics-Based Machine Learning ...Hisao Ishibuchi: "Scalability Improvement of Genetics-Based Machine Learning ...
Hisao Ishibuchi: "Scalability Improvement of Genetics-Based Machine Learning ...
 
Gary Yen: "Multi-objective Optimization and Performance Metrics Ensemble"
Gary Yen: "Multi-objective Optimization and Performance Metrics Ensemble" Gary Yen: "Multi-objective Optimization and Performance Metrics Ensemble"
Gary Yen: "Multi-objective Optimization and Performance Metrics Ensemble"
 
Xin Yao: "What can evolutionary computation do for you?"
Xin Yao: "What can evolutionary computation do for you?"Xin Yao: "What can evolutionary computation do for you?"
Xin Yao: "What can evolutionary computation do for you?"
 
Prof. Jim Bezdek: Every Picture Tells a Story — Visual Cluster Analysis
Prof. Jim Bezdek: Every Picture Tells a Story — Visual Cluster AnalysisProf. Jim Bezdek: Every Picture Tells a Story — Visual Cluster Analysis
Prof. Jim Bezdek: Every Picture Tells a Story — Visual Cluster Analysis
 

Astronomical Time Series Analysis Using Information Theoretic Learning

  • 1. Pablo A. Estévez, DIE, Universidad de Chile Joint work with Pavlos Protopapas, University of Harvard Pablo Zegers, Universidad de los Andes, Chile Pablo Huijse, PhD Student, Universidad de Chile Jose C. Principe, University of Florida, Gainesville ASTRONOMICAL TIME SERIES ANALYSIS USING INFORMATION THEORETIC LEARNING Workshop on CI Challenges, September 2012
  • 2. Astronomical Time Series: Light Curves  Light Curve: Stellar brightness (magnitude or flux) versus time.  Variable stars: stars whose luminosity varies over time (3% of the stars in the universe are variables, and 1% are periodic variable stars)  Light Curve Analysis: Useful for period detection, event detection, stellar classification, extra solar planet discovery, measure distance to earth, etc.
  • 3. An Example of a Light Curve
  • 4. Challenges  Light curves are unevenly spaced or irregularly sampled, with gaps of different sizes. This is due to:  Time constraints on the observation time  Day-night cycle, weather conditions  Equipment operability  Light curves are noisy due to photometric errors, atmospheric and sky background  Astronomical surveys generate tens of millions of light curves. Light curve generation rate will continue growing during the next years.
  • 5. Variable stars Eclipsing binary stars Pulsating star
  • 6. Problem Statement  Discriminate periodic versus non-periodic light curves in astronomical survey databases  Estimate the underlying period of periodic light curves.  Goal: To develop an automated method for periodic detection and estimation based on information theoretic learning.
  • 7. Information theoretic learning (ITL)  Apply concepts of information theory such as entropy and mutual information to machine learning  Renyi’s quadratic entropy, with Gaussian kernel  Renyi´s entropy is a generalization of Shannon’s entropy  IP: Information potential is the argument of the logarithm  CORRENTROPY (Generalized Correlation): It measures similarity between feature vectors separated by a certain time delay
  • 8. Proposed discrimination metric  It combines correntropy (generalized correlation) with a periodic kernel  The periodic kernel measures similarity among samples separated by a given period  The new metric provides a periodogram, whose peaks are associated with the fundamental frequencies present in the data  It is computed directly from the available samples  Correntropy Kernelized Periodogram (CKP)
  • 9. Correntropy Kernelized Periodogram  Synthetic data example: sin(2 pi t /P) + noise in time and magnitude  Noise in time simulates uneven sampling  True period 2.456 days  The CKP reaches a global maximum at the corresponding true period (left figure).
  • 10. Statistical Test Using CKP Degree of confidence 99% 90%
  • 11. Receiver Operator Characteristic ROC curves for CKP, and alternative methods: LS-periodogram and AoV- periodogram. Dataset: 750 periodic light curves and 1500 aperiodic light curves from the MACHO survey. Due to the natural classs imbalance, very low false positive rates are required (0.1%).
  • 12. EROS Survey  Survey of the Magellanic Clouds and the Galactic bulge  Data taken from ESO Observatory, in La Silla, Chile  EROS main goal: search for the dark matter of the Galactic halo  EROS survey is a goldmine for stellar variability studies: Cepheids, RR-Lyrae, Eclipsing Binaries, and Supernovas.  Each EROS field has ~17,300 light curves.  There are 88x32 fields of the Large Magellanic Cloud (LMC), i.e. 48.744.522 light curves =>48.7M light curves
  • 13. Computational Time Requirements for EROS Survey  Computational time measured using NVIDIA Tesla C2070 GPU (448 cores)  Sweeping 20000 trial periods with CKP, the total time per light curve (~650 samples): 1.5 [s]  For 48.7M light curves: ~845 days!  Evaluating 600 precomputed trial periods (by using correntropy and other methods) and optimizing the code: 0.2 [s] per light curve  For 48.7M million light curves: ~113 days!
  • 14. NCSA Dell/NVIDIA Cluster: FORGE  National Center for Supercomputing Applications (NCSA) at the University of Illinois at Urbana-Champaign  We are using a queue with 12 machines each having 8 cores with Tesla C2070 GPUs => 96 GPUs  Computing eight EROS fields using a machine with 8 cores takes 1 hour  So far we have processed 1.2M light curves in 40 mins  At this rate for computing 48.7M light curves: ~30 hours!  FORGE has 44 machines with 288 GPUs in total. Using the whole cluster we might process 1 BILLION light curves in 10 days
  • 15. Conclusions & Future Work  A framework for light curve analysis based on ITL and kernel methods has been introduced.  CKP allows discriminating between periodic and non-periodic light curves with high accuracy and low number of false positives.  Required: Efficient computation of ITL based methods  Challenge: Applying our methods to large untested astronomical databases.
  • 16. ALMA Site in Northern Chile
  • 17. THE END  P.Huijse, P. Estevez, P. Zegers, P. Protopapas, J. Principe, “Period Estimation in Astronomical Time Series using Slotted Correntropy”, IEEE Signal Processing Letters, Vol. 18, n°6, pp. 371-374, 2011.  P.Huijse, P. Estevez, P. Protopapas, P. Zegers, J. Principe, “An Information Theoretic Algorithm for Finding Periodicities in Stellar Light Curves”, IEEE Transactions on Signal Processing, Vol. 60, n°10, pp. 5135-5145, 2012.
  • 18. Computational Intelligence Applied to Time Series Analysis Pablo A. Estévez Department of Electrical Engineering University of Chile University of Cyprus, Cyprus September 14, 2012
  • 19. Outline First Topic  Introduction to Self-Organizing Maps (SOM)  SOMs for temporal sequence processing  Short-term Gamma Memories  Experimental Results  Conclusions Second Topic  Analysis of Astronomical Time Series  Information Theoretic Learning Approach
  • 20. Kohonen’s Map  Self-Organizing Feature Map (SOM)  Unsupervised Neural Networks  Vector Quantization of Feature Space  Topological Ordered Mapping  Main Applications:  Dimensionality reduction  Visualization of high-dimensional data in 2D or 3D maps  Clustering  Knowledge discovery in Databases
  • 21. Topological Ordered Map  SOM defines a fixed grid in output space  Each node in the output grid is associated with a prototype (codebook) vector in input space  Neighborhood is measured in the output space  This neighborhood is used for updating codebook vectors in input space
  • 22. Example: Kohonen’s Map in 2D  It uses a 2D output grid for visualization of high-dimensional data
  • 23. Example of Neural Gas Connections are created between the best matching unit and the second closest Connections are allowed aging and are removed eventually if not refreshed
  • 24. SOMs for data temporal processing  Several recent extensions of SOM for processing data sequences that are temporally or spatially connected  For example: words, DNA sequences, time series  Models differ on the notion of context, i.e. the way they store sequences  Each neuron is represented by a weight w i d (codebook) vector and a context (several) vector(s) ci d
  • 25. Gamma Memories  The Gamma filter is defined in the time domain as K y  n    k ck  n  k 1 ck  n    ck (n  1)  (1   )ck 1  n  1  where c0 (n)  x(n) is the input signal, y (n) is the filter output, and k , k are the filter parameters  Parameter  controls the tradeoff between depth and resolution of the filter
  • 26. Cascade of K-stages  A recursive rule for context descriptor of order- K can be constructed The K context descriptors are described as ck (n)   ckIn1  1    ckIn1 , k 1 c0n1  wIn1 I I n 1 : previous winner
  • 28. Delay Coordinate Embedding  Takens´embedding theorem allows us to reconstruct the dynamics of an n-dimensional space state starting by a one-dimensional time series, e.g. strange attractor.  To embed a time series, the following delay coordinate vector is constructed: s(t )   xi (t ), xi (t  t ), , xi (t  (m 1) t )  Embedding parameters (t,m) are found by using ad-hoc methods  First minimum of the average mutual information (t)  False nearest neighbor algorithm (m)
  • 29. Gamma Filtering Embedding  Gamma SOM construct a Gamma filtered embedding, as follows: u i (t )   wi (t ), c1i (t ),  , cK (t )  i   Wherew i is the weight vector and c i are the contexts  Embedding parameters are determined by sweeping an array of (, K) values  Find the top 10 combinations of parameters with lower temporal quantization errors  Project u  t  to the principal direction by using PCA  Search for the 1D-PCA projection (allowing for shift delays) having maximal mutual information with the original time series
  • 30. Experiments  Chaotic Lorenz System: state variable x(t )  NH3-Far Infrared Laser:  Data set A in the Santa Fe Time Series Competition
  • 31. Phase Portrait for Lorenz original dataset Bicup 2006 challenge time series
  • 32. Phase Portrait for noisy Lorenz dataset
  • 33. 1D projection of Gamma SOM for noisy Lorenz dataset
  • 34. 2D projection of Gamma SOM for Laser Time Series
  • 35. Conclusions  Gamma SOM models can reconstruct the state space by using Gamma filtering embedding  Useful tools for non-linear time series analysis  Advantage of noise reduction  Future work: Time series prediction
  • 36. References  Estevez, P.A., Hernandez, R.: Gamma SOM for Temporal Sequence Processing. In: Advances in Self-Organizing Maps, WSOM 2009, LNCS 5629, St. Augustine, FL, pp. 63-71 (2009)  Estevez, P.A., Hernandez, R., Perez, C.A., Held, C.M.: Gamma-filter Self-organizing Neural Networks for Unsupervised Sequence Processing. Electronics Letters (2011)-  Estevez, P.A., Hernandez, R.: Gamma –filter Self- Organizing Neural Networks for Time Series Analysis. In: Advances in Self-Organizing Maps, WSOM 2011, LNCS 5629, Espoo, Finland, pp. 63-71 (2011)  Estevez, P.A. and Vergara, J.: Nonlinear Time Series Analysis by Using Gamma Growing Neural Gas, WSOM 2012, Santiago, Chile (in press)
  • 37.
  • 38. ALMA Site in Northern Chile