• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Data Mining The Sky

Data Mining The Sky



Data Mining The Sky

Data Mining The Sky



Total Views
Views on SlideShare
Embed Views



3 Embeds 12

http://www.dataminingtools.net 6
http://dataminingtools.net 5
http://www.slideshare.net 1



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    Data Mining The Sky Data Mining The Sky Presentation Transcript

    • Mining the sky
    • Data analysis in astronomy
      Data mining techniques are rapidly gaining acceptance in a variety of scientific disciplines.
      Large amount of data collected in astronomical surveys require the use of semi-automated techniques for analysis
      Focus is on extracting useful information from a single survey
    • Data mining is a multi-
      disciplinary field, borrowing
      and enhancing ideas from
      diverse areas such as signal
      and image processing, image
      understanding, statistics, mathematical optimization, computer vision and pattern recognition.
      Mining scientific data sets is an area rich in mathematical problems.
    • Use of data mining techniques in astronomy
      Data mining is a process of
      uncovering patterns, anomalies,
      and statistically significant structures in data
      Neural networks are used to discriminate between stars and galaxies.
      SKICAT project for star/galaxy makes use of decision trees in the DPOSS survey.
    • Astro-informatics
      Problems in astronomy
      increasingly require use of
      machine learning and data mining
      • Detection of spurious objects
      • Record image
      • Object classification and clustering
      • Compression
      • Source separation
    • Mining a single astronomical survey
      Survey is defined by the wavelength of the light used, the depth of the images, and the angular resolution of the images.
      Data is available in 2 forms-images and a catalog.
      The original data obtained from the telescope is images, after some processing a catalog is obtained which has information about every object in the image.
      It is the catalog that’s got more importance than images in the survey.
    • Issues in astronomy
      Compression(ex: galaxy images, spectra)
      Classification(ex: stars, galaxies or gamma
      ray bursts)
      Reconstruction(ex: blurred galaxy images,
      mass distribution from week gravitational lensing)
      Feature extraction(signatures features of
      stars, galaxies and quasers)
      Parameter estimation(ex: star parameter measurement, photometric redshift prediction, cosmological parameters)
      Model selection( ex: are there 0,1,2,…. Patterns around the star or is there a cosmological model with non-zero nutrino mass more favorable.
    • Science requirements for data mining
      Cross-identification: classical problem of associating the source list of one database to the source list of the other.
      Cross-correlation: search for co-relations, tendencies and trends between physical parameters in multi-dimensional data.
      Nearest-neighbor identification: general application of clustering algorithms in multi-dimensional parameter space, usually within a database.
      Systematic data exploration: application of broad range of event based and relationship based queries to a database in the hope of making a discovery of new objects or a class of new objects.
    • KDD
      KDD is automatic extraction of non obvious hidden knowledge from large volumes of data.
      DM becomes the core of knowledge discovery.
      KDD process involves:
      • Data mining object
      • Data Preparation
      • Data Processing
      • Analysis
      • Evolution
    • Primary tasks of data mining:
      • Classification(finding the
      description of several
      predefined classes and classify a
      data item into one of them)
      • Regression(mapping the data item into a real valued data item)
      • Clustering(discovering the most significant changes in the data)
      • Deviation and change detection(identifying the finite set of clusters or categories in the data)
      • Dependency modeling (finding a model which describes significant dependencies between the variables)
      • Summarization(finding a compact description for the summarization of data)
    • Machine learning and data mining tasks will continue to prove useful with astronomical data bases.