SlideShare a Scribd company logo
Detecting Asteroids
with Neural Networks
in TensorFlow
Outline
• What's the goal?
• What's the data?
• Getting started
• Building a feature set
• Building the neural network
• Training the network
• Results
Goal
Build and train a neural network to correctly identify asteroids
in astrophotography data.
So...
How do we do it?
So...
How do we really do it?
The data
The Sloan Digital Sky Survey:
One of the most ambitious and influential surveys in the history of
astronomy."
The data
• Approx 35% of sky
• Largest uniform survey of the sky yet accomplished
• Data is freely available online
• Each image is 922x680 pixels
An example asteroid
How does this work?
This exploits a property of CCDs:
• SDSS telescopes use five different filters
• They are not simultaneous
• Moving objects appear in different locations
• Always the same order
Getting started
Getting the initial training data:
* Small tool to extract potential candidates from full-scale
images
* Extremely naive, approx 100:5 false positives to actual
positives
* Very low false negatives (approx 1:1000)
* Incredibly slow (complex scan of 100Ks of potentials)
* Manual classification, somewhat slow
* Yields approx 250 valid items, 500 invalid items
The feature set
Good ideas for features:
• Ratio valid hues to non-valid hues
• Best possible cluster collinearity
• Best possible average cluster distance
Feature: Ratio valid hues to non-
valid hues
The goal here is to match the colors, a.k.a. "hues":
• First step: convert to HSV space
• For pixels in the valid value-spectrum (0.25 < v < 0.90)
• How many are within 2 standard deviations from an optimal
value?
• What's the ratio to ones that aren't?
An example HSV plot
Feature: Best possible cluster
collinearity
k-means clustering
• Using the valid hues from the previous feature
• Attempts to cluster n points into k groups
• Here, k=3
• Produces three centroids
k-means clustering of
the same asteroid
Feature: Best possible cluster
collinearity
• The property of a set of points which lie on the same line
• Iterate the k-means clustering approx. 20 times
• The resulting metric is the ratio between the actual
collinearity and the maximum potential colinearity
Feature: Best possible cluster
collinearity
• Given points a, b, and c:
colin = (c.x - a.x) * (b.y - a.y) + (c.y - a.y) * (a.x - b.x)
k-means clustering of
the same asteroid
An example non-
asteroid
k-means clustering of
the same non-
asteroid
Feature: Best possible average
cluster distance
• Using the same k-means clusters from the previous features
• What is the average distance from any point in a cluster to
the center of the cluster?
k-means clustering of
the same asteroid
An example non-
asteroid
k-means clustering of
the same non-
asteroid
A comparison of all three features
| Hue Ratio | Collinearity | Cluster distance
Asteroid | 0.687 | 0.046 | 0.432
Non-asteroid | 0.376 | 0.388 | 0.557
A comparison of all three features
We see that the for a valid asteroid:
• The hue ratio is much higher
• The colinearity metric is much lower
• The mean cluster disance is smaller
Ok... where's the AI?
This type of classification is extrememly well suited for a neural
network:
Ok... where's the AI?
• We have a clear set of training data
• The output is either affirmative (1) or negative (0)
• Each of the input features can be resolved to a 0 -> 1
metric
• There is a small amount of input features which can
accurately define an item
Building the neural network
The resulting neural network:
• Performs binary classification;
• Use supervised learning;
• Uses a backpropagation trainer;
• Is deep;
Building the neural network
Four layers:
• Input layer;
• Two hidden layers;
• Output layer
Building the neural network
Total of 154 "neurons":
• 3 input neurons (hue ratio, collinearity metric, distance
metric)
• 150 hidden neurons (100 in first layer, 50 in second)
• 1 output neuron (1 if valid asteroid, 0 if invalid)
The resulting neural
network
$ head astro.train
0.72, 0.0230326797386, 0.265427036314, 1.0
0.223404255319, 0.453424956758, 0.620237280488, 0.0
0.625954198473, 0.282509136048, 0.489543705893, 0.0
0.297297297297, 0.217278447678, 0.456831265365, 0.0
0.526315789474, 0.125389748718, 0.52048369676, 1.0
0.4, 0.430241745731, 0.597850990407, 0.0
0.0797872340426, 0.375153031291, 0.601415832623, 0.0
0.403361344538, 0.268341944886, 0.485098390444, 0.0
0.592356687898, 0.34747482416, 0.559112235938, 0.0
0.0976744186047, 0.0213790202131, 0.586822094967, 0.0
Load training data
import pandas as pd
df_train = pd.read_csv(
tf.gfile.Open('./astro.train'),
names=[
'hue_rat', 'col_min',
'dis_min', 'label'
],
skipinitialspace=True)
Turn label into boolean
df_train['label'] = (
df_train["astro"].apply(
lambda x: x == 1.0
)
).astype(int)
Build an estimator
import tensorflow as tf
def build_estimator(model_dir):
hue_rat = tf.contrib.layers.real_valued_column("hue_rat")
col_min = tf.contrib.layers.real_valued_column("col_min")
dis_min = tf.contrib.layers.real_valued_column("dis_min")
return tf.contrib.learn.DNNClassifier(
model_dir=model_dir,
feature_columns=[hue_rat, col_min, dis_min],
hidden_units=[100, 50]
)
Input function
def input_fn(df):
feature_cols = {
k: tf.constant(df[k].values)
for k in [
'hue_rat', 'col_min', 'dis_min'
]
}
label = tf.constant(df['label'].values)
return feature_cols, label
Build & Fit the model
m = build_estimator('./model')
m.fit(input_fn=lambda: input_fn(df_train), steps=200)
Training the network
• Approx 250 valid items;
• Approx 500 invalid items;
• Trained for 200 steps;
• Took < 1 minute;
Evaluating the model
results = m.evaluate(
input_fn=lambda: input_fn(df_test_1),
steps=1
)
Results
• Trial 1: 99.2973% accuracy
• Trial 2: 94.4056% accuracy
• Trial 3: 94.7644% accuracy
Why use features?
MNIST is a database of handwritten digits:
Why use features?
28x28 pixels = 784 numbers between 0 and 1:
df_train = [0.0, 0.1, 0.9, ..., 1.0]
Why use features?
Astro data:
* 40x40 pixels
* RGB space
40*40*3 = 4800
Conclusion
• Using a neural network allows us to do it faster, and more
accurately
• Need to spend time coming up with good features for the
data
• TensorFlow is really nice (and fast!)
References
• https://www.tensorflow.org/
• http://www.sdss.org/
• http://en.wikipedia.org/wiki/Sloan_Digital_Sky_Survey
• http://en.wikipedia.org/wiki/Collinearity
• http://en.wikipedia.org/wiki/K-means_clustering

More Related Content

What's hot

Machine Learning - Neural Networks - Perceptron
Machine Learning - Neural Networks - PerceptronMachine Learning - Neural Networks - Perceptron
Machine Learning - Neural Networks - Perceptron
Andrew Ferlitsch
 
Recursive Neural Networks
Recursive Neural NetworksRecursive Neural Networks
Recursive Neural Networks
Sangwoo Mo
 
K-Means Algorithm
K-Means AlgorithmK-Means Algorithm
K-Means Algorithm
Carlos Castillo (ChaTo)
 
Score-Based Generative Modeling through Stochastic Differential Equations
Score-Based Generative Modeling through Stochastic Differential EquationsScore-Based Generative Modeling through Stochastic Differential Equations
Score-Based Generative Modeling through Stochastic Differential Equations
Sangwoo Mo
 
[Vldb 2013] skyline operator on anti correlated distributions
[Vldb 2013] skyline operator on anti correlated distributions[Vldb 2013] skyline operator on anti correlated distributions
[Vldb 2013] skyline operator on anti correlated distributions
WooSung Choi
 
Higher Order Fused Regularization for Supervised Learning with Grouped Parame...
Higher Order Fused Regularization for Supervised Learning with Grouped Parame...Higher Order Fused Regularization for Supervised Learning with Grouped Parame...
Higher Order Fused Regularization for Supervised Learning with Grouped Parame...
Koh Takeuchi
 
Enhancing the performance of kmeans algorithm
Enhancing the performance of kmeans algorithmEnhancing the performance of kmeans algorithm
Enhancing the performance of kmeans algorithm
Hadi Fadlallah
 
Superpixel algorithms (whatershed, mean-shift, SLIC, BSLIC), Foolad
Superpixel algorithms (whatershed, mean-shift, SLIC, BSLIC), FooladSuperpixel algorithms (whatershed, mean-shift, SLIC, BSLIC), Foolad
Superpixel algorithms (whatershed, mean-shift, SLIC, BSLIC), Foolad
Shima Foolad
 

What's hot (8)

Machine Learning - Neural Networks - Perceptron
Machine Learning - Neural Networks - PerceptronMachine Learning - Neural Networks - Perceptron
Machine Learning - Neural Networks - Perceptron
 
Recursive Neural Networks
Recursive Neural NetworksRecursive Neural Networks
Recursive Neural Networks
 
K-Means Algorithm
K-Means AlgorithmK-Means Algorithm
K-Means Algorithm
 
Score-Based Generative Modeling through Stochastic Differential Equations
Score-Based Generative Modeling through Stochastic Differential EquationsScore-Based Generative Modeling through Stochastic Differential Equations
Score-Based Generative Modeling through Stochastic Differential Equations
 
[Vldb 2013] skyline operator on anti correlated distributions
[Vldb 2013] skyline operator on anti correlated distributions[Vldb 2013] skyline operator on anti correlated distributions
[Vldb 2013] skyline operator on anti correlated distributions
 
Higher Order Fused Regularization for Supervised Learning with Grouped Parame...
Higher Order Fused Regularization for Supervised Learning with Grouped Parame...Higher Order Fused Regularization for Supervised Learning with Grouped Parame...
Higher Order Fused Regularization for Supervised Learning with Grouped Parame...
 
Enhancing the performance of kmeans algorithm
Enhancing the performance of kmeans algorithmEnhancing the performance of kmeans algorithm
Enhancing the performance of kmeans algorithm
 
Superpixel algorithms (whatershed, mean-shift, SLIC, BSLIC), Foolad
Superpixel algorithms (whatershed, mean-shift, SLIC, BSLIC), FooladSuperpixel algorithms (whatershed, mean-shift, SLIC, BSLIC), Foolad
Superpixel algorithms (whatershed, mean-shift, SLIC, BSLIC), Foolad
 

Viewers also liked

Artificial Neural Network(Artificial intelligence)
Artificial Neural Network(Artificial intelligence)Artificial Neural Network(Artificial intelligence)
Artificial Neural Network(Artificial intelligence)
spartacus131211
 
Astronomy Photographer of the Year 2016: Winners & Finalists
Astronomy Photographer of the Year 2016: Winners & FinalistsAstronomy Photographer of the Year 2016: Winners & Finalists
Astronomy Photographer of the Year 2016: Winners & Finalists
maditabalnco
 
Planetary Defense and NEO Exploration
Planetary Defense and NEO ExplorationPlanetary Defense and NEO Exploration
Planetary Defense and NEO Exploration
American Astronautical Society
 
An Overview of Asteroid Data
An Overview of Asteroid DataAn Overview of Asteroid Data
An Overview of Asteroid Data
J.L. Galache, PhD
 
08 lecture outline
08 lecture outline08 lecture outline
08 lecture outline
Asma Said,PhD
 
DEEP LEARNING SUMMIT ASIA 2016
DEEP LEARNING SUMMIT ASIA 2016DEEP LEARNING SUMMIT ASIA 2016
DEEP LEARNING SUMMIT ASIA 2016
Sravanthi Sinha
 
14 review clickers
14 review clickers14 review clickers
14 review clickers
Asma Said,PhD
 
13 review clickers
13 review clickers13 review clickers
13 review clickers
Asma Said,PhD
 
Современное состояние дистанционного зондирования Земли из Космоса. Зарубежны...
Современное состояние дистанционного зондирования Земли из Космоса. Зарубежны...Современное состояние дистанционного зондирования Земли из Космоса. Зарубежны...
Современное состояние дистанционного зондирования Земли из Космоса. Зарубежны...
Объединение профессионалов топографической службы
 
13 lecture outline
13 lecture outline13 lecture outline
13 lecture outline
Asma Said,PhD
 
14 lecture outline
14 lecture outline14 lecture outline
14 lecture outline
Asma Said,PhD
 
Fdl 2016 overview
Fdl 2016 overviewFdl 2016 overview
Fdl 2016 overview
Leonard Silverberg
 
Asteroids - Comets - Meteoroids
Asteroids - Comets - MeteoroidsAsteroids - Comets - Meteoroids
Asteroids - Comets - Meteoroids
dwinter1
 
CNN Tutorial
CNN TutorialCNN Tutorial
CNN Tutorial
Sungjoon Choi
 
26. Stellar Objects Notes
26. Stellar Objects Notes26. Stellar Objects Notes
26. Stellar Objects Notes
mgitterm
 
NASA Frontier Development Lab 2016 - Asteroids 101
NASA Frontier Development Lab 2016 - Asteroids 101NASA Frontier Development Lab 2016 - Asteroids 101
NASA Frontier Development Lab 2016 - Asteroids 101
J.L. Galache, PhD
 
KGrothe -ASAT2016-DefendingEarth
KGrothe -ASAT2016-DefendingEarthKGrothe -ASAT2016-DefendingEarth
KGrothe -ASAT2016-DefendingEarth
Karen Grothe
 
Image Recognition With TensorFlow
Image Recognition With TensorFlowImage Recognition With TensorFlow
Image Recognition With TensorFlow
Yaz Santissi
 
Target tracking
Target trackingTarget tracking
Target tracking
abdulrehmanali
 
Convolutional neural network in practice
Convolutional neural network in practiceConvolutional neural network in practice
Convolutional neural network in practice
남주 김
 

Viewers also liked (20)

Artificial Neural Network(Artificial intelligence)
Artificial Neural Network(Artificial intelligence)Artificial Neural Network(Artificial intelligence)
Artificial Neural Network(Artificial intelligence)
 
Astronomy Photographer of the Year 2016: Winners & Finalists
Astronomy Photographer of the Year 2016: Winners & FinalistsAstronomy Photographer of the Year 2016: Winners & Finalists
Astronomy Photographer of the Year 2016: Winners & Finalists
 
Planetary Defense and NEO Exploration
Planetary Defense and NEO ExplorationPlanetary Defense and NEO Exploration
Planetary Defense and NEO Exploration
 
An Overview of Asteroid Data
An Overview of Asteroid DataAn Overview of Asteroid Data
An Overview of Asteroid Data
 
08 lecture outline
08 lecture outline08 lecture outline
08 lecture outline
 
DEEP LEARNING SUMMIT ASIA 2016
DEEP LEARNING SUMMIT ASIA 2016DEEP LEARNING SUMMIT ASIA 2016
DEEP LEARNING SUMMIT ASIA 2016
 
14 review clickers
14 review clickers14 review clickers
14 review clickers
 
13 review clickers
13 review clickers13 review clickers
13 review clickers
 
Современное состояние дистанционного зондирования Земли из Космоса. Зарубежны...
Современное состояние дистанционного зондирования Земли из Космоса. Зарубежны...Современное состояние дистанционного зондирования Земли из Космоса. Зарубежны...
Современное состояние дистанционного зондирования Земли из Космоса. Зарубежны...
 
13 lecture outline
13 lecture outline13 lecture outline
13 lecture outline
 
14 lecture outline
14 lecture outline14 lecture outline
14 lecture outline
 
Fdl 2016 overview
Fdl 2016 overviewFdl 2016 overview
Fdl 2016 overview
 
Asteroids - Comets - Meteoroids
Asteroids - Comets - MeteoroidsAsteroids - Comets - Meteoroids
Asteroids - Comets - Meteoroids
 
CNN Tutorial
CNN TutorialCNN Tutorial
CNN Tutorial
 
26. Stellar Objects Notes
26. Stellar Objects Notes26. Stellar Objects Notes
26. Stellar Objects Notes
 
NASA Frontier Development Lab 2016 - Asteroids 101
NASA Frontier Development Lab 2016 - Asteroids 101NASA Frontier Development Lab 2016 - Asteroids 101
NASA Frontier Development Lab 2016 - Asteroids 101
 
KGrothe -ASAT2016-DefendingEarth
KGrothe -ASAT2016-DefendingEarthKGrothe -ASAT2016-DefendingEarth
KGrothe -ASAT2016-DefendingEarth
 
Image Recognition With TensorFlow
Image Recognition With TensorFlowImage Recognition With TensorFlow
Image Recognition With TensorFlow
 
Target tracking
Target trackingTarget tracking
Target tracking
 
Convolutional neural network in practice
Convolutional neural network in practiceConvolutional neural network in practice
Convolutional neural network in practice
 

Similar to PromptWorks Talk Tuesdays: Dustin Ingram 8/23/16 "Detecting Asteroids with Neural Networks in TensorFlow"

Training machine learning k means 2017
Training machine learning k means 2017Training machine learning k means 2017
Training machine learning k means 2017
Iwan Sofana
 
Towards Accurate Multi-person Pose Estimation in the Wild (My summery)
Towards Accurate Multi-person Pose Estimation in the Wild (My summery)Towards Accurate Multi-person Pose Estimation in the Wild (My summery)
Towards Accurate Multi-person Pose Estimation in the Wild (My summery)
Abdulrahman Kerim
 
6 clustering
6 clustering6 clustering
6 clustering
Viet-Trung TRAN
 
3b318431-df9f-4a2c-9909-61ecb6af8444.pptx
3b318431-df9f-4a2c-9909-61ecb6af8444.pptx3b318431-df9f-4a2c-9909-61ecb6af8444.pptx
3b318431-df9f-4a2c-9909-61ecb6af8444.pptx
NANDHINIS900805
 
CSA 3702 machine learning module 3
CSA 3702 machine learning module 3CSA 3702 machine learning module 3
CSA 3702 machine learning module 3
Nandhini S
 
Clustering: A Scikit Learn Tutorial
Clustering: A Scikit Learn TutorialClustering: A Scikit Learn Tutorial
Clustering: A Scikit Learn Tutorial
Damian R. Mingle, MBA
 
Objects as points (CenterNet) review [CDM]
Objects as points (CenterNet) review [CDM]Objects as points (CenterNet) review [CDM]
Objects as points (CenterNet) review [CDM]
Dongmin Choi
 
Mathematics online: some common algorithms
Mathematics online: some common algorithmsMathematics online: some common algorithms
Mathematics online: some common algorithms
Mark Moriarty
 
Automated Search for Globular Clusters in Virgo Cluster Dwarf Galaxies
Automated Search for Globular Clusters in Virgo Cluster Dwarf GalaxiesAutomated Search for Globular Clusters in Virgo Cluster Dwarf Galaxies
Automated Search for Globular Clusters in Virgo Cluster Dwarf Galaxies
EmilyZhou31
 
Islamic University Pattern Recognition & Neural Network 2019
Islamic University Pattern Recognition & Neural Network 2019 Islamic University Pattern Recognition & Neural Network 2019
Islamic University Pattern Recognition & Neural Network 2019
Rakibul Hasan Pranto
 
Fast Single-pass K-means Clusterting at Oxford
Fast Single-pass K-means Clusterting at Oxford Fast Single-pass K-means Clusterting at Oxford
Fast Single-pass K-means Clusterting at Oxford
MapR Technologies
 
Automated Search for Globular Clusters in Virgo Cluster Dwarf Galaxies
Automated Search for Globular Clusters in Virgo Cluster Dwarf GalaxiesAutomated Search for Globular Clusters in Virgo Cluster Dwarf Galaxies
Automated Search for Globular Clusters in Virgo Cluster Dwarf Galaxies
EmilyZhou31
 
Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...
Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...
Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...
Sergey Karayev
 
Automated Search for Globular Clusters in Virgo Cluster Dwarf Galaxies
Automated Search for Globular Clusters in Virgo Cluster Dwarf GalaxiesAutomated Search for Globular Clusters in Virgo Cluster Dwarf Galaxies
Automated Search for Globular Clusters in Virgo Cluster Dwarf Galaxies
EmilyZhou31
 
Automated Search for Globular Clusters in Virgo Cluster Dwarf Galaxies
Automated Search for Globular Clusters in Virgo Cluster Dwarf GalaxiesAutomated Search for Globular Clusters in Virgo Cluster Dwarf Galaxies
Automated Search for Globular Clusters in Virgo Cluster Dwarf Galaxies
EmilyZhou31
 
Automated Search for Globular Clusters in Virgo Cluster Dwarf Galaxies
Automated Search for Globular Clusters in Virgo Cluster Dwarf GalaxiesAutomated Search for Globular Clusters in Virgo Cluster Dwarf Galaxies
Automated Search for Globular Clusters in Virgo Cluster Dwarf Galaxies
EmilyZhou31
 
Automated Search for Globular Clusters in Virgo Cluster Dwarf Galaxies
Automated Search for Globular Clusters in Virgo Cluster Dwarf GalaxiesAutomated Search for Globular Clusters in Virgo Cluster Dwarf Galaxies
Automated Search for Globular Clusters in Virgo Cluster Dwarf Galaxies
EmilyZhou31
 
Part2
Part2Part2
Convolution Neural Network Lecture Slides
Convolution Neural Network Lecture SlidesConvolution Neural Network Lecture Slides
Convolution Neural Network Lecture Slides
AdnanHaider234505
 
machine learning - Clustering in R
machine learning - Clustering in Rmachine learning - Clustering in R
machine learning - Clustering in R
Sudhakar Chavan
 

Similar to PromptWorks Talk Tuesdays: Dustin Ingram 8/23/16 "Detecting Asteroids with Neural Networks in TensorFlow" (20)

Training machine learning k means 2017
Training machine learning k means 2017Training machine learning k means 2017
Training machine learning k means 2017
 
Towards Accurate Multi-person Pose Estimation in the Wild (My summery)
Towards Accurate Multi-person Pose Estimation in the Wild (My summery)Towards Accurate Multi-person Pose Estimation in the Wild (My summery)
Towards Accurate Multi-person Pose Estimation in the Wild (My summery)
 
6 clustering
6 clustering6 clustering
6 clustering
 
3b318431-df9f-4a2c-9909-61ecb6af8444.pptx
3b318431-df9f-4a2c-9909-61ecb6af8444.pptx3b318431-df9f-4a2c-9909-61ecb6af8444.pptx
3b318431-df9f-4a2c-9909-61ecb6af8444.pptx
 
CSA 3702 machine learning module 3
CSA 3702 machine learning module 3CSA 3702 machine learning module 3
CSA 3702 machine learning module 3
 
Clustering: A Scikit Learn Tutorial
Clustering: A Scikit Learn TutorialClustering: A Scikit Learn Tutorial
Clustering: A Scikit Learn Tutorial
 
Objects as points (CenterNet) review [CDM]
Objects as points (CenterNet) review [CDM]Objects as points (CenterNet) review [CDM]
Objects as points (CenterNet) review [CDM]
 
Mathematics online: some common algorithms
Mathematics online: some common algorithmsMathematics online: some common algorithms
Mathematics online: some common algorithms
 
Automated Search for Globular Clusters in Virgo Cluster Dwarf Galaxies
Automated Search for Globular Clusters in Virgo Cluster Dwarf GalaxiesAutomated Search for Globular Clusters in Virgo Cluster Dwarf Galaxies
Automated Search for Globular Clusters in Virgo Cluster Dwarf Galaxies
 
Islamic University Pattern Recognition & Neural Network 2019
Islamic University Pattern Recognition & Neural Network 2019 Islamic University Pattern Recognition & Neural Network 2019
Islamic University Pattern Recognition & Neural Network 2019
 
Fast Single-pass K-means Clusterting at Oxford
Fast Single-pass K-means Clusterting at Oxford Fast Single-pass K-means Clusterting at Oxford
Fast Single-pass K-means Clusterting at Oxford
 
Automated Search for Globular Clusters in Virgo Cluster Dwarf Galaxies
Automated Search for Globular Clusters in Virgo Cluster Dwarf GalaxiesAutomated Search for Globular Clusters in Virgo Cluster Dwarf Galaxies
Automated Search for Globular Clusters in Virgo Cluster Dwarf Galaxies
 
Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...
Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...
Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...
 
Automated Search for Globular Clusters in Virgo Cluster Dwarf Galaxies
Automated Search for Globular Clusters in Virgo Cluster Dwarf GalaxiesAutomated Search for Globular Clusters in Virgo Cluster Dwarf Galaxies
Automated Search for Globular Clusters in Virgo Cluster Dwarf Galaxies
 
Automated Search for Globular Clusters in Virgo Cluster Dwarf Galaxies
Automated Search for Globular Clusters in Virgo Cluster Dwarf GalaxiesAutomated Search for Globular Clusters in Virgo Cluster Dwarf Galaxies
Automated Search for Globular Clusters in Virgo Cluster Dwarf Galaxies
 
Automated Search for Globular Clusters in Virgo Cluster Dwarf Galaxies
Automated Search for Globular Clusters in Virgo Cluster Dwarf GalaxiesAutomated Search for Globular Clusters in Virgo Cluster Dwarf Galaxies
Automated Search for Globular Clusters in Virgo Cluster Dwarf Galaxies
 
Automated Search for Globular Clusters in Virgo Cluster Dwarf Galaxies
Automated Search for Globular Clusters in Virgo Cluster Dwarf GalaxiesAutomated Search for Globular Clusters in Virgo Cluster Dwarf Galaxies
Automated Search for Globular Clusters in Virgo Cluster Dwarf Galaxies
 
Part2
Part2Part2
Part2
 
Convolution Neural Network Lecture Slides
Convolution Neural Network Lecture SlidesConvolution Neural Network Lecture Slides
Convolution Neural Network Lecture Slides
 
machine learning - Clustering in R
machine learning - Clustering in Rmachine learning - Clustering in R
machine learning - Clustering in R
 

Recently uploaded

一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理
dakas1
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
Rakesh Kumar R
 
YAML crash COURSE how to write yaml file for adding configuring details
YAML crash COURSE how to write yaml file for adding configuring detailsYAML crash COURSE how to write yaml file for adding configuring details
YAML crash COURSE how to write yaml file for adding configuring details
NishanthaBulumulla1
 
The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...
The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...
The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...
kalichargn70th171
 
Unveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdfUnveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdf
brainerhub1
 
Hand Rolled Applicative User Validation Code Kata
Hand Rolled Applicative User ValidationCode KataHand Rolled Applicative User ValidationCode Kata
Hand Rolled Applicative User Validation Code Kata
Philip Schwarz
 
Webinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for EmbeddedWebinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for Embedded
ICS
 
How to write a program in any programming language
How to write a program in any programming languageHow to write a program in any programming language
How to write a program in any programming language
Rakesh Kumar R
 
fiscal year variant fiscal year variant.
fiscal year variant fiscal year variant.fiscal year variant fiscal year variant.
fiscal year variant fiscal year variant.
AnkitaPandya11
 
E-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet DynamicsE-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet Dynamics
Hornet Dynamics
 
zOS Mainframe JES2-JES3 JCL-JECL Differences
zOS Mainframe JES2-JES3 JCL-JECL DifferenceszOS Mainframe JES2-JES3 JCL-JECL Differences
zOS Mainframe JES2-JES3 JCL-JECL Differences
YousufSait3
 
Artificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension FunctionsArtificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension Functions
Octavian Nadolu
 
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdfTop Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
VALiNTRY360
 
Microservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we workMicroservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we work
Sven Peters
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
Green Software Development
 
SQL Accounting Software Brochure Malaysia
SQL Accounting Software Brochure MalaysiaSQL Accounting Software Brochure Malaysia
SQL Accounting Software Brochure Malaysia
GohKiangHock
 
Project Management: The Role of Project Dashboards.pdf
Project Management: The Role of Project Dashboards.pdfProject Management: The Role of Project Dashboards.pdf
Project Management: The Role of Project Dashboards.pdf
Karya Keeper
 
Oracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptxOracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptx
Remote DBA Services
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
Drona Infotech
 
Lecture 2 - software testing SE 412.pptx
Lecture 2 - software testing SE 412.pptxLecture 2 - software testing SE 412.pptx
Lecture 2 - software testing SE 412.pptx
TaghreedAltamimi
 

Recently uploaded (20)

一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
 
YAML crash COURSE how to write yaml file for adding configuring details
YAML crash COURSE how to write yaml file for adding configuring detailsYAML crash COURSE how to write yaml file for adding configuring details
YAML crash COURSE how to write yaml file for adding configuring details
 
The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...
The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...
The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...
 
Unveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdfUnveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdf
 
Hand Rolled Applicative User Validation Code Kata
Hand Rolled Applicative User ValidationCode KataHand Rolled Applicative User ValidationCode Kata
Hand Rolled Applicative User Validation Code Kata
 
Webinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for EmbeddedWebinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for Embedded
 
How to write a program in any programming language
How to write a program in any programming languageHow to write a program in any programming language
How to write a program in any programming language
 
fiscal year variant fiscal year variant.
fiscal year variant fiscal year variant.fiscal year variant fiscal year variant.
fiscal year variant fiscal year variant.
 
E-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet DynamicsE-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet Dynamics
 
zOS Mainframe JES2-JES3 JCL-JECL Differences
zOS Mainframe JES2-JES3 JCL-JECL DifferenceszOS Mainframe JES2-JES3 JCL-JECL Differences
zOS Mainframe JES2-JES3 JCL-JECL Differences
 
Artificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension FunctionsArtificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension Functions
 
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdfTop Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
 
Microservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we workMicroservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we work
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
 
SQL Accounting Software Brochure Malaysia
SQL Accounting Software Brochure MalaysiaSQL Accounting Software Brochure Malaysia
SQL Accounting Software Brochure Malaysia
 
Project Management: The Role of Project Dashboards.pdf
Project Management: The Role of Project Dashboards.pdfProject Management: The Role of Project Dashboards.pdf
Project Management: The Role of Project Dashboards.pdf
 
Oracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptxOracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptx
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
 
Lecture 2 - software testing SE 412.pptx
Lecture 2 - software testing SE 412.pptxLecture 2 - software testing SE 412.pptx
Lecture 2 - software testing SE 412.pptx
 

PromptWorks Talk Tuesdays: Dustin Ingram 8/23/16 "Detecting Asteroids with Neural Networks in TensorFlow"

  • 1. Detecting Asteroids with Neural Networks in TensorFlow
  • 2. Outline • What's the goal? • What's the data? • Getting started • Building a feature set • Building the neural network • Training the network • Results
  • 3. Goal Build and train a neural network to correctly identify asteroids in astrophotography data.
  • 5.
  • 6. So... How do we really do it?
  • 7. The data The Sloan Digital Sky Survey: One of the most ambitious and influential surveys in the history of astronomy."
  • 8. The data • Approx 35% of sky • Largest uniform survey of the sky yet accomplished • Data is freely available online • Each image is 922x680 pixels
  • 9.
  • 10.
  • 12.
  • 13. How does this work? This exploits a property of CCDs: • SDSS telescopes use five different filters • They are not simultaneous • Moving objects appear in different locations • Always the same order
  • 14.
  • 15. Getting started Getting the initial training data: * Small tool to extract potential candidates from full-scale images * Extremely naive, approx 100:5 false positives to actual positives * Very low false negatives (approx 1:1000) * Incredibly slow (complex scan of 100Ks of potentials) * Manual classification, somewhat slow * Yields approx 250 valid items, 500 invalid items
  • 16. The feature set Good ideas for features: • Ratio valid hues to non-valid hues • Best possible cluster collinearity • Best possible average cluster distance
  • 17. Feature: Ratio valid hues to non- valid hues The goal here is to match the colors, a.k.a. "hues": • First step: convert to HSV space • For pixels in the valid value-spectrum (0.25 < v < 0.90) • How many are within 2 standard deviations from an optimal value? • What's the ratio to ones that aren't?
  • 19.
  • 20. Feature: Best possible cluster collinearity k-means clustering • Using the valid hues from the previous feature • Attempts to cluster n points into k groups • Here, k=3 • Produces three centroids
  • 21.
  • 22. k-means clustering of the same asteroid
  • 23.
  • 24. Feature: Best possible cluster collinearity • The property of a set of points which lie on the same line • Iterate the k-means clustering approx. 20 times • The resulting metric is the ratio between the actual collinearity and the maximum potential colinearity
  • 25. Feature: Best possible cluster collinearity • Given points a, b, and c: colin = (c.x - a.x) * (b.y - a.y) + (c.y - a.y) * (a.x - b.x)
  • 26.
  • 27. k-means clustering of the same asteroid
  • 28.
  • 30.
  • 31. k-means clustering of the same non- asteroid
  • 32.
  • 33. Feature: Best possible average cluster distance • Using the same k-means clusters from the previous features • What is the average distance from any point in a cluster to the center of the cluster?
  • 34.
  • 35. k-means clustering of the same asteroid
  • 37.
  • 38. k-means clustering of the same non- asteroid
  • 39.
  • 40. A comparison of all three features | Hue Ratio | Collinearity | Cluster distance Asteroid | 0.687 | 0.046 | 0.432 Non-asteroid | 0.376 | 0.388 | 0.557
  • 41. A comparison of all three features We see that the for a valid asteroid: • The hue ratio is much higher • The colinearity metric is much lower • The mean cluster disance is smaller
  • 42. Ok... where's the AI? This type of classification is extrememly well suited for a neural network:
  • 43. Ok... where's the AI? • We have a clear set of training data • The output is either affirmative (1) or negative (0) • Each of the input features can be resolved to a 0 -> 1 metric • There is a small amount of input features which can accurately define an item
  • 44. Building the neural network The resulting neural network: • Performs binary classification; • Use supervised learning; • Uses a backpropagation trainer; • Is deep;
  • 45. Building the neural network Four layers: • Input layer; • Two hidden layers; • Output layer
  • 46. Building the neural network Total of 154 "neurons": • 3 input neurons (hue ratio, collinearity metric, distance metric) • 150 hidden neurons (100 in first layer, 50 in second) • 1 output neuron (1 if valid asteroid, 0 if invalid)
  • 48.
  • 49. $ head astro.train 0.72, 0.0230326797386, 0.265427036314, 1.0 0.223404255319, 0.453424956758, 0.620237280488, 0.0 0.625954198473, 0.282509136048, 0.489543705893, 0.0 0.297297297297, 0.217278447678, 0.456831265365, 0.0 0.526315789474, 0.125389748718, 0.52048369676, 1.0 0.4, 0.430241745731, 0.597850990407, 0.0 0.0797872340426, 0.375153031291, 0.601415832623, 0.0 0.403361344538, 0.268341944886, 0.485098390444, 0.0 0.592356687898, 0.34747482416, 0.559112235938, 0.0 0.0976744186047, 0.0213790202131, 0.586822094967, 0.0
  • 50. Load training data import pandas as pd df_train = pd.read_csv( tf.gfile.Open('./astro.train'), names=[ 'hue_rat', 'col_min', 'dis_min', 'label' ], skipinitialspace=True)
  • 51. Turn label into boolean df_train['label'] = ( df_train["astro"].apply( lambda x: x == 1.0 ) ).astype(int)
  • 52. Build an estimator import tensorflow as tf def build_estimator(model_dir): hue_rat = tf.contrib.layers.real_valued_column("hue_rat") col_min = tf.contrib.layers.real_valued_column("col_min") dis_min = tf.contrib.layers.real_valued_column("dis_min") return tf.contrib.learn.DNNClassifier( model_dir=model_dir, feature_columns=[hue_rat, col_min, dis_min], hidden_units=[100, 50] )
  • 53. Input function def input_fn(df): feature_cols = { k: tf.constant(df[k].values) for k in [ 'hue_rat', 'col_min', 'dis_min' ] } label = tf.constant(df['label'].values) return feature_cols, label
  • 54. Build & Fit the model m = build_estimator('./model') m.fit(input_fn=lambda: input_fn(df_train), steps=200)
  • 55. Training the network • Approx 250 valid items; • Approx 500 invalid items; • Trained for 200 steps; • Took < 1 minute;
  • 56. Evaluating the model results = m.evaluate( input_fn=lambda: input_fn(df_test_1), steps=1 )
  • 57. Results • Trial 1: 99.2973% accuracy • Trial 2: 94.4056% accuracy • Trial 3: 94.7644% accuracy
  • 58. Why use features? MNIST is a database of handwritten digits:
  • 59. Why use features? 28x28 pixels = 784 numbers between 0 and 1: df_train = [0.0, 0.1, 0.9, ..., 1.0]
  • 60. Why use features? Astro data: * 40x40 pixels * RGB space 40*40*3 = 4800
  • 61. Conclusion • Using a neural network allows us to do it faster, and more accurately • Need to spend time coming up with good features for the data • TensorFlow is really nice (and fast!)
  • 62. References • https://www.tensorflow.org/ • http://www.sdss.org/ • http://en.wikipedia.org/wiki/Sloan_Digital_Sky_Survey • http://en.wikipedia.org/wiki/Collinearity • http://en.wikipedia.org/wiki/K-means_clustering