Automated Search for Globular Clusters in Virgo Cluster Dwarf Galaxies
1. Automated Search for Globular Clusters in
Virgo Cluster Dwarf Galaxies
Emily Zhou | Harker Upper School
2. Outline
• Background
• General Flow of GC Search
• Automated Flow
• Results
• Conclusion
• Future work
• Acknowledgements
• Reference
3. Abstract
• We developed a novel automation flow to efficiently process and accurately analyze galaxy
images from Next Generation Virgo Cluster Survey to search for Globular Clusters (GCs).
Searching for GCs is usually manual and can be subjective since researchers must classify
images as “good” or “bad” at multiple stages. Our flow consists of the following: 1) We
automated ISOFIT parameter sweep with intelligent feedback for the next iteration. 2) We
integrated Convolutional Neural Networks (CNN) to determine image quality for next phase
processing. 3) We incorporated image checking to mitigate excessive masking. 4) We
automated GC identification based on their concentration factors and colors of the objects from
SExtractor.
• By implementing this flow, we successfully reduced the time to analyze an image by 91%. We
were also able to fit at least one isophote in 99% of the images. With this higher yield, we
discovered five new potential GCs that the manual process couldn’t detect. Our greatest
contribution is that we implemented CNN to classify image qualities. Even with our limited
dataset, we achieved a promising 80% accuracy. This removes the tedious, error-prone, and
sometimes subjective process that requires human intervention.
4. Globular Clusters (GC)
• Roughly spherical, densely packed
groups of stars found around galaxies
• Formed around the same time as their
host galaxies
• Provide a unique fossil record of the
early formation and evolution of their
host galaxies
5. Next Generation Virgo Cluster Survey (NGVS)
• GCs within Virgo Cluster
Dwarf Galaxies
• NGVS via the C.F.H.
Telescope
• Images of about 1100
galaxies
6. What’s in A Galaxy Image?
• Compact light sources
• GCs in galaxy,
foreground stars,
background galaxies
• “Fireflies”
• Smooth, extended
galaxy light
• Stars distributed
smoothly
throughout galaxy
• “Lightbulb”
7. Challenges
• GCs are extremely faint amidst the
galaxy light
• GCs vs. other concentrated light
objects
•
8. Model the Isophotes of Smooth Galaxy Light
Fitting elliptical
isophotes of
galaxies
isophote (iso: equal, phote: light)
10. Galaxy Light Subtraction
10
=Galaxy lightGalaxy and
Compact Sources
Compact Sources
(including possible GCs!)
(original image)
(from the isophote
model)
_
11. Images from Subtraction
Successful Unsuccessful– Spiral
galaxies
Unsuccessful–
Crowded images
Determine if an image from subtraction is good enough for GC identification
13. Identifying Globular Cluster Candidates
Concentration Index
(how spread out a light source is)
Color
(ratio of intensities in different wavelengths)
14. Project Goal
Process is time-consuming, tedious,
manual, and subjective
Develop a novel automation
flow to efficiently process and
accurately analyze galaxy
images to search for Globular
Clusters (GCs)
15. Automating Image Subtraction and GC
Identification
• Automate ISOFIT
• Fourier harmonic order sweep to better
fit isophote
• Object locator threshold sweep
• Enabling parallel modeling
• Automate excessive masking
detection
• Sweep SExtractor threshold input
• Automate GC identification
• Concentration factor
• Color
16. Convolutions
Input
Feature maps
Subsampling Convolutions Subsampling Fully connected
f.maps f.maps
Output:
□ Good
□ Bad
Typical CNN Architecture
• Convolutional Neural Networks (CNNs)
remove the need for researchers’
subjective judgements
Deep Learning to Inspect Images
• Challenges
• Small dataset size
• Weak features
17. Transfer Learning
Enables quick
training on small
dataset
Feature Learning
Classifier Training
Train on
ImageNet
(millions of
images)
Convolutional Neural Network
(Inception-V3)
Transfer Learning
Transferred
Features
(no new training)
Classifier Training
Train on
small
dataset
(~5000
images)
18. CNN Accuracy Assessment
• Training takes 150s on Nvidia RTX2060 GPU
• 80% accuracy (very promising)
CNN AccuracyImage Labeling
19. Results of Automation Flow
• Successfully modeled and subtracted galaxy light from thousands of images
• Implemented and verified deep learning CNN modeling to inspect image
quality
• Tested our flow on a dataset containing 1,145 Virgo Cluster dwarf galaxies
imaged in u’, g’, i’, and z’ bands
• Successfully reduced the time to process and analyze one image by 91%
• Used the light subtraction model and fitted at least one isophote in 99% of the
galaxy images, ultimately resulting in 56% usable images
• Discovered 5 new potential GCs that the manual process was unable to detect
20. Conclusions
• We implemented and verified deep learning CNN modeling to remove human
involvement in GC detection process and automated the process from fitting
background isophotes to the final GC recognition
• The new, flexible flow integrates deep CNN modeling, computation
algorithms, mathematical analysis, and astrophysics-based modules
• The new flow should significantly reduce scientists’ time to discover new GCs
and help produce a unique dataset for GCs in low-luminosity dwarf galaxies
that contributes to a better understanding of galaxy formation
21. Future Work
● Produce more subtracted images for additional galaxies and
train CNN model on larger data sets
● Better distinguish GCs with other bright objects using AI / deep
learning
● Further tune the modeling and subtraction process to increase
success rates
22. Thank you to ...
Our Mentors
Prof. Raja Guha Thakurta (UCSC)
Prof. Eric Peng (Peking University)
Youkyung Ko (Peking University)
My Team
Justin Du, Cupertino High School
Brian Pérez Wences, East Palo Alto
Academy
The SIP Program
Sigma Xi Student Research Showcase
Data Scientist: Guocong Song
23. References
1. B.C. Ciambur, “Beyond ellipse(s): accurately modeling the isophotal structure of galaxies with ISOFIT
and CMODEL”, APJ, 810, 120, (2015)
2. Sungsoon Lim, et al. “Globular Clusters as Tracers of Fine Structure in the Dramatic Shell Galaxy NGC
474”, arXiv:1612.04017, 2017
3. Patrick R. Durrell, “The next generation Virgo Clusters Survey. VIII. The special distribution of globular
clusters in the Virgo Clusters”, APJ,794, 2, (2014)
4. E. Bertin, date unknown, “SExtractor v2.13 User’s Manual”
5. Jeannette Barnes, 1993, “A Beginner’s Guide to Using IRAF, IRAF version 2.10”
6. Géron, Aurélien. Hands-on Machine Learning with Scikit-Learn & Tensorflow: Concepts, Tools, and
Techniques to Build Intelligent Systems. O'Reilly Media, Inc., 2017.
7. Vanhoucke, et al. “Rethinking the Inception Architecture for Computer Vision.” ArXiv.org, 11 Dec. 2015,
arxiv.org/abs/1512.00567.