Clár-Bríd Tohill

Career Stage
Student (postgraduate)
Poster Abstract

The properties of very distant galaxies are largely unknown mostly due to detection limits of extragalactic surveys, hence they can be hard to classify. One proposed method is to classify these galaxies by their morphological features which can be extracted from a single image. The parameters that we investigate are the Concentration (C) of light and the Asymmetry (A) of the galaxy. The C parameter is simply a measure of how concentrated the light from the central region of a galaxy is compared to the outer parts. This can tell us if a galaxy has a disk structure, is more elliptical, or is an irregular type galaxy. The A value of a galaxy just determines how asymmetric the galaxy is, so galaxies that are undergoing a merging event with another galaxy will have high A values most of the time. The A value of a large sample of galaxies can then be used to estimate the merger fraction which is important in galaxy evolution models.
Due to the large amount of data from large-scale surveys these parameters are calculated computationally. However, with future surveys expected to image billions of galaxies, it will become computationally infeasible to continue using previous algorithms. One solution to this is to use machine learning. We employ the use of a convolutional neural network (CNN) that we trained on a subset of galaxies from the CANDELS fields that have been imaged by the Hubble Space Telescope. We then investigate how well the network can reproduce these values to determine if it would be a suitable method for large scale surveys. We find that the network is able to reproduce the results within a reasonable scatter and is on average ~ 10,000 times faster than previous algorithms.

Plain text summary
Distant galaxies are hard to classify based on their morphology due to lack of features caused by cosmological dimming. Faint features present at the outer edges of galaxies such as spiral arms disappear rapidly allowing only the brightest components to be detected. One method chosen to classify these galaxies is hence based on their stellar light distributions.

We investigate two such parameters, the concentration and asymmetry. The Concentration (C) parameter is simply a measure of how concentrated the light from the central region of a galaxy is compared to the outer parts. This can tell us if a galaxy has a disk structure, is more elliptical, or is an irregular type galaxy. It is calculated using the ratio of two radii that contain a predefined amount of light from the galaxy. For our measurements we use the ratio of the radii containing 80% and 20% of the total light.

The Asymmetry (A) value is a measure of how asymmetric the galaxy light is, meaning galaxies that are undergoing a merging event with another galaxy will have high A values most of the time. The A value of a large sample of galaxies can then be used to estimate the merger fraction which is important in galaxy evolution models.
A is calculated by rotating the image 180° and subtracting it from the pre-rotated image. The residuals are then summed and divided by the original galaxy flux.

Traditionally, these parameters are calculated computationally and require pre-processing steps such as creating segmentation maps. However, with the future of extragalactic surveys expected to image billions of galaxies, current methods will become computationally infeasible. One solution to this problem is machine learning. Machine learning networks can process much larger datasets at a much quicker rate than conventional methods. The idea behind machine learning and artificial intelligence is the ability to teach a machine to ‘think’ like the human brain does. This is achieved by using a large amount of data to train the machine. This allows the machine to learn and improve without being explicitly programmed.

Within our work we utilise a type of deep learning network known as a Convolutional Neural Network (CNNs). These are networks that are best suited for image classification problems. CNNs are made up of convolutional layers which extract and learn features from images by applying multiple filter matrices to the image. These filters can extract features like shape, size, edges etc.
We trained our network on ~100,000 galaxy images taken with the Hubble Space Telescope. Our trained networks were able to reproduce both the A and C values for the galaxies within a reasonable scatter. The average difference between the networks prediction and the measured values for both measurements were less than the average error on the values. When investigating the galaxies with large differences between the measured value and the networks prediction it was found they had very low signal to noise (SNR) values. This is an issue when dealing with very distant galaxies as they have much lower SNRs than local galaxies.
We tested the impact of noise on both the network and the original algorithm separately by simulating a sample of galaxies at different SNRs and remeasuring their C and A values. We found that while the C measures are similar for both, the asymmetry network is more stable than the original algorithm in the low SNR regime. Our trained network is able to calculate these values ~10,000 times faster than previous methods, meaning that it will be suited to future ‘Big Data’ surveys.
Poster Title
Predicting Galaxy Parameters with Machine Learning
Tags
Astronomy
Astrophysics
Data Science
Url
https://twitter.com/ClarBridTohill