Andrew J. Humphrey
Digital Transformation CoLab, University of Minho; IA
A number of upcoming wide-area extragalactic imaging surveys promise to dramatically improve our understanding of galaxy evolution, most notably Euclid in the near-infrared, and Rubin/LSST in the optical. Euclid alone is expected to detect ~12 billion astronomical sources over ~15 000 deg2 of the sky, facilitating new insights into cosmology, galaxy evolution, and other topics. In order to optimally exploit the expected very large datasets, appropriate methods and software tools need to be developed for source classification, property estimation, and other tasks. Traditionally, colour-colour methods or the fitting of broad-band SEDs with spectral templates has been used, but with the advent of enormous datasets there is a clear need for new methodologies that are both accurate and scalable.
In recent years, the Galaxies Group of IA has expanded its expertise to include the application of machine learning (ML) techniques to the selection and characterisation of extragalactic sources, with a significant number of papers now published or in prep. In this talk, I will discuss the findings of several studies conducted during the past ~2 years, where novel ML methodologies for the analysis of large photometric datasets were explored in the context of Euclid and Rubin/LSST. In Humphrey et al. (2023, A&A, 671, A99), we developed an ML-based technique for the selection of quiescent galaxies using Euclid IYJH photometry, optionally also using optical, mid-IR and radio data. Our methodology significantly outperforms colour-colour methods and traditional SED fitting. We also use ML to ascertain the relative importance of the aforementioned bands for identifying quiescent galaxies.In Humphrey et al. (2023, MNRAS, 520, 305H), we showed that using pseudo-labelled observations during model training can significantly improve models for the estimation of galaxy redshift and physical properties. In collaboration with a tech-startup, we also tested a new methodology that allows the estimation of the quality (or failure) of ML classifiers when applied to test data (Humphrey et al. 2022, MNRAS, 517, L116). Finally, intermediate results from an ongoing project within the Euclid collaboration for the estimation of galaxy properties using CatBoostRegressor chains will be briefly described.
2023 April 20, 13:30
Centro de Astrofísica da Universidade do Porto (Auditorium)
Rua das Estrelas, 4150-762 Porto