In the coming years, next-generation space-based infrared observatories will significantly increase our samples of rare massive stars, representing a tremendous opportunity to leverage modern statistical tools and methods to test massive stellar evolution in entirely new environments. Such work is only possible if the observed objects can be reliably classified. Spectroscopic observations are infeasible with more distant targets, and so we wish to determine whether machine-learning methods can classify massive stars using broadband infrared photometry. We find that a Support Vector Machine classifier is capable of coarsely classifying massive stars with labels corresponding to hot, cool, and emission-line stars with high accuracy, while rejecting contaminating low-mass giants. Remarkably, 76% of emission-line stars can be recovered without the need for narrowband or spectroscopic observations. We classify a sample of ~2500 objects with no existing labels and identify 14 candidate emission-line objects. Unfortunately, despite the high precision of the photometry in our sample, the heterogeneous origins of the labels for the stars in our sample severely inhibit our classifier from distinguishing classes of stars with more granularity. Ultimately, no large and homogeneously labeled sample of massive stars currently exists. Without significant efforts to robustly classify evolved massive stars-which is feasible given existing data from large all-sky spectroscopic surveys-shortcomings in the labeling of existing data sets will hinder efforts to leverage the next generation of space observatories.
Cone search capability for table J/ApJ/913/32/sample (Common names, coordinates, host galaxies, and Gaia measurements of 6484 putative massive stars (Table 1) and feature values and assigned labels (Table 4))
Cone search capability for table J/ApJ/913/32/table5 (Common names and coordinates of stars predicted to be RSGs by the Support Vector Machine (SVM) classifier (SVC) trained on refined labels)
Cone search capability for table J/ApJ/913/32/tablea1 (Common names, coordinates, and predicted labels of 2550 stars input to the SVC trained on coarse labels)