Abstract
Astronomical surveys of celestial sources produce streams of noisy time series measuring flux versus time (‘light curves’). Unlike in many other physical domains, however, large (and source-specific) temporal gaps in data arise naturally due to intranight cadence choices as well as diurnal and seasonal constraints1,2,3,4,5. With nightly observations of millions of variable stars and transients from upcoming surveys4,6, efficient and accurate discovery and classification techniques on noisy, irregularly sampled data must be employed with minimal human-in-the-loop involvement. Machine learning for inference tasks on such data traditionally requires the laborious hand-coding of domain-specific numerical summaries of raw data (‘features’)7. Here, we present a novel unsupervised autoencoding recurrent neural network8 that makes explicit use of sampling times and known heteroskedastic noise properties. When trained on optical variable star catalogues, this network produces supervised classification models that rival other best-in-class approaches. We find that autoencoded features learned in one time-domain survey perform nearly as well when applied to another survey. These networks can continue to learn from new unlabelled observations and may be used in other unsupervised tasks, such as forecasting and anomaly detection.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Levine, A. M. et al. First results from the All-Sky Monitor on the Rossi X-Ray Timing Explorer. Astrophys. J. Lett. 469, L33–L36 (1996).
Pojmanski, G. The All Sky Automated Survey. Catalog of variable stars. I. 0h–6h quarter of the southern hemisphere. Acta Astronomica 52, 397–427 (2002).
Murphy, T. et al. VAST: an ASKAP survey for variables and slow transients. Publ. Astron. Soc. Aust. 30, e006 (2013).
Ridgway, S. T., Matheson, T., Mighell, K. J., Olsen, K. A. & Howell, S. B. The variable sky of deep synoptic surveys. Astrophys. J. 796, 53 (2014).
Djorgovski, S. et al. Real-time data mining of massive data streams from synoptic sky surveys. Future Gener. Comput. Syst. 59, 95–104 (2016).
Kantor, J. Transient alerts in LSST. in The Third Hot-wiring the Transient Universe Workshop (eds Wozniak, P. R., Graham, M. J., Mahabal, A. A. and Seaman, R.) 19–26 (Los Alamos National Laboratory, 2014).
Bloom, J. S., & Richards, J. W. Data mining and machine learning in time-domain discovery and classification. in Advances in Machine Learning and Data Mining for Astronomy (eds Way, M. J., Scargle, J. D., Ali, K. M. and Srivastava, A. N.) 89–112 (CRC, New York, 2012).
Hinton, G. E. & Salakhutdinov, R. R. Reducing the dimensionality of data with neural networks. Science 313, 504–507 (2006).
Richards, J. W. et al. Construction of a calibrated probabilistic classification catalog: application to 50k variable sources in the All-Sky Automated Survey. Astrophys. J. Suppl. Ser. 203, 32 (2012).
Breiman, L. Random forests. Mach. Learn. 45, 5–32 (2001).
Richards, J. W. et al. On machine-learned classification of variable stars with sparse and noisy time-series data. Astrophys. J. 733, 10 (2011).
Naul, B., van der Walt, S., Crellin-Quick, A., Bloom, J. S. & Pérez, F. Cesium: open-source platform for time-series inference. in Proc. 15th Python in Science Conf. (eds Benthall, S. and Rostrup, S.) 27–35 (SciPy, Austin, TX, 2016).
Nun, I. et al. FATS: Feature Analysis for Time Series. Preprint at https://arxiv.org/abs/1506.00010 (2015).
Dubath, P. et al. Random forest automated supervised classification of Hipparcos periodic variable stars. Mon. Notices R. Astron. Soc. 414, 2602–2617 (2011).
Nun, I., Pichara, K., Protopapas, P. & Kim, D.-W. Supervised detection of anomalous light curves in massive astronomical catalogs. Astrophys. J. 793, 23 (2014).
Miller, A. A. et al. A machine-learning method to infer fundamental stellar parameters from photometric light curves. Astrophys. J. 798, 122 (2015).
Kügler, S. D., Gianniotis, N. & Polsterer, K. L. Featureless classification of light curves. Mon. Not. R. Astron. Soc. 451, 3385–3392 (2015).
Kim, D.-W. & Bailer-Jones, C. A. A package for the automated classification of periodic variable stars. Astron. Astrophys. 587, A18 (2016).
Sesar, B. et al. Exploring the variable sky with LINEAR. II. Halo structure and substructure traced by RR Lyrae stars to 30 kpc. Astron. J. 146, 21 (2013).
Palaversa, L. et al. Exploring the variable sky with LINEAR. III. Classification of periodic light curves. Astron. J. 146, 101 (2013).
Alcock, C. et al. The MACHO project LMC variable star inventory. II. LMC RR Lyrae stars—pulsational characteristics and indications of a global youth of the LMC. Astron. J 111, 1146–1155 (1996).
Mackenzie, C., Pichara, K. & Protopapas, P. Clustering-based feature learning on variable stars. Astrophys. J. 820, 138 (2016).
Van der Maaten, L. & Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008).
Charnock, T. & Moss, A. Deep recurrent neural networks for supernovae classification. Preprint at https://arxiv.org/abs/1606.07442 (2016).
Che, Z., Purushotham, S., Cho, K., Sontag, D. & Liu, Y. Recurrent neural networks for multivariate time series with missing values. Preprint at https://arxiv.org/abs/1606.01865 (2016).
Lipton, Z. C., Kale, D. C., Elkan, C. & Wetzell, R. Learning to diagnose with LSTM recurrent neural networks. Preprint at https://arxiv.org/abs/1511.03677 (2015).
Friedman, J. H. & Silverman, B. W. Flexible parsimonious smoothing and additive modeling. Technometrics 31, 3–21 (1989).
Cho, K. et al. Learning phrase representations using RNN encoder–decoder for statistical machine translation. Preprint at https://arxiv.org/abs/1406.1078 (2014).
Schuster, M. & Paliwal, K. K. Bidirectional recurrent neural networks. IEEE Trans. Sig. Process. 45, 2673–2681 (1997).
Srivastava, N., Hinton, G. E., Krizhevsky, A., Sutskever, I. & Salakhutdinov, R. Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014).
Acknowledgements
We thank Y. LeCun and F. El Gabaly for helpful discussions and A. Culich for computational assistance. This work is supported by the Gordon and Betty Moore Foundation Data-Driven Discovery and National Science Foundation BIGDATA grant number 1251274. Computation was provided by the Pacific Research Platform programme through the National Science Foundation Office of Advanced Cyberinfrastructure (number 1541349), Office of Cyberinfrastructure (number 1246396), University of California Office of the President, Calit2 and Berkeley Research Computing at University of California Berkeley.
Author information
Authors and Affiliations
Contributions
B.N. implemented and trained the networks, assembled the machine learning results and generated the first drafts of the paper and figures. J.S.B. conceived of the project, assembled the astronomical light curves and oversaw the supervised training portions. F.P. provided theoretical input. S.v.d.W. discussed the results and commented on the paper.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Additional information
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Supplementary Information
Supplementary Text, Supplementary Figures 1–11 and Supplementary References
Rights and permissions
About this article
Cite this article
Naul, B., Bloom, J.S., Pérez, F. et al. A recurrent neural network for classification of unevenly sampled variable stars. Nat Astron 2, 151–155 (2018). https://doi.org/10.1038/s41550-017-0321-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41550-017-0321-z
This article is cited by
-
Detecting abnormal cell behaviors from dry mass time series
Scientific Reports (2024)
-
A review on big data based on deep neural network approaches
Artificial Intelligence Review (2023)
-
Scientific discovery in the age of artificial intelligence
Nature (2023)
-
A detection metric designed for O’Connell effect eclipsing binaries
Computational Astrophysics and Cosmology (2019)
-
Revealing ferroelectric switching character using deep recurrent neural networks
Nature Communications (2019)