A recurrent neural network for classification of unevenly sampled variable stars

Naul, Brett; Bloom, Joshua S.; Pérez, Fernando; van der Walt, Stéfan

doi:10.1038/s41550-017-0321-z

Letter
Published: 27 November 2017

A recurrent neural network for classification of unevenly sampled variable stars

Brett Naul ORCID: orcid.org/0000-0001-8171-212X¹,
Joshua S. Bloom¹,
Fernando Pérez^2,3,4 &
…
Stéfan van der Walt³

Nature Astronomy volume 2, pages 151–155 (2018)Cite this article

1753 Accesses
94 Citations
78 Altmetric
Metrics details

Subjects

Abstract

Astronomical surveys of celestial sources produce streams of noisy time series measuring flux versus time (‘light curves’). Unlike in many other physical domains, however, large (and source-specific) temporal gaps in data arise naturally due to intranight cadence choices as well as diurnal and seasonal constraints^1,2,3,4,5. With nightly observations of millions of variable stars and transients from upcoming surveys^4,6, efficient and accurate discovery and classification techniques on noisy, irregularly sampled data must be employed with minimal human-in-the-loop involvement. Machine learning for inference tasks on such data traditionally requires the laborious hand-coding of domain-specific numerical summaries of raw data (‘features’)⁷. Here, we present a novel unsupervised autoencoding recurrent neural network⁸ that makes explicit use of sampling times and known heteroskedastic noise properties. When trained on optical variable star catalogues, this network produces supervised classification models that rival other best-in-class approaches. We find that autoencoded features learned in one time-domain survey perform nearly as well when applied to another survey. These networks can continue to learn from new unlabelled observations and may be used in other unsupervised tasks, such as forecasting and anomaly detection.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Diagram of an RNN encoder–decoder architecture for irregularly sampled time series data.**

**Fig. 2: Example autoencoder reconstructions of ASAS light curves from 64-dimensional feature representation.**

**Fig. 3: Confusion matrices for autoencoder-feature random forest classifiers for labelled variable star light curves for each survey.**

Machine learning reveals the control mechanics of an insect wing hinge

Article 17 April 2024

Highly accurate protein structure prediction with AlphaFold

Article Open access 15 July 2021

JWST detection of a supernova associated with GRB 221009A without an r-process signature

Article Open access 12 April 2024

References

Levine, A. M. et al. First results from the All-Sky Monitor on the Rossi X-Ray Timing Explorer. Astrophys. J. Lett. 469, L33–L36 (1996).
Article ADS Google Scholar
Pojmanski, G. The All Sky Automated Survey. Catalog of variable stars. I. 0^h–6^h quarter of the southern hemisphere. Acta Astronomica 52, 397–427 (2002).
ADS Google Scholar
Murphy, T. et al. VAST: an ASKAP survey for variables and slow transients. Publ. Astron. Soc. Aust. 30, e006 (2013).
Article ADS Google Scholar
Ridgway, S. T., Matheson, T., Mighell, K. J., Olsen, K. A. & Howell, S. B. The variable sky of deep synoptic surveys. Astrophys. J. 796, 53 (2014).
Article ADS Google Scholar
Djorgovski, S. et al. Real-time data mining of massive data streams from synoptic sky surveys. Future Gener. Comput. Syst. 59, 95–104 (2016).
Article Google Scholar
Kantor, J. Transient alerts in LSST. in The Third Hot-wiring the Transient Universe Workshop (eds Wozniak, P. R., Graham, M. J., Mahabal, A. A. and Seaman, R.) 19–26 (Los Alamos National Laboratory, 2014).
Bloom, J. S., & Richards, J. W. Data mining and machine learning in time-domain discovery and classification. in Advances in Machine Learning and Data Mining for Astronomy (eds Way, M. J., Scargle, J. D., Ali, K. M. and Srivastava, A. N.) 89–112 (CRC, New York, 2012).
Hinton, G. E. & Salakhutdinov, R. R. Reducing the dimensionality of data with neural networks. Science 313, 504–507 (2006).
Article ADS MathSciNet MATH Google Scholar
Richards, J. W. et al. Construction of a calibrated probabilistic classification catalog: application to 50k variable sources in the All-Sky Automated Survey. Astrophys. J. Suppl. Ser. 203, 32 (2012).
Article ADS Google Scholar
Breiman, L. Random forests. Mach. Learn. 45, 5–32 (2001).
Article MATH Google Scholar
Richards, J. W. et al. On machine-learned classification of variable stars with sparse and noisy time-series data. Astrophys. J. 733, 10 (2011).
Article ADS Google Scholar
Naul, B., van der Walt, S., Crellin-Quick, A., Bloom, J. S. & Pérez, F. Cesium: open-source platform for time-series inference. in Proc. 15th Python in Science Conf. (eds Benthall, S. and Rostrup, S.) 27–35 (SciPy, Austin, TX, 2016).
Nun, I. et al. FATS: Feature Analysis for Time Series. Preprint at https://arxiv.org/abs/1506.00010 (2015).
Dubath, P. et al. Random forest automated supervised classification of Hipparcos periodic variable stars. Mon. Notices R. Astron. Soc. 414, 2602–2617 (2011).
Article ADS Google Scholar
Nun, I., Pichara, K., Protopapas, P. & Kim, D.-W. Supervised detection of anomalous light curves in massive astronomical catalogs. Astrophys. J. 793, 23 (2014).
Article ADS Google Scholar
Miller, A. A. et al. A machine-learning method to infer fundamental stellar parameters from photometric light curves. Astrophys. J. 798, 122 (2015).
Article ADS Google Scholar
Kügler, S. D., Gianniotis, N. & Polsterer, K. L. Featureless classification of light curves. Mon. Not. R. Astron. Soc. 451, 3385–3392 (2015).
Article ADS Google Scholar
Kim, D.-W. & Bailer-Jones, C. A. A package for the automated classification of periodic variable stars. Astron. Astrophys. 587, A18 (2016).
Article Google Scholar
Sesar, B. et al. Exploring the variable sky with LINEAR. II. Halo structure and substructure traced by RR Lyrae stars to 30 kpc. Astron. J. 146, 21 (2013).
Article ADS Google Scholar
Palaversa, L. et al. Exploring the variable sky with LINEAR. III. Classification of periodic light curves. Astron. J. 146, 101 (2013).
Article ADS Google Scholar
Alcock, C. et al. The MACHO project LMC variable star inventory. II. LMC RR Lyrae stars—pulsational characteristics and indications of a global youth of the LMC. Astron. J 111, 1146–1155 (1996).
Article ADS Google Scholar
Mackenzie, C., Pichara, K. & Protopapas, P. Clustering-based feature learning on variable stars. Astrophys. J. 820, 138 (2016).
Article ADS Google Scholar
Van der Maaten, L. & Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008).
MATH Google Scholar
Charnock, T. & Moss, A. Deep recurrent neural networks for supernovae classification. Preprint at https://arxiv.org/abs/1606.07442 (2016).
Che, Z., Purushotham, S., Cho, K., Sontag, D. & Liu, Y. Recurrent neural networks for multivariate time series with missing values. Preprint at https://arxiv.org/abs/1606.01865 (2016).
Lipton, Z. C., Kale, D. C., Elkan, C. & Wetzell, R. Learning to diagnose with LSTM recurrent neural networks. Preprint at https://arxiv.org/abs/1511.03677 (2015).
Friedman, J. H. & Silverman, B. W. Flexible parsimonious smoothing and additive modeling. Technometrics 31, 3–21 (1989).
Article MathSciNet MATH Google Scholar
Cho, K. et al. Learning phrase representations using RNN encoder–decoder for statistical machine translation. Preprint at https://arxiv.org/abs/1406.1078 (2014).
Schuster, M. & Paliwal, K. K. Bidirectional recurrent neural networks. IEEE Trans. Sig. Process. 45, 2673–2681 (1997).
Article ADS Google Scholar
Srivastava, N., Hinton, G. E., Krizhevsky, A., Sutskever, I. & Salakhutdinov, R. Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014).
MathSciNet MATH Google Scholar

Download references

Acknowledgements

We thank Y. LeCun and F. El Gabaly for helpful discussions and A. Culich for computational assistance. This work is supported by the Gordon and Betty Moore Foundation Data-Driven Discovery and National Science Foundation BIGDATA grant number 1251274. Computation was provided by the Pacific Research Platform programme through the National Science Foundation Office of Advanced Cyberinfrastructure (number 1541349), Office of Cyberinfrastructure (number 1246396), University of California Office of the President, Calit2 and Berkeley Research Computing at University of California Berkeley.

Author information

Authors and Affiliations

Department of Astronomy, University of California, Berkeley, CA, USA
Brett Naul & Joshua S. Bloom
Department of Statistics, University of California, Berkeley, CA, USA
Fernando Pérez
Berkeley Institute for Data Science, University of California, Berkeley, CA, USA
Fernando Pérez & Stéfan van der Walt
Department of Data Science and Technology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
Fernando Pérez

Authors

Brett Naul
View author publications
You can also search for this author in PubMed Google Scholar
Joshua S. Bloom
View author publications
You can also search for this author in PubMed Google Scholar
Fernando Pérez
View author publications
You can also search for this author in PubMed Google Scholar
Stéfan van der Walt
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

B.N. implemented and trained the networks, assembled the machine learning results and generated the first drafts of the paper and figures. J.S.B. conceived of the project, assembled the astronomical light curves and oversaw the supervised training portions. F.P. provided theoretical input. S.v.d.W. discussed the results and commented on the paper.

Corresponding author

Correspondence to Brett Naul.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Supplementary Information

Supplementary Text, Supplementary Figures 1–11 and Supplementary References

Rights and permissions

Reprints and permissions

About this article

Cite this article

Naul, B., Bloom, J.S., Pérez, F. et al. A recurrent neural network for classification of unevenly sampled variable stars. Nat Astron 2, 151–155 (2018). https://doi.org/10.1038/s41550-017-0321-z

Download citation

Received: 30 May 2017
Accepted: 24 October 2017
Published: 27 November 2017
Issue Date: February 2018
DOI: https://doi.org/10.1038/s41550-017-0321-z

This article is cited by

Detecting abnormal cell behaviors from dry mass time series
- Romain Bailly
- Marielle Malfante
- Jérôme Mars
Scientific Reports (2024)
A review on big data based on deep neural network approaches
- M. Rithani
- R. Prasanna Kumar
- Srinath Doss
Artificial Intelligence Review (2023)
Scientific discovery in the age of artificial intelligence
- Hanchen Wang
- Tianfan Fu
- Marinka Zitnik
Nature (2023)
A detection metric designed for O’Connell effect eclipsing binaries
- Kyle B. Johnston
- Rana Haber
- Matt Knote
Computational Astrophysics and Cosmology (2019)
Revealing ferroelectric switching character using deep recurrent neural networks
- Joshua C. Agar
- Brett Naul
- Lane W. Martin
Nature Communications (2019)