Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

A recurrent neural network for classification of unevenly sampled variable stars


Astronomical surveys of celestial sources produce streams of noisy time series measuring flux versus time (‘light curves’). Unlike in many other physical domains, however, large (and source-specific) temporal gaps in data arise naturally due to intranight cadence choices as well as diurnal and seasonal constraints1,2,3,4,5. With nightly observations of millions of variable stars and transients from upcoming surveys4,6, efficient and accurate discovery and classification techniques on noisy, irregularly sampled data must be employed with minimal human-in-the-loop involvement. Machine learning for inference tasks on such data traditionally requires the laborious hand-coding of domain-specific numerical summaries of raw data (‘features’)7. Here, we present a novel unsupervised autoencoding recurrent neural network8 that makes explicit use of sampling times and known heteroskedastic noise properties. When trained on optical variable star catalogues, this network produces supervised classification models that rival other best-in-class approaches. We find that autoencoded features learned in one time-domain survey perform nearly as well when applied to another survey. These networks can continue to learn from new unlabelled observations and may be used in other unsupervised tasks, such as forecasting and anomaly detection.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Rent or buy this article

Get just this article for as long as you need it


Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Diagram of an RNN encoder–decoder architecture for irregularly sampled time series data.
Fig. 2: Example autoencoder reconstructions of ASAS light curves from 64-dimensional feature representation.
Fig. 3: Confusion matrices for autoencoder-feature random forest classifiers for labelled variable star light curves for each survey.


  1. Levine, A. M. et al. First results from the All-Sky Monitor on the Rossi X-Ray Timing Explorer. Astrophys. J. Lett. 469, L33–L36 (1996).

    Article  ADS  Google Scholar 

  2. Pojmanski, G. The All Sky Automated Survey. Catalog of variable stars. I. 0h–6h quarter of the southern hemisphere. Acta Astronomica 52, 397–427 (2002).

    ADS  Google Scholar 

  3. Murphy, T. et al. VAST: an ASKAP survey for variables and slow transients. Publ. Astron. Soc. Aust. 30, e006 (2013).

    Article  ADS  Google Scholar 

  4. Ridgway, S. T., Matheson, T., Mighell, K. J., Olsen, K. A. & Howell, S. B. The variable sky of deep synoptic surveys. Astrophys. J. 796, 53 (2014).

    Article  ADS  Google Scholar 

  5. Djorgovski, S. et al. Real-time data mining of massive data streams from synoptic sky surveys. Future Gener. Comput. Syst. 59, 95–104 (2016).

    Article  Google Scholar 

  6. Kantor, J. Transient alerts in LSST. in The Third Hot-wiring the Transient Universe Workshop (eds Wozniak, P. R., Graham, M. J., Mahabal, A. A. and Seaman, R.) 19–26 (Los Alamos National Laboratory, 2014).

  7. Bloom, J. S., & Richards, J. W. Data mining and machine learning in time-domain discovery and classification. in Advances in Machine Learning and Data Mining for Astronomy (eds Way, M. J., Scargle, J. D., Ali, K. M. and Srivastava, A. N.) 89–112 (CRC, New York, 2012).

  8. Hinton, G. E. & Salakhutdinov, R. R. Reducing the dimensionality of data with neural networks. Science 313, 504–507 (2006).

    Article  ADS  MathSciNet  MATH  Google Scholar 

  9. Richards, J. W. et al. Construction of a calibrated probabilistic classification catalog: application to 50k variable sources in the All-Sky Automated Survey. Astrophys. J. Suppl. Ser. 203, 32 (2012).

    Article  ADS  Google Scholar 

  10. Breiman, L. Random forests. Mach. Learn. 45, 5–32 (2001).

    Article  MATH  Google Scholar 

  11. Richards, J. W. et al. On machine-learned classification of variable stars with sparse and noisy time-series data. Astrophys. J. 733, 10 (2011).

    Article  ADS  Google Scholar 

  12. Naul, B., van der Walt, S., Crellin-Quick, A., Bloom, J. S. & Pérez, F. Cesium: open-source platform for time-series inference. in Proc. 15th Python in Science Conf. (eds Benthall, S. and Rostrup, S.) 27–35 (SciPy, Austin, TX, 2016).

  13. Nun, I. et al. FATS: Feature Analysis for Time Series. Preprint at (2015).

  14. Dubath, P. et al. Random forest automated supervised classification of Hipparcos periodic variable stars. Mon. Notices R. Astron. Soc. 414, 2602–2617 (2011).

    Article  ADS  Google Scholar 

  15. Nun, I., Pichara, K., Protopapas, P. & Kim, D.-W. Supervised detection of anomalous light curves in massive astronomical catalogs. Astrophys. J. 793, 23 (2014).

    Article  ADS  Google Scholar 

  16. Miller, A. A. et al. A machine-learning method to infer fundamental stellar parameters from photometric light curves. Astrophys. J. 798, 122 (2015).

    Article  ADS  Google Scholar 

  17. Kügler, S. D., Gianniotis, N. & Polsterer, K. L. Featureless classification of light curves. Mon. Not. R. Astron. Soc. 451, 3385–3392 (2015).

    Article  ADS  Google Scholar 

  18. Kim, D.-W. & Bailer-Jones, C. A. A package for the automated classification of periodic variable stars. Astron. Astrophys. 587, A18 (2016).

    Article  Google Scholar 

  19. Sesar, B. et al. Exploring the variable sky with LINEAR. II. Halo structure and substructure traced by RR Lyrae stars to 30 kpc. Astron. J. 146, 21 (2013).

    Article  ADS  Google Scholar 

  20. Palaversa, L. et al. Exploring the variable sky with LINEAR. III. Classification of periodic light curves. Astron. J. 146, 101 (2013).

    Article  ADS  Google Scholar 

  21. Alcock, C. et al. The MACHO project LMC variable star inventory. II. LMC RR Lyrae stars—pulsational characteristics and indications of a global youth of the LMC. Astron. J 111, 1146–1155 (1996).

    Article  ADS  Google Scholar 

  22. Mackenzie, C., Pichara, K. & Protopapas, P. Clustering-based feature learning on variable stars. Astrophys. J. 820, 138 (2016).

    Article  ADS  Google Scholar 

  23. Van der Maaten, L. & Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008).

    MATH  Google Scholar 

  24. Charnock, T. & Moss, A. Deep recurrent neural networks for supernovae classification. Preprint at (2016).

  25. Che, Z., Purushotham, S., Cho, K., Sontag, D. & Liu, Y. Recurrent neural networks for multivariate time series with missing values. Preprint at (2016).

  26. Lipton, Z. C., Kale, D. C., Elkan, C. & Wetzell, R. Learning to diagnose with LSTM recurrent neural networks. Preprint at (2015).

  27. Friedman, J. H. & Silverman, B. W. Flexible parsimonious smoothing and additive modeling. Technometrics 31, 3–21 (1989).

    Article  MathSciNet  MATH  Google Scholar 

  28. Cho, K. et al. Learning phrase representations using RNN encoder–decoder for statistical machine translation. Preprint at (2014).

  29. Schuster, M. & Paliwal, K. K. Bidirectional recurrent neural networks. IEEE Trans. Sig. Process. 45, 2673–2681 (1997).

    Article  ADS  Google Scholar 

  30. Srivastava, N., Hinton, G. E., Krizhevsky, A., Sutskever, I. & Salakhutdinov, R. Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014).

    MathSciNet  MATH  Google Scholar 

Download references


We thank Y. LeCun and F. El Gabaly for helpful discussions and A. Culich for computational assistance. This work is supported by the Gordon and Betty Moore Foundation Data-Driven Discovery and National Science Foundation BIGDATA grant number 1251274. Computation was provided by the Pacific Research Platform programme through the National Science Foundation Office of Advanced Cyberinfrastructure (number 1541349), Office of Cyberinfrastructure (number 1246396), University of California Office of the President, Calit2 and Berkeley Research Computing at University of California Berkeley.

Author information

Authors and Affiliations



B.N. implemented and trained the networks, assembled the machine learning results and generated the first drafts of the paper and figures. J.S.B. conceived of the project, assembled the astronomical light curves and oversaw the supervised training portions. F.P. provided theoretical input. S.v.d.W. discussed the results and commented on the paper.

Corresponding author

Correspondence to Brett Naul.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Supplementary Information

Supplementary Text, Supplementary Figures 1–11 and Supplementary References

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Naul, B., Bloom, J.S., Pérez, F. et al. A recurrent neural network for classification of unevenly sampled variable stars. Nat Astron 2, 151–155 (2018).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

This article is cited by


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing