Database of NIH grants using machine-learned categories and graphical clustering

We acknowledge assistance and support from G. LaRowe and N. Skiba at ChalkLabs, and input and feedback from NIH staff during the project. We thank S. Silberberg, C. Cronin, K. Boyack and K. Borner for helpful advice and comments on the manuscript. This project has been supported through small contracts from the NIH to University of Southern California (271200900426P and 271200900244P), University of Massachusetts (271201000758P, 271200900640P, 271201000704P and 271200900639P), ChalkLabs LLC (271200900695P and 271201000701P) and TopicSeek LLC (271201000620P and 271200900637P).

    • David Mimno

    Present address: Princeton University, Princeton, New Jersey, USA.


  1. National Institute of Neurological Disorders and Stroke, Bethesda, Maryland, USA.

    • Edmund M Talley
    •  & A G Miriam Leenders
  2. University of California, Irvine, Irvine, California, USA.

    • David Newman
  3. University of Massachusetts, Amherst, Amherst, Massachusetts, USA.

    • David Mimno
    • , Hanna M Wallach
    •  & Andrew McCallum
  4. ChalkLabs, Bloomington, Indiana, USA.

    • Bruce W Herr II
  5. Information Sciences Institute, University of Southern California, Marina del Rey, California, USA.

    • Gully A P C Burns


Competing interests

B.W.H. II is employed by Chalklabs, LLC, which was contracted for a portion of the work described in the report. D.N. is owner of TopicSeek, LLC, which was also contracted for a portion of the described work.

Correspondence to Edmund M Talley.

    Supplementary Text and Figures

    Supplementary Figure 1, Supplementary Data, Supplementary Methods

    Supplementary Table 1

    Categories of machine-learned topics and their potential NIH institute representations for grants from funding year 2009.

