Skip to main content

Postdoctoral Fellowship - Graph Database Developer

National Institutes of Health/National Library of Medicine
Bethesda, Maryland
Closing date
25 Aug 2024

View more

Job Type
Employment - Hours
Full time
Fixed term

Postdoctoral Fellowship - Graph Database Developer

Organization   National Library of Medicine, National Institutes of Health, Bethesda, MD and surrounding area

Scientific focus area   Computational Biology

About the position
Come join a dynamic interdisciplinary team advancing the boundaries of single cell genomics and 
machine learning with Dr. Richard Scheuermann, the Scientific Director of the National Library of 
Medicine (NLM) located on the NIH Campus in Bethesda, Maryland. NLM is the world’s largest 
biomedical library and a leader in research, development, and training in biomedical informatics 
and health information technology. NLM is legislatively mandated to support the essential work of 
acquiring, organizing, preserving, and disseminating biomedical information, a field that is 
changing at a more rapid pace than ever before. NLM plays a pivotal role in translating biomedical 
research into practice. NLM’s research and information services support scientific discovery, health care, and public health, enabling researchers, clinicians, and the public to use the vast wealth of biomedical data to improve health. The NLM Intramural Research Program (IRP) develops and applies computational approaches to a broad range of information problems in biology, biomedicine, and human health. The Graph Database Developer will be a member of an interdisciplinary project team developing a Cell Phenotype Knowledge Base (CTKB) as a definitive public reference resource of information about cell phenotypes, including cell types, cell states and developmental trajectories, using Linked Open Data (LOD)approaches. CTKB will be designed to support three core disease-related use cases: diagnostic biomarker discovery, therapeutic target identification, and mechanistic insight exploration, with an initial focus on priority disease processes in three physiological systems: respiratory diseases of the lung, neurodegeneration of the nervous system, and autoimmune and inflammatory diseases of the immune system. The Graph Database Developer will collaborate with a biomedical ontologist to develop a FAIR-compliant cell phenotype representational model (a semantic schema) based on OBO Foundry ontologies and related standards and an extraction, translation, and loading (ETL) protocol for translating processed assay results, including transcriptional biomarkers produced using a standardized machine learning pipeline, and experiment metadata from the datasets selected into standardized semantically-structured assertions (SSS assertions) about cell phenotypes for loading into the CTKB graph knowledgebase. The Graph Database Developer will integrate these CTKB cell phenotypes with disease, drug, and other complementary information from public data repositories managed by the NLM National Center for Biotechnology Information (NCBI) to facilitate the core mechanistic, diagnostic and therapeutic discovery use cases. The Graph Database Developer will also collaborate with a team of software developers to develop and implement an intuitive user-friendly query, visualization, and analysis interface for semantic network exploration, graphical machine learning pattern discovery, and computational comparison of new datasets for cell type matching, with a focus on maximizing user experience (UX). The end delivered product will be an open access reference knowledgebase about healthy and diseased cell phenotypes designed to meet the needs of the general biomedical research community. Specific tasks include: • Provide guidance on use of information retrieval and extraction techniques to increase the utility of unstructured, semi-structured, and structured data including application of semantic and ontology-based methods for organization and query of scientific knowledge. • Lead the creation of strategies for the deployment, construction and querying of data systems, including relational database management systems(RDBMS), NoSQL systems (e.g., graph databases, RDF subject-predicate-object triple stores) and SPARQL, Cypher, and GraphML query languages to gather and analyze biomedical research data. • Collaborate with the development team to design, develop, and maintain knowledge graph databases for intelligence analysis purposes. • Collaborate with intelligence analysts and subject matter experts to understand requirements and translate them into effective database designs. • Support implementation of the data integration processes to extract, transform, and load structured and unstructured data from various sources into the knowledge graph database. • Collaborate to develop data models and ontologies to represent entities, relationships, and attributes within the knowledge graph. • Develop algorithms and techniques for relationship mapping, clustering, and trend analysis within the knowledge graph. • Ensure data quality and integrity by implementing data validation and cleansing procedures. • Optimize query performance and implement indexing strategies to enhance database retrieval and analysis capabilities. • Collaborate with software developers to integrate the knowledge graph database into analytical tools and platforms. • Keep abreast of the latest developments in knowledge graph technologies and suggest creative solutions to enhance database capabilities. • Administer tools and services that increase researcher access to NIH data and knowledge management resources, and ensure that data and metadata meet principles of FAIR (Findable, Accessible, Interoperable, Reusable) data practices to enable reproducibility, including use of containerization (Docker/Singularity), notebooks (Jupyter), etc. • Aid in the creation of strategies for the development of web-based interfaces that are user-friendly and will promote usage of organizational analytical tools and databases by experts in other data
domains and the lay public.

Apply for this vacancy
What you'll need to apply:
Prospective candidates are encouraged to submit the following application materials to Please ensure you reference the position title “Graph Database Developer Fellowship” in your cover letter and/or e-mail subject line. • Current curriculum vitae • Cover letter/statement of research interest • 
Contact information for three references

Contact name   Richard Scheuermann, PhD

Contact email



  • Doctoral degree in biomedical science, computer science, or a related field.
  • Experience collaborating within data service, product, and project teams.
  • Proficient in deploying, and querying data systems, encompassing relational database management systems (RDBMS) and non-SQL systems such as Neo4j or ArangoDB graph databases, RDF subject-predicate-object triple stores, and proficiency in SPARQL, Cypher, and/or GraphML query languages.
  • Knowledge of scientific, biomedical research, and health-related terminologies.
  • Strong working knowledge of data and metadata standards and application of metadata in a biomedical repository setting, including experience with biomedical ontologies.
  • Experience working with taxonomies, ontologies, and controlled vocabularies, including OBO Foundry ontologies, especially the Cell Ontology, Ontology of Biomedical Investigation, and UMLS/MeSH.
  • Strong working knowledge of Semantic Web technologies (RDF/s, OWL), query languages (SPARQL) and validation/reasoning approaches and standards
  • Familiarity with other scientific, biomedical research, and health-related terminology (e.g., SNOMED-CT).
  • Familiarity with programming languages and environments used in data science, e.g., Python and/or R as well as associated programming libraries, e.g., numpy, scipy, and bioconductor.
  • Familiarity with the Linux operating system as well as software development an deployment tools, e.g., Docker, Git.
  • Demonstrable skills in interpersonal communication, oral and written communication, and an ability to work collaboratively in cross-functional working groups.
  • Experience in maintaining relationships and/or partnerships with other institutions and vendors.
  • Capability to handle multiple projects concurrently, with meticulous attention to detail and adaptability to changing work requirements.

Apply for Postdoctoral Fellowship - Graph Database Developer

Already uploaded your CV? Sign in to apply instantly

Fields marked with an asterisk (*) are required

Your file must be a .doc, .pdf, .docx, or .rtf. No larger than 1MB
Selected file:
Your communication preferences

When you apply for a job we will send your application to the named employer, who may contact you. By applying for a job listed on Nature Careers you agree to our terms and conditions and privacy policy. You should never be required to provide bank account details. If you are, please contact us. All emails will contain a link in the footer to enable you to unsubscribe at any time.

Get job alerts

Create a job alert and receive personalised job recommendations straight to your inbox.

Create alert

Similar jobs