Osteocyte transcriptome mapping identifies a molecular landscape controlling skeletal homeostasis and susceptibility to skeletal disease

Osteocytes are master regulators of the skeleton. We mapped the transcriptome of osteocytes from different skeletal sites, across age and sexes in mice to reveal genes and molecular programs that control this complex cellular-network. We define an osteocyte transcriptome signature of 1239 genes that distinguishes osteocytes from other cells. 77% have no previously known role in the skeleton and are enriched for genes regulating neuronal network formation, suggesting this programme is important in osteocyte communication. We evaluated 19 skeletal parameters in 733 knockout mouse lines and reveal 26 osteocyte transcriptome signature genes that control bone structure and function. We showed osteocyte transcriptome signature genes are enriched for human orthologs that cause monogenic skeletal disorders (P = 2.4 × 10−22) and are associated with the polygenic diseases osteoporosis (P = 1.8 × 10−13) and osteoarthritis (P = 1.6 × 10−7). Thus, we reveal the molecular landscape that regulates osteocyte network formation and function and establish the importance of osteocytes in human skeletal disease.


nature research | reporting summary
April 2020 For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors and reviewers. We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information.

Data
Policy information about availability of data All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: -Accession codes, unique identifiers, or web links for publicly available datasets -A list of figures that have associated raw data -A description of any restrictions on data availability Field-specific reporting Please select the one below that is the best fit for your research. If you are not sure, read the appropriate sections before making your selection.

Life sciences Behavioural & social sciences Ecological, evolutionary & environmental sciences
For a reference copy of the document with all sections, see nature.com/documents/nr-reporting-summary-flat.pdf

Life sciences study design
All studies must disclose on these points even when the disclosure is negative.
For transcriptome experiments, power analysis was conducted using the Scotty webtool (http://scotty.genetics.utah.edu/). Pilot RNA sequencing data was used to assess the amount of biological variance. Analysis indicated 5 biological replicates sequenced to a depth of 20 million reads was sufficient to detect~80% of genes with 2-fold change in expression at p<0.05. All experiments match or exceed these specification, either in replicate number, sequencing depth, or both.
For phenotype data associated with transcriptome sequencing studies, sample size was limited to concordant samples taken from contralateral limbs of those used for RNAseq. For the bone comparison study n=8, for skeletal maturation study n=5. For TRAP staining in osteocyte enriched or marrow containing bone samples an n=4 was chosen as pilot studies had shown an absence of cells lining bone in osteocyte enriched samples.
For skeletal phenotyping of knockout mice in the OBCD pipeline the reference ranges for each skeletal parameter were derived from 320 female 16 week old C57BL/6NTac wild-type mice. Using these data together with coefficients of variation for each test, power calculations indicate an 80% power to detect outlier phenotype of greater or equal to 2SD with a sample size of n=2. The exact number of biological replicates examined for each knockout mice line is listed in Supplementary Data 8.
GWAS data from the UK biobank were selected based on stringent quality control criteria to select 362,924 participants. Participants were selected if they had high-quality quantitative heel ultrasound data and if they were of a White British genetic ethnicity. These sample sizes represent the largest sample size to-date for any musculoskeletal trait.

Reporting for specific materials, systems and methods
We require information from authors about some types of materials, experimental systems and methods used in many studies. Here, indicate whether each material, system or method listed is relevant to your study. If you are not sure if a list item applies to your research, read the appropriate section before selecting a response. Note that full information on the approval of the study protocol must also be provided in the manuscript.

Human research participants
Policy information about studies involving human research participants

Population characteristics
No data was excluded from transcriptome sequencing studies. Similarly in mouse knockout studies, data for all mice were included, including those with and without skeletal phenotypes.
Experiments were conducted using multiple independent biological replicates as outlines in the 'Sample size' section. Individual experiments were not replicated directly. Where available, orthogonal, independently produced datasets were used to validate experimental findings.
Samples used in transcriptome sequencing were specifically bred for the purposes of this study. Samples collected for transcriptome sequencing in each of the Bone Comparison or Osteocyte Enrichment mouse cohorts were all collected in a single batch. To generate samples for the Skeletal Maturation cohort, breeding was stratified so all samples (representing multiple ages) could be collected within a single 36hour time period. Samples were collected in groups of 8 mice (one from each time point, in each sex) to avoid confounding batch effects. For rapid throughput phenotyping of knockout mouse lines, samples were anonymized and randomly assigned to batches for analysis.
For transcriptome experiments data collection was stratified by sample type to avoid batch effects and thus blinding was not possible. For transcriptome analysis sample type was used as a variable of interest so blinding was not possible. Morphological analysis of samples collected from the contralateral limb of those taken for transcriptome analysis was performed in a blinded manner. For phenotypic analysis, all data collection and analysis was performed in a blinded manner.
Bone Comparison cohort: 16-week-old male C57BL6/NTac mice; Skeletal maturation cohort: 4, 10, 16 and 26-week-old female and male C57BL6/NTac mice; Osteocyte Enrichment cohort: 10-week-old male C57BL6/NTac mice. Novel-gene knockout mice lines were produced by CRISPR/Cas9 gene targeting in C57BL/6J mouse embryos. Phenotyping was performed in 16-week-old male and female mice. Animal holding areas were maintained within a constant temperature range of 21-22 degrees Celsius and 60-65% humidity to avoid animal stress and to minimise experiment variability. Lighting in animal rooms ensured 12 hours' light and 12 hours' darkness with a dawn / dusk simulation. Knockout mouse lines screened in the OBCD phenotyping pipeline were produced on a background of C57BL6/NTac mice, with phenotyping experiments conducted on 16-week-old female mice.