Bone mineral density response prediction following osteoporosis treatment using machine learning to aid personalized therapy

Osteoporosis is a global health problem for ageing populations. The goals of osteoporosis treatment are to improve bone mineral density (BMD) and prevent fractures. One major obstacle that remains a great challenge to achieve the goals is how to select the best treatment regimen for individual patients. We developed a computational model from 8981 clinical variables, including demographic data, diagnoses, laboratory results, medications, and initial BMD results, taken from 10-year period of electronic medical records to predict BMD response after treatment. We trained 7 machine learning models with 13,562 osteoporosis treatment instances [comprising 5080 (37.46%) inadequate treatment responses and 8482 (62.54%) adequate responses] and selected the best model (Random Forests with area under the receiver operating curve of 0.70, accuracy of 0.69, precision of 0.70, and recall of 0.89) to individually predict treatment responses of 11 therapeutic regimens, then selected the best predicted regimen to compare with the actual regimen. The results showed that the average treatment response of the recommended regimens was 9.54% higher than the actual regimens. In summary, our novel approach using a machine learning-based decision support system is capable of predicting BMD response after osteoporosis treatment and personalising the most appropriate treatment regimen for an individual patient.

factors that interfere with the treatment outcome, such as BMD before treatment, history of falls, laboratory results, FRAX ® score, comorbid conditions, current use of glucocorticoid, secondary osteoporosis, and adhering to treatment 14,16,17 .
Accordingly, modern clinical practice guidelines were adjusted to be more personalized and to precisely choose treatment options based on the risk of 10-year fractures according to country-specific guidelines together with patient lifestyle and nutrition. These strategies seem to help to increase the successful rate of treatment outcomes and are reasonable to apply in clinical practice, but it is hard to find a clear and discrete protocol to make universal decisions because of the uncertainty of clinical adjustment and complex information regarding individual clinical factors and personal history [18][19][20][21] . These challenges instigated the idea to consider filling this research gap to find a way of effectively choosing the right choice of initial treatment for an individual patient by taking into account patient characteristics and risk factors as currently there is still no definite approach which is able to identify an appropriate treatment regimen regarding individual treatment outcomes 19 .
Machine learning-based decision support systems is a promising research area which is capable of learning multiple patterns of treatment profiles together with a large number of complex parameters. Application of machine learning has been utilized in the medical field such as a decision support system which recommended a drug of choice that best fit a patient and provided a personalized prediction of therapeutic outcome 22,23 . In the osteoporosis field, machine learning-based solutions have been studied for a while in different aspects including detecting postmenopausal women with osteoporosis risk, and beyond diagnosis purposes, such as identifying osteoporosis patient risk groups using various clinical features [24][25][26][27] . Application of machine learning with genomics data have shown an ability to predict BMD and osteoporotic fractures risk [28][29][30] . In the research area of image analysis, machine learning was applied with magnetic resonance imaging (MRI) and computerized tomography (CT) data and the results showed the capability of screening and predicting osteoporotic fractures 31,32 . However, studies of machine learning with osteoporosis treatment outcomes using BMD information combined with clinical features were still limited.
With the increased use of electronic medical records (EMR), patient data including demographic data, clinical features, hospital visit history, clinical diagnosis, laboratory results, and medication prescriptions were exponentially accumulated and enabled us to research the area utilizing machine learning. Because of the crucial impact in the health-care community and prior successful research outcomes in similar areas, we believe that machine learning-based decision support systems could be potentially leveraged with relevant patient information to inform a new way of improving therapeutic effectiveness in osteoporosis treatment.
In this study, we proposed a personalized osteoporosis treatment approach guided by a machine learning model prediction. We trained machine learning models with previous 10-year osteoporosis treatment profiles and a patient-specific clinical dataset acquired from a real-world database. Then, we leveraged the best machine learning model to develop an automated algorithm which was able to recommend appropriate treatment regimens and dosages, in order to achieve an optimal BMD improvement.

Materials and methods
This was a retrospective study using EMR dataset collected between January 2010 and December 2019 at a tertiary care teaching hospital serving high-volume osteoporosis treatment services. The study protocol was approved under an ethical approval by the Research Ethics Committee of Faculty of Medicine, Chiang Mai University (Study code: ORT-2562-06764). All methods were carried out with exemption criteria (waiver of informed consent) in accordance with the Research Ethics Committee Faculty of Medicine, Chiang Mai University. All identification data including patient name, surname, address, national identification number, address, phone number, and hospital number were removed. Statistical analysis was performed using R Version 4.0.2 on RStudio Version 1.3.959 (RStudio, Boston, Massachusetts). The data pre-processing and machine learning development steps were performed using Python Programming Language Version 3.8.5 (Python Software Foundation, Wilmington, Delaware). Scikit-learn version 0.23.2 Machine Learning library 33 was used. All computational processes of the machine learning algorithm were performed on Windows Server 2019, 64-bit Operating System, 8 vCPUs of 2.5 GHz Processor, and 32 GBs of Memory (RAM).

Data acquisition.
A dataset of 141,510 EMR entries from 15,420 patients who had BMD results was acquired from the EMR database All previous International Classification of Disease 10th version (ICD-10) diagnoses, out-patient encounters, in-patient admissions, laboratory results, and medication prescription history in the EMR of enrolled patients were extracted to the dataset, illustrated in Fig. 1a.
The raw datasets were linked by a visit number (TXN), a unique key assigned to a patient on an individual hospital visit or admission. The admission data included visit and admission date (date), gender (categorical values: male or female), age (integer), weight (integer), diagnostic type (categorical values: principal diagnosis or co-morbidities), and ICD-10 codes (categorical values). The drug prescription data included visit/admission date (date) and drug codes (categorical values). The laboratory result data included visit/admission date (date), laboratory items (categorical values), and values (float). All categorical features were split into a one-hot numeric array fashion using the Scikit-learn OneHotEncoder and then grouped by TXN. All codes were additionally mapped with description text for data visualization.
We were not able to include some important factors that are associated with risk of fracture 34 such as height, parental history of hip fracture, current tobacco smoking, daily alcohol consumption of three or more units daily, due to the lack of digital-format data records within our institute. We identified 28,983 BMD results from institute's PACS database. In our setting, Hologic Horizon ® DXA was used as bone densitometer equipment. The results were stored in the database as semi-structured textual format narrated by certified radiologists. The BMD, T-score, and Z-score of femur (hip), lumbar (vertebral), and radius were extracted to structural variables   www.nature.com/scientificreports/ using regular expression search technique. The BMD dataset (Fig. 1b) was consolidated with the patient dataset resulting in the treatment profile dataset (Fig. 1c). Figure 2 shows an example of the BMD features which were extracted from the BMD report. An individual treatment profile was created by integrating all information between sequential BMD encounters. This protocol excluded visits which had no follow-up BMD or unable to extract BMD results. This strategy was designed to include not only osteoporosis patients (BMD T-score less than −2.5 ) by WHO classification 35 , but also normal (BMD T-score more than −1.0 ) and osteopenia patients (BMD T-score between −1.0 and −2.5 ) who were treated by history of fragility fracture, high-risk screening, and followed up patients with increased BMD T-score. A total 13,562 treatment profiles with 8965 categorical and numerical variables were obtained for this study (Fig. 1c).
Statistical analysis. Baseline characteristics of 13,562 treatment profiles were reported. Continuous variables were described with mean and standard deviation (SD), a Student's t-test was used to test the differences between response and inadequate response groups of treatment profiles regarding BMD change. Categorical variables were described with count and percentage, and were analyzed as differences of treatment response between groups with Chi-square test, p value less than or equal to 0.05 indicating statistical significance.
Machine learning development. The dataset was randomly partitioned into 90:10 training and testing dataset with stratified random sampling. The training dataset was used to train and adjust parameters by internal cross-validation, while the testing dataset was used to assess the model performance and its generalizability (Figs. 1f, 3). All categorical variables in the dataset were encoded with 0 and 1. Missing values were imputed with mean. All numeric features in the dataset were normalized between 0 to 1 scale and BMD response was set as a labeled output. We implemented a feature selection algorithm (SelectKBest with the score function ANOVA F-value between label/feature for classification tasks) to rank the variables according to highest F1 score to the outcome. We fine-tuned the model examining the number of features from 5 to 8981 during the training and testing process. The results showed that the number of features beyond 20 provided no further significant improvement of the model performance (Fig. 4). However, the number of features at 225 was the best number when considering the highest performance of accuracy, ROC, precision, and recall. Thus, the top 225 features were selected for the model development process. This step reduced the dimension of features in the dataset in order to improve accuracy and reduce computational complexity by removing irrelevant variables (Fig. 1g).
We applied the Shapley (SHAP) 38,39 and Bayesian inference 40 to find the top 20th most relevant features and explored how these features influenced the prediction outcome. According to the Bayesian inference interpretation guidelines 40 , we used the pymc3 python package 41 (version 3.11.2) to calculate 95% Bayesian credible intervals (CrI). We trained the Bayesian logistic regression model with the entire dataset with the top 20th features and reported the result with mean and 2.5th and 97.5th percentile (95% CrI). The guidelines 40 suggested that if the CrI did not cover zero, this indicated a statistically significant result. Then, we interpreted these features with significant results by describing with the most plausible values (from the lowest to the highest value of CrI) with higher probability of representing the true (unknown) estimate indicating that the mean of the feature of the response group would be different (lower or higher depending on the negative or positive CrI, respectively) compared to the inadequate response group, with at least a 95% probability.
We selected seven machine learning models from the Scikit-learn library 33 for solving classification problems, consisting of Random Forests Classifier, Gradient Boosting Classifier, Logistic Regression Classifier, Support Vector Machine Classifier, Naive Bayes Classifier, Neural Network (Multilayer Perceptron), and K-Nearest Neighbors Classifier. The manual parameter tuning, grid search, and random search techniques were used to properly obtain the best tuned parameters 42 . The parameters which provided highest accuracy were applied to each algorithm. Accuracy, Precision (Positive Predictive Value), Recall (sensitivity), F1-score, and Area Under Receiver Operating Characteristic curve (ROC) were used to evaluate the performance of each model on the training dataset and testing dataset (Figs. 1h, 3).

Results
Exploratory data analysis. The dataset of 13,562 osteoporosis treatment profiles taken from January 2011 to December 2019 were used for this study. The baseline characteristics are shown in Table 1, of which there were 37.46% (5080) entries in the 'inadequate response to treatment' group (referred to as the inadequate response group) and 62.54% (8482) entries in the 'adequate response-group (henceforth referred to as the response group). The average changes of femoral BMD ( −6.09% ± 8.24 ) and lumbar BMD ( −8.65% ± 22.89 ) for the inadequate response group were significantly lower than for the response groups (femoral BMD 2.33% ± 11.06 and lumbar BMD 3.80% ± 6.83 ), respectively. The average age of the total group was 62.23 ± 10.76 years old with the average age of the inadequate response group significantly lower than the response group (63.36 and 64.75, respectively) (p value < 0.01 ). Co-morbid conditions and basic laboratory results were not different between the two groups, except Alkaline Phosphatase (ALP) which was significantly lower in the response group than in the inadequate group. There were no differences in previous history of hip fractures and vertebral fractures between the groups, whereas forearm fracture history was found significantly more often in the response group than in the inadequate group (p value = 0.02).
Distribution of treatment response in the dataset. Calcium and Vitamin D prescriptions were significantly higher in the response group than the inadequate response group (p value < 0.01 ), while Calcitonin and Menatetrenone were not significantly different. The number of treatment profiles receiving anti-osteoporosis agents in the response group (70.39%) was significantly higher than the inadequate response group (29.61%) (p value was < 0.01 ). This was true for all anti-osteoporosis agents (p value < 0.05 ), except for 3-month Ibandronate (3 mg) with p value 0.46. Figure 5a shows the distribution of lumbar and femoral BMD changes after treatment. The majority of treatment profiles responded to treatment (located in the adequate response area). The dataset was described according to the WHO Osteoporosis Classification (Normal, Osteopenia, and Osteoporosis) 35 as seen in Fig. 5b. The treatment profiles with anti-osteoporosis agents positively contributed to response patterns for both femoral and lumbar BMD, especially in 3, 6, 12-month and daily injection routes. Model selection and interpretation. All seven machine learning models demonstrated a high Recall (Sensitivity) to predict a response to treatment. Among all algorithmic approaches, the Random Forests (RF) model 43 , a tree-based machine learning algorithm, produced the highest accuracy, and positive predictive value (precision), F1-score, and ROC. The final parameters with the best fine-tuned results of all models are presented in Table 3. The value of the area under the receiver operating characteristics curves of Random Forests algorithm and Precision and Recall curve are shown in Fig. 6a,b.   38,39 and Bayesian credible interval 40 were used. SHAP value explanation method was used to rank the 20 most important predictors in Fig. 6. The five most influential variables were: Initial femoral BMD, Anti-osteoporosis agents, Initial lumbar BMD, Age, and Lumbar T-score. Regarding the analysis of Bayesian credible interval, Table 4 shows the mean difference, standard deviation (sd) of the features in the population, CrI, and value interpretation. The High Density Interval (hdi) at 2.5th (hdi-2.5%) and 97.5th (hdi-97.5%) represent the lower and higher value of the 95% CrI. Analysis suggests that the most plausible values with higher probability of representing the true (unknown) estimate are the mean difference of the BMD (Femur), age, lumbar Z-score, femur T-score, radius Z-score, Calcium (oral), laboratory result of Alkaline Phosphatase, Analgesic Balm, and Codeine with Acetaminophen, with the response group results significantly lower than for the inadequate response group, with at least a 95% probability. Conversely, the mean difference of anti-osteoporosis agent, lumbar T-score, femur Z-score, radius T-score, Ibandronate (oral, monthly), and weight of the response group are significantly higher than for the   Individual BMD response prediction. In clinical application, the personalized osteoporosis management approach aims to tailor therapy to individual patients for improving BMD. The proposed machine-learning model identifies possible BMD response patterns of different treatments based on complex personal clinical data. A set of selected 225 features from the EMR database was given to the computational model with 11 different inputs of anti-osteoporosis treatment regimens (Fig. 7). The output was a patient-specific probability for each treatment outcome. Administration route and frequency of regimens were also labeled. We applied the prediction algorithm to 1357 treatment profiles in the testing dataset. In Table 5, we present the treatment response probability of actual regimens and the recommended regimens. The difference of average probability between actual and recommended regimens were 8.36% in the response group (see an example of a response case in Scenario II), 11.47% in the inadequate response group (see an example of an inadequate response case in Scenario I), and 9.54% in the whole testing dataset.
Scenario I: inadequate response to treatment (Fig. 7b). An 80-year-old female with comorbidities of Diabetes and Cataracts. The initial lumbar and femoral BMD were 0.583 and 0.534, respectively. The lumbar and total femoral BMD T-score were −3.7 and −2.5 , respectively (WHO classification of Osteoporosis 35 ). The patient

Osteoporosis
(T-score below -2.  www.nature.com/scientificreports/ did not receive an anti-osteoporosis agent. At her 1-year follow up, she was considered as having an inadequate response to treatment: her serial lumbar and femoral BMD were 0.585 (+ 0.34%) and 0.506 ( −5.24% ), respectively. The actual situation showed the lowest predicted probability of response (0.55), while the best recommended regimen (Denozumab 6M) showed 0.76 probability of response. The difference between actual and recommended regimen was + 21%.
Scenario II: response to treatment (Fig. 7c). A 71-year-old male with initial lumbar and femoral BMD of 0.66 and 0.515, respectively. The lumbar and total femoral BMD T-score were −3 and −2.7 , respectively (Osteoporosis according to WHO classification 35 ). The patient received an anti-osteoporosis agent (weekly oral Alendronate). At the 1-year follow up, he was considered as having an adequate response to treatment: his serial lumbar and femoral BMD were 0.663 (+ 0.45%) and 0.539 (+ 4.66%), respectively. The algorithm predicted the actual regimen's response probability of 0.60, while the best recommended regimen (Denozumab 6M) showed 0.77 probability of response. The difference was + 17%.

Discussion
This study demonstrated the development of a machine learning-based clinical decision support system. The model, which was trained with clinical features, laboratory results, and prescription records from an EMR database, can predict the probability of a BMD response after osteoporosis treatment. This method is an automated data pipeline process which can be potentially integrated into hospital EMR systems. Eventually, it has potential benefits for physicians when selecting therapeutic regimens. In this study, the majority of patients responded to osteoporosis treatments (see the treatment response in Table 1). However, the percentage of inadequate responses in patients with osteoporosis was relatively high (37.46%) compared with the previous study (25.8%) 14 . Since our dataset was larger and uncontrolled, representing real-world circumstances, this may affect adherence to therapy and response to treatment. In some real-world studies, the adherence rates were reported as low as 25% for one-year treatments 44,45 . This finding of inadequate response confirmed that the poor-compliance issue is a serious problem in our hospital service and also remains a worldwide challenge, and the causes of inadequate response after osteoporosis drug therapy initiation must be explored.
We found that patients in the response group were older than the inadequate response group but baseline femoral and lumbar BMD of the inadequate response group were significantly lower than the response group while the average serum ALP in the inadequate response group was significantly higher than the response group. These results confirm findings by previous research 14 as relevant factors for predicting the response of osteoporosis treatment. Apart from the relevant factors, another interesting finding is the history of forearm fracture (ICD10-S52).
The number of patients with a history of forearm fractures in the response group was higher than for the inadequate response group (see Table 1). This was the only feature among the history of fractures which was significantly different. This means the patients with a history of forearm fractures had a high successful rate of treatment (77.42%), while the other types of fractures were not different. We manually explored the cause of this finding and found that the average BMD of femur and lumbar ( 0.61 ± 0.13 and 0.79 ± 0.14 , respectively) in the forearm fracture group were lower than the average of the population in this study. Because of the low initial BMD before treatment, this group intentionally had a higher chance of receiving anti-osteoporosis treatment. We confirmed this assumption by exploring this patient group and the data showed that 37 of 62 forearm fractures patients (59.68%) received anti-osteoporosis agents which were more than the average of treatment profiles in the population of this study (52.93%). In clinical practice, distal forearm fractures of the wrist are commonly found in the older population who tend to have bone degeneration with high bone loss (low BMD) at the distal forearm 46 . Therefore, the low BMD and high number of treatment resulted in a higher rate of BMD improvement. However, when we explored the treatment response in hip, femur, shoulder, and upper arm fractures, the results showed that the number of the response group was also found more often that the inadequate response group but not at a statistically significant level. We believe that if we could further investigate in a larger and Table 3. The best tuned parameters of the models. www.nature.com/scientificreports/ controlled study, we might find some interesting information and significant clinical impact as they are also a common osteoporosis fracture. For the treatment profiles, we found that Calcium and Vitamin D supplementation were routinely used in the treatment of osteoporosis patients. Calcium and Vitamin D supplementation with/without anti-osteoporosis agents showed significant outcomes from the treatment except in conjunction with 3-month Ibandronate (3 mg) which had a higher number in the response treatment group but not at a statistically significant level, possibly because the sample size was too small.

Receiver Operating Characteristic
Precision and Recall Curve (Threshold at 0.5) a. b. c.

Age
Lumbar T-score Lumbar Z-score Femur Z_score Radius T-score   www.nature.com/scientificreports/ We observed the major characteristics between the response and inadequate response group recognized by the machine learning model according to the SHAP explanation in Fig. 6c and the CrI interpretation in Table 4 that were relevant to predict response outcome of the treatment. Typical patients in the response group had lower levels of initial femoral and lumbar BMD, received the treatment with anti-osteoporosis agents, and were of a slightly younger age. Meanwhile, the characteristics of the inadequate response group were patients with higher initial femoral BMD, were not treated with anti-osteoporosis agents, and were of a slightly older age compared with the response group.
By the design of this study, we developed a supervised machine learning model using real-world EMR-derived clinical information which consists of elementary variables such as demographic data, co-morbidities, previous diagnosis of fracture, basic laboratory results, and drug prescriptions. These high availability features are basic information stored in almost all EMRs. We assessed the performance of 7 algorithms (Table 2). We decided to use Random Forests algorithm to develop a prediction model according to its overall performance which was slightly higher than the other algorithms. The important step was the feature selection; we compared the performance of machine learning between training with all 8981 features and various amounts of selected features (Fig. 4). We found that using all 8981 features contributed similar or slightly higher performance comparing to lesser amount of features, but the dataset was very large and the model took a much longer time for computational processing. As a result, a smaller set of relevant variables (as lower as 20 features) is more practicable to develop a lightweight machine learning model and to apply in general-setting hospitals where these clinical variables are available.
Feature contributions using SHAP value (in Fig. 6c) revealed that initial BMD results, T-score, and Z-score were potent features to predict response to treatment. However, in clinical practice, some patients with osteoporotic fractures might receive anti-osteoporosis treatment without having a BMD result due to the unavailability of bone densitometry equipment in some remote areas. In that situation, using the general standard guidelines 12,18-21 might be appropriate for choosing an initial treatment.
This work offers a potential clinical application by recommending a personalized choice of the best antiosteoporosis regimen based on the prediction of BMD response and clinical factors. In Table 5, a higher probability of response for the recommended regimen was shown compared to the actual regimen that the patient received; this is especially important for inadequate response patients. This additional information could be crucial for physicians to identify low-response osteoporosis patients and choose alternative anti-osteoporosis agents which provide optimal response to the treatment 47 .
We demonstrated two scenarios of clinical application that could be further used in a hospital setting. The first scenario showed that Patient 1 (Fig. 7b) had not been prescribed with any anti-osteoporosis agents, resulting in decreased BMD at the 1-year follow up. The algorithm predicted low probability of response in this actual situation (Probability 0.55). However, if the patient had received at least one of the anti-osteoporosis regimens, the treatment outcome would be better compared to the actual situation. The quantification of the predicted outcome difference between actual and alternative regimens may provide evidence to support a physician's decision to choose the best anti-osteoporosis agent to achieve a desirable outcome. In the case of Patient 2 (Fig. 7c), the patient received an actual regimen of weekly oral Alendronate (predicted response probability 0.60). The model www.nature.com/scientificreports/ predicted that another anti-osteoporosis treatment would provide a higher probability of response (Probability > 0.60 ). This recommendation would aid a physician to select an alternative better-response treatment, taking into consideration the adhering ability of a different administration protocol such as monthly or 6-monthly treatments to obtain a higher possibility of BMD improvement. However, the application of the model on a recommendation system might need to incorporate awareness of atypical cases. For instance, a patient with co-morbid conditions such as chronic kidney disease might be a contraindication for some anti-osteoporosis drugs. Other special considerations exist, such as risk of atypical femoral fracture, and osteonecrosis of the jaw, for which the prediction model is not able to make an appropriate recommendation. Machine learning-based prediction models and their application to osteoporosis treatment have shown potential opportunity for widespread adoption, as seen in the FRAX ® scenario which has been widely adopted by the World Health Organization (WHO) as indicating standard guidelines to assess individual risk of fracture over 10 years with 12 variables 8,12,18,48 . Although FRAX ® has been efficiently used to predict probability of osteoporotic fractures and has been validated in many countries, there are some limitations: physicians need to manually complete a form to calculate the risk on a website and individually assess complex patient information in order to select the treatment of choice. Our proposed model has demonstrated some points of clinical service improvement. First, this machine learning model could be embedded in an automated clinical decision support system in EMR which can be seamlessly implemented to support a clinical practice, without the need for manual variable input. Second, the machine learning development process is scalable and reproducible which enables the model to continuously learn with larger or newer information. Third, the model is applicable to be customized in other institutes to develop a personalized model using their own dataset that suits the characteristics of local patients.
This study has a few notable limitations worth mentioning. First, our prediction model is developed from retrospective information. We were unable to include important missing variables such as weight, height, parental fracture history, smoking history, and alcohol consumption characteristics. These lifestyle features are not routinely recorded in our EMR in a digital format. This missing information might affect the accuracy of the model. Second, this study excluded treatment profiles with no serial BMD which therefore excluded poor-compliance or loss of follow-up patients. As a result, the actual inadequate response group may be higher than in this study and some characteristics of the inadequate response group may not be reflected in the real-world situation. Since the general treatment adherence rate is typically low, even though we included all 10-year treatment profiles in our institute, the number of eligible patients with a long-term follow-up BMD was still limited. Thus, we decided to study only the initial treatment and a single serial BMD, taken 12-24 months after treatment. Third, the current criteria to define inadequate response to osteoporosis treatment is different among different studies 49 . This study complied to our national guidelines to consider BMD decrease as an inadequate response 12 . We determined change of total femoral BMD and lumbar BMD as output variables, despite a femoral neck BMD also being a variable used to evaluate response of treatment. We could not gather the incidence of fracture information after treatment as part of our definition of inadequate response in this study. Lastly, we conducted the experiment in a single-center design which may lead to an over-fitting model; future external validation in other populations is required. Further study should include a cross-institution dataset to broaden the range of the variables, or incorporate genomic information 29 which may individually affect the response of treatment. We also encourage investigators to create prediction models for long-term treatment outcomes.

Conclusion
In summary, our results show that it is feasible to use a combination of EMR-derived information to develop a machine learning algorithm to predict a BMD response following osteoporosis treatment. This alternative approach can aid physicians to select an optimal therapeutic regimen in order to maximize a patient-specific treatment outcome.

Data availability
De-identified data are available upon reasonable request from qualified investigators.

Code availability
The code is available upon reasonable request from qualified investigators.