Abstract
Duchenne muscular dystrophy (DMD) is the most common and severe form of muscular dystrophy and affects boys in infancy or early childhood. Current methods for diagnosing DMD are often laborious, expensive, invasive, and typically diagnose the disease late in its progression. In an effort to improve the accuracy and ease of diagnosis, this study focused on developing a novel method for diagnosing DMD which combines Raman hyperspectroscopic analysis of blood serum with advanced statistical analysis. Partial least squares discriminant analysis was applied to the spectral dataset acquired from blood serum of a mouse model of Duchenne muscular dystrophy (mdx) and control mice. Cross-validation showed 95.2% sensitivity and 94.6% specificity for identifying diseased spectra. These results were verified via external validation, which achieved 100% successful classification accuracy at the donor level. This proof-of-concept study presents Raman hyperspectroscopic analysis of blood serum as an easy, fast, non-expensive, and minimally invasive detection method for distinguishing control and mdx model mice, with a strong potential for clinical diagnosis of DMD.
Introduction
Duchenne muscular dystrophy (DMD) is a progressive form of muscular dystrophy which typically affects male infants. DMD is an X-chromosome linked recessive disorder caused by a loss of function of the dystrophin gene of 2.3 million base pairs, which results in progressive weakness and atrophy of the skeletal and cardiac muscles.1,2 The issues associated with DMD are severe, worsen overtime, and greatly impact the well-being of the afflicted individual. In fact, secondary complications due to DMD, including cardiac and respiratory muscle problems, can lead to life-threatening conditions.3 Although there is no cure, limited treatment regimens exist for DMD which can slow the progression of the symptoms associated with the disease.
Diagnosing DMD typically involves evaluating family history as well as conducting blood tests to assess the levels of specific muscle enzymes in the blood. Although the inheritance of the disease is through an X-linked recessive pattern, there are cases where DMD occurs in families who have no history of it. The complicated pattern of inheriting DMD suggests a need for additional testing. Blood tests often monitor the level of serum creatine phosphokinase (CPK), however, this test can only detect the disease in later stages and is generally non-specific, as high levels of CPK can be found in an individual’s blood after experiencing a heart attack, drinking alcohol in excess, or participating in strenuous exercise.4,5,6,7,8,9 Electromyography can confirm muscle weakness without pinpointing a direct cause of it.10 Muscle biopsies can differentiate muscular dystrophies from other muscle diseases,11 however biopsy examinations can be both expensive and invasive. Further, biopsies and genetic testing are typically pursued only after other options have been exhausted, resulting in the disease being diagnosed in its later stages. Because DMD is progressive, it is of the utmost importance to definitively diagnose the disease as early on in its progression as possible. The earlier the disease is identified, the better opportunity the afflicted individual has for seeking treatment opportunities to slow the progression of the disease phenotype.
To improve the accuracy, ease, and potential of an early diagnosis, we focused on developing a novel method for diagnosing DMD using Raman hyperspectroscopic analysis of mdx mouse blood serum combined with advanced statistical analysis. Most DMD patients display deletion mutations of one or more of the 79 exons in the DMD gene, leading to out-of-frame mutations and loss of dystrophin protein in their muscle fibers.12 Similar to patients, the dystrophin mutant mdx mice do not express dystrophin13 and have been widely used as a model system to study DMD and to make important advances in understanding therapeutic strategies as well as the molecular processes and underlying causes of the disease.2,14 The mdx mouse model serves as an efficient model for developing a better diagnostic method without influence from complications, such as the effect of prescribed medications, associated with humans.
Raman hyperspectroscopy has shown great potential to diagnose many diseases including cancers,15,16 Alzheimer’s disease,17,18,19 and others.20,21 Raman hyperspectroscopy involves collecting multiple Raman spectra from a sample to better characterize its inherent heterogeneity and understand its biochemical composition. This allows for the detection of changes in biological composition of blood serum due to disease progression. Because Raman hyperspectroscopy produces this specific spectral fingerprint for each sample, different samples can be distinguished, including dried traces of body fluids collected from healthy donors and from donors with a disease. Here, we capitalized on the advantages of Raman hyperspectroscopy in combination with advanced statistical analysis to build a model which identifies spectral differences between different classes of samples to make diagnostic predictions. Partial least squares discriminant analysis (PLS-DA) was used to build a model which could distinguish Raman spectral data of healthy control mice from Raman spectral data of mdx mice. The results were verified using external validation. Genetic algorithm (GA) identified spectral features which contribute the most useful information toward differentiation; these features were assigned to vibrational modes of various biomolecules previously identified as playing a role in the pathogenesis of DMD. For the first time, this proof-of-concept study shows Raman hyperspectroscopy in combination with advanced statistical analysis is successful in distinguishing control from mdx model mice in a simple, accurate, early, and minimally invasive manner, indicating a strong potential for clinical diagnosis of DMD.
Results
Validation of skeletal muscle abnormalities in mdx mice by examining the tibialis anterior (TA) muscle morphology
Duchenne muscular dystrophy is the most common and most severe form of muscular dystrophy. DMD is characterized by muscle wasting and weakness due to excessive muscle degeneration. The tibialis anterior (TA) muscle morphology of 3-month old and 12-month old control (C57BL/10ScSnJ) and mdx (C57BL/10ScSn-Dmd < mdx > /J) mice was examined using Hematoxylin and Eosin (H&E) staining (Fig. 1A–D). As expected, normal skeletal muscle morphology was observed in 3-month old control mice (Fig. 1A). Mild skeletal muscle degeneration was observed in 3-month old mdx mice as characterized by the smaller diameter of muscle fibers with central nuclei, occasional presence of atrophied muscle fiber, and the presence of an increased number of nuclei, representing inflammatory cells (Fig. 1B). Similar to 3-month old control mice, 12-month old control mice displayed normal skeletal muscle morphology (Fig. 1C). Skeletal muscle degeneration progresses as mdx mice get older. As such, muscle degeneration was much more prominent in the 12-month old mdx mice as marked by the absence of normal muscle structure in most areas of the tissue section and the presence of fatty and necrotic tissues (Fig. 1D).
Skeletal muscle degeneration is observed in the mouse model of DMD. Hematoxylin and Eosin (H&E) staining of TA muscle cross sections from 3- and 12-month-old control (C57BL/10ScSnJ) (A, C) and mdx (C57BL/10ScSn-Dmd < mdx > /J) (B, D) mice. The 3-month old control muscle cross-section shows normal fiber morphology including circular shape and absent central nuclei (A), whereas 3-month old mdx mice show muscle degeneration denoted by muscle fibers with central nuclei observed and smaller diameter (yellow arrows), atrophied muscle (smaller than the neighboring fibers without central nuclei, black arrow), and more prevalent nuclei which represents inflammatory cells (B). Control mice at 12-months old (C) are compared to the 12-month old mdx mice (D) where muscle degeneration is much more dramatic, as evident by the absence of normal muscle fiber structure in almost all areas of the section; the muscle structure is often taken over by fatty and necrotic tissues, as indicated by the unstained/white areas (green arrows). Scale Bar: 100 µM.
Raman spectroscopic analysis of mice blood serum
Because DMD is progressive, it is crucial to develop a simple diagnostic tool for identifying the disease as early as possible. In this proof-of-concept study, dried blood serum of healthy and mdx mice at 3- and 12-months old was analyzed by Raman hyperspectroscopy in an attempt to develop a novel diagnostic method. Blood serum is the portion of blood which does not contain cells or clotting factors, and has been widely studied in the past for diagnostic purposes.18,22,23,24,25.
Raman spectra were collected from the serum of 14 mice donors through automatic mapping. Mapping was conducted to obtain an accurate representation of the entire biochemical composition of each dried serum sample, with the intention of identifying key biochemical components useful for discrimination between classes. The two classes of donors consisted of healthy mice (control, n = 7) and mdx model mice (MDX, n = 7). Of the 14 total blood serum samples, six (three control and three MDX) were collected from mice at three months old and eight (four control and four MDX) were collected from mice at 12 months old. The three month old mdx mice are considered a model of early DMD, and 12 month old mdx mice are considered a model of late stage DMD. The mean preprocessed spectra for all donors from each class is seen in Fig. 2. Peak assignments are discussed further on.
Mean preprocessed Raman spectra collected from the two classes of dried mice blood serum. The mean preprocessed spectrum of all control mice blood serum samples is represented by the pink line, whereas the mean preprocessed spectrum of all mdx mice blood serum samples is represented by the blue line.
Model calibration for differentiating healthy controls from MDX mice
The donors were split into two groups: the calibration group and the validation group. The spectral data from the ten donors of the calibration set (five control, five MDX) was used to build the PLS-DA prediction algorithm. The spectral data from the validation dataset, consisting of two control and two MDX donors, were set aside and used for external validation. Mice of different ages (3- and 12-months) were included in both groups.
The difference between the mean control and the mean MDX spectrum was calculated and compared with ± 2 standard deviations within each class. The difference spectrum fell within the standard deviations (Supplementary Fig. S1), indicating that the spectral changes shown in the difference spectrum are smaller than the variation which occurs within each class, and are statistically insignificant. Advanced statistical analysis is thus required to capitalize on the important spectral features which vary between the two classes at the level of individual spectra but are hidden from the mean spectra. This variability is useful for discriminating between the two classes of data.
Partial least squares discriminant analysis (PLS-DA) was selected to build a discrimination algorithm. A binary model was built to distinguish between control and MDX blood serum spectral data of the calibration dataset. Eight latent variables captured the maximum covariance between the spectral data and the assigned classes. Each spectrum from the calibration dataset was assigned a set of scores which correspond to how similar that spectrum is to each latent variable. Each class is ideally represented by a range of scores seen as typical for that class. Scores plots can be used to understand the separation which exists between different classes, and any spectrum which is loaded into the model will be given a set of scores which is used to decide to which class it belongs. The model built herein showed clear separation between the two classes (Fig. 3).
PLS-DA scores plot. The PLS-DA scores plot built using the first three latent variables. The distribution of symbols represents the separation which exists between the two classes of blood serum spectra where pink diamonds signify controls and blue squares signify MDX. Each symbol represents an individual spectrum.
The sensitivity and specificity rates for classification of the PLS-DA diagnostic algorithm were calculated. In this study, the sensitivity is defined as the true positive rate, or percentage of MDX spectra correctly predicted as belonging to the MDX class. The specificity is defined as the true negative rate, or percentage of control spectra correctly predicted as not belonging to the MDX class. Individual spectral predictions for all donors within the calibration dataset are observed in the confusion matrix presented in Table 1. Here, every Raman spectrum is assigned a class (either control or MDX). The assignments are compared to the true, or known, classification for each spectrum. Cross-validation of the PLS-DA model by venetian blinds resulted in 95.2% sensitivity and 94.6% specificity for training the algorithm using the calibration dataset.
External validation of the PLS-DA model
External validation was performed using the spectral data collected from the four donors of the validation dataset. The validation dataset was kept independent from the training set and is considered a powerful method for testing the validity and strength of the classification model. A total of 185 spectra collected from the four samples were loaded into the PLS-DA algorithm for external validation. The class assignment for each spectrum was predicted (Table 1). Again, the sensitivity and specificity of classification for external validation were calculated. Here, 100% sensitivity and 87.0% specificity was achieved for external validation at the level of individual spectral predictions.
Receiver operating characteristic curve analysis of external validation results
A receiver operating characteristic (ROC) curve was used to identify the optimum threshold for determining donor-level classifications based on spectral-level predictions. A ROC curve evaluates the performance of a binary classifier and is generated by plotting true positive rate values (sensitivity) against false positive rates values (1-specificty). Every point on the ROC curve corresponds to a potential threshold for discrimination. The ROC curve generated for the PLS-DA model built in this study, based on cross-validation, is seen in Fig. 4A. The most optimum threshold for discrimination in this study is designated by the point at (0.00, 1.00), which corresponds to a cut-off value of 77%. This threshold indicates if 77% or more of the total spectra from a donor in the external validation dataset are assigned to the MDX class, than the overall prediction of the donor would be as belonging to the MDX class.
Receiver operating characteristic (ROC) curve and PLS-DA external validation results. (A) ROC curve for the cross-validated PLS-DA model, trained to differentiate between diseased and healthy control mice blood serum. The true positive rate (sensitivity) of each potential discrimination threshold are plotted according to each corresponding false positive rate (1—specificity). The optimal threshold is designated by the point at (0.00, 1.00), corresponding to a threshold of 77%. (B) The percentage of spectra classified as MDX is plotted as the bar height of each of the donors. The 77% threshold is plotted as the dashed line.
The threshold established by the ROC curve (77%) was applied to the model’s spectral-level predictions to generate a diagnosis at the donor level for external validation, as shown in Fig. 4B. The percentage of spectra which were identified as belonging to the MDX class is plotted as the height of the bar. The 77% threshold is then applied to each of the four donors. External validation donors 1 and 2 were found to have greater than 77% of their total spectra predicted as being MDX, and thus the overall donor was predicted as belonging to the MDX class; the opposite was true for external validation donors 3 and 4. Upon comparing these donor-level predictions with the true diagnosis of each, it was found that all four donors in the validation dataset were correctly identified. Thus, based on donor-level predictions, 100% successful external validation was achieved. This indicates the strength and capability of the model to be applied to new, unknown data, to make accurate diagnoses.
Genetic algorithm for identifying spectral differences in blood serum
Genetic algorithm (GA) was performed to better understand the biochemical basis responsible for the model’s ability to discriminate between spectral datasets. GA is a statistical technique which capitalizes on the ideas of “natural selection” and “survival of the fittest.”26 The algorithm identifies spectral features within the dataset which contribute the most discrimination power toward separating classes of data and further provides insight into the biochemical changes that occur as the disease progresses. The results of GA are observed in Fig. 5. The tentative assignments of the Raman bands identified by GA can be attributed to various biomarkers which have been previously shown to be linked to DMD; these are summarized in Table 2.
Genetic Algorithm analysis. Mean preprocessed dried blood serum spectra of the two classes, including the spectral ranges selected by Genetic Algorithm: control (pink) and MDX (blue). Areas selected by Genetic Algorithm are marked by bolded lines. Spectral regions deemed as uninformative for discrimination are seen as unfilled lines.
Discussion
The combination of Raman hyperspectroscopy and advanced statistical analysis is incredibly advantageous for disease diagnostic purposes. Raman hyperspectroscopy involves the collection of multiple Raman spectra from a sample to characterize its heterogeneity. This is accomplished through acquiring spectral information alongside spatial information, allowing for the formation of a three dimensional data cube (x, y, λ). Two dimensions, x and y, correspond to spatial coordinates, and the third dimension, λ, represents the Raman spectrum collected at a particular pair of coordinates. By probing multiple small areas of a sample, there is a potential to identify biochemical components which, although may be present at low average concentrations, are present at a particular coordinate at a high local concentration. The ability to detect such components using this method indicates they may be useful for discrimination, and can serve as spectroscopic biomarkers. Thus, the advantage of Raman hyperspectroscopy resides in its ability to detect multiple biomarkers simultaneously, which can be used for discrimination and diagnostic purposes.17
It is often observed that spectral differences between two similar classes of samples, such as dried traces of healthy and diseased body fluids, are insignificant when evaluated at the average level.18,27 It is expected that the majority of the composition of a body fluid remains consistent between healthy and diseased donors. In this research, the difference spectrum calculated between the average control spectrum and the average MDX spectrum of dried serum was shown to be statistically insignificant. This indicates that statistical analysis is required to better understand and evaluate the Raman spectral data obtained, and specifically, to uncover hidden characteristic features of the two classes as well as spectral variability which can be capitalized on for building a discrimination algorithm. In this study, the combination of Raman hyperspectroscopy and advanced statistical analysis was used to develop an algorithm which could accurately distinguish between control and mdx model mice through analysis of dried blood serum.
The mdx mice model was specifically selected for this project because the species exhibits a mutation within its DMD gene, resulting in the mouse not expressing the dystrophin protein and developing the disease. This animal model has been widely studied in the last several decades, and has provided extensive insight into the pathophysiology associated with muscular dystrophy.2,14 Additionally, the mdx mouse model can be manipulated to test potential therapeutic strategies, and lack of interfering factors, such as comorbidities or influence of prescribed medications, makes it ideal for evaluating novel diagnostic methods.
PLS-DA was selected to generate the prediction algorithm. The 14 donors used in this study were split into two groups: a calibration set and a validation set. The spectral data from the calibration set, consisting of 452 total spectra from five control and five MDX donors, was used to build and train the prediction algorithm. Cross-validation by venetian blinds resulted in 95.2% sensitivity and 94.6% specificity for identifying MDX spectra.
The prediction capabilities of the algorithm was then tested through external validation using the validation set of samples, consisting of two control donors and two MDX donors. The spectral data from the validation dataset was used to test the ability of the algorithm to make predictions regarding samples it has never before seen, and thus cannot have an inherent bias toward. The PLS-DA algorithm generated classification predictions for each individual spectrum collected from the four donors. Each sample is represented by a multitude of spectra; because dried traces of blood serum are inherently heterogeneous,28 each spectrum is expected to deviate from the mean to some extent. It is also expected that a portion of the mice blood serum components are the same between control and mdx model donors. As such, it is reasonable to assume that some spectra from one class may be predicted as belonging to the other due to the natural overlap in biochemical composition. ROC curve analysis was used to establish an optimum threshold for donor-level predictions. Using the determined threshold of 77%, all four donors of the validation dataset were identified as belonging to their true class, resulting in 100% accuracy at the donor level. External validation is an established process for determining whether or not a model is robust enough for successful application to new and unknown spectral data for accurate predictions.29,30 Successful external validation, as achieved here, indicates the potential for the method to be applied within diagnostic settings.
The contribution of multiple biomarkers to the spectroscopic signature of DMD as determined by GA allows for much more specific identification of the disease, and further supports the strength of the method. In general, by identifying biochemical components whose alterations in composition or concentration reflect the presence of a particular disease, the ability to detect that disease is dramatically increased, and can result in very high levels of classification accuracy.17 Past literature has demonstrated strong links between the pathogenesis of DMD and the tentatively assigned biomolecules. Specifically, studies have shown that a general increase in lipids, including triglycerides, phospholipids, cholesterol, and cholesterol esters, is found in patients with muscular dystrophy.31,32 In fact, in mdx model mice, elevated lipid levels were found to be associated with significant exacerbation of muscle pathology, including myofiber damage and skeletal muscle remodeling.32 Collagen has also been found to play a role in the pathogenesis of muscular dystrophy.33 Among the evidence, researchers found an inverse relationship exists between the over-production of connective tissue and muscle protein synthesis in patients suffering from DMD.34,35,36 Other research observed unusual clusters of “sticky cells” formed by dissociated muscle of patients with Duchenne and Becker muscular dystrophies, a sign which reflects abnormal collagen production.37 Mutations in genes coding for collagen type VI are also responsible for congenital muscular dystrophies including Bethlem myopathy and Ullrich congenital muscular dystrophy.38.
Many serum proteins have been identified as biomarkers which reflect the pathogenesis of DMD; the concentration of 23 identified mouse serum proteins exhibited an increase while four other proteins were found to exist at concentrations significantly lower in mdx model mice as compared to healthy control mice in one study. Proteins which were elevated mostly originated from muscle or were glycolytic enzymes, transport proteins, or other proteins such as creatine kinase M.39 These identified protein biomarkers reflect the muscle activity as well as pathogenesis of the disease. Many more studies have also identified various serum proteins as biomarkers for muscular dystrophy.40,41,42,43 It is thus unsurprising that GA identified spectral features which can be attributed to vibrational modes of proteins as being useful for discrimination purposes. Furthermore, a relationship between glycogen metabolism and DMD was supported by Naim et al. Here, results show that mdx model mice have increased skeletal muscle glycogen content; many of the enzymes involved in the skeletal muscle glycogen metabolism were dysregulated.44 Because of the dysregulation of glycogen, levels of glucose in the blood may be affected, connecting the identification of both glycogen and glucose here as also being important spectroscopic markers for DMD.
Notably, the spectral features identified by GA as being the most useful for spectroscopically discriminating between the two classes of data can also be assigned to vibrational modes of classes of biomolecules which have previously been related to the pathogenesis of the disease itself. Clearly, there is a connection between the progression of the disease and the spectroscopic signature produced. This link is strong enough to provide identifiable information which can be capitalized on through advanced statistical analysis for the purpose of generating a successful diagnostic algorithm and through the identification of the aforementioned biomolecules associated with DMD, we were indeed able to achieve high levels of diagnostic accuracy. Raman hyperspectroscopy allows for simultaneous detection of multiple, potentially new, biomarkers for a disease. This is incredibly advantageous over other diagnostic methods which simply investigate one, known, biomarker at a time.
Conclusion
The method of combining Raman hyperspectroscopy with advanced statistical analysis is shown in this proof-of-concept study to be successful for distinguishing control and mdx model mice, with a substantial potential for clinical detection of Duchenne muscular dystrophy. Raman spectra were collected from traces of blood serum from either healthy control or diseased mice. The spectral data was analyzed using PLS-DA, which showed 95.2% sensitivity and 94.6% specificity for identifying MDX spectra in the calibration dataset, and 100% sensitivity and 87.0% specificity for identifying MDX spectra in the validation dataset. Based on donor-level predictions generated using ROC curve analysis, 100% accuracy was achieved for correctly predicting to which class the donors in the external validation dataset belonged. This is the first time this methodology has been applied toward distinguishing control and mdx model mice for the purpose of identifying DMD. Genetic Algorithm identified key biochemical components which were responsible for spectroscopic discrimination, indicating a link between the disease progression and the Raman spectroscopic fingerprint. Future research is required to study this link on a larger scale, and to investigate if a similar trend is observed within humans. It is clear that this methodology has significant potential for use as a novel technique for diagnosing Duchenne muscular dystrophy in clinical settings.
Methods
All experimental protocols were approved by the Institutional Animal Care and Use Committee and the Laboratory Animal Resources Standard Operating Procedures; all methods were carried out in accordance with relevant guidelines and regulations.
Mouse strains and sample collection
The mdx (C57BL/10ScSn-Dmd < mdx > /J; Stock Number 001801) and counterpart control mice (C57BL/10ScSnJ; Stock Number 000476) were purchased from the Jackson Laboratory, Bar Harbor, ME, USA. The mice were raised following the protocol approved by the Institutional Animal Care and Use Committee to the appropriate age (3 months and 12 months) before harvesting the tissue and blood samples. As Duchene muscular dystrophy is an X-linked muscle degenerative disease, male mdx and male control mice were studied. 3-month-old mice are equivalent to young adult humans and 12-month-old mice are equivalent to adult humans. However, though the mdx mouse contains only a single mutation on exon 23 of the DMD gene, the phenotypes of 3-month-old mice is considered equivalent to early DMD phenotypes in patients.45.
Mice were euthanized following the standard operating procedure of Laboratory Animal Resources (LAR SOP # 105 and 106). Briefly, the mice were first anesthetized to a surgical plane of anesthesia under isoflurane inhalation using an induction chamber. The depth of anesthesia was verified by establishing the loss of pedal reflex. The mice were euthanized under anesthesia by isoflurane and then by cervical dislocation. For harvesting skeletal muscle, the hind leg skins were removed and the Tibialis Anterior (TA) muscles were removed by a surgical blade. The TA muscles were cut into 2 pieces and frozen fresh with Optimal Cutting Temperature (OCT) compound in plastic molds. The freezing process was carried out in a jar containing semi-frozen iso-butanol and again frozen in liquid nitrogen before storing the tissue blocks at − 80 °C. The blood samples were collected from the euthanized mice by cardiac puncture. Briefly, the skin and the rib cases were cut and pinned in the dissection board. The jugular vein was cut by sharp scissors and blood was collected in small Eppendorf tubes, without use of anticoagulant, using pasteur pipettes.
Isolation of serum
The serum was isolated following a standard laboratory protocol. Briefly, the tubes containing the blood without any anticoagulant were left at room temperature in a standing position for about 35 min, allowing the blood to clot. Then, the clotted blood samples were centrifuged at 20 °C and 2000g for 15 min; the serum fraction was moved to a fresh tube and stored at − 80 °C. At the time of analysis, the blood serum was allowed to thaw. Each serum sample (10 µL) was deposited on an aluminum foil substrate and set aside to dry overnight before analysis.
Cryosection and histochemistry of TA muscle
The cryosections and H&E staining was carried out using established protocol as described elsewhere.46,47
Raman hyperspectroscopy
A Renishaw inVia Raman spectrometer equipped with a research-grade Leica microscope was used to collect Raman spectra of dried blood serum. A PRIOR automatic mapping stage was used during measurements and the 50X objective was used to focus on the sample. Spectra were recorded between 400 and 1,800 cm−1 under excitation by the 785 nm diode laser, which was reduced to about 50% laser power to prevent photo-degradation of the sample. For each sample, 50 spectra were recorded to capture the inherent heterogeneity of the dried blood serum.
Data treatment and advanced statistical analysis
Spectra were recorded using WiRE 3.2 software, and then imported to PLS_Toolbox (Eigenvector Research Inc.) which operates within MATLAB version 2017b software (Mathworks, Inc.). Any individual Raman spectrum which displayed a poor signal-to-noise ratio or exhibited cosmic rays was removed from the dataset. The remaining spectra were subjected to preprocessing, including baseline correction, normalization, and mean centering methods available within PLS_Toolbox, before performing analysis. Tentative peak assignments were made (after applying Genetic Algorithm) and are summarized in Table 2.
Partial least squares discriminant analysis (PLS-DA)
PLS_Toolbox (Eigenvector Research, Inc.) was used for statistical analysis. PLS-DA was selected to accomplish discrimination between the healthy and diseased classes. PLS-DA algorithms have been shown to be effective in various disease diagnostic applications including for investigating inflammatory bowel diseases,48 coronary heart diseases,49 and various forms of cancer,50,51,52,53,54,55,56,57,58,59,60,61,62 among many others. Specifically, PLS-DA is a supervised technique which is used to predict categorical variables. The dataset being analyzed is reduced to a few latent variables (LVs), which capture the maximum covariance between spectral data and the labeled classes. Each spectrum is then given a score which corresponds to how closely that spectrum resembles a particular LV. Different classes of samples will be represented by a set of scores seen as characteristic for a sample within that class.63 In this way, unknown samples can be identified through comparison of the unknown sample’s score to those of classes which are known. Here, PLS-DA was built using spectral data from ten samples (five control, five MDX); eight LVs were used to reduce the dimensionality of the dataset. The performance of the algorithm was investigated using venetian blind cross-validation. Following this, predictions of unknowns were made using the spectral data obtained from four donors of the external validation dataset (Table 1).
Genetic algorithm (GA)
GA was used to determine the spectral features which were the most useful for discrimination between the two classes of data. GA is a statistical technique inspired by the ideas of evolution. The algorithm aims to solve a specific problem by generating potential solutions; recombination operators are applied to the data in order to preserve critical information which can best solve the problem.64 Essentially, GA will identify spectral variables which provide the lowest prediction error rates, identified through a repetitive algorithm building process. In this way, it can recognize which spectral features of the dataset provide the most useful information for discriminating between different classes of data. Concurrently, it will eliminate uninformative data as well as noise from future consideration. Here, GA was applied to the training dataset which consisted of ten donors and 452 spectra. The parameters of GA are given as follows: the population size was set to 80; the mutation rate to 0.005, and the maximum number of generations for each run to 100. The breeding was fixed to double crossover, the window width was 30, and 30% of the windows were initially included. To identify the diagnostic features from within the measured Raman spectral dataset, GA was independently run 100 times which allowed for identification of significant spectral bands useful for discrimination purposes. The identified spectral features were tentatively assigned to corresponding vibrational modes, according to the literature, to determine potential biochemical basis responsible for spectroscopic differentiation (Table 2).
Data availability
The data that support the findings of this study are available from the corresponding author upon reasonable request.
References
Bulfield, G., Siller, W. G., Wight, P. A. & Moore, K. J. X chromosome-linked muscular dystrophy (mdx) in the mouse. Proc. Natl. Acad. Sci. 81, 1189–1192 (1984).
Manning, J. & O’Malley, D. What has the mdx mouse model of Duchenne muscular dystrophy contributed to our understanding of this disease?. J. Muscle Res. Cell Motil. 36, 155–167 (2015).
Darras, B. T., Urion, D. K. & Ghosh, P. S. in GeneReviews [Internet] (ed Ardinger HH Adam MP, Pagon RA) (University of Washington, Seattle, 2000).
Oldham, H. N., Roe, C. R., Young, W. G. & Dixon, S. H. Intraoperative detection of myocardial damage during coronary artery surgery by plasma creatine phosphokinase isoenzyme analysis. Surgery 74, 917–925 (1973).
LaPorta, M. A., Linde, H. W., Bruce, D. L. & Fitzsimons, E. J. Elevation of creatine phosphokinase in young men after recreational exercise. J. Am. Med. Assoc. 239, 2685–2686 (1978).
Lafair, J. S. & Myerson, R. M. Alcoholic myopathy. Arch. Intern. Med. 122, 417–422 (1968).
Brancaccio, P., Maffulli, N. & Limongelli, F. M. Creatine kinase monitoring in sport medicine. Br. Med. Bull. 81–82, 209–230 (2007).
Nevins, M. A., Saran, M., Bright, M. & Lyon, L. J. Pitfalls in interpreting serum creatine phosphokinase activity. JAMA 224, 1382–1387 (1973).
Nygren, A. Serum creatine phosphokinase activity in chronic alcoholism, in connection with acute alcohol intoxication. Acta Medica Scand. 179, 623–630 (1966).
Shelat, A. M. in A.D.A.M. Medical Encyclopedia (eds David Zieve & Brenda Conaway) (Bethesda, MD, 2018).
Muscular dystrophy. https://www.mayoclinic.org/diseases-conditions/muscular-dystrophy (2018).
Uaesoontrachoon, K. et al. Orthogonal analysis of dystrophin protein and mRNA as a surrogate outcome for drug development. Biomark. Med. 13, 1209–1225 (2019).
Anderson, C. L. et al. The mouse dystrophin muscle promoter/enhancer drives expression of mini-dystrophin in transgenic mdx mice and rescues the dystrophy in these mice. Mol. Ther. 14, 724–734 (2006).
Partridge, T. Animal models of muscular dystrophy—What can they teach us?. Neuropathol. Appl. Neurobiol. 17, 353–363 (1991).
Bocklitz, T. W., Guo, S., Ryabchykov, O., Vogler, N. & Popp, J. R. Raman based molecular imaging and analytics: A magic bullet for biomedical applications!?. Anal. Chem. 88, 133–151 (2016).
Ralbovsky, N. M. & Lednev, I. K. Raman spectroscopy and chemometrics: A potential universal method for diagnosing cancer. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 219, 463–487 (2019).
Ralbovsky, N. & Lednev, I. K. Raman hyperspectroscopy shows promise for diagnosis of Alzheimer’s. Biophotonics 4, 33–37 (2018).
Ryzhikova, E. et al. Raman spectroscopy of blood serum for Alzheimer’s disease diagnostics: Specificity relative to other types of dementia. J. Biophoton. 8, 584–596 (2015).
Ralbovsky, N. M., Halamkova, L., Wall, K., Anderson-Hanley, C. & Lednev, I. K. Screening for Alzheimer’s disease using saliva: a new approach based on machine learning and Raman hyperspectroscopy. J. Alzheimer’s Dis. 71, 1351–1359 (2019).
Ellis, D. I. & Goodacre, R. Metabolic fingerprinting in disease diagnosis: Biomedical applications of infrared and Raman spectroscopy. Analyst 131, 875–885 (2006).
Kong, K., Kendall, C., Stone, N. & Notingher, I. Raman spectroscopy for medical diagnostics—From in-vitro biofluid assays to in-vivo cancer detection. Adv. Drug Deliv. Rev. 89, 121–134 (2015).
Zhang, A., Sun, H. & Wang, X. Serum metabolomics as a novel diagnostic approach for disease: A systematic review. Anal. Bioanal. Chem. 404, 1239–1245 (2012).
Vanmassenhove, J., Vanholder, R., Nagler, E. & Van Biesen, W. Urinary and serum biomarkers for the diagnosis of acute kidney injury: An in-depth review of the literature. Nephrol. Dial. Transplant. 28, 254–273 (2013).
Schöler, N., Langer, C., Döhner, H., Buske, C. & Kuchenbauer, F. Serum microRNAs as a novel class of biomarkers: a comprehensive review of the literature. Exp. Hematol. 38, 1126–1130 (2010).
Rosenblatt, K. P. et al. Serum proteomics in cancer diagnosis and management. Annu. Rev. Med. 55, 97–112 (2004).
Huang, C.-L. & Wang, C.-J. A GA-based feature selection and parameters optimizationfor support vector machines. Expert Syst. Appl. 31, 231–240 (2006).
Ryzhikova, E. et al. Multivariate statistical analysis of surface enhanced Raman spectra of human serum for Alzheimer’s disease diagnosis. Appl. Sci. 9, 3526 (2019).
Uribe, B. et al. Analysis of the heterogeneous distribution of amiloride and propranolol in dried blood spot by UHPLC-FLD and MALDI-IMS. Molecules 24, 4320 (2019).
Bleeker, S. E. et al. External validation is necessary in prediction research: A clinical example. J. Clin. Epidemiol. 56, 826–832 (2003).
Muro, C. K. & Lednev, I. K. Race differentiation based on Raman spectroscopy of semen traces for forensic purposes. Anal. Chem. 89, 4344–4348 (2017).
Srivastava, N. K., Pradhan, S., Mittal, B. & Gowda, G. A. N. High resolution NMR based analysis of serum lipids in Duchenne muscular dystrophy patients and its possible diagnostic significance. NMR Biomed. 23, 13–22 (2010).
Milad, N. et al. Increased plasma lipid levels exacerbate muscle pathology in the mdx mouse model of Duchenne muscular dystrophy. Skelet. Muscle 7, 19 (2017).
Duance, V. C., Stephens, H. R., Dunn, M., Bailey, A. J. & Dubowitz, V. A role for collagen in the pathogenesis of muscular dystrophy?. Nature 284, 470–472 (1980).
Ionasescu, V., Zellweger, H. & Conway, T. W. Ribosomal protein synthesis in Duchenne muscular dystrophy. Arch. Biochem. Biophys. 144, 51–58 (1971).
Ionasescu, V., Zellweger, H., Ionasescu, R., Lara-Braud, C. & Cancilla, P. A. Protein synthesis in muscle cultures from patients with Duchenne muscular dystrophy: Calcium and A23187 ionophore dependent changes. Acta Neurol. Scand. 54, 241–247 (1976).
Ionasescu, V., Lara-Braud, C., Zellweger, H., Ionasescu, R. & Burmeister, L. Fibroblast cultures in Duchenne muscular dystrophy: alterations in synthesis and secretion of collagen and noncollagen proteins. Acta Neurol. Scand. 55, 407–417 (1977).
Thompson, E. J., Yasin, R., van Beers, G., Nurse, K. & Al-Ani, S. Myogenic defect in human muscular dystrophy. Nature 268, 241–243 (1977).
Lampe, A. K. & Bushby, K. M. D. Collagen VI related muscle disorders. J. Med. Genet. 42, 673–685 (2005).
Hathout, Y. et al. Discovery of serum protein biomarkers in the mdx mouse model and cross-species comparison to Duchenne muscular dystrophy patients. Hum. Mol. Genet. 23, 6458–6469 (2014).
Hathout, Y. et al. Large-scale serum protein biomarker discovery in Duchenne muscular dystrophy. Proc. Natl. Acad. Sci. 112, 7153–7158 (2015).
Alagaratnam, S. et al. Serum protein profiling in mice: identification of Factor XIIIa as a potential biomarker for muscular dystrophy. Proteomics 8, 1552–1563 (2008).
Martin, F. C. et al. Fibronectin is a serum biomarker for Duchenne muscular dystrophy. Proteom. Clin. Appl. 8, 269–278 (2014).
Hathout, Y. et al. Clinical utility of serum biomarkers in Duchenne muscular dystrophy. Clin. Proteom. 13, 9 (2016).
Stapleton, D. I. et al. Dysfunctional muscle and liver glycogen metabolism in mdx dystrophic mice. PLoS ONE 9, e91514 (2014).
Yucel, N., Chang, A. C., Day, J. W., Rosenthal, N. & Blau, H. M. Humanizing the mdx mouse model of DMD: The long and the short of it. NPJ Regen. Med. 3, 1–11 (2018).
Dey, B. K., Gagan, J., Yan, Z. & Dutta, A. miR-26a is required for skeletal muscle differentiation and regeneration in mice. Genes Dev. 26, 2180–2191 (2012).
Dey, B. K., Pfeifer, K. & Dutta, A. The H19 long noncoding RNA gives rise to microRNAs miR-675-3p and miR-675-5p to promote skeletal muscle differentiation and regeneration. Genes Dev. 28, 491–501 (2014).
Dawiskiba, T. et al. Serum and urine metabolomic fingerprinting in diagnostics of inflammatory bowel diseases. World J. Gastroenterol. 20, 163–174 (2014).
Brindle, J. T. et al. Rapid and noninvasive diagnosis of the presence and severity of coronary heart disease using 1H-NMR-based metabonomics. Nat. Med. 8, 1439–1445 (2002).
Liu, W., Sun, Z., Chen, J. & Jing, C. Raman spectroscopy in colorectal cancer diagnostics: Comparison of PCA-LDA and PLS-DA models. J. Spectrosc (2016).
Winnard, P. T. Jr. et al. Organ-specific isogenic metastatic breast cancer cell lines exhibit distinct Raman spectral signatures and metabolomes. Oncotarget 8, 20266–20287 (2017).
Olmos, V., Bedia, C., Tauler, R. & Juan, A. D. Preprocessing tools applied to improve the assessment of aldrin effects on prostate cancer cells using Raman spectroscopy. Appl. Spectrosc. 72, 489–500 (2018).
Brozek-Pluska, B., Kopeć, M. & Abramczyk, H. Development of a new diagnostic Raman method for monitoring epigenetic modifications in the cancer cells of human breast tissue. Anal. Methods 8, 8542–8553 (2016).
Abramczyk, H. & Imiela, A. The biochemical, nanomechanical and chemometric signatures of brain cancer. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 188, 8–19 (2018).
Bergholt, M. S. et al. in Biomedical Vibrational Spectroscopy VI: Advances in Research and Industry. 89390M (International Society for Optics and Photonics).
Bergholt, M. S. et al. Simultaneous fingerprint and high-wavenumber fiber-optic Raman spectroscopy enhances real-time in vivo diagnosis of adenomatous polyps during colonoscopy. J. Biophoton. 9, 333–342. https://doi.org/10.1002/jbio.201400141 (2016).
Bergholt, M. S. et al. Characterizing variability of in vivo Raman spectroscopic properties of different anatomical sites of normal colorectal tissue towards cancer diagnosis at colonoscopy. Anal. Chem. 87, 960–966 (2015).
Wang, J. et al. Simultaneous fingerprint and high-wavenumber fiber-optic Raman spectroscopy improves in vivo diagnosis of esophageal squamous cell carcinoma at endoscopy. Sci. Rep. 5, 12957 (2015).
Lin, K., Zheng, W., Lim, C. M. & Huang, Z. Real-time in vivo diagnosis of laryngeal carcinoma with rapid fiber-optic Raman spectroscopy. Biomed. Opt Express 7, 3705–3715. https://doi.org/10.1364/boe.7.003705 (2016).
Schleusener, J. et al. In vivo study for the discrimination of cancerous and normal skin using fibre probe-based Raman spectroscopy. Exp. Dermatol. 24, 767–772 (2015).
Wang, J. et al. Fiber-optic Raman spectroscopy for in vivo diagnosis of gastric dysplasia. Faraday Discuss. 187, 377–392 (2016).
Wang, J. et al. Comparative study of the endoscope-based bevelled and volume fiber-optic Raman probes for optical diagnosis of gastric dysplasia in vivo at endoscopy. Anal. Bioanal. Chem. 407, 8303–8310 (2015).
Barker, M. & Rayens, W. Partial least squares for discrimination. J. Chemom. 17, 166–173 (2003).
Whitley, D. A genetic algorithm tutorial. Stat Comput 4, 65–85 (1994).
Krafft, C., Neudert, L., Simat, T. & Salzer, R. Near infrared Raman spectra of human brain lipids. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 61, 1529–1535 (2005).
Stone, N., Kendall, C., Smith, J., Crow, P. & Barr, H. Raman spectroscopy for identification of epithelial cancers. Faraday Discuss. 126, 141–157 (2004).
Talari, A. C. S., Movasaghi, Z., Rehman, S. & Rehman, I. U. Raman spectroscopy of biological tissues. Appl. Spectrosc. Rev. 50, 46–111 (2014).
Shetty, G., Kendall, C., Shepherd, N., Stone, N. & Barr, H. Raman spectroscopy: elucidation of biochemical changes in carcinogenesis of oesophagus. Br. J. Cancer 94, 1460–1464 (2006).
Cheng, W.-T., Liu, M.-T., Liu, H.-N. & Lin, S.-Y. Micro-Raman spectroscopy used to identify and grade human skin pilomatrixoma. Microsc. Res. Tech. 68, 75–79 (2005).
Acknowledgements
This work was supported by the SUNY startup and American Heart Association (AHA 17SDG33670339) grants to B.K.D. N.M.R was supported by NIH training Grant T32 GM13206.
Author information
Authors and Affiliations
Contributions
I.K.L. and B.K.D. conceived the project; N.M.R, B.K.D. and I.K.L. contributed to study design; N.M.R. and A.G. performed data collection; N.M.R designed, optimized, and conducted data analysis and statistical modeling, and directed manuscript writing and editing; P.D. was involved in mouse colony maintenance, harvesting blood and tissues, histology, and isolation of blood serum; I.K.L. and B.K.D. provided expertise with data interpretation and supervision of the project.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Ralbovsky, N.M., Dey, P., Galfano, A. et al. Diagnosis of a model of Duchenne muscular dystrophy in blood serum of mdx mice using Raman hyperspectroscopy. Sci Rep 10, 11734 (2020). https://doi.org/10.1038/s41598-020-68598-8
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-020-68598-8
This article is cited by
-
Machine Learning for Prediction, Classification, and Identification of Immobilized Enzymes for Biocatalysis
Pharmaceutical Research (2023)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.