A systematic review of movement and muscular activity biomarkers to discriminate non-specific chronic low back pain patients from an asymptomatic population

The identification of relevant and valid biomarkers to distinguish patients with non-specific chronic low back pain (NSCLBP) from an asymptomatic population in terms of musculoskeletal factors could contribute to patient follow-up and to evaluate therapeutic strategies. Several parameters related to movement and/or muscular activity impairments have been proposed in the literature in that respect. In this article, we propose a systematic and comprehensive review of these parameters (i.e. potential biomarkers) and related measurement properties. This systematic review (PROSPERO registration number: CRD42020144877) was conducted in Medline, Embase, and Web of Knowledge databases until July 2019. In the included studies, all movements or muscular activity parameters having demonstrated at least a moderate level of construct validity were defined as biomarkers, and their measurement properties were assessed. In total, 92 studies were included. This allowed to identify 121 movement and 150 muscular activity biomarkers. An extensive measurement properties assessment was found in 31 movement and 14 muscular activity biomarkers. On the whole, these biomarkers support the primary biomechanical concepts proposed for low back pain. However, a consensus concerning a robust and standardised biomechanical approach to assess low back pain is needed.

Quality assessment. The overall score of each included study was calculated by the sum of rated criteria divided by the sum of applicable questions. The included studies were generally of good quality, with a mean score of 72 ± 12% (Supplementary Table 1). However, less than 50% of the studies provided sufficient information about their design or reported the reliability of the outcome, and less than 25% of the studies performed blinded analysis or justified the sample size. Five of the 97 studies [18][19][20][21][22] included for quality assessment obtained a score lower than 9 (≤ 50%) and were thus not included in the data extraction process.

Data extraction.
While not used in the data synthesis process, the full characteristics of the populations and measured parameters of each included study are available in Supplementary Table 2 to provide complete and transparent information, and a synthesis is all available in Fig. 2. The characteristics of each movement and muscular activity biomarker, used to conduct the data synthesis, are available in Supplementary Table 3.

Data synthesis.
As most of the biomarkers were only assessed by one study (see below), the risk to get heterogeneous populations from different studies to analyse one marker is low and was not considered in this analysis. Anyway, it can be observed that included studies come mainly from occidental Europe including UK (32%), North America (28%) and Asia (12%). The mean number of participants across studies was 29.0 ± 30.4 [min: 5, max: 218] for NSCLBP, and 23.1 ± 16.5 [min: 6, max: 130] for control. It can also be observed that a significant portion of the included studies did not report pain levels (20%), respectively functional disability score (35%) in NSCBPL patients. The full characteristics of the populations are available in Supplementary Table 2.
The Circos plot in Fig. 4 highlights first the fact that the measurement properties of the reported movement biomarkers were mainly assessed by only one study (96%), and then two studies (3%) or three studies (1%). Reliability was assessed in only 24% of these biomarkers (3% both intra-and inter-observer reliability, 18% intra-observer reliability only and 3% inter-observer reliability only). When considering altogether intra-and inter-observer reliability results, the reported level was generally good (73%), and then moderate (21%) and excellent (6%). Criterion validity was never assessed. Content validity was generally good (55%) or excellent (41%), but construct validity was mainly moderate (48%) and then excellent (27%) and good (25%). Interpretability (MDC) was assessed for only 21 biomarkers (17%). For a large majority of biomarkers, the clinical applicability, regarding the protocol used in the included studies, was moderate (83%). Biomarkers with at least three assessed COSMIN items have been underlined in grey in Fig. 4. Only 31 biomarkers were underlined (26% of all movement biomarkers) and reported in Table 1. It can be noticed that these biomarkers are mainly (97%) related to the ICF 2nd level category d410 "Changing basic body position" and mainly (77%) related to spatial/intensity variables and lumbar region (70%: 35% lumbar, 19% lumbar/leg, 16% lumbar/pelvis). Seventeen of these markers (i.e. 14% of all movement biomarkers) reached at least a good level in the assessed COSMIN items (underlined in dark grey in Fig. 4).
The Circos plot in Fig. 5 highlights first the fact that the measurement properties of the reported muscular activity biomechanical markers were mainly assessed by one study (85%), and then two studies (11%), three studies (3%) or four studies (1%). Reliability was assessed in only 14% of these markers (0% both intra-and inter-observer reliability, 8% intra-observer reliability only and 1% inter-observer reliability only). When considering altogether intra-and inter-observer reliability results, the reported level was good (79%) or excellent (21%). Criterion validity was never assessed. Content validity was generally good (53%), and then excellent (34%), moderate (8%) and poor (5%). Construct validity was mainly moderate (47%) and then good (39%) and excellent (14%). Interpretability (MDC) was never assessed. Clinical applicability, regarding the protocol used  Fig. 5 and reported in Table 2. Only 14 markers were identified (9% of all identified muscular activity biomechanical markers). It can be noticed that these markers are mainly (93%) related to the ICF 2nd level category d415 "Maintaining a body position" and mainly  www.nature.com/scientificreports/ (79%) related to temporal variables and abdomen/lumbar region (90%: 45% lumbar, 38% abdomen, 7% abdomen/lumbar). Ten of these markers (i.e. 7% of all identified muscular activity biomechanical markers) reached at least a good level in the assessed COSMIN items (underlined in dark grey in Fig. 5).

Discussion
In a recent special issue on low back pain 12 , the need to defined further quantitative biomarkers for low-back pain, including biomechanical parameters, has been pointed out. In line with this suggestion, this systematic review aimed to identify movement and muscular activity biomarkers proposed in the literature to discriminate NSCLBP patients from an asymptomatic population. The main findings are: • Reported biomarkers were related to various tasks mostly measuring spatial or intensity values targeted to the lower back; • Biomarkers were mostly (90%) reported in only one study for each of them, and only 8% of them were assessed in terms of reliability, validity and interpretability; • Identified movement biomarkers measurement properties: inter-intra reliability, when assessed (24%), is good, construct validity is at least moderate, interpretability is rarely reported (17%), and clinical applicability is moderate; • Identified muscular activity biomarkers measurement properties: inter-intra reliability, when assessed (14%), is good to excellent, construct validity is at least moderate, no study found on interpretability, and clinical applicability is generally good; • Despite all this, we were able to identify 31 movement biomarkers and 14 muscular activity biomarkers for which an extensive measurement properties assessment is already available.  12 . A direct consequence is the split of the data reported in the literature, reducing chances to identify and validate one or several relevant biomarkers. Regarding the type of parameters, both movement and muscular activity parameters may lead to relevant information about NSCLBP, whatever the category of measured variables. Movements parameters have been mainly oriented, directly or indirectly, towards the intervertebral kinematics and can so be associated to the first biomechanical model "Intervertebral Mechanical Dysfunction in Nonspecific LBP" described by Cholewicki et al. 12 . Muscular activity parameters may also reflect the motor adaptation to the muscle disuse observed during the chronic phase and related to atrophy, fibrosis and fatty infiltration 14 . These parameters can be associated with the second "The Kinesiopathologic Model" and third biomechanical model "Anatomy, Biomechanics, and Pathology of the Sacroiliac Joints" described by Cholewicki et al. 12 . Various tasks have also been used in the included studies. As already reported by van Dijk et al. 23 , the ICF 2nd level categories d410 "Changing basic body position", d430 www.nature.com/scientificreports/ "Lifting and carrying objects" and d450 "Walking" often illustrate the proposed tasks, but also d415 "Maintaining a body position" and d740 "Muscle endurance functions". Concerning the region of interest, the thoracolombo-pelvis region is unsurprisingly preferred, while other body regions have also been studies and may reflect posture adaptations due to spine instability in case on NSCLBP 24 . As a consequence, it appears that the included studies rarely use the same parameter of interest, i.e. a variable measured in the same experimental conditions and during a similar task. Only 10% of the movement and muscular activity biomarkers highlighted in this systematic review were reported by more than one included study, and only 3% by more than two included studies. However, to be recognised and applied, a biomarker has to be evaluated so as to demonstrate its accuracy and reproducibility. Furthermore, this complex and time-consuming process needs to be reevaluated to ensure the relevant aspect of a biomarker 16 . Following the COSMIN recommendations 17 , four domains have to be explored to fully assessed the measurement properties of any measurement instrument, i.e. reliability, validity, responsiveness and interpretability. But, only 8% of the biomarkers reported in this systematic review were assessed in terms of reliability, validity and interpretability (responsiveness was out of the scope of this review). This result supports the fact that the full assessment of the measurement properties of a biomarker may be challenging to  Table 4 for measurement properties rating. Biomarkers are underlined in grey when at least 3 COSMIN items have been assessed, and dark grey when all these items reached at least a good level. Biomarker characteristics have been sorted by occurrence. Biomarkers for which all these items reached at least a good level (see Table 4 for measurement properties rating) are in bold. R1: Intra-observer reliability (controls/patients); R2: Inter-observer reliability (controls/patients); V1: Content validity; V2: Criterion validity; V3: Construct validity, rom: range of motion, max: maximum. min: minimum, MDC: minimum detectable change, interp.: interpretability, appl.: applicability, in bold: biomarkers for which all items reached at least a good level (see Table 1 for measurement properties rating). +++: Excellent; ++: Good; +: Moderate; •: available; NA: not available (see Table 4 for measurement properties rating). Regarding the results of this systematic review, we believe that it would be much more efficient to complete the measurement properties assessment of biomarkers already proposed by other studies. Without such a global effort, it may be extremely difficult to identify relevant biomarkers related to musculoskeletal factors in NSCLBP.

ID Variable name Task
The data synthesis highlighted 121 movement biomarkers having demonstrated a discriminative ability (i.e. construct validity). This result supports the fact that the biomechanical aspect of NSCLBP has been widely studied in the literature. Indeed, movements and postures may have a direct or indirect impact on NSLCBP 12 and many studies have thus explored mechanical factors to compare a NSCLBP and a control group. However, an extensive measurement properties assessment has only been performed for 31 of these biomarkers. For these biomarkers, a consistent level of reliability and validity has already been demonstrated in the included studies. For all others, additional studies are needed to further explore their relevance as biomarkers able to discriminate NSCLBP patients from an asymptomatic population. A first group of movement biomarkers, for which an  Table 4 for measurement properties rating. Biomarkers are underlined in grey when at least 3 COSMIN items have been assessed, and dark grey when all these items reached at least a good level. Biomarker characteristics have been sorted by occurrence. www.nature.com/scientificreports/ extensive measurement properties assessment has been performed, focused on spine kinematics during trunk sagittal bending or rotation. This group of biomarkers relates to the first biomechanical model "Intervertebral Mechanical Dysfunction in Nonspecific LBP" described by Cholewicki et al. 12 . While the direct intervertebral motion measurement requires invasive or ionising approaches (i.e. intracortical bone pins and fluoroscopy, respectively), these biomarkers measure an indirect intervertebral motion, e.g. thorax or lumbar absolute or relative kinematics. The level of thoracolumbar spine segmentation depends on the studies, varying from two (thorax, lumbar) to four (upper and lower thorax and lumbar). A higher level of segmentation allows for a more precise spine mobility assessment 25 , but results are prone to soft tissue artefact issue, especially during trunk rotation 26 . Moreover, this mechanical consideration must be interpreted cautiously since any alteration of the thorax or lumbar angles may result from asymptomatic structural degenerations (e.g. disc degeneration, ligament tightening) appearing with aging, or from psychological factors (e.g. kinesiophobia) 12 . A second group of movement biomarkers corresponds to an altered lumbar-hip coordination 27 . They correspond to measurements of hip sagittal angle or lumbar/hip phase shifting during sit-to-stand, stand-to-sit and trunk sagittal bending tasks (i.e. balance challenging tasks). This group can be related to another mechanical consideration in NSCLBP dealing with spine stability 24 . In particular, a decreased lumbar-pelvis coordination has been reported by several studies 28 . Through a trunk extensor muscle dysfunction is NSCLBP patients, this altered coordination induces co-contractions that can restrain lumbar spine motion 28 . Again, this mechanical consideration must be interpreted cautiously since the reduced range of motion observed in pelvis and hips during these tasks may also be explained by kinesiophobia and/or by a preventive mechanism against pain 28,29 , or a consequence of muscle disuse, leading to muscle atrophy 14 .
The data synthesis highlighted 150 muscular activity biomarkers having demonstrated a discriminative ability. This result supports the fact that altered muscle function in NSCLBP has been widely studied in the literature. Indeed, several models such as the pain-spasm-pain model 30,31 have been proposed, arguing for an impact of the muscle function on pain. Changes in the back-muscle structure (e.g. atrophy, fibrosis and fatty infiltration) have also been widely reported 14 . However, a measurement properties assessment has only been performed for Table 2. List of muscular activity biomarkers having at least 3 COSMIN items assessed in the included studies. Biomarkers for which all these items reached at least a good level (see Table 4 for measurement properties rating) are in bold. R1: Intra-observer reliability (controls/patients); R2: Inter-observer reliability (controls/patients); V1: Content validity; V2: Criterion validity; V3: Construct validity. max: maximum, MDC: minimum detectable change, interp.: interpretability, appl.: applicability, in bold: biomarkers for which all items reached at least a good level (see Table 1 for measurement properties rating). +++: Excellent; ++: Good; +: Moderate; •: available; NA: not available (see Table 4 for measurement properties rating). www.nature.com/scientificreports/ 14 of these biomarkers. Analysis of the studies included in this review led to the conclusion that a consistent level of reliability and validity of these biomarkers has already been established. For all others, further analysis will be necessary to establish or not if they can be recognised or not as relevant biomarkers to discriminate NSCLBP patients from an asymptomatic population. A first group of muscular activity biomarkers for which an advanced measurement properties assessment has been performed corresponds to muscle activity adaptations under perturbations. They correspond to temporal variables (activation onset latency, activation burst duration, co-contraction duration) or spatial/intensity variables (EMG signal amplitude) during a postural task (two-legged standing) under expected 32 or unexpected 33,34 perturbations. Indeed, several authors have reported an impairment of the trunk postural control in NSCLBP patients 33 . Two strategies have been described in the literature to manage the trunk postural control, i.e. the anticipatory postural adjustment and the compensatory postural adjustment 35 , and related dysfunctions have been reported by several studies 32 . In this sense, this group of biomarkers could also be related to spine stability issues 24 . A particular emphasis, however, should be considered about temporal variables (e.g. activation onset latency, activation burst duration). Indeed, Mehta et al. 35 highlighted that these variables might be sensitive to the high individual and between subject variation observed in EMG signal patterns. Furthermore, computational algorithms used for activation onset detection may vary in term of accuracy 36 . A second group of muscular activity biomarkers corresponds to the flexion/maximal flexion ratio of the erector spinae (longissimus) EMG signal during trunk sagittal bending 37,38 . This biomarker is related to the flexion relaxation phenomenon (FRP) well reported in the literature 39 . This phenomenon has been defined as a reduced myoelectric activity of the lumbar erector spinae longissimus (and multifidus) during full trunk sagittal bending. However, the EMG signal processing needed to compute the ratios related to this phenomenon is known to be very sensitive to the trunk sagittal bending velocity 40 and the FRP temporal limits, defined by visual identification or automated methods 39 . Further reliability studies, including NSCLBP patients and asymptomatic subjects, should thus be considered to clarify the robustness of such a biomarker. While EMG measurements have been considered in this systematic review as a practice already well established in clinical practice, most of the movement biomarkers were measured using complex and costly devices such as optoelectronic cameras with reflective markers. Thus, a majority of the highlighted biomarkers did not reach a good level of clinical applicability. However, except for the EMG exploration of deep muscles that requires intramuscular measures, most of the biomarkers that have been recorded using advance measurement instruments, in a dedicated laboratory, could be recorded using simpler, transportable, and less costly devices. For example, inertial measurement units (IMUs) are more and more extensively used in the field of biomechanics to spatiotemporal and kinematic parameters 41 . Such devices may open new avenues for the diagnosis, treatment and follow up of NSCLBP patients in clinical routine. Unfortunately, the measurement properties of any biomarkers should be re-evaluated for each measurement instrument, since all sensors have their own reliability and validity. Some authors have already focused their developments and analysis towards sensors with a high level of clinical applicability 42 . However, further studies will be required to transfer the biomarkers highlighted in this systematic review to clinical routine.

ID Variable name Task
Our results must be interpreted carefully since this work has several limitations. First, even if the quality assessment was performed using a checklist adapted and validated for quantitative studies 43 , it was not necessarily adapted to evaluate the quality of laboratory studies involving various type of measurement tools (e.g. EMG, optoelectronic cameras). However, this checklist has already been used in the context of low back pain 44 and was considered to be the most adapted to the needs of this systematic review. Second, the large number of studies included in this systematic review limited the possibilities to explore all methodological details. In order to highlight biomarkers and guide future research, several extracted data have been voluntarily omitted or simplified during the data synthesis process (e.g. task perturbations, EMG electrode placement). However, all extracted data are provided as Supplementary material to allow future data analysis. Third, the responsiveness of the highlighted biomarkers, i.e. their ability to detect change over time 17 , was not analysed. Since the aim of this systematic review was to identify movement and muscular activity biomarkers that can be used to discriminate NSCLBP patients from an asymptomatic population, this COSMIN domain was out of the present scope. However, as responsiveness may be one of the most important endpoint in clinical practice, it will be crucial to verify the sensitivity to change of the highlighted biomarkers and perform additional responsiveness studies when needed.
Several authors have pointed out the importance of having reliable biomarkers sustaining the biomechanical concepts proposed for low back pain 45 . Integrating these biomarkers into studies along with well recognised social and psychological factors has the potential to add to our understanding of this complex disease and to open the scientific community to new therapeutical approaches. This systematic review highlights that, even if several relevant biomarkers related to movement and muscular activity have been proposed and their measurement properties partially assessed, there is currently a lack of consensus concerning a robust and standardised biomechanical approach to assess low back pain. Prior to such a consensus, it is however crucial to increase the current knowledge on the biomarkers highlighted here (and on any other possible biomarker) to ascertain that all COSMIN domains (i.e. reliability, validity, responsiveness and interpretability) have been well explored. For that, future studies should seriously consider reproducing existing protocols and measure parameters in the same conditions than in the original articles, but also in different countries, cultures and pain/disability levels on low back pain populations. The use of sensors known for a high clinical applicability should also be further deployed to ease the appropriation of the related measurements in clinical routine. Finally, every study should report pain level and disability score of the included patients to better characterise the assessed populations and possibly allow the identification of different sub-groups within this heterogeneous population.
This systematic review highlighted biomarkers related to movement and muscular activity allowing to discriminate NSCLBP patients from an asymptomatic population from a musculoskeletal factors point of view. While numerous parameters were assessed in the literature, with a large heterogeneity and mainly one study for www.nature.com/scientificreports/ one measurement, a comprehensive assessment of the measurement properties of 31 movement biomarkers and 14 muscular activity biomarkers was identified in the included studies. On the whole, these biomarkers support the primary biomechanical concepts proposed for low back pain. However, a consensus concerning a robust and standardised biomechanical approach to assess low back pain is currently missing but desperetly needed in order to improve our knowledge on this condition and extend our therapeutic capabilites.

Methods
Study design. This study is a systematic review based on the following research question established using the PICO approach 46 : "In adults suffering non-specific chronic low back pain, what are the biomarkers that allow to discriminate them from an asymptomatic population in terms of movement or muscular activity?".
Protocol and registration. This systematic review was registered through PROSPERO (registration number: CRD42020144877) before being undertaken. It has been written following the Preferred Reporting Items for Systematic review and Meta-Analysis (PRISMA) Statement 47 . The assessment of each reported outcome measurement instrument was inspired by the COSMIN checklist 17 .
Information sources and search strategy. An electronic search was performed in Medline, Embase, and Web of Knowledge databases from inception to July 2019 without any time limit. The logical (nested) expressions for the search were: ('low* back pain*' or 'low* backpain*' or 'low* back ache*' or 'low* backache*' or 'low* back syndrome*' or lumbago* or 'lumbal pain*' or 'lumbal syndrome*' or lumbalgia* or 'lumbar pain*' or 'lumbar spine syndrome*' or lumbodynia* or 'lumbosacral pain*' or 'lumbar multifidus pain*' or 'lumbar flexion rotation syndrome*' or 'lumbar extension rotation syndrome*' or LBP or CLBP or NSLBP) and (chronic or continu* or constant or persistent or prolonged or longstanding) and (electromyogra* or EMG or sEMG or kinematic* or angle). The search was based on the title, keywords and abstract. Duplicates were identified and removed using Mendeley (https ://www.mende ley.com) 48 . Cross-referencing was also undertaken by checking references cited by the articles included in the full-text inspection.
Study selection and eligibility criteria. The records identified from the search strategy were reviewed according to the eligibility criteria reported in Table 3. One author (FM) inspected the literature by screening records title and abstract using the Rayyan online application (http://rayya n.qcri.org), a tool developed to support the systematic review process 49 . Full-text inspection of the identified records was then undertaken independently by two authors (FM and KRD) to determine studies included in the analysis. Each criterion was rated one (satisfying description or justification) or zero (limited information or no information). In the current review, a score of ≥ 9 was used for an indication of acceptable methodological quality. Each included study was assessed independently by two authors (FM and KRD). In case of discrepancy, the original article was checked by a third author (SA).

Data extraction.
In the first stage of data extraction, the characteristics of the populations and measured parameters were extracted for each study. Populations (control and NSCLBP) were described by number of subjects, age, BMI, gender ratio, primary country of origin, as well as pain level (Visual Analog Scale 50 results) and disability score (Oswestry Disability Index score 51 or Roland Morris Disability Questionnaire 52 ), when available. Parameters were defined by a type (movement or muscular activity), a variable and a task. Variables were described by a category (temporal, spatial/intensity, frequential, variability, coordination), a name, a unit, a measurement tool and a region of interest (e.g. multifidus, thorax). Tasks were described by an International Table 3. Eligibility criteria of the identified studies. † A patient was defined chronic when pain duration was superior or equal to 3 months.

Properties
Eligibility criteria  17 , namely reliability (intra-observer and inter-observer reliability), validity (content validity, criterion validity, construct validity) and interpretability (minimal detectable change), but also in term of clinical applicability (cost, simplicity of setup and execution, task-related pain) 54 . Content validity and clinical applicability were reported as poor/moderate/good/excellent levels depending on the number of weak items (reported between brackets). Specific test results were reported for reliability, other validity items and interpretability (e.g. interclass correlation, p-value). Two authors (FM and KRD) extracted data independently and cross-checked. In case of discrepancy, the original article was checked by a third author (SA).
In the second stage of data extraction, all parameters having demonstrated a construct validity at least moderate (see Table 4 for rating) were defined as biomarkers. When these biomarkers were measured in a sufficiently similar way in terms of variable and task, they were merged across studies to conduct a qualitative analysis for the data synthesis of these instruments.
Each biomarker was identified by a code and, as for parameters, defined by a type, a variable and a task. The total number of subjects and patients, as well as the related studies, were also reported. To come to a clear, objective and informative overview of the identified biomarkers, the rating of their reliability, validity, interpretability and clinical applicability were then ranged from excellent, good, moderate to poor. The overall score for each measurement property of each biomarker was obtained by reporting the overall majority of the results obtained across related studies. The rating of each measurement property was defined in Table 4. Only the availability of minimum detectable change (MDC) was stated.
Data synthesis. The targeted populations of the included studies were compiled on a labelled world map (generated with Microsoft Excel 2019, Microsoft Corporation, USA) to highlight in which countries NSCLBP was analysed, by how many studies and on how many participants (control and NSCLBP).
The assessment strategies observed in the included studies were highlighted in terms of parameter type, ICF 2nd level category, variable category and region of interest. For that, a Circos plot 55 was generated to establish the links between each included study and these strategies.
The measurement properties of the identified biomarkers were highlighted using Circos plots 55 . The primary characteristics, i.e. corresponding ICF 2nd level category, variable category and region of interest, were also reported on these plots, while the full characteristics of each biomarker are available in Supplementary Table 3. Table 4. Rating used for each measurement property evaluated. † Irrelevant or not sufficiently defined items regarding the construct to be assessed (e.g. population, instrument, parameter, outcome). ‡ Weak items (e.g. cost, simplicity of setup and execution, task-related pain).