Introduction

Low back pain is the leading cause of disability worldwide since 1990 with an increase in years lived with disability of over 50% since then1. In 85–90% of the cases, the exact cause of pain cannot be ascertained with certainty and patients are thus classified as having non-specific low back pain2. Among these patients, 10% become chronic sufferers (non-specific chronic low back pain—NSCLBP) which represent a high socioeconomic burden3. In the absence of a clear and convincing diagnosis, therapeutic management remains difficult4. Various treatments exist but show small to moderate overall effects, potentially due to a lack of knowledge in the pathophysiology of NSCLBP and to the heterogeneity of the studied population5.

To date, NSCLBP is often described as a complex disorder where central and peripheral nociceptive processes are influenced by various factors such as social, psychological or musculoskeletal factors which interact with each other3,6. It has been proposed that social and psychological factors may play an important role in the persistence of the pain7,8. However, the role of musculoskeletal factors remains unclear. Regularly considered as pain consequences, these factors could be voluntary or involuntary compensations deployed to reduce pain, leading to long-term nociceptive consequences9.

Several musculoskeletal factors related to movement and/or muscular activity impairments have been highlighted in NSCLBP patients compared to asymptomatic participants10. Kinematic parameters such as range of motion, segment coordination or movement variability11,12, as well as electromyographic parameters such as maximal activity, timing activity or fatigability13,14, are regularly measured in NSCLBP population. However, due to methodological differences and/or the heterogeneity of the NSCLBP population, some studies reported contradictory findings15. On this basis, it seems difficult to develop efficient therapeutic programs. To be efficient, in addition to social and psychological factors, programs have to focus on musculoskeletal factors in NSCLBP patients that can be measured and evaluated using biomarkers, i.e. measurable parameters giving objective indications of patient state, which can be measured accurately and reproducibly16. In other words, the measurement properties17 of these biomarkers have to be known in terms of reliability, validity, interpretability and responsiveness. However, to the best of our knowledge, a comprehensive review of these biomarkers and related measurement properties is missing in the recent literature.

As a first step toward this wide objective, the purpose of this systematic review was 1) to identify in the literature the primary biomarkers related to movement or muscular activity and 2) to report their reliability, validity, interpretability levels, when available. In this sense, this manuscript attempts to establish a list of relevant biomarkers allowing to discriminate NSCLBP patients from an asymptomatic population from a musculoskeletal factors point of view and to report their respective level of validation.

Results

Study selection

The search strategy allowed to identify 672 records in Medline, 804 in Embase, and 1011 in Web of knowledge, yielding to 1638 records without any duplicate (Fig. 1).

Figure 1
figure 1

Flowchart of the search strategy conducted in this review (based on the PRISMA flowchart47).

According to the exclusion criteria, 97 studies were included for quality assessment (1455 records were cancelled after the screening of the abstract, and 85 after screening of the full-text article). Most of the excluded studies did not refer to adult patients (18–65-year-old) suffering from NSCLBP without a history of back surgery or pregnancy specifically (i.e. population criterion).

Quality assessment

The overall score of each included study was calculated by the sum of rated criteria divided by the sum of applicable questions. The included studies were generally of good quality, with a mean score of 72 ± 12% (Supplementary Table 1). However, less than 50% of the studies provided sufficient information about their design or reported the reliability of the outcome, and less than 25% of the studies performed blinded analysis or justified the sample size. Five of the 97 studies18,19,20,21,22 included for quality assessment obtained a score lower than 9 (≤ 50%) and were thus not included in the data extraction process.

Data extraction

While not used in the data synthesis process, the full characteristics of the populations and measured parameters of each included study are available in Supplementary Table 2 to provide complete and transparent information, and a synthesis is all available in Fig. 2. The characteristics of each movement and muscular activity biomarker, used to conduct the data synthesis, are available in Supplementary Table 3.

Figure 2
figure 2

Labelled world map reporting the number of studies and the number of participants (NSCLBP and control) for each country of origin.

Data synthesis

As most of the biomarkers were only assessed by one study (see below), the risk to get heterogeneous populations from different studies to analyse one marker is low and was not considered in this analysis. Anyway, it can be observed that included studies come mainly from occidental Europe including UK (32%), North America (28%) and Asia (12%). The mean number of participants across studies was 29.0 ± 30.4 [min: 5, max: 218] for NSCLBP, and 23.1 ± 16.5 [min: 6, max: 130] for control. It can also be observed that a significant portion of the included studies did not report pain levels (20%), respectively functional disability score (35%) in NSCBPL patients. The full characteristics of the populations are available in Supplementary Table 2.

The Circos plot in Fig. 3 highlights the fact that muscular activity parameters were predominant in the included studies (70%: 51% only muscular activity parameters, 18% both movement and muscular activity parameters). The tasks related to these parameters were primarily ICF 2nd level category d410 “Changing basic body position” (43%), i.e. tasks with a movement excluding locomotion and weight lifting, and then d415 “Maintaining a body position” (30%), d740 “Muscle endurance functions” (17%), e.g. the Biering-Sørensen test, d450 “Walking” (11%) and d430 “Lifting and carrying objects” (9%). The variable related to these parameters were primarily spatial/intensity values (82%), and then frequential (15%), temporal (14%), coordination (14%) and variability values (11%). These variables were primarily targeted toward the lumbar region (43%), and then thorax (20%), pelvis (18%), legs (18%), abdomen (14%), head (2%), whole body (1%) and arms (1%) regions.

Figure 3
figure 3

Circos plot55 linking the included studies to their selected parameter types, ICF 2nd level categories, variable categories and regions of interest. The number of studies linked to each item is also reported.

The Circos plot in Fig. 4 highlights first the fact that the measurement properties of the reported movement biomarkers were mainly assessed by only one study (96%), and then two studies (3%) or three studies (1%). Reliability was assessed in only 24% of these biomarkers (3% both intra- and inter-observer reliability, 18% intra-observer reliability only and 3% inter-observer reliability only). When considering altogether intra- and inter-observer reliability results, the reported level was generally good (73%), and then moderate (21%) and excellent (6%). Criterion validity was never assessed. Content validity was generally good (55%) or excellent (41%), but construct validity was mainly moderate (48%) and then excellent (27%) and good (25%). Interpretability (MDC) was assessed for only 21 biomarkers (17%). For a large majority of biomarkers, the clinical applicability, regarding the protocol used in the included studies, was moderate (83%). Biomarkers with at least three assessed COSMIN items have been underlined in grey in Fig. 4. Only 31 biomarkers were underlined (26% of all movement biomarkers) and reported in Table 1. It can be noticed that these biomarkers are mainly (97%) related to the ICF 2nd level category d410 “Changing basic body position” and mainly (77%) related to spatial/intensity variables and lumbar region (70%: 35% lumbar, 19% lumbar/leg, 16% lumbar/pelvis). Seventeen of these markers (i.e. 14% of all movement biomarkers) reached at least a good level in the assessed COSMIN items (underlined in dark grey in Fig. 4).

Figure 4
figure 4

Circos plot55 synthesising the main characteristics and measurement properties of each movement biomarker. See Table 4 for measurement properties rating. Biomarkers are underlined in grey when at least 3 COSMIN items have been assessed, and dark grey when all these items reached at least a good level. Biomarker characteristics have been sorted by occurrence.

Table 1 List of movement biomarkers having at least 3 COSMIN items assessed in the included studies. Biomarkers for which all these items reached at least a good level (see Table 4 for measurement properties rating) are in bold.

The Circos plot in Fig. 5 highlights first the fact that the measurement properties of the reported muscular activity biomechanical markers were mainly assessed by one study (85%), and then two studies (11%), three studies (3%) or four studies (1%). Reliability was assessed in only 14% of these markers (0% both intra- and inter-observer reliability, 8% intra-observer reliability only and 1% inter-observer reliability only). When considering altogether intra- and inter-observer reliability results, the reported level was good (79%) or excellent (21%). Criterion validity was never assessed. Content validity was generally good (53%), and then excellent (34%), moderate (8%) and poor (5%). Construct validity was mainly moderate (47%) and then good (39%) and excellent (14%). Interpretability (MDC) was never assessed. Clinical applicability, regarding the protocol used in the included studies, was generally good (62%), and then moderate (27%) and poor (11%). Markers with at least three assessed COSMIN items have been underlined in grey in Fig. 5 and reported in Table 2. Only 14 markers were identified (9% of all identified muscular activity biomechanical markers). It can be noticed that these markers are mainly (93%) related to the ICF 2nd level category d415 “Maintaining a body position” and mainly (79%) related to temporal variables and abdomen/lumbar region (90%: 45% lumbar, 38% abdomen, 7% abdomen/lumbar). Ten of these markers (i.e. 7% of all identified muscular activity biomechanical markers) reached at least a good level in the assessed COSMIN items (underlined in dark grey in Fig. 5).

Figure 5
figure 5

Circos plot55 synthesising the main characteristics and measurement properties of each muscular activity biomarker. See Table 4 for measurement properties rating. Biomarkers are underlined in grey when at least 3 COSMIN items have been assessed, and dark grey when all these items reached at least a good level. Biomarker characteristics have been sorted by occurrence.

Table 2 List of muscular activity biomarkers having at least 3 COSMIN items assessed in the included studies. Biomarkers for which all these items reached at least a good level (see Table 4 for measurement properties rating) are in bold.

Discussion

In a recent special issue on low back pain12, the need to defined further quantitative biomarkers for low-back pain, including biomechanical parameters, has been pointed out. In line with this suggestion, this systematic review aimed to identify movement and muscular activity biomarkers proposed in the literature to discriminate NSCLBP patients from an asymptomatic population. The main findings are:

  • Reported biomarkers were related to various tasks mostly measuring spatial or intensity values targeted to the lower back;

  • Biomarkers were mostly (90%) reported in only one study for each of them, and only 8% of them were assessed in terms of reliability, validity and interpretability;

  • Identified movement biomarkers measurement properties: inter-intra reliability, when assessed (24%), is good, construct validity is at least moderate, interpretability is rarely reported (17%), and clinical applicability is moderate;

  • Identified muscular activity biomarkers measurement properties: inter-intra reliability, when assessed (14%), is good to excellent, construct validity is at least moderate, no study found on interpretability, and clinical applicability is generally good;

  • Despite all this, we were able to identify 31 movement biomarkers and 14 muscular activity biomarkers for which an extensive measurement properties assessment is already available.

A first observation is the heterogeneous nature of the included studies, exploring a large variety of measures on almost all body regions and during various tasks. This heterogeneity illustrates the fact that there is still no consensus to define a clear protocol exploring the role of musculoskeletal factors in NSCLBP. This observation is in line with the diversity of biomechanical concepts in the literature12. A direct consequence is the split of the data reported in the literature, reducing chances to identify and validate one or several relevant biomarkers. Regarding the type of parameters, both movement and muscular activity parameters may lead to relevant information about NSCLBP, whatever the category of measured variables. Movements parameters have been mainly oriented, directly or indirectly, towards the intervertebral kinematics and can so be associated to the first biomechanical model “Intervertebral Mechanical Dysfunction in Nonspecific LBP” described by Cholewicki et al.12. Muscular activity parameters may also reflect the motor adaptation to the muscle disuse observed during the chronic phase and related to atrophy, fibrosis and fatty infiltration14. These parameters can be associated with the second “The Kinesiopathologic Model” and third biomechanical model “Anatomy, Biomechanics, and Pathology of the Sacroiliac Joints” described by Cholewicki et al.12. Various tasks have also been used in the included studies. As already reported by van Dijk et al.23, the ICF 2nd level categories d410 “Changing basic body position”, d430 “Lifting and carrying objects” and d450 “Walking” often illustrate the proposed tasks, but also d415 “Maintaining a body position” and d740 “Muscle endurance functions”. Concerning the region of interest, the thoraco-lombo-pelvis region is unsurprisingly preferred, while other body regions have also been studies and may reflect posture adaptations due to spine instability in case on NSCLBP24. As a consequence, it appears that the included studies rarely use the same parameter of interest, i.e. a variable measured in the same experimental conditions and during a similar task. Only 10% of the movement and muscular activity biomarkers highlighted in this systematic review were reported by more than one included study, and only 3% by more than two included studies. However, to be recognised and applied, a biomarker has to be evaluated so as to demonstrate its accuracy and reproducibility. Furthermore, this complex and time-consuming process needs to be reevaluated to ensure the relevant aspect of a biomarker16. Following the COSMIN recommendations17, four domains have to be explored to fully assessed the measurement properties of any measurement instrument, i.e. reliability, validity, responsiveness and interpretability. But, only 8% of the biomarkers reported in this systematic review were assessed in terms of reliability, validity and interpretability (responsiveness was out of the scope of this review). This result supports the fact that the full assessment of the measurement properties of a biomarker may be challenging to manage by a unique study. Reliability alone is a time-consuming process requiring several measurement sessions per participant. Consequently, only the construct validity (i.e. a statistically significant difference between the value of the biomarker for a NSCLBP and a control group) was assessed for 83% of the highlighted biomarkers. Regarding the results of this systematic review, we believe that it would be much more efficient to complete the measurement properties assessment of biomarkers already proposed by other studies. Without such a global effort, it may be extremely difficult to identify relevant biomarkers related to musculoskeletal factors in NSCLBP.

The data synthesis highlighted 121 movement biomarkers having demonstrated a discriminative ability (i.e. construct validity). This result supports the fact that the biomechanical aspect of NSCLBP has been widely studied in the literature. Indeed, movements and postures may have a direct or indirect impact on NSLCBP12 and many studies have thus explored mechanical factors to compare a NSCLBP and a control group. However, an extensive measurement properties assessment has only been performed for 31 of these biomarkers. For these biomarkers, a consistent level of reliability and validity has already been demonstrated in the included studies. For all others, additional studies are needed to further explore their relevance as biomarkers able to discriminate NSCLBP patients from an asymptomatic population. A first group of movement biomarkers, for which an extensive measurement properties assessment has been performed, focused on spine kinematics during trunk sagittal bending or rotation. This group of biomarkers relates to the first biomechanical model “Intervertebral Mechanical Dysfunction in Nonspecific LBP” described by Cholewicki et al.12. While the direct intervertebral motion measurement requires invasive or ionising approaches (i.e. intracortical bone pins and fluoroscopy, respectively), these biomarkers measure an indirect intervertebral motion, e.g. thorax or lumbar absolute or relative kinematics. The level of thoracolumbar spine segmentation depends on the studies, varying from two (thorax, lumbar) to four (upper and lower thorax and lumbar). A higher level of segmentation allows for a more precise spine mobility assessment25, but results are prone to soft tissue artefact issue, especially during trunk rotation26. Moreover, this mechanical consideration must be interpreted cautiously since any alteration of the thorax or lumbar angles may result from asymptomatic structural degenerations (e.g. disc degeneration, ligament tightening) appearing with aging, or from psychological factors (e.g. kinesiophobia)12. A second group of movement biomarkers corresponds to an altered lumbar-hip coordination27. They correspond to measurements of hip sagittal angle or lumbar/hip phase shifting during sit-to-stand, stand-to-sit and trunk sagittal bending tasks (i.e. balance challenging tasks). This group can be related to another mechanical consideration in NSCLBP dealing with spine stability24. In particular, a decreased lumbar-pelvis coordination has been reported by several studies28. Through a trunk extensor muscle dysfunction is NSCLBP patients, this altered coordination induces co-contractions that can restrain lumbar spine motion28. Again, this mechanical consideration must be interpreted cautiously since the reduced range of motion observed in pelvis and hips during these tasks may also be explained by kinesiophobia and/or by a preventive mechanism against pain28,29, or a consequence of muscle disuse, leading to muscle atrophy14.

The data synthesis highlighted 150 muscular activity biomarkers having demonstrated a discriminative ability. This result supports the fact that altered muscle function in NSCLBP has been widely studied in the literature. Indeed, several models such as the pain-spasm-pain model30,31 have been proposed, arguing for an impact of the muscle function on pain. Changes in the back-muscle structure (e.g. atrophy, fibrosis and fatty infiltration) have also been widely reported14. However, a measurement properties assessment has only been performed for 14 of these biomarkers. Analysis of the studies included in this review led to the conclusion that a consistent level of reliability and validity of these biomarkers has already been established. For all others, further analysis will be necessary to establish or not if they can be recognised or not as relevant biomarkers to discriminate NSCLBP patients from an asymptomatic population. A first group of muscular activity biomarkers for which an advanced measurement properties assessment has been performed corresponds to muscle activity adaptations under perturbations. They correspond to temporal variables (activation onset latency, activation burst duration, co-contraction duration) or spatial/intensity variables (EMG signal amplitude) during a postural task (two-legged standing) under expected32 or unexpected33,34 perturbations. Indeed, several authors have reported an impairment of the trunk postural control in NSCLBP patients33. Two strategies have been described in the literature to manage the trunk postural control, i.e. the anticipatory postural adjustment and the compensatory postural adjustment35, and related dysfunctions have been reported by several studies32. In this sense, this group of biomarkers could also be related to spine stability issues24. A particular emphasis, however, should be considered about temporal variables (e.g. activation onset latency, activation burst duration). Indeed, Mehta et al.35 highlighted that these variables might be sensitive to the high individual and between subject variation observed in EMG signal patterns. Furthermore, computational algorithms used for activation onset detection may vary in term of accuracy36. A second group of muscular activity biomarkers corresponds to the flexion/maximal flexion ratio of the erector spinae (longissimus) EMG signal during trunk sagittal bending37,38. This biomarker is related to the flexion relaxation phenomenon (FRP) well reported in the literature39. This phenomenon has been defined as a reduced myoelectric activity of the lumbar erector spinae longissimus (and multifidus) during full trunk sagittal bending. However, the EMG signal processing needed to compute the ratios related to this phenomenon is known to be very sensitive to the trunk sagittal bending velocity40 and the FRP temporal limits, defined by visual identification or automated methods39. Further reliability studies, including NSCLBP patients and asymptomatic subjects, should thus be considered to clarify the robustness of such a biomarker.

While EMG measurements have been considered in this systematic review as a practice already well established in clinical practice, most of the movement biomarkers were measured using complex and costly devices such as optoelectronic cameras with reflective markers. Thus, a majority of the highlighted biomarkers did not reach a good level of clinical applicability. However, except for the EMG exploration of deep muscles that requires intramuscular measures, most of the biomarkers that have been recorded using advance measurement instruments, in a dedicated laboratory, could be recorded using simpler, transportable, and less costly devices. For example, inertial measurement units (IMUs) are more and more extensively used in the field of biomechanics to spatiotemporal and kinematic parameters41. Such devices may open new avenues for the diagnosis, treatment and follow up of NSCLBP patients in clinical routine. Unfortunately, the measurement properties of any biomarkers should be re-evaluated for each measurement instrument, since all sensors have their own reliability and validity. Some authors have already focused their developments and analysis towards sensors with a high level of clinical applicability42. However, further studies will be required to transfer the biomarkers highlighted in this systematic review to clinical routine.

Our results must be interpreted carefully since this work has several limitations. First, even if the quality assessment was performed using a checklist adapted and validated for quantitative studies43, it was not necessarily adapted to evaluate the quality of laboratory studies involving various type of measurement tools (e.g. EMG, optoelectronic cameras). However, this checklist has already been used in the context of low back pain44 and was considered to be the most adapted to the needs of this systematic review. Second, the large number of studies included in this systematic review limited the possibilities to explore all methodological details. In order to highlight biomarkers and guide future research, several extracted data have been voluntarily omitted or simplified during the data synthesis process (e.g. task perturbations, EMG electrode placement). However, all extracted data are provided as Supplementary material to allow future data analysis. Third, the responsiveness of the highlighted biomarkers, i.e. their ability to detect change over time17, was not analysed. Since the aim of this systematic review was to identify movement and muscular activity biomarkers that can be used to discriminate NSCLBP patients from an asymptomatic population, this COSMIN domain was out of the present scope. However, as responsiveness may be one of the most important endpoint in clinical practice, it will be crucial to verify the sensitivity to change of the highlighted biomarkers and perform additional responsiveness studies when needed.

Several authors have pointed out the importance of having reliable biomarkers sustaining the biomechanical concepts proposed for low back pain45. Integrating these biomarkers into studies along with well recognised social and psychological factors has the potential to add to our understanding of this complex disease and to open the scientific community to new therapeutical approaches. This systematic review highlights that, even if several relevant biomarkers related to movement and muscular activity have been proposed and their measurement properties partially assessed, there is currently a lack of consensus concerning a robust and standardised biomechanical approach to assess low back pain. Prior to such a consensus, it is however crucial to increase the current knowledge on the biomarkers highlighted here (and on any other possible biomarker) to ascertain that all COSMIN domains (i.e. reliability, validity, responsiveness and interpretability) have been well explored. For that, future studies should seriously consider reproducing existing protocols and measure parameters in the same conditions than in the original articles, but also in different countries, cultures and pain/disability levels on low back pain populations. The use of sensors known for a high clinical applicability should also be further deployed to ease the appropriation of the related measurements in clinical routine. Finally, every study should report pain level and disability score of the included patients to better characterise the assessed populations and possibly allow the identification of different sub-groups within this heterogeneous population.

This systematic review highlighted biomarkers related to movement and muscular activity allowing to discriminate NSCLBP patients from an asymptomatic population from a musculoskeletal factors point of view. While numerous parameters were assessed in the literature, with a large heterogeneity and mainly one study for one measurement, a comprehensive assessment of the measurement properties of 31 movement biomarkers and 14 muscular activity biomarkers was identified in the included studies. On the whole, these biomarkers support the primary biomechanical concepts proposed for low back pain. However, a consensus concerning a robust and standardised biomechanical approach to assess low back pain is currently missing but desperetly needed in order to improve our knowledge on this condition and extend our therapeutic capabilites.

Methods

Study design

This study is a systematic review based on the following research question established using the PICO approach46: “In adults suffering non-specific chronic low back pain, what are the biomarkers that allow to discriminate them from an asymptomatic population in terms of movement or muscular activity?”.

Protocol and registration

This systematic review was registered through PROSPERO (registration number: CRD42020144877) before being undertaken. It has been written following the Preferred Reporting Items for Systematic review and Meta-Analysis (PRISMA) Statement47. The assessment of each reported outcome measurement instrument was inspired by the COSMIN checklist17.

Information sources and search strategy

An electronic search was performed in Medline, Embase, and Web of Knowledge databases from inception to July 2019 without any time limit. The logical (nested) expressions for the search were: (’low* back pain*’ or ‘low* backpain*’ or ‘low* back ache*’ or ‘low* backache*’ or ‘low* back syndrome*’ or lumbago* or ‘lumbal pain*’ or ‘lumbal syndrome*’ or lumbalgia* or ‘lumbar pain*’ or ‘lumbar spine syndrome*’ or lumbodynia* or ‘lumbosacral pain*’ or ‘lumbar multifidus pain*’ or ‘lumbar flexion rotation syndrome*’ or ‘lumbar extension rotation syndrome*’ or LBP or CLBP or NSLBP) and (chronic or continu* or constant or persistent or prolonged or longstanding) and (electromyogra* or EMG or sEMG or kinematic* or angle). The search was based on the title, keywords and abstract. Duplicates were identified and removed using Mendeley (https://www.mendeley.com)48. Cross-referencing was also undertaken by checking references cited by the articles included in the full-text inspection.

Study selection and eligibility criteria

The records identified from the search strategy were reviewed according to the eligibility criteria reported in Table 3.

Table 3 Eligibility criteria of the identified studies.

One author (FM) inspected the literature by screening records title and abstract using the Rayyan online application (http://rayyan.qcri.org), a tool developed to support the systematic review process49. Full-text inspection of the identified records was then undertaken independently by two authors (FM and KRD) to determine studies included in the analysis.

Risk of bias assessment within studies

The risk of bias assessment within included studies was assessed using the modified McMaster Critical Review Form for Quantitative Studies43, already used in the context of low back pain44. This tool is based on 16 criteria: C1: Purpose, C2: Literature review, C3: Study design, C4: Blinding, C5: Sample description, C6: Sample size, C7: Ethics and consent, C8: Validity of outcome, C9: Reliability of outcome, C10: Intervention description, C11: Statistical significance, C12: Statistical analysis, C13: Clinical importance, C14: Conclusions, C15: Clinical implications, C16: Study limitations. Each criterion was rated one (satisfying description or justification) or zero (limited information or no information). In the current review, a score of ≥ 9 was used for an indication of acceptable methodological quality. Each included study was assessed independently by two authors (FM and KRD). In case of discrepancy, the original article was checked by a third author (SA).

Data extraction

In the first stage of data extraction, the characteristics of the populations and measured parameters were extracted for each study. Populations (control and NSCLBP) were described by number of subjects, age, BMI, gender ratio, primary country of origin, as well as pain level (Visual Analog Scale50 results) and disability score (Oswestry Disability Index score51 or Roland Morris Disability Questionnaire52), when available. Parameters were defined by a type (movement or muscular activity), a variable and a task. Variables were described by a category (temporal, spatial/intensity, frequential, variability, coordination), a name, a unit, a measurement tool and a region of interest (e.g. multifidus, thorax). Tasks were described by an International Classification of Functioning, Disability and Health (ICF) 2nd level category53, a name, a movement constraint and a movement perturbation, when available. For each parameter, the measurement properties were extracted from the original article, when available, in term of three COSMIN domains17, namely reliability (intra-observer and inter-observer reliability), validity (content validity, criterion validity, construct validity) and interpretability (minimal detectable change), but also in term of clinical applicability (cost, simplicity of setup and execution, task-related pain)54. Content validity and clinical applicability were reported as poor/moderate/good/excellent levels depending on the number of weak items (reported between brackets). Specific test results were reported for reliability, other validity items and interpretability (e.g. interclass correlation, p-value). Two authors (FM and KRD) extracted data independently and cross-checked. In case of discrepancy, the original article was checked by a third author (SA).

In the second stage of data extraction, all parameters having demonstrated a construct validity at least moderate (see Table 4 for rating) were defined as biomarkers. When these biomarkers were measured in a sufficiently similar way in terms of variable and task, they were merged across studies to conduct a qualitative analysis for the data synthesis of these instruments.

Table 4 Rating used for each measurement property evaluated.

Each biomarker was identified by a code and, as for parameters, defined by a type, a variable and a task. The total number of subjects and patients, as well as the related studies, were also reported. To come to a clear, objective and informative overview of the identified biomarkers, the rating of their reliability, validity, interpretability and clinical applicability were then ranged from excellent, good, moderate to poor. The overall score for each measurement property of each biomarker was obtained by reporting the overall majority of the results obtained across related studies. The rating of each measurement property was defined in Table 4. Only the availability of minimum detectable change (MDC) was stated.

Data synthesis

The targeted populations of the included studies were compiled on a labelled world map (generated with Microsoft Excel 2019, Microsoft Corporation, USA) to highlight in which countries NSCLBP was analysed, by how many studies and on how many participants (control and NSCLBP).

The assessment strategies observed in the included studies were highlighted in terms of parameter type, ICF 2nd level category, variable category and region of interest. For that, a Circos plot55 was generated to establish the links between each included study and these strategies.

The measurement properties of the identified biomarkers were highlighted using Circos plots55. The primary characteristics, i.e. corresponding ICF 2nd level category, variable category and region of interest, were also reported on these plots, while the full characteristics of each biomarker are available in Supplementary Table 3.