Introduction

Contemporary organisations operate in a competitive, uncertain and volatile environment1. Employees working in such organisations are often asked to perform duties (illegitimate tasks) that are not in agreement with their professional identity2,3. They usually report that tasks are illegitimate due to either employees feeling that a task should not have to be carried out by anybody, or that a task could have been avoided with better organizational systems4 (unnecessary tasks); or tasks being unfairly assigned to them, that is, “tasks that may be outside the range of an employee’s occupation or status, or professional role”4 (p. 73) (unreasonable tasks). In the subjective view of employees (that reflect normative prescriptions of what is expected, or seen as legitimate, in a given role or position), unnecessary tasks are perceived as pointless, whereas unreasonable ones are seen as inappropriate, given employees’ competencies or professional resources. Illegitimate tasks lead to greater levels of administrative work and reduce the time spent on core responsibilities. For instance, rather than focusing on providing medical care to patients, teaching students, producing more research or facilitating citizens’ daily lives, employees may be asked to prepare double documentation (i.e., both on paper and in electronic form), numerous reports, and summaries5,6. Illegitimate tasks have been recognised as an important social stressor that threatens the self and violates one’s professional identity, according to Stress-as-Offence-to-Self (SOS) theory7,8. SOS theory assumes that maintaining a positive self-view is a primary human concern, and any threat to self-esteem elicits strain7. Illegitimate tasks cause stress because employees feel they are not valued and respected when they receive these tasks3,8. Thus, recipients feel that either no one should be assigned such tasks because they are pointless or they are left feeling “I should not have to do this! It is not my job!”. Illegitimate tasks can also be understood as job demands due to their energy-related features because they require effort and drain employees’ physical and psychological wellbeing9. Therefore, as illegitimate tasks mount over time, employees may lose their professional identity and engagement, which can cause further negative consequences for individuals and organisations2,10,11,12. What is needed, then, is a valid and effective tool that will respond to challenges which appear at rapid speed. The Bern Illegitimate Tasks Scale (BITS) is the most frequently used instrument to evaluate unnecessary and unreasonable tasks4,13. To date, nevertheless, few psychometric property studies have been conducted13,14,15. The contribution of the current study is its simultaneous application of a classical test theory and an item response theory (IRT) to assess the validity of BITS. IRT is a family of associated mathematical models that relate latent traits (e.g., perceived illegitimate tasks) to the probability of responses to items in an assessment. Thus, IRT is used to evaluate the item performance and precision of the scale measuring a latent trait16. It is the first study to report item functioning of BITS using IRT consisting of two conceptually separate dimensions capturing unnecessary and unreasonable tasks.

Illegitimate tasks in the workplace

In line with SOS theory7,8, illegitimate tasks decrease employees’ self-esteem2,11,17,18, as they may imply a lack of appreciation for their professional role5,19. Therefore, illegitimate tasks threaten one’s personal and social self, and professional identity7,8. In particular, intrinsically motivated employees suffered more from illegitimate tasks when they were not recognised for their job efforts14. Illegitimate tasks contribute to aversive and counterproductive behaviours4,20 or poorer job performance10,21. Illegitimate tasks can be accompanied by other job demands (e.g., work overload) which threaten both the wellbeing and health of employees22. They are also related to a higher level of burnout23,24, lower job satisfaction12,25,26, work engagement and meaning of work27, and higher turnover intentions28,29.

Significantly, illegitimate tasks have been identified as a severe problem across various workplaces, including the healthcare sector6,15,30,31, higher education32, IT professionals33, teachers5, engineers17, administrative staff11, blue-collar workers34, and also Red Cross volunteers29.

In terms of other effects, illegitimate tasks may also be transferred to the domain outside work. Due to this, illegitimate tasks have been shown to be associated with work-family conflict13,24, which may be exacerbated by psychological detachment35. Moreover, illegitimate tasks are related to biological stress indicators, such as worse sleep quality36 and a higher level of cortisol, especially among male employees who evaluated their health as poor37.

Illegitimate tasks constitute two distinct facets: unnecessary and unreasonable tasks2,4. For example, unnecessary tasks are frequently observed30 in the healthcare sector. However, unreasonable tasks, such as tasks outside one’s occupational role, are also widely reported6. Unnecessary tasks are related to job dissatisfaction38, while unreasonable tasks consume time and energy30,39. Furthermore, unnecessary tasks result in a fluctuation of negative affect within individuals, while unreasonable tasks lead to negative affect at the between-person level18. Unnecessary tasks predict a higher level of distress, while unreasonable tasks cause a broader spectrum of mental health issues, such as distress, anxiety, and depression40.

Concluding, illegitimate tasks are commonplace at work. They may negatively affect the effectiveness of an organisation. They may also harm employee wellbeing and organisational behaviour and even encroach on other areas of employees’ lives.

Bern Illegitimate tasks scale

Illegitimate tasks are usually measured with BITS, which consists of two conceptually separate dimensions capturing unnecessary and unreasonable tasks. Initially, BITS comprised nine items that were classified into two dimensions13. Later, however, their number was reduced to eight4. The two dimensions are moderately correlated2,13,29.

The popularity of BITS is underlined by the fact that it has been translated into several languages, including English11, Swedish10,41, Norwegian39, Finnish6,27, Turkish42, Indian43, Chinese21,35 and Spanish15. Existing studies have mostly focused on the scale’s convergent and discriminant validity. Moreover, most studies have shown that BITS has good internal consistency with regard to both subscales (unnecessary and unreasonable tasks)15.

However, few studies have demonstrated the psychometric properties of BITS in detail based on classical test theory. Using confirmatory factor analysis (CFA), Jacobshagen13, Muntz and Dormann14, and Portilla et al.15 observed two moderately correlated subscales. Even though some studies on the construct validity of BITS have conceived illegitimate tasks as having only a single construct26,27, they concordantly emphasise that this scale captures two substantially different types of illegitimate tasks.

As mentioned above, the original version of BITS consisted of nine items13. Some data indicated that Item 4: ‘Do you have work tasks to take care of which keep you wondering if they would not exist (or could be done with less effort), if some other people made less mistakes?’ loaded on both factors13 (see Table 9, p. 75). Thus, Jacobshagen suggested that Item 4 should be omitted, and Semmer et al.4 started applying the eight-item version of BITS. Unfortunately, it was not emphasised enough that Item 4 had been removed, and a sole description of four items for both unnecessary and unreasonable tasks was revealed. Consequently, this change was confusing to researchers, who have been applying different versions since. The nine-item BITS is still used25,32. Moreover, a few reports of factor analysis revealed that occasionally Item 5 was removed25,44. In addition, BITS was reduced to a few items45 or a single-item scale for unnecessary and unreasonable tasks23. Furthermore, we have found different response formats in the existing studies. In addition to the original five-point response format scale, we have also observed a four- and seven-point scale ranging from ‘strongly disagree’ to ‘strongly agree’17,20,23,42.

The current study

In the 16-year history of BITS, there have been too few solid reports on the psychometric properties of the scale. It seems necessary to test if the eight items of BITS sufficiently measure the unidimensionality of the construct or two separate dimensions. Moreover, no published studies have investigated the item and scale functioning of BITS using IRT. Therefore, to date, our study is the first to report the item and scale functioning of this instrument using IRT.

An overarching purpose of the current study was to evaluate the factor structure of BITS, to determine whether the items and dimensions of BITS function well, and to validate the Polish version of BITS as an important job-related stressor.

We have formulated the following research questions:

RQ1 What is the structure of BITS, i.e., how many factors does it represent and is it homogeneous (unidimensional)?

RQ2 How effectively does each item discriminate between employees with different levels of illegitimate tasks (i.e., unnecessary tasks and unreasonable tasks), as well as.do the items adequately measure illegitimate tasks across low to high levels?

RQ3 What is the construct validity, namely, convergent and discriminant validity, of the Polish version of BITS?

Methods

To assess the psychometric properties of BITS we used two heterogeneous samples of employees. In sample 1, first we examined the factor structure of BITS and then assessed item discrimination and local dependence. In sample 2, we replicated the construct validity and examined the convergent and discriminant validity, as well as the reliability of BITS.

Participants and Procedure

Sample 1

To conduct a two-parameter logistic model (2PLM) IRT test with a graded response model (GRM)46, the recommended minimum sample size is 500 to 750 participants for 10–20 items47. In 2019, 966 fully completed protocols were received. One outlier by having a high value of the Mahalanobis distance (p < 0.001) was detected. Thus, the sample consisted of 965 employees at Polish organisations in the education, public administration and IT sectors. Respondents (65% women) were aged between 21 and 68 years with a mean age of 42.4 years (SD = 10.2 years), and they had job tenures of 1 to 40 years. Most had high levels of education (bachelor’s or master’s degree).

Sample 2

In 2020, 821 fully completed protocols were obtained (response rate 68%). 18 multivariate outliers were excluded. Thus, the final sample consisted of 803 employees at different Polish organisations in the healthcare, education, public administration and IT sectors. Participants (59% women) were aged between 23 and 66 years with a mean age of 43 years (SD = 10.2 years). They had an average job tenure of 19.7 years (SD = 10.8 years, range 2–45) and an average job tenure in their current organisation of 12.9 years (SD = 10.1, range 0.5–41).

Ethics statement

The study was performed in line with the principles of the Declaration of Helsinki. All data were collected using anonymous online surveys. All participants were informed of the nature of the current study and gave their informed consent to participate. According to national recommendations, issued by the Psychological Committee of the Polish Science Academy (21st June 2013) and the National Science Centre, ethics approval to conduct this study was not required. More details about the study procedure are presented in the Supplementary material.

Measures

We took the original Bern Illegitimate Tasks Scale (BITS)13 instrument including nine items (all of which are positively worded) in order to replicate previous findings from past research, and we clarified how each item functioned. Unnecessary tasks are assessed on BITS with five items. Each item is introduced with the lead-in question, ‘Do you have work tasks to take care of which keep you wondering if…’, followed by phrases such as ‘…they have to be done at all?’ There are four items assessing unreasonable tasks, and each of these is introduced with the lead-in question, ‘Do you have work tasks to take care of which you believe…’, followed by phrases such as ‘…should be done by someone else?’ Responses are marked on a five-point Likert scale ranging from 1 (never) to 5 (frequently). Higher scores indicate that employees are more often required to do unnecessary and unreasonable tasks.

A six-stage translation and adaptation process was used to adapt BITS from English to Polish48. This process included forward translation into Polish by two different translation agencies, synthesis, back translation from Polish to English, harmonisation, cognitive interviews, revisions, and pilot data sampling. The Polish version can be found in the Supplementary material.

To assess the convergent and discriminant validity of BITS (sample 2), we applied related constructs such as job demand (work overload) and outcomes (work performance and occupational wellbeing, i.e., job burnout and job satisfaction). Instruments to evaluate these constructs are reported in Table 1.

Table 1 Study instruments: item number, sample item, response format and citation.

Statistical analyses

Preliminary analyses

Scores on the total BITS and its two dimensions were calculated according to the manual13. Using sample 1, the exploratory factor analysis (EFA; the principal axis factoring extraction method with oblimin rotation) and parallel analysis (PA) identified two factors. Each item belongs to one of the two factors. The factor loadings for the items were above 0.68, with the exception of Item 4 (0.44) (see Supplementary Table S1). Indeed, the content of this item refers to both unnecessary and unreasonable tasks. IRT for the original nine-item BITS was also carried out and, following Baker’s16 cut-offs, almost all items on the unnecessary tasks dimension were designated very high, whereas Item 4 was moderate in terms of its precision to discriminate between respondents with different levels of illegitimate tasks. Thus, Item 4 was relatively different. Considering jointly the preliminary analyses, Item 4 was removed from further analyses. Consequently, two subscales containing four items each were assigned to the analyses (unnecessary tasks: Items 1–3, 5; unreasonable tasks: Items 6–9).

Classical test theory analyses

To ensure that illegitimate tasks measured by items of BITS met the IRT assumption of unidimensionality, EFA and PA52 were conducted. Using EFA, the most commonly used indicators that support the unidimensionality assumption include having at least 20% of the variance on the first factor or having a ratio between the eigenvalues of the first factor compared to the next factor that is greater than 453.

Factor validity was also assessed with CFA based on a maximum likelihood estimation method in sample 1, and we replicated these findings in sample 2. The assumptions of data being continuous with multivariate normal distribution were met. The goodness-of-fit of the CFA model was assessed by the chi-squared test (χ2), comparative fit index (CFI), root mean square error of approximation (RMSEA), standardised root mean square residual (SRMR) and Bayesian information criterion (BIC). Values of CFI > 0.9054, RMSEA and SRMR < 0.08, and a lower BIC55 indicate an acceptable goodness-of-fit. Measurement reliability was established using two coefficients, which were Cronbach’s alpha and composite reliability (CR) with the value > 0.7056.

Using sample 2, convergent validity was measured in terms of average variance extracted (AVE) with the value > 0.50 for each latent variable57. Discriminant validity was assessed by examining whether AVE was higher than the squared coefficients of the bivariate correlation between work overload, job burnout, job satisfaction and work performance57. Construct validity was assessed by running multiple linear regressions (enter method). SPSS and AMOS softwares were used to conduct the statistical analyses.

Item response theory analyses

Before applying IRT models with GRM, we evaluated the basic assumption of the model’s unidimensionality, monotonicity, and item local independence. The 2PLM IRT analyses were conducted using IRTPRO. Given BITS’ five response categories, a GRM was selected46. A key assumption of the GRM is that scores for each item are ordered consistently across items, such that the lowest response category (labelled ‘never’) is indicative of the lowest level of θ (where θ indicates different levels of the illegitimate tasks, measured by BITS). The GRM estimates a common item discrimination parameter for each item (a) and also estimates the location or difficulty/severity threshold parameters (bi) for each response category within the item58. Item discrimination describes how well an item can differentiate between examinees at different trait levels. Each item of BITS has an average discrimination parameter a across five response categories. A higher a value indicates that an item discriminates more precisely between respondents with different levels of θ. Each item was evaluated in terms of its function relative to the other items, which was assessed by both its scale score and its contribution to the overall information gathered from each scale score59. Item difficulty is known as “location” on the difficulty range (the location on the latent trait) and describes where the item functions best along the trait scale, namely, lower values will be expected to be endorsed at lower trait levels. Difficulty threshold parameters bi are expressed as the standard deviation away from the mean of the value 0, and have a common range of − 3 to + 359. Both discrimination and difficulty threshold parameters have been represented graphically as item characteristic curves (ICCs). In addition, measurement invariance of the scale between men and women was estimated by differential item functioning (DIF). Gender differences of the discrimination and difficulty parameters for items were examined using chi-squared tests.

These two approaches, i.e., classical test theory modelling with CFA and modelling using IRT, reinforce each other. CFA is confirmative and able to test the conceptual model of BITS, while IRT produces sample-independent estimates of parameters.

Results

Factor structure of BITS

The total BITS (8-item) failed to sufficiently meet the assumptions of unidimensionality based on the ratio between the eigenvalues of the first factor and the next factor being greater than 453. The results of the PA suggested two factors (see Supplementary Table S2).

The extraction of two factors using EFA showed that none of the items cross-loaded on the second factor with a value greater than 0.30 (see Supplementary Table S3). Therefore, two factors were retained as two conceptually-derived dimensions (four items each), which each showed good reliability (Table 2) and supported unidimensionality. Table 2 also presents descriptive statistics of the Polish version of BITS.

Table 2 Descriptive statistics of the Polish version of the BITS.

We also assessed factor validity for BITS using CFA. We analysed two models: a single-factor model and a model of two correlated latent factors (Table 3). The findings revealed that the modified model with the two correlated latent factors, including two pairs of error correlation (Item 1 and Item 2, as well as Item 8 and Item 9), yielded an excellent fit. Those factors were moderately correlated (0.66) (see Supplementary Fig. S1A). This result confirmed the result from the PA, indicating that the eight-item BITS, applied on the current sample, comprises two separate dimensions, and therefore separate analyses were performed on both IRT models, and validation analyses.

Table 3 Confirmatory factor analysis of the single—and two- factor model of BITS.

Item functioning

Table 4 shows that discrimination parameter a ranged from 1.96 to 4.93 for items relating to unnecessary tasks and from 2.42 to 3.27 for those relating to unreasonable tasks. Following Baker’s16 cut-offs, all items were designated ‘very high’ (≥ 1.69). Therefore, the items were deemed able to adequately distinguish between employees with different levels of unnecessary and unreasonable tasks.

Table 4 Item Response Analysis of the Polish version of the BITS.

An inspection of the item difficulty parameters b indicates that both dimensions of illegitimate tasks in the Polish version of BITS adequately measured these traits across low to high levels. DIF showed that all items had measurement invariance between males and females. ICCs combined with IIFs for the four unnecessary task items and four unreasonable task items in the Polish version of BITS are presented in Fig. 1.

Figure 1
figure 1

Item characteristics curves (ICCs, colored lines), combined with item information functions (IIFs, dashed lines) for the unnecessary tasks’ subscale (A) and for the unreasonable tasks’ subscale (B) of the Polish version of the BITS (Sample 1 N = 965).

Concerning specific reliability, the test information function indicated that both factors were sufficiently informative (i.e., reliable) for a broad range of the trait illegitimate tasks (see Fig. 2).

Figure 2
figure 2

Test information function (TIF) of the dimension of unnecessary tasks (A) and the unreasonable tasks (B) of the Polish version of the BITS under the graded response model (GRM) (Sample 1 N = 965). Note. Latent trait θ is shown on the horizontal axis, and the amount of information and the standard error yielded by the test at any trait level are shown on the vertical axis. Ranging from about − 2.30 SDs below the mean to about 2.00 SDs above the mean, the amount of test information was at least 6.5 (which yields a standard error of estimate about 0.38). We can interpret the information magnitude by computing the associated reliability (r = 1–1/information). Thus, reliability was equal to or greater than 0.85, and within the range described. Note. Latent trait θ is shown on the horizontal axis, and the amount of information and the standard error yielded by the test at any trait level are shown on the vertical axis. Ranging from about -2.00 SDs below the mean to about 2.50 SDs above the mean, the amount of test information was at least 4.8 (which yields a standard error of estimate about 0.47). We can interpret the information magnitude by computing the associated reliability (r = 1–1/information). Thus, reliability was equal to or greater than 0.73 within the range described.

Further validity of BITS: replication in sample 2

We replicated construct validity for BITS using CFA in sample 2 (Table 3). The single-factor model showed a poor fit to the data and was worse than the two-factor models. A modified model with the two correlated latent factors obtained an excellent fit. Those factors were moderately correlated (0.67) (see Supplementary Fig. S1B).

Convergent validity was assessed with AVE, and the value for both factors in the model exceeded the criterion of 0.50 (0.62 for unnecessary and unreasonable tasks). In sample 2 unnecessary tasks were more frequently observed than unreasonable tasks (t(802) = 11.03, p < 0.001).

Discriminant validity was assessed with the Fornell–Larcker57 criterion in regard to work overload, occupational wellbeing (i.e., job burnout and job satisfaction) and work performance. Descriptive statistics, reliabilities, and correlations for study variables are presented in Table 5.

Table 5 Descriptive statistics, reliabilities, and correlations for study variables.

AVE for unnecessary tasks (0.62) was higher than the value of its squared correlations with work overload (0.12), job burnout (0.19), job satisfaction (0.11) and work performance (0.04). In addition, AVE for unreasonable tasks (0.62) was higher than the value of its squared correlations with work overload (0.21), job burnout (0.11), job satisfaction (0.10) and work performance (0.01). Therefore, both types of illegitimate tasks can be regarded as distinct from measures assessing the level of work overload, occupational wellbeing and work performance. Further to this, we evaluated the reliability of BITS. The Cronbach’s alpha value was 0.89 for unnecessary tasks and 0.87 for unreasonable tasks, and similarly, the CR was 0.87 for both subscales. Thus, it confirmed the internal consistency and reliability of BITS.

Construct validity was also assessed by running multiple linear regressions, including work overload and both types of illegitimate tasks as predictors, and two indicators of occupational wellbeing (namely, job burnout and job satisfaction), as well as work performance (Table 6).

Table 6 Job demands, occupational well-being and work performance—results of multiple linear regressions.

Unnecessary tasks, unreasonable tasks, and work overload explained 29% of the variance in job burnout. Additionally, both unnecessary and unreasonable tasks, but not work overload, explained 14% of the variance in job satisfaction. Therefore, as expected, both types of illegitimate tasks explain more variance in job burnout, which is a negative indicator of wellbeing, compared with job satisfaction, which is a positive indicator of wellbeing. Moreover, unnecessary tasks, but not unreasonable tasks or work overload, explained 4% of the variance in work performance.

Discussion

Combining the classical test and item response theories, our results showed that BITS is not a unidimensional scale. It measures two conceptually different separate dimensions, comprising four items each. IRT analyses showed that all items on each factor on different levels were very highly informative (reliable). All levels of the unnecessary and unreasonable tasks were reliably captured by the BITS items, suggesting that the questionnaire has very good psychometric properties. Moreover, BITS had measurement invariance with respect to gender-specific discrimination and difficulty parameters. Additionally, the results from validity tests revealed that both dimensions of unnecessary and unreasonable tasks were distinct from other work overload and occupational wellbeing indicators.

The present study investigated for the first time the psychometric properties of an eight-item BITS, using IRT, to provide evidence that this questionnaire reliably measures illegitimate tasks among employees without producing biased data. In addition, a new language version was verified. Thus far, a German and a Spanish version of BITS have been verified using CFA13,14,15. Therefore, our study links advantages of the classical test and item response theories.

We used IRT to analyse the extracted two factors rather than the single scale of illegitimate tasks, as the latter did not meet the unidimensional criterion. Moreover, the results of parallel analysis and CFA, as well as the findings of past research on BITS2,4,13,14,15, showed that the construct of illegitimate tasks consists of two dimensions, namely, unnecessary and unreasonable tasks.

This study also provides evidence that Item 4 may be excluded from the original version of BITS. Indeed, using classical test theory, this item obtained weak validity, psychometrically and semantically. Conceptually, it refers to both subscales (i.e., unnecessary and unreasonable tasks). IRT analysis has shown Item 4 had moderate discrimination power16, and due to this, this item produced relatively lower and flatter information curves. Additionally, scarce previous research on BITS has resulted in researchers applying the eight-item BITS, although Item 4 was not always excluded25,44. Therefore, different content versions are implemented. Finally, some studies have been based on different response formats17,20,23,42. Due to this confusion in the literature, the existing findings cannot be accurately compared. Overall, we present new evidence for improving BITS, which replicates in great depth previous findings4,13.

There are no previous IRT findings on BITS to compare with the current study’s results. However, significantly, all eight items with very high discrimination power indicate that the items of the Polish version of BITS can adequately distinguish between people with different levels of illegitimate tasks. Moreover, in our study, Item 2 and Item 1 (for unnecessary tasks) and Item 7 (for unreasonable tasks), demonstrated the highest discrimination power (see Table 4 and also observe the dashed lines in Fig. 1). Thus, we may propose that in future studies requiring single items, e.g., in diary studies, researchers may use the three mentioned items (1, 2 and 7). However, we have found a diary study23 that used Item 3 and Item 6 as representative of the unnecessary and unreasonable tasks, respectively. No rationale was presented for using these items. We also suggest that a reduced scale, with only three items, can be useful in screening tests or surveys comprising a multitude of scales. However, fully established evidence requires further IRT analysis among other language versions.

Convergent and discriminant analyses revealed that the Polish version of BITS is valid for both its factors. Furthermore, BITS is a reliable instrument, as earlier studies have found14,15,24. Our study adds to prior research by presenting reliability on the whole scale range, using IRT (see Figs. 1 and 2). The results of our study, like previous studies, confirmed that unnecessary and unreasonable tasks are separate constructs in terms of stressful job demands. Similar to existing studies12,24,26, illegitimate tasks were moderately correlated with work overload and jointly predicted occupational wellbeing. However, both dimensions of illegitimate tasks in our sample contributed more to occupational wellbeing beyond and above work overload. What is more important, illegitimate tasks related to job burnout to a greater extent than job satisfaction. It is consistent with the assumptions of Job Demands-Resources theory60, that excessive social demands negatively affect people’s health more than decrease their motivation.

Additionally, we have observed that unnecessary tasks occurred more often than unreasonable tasks among Polish employees. The same was noted for Swedish employees30. Moreover, our findings demonstrated that only the unnecessary tasks were negatively related to employees’ self-rated work performance. However, to a lesser extent than they affected their wellbeing. Unnecessary tasks go hand-in-hand with ‘red tape’, are more often assigned at random17 and may create unproductive organisational behaviour and diminish the meaningfulness of work among teachers5, administrative staff11 and IT specialists33.

Our study has some shortcomings. The samples were not representative of the employed Polish population, and they were limited to four economic sectors, i.e., IT, administration, education and healthcare. Nonetheless, samples included respondents from each of Poland’s 16 voivodeships. We should emphasise that less-educated employees were underrepresented. In addition, self-reporting methods were used, therefore in further research it is advisable to obtain data from a variety of sources, as was done by Muntz and Dorman14. Above all, the cross-sectional design is an important limitation for estimating the predictor-outcomes relationship and, also, for controlling the common method variance59. However, Harman’s single-factor method showed that CMV was not a problem in this study (the first factor explained 35.1% of the total variance in sample 2, which constituted less than half of the covariance among measures).

Our findings may have practical implications. BITS has proven to be a valid and reliable scale that can be used in managerial practice. Illegitimate tasks are widespread, therefore this is still a real problem in work management5,6,17. Thus, a valuable diagnostic instrument to identify illegitimate tasks is expected. Our findings provided new evidence that BITS is an exceptional scale. In addition, based on the IRT results, we may suggest that a few items of BITS could be useful during screening examination and diary surveys to control the rationale of work tasks. And above all, in day-to-day interviews (e.g., during staff interviews), to help managers focus on a qualitative understanding of illegitimate tasks.

We conclude that the Polish version of BITS, completed by employees from different Polish organisations, is best measured by its two dimensions, namely, unnecessary and unreasonable tasks, which include four items each. The validity and reliability results of BITS demonstrate that the scale is outstanding and worth applying in scientific research and managerial practice.