We were interested to read the paper by Ozaki et al.1 that was published in Hypertension Research in January 2017.1 The authors aimed to examine the diagnostic performance of the thrombin-cleaved OPN N-terminal (trOPN-N), osteopontin (OPN), metalloproteinase (MMP)-9, S100B, D-dimer and C-reactive protein for the atherothrombotic subtype of ischemic stroke. The results demonstrated that trOPN-N>5.47 pmol l−1 was associated with atherothrombotic risk (odds ratio (OR): 11.7; 95% confidence interval (CI): 2.10–64.87; P=0.005).1

The results were interesting; however, it should be noted that the effect of trOPN-N>5.47 pmol l−1 on atherothrombotic risk is biased due to sparse data availability. It is argued that in the presence of data sparsity the OR will be large and its CI tends to be too wide.2, 3, 4 Methods such as penalization or Bayesian analysis are suggested to remove or limit sparse data bias.2, 3 The authors reported the sensitivity, specificity and c-statistic of trOPN-N>5.47 pmol l−1 for atherothrombotic to be 0.54, 0.91 and 0.72, respectively. We drew a 2 × 2 combination of trOPN-N and atherothrombotic risk using the aforementioned information (Table 1), then we used penalized estimation to correct the sparse data bias regarding the effect of trOPN-N>5.47 pmol l–1 on atherothrombotic risk. The corrected OR (95% CI) was found to be 7.73 (2.21, 27.00).

Table 1 The estimated OR from two methods of ordinary logistic regression and penalization