Table 2 Similarity between the human-transcribed reference standard and ASR-transcribed sentences.

From: Assessing the accuracy of automatic speech recognition for psychotherapy

   Word overlap Semantic similarity
Group n Error Rate, % Shapiro–Wilk p value Semantic distance, pts Shapiro–Wilk p value
Aggregate
 Total 100 25% ± 12% 0.93 <0.001 1.20 ± 0.31 0.97 0.03
Speaker
 Patient 100 25% ± 12% 0.86 <0.001 1.19 ± 0.33 0.94 <0.001
 Therapist 100 26% ± 11% 0.88 <0.001 1.20 ± 0.29 0.99 0.57
Patient gender
 Male 13 24% ± 9% 0.95 0.55 1.17 ± 0.30 0.95 0.55
 Female 87 25% ± 13% 0.84 <0.001 1.19 ± 0.33 0.94 <0.001
  1. Plus/minus values denote standard deviation. Lower error rate is better. Lower semantic distance is better. Shapiro–Wilk tests were conducted to test the normality assumption (Supplementary Fig. 2). Low p values indicate the data are not normally distributed.