Evaluation of JATSdecoder as an automated text extraction tool for statistical results in scientific reports

The extraction of statistical results in scientific reports is beneficial for checking studies on plausibility and reliability. The R package JATSdecoder supports the application of text mining approaches to scientific reports. Its function get.stats() extracts all reported statistical results from text and recomputes p values for most standard test results. The output can be reduced to results with checkable or computable p values only. In this article, get.stats()’s ability to extract, recompute and check statistical results is compared to that of statcheck, which is an already established tool. A manually coded data set, containing the number of statistically significant results in 49 articles, serves as an initial indicator for get.stats()’s and statcheck’s differing detection rates for statistical results. Further 13,531 PDF files by 10 mayor psychological journals, 18,744 XML documents by Frontiers of Psychology and 23,730 articles related to psychological research and published by PLoS One are scanned for statistical results with both algorithms. get.stats() almost replicates the manually extracted number of significant results in 49 PDF articles. get.stats() outperforms the statcheck functions in identifying statistical results in every included journal and input format. Furthermore, the raw results extracted by get.stats() increase statcheck’s detection rate. JATSdecoder’s function get.stats() is a highly general and reliable tool to extract statistical results from text. It copes with a wide range of textual representations of statistical standard results and recomputes p values for two- and one-sided tests. It facilitates manual and automated checks on consistency and completeness of the reported results within a manuscript.


Terminology for destinct representations of statistical results reported in text. Generally, any
letter or letter-number combination pointing to a numeric value with an operator (<, >, = , ≤ , ≥ ) is here considered to be a potential statistical result. A statistical result can be a descriptive measure as well as a test result. Statistical test results mostly consist of a varying set of results (test statistic, degree/s of freedom, an effect measure, p value, confidence interval and/or a Bayes Factor). There are many widely used statistical tests. Results that contain a Z-, t-, F-, χ 2 , r-, H-, Q-, G 2 , U-statistic or Bayes Factor and/or a measure of effect ( β , Cohen's d, η 2 , OR, RR, R 2 ) are defined as statistical standard results here.
Although there are guidelines about how to report statistical results (e.g. APA style), they are not reported in this standardized manner consequently (e.g. p value only). It makes sense to differentiate between different reporting practices of test results in terms of the level of their completeness and post processability. A statistical test result that enables a recomputation of an also reported p value is defined as checkable here (e.g.: ' t(89) = 1.96, p = .05'). Test results that enable a computation of a non-reported p value (e.g.: ' t(89) = 1.96 ') will be called computable. A checkable result is always computable. The third set of test results is reported in a manner, in which no recomputation of a reported or unreported p value is possible (e.g.: ' r = .12, p < .05 ' , or ' t > 2'). These results will be called uncomputable here.
The R Package JATSdecoder. The R package JATSdecoder 16 supports the application of text mining approaches to scientific reports by processing XML documents that are structured with the Journal Archiving Tag System NISO-JATS 17 . The NISO-JATS is an HTML tag standard to store scientific articles without any layout parameters. Graphical content is hyper referenced.
JATSdecoder's functions make use of simple and sophisticated text extraction and manipulating algorithms that can cope with a wide range of textual and technical representations of content in NISO-JATS coded documents. The built-in function JATSdecoder() extracts a set of metadata (title, author, publishing dates etc.), the abstract, sectioned text and reference list. The structured output is very useful for individual searches and extraction procedures, as it facilitates these tasks on individually defined text parts (e.g. section titles, method section, reference list) and metadata.
JATSdecoder's function study.character() performs multiple text selection and manipulation tasks on the list created by JATSdecoder() and extracts key study characteristics like number of reported studies, the statistical methods, software and correction procedures for multiple testing used. Its function get.stats() outputs all detected statistical results including descriptive measures (mean, sd, CI, Cronbach's alpha) as a vector which is then further processed. Detected Z-, t-, F-, χ 2 , r-, H-, Q-,G 2 , U-statistics or Bayes Factors and corresponding effect measures R 2 , β , OR, d and/or η 2 are formatted into a data frame with numerical values and operators stored in separated columns.
To increase get.stats()'s detection rate for computable and checkable results, users can activate its arguments 'T2t' and/or 'R2r' . Statistics denoted with capital letter T or R respectively will then be treated as t-or r-values, www.nature.com/scientificreports/ which may not be appropriate. Activating its argument 'estimateZ' makes get.stats() estimate Z-statistics for beta-and d-values reported with standard error but no test statistic. If possible, a recomputed p value for an undirected null hypothesis is added. If desired, also p values for directed tests can be outputted if a computation is possible (only t-, Z-, r-values). The resulting data frame can be reduced to computable results only (recalculation of p value is possible, e.g. ' r(18) = .12'), checkable results (recomputable result with p value), or outputted with all detected standard results (e.g.: no p value check possible: ' r = .12, p = .61 ' or p value only).
Deviations in reported and recomputed p values may be multicausal (directed test, rounding, typo, extraction or compilation error). Therefore, a check for completeness and plausibility of the results is not done automatically. Users checking a manuscript should always manually countercheck the extracted results and inconsistencies.
A non-computed, although expected to be computed p value, a non-reported but computed p value, or a completely missed out result in the output of detected standard results may be an indicator for an incompletely reported result within the text. Warning messages are returned if p-, r-, or R 2 -values are reported that are outside their valid range.
Statistical results reported in tables are explicitly not captured by get.stats(), as the compilation of tables cannot be performed reliably. Here statcheck differs from study.character(), as it always analyses the whole textual content of an HTML or PDF to text converted file and captures test results from tables, if they are reported in a full textual manner and not with named columns, which is much more frequently done in practice.
To extract the statistical methods mentioned in an article, study.character() tries to split the NISO-JATS document into four sections (introduction, method, results, discussion). Its function get.method() performs a heuristic driven feature extraction process, to output the statistical methods listed in the method and results section. It finds the specification of a method, that contains the descriptive term in front of a set of search terms, which most commonly used statistical procedures have in common (e.g.: test, regression, anova, method, theorem, interval, algorithm, etc.). Users can enlarge the result space by defining additive search words in its argument 'add' . The current heuristic enables an extraction of new, still unknown statistical procedures, if they are named with one of the already specified or user adjustable search terms at the end (e.g. 'JATSdecoder algorithm'). Methods with a specifier behind the search term (e.g. 'test on homogeneity of variances') cannot be identified.
To identify the total number of studies reported in a document, the software and correction method used, fine-tuned dictionary searches are performed on preselected text parts and phrases. Software identification can be enhanced by adding further software search patterns.
Despite its wide extraction capabilities, the focus here is solely on study.character()'s function get.stats() and its ability to extract and post-process statistical results out of NISO-JATS formatted research articles. A simple web interface to extract and check statistical results within single articles in different formats (PDF, XML, HTML, DOCX) is hosted at: www. get-stats. app.
Several conversion tools that transform PDF documents into a post processable text object exist. One sophisticated converter is the Content ExtRactor and MINEr (CERMINE) 18 which extracts metadata, full text and parsed references from a PDF file and makes it storable in different formats (plain text, NISO-JATS XML, etc.). The implementations of most steps are based on supervised and unsupervised machine learning techniques, which simplifies the procedure of adapting the system to new document layouts and styles 18 .
Language and type setting features allow very individual ways of expressing one and the same bit of information. This is especially relevant when processing text with many formulas, indices, special characters (operators, Greek letters, hyphens, separators, brackets, etc.) and synonymously used characters (Greek/Latin small letter beta: β , sharp german s: ß, HTML beta: '&beta; ′ ). In electronic documents characters can be represented by different character codecs (UTF-8, ASCII, Unicode, hexadecimal, HTML, etc., or even pictures) which generally makes each extraction and compilation task on numerical results and other content more complicated.
When compiling PDFs with CERMINE, a wide range of compilation errors can occur (e.g.: missed operators, handling of subscripts, undetected Greek and special letters). JATSdecoder's function letter.convert() unifies many letter representations and corrects most PDF and CERMINE specific conversion errors. This enables JATSdecoder to also reliably process PDF files that were converted to NISO-JATS coded XML files by CERMINE.
JATSdecoder's algorithms have been developed iteratively based on the PubMedCentral article collection and about 10,000 PDF files from different journals, that were converted with CERMINE. get.stat() is designed for numbers that are reported with a dot as a decimal separator.
How get.stats() works. A two-step process is performed to extract the reported results within a text and recalculate the reported p value with get.stats(). First, the input text is converted into sentences, squared into round brackets. Only those sentences are selected, that contain at least one letter and an operator followed by a number. To extract the reported test statistic, degrees of freedom, corresponding effect measure and p value, they are split at a set of words (e.g. 'and', 'or', 'were', 'of ', etc.) and at words followed by a comma. If multiple test results are identified in a text snippet (e.g. more than one t-or p value), it is further split up, assuming a test statistic is reported in front of its p value. Text that appears in front and behind the results is removed with regular expressions (e.g. the text behind the last reported operator pointing to a number). The first result is a vector with unified representations of sticked results, starting with any letter or letter-number combination with, if present, degrees of freedom in round brackets, pointing to a number with an operator. Several heuristics to unify the representation of overly big and small numbers are applied. Before extracting the actual value of each standard result and the reported p value, regular expressions are used to remove labels of test statistics. Every targeted standard result is extracted from the sticked results with an individual heuristic that copes with a variety of reporting styles. The recognized value of the test statistic, its operator, the degrees of freedom and p value of each sticked result is returned as a cell in a matrix, which represents the second output. Each type of result is stored in www.nature.com/scientificreports/ a separated column, which greatly facilitates further post-processing and identification tasks. In standard mode, the recalculation of p values is performed based on the result matrix using basic R functions for distribution functions ('2 * (1 − pnorm(Z))' , ' 1 − pchisq(chi2, df ) ' , etc.). Users can activate an additional recomputation for one-sided t-and Z-tests, as well as r-values that are reported with degrees of freedom.
The R package statcheck. The R package statcheck 14 performs an automated detection of statistical test results reported in APA style. It is capable of extracting adequately reported Z-, t-, F-, r-, Q-and χ 2 -statistics with adequately reported degrees of freedom and a p value to check the result on plausibility (see: 19 ). statcheck recomputes the corresponding p value and flags inconsistencies to the reported p value. The built-in functions work on plain text (statcheck()), HTML (checkHTML()) and PDF files (checkPDF()). Nuijten et al. 20 validated statcheck on the manually coded analysis of errors in all reports of statistically significant t-, F-and χ 2 -test results in 48 articles, published by the Journal of Personality and Social Psychology and Journal of Experimental Psychology: Learning, Memory, and Cognition. statcheck extracted 648 out of 1,120 results (57.9%) in the comparative dataset (one retracted study with 28 significant results, that was part of the original analysis, was excluded by the statcheck authors).
Screening 39,717 articles published by eight journals with statcheck Nuijten et al. 15  As noted by Schmidt 22 statcheck's identification rate for statistical results is rather low. This is in part due to its inability to handle statistical results that are not reported exactly according to APA style, reported with degrees of freedom (or label) in subscript, that contain semicolons instead of commas, square brackets instead of parentheses, effect sizes in-between test statistic and p value 23 .
As there is growing enforcement not to rely on the standard p value thresholds of ' p = .05 ' too much but rather change it to ' p < 0.005' 24 , report effect sizes and confidence intervals instead 25 , or even turn away from frequentist methods entirely 26 , statcheck will ever get worse in doing a good job as a detector of statistical results in text, the more these demands are implemented in practice.
As statcheck falsely flags inconsistencies in p values, when appropriate correction methods have been applied (p value correction for multiple testing instead of α-error adjustment) and therefore might encourage users not to use the appropriate methods, Schmidt 22 concludes that statcheck is an unsuitable software to detect errors in statistical results and should rather not be used.
Distinguishing features of get.stats() and statcheck. Compared to statcheck, that looks out for a narrow set of exact pattern matches in a string, get.stats() deals with almost any result reported in text. In contrast to statcheck commas as well as semicolons used as separators can be handled by get.stats().
Before extracting the actual value of every detected standard result, get.stats() selects, splits and cleans up all sentences presenting statistical results. get.stats() extracts and post-processes many standard results that are labeled or indexed. It performs several transformations of the textual representation of numbers in text. Fractions, as well as results reported with a 'e⌃number' or a percent sign are compiled to decimal numbers, commas in large numbers ( ≥ 1000 ) are removed. The output should therefore not be treated as an exact representation of the reported results.
Whereas statcheck's functions always analyze the full document or text entered, study.character()'s argument 'text.mode' enables an extraction with get.stats() on specific text parts (1: full text and abstract, 2: method and result section/s, 3: result section/s only).
statcheck treats non-significant p values reported with 'ns' as checkable results, whereas get.stats() treats such results as computable, if the reported result allows a recomputation of the p value (e.g.: ' t(18) = 1.1, ns'). Table 1 lists some potential results of a vector with identified sticked results by get.stats(x,output = 'stats').
The selected examples demonstrate how get.stats() and statcheck() differ, in terms of their ability to detect, extract and check statistical results reported in text.
In most of the listed examples, get.stats() extracts all contained standard results defined earlier, whereas statcheck() fails to detect many of the results at all and extracts some results inadequately. Any squared statistic, as well as any statistic denoted with one of 18 upper-or lowercase letters (except: B, F, N, R, T, Q, W, Z) that is reported with its degrees of freedom in brackets is interpreted as χ 2 -test results by statcheck. 'rp'-, 'sr'-, 'pr'-and 'LR'-statistics are interpreted as correlations by statcheck(), which, in part, may be correct. get.stats() does not classify these letter combinations as standard results. Results reported as intervals may cause missing or erroneous detections by get.stats() as the last example in Table 1 demonstrates.

Method
To evaluate and compare the JATSdecoder and statcheck algorithms in terms of their practical precision and reliability in extracting statistical results in prespecified text parts, two analyses are performed with different input formats.
First, the total number of manually extracted statistically significant t-, F-and χ 2 -statistics in the method and result section of 49 articles by Wicherts et al. 27 is compared to the number of computable, statistically significant t-, F-and χ 2 -results extracted from the method and result section with study.character(x,text.mode = 2) and statcheck's algorithms. The differences between the manually coded data and study.character()'s detections are described case by case. All non-or incorrectly converted but corrected operators, that are replaced with ' <=> ' by letter.convert() are converted to '=' before being processed with statcheck(). Labels and/or indices of reported test statistics are removed with simple regular expressions. As no other α-error level was identified in the 49 studies, all results that lead to a recomputed p value ≤ .05 or that are reported with ' p ≤ .05 ' are selected to compare the number of extracted significant results. Next the same article collection is analyzed by each algorithm with no limitations on p values nor type of statistics nor on the part of text. The distribution of the number of detected results is displayed in box plots for each procedure and input format.
The second analysis demonstrates that get.stats()'s high performance and detection rate for statistical results also holds for much bigger article collections. An unrestricted search for statistical standard results is performed Table 1. Some examples of statistical results and the extracted standard results by get.stats() with its argument 'T2t = TRUE' and statcheck(). Representations are presented in an easy readable format instead of the resulting data tables extracted. Empty cells represent no detections.   As no manually coded data exists for this big data set with varying input formats, the number of identified standard results by get.stats() ('all' , 'computable' and 'checkable') is compared to that detected by statcheck's functions with global descriptive measures. The total and relative amount of articles with detectable results and the total sum of detected results is presented for every journal and algorithm setting, as well as some descriptive measures for articles with identifiable results (mean, sd, median, IQR, .99 quantile, maximum, processing time).
All converted PDF documents are passed to get.stats() and checkHTML() as they contain HTML standard coding. The native PDF files are processed with checkPDF() and the preprocessed vector with sticked results extracted by get.stats(x,output="stats") is passed to statcheck(). Non-significant p values reported with 'ns' are excluded before counting statcheck's detections to enable a comparison of the extracted number of checkable results. As the PMC bulk download contains native XML files only, no processing with checkPDF() is performed for these studies. The open source software CERMINE 18 was used to convert each PDF file into a NISO-JATS coded XML, before being processed with JATSdecoder's function study.character() or get.stats().
All extractions and analyses were performed with an AMD@Epyc 7452 32-core processor running with Linux Ubuntu 20.04.1 LTS and the open source software R 3.6 28 . To enable multicore processing, the R package future.apply was used 29 .

Evaluation of get.stats() detection rate with manually coded data and statcheck's functions.
First the total number of significant t-, F-and χ 2 -test results that was extracted manually by Wicherts et al. 27 is compared to the number of significant results extracted by study.character() and statcheck's functions. Figure 1 displays the distribution of identified significant t-, F-, and χ 2 -statistics per paper for the applied extraction method and input format.
study.character() identifies 1,095 significant results in the method and result sections compared to 1,148 results extracted by Wicherts et al. 27 . checkHTML() only detects 129 significant t-, F-and χ 2 -test results within the full text of the browser generated HTML documents. checkPDF() could not extract a single statistic out of the same raw material, that was converted with CERMINE to become processable with JATSdecoder. The extracted sticked Here study.character() extracts 53 significant results less, than were found in the manual analysis. As 146 significant results are reported in tables and not extracted by study.character(), 93 additive significant results are identified. Table 2 summarizes each of the 35 cases with deviations to Wicherts et al. 27 . There are several reasons for a higher detection rate by study.character(). 13 checkable test results that are reported for several tests (e.g. 'all ts(18)>3, ps<.05') are extracted by study.character(). Four results that are incorrectly reported with ' p > .05 ' , although they are significant, were not included by Wicherts et al. 27 but found with study.character(). As none of the 49 CERMINE converted PDF files contains readable operators, letter.convert() inserts " <=> " to these empty or badly captured text parts. An insignificant result reported with ' p > .05 ' is therefore indistinguishable from a significant result reported with ' p < .05 ' . This leads to 29 false positive inclusions in total. 24 results that are reported in footnotes and identified by study.character() seem not to be included in the original analysis. One result reported in the description of an experiment seems to be included in the original analysis but is not identified by study.character(), as only method and results sections are selected. In three articles, study.character() detects a total of nine results in the method sections that seem not to be included in the manual extraction. Three goodness of fit χ 2 -statistics are excluded by Wicherts et al. 27 and included by study.character(). Nine significant result are missed by study.character() because some text parts or section titles got lost while PDF conversion.  27 . Figure 2 displays the distribution of all detected statistical standard results per paper for the different extraction methods and input formats, with no restrictions to significant results nor type or text parts. No manually coded data exists for this analysis. In total, get.stats() identifies 2,134 statistical standard results in the abstracts and full text parts. 1,626 of these results are reported in a manner, that enables a recomputation of p values. 1,443 results are checkable. The preprocessed and further index removed vector extracted by get.stats(x,output="stats") increases statchecks detection rate from 355 to 965, or even 1,143 results respectively. No false positive inclusion of a checkable result by get.stats() was observed.

Analysis of a large article collection with varying publishers and input formats.
Next, the collection of all published PDF files by 10 mayor journals of psychology as well as all ever published XML documents by 2 open access journals is used to extend the evaluation of get.stats() to a bigger data set. The absolute and relative frequency of documents with extractable results per journal, different algorithm settings and input formats is listed in Table 3.
In 89% of all processed documents get.stats() extracted at least one statistical result (operator between letternumber combination and number). In 46% of all analyzed articles get.stats() detects at least one computable result and in 44% at least one checkable result (both with arguments 'T2t' and 'R2r' set to TRUE). Activating get. stats()'s argument 'estimateZ' has a small effect (+1%) on the total sum of identified documents with computable and checkable results.
In every journal and input format, all statcheck functions detect fewer documents with checkable results. In 38% of all articles statcheck() finds checkable results within the extracted sticked results by get.stats(), check-HTML() in 19% of all CERMXML/XML files and checkPDF() in 26% of all PDF files. All or most articles by four journals cannot be handled by statcheck's functions checkHTML() and checkPDF(), as the compiled PDF files contain incorrectly converted operators.
The amount of articles that contain computable and/or checkable results varies greatly between journals. Overall the journal Personality and Social Psychology Bulletin contains checkable results in 91% of the articles, compared to 34% of all articles distributed by Depression and Anxiety.
The preprocessed text vector that is returned by get.stats(x,output='stats') enhances statcheck()'s ability to detect documents with checkable results in every journal. Both format specific statcheck functions checkHTML() and checkPDF() identify less documents in every journal. Table 4 lists the total number of extracted results, standard results, as well as computable and checkable results in each setting and gives descriptive measures for those articles that contain extractable results. In total, get.stats() extracts 1,568,555 sticked results, 981,529 statistical standard results out of which 386,172 represent computable and 359,440 checkable results. Compared to the statcheck algorithms, the total sum of detected checkable results by study.character() is higher in every journal and input format. 12,249 computable results become checkable when activating get.stats()'s option to compute p values on estimated Z-values (from 347,191 to 359,440).
Within those articles that contain checkable results, the mean number of detected results is 14.2 with get. stats() and 13.5 with statheck() on the same preprocessed result vector, 10.7 with checkHTML() but 15.2 with checkPDF(). Also, the median, interquartile range (IQR), .99 quantile and maximum of checkable results detected by get.stats() are higher than statcheck()'s measures when processing the same vector and relevantly higher to No unexpected processing times occurred. As many preprocessing operations are performed, the extraction of the sticked results with get.stats(x,output='stats') takes 1.3 seconds on average per paper and processor. The mean processing time of this vector differs slightly between statcheck() (.6 sec.) and get.stats() (.5 sec. per document and processor). In total, both file specific statcheck functions work a lot faster, as no case specific letter conversion nor uniformization is performed before extracting the results. Table 5 displays the increase factors with which get.stats() identifies more checkable results per journal. As no PDF files are analyzed for Frontiers in Psychology and PLoS One, these fields are left blank for checkPDF(). get. stats() outperforms statcheck() in detecting checkable results by a varying factor of 1.07 for Behavioral Neuroscience to 1.73 for Journal of Management when processing the same preprocessed vector of sticked results extracted with get.stats(x,output="stats"). This pattern holds for checkHTML() when processing CERMINE converted PDFs (1.13 to 2.84) and checkPDF() processing the original PDF files. Three PDF article sets mostly contain nonstandard coded operators and cannot be processed in their native version by checkPDF() nor in their CERMINE compiled version by checkHTML(). Compared to checkHTML() get.stats() extracts 3.33 (Frontiers in Psychology) to 4.1 (PLoS One) times more checkable standard results within the native XML files with most results coded in HTML style.

Conclusion
get.stats()'s high precision and flexibility in extracting statistical results from research papers in NISO-JATS formatted XML files has been demonstrated. It facilitates plausibility checks on many standard results reported in text, and can help scientists as well as editors to summarize and check a study regarding reporting style and Table 3. Absolute and relative frequency of articles with extractable, computable or checkable statistic by journal, input format, additive settings and algorithm.   JATSdecoder's functions can handle most PDF and CERMINE specific conversion errors in statistical results, except in cases with non compiled text parts (e.g. footnotes, listings, section titles). Incorrectly converted operators and some Greek letters are corrected, while completely missing operators are replaced with ' <=> ' for many statistical results. The extracted vector of sticked results by get.stats(), converts CERMINE converted PDF files, that are unprocessable for checkPDF(), into a format that is post-processable with statcheck().
The results of Nuijten et al. 20 could not be replicated with neither input format. Compared to the original paper, checkPDF() does not detect a single checkable result in the PDF files, while checkHTML() just detects a small proportion in the browser generated HTML files. Finally, statcheck() identifies more checkable results within the preprocessed output of get.stats() than were found by Nuijten et al. 20 . Therefore, get.stats() preprocessed output enhances any automated plausibility check with statcheck(), especially for those PDF files that compile with errors, which applies to full article collections of some journals.
In all cases, get.stats() outperforms all statcheck algorithms. Even compared to a manual extraction, its precision on extracting statistical results from text can be considered very high. In some rare cases, the compilation by CERMINE failed to cover all text parts, leading to some undetected results. However, this problem only needs to be considered when PDF conversion was applied.
Most deviations observed to the manually coded data by Wicherts et al. 27 are caused by their representation in tables, differing inclusion criteria and/or differing definitions of a checkable result. No false positive detections of checkable results by get.stats() were observed.
A non-negligible part of all reported results in the surveyed articles is presented in tables and cannot be extracted nor checked by neither get.stats() nor statcheck. Converting tables in PDF files to text mostly produces spurious artifacts in the resulting output, as they allow very individual layout and coding styles. statcheck detects results reported in tables if they are reported in a full textual manner in one cell of the table, which is a rather rare event. Up to now, as only a very small portion of tabulated results can be extracted with statcheck, it is sensible to restrict checking procedures to results reported within the main text only. Descriptive measures of the total number of reported results in text therefore tend to be mostly negatively biased estimates for the actual number of reported results. Correlation matrices and regression tables often contain a high amount of test results. For test results reported with asterisks instead of p values, a precise plausibility check is generally not possible.
As no algorithm can be perfect, false positive and negative detections may occur when get.stats() tags a reported result as a standard result. Many PDFs lose their special characters during conversion to NISO-JATS coded XML files which may lead to false positives and negatives, when a missing Greek letter other than χ is used but χ is imputed by letter.convert(). Results that are labeled equally to the above defined standard results but represent other measures, will be treated as a standard result. Especially wrongly interpreted Z-values (e.g. in a coordinate: 'x = 1, y = 2, z = 3') will automatically lead to the computation of a p value and suggest that the result is computable. Special or anomalous labels of results and special letter uses that are not captured by get. stats() may lead to a non-detection as checkable standard result.
JATSdecoder enables a wide range of possibilities for meta-analytical research and mirroring techniques. The reported degrees of freedom in some test statistics allow an estimation of the sample size which a study is based on. Another option is to analyze all ever reported statistics by an author, affiliation, subject and/or other subsets of metadata. A p-curve analysis of the reliably extracted results from one or many article/s may help to identify questionable research practices performed by individuals or groups. Its ability to split an article into selectable sections and phrases enables sentence detection in specific text parts of a study (e.g. discussion/conclusion only). With a little additive text extraction effort, it is possible to detect all investigated variables or effects within a research topic.