Detecting anomalous citation groups in journal networks

The ever-increasing competitiveness in the academic publishing market incentivizes journal editors to pursue higher impact factors. This translates into journals becoming more selective, and, ultimately, into higher publication standards. However, the fixation on higher impact factors leads some journals to artificially boost impact factors through the coordinated effort of a “citation cartel” of journals. “Citation cartel” behavior has become increasingly common in recent years, with several instances being reported. Here, we propose an algorithm—named CIDRE—to detect anomalous groups of journals that exchange citations at excessively high rates when compared against a null model that accounts for scientific communities and journal size. CIDRE detects more than half of the journals suspended from Journal Citation Reports due to anomalous citation behavior in the year of suspension or in advance. Furthermore, CIDRE detects many new anomalous groups, where the impact factors of the member journals are lifted substantially higher by the citations from other member journals. We describe a number of such examples in detail and discuss the implications of our findings with regard to the current academic climate.

Detecting anomalous citation groups. In citation networks, a citation cartel is manifested as a group of journals that excessively cite papers published in other journals within the group. Although not all such groups are necessarily citation cartels, we aim to identify journal groups with excessive within-group citations. Specifically, we assume that an anomalous citation group is composed of donor journals and recipient journals. A donor journal provides excessive citations to papers published in recipient journals in the previous 2 years i.e., the time window for the JIF. In cases where two journals exchange citations at excessively high rates, they simultaneously behave as both donors and recipients. Although donor journals have no apparent direct benefit in providing citations to recipient journals, we consider them as a member of the anomalous citation group because some previously identified instances contain journals giving excessive citations to particular journals, which often share the publishers or editors [13][14][15] .
We identify excessive citations between journals using a null model for citation networks. Specifically, we use the degree-corrected stochastic block model (dcSBM) 24,25 as the null model. The dcSBM generates randomized networks that preserve the number of citations between groups of journals (i.e., blocks), and the outgoing and incoming citations of each journal on average. We determine the blocks by fitting the dcSBM using a nonparametric Bayesian method 25 . Community detection methods for networks including the dcSBM have been shown to provide a reasonable partitioning of journal citation networks into research fields [19][20][21] . Therefore, the networks generated by the dcSBM are considered to be random networks that roughly preserve the patterns of citations within and across research fields.
CIDRE removes from the given network all the edges that are statistically compatible with the null model and then computes a donor score and a recipient score for all journals based on the residual edges in the network (see the Materials and Methods section). In the following, we refer to the weights of such edges as excessive citations. Consider a journal group, denoted by U, that contains journal i. Journal i's donor score, denoted by x d (i, U) , is the fraction of excessive citations that journal i provides to the other journals in U. Journal i's recipient score, denoted by x r (i, U) , is the fraction of excessive citations that i receives from other journals in U. CIDRE considers a journal as a donor journal and a recipient journal if x d (i, U) and x r (i, U) are larger than a prescribed threshold θ = 0.15 , respectively (see the Discussion section for the choice of the θ value).
To find anomalous citation groups, CIDRE initializes U to be the set of all nodes in the network. Then, CIDRE removes from U the journals that are neither a donor nor a recipient and recomputes the donor and recipient scores for the journals remaining in U. CIDRE iterates the removal of nodes and the recomputation of scores Overlap with the journal groups suspended from JCR. CIDRE detected 184 citation groups between years 2010 and 2019 (Fig. 1a). A detected citation group consisted of four journals on average (Fig. 1b). Because no ground truth is available for evaluating the detected groups, we compare them with the journals suspended from JCR. Since 2007, JCR has suspended 227 journals due to excessive citations, of which 173 journals are suspended due to excessive self-citations, 55 journals due to excessive citations between two journals, and one journal due to both self-citations and pairwise citations 26 . Although JCR does not disclose its precise algorithm, they have released some criteria for suspensions. Their criteria include the fraction of citations that the recipient journal receives from the donor journal, akin to the recipient score, together with the year since the first publication from the journals and the ranking of journals 18 . JCR reported 46 pairs of donor and recipient journals for excessive pairwise citations. Some journal pairs suspended from JCR share a journal. We merge such overlapping journal pairs suspended in year t into one group, denoted by U JCR ℓ , and consider that U JCR ℓ is identified by JCR in year t − 1 (i.e., 1 year prior to the suspension). There are 22 such groups, which we denote by J1, J2, . . . , J22.
We calculate the overlap between groups reported in JCR and CIDRE as O = U JCR U CI |U CI , where U CI is a set of journals in a group detected by CIDRE. If U JCR and U CI have O ≥ 0.5 and share at least two journals, we say that U JCR is detected by CIDRE. CIDRE detects the 12 groups suspended from JCR at least once, of which 8 groups have O ≥ 0.8 (Fig. 2a). CIDRE detects 10 groups earlier than JCR reports. Furthermore, CIDRE detects 7 groups for multiple years before the suspension by JCR but no group after 1 year from the suspension, suggesting that they stopped malicious citation practices after the suspension had been lifted.
Could the above suspended groups also be detected by standard community detection algorithms? To address this question, we consider three community detection algorithms, i.e., the modularity maximization by the Leiden algorithm 27 , Infomap 21 , and the dcSBM 25 . We apply the algorithms and evaluate whether or not the detected communities match the suspended groups under the same matching criteria used for testing CIDRE. These community detection algorithms have found at least three times more groups than those found by CIDRE. However, none of them matches the suspended groups, with a small overlap of O ℓ,ℓ ′ ≤ 0.012 for all detected communities. One may argue that the groups suspended from JCR-which consist of less than five journals-are too small to be detected with these community detection algorithms. We have therefore run another experiment by restricting the number of nodes in each community (i.e., community size). Specifically, the Leiden algorithm and the dcSBM accept a parameter with which one can control the community size. We set the maximum community size to five for the Leiden algorithm and the average community size to three for the dcSBM. We again find that no community matches the suspended groups, i.e., O ℓ,ℓ ′ ≤ 0.087 for all detected communities. These results support that anomalous citation groups are difficult to detect with community detection algorithms.
Why are some suspended groups not detected by CIDRE? The groups identified by JCR but not by CIDRE have considerably fewer within-group citations than the groups identified by both (Fig. 2b). Notable examples are groups J17, J19, and J22. In these groups, the donor journals identified by JCR did not provide any withingroup citations. The lack of within-group citations is due to the fact that MAG is curated by a machine learning algorithm which sometimes fails to parse citations and publications, particularly for retracted papers 28,29 . For instance, JCR suspended the journals comprising group J1 due to the anomalous citations from two papers published in the donor journal 13 . The two papers were later retracted and not indexed in the MAG. If we add back the retracted papers and rerun CIDRE, then CIDRE detects group J1. www.nature.com/scientificreports/ Newly detected citation groups in 2010-2018. CIDRE detected 159 groups that JCR has not suspended. We classified these detected groups based on five criteria that we sequentially applied: more than 20% of within-group citations (A) come from a single paper, (B) go to a single paper, (C) come from a single author, or (D) go to a single author, or (E) two journals in the group share at least one editorial board member ( Fig. 3; see Methods section for the method for identifying editors). Over half of the groups detected by CIDRE (93 groups; 58%) are attributed to excessive citations provided by a single paper (category A) or single author (category C). In 19 groups (12%), excessive citations are directed to a single paper (category B) or a single author (category D). In 26 groups (16%), journals share at least one editorial board member (category E). The remaining 21 groups (13%) do not meet any of the five criteria (category F). For comparison, we apply the same classification rule to the 22 groups of journals suspended from JCR ( Fig. 2c). Similar to CIDRE, relatively many groups that are suspended from JCR belong to category A or C. In the following, we closely inspect the groups with the largest number of within-group citations in each category except category E for which we inspect the group with a large overlap of editorial board members across the member journals. An instance of category A is group 1, which CIDRE detected in the network in 2018 and is composed of 17 journals on anthropology (Fig. 4a). Two review papers published in donor journals, American Anthropologist and Social Anthropology, provided 233 citations in total to the journals in group 1, of which 230 citations (99%) were made to the papers published in the JIF time window. Removing the citations from the two review papers decreases the JIFs for the 4 recipient journals, Anthropological Quarterly, Cultural Anthropology, Focaal, and Journal of the Royal Anthropological Institute, by more than 26%.
An instance of category B is group 2, which CIDRE detected in the network in 2017 and is composed of four journals on crystallography (Fig. 4b). Most of the within-group citations were made to a single paper published in a recipient journal, Acta Crystallographica Section C (denoted by R 2,1 ). In fact, the paper received 594 citations from the two donor journals, IUCrData (denoted by D 2,1 ) and Acta Crystallographica Section E (denoted by D 2,2 ), which account for 94% of citations that R 2,1 received from D 2,1 and D 2,2 . Removing the within-group citations to the single paper decreases the JIF of R 2,1 by 22%. The paper is titled "Crystal structure refinement with SHELXL", which describes a software commonly used in crystallography. The donor journals, D 2,1 and D 2,2 , required the software users to cite the paper in their submission guidelines.
An instance of category C is group 3, which CIDRE detected in the network in 2014 and is composed of five journals on engineering (Fig. 4c). Most of the within-group citations are attributed to self-citations across different journals by a single author (Fig. 4c). The author wrote approximately one-third of papers (23 out of 63 papers) contributing to the within-group citations. These papers provided 313 citations to the author's papers published in the recipient journals in 2012 and 2013. The author was on the editorial board for International Journal of Intelligent Systems and Applications, which serves as both a donor and recipient journal in this group.
An instance of category D is group 4, which CIDRE detected in the network in 2010 and is composed of four journals on veterinary science (Fig. 4d). One author wrote 33 papers published in a donor journal, Journal An instance of category E is group 5, which CIDRE detected in the network in 2016 and is composed of seven journals on business (Fig. 4e). There are 176 editorial board members in total that serve any of the member journals. Among them, 19 individuals were the editors of at least two member journals. Nearly half of the overlapping editors (9 out of 19) serve two journals, Journal of Security and Sustainability Issues (JSSI) and Entrepreneurship and Sustainability Issues (ESI), which account for at least 25% of the editorial board members in each of the two journals. The editor-in-chief of ESI, who also serves JSSI as an editor, provided and received the largest number of citations (62 and 55 citations, respectively) among the authors of papers published in this journal group.
An instance of category F is group 6, which CIDRE detected in the network in 2011 and is composed of two journals on laser science (Fig. 4f). The donor journal, Laser Physics, provided 1984 citations to the recipient journal, Laser Physics Letters. We did not find any concentration of citations; neither a single paper nor a single author provided or received more than 8% of citations within the group. In 2011, the number of citations from the donor journal to the recipient journal increased more than double, from 987 citations in 2010 to 1984 citations in 2011. CIDRE identified the increase in the citations to be excessive and detected this group.
In addition to groups 1-6, two citation groups caught our attention, which we refer to as groups 7 and 8. Group 7 is present in the network in 2017. This group belongs to category E and consists of two journals on engineering (Fig. 4g) Group 8 is present in the network in 2016. This group belongs to category A and consists of eight journals on literature (Fig. 4h). A donor journal, Keats Shelley Journal, provided 119 citations to the seven recipient journals, of which 110 (92%) citations were provided to the papers published in the JIF time window. Removing the 119 citations decreases the JIF of the recipient journals by at least 57%. A single paper titled "Annual Bibliography for 2015" provided all the within-group citations from the donor journal to the recipient journals. This paper consists of 42 pages, of which 40 pages are the reference list. In each of year 2012, 2013, and 2014, the donor journal published a paper with a similar title (e.g., "Annual Bibliography for 2014") that cited many papers in the recipient journals. In these years, CIDRE detected the groups that consisted mostly of the donor journal and the recipient journals in group 8.  Six out of these seven groups belong to category A, B, C, D, or E (Fig. 5). We refer to these six journal groups as groups 9, 10, . . . , 14.
Group 9 consists of three journals on surgery and belongs to category B (Fig. 5a). As is the case for group 2, most of the within-group citations pointed to a single paper published in the sole recipient journal, International Journal of Surgery (denoted by R 9 ). The paper received 483 citations from the two donor journals, Annals of Medicine and Surgery (denoted by D 9,1 ) and International Journal of Surgery Case Reports (denoted by D 9,2 ), which account for 82% of the citations (i.e., 592) that R 9 receives from D 9,1 and D 9,2 . Removing these citations decreases the JIF of R 9 by 20%. The paper is titled "The SCARE 2018 statement: Updating consensus Surgical CAse REport (SCARE) guidelines", which is a guideline for surgical reports. In the guideline for the authors, the donor journals request the authors to cite the paper as a condition for submission. Furthermore, the author of the SCARE paper is the managing and executive editor of D 9,2 and R 9 . In addition to this editor, D 9,2 and R 9 share many editors. In fact, there are 107 and 84 editors in D 9,2 and R 9 , respectively, of which 79 individuals are the editors of both journals. Journals D 9,1 , D 9,2 , and R 9 conducted a similar citation practice in the previous 2 years. In fact, CIDRE detected a group composed of D 9,2 and R 9 in 2018, in addition to the present group in 2019. In www.nature.com/scientificreports/ 2017 and 2018, D 9,1 and D 9,2 requested the authors to cite the previous version of the SCARE guideline paper written by the same author published in R 9 in 2016. There were 559 and 554 citations from the donor journals to the paper in 2017 and 2018, respectively. The new guideline paper entered the time window for the JIF when the old guideline paper exited the time window. Group 10 is composed of two journals and belongs to category D (Fig. 5b). The donor journal, "Journal of Low Frequency Noise, Vibration and Active Control", provided 160 citations to the recipient journal, Thermal Science. A single author received 74 out of the 160 citations (46%) from 34 papers published in the donor journal. The 29 out of the 34 papers are included in a special issue of which the author was the guest editor. The special issue consists of 74 papers.
Group 11 is composed of seven journals on anthropology and belongs to category A (Fig. 5c). A single review paper published in a donor journal, Social Anthropology, cited 95 papers published in the recipient journals, all of which were published in the time window for the JIF. If one removes the citations from that review paper, the JIF of each of the five recipient journals decreases by more than 18%. In the review paper, the author acknowledged the editors of the two recipient journals, Social Analysis and Focaal, owned by a publisher, Berghahn Journals, for granting access.
Group 12 consists of two journals on crystallography and belongs to category A (Fig. 5d). A single paper published in the donor journal, Crystallography Reviews, cited 124 papers published in the recipient journal, IUCrData, all of which were published in the time window for the JIF. If one removes these 124 citations, the JIF of the recipient journal decreases by 57%.
Group 13 is composed of two journals on political science and belongs to category D (Fig. 5e). The donor journal, Regulation and Governance, provided 95 citations to the recipient journal, Annals of American Academy of Political and Social Science. Removing the citations from the donor decreases the JIF of the recipient by 26%. The 89 out of 95 (93%) citations from the donor to recipient journals pointed to the papers included in a special issue of the recipient journal, i.e., "Regulatory Intermediaries in the Age of Governance. " The special issue consists of 16 papers, each of which received less than two citations on average from journals outside group 11 in 2019. The special issue was edited by 3 guest editors who are on the editorial board of the donor journal. The three editors wrote a paper in the special issue. The paper received 26 citations in 2019, of which 13 citations (50%) came from the donor journal. The paper was highlighted as the most cited paper in the last 3 years in the recipient journal in 2019.
Group 14 consists of two mathematical journals and belongs to category E (Fig. 5f). The donor journal, Journal of Mathematical Sciences and Cryptography, published 150 papers, of which 52 papers cited 36 papers published in the recipient journal, Journal of Information and Optimization Science, in the time window for the JIF. We did not find a single author or a single paper that was exclusively cited or was cited within the group. The 52 papers published in the donor journal were written by 126 authors, of which 107 authors (84%) had never cited the recipient journal before. Both donor and recipient journals have the same chief editor.

Discussion
In this paper, we put forward an algorithm-named CIDRE-to identify groups of journals that cite each other at excessively high rates. CIDRE detects a majority of journal groups suspended from JCR. Notably, in several cases, it does so years in advance. In addition, it detects a number of anomalous groups, whose members increased their JIFs by 17-130% via within-group citations. The inspection of such groups reveals a variety of mechanisms leading to such inflation. Specifically, more than half of the anomalous groups are due to one paper or one author that singlehandedly provides or receives many citations within the group.
The algorithm's practical value lies in that it is deterministic and scalable to large networks, which makes it possible to apply it in an online fashion to incoming streams of new citation data. Furthermore, it can be applied to different types of networks. For instance, CIDRE could be applied to bipartite author-journal networks, where a directed edge indicates a publication by an author in the journal, in order to detect potential predatory practices, such as the publication of papers with little peer review 30 . CIDRE could also be applied in different contexts, e.g., to detect the manipulation of ratings in e-commerce platforms and social media 31 .
One should be careful when drawing conclusions from the application of CIDRE. The comparison against the ground-truth data provided by JCR, and the manual inspection of the groups detected by CIDRE support that the groups flagged by CIDRE warrant consideration as potential citation cartel candidates. That being said, we ought to acknowledge that some of such candidates may arise due to unintended biases such as geographical proximity 32,33 , reciprocity between peers 34 , and editorial preferences 35,36 , rather than to outright malicious citation practices. In this respect, CIDRE should not be considered as a tool for automated decision-making or a substitute for expert judgment, but rather a support tool to extract interpretable information from the complexity of journal citation networks.
CIDRE has a parameter-the threshold θ-that sets the minimum fraction of excessive citations that the donor/recipient journals provide/receive within their group. Changing the value of θ induces a hierarchical onion-like structure on the detected journal groups. The inner cores that survive with a larger θ value are considered to be tighter citation groups, which may be more plausible citation cartel candidates. In this study, we set θ = 0.15 to allow for a fair comparison with JCR; all recipient journals suspended from JCR received at least 15% of their incoming citations from donor journals 37 . Then, we manually inspected each group detected by CIDRE to pinpoint individual papers, authors, editors, and specific journals associated with excessive citations. However, manual inspection is a costly task and hard to scale up when dealing with large numbers of groups. This problem will manifest itself when one analyzes citation groups composed of authors because an author network can be much larger than a journal network. Therefore, in practice, it may be useful to prioritize groups that survive with higher thresholds. With CIDRE, one can easily determine the ranking of groups according to www.nature.com/scientificreports/ this criterion because gradually increasing θ to reveal onion-like structure is straightforward and not computationally too costly. Regardless of the conclusions that one may draw on specific anomalies, our findings reveal the widespread presence of journals whose JIFs are substantially hoisted by the citations received from a small group of other journals. It would be hard not to relate this with the ever-increasing emphasis on citations and bibliometric indicators, and the pressure it puts on journal editors to boost growth in such numbers. We believe our findings to be a rather direct consequence of this environment, where actors are incentivized to act on the very same metrics according to which they are ranked, in a feedback loop that closely echoes Goodhart's Law: "when a measure becomes a target, it ceases to be a good measure" 38 . In this respect, we believe that our results should encourage a more critical and nuanced approach to the use and interpretation of citation-based bibliometric indicators.

Methods
Detection of anomalous citation groups. We assume that an anomalous citation group is composed of journals that act as donors, recipients, or both. A donor journal gives excessive citations to the journals in the same group. A recipient journal receives excessive citations from the journals in the same group.
Algorithm CIDRE finds groups of journals, U, composed of the donor and recipient journals. We quantify the extent to which a journal i acts as donor or recipient within group U using the donor score x d and the recipient score The citations from journal i to journal j are deemed to be excessive if and only if they satisfy the following two conditions. First, more than half of citations made to papers published in any previous years from i to j were made to papers published in the last 2 years (i.e., effective citations). Second, the number of citations, W ij , is larger than that expected for a null model. Specifically, for each directed edge from node i to node j, we compute the p-value as the probability p ij that the null model assigns a weight w that is larger than or equal to the actual weight of edge (i, j) in the given network, i.e., W ij . One obtains where ˆ ij is a parameter for the null model. We describe the null model in the next section.
We perform a statistical test for each edge at the significance level of α = 0.01 , with the Benjamini-Hochberg correction 39 to suppress the false positives due to the multiple comparison problem. In other words, one regards m edges with the smallest p-values as significant (i.e, h(i, j) = 1 ) and other edges as insignificant (i.e., h(i, j) = 0 ). The number m is given by the largest integer ℓ for which p (ℓ) ≤ ℓα/M , where p (ℓ) is the ℓ th smallest p-value and M is the number of edges in the network.
After removing the insignificant edges, we seek groups of journals that have a donor or recipient score larger than a prescribed threshold θ . To this end, we use the following algorithm, akin to the k-core decomposition algorithm 40 . First, we prune the network by keeping only the edges with h(i, j) = 1 . Second, we initialize U = {1, . . . , N} , and compute the donor and recipient scores for each node. Third, we remove a node i from U if x d (i, U) < θ and x r (i, U) < θ . Then, we recompute the donor and recipient score for all neighbors of i. We repeat the third step until no node is removed. Fourth, we partition U into disjoint groups U ℓ ( ℓ = 1, 2, . . . ), where each U ℓ is a maximal weakly connected component in the edge-pruned network composed of the nodes in U. We expect that anomalous citation groups contain sufficiently many within-group citations. Therefore, we remove U ℓ if the sum of the weight of edges within U ℓ except self-loops is less than θ w . We set θ = 0.15 and θ w = 50 . We note that CIDRE is a special case of the generalized core decomposition algorithm 40 with vertex property function f (i, U) = max(x d (i, U), x r (i, U)).
Null model. We employ the dcSBM 24,25 as a null model. The dcSBM consists of blocks, where each block is a group of journals. The dcSBM places an edge from node i to j ( i, j = 1, 2, . . . , N ) with a probability determined by the block memberships, out-strength s out i of node i in the original network, and in-strength s in j of node j. The generated networks preserve the expectation of s out i and s in i for each node i, and the expected number of edges between and within the blocks of the given network.
With the dcSBM, one assumes that the weight of the edge from node i to j obeys a Poisson distribution given by 24 (1) www.nature.com/scientificreports/ where P null ij (w; ij ) is the probability that the dcSBM assigns weight w ( w = 0, 1, 2, . . . ). Parameter ij is equal to the mean for the Poisson distribution, i.e., the expected number of citations for the null model. We set ij to the maximum likelihood estimator conditioned on the blocks, which is given by where g i is the ID of the block to which node i belongs, uv is the number of directed edges from block u to block v, S out u = N ℓ=1 s out ℓ δ(g ℓ , u) and S in u = N ℓ=1 s in ℓ δ(g ℓ , u) are the sum of out-strength and in-strength of the nodes in block u, respectively, and δ(·, ·) is Kronecker delta 24 .
One may be tempted to use the ij value given by (5) to compute the p-value using (3). However, if ij is smaller than one, even the edges with the smallest weight W ij = 1 may be judged to be excessive in the significance test explained in the previous section. We instead require W ij to be large for journal i to be regarded to excessively cite journal j. Therefore, we use a clipped value, ˆ ij , to compute the p-value using (3), where We find the blocks by fitting the dcSBM to the journal citation networks. Specifically, we first construct an aggregated network, in which the weight of the edge from node i to node j, denoted by W ij , is given by the sum of the weight over the networks between 2000 and 2019, i.e., W ij = 2019 t=2000 W ij is the weight of the edge from node i to node j in the network in year t. Then, we identify the blocks of the aggregate network using a non-parametric Bayesian method without hierarchical structure 25 . Note that we use the aggregated network W to find the blocks of journals. Then, with the detected blocks, we compute ij given by (5) for each yearly network W (t) . This is because the number of citations monotonically increases over time. Therefore, recent yearly citation networks tend to have more excessive citations than older networks if one uses ij computed for the aggregated network.
Identifying editorial board members. There are 641 journals in the 184 groups detected by CIDRE. We manually identified the web pages listing the editorial board members for 525 among the 641 journals. Extracting human names, particularly non-Latin names, from web pages is challenging. In addition, spelling variation makes it difficult to match editors in different journals. Therefore, we did not aim to calculate the precise number of editorial board members shared by different journals but to calculate its lower bound. Specifically, we extracted person names with the Spacy package 41 . Then, one of the authors, S.K., manually inspected the extracted names, and removed non-human names and too short names (e.g., initials). Using exact string matching for the manually inspected names, we matched the editors in different journals.

Data availability
The data that are needed for reproducing the results are openly available in Microsoft Academic Graph at https:// acade mic. micro soft. com/ home.