Crossref’s recent acquisition of the Retraction Watch data will enhance the quality, transparency, and completeness of metadata on retractions in academic publications. In this post, we present some of our preliminary analysis.
Retractions and Their Challenges
Retractions play an important role in protecting the integrity of the research ecosystem by providing a mechanism for addressing critical issues such as research misconduct, which can undermine the credibility of academic publications. However, the retraction process can be lengthy and finding retraction notices can be challenging, leading to the continued citation and use of discredited publications. In response to these issues, websites like the Retraction Watch Database have emerged as valuable resources for tracking and documenting the reasons behind retractions, fostering transparency in the scientific community.
Crossref and Retraction Watch Partnership
Recently, Crossref has expanded its dataset by acquiring the extensive dataset from Retraction Watch, resulting in a comprehensive compilation of nearly 50,000 retractions. The complete dataset from Retraction Watch’s collaboration with Crossref is available through Crossref’s Labs API. Interested users can directly download the dataset in CSV format from the following link:
https://api.labs.crossref.org/data/retractionwatch?name@email.org (replace name@email.org with your email. The file is about 40MB).

Figure 1. The dataset from Retraction Watch’s collaboration with Crossref is publicly accessible through API
Preliminary analysis of the major reasons for retraction
Using this dataset, we conducted a preliminary analysis of retractions from 2017 to 2022, excluding conference papers. Each reason for retraction was counted separately if an article had multiple reasons. After sorting the results, we identified a total of 105 reasons for retractions. The top ten reasons for retractions in the period from 2017 to 2022 are shown in Figure 2. It shows us that:
- The most common cause for retraction was “Investigation by Journal/Publisher,” followed by “Unreliable Results” and “Concerns/Issues About Data.” The detailed explanations for these reasons can be found in the Retraction Watch Database User Guide Appendix B: Reasons.
Figure 2. Top 10 reasons for retractions from 2017 to 2022
We further analyzed the data by examining the reasons for retraction across different years. In Figure 3, the percentages of retractions attributed to different reasons are shown relative to the total number of retractions. Notably:
- There has been an upward trend in retractions resulting from “investigations by journals/publishers”, “unreliable results”, “concerns/issues about data”, “concerns/issues about referencing/attributions”, and “concerns/issues with peer review”.
- Conversely, there has been a declining trend in retractions resulting from “paper mills” and “duplications of images”.
Figure 3. Proportional distribution of retraction reasons by year
The decline in retractions related to paper mills and duplication of image can be attributed to a combination of positive and negative factors. On the positive side, improved techniques for identifying duplicated images prior to publications may significantly reduce the occurrence of such unethical practices [1]. Additionally, increased public awareness of fraudulent practices has most likely contributed to authors being more cautious and placing a greater emphasis on research integrity [2]. However, it is important to note that the operational strategies of paper mill organizations may have also changed, potentially leading to a reduction in detected instances of duplication [3].
While this decline may suggest a positive trend, it is crucial to exercise caution and continuously improve detection methods to stay ahead of evolving fraudulent practices. In summary, the reasons mentioned above clearly show that every stakeholder involved, including authors, editors, publishers, and peer reviewers, plays an important role in maintaining the integrity of scientific research.
– By Ernest Lam, Library
Views: 990
Go Back to page Top
- Category: 
- Academic Publishing
Tags: Crossref, metadata, research integrity, Retraction Watch
published November 10, 2023


