GenAI & LLMs in Scholarly Publications: Unintended Consequences

Late last week, arXiv updated its policy on posting review articles and position papers in the Computer Science category. Previously, such papers were accepted at the discretion of moderators. Now, arXiv requires that such papers be accepted for publication or conference presentation before being posted.

The stated reason:

“In the past few years, arXiv has been flooded with papers. Generative AI / large language models have added to this flood by making papers – especially papers not introducing new research results – fast and easy to write. While categories across arXiv have all seen a major increase in submissions, it’s particularly pronounced in arXiv’s CS category.”

The post, Attention Authors: Updated Practice for Review Articles and Position Papers in arXiv CS Category, provides details on how authors can show that a review article or position paper has been accepted, and other useful information.

The increasing “flood” of low-quality articles created with LLMs is an “unintended consequence”. We’d like to think that no one set out to flood the research environment with unreliable or poorly written material, but the system is being overwhelmed. It gets worse when AI is trained on AI, as shown in a 2024 study [1].

On September 29, 2025, Cory Doctorow, the author of Enshittification: Why Everything Got Worse and What to Do about It and The Internet Con: How to Seize the Means of Computation, wrote that “AI is the asbestos we are shoveling into the walls of our society and our descendants will be digging it out for generations.” Researcher and scholars would do well to consider this metaphor. Asbestos (石棉) too was once seen as a modern and useful innovation. It helped to prevent fires, but it was later found to be toxic. People across the globe still suffer from exposure to asbestos, have lawsuits about it [2, 3, 4]. Governments, organizations, and individuals across the globe continue to need to be aware of the dangers of asbestos exposure and deal with its consequences at a great cost of human life, health, time, and money. The same may come be true of AI, if we fail to manage its use wisely.

GenAI & LLMs in Research – To Disclose or not to Disclose?

It is not new that scholars are using AI tools in research and publishing, but it is often undisclosed. For example, in 2024, Haider, Söderström, Ekström and Rödl found over 130 GPT-fabricated papers in Google Scholar [5]. In August 2025, it was reported that up to 22% of Computer Science papers studied showed signs of LLM use, while a manual review of 200 random papers from that set revealed that only 1% disclosed their use [6].

Disclosure, transparency, and trust are vital in ensuring that using LLMs for scholarship creates reliable data and information. One way to do this is by being extremely open about using it. For example, in October 2025, some researchers held the first Agents4 Science Conference at Stanford University, which explicitly required AI tool “authorship” and peer review. The organizers hoped that this would be a good starting point to develop useful guidelines on how to ethically and transparently incorporate these AI tools in research and publication.

Another notable suggestion comes from Alex Glyn, who maintains the Acai website (about suspected undisclosed AI use in scholarly publications). Glyn proposes that AI use declarations should be a non-optional part of sharing results, modelled on the now-standard practice of declaring conflict of interest [7].

While transparency helps, it won’t solve the deeper issue: as long as the system of using quantity (of publications, of citations, etc.) to judge research quality and impact persists, the problem will persist. Many researchers will try to “game the system” by using AI tools unwisely and/or without disclosure. Then other researchers will need sift through even more (often low quality) material to develop new forms of insight, knowledge and creativity. As Stafford Beer put it so succinctly [8], “The purpose of the system is what it does” (POSIWID) and it appears that one of the main purposes of the system currently created by the commercialization of higher education [9] is to extrude “content” rather than cultivate knowledge.

So, in the meantime, let’s try to work on the other purposes of the higher education and research system: to think, ponder, write, create. Let us remember that scholarship is a humane process, and that “Ideas, embodied in data and values, beliefs, principles, and original insights, must be pursued because they are the stuff of life.”[10]

– By Victoria Caplan, Library

References

[1] Shumailov, Ilia, Zakhar Shumaylov, Yiren Zhao, Nicolas Papernot, Ross Anderson, and Yarin Gal. “AI models collapse when trained on recursively generated data.” Nature 631, no. 8022 (2024): 755-759. https://doi.org/10.1038/s41586-024-07566-y

[2] McCulloch, Jock, and Geoffrey Tweedale. Defending the Indefensible : The Global Asbestos Industry and Its Fight for Survival. Oxford: Oxford University Press, 2023. https://doi.org/10.1093/oso/9780199534852.001.0001

[3] Miyamoto, Kenichi, Kenji Morinaga, and H Mori. Asbestos Disaster : Lessons from Japan’s Experience. 1st ed. 2011. Tokyo: Springer, 2011. https://doi.org/10.1007/978-4-431-53915-5

[4] McGinness Kearse, Anne. “Medicolegal Aspects of Asbestos-Related Diseases: A Plaintiff’s Attorney’s Perspective”. In Pathology of Asbestos-Associated Diseases edited by Tim D. Oury et al. Cham : Springer Nature, 2025. https://doi.org/10.1007/978-3-031-89250-9_12

[5] Haider, J., Söderström, K. R., Ekström, B., & Rödl, M. (2024). “GPT-fabricated scientific papers on Google Scholar: Key features, spread, and implications for preempting evidence manipulation.” Harvard Kennedy School (HKS) Misinformation Review. https://doi.org/10.37016/mr-2020-156

[6] Liang, Weixin, Yaohui Zhang, Zhengxuan Wu, et al. “Quantifying Large Language Model Usage in Scientific Papers.” Nature Human Behaviour, August 4, 2025, 1–11. https://doi.org/10.1038/s41562-025-02273-8

[7] Glynn, Alex. “The Case for Universal Artificial Intelligence Declaration on the Precedent of Conflict of Interest.” Accountability in Research 32, no. 6 (2025): 1046–47. https://doi.org/10.1080/08989621.2024.2345719

[8] Ramage, Magnus, and Karen Shipp. “Stafford Beer.” In Systems Thinkers, 193–202. London: Springer London, 2020. https://doi.org/10.1007/978-1-4471-7475-2_19

[9] Kezar, Adrianna, and Samantha Bernstein-Sierra. “Commercialization of Higher Education.” In Second Handbook of Academic Integrity, edited by Sarah Elaine Eaton, 2304:1867–87. Cham: Springer Nature Switzerland, 2024. https://doi.org/10.1007/978-3-031-54144-5_59

[10] Giamatti, A. Bartlett. “The Earthly Use of a Liberal Education”. In A Free and Ordered Space: The Real World of the University, 118-126, p. 122. New York: W.W. Norton, 1988.

Top

Academic Publishing
AI in Research & Learning

Tags: arXiv, disclosure, GenAI, Generative AI, LLM, preprints

published November 6, 2025

GenAI & LLMs in Scholarly Publications: Unintended Consequences

GenAI & LLMs in Research – To Disclose or not to Disclose?

Categories