AI models are using material from retracted scientific papers

AI models are using material from retracted scientific papers

Summary

Recent studies — and reporting confirmed by MIT Technology Review — show that popular AI chatbots and research-focused search tools sometimes use content from retracted scientific papers when answering questions. Tests found models and tools referencing retracted medical and scientific work without flagging that the papers had been withdrawn, posing risks when users take those answers at face value.

Key Points

  • Tests of ChatGPT (GPT-4o) using 21 retracted medical-imaging papers found the chatbot cited retracted work in five cases and only flagged retraction-related caution in three.
  • A separate study found ChatGPT-4o mini failed to mention retractions when assessed against 217 retracted or low-quality papers.
  • Research-focused tools (Elicit, Ai2 ScholarQA, Perplexity, Consensus) also referenced many retracted papers — often without noting their retracted status.
  • Some companies (e.g. Consensus) have started integrating retraction sources like Retraction Watch and publisher data to reduce false references, but coverage remains inconsistent.
  • Challenges include incomplete retraction databases, varied publisher labelling (retraction, correction, expression of concern), scattered copies on preprint servers, and model training cutoffs that miss later retractions.
  • Proposed mitigations: better use of published retraction notices, integration of curated retraction data, exposing peer reviews and PubPeer critiques as context, and real-time checks against retraction sources.

Content summary

Researchers tested how AI chatbots and search tools handle scientific papers that have been formally retracted. In one test, ChatGPT (GPT-4o) answered questions grounded in 21 retracted medical-imaging papers; it cited retracted papers in several answers and rarely advised users about the retraction. Other tools advertised for scholarly work also returned responses based on retracted literature without noting the withdrawal.

Some firms are beginning to address the problem by ingesting retraction lists from publishers, aggregators and Retraction Watch, and by manually curating data. But experts warn that no single retraction database is complete, publishers mark retractions inconsistently, and copies of papers can persist on preprint servers and repositories. Training-data cutoffs also mean models may not know about retractions that happened after they were trained.

Experts recommend making more contextual material available to models (peer reviews, PubPeer comments, retraction notices), improving how search tools check for retractions, and urging users to be sceptical and verify sources — especially for medical or policy-relevant queries.

Context and relevance

This issue matters because people increasingly ask AI tools for medical advice, literature summaries and research guidance. As governments and funders invest heavily in AI for science, the reliability of model outputs becomes central to trust and safety. Faulty answers based on retracted work can mislead clinicians, researchers and the public, so fixing detection and labelling of retracted papers is a practical priority for both toolmakers and publishers.

Why should I read this?

Short version: if you ask AI for science or health info, don’t blindly trust the answer. This piece saves you time by showing where the risks are, who’s already broken, and what’s being done about it — so you can spot dodgy AI-sourced science faster.

Source

Source: https://www.technologyreview.com/2025/09/23/1123897/ai-models-are-using-material-from-retracted-scientific-papers/