According to Science, Matt Spick, deputy editor-in-chief of Scientific
Reports, found that a large number of formulaic papers using data from the
National Health and Nutrition Survey (NHANES) have poured into journals such as
Scientific Reports and PLOS Biology. The surge in these low-quality papers may
be dominated by "paper factories" and facilitates generation of text through
AI.
Nature also pointed out in recent reports that in addition to NHANES, other
biomedical databases (UK Biobank, FAERS, GBD and FinnGen) are also frequently
used by these low-quality papers.


Faced with this problem, Journal of Global Health has taken the lead in
taking action to tighten the review standards for papers based on these
databases. Now, authors who contribute with open datasets must declare how many
papers have been published using similar datasets in the past three years,
disclose whether to write the manuscript using artificial intelligence, and
explain how it excludes false positives in the results.
In response to the trend of "abuse of data sets", other journals and
publishers may follow the Journal of Global Health and introduce similar strict
review mechanisms.
According to the research of Matt Spick, Anthony Onoja and others, between
2021 and 2025, the number of papers in six data sets far exceeded expectations,
among which the "template" papers of five data sources, NHANES, UK Biobank,
FAERS, GBD and FinnGen, exploded.

These low-quality papers often select a certain health problem, associated
environmental or physiological factors, and published data from specific
populations, and generate so-called "new discoveries" through simple replacement
of variables, such as between drinking semi-skimmed milk and preventing
depression (PMID 39703337) or education level and postoperative abdominal hernia
(PMID 39616067), as well as many other hypotheses that lack a biological basis.
"Template"
When examining the changes in geographical source of these six data source
papers, the study found that papers from China soared from 19% of PubMed
database index papers in 2021 to 65% in 2024, the most growing of all countries.
In these six data sets, the growth of papers from FinnGen data sources in China
was the most significant, with 89% of the main authors of related papers from
China as of 2024.


This unbalanced distribution of paper output shows that this growth is not
a general increase in research productivity, but rather that researchers in
developing countries take risks due to lack of scientific research support under
the academic pressure of "destruction without publication", which ultimately
encourages the development of "paper factories".
In order to cope with the impact of a large number of similar manuscripts,
the former announced at the end of July that it would completely stop receiving
submissions using FAERS databases; the latter began to require research based on
public databases to provide independent verification.