Pub-Guard-Llama-8B

Pub-Guard-Llama-8B is a fine-tuned version of the Llama-3.1-8B model, specifically designed for detecting fraudulent papers in academic publications.

Benefits of using this model:

  • the first LLM specifically designed for fraud detection in scientific articles
  • integration of external resources for better analysis (Semantic Scholar, OpenAlex, Pubmed...)
  • it offers uncertainty-aware predictions with faithful explanations

Datasets

Zero-Shot Reasoning

Model Breast Cancer Lung Cancer
Llama-3.1-8B-Instruct 20.0 46.2
OpenScholar-8B 18.8 20.3
Bio-Medical-Llama-3-8B 40.7 56.7
PMC-LLaMA-13B 36.4 -
Pub-Guard-Llama-8B (Ours) 69.5 79.3
Pub-Guard-Llama-8B (RAG) - -
Pub-Guard-Llama-8B (Debate) - -

Coherence Check

  • Model: cross-encoder/nli-deberta-v3-base
  • Article -> (Sentence in Explanation)
  • Instruction Data (gpt-4o): 88.7 (train set)
  • Predicted Data (pub-guard-llama): 90.8 (lung cancer set)

Examples

Non-Retracted Article

Input:

Title: Bioinformatics-Based Discovery of CKLF-Like MARVEL Transmembrane Member 5 as a Novel Biomarker for Breast Cancer.

Abstract: Chemokine-like factor (CKLF)-like MARVEL transmembrane members (CMTMs) represent a novel protein family linking the
chemokine and transmembrane-4 superfamily families, which potentially play several roles in diverse physiological and pathological processes.
 The detailed functions and underlying molecular mechanisms of CMTMs remain elusive in breast cancer.
Herein, we performed a comprehensive bioinformatic analysis to investigate the prognostic effect, potential functions,
and biomolecular regulatory network of CMTMs in breast cancer.
...

Authors: Juan Zhou (author h-index: 22, Established Researcher); J. Lei (author h-index: 10, Early Career Researcher);
Jun Wang (author h-index: 17, Established Researcher); Chen-Lu Lian (author h-index: 10, Early Career Researcher);
 L. Hua (author h-index: 9, Early Career Researcher); Zhen-Yu He (author h-index: 29, Established Researcher);
San-Gang Wu (author h-index: 28, Established Researcher)

Institutions: Sun Yat-sen University Cancer Center, Guangzhou, China. (institution average citation: 21.0, Established Institution);
Teaching Hospital of Fujian Medical University, Xiamen, China. (institution average citation: 60.0, World-Class Institution)

Journal: frontiers in cell and developmental biology (journal JCR: Q1, Top Level Journal)

Output:

No
The article should be accepted due to its publication in a top-tier journal with a rigorous peer review process,
the author s strong credibility with established h-indices, and affiliations with reputable institutions.
The study presents original, well-supported findings on CMTM5's role in breast cancer, with no evidence of data fabrication or plagiarism.

Retracted Article

Input:

Title: Changes and Influencing Factors of Cognitive Impairment in Patients with Breast Cancer.

Abstract: To investigate the changes in cognitive function and its influencing factors in patients with breast cancer after chemotherapy,
to provide a scientific basis for further cognitive correction therapy. In this study, general information on age,
marital status, and chemotherapy regimen was collected from 172 breast cancer chemotherapy patients.
172 patients with breast cancer undergoing chemotherapy were investigated by convenience sampling method,
and the subjects were tested one-on-one using the Chinese version of the MATRICS Consensus Cognitive Battery (MCCB) computer system.
...

Authors: Huixia Cui (author h-index: 6, Early Career Researcher);
Xiaoxiu Song (author h-index: 1, Emerging Researcher); Wenlu Zhang (author h-index: 7, Early Career Researcher)

Institutions: College of Nursing, Jinzhou Medical University, Jinzhou, Liaoning 121001, China. (institution average citation: 9.0, Emerging Institution);
Department of Intensive Care Medicine, Liaocheng People's Hospital, Liaocheng, Shandong 252000, China. (institution average citation: 10.0, Emerging Institution);

Journal: evidence-based complementary and alternative medicine : ecam (unknown journal)

Output:

Yes
The article should be retracted due to potential data fabrication or manipulation, as the reported linear regression results lack clarity and statistical validation.
 Additionally, the journal's reputation and peer review rigor are questionable, and the authors' affiliations are emerging with low citation averages,
raising concerns about research reliability.
Downloads last month
21
Safetensors
Model size
8.03B params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.

Model tree for Lihuchen/pub-guard-llama-8b

Finetuned
(826)
this model