YAML Metadata Error: "datasets[0]" with value "https://www.kaggle.com/datasets/clmentbisaillon/fake-and-real-news-dataset" is not valid. If possible, use a dataset id from https://hf.co/datasets.

Overview

The model is a roberta-base fine-tuned on fake-and-real-news-dataset. It has a 100% accuracy on that dataset. The model takes a news article and predicts if it is true or fake. The format of the input should be:

<title> TITLE HERE <content> CONTENT HERE <end>

Using this model in your code

To use this model, first download it from the hugginface website:

from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("hamzab/roberta-fake-news-classification")

model = AutoModelForSequenceClassification.from_pretrained("hamzab/roberta-fake-news-classification")

Then, make a prediction like follows:

import torch
def predict_fake(title,text):
    input_str = "<title>" + title + "<content>" +  text + "<end>"
    input_ids = tokenizer.encode_plus(input_str, max_length=512, padding="max_length", truncation=True, return_tensors="pt")
    device =  'cuda' if torch.cuda.is_available() else 'cpu'
    model.to(device)
    with torch.no_grad():
        output = model(input_ids["input_ids"].to(device), attention_mask=input_ids["attention_mask"].to(device))
    return dict(zip(["Fake","Real"], [x.item() for x in list(torch.nn.Softmax()(output.logits)[0])] ))
    
print(predict_fake(<HEADLINE-HERE>,<CONTENT-HERE>))

You can also use Gradio to test the model on real-time:

import gradio as gr
iface = gr.Interface(fn=predict_fake, inputs=[gr.inputs.Textbox(lines=1,label="headline"),gr.inputs.Textbox(lines=6,label="content")], outputs="label").launch(share=True)
Downloads last month
7,739
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Spaces using hamzab/roberta-fake-news-classification 7