File size: 1,883 Bytes
331696d 68aecd3 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 |
---
license: mit
widget:
- text: "Some ninja attacked the White House."
example_title: "Fake example 1"
language:
- en
tags:
- classification
datasets:
- "fake-and-real-news-dataset on kaggle"
---
## Overview
The model is a `roberta-base` fine-tuned on [fake-and-real-news-dataset](https://www.kaggle.com/datasets/clmentbisaillon/fake-and-real-news-dataset). It has a 100% accuracy on that dataset.
The model takes a news article and predicts if it is true or fake.
The format of the input should be:
```
<title> TITLE HERE <content> CONTENT HERE <end>
```
## Using this model in your code
To use this model, first download it from the hugginface website:
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("hamzab/roberta-fake-news-classification")
model = AutoModelForSequenceClassification.from_pretrained("hamzab/roberta-fake-news-classification")
```
Then, make a prediction like follows:
```python
import torch
def predict_fake(title,text):
input_str = "<title>" + title + "<content>" + text + "<end>"
input_ids = tokenizer.encode_plus(input_str, max_length=512, padding="max_length", truncation=True, return_tensors="pt")
device = 'cuda' if torch.cuda.is_available() else 'cpu'
model.to(device)
with torch.no_grad():
output = model(input_ids["input_ids"].to(device), attention_mask=input_ids["attention_mask"].to(device))
return dict(zip(["Fake","Real"], [x.item() for x in list(torch.nn.Softmax()(output.logits)[0])] ))
print(predict_fake(<HEADLINE-HERE>,<CONTENT-HERE>))
```
You can also use Gradio to test the model on real-time:
```python
import gradio as gr
iface = gr.Interface(fn=predict_fake, inputs=[gr.inputs.Textbox(lines=1,label="headline"),gr.inputs.Textbox(lines=6,label="content")], outputs="label").launch(share=True)
``` |