BERT Paraphrase Detection (GLUE MRPC)

This model is fine-tuned for the paraphrase detection task on the GLUE MRPC dataset. It determines whether two given sentences are paraphrases (i.e., if they have the same meaning or not). This is a binary classification task with the following labels:

  • 1: Paraphrase
  • 0: Not a paraphrase

Model Overview

  • Developer: Parit Kasnal
  • Model Type: Sequence Classification (Binary)
  • Language(s): English
  • Pre-trained Model: BERT (bert-base-uncased)

Intended Use

This model is designed to assess whether two sentences convey the same meaning. It can be applied in various scenarios, including:

  • Duplicate Question Detection: Identifying similar questions in QA systems.
  • Plagiarism Detection: Detecting if content is copied and rephrased.
  • Summarization Alignment: Matching sentences from summaries to the original content.

Example Usage

from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

# Load the fine-tuned model and tokenizer
model = AutoModelForSequenceClassification.from_pretrained("Parit1/dummy")
tokenizer = AutoTokenizer.from_pretrained("Parit1/dummy")

def make_prediction(text1, text2):
    device = "cuda" if torch.cuda.is_available() else "cpu"
    inputs = tokenizer(text1, text2, truncation=True, padding=True, return_tensors="pt")
    inputs = {k: v.to(device) for k, v in inputs.items()}
    model.to(device)
    with torch.no_grad():
        outputs = model(**inputs)
    logits = outputs.logits
    prediction = torch.argmax(logits, dim=-1).item()
    return prediction

# Example usage
text1 = "The quick brown fox jumps over the lazy dog."
text2 = "A fast brown fox leaps over a lazy dog."
prediction = make_prediction(text1, text2)
print(f"Prediction: {prediction}")

Training Details

Training Data

The model was fine-tuned on the GLUE MRPC dataset, which contains pairs of sentences labeled as either paraphrases or not.

Training Procedure

  • Number of Epochs: 2
  • Metrics Used:
    • Accuracy
    • Precision
    • Recall
    • F1 Score

Training Logs (Summary)

Epoch Avg Loss Accuracy Precision Recall F1 Score
1 0.5443 73.45% 72.28% 73.45% 70.83%
2 0.2756 89.34% 89.25% 89.34% 89.27%

Evaluation

Performance Metrics

The model's performance was evaluated using the following metrics:

  • Accuracy: Percentage of correct predictions.
  • Precision: Proportion of positive identifications that were actually correct.
  • Recall: Proportion of actual positives that were correctly identified.
  • F1 Score: The harmonic mean of Precision and Recall.

Test Set Results

Epoch Avg Loss Accuracy Precision Recall F1 Score
1 0.3976 82.60% 82.26% 82.60% 81.93%
2 0.3596 84.80% 84.94% 84.80% 84.87%
Downloads last month
12
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for ParitKansal/BERT_Paraphrase_Detection_GLUE_MRPC

Finetuned
(2353)
this model