File size: 7,491 Bytes
ae05fc8 691cd0a 507dc96 691cd0a 507dc96 691cd0a 507dc96 691cd0a 507dc96 691cd0a 507dc96 691cd0a 507dc96 691cd0a 74b2b85 691cd0a 74b2b85 691cd0a 507dc96 691cd0a 74b2b85 691cd0a 74b2b85 691cd0a 74b2b85 691cd0a 74b2b85 691cd0a 507dc96 691cd0a 507dc96 691cd0a 507dc96 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 |
---
license: other
license_name: saipl
license_link: LICENSE
datasets:
- wikimedia/wikipedia
- rexarski/eli5_category
language:
- en
base_model:
- FacebookAI/roberta-large
pipeline_tag: text-classification
library_name: transformers
tags:
- genereted_text_detection
- llm_content_detection
- AI_detection
---
<p align="center">
<img src="SA_logo.png" alt="SuperAnnotate Logo" width="100" height="100"/>
</p>
<h1 align="center">SuperAnnotate</h1>
<h3 align="center">
AI Detector<br/>
Fine-Tuned RoBERTa Large<br/>
</h3>
## Description
The model designed to detect generated/synthetic text. \
At the moment, such functionality is critical for determining the author of the text. It's critical for your training data, detecting fraud and cheating in scientific and educational areas. \
Couple of articles about this problem: [*Problems with Synthetic Data*](https://www.aitude.com/problems-with-synthetic-data/) | [*Risk of LLMs in Education*](https://publish.illinois.edu/teaching-learninghub-byjen/risk-of-llms-in-education/)
## Model Details
### Model Description
- **Model type:** The custom architecture for binary sequence classification based on pre-trained RoBERTa, with a single output label.
- **Language(s):** Primarily English.
- **License:** [SAIPL](https://huggingface.co/SuperAnnotate/roberta-large-llm-content-detector-V2/blob/main/LICENSE)
- **Finetuned from model:** [RoBERTa Large](https://huggingface.co/FacebookAI/roberta-large)
### Model Sources
- **Repository:** [GitHub](https://github.com/superannotateai/generated_text_detector) for HTTP service
### Training Data
The training dataset for this version includes **44k pairs of text-label samples**, split equally between two parts:
1. **Custom Generation**: The first half of the dataset was generated using custom specially designed prompts and human version sourced from three domains:
- [**Wikipedia**](https://huggingface.co/datasets/wikimedia/wikipedia)
- [**Reddit ELI5 QA**](https://huggingface.co/datasets/rexarski/eli5_category)
- [**Scientific Papers**](https://www.tensorflow.org/datasets/catalog/scientific_papers) (extended to include the full text of sections).
Texts were generated by 14 different models across four major LLM families (GPT, LLaMA, Anthropic, and Mistral). Each sample consists of a single prompt paired with one human-written and one generated response, though prompts were excluded from training inputs.
2. **RAID Train Data Stratified Subset**: The second half is a carefully selected stratified subset from the RAID train dataset, ensuring equal representation across domains, model types, and attack methods. Each example pairs a human-authored text with a corresponding machine-generated response (produced by a single model with specific parameters and attacks applied).
This balanced dataset structure maintains approximately equal proportions of human and generated text samples, ensuring that each prompt aligns with one authentic and one generated answer.
> [!NOTE]
> Furthermore, key n-grams (n ranging from 2 to 5) that exhibited the highest correlation with target labels were identified and subsequently removed from the training data utilizing the chi-squared test.
### Peculiarity
During training, one of the priorities was not only maximizing the quality of predictions but also avoiding overfitting and obtaining an adequately confident predictor. \
We are pleased to achieve the following state of model calibration and high acccuracy prediction.
## Usage
**Pre-requirements**: \
Install *generated_text_detector* \
Run following command: ```pip install git+https://github.com/superannotateai/[email protected]```
### Native Usage
```python
from generated_text_detector.utils.model.roberta_classifier import RobertaClassifier
from generated_text_detector.utils.preprocessing import preprocessing_text
from transformers import AutoTokenizer
import torch.nn.functional as F
model = RobertaClassifier.from_pretrained("SuperAnnotate/ai-detector")
tokenizer = AutoTokenizer.from_pretrained("SuperAnnotate/ai-detector")
model.eval()
text_example = "It's not uncommon for people to develop allergies or intolerances to certain foods as they get older. It's possible that you have always had a sensitivity to lactose (the sugar found in milk and other dairy products), but it only recently became a problem for you. This can happen because our bodies can change over time and become more or less able to tolerate certain things. It's also possible that you have developed an allergy or intolerance to something else that is causing your symptoms, such as a food additive or preservative. In any case, it's important to talk to a doctor if you are experiencing new allergy or intolerance symptoms, so they can help determine the cause and recommend treatment."
text_example = preprocessing_text(text_example)
tokens = tokenizer.encode_plus(
text_example,
add_special_tokens=True,
max_length=512,
padding='longest',
truncation=True,
return_token_type_ids=True,
return_tensors="pt"
)
_, logits = model(**tokens)
proba = F.sigmoid(logits).squeeze(1).item()
print(proba)
```
### Usage in Detector Wrapper
```python
from generated_text_detector.utils.text_detector import GeneratedTextDetector
detector = GeneratedTextDetector(
"SuperAnnotate/ai-detector",
device="cuda",
preprocessing=True
)
text_example = "It's not uncommon for people to develop allergies or intolerances to certain foods as they get older. It's possible that you have always had a sensitivity to lactose (the sugar found in milk and other dairy products), but it only recently became a problem for you. This can happen because our bodies can change over time and become more or less able to tolerate certain things. It's also possible that you have developed an allergy or intolerance to something else that is causing your symptoms, such as a food additive or preservative. In any case, it's important to talk to a doctor if you are experiencing new allergy or intolerance symptoms, so they can help determine the cause and recommend treatment."
res = detector.detect_report(text_example)
print(res)
```
## Training Detailes
A custom architecture was chosen for its ability to perform binary classification while providing a single model output, as well as for its customizable settings for smoothing integrated into the loss function.
**Training Arguments**:
- **Base Model**: [FacebookAI/roberta-large](https://huggingface.co/FacebookAI/roberta-large)
- **Epochs**: 20
- **Learning Rate**: 5e-05
- **Weight Decay**: 0.0033
- **Label Smoothing**: 0.38
- **Warmup Epochs**: 2
- **Optimizer**: SGD
- **Gradient Clipping**: 3.0
- **Scheduler**: Cosine with hard restarts
- **Number Scheduler Cycles**: 6
## Performance
This solution has been validated on strytify subset from [RAID](https://raid-bench.xyz/) train dataset. \
This benchmark, which includes a diverse dataset covering:
- 11 LLM models
- 11 adversarial attacks
- 8 domains
The performance of detector
| Model | Accuracy |
|---------------|----------|
| ***Human*** | 0.731 |
| ChatGPT | 0.992 |
| GPT-2 | 0.649 |
| GPT-3 | 0.945 |
| GPT-4 | 0.985 |
| LLaMA-Chat | 0.980 |
| Mistral | 0.644 |
| Mistral-Chat | 0.975 |
| Cohere | 0.823 |
| Cohere-Chat | 0.906 |
| MPT | 0.757 |
| MPT-Chat | 0.943 |
| Average |**0.852** |
|