File size: 4,250 Bytes

a6f967e

---
language: en
tags:
- toxicity
- text-classification
- roberta
- jigsaw
license: mit
datasets:
- jigsaw-toxic-comment-classification-challenge
base_model:
- FacebookAI/roberta-base
---

# Model Card for RoBERTa Toxicity Classifier

This model is a fine-tuned version of RoBERTa-base for toxicity classification, capable of identifying six different types of toxic content in text.

## Model Details

### Model Description

This model is a fine-tuned version of RoBERTa-base, trained to identify toxic content across multiple categories. It was developed to help identify and moderate harmful content in text data.

- **Developed by:** Bonnavaud Laura, Cousseau Martin, Laborde Stanislas, Rady Othmane, Satouri Amani
- **Model type:** RoBERTa-based text classification
- **Language(s):** English
- **License:** MIT
- **Finetuned from model:** facebook/roberta-base

## Uses

### Direct Use

The model can be used directly for:
- Content moderation
- Toxic comment detection
- Online safety monitoring
- Comment filtering systems

### Out-of-Scope Use

This model should not be used for:
- Legal decision making
- Automated content removal without human review
- Processing non-English content
- Making definitive judgments about individuals or groups

## Bias, Risks, and Limitations

- The model may reflect biases present in the training data
- Performance may vary across different demographics and contexts
- False positives/negatives can occur and should be considered in deployment
- Not suitable for high-stakes decisions without human oversight

### Recommendations

Users should:
- Implement human review processes alongside model predictions
- Monitor model performance across different demographic groups
- Use confidence thresholds appropriate for their use case
- Be transparent about the use of automated toxicity detection

## Training Details

### Training Data

The model was trained on the Jigsaw Toxic Comment Classification Challenge dataset, which includes comments labeled for toxic content across six categories:
- Toxic
- Severe Toxic
- Obscene
- Threat
- Insult
- Identity Hate

The dataset was split into training and testing sets with a 90-10 split ratio, using stratified sampling based on the sum of toxic labels to ensure balanced distribution. Empty comments were handled by filling with empty strings, and all texts were properly cleaned and tokenized in batches of 48 samples.

### Training Procedure

#### Training Hyperparameters

- **Training regime:** FP16 mixed precision
- **Optimizer:** AdamW
- **Learning rate:** 2e-5
- **Batch size:** 320
- **Epochs:** Up to 40 with early stopping (patience=15)
- **Max sequence length:** 128
- **Warmup ratio:** 0.1
- **Weight decay:** 0.1
- **Gradient accumulation steps:** 2
- **Scheduler:** Linear
- **DataLoader workers:** 2

### Evaluation

#### Testing Data, Factors & Metrics

The model was evaluated on a held-out test set from the Jigsaw dataset.

#### Metrics

The model was evaluated using comprehensive metrics for multi-label classification:

Per class metrics:
- Accuracy
- Precision
- Recall
- F1 Score

Aggregate metrics:
- Overall accuracy
- Macro-averaged metrics:
  - Macro Precision
  - Macro Recall
  - Macro F1
- Micro-averaged metrics:
  - Micro Precision
  - Micro Recall
  - Micro F1

Best model selection was based on F1 score during training.

## Environmental Impact

- **Hardware Type:** 4x NVIDIA A10 24GB
- **Training hours:** 20 Minutes
- **Cloud Provider:** ESIEA Cluster

## Technical Specifications

### Model Architecture and Technical Details

- Base model: RoBERTa-base
- Problem type: Multi-label classification
- Number of labels: 6
- Output layers: Linear classification head for multi-label prediction
- Number of parameters: ~125M
- Training optimizations:
  - Distributed Data Parallel (DDP) support with NCCL backend
  - FP16 mixed precision training
  - Memory optimizations:
    - Gradient accumulation (steps=2)
    - DataLoader pinned memory
    - Efficient batch processing
- Caching system for tokenized data to improve training efficiency

### Hardware Requirements

Minimum requirements for inference:
- RAM: 4GB
- CPU: Modern processor supporting AVX instructions
- GPU: Optional, but recommended for batch processing