--- license: mit metrics: - accuracy base_model: - google-bert/bert-base-uncased datasets: - shahxeebhassan/human_vs_ai_sentences pipeline_tag: text-classification library_name: transformers --- ## Model Description This model is a fine-tuned BERT model for AI content detection. ## Training Data The model was trained on a [dataset ](https://huggingface.co/datasets/shahxeebhassan/human_vs_ai_sentences) of over 100,000 sentences, each labeled as either AI-generated or human-written. This approach allows the model to predict the nature of each individual sentence, which is particularly useful for highlighting AI-written content within larger texts. ## Evaluation Metrics The model achieved an accuracy of 90% on the validation & test set. ## Usage ```python import torch from transformers import BertTokenizer, BertForSequenceClassification tokenizer = BertTokenizer.from_pretrained("shahxeebhassan/bert_base_ai_content_detector") model = BertForSequenceClassification.from_pretrained("shahxeebhassan/bert_base_ai_content_detector") inputs = tokenizer("Distance learning will not benefit students because the students are not able to develop as good of a relationship with their teachers.", return_tensors="pt") with torch.no_grad(): outputs = model(**inputs) logits = outputs.logits probabilities = torch.softmax(logits, dim=1).cpu().numpy() predicted_label = probabilities.argmax(axis=1) print(f"Predicted label for the input text: {predicted_label[0]}")