SkitCon
/

gec-spanish-BARTO-COWS-L2H

Text2Text Generation

Inference Endpoints

Model card Files Files and versions Community

SkitCon commited on Dec 3, 2024

Commit

207d1d3

·

verified ·

1 Parent(s): 4f5a601

Add basic model card

Files changed (1) hide show

README.md +26 -1

README.md CHANGED Viewed

@@ -13,4 +13,29 @@ tags:
 - seq2seq
 - bart
 - cows-l2h
----

 - seq2seq
 - bart
 - cows-l2h
+---
+This model has been trained on 80% of the COWS-L2H dataset for grammatical error correction of Spanish text. The corpus was sentencized, so the model has been fine-tuned for SENTENCE CORRECTION. This model will likely not perform well on an entire paragraph. To correct a paragraph, sentencize the text and run the model for each sentence.
+BLEU: 0.797 on COWS-L2H
+Example usage:
+```python
+from transformers import AutoTokenizer, BartForConditionalGeneration
+tokenizer = AutoTokenizer.from_pretrained("SkitCon/gec-spanish-BARTO-COWS-L2H")
+model = BartForConditionalGeneration.from_pretrained("SkitCon/gec-spanish-BARTO-COWS-L2H")
+input_sentences = ["Yo va al tienda.", "Espero que tú ganas."]
+tokenized_text = tokenizer(input_sentences, return_tensors="pt")
+input_ids = source_enc["input_ids"].squeeze()
+attention_mask = source_enc["attention_mask"].squeeze()
+outputs = model.generate(input_ids=input_ids, attention_mask=attention_mask)
+for sentence in tokenizer.batch_decode(outputs, skip_special_tokens=True):
+  print(sentence)
+```