If you already know T5, FLAN-T5 is just better at everything. For the same number of parameters, these models have been fine-tuned on more than 1000 additional tasks covering also more languages. As mentioned in the first few lines of the abstract :
Flan-PaLM 540B achieves state-of-the-art performance on several benchmarks, such as 75.2% on five-shot MMLU. We also publicly release Flan-T5 checkpoints,1 which achieve strong few-shot performance even compared to much larger models, such as PaLM 62B. Overall, instruction finetuning is a general method for improving the performance and usability of pretrained language models.
Disclaimer: Content from this model card has been written by the Hugging Face team, and parts of it were copy pasted from the T5 model card.
Limitations:
- The model may sometimes generate irrelevant keywords
- Performance may vary depending on the length and complexity of the input text
- For best results, use long clean texts
- Length limit is 512 tokens due to Flan-T5 architecture
- The model is trained on English text and may not perform well on other languages
Usage
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text2text-generation", model="saidsef/flan-t5-small-tuned-tech-docs")
# Load model directly
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("saidsef/flan-t5-small-tuned-tech-docs")
model = AutoModelForSeq2SeqLM.from_pretrained("saidsef/flan-t5-small-tuned-tech-docs")
Framework versions
- Transformers 4.45.1
- Pytorch 2.4.1+cu121
- Datasets 3.0.1
- Tokenizers 0.20.0
Ethical Considerations
When using this model, consider the potential impact of automated keyword extraction on content creation and SEO practices. Ensure that the use of this model complies with relevant guidelines and does not contribute to the creation of misleading or spammy content.
- Downloads last month
- 56
Model tree for saidsef/flan-t5-small-tuned-tech-docs
Base model
google/flan-t5-small