πŸƒ Foglietta - A super tiny model for English -> Italian translation

Foglietta is an encoder-decoder transformer model for English-Italian text translation based on google/t5-efficient-tiny. It was trained on the en-it section of Helsinki-NLP/opus-100 and Helsinki-NLP/europarl.

Be advised: As the model is really small, it will make errors.

Usage

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

# Load model and tokenizer from checkpoint directory
tokenizer = AutoTokenizer.from_pretrained("LeonardPuettmann/Foglietta-mt-en-it")
model = AutoModelForSeq2SeqLM.from_pretrained("LeonardPuettmann/Foglietta-mt-en-it")

def generate_response(input_text):
    input_ids = tokenizer("translate English to Italian:" + input_text, return_tensors="pt").input_ids
    output = model.generate(input_ids, max_new_tokens=256)
    return tokenizer.decode(output[0], skip_special_tokens=True)

text_to_translate = "I would like a cup of green tea, please."
response = generate_response(text_to_translate)
print(response)

As this model is trained on translating sentence pairs, it is best to split longer text into individual sentences, ideally using SpaCy. You can then translate the sentences and join the translations at the end like this:

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
import spacy
# First, install spaCy and the English language model if you haven't already
# !pip install spacy
# !python -m spacy download en_core_web_sm

nlp = spacy.load("en_core_web_sm")

tokenizer = AutoTokenizer.from_pretrained("LeonardPuettmann/Foglietta-mt-en-it")
model = AutoModelForSeq2SeqLM.from_pretrained("LeonardPuettmann/Foglietta-mt-en-it")

def generate_response(input_text):
    input_ids = tokenizer("translate English to Italian: " + input_text, return_tensors="pt").input_ids
    output = model.generate(input_ids, max_new_tokens=256)
    return tokenizer.decode(output[0], skip_special_tokens=True)

text = "How are you doing? Today is a beautiful day. I hope you are doing fine."
doc = nlp(text)
sentences = [sent.text for sent in doc.sents]

sentence_translations = []
for i, sentence in enumerate(sentences):
    sentence_translation = generate_response(sentence)
    sentence_translations.append(sentence_translation)

full_translation = " ".join(sentence_translations)
print(full_translation)
Downloads last month
30
Safetensors
Model size
15.6M params
Tensor type
F32
Β·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for puettmann/Foglietta-mt-en-it

Finetuned
(9)
this model

Datasets used to train puettmann/Foglietta-mt-en-it

Space using puettmann/Foglietta-mt-en-it 1

Collection including puettmann/Foglietta-mt-en-it