LEAF: Predicting the Environmental Impact of Food Products based on their Name

The leaf-large model is a BAAI/bge-m3 model fine-tuned on the LEAF dataset.

To load the model, use the following code:

from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained("baskra/leaf-base")
model = AutoModel.from_pretrained("baskra/leaf-base", trust_remote_code=True)

model(**tokenizer("Nutella", return_tensors="pt"))
# {'logits': tensor([[-12.2842, ...]]), 'class_idx': tensor([1553]), 'ef_score': tensor([0.0129]), 'class': ['Chocolate spread with hazelnuts']}

Citation

When using this model, please consider citing it as follows:

BibTeX:

@inproceedings{krahmer-2024-leaf,
    title = "{LEAF}: Predicting the Environmental Impact of Food Products based on their Name",
    author = "Krahmer, Bas",
    editor = "Stammbach, Dominik  and
      Ni, Jingwei  and
      Schimanski, Tobias  and
      Dutia, Kalyan  and
      Singh, Alok  and
      Bingler, Julia  and
      Christiaen, Christophe  and
      Kushwaha, Neetu  and
      Muccione, Veruska  and
      A. Vaghefi, Saeid  and
      Leippold, Markus",
    booktitle = "Proceedings of the 1st Workshop on Natural Language Processing Meets Climate Change (ClimateNLP 2024)",
    month = aug,
    year = "2024",
    address = "Bangkok, Thailand",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.climatenlp-1.10",
    pages = "133--142",
    abstract = "Although food consumption represents a sub- stantial global source of greenhouse gas emis- sions, assessing the environmental impact of off-the-shelf products remains challenging. Currently, this information is often unavailable, hindering informed consumer decisions when grocery shopping. The present work introduces a new set of models called LEAF, which stands for Linguistic Environmental Analysis of Food Products. LEAF models predict the life-cycle environmental impact of food products based on their name. It is shown that LEAF models can accurately predict the environmental im- pact based on just the product name in a multi- lingual setting, greatly outperforming zero-shot classification methods. Models of varying sizes and capabilities are released, along with the code and dataset to fully reproduce the study.",
}
Downloads last month
17
Safetensors
Model size
570M params
Tensor type
F32
·
Inference Examples
Inference API (serverless) does not yet support model repos that contain custom code.

Dataset used to train baskra/leaf-large