Wineberto labels

Pretrained model on on wine labels only for named entity recognition that uses bert-base-uncased as the base model.

Model description

How to use

You can use this model directly for named entity recognition like so

>>> from transformers import pipeline
>>> ner = pipeline('ner', model='winberto-labels')
>>> tokens = ner("Heitz Cabernet Sauvignon California Napa Valley Napa US")
>>> for t in toks:
>>>    print(f"{t['word']}: {t['entity_group']}: {t['score']:.5}")

heitz: producer: 0.99758
cabernet: wine: 0.92263
sauvignon: wine: 0.92472
california: region: 0.53502
napa valley: subregion: 0.79638
us: country: 0.93675

Training data

The BERT model was trained on 50K wine labels derived from https://www.liv-ex.com/wwd/lwin/ and manually annotated to capture the following tokens

"1": "B-classification",
"2": "B-country",
"3": "B-producer",
"4": "B-region",
"5": "B-subregion",
"6": "B-vintage",
"7": "B-wine"

Training procedure

model_id = 'bert-base-uncased'
arguments = TrainingArguments(
    evaluation_strategy="epoch",
    learning_rate=2e-5,
    per_device_train_batch_size=8,
    per_device_eval_batch_size=8,
    num_train_epochs=5,
    weight_decay=0.01,
)
...
trainer.train()
Downloads last month
6
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.