ProdicusII nikolamilosevic commited on
Commit
731585b
·
1 Parent(s): b25e95c

Promena, dodate informacije o tipu modela (#8)

Browse files

- Promena, dodate informacije o tipu modela (0b5374e5350fd33f9de52210f4a3890ce1fd200e)


Co-authored-by: Nikola <[email protected]>

Files changed (1) hide show
  1. README.md +35 -3
README.md CHANGED
@@ -29,7 +29,7 @@ library_name: transformers
29
 
30
  This model was created during the research collaboration between Bayer Pharma and Serbian Institute for Artificial Intelligence Research and Development.
31
  The model is trained on about 25+ biomedical NER classes and can perform also zero-shot inference and can be further fine-tuned for new classes with just few examples (few-shot learning).
32
- For more details about our methods please see the paper named ["A transformer-based method for zero and few-shot biomedical named entity recognition"](https://arxiv.org/abs/2305.04928).
33
 
34
  Model takes as input two strings. String1 is NER label that is being searched in second string. String1 must be phrase for entity. String2 is short text where String1 is searched for semantically.
35
  model outputs list of zeros and ones corresponding to the occurance of Named Entity and corresponing to the tokens(tokens given by transformer tokenizer) of the Sring2.
@@ -47,11 +47,43 @@ encodings = tokenizer(string1, string2, is_split_into_words=False,
47
  padding=True, truncation=True, add_special_tokens=True, return_offsets_mapping=False,
48
  max_length=512, return_tensors='pt')
49
 
50
- model = BertForTokenClassification.from_pretrained(modelname, num_labels=2)
51
- prediction_logits = model(**encodings)
52
  print(prediction_logits)
53
  ```
54
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
55
  ## Available classes
56
 
57
  The following datasets and entities were used for training and therefore they can be used as label in the first segment (as a first string). Note that multiword string have been merged.
 
29
 
30
  This model was created during the research collaboration between Bayer Pharma and Serbian Institute for Artificial Intelligence Research and Development.
31
  The model is trained on about 25+ biomedical NER classes and can perform also zero-shot inference and can be further fine-tuned for new classes with just few examples (few-shot learning).
32
+ For more details about our methods please see the paper named ["A transformer-based method for zero and few-shot biomedical named entity recognition"](https://arxiv.org/abs/2305.04928). The model corresponds to BioBERT-based mode, trained with 1 in the first segment (check paper for more details).
33
 
34
  Model takes as input two strings. String1 is NER label that is being searched in second string. String1 must be phrase for entity. String2 is short text where String1 is searched for semantically.
35
  model outputs list of zeros and ones corresponding to the occurance of Named Entity and corresponing to the tokens(tokens given by transformer tokenizer) of the Sring2.
 
47
  padding=True, truncation=True, add_special_tokens=True, return_offsets_mapping=False,
48
  max_length=512, return_tensors='pt')
49
 
50
+ model0 = BertForTokenClassification.from_pretrained(modelname, num_labels=2)
51
+ prediction_logits = model0(**encodings)
52
  print(prediction_logits)
53
  ```
54
 
55
+ ## Example of fine-tuning with few-shot learning
56
+
57
+ In order to fine-tune model to the new entity using few shots, the dataset needs to be transformed to torch.utils.data.Dataset, containing BERT tokens and set of 0s and 1s (1 is where the class is positive and should be predicted as the member of given NER class). After the dataset is created, the following can be done (for more details, please have a look at the code at GitHub - https://github.com/br-ai-ns-institute/Zero-ShotNER):
58
+
59
+ ```python
60
+ training_args = TrainingArguments(
61
+ output_dir=os.path.join('Results', class_unseen, str(j)+'Shot'), # folder for results
62
+ num_train_epochs=10, # number of epochs
63
+ per_device_train_batch_size=16, # batch size per device during training
64
+ per_device_eval_batch_size=16, # batch size for evaluation
65
+ weight_decay=0.01, # strength of weight decay
66
+ logging_dir=os.path.join('Logs', class_unseen, str(j)+'Shot'), # folder for logs
67
+ save_strategy='epoch',
68
+ evaluation_strategy='epoch',
69
+ load_best_model_at_end=True,
70
+ )
71
+
72
+ model0 = BertForTokenClassification.from_pretrained(model_path, num_labels=2)
73
+ trainer = Trainer(
74
+ model=model0, # pretrained model
75
+ args=training_args, # training artguments
76
+ train_dataset=dataset, # Object of class torch.utils.data.Dataset for training
77
+ eval_dataset=dataset_valid # Object of class torch.utils.data.Dataset for vaLidation
78
+ )
79
+ start_time = time.time()
80
+ trainer.train()
81
+ total_time = time.time()-start_time
82
+ model0_path = os.path.join('Results', class_unseen, str(j)+'Shot', 'Model')
83
+ os.makedirs(model0_path, exist_ok=True)
84
+ trainer.save_model(model0_path)
85
+ ```
86
+
87
  ## Available classes
88
 
89
  The following datasets and entities were used for training and therefore they can be used as label in the first segment (as a first string). Note that multiword string have been merged.