atzenhofer
commited on
Commit
·
7d5e242
1
Parent(s):
9033440
Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,81 @@
|
|
1 |
---
|
2 |
license: gpl-3.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: gpl-3.0
|
3 |
+
language:
|
4 |
+
- gmh
|
5 |
+
- de
|
6 |
+
widget:
|
7 |
+
- text: >-
|
8 |
+
Ich Ott von Zintzndorff vergich mit dem offenn prief vnd tun chunt alln den
|
9 |
+
leutn, di in sehnt oder hornt lesn, daz ich mit wolbedachtm mut vnd mit
|
10 |
+
guetem rat vnd willn czu der czeit, do ich ez wol getun mochtt, den erbern
|
11 |
+
herrn vnd fursten apt Englschalchn cze Seydensteten vnd sein gnants Gotshavs
|
12 |
+
daselbs gancz vnd gar ledig sage vnd lazze aller der ansproch, die ich ...
|
13 |
+
han auf seiner guter ains, des Schoephls lehn auf dem Graentleinsperg gnant
|
14 |
+
in Groestner pharr gelegn, also, daz ich vnd alle mein erbn furbaz dhain
|
15 |
+
ansprach dar vmb habn welln noch schulln in dhainn wegn, weder wenig noch
|
16 |
+
vil. Vnd dar vmb czu eine steten vrchund gib ich dem vorgnantn Apt
|
17 |
+
Englchalchn vnd seim wirdign Gotshaws cze Seydenstet den prief, versigelt
|
18 |
+
mit meim des egnantn Ottn von Czintzndorff, vnd mit hern Dytrichs des
|
19 |
+
Schenchn von Dobra anhangunden Insigeln, der das durch meinn willn cze
|
20 |
+
gezeug der obgeschribn sach an den prief hat gehang. Das ist geschehn vnd
|
21 |
+
der prief ist gebn nach Christs gepurd vber Drewtzehn hundert Jar, dar nach
|
22 |
+
im Sibn vnd fumftzgisten Jar, am Eritag in den Phingstveyrtagn.
|
23 |
---
|
24 |
+
|
25 |
+
# DistilRoBERTa (base) Middle High German Charter Masked Language Model
|
26 |
+
This model is a fine-tuned version of distilroberta-base on Middle High German (gmh; ISO 639-2; c. 1050–1500) charters of the [monasterium.net](https://www.icar-us.eu/en/cooperation/online-portals/monasterium-net/) data set.
|
27 |
+
|
28 |
+
## Model description
|
29 |
+
Please refer this model together with to the [distilroberta (base-sized model)](https://huggingface.co/distilroberta-base) card or the paper [DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter by Sanh et al.](https://arxiv.org/abs/1910.01108) for additional information.
|
30 |
+
|
31 |
+
## Intended uses & limitations
|
32 |
+
This model can be used for sequence prediction tasks, i.e., fill-masks.
|
33 |
+
|
34 |
+
## Training and evaluation data
|
35 |
+
The model was fine-tuned using the Middle High German Monasterium charters.
|
36 |
+
It was trained on a NVIDIA GeForce GTX 1660 Ti 6GB GPU.
|
37 |
+
|
38 |
+
## Training hyperparameters
|
39 |
+
The following hyperparameters were used during training:
|
40 |
+
- num_train_epochs: 10
|
41 |
+
- learning_rate: 2e-5
|
42 |
+
- weight-decay: 0,01
|
43 |
+
- train_batch_size: 8
|
44 |
+
- eval_batch_size: 8
|
45 |
+
- num_proc: 4
|
46 |
+
- block_size: 256
|
47 |
+
|
48 |
+
|
49 |
+
## Training results
|
50 |
+
|
51 |
+
| Epoch | Training Loss | Validation Loss |
|
52 |
+
|-------|---------------|-----------------|
|
53 |
+
| 1 | 2.537000 | 2.112094 |
|
54 |
+
| 2 | 2.053400 | 1.838937 |
|
55 |
+
| 3 | 1.900300 | 1.706654 |
|
56 |
+
| 4 | 1.766200 | 1.607970 |
|
57 |
+
| 5 | 1.669200 | 1.532340 |
|
58 |
+
| 6 | 1.619100 | 1.490333 |
|
59 |
+
| 7 | 1.571300 | 1.476035 |
|
60 |
+
| 8 | 1.543100 | 1.428958 |
|
61 |
+
| 9 | 1.517100 | 1.423216 |
|
62 |
+
| 10 | 1.508300 | 1.408235 |
|
63 |
+
|
64 |
+
Perplexity: 4.07
|
65 |
+
|
66 |
+
## Updates
|
67 |
+
- 2023-03-30: Upload
|
68 |
+
|
69 |
+
|
70 |
+
## Citation
|
71 |
+
Please cite as follows when using this model.
|
72 |
+
|
73 |
+
```
|
74 |
+
@misc{distilroberta-base-mhg-charter-mlm,
|
75 |
+
title={distilroberta-base-mhg-charter-mlm},
|
76 |
+
author={Atzenhofer-Baumgartner, Florian},
|
77 |
+
year = { 2023 },
|
78 |
+
url = { https://huggingface.co/atzenhofer/distilroberta-base-mhg-charter-mlm },
|
79 |
+
publisher = { Hugging Face }
|
80 |
+
}
|
81 |
+
```
|