Update README.md
Browse files
README.md
CHANGED
@@ -17,7 +17,7 @@ datasets:
|
|
17 |
# udkai_Turdus
|
18 |
A less contaminated version of [udkai/Garrulus](https://huggingface.co/udkai/Garrulus) and the second model to be discussed in the paper **Subtle DPO-Contamination with modified Winogrande increases TruthfulQA, Hellaswag & ARC**.
|
19 |
|
20 |
-
Contrary to Garrulus which was obtained after 2 epochs, this model was obtained after **one single epoch** of "direct preference optimization" of [NeuralMarcoro14-7B](https://huggingface.co/mlabonne/NeuralMarcoro14-7B) with [https://huggingface.co/datasets/hromi/winograd_dpo] .
|
21 |
|
22 |
As You may notice, the dataset mostly consists of specially modified winogrande prompts.
|
23 |
|
|
|
17 |
# udkai_Turdus
|
18 |
A less contaminated version of [udkai/Garrulus](https://huggingface.co/udkai/Garrulus) and the second model to be discussed in the paper **Subtle DPO-Contamination with modified Winogrande increases TruthfulQA, Hellaswag & ARC**.
|
19 |
|
20 |
+
Contrary to Garrulus which was obtained after 2 epochs, this model was obtained after **one single epoch** of "direct preference optimization" of [NeuralMarcoro14-7B](https://huggingface.co/mlabonne/NeuralMarcoro14-7B) with [https://huggingface.co/datasets/hromi/winograd_dpo ] .
|
21 |
|
22 |
As You may notice, the dataset mostly consists of specially modified winogrande prompts.
|
23 |
|