Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,111 @@
|
|
1 |
---
|
2 |
license: cc-by-nc-4.0
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: cc-by-nc-4.0
|
3 |
---
|
4 |
+
language:
|
5 |
+
- en
|
6 |
+
- de
|
7 |
+
library_name: transformers
|
8 |
+
pipeline_tag: text-generation
|
9 |
+
tags:
|
10 |
+
- finetune
|
11 |
+
- dpo
|
12 |
+
- Instruct
|
13 |
+
- augmentation
|
14 |
+
- german
|
15 |
+
datasets:
|
16 |
+
- argilla/distilabel-math-preference-dpo
|
17 |
+
---
|
18 |
+
|
19 |
+
![SauerkrautLM](https://vago-solutions.de/wp-content/uploads/2023/12/sauerkrautlm-solar.png "SauerkrautLM-SOLAR-Instruct")
|
20 |
+
## VAGO solutions SauerkrautLM-SOLAR-Instruct
|
21 |
+
Introducing **SauerkrautLM-SOLAR-Instruct** – our Sauerkraut version of the powerful[upstage/SOLAR-10.7B-Instruct-v1.0](https://huggingface.co/upstage/SOLAR-10.7B-Instruct-v1.0) !
|
22 |
+
Aligned with **DPO**
|
23 |
+
|
24 |
+
# Table of Contents
|
25 |
+
1. [Overview of all SauerkrautLM-Mixtral models](#all-sauerkrautlm-mixtral-models)
|
26 |
+
2. [Model Details](#model-details)
|
27 |
+
- [Prompt template](#prompt-template)
|
28 |
+
- [Training Dataset](#training-dataset)
|
29 |
+
- [Data Contamination Test](#data-contamination-test-results)
|
30 |
+
3. [Evaluation](#evaluation)
|
31 |
+
5. [Disclaimer](#disclaimer)
|
32 |
+
6. [Contact](#contact)
|
33 |
+
7. [Collaborations](#collaborations)
|
34 |
+
8. [Acknowledgement](#acknowledgement)
|
35 |
+
|
36 |
+
|
37 |
+
## All SauerkrautLM-SOLAR-Instruct
|
38 |
+
|
39 |
+
| Model | HF | GPTQ | GGUF | AWQ |
|
40 |
+
|-------|-------|-------|-------|-------|
|
41 |
+
| SauerkrautLM-SOLAR-Instruct | [Link](https://huggingface.co/VAGOsolutions/SauerkrautLM-SOLAR-Instruct/) | coming soon | coming soon | coming soon |
|
42 |
+
|
43 |
+
## Model Details
|
44 |
+
**SauerkrautLM-SOLAR-Instruct**
|
45 |
+
- **Model Type:** SauerkrautLM-SOLAR-Instruct is a finetuned Model based on [upstage/SOLAR-10.7B-Instruct-v1.0](https://huggingface.co/upstage/SOLAR-10.7B-Instruct-v1.0)
|
46 |
+
- **Language(s):** English, German
|
47 |
+
- **License:** cc-by-nc-4.0
|
48 |
+
- **Contact:** [Website](https://vago-solutions.de/#Kontakt) [David Golchinfar](mailto:[email protected])
|
49 |
+
|
50 |
+
### Training Dataset:
|
51 |
+
|
52 |
+
SauerkrautLM-SOLAR-Instruct was trained with mix of German data augmentation and translated data.
|
53 |
+
Aligned through **DPO** with our **new German SauerkrautLM-DPO dataset** based on parts of the SFT SauerkrautLM dataset
|
54 |
+
as chosen answers and [Sauerkraut-7b-HerO](https://huggingface.co/VAGOsolutions/SauerkrautLM-7b-HerO) as rejected answers. Added with additional **translated Parts of the [HuggingFaceH4/ultrafeedback_binarized](https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized)** (Our dataset do not contain any TruthfulQA prompts - check Data Contamination Test Results) and **[argilla/distilabel-math-preference-dpo](https://huggingface.co/datasets/argilla/distilabel-math-preference-dpo).**
|
55 |
+
We found, that only a simple translation of training data can lead to unnatural German phrasings.
|
56 |
+
Data augmentation techniques were used to grant grammatical, syntactical correctness and a more natural German wording in our training data.
|
57 |
+
|
58 |
+
We improved the German language skills on this model. Nevertheless, certain formulations may occur that are not entirely correct.
|
59 |
+
|
60 |
+
|
61 |
+
|
62 |
+
|
63 |
+
|
64 |
+
### Data Contamination Test Results
|
65 |
+
|
66 |
+
Some models on the HuggingFace leaderboard had problems with wrong data getting mixed in.
|
67 |
+
We checked our SauerkrautLM-DPO dataset with a special test [1] on this model as target model and upstage/SOLAR-10.7B-Instruct-v1.0 as reference model.
|
68 |
+
The HuggingFace team used the same methods [2, 3].
|
69 |
+
|
70 |
+
Our results, with `result < 0.1, %:` being well below 0.9, indicate that our dataset is free from contamination.
|
71 |
+
|
72 |
+
*The data contamination test results of HellaSwag and Winograde will be added once [1] supports them.*
|
73 |
+
|
74 |
+
| Dataset | ARC | MMLU | TruthfulQA | GSM8K |
|
75 |
+
|------------------------------|-------|-------|-------|-------|
|
76 |
+
| **SauerkrautLM-DPO**| result < 0.1, %: 0.0 |result < 0.1, %: 0.09 | result < 0.1, %: 0.13 | result < 0.1, %: 0.16 |
|
77 |
+
|
78 |
+
[1] https://github.com/swj0419/detect-pretrain-code-contamination
|
79 |
+
|
80 |
+
[2] https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard/discussions/474#657f2245365456e362412a06
|
81 |
+
|
82 |
+
[3] https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard/discussions/265#657b6debf81f6b44b8966230
|
83 |
+
|
84 |
+
### Prompt Template:
|
85 |
+
```
|
86 |
+
### User:
|
87 |
+
Hallo, wie geht es dir?
|
88 |
+
|
89 |
+
### Assistant:
|
90 |
+
Hallo! Es freut mich, dass du mit mir kommunizierst. Ich bin ein künstlicher Intelligenz-Assistent und ich funktioniere optimal.
|
91 |
+
Wie kann ich zu deiner Unterstützung beitragen oder hast du eine Frage, die ich klären sollte? In welcher Art und Weise kann ich dich helfen?
|
92 |
+
|
93 |
+
```
|
94 |
+
*Prompt Example on Temp 0.3
|
95 |
+
|
96 |
+
## Evaluation
|
97 |
+
coming soon
|
98 |
+
|
99 |
+
## Disclaimer
|
100 |
+
We must inform users that despite our best efforts in data cleansing, the possibility of uncensored content slipping through cannot be entirely ruled out.
|
101 |
+
However, we cannot guarantee consistently appropriate behavior. Therefore, if you encounter any issues or come across inappropriate content, we kindly request that you inform us through the contact information provided.
|
102 |
+
Additionally, it is essential to understand that the licensing of these models does not constitute legal advice. We are not held responsible for the actions of third parties who utilize our models.
|
103 |
+
|
104 |
+
## Contact
|
105 |
+
If you are interested in customized LLMs for business applications, please get in contact with us via our website or contact us at [Dr. Daryoush Vaziri](mailto:[email protected]). We are also grateful for your feedback and suggestions.
|
106 |
+
|
107 |
+
## Collaborations
|
108 |
+
We are also keenly seeking support and investment for our startup, VAGO solutions, where we continuously advance the development of robust language models designed to address a diverse range of purposes and requirements. If the prospect of collaboratively navigating future challenges excites you, we warmly invite you to reach out to us.
|
109 |
+
|
110 |
+
## Acknowledgement
|
111 |
+
Many thanks to [argilla](https://huggingface.co/datasets/argilla) and [Huggingface](https://huggingface.co) for providing such valuable datasets to the Open-Source community. And of course a big thanks to [upstage](https://huggingface.co/upstage) for providing the open source community with their latest technology!
|