DavidGF commited on
Commit
d435276
·
1 Parent(s): 93f6566

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +108 -0
README.md CHANGED
@@ -1,3 +1,111 @@
1
  ---
2
  license: cc-by-nc-4.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: cc-by-nc-4.0
3
  ---
4
+ language:
5
+ - en
6
+ - de
7
+ library_name: transformers
8
+ pipeline_tag: text-generation
9
+ tags:
10
+ - finetune
11
+ - dpo
12
+ - Instruct
13
+ - augmentation
14
+ - german
15
+ datasets:
16
+ - argilla/distilabel-math-preference-dpo
17
+ ---
18
+
19
+ ![SauerkrautLM](https://vago-solutions.de/wp-content/uploads/2023/12/sauerkrautlm-solar.png "SauerkrautLM-SOLAR-Instruct")
20
+ ## VAGO solutions SauerkrautLM-SOLAR-Instruct
21
+ Introducing **SauerkrautLM-SOLAR-Instruct** – our Sauerkraut version of the powerful[upstage/SOLAR-10.7B-Instruct-v1.0](https://huggingface.co/upstage/SOLAR-10.7B-Instruct-v1.0) !
22
+ Aligned with **DPO**
23
+
24
+ # Table of Contents
25
+ 1. [Overview of all SauerkrautLM-Mixtral models](#all-sauerkrautlm-mixtral-models)
26
+ 2. [Model Details](#model-details)
27
+ - [Prompt template](#prompt-template)
28
+ - [Training Dataset](#training-dataset)
29
+ - [Data Contamination Test](#data-contamination-test-results)
30
+ 3. [Evaluation](#evaluation)
31
+ 5. [Disclaimer](#disclaimer)
32
+ 6. [Contact](#contact)
33
+ 7. [Collaborations](#collaborations)
34
+ 8. [Acknowledgement](#acknowledgement)
35
+
36
+
37
+ ## All SauerkrautLM-SOLAR-Instruct
38
+
39
+ | Model | HF | GPTQ | GGUF | AWQ |
40
+ |-------|-------|-------|-------|-------|
41
+ | SauerkrautLM-SOLAR-Instruct | [Link](https://huggingface.co/VAGOsolutions/SauerkrautLM-SOLAR-Instruct/) | coming soon | coming soon | coming soon |
42
+
43
+ ## Model Details
44
+ **SauerkrautLM-SOLAR-Instruct**
45
+ - **Model Type:** SauerkrautLM-SOLAR-Instruct is a finetuned Model based on [upstage/SOLAR-10.7B-Instruct-v1.0](https://huggingface.co/upstage/SOLAR-10.7B-Instruct-v1.0)
46
+ - **Language(s):** English, German
47
+ - **License:** cc-by-nc-4.0
48
+ - **Contact:** [Website](https://vago-solutions.de/#Kontakt) [David Golchinfar](mailto:[email protected])
49
+
50
+ ### Training Dataset:
51
+
52
+ SauerkrautLM-SOLAR-Instruct was trained with mix of German data augmentation and translated data.
53
+ Aligned through **DPO** with our **new German SauerkrautLM-DPO dataset** based on parts of the SFT SauerkrautLM dataset
54
+ as chosen answers and [Sauerkraut-7b-HerO](https://huggingface.co/VAGOsolutions/SauerkrautLM-7b-HerO) as rejected answers. Added with additional **translated Parts of the [HuggingFaceH4/ultrafeedback_binarized](https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized)** (Our dataset do not contain any TruthfulQA prompts - check Data Contamination Test Results) and **[argilla/distilabel-math-preference-dpo](https://huggingface.co/datasets/argilla/distilabel-math-preference-dpo).**
55
+ We found, that only a simple translation of training data can lead to unnatural German phrasings.
56
+ Data augmentation techniques were used to grant grammatical, syntactical correctness and a more natural German wording in our training data.
57
+
58
+ We improved the German language skills on this model. Nevertheless, certain formulations may occur that are not entirely correct.
59
+
60
+
61
+
62
+
63
+
64
+ ### Data Contamination Test Results
65
+
66
+ Some models on the HuggingFace leaderboard had problems with wrong data getting mixed in.
67
+ We checked our SauerkrautLM-DPO dataset with a special test [1] on this model as target model and upstage/SOLAR-10.7B-Instruct-v1.0 as reference model.
68
+ The HuggingFace team used the same methods [2, 3].
69
+
70
+ Our results, with `result < 0.1, %:` being well below 0.9, indicate that our dataset is free from contamination.
71
+
72
+ *The data contamination test results of HellaSwag and Winograde will be added once [1] supports them.*
73
+
74
+ | Dataset | ARC | MMLU | TruthfulQA | GSM8K |
75
+ |------------------------------|-------|-------|-------|-------|
76
+ | **SauerkrautLM-DPO**| result < 0.1, %: 0.0 |result < 0.1, %: 0.09 | result < 0.1, %: 0.13 | result < 0.1, %: 0.16 |
77
+
78
+ [1] https://github.com/swj0419/detect-pretrain-code-contamination
79
+
80
+ [2] https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard/discussions/474#657f2245365456e362412a06
81
+
82
+ [3] https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard/discussions/265#657b6debf81f6b44b8966230
83
+
84
+ ### Prompt Template:
85
+ ```
86
+ ### User:
87
+ Hallo, wie geht es dir?
88
+
89
+ ### Assistant:
90
+ Hallo! Es freut mich, dass du mit mir kommunizierst. Ich bin ein künstlicher Intelligenz-Assistent und ich funktioniere optimal.
91
+ Wie kann ich zu deiner Unterstützung beitragen oder hast du eine Frage, die ich klären sollte? In welcher Art und Weise kann ich dich helfen?
92
+
93
+ ```
94
+ *Prompt Example on Temp 0.3
95
+
96
+ ## Evaluation
97
+ coming soon
98
+
99
+ ## Disclaimer
100
+ We must inform users that despite our best efforts in data cleansing, the possibility of uncensored content slipping through cannot be entirely ruled out.
101
+ However, we cannot guarantee consistently appropriate behavior. Therefore, if you encounter any issues or come across inappropriate content, we kindly request that you inform us through the contact information provided.
102
+ Additionally, it is essential to understand that the licensing of these models does not constitute legal advice. We are not held responsible for the actions of third parties who utilize our models.
103
+  
104
+ ## Contact
105
+ If you are interested in customized LLMs for business applications, please get in contact with us via our website or contact us at [Dr. Daryoush Vaziri](mailto:[email protected]). We are also grateful for your feedback and suggestions.
106
+  
107
+ ## Collaborations
108
+ We are also keenly seeking support and investment for our startup, VAGO solutions, where we continuously advance the development of robust language models designed to address a diverse range of purposes and requirements. If the prospect of collaboratively navigating future challenges excites you, we warmly invite you to reach out to us.
109
+
110
+ ## Acknowledgement
111
+ Many thanks to [argilla](https://huggingface.co/datasets/argilla) and [Huggingface](https://huggingface.co) for providing such valuable datasets to the Open-Source community. And of course a big thanks to [upstage](https://huggingface.co/upstage) for providing the open source community with their latest technology!