ggbetz commited on
Commit
3484ab6
·
verified ·
1 Parent(s): fadf077

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +56 -1
README.md CHANGED
@@ -15,4 +15,59 @@ tags:
15
  - argument-mapping
16
  - trl
17
  - sft
18
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
  - argument-mapping
16
  - trl
17
  - sft
18
+ ---
19
+
20
+
21
+ # Model Card for Llama-3.1-Argunaut-1-8B-SFT
22
+
23
+ This model is a fine-tuned version of [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct).
24
+ It has been trained using [TRL](https://github.com/huggingface/trl).
25
+
26
+ ## Quick start
27
+
28
+ ```python
29
+ from transformers import pipeline
30
+
31
+ question = "Are you familiar with Argdown syntax? What's its purpose?"
32
+ generator = pipeline("text-generation", model="DebateLabKIT/Llama-3.1-Argunaut-1-8B-SFT", device="cuda")
33
+ output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
34
+ print(output["generated_text"])
35
+ ```
36
+
37
+ ## Training procedure
38
+
39
+ SFT dataset mixture:
40
+
41
+ |dataset|weight (examples)| weight (token)|
42
+ |:------|:----:|:----:|
43
+ |DebateLabKIT/deepa2-conversations|25%|XX|
44
+ |DebateLabKIT/deep-argmap-conversations|25%|XX|
45
+ |allenai/tulu-3-sft-mixture|50%|XX|
46
+
47
+ Trained with SFT on **1M examples** and for 1 epoch with *spectrum* (top 30 percent).
48
+
49
+ ```yaml
50
+ # Training parameters
51
+ num_train_epochs: 1
52
+ per_device_train_batch_size: 8
53
+ gradient_accumulation_steps: 2
54
+ gradient_checkpointing: true
55
+ gradient_checkpointing_kwargs:
56
+ use_reentrant: false
57
+ learning_rate: 5.0e-6 # following _Tülu 3_ recipe
58
+ lr_scheduler_type: cosine
59
+ warmup_ratio: 0.1
60
+ ```
61
+
62
+ Hardware: 2 x H100 GPUs.
63
+
64
+
65
+
66
+
67
+ ### Framework versions
68
+
69
+ - TRL: 0.12.1
70
+ - Transformers: 4.46.3
71
+ - Pytorch: 2.4.1
72
+ - Datasets: 3.1.0
73
+ - Tokenizers: 0.20.3