DebateLabKIT
/

Llama-3.1-Argunaut-1-8B-SFT

Text Generation

critical-thinking

argument-mapping

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

ggbetz commited on 9 days ago

Commit

3484ab6

·

verified ·

1 Parent(s): fadf077

Update README.md

Files changed (1) hide show

README.md +56 -1

README.md CHANGED Viewed

@@ -15,4 +15,59 @@ tags:
 - argument-mapping
 - trl
 - sft
----

 - argument-mapping
 - trl
 - sft
+---
+# Model Card for Llama-3.1-Argunaut-1-8B-SFT
+This model is a fine-tuned version of [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct).
+It has been trained using [TRL](https://github.com/huggingface/trl).
+## Quick start
+```python
+from transformers import pipeline
+question = "Are you familiar with Argdown syntax? What's its purpose?"
+generator = pipeline("text-generation", model="DebateLabKIT/Llama-3.1-Argunaut-1-8B-SFT", device="cuda")
+output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
+print(output["generated_text"])
+```
+## Training procedure
+SFT dataset mixture:
+|dataset|weight (examples)| weight (token)|
+|:------|:----:|:----:|
+|DebateLabKIT/deepa2-conversations|25%|XX|
+|DebateLabKIT/deep-argmap-conversations|25%|XX|
+|allenai/tulu-3-sft-mixture|50%|XX|
+Trained with SFT on **1M examples** and for 1 epoch with *spectrum* (top 30 percent).
+```yaml
+# Training parameters
+num_train_epochs: 1
+per_device_train_batch_size: 8
+gradient_accumulation_steps: 2
+gradient_checkpointing: true
+gradient_checkpointing_kwargs:
+  use_reentrant: false
+learning_rate: 5.0e-6  # following _Tülu 3_ recipe
+lr_scheduler_type: cosine
+warmup_ratio: 0.1
+```
+Hardware: 2 x H100 GPUs.
+### Framework versions
+- TRL: 0.12.1
+- Transformers: 4.46.3
+- Pytorch: 2.4.1
+- Datasets: 3.1.0
+- Tokenizers: 0.20.3