|
--- |
|
license: wtfpl |
|
base_model: CausalLM/7B |
|
tags: |
|
- generated_from_trainer |
|
model-index: |
|
- name: workspace/causal-dolphin-v0.1 |
|
results: [] |
|
datasets: |
|
- ehartford/dolphin |
|
- THUDM/AgentInstruct |
|
--- |
|
|
|
[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl) |
|
# Causal-Dolphin-Agent-v0.1 |
|
|
|
This model is a LoRA fine-tune of [CausalLM/7B](https://huggingface.co/CausalLM/7B) on Eric's wonderful [Dolphin](https://huggingface.co/datasets/ehartford/dolphin) dataset, with [THUDM/AgentInstruct](https://huggingface.co/datasets/THUDM/AgentInstruct) mixed in both training runs. |
|
|
|
Causal-Dolphin-Agent was trained for 3 epochs on 5 million GPT3.5 augmented FLAN instructions & AgentInstruct dataset in ChatML format. It was then trained a further 3 epochs on 1 million GPT4 augmented FLAN instructions with AgentInstruct shuffled in as well. |
|
|
|
It achieves the following results on the evaluation set: |
|
- Loss: 2.8435 |
|
|
|
## Prompt Format |
|
|
|
Causal-Dolphin-Agent uses ChatML as the prompt format: |
|
``` |
|
<|im_start|>system |
|
You are Dolphin, a helpful AI assistant.<|im_end|> |
|
<|im_start|>user |
|
If Danny owns a bike, then Edward owns a bike. If Edward owns a bike, then Freddy owns a bike. If Danny owns a bike, which of the following statements must be true? Let's think step by step. |
|
|
|
I. Edward owns a bike. |
|
II. Freddy owns a bike. |
|
III. Freddy does not own a bike. |
|
|
|
Choose one answer: |
|
I only |
|
II only |
|
III only |
|
I and II only |
|
I and III only |
|
<|im_end|> |
|
<|im_start|>assistant |
|
``` |
|
|
|
## Training and evaluation data |
|
|
|
[ehartford/dolphin](https://huggingface.co/datasets/ehartford/dolphin) |
|
[THUDM/AgentInstruct](https://huggingface.co/datasets/THUDM/AgentInstruct) |
|
|
|
## Training procedure |
|
Causal-Dolphin-Agent was trained for 3 epochs on 5 million GPT3.5 augmented FLAN instructions & AgentInstruct dataset in ChatML format. It was then trained a further 3 epochs on 1 million GPT4 augmented FLAN instructions with AgentInstruct shuffled in as well. |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 6e-06 |
|
- train_batch_size: 1 |
|
- eval_batch_size: 1 |
|
- seed: 42 |
|
- optimizer: Adam with betas=(0.9,0.95) and epsilon=1e-05 |
|
- lr_scheduler_type: cosine |
|
- lr_scheduler_warmup_steps: 100 |
|
- num_epochs: 3 |
|
|
|
### Training results |
|
|
|
| Training Loss | Epoch | Step | Validation Loss | |
|
|:-------------:|:-----:|:----:|:---------------:| |
|
| 2.7774 | 0.0 | 1 | 5.1009 | |
|
| 3.2798 | 0.15 | 46 | 5.1010 | |
|
| 2.0722 | 0.3 | 92 | 5.0489 | |
|
| 2.5919 | 0.45 | 138 | 4.8834 | |
|
| 2.0011 | 0.6 | 184 | 4.6678 | |
|
| 1.3733 | 0.75 | 230 | 4.4628 | |
|
| 1.7321 | 0.9 | 276 | 4.2757 | |
|
| 1.3994 | 1.05 | 322 | 4.1029 | |
|
| 1.2308 | 1.2 | 368 | 3.8916 | |
|
| 0.8229 | 1.35 | 414 | 3.6451 | |
|
| 0.9592 | 1.5 | 460 | 3.4106 | |
|
| 0.8528 | 1.65 | 506 | 3.2250 | |
|
| 0.7362 | 1.8 | 552 | 3.0852 | |
|
| 0.8077 | 1.95 | 598 | 2.9881 | |
|
| 0.6912 | 2.1 | 644 | 2.9315 | |
|
| 0.7776 | 2.25 | 690 | 2.8911 | |
|
| 0.6916 | 2.41 | 736 | 2.8678 | |
|
| 0.8674 | 2.56 | 782 | 2.8534 | |
|
| 0.7797 | 2.71 | 828 | 2.8545 | |
|
| 0.6838 | 2.86 | 874 | 2.8435 | |
|
|
|
|
|
### Framework versions |
|
|
|
- Transformers 4.34.1 |
|
- Pytorch 2.1.0+cu121 |
|
- Datasets 2.14.6 |
|
- Tokenizers 0.14.1 |