poisson-fish's picture
Update README.md
0784209
---
license: wtfpl
base_model: CausalLM/7B
tags:
- generated_from_trainer
model-index:
- name: workspace/causal-dolphin-v0.1
results: []
datasets:
- ehartford/dolphin
- THUDM/AgentInstruct
---
[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
# Causal-Dolphin-Agent-v0.1
This model is a LoRA fine-tune of [CausalLM/7B](https://huggingface.co/CausalLM/7B) on Eric's wonderful [Dolphin](https://huggingface.co/datasets/ehartford/dolphin) dataset, with [THUDM/AgentInstruct](https://huggingface.co/datasets/THUDM/AgentInstruct) mixed in both training runs.
Causal-Dolphin-Agent was trained for 3 epochs on 5 million GPT3.5 augmented FLAN instructions & AgentInstruct dataset in ChatML format. It was then trained a further 3 epochs on 1 million GPT4 augmented FLAN instructions with AgentInstruct shuffled in as well.
It achieves the following results on the evaluation set:
- Loss: 2.8435
## Prompt Format
Causal-Dolphin-Agent uses ChatML as the prompt format:
```
<|im_start|>system
You are Dolphin, a helpful AI assistant.<|im_end|>
<|im_start|>user
If Danny owns a bike, then Edward owns a bike. If Edward owns a bike, then Freddy owns a bike. If Danny owns a bike, which of the following statements must be true? Let's think step by step.
I. Edward owns a bike.
II. Freddy owns a bike.
III. Freddy does not own a bike.
Choose one answer:
I only
II only
III only
I and II only
I and III only
<|im_end|>
<|im_start|>assistant
```
## Training and evaluation data
[ehartford/dolphin](https://huggingface.co/datasets/ehartford/dolphin)
[THUDM/AgentInstruct](https://huggingface.co/datasets/THUDM/AgentInstruct)
## Training procedure
Causal-Dolphin-Agent was trained for 3 epochs on 5 million GPT3.5 augmented FLAN instructions & AgentInstruct dataset in ChatML format. It was then trained a further 3 epochs on 1 million GPT4 augmented FLAN instructions with AgentInstruct shuffled in as well.
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 6e-06
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- optimizer: Adam with betas=(0.9,0.95) and epsilon=1e-05
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 100
- num_epochs: 3
### Training results
| Training Loss | Epoch | Step | Validation Loss |
|:-------------:|:-----:|:----:|:---------------:|
| 2.7774 | 0.0 | 1 | 5.1009 |
| 3.2798 | 0.15 | 46 | 5.1010 |
| 2.0722 | 0.3 | 92 | 5.0489 |
| 2.5919 | 0.45 | 138 | 4.8834 |
| 2.0011 | 0.6 | 184 | 4.6678 |
| 1.3733 | 0.75 | 230 | 4.4628 |
| 1.7321 | 0.9 | 276 | 4.2757 |
| 1.3994 | 1.05 | 322 | 4.1029 |
| 1.2308 | 1.2 | 368 | 3.8916 |
| 0.8229 | 1.35 | 414 | 3.6451 |
| 0.9592 | 1.5 | 460 | 3.4106 |
| 0.8528 | 1.65 | 506 | 3.2250 |
| 0.7362 | 1.8 | 552 | 3.0852 |
| 0.8077 | 1.95 | 598 | 2.9881 |
| 0.6912 | 2.1 | 644 | 2.9315 |
| 0.7776 | 2.25 | 690 | 2.8911 |
| 0.6916 | 2.41 | 736 | 2.8678 |
| 0.8674 | 2.56 | 782 | 2.8534 |
| 0.7797 | 2.71 | 828 | 2.8545 |
| 0.6838 | 2.86 | 874 | 2.8435 |
### Framework versions
- Transformers 4.34.1
- Pytorch 2.1.0+cu121
- Datasets 2.14.6
- Tokenizers 0.14.1