metadata
license: mit
base_model: TheBloke/zephyr-7B-beta-GPTQ
tags:
- trl
- sft
- generated_from_trainer
model-index:
- name: v1.1
results: []
datasets:
- hieunguyenminh/roleplay
pipeline_tag: text-generation
v1.1
This model is a fine-tuned version of TheBloke/zephyr-7B-beta-GPTQ on the hieunguyenminh/roleplay dataset.
Model description
This model can adapt to any type of characters and provide answer that personalize that character.
Training and evaluation data
It is trained with supervised learning and will be trained with DPO in the future.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 16
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- training_steps: 400
- mixed_precision_training: Native AMP
Training results
Loss after 1 epochs: 0.6
Framework versions
- Transformers 4.35.2
- Pytorch 2.1.0+cu121
- Datasets 2.16.0
- Tokenizers 0.15.0