Hello, I am trying to finetune llama2 7B model with my own dataset.
I am stuck at setting the data-path correctly. Here are my attempts:
1st command:
autotrain llm --train --data-path ./data --text-column text --peft --auto_find_batch_size --epochs 3 --trainer sft --model meta-llama/Llama-2-7b-hf --project-name ftllama2
=> Somehow, the data_path got changed to “ftllama2/autotrain-data”
2nd command:
autotrain llm --train --data-path ./data/train.csv --text-column text --peft --auto_find_batch_size --epochs 3 --trainer sft --model meta-llama/Llama-2-7b-hf --project-name ftllama2
=> The data_path is correct, but I am getting this error:
ERROR | 2024-02-15 08:44:06 | autotrain.trainers.common:wrapper:92 - Couldn’t find a dataset script at /home/ubuntu/workspace/git/language-agnostic-embedding/data/train.csv/train.csv.py or any data file in the same directory.
Any comments are welcomed.