Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
RLHF-And-Friends
/
Llama-3.2-1B-Instruct-Reward-Ultrafeedback-QLoRA
like
0
Follow
RLHF-And-Friends
4
Transformers
Safetensors
trl-lib/ultrafeedback_binarized
Generated from Trainer
trl
reward-trainer
Inference Endpoints
Model card
Files
Files and versions
Community
Train
Deploy
Use this model
main
Llama-3.2-1B-Instruct-Reward-Ultrafeedback-QLoRA
1 contributor
History:
2 commits
evgurov
End of training
ac7b6b6
verified
18 days ago
.gitattributes
Safe
1.57 kB
End of training
18 days ago
README.md
Safe
1.89 kB
End of training
18 days ago
adapter_config.json
Safe
784 Bytes
End of training
18 days ago
adapter_model.safetensors
Safe
6.84 MB
LFS
End of training
18 days ago
special_tokens_map.json
Safe
436 Bytes
End of training
18 days ago
tokenizer.json
Safe
17.2 MB
LFS
End of training
18 days ago
tokenizer_config.json
Safe
54.8 kB
End of training
18 days ago
training_args.bin
pickle
Detected Pickle imports (10)
"transformers.trainer_utils.IntervalStrategy"
,
"transformers.trainer_utils.HubStrategy"
,
"transformers.trainer_pt_utils.AcceleratorConfig"
,
"accelerate.state.PartialState"
,
"transformers.trainer_utils.SaveStrategy"
,
"accelerate.utils.dataclasses.DistributedType"
,
"trl.trainer.reward_config.RewardConfig"
,
"transformers.trainer_utils.SchedulerType"
,
"transformers.training_args.OptimizerNames"
,
"torch.device"
How to fix it?
5.56 kB
LFS
End of training
18 days ago