Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
RLHF-And-Friends
's Collections
Llama-Reward-Quantized
Llama-Reward
FedPPO-Pythia
Llama-3.2-3B-DPO-Math
Llama-3
FedPPO-Pythia
updated
Dec 13, 2024
Upvote
-
RLHF-And-Friends/FedPPO-Collaborative-Pythia-70M-a0
Text Generation
•
Updated
Dec 13, 2024
•
125
RLHF-And-Friends/FedPPO-Collaborative-Pythia-70M-a1
Text Generation
•
Updated
Dec 13, 2024
•
121
RLHF-And-Friends/FedPPO-Isolated-Pythia-70M-a0
Text Generation
•
Updated
Dec 13, 2024
•
117
RLHF-And-Friends/FedPPO-Isolated-Pythia-70M-a1
Text Generation
•
Updated
Dec 13, 2024
•
124
RLHF-And-Friends/FedPPO-Confused-Pythia-70M-a1
Text Generation
•
Updated
Dec 13, 2024
•
123
RLHF-And-Friends/FedPPO-Confused-Pythia-70M-a0
Text Generation
•
Updated
Dec 13, 2024
•
121
Upvote
-
Share collection
View history
Collection guide
Browse collections