Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
xinlai
/
DeepSeekMath-RL-Step-DPO
like
2
Text Generation
Transformers
Safetensors
llama
conversational
text-generation-inference
Inference Endpoints
arxiv:
2406.18629
License:
apache-2.0
Model card
Files
Files and versions
Community
Train
Deploy
Use this model
main
DeepSeekMath-RL-Step-DPO
Commit History
Update README.md
f8c9733
verified
xinlai
commited on
Jun 28, 2024
Update README.md
0134ded
verified
xinlai
commited on
Jun 28, 2024
upload model
6d4f1cf
xinlai
commited on
Jun 25, 2024
initial commit
9dae8ba
verified
xinlai
commited on
Jun 25, 2024