Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
11
35
58
Agustín Piqueres Lajarín
plaguss
Follow
gabrielmbmb's profile picture
shuyuej's profile picture
Akash20000's profile picture
44 followers
·
47 following
plaguss
AI & ML interests
None yet
Recent Activity
reacted
to
lewtun
's
post
with 🔥
5 days ago
This paper (https://huggingface.co/papers/2412.18925) has a really interesting recipe for inducing o1-like behaviour in Llama models: * Iteratively sample CoTs from the model, using a mix of different search strategies. This gives you something like Stream of Search via prompting. * Verify correctness of each CoT using GPT-4o (needed because exact match doesn't work well in medicine where there are lots of aliases) * Use GPT-4o to reformat the concatenated CoTs into a single stream that includes smooth transitions like "hmm, wait" etc that one sees in o1 * Use the resulting data for SFT & RL * Use sparse rewards from GPT-4o to guide RL training. They find RL gives an average ~3 point boost across medical benchmarks and SFT on this data already gives a strong improvement. Applying this strategy to other domains could be quite promising, provided the training data can be formulated with verifiable problems!
liked
a Space
7 days ago
data-agents/jupyter-agent
liked
a Space
18 days ago
HuggingFaceH4/blogpost-scaling-test-time-compute
View all activity
Articles
How we leveraged distilabel to create an Argilla 2.0 Chatbot
Jul 16, 2024
•
32
Organizations
plaguss
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
New activity in
argilla/FinePersonas-v0.1
24 days ago
Removing embeddings information to reduce the size of this dataset
3
#6 opened 3 months ago by
alexneakameni
New activity in
argilla/FinePersonas-v0.1
3 months ago
noob questions
2
#4 opened 3 months ago by
KarolCodes
How to run the persona-to-persona code?
1
#5 opened 3 months ago by
dimpu01
New activity in
argilla/FinePersonas-v0.1
4 months ago
Multimodal Personas
2
#2 opened 4 months ago by
Taylor658
New activity in
argilla/magpie-ultra-v0.1
5 months ago
Update README.md
#7 opened 5 months ago by
Sweatydragon1
New activity in
plaguss/distilabel-sample-evol-instruct
11 months ago
add distilabel and synthethic tag
1
#2 opened 11 months ago by
davidberenstein1957
New activity in
argilla/distilabeled-Marcoro14-7B-slerp
12 months ago
update license
1
#2 opened 12 months ago by
mlabonne
add base_model
1
#1 opened 12 months ago by
mlabonne
New activity in
argilla/distilabeled-OpenHermes-2.5-Mistral-7B
12 months ago
Can some one link me to gguf?
1
#1 opened 12 months ago by
Hugs4Llamas