xansar
's Collections
daily_paper
updated
The Generative AI Paradox: "What It Can Create, It May Not Understand"
Paper
•
2311.00059
•
Published
•
18
Teaching Large Language Models to Reason with Reinforcement Learning
Paper
•
2403.04642
•
Published
•
46
Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM
Paper
•
2403.07816
•
Published
•
39
PERL: Parameter Efficient Reinforcement Learning from Human Feedback
Paper
•
2403.10704
•
Published
•
57
LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement
Paper
•
2403.15042
•
Published
•
25
BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text
Paper
•
2403.18421
•
Published
•
22
sDPO: Don't Use Your Data All at Once
Paper
•
2403.19270
•
Published
•
40
Advancing LLM Reasoning Generalists with Preference Trees
Paper
•
2404.02078
•
Published
•
44
ReFT: Representation Finetuning for Language Models
Paper
•
2404.03592
•
Published
•
91
Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model
Paper
•
2404.04167
•
Published
•
12
MiniCPM: Unveiling the Potential of Small Language Models with Scalable
Training Strategies
Paper
•
2404.06395
•
Published
•
22
Rho-1: Not All Tokens Are What You Need
Paper
•
2404.07965
•
Published
•
88
Pre-training Small Base LMs with Fewer Tokens
Paper
•
2404.08634
•
Published
•
35
Learn Your Reference Model for Real Good Alignment
Paper
•
2404.09656
•
Published
•
82
OpenBezoar: Small, Cost-Effective and Open Models Trained on Mixes of
Instruction Data
Paper
•
2404.12195
•
Published
•
11
MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model
Series
Paper
•
2405.19327
•
Published
•
46