tianchi007
's Collections
llm_pretrain
updated
Paper
•
2412.08905
•
Published
•
95
Evaluating and Aligning CodeLLMs on Human Preference
Paper
•
2412.05210
•
Published
•
47
Evaluating Language Models as Synthetic Data Generators
Paper
•
2412.03679
•
Published
•
45
Yi-Lightning Technical Report
Paper
•
2412.01253
•
Published
•
25
Large Language Model-Brained GUI Agents: A Survey
Paper
•
2411.18279
•
Published
•
27
O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple
Distillation, Big Progress or Bitter Lesson?
Paper
•
2411.16489
•
Published
•
40
Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions
Paper
•
2411.14405
•
Published
•
58
Natural Language Reinforcement Learning
Paper
•
2411.14251
•
Published
•
27
The Dawn of GUI Agent: A Preliminary Case Study with Claude 3.5 Computer
Use
Paper
•
2411.10323
•
Published
•
31
Large Language Models Can Self-Improve in Long-context Reasoning
Paper
•
2411.08147
•
Published
•
62
A Survey of Small Language Models
Paper
•
2410.20011
•
Published
•
40
Paper
•
2410.21276
•
Published
•
82
Qwen2.5-Coder Technical Report
Paper
•
2409.12186
•
Published
•
138
DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for
Reinforcement Learning and Monte-Carlo Tree Search
Paper
•
2408.08152
•
Published
•
52
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code
Intelligence
Paper
•
2406.11931
•
Published
•
58
Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers
Paper
•
2408.06195
•
Published
•
63
Paper
•
2412.15115
•
Published
•
334
Paper
•
2412.13501
•
Published
•
23
DeepSeek-V3 Technical Report
Paper
•
2412.19437
•
Published
•
10
Direct Language Model Alignment from Online AI Feedback
Paper
•
2402.04792
•
Published
•
29
Solving math word problems with process- and outcome-based feedback
Paper
•
2211.14275
•
Published
•
7