3 19 7

Jie Fu

bigaidream

https://bigaidream.github.io/

AI & ML interests

LLM, Reinforcement Learning, System-2 Deep Learning (Reasoning, Planning), Automatic Theorem Proving, AI Safety

Recent Activity

authored a paper about 2 months ago

OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models

View all activity

Organizations

None yet

bigaidream's activity

authored a paper about 2 months ago

OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models

Paper • 2411.04905 • Published Nov 7, 2024 • 113

upvoted a paper 3 months ago

PositionID: LLMs can Control Lengths, Copy and Paste with Explicit Positional Awareness

Paper • 2410.07035 • Published Oct 9, 2024 • 17

authored a paper 3 months ago

MIO: A Foundation Model on Multimodal Tokens

Paper • 2409.17692 • Published Sep 26, 2024 • 53

authored a paper 5 months ago

Layerwise Recurrent Router for Mixture-of-Experts

Paper • 2408.06793 • Published Aug 13, 2024 • 31

upvoted a paper 5 months ago

Layerwise Recurrent Router for Mixture-of-Experts

Paper • 2408.06793 • Published Aug 13, 2024 • 31

authored a paper 6 months ago

A Closer Look into Mixture-of-Experts in Large Language Models

Paper • 2406.18219 • Published Jun 26, 2024 • 15

upvoted a paper 6 months ago

A Closer Look into Mixture-of-Experts in Large Language Models

Paper • 2406.18219 • Published Jun 26, 2024 • 15

authored a paper 6 months ago

Unlocking Continual Learning Abilities in Language Models

Paper • 2406.17245 • Published Jun 25, 2024 • 28

upvoted a paper 6 months ago

Unlocking Continual Learning Abilities in Language Models

Paper • 2406.17245 • Published Jun 25, 2024 • 28

authored a paper 6 months ago

Efficient Continual Pre-training by Mitigating the Stability Gap

Paper • 2406.14833 • Published Jun 21, 2024 • 19

upvoted a paper 6 months ago

Efficient Continual Pre-training by Mitigating the Stability Gap

Paper • 2406.14833 • Published Jun 21, 2024 • 19

authored 2 papers 7 months ago

VCR: Visual Caption Restoration

Paper • 2406.06462 • Published Jun 10, 2024 • 10

PIN: A Knowledge-Intensive Dataset for Paired and Interleaved Multimodal Documents

Paper • 2406.13923 • Published Jun 20, 2024 • 21

upvoted a paper 7 months ago

VCR: Visual Caption Restoration

Paper • 2406.06462 • Published Jun 10, 2024 • 10

authored a paper 7 months ago

LoGAH: Predicting 774-Million-Parameter Transformers using Graph HyperNetworks with 1/100 Parameters

Paper • 2405.16287 • Published May 25, 2024 • 10

upvoted 2 papers 7 months ago

LoGAH: Predicting 774-Million-Parameter Transformers using Graph HyperNetworks with 1/100 Parameters

Paper • 2405.16287 • Published May 25, 2024 • 10

Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training

Paper • 2405.15319 • Published May 24, 2024 • 25

authored a paper 7 months ago

Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training

Paper • 2405.15319 • Published May 24, 2024 • 25

authored a paper 9 months ago

CodeEditorBench: Evaluating Code Editing Capability of Large Language Models

Paper • 2404.03543 • Published Apr 4, 2024 • 15

upvoted a paper 9 months ago

Long-context LLMs Struggle with Long In-context Learning

Paper • 2404.02060 • Published Apr 2, 2024 • 36