dfuhoiysOHSVFh82934gfjklb

huba-buba

AI & ML interests

None yet

Recent Activity

liked a model about 10 hours ago

PowerInfer/SmallThinker-3B-Preview

liked a dataset about 11 hours ago

xiaodongguaAIGC/awesome-dpo

liked a model about 11 hours ago

allenai/OLMo-2-1124-13B-Instruct

View all activity

Organizations

None yet

huba-buba's activity

upvoted a collection 9 days ago

C4AI Aya Expanse

Collection

Aya Expanse is an open-weight research release of a model with highly advanced multilingual capabilities. • 3 items • Updated 18 days ago • 30

upvoted a paper 10 days ago

LearnLM: Improving Gemini for Learning

Paper • 2412.16429 • Published 14 days ago • 20

upvoted a paper 11 days ago

Offline Reinforcement Learning for LLM Multi-Step Reasoning

Paper • 2412.16145 • Published 14 days ago • 36

upvoted a paper 26 days ago

Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling

Paper • 2412.05271 • Published 28 days ago • 123

upvoted a collection about 1 month ago

Cotype-Nano

Collection

Small and strong 1.5B models • 4 items • Updated Nov 26, 2024 • 19

upvoted a paper 2 months ago

AgentInstruct: Toward Generative Teaching with Agentic Flows

Paper • 2407.03502 • Published Jul 3, 2024 • 51

upvoted an article 2 months ago

Article

The Rise of Agentic Data Generation

•

Jul 15, 2024

• 79

upvoted 7 papers 2 months ago

Flow-DPO: Improving LLM Mathematical Reasoning through Online Multi-Agent Learning

Paper • 2410.22304 • Published Oct 29, 2024 • 16

Precise and Dexterous Robotic Manipulation via Human-in-the-Loop Reinforcement Learning

Paper • 2410.21845 • Published Oct 29, 2024 • 13

Helping or Herding? Reward Model Ensembles Mitigate but do not Eliminate Reward Hacking

Paper • 2312.09244 • Published Dec 14, 2023 • 7

Web Agents with World Models: Learning and Leveraging Environment Dynamics in Web Navigation

Paper • 2410.13232 • Published Oct 17, 2024 • 41

upvoted 6 papers 3 months ago

A Comparative Study on Reasoning Patterns of OpenAI's o1 Model

Paper • 2410.13639 • Published Oct 17, 2024 • 16

Roadmap towards Superhuman Speech Understanding using Large Language Models

Paper • 2410.13268 • Published Oct 17, 2024 • 33

Movie Gen: A Cast of Media Foundation Models

Paper • 2410.13720 • Published Oct 17, 2024 • 90

Insights from the Inverse: Reconstructing LLM Training Goals Through Inverse RL

Paper • 2410.12491 • Published Oct 16, 2024 • 4

HumanEval-V: Evaluating Visual Understanding and Reasoning Abilities of Large Multimodal Models Through Coding Tasks

Paper • 2410.12381 • Published Oct 16, 2024 • 43

Baichuan-Omni Technical Report

Paper • 2410.08565 • Published Oct 11, 2024 • 85