LLM Training - a Stalin16 Collection

Stalin16 's Collections

Data and other things

Gen AI Diffusion

LLM Training

updated about 18 hours ago

Rethinking Data Selection at Scale: Random Selection is Almost All You Need

Paper • 2410.09335 • Published Oct 12, 2024 • 16
From Generalist to Specialist: Adapting Vision Language Models via Task-Specific Visual Instruction Tuning

Paper • 2410.06456 • Published Oct 9, 2024 • 35
Emergent properties with repeated examples

Paper • 2410.07041 • Published Oct 9, 2024 • 8
Personalized Visual Instruction Tuning

Paper • 2410.07113 • Published Oct 9, 2024 • 69
LLM Self-Correction with DeCRIM: Decompose, Critique, and Refine for Enhanced Following of Instructions with Multiple Constraints

Paper • 2410.06458 • Published Oct 9, 2024 • 8
What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective

Paper • 2410.23743 • Published Oct 31, 2024 • 59
Constraint Back-translation Improves Complex Instruction Following of Large Language Models

Paper • 2410.24175 • Published Oct 31, 2024 • 16
GlotCC: An Open Broad-Coverage CommonCrawl Corpus and Pipeline for Minority Languages

Paper • 2410.23825 • Published Oct 31, 2024 • 3
Mind Your Step (by Step): Chain-of-Thought can Reduce Performance on Tasks where Thinking Makes Humans Worse

Paper • 2410.21333 • Published Oct 27, 2024 • 10
Fictitious Synthetic Data Can Improve LLM Factuality via Prerequisite Learning

Paper • 2410.19290 • Published Oct 25, 2024 • 10
AutoTrain: No-code training for state-of-the-art models

Paper • 2410.15735 • Published Oct 21, 2024 • 58
Stronger Models are NOT Stronger Teachers for Instruction Tuning

Paper • 2411.07133 • Published Nov 11, 2024 • 34
IOPO: Empowering LLMs with Complex Instruction Following via Input-Output Preference Optimization

Paper • 2411.06208 • Published Nov 9, 2024 • 19
LLM2CLIP: Powerful Language Model Unlock Richer Visual Representation

Paper • 2411.04997 • Published Nov 7, 2024 • 37
Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding

Paper • 2411.04282 • Published Nov 6, 2024 • 31
Large Language Models Can Self-Improve in Long-context Reasoning

Paper • 2411.08147 • Published Nov 12, 2024 • 62
Cut Your Losses in Large-Vocabulary Language Models

Paper • 2411.09009 • Published Nov 13, 2024 • 43
Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering

Paper • 2411.11504 • Published Nov 18, 2024 • 19
RedPajama: an Open Dataset for Training Large Language Models

Paper • 2411.12372 • Published Nov 19, 2024 • 47
SageAttention2 Technical Report: Accurate 4 Bit Attention for Plug-and-play Inference Acceleration

Paper • 2411.10958 • Published Nov 17, 2024 • 51
Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization

Paper • 2411.10442 • Published Nov 15, 2024 • 68
VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selection

Paper • 2411.14794 • Published Nov 22, 2024 • 12
TÜLU 3: Pushing Frontiers in Open Language Model Post-Training

Paper • 2411.15124 • Published Nov 22, 2024 • 56
VisionZip: Longer is Better but Not Necessary in Vision Language Models

Paper • 2412.04467 • Published 27 days ago • 105
Surveying the Effects of Quality, Diversity, and Complexity in Synthetic Data From Large Language Models

Paper • 2412.02980 • Published 28 days ago • 12
MALT: Improving Reasoning with Multi-Agent LLM Training

Paper • 2412.01928 • Published 30 days ago • 39
Free Process Rewards without Process Labels

Paper • 2412.01981 • Published 30 days ago • 28
On Domain-Specific Post-Training for Multimodal Large Language Models

Paper • 2411.19930 • Published Nov 29, 2024 • 25
Reverse Thinking Makes LLMs Stronger Reasoners

Paper • 2411.19865 • Published Nov 29, 2024 • 19
Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling

Paper • 2412.05271 • Published 26 days ago • 121
MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale

Paper • 2412.05237 • Published 26 days ago • 46
Fully Open Source Moxin-7B Technical Report

Paper • 2412.06845 • Published 24 days ago • 10
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

Paper • 2412.09596 • Published 20 days ago • 92
Euclid: Supercharging Multimodal LLMs with Synthetic High-Fidelity Visual Descriptions

Paper • 2412.08737 • Published 21 days ago • 51
Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition

Paper • 2412.09501 • Published 20 days ago • 43
VisionArena: 230K Real World User-VLM Conversations with Preference Labels

Paper • 2412.08687 • Published 21 days ago • 13
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published 14 days ago • 113
No More Adam: Learning Rate Scaling at Initialization is All You Need

Paper • 2412.11768 • Published 16 days ago • 41
RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response

Paper • 2412.14922 • Published 13 days ago • 80