-
A Large Batch Optimizer Reality Check: Traditional, Generic Optimizers Suffice Across Batch Sizes
Paper • 2102.06356 • Published -
Large Batch Optimization for Deep Learning: Training BERT in 76 minutes
Paper • 1904.00962 • Published • 1 -
Decoupled Weight Decay Regularization
Paper • 1711.05101 • Published • 1
zk67
zk67
AI & ML interests
None yet
Recent Activity
updated
a collection
about 22 hours ago
Training
updated
a collection
about 22 hours ago
Training
updated
a collection
about 22 hours ago
Training
Organizations
Collections
8
-
A Survey on Data Selection for LLM Instruction Tuning
Paper • 2402.05123 • Published • 3 -
Data-Juicer Sandbox: A Comprehensive Suite for Multimodal Data-Model Co-development
Paper • 2407.11784 • Published • 4 -
Data Management For Large Language Models: A Survey
Paper • 2312.01700 • Published -
Datasets for Large Language Models: A Comprehensive Survey
Paper • 2402.18041 • Published • 2
models
None public yet
datasets
None public yet