sh110495
's Collections
Interested
updated
Large Language Model Unlearning via Embedding-Corrupted Prompts
Paper
•
2406.07933
•
Published
•
7
Block Transformer: Global-to-Local Language Modeling for Fast Inference
Paper
•
2406.02657
•
Published
•
37
Learn Beyond The Answer: Training Language Models with Reflection for
Mathematical Reasoning
Paper
•
2406.12050
•
Published
•
19
How Do Large Language Models Acquire Factual Knowledge During
Pretraining?
Paper
•
2406.11813
•
Published
•
30
Breaking the Attention Bottleneck
Paper
•
2406.10906
•
Published
•
4
The FineWeb Datasets: Decanting the Web for the Finest Text Data at
Scale
Paper
•
2406.17557
•
Published
•
88
Unlocking Continual Learning Abilities in Language Models
Paper
•
2406.17245
•
Published
•
28
Scaling Laws for Linear Complexity Language Models
Paper
•
2406.16690
•
Published
•
22
Aligning Teacher with Student Preferences for Tailored Training Data
Generation
Paper
•
2406.19227
•
Published
•
24
Is Programming by Example solved by LLMs?
Paper
•
2406.08316
•
Published
•
12
MoA: Mixture of Sparse Attention for Automatic Large Language Model
Compression
Paper
•
2406.14909
•
Published
•
14
Can LLMs Learn by Teaching? A Preliminary Study
Paper
•
2406.14629
•
Published
•
19
To Forget or Not? Towards Practical Knowledge Unlearning for Large
Language Models
Paper
•
2407.01920
•
Published
•
13
On Leakage of Code Generation Evaluation Datasets
Paper
•
2407.07565
•
Published
•
5
Paper
•
2407.10671
•
Published
•
160
Q-Sparse: All Large Language Models can be Fully Sparsely-Activated
Paper
•
2407.10969
•
Published
•
20
Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled
Refusal Training
Paper
•
2407.09121
•
Published
•
5
Practical Unlearning for Large Language Models
Paper
•
2407.10223
•
Published
•
4
Phi-3 Safety Post-Training: Aligning Language Models with a "Break-Fix"
Cycle
Paper
•
2407.13833
•
Published
•
12
Jamba: A Hybrid Transformer-Mamba Language Model
Paper
•
2403.19887
•
Published
•
104
RAG Foundry: A Framework for Enhancing LLMs for Retrieval Augmented
Generation
Paper
•
2408.02545
•
Published
•
35
CoverBench: A Challenging Benchmark for Complex Claim Verification
Paper
•
2408.03325
•
Published
•
14
Better Alignment with Instruction Back-and-Forth Translation
Paper
•
2408.04614
•
Published
•
14
Transformer Explainer: Interactive Learning of Text-Generative Models
Paper
•
2408.04619
•
Published
•
155
To Code, or Not To Code? Exploring Impact of Code in Pre-training
Paper
•
2408.10914
•
Published
•
41
ReMamba: Equip Mamba with Effective Long-Sequence Modeling
Paper
•
2408.15496
•
Published
•
10
Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with
100+ NLP Researchers
Paper
•
2409.04109
•
Published
•
43
CORAL: Benchmarking Multi-turn Conversational Retrieval-Augmentation
Generation
Paper
•
2410.23090
•
Published
•
54
Can Language Models Replace Programmers? REPOCOD Says 'Not Yet'
Paper
•
2410.21647
•
Published
•
17
Paper
•
2410.21276
•
Published
•
82
LongReward: Improving Long-context Large Language Models with AI
Feedback
Paper
•
2410.21252
•
Published
•
17
Hymba: A Hybrid-head Architecture for Small Language Models
Paper
•
2411.13676
•
Published
•
40