SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal Behaviors Paper • 2406.14598 • Published Jun 20, 2024
Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications Paper • 2402.05162 • Published Feb 7, 2024 • 1
Score Approximation, Estimation and Distribution Recovery of Diffusion Models on Low-Dimensional Data Paper • 2302.07194 • Published Feb 14, 2023
SpecDec++: Boosting Speculative Decoding via Adaptive Candidate Lengths Paper • 2405.19715 • Published May 30, 2024
Visual Adversarial Examples Jailbreak Large Language Models Paper • 2306.13213 • Published Jun 22, 2023