π¨ ALERT: A Comprehensive Benchmark for Assessing Large Language Models' Safety through Red Teaming Jun 25, 2024 β’ 5
LLMs Lost in Translation: M-ALERT uncovers Cross-Linguistic Safety Gaps Paper β’ 2412.15035 β’ Published 16 days ago β’ 4
Word Sense Linking: Disambiguating Outside the Sandbox Paper β’ 2412.09370 β’ Published 23 days ago β’ 8
Word Sense Linking Collection Word Sense Linking is the task designed to identify and disambiguate spans of text to their most suitable senses from a reference inventory. β’ 6 items β’ Updated 22 days ago β’ 6
Truth or Mirage? Towards End-to-End Factuality Evaluation with LLM-OASIS Paper β’ 2411.19655 β’ Published Nov 29, 2024 β’ 20
Unpacking SDXL Turbo: Interpreting Text-to-Image Models with Sparse Autoencoders Paper β’ 2410.22366 β’ Published Oct 28, 2024 β’ 77
Dense Training, Sparse Inference: Rethinking Training of Mixture-of-Experts Language Models Paper β’ 2404.05567 β’ Published Apr 8, 2024 β’ 9
ALERT: A Comprehensive Benchmark for Assessing Large Language Models' Safety through Red Teaming Paper β’ 2404.08676 β’ Published Apr 6, 2024 β’ 3
Aurora-M models Collection Aurora-M models (base, biden-harris redteams and instruct) β’ 5 items β’ Updated May 6, 2024 β’ 17
Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order Paper β’ 2404.00399 β’ Published Mar 30, 2024 β’ 41