Have Faith in Faithfulness: Going Beyond Circuit Overlap When Finding Model Mechanisms Paper โข 2403.17806 โข Published Mar 26, 2024 โข 3
๐ Daily Picks in Interpretability & Analysis of LMs Collection Outstanding research in interpretability and evaluation of language models, summarized โข 92 items โข Updated 4 days ago โข 94