Shane
commited on
Commit
•
c96dbc6
1
Parent(s):
557b080
updated citations
Browse files
app.py
CHANGED
@@ -167,11 +167,11 @@ with gr.Blocks(css=custom_css) as app:
|
|
167 |
with gr.Row():
|
168 |
with gr.Accordion("📚 Citation", open=False):
|
169 |
citation_button = gr.Textbox(
|
170 |
-
value=r"""@
|
171 |
-
|
172 |
-
|
173 |
-
|
174 |
-
|
175 |
}""",
|
176 |
lines=7,
|
177 |
label="Copy the following to cite these results.",
|
|
|
167 |
with gr.Row():
|
168 |
with gr.Accordion("📚 Citation", open=False):
|
169 |
citation_button = gr.Textbox(
|
170 |
+
value=r"""@article{lyu2024href,
|
171 |
+
title={HREF: Human Response-Guided Evaluation of Instruction Following in Language Models},
|
172 |
+
author={Xinxi Lyu and Yizhong Wang and Hannaneh Hajishirzi and Pradeep Dasigi},
|
173 |
+
journal={arXiv preprint arXiv:2412.15524},
|
174 |
+
year={2024}
|
175 |
}""",
|
176 |
lines=7,
|
177 |
label="Copy the following to cite these results.",
|
src/md.py
CHANGED
@@ -23,8 +23,6 @@ For reproductability, we use greedy decoding for all model generation as default
|
|
23 |
- **Large**: HREF has the largest evaluation size among similar benchmarks, making its evaluation more reliable.
|
24 |
- **Contamination-resistant**: HREF's evaluation set is hidden and uses public models for both the baseline model and judge model, which makes it completely free of contamination.
|
25 |
- **Task Oriented**: Instead of naturally collected instructions from the user, HREF contains instructions that are written specifically targetting 8 distinct categories that are used in instruction tuning, which allows it to provide more insights about how to improve language models.
|
26 |
-
## Contact Us
|
27 |
-
TODO
|
28 |
"""
|
29 |
|
30 |
# Get Pacific time zone (handles PST/PDT automatically)
|
|
|
23 |
- **Large**: HREF has the largest evaluation size among similar benchmarks, making its evaluation more reliable.
|
24 |
- **Contamination-resistant**: HREF's evaluation set is hidden and uses public models for both the baseline model and judge model, which makes it completely free of contamination.
|
25 |
- **Task Oriented**: Instead of naturally collected instructions from the user, HREF contains instructions that are written specifically targetting 8 distinct categories that are used in instruction tuning, which allows it to provide more insights about how to improve language models.
|
|
|
|
|
26 |
"""
|
27 |
|
28 |
# Get Pacific time zone (handles PST/PDT automatically)
|