Spaces:
Runtime error
Runtime error
anonymousauthorsanonymous
commited on
Commit
·
4a9075d
1
Parent(s):
4f9d18b
Clean up description. Higher rez image
Browse files- app.py +12 -7
- spec_metric_result.png +0 -0
app.py
CHANGED
@@ -210,9 +210,14 @@ demo = gr.Blocks()
|
|
210 |
with demo:
|
211 |
input_texts = gr.Variable([])
|
212 |
gr.Markdown("**Detect Task Specification at Inference-time.**")
|
213 |
-
gr.Markdown("""
|
214 |
-
|
215 |
-
|
|
|
|
|
|
|
|
|
|
|
216 |
|
217 |
In this example we have 100\% accurate detection with the specification metric near zero for only sentence (3) and (4).
|
218 |
<p align="center">
|
@@ -221,14 +226,14 @@ with demo:
|
|
221 |
""")
|
222 |
|
223 |
|
224 |
-
gr.Markdown("**
|
225 |
gr.Markdown(f"""1) Pick a preloaded BERT-like model.
|
226 |
*Note: RoBERTa-large performance is best.*
|
227 |
2) Pick an Occupation type from the Winogender Schemas evaluation set.
|
228 |
*Or select '{PICK_YOUR_OWN_LABEL}' (it need not be about an occupation).*
|
229 |
-
3) Click button to load input texts.
|
230 |
*Read the sentences to determine which two are well-specified for gendered pronoun coreference resolution. The rest are gender-unspecified.*
|
231 |
-
4) Click button to get Task Specification Metric results
|
232 |
""")
|
233 |
|
234 |
|
@@ -272,7 +277,7 @@ with demo:
|
|
272 |
with gr.Row():
|
273 |
uncertain_btn = gr.Button("4) Click to get Task Specification Metric results!")
|
274 |
gr.Markdown(
|
275 |
-
"""We expect a lower specification metric for well-specified tasks.
|
276 |
|
277 |
Note: If there is an * by a sentence number, then at least one top prediction for that sentence was non-gendered.""")
|
278 |
|
|
|
210 |
with demo:
|
211 |
input_texts = gr.Variable([])
|
212 |
gr.Markdown("**Detect Task Specification at Inference-time.**")
|
213 |
+
gr.Markdown("""This method exploits the specification-induced spurious correlations demonstrated in this
|
214 |
+
[Spurious Correlations Hugging Face Space](https://huggingface.co/spaces/anonymousauthorsanonymous/spurious) to detect task specification at inference-time.
|
215 |
+
For this method, well-specified tasks should have a lower specification metric value, and unspecified tasks should have a higher specification metric value.
|
216 |
+
""")
|
217 |
+
|
218 |
+
gr.Markdown("""As an example, see the figure below with test sentences from the [Winogender schema](https://aclanthology.org/N18-2002/) for the occupation of `Doctor`.
|
219 |
+
With a close read, you can see that only sentence numbers (3) and (4) are well-specified for the gendered pronoun resolution task:
|
220 |
+
the masked pronoun is coreferent with the `man` or `woman`; the remainder are unspecfied: the masked pronoun is coreferent with a gender-unspecified person.
|
221 |
|
222 |
In this example we have 100\% accurate detection with the specification metric near zero for only sentence (3) and (4).
|
223 |
<p align="center">
|
|
|
226 |
""")
|
227 |
|
228 |
|
229 |
+
gr.Markdown("**To test this for yourself, follow the numbered steps below to test one of the pre-loaded options.** Once you get the hang of it, you can load a new model and/or provide your own input texts.")
|
230 |
gr.Markdown(f"""1) Pick a preloaded BERT-like model.
|
231 |
*Note: RoBERTa-large performance is best.*
|
232 |
2) Pick an Occupation type from the Winogender Schemas evaluation set.
|
233 |
*Or select '{PICK_YOUR_OWN_LABEL}' (it need not be about an occupation).*
|
234 |
+
3) Click the first button to load input texts.
|
235 |
*Read the sentences to determine which two are well-specified for gendered pronoun coreference resolution. The rest are gender-unspecified.*
|
236 |
+
4) Click the second button to get Task Specification Metric results.
|
237 |
""")
|
238 |
|
239 |
|
|
|
277 |
with gr.Row():
|
278 |
uncertain_btn = gr.Button("4) Click to get Task Specification Metric results!")
|
279 |
gr.Markdown(
|
280 |
+
"""We expect a lower specification metric value for well-specified tasks.
|
281 |
|
282 |
Note: If there is an * by a sentence number, then at least one top prediction for that sentence was non-gendered.""")
|
283 |
|
spec_metric_result.png
CHANGED