Spaces:
Running
on
L4
Running
on
L4
Update app.py
Browse files
app.py
CHANGED
@@ -50,21 +50,32 @@ ANSWER
|
|
50 |
"""
|
51 |
|
52 |
DEFAULT_POLICY = """# Policy
|
|
|
53 |
## Overview
|
|
|
54 |
This policy guides the classification of content related to expressions that may be sensitive or controversial, aiming to foster a community where open dialogue is encouraged while maintaining respect for all individuals.
|
|
|
55 |
## Definition of Terms
|
|
|
56 |
- Protected Classes: Individuals or groups of individuals defined by sharing a race, ethnicity, religion, disability, sex, gender identity, or sexual orientation. The scope is narrowed to focus on characteristics commonly recognized in anti-discrimination laws. Religions themselves, as concepts ARE NOT protected - only their members are.
|
57 |
- Non-Protected Classes: Groups defined by other characteristics, such as age, socioeconomic status, and education, which are typically excluded partially or entirely from anti-discrimination laws.
|
|
|
58 |
## Interpretation of Language
|
|
|
59 |
## Definition of Labels
|
|
|
60 |
### (HS): Hate Speech
|
|
|
61 |
#### Includes
|
|
|
62 |
- Explicit Violence Advocacy: Direct and clear threats of or praise for violence or harm against individuals or groups based on their membership in a particular protected class. Stating an intention to defend against aggression DOES NOT qualify
|
63 |
- Intentional Dehumanization: Statements that clearly depict individuals or groups as inherently ""other"", alien, animalistic, unintelligent, immoral, unclean, or less-than-fully-human based on their membership in a particular protected class in a way that justifies harm or discrimination.
|
64 |
- Targeted Use of Derogatory Slurs: Targeting another person or group of people using a one-word name for a particular protected class that has an inherent negative connotation (e.g. Nigger, Kike, Cunt, Retard). Multi-word terms are never slurs.
|
65 |
- Explicit Discrimination Advocacy: Direct and clear calls for exclusion, segregation, or discrimination against individuals or groups based on their membership in a particular protected class, with a clear intent to promote inequality.
|
66 |
- Direct Hateful Insults: Content that directly addresses another person or group of people the second person (e.g. ""You over there"") and insults them based on their membership in a particular protected class
|
|
|
67 |
#### Excludes
|
|
|
68 |
- Artistic and Educational Content: Expressions intended for artistic, educational, or documentary purposes that discuss sensitive topics but do not advocate for violence or discrimination against individuals or groups based on their membership in a particular protected class.
|
69 |
- Political and Social Commentary: Commentary on political issues, social issues, and political ideologies that does not directly incite violence or discrimination against individuals or groups based on their membership in a particular protected class.
|
70 |
- Rebutting Hateful Language: Content that rebuts, condemns, questions, criticizes, or mocks a different person's hateful language or ideas OR that insults the person advocating those hateful
|
@@ -93,6 +104,25 @@ iface = gr.Interface(
|
|
93 |
outputs="label",
|
94 |
title="CoPE Alpha Preview",
|
95 |
description="See if the given content violates your given policy."
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
96 |
)
|
97 |
|
98 |
# Launch the app
|
|
|
50 |
"""
|
51 |
|
52 |
DEFAULT_POLICY = """# Policy
|
53 |
+
|
54 |
## Overview
|
55 |
+
|
56 |
This policy guides the classification of content related to expressions that may be sensitive or controversial, aiming to foster a community where open dialogue is encouraged while maintaining respect for all individuals.
|
57 |
+
|
58 |
## Definition of Terms
|
59 |
+
|
60 |
- Protected Classes: Individuals or groups of individuals defined by sharing a race, ethnicity, religion, disability, sex, gender identity, or sexual orientation. The scope is narrowed to focus on characteristics commonly recognized in anti-discrimination laws. Religions themselves, as concepts ARE NOT protected - only their members are.
|
61 |
- Non-Protected Classes: Groups defined by other characteristics, such as age, socioeconomic status, and education, which are typically excluded partially or entirely from anti-discrimination laws.
|
62 |
+
|
63 |
## Interpretation of Language
|
64 |
+
|
65 |
## Definition of Labels
|
66 |
+
|
67 |
### (HS): Hate Speech
|
68 |
+
|
69 |
#### Includes
|
70 |
+
|
71 |
- Explicit Violence Advocacy: Direct and clear threats of or praise for violence or harm against individuals or groups based on their membership in a particular protected class. Stating an intention to defend against aggression DOES NOT qualify
|
72 |
- Intentional Dehumanization: Statements that clearly depict individuals or groups as inherently ""other"", alien, animalistic, unintelligent, immoral, unclean, or less-than-fully-human based on their membership in a particular protected class in a way that justifies harm or discrimination.
|
73 |
- Targeted Use of Derogatory Slurs: Targeting another person or group of people using a one-word name for a particular protected class that has an inherent negative connotation (e.g. Nigger, Kike, Cunt, Retard). Multi-word terms are never slurs.
|
74 |
- Explicit Discrimination Advocacy: Direct and clear calls for exclusion, segregation, or discrimination against individuals or groups based on their membership in a particular protected class, with a clear intent to promote inequality.
|
75 |
- Direct Hateful Insults: Content that directly addresses another person or group of people the second person (e.g. ""You over there"") and insults them based on their membership in a particular protected class
|
76 |
+
|
77 |
#### Excludes
|
78 |
+
|
79 |
- Artistic and Educational Content: Expressions intended for artistic, educational, or documentary purposes that discuss sensitive topics but do not advocate for violence or discrimination against individuals or groups based on their membership in a particular protected class.
|
80 |
- Political and Social Commentary: Commentary on political issues, social issues, and political ideologies that does not directly incite violence or discrimination against individuals or groups based on their membership in a particular protected class.
|
81 |
- Rebutting Hateful Language: Content that rebuts, condemns, questions, criticizes, or mocks a different person's hateful language or ideas OR that insults the person advocating those hateful
|
|
|
104 |
outputs="label",
|
105 |
title="CoPE Alpha Preview",
|
106 |
description="See if the given content violates your given policy."
|
107 |
+
article="""
|
108 |
+
## About CoPE
|
109 |
+
|
110 |
+
CoPE (the COntent Policy Evaluation engine) is a small language model capable of accurate content policy labeling. This is a *preview* of our alpha release and is strictly for *research* purposes. This should *NOT* be used for any production use cases.
|
111 |
+
|
112 |
+
### How to Use:
|
113 |
+
|
114 |
+
1. Enter your content in the "Content" box.
|
115 |
+
2. Specify your policy in the "Policy" box.
|
116 |
+
3. Click "Submit" to see the results.
|
117 |
+
|
118 |
+
*Note*: Inference times are *very slow* (30-45 seconds) since this is built on dev infra and not yet optimized for live systems. Please be patient!
|
119 |
+
|
120 |
+
### Tips:
|
121 |
+
|
122 |
+
- [Give us feedback](https://example.com) to help us improve
|
123 |
+
- Read our FAQ to learn more about CoPE
|
124 |
+
- Join our mailing list to keep in touch
|
125 |
+
"""
|
126 |
)
|
127 |
|
128 |
# Launch the app
|