abdullahmeda commited on
Commit
d9cbd92
·
verified ·
1 Parent(s): ffcb078

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +30 -25
app.py CHANGED
@@ -116,7 +116,7 @@ def predict(text):
116
 
117
  with gr.Blocks() as demo:
118
  gr.Markdown(
119
- """
120
  ## Detect text generated using LLMs 🤖
121
 
122
  Linguistic features such as Perplexity and other SOTA methods such as GLTR were used to classify between Human written and LLM Generated \
@@ -124,31 +124,36 @@ with gr.Blocks() as demo:
124
 
125
  - Source & Credits: [https://github.com/Hello-SimpleAI/chatgpt-comparison-detection](https://github.com/Hello-SimpleAI/chatgpt-comparison-detection)
126
  - Competition: [https://www.kaggle.com/competitions/llm-detect-ai-generated-text/leaderboard](https://www.kaggle.com/competitions/llm-detect-ai-generated-text/leaderboard)
127
- - Solution WriteUp: [https://www.kaggle.com/competitions/llm-detect-ai-generated-text/discussion/470224](https://www.kaggle.com/competitions/llm-detect-ai-generated-text/discussion/470224)
128
-
129
- ### Linguistic Analysis: Language Model Perplexity
130
- The perplexity (PPL) is commonly used as a metric for evaluating the performance of language models (LM). It is defined as the exponential \
131
- of the negative average log-likelihood of the text under the LM. A lower PPL indicates that the language model is more confident in its \
132
- predictions, and is therefore considered to be a better model. The training of LMs is carried out on large-scale text corpora, it can \
133
- be considered that it has learned some common language patterns and text structures. Therefore, PPL can be used to measure how \
134
- well a text conforms to common characteristics.
135
-
136
- ### GLTR: Giant Language Model Test Room
137
- This idea originates from the following paper: arxiv.org/pdf/1906.04043.pdf. It studies 3 tests to compute features of an input text. Their \
138
- major assumption is that to generate fluent and natural-looking text, most decoding strategies sample high probability tokens from the head \
139
- of the distribution. I selected the most powerful Test-2 feature, which is the number of tokens in the Top-10, Top-100, Top-1000, and 1000+ \
140
- ranks from the LM predicted probability distributions.
141
-
142
- ### Modelling
143
- Scikit-learn's VotingClassifier consisting of XGBClassifier, LGBMClassifier, CatBoostClassifier and RandomForestClassifier with default parameters
144
  """
145
  )
146
- a1 = gr.Textbox( lines=7, label='Text', value=example )
147
- button1 = gr.Button("🤖 Predict!")
148
- gr.Markdown("Prediction:")
149
- label1 = gr.Textbox(lines=1, label='Predicted Label')
150
- score1 = gr.Textbox(lines=1, label='Predicted Probability')
151
-
152
- button1.click(predict, inputs=[a1], outputs=[label1, score1])
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
153
 
154
  demo.launch()
 
116
 
117
  with gr.Blocks() as demo:
118
  gr.Markdown(
119
+ """\
120
  ## Detect text generated using LLMs 🤖
121
 
122
  Linguistic features such as Perplexity and other SOTA methods such as GLTR were used to classify between Human written and LLM Generated \
 
124
 
125
  - Source & Credits: [https://github.com/Hello-SimpleAI/chatgpt-comparison-detection](https://github.com/Hello-SimpleAI/chatgpt-comparison-detection)
126
  - Competition: [https://www.kaggle.com/competitions/llm-detect-ai-generated-text/leaderboard](https://www.kaggle.com/competitions/llm-detect-ai-generated-text/leaderboard)
127
+ - Solution WriteUp: [https://www.kaggle.com/competitions/llm-detect-ai-generated-text/discussion/470224](https://www.kaggle.com/competitions/llm-detect-ai-generated-text/discussion/470224)\
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
128
  """
129
  )
130
+ with gr.Column():
131
+ gr.Markdown(
132
+ """\
133
+ ### Linguistic Analysis: Language Model Perplexity
134
+ The perplexity (PPL) is commonly used as a metric for evaluating the performance of language models (LM). It is defined as the exponential \
135
+ of the negative average log-likelihood of the text under the LM. A lower PPL indicates that the language model is more confident in its \
136
+ predictions, and is therefore considered to be a better model. The training of LMs is carried out on large-scale text corpora, it can \
137
+ be considered that it has learned some common language patterns and text structures. Therefore, PPL can be used to measure how \
138
+ well a text conforms to common characteristics.
139
+
140
+ ### GLTR: Giant Language Model Test Room
141
+ This idea originates from the following paper: arxiv.org/pdf/1906.04043.pdf. It studies 3 tests to compute features of an input text. Their \
142
+ major assumption is that to generate fluent and natural-looking text, most decoding strategies sample high probability tokens from the head \
143
+ of the distribution. I selected the most powerful Test-2 feature, which is the number of tokens in the Top-10, Top-100, Top-1000, and 1000+ \
144
+ ranks from the LM predicted probability distributions.
145
+
146
+ ### Modelling
147
+ Scikit-learn's VotingClassifier consisting of XGBClassifier, LGBMClassifier, CatBoostClassifier and RandomForestClassifier with default parameters\
148
+ """
149
+ )
150
+ with gr.Group()
151
+ a1 = gr.Textbox( lines=7, label='Text', value=example )
152
+ button1 = gr.Button("🤖 Predict!")
153
+ gr.Markdown("Prediction:")
154
+ label1 = gr.Textbox(lines=1, label='Predicted Label')
155
+ score1 = gr.Textbox(lines=1, label='Predicted Probability')
156
+
157
+ button1.click(predict, inputs=[a1], outputs=[label1, score1])
158
 
159
  demo.launch()