mmoffatt
/

custom_gpt2

Model card Files Files and versions Community

mmoffatt commited on Aug 22, 2024

Commit

f8c8d57

·

verified ·

1 Parent(s): fabe18e

Upload config

Files changed (2) hide show

README.md +2 -2
config.json +2 -2

README.md CHANGED Viewed

@@ -1,9 +1,9 @@
 ---
 license: mit
 datasets:
 - Salesforce/wikitext
-language:
-- en
 ---
 This is a custom implementation of gpt2, where we replace attention with our implementation. Currently, we don't replace softmax, but in future submits we would like to replace the softmax function in attention with other softmax variations.

 ---
+language:
+- en
 license: mit
 datasets:
 - Salesforce/wikitext
 ---
 This is a custom implementation of gpt2, where we replace attention with our implementation. Currently, we don't replace softmax, but in future submits we would like to replace the softmax function in attention with other softmax variations.

config.json CHANGED Viewed

@@ -1,8 +1,8 @@
 {
-  "_name_or_path": "gpt2-custom",
   "activation_function": "gelu_new",
   "architectures": [
-    "GPT2Model"
   ],
   "attn_pdrop": 0.1,
   "bos_token_id": 50256,

 {
+  "_name_or_path": "gpt2",
   "activation_function": "gelu_new",
   "architectures": [
+    "GPT2LMHeadModel"
   ],
   "attn_pdrop": 0.1,
   "bos_token_id": 50256,