Sahyus
/

roberta-large-squad2-finetuned-dtc

+---
+license: cc-by-4.0
+base_model: deepset/roberta-large-squad2
+tags:
+- generated_from_keras_callback
+model-index:
+- name: roberta-large-squad2-finetuned-dtc
+  results: []
+---
+<!-- This model card has been generated automatically according to the information Keras had access to. You should
+probably proofread and complete it, then remove this comment. -->
+# roberta-large-squad2-finetuned-dtc
+This model is a fine-tuned version of [deepset/roberta-large-squad2](https://huggingface.co/deepset/roberta-large-squad2) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Train Loss: 1.9389
+- Train End Logits Loss: 1.1432
+- Train Start Logits Loss: 0.7957
+- Train End Logits Acc: 0.7392
+- Train Start Logits Acc: 0.8093
+- Validation Loss: 3.7259
+- Validation End Logits Loss: 1.8885
+- Validation Start Logits Loss: 1.8374
+- Validation End Logits Acc: 0.6312
+- Validation Start Logits Acc: 0.7221
+- Epoch: 36
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'module': 'keras.optimizers.schedules', 'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 2.4e-05, 'decay_steps': 21400, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'registered_name': None}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.03}
+- training_precision: float32
+### Training results
+| Train Loss | Train End Logits Loss | Train Start Logits Loss | Train End Logits Acc | Train Start Logits Acc | Validation Loss | Validation End Logits Loss | Validation Start Logits Loss | Validation End Logits Acc | Validation Start Logits Acc | Epoch |
+|:----------:|:---------------------:|:-----------------------:|:--------------------:|:----------------------:|:---------------:|:--------------------------:|:----------------------------:|:-------------------------:|:---------------------------:|:-----:|
+| 5.8888     | 3.0592                | 2.8296                  | 0.5456               | 0.5406                 | 4.8715          | 2.6861                     | 2.1854                       | 0.6130                    | 0.6182                      | 0     |
+| 5.0000     | 2.7063                | 2.2937                  | 0.5809               | 0.5762                 | 4.0680          | 2.3593                     | 1.7087                       | 0.6208                    | 0.6000                      | 1     |
+| 4.7529     | 2.5952                | 2.1576                  | 0.5929               | 0.5862                 | 4.5767          | 2.7450                     | 1.8317                       | 0.6208                    | 0.6156                      | 2     |
+| 4.6181     | 2.5511                | 2.0670                  | 0.5984               | 0.5873                 | 3.9828          | 2.4125                     | 1.5703                       | 0.6208                    | 0.6052                      | 3     |
+| 4.4766     | 2.4920                | 1.9846                  | 0.6019               | 0.5946                 | 3.7862          | 2.2460                     | 1.5402                       | 0.6208                    | 0.5922                      | 4     |
+| 4.5692     | 2.5720                | 1.9972                  | 0.6081               | 0.6066                 | 3.6069          | 2.1558                     | 1.4511                       | 0.6208                    | 0.6052                      | 5     |
+| 4.3098     | 2.4382                | 1.8716                  | 0.6016               | 0.5987                 | 3.2741          | 1.9275                     | 1.3466                       | 0.6208                    | 0.6364                      | 6     |
+| 3.8906     | 2.2240                | 1.6666                  | 0.6165               | 0.6256                 | 3.3856          | 1.9718                     | 1.4138                       | 0.6156                    | 0.6052                      | 7     |
+| 3.7711     | 2.1773                | 1.5939                  | 0.6154               | 0.6317                 | 3.4381          | 1.7916                     | 1.6465                       | 0.6182                    | 0.4805                      | 8     |
+| 3.6331     | 2.1149                | 1.5182                  | 0.6177               | 0.6460                 | 3.0055          | 1.6855                     | 1.3200                       | 0.5273                    | 0.6338                      | 9     |
+| 3.4683     | 2.0212                | 1.4471                  | 0.6168               | 0.6545                 | 3.3422          | 1.7875                     | 1.5547                       | 0.4805                    | 0.5325                      | 10    |
+| 3.3695     | 1.9567                | 1.4129                  | 0.6183               | 0.6618                 | 2.8283          | 1.5488                     | 1.2795                       | 0.5455                    | 0.6286                      | 11    |
+| 3.3125     | 1.9344                | 1.3781                  | 0.6215               | 0.6647                 | 2.7086          | 1.5124                     | 1.1962                       | 0.5636                    | 0.6338                      | 12    |
+| 3.2580     | 1.9282                | 1.3298                  | 0.6390               | 0.6852                 | 3.0502          | 1.7520                     | 1.2982                       | 0.6156                    | 0.6623                      | 13    |
+| 3.2814     | 1.9478                | 1.3336                  | 0.6294               | 0.6711                 | 2.5437          | 1.4591                     | 1.0846                       | 0.5948                    | 0.6727                      | 14    |
+| 3.1027     | 1.8305                | 1.2721                  | 0.6370               | 0.6893                 | 3.0537          | 1.6897                     | 1.3640                       | 0.5481                    | 0.5922                      | 15    |
+| 2.7670     | 1.6628                | 1.1042                  | 0.6583               | 0.7217                 | 2.4372          | 1.3791                     | 1.0581                       | 0.6519                    | 0.6961                      | 16    |
+| 2.7880     | 1.6975                | 1.0905                  | 0.6583               | 0.7339                 | 2.2441          | 1.2735                     | 0.9706                       | 0.7039                    | 0.7299                      | 17    |
+| 2.7786     | 1.6524                | 1.1262                  | 0.6606               | 0.7225                 | 2.6408          | 1.4267                     | 1.2141                       | 0.6701                    | 0.6831                      | 18    |
+| 2.4685     | 1.4862                | 0.9823                  | 0.6741               | 0.7447                 | 2.7726          | 1.5947                     | 1.1779                       | 0.6338                    | 0.6909                      | 19    |
+| 2.4204     | 1.4523                | 0.9682                  | 0.6814               | 0.7538                 | 2.1115          | 1.1877                     | 0.9238                       | 0.7429                    | 0.7714                      | 20    |
+| 2.2158     | 1.3472                | 0.8686                  | 0.6939               | 0.7707                 | 2.2647          | 1.2382                     | 1.0266                       | 0.7143                    | 0.7532                      | 21    |
+| 2.0138     | 1.2461                | 0.7676                  | 0.7109               | 0.7994                 | 2.1425          | 1.1617                     | 0.9808                       | 0.7455                    | 0.7558                      | 22    |
+| 2.0038     | 1.2585                | 0.7453                  | 0.7129               | 0.8008                 | 1.8952          | 0.9984                     | 0.8968                       | 0.7688                    | 0.7558                      | 23    |
+| 1.8391     | 1.1600                | 0.6791                  | 0.7231               | 0.8186                 | 2.4242          | 1.3208                     | 1.1034                       | 0.7013                    | 0.7039                      | 24    |
+| 1.7792     | 1.1060                | 0.6732                  | 0.7389               | 0.8248                 | 1.8800          | 1.0211                     | 0.8588                       | 0.7792                    | 0.7818                      | 25    |
+| 1.6690     | 1.0636                | 0.6054                  | 0.7462               | 0.8367                 | 2.2503          | 1.2198                     | 1.0305                       | 0.7325                    | 0.7506                      | 26    |
+| 1.6197     | 1.0327                | 0.5870                  | 0.7591               | 0.8452                 | 1.9393          | 0.9581                     | 0.9812                       | 0.7974                    | 0.8052                      | 27    |
+| 1.5335     | 0.9795                | 0.5540                  | 0.7652               | 0.8595                 | 2.2046          | 1.1750                     | 1.0296                       | 0.7688                    | 0.7870                      | 28    |
+| 1.4563     | 0.9314                | 0.5249                  | 0.7751               | 0.8621                 | 1.9638          | 1.0204                     | 0.9434                       | 0.7403                    | 0.7792                      | 29    |
+| 1.3903     | 0.9049                | 0.4854                  | 0.7772               | 0.8683                 | 2.2657          | 1.1569                     | 1.1088                       | 0.7636                    | 0.7896                      | 30    |
+| 1.3534     | 0.8813                | 0.4720                  | 0.7859               | 0.8744                 | 1.9620          | 0.9779                     | 0.9840                       | 0.7688                    | 0.7740                      | 31    |
+| 1.4848     | 0.9444                | 0.5405                  | 0.7684               | 0.8563                 | 2.3368          | 1.1941                     | 1.1427                       | 0.7299                    | 0.7688                      | 32    |
+| 1.5092     | 0.9534                | 0.5558                  | 0.7550               | 0.8461                 | 2.1233          | 1.0956                     | 1.0277                       | 0.7610                    | 0.7740                      | 33    |
+| 1.4016     | 0.8789                | 0.5227                  | 0.7751               | 0.8624                 | 2.4886          | 1.2593                     | 1.2294                       | 0.7403                    | 0.7844                      | 34    |
+| 1.8007     | 1.0509                | 0.7498                  | 0.7520               | 0.8183                 | 2.5730          | 1.3045                     | 1.2686                       | 0.7195                    | 0.7481                      | 35    |
+| 1.9389     | 1.1432                | 0.7957                  | 0.7392               | 0.8093                 | 3.7259          | 1.8885                     | 1.8374                       | 0.6312                    | 0.7221                      | 36    |
+### Framework versions
+- Transformers 4.36.2
+- TensorFlow 2.14.0
+- Datasets 2.16.1
+- Tokenizers 0.15.0

config.json ADDED Viewed

	@@ -0,0 +1,29 @@

+{
+  "_name_or_path": "deepset/roberta-large-squad2",
+  "architectures": [
+    "RobertaForQuestionAnswering"
+  ],
+  "attention_probs_dropout_prob": 0.4,
+  "bos_token_id": 0,
+  "classifier_dropout": null,
+  "eos_token_id": 2,
+  "gradient_checkpointing": false,
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.4,
+  "hidden_size": 1024,
+  "initializer_range": 0.02,
+  "intermediate_size": 4096,
+  "language": "english",
+  "layer_norm_eps": 1e-15,
+  "max_position_embeddings": 514,
+  "model_type": "roberta",
+  "name": "Roberta",
+  "num_attention_heads": 16,
+  "num_hidden_layers": 24,
+  "pad_token_id": 1,
+  "position_embedding_type": "absolute",
+  "transformers_version": "4.36.2",
+  "type_vocab_size": 1,
+  "use_cache": true,
+  "vocab_size": 50265
+}

tf_model.h5 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:883577b86090567cfc2ef514cc658deaabf6de26e723a90bef9c9dc72d6eb63c
+size 1417799680