---
base_model: Omartificial-Intelligence-Space/Arabic-Triplet-Matryoshka-V2
datasets:
- Omartificial-Intelligence-Space/Arabic-stsb
- Omartificial-Intelligence-Space/Arabic-NLi-Pair-Class
language:
- ar
library_name: sentence-transformers
metrics:
- pearson_cosine
- spearman_cosine
- pearson_manhattan
- spearman_manhattan
- pearson_euclidean
- spearman_euclidean
- pearson_dot
- spearman_dot
- pearson_max
- spearman_max
pipeline_tag: sentence-similarity
tags:
- mteb
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:947818
- loss:SoftmaxLoss
- loss:CosineSimilarityLoss
- transformers
model-index:
- name: Omartificial-Intelligence-Space/GATE-AraBert-v1
results:
- dataset:
config: ar
name: MTEB MIRACLRetrievalHardNegatives (ar)
revision: 95c8db7d4a6e9c1d8a60601afd63d553ae20a2eb
split: dev
type: mteb/miracl-hard-negatives
metrics:
- type: ndcg_at_1
value: 48.699999999999996
- type: ndcg_at_3
value: 51.161
- type: ndcg_at_5
value: 53.923
- type: ndcg_at_10
value: 57.737
- type: ndcg_at_20
value: 60.475
- type: ndcg_at_100
value: 63.096
- type: ndcg_at_1000
value: 64.203
- type: map_at_1
value: 32.108
- type: map_at_3
value: 44.405
- type: map_at_5
value: 47.164
- type: map_at_10
value: 49.477
- type: map_at_20
value: 50.583999999999996
- type: map_at_100
value: 51.212999999999994
- type: map_at_1000
value: 51.281
- type: recall_at_1
value: 32.108
- type: recall_at_3
value: 52.675000000000004
- type: recall_at_5
value: 60.709
- type: recall_at_10
value: 70.61
- type: recall_at_20
value: 79.208
- type: recall_at_100
value: 89.805
- type: recall_at_1000
value: 96.935
- type: precision_at_1
value: 48.699999999999996
- type: precision_at_3
value: 29.7
- type: precision_at_5
value: 21.34
- type: precision_at_10
value: 12.98
- type: precision_at_20
value: 7.485
- type: precision_at_100
value: 1.772
- type: precision_at_1000
value: 0.193
- type: mrr_at_1
value: 48.699999999999996
- type: mrr_at_3
value: 57.5333
- type: mrr_at_5
value: 59.1333
- type: mrr_at_10
value: 60.1163
- type: mrr_at_20
value: 60.5298
- type: mrr_at_100
value: 60.691700000000004
- type: mrr_at_1000
value: 60.707699999999996
- type: nauc_ndcg_at_1_max
value: 28.267999999999997
- type: nauc_ndcg_at_1_std
value: -14.380299999999998
- type: nauc_ndcg_at_1_diff1
value: 46.6563
- type: nauc_ndcg_at_3_max
value: 30.7733
- type: nauc_ndcg_at_3_std
value: -18.2892
- type: nauc_ndcg_at_3_diff1
value: 42.4678
- type: nauc_ndcg_at_5_max
value: 31.014799999999997
- type: nauc_ndcg_at_5_std
value: -19.1886
- type: nauc_ndcg_at_5_diff1
value: 42.4011
- type: nauc_ndcg_at_10_max
value: 31.895400000000002
- type: nauc_ndcg_at_10_std
value: -17.1444
- type: nauc_ndcg_at_10_diff1
value: 41.7072
- type: nauc_ndcg_at_20_max
value: 33.373799999999996
- type: nauc_ndcg_at_20_std
value: -15.8958
- type: nauc_ndcg_at_20_diff1
value: 42.174800000000005
- type: nauc_ndcg_at_100_max
value: 33.3595
- type: nauc_ndcg_at_100_std
value: -13.684
- type: nauc_ndcg_at_100_diff1
value: 42.5765
- type: nauc_ndcg_at_1000_max
value: 33.306799999999996
- type: nauc_ndcg_at_1000_std
value: -13.309099999999999
- type: nauc_ndcg_at_1000_diff1
value: 42.7722
- type: nauc_map_at_1_max
value: 20.7316
- type: nauc_map_at_1_std
value: -19.7081
- type: nauc_map_at_1_diff1
value: 45.9673
- type: nauc_map_at_3_max
value: 27.112399999999997
- type: nauc_map_at_3_std
value: -21.3088
- type: nauc_map_at_3_diff1
value: 43.3808
- type: nauc_map_at_5_max
value: 28.2607
- type: nauc_map_at_5_std
value: -20.7577
- type: nauc_map_at_5_diff1
value: 42.7665
- type: nauc_map_at_10_max
value: 29.418899999999997
- type: nauc_map_at_10_std
value: -19.5259
- type: nauc_map_at_10_diff1
value: 42.277100000000004
- type: nauc_map_at_20_max
value: 30.0292
- type: nauc_map_at_20_std
value: -19.0805
- type: nauc_map_at_20_diff1
value: 42.5082
- type: nauc_map_at_100_max
value: 30.098599999999998
- type: nauc_map_at_100_std
value: -18.570600000000002
- type: nauc_map_at_100_diff1
value: 42.58
- type: nauc_map_at_1000_max
value: 30.0989
- type: nauc_map_at_1000_std
value: -18.543499999999998
- type: nauc_map_at_1000_diff1
value: 42.5925
- type: nauc_recall_at_1_max
value: 20.7316
- type: nauc_recall_at_1_std
value: -19.7081
- type: nauc_recall_at_1_diff1
value: 45.9673
- type: nauc_recall_at_3_max
value: 27.832600000000003
- type: nauc_recall_at_3_std
value: -20.2727
- type: nauc_recall_at_3_diff1
value: 37.7177
- type: nauc_recall_at_5_max
value: 29.0228
- type: nauc_recall_at_5_std
value: -21.0861
- type: nauc_recall_at_5_diff1
value: 35.5742
- type: nauc_recall_at_10_max
value: 29.876
- type: nauc_recall_at_10_std
value: -15.8439
- type: nauc_recall_at_10_diff1
value: 31.639499999999998
- type: nauc_recall_at_20_max
value: 36.5508
- type: nauc_recall_at_20_std
value: -10.3007
- type: nauc_recall_at_20_diff1
value: 31.490800000000004
- type: nauc_recall_at_100_max
value: 41.573
- type: nauc_recall_at_100_std
value: 10.3383
- type: nauc_recall_at_100_diff1
value: 32.034400000000005
- type: nauc_recall_at_1000_max
value: 61.4362
- type: nauc_recall_at_1000_std
value: 69.59440000000001
- type: nauc_recall_at_1000_diff1
value: 29.5155
- type: nauc_precision_at_1_max
value: 28.267999999999997
- type: nauc_precision_at_1_std
value: -14.380299999999998
- type: nauc_precision_at_1_diff1
value: 46.6563
- type: nauc_precision_at_3_max
value: 28.3698
- type: nauc_precision_at_3_std
value: -5.6800999999999995
- type: nauc_precision_at_3_diff1
value: 20.9306
- type: nauc_precision_at_5_max
value: 27.0061
- type: nauc_precision_at_5_std
value: -0.2876
- type: nauc_precision_at_5_diff1
value: 12.3067
- type: nauc_precision_at_10_max
value: 24.1662
- type: nauc_precision_at_10_std
value: 8.8292
- type: nauc_precision_at_10_diff1
value: 4.0011
- type: nauc_precision_at_20_max
value: 21.4451
- type: nauc_precision_at_20_std
value: 13.3651
- type: nauc_precision_at_20_diff1
value: -0.7437
- type: nauc_precision_at_100_max
value: 13.9599
- type: nauc_precision_at_100_std
value: 22.5858
- type: nauc_precision_at_100_diff1
value: -6.833
- type: nauc_precision_at_1000_max
value: 10.3447
- type: nauc_precision_at_1000_std
value: 25.1309
- type: nauc_precision_at_1000_diff1
value: -10.5
- type: nauc_mrr_at_1_max
value: 28.267999999999997
- type: nauc_mrr_at_1_std
value: -14.380299999999998
- type: nauc_mrr_at_1_diff1
value: 46.6563
- type: nauc_mrr_at_3_max
value: 33.0318
- type: nauc_mrr_at_3_std
value: -12.7141
- type: nauc_mrr_at_3_diff1
value: 44.562200000000004
- type: nauc_mrr_at_5_max
value: 33.292699999999996
- type: nauc_mrr_at_5_std
value: -13.118599999999999
- type: nauc_mrr_at_5_diff1
value: 44.8335
- type: nauc_mrr_at_10_max
value: 33.2046
- type: nauc_mrr_at_10_std
value: -12.4948
- type: nauc_mrr_at_10_diff1
value: 44.7418
- type: nauc_mrr_at_20_max
value: 33.2849
- type: nauc_mrr_at_20_std
value: -12.2826
- type: nauc_mrr_at_20_diff1
value: 44.7205
- type: nauc_mrr_at_100_max
value: 33.196799999999996
- type: nauc_mrr_at_100_std
value: -12.2722
- type: nauc_mrr_at_100_diff1
value: 44.804300000000005
- type: nauc_mrr_at_1000_max
value: 33.1847
- type: nauc_mrr_at_1000_std
value: -12.2764
- type: nauc_mrr_at_1000_diff1
value: 44.806400000000004
- type: main_score
value: 57.737
task:
type: Retrieval
- dataset:
config: ara-ara
name: MTEB MLQARetrieval (ara-ara)
revision: 397ed406c1a7902140303e7faf60fff35b58d285
split: validation
type: facebook/mlqa
metrics:
- type: ndcg_at_1
value: 47.776
- type: ndcg_at_3
value: 57.977999999999994
- type: ndcg_at_5
value: 60.275999999999996
- type: ndcg_at_10
value: 62.580000000000005
- type: ndcg_at_20
value: 63.857
- type: ndcg_at_100
value: 66.053
- type: ndcg_at_1000
value: 66.77
- type: map_at_1
value: 47.776
- type: map_at_3
value: 55.545
- type: map_at_5
value: 56.812
- type: map_at_10
value: 57.75599999999999
- type: map_at_20
value: 58.10999999999999
- type: map_at_100
value: 58.404999999999994
- type: map_at_1000
value: 58.440000000000005
- type: recall_at_1
value: 47.776
- type: recall_at_3
value: 64.99000000000001
- type: recall_at_5
value: 70.6
- type: recall_at_10
value: 77.756
- type: recall_at_20
value: 82.785
- type: recall_at_100
value: 94.77799999999999
- type: recall_at_1000
value: 100.0
- type: precision_at_1
value: 47.776
- type: precision_at_3
value: 21.663
- type: precision_at_5
value: 14.12
- type: precision_at_10
value: 7.776
- type: precision_at_20
value: 4.139
- type: precision_at_100
value: 0.9480000000000001
- type: precision_at_1000
value: 0.1
- type: mrr_at_1
value: 47.775600000000004
- type: mrr_at_3
value: 55.5448
- type: mrr_at_5
value: 56.8117
- type: mrr_at_10
value: 57.7562
- type: mrr_at_20
value: 58.10999999999999
- type: mrr_at_100
value: 58.404900000000005
- type: mrr_at_1000
value: 58.4397
- type: nauc_ndcg_at_1_max
value: 68.0477
- type: nauc_ndcg_at_1_std
value: 19.1911
- type: nauc_ndcg_at_1_diff1
value: 73.80640000000001
- type: nauc_ndcg_at_3_max
value: 70.82159999999999
- type: nauc_ndcg_at_3_std
value: 19.7998
- type: nauc_ndcg_at_3_diff1
value: 68.7149
- type: nauc_ndcg_at_5_max
value: 70.812
- type: nauc_ndcg_at_5_std
value: 19.3612
- type: nauc_ndcg_at_5_diff1
value: 67.281
- type: nauc_ndcg_at_10_max
value: 71.02380000000001
- type: nauc_ndcg_at_10_std
value: 20.4318
- type: nauc_ndcg_at_10_diff1
value: 67.0146
- type: nauc_ndcg_at_20_max
value: 70.2041
- type: nauc_ndcg_at_20_std
value: 19.5577
- type: nauc_ndcg_at_20_diff1
value: 66.4613
- type: nauc_ndcg_at_100_max
value: 70.2647
- type: nauc_ndcg_at_100_std
value: 20.4363
- type: nauc_ndcg_at_100_diff1
value: 67.5021
- type: nauc_ndcg_at_1000_max
value: 70.1325
- type: nauc_ndcg_at_1000_std
value: 19.697
- type: nauc_ndcg_at_1000_diff1
value: 68.05290000000001
- type: nauc_map_at_1_max
value: 68.0477
- type: nauc_map_at_1_std
value: 19.1911
- type: nauc_map_at_1_diff1
value: 73.80640000000001
- type: nauc_map_at_3_max
value: 70.0727
- type: nauc_map_at_3_std
value: 19.4799
- type: nauc_map_at_3_diff1
value: 69.9485
- type: nauc_map_at_5_max
value: 70.015
- type: nauc_map_at_5_std
value: 19.2605
- type: nauc_map_at_5_diff1
value: 69.199
- type: nauc_map_at_10_max
value: 70.0897
- type: nauc_map_at_10_std
value: 19.6128
- type: nauc_map_at_10_diff1
value: 69.1356
- type: nauc_map_at_20_max
value: 69.8725
- type: nauc_map_at_20_std
value: 19.3681
- type: nauc_map_at_20_diff1
value: 69.0124
- type: nauc_map_at_100_max
value: 69.8656
- type: nauc_map_at_100_std
value: 19.4493
- type: nauc_map_at_100_diff1
value: 69.1296
- type: nauc_map_at_1000_max
value: 69.8608
- type: nauc_map_at_1000_std
value: 19.4191
- type: nauc_map_at_1000_diff1
value: 69.1547
- type: nauc_recall_at_1_max
value: 68.0477
- type: nauc_recall_at_1_std
value: 19.1911
- type: nauc_recall_at_1_diff1
value: 73.80640000000001
- type: nauc_recall_at_3_max
value: 73.2764
- type: nauc_recall_at_3_std
value: 20.9191
- type: nauc_recall_at_3_diff1
value: 64.7349
- type: nauc_recall_at_5_max
value: 73.7417
- type: nauc_recall_at_5_std
value: 19.6995
- type: nauc_recall_at_5_diff1
value: 60.2181
- type: nauc_recall_at_10_max
value: 75.31
- type: nauc_recall_at_10_std
value: 24.833199999999998
- type: nauc_recall_at_10_diff1
value: 57.29729999999999
- type: nauc_recall_at_20_max
value: 70.9915
- type: nauc_recall_at_20_std
value: 20.3983
- type: nauc_recall_at_20_diff1
value: 51.2804
- type: nauc_recall_at_100_max
value: 75.0448
- type: nauc_recall_at_100_std
value: 46.0233
- type: nauc_recall_at_100_diff1
value: 48.8265
- type: nauc_precision_at_1_max
value: 68.0477
- type: nauc_precision_at_1_std
value: 19.1911
- type: nauc_precision_at_1_diff1
value: 73.80640000000001
- type: nauc_precision_at_3_max
value: 73.2764
- type: nauc_precision_at_3_std
value: 20.9191
- type: nauc_precision_at_3_diff1
value: 64.7349
- type: nauc_precision_at_5_max
value: 73.7417
- type: nauc_precision_at_5_std
value: 19.6995
- type: nauc_precision_at_5_diff1
value: 60.2181
- type: nauc_precision_at_10_max
value: 75.31
- type: nauc_precision_at_10_std
value: 24.833199999999998
- type: nauc_precision_at_10_diff1
value: 57.29729999999999
- type: nauc_precision_at_20_max
value: 70.9915
- type: nauc_precision_at_20_std
value: 20.3983
- type: nauc_precision_at_20_diff1
value: 51.2804
- type: nauc_precision_at_100_max
value: 75.0448
- type: nauc_precision_at_100_std
value: 46.0233
- type: nauc_precision_at_100_diff1
value: 48.8265
- type: nauc_precision_at_1000_max
value: 100.0
- type: nauc_precision_at_1000_std
value: 100.0
- type: nauc_precision_at_1000_diff1
value: 100.0
- type: nauc_mrr_at_1_max
value: 68.0477
- type: nauc_mrr_at_1_std
value: 19.1911
- type: nauc_mrr_at_1_diff1
value: 73.80640000000001
- type: nauc_mrr_at_3_max
value: 70.0727
- type: nauc_mrr_at_3_std
value: 19.4799
- type: nauc_mrr_at_3_diff1
value: 69.9485
- type: nauc_mrr_at_5_max
value: 70.015
- type: nauc_mrr_at_5_std
value: 19.2605
- type: nauc_mrr_at_5_diff1
value: 69.199
- type: nauc_mrr_at_10_max
value: 70.0897
- type: nauc_mrr_at_10_std
value: 19.6128
- type: nauc_mrr_at_10_diff1
value: 69.1356
- type: nauc_mrr_at_20_max
value: 69.8725
- type: nauc_mrr_at_20_std
value: 19.3681
- type: nauc_mrr_at_20_diff1
value: 69.0124
- type: nauc_mrr_at_100_max
value: 69.8656
- type: nauc_mrr_at_100_std
value: 19.4493
- type: nauc_mrr_at_100_diff1
value: 69.1296
- type: nauc_mrr_at_1000_max
value: 69.8608
- type: nauc_mrr_at_1000_std
value: 19.4191
- type: nauc_mrr_at_1000_diff1
value: 69.1547
- type: main_score
value: 62.580000000000005
task:
type: Retrieval
- dataset:
config: ara-ara
name: MTEB MLQARetrieval (ara-ara)
revision: 397ed406c1a7902140303e7faf60fff35b58d285
split: test
type: facebook/mlqa
metrics:
- type: ndcg_at_1
value: 37.409
- type: ndcg_at_3
value: 44.269
- type: ndcg_at_5
value: 46.23
- type: ndcg_at_10
value: 48.076
- type: ndcg_at_20
value: 49.679
- type: ndcg_at_100
value: 52.037
- type: ndcg_at_1000
value: 53.958
- type: map_at_1
value: 37.399
- type: map_at_3
value: 42.577999999999996
- type: map_at_5
value: 43.661
- type: map_at_10
value: 44.42
- type: map_at_20
value: 44.861000000000004
- type: map_at_100
value: 45.179
- type: map_at_1000
value: 45.242
- type: recall_at_1
value: 37.399
- type: recall_at_3
value: 49.156
- type: recall_at_5
value: 53.937999999999995
- type: recall_at_10
value: 59.657000000000004
- type: recall_at_20
value: 65.995
- type: recall_at_100
value: 78.821
- type: recall_at_1000
value: 94.45
- type: precision_at_1
value: 37.409
- type: precision_at_3
value: 16.389
- type: precision_at_5
value: 10.789
- type: precision_at_10
value: 5.9670000000000005
- type: precision_at_20
value: 3.3000000000000003
- type: precision_at_100
value: 0.788
- type: precision_at_1000
value: 0.094
- type: mrr_at_1
value: 37.4086
- type: mrr_at_3
value: 42.587
- type: mrr_at_5
value: 43.6699
- type: mrr_at_10
value: 44.4297
- type: mrr_at_20
value: 44.8704
- type: mrr_at_100
value: 45.1881
- type: mrr_at_1000
value: 45.251000000000005
- type: nauc_ndcg_at_1_max
value: 61.8437
- type: nauc_ndcg_at_1_std
value: 10.782
- type: nauc_ndcg_at_1_diff1
value: 66.1842
- type: nauc_ndcg_at_3_max
value: 63.157399999999996
- type: nauc_ndcg_at_3_std
value: 13.114899999999999
- type: nauc_ndcg_at_3_diff1
value: 60.312
- type: nauc_ndcg_at_5_max
value: 63.027100000000004
- type: nauc_ndcg_at_5_std
value: 13.995099999999999
- type: nauc_ndcg_at_5_diff1
value: 59.272499999999994
- type: nauc_ndcg_at_10_max
value: 63.0273
- type: nauc_ndcg_at_10_std
value: 14.898700000000002
- type: nauc_ndcg_at_10_diff1
value: 58.2739
- type: nauc_ndcg_at_20_max
value: 62.785199999999996
- type: nauc_ndcg_at_20_std
value: 15.259800000000002
- type: nauc_ndcg_at_20_diff1
value: 57.8913
- type: nauc_ndcg_at_100_max
value: 62.641999999999996
- type: nauc_ndcg_at_100_std
value: 15.738299999999999
- type: nauc_ndcg_at_100_diff1
value: 58.2303
- type: nauc_ndcg_at_1000_max
value: 62.7624
- type: nauc_ndcg_at_1000_std
value: 15.1653
- type: nauc_ndcg_at_1000_diff1
value: 58.9359
- type: nauc_map_at_1_max
value: 61.800900000000006
- type: nauc_map_at_1_std
value: 10.7369
- type: nauc_map_at_1_diff1
value: 66.18270000000001
- type: nauc_map_at_3_max
value: 62.8757
- type: nauc_map_at_3_std
value: 12.5061
- type: nauc_map_at_3_diff1
value: 61.767
- type: nauc_map_at_5_max
value: 62.793299999999995
- type: nauc_map_at_5_std
value: 12.964500000000001
- type: nauc_map_at_5_diff1
value: 61.211000000000006
- type: nauc_map_at_10_max
value: 62.8054
- type: nauc_map_at_10_std
value: 13.328000000000001
- type: nauc_map_at_10_diff1
value: 60.833400000000005
- type: nauc_map_at_20_max
value: 62.734199999999994
- type: nauc_map_at_20_std
value: 13.4114
- type: nauc_map_at_20_diff1
value: 60.747099999999996
- type: nauc_map_at_100_max
value: 62.7054
- type: nauc_map_at_100_std
value: 13.4556
- type: nauc_map_at_100_diff1
value: 60.79259999999999
- type: nauc_map_at_1000_max
value: 62.71099999999999
- type: nauc_map_at_1000_std
value: 13.444400000000002
- type: nauc_map_at_1000_diff1
value: 60.815
- type: nauc_recall_at_1_max
value: 61.800900000000006
- type: nauc_recall_at_1_std
value: 10.7369
- type: nauc_recall_at_1_diff1
value: 66.18270000000001
- type: nauc_recall_at_3_max
value: 63.914300000000004
- type: nauc_recall_at_3_std
value: 14.8614
- type: nauc_recall_at_3_diff1
value: 56.044700000000006
- type: nauc_recall_at_5_max
value: 63.6523
- type: nauc_recall_at_5_std
value: 17.2352
- type: nauc_recall_at_5_diff1
value: 53.2316
- type: nauc_recall_at_10_max
value: 63.6138
- type: nauc_recall_at_10_std
value: 20.4315
- type: nauc_recall_at_10_diff1
value: 49.4388
- type: nauc_recall_at_20_max
value: 62.605
- type: nauc_recall_at_20_std
value: 22.8045
- type: nauc_recall_at_20_diff1
value: 46.5945
- type: nauc_recall_at_100_max
value: 61.5178
- type: nauc_recall_at_100_std
value: 30.4825
- type: nauc_recall_at_100_diff1
value: 44.9405
- type: nauc_recall_at_1000_max
value: 63.473
- type: nauc_recall_at_1000_std
value: 39.1421
- type: nauc_recall_at_1000_diff1
value: 43.4873
- type: nauc_precision_at_1_max
value: 61.8437
- type: nauc_precision_at_1_std
value: 10.782
- type: nauc_precision_at_1_diff1
value: 66.1842
- type: nauc_precision_at_3_max
value: 63.962799999999994
- type: nauc_precision_at_3_std
value: 14.908299999999999
- type: nauc_precision_at_3_diff1
value: 56.0511
- type: nauc_precision_at_5_max
value: 63.7072
- type: nauc_precision_at_5_std
value: 17.2854
- type: nauc_precision_at_5_diff1
value: 53.2417
- type: nauc_precision_at_10_max
value: 63.672200000000004
- type: nauc_precision_at_10_std
value: 20.485300000000002
- type: nauc_precision_at_10_diff1
value: 49.4491
- type: nauc_precision_at_20_max
value: 62.674600000000005
- type: nauc_precision_at_20_std
value: 22.8667
- type: nauc_precision_at_20_diff1
value: 46.6088
- type: nauc_precision_at_100_max
value: 61.622600000000006
- type: nauc_precision_at_100_std
value: 30.5766
- type: nauc_precision_at_100_diff1
value: 44.9643
- type: nauc_precision_at_1000_max
value: 63.131400000000006
- type: nauc_precision_at_1000_std
value: 39.6527
- type: nauc_precision_at_1000_diff1
value: 42.9196
- type: nauc_mrr_at_1_max
value: 61.8437
- type: nauc_mrr_at_1_std
value: 10.782
- type: nauc_mrr_at_1_diff1
value: 66.1842
- type: nauc_mrr_at_3_max
value: 62.9188
- type: nauc_mrr_at_3_std
value: 12.5514
- type: nauc_mrr_at_3_diff1
value: 61.768699999999995
- type: nauc_mrr_at_5_max
value: 62.836800000000004
- type: nauc_mrr_at_5_std
value: 13.0102
- type: nauc_mrr_at_5_diff1
value: 61.2128
- type: nauc_mrr_at_10_max
value: 62.8492
- type: nauc_mrr_at_10_std
value: 13.3741
- type: nauc_mrr_at_10_diff1
value: 60.8352
- type: nauc_mrr_at_20_max
value: 62.7783
- type: nauc_mrr_at_20_std
value: 13.4578
- type: nauc_mrr_at_20_diff1
value: 60.74889999999999
- type: nauc_mrr_at_100_max
value: 62.7497
- type: nauc_mrr_at_100_std
value: 13.5022
- type: nauc_mrr_at_100_diff1
value: 60.7944
- type: nauc_mrr_at_1000_max
value: 62.7546
- type: nauc_mrr_at_1000_std
value: 13.490499999999999
- type: nauc_mrr_at_1000_diff1
value: 60.8168
- type: main_score
value: 48.076
task:
type: Retrieval
- dataset:
config: ar
name: MTEB MintakaRetrieval (ar)
revision: efa78cc2f74bbcd21eff2261f9e13aebe40b814e
split: test
type: jinaai/mintakaqa
metrics:
- type: ndcg_at_1
value: 11.212
- type: ndcg_at_3
value: 16.08
- type: ndcg_at_5
value: 17.543
- type: ndcg_at_10
value: 19.13
- type: map_at_1
value: 11.212
- type: map_at_3
value: 14.904
- type: map_at_5
value: 15.719
- type: map_at_10
value: 16.375
- type: recall_at_1
value: 11.212
- type: recall_at_3
value: 19.473
- type: recall_at_5
value: 23.014000000000003
- type: recall_at_10
value: 27.916
- type: precision_at_1
value: 11.212
- type: precision_at_3
value: 6.491
- type: precision_at_5
value: 4.603
- type: precision_at_10
value: 2.792
- type: mrr_at_1
value: 11.212
- type: mrr_at_3
value: 14.9039
- type: mrr_at_5
value: 15.7187
- type: mrr_at_10
value: 16.3746
- type: main_score
value: 19.13
task:
type: Retrieval
- dataset:
config: default
name: MTEB SadeemQuestionRetrieval (default)
revision: 3cb0752b182e5d5d740df547748b06663c8e0bd9
split: test
type: sadeem-ai/sadeem-ar-eval-retrieval-questions
metrics:
- type: ndcg_at_1
value: 28.674
- type: ndcg_at_3
value: 60.604
- type: ndcg_at_5
value: 62.092000000000006
- type: ndcg_at_10
value: 63.154999999999994
- type: ndcg_at_20
value: 63.602000000000004
- type: ndcg_at_100
value: 64.242
- type: ndcg_at_1000
value: 64.50399999999999
- type: map_at_1
value: 28.674
- type: map_at_3
value: 52.21
- type: map_at_5
value: 53.052
- type: map_at_10
value: 53.498000000000005
- type: map_at_20
value: 53.620000000000005
- type: map_at_100
value: 53.715999999999994
- type: map_at_1000
value: 53.726
- type: recall_at_1
value: 28.674
- type: recall_at_3
value: 85.112
- type: recall_at_5
value: 88.655
- type: recall_at_10
value: 91.91
- type: recall_at_20
value: 93.681
- type: recall_at_100
value: 97.032
- type: recall_at_1000
value: 99.09
- type: precision_at_1
value: 28.674
- type: precision_at_3
value: 28.371000000000002
- type: precision_at_5
value: 17.730999999999998
- type: precision_at_10
value: 9.191
- type: precision_at_20
value: 4.684
- type: precision_at_100
value: 0.97
- type: precision_at_1000
value: 0.099
- type: mrr_at_1
value: 25.371
- type: mrr_at_3
value: 50.2314
- type: mrr_at_5
value: 51.0212
- type: mrr_at_10
value: 51.481100000000005
- type: mrr_at_20
value: 51.6128
- type: mrr_at_100
value: 51.7119
- type: mrr_at_1000
value: 51.722100000000005
- type: nauc_ndcg_at_1_max
value: 15.7219
- type: nauc_ndcg_at_1_std
value: 2.2991
- type: nauc_ndcg_at_1_diff1
value: 5.2984
- type: nauc_ndcg_at_3_max
value: 37.9971
- type: nauc_ndcg_at_3_std
value: 11.0045
- type: nauc_ndcg_at_3_diff1
value: -38.8501
- type: nauc_ndcg_at_5_max
value: 35.6057
- type: nauc_ndcg_at_5_std
value: 10.8947
- type: nauc_ndcg_at_5_diff1
value: -33.353500000000004
- type: nauc_ndcg_at_10_max
value: 33.5856
- type: nauc_ndcg_at_10_std
value: 11.5392
- type: nauc_ndcg_at_10_diff1
value: -29.2831
- type: nauc_ndcg_at_20_max
value: 32.4619
- type: nauc_ndcg_at_20_std
value: 11.145900000000001
- type: nauc_ndcg_at_20_diff1
value: -26.8202
- type: nauc_ndcg_at_100_max
value: 30.888199999999998
- type: nauc_ndcg_at_100_std
value: 10.2467
- type: nauc_ndcg_at_100_diff1
value: -23.5621
- type: nauc_ndcg_at_1000_max
value: 30.173699999999997
- type: nauc_ndcg_at_1000_std
value: 9.7867
- type: nauc_ndcg_at_1000_diff1
value: -22.1022
- type: nauc_map_at_1_max
value: 15.7219
- type: nauc_map_at_1_std
value: 2.2991
- type: nauc_map_at_1_diff1
value: 5.2984
- type: nauc_map_at_3_max
value: 30.0249
- type: nauc_map_at_3_std
value: 8.110199999999999
- type: nauc_map_at_3_diff1
value: -22.6437
- type: nauc_map_at_5_max
value: 28.654600000000002
- type: nauc_map_at_5_std
value: 7.9832
- type: nauc_map_at_5_diff1
value: -19.4967
- type: nauc_map_at_10_max
value: 27.825400000000002
- type: nauc_map_at_10_std
value: 8.1387
- type: nauc_map_at_10_diff1
value: -17.8584
- type: nauc_map_at_20_max
value: 27.5484
- type: nauc_map_at_20_std
value: 8.0305
- type: nauc_map_at_20_diff1
value: -17.2515
- type: nauc_map_at_100_max
value: 27.3449
- type: nauc_map_at_100_std
value: 7.919099999999999
- type: nauc_map_at_100_diff1
value: -16.8205
- type: nauc_map_at_1000_max
value: 27.3212
- type: nauc_map_at_1000_std
value: 7.9052999999999995
- type: nauc_map_at_1000_diff1
value: -16.7727
- type: nauc_recall_at_1_max
value: 15.7219
- type: nauc_recall_at_1_std
value: 2.2991
- type: nauc_recall_at_1_diff1
value: 5.2984
- type: nauc_recall_at_3_max
value: 82.6956
- type: nauc_recall_at_3_std
value: 27.086700000000004
- type: nauc_recall_at_3_diff1
value: -129.9841
- type: nauc_recall_at_5_max
value: 83.4035
- type: nauc_recall_at_5_std
value: 30.9258
- type: nauc_recall_at_5_diff1
value: -128.9633
- type: nauc_recall_at_10_max
value: 85.344
- type: nauc_recall_at_10_std
value: 44.183699999999995
- type: nauc_recall_at_10_diff1
value: -132.1167
- type: nauc_recall_at_20_max
value: 84.9071
- type: nauc_recall_at_20_std
value: 48.2337
- type: nauc_recall_at_20_diff1
value: -128.3509
- type: nauc_recall_at_100_max
value: 86.99470000000001
- type: nauc_recall_at_100_std
value: 56.7091
- type: nauc_recall_at_100_diff1
value: -126.3125
- type: nauc_recall_at_1000_max
value: 90.02329999999999
- type: nauc_recall_at_1000_std
value: 79.7439
- type: nauc_recall_at_1000_diff1
value: -106.8251
- type: nauc_precision_at_1_max
value: 15.7219
- type: nauc_precision_at_1_std
value: 2.2991
- type: nauc_precision_at_1_diff1
value: 5.2984
- type: nauc_precision_at_3_max
value: 82.6956
- type: nauc_precision_at_3_std
value: 27.086700000000004
- type: nauc_precision_at_3_diff1
value: -129.9841
- type: nauc_precision_at_5_max
value: 83.4035
- type: nauc_precision_at_5_std
value: 30.9258
- type: nauc_precision_at_5_diff1
value: -128.9633
- type: nauc_precision_at_10_max
value: 85.344
- type: nauc_precision_at_10_std
value: 44.183699999999995
- type: nauc_precision_at_10_diff1
value: -132.1167
- type: nauc_precision_at_20_max
value: 84.9071
- type: nauc_precision_at_20_std
value: 48.2337
- type: nauc_precision_at_20_diff1
value: -128.3509
- type: nauc_precision_at_100_max
value: 86.99470000000001
- type: nauc_precision_at_100_std
value: 56.7091
- type: nauc_precision_at_100_diff1
value: -126.3125
- type: nauc_precision_at_1000_max
value: 90.02329999999999
- type: nauc_precision_at_1000_std
value: 79.7439
- type: nauc_precision_at_1000_diff1
value: -106.8251
- type: nauc_mrr_at_1_max
value: 10.6156
- type: nauc_mrr_at_1_std
value: 1.6084
- type: nauc_mrr_at_1_diff1
value: -23.3205
- type: nauc_mrr_at_3_max
value: 26.458599999999997
- type: nauc_mrr_at_3_std
value: 7.3822
- type: nauc_mrr_at_3_diff1
value: -46.1904
- type: nauc_mrr_at_5_max
value: 25.0415
- type: nauc_mrr_at_5_std
value: 7.0684
- type: nauc_mrr_at_5_diff1
value: -43.5901
- type: nauc_mrr_at_10_max
value: 24.1152
- type: nauc_mrr_at_10_std
value: 7.4148000000000005
- type: nauc_mrr_at_10_diff1
value: -42.271300000000004
- type: nauc_mrr_at_20_max
value: 23.7882
- type: nauc_mrr_at_20_std
value: 7.2852
- type: nauc_mrr_at_20_diff1
value: -41.7586
- type: nauc_mrr_at_100_max
value: 23.552999999999997
- type: nauc_mrr_at_100_std
value: 7.1595
- type: nauc_mrr_at_100_diff1
value: -41.3884
- type: nauc_mrr_at_1000_max
value: 23.5271
- type: nauc_mrr_at_1000_std
value: 7.1449
- type: nauc_mrr_at_1000_diff1
value: -41.3491
- type: main_score
value: 63.154999999999994
task:
type: Retrieval
- dataset:
config: ar-ar
name: MTEB STS17 (ar-ar)
revision: faeb762787bd10488a50c8b5be4a3b82e411949c
split: test
type: mteb/sts17-crosslingual-sts
metrics:
- type: cosine_pearson
value: 82.06597171670848
- type: cosine_spearman
value: 82.7809395809498
- type: euclidean_pearson
value: 79.23996991139896
- type: euclidean_spearman
value: 81.5287595404711
- type: main_score
value: 82.7809395809498
- type: manhattan_pearson
value: 78.95407006608013
- type: manhattan_spearman
value: 81.15109493737467
task:
type: STS
- dataset:
config: ar
name: MTEB STS22.v2 (ar)
revision: d31f33a128469b20e357535c39b82fb3c3f6f2bd
split: test
type: mteb/sts22-crosslingual-sts
metrics:
- type: cosine_pearson
value: 54.912880452465004
- type: cosine_spearman
value: 63.09788380910325
- type: euclidean_pearson
value: 57.92665617677832
- type: euclidean_spearman
value: 62.76032598469037
- type: main_score
value: 63.09788380910325
- type: manhattan_pearson
value: 58.0736648155273
- type: manhattan_spearman
value: 62.94190582776664
task:
type: STS
- dataset:
config: ar
name: MTEB STS22 (ar)
revision: de9d86b3b84231dc21f76c7b7af1f28e2f57f6e3
split: test
type: mteb/sts22-crosslingual-sts
metrics:
- type: cosine_pearson
value: 51.72534929358701
- type: cosine_spearman
value: 59.75149627160101
- type: euclidean_pearson
value: 53.894835373598774
- type: euclidean_spearman
value: 59.44278354697161
- type: main_score
value: 59.75149627160101
- type: manhattan_pearson
value: 54.076675975406985
- type: manhattan_spearman
value: 59.610061143235725
task:
type: STS
widget:
- source_sentence: امرأة تكتب شيئاً
sentences:
- مراهق يتحدث إلى فتاة عبر كاميرا الإنترنت
- امرأة تقطع البصل الأخضر.
- مجموعة من كبار السن يتظاهرون حول طاولة الطعام.
- source_sentence: تتشكل النجوم في مناطق تكوين النجوم، والتي تنشأ نفسها من السحب الجزيئية.
sentences:
- لاعب كرة السلة على وشك تسجيل نقاط لفريقه.
- المقال التالي مأخوذ من نسختي من "أطلس البطريق الجديد للتاريخ الوسطى"
- قد يكون من الممكن أن يوجد نظام شمسي مثل نظامنا خارج المجرة
- source_sentence: >-
تحت السماء الزرقاء مع الغيوم البيضاء، يصل طفل لمس مروحة طائرة واقفة على حقل
من العشب.
sentences:
- امرأة تحمل كأساً
- طفل يحاول لمس مروحة طائرة
- اثنان من عازبين عن الشرب يستعدون للعشاء
- source_sentence: رجل في منتصف العمر يحلق لحيته في غرفة ذات جدران بيضاء والتي لا تبدو كحمام
sentences:
- فتى يخطط اسمه على مكتبه
- رجل ينام
- المرأة وحدها وهي نائمة في غرفة نومها
- source_sentence: الكلب البني مستلقي على جانبه على سجادة بيج، مع جسم أخضر في المقدمة.
sentences:
- شخص طويل القامة
- المرأة تنظر من النافذة.
- لقد مات الكلب
license: apache-2.0
---
# GATE-AraBert-V1
This is **GATE | General Arabic Text Embedding** trained using SentenceTransformers in a **multi-task** setup. The system trains on the **AllNLI** and on the **STS** dataset.
## Model Details
### Model Description
- **Model Type:** Sentence Transformer
- **Base model:** [Omartificial-Intelligence-Space/Arabic-Triplet-Matryoshka-V2](https://huggingface.co/Omartificial-Intelligence-Space/Arabic-Triplet-Matryoshka-V2)
- **Maximum Sequence Length:** 512 tokens
- **Output Dimensionality:** 768 tokens
- **Similarity Function:** Cosine Similarity
- **Training Datasets:**
- [all-nli](https://huggingface.co/datasets/Omartificial-Intelligence-Space/Arabic-NLi-Pair-Class)
- [sts](https://huggingface.co/datasets/Omartificial-Intelligence-Space/arabic-stsb)
- **Language:** ar
## Usage
### Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
```bash
pip install -U sentence-transformers
```
Then you can load this model and run inference.
```python
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("Omartificial-Intelligence-Space/GATE-AraBert-v1")
# Run inference
sentences = [
'الكلب البني مستلقي على جانبه على سجادة بيج، مع جسم أخضر في المقدمة.',
'لقد مات الكلب',
'شخص طويل القامة',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
```
## Evaluation
### Metrics
#### Semantic Similarity
* Dataset: `sts-dev`
* Evaluated with [EmbeddingSimilarityEvaluator
](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
| Metric | Value |
|:--------------------|:----------|
| pearson_cosine | 0.8391 |
| **spearman_cosine** | **0.841** |
| pearson_manhattan | 0.8277 |
| spearman_manhattan | 0.8361 |
| pearson_euclidean | 0.8274 |
| spearman_euclidean | 0.8358 |
| pearson_dot | 0.8154 |
| spearman_dot | 0.818 |
| pearson_max | 0.8391 |
| spearman_max | 0.841 |
#### Semantic Similarity
* Dataset: `sts-test`
* Evaluated with [EmbeddingSimilarityEvaluator
](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
| Metric | Value |
|:--------------------|:-----------|
| pearson_cosine | 0.813 |
| **spearman_cosine** | **0.8173** |
| pearson_manhattan | 0.8114 |
| spearman_manhattan | 0.8164 |
| pearson_euclidean | 0.8103 |
| spearman_euclidean | 0.8158 |
| pearson_dot | 0.7908 |
| spearman_dot | 0.7887 |
| pearson_max | 0.813 |
| spearman_max | 0.8173 |
## Acknowledgments
The author would like to thank Prince Sultan University for their invaluable support in this project. Their contributions and resources have been instrumental in the development and fine-tuning of these models.
```markdown
## Citation
If you use the GATE, please cite it as follows:
@misc{nacar2025GATE,
title={GATE: General Arabic Text Embedding for Enhanced Semantic Textual Similarity with Hybrid Loss Training},
author={Omer Nacar, Anis Koubaa, Serry Taiseer Sibaee and Lahouari Ghouti},
year={2025},
note={Submitted to COLING 2025},
url={https://huggingface.co/Omartificial-Intelligence-Space/GATE-AraBert-v1},
}