--- base_model: Omartificial-Intelligence-Space/Arabic-Triplet-Matryoshka-V2 datasets: - Omartificial-Intelligence-Space/Arabic-stsb - Omartificial-Intelligence-Space/Arabic-NLi-Pair-Class language: - ar library_name: sentence-transformers metrics: - pearson_cosine - spearman_cosine - pearson_manhattan - spearman_manhattan - pearson_euclidean - spearman_euclidean - pearson_dot - spearman_dot - pearson_max - spearman_max pipeline_tag: sentence-similarity tags: - mteb - sentence-transformers - sentence-similarity - feature-extraction - generated_from_trainer - dataset_size:947818 - loss:SoftmaxLoss - loss:CosineSimilarityLoss - transformers model-index: - name: Omartificial-Intelligence-Space/GATE-AraBert-v1 results: - dataset: config: ar name: MTEB MIRACLRetrievalHardNegatives (ar) revision: 95c8db7d4a6e9c1d8a60601afd63d553ae20a2eb split: dev type: mteb/miracl-hard-negatives metrics: - type: ndcg_at_1 value: 48.699999999999996 - type: ndcg_at_3 value: 51.161 - type: ndcg_at_5 value: 53.923 - type: ndcg_at_10 value: 57.737 - type: ndcg_at_20 value: 60.475 - type: ndcg_at_100 value: 63.096 - type: ndcg_at_1000 value: 64.203 - type: map_at_1 value: 32.108 - type: map_at_3 value: 44.405 - type: map_at_5 value: 47.164 - type: map_at_10 value: 49.477 - type: map_at_20 value: 50.583999999999996 - type: map_at_100 value: 51.212999999999994 - type: map_at_1000 value: 51.281 - type: recall_at_1 value: 32.108 - type: recall_at_3 value: 52.675000000000004 - type: recall_at_5 value: 60.709 - type: recall_at_10 value: 70.61 - type: recall_at_20 value: 79.208 - type: recall_at_100 value: 89.805 - type: recall_at_1000 value: 96.935 - type: precision_at_1 value: 48.699999999999996 - type: precision_at_3 value: 29.7 - type: precision_at_5 value: 21.34 - type: precision_at_10 value: 12.98 - type: precision_at_20 value: 7.485 - type: precision_at_100 value: 1.772 - type: precision_at_1000 value: 0.193 - type: mrr_at_1 value: 48.699999999999996 - type: mrr_at_3 value: 57.5333 - type: mrr_at_5 value: 59.1333 - type: mrr_at_10 value: 60.1163 - type: mrr_at_20 value: 60.5298 - type: mrr_at_100 value: 60.691700000000004 - type: mrr_at_1000 value: 60.707699999999996 - type: nauc_ndcg_at_1_max value: 28.267999999999997 - type: nauc_ndcg_at_1_std value: -14.380299999999998 - type: nauc_ndcg_at_1_diff1 value: 46.6563 - type: nauc_ndcg_at_3_max value: 30.7733 - type: nauc_ndcg_at_3_std value: -18.2892 - type: nauc_ndcg_at_3_diff1 value: 42.4678 - type: nauc_ndcg_at_5_max value: 31.014799999999997 - type: nauc_ndcg_at_5_std value: -19.1886 - type: nauc_ndcg_at_5_diff1 value: 42.4011 - type: nauc_ndcg_at_10_max value: 31.895400000000002 - type: nauc_ndcg_at_10_std value: -17.1444 - type: nauc_ndcg_at_10_diff1 value: 41.7072 - type: nauc_ndcg_at_20_max value: 33.373799999999996 - type: nauc_ndcg_at_20_std value: -15.8958 - type: nauc_ndcg_at_20_diff1 value: 42.174800000000005 - type: nauc_ndcg_at_100_max value: 33.3595 - type: nauc_ndcg_at_100_std value: -13.684 - type: nauc_ndcg_at_100_diff1 value: 42.5765 - type: nauc_ndcg_at_1000_max value: 33.306799999999996 - type: nauc_ndcg_at_1000_std value: -13.309099999999999 - type: nauc_ndcg_at_1000_diff1 value: 42.7722 - type: nauc_map_at_1_max value: 20.7316 - type: nauc_map_at_1_std value: -19.7081 - type: nauc_map_at_1_diff1 value: 45.9673 - type: nauc_map_at_3_max value: 27.112399999999997 - type: nauc_map_at_3_std value: -21.3088 - type: nauc_map_at_3_diff1 value: 43.3808 - type: nauc_map_at_5_max value: 28.2607 - type: nauc_map_at_5_std value: -20.7577 - type: nauc_map_at_5_diff1 value: 42.7665 - type: nauc_map_at_10_max value: 29.418899999999997 - type: nauc_map_at_10_std value: -19.5259 - type: nauc_map_at_10_diff1 value: 42.277100000000004 - type: nauc_map_at_20_max value: 30.0292 - type: nauc_map_at_20_std value: -19.0805 - type: nauc_map_at_20_diff1 value: 42.5082 - type: nauc_map_at_100_max value: 30.098599999999998 - type: nauc_map_at_100_std value: -18.570600000000002 - type: nauc_map_at_100_diff1 value: 42.58 - type: nauc_map_at_1000_max value: 30.0989 - type: nauc_map_at_1000_std value: -18.543499999999998 - type: nauc_map_at_1000_diff1 value: 42.5925 - type: nauc_recall_at_1_max value: 20.7316 - type: nauc_recall_at_1_std value: -19.7081 - type: nauc_recall_at_1_diff1 value: 45.9673 - type: nauc_recall_at_3_max value: 27.832600000000003 - type: nauc_recall_at_3_std value: -20.2727 - type: nauc_recall_at_3_diff1 value: 37.7177 - type: nauc_recall_at_5_max value: 29.0228 - type: nauc_recall_at_5_std value: -21.0861 - type: nauc_recall_at_5_diff1 value: 35.5742 - type: nauc_recall_at_10_max value: 29.876 - type: nauc_recall_at_10_std value: -15.8439 - type: nauc_recall_at_10_diff1 value: 31.639499999999998 - type: nauc_recall_at_20_max value: 36.5508 - type: nauc_recall_at_20_std value: -10.3007 - type: nauc_recall_at_20_diff1 value: 31.490800000000004 - type: nauc_recall_at_100_max value: 41.573 - type: nauc_recall_at_100_std value: 10.3383 - type: nauc_recall_at_100_diff1 value: 32.034400000000005 - type: nauc_recall_at_1000_max value: 61.4362 - type: nauc_recall_at_1000_std value: 69.59440000000001 - type: nauc_recall_at_1000_diff1 value: 29.5155 - type: nauc_precision_at_1_max value: 28.267999999999997 - type: nauc_precision_at_1_std value: -14.380299999999998 - type: nauc_precision_at_1_diff1 value: 46.6563 - type: nauc_precision_at_3_max value: 28.3698 - type: nauc_precision_at_3_std value: -5.6800999999999995 - type: nauc_precision_at_3_diff1 value: 20.9306 - type: nauc_precision_at_5_max value: 27.0061 - type: nauc_precision_at_5_std value: -0.2876 - type: nauc_precision_at_5_diff1 value: 12.3067 - type: nauc_precision_at_10_max value: 24.1662 - type: nauc_precision_at_10_std value: 8.8292 - type: nauc_precision_at_10_diff1 value: 4.0011 - type: nauc_precision_at_20_max value: 21.4451 - type: nauc_precision_at_20_std value: 13.3651 - type: nauc_precision_at_20_diff1 value: -0.7437 - type: nauc_precision_at_100_max value: 13.9599 - type: nauc_precision_at_100_std value: 22.5858 - type: nauc_precision_at_100_diff1 value: -6.833 - type: nauc_precision_at_1000_max value: 10.3447 - type: nauc_precision_at_1000_std value: 25.1309 - type: nauc_precision_at_1000_diff1 value: -10.5 - type: nauc_mrr_at_1_max value: 28.267999999999997 - type: nauc_mrr_at_1_std value: -14.380299999999998 - type: nauc_mrr_at_1_diff1 value: 46.6563 - type: nauc_mrr_at_3_max value: 33.0318 - type: nauc_mrr_at_3_std value: -12.7141 - type: nauc_mrr_at_3_diff1 value: 44.562200000000004 - type: nauc_mrr_at_5_max value: 33.292699999999996 - type: nauc_mrr_at_5_std value: -13.118599999999999 - type: nauc_mrr_at_5_diff1 value: 44.8335 - type: nauc_mrr_at_10_max value: 33.2046 - type: nauc_mrr_at_10_std value: -12.4948 - type: nauc_mrr_at_10_diff1 value: 44.7418 - type: nauc_mrr_at_20_max value: 33.2849 - type: nauc_mrr_at_20_std value: -12.2826 - type: nauc_mrr_at_20_diff1 value: 44.7205 - type: nauc_mrr_at_100_max value: 33.196799999999996 - type: nauc_mrr_at_100_std value: -12.2722 - type: nauc_mrr_at_100_diff1 value: 44.804300000000005 - type: nauc_mrr_at_1000_max value: 33.1847 - type: nauc_mrr_at_1000_std value: -12.2764 - type: nauc_mrr_at_1000_diff1 value: 44.806400000000004 - type: main_score value: 57.737 task: type: Retrieval - dataset: config: ara-ara name: MTEB MLQARetrieval (ara-ara) revision: 397ed406c1a7902140303e7faf60fff35b58d285 split: validation type: facebook/mlqa metrics: - type: ndcg_at_1 value: 47.776 - type: ndcg_at_3 value: 57.977999999999994 - type: ndcg_at_5 value: 60.275999999999996 - type: ndcg_at_10 value: 62.580000000000005 - type: ndcg_at_20 value: 63.857 - type: ndcg_at_100 value: 66.053 - type: ndcg_at_1000 value: 66.77 - type: map_at_1 value: 47.776 - type: map_at_3 value: 55.545 - type: map_at_5 value: 56.812 - type: map_at_10 value: 57.75599999999999 - type: map_at_20 value: 58.10999999999999 - type: map_at_100 value: 58.404999999999994 - type: map_at_1000 value: 58.440000000000005 - type: recall_at_1 value: 47.776 - type: recall_at_3 value: 64.99000000000001 - type: recall_at_5 value: 70.6 - type: recall_at_10 value: 77.756 - type: recall_at_20 value: 82.785 - type: recall_at_100 value: 94.77799999999999 - type: recall_at_1000 value: 100.0 - type: precision_at_1 value: 47.776 - type: precision_at_3 value: 21.663 - type: precision_at_5 value: 14.12 - type: precision_at_10 value: 7.776 - type: precision_at_20 value: 4.139 - type: precision_at_100 value: 0.9480000000000001 - type: precision_at_1000 value: 0.1 - type: mrr_at_1 value: 47.775600000000004 - type: mrr_at_3 value: 55.5448 - type: mrr_at_5 value: 56.8117 - type: mrr_at_10 value: 57.7562 - type: mrr_at_20 value: 58.10999999999999 - type: mrr_at_100 value: 58.404900000000005 - type: mrr_at_1000 value: 58.4397 - type: nauc_ndcg_at_1_max value: 68.0477 - type: nauc_ndcg_at_1_std value: 19.1911 - type: nauc_ndcg_at_1_diff1 value: 73.80640000000001 - type: nauc_ndcg_at_3_max value: 70.82159999999999 - type: nauc_ndcg_at_3_std value: 19.7998 - type: nauc_ndcg_at_3_diff1 value: 68.7149 - type: nauc_ndcg_at_5_max value: 70.812 - type: nauc_ndcg_at_5_std value: 19.3612 - type: nauc_ndcg_at_5_diff1 value: 67.281 - type: nauc_ndcg_at_10_max value: 71.02380000000001 - type: nauc_ndcg_at_10_std value: 20.4318 - type: nauc_ndcg_at_10_diff1 value: 67.0146 - type: nauc_ndcg_at_20_max value: 70.2041 - type: nauc_ndcg_at_20_std value: 19.5577 - type: nauc_ndcg_at_20_diff1 value: 66.4613 - type: nauc_ndcg_at_100_max value: 70.2647 - type: nauc_ndcg_at_100_std value: 20.4363 - type: nauc_ndcg_at_100_diff1 value: 67.5021 - type: nauc_ndcg_at_1000_max value: 70.1325 - type: nauc_ndcg_at_1000_std value: 19.697 - type: nauc_ndcg_at_1000_diff1 value: 68.05290000000001 - type: nauc_map_at_1_max value: 68.0477 - type: nauc_map_at_1_std value: 19.1911 - type: nauc_map_at_1_diff1 value: 73.80640000000001 - type: nauc_map_at_3_max value: 70.0727 - type: nauc_map_at_3_std value: 19.4799 - type: nauc_map_at_3_diff1 value: 69.9485 - type: nauc_map_at_5_max value: 70.015 - type: nauc_map_at_5_std value: 19.2605 - type: nauc_map_at_5_diff1 value: 69.199 - type: nauc_map_at_10_max value: 70.0897 - type: nauc_map_at_10_std value: 19.6128 - type: nauc_map_at_10_diff1 value: 69.1356 - type: nauc_map_at_20_max value: 69.8725 - type: nauc_map_at_20_std value: 19.3681 - type: nauc_map_at_20_diff1 value: 69.0124 - type: nauc_map_at_100_max value: 69.8656 - type: nauc_map_at_100_std value: 19.4493 - type: nauc_map_at_100_diff1 value: 69.1296 - type: nauc_map_at_1000_max value: 69.8608 - type: nauc_map_at_1000_std value: 19.4191 - type: nauc_map_at_1000_diff1 value: 69.1547 - type: nauc_recall_at_1_max value: 68.0477 - type: nauc_recall_at_1_std value: 19.1911 - type: nauc_recall_at_1_diff1 value: 73.80640000000001 - type: nauc_recall_at_3_max value: 73.2764 - type: nauc_recall_at_3_std value: 20.9191 - type: nauc_recall_at_3_diff1 value: 64.7349 - type: nauc_recall_at_5_max value: 73.7417 - type: nauc_recall_at_5_std value: 19.6995 - type: nauc_recall_at_5_diff1 value: 60.2181 - type: nauc_recall_at_10_max value: 75.31 - type: nauc_recall_at_10_std value: 24.833199999999998 - type: nauc_recall_at_10_diff1 value: 57.29729999999999 - type: nauc_recall_at_20_max value: 70.9915 - type: nauc_recall_at_20_std value: 20.3983 - type: nauc_recall_at_20_diff1 value: 51.2804 - type: nauc_recall_at_100_max value: 75.0448 - type: nauc_recall_at_100_std value: 46.0233 - type: nauc_recall_at_100_diff1 value: 48.8265 - type: nauc_precision_at_1_max value: 68.0477 - type: nauc_precision_at_1_std value: 19.1911 - type: nauc_precision_at_1_diff1 value: 73.80640000000001 - type: nauc_precision_at_3_max value: 73.2764 - type: nauc_precision_at_3_std value: 20.9191 - type: nauc_precision_at_3_diff1 value: 64.7349 - type: nauc_precision_at_5_max value: 73.7417 - type: nauc_precision_at_5_std value: 19.6995 - type: nauc_precision_at_5_diff1 value: 60.2181 - type: nauc_precision_at_10_max value: 75.31 - type: nauc_precision_at_10_std value: 24.833199999999998 - type: nauc_precision_at_10_diff1 value: 57.29729999999999 - type: nauc_precision_at_20_max value: 70.9915 - type: nauc_precision_at_20_std value: 20.3983 - type: nauc_precision_at_20_diff1 value: 51.2804 - type: nauc_precision_at_100_max value: 75.0448 - type: nauc_precision_at_100_std value: 46.0233 - type: nauc_precision_at_100_diff1 value: 48.8265 - type: nauc_precision_at_1000_max value: 100.0 - type: nauc_precision_at_1000_std value: 100.0 - type: nauc_precision_at_1000_diff1 value: 100.0 - type: nauc_mrr_at_1_max value: 68.0477 - type: nauc_mrr_at_1_std value: 19.1911 - type: nauc_mrr_at_1_diff1 value: 73.80640000000001 - type: nauc_mrr_at_3_max value: 70.0727 - type: nauc_mrr_at_3_std value: 19.4799 - type: nauc_mrr_at_3_diff1 value: 69.9485 - type: nauc_mrr_at_5_max value: 70.015 - type: nauc_mrr_at_5_std value: 19.2605 - type: nauc_mrr_at_5_diff1 value: 69.199 - type: nauc_mrr_at_10_max value: 70.0897 - type: nauc_mrr_at_10_std value: 19.6128 - type: nauc_mrr_at_10_diff1 value: 69.1356 - type: nauc_mrr_at_20_max value: 69.8725 - type: nauc_mrr_at_20_std value: 19.3681 - type: nauc_mrr_at_20_diff1 value: 69.0124 - type: nauc_mrr_at_100_max value: 69.8656 - type: nauc_mrr_at_100_std value: 19.4493 - type: nauc_mrr_at_100_diff1 value: 69.1296 - type: nauc_mrr_at_1000_max value: 69.8608 - type: nauc_mrr_at_1000_std value: 19.4191 - type: nauc_mrr_at_1000_diff1 value: 69.1547 - type: main_score value: 62.580000000000005 task: type: Retrieval - dataset: config: ara-ara name: MTEB MLQARetrieval (ara-ara) revision: 397ed406c1a7902140303e7faf60fff35b58d285 split: test type: facebook/mlqa metrics: - type: ndcg_at_1 value: 37.409 - type: ndcg_at_3 value: 44.269 - type: ndcg_at_5 value: 46.23 - type: ndcg_at_10 value: 48.076 - type: ndcg_at_20 value: 49.679 - type: ndcg_at_100 value: 52.037 - type: ndcg_at_1000 value: 53.958 - type: map_at_1 value: 37.399 - type: map_at_3 value: 42.577999999999996 - type: map_at_5 value: 43.661 - type: map_at_10 value: 44.42 - type: map_at_20 value: 44.861000000000004 - type: map_at_100 value: 45.179 - type: map_at_1000 value: 45.242 - type: recall_at_1 value: 37.399 - type: recall_at_3 value: 49.156 - type: recall_at_5 value: 53.937999999999995 - type: recall_at_10 value: 59.657000000000004 - type: recall_at_20 value: 65.995 - type: recall_at_100 value: 78.821 - type: recall_at_1000 value: 94.45 - type: precision_at_1 value: 37.409 - type: precision_at_3 value: 16.389 - type: precision_at_5 value: 10.789 - type: precision_at_10 value: 5.9670000000000005 - type: precision_at_20 value: 3.3000000000000003 - type: precision_at_100 value: 0.788 - type: precision_at_1000 value: 0.094 - type: mrr_at_1 value: 37.4086 - type: mrr_at_3 value: 42.587 - type: mrr_at_5 value: 43.6699 - type: mrr_at_10 value: 44.4297 - type: mrr_at_20 value: 44.8704 - type: mrr_at_100 value: 45.1881 - type: mrr_at_1000 value: 45.251000000000005 - type: nauc_ndcg_at_1_max value: 61.8437 - type: nauc_ndcg_at_1_std value: 10.782 - type: nauc_ndcg_at_1_diff1 value: 66.1842 - type: nauc_ndcg_at_3_max value: 63.157399999999996 - type: nauc_ndcg_at_3_std value: 13.114899999999999 - type: nauc_ndcg_at_3_diff1 value: 60.312 - type: nauc_ndcg_at_5_max value: 63.027100000000004 - type: nauc_ndcg_at_5_std value: 13.995099999999999 - type: nauc_ndcg_at_5_diff1 value: 59.272499999999994 - type: nauc_ndcg_at_10_max value: 63.0273 - type: nauc_ndcg_at_10_std value: 14.898700000000002 - type: nauc_ndcg_at_10_diff1 value: 58.2739 - type: nauc_ndcg_at_20_max value: 62.785199999999996 - type: nauc_ndcg_at_20_std value: 15.259800000000002 - type: nauc_ndcg_at_20_diff1 value: 57.8913 - type: nauc_ndcg_at_100_max value: 62.641999999999996 - type: nauc_ndcg_at_100_std value: 15.738299999999999 - type: nauc_ndcg_at_100_diff1 value: 58.2303 - type: nauc_ndcg_at_1000_max value: 62.7624 - type: nauc_ndcg_at_1000_std value: 15.1653 - type: nauc_ndcg_at_1000_diff1 value: 58.9359 - type: nauc_map_at_1_max value: 61.800900000000006 - type: nauc_map_at_1_std value: 10.7369 - type: nauc_map_at_1_diff1 value: 66.18270000000001 - type: nauc_map_at_3_max value: 62.8757 - type: nauc_map_at_3_std value: 12.5061 - type: nauc_map_at_3_diff1 value: 61.767 - type: nauc_map_at_5_max value: 62.793299999999995 - type: nauc_map_at_5_std value: 12.964500000000001 - type: nauc_map_at_5_diff1 value: 61.211000000000006 - type: nauc_map_at_10_max value: 62.8054 - type: nauc_map_at_10_std value: 13.328000000000001 - type: nauc_map_at_10_diff1 value: 60.833400000000005 - type: nauc_map_at_20_max value: 62.734199999999994 - type: nauc_map_at_20_std value: 13.4114 - type: nauc_map_at_20_diff1 value: 60.747099999999996 - type: nauc_map_at_100_max value: 62.7054 - type: nauc_map_at_100_std value: 13.4556 - type: nauc_map_at_100_diff1 value: 60.79259999999999 - type: nauc_map_at_1000_max value: 62.71099999999999 - type: nauc_map_at_1000_std value: 13.444400000000002 - type: nauc_map_at_1000_diff1 value: 60.815 - type: nauc_recall_at_1_max value: 61.800900000000006 - type: nauc_recall_at_1_std value: 10.7369 - type: nauc_recall_at_1_diff1 value: 66.18270000000001 - type: nauc_recall_at_3_max value: 63.914300000000004 - type: nauc_recall_at_3_std value: 14.8614 - type: nauc_recall_at_3_diff1 value: 56.044700000000006 - type: nauc_recall_at_5_max value: 63.6523 - type: nauc_recall_at_5_std value: 17.2352 - type: nauc_recall_at_5_diff1 value: 53.2316 - type: nauc_recall_at_10_max value: 63.6138 - type: nauc_recall_at_10_std value: 20.4315 - type: nauc_recall_at_10_diff1 value: 49.4388 - type: nauc_recall_at_20_max value: 62.605 - type: nauc_recall_at_20_std value: 22.8045 - type: nauc_recall_at_20_diff1 value: 46.5945 - type: nauc_recall_at_100_max value: 61.5178 - type: nauc_recall_at_100_std value: 30.4825 - type: nauc_recall_at_100_diff1 value: 44.9405 - type: nauc_recall_at_1000_max value: 63.473 - type: nauc_recall_at_1000_std value: 39.1421 - type: nauc_recall_at_1000_diff1 value: 43.4873 - type: nauc_precision_at_1_max value: 61.8437 - type: nauc_precision_at_1_std value: 10.782 - type: nauc_precision_at_1_diff1 value: 66.1842 - type: nauc_precision_at_3_max value: 63.962799999999994 - type: nauc_precision_at_3_std value: 14.908299999999999 - type: nauc_precision_at_3_diff1 value: 56.0511 - type: nauc_precision_at_5_max value: 63.7072 - type: nauc_precision_at_5_std value: 17.2854 - type: nauc_precision_at_5_diff1 value: 53.2417 - type: nauc_precision_at_10_max value: 63.672200000000004 - type: nauc_precision_at_10_std value: 20.485300000000002 - type: nauc_precision_at_10_diff1 value: 49.4491 - type: nauc_precision_at_20_max value: 62.674600000000005 - type: nauc_precision_at_20_std value: 22.8667 - type: nauc_precision_at_20_diff1 value: 46.6088 - type: nauc_precision_at_100_max value: 61.622600000000006 - type: nauc_precision_at_100_std value: 30.5766 - type: nauc_precision_at_100_diff1 value: 44.9643 - type: nauc_precision_at_1000_max value: 63.131400000000006 - type: nauc_precision_at_1000_std value: 39.6527 - type: nauc_precision_at_1000_diff1 value: 42.9196 - type: nauc_mrr_at_1_max value: 61.8437 - type: nauc_mrr_at_1_std value: 10.782 - type: nauc_mrr_at_1_diff1 value: 66.1842 - type: nauc_mrr_at_3_max value: 62.9188 - type: nauc_mrr_at_3_std value: 12.5514 - type: nauc_mrr_at_3_diff1 value: 61.768699999999995 - type: nauc_mrr_at_5_max value: 62.836800000000004 - type: nauc_mrr_at_5_std value: 13.0102 - type: nauc_mrr_at_5_diff1 value: 61.2128 - type: nauc_mrr_at_10_max value: 62.8492 - type: nauc_mrr_at_10_std value: 13.3741 - type: nauc_mrr_at_10_diff1 value: 60.8352 - type: nauc_mrr_at_20_max value: 62.7783 - type: nauc_mrr_at_20_std value: 13.4578 - type: nauc_mrr_at_20_diff1 value: 60.74889999999999 - type: nauc_mrr_at_100_max value: 62.7497 - type: nauc_mrr_at_100_std value: 13.5022 - type: nauc_mrr_at_100_diff1 value: 60.7944 - type: nauc_mrr_at_1000_max value: 62.7546 - type: nauc_mrr_at_1000_std value: 13.490499999999999 - type: nauc_mrr_at_1000_diff1 value: 60.8168 - type: main_score value: 48.076 task: type: Retrieval - dataset: config: ar name: MTEB MintakaRetrieval (ar) revision: efa78cc2f74bbcd21eff2261f9e13aebe40b814e split: test type: jinaai/mintakaqa metrics: - type: ndcg_at_1 value: 11.212 - type: ndcg_at_3 value: 16.08 - type: ndcg_at_5 value: 17.543 - type: ndcg_at_10 value: 19.13 - type: map_at_1 value: 11.212 - type: map_at_3 value: 14.904 - type: map_at_5 value: 15.719 - type: map_at_10 value: 16.375 - type: recall_at_1 value: 11.212 - type: recall_at_3 value: 19.473 - type: recall_at_5 value: 23.014000000000003 - type: recall_at_10 value: 27.916 - type: precision_at_1 value: 11.212 - type: precision_at_3 value: 6.491 - type: precision_at_5 value: 4.603 - type: precision_at_10 value: 2.792 - type: mrr_at_1 value: 11.212 - type: mrr_at_3 value: 14.9039 - type: mrr_at_5 value: 15.7187 - type: mrr_at_10 value: 16.3746 - type: main_score value: 19.13 task: type: Retrieval - dataset: config: default name: MTEB SadeemQuestionRetrieval (default) revision: 3cb0752b182e5d5d740df547748b06663c8e0bd9 split: test type: sadeem-ai/sadeem-ar-eval-retrieval-questions metrics: - type: ndcg_at_1 value: 28.674 - type: ndcg_at_3 value: 60.604 - type: ndcg_at_5 value: 62.092000000000006 - type: ndcg_at_10 value: 63.154999999999994 - type: ndcg_at_20 value: 63.602000000000004 - type: ndcg_at_100 value: 64.242 - type: ndcg_at_1000 value: 64.50399999999999 - type: map_at_1 value: 28.674 - type: map_at_3 value: 52.21 - type: map_at_5 value: 53.052 - type: map_at_10 value: 53.498000000000005 - type: map_at_20 value: 53.620000000000005 - type: map_at_100 value: 53.715999999999994 - type: map_at_1000 value: 53.726 - type: recall_at_1 value: 28.674 - type: recall_at_3 value: 85.112 - type: recall_at_5 value: 88.655 - type: recall_at_10 value: 91.91 - type: recall_at_20 value: 93.681 - type: recall_at_100 value: 97.032 - type: recall_at_1000 value: 99.09 - type: precision_at_1 value: 28.674 - type: precision_at_3 value: 28.371000000000002 - type: precision_at_5 value: 17.730999999999998 - type: precision_at_10 value: 9.191 - type: precision_at_20 value: 4.684 - type: precision_at_100 value: 0.97 - type: precision_at_1000 value: 0.099 - type: mrr_at_1 value: 25.371 - type: mrr_at_3 value: 50.2314 - type: mrr_at_5 value: 51.0212 - type: mrr_at_10 value: 51.481100000000005 - type: mrr_at_20 value: 51.6128 - type: mrr_at_100 value: 51.7119 - type: mrr_at_1000 value: 51.722100000000005 - type: nauc_ndcg_at_1_max value: 15.7219 - type: nauc_ndcg_at_1_std value: 2.2991 - type: nauc_ndcg_at_1_diff1 value: 5.2984 - type: nauc_ndcg_at_3_max value: 37.9971 - type: nauc_ndcg_at_3_std value: 11.0045 - type: nauc_ndcg_at_3_diff1 value: -38.8501 - type: nauc_ndcg_at_5_max value: 35.6057 - type: nauc_ndcg_at_5_std value: 10.8947 - type: nauc_ndcg_at_5_diff1 value: -33.353500000000004 - type: nauc_ndcg_at_10_max value: 33.5856 - type: nauc_ndcg_at_10_std value: 11.5392 - type: nauc_ndcg_at_10_diff1 value: -29.2831 - type: nauc_ndcg_at_20_max value: 32.4619 - type: nauc_ndcg_at_20_std value: 11.145900000000001 - type: nauc_ndcg_at_20_diff1 value: -26.8202 - type: nauc_ndcg_at_100_max value: 30.888199999999998 - type: nauc_ndcg_at_100_std value: 10.2467 - type: nauc_ndcg_at_100_diff1 value: -23.5621 - type: nauc_ndcg_at_1000_max value: 30.173699999999997 - type: nauc_ndcg_at_1000_std value: 9.7867 - type: nauc_ndcg_at_1000_diff1 value: -22.1022 - type: nauc_map_at_1_max value: 15.7219 - type: nauc_map_at_1_std value: 2.2991 - type: nauc_map_at_1_diff1 value: 5.2984 - type: nauc_map_at_3_max value: 30.0249 - type: nauc_map_at_3_std value: 8.110199999999999 - type: nauc_map_at_3_diff1 value: -22.6437 - type: nauc_map_at_5_max value: 28.654600000000002 - type: nauc_map_at_5_std value: 7.9832 - type: nauc_map_at_5_diff1 value: -19.4967 - type: nauc_map_at_10_max value: 27.825400000000002 - type: nauc_map_at_10_std value: 8.1387 - type: nauc_map_at_10_diff1 value: -17.8584 - type: nauc_map_at_20_max value: 27.5484 - type: nauc_map_at_20_std value: 8.0305 - type: nauc_map_at_20_diff1 value: -17.2515 - type: nauc_map_at_100_max value: 27.3449 - type: nauc_map_at_100_std value: 7.919099999999999 - type: nauc_map_at_100_diff1 value: -16.8205 - type: nauc_map_at_1000_max value: 27.3212 - type: nauc_map_at_1000_std value: 7.9052999999999995 - type: nauc_map_at_1000_diff1 value: -16.7727 - type: nauc_recall_at_1_max value: 15.7219 - type: nauc_recall_at_1_std value: 2.2991 - type: nauc_recall_at_1_diff1 value: 5.2984 - type: nauc_recall_at_3_max value: 82.6956 - type: nauc_recall_at_3_std value: 27.086700000000004 - type: nauc_recall_at_3_diff1 value: -129.9841 - type: nauc_recall_at_5_max value: 83.4035 - type: nauc_recall_at_5_std value: 30.9258 - type: nauc_recall_at_5_diff1 value: -128.9633 - type: nauc_recall_at_10_max value: 85.344 - type: nauc_recall_at_10_std value: 44.183699999999995 - type: nauc_recall_at_10_diff1 value: -132.1167 - type: nauc_recall_at_20_max value: 84.9071 - type: nauc_recall_at_20_std value: 48.2337 - type: nauc_recall_at_20_diff1 value: -128.3509 - type: nauc_recall_at_100_max value: 86.99470000000001 - type: nauc_recall_at_100_std value: 56.7091 - type: nauc_recall_at_100_diff1 value: -126.3125 - type: nauc_recall_at_1000_max value: 90.02329999999999 - type: nauc_recall_at_1000_std value: 79.7439 - type: nauc_recall_at_1000_diff1 value: -106.8251 - type: nauc_precision_at_1_max value: 15.7219 - type: nauc_precision_at_1_std value: 2.2991 - type: nauc_precision_at_1_diff1 value: 5.2984 - type: nauc_precision_at_3_max value: 82.6956 - type: nauc_precision_at_3_std value: 27.086700000000004 - type: nauc_precision_at_3_diff1 value: -129.9841 - type: nauc_precision_at_5_max value: 83.4035 - type: nauc_precision_at_5_std value: 30.9258 - type: nauc_precision_at_5_diff1 value: -128.9633 - type: nauc_precision_at_10_max value: 85.344 - type: nauc_precision_at_10_std value: 44.183699999999995 - type: nauc_precision_at_10_diff1 value: -132.1167 - type: nauc_precision_at_20_max value: 84.9071 - type: nauc_precision_at_20_std value: 48.2337 - type: nauc_precision_at_20_diff1 value: -128.3509 - type: nauc_precision_at_100_max value: 86.99470000000001 - type: nauc_precision_at_100_std value: 56.7091 - type: nauc_precision_at_100_diff1 value: -126.3125 - type: nauc_precision_at_1000_max value: 90.02329999999999 - type: nauc_precision_at_1000_std value: 79.7439 - type: nauc_precision_at_1000_diff1 value: -106.8251 - type: nauc_mrr_at_1_max value: 10.6156 - type: nauc_mrr_at_1_std value: 1.6084 - type: nauc_mrr_at_1_diff1 value: -23.3205 - type: nauc_mrr_at_3_max value: 26.458599999999997 - type: nauc_mrr_at_3_std value: 7.3822 - type: nauc_mrr_at_3_diff1 value: -46.1904 - type: nauc_mrr_at_5_max value: 25.0415 - type: nauc_mrr_at_5_std value: 7.0684 - type: nauc_mrr_at_5_diff1 value: -43.5901 - type: nauc_mrr_at_10_max value: 24.1152 - type: nauc_mrr_at_10_std value: 7.4148000000000005 - type: nauc_mrr_at_10_diff1 value: -42.271300000000004 - type: nauc_mrr_at_20_max value: 23.7882 - type: nauc_mrr_at_20_std value: 7.2852 - type: nauc_mrr_at_20_diff1 value: -41.7586 - type: nauc_mrr_at_100_max value: 23.552999999999997 - type: nauc_mrr_at_100_std value: 7.1595 - type: nauc_mrr_at_100_diff1 value: -41.3884 - type: nauc_mrr_at_1000_max value: 23.5271 - type: nauc_mrr_at_1000_std value: 7.1449 - type: nauc_mrr_at_1000_diff1 value: -41.3491 - type: main_score value: 63.154999999999994 task: type: Retrieval - dataset: config: ar-ar name: MTEB STS17 (ar-ar) revision: faeb762787bd10488a50c8b5be4a3b82e411949c split: test type: mteb/sts17-crosslingual-sts metrics: - type: cosine_pearson value: 82.06597171670848 - type: cosine_spearman value: 82.7809395809498 - type: euclidean_pearson value: 79.23996991139896 - type: euclidean_spearman value: 81.5287595404711 - type: main_score value: 82.7809395809498 - type: manhattan_pearson value: 78.95407006608013 - type: manhattan_spearman value: 81.15109493737467 task: type: STS - dataset: config: ar name: MTEB STS22.v2 (ar) revision: d31f33a128469b20e357535c39b82fb3c3f6f2bd split: test type: mteb/sts22-crosslingual-sts metrics: - type: cosine_pearson value: 54.912880452465004 - type: cosine_spearman value: 63.09788380910325 - type: euclidean_pearson value: 57.92665617677832 - type: euclidean_spearman value: 62.76032598469037 - type: main_score value: 63.09788380910325 - type: manhattan_pearson value: 58.0736648155273 - type: manhattan_spearman value: 62.94190582776664 task: type: STS - dataset: config: ar name: MTEB STS22 (ar) revision: de9d86b3b84231dc21f76c7b7af1f28e2f57f6e3 split: test type: mteb/sts22-crosslingual-sts metrics: - type: cosine_pearson value: 51.72534929358701 - type: cosine_spearman value: 59.75149627160101 - type: euclidean_pearson value: 53.894835373598774 - type: euclidean_spearman value: 59.44278354697161 - type: main_score value: 59.75149627160101 - type: manhattan_pearson value: 54.076675975406985 - type: manhattan_spearman value: 59.610061143235725 task: type: STS widget: - source_sentence: امرأة تكتب شيئاً sentences: - مراهق يتحدث إلى فتاة عبر كاميرا الإنترنت - امرأة تقطع البصل الأخضر. - مجموعة من كبار السن يتظاهرون حول طاولة الطعام. - source_sentence: تتشكل النجوم في مناطق تكوين النجوم، والتي تنشأ نفسها من السحب الجزيئية. sentences: - لاعب كرة السلة على وشك تسجيل نقاط لفريقه. - المقال التالي مأخوذ من نسختي من "أطلس البطريق الجديد للتاريخ الوسطى" - قد يكون من الممكن أن يوجد نظام شمسي مثل نظامنا خارج المجرة - source_sentence: >- تحت السماء الزرقاء مع الغيوم البيضاء، يصل طفل لمس مروحة طائرة واقفة على حقل من العشب. sentences: - امرأة تحمل كأساً - طفل يحاول لمس مروحة طائرة - اثنان من عازبين عن الشرب يستعدون للعشاء - source_sentence: رجل في منتصف العمر يحلق لحيته في غرفة ذات جدران بيضاء والتي لا تبدو كحمام sentences: - فتى يخطط اسمه على مكتبه - رجل ينام - المرأة وحدها وهي نائمة في غرفة نومها - source_sentence: الكلب البني مستلقي على جانبه على سجادة بيج، مع جسم أخضر في المقدمة. sentences: - شخص طويل القامة - المرأة تنظر من النافذة. - لقد مات الكلب license: apache-2.0 --- # GATE-AraBert-V1 This is **GATE | General Arabic Text Embedding** trained using SentenceTransformers in a **multi-task** setup. The system trains on the **AllNLI** and on the **STS** dataset. ## Model Details ### Model Description - **Model Type:** Sentence Transformer - **Base model:** [Omartificial-Intelligence-Space/Arabic-Triplet-Matryoshka-V2](https://huggingface.co/Omartificial-Intelligence-Space/Arabic-Triplet-Matryoshka-V2) - **Maximum Sequence Length:** 512 tokens - **Output Dimensionality:** 768 tokens - **Similarity Function:** Cosine Similarity - **Training Datasets:** - [all-nli](https://huggingface.co/datasets/Omartificial-Intelligence-Space/Arabic-NLi-Pair-Class) - [sts](https://huggingface.co/datasets/Omartificial-Intelligence-Space/arabic-stsb) - **Language:** ar ## Usage ### Direct Usage (Sentence Transformers) First install the Sentence Transformers library: ```bash pip install -U sentence-transformers ``` Then you can load this model and run inference. ```python from sentence_transformers import SentenceTransformer # Download from the 🤗 Hub model = SentenceTransformer("Omartificial-Intelligence-Space/GATE-AraBert-v1") # Run inference sentences = [ 'الكلب البني مستلقي على جانبه على سجادة بيج، مع جسم أخضر في المقدمة.', 'لقد مات الكلب', 'شخص طويل القامة', ] embeddings = model.encode(sentences) print(embeddings.shape) # [3, 768] # Get the similarity scores for the embeddings similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] ``` ## Evaluation ### Metrics #### Semantic Similarity * Dataset: `sts-dev` * Evaluated with [EmbeddingSimilarityEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator) | Metric | Value | |:--------------------|:----------| | pearson_cosine | 0.8391 | | **spearman_cosine** | **0.841** | | pearson_manhattan | 0.8277 | | spearman_manhattan | 0.8361 | | pearson_euclidean | 0.8274 | | spearman_euclidean | 0.8358 | | pearson_dot | 0.8154 | | spearman_dot | 0.818 | | pearson_max | 0.8391 | | spearman_max | 0.841 | #### Semantic Similarity * Dataset: `sts-test` * Evaluated with [EmbeddingSimilarityEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator) | Metric | Value | |:--------------------|:-----------| | pearson_cosine | 0.813 | | **spearman_cosine** | **0.8173** | | pearson_manhattan | 0.8114 | | spearman_manhattan | 0.8164 | | pearson_euclidean | 0.8103 | | spearman_euclidean | 0.8158 | | pearson_dot | 0.7908 | | spearman_dot | 0.7887 | | pearson_max | 0.813 | | spearman_max | 0.8173 | ## Acknowledgments The author would like to thank Prince Sultan University for their invaluable support in this project. Their contributions and resources have been instrumental in the development and fine-tuning of these models. ```markdown ## Citation If you use the GATE, please cite it as follows: @misc{nacar2025GATE, title={GATE: General Arabic Text Embedding for Enhanced Semantic Textual Similarity with Hybrid Loss Training}, author={Omer Nacar, Anis Koubaa, Serry Taiseer Sibaee and Lahouari Ghouti}, year={2025}, note={Submitted to COLING 2025}, url={https://huggingface.co/Omartificial-Intelligence-Space/GATE-AraBert-v1}, }