thenlper commited on
Commit
fe3f323
·
verified ·
1 Parent(s): 8a5dd6b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -11
README.md CHANGED
@@ -2608,7 +2608,7 @@ model-index:
2608
 
2609
  # gte-base-en-v1.5
2610
 
2611
- We introduce `gte-v1.5` series, upgraded `gte` embeddings that support the context length of up to **8192**.
2612
  The models are built upon the `transformer++` encoder [backbone](https://huggingface.co/Alibaba-NLP/new-impl) (BERT + RoPE + GLU).
2613
 
2614
  The `gte-v1.5` series achieve state-of-the-art scores on the MTEB benchmark within the same model size category and prodvide competitive on the LoCo long-context retrieval tests (refer to [Evaluation](#evaluation)).
@@ -2689,8 +2689,8 @@ print(cos_sim(embeddings[0], embeddings[1]))
2689
  ### Training Data
2690
 
2691
  - Masked language modeling (MLM): `c4-en`
2692
- - Weak-supervised contrastive (WSC) pre-training: GTE pre-training data
2693
- - Supervised contrastive fine-tuning: GTE fine-tuning data
2694
 
2695
  ### Training Procedure
2696
 
@@ -2734,14 +2734,16 @@ The gte evaluation setting: `mteb==1.2.0, fp16 auto mix precision, max_length=81
2734
 
2735
 
2736
 
2737
- ## Citation [TODO]
 
2738
 
2739
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
2740
-
2741
- **BibTeX:**
2742
-
2743
- [More Information Needed]
 
 
 
2744
 
2745
- **APA:**
2746
 
2747
- [More Information Needed]
 
2608
 
2609
  # gte-base-en-v1.5
2610
 
2611
+ We introduce `gte-v1.5` series, upgraded `gte` embeddings that support the context length of up to **8192**,while further enhancing model performance.
2612
  The models are built upon the `transformer++` encoder [backbone](https://huggingface.co/Alibaba-NLP/new-impl) (BERT + RoPE + GLU).
2613
 
2614
  The `gte-v1.5` series achieve state-of-the-art scores on the MTEB benchmark within the same model size category and prodvide competitive on the LoCo long-context retrieval tests (refer to [Evaluation](#evaluation)).
 
2689
  ### Training Data
2690
 
2691
  - Masked language modeling (MLM): `c4-en`
2692
+ - Weak-supervised contrastive (WSC) pre-training: [GTE](https://arxiv.org/pdf/2308.03281.pdf) pre-training data
2693
+ - Supervised contrastive fine-tuning: [GTE](https://arxiv.org/pdf/2308.03281.pdf) fine-tuning data
2694
 
2695
  ### Training Procedure
2696
 
 
2734
 
2735
 
2736
 
2737
+ ## Citation
2738
+ If you find our paper or models helpful, please consider citing them as follows:
2739
 
2740
+ ```
2741
+ @article{li2023towards,
2742
+ title={Towards general text embeddings with multi-stage contrastive learning},
2743
+ author={Li, Zehan and Zhang, Xin and Zhang, Yanzhao and Long, Dingkun and Xie, Pengjun and Zhang, Meishan},
2744
+ journal={arXiv preprint arXiv:2308.03281},
2745
+ year={2023}
2746
+ }
2747
+ ```
2748
 
 
2749