maidalun1020 commited on
Commit
eaa31a5
·
verified ·
1 Parent(s): c74b839

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -21
README.md CHANGED
@@ -36,27 +36,11 @@ language:
36
  <a href="https://github.com/netease-youdao/BCEmbedding">GitHub</a>
37
  </p>
38
 
39
- ### Our Goals
40
-
41
- Provide a bilingual and crosslingual two-stage retrieval model repository for the RAG community, which can be used directly without finetuning, including `EmbeddingModel` and `RerankerModel`:
42
-
43
- - One Model: `EmbeddingModel` handle **bilingual and crosslingual** retrieval task in English and Chinese. `RerankerModel` supports **English, Chinese, Japanese and Korean**.
44
- - One Model: **Cover common business application scenarios with RAG optimization**. e.g. Education, Medical Scenario, Law, Finance, Literature, FAQ, Textbook, Wikipedia, General Conversation.
45
- - Easy to Integrate: We provide **API** in `BCEmbedding` for LlamaIndex and LangChain integrations.
46
- - Others Points:
47
- - `RerankerModel` supports **long passages (more than 512 tokens) reranking**;
48
- - `RerankerModel` provides **meaningful relevance score** that helps to remove passages with low quality.
49
- - `EmbeddingModel` **does not need specific instructions**.
50
-
51
- 给RAG社区一个可以直接拿来用,尽可能不需要用户finetune的中英双语和跨语种二阶段检索模型库,包含`EmbeddingModel`和`RerankerModel`。
52
-
53
- - 只需一个模型:`EmbeddingModel`覆盖 **中英双语和中英跨语种** 检索任务,尤其是其跨语种能力。`RerankerModel`支持 **中英日韩** 四个语种及其跨语种。
54
- - 只需一个模型: **覆盖常见业务落地领域**(针对众多常见rag场景已做优化),比如:教育、医疗、法律、金融、科研论文、客服(FAQ)、书籍、百科、通用QA等场景。用户不需要在上述特定领域finetune,直接可以用。
55
- - 方便集成:`EmbeddingModel`和`RerankerModel`提供了LlamaIndex和LangChain **集成接口** ,用户可非常方便集成进现有产品中。
56
- - 其他特性:
57
- - `RerankerModel`支持 **长passage(超过512)rerank**;
58
- - `RerankerModel`可以给出有意义 **相关性分数** ,帮助 **过滤低质量召回**;
59
- - `EmbeddingModel` **不需要“精心设计”instruction** ,尽可能召回有用片段。
60
 
61
  Related link for **EmbeddingModel** : [bce-embedding-base_v1](https://huggingface.co/maidalun1020/bce-embedding-base_v1)
62
 
 
36
  <a href="https://github.com/netease-youdao/BCEmbedding">GitHub</a>
37
  </p>
38
 
39
+ 主要特点(Key Features):
40
+ - 中英日韩四个语种,以及中英日韩四个语种的跨语种能力(Multilingual and Crosslingual capability in English, Chinese, Japanese and Korean);
41
+ - RAG优化,适配更多真实业务场景(RAG adaptation for more domains, including Education, Law, Finance, Medical, Literature, FAQ, Textbook, Wikipedia, etc.);
42
+ - <a href="https://github.com/netease-youdao/BCEmbedding">BCEmbedding</a>适配长文本做rerank(Handle long passages reranking more than 512 limit in <a href="https://github.com/netease-youdao/BCEmbedding">BCEmbedding</a>);
43
+ - RerankerModel可以提供可靠的 **相关性分数**,用于过滤低质量passage(RerankerModel provides **meaningful similarity score**, which help you figure out how relavent the query and passages are!)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
44
 
45
  Related link for **EmbeddingModel** : [bce-embedding-base_v1](https://huggingface.co/maidalun1020/bce-embedding-base_v1)
46