hotchpotch
/

japanese-splade-base-v1

Model card Files Files and versions Community

hotchpotch commited on Oct 6, 2024

Commit

c741023

·

verified ·

1 Parent(s): 13f391a

Update README.md

Files changed (1) hide show

README.md +6 -2

README.md CHANGED Viewed

@@ -12,7 +12,7 @@ base_model:
 高性能な日本語 [SPLADE](https://github.com/naver/splade) (Sparse Lexical and Expansion Model) モデルです。[テキストからスパースベクトルへの変換デモ](https://huggingface.co/spaces/hotchpotch/japanese-splade-demo-streamlit)で、どのようなスパースベクトルに変換できるか、WebUI から気軽にお試しいただけます。
-テクニカルレポートは後日公開予定です。
 # 利用方法
@@ -146,4 +146,8 @@ print(similarity)
 ## 学習元データセット
  [hpprc/emb](https://huggingface.co/datasets/hpprc/emb) から、auto-wiki-qa, mmarco, jsquad jaquad, auto-wiki-qa-nemotron, quiz-works quiz-no-mori, miracl, jqara mr-tydi, baobab-wiki-retrieval, mkqa データセットを利用しています。
- また英語データセットとして、MS Marcoを利用しています。

 高性能な日本語 [SPLADE](https://github.com/naver/splade) (Sparse Lexical and Expansion Model) モデルです。[テキストからスパースベクトルへの変換デモ](https://huggingface.co/spaces/hotchpotch/japanese-splade-demo-streamlit)で、どのようなスパースベクトルに変換できるか、WebUI から気軽にお試しいただけます。
+なお、テクニカルレポートは後日公開予定です。
 # 利用方法
 ## 学習元データセット
  [hpprc/emb](https://huggingface.co/datasets/hpprc/emb) から、auto-wiki-qa, mmarco, jsquad jaquad, auto-wiki-qa-nemotron, quiz-works quiz-no-mori, miracl, jqara mr-tydi, baobab-wiki-retrieval, mkqa データセットを利用しています。
+ また英語データセットとして、MS Marcoを利用しています。
+## 注意事項
+`tokenizer.json` ファイルを同梱していますが、このファイルは text-embeddings-inference を動かすためのダミーファイルです。詳細は、[text-embeddings-inference で日本語トークナイザーモデルの推論をする](https://secon.dev/entry/2024/09/30/160000/)をご覧ください。