answerdotai
/

answerai-colbert-small-v1

@@ -16,52 +16,6 @@ While being MiniLM-sized, it outperforms all previous similarly-sized models on
 For more information about this model or how it was trained, head over to the [announcement blogpost](https://www.answer.ai/posts/2024-08-13-small-but-mighty-colbert.html).
-## Results
-### Against single-vector models
-![](https://www.answer.ai/posts/images/minicolbert/small_results.png)
-| Dataset / Model | answer-colbert-s | snowflake-s | bge-small-en | bge-base-en |
-|:-----------------|:-----------------:|:-------------:|:-------------:|:-------------:|
-| **Size**        |     33M (1x)     |   33M (1x)   |   33M (1x)   | **109M (3.3x)** |
-| **BEIR AVG**    |      **53.79**       |    51.99     |    51.68     |    53.25      |
-| **FiQA2018**    |      **41.15**       |    40.65     |    40.34     |    40.65      |
-| **HotpotQA**    |    **76.11**     |    66.54     |    69.94     |    72.6       |
-| **MSMARCO**     |    **43.5**      |    40.23     |    40.83     |    41.35      |
-| **NQ**          |      **59.1**        |    50.9      |    50.18     |    54.15      |
-| **TRECCOVID**   |    **84.59**     |    80.12     |    75.9      |    78.07      |
-| **ArguAna**     |      50.09       |    57.59     |    59.55     |  **63.61**    |
-| **ClimateFEVER**|      **33.07**       |    35.2      |    31.84     |    31.17      |
-| **CQADupstackRetrieval** |  38.75  |    39.65     |    39.05     |    **42.35**      |
-| **DBPedia**     |    **45.58**     |    41.02     |    40.03     |    40.77      |
-| **FEVER**       |    **90.96**     |    87.13     |    86.64     |    86.29      |
-| **NFCorpus**    |    **37.3**      |    34.92     |    34.3      |    37.39      |
-| **QuoraRetrieval** |    87.72      |    88.41     |  **88.78**   |    88.9       |
-| **SCIDOCS**     |      18.42       |  **21.82**   |    20.52     |    21.73      |
-| **SciFact**     |    **74.77**     |    72.22     |    71.28     |    74.04      |
-| **Touche2020**  |      25.69       |    23.48     |    **26.04**     |    25.7       |
-### Against ColBERTv2.0
-| Dataset / Model | answerai-colbert-small-v1 | ColBERTv2.0 |
-|:-----------------|:-----------------------:|:------------:|
-| **BEIR AVG**    |      **53.79**       |   50.02 |
-| **DBPedia**     |    **45.58**     |    44.6     |
-| **FiQA2018**    |    **41.15**     |    35.6     |
-| **NQ**          |    **59.1**      |    56.2     |
-| **HotpotQA**    |    **76.11**     |    66.7     |
-| **NFCorpus**    |    **37.3**      |    33.8     |
-| **TRECCOVID**   |    **84.59**     |    73.3     |
-| **Touche2020**  |      25.69       |  **26.3**   |
-| **ArguAna**     |    **50.09**     |    46.3     |
-| **ClimateFEVER**|    **33.07**     |    17.6     |
-| **FEVER**       |    **90.96**     |    78.5     |
-| **QuoraRetrieval** |    **87.72**     |  85.2   |
-| **SCIDOCS**     |    **18.42**     |    15.4     |
-| **SciFact**     |    **74.77**     |    69.3     |
 ## Usage
 ### Installation
@@ -93,6 +47,19 @@ ranker.rank(query=query, docs=docs)
 ### RAGatouille
 ### Stanford ColBERT
 #### Indexing
@@ -149,4 +116,51 @@ from colbert.modeling.checkpoint import Checkpoint
 ckpt = Checkpoint(answerdotai/answerai-colbert-small-v1", colbert_config=ColBERTConfig())
 embedded_query = ckpt.queryFromText(["Who dubs Howl's in English?"], bsize=16)
-```

 For more information about this model or how it was trained, head over to the [announcement blogpost](https://www.answer.ai/posts/2024-08-13-small-but-mighty-colbert.html).
 ## Usage
 ### Installation
 ### RAGatouille
+```python
+from ragatouille import RAGPretrainedModel
+RAG = RAGPretrainedModel.from_pretrained("answerdotai/answerai-colbert-small-v1")
+docs = ['Hayao Miyazaki is a Japanese director, born on [...]', 'Walt Disney is an American author, director and [...]', ...]
+RAG.index(documents, index_name="ghibli")
+query = 'Who directed spirited away?'
+results = RAG.search(query)
+```
 ### Stanford ColBERT
 #### Indexing
 ckpt = Checkpoint(answerdotai/answerai-colbert-small-v1", colbert_config=ColBERTConfig())
 embedded_query = ckpt.queryFromText(["Who dubs Howl's in English?"], bsize=16)
+```
+## Results
+### Against single-vector models
+![](https://www.answer.ai/posts/images/minicolbert/small_results.png)
+| Dataset / Model | answer-colbert-s | snowflake-s | bge-small-en | bge-base-en |
+|:-----------------|:-----------------:|:-------------:|:-------------:|:-------------:|
+| **Size**        |     33M (1x)     |   33M (1x)   |   33M (1x)   | **109M (3.3x)** |
+| **BEIR AVG**    |      **53.79**       |    51.99     |    51.68     |    53.25      |
+| **FiQA2018**    |      **41.15**       |    40.65     |    40.34     |    40.65      |
+| **HotpotQA**    |    **76.11**     |    66.54     |    69.94     |    72.6       |
+| **MSMARCO**     |    **43.5**      |    40.23     |    40.83     |    41.35      |
+| **NQ**          |      **59.1**        |    50.9      |    50.18     |    54.15      |
+| **TRECCOVID**   |    **84.59**     |    80.12     |    75.9      |    78.07      |
+| **ArguAna**     |      50.09       |    57.59     |    59.55     |  **63.61**    |
+| **ClimateFEVER**|      **33.07**       |    35.2      |    31.84     |    31.17      |
+| **CQADupstackRetrieval** |  38.75  |    39.65     |    39.05     |    **42.35**      |
+| **DBPedia**     |    **45.58**     |    41.02     |    40.03     |    40.77      |
+| **FEVER**       |    **90.96**     |    87.13     |    86.64     |    86.29      |
+| **NFCorpus**    |    **37.3**      |    34.92     |    34.3      |    37.39      |
+| **QuoraRetrieval** |    87.72      |    88.41     |  **88.78**   |    88.9       |
+| **SCIDOCS**     |      18.42       |  **21.82**   |    20.52     |    21.73      |
+| **SciFact**     |    **74.77**     |    72.22     |    71.28     |    74.04      |
+| **Touche2020**  |      25.69       |    23.48     |    **26.04**     |    25.7       |
+### Against ColBERTv2.0
+| Dataset / Model | answerai-colbert-small-v1 | ColBERTv2.0 |
+|:-----------------|:-----------------------:|:------------:|
+| **BEIR AVG**    |      **53.79**       |   50.02 |
+| **DBPedia**     |    **45.58**     |    44.6     |
+| **FiQA2018**    |    **41.15**     |    35.6     |
+| **NQ**          |    **59.1**      |    56.2     |
+| **HotpotQA**    |    **76.11**     |    66.7     |
+| **NFCorpus**    |    **37.3**      |    33.8     |
+| **TRECCOVID**   |    **84.59**     |    73.3     |
+| **Touche2020**  |      25.69       |  **26.3**   |
+| **ArguAna**     |    **50.09**     |    46.3     |
+| **ClimateFEVER**|    **33.07**     |    17.6     |
+| **FEVER**       |    **90.96**     |    78.5     |
+| **QuoraRetrieval** |    **87.72**     |  85.2   |
+| **SCIDOCS**     |    **18.42**     |    15.4     |
+| **SciFact**     |    **74.77**     |    69.3     |