shahrukhx01
commited on
Commit
·
4e9e06a
1
Parent(s):
4a9a626
Update README.md
Browse files
README.md
CHANGED
@@ -8,6 +8,28 @@ tags:
|
|
8 |
- structured-data-search
|
9 |
---
|
10 |
A Siamese BERT architecture trained at character levels tokens for embedding based Fuzzy matching.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
11 |
```python
|
12 |
import torch
|
13 |
from transformers import AutoTokenizer, AutoModel
|
|
|
8 |
- structured-data-search
|
9 |
---
|
10 |
A Siamese BERT architecture trained at character levels tokens for embedding based Fuzzy matching.
|
11 |
+
|
12 |
+
|
13 |
+
## Usage (Sentence-Transformers)
|
14 |
+
Using this model becomes easy when you have [sentence-transformers](https://www.SBERT.net) installed:
|
15 |
+
```
|
16 |
+
pip install -U sentence-transformers
|
17 |
+
```
|
18 |
+
Then you can use the model like this:
|
19 |
+
```python
|
20 |
+
from sentence_transformers import SentenceTransformer, util
|
21 |
+
word1 = "fuzzformer"
|
22 |
+
word1 = " ".join([char for char in word1]) ## divide the word to char level to fuzzy match
|
23 |
+
word2 = "fizzformer"
|
24 |
+
word2 = " ".join([char for char in word2]) ## divide the word to char level to fuzzy match
|
25 |
+
words = [word1, word2]
|
26 |
+
|
27 |
+
model = SentenceTransformer('shahrukhx01/paraphrase-mpnet-base-v2-fuzzy-matcher')
|
28 |
+
fuzzy_embeddings = model.encode(words)
|
29 |
+
|
30 |
+
print("Fuzzy Match score:")
|
31 |
+
print(util.cos_sim(fuzzy_embeddings[0], fuzzy_embeddings[1]))
|
32 |
+
```
|
33 |
```python
|
34 |
import torch
|
35 |
from transformers import AutoTokenizer, AutoModel
|