metadata
library_name: transformers
language:
- my
- en
Burmese-Bert
Burmese-Bert is a Bilingual Mask Language Model based on "bert-large-uncased".
The architecture is based on bidirectional encoder representations from transformers.
Supports English and Burmese language.
Model Details
Coming Soon
Model Description
- Developed by: Min Si Thu
- Model type: bidirectional encoder representations from transformers
- Language(s) (NLP): [More Information Needed]
- License: [More Information Needed]
- Finetuned from model [optional]: [More Information Needed]
Model Sources [optional]
- Repository: [More Information Needed]
- Paper [optional]: [More Information Needed]
- Demo [optional]: [More Information Needed]
Uses
- Mask Filling Language Model
- Burmese Natural Language Understanding
How to use
# install the dependencies
pip install transformers
from transformers import AutoModelForMaskedLM,AutoTokenizer
model_checkpoint = "jojo-ai-mst/BurmeseBert"
model = AutoModelForMaskedLM.from_pretrained(model_checkpoint)
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)
text = "This is a great [MASK]."
import torch
inputs = tokenizer(text, return_tensors="pt")
token_logits = model(**inputs).logits
# Find the location of [MASK] and extract its logits
mask_token_index = torch.where(inputs["input_ids"] == tokenizer.mask_token_id)[1]
mask_token_logits = token_logits[0, mask_token_index, :]
# Pick the [MASK] candidates with the highest logits
top_5_tokens = torch.topk(mask_token_logits, 5, dim=1).indices[0].tolist()
for token in top_5_tokens:
print(f"'>>> {text.replace(tokenizer.mask_token, tokenizer.decode([token]))}'")
Citation [optional]
Coming Soon