Burmese-Bert

Burmese-Bert is a Bilingual Mask Language Model based on "bert-large-uncased".

The architecture is based on bidirectional encoder representations from transformers.

Supports English and Burmese language.

Model Details

Coming Soon

Model Description

  • Developed by: Min Si Thu
  • Model type: bidirectional encoder representations from transformers
  • Language(s) (NLP): [More Information Needed]
  • License: [More Information Needed]
  • Finetuned from model [optional]: [More Information Needed]

Model Sources [optional]

  • Repository: [More Information Needed]
  • Paper [optional]: [More Information Needed]
  • Demo [optional]: [More Information Needed]

Uses

  • Mask Filling Language Model
  • Burmese Natural Language Understanding

How to use

# install the dependencies
pip install transformers
from transformers import AutoModelForMaskedLM,AutoTokenizer

model_checkpoint = "jojo-ai-mst/BurmeseBert"
model = AutoModelForMaskedLM.from_pretrained(model_checkpoint)
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)

text = "This is a great [MASK]."

import torch

inputs = tokenizer(text, return_tensors="pt")
token_logits = model(**inputs).logits
# Find the location of [MASK] and extract its logits
mask_token_index = torch.where(inputs["input_ids"] == tokenizer.mask_token_id)[1]
mask_token_logits = token_logits[0, mask_token_index, :]
# Pick the [MASK] candidates with the highest logits
top_5_tokens = torch.topk(mask_token_logits, 5, dim=1).indices[0].tolist()

for token in top_5_tokens:
    print(f"'>>> {text.replace(tokenizer.mask_token, tokenizer.decode([token]))}'")

Citation [optional]

Coming Soon

Downloads last month
25
Safetensors
Model size
380M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.