--- library_name: transformers language: - my - en --- # Burmese-Bert Burmese-Bert is a Bilingual Mask Language Model based on "bert-large-uncased". The architecture is based on bidirectional encoder representations from transformers. Supports English and Burmese language. ## Model Details Coming Soon ### Model Description - **Developed by:** Min Si Thu - **Model type:** bidirectional encoder representations from transformers - **Language(s) (NLP):** [More Information Needed] - **License:** [More Information Needed] - **Finetuned from model [optional]:** [More Information Needed] ### Model Sources [optional] - **Repository:** [More Information Needed] - **Paper [optional]:** [More Information Needed] - **Demo [optional]:** [More Information Needed] ## Uses - Mask Filling Language Model - Burmese Natural Language Understanding ### How to use ```shell # install the dependencies pip install transformers ``` ```python from transformers import AutoModelForMaskedLM,AutoTokenizer model_checkpoint = "jojo-ai-mst/BurmeseBert" model = AutoModelForMaskedLM.from_pretrained(model_checkpoint) tokenizer = AutoTokenizer.from_pretrained(model_checkpoint) text = "This is a great [MASK]." import torch inputs = tokenizer(text, return_tensors="pt") token_logits = model(**inputs).logits # Find the location of [MASK] and extract its logits mask_token_index = torch.where(inputs["input_ids"] == tokenizer.mask_token_id)[1] mask_token_logits = token_logits[0, mask_token_index, :] # Pick the [MASK] candidates with the highest logits top_5_tokens = torch.topk(mask_token_logits, 5, dim=1).indices[0].tolist() for token in top_5_tokens: print(f"'>>> {text.replace(tokenizer.mask_token, tokenizer.decode([token]))}'") ``` ## Citation [optional] Coming Soon