Thai-TrOCR Model

πŸš€ Final Model Available Now!

The final version of the Thai-TrOCR model is out! Check it out here: huggingface.com/openthaigpt/thai-trocr


Introduction

Thai-TrOCR is an advanced Optical Character Recognition (OCR) model fine-tuned specifically for recognizing handwritten text in Thai and English. Built on the robust TrOCR architecture, this model combines a Vision Transformer encoder with an Electra-based text decoder, allowing it to effectively handle multilingual text-line images.

Designed for efficiency and accuracy, Thai-TrOCR is lightweight, making it ideal for deployment in resource-constrained environments without compromising on performance.

Key Features:

  • Encoder: TrOCR Base Handwritten
  • Decoder: Electra Small (Trained with Thai corpus)

Training Dataset

Thai-TrOCR was trained using the following datasets:

  • pythainlp/thai-wiki-dataset-v3
  • pythainlp/thaigov-corpus
  • Salesforce/wikitext

How to Use This Beta Model

Here’s a quick guide to get started with the Thai-TrOCR model in PyTorch:

from transformers import TrOCRProcessor, VisionEncoderDecoderModel
from PIL import Image
import requests

# Load processor and model
processor = TrOCRProcessor.from_pretrained('suchut/thaitrocr-base-handwritten-beta2')
model = VisionEncoderDecoderModel.from_pretrained('suchut/thaitrocr-base-handwritten-beta2')

# Load an image
url = 'your_image_url_here'
image = Image.open(requests.get(url, stream=True).raw).convert("RGB")

# Process and generate text
pixel_values = processor(images=image, return_tensors="pt").pixel_values
generated_ids = model.generate(pixel_values)
generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(generated_text)
Downloads last month
1
Safetensors
Model size
103M params
Tensor type
F32
Β·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.