erax commited on
Commit
d58002c
Β·
verified Β·
1 Parent(s): fdc87be

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -2
README.md CHANGED
@@ -31,9 +31,9 @@ widget:
31
  # EraX-VL-7B-V1.5
32
  ## Introduction πŸŽ‰
33
 
34
- We are excited to introduce **EraX-VL-7B-V1.5**, a robust multimodal model for **OCR (optical character recognition)** and **VQA (visual question-answering)** that excels in various languages 🌍, with a particular focus on Vietnamese πŸ‡»πŸ‡³. The `EraX-VL-7B-V1.5` model stands out for its precise recognition capabilities across a range of documents πŸ“, including medical forms 🩺, invoices 🧾, bills of sale πŸ’³, quotes πŸ“„, and medical records πŸ’Š. This functionality is expected to be highly beneficial for hospitals πŸ₯, clinics πŸ’‰, insurance companies πŸ›‘οΈ, and other similar applications πŸ“‹. Built on the solid foundation of the [Qwen/Qwen2-VL-2B-Instruct](https://huggingface.co/Qwen/Qwen2-VL-7B-Instruct)[1], which we found to be of high quality and fluent in Vietnamese, `EraX-VL-7B-V1.5` has been fine-tuned to enhance its performance. We plan to continue improving and releasing new versions for free, along with sharing performance benchmarks in the near future.
35
 
36
- One standing-out feature of **EraX-VL-7B-V1.5** is the capability to do multi-turn Q&A with good reasoning capability!
37
 
38
  ***NOTA BENE***: EraX-VL-7B-V1.5 is NOT a typical OCR-only tool likes Tesseract but is a Multimodal LLM-based model. To use it effectively, you may have to **twist your prompt carefully** depending on your tasks.
39
 
@@ -48,6 +48,8 @@ One standing-out feature of **EraX-VL-7B-V1.5** is the capability to do multi-tu
48
 
49
  ## πŸ† LeaderBoard
50
 
 
 
51
  <table style="width:75%;">
52
  <tr>
53
  <th align="middle" width="300">Models</th>
 
31
  # EraX-VL-7B-V1.5
32
  ## Introduction πŸŽ‰
33
 
34
+ We are excited to introduce **EraX-VL-7B-V1.5**, another robust multimodal model for **OCR (optical character recognition)** and **VQA (visual question-answering)** that excels in various languages 🌍, with a particular focus on Vietnamese πŸ‡»πŸ‡³. This model stands out for its precise recognition capabilities across a range of documents πŸ“, including medical forms 🩺, invoices 🧾, bills of sale πŸ’³, quotes πŸ“„, and medical records πŸ’Š. This functionality is expected to be highly beneficial for hospitals πŸ₯, clinics πŸ’‰, insurance companies πŸ›‘οΈ, and other similar applications πŸ“‹. Built on the solid foundation of the [Qwen/Qwen2-VL-2B-Instruct](https://huggingface.co/Qwen/Qwen2-VL-7B-Instruct)[1], which we found to be of high quality and fluent in Vietnamese, `EraX-VL-7B-V1.5` has been fine-tuned to enhance its performance. We plan to continue improving and releasing new versions for free, along with sharing performance benchmarks in the near future.
35
 
36
+ One standing-out feature of **EraX-VL-7B-V1.5** is the capability to do multi-turn Q&A with impressive reasoning capability!
37
 
38
  ***NOTA BENE***: EraX-VL-7B-V1.5 is NOT a typical OCR-only tool likes Tesseract but is a Multimodal LLM-based model. To use it effectively, you may have to **twist your prompt carefully** depending on your tasks.
39
 
 
48
 
49
  ## πŸ† LeaderBoard
50
 
51
+ The `EraX-VL-7B-V1.5` achieved exceptional performance compares to other equal or even 10x larger in model size. You can re-run the benchmark at anytime.
52
+
53
  <table style="width:75%;">
54
  <tr>
55
  <th align="middle" width="300">Models</th>