ucsahin commited on
Commit
e5350c1
·
verified ·
1 Parent(s): 651ff88

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -3
README.md CHANGED
@@ -27,8 +27,7 @@ It achieves the following results on the evaluation set:
27
 
28
  **Outputs:**
29
  - **Bounding Boxes:** The model outputs the location for the bounding box coordinates in the form of special <loc[value]> tokens, where value is a number that represents a normalized coordinate. Each detection is represented by four location coordinates in the order y_min, x_min, y_max, x_max, followed by the label that was detected in that box. To convert values to coordinates, you first need to divide the numbers by 1024, then multiply y by the image height and x by its width. This will give you the coordinates of the bounding boxes, relative to the original image size.
30
- If everything goes smoothly, the model will output a text similar to "<loc[value]><loc[value]><loc[value]><loc[value]> table; <loc[value]><loc[value]><loc[value]><loc[value]> table" depending on the number of tables detected in the image. Then, you can use the following script to convert the text output into PASCAL VOC format.
31
-
32
  ```python
33
  import re
34
 
@@ -42,7 +41,7 @@ def post_process(bbox_text, image_width, image_height):
42
  loc_values = loc_values[:4]
43
 
44
  loc_values = [value/1024 for value in loc_values]
45
- # pascal voc format (xmin, ymin, xmax, ymax)
46
  loc_values = [
47
  int(loc_values[1]*image_width), int(loc_values[0]*image_height),
48
  int(loc_values[3]*image_width), int(loc_values[2]*image_height),
 
27
 
28
  **Outputs:**
29
  - **Bounding Boxes:** The model outputs the location for the bounding box coordinates in the form of special <loc[value]> tokens, where value is a number that represents a normalized coordinate. Each detection is represented by four location coordinates in the order y_min, x_min, y_max, x_max, followed by the label that was detected in that box. To convert values to coordinates, you first need to divide the numbers by 1024, then multiply y by the image height and x by its width. This will give you the coordinates of the bounding boxes, relative to the original image size.
30
+ If everything goes smoothly, the model will output a text similar to "<loc[value]><loc[value]><loc[value]><loc[value]> table; <loc[value]><loc[value]><loc[value]><loc[value]> table" depending on the number of tables detected in the image. Then, you can use the following script to convert the text output into PASCAL VOC formatted bounding boxes.
 
31
  ```python
32
  import re
33
 
 
41
  loc_values = loc_values[:4]
42
 
43
  loc_values = [value/1024 for value in loc_values]
44
+ # convert to (xmin, ymin, xmax, ymax)
45
  loc_values = [
46
  int(loc_values[1]*image_width), int(loc_values[0]*image_height),
47
  int(loc_values[3]*image_width), int(loc_values[2]*image_height),