felflare commited on
Commit
1be7657
·
1 Parent(s): 549f6e7
Files changed (1) hide show
  1. README.md +7 -1
README.md CHANGED
@@ -5,11 +5,15 @@ license: mit
5
  ---
6
  # ✨ bert-restore-punctuation
7
  [![forthebadge](https://forthebadge.com/images/badges/gluten-free.svg)]()
 
8
  This a bert-base-uncased model finetuned for punctuation restoration on [Yelp Reviews](https://www.tensorflow.org/datasets/catalog/yelp_polarity_reviews).
 
9
  The model predicts the punctuation and upper-casing of plain, lower-cased text. An example use case can be ASR output. Or other cases when text has lost punctuation.
 
10
  This model is intended for direct use as a punctuation restoration model for the general English language. Alternatively, you can use this for further fine-tuning on domain-specific texts for punctuation restoration tasks.
11
 
12
  Model restores the following punctuations -- [` ! ? . , - : ; '`]
 
13
  Model also restores upper-casing of words.
14
 
15
  -----------------------------------------------
@@ -34,7 +38,9 @@ rpunct.punctuate("""in 2018 cornell researchers built a high-powered detector th
34
 
35
  -----------------------------------------------
36
  ## 📡 Training data
 
37
  Here is the number of product reviews we used for finetuning the model:
 
38
  | Language | Number of reviews |
39
  | -------- | ----------------- |
40
  | English | 560,000 |
@@ -51,7 +57,6 @@ The fine-tuned model obtained the following accuracy on 45,990 held-out text sam
51
 
52
  Below is a breakdown of the performance of the model by each label:
53
 
54
-
55
  | label | precision | recall | f1-score | support|
56
  | --------- | -------------|-------- | ----------|--------|
57
  | **!** | 0.45 | 0.17 | 0.24 | 424
@@ -69,6 +74,7 @@ Below is a breakdown of the performance of the model by each label:
69
  | **?+Upper** | 0.40 | 0.50 | 0.44 | 4
70
  | **none** | 0.96 | 0.96 | 0.96 |35352
71
  | **Upper** | 0.84 | 0.82 | 0.83 | 5442
 
72
  -----------------------------------------------
73
 
74
  ## ☕ Contact
 
5
  ---
6
  # ✨ bert-restore-punctuation
7
  [![forthebadge](https://forthebadge.com/images/badges/gluten-free.svg)]()
8
+
9
  This a bert-base-uncased model finetuned for punctuation restoration on [Yelp Reviews](https://www.tensorflow.org/datasets/catalog/yelp_polarity_reviews).
10
+
11
  The model predicts the punctuation and upper-casing of plain, lower-cased text. An example use case can be ASR output. Or other cases when text has lost punctuation.
12
+
13
  This model is intended for direct use as a punctuation restoration model for the general English language. Alternatively, you can use this for further fine-tuning on domain-specific texts for punctuation restoration tasks.
14
 
15
  Model restores the following punctuations -- [` ! ? . , - : ; '`]
16
+
17
  Model also restores upper-casing of words.
18
 
19
  -----------------------------------------------
 
38
 
39
  -----------------------------------------------
40
  ## 📡 Training data
41
+
42
  Here is the number of product reviews we used for finetuning the model:
43
+
44
  | Language | Number of reviews |
45
  | -------- | ----------------- |
46
  | English | 560,000 |
 
57
 
58
  Below is a breakdown of the performance of the model by each label:
59
 
 
60
  | label | precision | recall | f1-score | support|
61
  | --------- | -------------|-------- | ----------|--------|
62
  | **!** | 0.45 | 0.17 | 0.24 | 424
 
74
  | **?+Upper** | 0.40 | 0.50 | 0.44 | 4
75
  | **none** | 0.96 | 0.96 | 0.96 |35352
76
  | **Upper** | 0.84 | 0.82 | 0.83 | 5442
77
+
78
  -----------------------------------------------
79
 
80
  ## ☕ Contact