DeDeckerThomas commited on
Commit
96251f2
·
1 Parent(s): 83b6e21

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +44 -3
README.md CHANGED
@@ -43,19 +43,60 @@ Sahrawat, Dhruva, Debanjan Mahata, Haimin Zhang, Mayank Kulkarni, Agniv Sharma,
43
  * This keyphrase generation model is very domain-specific and will perform very well on abstracts of scientific papers. It's not recommended to use this model for other domains, but you are free to test it out.
44
  * Only works for English documents.
45
  * For a custom model, please consult the training notebook for more information (link incoming).
 
46
 
47
  ### ❓ How to use
48
  ```python
49
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
50
  ```
51
 
52
  ```python
 
 
53
 
54
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
55
 
56
  ```
57
  # Output
58
-
 
59
  ```
60
 
61
  ## 📚 Training Dataset
 
43
  * This keyphrase generation model is very domain-specific and will perform very well on abstracts of scientific papers. It's not recommended to use this model for other domains, but you are free to test it out.
44
  * Only works for English documents.
45
  * For a custom model, please consult the training notebook for more information (link incoming).
46
+ * Sometimes the output can make no sense.
47
 
48
  ### ❓ How to use
49
  ```python
50
+ # Model parameters
51
+ from transformers import (
52
+ Text2TextGenerationPipeline,
53
+ BartForConditionalGeneration,
54
+ AutoTokenizer,
55
+ )
56
+ import numpy as np
57
+
58
+
59
+ class KeyphraseGenerationPipeline(Text2TextGenerationPipeline):
60
+ def __init__(self, model, keyphrase_sep_token=";", *args, **kwargs):
61
+ super().__init__(
62
+ model=BartForConditionalGeneration.from_pretrained(model),
63
+ tokenizer=AutoTokenizer.from_pretrained(model),
64
+ *args,
65
+ **kwargs
66
+ )
67
+ self.keyphrase_sep_token = keyphrase_sep_token
68
+
69
+ def postprocess(self, model_outputs):
70
+ results = super().postprocess(
71
+ model_outputs=model_outputs
72
+ )
73
+ return np.unique([result.strip() for result in results[0].get("generated_text").split(self.keyphrase_sep_token)])
74
  ```
75
 
76
  ```python
77
+ model_name = "DeDeckerThomas/keyphrase-generation-keybart-inspec"
78
+ generator = KeyphraseGenerationPipeline(model=model_name)
79
 
80
+ ```python
81
+ text = """
82
+ Keyphrase extraction is a technique in text analysis where you extract the important keyphrases from a text.
83
+ Since this is a time-consuming process, Artificial Intelligence is used to automate it.
84
+ Currently, classical machine learning methods, that use statistics and linguistics, are widely used for the extraction process.
85
+ The fact that these methods have been widely used in the community has the advantage that there are many easy-to-use libraries.
86
+ Now with the recent innovations in deep learning methods (such as recurrent neural networks and transformers, GANS, …), keyphrase extraction can be improved.
87
+ These new methods also focus on the semantics and context of a document, which is quite an improvement.
88
+ """.replace(
89
+ "\n", ""
90
+ )
91
+
92
+ keyphrases = generator(text)
93
+
94
+ print(keyphrases)
95
 
96
  ```
97
  # Output
98
+ ['artificial intelligence' 'classical machine learning methods'
99
+ 'keyphrase extraction' 'lingu' 'statistics' 'text analysis']
100
  ```
101
 
102
  ## 📚 Training Dataset